From 64e7bac810daac236c14ce31c84e4158bf9c7759 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 26 Oct 2020 15:37:55 -0400 Subject: [PATCH] view: Avoid using ':' from metadata when generating a view MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Because it's a special character on Windows ("c:"). Use same technique already used for '/' and '\'. I didn't record how I generated their encoded forms before, so am sure there was a better way, but the way I did it now is to look at ghci> encodeFilePath "∕" "\226\136\149" And then the difference from that to "\56546\56456\56469" is adding 56320 to each, to get up to the escaped code plane. See comment for why I think handling ':' is ok, but that other illegal windows filenames won't. Note that, this should be enough to make the test suite always work. Other windows illegal filenames will fail at checkout time when it tries to put the illegal filename on the filesystem. --- Annex/View.hs | 39 ++++++++++------- CHANGELOG | 2 + ...__58___prop__95__view__95__roundtrips.mdwn | 2 + ..._06dd9606225cf881f7681fc27fb4833d._comment | 42 +++++++++++++++++++ 4 files changed, 69 insertions(+), 16 deletions(-) create mode 100644 doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips/comment_3_06dd9606225cf881f7681fc27fb4833d._comment diff --git a/Annex/View.hs b/Annex/View.hs index fa9a7b632d..725cce3cb6 100644 --- a/Annex/View.hs +++ b/Annex/View.hs @@ -235,27 +235,34 @@ pseudoSlash = "\56546\56456\56469" pseudoBackslash :: String pseudoBackslash = "\56546\56469\56498" +-- And this is '﹕' similarly. +pseudoColon :: String +pseudoColon = "\56559\56505\56469" + toViewPath :: MetaValue -> FilePath -toViewPath = escapeslash [] . decodeBS . fromMetaValue +toViewPath = escapepseudo [] . decodeBS . fromMetaValue where - escapeslash s ('/':cs) = escapeslash (pseudoSlash:s) cs - escapeslash s ('\\':cs) = escapeslash (pseudoBackslash:s) cs - escapeslash s ('%':cs) = escapeslash ("%%":s) cs - escapeslash s (c1:c2:c3:cs) - | [c1,c2,c3] == pseudoSlash = escapeslash ("%":pseudoSlash:s) cs - | [c1,c2,c3] == pseudoBackslash = escapeslash ("%":pseudoBackslash:s) cs - | otherwise = escapeslash ([c1]:s) (c2:c3:cs) - escapeslash s cs = concat (reverse (cs:s)) + escapepseudo s ('/':cs) = escapepseudo (pseudoSlash:s) cs + escapepseudo s ('\\':cs) = escapepseudo (pseudoBackslash:s) cs + escapepseudo s (':':cs) = escapepseudo (pseudoColon:s) cs + escapepseudo s ('%':cs) = escapepseudo ("%%":s) cs + escapepseudo s (c1:c2:c3:cs) + | [c1,c2,c3] == pseudoSlash = escapepseudo ("%":pseudoSlash:s) cs + | [c1,c2,c3] == pseudoBackslash = escapepseudo ("%":pseudoBackslash:s) cs + | [c1,c2,c3] == pseudoColon = escapepseudo ("%":pseudoColon:s) cs + | otherwise = escapepseudo ([c1]:s) (c2:c3:cs) + escapepseudo s cs = concat (reverse (cs:s)) fromViewPath :: FilePath -> MetaValue -fromViewPath = toMetaValue . encodeBS . deescapeslash [] +fromViewPath = toMetaValue . encodeBS . deescapepseudo [] where - deescapeslash s ('%':escapedc:cs) = deescapeslash ([escapedc]:s) cs - deescapeslash s (c1:c2:c3:cs) - | [c1,c2,c3] == pseudoSlash = deescapeslash ("/":s) cs - | [c1,c2,c3] == pseudoBackslash = deescapeslash ("\\":s) cs - | otherwise = deescapeslash ([c1]:s) (c2:c3:cs) - deescapeslash s cs = concat (reverse (cs:s)) + deescapepseudo s ('%':escapedc:cs) = deescapepseudo ([escapedc]:s) cs + deescapepseudo s (c1:c2:c3:cs) + | [c1,c2,c3] == pseudoSlash = deescapepseudo ("/":s) cs + | [c1,c2,c3] == pseudoBackslash = deescapepseudo ("\\":s) cs + | [c1,c2,c3] == pseudoColon = deescapepseudo (":":s) cs + | otherwise = deescapepseudo ([c1]:s) (c2:c3:cs) + deescapepseudo s cs = concat (reverse (cs:s)) prop_viewPath_roundtrips :: MetaValue -> Bool prop_viewPath_roundtrips v = fromViewPath (toViewPath v) == v diff --git a/CHANGELOG b/CHANGELOG index 126b9aa6ff..3652ec89eb 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -21,6 +21,8 @@ git-annex (8.20201008) UNRELEASED; urgency=medium will work. Thanks to John Thorvald Wodder II and Yaroslav Halchenko for their work on this. + * view: Avoid using ':' from metadata when generating a view, because + it's a special character on Windows ("c:") -- Joey Hess Thu, 08 Oct 2020 10:48:17 -0400 diff --git a/doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips.mdwn b/doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips.mdwn index 9e073c552a..4e804b39fa 100644 --- a/doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips.mdwn +++ b/doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips.mdwn @@ -30,3 +30,5 @@ Cheers, [[!meta author=yoh]] [[!tag projects/datalad]] + +> [[fixed|done]] --[[Joey]] diff --git a/doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips/comment_3_06dd9606225cf881f7681fc27fb4833d._comment b/doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips/comment_3_06dd9606225cf881f7681fc27fb4833d._comment new file mode 100644 index 0000000000..029540d2ba --- /dev/null +++ b/doc/bugs/1_test_failure_under_conda_on_Windows_10__58___prop__95__view__95__roundtrips/comment_3_06dd9606225cf881f7681fc27fb4833d._comment @@ -0,0 +1,42 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2020-10-26T18:41:27Z" + content=""" +Hmm, this uses viewedFiles, which generates filenames +based on the MetaValue. Note use of pathProduct, which uses +System.FilePath.combine. + +So, generating random ascii (including escape sequences) +bytestrings, and passing them through decodeBS to generate FilePaths, +and then operating on those filepaths. What could possibly go wrong. + +And aha! I made pathProduct use System.FilePath.Windows.combine +and was able to reproduce the test suite failure on Linux. + +And aha again: + + MetaValue (CurrentlySet True) "c:" + +Which of course breaks it on windows because it wanted to generate +something like "bar/c:/baz/a" but instead it gets "c:/bar/baz/a" + +git-annex does replace '/' and '\' when generating these filenames. +Not as a security measure (when the view branch is checked out, git's +security checks apply same as any branch so it piggybacks on those), +but to let the user build a view and successfully check it out +when their metadata happens to include such stuff. + +However, windows does have enough special filenames and gotchas +that it simply does not seem to make sense for git-annex to try to work +around them all in the view code. If a MetaValue happens to end with a +period, or is "nul", and so the generated filename is illegal on Windows, +it'll blow up at checkout time, and I am ok with that. + +So I think it would make sense to also escape ':', but that's about as far +as this should go. *Especially* because the filenames it generates need to +roundtrip back to metadata cleanly, which is what this test case is +testing. While I can finesse individual characters, it would be quite hard +to make a filename w/o a trailing dot roundtrip back to one with it, for +example. +"""]]