view: Avoid using ':' from metadata when generating a view

Because it's a special character on Windows ("c:").

Use same technique already used for '/' and '\'.

I didn't record how I generated their encoded forms before, so am sure
there was a better way, but the way I did it now is to look at

	ghci> encodeFilePath "∕"
	"\226\136\149"

And then the difference from that to "\56546\56456\56469"
is adding 56320 to each, to get up to the escaped code plane.

See comment for why I think handling ':' is ok, but that other illegal
windows filenames won't. Note that, this should be enough to make the
test suite always work. Other windows illegal filenames will fail at
checkout time when it tries to put the illegal filename on the
filesystem.
This commit is contained in:
Joey Hess 2020-10-26 15:37:55 -04:00
parent 2d291bf200
commit 64e7bac810
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 69 additions and 16 deletions

View file

@ -235,27 +235,34 @@ pseudoSlash = "\56546\56456\56469"
pseudoBackslash :: String
pseudoBackslash = "\56546\56469\56498"
-- And this is '﹕' similarly.
pseudoColon :: String
pseudoColon = "\56559\56505\56469"
toViewPath :: MetaValue -> FilePath
toViewPath = escapeslash [] . decodeBS . fromMetaValue
toViewPath = escapepseudo [] . decodeBS . fromMetaValue
where
escapeslash s ('/':cs) = escapeslash (pseudoSlash:s) cs
escapeslash s ('\\':cs) = escapeslash (pseudoBackslash:s) cs
escapeslash s ('%':cs) = escapeslash ("%%":s) cs
escapeslash s (c1:c2:c3:cs)
| [c1,c2,c3] == pseudoSlash = escapeslash ("%":pseudoSlash:s) cs
| [c1,c2,c3] == pseudoBackslash = escapeslash ("%":pseudoBackslash:s) cs
| otherwise = escapeslash ([c1]:s) (c2:c3:cs)
escapeslash s cs = concat (reverse (cs:s))
escapepseudo s ('/':cs) = escapepseudo (pseudoSlash:s) cs
escapepseudo s ('\\':cs) = escapepseudo (pseudoBackslash:s) cs
escapepseudo s (':':cs) = escapepseudo (pseudoColon:s) cs
escapepseudo s ('%':cs) = escapepseudo ("%%":s) cs
escapepseudo s (c1:c2:c3:cs)
| [c1,c2,c3] == pseudoSlash = escapepseudo ("%":pseudoSlash:s) cs
| [c1,c2,c3] == pseudoBackslash = escapepseudo ("%":pseudoBackslash:s) cs
| [c1,c2,c3] == pseudoColon = escapepseudo ("%":pseudoColon:s) cs
| otherwise = escapepseudo ([c1]:s) (c2:c3:cs)
escapepseudo s cs = concat (reverse (cs:s))
fromViewPath :: FilePath -> MetaValue
fromViewPath = toMetaValue . encodeBS . deescapeslash []
fromViewPath = toMetaValue . encodeBS . deescapepseudo []
where
deescapeslash s ('%':escapedc:cs) = deescapeslash ([escapedc]:s) cs
deescapeslash s (c1:c2:c3:cs)
| [c1,c2,c3] == pseudoSlash = deescapeslash ("/":s) cs
| [c1,c2,c3] == pseudoBackslash = deescapeslash ("\\":s) cs
| otherwise = deescapeslash ([c1]:s) (c2:c3:cs)
deescapeslash s cs = concat (reverse (cs:s))
deescapepseudo s ('%':escapedc:cs) = deescapepseudo ([escapedc]:s) cs
deescapepseudo s (c1:c2:c3:cs)
| [c1,c2,c3] == pseudoSlash = deescapepseudo ("/":s) cs
| [c1,c2,c3] == pseudoBackslash = deescapepseudo ("\\":s) cs
| [c1,c2,c3] == pseudoColon = deescapepseudo (":":s) cs
| otherwise = deescapepseudo ([c1]:s) (c2:c3:cs)
deescapepseudo s cs = concat (reverse (cs:s))
prop_viewPath_roundtrips :: MetaValue -> Bool
prop_viewPath_roundtrips v = fromViewPath (toViewPath v) == v

View file

@ -21,6 +21,8 @@ git-annex (8.20201008) UNRELEASED; urgency=medium
will work.
Thanks to John Thorvald Wodder II and Yaroslav Halchenko for their work
on this.
* view: Avoid using ':' from metadata when generating a view, because
it's a special character on Windows ("c:")
-- Joey Hess <id@joeyh.name> Thu, 08 Oct 2020 10:48:17 -0400

View file

@ -30,3 +30,5 @@ Cheers,
[[!meta author=yoh]]
[[!tag projects/datalad]]
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,42 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2020-10-26T18:41:27Z"
content="""
Hmm, this uses viewedFiles, which generates filenames
based on the MetaValue. Note use of pathProduct, which uses
System.FilePath.combine.
So, generating random ascii (including escape sequences)
bytestrings, and passing them through decodeBS to generate FilePaths,
and then operating on those filepaths. What could possibly go wrong.
And aha! I made pathProduct use System.FilePath.Windows.combine
and was able to reproduce the test suite failure on Linux.
And aha again:
MetaValue (CurrentlySet True) "c:"
Which of course breaks it on windows because it wanted to generate
something like "bar/c:/baz/a" but instead it gets "c:/bar/baz/a"
git-annex does replace '/' and '\' when generating these filenames.
Not as a security measure (when the view branch is checked out, git's
security checks apply same as any branch so it piggybacks on those),
but to let the user build a view and successfully check it out
when their metadata happens to include such stuff.
However, windows does have enough special filenames and gotchas
that it simply does not seem to make sense for git-annex to try to work
around them all in the view code. If a MetaValue happens to end with a
period, or is "nul", and so the generated filename is illegal on Windows,
it'll blow up at checkout time, and I am ok with that.
So I think it would make sense to also escape ':', but that's about as far
as this should go. *Especially* because the filenames it generates need to
roundtrip back to metadata cleanly, which is what this test case is
testing. While I can finesse individual characters, it would be quite hard
to make a filename w/o a trailing dot roundtrip back to one with it, for
example.
"""]]