fix Arbitrary AssociatedFile to not crash when LANG=C
Even letting through things that Data.Char.generalCategory said wereUppercaseLetter caused the crash. Apparently what's going on is that, in LANG=C, it does not expect to find unicode chars in a String, except presumably ones that are surrogates. But ascii is good enough to test the things we need to test about associated files.
This commit is contained in:
parent
6f7a09c50f
commit
d44fb89d4f
3 changed files with 41 additions and 3 deletions
11
Key.hs
11
Key.hs
|
@ -1,6 +1,6 @@
|
||||||
{- git-annex Keys
|
{- git-annex Keys
|
||||||
-
|
-
|
||||||
- Copyright 2011-2019 Joey Hess <id@joeyh.name>
|
- Copyright 2011-2020 Joey Hess <id@joeyh.name>
|
||||||
-
|
-
|
||||||
- Licensed under the GNU AGPL version 3 or higher.
|
- Licensed under the GNU AGPL version 3 or higher.
|
||||||
-}
|
-}
|
||||||
|
@ -28,6 +28,7 @@ module Key (
|
||||||
prop_isomorphic_key_encode
|
prop_isomorphic_key_encode
|
||||||
) where
|
) where
|
||||||
|
|
||||||
|
import Data.Char
|
||||||
import qualified Data.Text as T
|
import qualified Data.Text as T
|
||||||
import qualified Data.ByteString as S
|
import qualified Data.ByteString as S
|
||||||
import qualified Data.Attoparsec.ByteString as A
|
import qualified Data.Attoparsec.ByteString as A
|
||||||
|
@ -79,11 +80,15 @@ instance Arbitrary KeyData where
|
||||||
<*> ((succ . abs <$>) <$> arbitrary) -- chunknum cannot be 0 or negative
|
<*> ((succ . abs <$>) <$> arbitrary) -- chunknum cannot be 0 or negative
|
||||||
|
|
||||||
-- AssociatedFile cannot be empty, and cannot contain a NUL
|
-- AssociatedFile cannot be empty, and cannot contain a NUL
|
||||||
-- (but can be Nothing)
|
-- (but can be Nothing).
|
||||||
instance Arbitrary AssociatedFile where
|
instance Arbitrary AssociatedFile where
|
||||||
arbitrary = (AssociatedFile . fmap toRawFilePath <$> arbitrary)
|
arbitrary = (AssociatedFile . fmap conv <$> arbitrary)
|
||||||
`suchThat` (/= AssociatedFile (Just S.empty))
|
`suchThat` (/= AssociatedFile (Just S.empty))
|
||||||
`suchThat` (\(AssociatedFile f) -> maybe True (S.notElem 0) f)
|
`suchThat` (\(AssociatedFile f) -> maybe True (S.notElem 0) f)
|
||||||
|
where
|
||||||
|
-- Generating arbitrary unicode leads to encoding errors
|
||||||
|
-- when LANG=C, so limit to ascii.
|
||||||
|
conv = toRawFilePath . filter isAscii
|
||||||
|
|
||||||
instance Arbitrary Key where
|
instance Arbitrary Key where
|
||||||
arbitrary = mkKey . const <$> arbitrary
|
arbitrary = mkKey . const <$> arbitrary
|
||||||
|
|
|
@ -18,3 +18,5 @@ Full build logs are at http://neuro.debian.net/_files/_buildlogs/git-annex/7.201
|
||||||
|
|
||||||
[[!meta author=yoh]]
|
[[!meta author=yoh]]
|
||||||
[[!tag projects/datalad]]
|
[[!tag projects/datalad]]
|
||||||
|
|
||||||
|
> [[fixed|done]] --[[Joey]]
|
||||||
|
|
|
@ -0,0 +1,31 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 1"""
|
||||||
|
date="2020-02-02T19:41:34Z"
|
||||||
|
content="""
|
||||||
|
Minimal reproducer:
|
||||||
|
|
||||||
|
bash$ LANG=C ghci Utility/FileSystemEncoding.hs
|
||||||
|
ghci> useFileSystemEncoding
|
||||||
|
ghci> toRawFilePath "\611584"
|
||||||
|
"*** Exception: recoverEncode: invalid argument (invalid character)
|
||||||
|
|
||||||
|
No such problem in a unicode locale.
|
||||||
|
|
||||||
|
The problem does not, though, affect actually using git-annex in LANG=C
|
||||||
|
with a filename with that in its name.
|
||||||
|
|
||||||
|
Odd because the filesystem encoding is supposed to round-tip well,
|
||||||
|
anything, but here encoding a string with it is failing internally.
|
||||||
|
Maybe the thing is, it's not really round-tripping? QuickCheck arbitrary
|
||||||
|
magics up a FilePath that contains that, so it's starting in the middle and
|
||||||
|
trying to convert it out.
|
||||||
|
|
||||||
|
[[!commit 70395659db9f662e61009d984fc9b0b2f24fdece]] introduced this while
|
||||||
|
fixing another intermittent encoding test case failure.
|
||||||
|
|
||||||
|
ghci> Data.Char.generalCategory '\611584'
|
||||||
|
NotAssigned
|
||||||
|
|
||||||
|
I think it would make sense to filter out NotAssigned and PrivateUse.
|
||||||
|
"""]]
|
Loading…
Reference in a new issue