fix failing quickcheck properties

QuickCheck 2.10 found a counterexample eg "\929184" broke the property.

As far as I can tell, Git.Filename is matching how git handles encoding
of strange high unicode characters in filenames for display. Git does
not display high unicode characters, and instead displays the C-style
escaped form of each byte. This is ambiguous, but since git is not
unicode aware, it doesn't need to roundtrip parse it.

So, making Git.FileName's roundtrip test only chars < 256 seems fine.

Utility.Format.format uses encode_c, in order to mimic git, so that's
ok.

Utility.Format.gen uses decode_c, but only so that stuff like "\n"
in the format string is handled. If the format string contains C-style
octal escapes, they will be converted to ascii characters, and not
combined into unicode characters, but that should not be a problem.
If the user wants unicode characters, they can include them in the
format string, without escaping them.

Finally, decode_c is used by Utility.Gpg.secretKeys, because gpg
--with-colons hex-escapes some characters in particular ':' and '\\'.
gpg passes unicode through, so this use of decode_c is not a problem.

This commit was sponsored by Henrik Riomar on Patreon.
This commit is contained in:
Joey Hess 2017-06-17 16:17:09 -04:00
parent 89df21b8b8
commit da8e84efe9
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 30 additions and 12 deletions

View file

@ -11,7 +11,7 @@ module Utility.Format (
format,
decode_c,
encode_c,
prop_isomorphic_deencode
prop_encode_c_decode_c_roundtrip
) where
import Text.Printf (printf)
@ -100,8 +100,8 @@ empty :: Frag -> Bool
empty (Const "") = True
empty _ = False
{- Decodes a C-style encoding, where \n is a newline, \NNN is an octal
- encoded character, and \xNN is a hex encoded character.
{- Decodes a C-style encoding, where \n is a newline (etc),
- \NNN is an octal encoded character, and \xNN is a hex encoded character.
-}
decode_c :: FormatString -> String
decode_c [] = []
@ -173,6 +173,15 @@ encode_c' p = concatMap echar
e_asc c = showoctal $ ord c
showoctal i = '\\' : printf "%03o" i
{- for quickcheck -}
prop_isomorphic_deencode :: String -> Bool
prop_isomorphic_deencode s = s == decode_c (encode_c s)
{- For quickcheck.
-
- Encoding and then decoding roundtrips only when
- the string does not contain high unicode, because eg,
- both "\12345" and "\227\128\185" are encoded to "\343\200\271".
-
- This property papers over the problem, by only testing chars < 256.
-}
prop_encode_c_decode_c_roundtrip :: String -> Bool
prop_encode_c_decode_c_roundtrip s = s' == decode_c (encode_c s')
where
s' = filter (\c -> ord c < 256) s