2019-01-07 18:18:24 +00:00
|
|
|
{- Simple Base64 encoding
|
Fix setting/setting/viewing metadata that contains unicode or other special characters, when in a non-unicode locale.
Oh boy, not again. So, another place that the filesystem encoding needs to
be applied. Yay.
In passing, I changed decodeBS so if a NUL is embedded in the input, the
resulting FilePath doesn't get truncated at that NUL. This was needed to
make prop_b64_roundtrips pass, and on reviewing the callers of decodeBS, I
didn't see any where this wouldn't make sense. When a FilePath is used to
operate on the filesystem, it'll get truncated at a NUL anyway, whereas if
a String is being used for something else, it might conceivably have a NUL
in it, and we wouldn't want it to get truncated when going through
decodeBS.
(NB: There may be a speed impact from this change.)
2015-08-11 22:40:59 +00:00
|
|
|
-
|
2019-01-07 18:18:24 +00:00
|
|
|
- Copyright 2011-2019 Joey Hess <id@joeyh.name>
|
2011-05-01 18:27:40 +00:00
|
|
|
-
|
2014-05-10 14:01:27 +00:00
|
|
|
- License: BSD-2-clause
|
2011-05-01 18:27:40 +00:00
|
|
|
-}
|
|
|
|
|
2019-01-07 18:18:24 +00:00
|
|
|
module Utility.Base64 where
|
2011-05-01 18:27:40 +00:00
|
|
|
|
2015-08-12 14:36:51 +00:00
|
|
|
import Utility.FileSystemEncoding
|
|
|
|
|
2013-08-06 09:00:52 +00:00
|
|
|
import qualified "sandi" Codec.Binary.Base64 as B64
|
2013-03-06 20:29:19 +00:00
|
|
|
import Data.Maybe
|
2019-01-07 18:18:24 +00:00
|
|
|
import qualified Data.ByteString as B
|
2013-08-06 09:00:52 +00:00
|
|
|
import Data.ByteString.UTF8 (fromString, toString)
|
2015-08-12 14:36:51 +00:00
|
|
|
import Data.Char
|
2011-05-01 18:27:40 +00:00
|
|
|
|
2019-01-07 18:18:24 +00:00
|
|
|
-- | This uses the FileSystemEncoding, so it can be used on Strings
|
|
|
|
-- that repesent filepaths containing arbitrarily encoded characters.
|
|
|
|
toB64 :: String -> String
|
2019-01-01 18:54:06 +00:00
|
|
|
toB64 = toString . B64.encode . encodeBS
|
2011-05-01 18:27:40 +00:00
|
|
|
|
2019-01-07 18:18:24 +00:00
|
|
|
toB64' :: B.ByteString -> B.ByteString
|
|
|
|
toB64' = B64.encode
|
|
|
|
|
2013-03-06 20:29:19 +00:00
|
|
|
fromB64Maybe :: String -> Maybe String
|
2019-01-07 18:18:24 +00:00
|
|
|
fromB64Maybe s = either (const Nothing) (Just . decodeBS)
|
2013-08-06 09:00:52 +00:00
|
|
|
(B64.decode $ fromString s)
|
2013-03-06 20:29:19 +00:00
|
|
|
|
2019-01-07 18:18:24 +00:00
|
|
|
fromB64Maybe' :: B.ByteString -> Maybe (B.ByteString)
|
|
|
|
fromB64Maybe' = either (const Nothing) Just . B64.decode
|
|
|
|
|
2011-05-01 18:27:40 +00:00
|
|
|
fromB64 :: String -> String
|
2013-03-06 20:29:19 +00:00
|
|
|
fromB64 = fromMaybe bad . fromB64Maybe
|
|
|
|
where
|
|
|
|
bad = error "bad base64 encoded data"
|
metadata: Fix encoding problem that led to mojibake when storing metadata strings that contained both unicode characters and a space (or '!') character.
The fix is to stop using w82s, which does not properly reconstitute unicode
strings. Instrad, use utf8 bytestring to get the [Word8] to base64. This
passes unicode through perfectly, including any invalid filesystem encoded
characters.
Note that toB64 / fromB64 are also used for creds and cipher
embedding. It would be unfortunate if this change broke those uses.
For cipher embedding, note that ciphers can contain arbitrary bytes (should
really be using ByteString.Char8 there). Testing indicated it's not safe to
use the new fromB64 there; I think that characters were incorrectly
combined.
For credpair embedding, the username or password could contain unicode.
Before, that unicode would fail to round-trip through the b64.
So, I guess this is not going to break any embedded creds that worked
before.
This bug may have affected some creds before, and if so,
this change will not fix old ones, but should fix new ones at least.
2015-03-04 15:16:03 +00:00
|
|
|
|
2019-01-07 18:18:24 +00:00
|
|
|
fromB64' :: B.ByteString -> B.ByteString
|
|
|
|
fromB64' = fromMaybe bad . fromB64Maybe'
|
|
|
|
where
|
|
|
|
bad = error "bad base64 encoded data"
|
|
|
|
|
2015-08-12 14:36:51 +00:00
|
|
|
-- Only ascii strings are tested, because an arbitrary string may contain
|
2015-08-12 14:57:48 +00:00
|
|
|
-- characters not encoded using the FileSystemEncoding, which would thus
|
2019-01-07 18:18:24 +00:00
|
|
|
-- not roundtrip, as decodeBS always generates an output encoded that way.
|
metadata: Fix encoding problem that led to mojibake when storing metadata strings that contained both unicode characters and a space (or '!') character.
The fix is to stop using w82s, which does not properly reconstitute unicode
strings. Instrad, use utf8 bytestring to get the [Word8] to base64. This
passes unicode through perfectly, including any invalid filesystem encoded
characters.
Note that toB64 / fromB64 are also used for creds and cipher
embedding. It would be unfortunate if this change broke those uses.
For cipher embedding, note that ciphers can contain arbitrary bytes (should
really be using ByteString.Char8 there). Testing indicated it's not safe to
use the new fromB64 there; I think that characters were incorrectly
combined.
For credpair embedding, the username or password could contain unicode.
Before, that unicode would fail to round-trip through the b64.
So, I guess this is not going to break any embedded creds that worked
before.
This bug may have affected some creds before, and if so,
this change will not fix old ones, but should fix new ones at least.
2015-03-04 15:16:03 +00:00
|
|
|
prop_b64_roundtrips :: String -> Bool
|
2015-08-12 14:36:51 +00:00
|
|
|
prop_b64_roundtrips s
|
2019-01-07 18:18:24 +00:00
|
|
|
| all (isAscii) s = s == decodeBS (fromB64' (toB64' (encodeBS s)))
|
2015-08-12 14:36:51 +00:00
|
|
|
| otherwise = True
|