git-annex/Git/Filename.hs
Joey Hess 067aabdd48
wip RawFilePath 2x git-annex find speedup
Finally builds (oh the agoncy of making it build), but still very
unmergable, only Command.Find is included and lots of stuff is badly
hacked to make it compile.

Benchmarking vs master, this git-annex find is significantly faster!
Specifically:

	num files	old	new	speedup
	48500		4.77	3.73	28%
	12500		1.36	1.02	66%
	20		0.075	0.074	0% (so startup time is unchanged)

That's without really finishing the optimization. Things still to do:

* Eliminate all the fromRawFilePath, toRawFilePath, encodeBS,
  decodeBS conversions.
* Use versions of IO actions like getFileStatus that take a RawFilePath.
* Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy.
* Use ByteString for parsing git config to speed up startup.

It's likely several of those will speed up git-annex find further.
And other commands will certianly benefit even more.
2019-11-26 16:01:58 -04:00

40 lines
1 KiB
Haskell

{- Some git commands output encoded filenames, in a rather annoyingly complex
- C-style encoding.
-
- Copyright 2010, 2011 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU AGPL version 3 or higher.
-}
module Git.Filename where
import Common
import Utility.Format (decode_c, encode_c)
import Data.Char
import Data.Word
import qualified Data.ByteString as S
-- encoded filenames will be inside double quotes
decode :: S.ByteString -> RawFilePath
decode b = case S.uncons b of
Nothing -> b
Just (h, t)
| h /= q -> b
| otherwise -> case S.unsnoc t of
Nothing -> b
Just (i, l)
| l /= q -> b
| otherwise ->
encodeBS $ decode_c $ decodeBS i
where
q :: Word8
q = fromIntegral (ord '"')
{- Should not need to use this, except for testing decode. -}
encode :: RawFilePath -> S.ByteString
encode s = encodeBS $ "\"" ++ encode_c (decodeBS s) ++ "\""
{- For quickcheck. -}
prop_encode_decode_roundtrip :: FilePath -> Bool
prop_encode_decode_roundtrip s = s == fromRawFilePath (decode (encode (toRawFilePath s)))