This is not as efficient as using ByteStrings throughout, but converting
the String to ByteString is actually significantly faster than the old
parser.
benchmarking parse/old
time 9.657 μs (9.600 μs .. 9.732 μs)
1.000 R² (0.999 R² .. 1.000 R²)
mean 9.703 μs (9.645 μs .. 9.785 μs)
std dev 231.6 ns (161.5 ns .. 323.7 ns)
variance introduced by outliers: 25% (moderately inflated)
benchmarking parse/new
time 834.6 ns (797.1 ns .. 886.9 ns)
0.987 R² (0.976 R² .. 0.999 R²)
mean 816.4 ns (802.7 ns .. 845.1 ns)
std dev 62.39 ns (37.66 ns .. 108.4 ns)
variance introduced by outliers: 82% (severely inflated)
There is a small behavior change from the old parsePOSIXTime,
which accepted any amount of trailing whitespace after the timestamp.
That behavior was not documented, and it doesn't seem anything relied on it.
This should make == comparison of UUIDs somewhat faster, and perhaps a
few other operations around maps of UUIDs etc.
FromUUID/ToUUID are used to convert String, which is still used for all
IO of UUIDs. Eventually the hope is those instances can be removed,
and all git-annex branch log files etc use ByteString throughout, for a
real speed improvement.
Note the use of fromRawFilePath / toRawFilePath -- while a UUID usually
contains only alphanumerics and so could be treated as ascii, it's
conceivable that some git-annex repository has been initialized using
a UUID that is not only not a canonical UUID, but contains high unicode
or invalid unicode. Using the filesystem encoding avoids any problems
with such a thing. However, a NUL in a UUID seems extremely unlikely,
so I didn't use encodeBS / decodeBS to avoid their extra overhead in
handling NULs.
The Read/Show instance for UUID luckily serializes the same way for
ByteString as it did for String.
It used to display the "bad feed content" message indicating there were no
enclosures found, which was misleading when the http request for the feed
failed.
This commit was sponsored by Ewen McNeill on Patreon.