blog for the day

(may be updated later)
This commit is contained in:
Joey Hess 2012-07-04 13:22:32 -04:00
parent 40729e7fa2
commit 1f3f221b80

View file

@ -0,0 +1,86 @@
In a series of airport layovers all day. Since I woke up at 3:45 am,
didn't feel up to doing serious new work, so instead I worked through some
OSX support backlog.
git-annex will now use Haskell's SHA library if the `sha256sum`
command is not available. That library is slow, but it's guaranteed to be
available; git-annex already depended on it to calculate HMACs.
Then I decided to see if it makes sense to use the SHA library
when adding smaller files. At some point, its slower implementation should
win over needing to fork and parse the output of `sha256sum`. This was
the first time I tried out Haskell's
[Criterion](http://hackage.haskell.org/package/criterion) benchmarker,
and I built this simple benchmark in short order.
[[!format haskell """
import Data.Digest.Pure.SHA
import Data.ByteString.Lazy as L
import Criterion.Main
import Common
testfile :: FilePath
testfile = "/tmp/bar" -- on ram disk
main = defaultMain
[ bgroup "sha256"
[ bench "internal" $ whnfIO internal
, bench "external" $ whnfIO external
]
]
internal :: IO String
internal = showDigest . sha256 <$> L.readFile testfile
external :: IO String
external = pOpen ReadFromPipe "sha256sum" [testfile] $ \h ->
fst . separate (== ' ') <$> hGetLine h
"""]]
The nice thing about benchmarking in Airports is when you're running a
benchmark locally, you don't want to do anything else with the computer,
so can alternate people watching, spacing out, and analizing results.
100 kb file:
benchmarking sha256/internal
mean: 15.64729 ms, lb 15.29590 ms, ub 16.10119 ms, ci 0.950
std dev: 2.032476 ms, lb 1.638016 ms, ub 2.527089 ms, ci 0.950
benchmarking sha256/external
mean: 8.217700 ms, lb 7.931324 ms, ub 8.568805 ms, ci 0.950
std dev: 1.614786 ms, lb 1.357791 ms, ub 2.009682 ms, ci 0.950
75 kb file:
benchmarking sha256/internal
mean: 12.16099 ms, lb 11.89566 ms, ub 12.50317 ms, ci 0.950
std dev: 1.531108 ms, lb 1.232353 ms, ub 1.929141 ms, ci 0.950
benchmarking sha256/external
mean: 8.818731 ms, lb 8.425744 ms, ub 9.269550 ms, ci 0.950
std dev: 2.158530 ms, lb 1.916067 ms, ub 2.487242 ms, ci 0.950
50 kb file:
benchmarking sha256/internal
mean: 7.699274 ms, lb 7.560254 ms, ub 7.876605 ms, ci 0.950
std dev: 801.5292 us, lb 655.3344 us, ub 990.4117 us, ci 0.950
benchmarking sha256/external
mean: 8.715779 ms, lb 8.330540 ms, ub 9.102232 ms, ci 0.950
std dev: 1.988089 ms, lb 1.821582 ms, ub 2.181676 ms, ci 0.950
10 kb file:
benchmarking sha256/internal
mean: 1.586105 ms, lb 1.574512 ms, ub 1.604922 ms, ci 0.950
std dev: 74.07235 us, lb 51.71688 us, ub 108.1348 us, ci 0.950
benchmarking sha256/external
mean: 6.873742 ms, lb 6.582765 ms, ub 7.252911 ms, ci 0.950
std dev: 1.689662 ms, lb 1.346310 ms, ub 2.640399 ms, ci 0.950
It's possible to get nice graphical reports out of Criterion, but
this is clear enough, so I stopped here. 50 kb seems a reasonable
cutoff point.