git-annex/Utility
Joey Hess 7390f08ef9 Use cryptohash rather than SHA for hashing.
This is a massive win on OSX, which doesn't have a sha256sum normally.

Only use external hash commands when the file is > 1 mb,
since cryptohash is quite close to them in speed.

SHA is still used to calculate HMACs. I don't quite understand
cryptohash's API for those.

Used the following benchmark to arrive at the 1 mb number.

1 mb file:

benchmarking sha256/internal
mean: 13.86696 ms, lb 13.83010 ms, ub 13.93453 ms, ci 0.950
std dev: 249.3235 us, lb 162.0448 us, ub 458.1744 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
  4 (4.0%) high mild
  1 (1.0%) high severe
variance introduced by outliers: 10.415%
variance is moderately inflated by outliers

benchmarking sha256/external
mean: 14.20670 ms, lb 14.17237 ms, ub 14.27004 ms, ci 0.950
std dev: 230.5448 us, lb 150.7310 us, ub 427.6068 us, ci 0.950
found 3 outliers among 100 samples (3.0%)
  2 (2.0%) high mild
  1 (1.0%) high severe

2 mb file:

benchmarking sha256/internal
mean: 26.44270 ms, lb 26.23701 ms, ub 26.63414 ms, ci 0.950
std dev: 1.012303 ms, lb 925.8921 us, ub 1.122267 ms, ci 0.950
variance introduced by outliers: 35.540%
variance is moderately inflated by outliers

benchmarking sha256/external
mean: 26.84521 ms, lb 26.77644 ms, ub 26.91433 ms, ci 0.950
std dev: 347.7867 us, lb 210.6283 us, ub 571.3351 us, ci 0.950
found 6 outliers among 100 samples (6.0%)

import Crypto.Hash
import Data.ByteString.Lazy as L
import Criterion.Main
import Common

testfile :: FilePath
testfile = "/run/shm/data" -- on ram disk

main = defaultMain
        [ bgroup "sha256"
                [ bench "internal" $ whnfIO internal
                , bench "external" $ whnfIO external
                ]
        ]

sha256 :: L.ByteString -> Digest SHA256
sha256 = hashlazy

internal :: IO String
internal = show . sha256 <$> L.readFile testfile

external :: IO String
external = do
	s <- readProcess "sha256sum" [testfile]
        return $ fst $ separate (== ' ') s
2013-09-22 20:06:02 -04:00
..
DirWatcher let's put type modules under the parent module, not in a Types directory 2013-03-10 22:24:13 -04:00
Applicative.hs pointlessness 2012-06-29 10:00:05 -04:00
Base64.hs tag xmpp pushes with jid 2013-03-06 16:29:19 -04:00
Batch.hs simpler ifdef for linux 2013-06-21 13:09:09 -04:00
CoProcess.hs get rid of __WINDOWS__, use mingw32_HOST_OS 2013-08-02 12:27:32 -04:00
CopyFile.hs get rid of __WINDOWS__, use mingw32_HOST_OS 2013-08-02 12:27:32 -04:00
Daemon.hs avoid more build warnings on Windows 2013-08-04 14:05:36 -04:00
DataUnits.hs refactor and unify code 2013-07-19 19:39:14 -04:00
DBus.hs finished where indentation changes 2012-12-13 00:24:19 -04:00
Directory.hs better nukefile 2013-05-21 13:03:46 -04:00
DirWatcher.hs assistant: Fix OSX bug that prevented committing changed files to a repository when in indirect mode. 2013-03-17 17:01:43 -04:00
DiskFree.hs tweak 2013-03-13 14:54:52 -04:00
Dot.hs finished where indentation changes 2012-12-13 00:24:19 -04:00
Env.hs fix the day's windows permissions damage 2013-05-12 19:09:48 -04:00
Exception.hs avoid warnings when built with ghc 7.6 2013-06-02 15:01:58 -04:00
ExternalSHA.hs Use cryptohash rather than SHA for hashing. 2013-09-22 20:06:02 -04:00
FileMode.hs get rid of __WINDOWS__, use mingw32_HOST_OS 2013-08-02 12:27:32 -04:00
FileSystemEncoding.hs Fix a few bugs involving filenames that are at or near the filesystem's maximum filename length limit. 2013-07-30 19:18:29 -04:00
Format.hs gpg secret keys list parsing 2013-09-16 12:57:39 -04:00
FreeDesktop.hs linux standalone auto-install icons 2013-07-09 20:50:41 -04:00
FSEvents.hs let's put type modules under the parent module, not in a Types directory 2013-03-10 22:24:13 -04:00
Gpg.hs webapp gpg key generation 2013-09-17 15:36:15 -04:00
Hash.hs Use cryptohash rather than SHA for hashing. 2013-09-22 20:06:02 -04:00
HumanNumber.hs refactor and unify code 2013-07-19 19:39:14 -04:00
HumanTime.hs finished where indentation changes 2012-12-13 00:24:19 -04:00
InodeCache.hs fix permission damage (thanks, Windows) 2013-05-11 23:54:25 -04:00
INotify.hs catch does not exist error when adding a watch 2013-07-17 15:32:24 -04:00
JSONStream.hs whitespace fixes 2012-12-13 00:45:27 -04:00
Kqueue.hs let's put type modules under the parent module, not in a Types directory 2013-03-10 22:24:13 -04:00
libdiskfree.c Makefile now builds using cabal, taking advantage of cabal's automatic detection of appropriate build flags. 2013-02-27 02:39:22 -04:00
libdiskfree.h
libkqueue.c include sys/types.h 2013-04-24 10:39:52 -04:00
libkqueue.h fix prototype 2012-06-19 01:57:19 -04:00
libmounts.c cleanup 2012-07-19 21:20:38 -04:00
libmounts.h Got removable media mount detection working on Android. 2013-05-04 16:19:25 -04:00
LogFile.hs avoid more build warnings on Windows 2013-08-04 14:05:36 -04:00
Lsof.hs clean up from windows porting 2013-05-11 18:23:41 -04:00
Matcher.hs fix handling of Not in the matcher 2013-05-25 13:50:27 -04:00
Metered.hs webapp: Progess bar fixes for many types of special remotes. 2013-03-28 17:04:37 -04:00
Misc.hs refactor git-annex branch log filename code into central location 2013-08-29 19:13:00 -04:00
Monad.hs Speed up the 'unused' command. 2013-08-25 21:02:13 -04:00
Mounts.hsc avoid warnings when built with ghc 7.6 2013-06-02 15:01:58 -04:00
Network.hs finished where indentation changes 2012-12-13 00:24:19 -04:00
NotificationBroadcaster.hs webapp: Fix a race that sometimes caused alerts or other notifications to be missed if they occurred while a page was loading. 2013-03-27 14:56:20 -04:00
OSX.hs squelch warning 2012-11-26 16:29:05 -04:00
Parallel.hs finished where indentation changes 2012-12-13 00:24:19 -04:00
PartialPrelude.hs
Path.hs Youtube support! (And 53 other video hosts) 2013-08-22 18:50:43 -04:00
Percentage.hs cleanup 2013-07-20 20:56:04 -04:00
Process.hs avoid more build warnings on Windows 2013-08-04 14:05:36 -04:00
QuickCheck.hs Stop depending on testpack. 2013-02-27 23:23:41 -04:00
Quvi.hs better error message 2013-08-22 21:12:41 -04:00
Rsync.hs deal with Cygwin rsync paths issue 2013-05-14 13:24:15 -04:00
SafeCommand.hs fix syntax 2013-08-02 12:42:14 -04:00
Shell.hs fix use of wrong shebang when android is installing git-annex-shell wrapper on server 2013-05-06 15:58:13 -04:00
SRV.hs fix build with haskell DNS 1.0.0 2013-09-17 11:54:09 -04:00
Tense.hs finished where indentation changes 2012-12-13 00:24:19 -04:00
ThreadLock.hs
ThreadScheduler.hs get rid of __WINDOWS__, use mingw32_HOST_OS 2013-08-02 12:27:32 -04:00
TList.hs add two long-running XMPP push threads, no more inversion of control 2013-05-22 15:13:31 -04:00
Tmp.hs Fix a few bugs involving filenames that are at or near the filesystem's maximum filename length limit. 2013-07-30 19:18:29 -04:00
Touch.hsc finished where indentation changes 2012-12-13 00:24:19 -04:00
Url.hs Set --clobber when running wget to ensure resuming works properly. 2013-08-21 18:19:01 -04:00
UserInfo.hs get rid of __WINDOWS__, use mingw32_HOST_OS 2013-08-02 12:27:32 -04:00
Verifiable.hs finished where indentation changes 2012-12-13 00:24:19 -04:00
WebApp.hs Use cryptohash rather than SHA for hashing. 2013-09-22 20:06:02 -04:00
Yesod.hs fix a build failure on android 2013-06-27 15:25:28 -04:00