2015-08-06 19:02:25 +00:00
|
|
|
{- git-annex hashing backends
|
2011-03-01 20:50:53 +00:00
|
|
|
-
|
2021-02-09 19:00:51 +00:00
|
|
|
- Copyright 2011-2021 Joey Hess <id@joeyh.name>
|
2011-03-01 20:50:53 +00:00
|
|
|
-
|
2019-03-13 19:48:14 +00:00
|
|
|
- Licensed under the GNU AGPL version 3 or higher.
|
2011-03-01 20:50:53 +00:00
|
|
|
-}
|
|
|
|
|
2019-01-11 20:34:04 +00:00
|
|
|
{-# LANGUAGE OverloadedStrings #-}
|
2018-03-15 15:16:00 +00:00
|
|
|
|
2014-08-01 19:09:49 +00:00
|
|
|
module Backend.Hash (
|
|
|
|
backends,
|
|
|
|
testKeyBackend,
|
2019-08-02 17:56:55 +00:00
|
|
|
keyHash,
|
add equivilant key log for VURL keys
When downloading a VURL from the web, make sure that the equivilant key
log is populated.
Unfortunately, this does not hash the content while it's being
downloaded from the web. There is not an interface in Backend currently
for incrementally hash generation, only for incremental verification of an
existing hash. So this might add a noticiable delay, and it has to show
a "(checksum...") message. This could stand to be improved.
But, that separate hashing step only has to happen on the first download
of new content from the web. Once the hash is known, the VURL key can have
its hash verified incrementally while downloading except when the
content in the web has changed. (Doesn't happen yet because
verifyKeyContentIncrementally is not implemented yet for VURL keys.)
Note that the equivilant key log file is formatted as a presence log.
This adds a tiny bit of overhead (eg "1 ") per line over just listing the
urls. The reason I chose to use that format is it seems possible that
there will need to be a way to remove an equivilant key at some point in
the future. I don't know why that would be necessary, but it seemed wise
to allow for the possibility.
Downloads of VURL keys from other special remotes that claim urls,
like bittorrent for example, does not popilate the equivilant key log.
So for now, no checksum verification will be done for those.
Sponsored-by: Nicholas Golder-Manning on Patreon
2024-02-29 19:41:57 +00:00
|
|
|
descChecksum,
|
2014-08-01 19:09:49 +00:00
|
|
|
) where
|
2011-03-01 20:50:53 +00:00
|
|
|
|
2016-01-20 20:36:33 +00:00
|
|
|
import Annex.Common
|
2011-03-01 20:50:53 +00:00
|
|
|
import qualified Annex
|
2020-11-05 15:26:34 +00:00
|
|
|
import Backend.Utilities
|
2017-02-24 19:16:56 +00:00
|
|
|
import Types.Key
|
2011-06-02 01:56:04 +00:00
|
|
|
import Types.Backend
|
2012-06-20 20:07:14 +00:00
|
|
|
import Types.KeySource
|
Use cryptohash rather than SHA for hashing.
This is a massive win on OSX, which doesn't have a sha256sum normally.
Only use external hash commands when the file is > 1 mb,
since cryptohash is quite close to them in speed.
SHA is still used to calculate HMACs. I don't quite understand
cryptohash's API for those.
Used the following benchmark to arrive at the 1 mb number.
1 mb file:
benchmarking sha256/internal
mean: 13.86696 ms, lb 13.83010 ms, ub 13.93453 ms, ci 0.950
std dev: 249.3235 us, lb 162.0448 us, ub 458.1744 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
4 (4.0%) high mild
1 (1.0%) high severe
variance introduced by outliers: 10.415%
variance is moderately inflated by outliers
benchmarking sha256/external
mean: 14.20670 ms, lb 14.17237 ms, ub 14.27004 ms, ci 0.950
std dev: 230.5448 us, lb 150.7310 us, ub 427.6068 us, ci 0.950
found 3 outliers among 100 samples (3.0%)
2 (2.0%) high mild
1 (1.0%) high severe
2 mb file:
benchmarking sha256/internal
mean: 26.44270 ms, lb 26.23701 ms, ub 26.63414 ms, ci 0.950
std dev: 1.012303 ms, lb 925.8921 us, ub 1.122267 ms, ci 0.950
variance introduced by outliers: 35.540%
variance is moderately inflated by outliers
benchmarking sha256/external
mean: 26.84521 ms, lb 26.77644 ms, ub 26.91433 ms, ci 0.950
std dev: 347.7867 us, lb 210.6283 us, ub 571.3351 us, ci 0.950
found 6 outliers among 100 samples (6.0%)
import Crypto.Hash
import Data.ByteString.Lazy as L
import Criterion.Main
import Common
testfile :: FilePath
testfile = "/run/shm/data" -- on ram disk
main = defaultMain
[ bgroup "sha256"
[ bench "internal" $ whnfIO internal
, bench "external" $ whnfIO external
]
]
sha256 :: L.ByteString -> Digest SHA256
sha256 = hashlazy
internal :: IO String
internal = show . sha256 <$> L.readFile testfile
external :: IO String
external = do
s <- readProcess "sha256sum" [testfile]
return $ fst $ separate (== ' ') s
2013-09-22 23:45:08 +00:00
|
|
|
import Utility.Hash
|
2019-06-25 15:37:52 +00:00
|
|
|
import Utility.Metered
|
2020-11-05 15:26:34 +00:00
|
|
|
import qualified Utility.RawFilePath as R
|
2012-07-04 13:08:20 +00:00
|
|
|
|
2019-01-11 20:34:04 +00:00
|
|
|
import qualified Data.ByteString as S
|
2021-10-06 00:20:08 +00:00
|
|
|
import qualified Data.ByteString.Short as S (toShort, fromShort)
|
2019-01-11 20:34:04 +00:00
|
|
|
import qualified Data.ByteString.Char8 as S8
|
2012-07-04 13:08:20 +00:00
|
|
|
import qualified Data.ByteString.Lazy as L
|
2019-06-25 17:10:06 +00:00
|
|
|
import Control.DeepSeq
|
|
|
|
import Control.Exception (evaluate)
|
2011-03-01 20:50:53 +00:00
|
|
|
|
2015-08-06 19:02:25 +00:00
|
|
|
data Hash
|
|
|
|
= MD5Hash
|
|
|
|
| SHA1Hash
|
|
|
|
| SHA2Hash HashSize
|
|
|
|
| SHA3Hash HashSize
|
|
|
|
| SkeinHash HashSize
|
2018-03-13 20:15:35 +00:00
|
|
|
| Blake2bHash HashSize
|
2019-07-05 19:29:00 +00:00
|
|
|
| Blake2bpHash HashSize
|
2018-03-13 20:15:35 +00:00
|
|
|
| Blake2sHash HashSize
|
|
|
|
| Blake2spHash HashSize
|
2011-05-16 15:46:34 +00:00
|
|
|
|
2020-07-20 16:08:37 +00:00
|
|
|
cryptographicallySecure :: Hash -> Bool
|
|
|
|
cryptographicallySecure (SHA2Hash _) = True
|
|
|
|
cryptographicallySecure (SHA3Hash _) = True
|
|
|
|
cryptographicallySecure (SkeinHash _) = True
|
|
|
|
cryptographicallySecure (Blake2bHash _) = True
|
|
|
|
cryptographicallySecure (Blake2bpHash _) = True
|
|
|
|
cryptographicallySecure (Blake2sHash _) = True
|
|
|
|
cryptographicallySecure (Blake2spHash _) = True
|
|
|
|
cryptographicallySecure SHA1Hash = False
|
|
|
|
cryptographicallySecure MD5Hash = False
|
|
|
|
|
2024-03-01 20:42:02 +00:00
|
|
|
{- Order is significant. The first hash is the default one that git-annex
|
|
|
|
- uses, and must be cryptographically secure.
|
|
|
|
-
|
|
|
|
- Also, want more common sizes earlier than uncommon sizes. -}
|
2013-10-02 00:34:06 +00:00
|
|
|
hashes :: [Hash]
|
|
|
|
hashes = concat
|
2017-02-24 19:16:56 +00:00
|
|
|
[ map (SHA2Hash . HashSize) [256, 512, 224, 384]
|
|
|
|
, map (SHA3Hash . HashSize) [256, 512, 224, 384]
|
|
|
|
, map (SkeinHash . HashSize) [256, 512]
|
2018-03-13 20:15:35 +00:00
|
|
|
, map (Blake2bHash . HashSize) [256, 512, 160, 224, 384]
|
2019-07-05 19:29:00 +00:00
|
|
|
, map (Blake2bpHash . HashSize) [512]
|
2018-03-13 20:15:35 +00:00
|
|
|
, map (Blake2sHash . HashSize) [256, 160, 224]
|
|
|
|
, map (Blake2spHash . HashSize) [256, 224]
|
2015-08-06 19:02:25 +00:00
|
|
|
, [SHA1Hash]
|
2015-02-04 17:47:54 +00:00
|
|
|
, [MD5Hash]
|
2013-10-02 00:34:06 +00:00
|
|
|
]
|
2011-05-16 15:46:34 +00:00
|
|
|
|
2013-10-02 00:34:06 +00:00
|
|
|
{- The SHA256E backend is the default, so genBackendE comes first. -}
|
2011-12-31 08:11:39 +00:00
|
|
|
backends :: [Backend]
|
2015-08-06 19:02:25 +00:00
|
|
|
backends = concatMap (\h -> [genBackendE h, genBackend h]) hashes
|
2011-03-02 17:47:45 +00:00
|
|
|
|
2014-08-01 19:09:49 +00:00
|
|
|
genBackend :: Hash -> Backend
|
|
|
|
genBackend hash = Backend
|
2017-02-24 19:16:56 +00:00
|
|
|
{ backendVariety = hashKeyVariety hash (HasExt False)
|
2020-07-20 18:06:05 +00:00
|
|
|
, genKey = Just (keyValue hash)
|
2015-10-01 17:28:49 +00:00
|
|
|
, verifyKeyContent = Just $ checkKeyChecksum hash
|
2021-02-09 19:00:51 +00:00
|
|
|
, verifyKeyContentIncrementally = Just $ checkKeyChecksumIncremental hash
|
2013-09-25 07:09:06 +00:00
|
|
|
, canUpgradeKey = Just needsUpgrade
|
2014-07-10 21:06:04 +00:00
|
|
|
, fastMigrate = Just trivialMigrate
|
2014-07-27 16:33:46 +00:00
|
|
|
, isStableKey = const True
|
2024-02-29 21:21:29 +00:00
|
|
|
, isCryptographicallySecure = cryptographicallySecure hash
|
|
|
|
, isCryptographicallySecureKey = const $ pure $
|
|
|
|
cryptographicallySecure hash
|
2012-07-04 13:08:20 +00:00
|
|
|
}
|
2011-04-08 00:08:11 +00:00
|
|
|
|
2014-08-01 19:09:49 +00:00
|
|
|
genBackendE :: Hash -> Backend
|
|
|
|
genBackendE hash = (genBackend hash)
|
2017-02-24 19:16:56 +00:00
|
|
|
{ backendVariety = hashKeyVariety hash (HasExt True)
|
2020-07-20 18:06:05 +00:00
|
|
|
, genKey = Just (keyValueE hash)
|
2014-08-01 19:09:49 +00:00
|
|
|
}
|
2011-05-16 15:46:34 +00:00
|
|
|
|
2017-02-24 19:16:56 +00:00
|
|
|
hashKeyVariety :: Hash -> HasExt -> KeyVariety
|
2019-01-16 20:33:08 +00:00
|
|
|
hashKeyVariety MD5Hash he = MD5Key he
|
|
|
|
hashKeyVariety SHA1Hash he = SHA1Key he
|
|
|
|
hashKeyVariety (SHA2Hash size) he = SHA2Key size he
|
|
|
|
hashKeyVariety (SHA3Hash size) he = SHA3Key size he
|
|
|
|
hashKeyVariety (SkeinHash size) he = SKEINKey size he
|
|
|
|
hashKeyVariety (Blake2bHash size) he = Blake2bKey size he
|
2019-07-05 19:29:00 +00:00
|
|
|
hashKeyVariety (Blake2bpHash size) he = Blake2bpKey size he
|
2019-01-16 20:33:08 +00:00
|
|
|
hashKeyVariety (Blake2sHash size) he = Blake2sKey size he
|
|
|
|
hashKeyVariety (Blake2spHash size) he = Blake2spKey size he
|
2011-05-16 15:46:34 +00:00
|
|
|
|
2013-10-02 00:34:06 +00:00
|
|
|
{- A key is a hash of its contents. -}
|
2020-05-15 16:51:09 +00:00
|
|
|
keyValue :: Hash -> KeySource -> MeterUpdate -> Annex Key
|
2019-06-25 17:10:06 +00:00
|
|
|
keyValue hash source meterupdate = do
|
2020-11-05 15:26:34 +00:00
|
|
|
let file = contentLocation source
|
2015-01-20 20:58:48 +00:00
|
|
|
filesize <- liftIO $ getFileSize file
|
2019-06-25 17:10:06 +00:00
|
|
|
s <- hashFile hash file meterupdate
|
2020-05-15 16:51:09 +00:00
|
|
|
return $ mkKey $ \k -> k
|
2021-10-06 00:20:08 +00:00
|
|
|
{ keyName = S.toShort (encodeBS s)
|
2017-02-24 19:16:56 +00:00
|
|
|
, keyVariety = hashKeyVariety hash (HasExt False)
|
2012-07-04 17:04:01 +00:00
|
|
|
, keySize = Just filesize
|
2011-05-16 15:46:34 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
{- Extension preserving keys. -}
|
2020-05-15 16:51:09 +00:00
|
|
|
keyValueE :: Hash -> KeySource -> MeterUpdate -> Annex Key
|
2019-06-25 15:37:52 +00:00
|
|
|
keyValueE hash source meterupdate =
|
2020-07-29 21:12:22 +00:00
|
|
|
keyValue hash source meterupdate
|
|
|
|
>>= addE source (const $ hashKeyVariety hash (HasExt True))
|
2011-03-01 20:50:53 +00:00
|
|
|
|
2020-11-05 15:26:34 +00:00
|
|
|
checkKeyChecksum :: Hash -> Key -> RawFilePath -> Annex Bool
|
2015-12-06 20:26:38 +00:00
|
|
|
checkKeyChecksum hash key file = catchIOErrorType HardwareFault hwfault $ do
|
2022-06-28 19:28:14 +00:00
|
|
|
fast <- Annex.getRead Annex.fast
|
2020-11-05 15:26:34 +00:00
|
|
|
exists <- liftIO $ R.doesPathExist file
|
2018-08-28 22:01:13 +00:00
|
|
|
case (exists, fast) of
|
|
|
|
(True, False) -> do
|
2023-04-10 21:03:41 +00:00
|
|
|
showAction (UnquotedString descChecksum)
|
2021-02-09 19:00:51 +00:00
|
|
|
sameCheckSum key
|
|
|
|
<$> hashFile hash file nullMeterUpdate
|
2015-12-06 20:26:38 +00:00
|
|
|
_ -> return True
|
2012-11-11 04:51:07 +00:00
|
|
|
where
|
2015-05-27 20:40:03 +00:00
|
|
|
hwfault e = do
|
filter out control characters in warning messages
Converted warning and similar to use StringContainingQuotedPath. Most
warnings are static strings, some do refer to filepaths that need to be
quoted, and others don't need quoting.
Note that, since quote filters out control characters of even
UnquotedString, this makes all warnings safe, even when an attacker
sneaks in a control character in some other way.
When json is being output, no quoting is done, since json gets its own
quoting.
This does, as a side effect, make warning messages in json output not
be indented. The indentation is only needed to offset warning messages
underneath the display of the file they apply to, so that's ok.
Sponsored-by: Brett Eisenberg on Patreon
2023-04-10 18:47:32 +00:00
|
|
|
warning $ UnquotedString $ "hardware fault: " ++ show e
|
2015-05-27 20:40:03 +00:00
|
|
|
return False
|
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
sameCheckSum :: Key -> String -> Bool
|
|
|
|
sameCheckSum key s
|
|
|
|
| s == expected = True
|
|
|
|
{- A bug caused checksums to be prefixed with \ in some
|
|
|
|
- cases; still accept these as legal now that the bug
|
|
|
|
- has been fixed. -}
|
|
|
|
| '\\' : s == expected = True
|
|
|
|
| otherwise = False
|
|
|
|
where
|
|
|
|
expected = decodeBS (keyHash key)
|
|
|
|
|
|
|
|
checkKeyChecksumIncremental :: Hash -> Key -> Annex IncrementalVerifier
|
2021-02-09 21:03:27 +00:00
|
|
|
checkKeyChecksumIncremental hash key = liftIO $ (snd $ hasher hash) key
|
2021-02-09 19:00:51 +00:00
|
|
|
|
2019-01-11 20:34:04 +00:00
|
|
|
keyHash :: Key -> S.ByteString
|
|
|
|
keyHash = fst . splitKeyNameExtension
|
2012-12-20 19:43:14 +00:00
|
|
|
|
2018-05-23 18:07:51 +00:00
|
|
|
{- Upgrade keys that have the \ prefix on their hash due to a bug, or
|
|
|
|
- that contain non-alphanumeric characters in their extension.
|
|
|
|
-
|
|
|
|
- Also, for a while migrate from eg SHA256E to SHA256 resulted in a SHA256
|
|
|
|
- key that contained an extension inside its keyName. Upgrade those
|
|
|
|
- keys, removing the extension.
|
|
|
|
-}
|
2012-12-20 19:43:14 +00:00
|
|
|
needsUpgrade :: Key -> Bool
|
2018-05-23 18:07:51 +00:00
|
|
|
needsUpgrade key = or
|
2019-01-11 20:34:04 +00:00
|
|
|
[ "\\" `S8.isPrefixOf` keyHash key
|
handle keys with extensions consistently in all locales
Fix some cases where handling of keys with extensions varied depending on
the locale.
A filename with a unicode extension would before generate a key with an
extension in a unicode locale, but not in LANG=C, because the extension
was not all alphanumeric. Also the the length of the extension could be
counted differently depending on the locale.
In a non-unicode locale, git-annex migrate would see that the extension
was not all alphanumeric and want to "upgrade" it. Now that doesn't happen.
As far as backwards compatability, this does mean that unicode
extensions are counted by the number of bytes, not number of characters.
So, if someone is using unicode extensions, they may find git-annex
stops using them when adding files, because their extensions are too
long. Keys already in their repo with the "too long" extensions will
still work though, so this only prevents adding the same content with
the same extension generating the same key. Documented this by
documenting that annex.maxextensionlength is a number of bytes.
Also, if a filename has an extension that is not valid utf-8 and the
locale is utf-8, the extension will be allowed now, and an old
git-annex, in the same locale would not, and would also want to
"upgrade" that.
2020-02-20 21:18:59 +00:00
|
|
|
, S.any (not . validInExtension) (snd $ splitKeyNameExtension key)
|
2021-10-06 00:20:08 +00:00
|
|
|
, not (hasExt (fromKey keyVariety key)) && keyHash key /= S.fromShort (fromKey keyName key)
|
2018-05-23 18:07:51 +00:00
|
|
|
]
|
2013-10-02 00:34:06 +00:00
|
|
|
|
2024-03-01 20:42:02 +00:00
|
|
|
trivialMigrate :: Key -> Backend -> AssociatedFile -> Bool -> Annex (Maybe Key)
|
|
|
|
trivialMigrate oldkey newbackend afile _inannex = trivialMigrate' oldkey newbackend afile
|
2018-09-24 16:07:46 +00:00
|
|
|
<$> (annexMaxExtensionLength <$> Annex.getGitConfig)
|
|
|
|
|
|
|
|
trivialMigrate' :: Key -> Backend -> AssociatedFile -> Maybe Int -> Maybe Key
|
|
|
|
trivialMigrate' oldkey newbackend afile maxextlen
|
2015-01-04 16:33:10 +00:00
|
|
|
{- Fast migration from hashE to hash backend. -}
|
2019-11-22 20:24:04 +00:00
|
|
|
| migratable && hasExt oldvariety = Just $ alterKey oldkey $ \d -> d
|
2021-10-06 00:20:08 +00:00
|
|
|
{ keyName = S.toShort (keyHash oldkey)
|
2017-02-24 19:16:56 +00:00
|
|
|
, keyVariety = newvariety
|
2014-07-10 21:06:04 +00:00
|
|
|
}
|
2015-01-04 16:33:10 +00:00
|
|
|
{- Fast migration from hash to hashE backend. -}
|
2018-05-23 18:07:51 +00:00
|
|
|
| migratable && hasExt newvariety = case afile of
|
2017-03-10 17:12:24 +00:00
|
|
|
AssociatedFile Nothing -> Nothing
|
2019-11-22 20:24:04 +00:00
|
|
|
AssociatedFile (Just file) -> Just $ alterKey oldkey $ \d -> d
|
2021-10-06 00:20:08 +00:00
|
|
|
{ keyName = S.toShort $ keyHash oldkey
|
handle keys with extensions consistently in all locales
Fix some cases where handling of keys with extensions varied depending on
the locale.
A filename with a unicode extension would before generate a key with an
extension in a unicode locale, but not in LANG=C, because the extension
was not all alphanumeric. Also the the length of the extension could be
counted differently depending on the locale.
In a non-unicode locale, git-annex migrate would see that the extension
was not all alphanumeric and want to "upgrade" it. Now that doesn't happen.
As far as backwards compatability, this does mean that unicode
extensions are counted by the number of bytes, not number of characters.
So, if someone is using unicode extensions, they may find git-annex
stops using them when adding files, because their extensions are too
long. Keys already in their repo with the "too long" extensions will
still work though, so this only prevents adding the same content with
the same extension generating the same key. Documented this by
documenting that annex.maxextensionlength is a number of bytes.
Also, if a filename has an extension that is not valid utf-8 and the
locale is utf-8, the extension will be allowed now, and an old
git-annex, in the same locale would not, and would also want to
"upgrade" that.
2020-02-20 21:18:59 +00:00
|
|
|
<> selectExtension maxextlen file
|
2017-02-24 19:16:56 +00:00
|
|
|
, keyVariety = newvariety
|
2015-01-04 16:33:10 +00:00
|
|
|
}
|
2018-05-23 18:07:51 +00:00
|
|
|
{- Upgrade to fix bad previous migration that created a
|
|
|
|
- non-extension preserving key, with an extension
|
|
|
|
- in its keyName. -}
|
|
|
|
| newvariety == oldvariety && not (hasExt oldvariety) &&
|
2021-10-06 00:20:08 +00:00
|
|
|
keyHash oldkey /= S.fromShort (fromKey keyName oldkey) =
|
2019-11-22 20:24:04 +00:00
|
|
|
Just $ alterKey oldkey $ \d -> d
|
2021-10-06 00:20:08 +00:00
|
|
|
{ keyName = S.toShort (keyHash oldkey)
|
2019-11-22 20:24:04 +00:00
|
|
|
}
|
2014-07-10 21:06:04 +00:00
|
|
|
| otherwise = Nothing
|
2017-02-24 19:16:56 +00:00
|
|
|
where
|
|
|
|
migratable = oldvariety /= newvariety
|
|
|
|
&& sameExceptExt oldvariety newvariety
|
2019-11-22 20:24:04 +00:00
|
|
|
oldvariety = fromKey keyVariety oldkey
|
2017-02-24 19:16:56 +00:00
|
|
|
newvariety = backendVariety newbackend
|
2014-07-10 21:06:04 +00:00
|
|
|
|
2020-11-05 15:26:34 +00:00
|
|
|
hashFile :: Hash -> RawFilePath -> MeterUpdate -> Annex String
|
2019-06-25 17:10:06 +00:00
|
|
|
hashFile hash file meterupdate =
|
2020-11-05 15:26:34 +00:00
|
|
|
liftIO $ withMeteredFile (fromRawFilePath file) meterupdate $ \b -> do
|
2021-02-09 19:00:51 +00:00
|
|
|
let h = (fst $ hasher hash) b
|
2019-06-25 17:10:06 +00:00
|
|
|
-- Force full evaluation of hash so whole file is read
|
|
|
|
-- before returning.
|
|
|
|
evaluate (rnf h)
|
|
|
|
return h
|
2021-02-09 19:00:51 +00:00
|
|
|
|
|
|
|
type Hasher = (L.ByteString -> String, Key -> IO IncrementalVerifier)
|
|
|
|
|
|
|
|
hasher :: Hash -> Hasher
|
|
|
|
hasher MD5Hash = md5Hasher
|
|
|
|
hasher SHA1Hash = sha1Hasher
|
|
|
|
hasher (SHA2Hash hashsize) = sha2Hasher hashsize
|
|
|
|
hasher (SHA3Hash hashsize) = sha3Hasher hashsize
|
|
|
|
hasher (SkeinHash hashsize) = skeinHasher hashsize
|
|
|
|
hasher (Blake2bHash hashsize) = blake2bHasher hashsize
|
|
|
|
hasher (Blake2bpHash hashsize) = blake2bpHasher hashsize
|
|
|
|
hasher (Blake2sHash hashsize) = blake2sHasher hashsize
|
|
|
|
hasher (Blake2spHash hashsize) = blake2spHasher hashsize
|
|
|
|
|
|
|
|
mkHasher :: HashAlgorithm h => (L.ByteString -> Digest h) -> Context h -> Hasher
|
2021-08-18 17:19:02 +00:00
|
|
|
mkHasher h c = (show . h, mkIncrementalVerifier c descChecksum . sameCheckSum)
|
2021-02-09 19:00:51 +00:00
|
|
|
|
|
|
|
sha2Hasher :: HashSize -> Hasher
|
2018-08-28 22:01:13 +00:00
|
|
|
sha2Hasher (HashSize hashsize)
|
2021-02-09 19:00:51 +00:00
|
|
|
| hashsize == 256 = mkHasher sha2_256 sha2_256_context
|
|
|
|
| hashsize == 224 = mkHasher sha2_224 sha2_224_context
|
|
|
|
| hashsize == 384 = mkHasher sha2_384 sha2_384_context
|
|
|
|
| hashsize == 512 = mkHasher sha2_512 sha2_512_context
|
2023-04-10 17:38:14 +00:00
|
|
|
| otherwise = giveup $ "unsupported SHA2 size " ++ show hashsize
|
2013-10-02 01:10:56 +00:00
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
sha3Hasher :: HashSize -> Hasher
|
2017-02-24 19:16:56 +00:00
|
|
|
sha3Hasher (HashSize hashsize)
|
2021-02-09 19:00:51 +00:00
|
|
|
| hashsize == 256 = mkHasher sha3_256 sha3_256_context
|
|
|
|
| hashsize == 224 = mkHasher sha3_224 sha3_224_context
|
|
|
|
| hashsize == 384 = mkHasher sha3_384 sha3_384_context
|
|
|
|
| hashsize == 512 = mkHasher sha3_512 sha3_512_context
|
2023-04-10 17:38:14 +00:00
|
|
|
| otherwise = giveup $ "unsupported SHA3 size " ++ show hashsize
|
2015-08-06 19:02:25 +00:00
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
skeinHasher :: HashSize -> Hasher
|
2017-02-24 19:16:56 +00:00
|
|
|
skeinHasher (HashSize hashsize)
|
2021-02-09 19:00:51 +00:00
|
|
|
| hashsize == 256 = mkHasher skein256 skein256_context
|
|
|
|
| hashsize == 512 = mkHasher skein512 skein512_context
|
2023-04-10 17:38:14 +00:00
|
|
|
| otherwise = giveup $ "unsupported SKEIN size " ++ show hashsize
|
2014-08-01 19:09:49 +00:00
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
blake2bHasher :: HashSize -> Hasher
|
2018-03-13 20:15:35 +00:00
|
|
|
blake2bHasher (HashSize hashsize)
|
2021-02-09 19:00:51 +00:00
|
|
|
| hashsize == 256 = mkHasher blake2b_256 blake2b_256_context
|
|
|
|
| hashsize == 512 = mkHasher blake2b_512 blake2b_512_context
|
|
|
|
| hashsize == 160 = mkHasher blake2b_160 blake2b_160_context
|
|
|
|
| hashsize == 224 = mkHasher blake2b_224 blake2b_224_context
|
|
|
|
| hashsize == 384 = mkHasher blake2b_384 blake2b_384_context
|
2023-04-10 17:38:14 +00:00
|
|
|
| otherwise = giveup $ "unsupported BLAKE2B size " ++ show hashsize
|
2018-03-13 20:15:35 +00:00
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
blake2bpHasher :: HashSize -> Hasher
|
2019-07-05 19:29:00 +00:00
|
|
|
blake2bpHasher (HashSize hashsize)
|
2021-02-09 19:00:51 +00:00
|
|
|
| hashsize == 512 = mkHasher blake2bp_512 blake2bp_512_context
|
2023-04-10 17:38:14 +00:00
|
|
|
| otherwise = giveup $ "unsupported BLAKE2BP size " ++ show hashsize
|
2019-07-05 19:29:00 +00:00
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
blake2sHasher :: HashSize -> Hasher
|
2018-03-13 20:15:35 +00:00
|
|
|
blake2sHasher (HashSize hashsize)
|
2021-02-09 19:00:51 +00:00
|
|
|
| hashsize == 256 = mkHasher blake2s_256 blake2s_256_context
|
|
|
|
| hashsize == 160 = mkHasher blake2s_160 blake2s_160_context
|
|
|
|
| hashsize == 224 = mkHasher blake2s_224 blake2s_224_context
|
2023-04-10 17:38:14 +00:00
|
|
|
| otherwise = giveup $ "unsupported BLAKE2S size " ++ show hashsize
|
2018-03-13 20:15:35 +00:00
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
blake2spHasher :: HashSize -> Hasher
|
2018-03-13 20:15:35 +00:00
|
|
|
blake2spHasher (HashSize hashsize)
|
2021-02-09 19:00:51 +00:00
|
|
|
| hashsize == 256 = mkHasher blake2sp_256 blake2sp_256_context
|
|
|
|
| hashsize == 224 = mkHasher blake2sp_224 blake2sp_224_context
|
2023-04-10 17:38:14 +00:00
|
|
|
| otherwise = giveup $ "unsupported BLAKE2SP size " ++ show hashsize
|
2018-03-13 20:15:35 +00:00
|
|
|
|
2021-02-09 19:00:51 +00:00
|
|
|
sha1Hasher :: Hasher
|
|
|
|
sha1Hasher = mkHasher sha1 sha1_context
|
|
|
|
|
|
|
|
md5Hasher :: Hasher
|
|
|
|
md5Hasher = mkHasher md5 md5_context
|
2018-08-28 22:01:13 +00:00
|
|
|
|
incremental hashing for fileRetriever
It uses tailVerify to hash the file while it's being written.
This is able to sometimes avoid a separate checksum step. Although
if the file gets written quickly enough, tailVerify may not see it
get created before the write finishes, and the checksum still happens.
Testing with the directory special remote, incremental checksumming did
not happen. But then I disabled the copy CoW probing, and it did work.
What's going on with that is the CoW probe creates an empty file on
failure, then deletes it, and then the file is created again. tailVerify
will open the first, empty file, and so fails to read the content that
gets written to the file that replaces it.
The directory special remote really ought to be able to avoid needing to
use tailVerify, and while other special remotes could do things that
cause similar problems, they probably don't. And if they do, it just
means the checksum doesn't get done incrementally.
Sponsored-by: Dartmouth College's DANDI project
2021-08-13 19:43:29 +00:00
|
|
|
descChecksum :: String
|
|
|
|
descChecksum = "checksum"
|
|
|
|
|
2023-03-14 02:39:16 +00:00
|
|
|
{- A variant of the SHA256E backend, for testing that needs special keys
|
2014-08-01 19:09:49 +00:00
|
|
|
- that cannot collide with legitimate keys in the repository.
|
|
|
|
-
|
|
|
|
- This is accomplished by appending a special extension to the key,
|
|
|
|
- that is not one that selectExtension would select (due to being too
|
|
|
|
- long).
|
|
|
|
-}
|
|
|
|
testKeyBackend :: Backend
|
|
|
|
testKeyBackend =
|
2017-02-24 19:16:56 +00:00
|
|
|
let b = genBackendE (SHA2Hash (HashSize 256))
|
2020-07-20 18:06:05 +00:00
|
|
|
gk = case genKey b of
|
2020-05-15 16:51:09 +00:00
|
|
|
Nothing -> Nothing
|
2020-07-29 21:12:22 +00:00
|
|
|
Just f -> Just (\ks p -> addTestE <$> f ks p)
|
2020-07-20 18:06:05 +00:00
|
|
|
in b { genKey = gk }
|
2014-08-01 19:09:49 +00:00
|
|
|
where
|
2020-07-29 21:12:22 +00:00
|
|
|
addTestE k = alterKey k $ \d -> d
|
2019-11-22 20:24:04 +00:00
|
|
|
{ keyName = keyName d <> longext
|
|
|
|
}
|
2014-08-01 19:09:49 +00:00
|
|
|
longext = ".this-is-a-test-key"
|