use SHA256 by default
To get old behavior, add a .gitattributes containing: * annex.backend=WORM I feel that SHA256 is a better default for most people, as long as their systems are fast enough that checksumming their files isn't a problem. git-annex should default to preserving the integrity of data as well as git does. Checksum backends also work better with editing files via unlock/lock. I considered just using SHA1, but since that hash is believed to be somewhat near to being broken, and git-annex deals with large files which would be a perfect exploit medium, I decided to go to a SHA-2 hash. SHA512 is annoyingly long when displayed, and git-annex displays it in a few places (and notably it is shown in ls -l), so I picked the shorter hash. Considered SHA224 as it's even shorter, but feel it's a bit weird. I expect git-annex will use SHA-3 at some point in the future, but probably not soon! Note that systems without a sha256sum (or sha256) program will fall back to defaulting to SHA1.
This commit is contained in:
parent
1089e85d48
commit
ef3457196a
8 changed files with 37 additions and 30 deletions
|
@ -5,17 +5,19 @@ to retrieve the file's content (its value).
|
|||
Multiple pluggable key-value backends are supported, and a single repository
|
||||
can use different ones for different files.
|
||||
|
||||
* `WORM` ("Write Once, Read Many") This assumes that any file with
|
||||
the same basename, size, and modification time has the same content.
|
||||
This is the default, and the least expensive backend.
|
||||
* `SHA1` -- This uses a key based on a sha1 checksum. This allows
|
||||
* `SHA256` -- The default backend for new files. This allows
|
||||
verifying that the file content is right, and can avoid duplicates of
|
||||
files with the same content. Its need to generate checksums
|
||||
can make it slower for large files.
|
||||
* `SHA512`, `SHA384`, `SHA256`, `SHA224` -- Like SHA1, but larger
|
||||
checksums. Mostly useful for the very paranoid, or anyone who is
|
||||
researching checksum collisions and wants to annex their colliding data. ;)
|
||||
* `SHA1E`, `SHA512E`, etc -- Variants that preserve filename extension as
|
||||
can make it slower for large files.
|
||||
* `WORM` ("Write Once, Read Many") This assumes that any file with
|
||||
the same basename, size, and modification time has the same content.
|
||||
This is the the least expensive backend, recommended for really large
|
||||
files or slow systems.
|
||||
* `SHA512` -- Best currently available hash, for the very paranoid.
|
||||
* `SHA1` -- Smaller hash than `SHA256` for those who want a checksum
|
||||
but are not concerned about security.
|
||||
* `SHA384`, `SHA224` -- Hashes for people who like unusual sizes.
|
||||
* `SHA256E`, `SHA1E`, etc -- Variants that preserve filename extension as
|
||||
part of the key. Useful for archival tasks where the filename extension
|
||||
contains metadata that should be preserved.
|
||||
|
||||
|
@ -27,9 +29,11 @@ For finer control of what backend is used when adding different types of
|
|||
files, the `.gitattributes` file can be used. The `annex.backend`
|
||||
attribute can be set to the name of the backend to use for matching files.
|
||||
|
||||
For example, to use the SHA1 backend for sound files, which tend to be
|
||||
smallish and might be modified or copied over time, you could set in
|
||||
`.gitattributes`:
|
||||
For example, to use the SHA256 backend for sound files, which tend to be
|
||||
smallish and might be modified or copied over time,
|
||||
while using the WORM backend for everything else, you could set
|
||||
in `.gitattributes`:
|
||||
|
||||
*.mp3 annex.backend=SHA1
|
||||
*.ogg annex.backend=SHA1
|
||||
* annex.backend=WORM
|
||||
*.mp3 annex.backend=SHA256
|
||||
*.ogg annex.backend=SHA256
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue