To get old behavior, add a .gitattributes containing: * annex.backend=WORM
I feel that SHA256 is a better default for most people, as long as their
systems are fast enough that checksumming their files isn't a problem.
git-annex should default to preserving the integrity of data as well as git
does. Checksum backends also work better with editing files via
unlock/lock.
I considered just using SHA1, but since that hash is believed to be somewhat
near to being broken, and git-annex deals with large files which would be a
perfect exploit medium, I decided to go to a SHA-2 hash.
SHA512 is annoyingly long when displayed, and git-annex displays it in a
few places (and notably it is shown in ls -l), so I picked the shorter
hash. Considered SHA224 as it's even shorter, but feel it's a bit weird.
I expect git-annex will use SHA-3 at some point in the future, but
probably not soon!
Note that systems without a sha256sum (or sha256) program will fall back to
defaulting to SHA1.
The backend usage graph shows present keys as well as keys found in the
repository tree, so it will also be populated for bare repositories.
Changed wording to "visible annex keys", which explains why it's 0 in
a bare repository (no keys visible as no tree), and also why it varies
depending on which branch is checked out. This seemed better than doing
something expensive to look up keys from the git-annex branch.
Checks location log information, and file contents.
Does not check that numcopies is satisfied, as .gitattributes information
about numcopies is not available in a bare repository. In practice, that
should not be a problem, since fsck is also run in a checkout and will
check numcopies there.
This new approach allows filtering out checks from the default set that are
not appropriate for a command, rather than having to list every check
that is appropriate. It also reduces some boilerplate.
Haskell does not define Eq for functions, so I had to go a long way around
with each check having a unique id. Meh.