124 lines
5.7 KiB
Markdown
124 lines
5.7 KiB
Markdown
The "backend" in git-annex specifies how a key is generated from a file's
|
|
content and/or filesystem metadata. Most backends are different kinds of
|
|
hashes. A single repository can use different backends for different files.
|
|
The [[key|internals/key_format]] includes the backend that is used for that
|
|
key.
|
|
|
|
## configuring which backend to use
|
|
|
|
The `annex.backend` git-config setting can be used to configure the
|
|
default backend to use when adding new files.
|
|
|
|
For finer control of what backend is used when adding different types of
|
|
files, the `.gitattributes` file can be used. The `annex.backend`
|
|
attribute can be set to the name of the backend to use for matching files.
|
|
|
|
For example, to use the SHA256E backend for sound files, which tend to be
|
|
smallish and might be modified or copied over time,
|
|
while using the WORM backend for everything else, you could set
|
|
in `.gitattributes`:
|
|
|
|
* annex.backend=WORM
|
|
*.mp3 annex.backend=SHA256E
|
|
*.ogg annex.backend=SHA256E
|
|
|
|
## recommended backends to use
|
|
|
|
* `SHA256E` -- The default backend for new files, combines a 256 bit SHA-2
|
|
hash of the file's content with the file's extension. This allows
|
|
verifying that the file content is right, and can avoid duplicates of
|
|
files with the same content. Its need to generate checksums
|
|
can make it slower for large files.
|
|
* `SHA256` -- SHA-2 hash that does not include the file extension in the
|
|
key, which can lead to better deduplication but can confuse some programs.
|
|
* `SHA512`, `SHA512E` -- Best SHA-2 hash, for the very paranoid.
|
|
* `SHA384`, `SHA384E`, `SHA224`, `SHA224E` -- SHA-2 hashes for
|
|
people who like unusual sizes.
|
|
* `SHA3_512`, `SHA3_512E`, `SHA3_384`, `SHA3_384E`, `SHA3_256`, `SHA3_256E`, `SHA3_224`, `SHA3_224E`
|
|
-- SHA-3 hashes, for bleeding edge fun.
|
|
* `SKEIN512`, `SKEIN512E`, `SKEIN256`, `SKEIN256E`
|
|
-- [Skein hash](http://en.wikipedia.org/wiki/Skein_hash),
|
|
a well-regarded SHA3 hash competition finalist.
|
|
* `BLAKE2B160`, `BLAKE2B224`, `BLAKE2B256`, `BLAKE2B384`, `BLAKE2B512`
|
|
`BLAKE2B160E`, `BLAKE2B224E`, `BLAKE2B256E`, `BLAKE2B384E`, `BLAKE2B512E`
|
|
-- Fast [Blake2 hash](https://blake2.net/) variants optimised for 64 bit
|
|
platforms.
|
|
* `BLAKE2S160`, `BLAKE2S224`, `BLAKE2S256`
|
|
`BLAKE2S160E`, `BLAKE2S224E`, `BLAKE2S256E`
|
|
-- Fast [Blake2 hash](https://blake2.net/) variants optimised for 32 bit
|
|
platforms.
|
|
* `BLAKE2BP512`, `BLAKE2BP512E`
|
|
-- Fast [Blake2 hash](https://blake2.net/) variants optimised for
|
|
4-way CPUs.
|
|
* `BLAKE2SP224`, `BLAKE2SP256`
|
|
`BLAKE2SP224E`, `BLAKE2SP256E`
|
|
-- Fast [Blake2 hash](https://blake2.net/) variants optimised for
|
|
8-way CPUs.
|
|
* `VURL` -- This is like an `URL` (see below) but the content can
|
|
be verified with a cryptographically secure checksum that is
|
|
recorded in the git-annex branch. It's generated when using
|
|
eg `git-annex addurl --fast --verifiable`.
|
|
|
|
## non-cryptograpgically secure backends
|
|
|
|
The backends below do not guarantee cryptographically that the
|
|
content of an annexed file remains unchanged.
|
|
|
|
* `SHA1`, `SHA1E`, `MD5`, `MD5E` -- Smaller hashes than `SHA256`
|
|
for those who want a checksum but are not concerned about security.
|
|
* `WORM` ("Write Once, Read Many") -- This assumes that any file with
|
|
the same filename, size, and modification time has the same content.
|
|
This is the least expensive backend, recommended for really large
|
|
files or slow systems.
|
|
* `URL` -- This is a key that is generated from the url to a file.
|
|
It's generated when using eg, `git annex addurl --fast`, when the file
|
|
content is not available for hashing.
|
|
The key may not contain the full URL; for long URLs, part of the URL may be
|
|
represented by a checksum.
|
|
The URL key may contain `&` characters; be sure to quote the key if
|
|
passing it to a shell script. These types of keys are distinct from URLs/URIs
|
|
that may be attached to a key (using any backend) indicating the key's location
|
|
on the web or in one of [[special_remotes]].
|
|
|
|
## external backends
|
|
|
|
While most backends are built into git-annex, it also supports external
|
|
backends. These are programs with names like `git-annex-backend-XFOO`,
|
|
which can be provided by others. See [[design/external_backend_protocol]]
|
|
for details about how to write them.
|
|
|
|
Here's a list of external backends. Edit this page to add yours to the list.
|
|
|
|
* [[design/external_backend_protocol/git-annex-backend-XFOO]]
|
|
is a demo program implementing the protocol with a shell script.
|
|
|
|
Like with git-annex's builtin backends, you can add "E" to the end of the
|
|
name of an external backend, to get a version that includes the file
|
|
extension in the key.
|
|
|
|
## internal use backends
|
|
|
|
Keys using these backends can sometimes be visible, but they are used by
|
|
git-annex for its own purposes, and not for your annexed files.
|
|
|
|
* `GIT` -- This is used internally by git-annex when exporting trees
|
|
containing files stored in git, rather than git-annex. It represents a
|
|
git sha. This is never used for git-annex links, but information about
|
|
keys of this type is stored in the git-annex branch.
|
|
* `GITBUNDLE` and `GITMANIFEST` -- Used by [[git-remote-annex]] to store
|
|
a git repository in a special remote. See
|
|
[[this_page|internals/git-remote-annex]] for details about these.
|
|
|
|
## notes
|
|
|
|
If you want to be able to prove that you're working with the same file
|
|
contents that were checked into a repository earlier, you should avoid
|
|
using non-cryptographically-secure backends, and will need to use
|
|
signed git commits. See [[tips/using_signed_git_commits]] for details.
|
|
|
|
Retrieval of WORM and URL from many [[special_remotes]] is prohibited
|
|
for [[security_reasons|security/CVE-2018-10857_and_CVE-2018-10859]].
|
|
|
|
Note that the various 512 and 384 length hashes result in long paths,
|
|
which are known to not work on Windows. If interoperability on Windows is a
|
|
concern, avoid those.
|