update documentation for new, neutered key-value backends

Backends are now only used to generate keys (and check them); they
are not arbitrary key-value stores for data, because it turned out such
a store is better modeled as a special remote. Updated docs to not
imply backends do more than they do now.

Sometimes I'm tempted to rename "backend" to "keytype" or something,
which would really be more clear. But it would be an annoying transition
for users, with annex.backends etc.
This commit is contained in:
Joey Hess 2011-08-28 16:28:38 -04:00
parent 999d5df90b
commit bbba6c19bd
5 changed files with 29 additions and 37 deletions

View file

@ -1,24 +1,16 @@
git-annex uses a key-value abstraction layer to allow file contents to be
stored in different ways. In theory, any key-value storage system could be
used to store file contents.
When a file is annexed, a key is generated from its content and/or metadata.
The file checked into git symlinks to the key. This key can later be used
to retrieve the file's content (its value).
Multiple pluggable backends are supported, and a single repository
can use different backends for different files.
Multiple pluggable key-value backends are supported, and a single repository
can use different ones for different files.
These backends can transfer file contents between configured git remotes.
It's also possible to use [[special_remotes]], such as Amazon S3 with
these backends.
* `WORM` ("Write Once, Read Many") This backend assumes that any file with
the same basename, size, and modification time has the same content. So with
this backend, files can be moved around, but should never be added to
* `WORM` ("Write Once, Read Many") This assumes that any file with
the same basename, size, and modification time has the same content. So
files can be moved around, but should never be added to
or changed. This is the default, and the least expensive backend.
* `SHA1` -- This backend uses a key based on a sha1 checksum. This backend
allows modifications of files to be tracked. Its need to generate checksums
* `SHA1` -- This uses a key based on a sha1 checksum. This allows
modifications of files to be tracked. Its need to generate checksums
can make it slower for large files.
* `SHA512`, `SHA384`, `SHA256`, `SHA224` -- Like SHA1, but larger
checksums. Mostly useful for the very paranoid, or anyone who is

View file

@ -1,8 +1,8 @@
The WORM and SHA1 key-value [[backends]] store data inside
your git repository's `.git` directory, not in some external data store.
Annexed data is stored inside your git repository's `.git/annex` directory.
Some [[special_remotes]] can store annexed data elsewhere.
It's important that data not get lost by an ill-considered `git annex drop`
command. So, then using those backends, git-annex can be configured to try
command. So, git-annex can be configured to try
to keep N copies of a file's content available across all repositories.
(Although [[untrusted_repositories|trust]] don't count toward this total.)

View file

@ -72,15 +72,15 @@ Many git-annex commands will stage changes for later `git commit` by you.
* get [path ...]
Makes the content of annexed files available in this repository. Depending
on the backend used, this will involve copying them from another repository,
or downloading them, or transferring them from some kind of key-value store.
Makes the content of annexed files available in this repository. This
will involve copying them from another repository, or downloading them,
or transferring them from some kind of key-value store.
* drop [path ...]
Drops the content of annexed files from this repository.
git-annex may refuse to drop content if the backend does not think
git-annex may refuse to drop content if it does not think
it is safe to do so, typically because of the setting of annex.numcopies.
* move [path ...]
@ -207,14 +207,14 @@ Many git-annex commands will stage changes for later `git commit` by you.
* migrate [path ...]
Changes the specified annexed files to store their content in the
default backend (or the one specified with --backend). Only files whose
content is currently available are migrated.
Changes the specified annexed files to use the default key-value backend
(or the one specified with --backend). Only files whose content
is currently available are migrated.
Note that the content is not removed from the backend it was previously in.
Use `git annex unused` to find and remove such content.
Note that the content is also still available using the old key after
migration. Use `git annex unused` to find and remove the old key.
Normally, nothing will be done to files already in the backend.
Normally, nothing will be done to files already using the new backend.
However, if a backend changes the information it uses to construct a key,
this can also be used to migrate files to use the new key format.
@ -293,7 +293,7 @@ Many git-annex commands will stage changes for later `git commit` by you.
* fromkey file
This plumbing-level command can be used to manually set up a file
to link to a specified key in the key-value backend.
in the git repository to link to a specified key.
* dropkey [key ...]
@ -500,8 +500,8 @@ Here are all the supported configuration settings.
# CONFIGURATION VIA .gitattributes
The backend used when adding a new file to the annex can be configured
on a per-file-type basis via `.gitattributes` files. In the file,
The key-value backend used when adding a new file to the annex can be
configured on a per-file-type basis via `.gitattributes` files. In the file,
the `annex.backend` attribute can be set to the name of the backend to
use. For example, this here's how to use the WORM backend by default,
but the SHA1 backend for ogg files:

View file

@ -1,8 +1,8 @@
You can use the fsck subcommand to check for problems in your data.
What can be checked depends on the [[backend|backends]] you've used to store
the data. For example, when you use the SHA1 backend, fsck will verify that
the checksums of your files are good. Fsck also checks that the annex.numcopies
setting is satisfied for all files.
You can use the fsck subcommand to check for problems in your data. What
can be checked depends on the key-value [[backend|backends]] you've used
for the data. For example, when you use the SHA1 backend, fsck will verify
that the checksums of your files are good. Fsck also checks that the
annex.numcopies setting is satisfied for all files.
# git annex fsck
fsck some_file (checksum...) ok

View file

@ -2,7 +2,7 @@ It's possible for data to accumulate in the annex that no files point to
anymore. One way it can happen is if you `git rm` a file without
first calling `git annex drop`. And, when you modify an annexed file, the old
content of the file remains in the annex. Another way is when migrating
between backends.
between key-value [[backends|backend]].
This might be historical data you want to preserve, so git-annex defaults to
preserving it. So from time to time, you may want to check for such data and