2011-03-02 01:32:28 +00:00
|
|
|
In the world of git, we're not scared about internal implementation
|
|
|
|
details, and sometimes we like to dive in and tweak things by hand. Here's
|
|
|
|
some documentation to that end.
|
|
|
|
|
2011-03-16 04:08:02 +00:00
|
|
|
## `.git/annex/objects/aa/bb/*/*`
|
2011-03-02 01:32:28 +00:00
|
|
|
|
|
|
|
This is where locally available file contents are actually stored.
|
|
|
|
Files added to the annex get a symlink checked into git that points
|
|
|
|
to the file content.
|
|
|
|
|
2011-03-16 04:08:02 +00:00
|
|
|
First there are two levels of directories used for hashing, to prevent
|
|
|
|
too many things ending up in any one directory.
|
2013-04-01 00:13:49 +00:00
|
|
|
See [[hashing]] for details.
|
2011-03-16 04:08:02 +00:00
|
|
|
|
2012-11-30 20:01:29 +00:00
|
|
|
Each subdirectory has the [[name_of_a_key|key_format]] in one of the
|
2011-03-02 01:38:47 +00:00
|
|
|
[[key-value_backends|backends]]. The file inside also has the name of the key.
|
|
|
|
This two-level structure is used because it allows the write bit to be removed
|
|
|
|
from the subdirectories as well as from the files. That prevents accidentially
|
|
|
|
deleting or changing the file contents.
|
2011-03-02 01:32:28 +00:00
|
|
|
|
2012-12-25 18:25:47 +00:00
|
|
|
In [[direct_mode]], file contents are not stored in here, and instead
|
|
|
|
are stored directly in the file. However, the same symlinks are still
|
|
|
|
committed to git, internally.
|
|
|
|
|
|
|
|
Also in [[direct_mode]], some additional data is stored in these directories.
|
|
|
|
`.cache` files contain cached file stats used in detecting when a file has
|
|
|
|
changed, and `.map` files contain a list of file(s) in the work directory
|
|
|
|
that contain the key.
|
|
|
|
|
2011-06-22 21:26:34 +00:00
|
|
|
## The git-annex branch
|
|
|
|
|
|
|
|
This branch is managed by git-annex, with the contents listed below.
|
|
|
|
|
2011-06-23 16:11:03 +00:00
|
|
|
The file `.git/annex/index` is a separate git index file it uses
|
slow, stupid, and safe index updating
Always merge the git-annex branch into .git/annex/index before making a
commit from the index.
This ensures that, when the branch has been changed in any way
(by a push being received, or changes pulled directly into it, or
even by the user checking it out, and committing a change), the index
reflects those changes.
This is much too slow; it needs to be optimised to only update the
index when the branch has really changed, not every time.
Also, there is an unhandled race, when a change is made to the branch
right after the index gets updated. I left it in for now because it's
unlikely and I didn't want to complicate things with additional locking
yet.
2011-12-11 18:51:20 +00:00
|
|
|
to accumulate changes for the git-annex branch.
|
|
|
|
Also, `.git/annex/journal/` is used to record changes before they
|
|
|
|
are added to git.
|
2011-06-22 21:26:34 +00:00
|
|
|
|
|
|
|
### `uuid.log`
|
2011-03-02 01:32:28 +00:00
|
|
|
|
|
|
|
Records the UUIDs of known repositories, and associates them with a
|
|
|
|
description of the repository. This allows git-annex to display something
|
|
|
|
more useful than a UUID when it refers to a repository that does not have
|
|
|
|
a configured git remote pointing at it.
|
|
|
|
|
|
|
|
The file format is simply one line per repository, with the uuid followed by a
|
2011-10-06 19:31:25 +00:00
|
|
|
space and then the description, followed by a timestamp. Example:
|
2011-03-02 01:32:28 +00:00
|
|
|
|
2011-10-06 19:31:25 +00:00
|
|
|
e605dca6-446a-11e0-8b2a-002170d25c55 laptop timestamp=1317929189.157237s
|
|
|
|
26339d22-446b-11e0-9101-002170d25c55 usb disk timestamp=1317929330.769997s
|
2011-03-02 01:32:28 +00:00
|
|
|
|
2012-04-20 15:31:30 +00:00
|
|
|
If there are multiple lines for the same uuid, the one with the most recent
|
|
|
|
timestamp wins. git-annex union merges this and other files.
|
|
|
|
|
2013-03-04 00:47:36 +00:00
|
|
|
## `remote.log`
|
2011-03-28 06:12:05 +00:00
|
|
|
|
2011-03-28 23:08:12 +00:00
|
|
|
Holds persistent configuration settings for [[special_remotes]] such as
|
|
|
|
Amazon S3.
|
2011-03-28 06:12:05 +00:00
|
|
|
|
2011-03-28 23:08:12 +00:00
|
|
|
The file format is one line per remote, starting with the uuid of the
|
2013-03-04 00:47:36 +00:00
|
|
|
remote, followed by a space, and then a series of var=value pairs,
|
2011-10-06 20:07:51 +00:00
|
|
|
each separated by whitespace, and finally a timestamp.
|
2011-03-28 06:12:05 +00:00
|
|
|
|
2013-03-04 00:47:36 +00:00
|
|
|
Encrypted special remotes store their encryption key here,
|
|
|
|
in the "cipher" value. It is base64 encoded, and unless shared [[encryption]]
|
|
|
|
is used, is encrypted to one or more gpg keys. The first 256 bytes of
|
|
|
|
the cipher is used as the HMAC SHA1 encryption key, to encrypt filenames
|
|
|
|
stored on the special remote. The remainder of the cipher is used as a gpg
|
|
|
|
symmetric encryption key, to encrypt the content of files stored on the special
|
|
|
|
remote.
|
|
|
|
|
2011-06-22 21:26:34 +00:00
|
|
|
## `trust.log`
|
2011-03-02 01:32:28 +00:00
|
|
|
|
|
|
|
Records the [[trust]] information for repositories. Does not exist unless
|
|
|
|
[[trust]] values are configured.
|
|
|
|
|
|
|
|
The file format is one line per repository, with the uuid followed by a
|
2011-12-04 01:01:22 +00:00
|
|
|
space, and then either `1` (trusted), `0` (untrusted), `?` (semi-trusted),
|
|
|
|
`X` (dead) and finally a timestamp.
|
2011-03-02 01:32:28 +00:00
|
|
|
|
|
|
|
Example:
|
|
|
|
|
2011-10-06 20:07:51 +00:00
|
|
|
e605dca6-446a-11e0-8b2a-002170d25c55 1 timestamp=1317929189.157237s
|
|
|
|
26339d22-446b-11e0-9101-002170d25c55 ? timestamp=1317929330.769997s
|
|
|
|
|
|
|
|
Repositories not listed are semi-trusted.
|
2011-03-02 01:32:28 +00:00
|
|
|
|
2012-10-01 19:12:04 +00:00
|
|
|
## `group.log`
|
|
|
|
|
|
|
|
Used to group repositories together.
|
|
|
|
|
|
|
|
The file format is one line per repository, with the uuid followed by a space,
|
|
|
|
and then a space-separated list of groups this repository is part of,
|
|
|
|
and finally a timestamp.
|
|
|
|
|
2012-10-04 19:48:59 +00:00
|
|
|
## `preferred-content.log`
|
|
|
|
|
|
|
|
Used to indicate which repositories prefer to contain which file contents.
|
|
|
|
|
|
|
|
The file format is one line per repository, with the uuid followed by a space,
|
|
|
|
then a boolean expression, and finally a timestamp.
|
|
|
|
|
|
|
|
Files matching the expression are preferred to be retained in the
|
|
|
|
repository, while files not matching it are preferred to be stored
|
|
|
|
somewhere else.
|
|
|
|
|
2011-06-22 21:26:34 +00:00
|
|
|
## `aaa/bbb/*.log`
|
2011-03-02 01:32:28 +00:00
|
|
|
|
2011-07-01 21:28:31 +00:00
|
|
|
These log files record [[location_tracking]] information
|
2011-03-16 04:08:02 +00:00
|
|
|
for file contents. Again these are placed in two levels of subdirectories
|
2013-04-01 00:13:49 +00:00
|
|
|
for hashing. See [[hashing]] for details.
|
|
|
|
|
|
|
|
The name of the key is the filename, and the content
|
2011-03-02 01:32:28 +00:00
|
|
|
consists of a timestamp, either 1 (present) or 0 (not present), and
|
|
|
|
the UUID of the repository that has or lacks the file content.
|
|
|
|
|
|
|
|
Example:
|
|
|
|
|
|
|
|
1287290776.765152s 1 e605dca6-446a-11e0-8b2a-002170d25c55
|
|
|
|
1287290767.478634s 0 26339d22-446b-11e0-9101-002170d25c55
|
|
|
|
|
2011-12-13 22:01:13 +00:00
|
|
|
These files are designed to be auto-merged using git's [[union merge driver|git-union-merge]].
|
2011-03-02 01:32:28 +00:00
|
|
|
The timestamps allow the most recent information to be identified.
|
2011-07-01 21:28:31 +00:00
|
|
|
|
2012-04-20 15:31:30 +00:00
|
|
|
## `aaa/bbb/*.log.web`
|
2011-07-01 21:28:31 +00:00
|
|
|
|
|
|
|
These log files record urls used by the
|
|
|
|
[[web_special_remote|special_remotes/web]]. Their format is similar
|
|
|
|
to the location tracking files, but with urls rather than UUIDs.
|