1903 lines
67 KiB
Markdown
1903 lines
67 KiB
Markdown
# NAME
|
|
|
|
git-annex - manage files with git, without checking their contents in
|
|
|
|
# SYNOPSIS
|
|
|
|
git annex command [params ...]
|
|
|
|
# DESCRIPTION
|
|
|
|
git-annex allows managing files with git, without checking the file
|
|
contents into git. While that may seem paradoxical, it is useful when
|
|
dealing with files larger than git can currently easily handle, whether due
|
|
to limitations in memory, checksumming time, or disk space.
|
|
|
|
Even without file content tracking, being able to manage files with git,
|
|
move files around and delete files with versioned directory trees, and use
|
|
branches and distributed clones, are all very handy reasons to use git. And
|
|
annexed files can co-exist in the same git repository with regularly
|
|
versioned files, which is convenient for maintaining documents, Makefiles,
|
|
etc that are associated with annexed files but that benefit from full
|
|
revision control.
|
|
|
|
When a file is annexed, its content is moved into a key-value store, and
|
|
a symlink is made that points to the content. These symlinks are checked into
|
|
git and versioned like regular files. You can move them around, delete
|
|
them, and so on. Pushing to another git repository will make git-annex
|
|
there aware of the annexed file, and it can be used to retrieve its
|
|
content from the key-value store.
|
|
|
|
# EXAMPLES
|
|
|
|
# git annex get video/hackity_hack_and_kaxxt.mov
|
|
get video/hackity_hack_and_kaxxt.mov (not available)
|
|
I was unable to access these remotes: server
|
|
Try making some of these repositories available:
|
|
5863d8c0-d9a9-11df-adb2-af51e6559a49 -- my home file server
|
|
58d84e8a-d9ae-11df-a1aa-ab9aa8c00826 -- portable USB drive
|
|
ca20064c-dbb5-11df-b2fe-002170d25c55 -- backup SATA drive
|
|
failed
|
|
# sudo mount /media/usb
|
|
# git remote add usbdrive /media/usb
|
|
# git annex get video/hackity_hack_and_kaxxt.mov
|
|
get video/hackity_hack_and_kaxxt.mov (from usbdrive...) ok
|
|
|
|
# git annex add iso
|
|
add iso/Debian_5.0.iso ok
|
|
|
|
# git annex drop iso/Debian_4.0.iso
|
|
drop iso/Debian_4.0.iso ok
|
|
|
|
# git annex move iso --to=usbdrive
|
|
move iso/Debian_5.0.iso (moving to usbdrive...) ok
|
|
|
|
# COMMONLY USED COMMANDS
|
|
|
|
Like many git commands, git-annex can be passed a path that
|
|
is either a file or a directory. In the latter case it acts on all relevant
|
|
files in the directory. When no path is specified, most git-annex commands
|
|
default to acting on all relevant files in the current directory (and
|
|
subdirectories).
|
|
|
|
* `add [path ...]`
|
|
|
|
Adds files in the path to the annex. If no path is specified, adds
|
|
files from the current directory and below.
|
|
|
|
Files that are already checked into git, or that git has been configured
|
|
to ignore will be silently skipped. (Use `--force` to add ignored files.)
|
|
|
|
Dotfiles are skipped unless explicitly listed, or the --include-dotfiles
|
|
option is used.
|
|
|
|
* `get [path ...]`
|
|
|
|
Makes the content of annexed files available in this repository. This
|
|
will involve copying them from another repository, or downloading them,
|
|
or transferring them from some kind of key-value store.
|
|
|
|
Normally git-annex will choose which repository to copy the content from,
|
|
but you can override this using the `--from` option.
|
|
|
|
Rather than specifying a filename, the `--all` option can be used to
|
|
get all available versions of all files, or the --key=KEY`
|
|
option can be used to get a specified key.
|
|
|
|
* `drop [path ...]`
|
|
|
|
Drops the content of annexed files from this repository.
|
|
|
|
git-annex will refuse to drop content if it cannot verify it is
|
|
safe to do so. This can be overridden with the `--force` switch.
|
|
|
|
To drop content from a remote, specify `--from`.
|
|
|
|
* `move [path ...]`
|
|
|
|
When used with the `--from` option, moves the content of annexed files
|
|
from the specified repository to the current one.
|
|
|
|
When used with the `--to` option, moves the content of annexed files from
|
|
the current repository to the specified one.
|
|
|
|
* `copy [path ...]`
|
|
|
|
When used with the `--from` option, copies the content of annexed files
|
|
from the specified repository to the current one.
|
|
|
|
When used with the `--to` option, copies the content of annexed files from
|
|
the current repository to the specified one.
|
|
|
|
To avoid contacting the remote to check if it has every file
|
|
when copying --to the repository, specify `--fast`
|
|
|
|
To force checking the remote for every file when copying --from the
|
|
repository, specify `--force`.
|
|
|
|
* `status [path ...]`
|
|
|
|
Similar to `git status --short`, displays the status of the files in the
|
|
working tree. Shows files that are not checked into git, files that
|
|
have been deleted, and files that have been modified.
|
|
Particularly useful in direct mode.
|
|
|
|
* `unlock [path ...]`
|
|
|
|
Normally, the content of annexed files is protected from being changed.
|
|
Unlocking an annexed file allows it to be modified. This replaces the
|
|
symlink for each specified file with a copy of the file's content.
|
|
You can then modify it and `git annex add` (or `git commit`) to inject
|
|
it back into the annex.
|
|
|
|
* `edit [path ...]`
|
|
|
|
This is an alias for the unlock command. May be easier to remember,
|
|
if you think of this as allowing you to edit an annexed file.
|
|
|
|
* `lock [path ...]`
|
|
|
|
Use this to undo an unlock command if you don't want to modify
|
|
the files, or have made modifications you want to discard.
|
|
|
|
* `sync [remote ...]`
|
|
|
|
Use this command when you want to synchronize the local repository with
|
|
one or more of its remotes. You can specify the remotes (or remote
|
|
groups) to sync with by name; the default if none are specified is to
|
|
sync with all remotes.
|
|
Or specify `--fast` to sync with the remotes with the
|
|
lowest annex-cost value.
|
|
|
|
The sync process involves first committing any local changes to files
|
|
that have previously been added to the repository,
|
|
then fetching and merging the `synced/master` and the `git-annex` branch
|
|
from the remote repositories, and finally pushing the changes back to
|
|
those branches on the remote repositories. You can use standard git
|
|
commands to do each of those steps by hand, or if you don't want to
|
|
worry about the details, you can use sync.
|
|
|
|
Merge conflicts are automatically handled by sync. When two conflicting
|
|
versions of a file have been committed, both will be added to the tree,
|
|
under different filenames. For example, file "foo" would be replaced
|
|
with "foo.somekey" and "foo.otherkey".
|
|
|
|
Note that syncing with a remote will not update the remote's working
|
|
tree with changes made to the local repository. However, those changes
|
|
are pushed to the remote, so they can be merged into its working tree
|
|
by running "git annex sync" on the remote.
|
|
|
|
With the `--content` option, the contents of annexed files in the work
|
|
tree will also be uploaded and downloaded from remotes. By default,
|
|
this tries to get each annexed file that the local repository does not
|
|
yet have, and then copies each file to every remote that it is syncing with.
|
|
This behavior can be overridden by configuring the preferred content of
|
|
a repository. See see PREFERRED CONTENT below.
|
|
|
|
The `--message` or `-m` option can be used to specify a commit message.
|
|
|
|
* `merge`
|
|
|
|
This performs the same merging (and merge conflict resolution)
|
|
that is done by the sync command, but without pushing or pulling any data.
|
|
|
|
One way to use this is to put `git annex merge` into a repository's
|
|
post-receive hook. Then any syncs to the repository will update its working
|
|
copy automatically.
|
|
|
|
* `mirror [path ...]`
|
|
|
|
This causes a destination repository to mirror a source repository.
|
|
|
|
To use the local repository as the source repository,
|
|
specify mirror `--to` remote.
|
|
|
|
To use a remote as the source repository, specify mirror `--from` remote.
|
|
|
|
Each specified file in the source repository is mirrored to the destination
|
|
repository. If a file's content is present in the source repository, it is
|
|
copied to the destination repository. If a file's content is not present in
|
|
the source repository, it will be dropped from the destination repository
|
|
when the numcopies setting allows.
|
|
|
|
Note that mirror does not sync the git repository, but only the file
|
|
contents.
|
|
|
|
Also, --all may be specified to mirror all objects stored in the git
|
|
annex, not only objects used by currently existing files. However, this
|
|
bypasses checking the .gitattributes annex.numcopies setting when
|
|
dropping files.
|
|
|
|
* `addurl [url ...]`
|
|
|
|
Downloads each url to its own file, which is added to the annex.
|
|
|
|
To avoid immediately downloading the url, specify `--fast`.
|
|
|
|
To avoid storing the size of the url's content, and accept whatever
|
|
is there at a future point, specify `--relaxed`. (Implies `--fast`.)
|
|
|
|
Normally the filename is based on the full url, so will look like
|
|
"www.example.com_dir_subdir_bigfile". For a shorter filename, specify
|
|
`--pathdepth=N`. For example, `--pathdepth=1` will use "dir/subdir/bigfile",
|
|
while `--pathdepth=3` will use "bigfile". It can also be negative;
|
|
`--pathdepth=-2` will use the last two parts of the url.
|
|
|
|
Or, to directly specify what file the url is added to, specify `--file`.
|
|
This changes the behavior; now all the specified urls are recorded as
|
|
alternate locations from which the file can be downloaded. In this mode,
|
|
addurl can be used both to add new files, or to add urls to existing files.
|
|
|
|
When `quvi` is installed, urls are automatically tested to see if they
|
|
point to a video hosting site, and the video is downloaded instead.
|
|
|
|
Urls to torrent files (including magnet links) will cause the content of
|
|
the torrent to be downloaded, using `aria2c`.
|
|
|
|
* `rmurl file url`
|
|
|
|
Record that the file is no longer available at the url.
|
|
|
|
* `import [path ...]`
|
|
|
|
Moves files from somewhere outside the git working copy, and adds them to
|
|
the annex. Individual files to import can be specified.
|
|
If a directory is specified, the entire directory is imported.
|
|
|
|
git annex import /media/camera/DCIM/*
|
|
|
|
By default, importing two files with the same contents from two different
|
|
locations will result in both files being added to the repository.
|
|
(With all checksumming backends, including the default SHA256E,
|
|
only one copy of the data will be stored.)
|
|
|
|
To not delete files from the import location, use the
|
|
`--duplicate` option. This could allow importing the same files repeatedly
|
|
to different locations in a repository. More likely, it could be used to
|
|
import the same files to a number of different branches or separate git
|
|
repositories.
|
|
|
|
To only import files whose content has not been seen before by git-annex,
|
|
use the `--deduplicate` option. Duplicate files will be deleted from the
|
|
import location.
|
|
|
|
To only import files whose content has not been seen before by git-annex,
|
|
but avoid deleting duplicate files, use the `--skip-duplicates` option.
|
|
|
|
The `--clean-duplicates` option does not import any new files, but any files
|
|
found in the import location that are duplicates of content in the annex
|
|
are deleted.
|
|
|
|
(Note that using `--deduplicate` or `--clean-duplicates` with the WORM
|
|
backend does not look at file content, but filename and mtime.)
|
|
|
|
* `importfeed [url ...]`
|
|
|
|
Imports the contents of podcast feeds. Only downloads files whose
|
|
urls have not already been added to the repository before, so you can
|
|
delete, rename, etc the resulting files and repeated runs won't duplicate
|
|
them. (Use `--force` to force downloading urls it's seen before.)
|
|
|
|
Use `--template` to control where the files are stored.
|
|
The default template is '${feedtitle}/${itemtitle}${extension}'
|
|
(Other available variables: feedauthor, itemauthor, itemsummary, itemdescription, itemrights, itemid, itempubdate, title, author)
|
|
|
|
The `--relaxed` and `--fast` options behave the same as they do in addurl.
|
|
|
|
When quvi is installed, links in the feed are tested to see if they
|
|
are on a video hosting site, and the video is downloaded. This allows
|
|
importing e.g., youtube playlists.
|
|
|
|
* `undo [filename|directory] ...`
|
|
|
|
When passed a filename, undoes the last change that was made to that
|
|
file.
|
|
|
|
When passed a directory, undoes the last change that was made to the
|
|
contents of that directory.
|
|
|
|
Running undo a second time will undo the undo, returning the working
|
|
tree to the same state it had before. In order for undoing an undo of
|
|
staged changes, any staged changes are first committed by the
|
|
undo command.
|
|
|
|
Note that this does not undo get/drop of a file's content; it only
|
|
operates on the file tree committed to git.
|
|
|
|
* `watch`
|
|
|
|
Watches for changes to files in the current directory and its subdirectories,
|
|
and takes care of automatically adding new files, as well as dealing with
|
|
deleted, copied, and moved files. With this running as a daemon in the
|
|
background, you no longer need to manually run git commands when
|
|
manipulating your files.
|
|
|
|
By default, all files in the directory will be added to the repository.
|
|
(Including dotfiles.) To block some files from being added, use
|
|
`.gitignore` files.
|
|
|
|
By default, all files that are added are added to the annex, the same
|
|
as when you run `git annex add`. If you configure annex.largefiles,
|
|
files that it does not match will instead be added with `git add`.
|
|
|
|
To not daemonize, run with `--foreground` ; to stop a running daemon,
|
|
run with `--stop`.
|
|
|
|
* `assistant`
|
|
|
|
Like watch, but also automatically syncs changes to other remotes.
|
|
Typically started at boot, or when you log in.
|
|
|
|
With the `--autostart` option, the assistant is started in any repositories
|
|
it has created. These are listed in `~/.config/git-annex/autostart`.
|
|
|
|
* `webapp`
|
|
|
|
Opens a web app, that allows easy setup of a git-annex repository,
|
|
and control of the git-annex assistant. If the assistant is not
|
|
already running, it will be started.
|
|
|
|
By default, the webapp can only be accessed from localhost, and running
|
|
it opens a browser window.
|
|
|
|
To use the webapp on a remote computer, use the `--listen=address`
|
|
option to specify the address the web server should listen on
|
|
(or set annex.listen).
|
|
This disables running a local web browser, and outputs the url you
|
|
can use to open the webapp.
|
|
|
|
When using the webapp on a remote computer, you'll almost certainly
|
|
want to enable HTTPS. The webapp will use HTTPS if it finds
|
|
a .git/annex/privkey.pem and .git/annex/certificate.pem. Here's
|
|
one way to generate those files, using a self-signed certificate:
|
|
|
|
openssl genrsa -out .git/annex/privkey.pem 4096
|
|
openssl req -new -x509 -key .git/annex/privkey.pem > .git/annex/certificate.pem
|
|
|
|
# REPOSITORY SETUP COMMANDS
|
|
|
|
* `init [description]`
|
|
|
|
Until a repository (or one of its remotes) has been initialized,
|
|
git-annex will refuse to operate on it, to avoid accidentally
|
|
using it in a repository that was not intended to have an annex.
|
|
|
|
It's useful, but not mandatory, to initialize each new clone
|
|
of a repository with its own description. If you don't provide one,
|
|
one will be generated using the username, hostname and the path.
|
|
|
|
* `describe repository description`
|
|
|
|
Changes the description of a repository.
|
|
|
|
The repository to describe can be specified by git remote name or
|
|
by uuid. To change the description of the current repository, use
|
|
"here".
|
|
|
|
* `initremote name [param=value ...]`
|
|
|
|
Creates a new special remote, and adds it to `.git/config`.
|
|
|
|
The remote's configuration is specified by the parameters. Different
|
|
types of special remotes need different configuration values. The
|
|
command will prompt for parameters as needed.
|
|
|
|
All special remotes support encryption. You can either specify
|
|
`encryption=none` to disable encryption, or specify
|
|
`encryption=hybrid keyid=$keyid ...` to specify a GPG key id (or an email
|
|
address associated with a key).
|
|
|
|
There are actually three schemes that can be used for management of the
|
|
encryption keys. When using the encryption=hybrid scheme, additional
|
|
GPG keys can be given access to the encrypted special remote easily
|
|
(without re-encrypting everything). When using encryption=shared,
|
|
a shared key is generated and stored in the git repository, allowing
|
|
anyone who can clone the git repository to access it. Finally, when using
|
|
encryption=pubkey, content in the special remote is directly encrypted
|
|
to the specified GPG keys, and additional ones cannot easily be given
|
|
access.
|
|
|
|
Note that with encryption enabled, a cryptographic key is created.
|
|
This requires sufficient entropy. If initremote seems to hang or take
|
|
a long time while generating the key, you may want to Ctrl-c it and
|
|
re-run with `--fast`, which causes it to use a lower-quality source of
|
|
randomness.
|
|
|
|
Example Amazon S3 remote:
|
|
|
|
git annex initremote mys3 type=S3 encryption=hybrid keyid=me@example.com datacenter=EU
|
|
|
|
* `enableremote name [param=value ...]`
|
|
|
|
Enables use of an existing special remote in the current repository,
|
|
which may be a different repository than the one in which it was
|
|
originally created with the initremote command.
|
|
|
|
The name of the remote is the same name used when originally
|
|
creating that remote with "initremote". Run "git annex enableremote"
|
|
without any name to get a list of special remote names.
|
|
|
|
Some special remotes may need parameters to be specified every time.
|
|
For example, the directory special remote requires a directory= parameter.
|
|
|
|
This command can also be used to modify the configuration of an existing
|
|
special remote, by specifying new values for parameters that were
|
|
originally set when using initremote. (However, some settings such as
|
|
the as the encryption scheme cannot be changed once a special remote
|
|
has been created.)
|
|
|
|
The GPG keys that an encrypted special remote is encrypted with can be
|
|
changed using the keyid+= and keyid-= parameters. These respectively
|
|
add and remove keys from the list. However, note that removing a key
|
|
does NOT necessarily prevent the key's owner from accessing data
|
|
in the encrypted special remote
|
|
(which is by design impossible, short of deleting the remote).
|
|
|
|
One use-case of keyid-= is to replace a revoked key with
|
|
a new key:
|
|
|
|
git annex enableremote mys3 keyid-=revokedkey keyid+=newkey
|
|
|
|
Also, note that for encrypted special remotes using plain public-key
|
|
encryption (encryption=pubkey), adding or removing a key has NO effect
|
|
on files that have already been copied to the remote. Hence using
|
|
keyid+= and keyid-= with such remotes should be used with care, and
|
|
make little sense except in cases like the revoked key example above.
|
|
|
|
* `numcopies [N]`
|
|
|
|
Tells git-annex how many copies it should preserve of files, over all
|
|
repositories. The default is 1.
|
|
|
|
Run without a number to get the current value.
|
|
|
|
When git-annex is asked to drop a file, it first verifies that the
|
|
required number of copies can be satisfied among all the other
|
|
repositories that have a copy of the file.
|
|
|
|
This can be overridden on a per-file basis by the annex.numcopies setting
|
|
in .gitattributes files.
|
|
|
|
* `trust [repository ...]`
|
|
|
|
Records that a repository is trusted to not unexpectedly lose
|
|
content. Use with care.
|
|
|
|
To trust the current repository, use "here".
|
|
|
|
* `untrust [repository ...]`
|
|
|
|
Records that a repository is not trusted and could lose content
|
|
at any time.
|
|
|
|
* `semitrust [repository ...]`
|
|
|
|
Returns a repository to the default semi trusted state.
|
|
|
|
* `dead [repository ...]`
|
|
|
|
Indicates that the repository has been irretrievably lost.
|
|
(To undo, use semitrust.)
|
|
|
|
* `group repository groupname`
|
|
|
|
Adds a repository to a group, such as "archival", "enduser", or "transfer".
|
|
The groupname must be a single word.
|
|
|
|
Omit the groupname to show the current groups that a repository is in.
|
|
|
|
* `ungroup repository groupname`
|
|
|
|
Removes a repository from a group.
|
|
|
|
* `wanted repository [expression]`
|
|
|
|
When run with an expression, configures the content that is preferred
|
|
to be held in the archive. See PREFERRED CONTENT below.
|
|
|
|
For example:
|
|
|
|
git annex wanted . "include=*.mp3 or include=*.ogg"
|
|
|
|
Without an expression, displays the current preferred content setting
|
|
of the repository.
|
|
|
|
* `schedule repository [expression]`
|
|
|
|
When run with an expression, configures scheduled jobs to run at a
|
|
particular time. This can be used to make the assistant periodically run
|
|
incremental fscks. See SCHEDULED JOBS below.
|
|
|
|
* `vicfg`
|
|
|
|
Opens EDITOR on a temp file containing most of the above configuration
|
|
settings, as well as a few others, and when it exits, stores any changes
|
|
made back to the git-annex branch.
|
|
|
|
* `direct`
|
|
|
|
Switches a repository to use direct mode, where rather than symlinks to
|
|
files, the files are directly present in the repository.
|
|
|
|
As part of the switch to direct mode, any changed files will be committed.
|
|
|
|
Note that git commands that operate on the work tree will refuse to
|
|
run in direct mode repositories. Use `git annex proxy` to safely run such
|
|
commands.
|
|
|
|
* `indirect`
|
|
|
|
Switches a repository back from direct mode to the default, indirect mode.
|
|
|
|
As part of the switch from direct mode, any changed files will be committed.
|
|
|
|
# REPOSITORY MAINTENANCE COMMANDS
|
|
|
|
* `fsck [path ...]`
|
|
|
|
With no parameters, this command checks the whole annex for consistency,
|
|
and warns about or fixes any problems found. This is a good complement to
|
|
`git fsck`.
|
|
|
|
With parameters, only the specified files are checked.
|
|
|
|
To check a remote to fsck, specify `--from`.
|
|
|
|
To avoid expensive checksum calculations (and expensive transfers when
|
|
fscking a remote), specify `--fast`.
|
|
|
|
To start a new incremental fsck, use the `--incremental` option. Then
|
|
the next time you fsck, you can instead use the `--more` option
|
|
to skip over files that have already been checked, and continue
|
|
where it left off.
|
|
|
|
The `--incremental-schedule` option makes a new incremental fsck be
|
|
started a configurable time after the last incremental fsck was started.
|
|
Once the current incremental fsck has completely finished, it causes
|
|
a new one to start.
|
|
|
|
Maybe you'd like to run a fsck for 5 hours at night, picking up each
|
|
night where it left off. You'd like this to continue until all files
|
|
have been fscked. And once it's done, you'd like a new fsck pass to start,
|
|
but no more often than once a month. Then put this in a nightly cron job:
|
|
|
|
git annex fsck --incremental-schedule 30d --time-limit 5h
|
|
|
|
To verify data integrity only while disregarding required number of copies,
|
|
use `--numcopies=1`.
|
|
|
|
* `unused`
|
|
|
|
Checks the annex for data that does not correspond to any files present
|
|
in any tag or branch, and prints a numbered list of the data.
|
|
|
|
To only show unused temp and bad files, specify `--fast`.
|
|
|
|
To check for annexed data on a remote, specify `--from`.
|
|
|
|
After running this command, you can use the `--unused` option to
|
|
operate on all the unused data that was found. For example, to
|
|
move all unused data to origin:
|
|
|
|
git annex unused; git annex move --unused --to origin
|
|
|
|
* `dropunused [number|range ...]`
|
|
|
|
Drops the data corresponding to the numbers, as listed by the last
|
|
`git annex unused`
|
|
|
|
You can also specify ranges of numbers, such as "1-1000".
|
|
Or, specify "all" to drop all unused data.
|
|
|
|
To drop the data from a remote, specify `--from.`
|
|
|
|
* `addunused [number|range ...]`
|
|
|
|
Adds back files for the content corresponding to the numbers or ranges,
|
|
as listed by the last `git annex unused`. The files will have names
|
|
starting with "unused."
|
|
|
|
* `fix [path ...]`
|
|
|
|
Fixes up symlinks that have become broken to again point to annexed content.
|
|
This is useful to run if you have been moving the symlinks around,
|
|
but is done automatically when committing a change with git too.
|
|
|
|
* `upgrade`
|
|
|
|
Upgrades the repository to current layout.
|
|
|
|
* `forget`
|
|
|
|
Causes the git-annex branch to be rewritten, throwing away historical
|
|
data about past locations of files. The resulting branch will use less
|
|
space, but `git annex log` will not be able to show where
|
|
files used to be located.
|
|
|
|
To also prune references to repositories that have been marked as dead,
|
|
specify `--drop-dead`.
|
|
|
|
When this rewritten branch is merged into other clones of
|
|
the repository, `git-annex` will automatically perform the same rewriting
|
|
to their local `git-annex` branches. So the forgetfulness will automatically
|
|
propagate out from its starting point until all repositories running
|
|
git-annex have forgotten their old history. (You may need to force
|
|
git to push the branch to any git repositories not running git-annex.)
|
|
|
|
* `repair`
|
|
|
|
This can repair many of the problems with git repositories that `git fsck`
|
|
detects, but does not itself fix. It's useful if a repository has become
|
|
badly damaged. One way this can happen is if a repository used by git-annex
|
|
is on a removable drive that gets unplugged at the wrong time.
|
|
|
|
This command can actually be used inside git repositories that do not
|
|
use git-annex at all; when used in a repository using git-annex, it
|
|
does additional repairs of the git-annex branch.
|
|
|
|
It works by deleting any corrupt objects from the git repository, and
|
|
retrieving all missing objects it can from the remotes of the repository.
|
|
|
|
If that is not sufficient to fully recover the repository, it can also
|
|
reset branches back to commits before the corruption happened, delete
|
|
branches that are no longer available due to the lost data, and remove any
|
|
missing files from the index. It will only do this if run with the
|
|
`--force` option, since that rewrites history and throws out missing data.
|
|
Note that the `--force` option never touches tags, even if they are no
|
|
longer usable due to missing data.
|
|
|
|
After running this command, you will probably want to run `git fsck` to
|
|
verify it fixed the repository. Note that fsck may still complain about
|
|
objects referenced by the reflog, or the stash, if they were unable to be
|
|
recovered. This command does not try to clean up either the reflog or the
|
|
stash.
|
|
|
|
It is also a good idea to run `git annex fsck --fast` after this command,
|
|
to make sure that the git-annex branch reflects reality.
|
|
|
|
# QUERY COMMANDS
|
|
|
|
* `find [path ...]`
|
|
|
|
Outputs a list of annexed files in the specified path. With no path,
|
|
finds files in the current directory and its subdirectories.
|
|
|
|
By default, only lists annexed files whose content is currently present.
|
|
This can be changed by specifying matching options. To list all
|
|
annexed files, present or not, specify `--include "*"`. To list all
|
|
annexed files whose content is not present, specify `--not --in=here`
|
|
|
|
To output filenames terminated with nulls, for use with xargs -0,
|
|
specify `--print0`. Or, a custom output formatting can be specified using
|
|
`--format`. The default output format is the same as `--format='${file}\\n'`
|
|
|
|
These variables are available for use in formats: file, key, backend,
|
|
bytesize, humansize, keyname, hashdirlower, hashdirmixed, mtime (for
|
|
the mtime field of a WORM key).
|
|
|
|
* `whereis [path ...]`
|
|
|
|
Displays a information about where the contents of files are located.
|
|
|
|
* `list [path ...]`
|
|
|
|
Displays a table of remotes that contain the contents of the specified
|
|
files. This is similar to whereis but a more compact display. Only
|
|
configured remotes are shown by default; specify --allrepos to list
|
|
all repositories.
|
|
|
|
* `log [path ...]`
|
|
|
|
Displays the location log for the specified file or files,
|
|
showing each repository they were added to ("+") and removed from ("-").
|
|
|
|
To limit how far back to search for location log changes, the options
|
|
`--since`, `--after`, `--until`, `--before`, and `--max-count` can be specified.
|
|
They are passed through to git log. For example, `--since "1 month ago"`
|
|
|
|
To generate output suitable for the gource visualization program,
|
|
specify `--gource`.
|
|
|
|
* `info [directory|file|remote|uuid ...]`
|
|
|
|
Displays statistics and other information for the specified item,
|
|
which can be a directory, or a file, or a remote, or the uuid of a
|
|
repository.
|
|
|
|
When no item is specified, displays statistics and information
|
|
for the repository as a whole.
|
|
|
|
When a directory is specified, the file matching options can be used
|
|
to select the files in the directory that are included in the statistics.
|
|
|
|
To only show the data that can be gathered quickly, use `--fast`.
|
|
|
|
For example, suppose you want to run "git annex get .", but
|
|
would first like to see how much disk space that will use.
|
|
Then run:
|
|
|
|
git annex info --fast . --not --in here
|
|
|
|
* `version`
|
|
|
|
Shows the version of git-annex, as well as repository version information.
|
|
|
|
* `map`
|
|
|
|
Helps you keep track of your repositories, and the connections between them,
|
|
by going out and looking at all the ones it can get to, and generating a
|
|
Graphviz file displaying it all. If the `dot` command is available, it is
|
|
used to display the file to your screen (using x11 backend). (To disable
|
|
this display, specify `--fast`)
|
|
|
|
This command only connects to hosts that the host it's run on can
|
|
directly connect to. It does not try to tunnel through intermediate hosts.
|
|
So it might not show all connections between the repositories in the network.
|
|
|
|
Also, if connecting to a host requires a password, you might have to enter
|
|
it several times as the map is being built.
|
|
|
|
Note that this subcommand can be used to graph any git repository; it
|
|
is not limited to git-annex repositories.
|
|
|
|
# METADATA COMMANDS
|
|
|
|
* `metadata [path ...] [-s field=value -s field+=value -s field-=value ...] [-g field]`
|
|
|
|
The content of a file can have any number of metadata fields
|
|
attached to it to describe it. Each metadata field can in turn
|
|
have any number of values.
|
|
|
|
This command can be used to set metadata, or show the currently set
|
|
metadata.
|
|
|
|
To show current metadata, run without any -s parameters. The --json
|
|
option will enable json output.
|
|
|
|
To only get the value(s) of a single field, use -g field.
|
|
The values will be output one per line, with no other output, so
|
|
this is suitable for use in a script.
|
|
|
|
To set a field's value, removing any old value(s), use -s field=value.
|
|
|
|
To add an additional value, use -s field+=value.
|
|
|
|
To remove a value, use -s field-=value.
|
|
|
|
To set a value, only if the field does not already have a value,
|
|
use -s field?=value
|
|
|
|
To set a tag, use -t tag, and use -u tag to remove a tag.
|
|
|
|
For example, to set some tags on a file and also its author:
|
|
|
|
git annex metadata annexscreencast.ogv -t video -t screencast -s author+=Alice
|
|
|
|
* `view [tag ...] [field=value ...] [field=glob ...] [!tag ...] [field!=value ...]`
|
|
|
|
Uses metadata to build a view branch of the files in the current branch,
|
|
and checks out the view branch. Only files in the current branch whose
|
|
metadata matches all the specified field values and tags will be
|
|
shown in the view.
|
|
|
|
Multiple values for a metadata field can be specified, either by using
|
|
a glob (`field="*"`) or by listing each wanted value. The resulting view
|
|
will put files in subdirectories according to the value of their fields.
|
|
|
|
Once within such a view, you can make additional directories, and
|
|
copy or move files into them. When you commit, the metadata will
|
|
be updated to correspond to your changes.
|
|
|
|
There are fields corresponding to the path to the file. So a file
|
|
"foo/bar/baz/file" has fields "/=foo", "foo/=bar", and "foo/bar/=baz".
|
|
These location fields can be used the same as other metadata to construct
|
|
the view.
|
|
|
|
For example, `/=podcasts` will only include files from the podcasts
|
|
directory in the view, while `podcasts/=*` will preserve the
|
|
subdirectories of the podcasts directory in the view.
|
|
|
|
* `vpop [N]`
|
|
|
|
Switches from the currently active view back to the previous view.
|
|
Or, from the first view back to original branch.
|
|
|
|
The optional number tells how many views to pop.
|
|
|
|
* `vfilter [tag ...] [field=value ...] [!tag ...] [field!=value ...]`
|
|
|
|
Filters the current view to only the files that have the
|
|
specified field values and tags.
|
|
|
|
* `vadd [field=glob ...] [field=value ...] [tag ...]`
|
|
|
|
Changes the current view, adding an additional level of directories
|
|
to categorize the files.
|
|
|
|
For example, when the view is by author/tag, `vadd year=*` will
|
|
change it to year/author/tag.
|
|
|
|
So will `vadd year=2014 year=2013`, but limiting the years in view
|
|
to only those two.
|
|
|
|
* `vcycle`
|
|
|
|
When a view involves nested subdirectories, this cycles the order.
|
|
|
|
For example, when the view is by year/author/tag, `vcycle` will switch
|
|
it to author/tag/year.
|
|
|
|
# UTILITY COMMANDS
|
|
|
|
* `migrate [path ...]`
|
|
|
|
Changes the specified annexed files to use the default key-value backend
|
|
(or the one specified with `--backend`). Only files whose content
|
|
is currently available are migrated.
|
|
|
|
Note that the content is also still available using the old key after
|
|
migration. Use `git annex unused` to find and remove the old key.
|
|
|
|
Normally, nothing will be done to files already using the new backend.
|
|
However, if a backend changes the information it uses to construct a key,
|
|
this can also be used to migrate files to use the new key format.
|
|
|
|
* `reinject src dest`
|
|
|
|
Moves the src file into the annex as the content of the dest file.
|
|
This can be useful if you have obtained the content of a file from
|
|
elsewhere and want to put it in the local annex.
|
|
|
|
Automatically runs fsck on dest to check that the expected content was
|
|
provided.
|
|
|
|
Example:
|
|
|
|
git annex reinject /tmp/foo.iso foo.iso
|
|
|
|
* `unannex [path ...]`
|
|
|
|
Use this to undo an accidental `git annex add` command. It puts the
|
|
file back how it was before the add.
|
|
|
|
Note that for safety, the content of the file remains in the annex,
|
|
until you use `git annex unused` and `git annex dropunused`.
|
|
|
|
This is not the command you should use if you intentionally annexed a
|
|
file and don't want its contents any more. In that case you should use
|
|
`git annex drop` instead, and you can also `git rm` the file.
|
|
|
|
Normally this does a slow copy of the file. In `--fast` mode, it
|
|
instead makes a hard link from the file to the content in the annex.
|
|
But use --fast mode with caution, because editing the file will
|
|
change the content in the annex.
|
|
|
|
* `uninit`
|
|
|
|
Use this to stop using git annex. It will unannex every file in the
|
|
repository, and remove all of git-annex's other data, leaving you with a
|
|
git repository plus the previously annexed files.
|
|
|
|
* `reinit uuid|description`
|
|
|
|
Normally, initializing a repository generates a new, unique identifier
|
|
(UUID) for that repository. Occasionally it may be useful to reuse a
|
|
UUID -- for example, if a repository got deleted, and you're
|
|
setting it back up.
|
|
|
|
Use this with caution; it can be confusing to have two existing
|
|
repositories with the same UUID. Also, you will probably want to run
|
|
a fsck.
|
|
|
|
# PLUMBING COMMANDS
|
|
|
|
* `pre-commit [path ...]`
|
|
|
|
This is meant to be called from git's pre-commit hook. `git annex init`
|
|
automatically creates a pre-commit hook using this.
|
|
|
|
Fixes up symlinks that are staged as part of a commit, to ensure they
|
|
point to annexed content. Also handles injecting changes to unlocked
|
|
files into the annex. When in a view, updates metadata to reflect changes
|
|
made to files in the view.
|
|
|
|
* `lookupkey [file ...]`
|
|
|
|
This plumbing-level command looks up the key used for a file in the
|
|
index. The key is output to stdout. If there is no key (because
|
|
the file is not present in the index, or is not a git-annex managed file),
|
|
nothing is output, and it exits nonzero.
|
|
|
|
* `examinekey [key ...]`
|
|
|
|
This plumbing-level command is given a key, and prints information
|
|
that can be determined purely by looking at the key.
|
|
|
|
To specify what information to print, use `--format`. Or use `--json`
|
|
to get all available information in JSON format.
|
|
|
|
The same variables can be used in the format string as can be used in
|
|
the format string of git annex find (except there is no file option
|
|
here).
|
|
|
|
For example, the location a key's value is stored (in indirect mode)
|
|
can be looked up by running:
|
|
|
|
git annex examinekey --format='.git/annex/objects/${hashdirmixed}${key}/${key}'
|
|
|
|
* `fromkey key file`
|
|
|
|
This plumbing-level command can be used to manually set up a file
|
|
in the git repository to link to a specified key.
|
|
|
|
* `dropkey [key ...]`
|
|
|
|
This plumbing-level command drops the annexed data for the specified
|
|
keys from this repository.
|
|
|
|
This can be used to drop content for arbitrary keys, which do not need
|
|
to have a file in the git repository pointing at them.
|
|
|
|
* `transferkey`
|
|
|
|
This plumbing-level command is used to request a single key be
|
|
transferred. Either the --from or the --to option can be used to specify
|
|
the remote to use. A --file option can be used to hint at the file
|
|
associated with the key.
|
|
|
|
* `transferkeys`
|
|
|
|
This plumbing-level command is used by the assistant to transfer data.
|
|
It is fed instructions about the keys to transfer using an internal
|
|
stdio protocol, which is intentionally not documented (as it may change
|
|
at any time).
|
|
|
|
* `setpresentkey key uuid [1|0]`
|
|
|
|
This plumbing-level command changes git-annex's records about whether
|
|
the specified key is present in a remote with the specified uuid.
|
|
|
|
* `rekey [file key ...]`
|
|
|
|
This plumbing-level command is similar to migrate, but you specify
|
|
both the file, and the new key to use for it.
|
|
|
|
With `--force`, even files whose content is not currently available will
|
|
be rekeyed. Use with caution.
|
|
|
|
* `findref [ref]`
|
|
|
|
This is similar to the find command, but instead of finding files in the
|
|
current work tree, it finds files in the specified git ref.
|
|
|
|
Most MATCHING OPTIONS can be used with findref, to limit the files it
|
|
finds. However, the --include and --exclude options will not work.
|
|
|
|
* `proxy -- git cmd [options]`
|
|
|
|
Only useful in a direct mode repository, this runs the specified git
|
|
command with a temporary work tree, and updates the working tree to
|
|
reflect any changes staged or committed by the git command.
|
|
|
|
For example, to revert the most recent change that was committed
|
|
to the repository:
|
|
|
|
git annex proxy -- git revert HEAD
|
|
|
|
To check out a past version of the repository:
|
|
|
|
git annex proxy -- git checkout HEAD^^
|
|
|
|
To rename a directory:
|
|
|
|
git annex proxy -- git mv mydir newname
|
|
|
|
* `resolvemerge`
|
|
|
|
Resolves a conflicted merge, by adding both conflicting versions of the
|
|
file to the tree, using variants of their filename. This is done
|
|
automatically when using `git annex sync` or `git annex merge`.
|
|
|
|
Note that only merge conflicts that involve an annexed file are resolved.
|
|
Merge conflicts between two files that are not annexed will not be
|
|
automatically resolved.
|
|
|
|
* `diffdriver`
|
|
|
|
This is an external git diff driver shim. Normally, when using `git diff`
|
|
with an external git driver, the symlinks to annexed files are not set up
|
|
right, so the external git driver cannot read them in order to perform
|
|
smart diffing of their contents. This command works around the problem,
|
|
by passing the fixed up files to the real external diff driver.
|
|
|
|
To use, just configure git to use "git-annex diffdriver -- cmd params --"
|
|
as the external diff command, where cmd is the real external diff
|
|
command you want to use, and params are any extra parameters to pass
|
|
to it. Note the trailing "--", which is required.
|
|
|
|
For example, set `GIT_EXTERNAL_DIFF=git-annex diffdriver -- j-c-diff --`
|
|
|
|
* `remotedaemon`
|
|
|
|
Detects when network remotes have received git pushes and fetches from them.
|
|
|
|
* `xmppgit`
|
|
|
|
This command is used internally to perform git pulls over XMPP.
|
|
|
|
# TESTING COMMANDS
|
|
|
|
* `test`
|
|
|
|
This runs git-annex's built-in test suite.
|
|
|
|
There are several parameters, provided by Haskell's tasty test framework.
|
|
Pass --help for details.
|
|
|
|
* `testremote remote`
|
|
|
|
This tests a remote by generating some random objects and sending them to
|
|
the remote, then redownloading them, removing them from the remote, etc.
|
|
|
|
It's safe to run in an existing repository (the repository contents are
|
|
not altered), although it may perform expensive data transfers.
|
|
|
|
To perform a smaller set of tests, use --fast.
|
|
|
|
The --size option can be used to tune the size of the generated objects.
|
|
|
|
Testing a single remote will use the remote's configuration,
|
|
automatically varying the chunk sizes, and with simple shared encryption
|
|
enabled and disabled.
|
|
|
|
* `fuzztest`
|
|
|
|
Generates random changes to files in the current repository,
|
|
for use in testing the assistant. This is dangerous, so it will not
|
|
do anything unless --forced.
|
|
|
|
# OPTIONS
|
|
|
|
* `--force`
|
|
|
|
Force unsafe actions, such as dropping a file's content when no other
|
|
source of it can be verified to still exist, or adding ignored files.
|
|
Use with care.
|
|
|
|
* `--fast`
|
|
|
|
Enable less expensive, but also less thorough versions of some commands.
|
|
What is avoided depends on the command.
|
|
|
|
* `--auto`
|
|
|
|
Enable automatic mode. Commands that get, drop, or move file contents
|
|
will only do so when needed to help satisfy the setting of numcopies,
|
|
and preferred content configuration.
|
|
|
|
* `--all`
|
|
|
|
Operate on all data that has been stored in the git annex,
|
|
including old versions of files. This is the default behavior when
|
|
running git-annex in a bare repository; in a non-bare repository the
|
|
normal behavior is to only operate on specified files in the working
|
|
tree.
|
|
|
|
* `--unused`
|
|
|
|
Operate on all data that has been determined to be unused by
|
|
a previous run of `git-annex unused`.
|
|
|
|
* `--key=key`
|
|
|
|
Operate on only the specified key.
|
|
|
|
* `--quiet`
|
|
|
|
Avoid the default verbose display of what is done; only show errors
|
|
and progress displays.
|
|
|
|
* `--verbose`
|
|
|
|
Enable verbose display.
|
|
|
|
* `--json`
|
|
|
|
Rather than the normal output, generate JSON. This is intended to be
|
|
parsed by programs that use git-annex. Each line of output is a JSON
|
|
object. Note that JSON output is only usable with some git-annex commands,
|
|
like info, find, whereis, and metadata.
|
|
|
|
* `--debug`
|
|
|
|
Show debug messages.
|
|
|
|
* `--no-debug`
|
|
|
|
Disable debug messages.
|
|
|
|
* `--from=repository`
|
|
|
|
Specifies a repository that content will be retrieved from, or that
|
|
should otherwise be acted on.
|
|
|
|
It should be specified using the name of a configured remote.
|
|
|
|
* `--to=repository`
|
|
|
|
Specifies a repository that content will be sent to.
|
|
|
|
It should be specified using the name of a configured remote.
|
|
|
|
* `--numcopies=n`
|
|
|
|
Overrides the numcopies setting, forcing git-annex to ensure the
|
|
specified number of copies exist.
|
|
|
|
Note that setting numcopies to 0 is very unsafe.
|
|
|
|
* `--time-limit=time`
|
|
|
|
Limits how long a git-annex command runs. The time can be something
|
|
like "5h", or "30m" or even "45s" or "10d".
|
|
|
|
Note that git-annex may continue running a little past the specified
|
|
time limit, in order to finish processing a file.
|
|
|
|
Also, note that if the time limit prevents git-annex from doing all it
|
|
was asked to, it will exit with a special code, 101.
|
|
|
|
* `--trust=repository`
|
|
* `--semitrust=repository`
|
|
* `--untrust=repository`
|
|
|
|
Overrides trust settings for a repository. May be specified more than once.
|
|
|
|
The repository should be specified using the name of a configured remote,
|
|
or the UUID or description of a repository.
|
|
|
|
* `--trust-glacier-inventory`
|
|
|
|
Amazon Glacier inventories take hours to retrieve, and may not represent
|
|
the current state of a repository. So git-annex does not trust that
|
|
files that the inventory claims are in Glacier are really there.
|
|
This switch can be used to allow it to trust the inventory.
|
|
|
|
Be careful using this, especially if you or someone else might have recently
|
|
removed a file from Glacier. If you try to drop the only other copy of the
|
|
file, and this switch is enabled, you could lose data!
|
|
|
|
* `--backend=name`
|
|
|
|
Specifies which key-value backend to use. This can be used when
|
|
adding a file to the annex, or migrating a file. Once files
|
|
are in the annex, their backend is known and this option is not
|
|
necessary.
|
|
|
|
* `--format=value`
|
|
|
|
Specifies a custom output format. The value is a format string,
|
|
in which '${var}' is expanded to the value of a variable. To right-justify
|
|
a variable with whitespace, use '${var;width}' ; to left-justify
|
|
a variable, use '${var;-width}'; to escape unusual characters in a variable,
|
|
use '${escaped_var}'
|
|
|
|
Also, '\\n' is a newline, '\\000' is a NULL, etc.
|
|
|
|
* `--user-agent=value`
|
|
|
|
Overrides the User-Agent to use when downloading files from the web.
|
|
|
|
* `--notify-finish`
|
|
|
|
Caused a desktop notification to be displayed after each successful
|
|
file download and upload.
|
|
|
|
(Only supported on some platforms, e.g. Linux with dbus. A no-op when
|
|
not supported.)
|
|
|
|
* `--notify-start`
|
|
|
|
Caused a desktop notification to be displayed when a file upload
|
|
or download has started, or when a file is dropped.
|
|
|
|
* `-c name=value`
|
|
|
|
Overrides git configuration settings. May be specified multiple times.
|
|
|
|
# MATCHING OPTIONS
|
|
|
|
These options can all be specified multiple times, and can be combined to
|
|
limit which files git-annex acts on.
|
|
|
|
Arbitrarily complicated expressions can be built using these options.
|
|
For example:
|
|
|
|
--exclude '*.mp3' --and --not -( --in=usbdrive --or --in=archive -)
|
|
|
|
The above example prevents git-annex from working on mp3 files whose
|
|
file contents are present at either of two repositories.
|
|
|
|
* `--exclude=glob`
|
|
|
|
Skips files matching the glob pattern. The glob is matched relative to
|
|
the current directory. For example:
|
|
|
|
--exclude='*.mp3' --exclude='subdir/*'
|
|
|
|
Note that this will not match anything when using --all or --unused.
|
|
|
|
* `--include=glob`
|
|
|
|
Skips files not matching the glob pattern. (Same as `--not --exclude`.)
|
|
For example, to include only mp3 and ogg files:
|
|
|
|
--include='*.mp3' --or --include='*.ogg'
|
|
|
|
Note that this will not skip anything when using --all or --unused.
|
|
|
|
* `--in=repository`
|
|
|
|
Matches only files that git-annex believes have their contents present
|
|
in a repository. Note that it does not check the repository to verify
|
|
that it still has the content.
|
|
|
|
The repository should be specified using the name of a configured remote,
|
|
or the UUID or description of a repository. For the current repository,
|
|
use `--in=here`
|
|
|
|
* `--in=repository@{date}`
|
|
|
|
Matches files currently in the work tree whose content was present in
|
|
the repository on the given date.
|
|
|
|
The date is specified in the same syntax documented in
|
|
gitrevisions(7). Note that this uses the reflog, so dates far in the
|
|
past cannot be queried.
|
|
|
|
For example, you might need to run `git annex drop .` to temporarily
|
|
free up disk space. The next day, you can get back the files you dropped
|
|
using `git annex get . --in=here@{yesterday}`
|
|
|
|
* `--copies=number`
|
|
|
|
Matches only files that git-annex believes to have the specified number
|
|
of copies, or more. Note that it does not check remotes to verify that
|
|
the copies still exist.
|
|
|
|
* `--copies=trustlevel:number`
|
|
|
|
Matches only files that git-annex believes have the specified number of
|
|
copies, on remotes with the specified trust level. For example,
|
|
`--copies=trusted:2`
|
|
|
|
To match any trust level at or higher than a given level,
|
|
use 'trustlevel+'. For example, `--copies=semitrusted+:2`
|
|
|
|
* `--copies=groupname:number`
|
|
|
|
Matches only files that git-annex believes have the specified number of
|
|
copies, on remotes in the specified group. For example,
|
|
`--copies=archive:2`
|
|
|
|
* `--lackingcopies=number`
|
|
|
|
Matches only files that git-annex believes need the specified number or
|
|
more additional copies to be made in order to satisfy their numcopies
|
|
settings.
|
|
|
|
* `--approxlackingcopies=number`
|
|
|
|
Like lackingcopies, but does not look at .gitattributes annex.numcopies
|
|
settings. This makes it significantly faster.
|
|
|
|
* `--inbackend=name`
|
|
|
|
Matches only files whose content is stored using the specified key-value
|
|
backend.
|
|
|
|
* `--inallgroup=groupname`
|
|
|
|
Matches only files that git-annex believes are present in all repositories
|
|
in the specified group.
|
|
|
|
* `--smallerthan=size`
|
|
* `--largerthan=size`
|
|
|
|
Matches only files whose content is smaller than, or larger than the
|
|
specified size.
|
|
|
|
The size can be specified with any commonly used units, for example,
|
|
"0.5 gb" or "100 KiloBytes"
|
|
|
|
* `--metadata field=glob`
|
|
|
|
Matches only files that have a metadata field attached with a value that
|
|
matches the glob. The values of metadata fields are matched case
|
|
insensitively.
|
|
|
|
* `--want-get`
|
|
|
|
Matches files that the preferred content settings for the repository
|
|
make it want to get. Note that this will match even files that are
|
|
already present, unless limited with e.g., `--not --in .`
|
|
|
|
Note that this will not match anything when using --all or --unused.
|
|
|
|
* `--want-drop`
|
|
|
|
Matches files that the preferred content settings for the repository
|
|
make it want to drop. Note that this will match even files that have
|
|
already been dropped, unless limited with e.g., `--in .`
|
|
|
|
Note that this will not match anything when using --all or --unused.
|
|
|
|
* `--not`
|
|
|
|
Inverts the next matching option. For example, to only act on
|
|
files with less than 3 copies, use `--not --copies=3`
|
|
|
|
* `--and`
|
|
|
|
Requires that both the previous and the next matching option matches.
|
|
The default.
|
|
|
|
* `--or`
|
|
|
|
Requires that either the previous, or the next matching option matches.
|
|
|
|
* `-(`
|
|
|
|
Opens a group of matching options.
|
|
|
|
* `-)`
|
|
|
|
Closes a group of matching options.
|
|
|
|
# PREFERRED CONTENT
|
|
|
|
Each repository has a preferred content setting, which specifies content
|
|
that the repository wants to have present. These settings can be configured
|
|
using `git annex vicfg` or `git annex wanted`.
|
|
They are used by the `--auto` option, and by the git-annex assistant.
|
|
|
|
The preferred content settings are similar, but not identical to
|
|
the matching options specified above, just without the dashes.
|
|
For example:
|
|
|
|
exclude=archive/* and (include=*.mp3 or smallerthan=1mb)
|
|
|
|
The main differences are that `exclude=` and `include=` always
|
|
match relative to the top of the git repository, and that there is
|
|
no equivilant to `--in`.
|
|
|
|
When a repository is in one of the standard predefined groups, like "backup"
|
|
and "client", setting its preferred content to "standard" will use a
|
|
built-in preferred content expression developed for that group.
|
|
|
|
# SCHEDULED JOBS
|
|
|
|
The git-annex assistant daemon can be configured to run scheduled jobs.
|
|
This is similar to cron and anacron (and you can use them if you prefer),
|
|
but has the advantage of being integrated into git-annex, and so being able
|
|
to e.g., fsck a repository on a removable drive when the drive gets
|
|
connected.
|
|
|
|
The scheduled jobs can be configured using `git annex vicfg` or
|
|
`git annex schedule`.
|
|
|
|
These actions are available: "fsck self", "fsck UUID" (where UUID
|
|
is the UUID of a remote to fsck). After the action comes the duration
|
|
to allow the action to run, and finally the schedule of when to run it.
|
|
|
|
To schedule multiple jobs, separate them with "; ".
|
|
|
|
Some examples:
|
|
|
|
fsck self 30m every day at any time
|
|
fsck self 1h every month at 3 AM
|
|
fsck self 1h on day 1 of every month at any time
|
|
fsck self 1h every week divisible by 2 at any time
|
|
|
|
# CONFIGURATION VIA .git/config
|
|
|
|
Like other git commands, git-annex is configured via `.git/config`.
|
|
Here are all the supported configuration settings.
|
|
|
|
* `annex.uuid`
|
|
|
|
A unique UUID for this repository (automatically set).
|
|
|
|
* `annex.backends`
|
|
|
|
Space-separated list of names of the key-value backends to use.
|
|
The first listed is used to store new files by default.
|
|
|
|
* `annex.diskreserve`
|
|
|
|
Amount of disk space to reserve. Disk space is checked when transferring
|
|
content to avoid running out, and additional free space can be reserved
|
|
via this option, to make space for more important content (such as git
|
|
commit logs). Can be specified with any commonly used units, for example,
|
|
"0.5 gb", "500M", or "100 KiloBytes"
|
|
|
|
The default reserve is 1 megabyte.
|
|
|
|
* `annex.largefiles`
|
|
|
|
Allows configuring which files `git annex add` and the assistant consider
|
|
to be large enough to need to be added to the annex. By default,
|
|
all files are added to the annex.
|
|
|
|
The value is a preferred content expression. See PREFERRED CONTENT
|
|
for details.
|
|
|
|
Example:
|
|
|
|
annex.largefiles = largerthan=100kb and not (include=*.c or include=*.h)
|
|
|
|
* `annex.numcopies`
|
|
|
|
This is a deprecated setting. You should instead use the
|
|
`git annex numcopies` command to configure how many copies of files
|
|
are kept across all repositories.
|
|
|
|
This config setting is only looked at when `git annex numcopies` has
|
|
never been configured.
|
|
|
|
Note that setting numcopies to 0 is very unsafe.
|
|
|
|
* `annex.genmetadata`
|
|
|
|
Set this to `true` to make git-annex automatically generate some metadata
|
|
when adding files to the repository.
|
|
|
|
In particular, it stores year and month metadata, from the file's
|
|
modification date.
|
|
|
|
When importfeed is used, it stores additional metadata from the feed.
|
|
|
|
* `annex.queuesize`
|
|
|
|
git-annex builds a queue of git commands, in order to combine similar
|
|
commands for speed. By default the size of the queue is limited to
|
|
10240 commands; this can be used to change the size. If you have plenty
|
|
of memory and are working with very large numbers of files, increasing
|
|
the queue size can speed it up.
|
|
|
|
* `annex.bloomcapacity`
|
|
|
|
The `git annex unused` command uses a bloom filter to determine
|
|
what data is no longer used. The default bloom filter is sized to handle
|
|
up to 500000 keys. If your repository is larger than that,
|
|
you can adjust this to avoid `git annex unused` not noticing some unused
|
|
data files. Increasing this will make `git-annex unused` consume more memory;
|
|
run `git annex info` for memory usage numbers.
|
|
|
|
* `annex.bloomaccuracy`
|
|
|
|
Adjusts the accuracy of the bloom filter used by
|
|
`git annex unused`. The default accuracy is 1000 --
|
|
1 unused file out of 1000 will be missed by `git annex unused`. Increasing
|
|
the accuracy will make `git annex unused` consume more memory;
|
|
run `git annex info` for memory usage numbers.
|
|
|
|
* `annex.sshcaching`
|
|
|
|
By default, git-annex caches ssh connections using ssh's
|
|
ControlMaster and ControlPersist settings
|
|
(if built using a new enough ssh). To disable this, set to `false`.
|
|
|
|
* `annex.alwayscommit`
|
|
|
|
By default, git-annex automatically commits data to the git-annex branch
|
|
after each command is run. If you have a series
|
|
of commands that you want to make a single commit, you can
|
|
run the commands with `-c annex.alwayscommit=false`. You can later
|
|
commit the data by running `git annex merge` (or by automatic merges)
|
|
or `git annex sync`.
|
|
|
|
* `annex.hardlink`
|
|
|
|
Set this to `true` to make file contents be hard linked into the
|
|
repository when possible, instead of a more expensive copy.
|
|
|
|
Use with caution -- This can invalidate numcopies counting, since
|
|
with hard links, fewer copies of a file can exist. So, it is a good
|
|
idea to mark a repository using this setting as untrusted.
|
|
|
|
When a repository is set up using `git clone --shared`, git-annex init
|
|
will automatically set annex.hardlink and mark the repository as
|
|
untrusted.
|
|
|
|
* `annex.delayadd`
|
|
|
|
Makes the watch and assistant commands delay for the specified number of
|
|
seconds before adding a newly created file to the annex. Normally this
|
|
is not needed, because they already wait for all writers of the file
|
|
to close it. On Mac OSX, when not using direct mode this defaults to
|
|
1 second, to work around a bad interaction with software there.
|
|
|
|
* `annex.expireunused`
|
|
|
|
Controls what the assistant does about unused file contents
|
|
that are stored in the repository.
|
|
|
|
The default is `false`, which causes
|
|
all old and unused file contents to be retained, unless the assistant
|
|
is able to move them to some other repository (such as a backup repository).
|
|
|
|
Can be set to a time specification, like "7d" or "1m", and then
|
|
file contents that have been known to be unused for a week or a
|
|
month will be deleted.
|
|
|
|
* `annex.fscknudge`
|
|
|
|
When set to false, prevents the webapp from reminding you when using
|
|
repositories that lack consistency checks.
|
|
|
|
* `annex.autoupgrade`
|
|
|
|
When set to ask (the default), the webapp will check for new versions
|
|
and prompt if they should be upgraded to. When set to true, automatically
|
|
upgrades without prompting (on some supported platforms). When set to
|
|
false, disables any upgrade checking.
|
|
|
|
Note that upgrade checking is only done when git-annex is installed
|
|
from one of the prebuilt images from its website. This does not
|
|
bypass e.g., a Linux distribution's own upgrade handling code.
|
|
|
|
This setting also controls whether to restart the git-annex assistant
|
|
when the git-annex binary is detected to have changed. That is useful
|
|
no matter how you installed git-annex.
|
|
|
|
* `annex.autocommit`
|
|
|
|
Set to false to prevent the git-annex assistant from automatically
|
|
committing changes to files in the repository.
|
|
|
|
* `annex.startupscan`
|
|
|
|
Set to false to prevent the git-annex assistant from scanning the
|
|
repository for new and changed files on startup. This will prevent it
|
|
from noticing changes that were made while it was not running, but can be
|
|
a useful performance tweak for a large repository.
|
|
|
|
* `annex.listen`
|
|
|
|
Configures which address the webapp listens on. The default is localhost.
|
|
Can be either an IP address, or a hostname that resolves to the desired
|
|
address.
|
|
|
|
* `annex.debug`
|
|
|
|
Set to true to enable debug logging by default.
|
|
|
|
* `annex.version`
|
|
|
|
Automatically maintained, and used to automate upgrades between versions.
|
|
|
|
* `annex.direct`
|
|
|
|
Set to true when the repository is in direct mode. Should not be set
|
|
manually; use the "git annex direct" and "git annex indirect" commands
|
|
instead.
|
|
|
|
* `annex.crippledfilesystem`
|
|
|
|
Set to true if the repository is on a crippled filesystem, such as FAT,
|
|
which does not support symbolic links, or hard links, or unix permissions.
|
|
This is automatically probed by "git annex init".
|
|
|
|
* `remote.<name>.annex-cost`
|
|
|
|
When determining which repository to
|
|
transfer annexed files from or to, ones with lower costs are preferred.
|
|
The default cost is 100 for local repositories, and 200 for remote
|
|
repositories.
|
|
|
|
* `remote.<name>.annex-cost-command`
|
|
|
|
If set, the command is run, and the number it outputs is used as the cost.
|
|
This allows varying the cost based on e.g., the current network. The
|
|
cost-command can be any shell command line.
|
|
|
|
* `remote.<name>.annex-start-command`
|
|
|
|
A command to run when git-annex begins to use the remote. This can
|
|
be used to, for example, mount the directory containing the remote.
|
|
|
|
The command may be run repeatedly when multiple git-annex processes
|
|
are running concurrently.
|
|
|
|
* `remote.<name>.annex-stop-command`
|
|
|
|
A command to run when git-annex is done using the remote.
|
|
|
|
The command will only be run once *all* running git-annex processes
|
|
are finished using the remote.
|
|
|
|
* `remote.<name>.annex-shell`
|
|
|
|
Specify an alternative git-annex-shell executable on the remote
|
|
instead of looking for "git-annex-shell" on the PATH.
|
|
|
|
This is useful if the git-annex-shell program is outside the PATH
|
|
or has a non-standard name.
|
|
|
|
* `remote.<name>.annex-ignore`
|
|
|
|
If set to `true`, prevents git-annex
|
|
from storing file contents on this remote by default.
|
|
(You can still request it be used by the `--from` and `--to` options.)
|
|
|
|
This is, for example, useful if the remote is located somewhere
|
|
without git-annex-shell. (For example, if it's on GitHub).
|
|
Or, it could be used if the network connection between two
|
|
repositories is too slow to be used normally.
|
|
|
|
This does not prevent git-annex sync (or the git-annex assistant) from
|
|
syncing the git repository to the remote.
|
|
|
|
* `remote.<name>.annex-sync`
|
|
|
|
If set to `false`, prevents git-annex sync (and the git-annex assistant)
|
|
from syncing with this remote.
|
|
|
|
* `remote.<name>.annex-readonly`
|
|
|
|
If set to `true`, prevents git-annex from making changes to a remote.
|
|
This both prevents git-annex sync from pushing changes, and prevents
|
|
storing or removing files from read-only remote.
|
|
|
|
* `remote.<name>.annexUrl`
|
|
|
|
Can be used to specify a different url than the regular `remote.<name>.url`
|
|
for git-annex to use when talking with the remote. Similar to the `pushUrl`
|
|
used by git-push.
|
|
|
|
* `remote.<name>.annex-uuid`
|
|
|
|
git-annex caches UUIDs of remote repositories here.
|
|
|
|
* `remote.<name>.annex-trustlevel`
|
|
|
|
Configures a local trust level for the remote. This overrides the value
|
|
configured by the trust and untrust commands. The value can be any of
|
|
"trusted", "semitrusted" or "untrusted".
|
|
|
|
* `remote.<name>.annex-availability`
|
|
|
|
Can be used to tell git-annex whether a remote is LocallyAvailable
|
|
or GloballyAvailable. Normally, git-annex determines this automatically.
|
|
|
|
* `remote.<name>.annex-bare`
|
|
|
|
Can be used to tell git-annex if a remote is a bare repository
|
|
or not. Normally, git-annex determines this automatically.
|
|
|
|
* `remote.<name>.annex-ssh-options`
|
|
|
|
Options to use when using ssh to talk to this remote.
|
|
|
|
* `remote.<name>.annex-rsync-options`
|
|
|
|
Options to use when using rsync
|
|
to or from this remote. For example, to force ipv6, and limit
|
|
the bandwidth to 100Kbyte/s, set it to `-6 --bwlimit 100`
|
|
|
|
* `remote.<name>.annex-rsync-upload-options`
|
|
|
|
Options to use when using rsync to upload a file to a remote.
|
|
|
|
These options are passed after other applicable rsync options,
|
|
so can be used to override them. For example, to limit upload bandwidth
|
|
to 10Kbyte/s, set `--bwlimit 10`.
|
|
|
|
* `remote.<name>.annex-rsync-download-options`
|
|
|
|
Options to use when using rsync to download a file from a remote.
|
|
|
|
These options are passed after other applicable rsync options,
|
|
so can be used to override them.
|
|
|
|
* `remote.<name>.annex-rsync-transport`
|
|
|
|
The remote shell to use to connect to the rsync remote. Possible
|
|
values are `ssh` (the default) and `rsh`, together with their
|
|
arguments, for instance `ssh -p 2222 -c blowfish`; Note that the
|
|
remote hostname should not appear there, see rsync(1) for details.
|
|
When the transport used is `ssh`, connections are automatically cached
|
|
unless `annex.sshcaching` is unset.
|
|
|
|
* `remote.<name>.annex-bup-split-options`
|
|
|
|
Options to pass to bup split when storing content in this remote.
|
|
For example, to limit the bandwidth to 100Kbyte/s, set it to `--bwlimit 100k`
|
|
(There is no corresponding option for bup join.)
|
|
|
|
* `remote.<name>.annex-gnupg-options`
|
|
|
|
Options to pass to GnuPG for symmetric encryption. For instance, to
|
|
use the AES cipher with a 256 bits key and disable compression, set it
|
|
to `--cipher-algo AES256 --compress-algo none`. (These options take
|
|
precedence over the default GnuPG configuration, which is otherwise
|
|
used.)
|
|
|
|
* `annex.ssh-options`, `annex.rsync-options`,
|
|
`annex.rsync-upload-options`, `annex.rsync-download-options`,
|
|
`annex.bup-split-options`, `annex.gnupg-options`
|
|
|
|
Default options to use if a remote does not have more specific options
|
|
as described above.
|
|
|
|
* `annex.web-options`
|
|
|
|
Options to pass when running wget or curl.
|
|
For example, to force ipv4 only, set it to "-4"
|
|
|
|
* `annex.quvi-options`
|
|
|
|
Options to pass to quvi when using it to find the url to download for a
|
|
video.
|
|
|
|
* `annex.aria-torrent-options`
|
|
|
|
Options to pass to aria2c when using it to download a torrent.
|
|
|
|
* `annex.http-headers`
|
|
|
|
HTTP headers to send when downloading from the web. Multiple lines of
|
|
this option can be set, one per header.
|
|
|
|
* `annex.http-headers-command`
|
|
|
|
If set, the command is run and each line of its output is used as a HTTP
|
|
header. This overrides annex.http-headers.
|
|
|
|
* `annex.web-download-command`
|
|
|
|
Use to specify a command to run to download a file from the web.
|
|
(The default is to use wget or curl.)
|
|
|
|
In the command line, %url is replaced with the url to download,
|
|
and %file is replaced with the file that it should be saved to.
|
|
|
|
* `annex.secure-erase-command`
|
|
|
|
This can be set to a command that should be run whenever git-annex
|
|
removes the content of a file from the repository.
|
|
|
|
In the command line, %file is replaced with the file that should be
|
|
erased.
|
|
|
|
For example, to use the wipe command, set it to `wipe -f %file`.
|
|
|
|
* `remote.<name>.rsyncurl`
|
|
|
|
Used by rsync special remotes, this configures
|
|
the location of the rsync repository to use. Normally this is automatically
|
|
set up by `git annex initremote`, but you can change it if needed.
|
|
|
|
* `remote.<name>.buprepo`
|
|
|
|
Used by bup special remotes, this configures
|
|
the location of the bup repository to use. Normally this is automatically
|
|
set up by `git annex initremote`, but you can change it if needed.
|
|
|
|
* `remote.<name>.ddarrepo`
|
|
|
|
Used by ddar special remotes, this configures
|
|
the location of the ddar repository to use. Normally this is automatically
|
|
set up by `git annex initremote`, but you can change it if needed.
|
|
|
|
* `remote.<name>.directory`
|
|
|
|
Used by directory special remotes, this configures
|
|
the location of the directory where annexed files are stored for this
|
|
remote. Normally this is automatically set up by `git annex initremote`,
|
|
but you can change it if needed.
|
|
|
|
* `remote.<name>.s3`
|
|
|
|
Used to identify Amazon S3 special remotes.
|
|
Normally this is automatically set up by `git annex initremote`.
|
|
|
|
* `remote.<name>.glacier`
|
|
|
|
Used to identify Amazon Glacier special remotes.
|
|
Normally this is automatically set up by `git annex initremote`.
|
|
|
|
* `remote.<name>.webdav`
|
|
|
|
Used to identify webdav special remotes.
|
|
Normally this is automatically set up by `git annex initremote`.
|
|
|
|
* `remote.<name>.tahoe`
|
|
|
|
Used to identify tahoe special remotes.
|
|
Points to the configuration directory for tahoe.
|
|
|
|
* `remote.<name>.annex-xmppaddress`
|
|
|
|
Used to identify the XMPP address of a Jabber buddy.
|
|
Normally this is set up by the git-annex assistant when pairing over XMPP.
|
|
|
|
* `remote.<name>.gcrypt`
|
|
|
|
Used to identify gcrypt special remotes.
|
|
Normally this is automatically set up by `git annex initremote`.
|
|
|
|
It is set to "true" if this is a gcrypt remote.
|
|
If the gcrypt remote is accessible over ssh and has git-annex-shell
|
|
available to manage it, it's set to "shell".
|
|
|
|
* `remote.<name>.hooktype`, `remote.<name>.externaltype`
|
|
|
|
Used by hook special remotes and external special remotes to record
|
|
the type of the remote.
|
|
|
|
# CONFIGURATION VIA .gitattributes
|
|
|
|
The key-value backend used when adding a new file to the annex can be
|
|
configured on a per-file-type basis via `.gitattributes` files. In the file,
|
|
the `annex.backend` attribute can be set to the name of the backend to
|
|
use. For example, this here's how to use the WORM backend by default,
|
|
but the SHA256E backend for ogg files:
|
|
|
|
* annex.backend=WORM
|
|
*.ogg annex.backend=SHA256E
|
|
|
|
The numcopies setting can also be configured on a per-file-type basis via
|
|
the `annex.numcopies` attribute in `.gitattributes` files. This overrides
|
|
other numcopies settings.
|
|
For example, this makes two copies be needed for wav files and 3 copies
|
|
for flac files:
|
|
|
|
*.wav annex.numcopies=2
|
|
*.flac annex.numcopies=3
|
|
|
|
Note that setting numcopies to 0 is very unsafe.
|
|
|
|
These settings are honored by git-annex whenever it's operating on a
|
|
matching file. However, when using --all, --unused, or --key to specify
|
|
keys to operate on, git-annex is operating on keys and not files, so will
|
|
not honor the settings from .gitattributes.
|
|
|
|
Also note that when using views, only the toplevel .gitattributes file is
|
|
preserved in the view, so other settings in other files won't have any
|
|
effect.
|
|
|
|
# FILES
|
|
|
|
These files are used by git-annex:
|
|
|
|
`.git/annex/objects/` in your git repository contains the annexed file
|
|
contents that are currently available. Annexed files in your git
|
|
repository symlink to that content.
|
|
|
|
`.git/annex/` in your git repository contains other run-time information
|
|
used by git-annex.
|
|
|
|
`~/.config/git-annex/autostart` is a list of git repositories
|
|
to start the git-annex assistant in.
|
|
|
|
`.git/hooks/pre-commit-annex` in your git repository will be run whenever
|
|
a commit is made, either by git commit, git-annex sync, or the git-annex
|
|
assistant.
|
|
|
|
# SEE ALSO
|
|
|
|
Most of git-annex's documentation is available on its web site,
|
|
<http://git-annex.branchable.com/>
|
|
|
|
If git-annex is installed from a package, a copy of its documentation
|
|
should be included, in, for example, `/usr/share/doc/git-annex/`.
|
|
|
|
# AUTHOR
|
|
|
|
Joey Hess <id@joeyh.name>
|
|
|
|
<http://git-annex.branchable.com/>
|
|
|
|
Warning: Automatically converted into a man page by mdwn2man. Edit with care.
|