a35278430a
As part of this, I fixed up how log was getting the descriptions of remotes.
692 lines
22 KiB
Markdown
692 lines
22 KiB
Markdown
# NAME
|
|
|
|
git-annex - manage files with git, without checking their contents in
|
|
|
|
# SYNOPSIS
|
|
|
|
git annex command [params ...]
|
|
|
|
# DESCRIPTION
|
|
|
|
git-annex allows managing files with git, without checking the file
|
|
contents into git. While that may seem paradoxical, it is useful when
|
|
dealing with files larger than git can currently easily handle, whether due
|
|
to limitations in memory, checksumming time, or disk space.
|
|
|
|
Even without file content tracking, being able to manage files with git,
|
|
move files around and delete files with versioned directory trees, and use
|
|
branches and distributed clones, are all very handy reasons to use git. And
|
|
annexed files can co-exist in the same git repository with regularly
|
|
versioned files, which is convenient for maintaining documents, Makefiles,
|
|
etc that are associated with annexed files but that benefit from full
|
|
revision control.
|
|
|
|
When a file is annexed, its content is moved into a key-value store, and
|
|
a symlink is made that points to the content. These symlinks are checked into
|
|
git and versioned like regular files. You can move them around, delete
|
|
them, and so on. Pushing to another git repository will make git-annex
|
|
there aware of the annexed file, and it can be used to retrieve its
|
|
content from the key-value store.
|
|
|
|
# EXAMPLES
|
|
|
|
# git annex get video/hackity_hack_and_kaxxt.mov
|
|
get video/_why_hackity_hack_and_kaxxt.mov (not available)
|
|
I was unable to access these remotes: server
|
|
Try making some of these repositories available:
|
|
5863d8c0-d9a9-11df-adb2-af51e6559a49 -- my home file server
|
|
58d84e8a-d9ae-11df-a1aa-ab9aa8c00826 -- portable USB drive
|
|
ca20064c-dbb5-11df-b2fe-002170d25c55 -- backup SATA drive
|
|
failed
|
|
# sudo mount /media/usb
|
|
# git remote add usbdrive /media/usb
|
|
# git annex get video/hackity_hack_and_kaxxt.mov
|
|
get video/hackity_hack_and_kaxxt.mov (from usbdrive...) ok
|
|
|
|
# git annex add iso
|
|
add iso/Debian_5.0.iso ok
|
|
|
|
# git annex drop iso/Debian_4.0.iso
|
|
drop iso/Debian_4.0.iso ok
|
|
|
|
# git annex move iso --to=usbdrive
|
|
move iso/Debian_5.0.iso (moving to usbdrive...) ok
|
|
|
|
# COMMONLY USED COMMANDS
|
|
|
|
Like many git commands, git-annex can be passed a path that
|
|
is either a file or a directory. In the latter case it acts on all relevant
|
|
files in the directory. When no path is specified, most git-annex commands
|
|
default to acting on all relevant files in the current directory (and
|
|
subdirectories).
|
|
|
|
* add [path ...]
|
|
|
|
Adds files in the path to the annex. Files that are already checked into
|
|
git, or that git has been configured to ignore will be silently skipped.
|
|
(Use --force to add ignored files.) Dotfiles are skipped unless explicitly
|
|
listed.
|
|
|
|
* get [path ...]
|
|
|
|
Makes the content of annexed files available in this repository. This
|
|
will involve copying them from another repository, or downloading them,
|
|
or transferring them from some kind of key-value store.
|
|
|
|
Normally git-annex will choose which repository to copy the content from,
|
|
but you can override this using the --from option.
|
|
|
|
* drop [path ...]
|
|
|
|
Drops the content of annexed files from this repository.
|
|
|
|
git-annex will refuse to drop content if it cannot verify it is
|
|
safe to do so. This can be overridden with the --force switch.
|
|
|
|
To drop content from a remote, specify --from.
|
|
|
|
* move [path ...]
|
|
|
|
When used with the --from option, moves the content of annexed files
|
|
from the specified repository to the current one.
|
|
|
|
When used with the --to option, moves the content of annexed files from
|
|
the current repository to the specified one.
|
|
|
|
* copy [path ...]
|
|
|
|
When used with the --from option, copies the content of annexed files
|
|
from the specified repository to the current one.
|
|
|
|
When used with the --to option, copies the content of annexed files from
|
|
the current repository to the specified one.
|
|
|
|
To avoid contacting the remote to check if it has every file, specify --fast
|
|
|
|
* unlock [path ...]
|
|
|
|
Normally, the content of annexed files is protected from being changed.
|
|
Unlocking a annexed file allows it to be modified. This replaces the
|
|
symlink for each specified file with a copy of the file's content.
|
|
You can then modify it and `git annex add` (or `git commit`) to inject
|
|
it back into the annex.
|
|
|
|
* edit [path ...]
|
|
|
|
This is an alias for the unlock command. May be easier to remember,
|
|
if you think of this as allowing you to edit an annexed file.
|
|
|
|
* lock [path ...]
|
|
|
|
Use this to undo an unlock command if you don't want to modify
|
|
the files, or have made modifications you want to discard.
|
|
|
|
* sync [remote ...]
|
|
|
|
Use this command when you want to synchronize the local repository with
|
|
one or more of its remotes. You can specifiy the remotes to sync with;
|
|
the default is to sync with all remotes. Or specify --fast to sync with
|
|
the remotes with the lowest annex-cost value.
|
|
|
|
The sync process involves first committing all local changes (git commit -a),
|
|
then fetching and merging the `synced/master` and the `git-annex` branch
|
|
from the remote repositories and finally pushing the changes back to
|
|
those branches on the remote repositories. You can use standard git
|
|
commands to do each of those steps by hand, or if you don't want to
|
|
worry about the details, you can use sync.
|
|
|
|
Note that syncing with a remote will not update the remote's working
|
|
tree with changes made to the local repository. However, those changes
|
|
are pushed to the remote, so can be merged into its working tree
|
|
by running "git annex sync" on the remote.
|
|
|
|
Note that sync does not transfer any file contents from or to the remote
|
|
repositories.
|
|
|
|
* addurl [url ...]
|
|
|
|
Downloads each url to a file, which is added to the annex.
|
|
|
|
To avoid immediately downloading the url, specify --fast
|
|
|
|
# REPOSITORY SETUP COMMANDS
|
|
|
|
* init [description]
|
|
|
|
Until a repository (or one of its remotes) has been initialized,
|
|
git-annex will refuse to operate on it, to avoid accidentially
|
|
using it in a repository that was not intended to have an annex.
|
|
|
|
It's useful, but not mandatory, to initialize each new clone
|
|
of a repository with its own description.
|
|
|
|
* describe repository description
|
|
|
|
Changes the description of a repository.
|
|
|
|
The repository to describe can be specified by git remote name or
|
|
by uuid. To change the description of the current repository, use
|
|
"."
|
|
|
|
* initremote name [param=value ...]
|
|
|
|
Sets up a special remote. The remote's
|
|
configuration is specified by the parameters. If a remote
|
|
with the specified name has already been configured, its configuration
|
|
is modified by any values specified. In either case, the remote will be
|
|
added to `.git/config`.
|
|
|
|
Example Amazon S3 remote:
|
|
|
|
initremote mys3 type=S3 encryption=none datacenter=EU
|
|
|
|
* trust [repository ...]
|
|
|
|
Records that a repository is trusted to not unexpectedly lose
|
|
content. Use with care.
|
|
|
|
To trust the current repository, use "."
|
|
|
|
* untrust [repository ...]
|
|
|
|
Records that a repository is not trusted and could lose content
|
|
at any time.
|
|
|
|
* semitrust [repository ...]
|
|
|
|
Returns a repository to the default semi trusted state.
|
|
|
|
* dead [repository ...]
|
|
|
|
Indicates that the repository has been irretrevably lost.
|
|
(To undo, use semitrust.)
|
|
|
|
# REPOSITORY MAINTENANCE COMMANDS
|
|
|
|
* fsck [path ...]
|
|
|
|
With no parameters, this command checks the whole annex for consistency,
|
|
and warns about or fixes any problems found.
|
|
|
|
With parameters, only the specified files are checked.
|
|
|
|
To avoid expensive checksum calculations, specify --fast
|
|
|
|
* unused
|
|
|
|
Checks the annex for data that does not correspond to any files present
|
|
in any tag or branch, and prints a numbered list of the data.
|
|
|
|
To only show unused temp and bad files, specify --fast.
|
|
|
|
To check for annexed data on a remote, specify --from.
|
|
|
|
* dropunused [number ...]
|
|
|
|
Drops the data corresponding to the numbers, as listed by the last
|
|
`git annex unused`
|
|
|
|
To drop the data from a remote, specify --from.
|
|
|
|
* merge
|
|
|
|
Automatically merges remote tracking branches */git-annex into
|
|
the git-annex branch. While git-annex mostly handles keeping the
|
|
git-annex branch merged automatically, if you find you are unable
|
|
to push the git-annex branch due non-fast-forward, this will fix it.
|
|
|
|
* fix [path ...]
|
|
|
|
Fixes up symlinks that have become broken to again point to annexed content.
|
|
This is useful to run if you have been moving the symlinks around,
|
|
but is done automatically when committing a change with git too.
|
|
|
|
* upgrade
|
|
|
|
Upgrades the repository to current layout.
|
|
|
|
# QUERY COMMANDS
|
|
|
|
* version
|
|
|
|
Shows the version of git-annex, as well as repository version information.
|
|
|
|
* find [path ...]
|
|
|
|
Outputs a list of annexed files in the specified path. With no path,
|
|
finds files in the current directory and its subdirectories.
|
|
|
|
By default, only lists annexed files whose content is currently present.
|
|
This can be changed by specifying file matching options. To list all
|
|
annexed files, present or not, specify --include "*". To list all
|
|
annexed files whose content is not present, specify --not --in="."
|
|
|
|
To output filenames terminated with nulls, for use with xargs -0,
|
|
specify --print0. Or, a custom output formatting can be specified using
|
|
--format. The default output format is the same as --format='${file}\\n'
|
|
|
|
These variables are available for use in formats: file, key, backend,
|
|
bytesize, humansize
|
|
|
|
* whereis [path ...]
|
|
|
|
Displays a list of repositories known to contain the content of the
|
|
specified file or files.
|
|
|
|
* log [path ...]
|
|
|
|
Displays the location log for the specified file or files,
|
|
showing each repository they were added to ("+") and removed from ("-").
|
|
|
|
To limit how far back to seach for location log changes, the options
|
|
--since, --after, --until, --before, and --max-count can be specified.
|
|
They are passed through to git log. For example, --since "1 month ago"
|
|
|
|
To generate output suitable for the gource visualisation program,
|
|
specify --gource.
|
|
|
|
* status
|
|
|
|
Displays some statistics and other information, including how much data
|
|
is in the annex and a list of all known repositories.
|
|
|
|
To only show the data that can be gathered quickly, use --fast.
|
|
|
|
* map
|
|
|
|
Helps you keep track of your repositories, and the connections between them,
|
|
by going out and looking at all the ones it can get to, and generating a
|
|
Graphviz file displaying it all. If the `dot` command is available, it is
|
|
used to display the file to your screen (using x11 backend). (To disable
|
|
this display, specify --fast)
|
|
|
|
This command only connects to hosts that the host it's run on can
|
|
directly connect to. It does not try to tunnel through intermediate hosts.
|
|
So it might not show all connections between the repositories in the network.
|
|
|
|
Also, if connecting to a host requires a password, you might have to enter
|
|
it several times as the map is being built.
|
|
|
|
Note that this subcommand can be used to graph any git repository; it
|
|
is not limited to git-annex repositories.
|
|
|
|
# UTILITY COMMANDS
|
|
|
|
* migrate [path ...]
|
|
|
|
Changes the specified annexed files to use the default key-value backend
|
|
(or the one specified with --backend). Only files whose content
|
|
is currently available are migrated.
|
|
|
|
Note that the content is also still available using the old key after
|
|
migration. Use `git annex unused` to find and remove the old key.
|
|
|
|
Normally, nothing will be done to files already using the new backend.
|
|
However, if a backend changes the information it uses to construct a key,
|
|
this can also be used to migrate files to use the new key format.
|
|
|
|
* reinject src dest
|
|
|
|
Moves the src file into the annex as the content of the dest file.
|
|
This can be useful if you have obtained the content of a file from
|
|
elsewhere and want to put it in the local annex.
|
|
|
|
Automatically runs fsck on dest to check that the expected content was
|
|
provided.
|
|
|
|
Example:
|
|
|
|
git annex reinject /tmp/foo.iso foo.iso
|
|
|
|
* unannex [path ...]
|
|
|
|
Use this to undo an accidental `git annex add` command. You can use
|
|
`git annex unannex` to move content out of the annex at any point,
|
|
even if you've already committed it.
|
|
|
|
This is not the command you should use if you intentionally annexed a
|
|
file and don't want its contents any more. In that case you should use
|
|
`git annex drop` instead, and you can also `git rm` the file.
|
|
|
|
In --fast mode, this command leaves content in the annex, simply making
|
|
a hard link to it.
|
|
|
|
* uninit
|
|
|
|
Use this to stop using git annex. It will unannex every file in the
|
|
repository, and remove all of git-annex's other data, leaving you with a
|
|
git repository plus the previously annexed files.
|
|
|
|
# PLUMBING COMMANDS
|
|
|
|
* pre-commit [path ...]
|
|
|
|
Fixes up symlinks that are staged as part of a commit, to ensure they
|
|
point to annexed content. Also handles injecting changes to unlocked
|
|
files into the annex.
|
|
|
|
This is meant to be called from git's pre-commit hook. `git annex init`
|
|
automatically creates a pre-commit hook using this.
|
|
|
|
* fromkey key file
|
|
|
|
This plumbing-level command can be used to manually set up a file
|
|
in the git repository to link to a specified key.
|
|
|
|
* dropkey [key ...]
|
|
|
|
This plumbing-level command drops the annexed data for the specified
|
|
keys from this repository.
|
|
|
|
This can be used to drop content for arbitrary keys, which do not need
|
|
to have a file in the git repository pointing at them.
|
|
|
|
Example:
|
|
|
|
git annex dropkey SHA1-s10-7da006579dd64330eb2456001fd01948430572f2
|
|
|
|
# OPTIONS
|
|
|
|
* --force
|
|
|
|
Force unsafe actions, such as dropping a file's content when no other
|
|
source of it can be verified to still exist, or adding ignored files.
|
|
Use with care.
|
|
|
|
* --fast
|
|
|
|
Enables less expensive, but also less thorough versions of some commands.
|
|
What is avoided depends on the command.
|
|
|
|
* --auto
|
|
|
|
Enables automatic mode. Commands that get, drop, or move file contents
|
|
will only do so when needed to help satisfy the setting of annex.numcopies.
|
|
|
|
* --quiet
|
|
|
|
Avoid the default verbose display of what is done; only show errors
|
|
and progress displays.
|
|
|
|
* --verbose
|
|
|
|
Enable verbose display.
|
|
|
|
* --json
|
|
|
|
Rather than the normal output, generate JSON. This is intended to be
|
|
parsed by programs that use git-annex. Each line of output is a JSON
|
|
object.
|
|
|
|
* --debug
|
|
|
|
Show debug messages.
|
|
|
|
* --from=repository
|
|
|
|
Specifies a repository that content will be retrieved from, or that
|
|
should otherwise be acted on.
|
|
|
|
It should be specified using the name of a configured remote.
|
|
|
|
* --to=repository
|
|
|
|
Specifies a repository that content will be sent to.
|
|
|
|
It should be specified using the name of a configured remote.
|
|
|
|
* --numcopies=n
|
|
|
|
Overrides the `annex.numcopies` setting, forcing git-annex to ensure the
|
|
specified number of copies exist.
|
|
|
|
* --trust=repository
|
|
* --semitrust=repository
|
|
* --untrust=repository
|
|
|
|
Overrides trust settings for a repository. May be specified more than once.
|
|
|
|
The repository should be specified using the name of a configured remote,
|
|
or the UUID or description of a repository.
|
|
|
|
* --backend=name
|
|
|
|
Specifies which key-value backend to use. This can be used when
|
|
adding a file to the annex, or migrating a file. Once files
|
|
are in the annex, their backend is known and this option is not
|
|
necessary.
|
|
|
|
* --format=value
|
|
|
|
Specifies a custom output format. The value is a format string,
|
|
in which '${var}' is expanded to the value of a variable. To right-justify
|
|
a variable with whitespace, use '${var;width}' ; to left-justify
|
|
a variable, use '${var;-width}'; to escape unusual characters in a variable,
|
|
use '${escaped_var}'
|
|
|
|
Also, '\\n' is a newline, '\\000' is a NULL, etc.
|
|
|
|
* -c name=value
|
|
|
|
Used to override git configuration settings. May be specified multiple times.
|
|
|
|
# FILE MATCHING OPTIONS
|
|
|
|
These options can all be specified multiple times, and can be combined to
|
|
limit which files git-annex acts on.
|
|
|
|
Arbitrarily complicated expressions can be built using these options.
|
|
For example:
|
|
|
|
--exclude '*.mp3' --and --not -( --in=usbdrive --or --in=archive -)
|
|
|
|
The above example prevents git-annex from working on mp3 files whose
|
|
file contents are present at either of two repositories.
|
|
|
|
* --exclude=glob
|
|
|
|
Skips files matching the glob pattern. The glob is matched relative to
|
|
the current directory. For example:
|
|
|
|
--exclude='*.mp3' --exclude='subdir/*'
|
|
|
|
* --include=glob
|
|
|
|
Skips files not matching the glob pattern. (Same as --not --exclude.)
|
|
For example, to include only mp3 and ogg files:
|
|
|
|
--include='*.mp3' --or --include='*.ogg'
|
|
|
|
* --in=repository
|
|
|
|
Matches only files that git-annex believes have their contents present
|
|
in a repository. Note that it does not check the repository to verify
|
|
that it still has the content.
|
|
|
|
The repository should be specified using the name of a configured remote,
|
|
or the UUID or description of a repository. For the current repository,
|
|
use "--in=."
|
|
|
|
* --copies=number
|
|
|
|
Matches only files that git-annex believes to have the specified number
|
|
of copies, or more. Note that it does not check remotes to verify that
|
|
the copies still exist.
|
|
|
|
* --inbackend=name
|
|
|
|
Matches only files whose content is stored using the specified key-value
|
|
backend.
|
|
|
|
* --not
|
|
|
|
Inverts the next file matching option. For example, to only act on
|
|
mp3s, use: --not --exclude='*.mp3'
|
|
|
|
* --and
|
|
|
|
Requires that both the previous and the next file matching option matches.
|
|
The default.
|
|
|
|
* --or
|
|
|
|
Requires that either the previous, or the next file matching option matches.
|
|
|
|
* -(
|
|
|
|
Opens a group of file matching options.
|
|
|
|
* -)
|
|
|
|
Closes a group of file matching options.
|
|
|
|
# CONFIGURATION
|
|
|
|
Like other git commands, git-annex is configured via `.git/config`.
|
|
Here are all the supported configuration settings.
|
|
|
|
* `annex.uuid`
|
|
|
|
A unique UUID for this repository (automatically set).
|
|
|
|
* `annex.numcopies`
|
|
|
|
Number of copies of files to keep across all repositories. (default: 1)
|
|
|
|
* `annex.backends`
|
|
|
|
Space-separated list of names of the key-value backends to use.
|
|
The first listed is used to store new files by default.
|
|
|
|
* `annex.diskreserve`
|
|
|
|
Amount of disk space to reserve. Disk space is checked when transferring
|
|
content to avoid running out, and additional free space can be reserved
|
|
via this option, to make space for more important content (such as git
|
|
commit logs). Can be specified with any commonly used units, for example,
|
|
"0.5 gb" or "100 KiloBytes"
|
|
|
|
The default reserve is 1 megabyte.
|
|
|
|
* `annex.version`
|
|
|
|
Automatically maintained, and used to automate upgrades between versions.
|
|
|
|
* `remote.<name>.annex-cost`
|
|
|
|
When determining which repository to
|
|
transfer annexed files from or to, ones with lower costs are preferred.
|
|
The default cost is 100 for local repositories, and 200 for remote
|
|
repositories.
|
|
|
|
* `remote.<name>.annex-cost-command`
|
|
|
|
If set, the command is run, and the number it outputs is used as the cost.
|
|
This allows varying the cost based on eg, the current network. The
|
|
cost-command can be any shell command line.
|
|
|
|
* `remote.<name>.annex-ignore`
|
|
|
|
If set to `true`, prevents git-annex
|
|
from using this remote by default. (You can still request it be used
|
|
by the --from and --to options.)
|
|
|
|
This is, for example, useful if the remote is located somewhere
|
|
without git-annex-shell. (For example, if it's on GitHub).
|
|
Or, it could be used if the network connection between two
|
|
repositories is too slow to be used normally.
|
|
|
|
* `remote.<name>.annexUrl`
|
|
|
|
Can be used to specify a different url than the regular `remote.<name>.url`
|
|
for git-annex to use when talking with the remote. Similar to the `pushUrl`
|
|
used by git-push.
|
|
|
|
* `remote.<name>.annex-uuid`
|
|
|
|
git-annex caches UUIDs of remote repositories here.
|
|
|
|
* `remote.<name>.annex-ssh-options`
|
|
|
|
Options to use when using ssh to talk to this remote.
|
|
|
|
* `remote.<name>.annex-rsync-options`
|
|
|
|
Options to use when using rsync
|
|
to or from this remote. For example, to force ipv6, and limit
|
|
the bandwidth to 100Kbyte/s, set it to "-6 --bwlimit 100"
|
|
|
|
* `remote.<name>.annex-web-options`
|
|
|
|
Options to use when using wget or curl to download a file from the web.
|
|
(wget is always used in preference to curl if available).
|
|
For example, to force ipv4 only, set it to "-4"
|
|
|
|
* `remote.<name>.annex-bup-split-options`
|
|
|
|
Options to pass to bup split when storing content in this remote.
|
|
For example, to limit the bandwidth to 100Kbye/s, set it to "--bwlimit 100k"
|
|
(There is no corresponding option for bup join.)
|
|
|
|
* `annex.ssh-options`, `annex.rsync-options`, `annex.web-options, `annex.bup-split-options`
|
|
|
|
Default ssh, rsync, wget/curl, and bup options to use if a remote does not
|
|
have specific options.
|
|
|
|
* `remote.<name>.buprepo`
|
|
|
|
Used by bup special remotes, this configures
|
|
the location of the bup repository to use. Normally this is automaticaly
|
|
set up by `git annex initremote`, but you can change it if needed.
|
|
|
|
* `remote.<name>.directory`
|
|
|
|
Used by directory special remotes, this configures
|
|
the location of the directory where annexed files are stored for this
|
|
remote. Normally this is automaticaly set up by `git annex initremote`,
|
|
but you can change it if needed.
|
|
|
|
* `remote.<name>.s3`
|
|
|
|
Used to identify Amazon S3 special remotes.
|
|
Normally this is automaticaly set up by `git annex initremote`.
|
|
|
|
# CONFIGURATION VIA .gitattributes
|
|
|
|
The key-value backend used when adding a new file to the annex can be
|
|
configured on a per-file-type basis via `.gitattributes` files. In the file,
|
|
the `annex.backend` attribute can be set to the name of the backend to
|
|
use. For example, this here's how to use the WORM backend by default,
|
|
but the SHA1 backend for ogg files:
|
|
|
|
* annex.backend=WORM
|
|
*.ogg annex.backend=SHA1
|
|
|
|
The numcopies setting can also be configured on a per-file-type basis via
|
|
the `annex.numcopies` attribute in `.gitattributes` files.
|
|
For example, this makes two copies be needed for wav files:
|
|
|
|
*.wav annex.numcopies=2
|
|
|
|
# FILES
|
|
|
|
These files are used by git-annex, in your git repository:
|
|
|
|
`.git/annex/objects/` contains the annexed file contents that are currently
|
|
available. Annexed files in your git repository symlink to that content.
|
|
|
|
# SEE ALSO
|
|
|
|
Most of git-annex's documentation is available on its web site,
|
|
<http://git-annex.branchable.com/>
|
|
|
|
If git-annex is installed from a package, a copy of its documentation
|
|
should be included, in, for example, `/usr/share/doc/git-annex/`
|
|
|
|
# AUTHOR
|
|
|
|
Joey Hess <joey@kitenet.net>
|
|
|
|
<http://git-annex.branchable.com/>
|
|
|
|
Warning: Automatically converted into a man page by mdwn2man. Edit with care
|