git-annex/doc/git-annex-export.mdwn
Joey Hess 7294d23d78
export: Added --from option
This is similar to git-annex copy --from --to, in that it downloads a
local copy, locks it for removal, uploads it, and drops it. Removal of
the temporary local copy is done without verifying numcopies for the
same reason as that command.

I do wonder, looking at this, if there's a race where the local copy
gets used as a copy to allow some other drop in the narrow window after
it is downloaded and before it gets locked for removal. That would need
some other repository to have an out of date location log that says the
repository contains a copy of the key, in order for it to try to use it
as a copy. If there is such a race, git-annex copy/move would also be
vulnerable to it. It would be better to lock it for removal before
starting to download it! That is possible in v10 repositories, which do
use a separate content lock file.

Note that, when the exported tree contains several files that use the
same key, it will be downloaded repeatedly, once per time needed to
upload it. It would be possible to avoid that extra work, but it would
complicate this since the local copy would need to be preserved, locked
for removal, until the end. Also, that would mean that interrupting the
export would leave possibly a lot of temporarily downloaded keys in the
local repository, while currently it can only leave one.
2024-08-08 12:08:55 -04:00

188 lines
6.4 KiB
Markdown

# NAME
git-annex export - export a tree of files to a special remote
# SYNOPSIS
git annex export `treeish --to remote`
# DESCRIPTION
Use this command to export a tree of files from a git-annex repository.
Normally files are stored on a git-annex special remote named by their
keys. That is great for reliable data storage, but your filenames are
obscured. Exporting replicates the tree to the special remote as-is.
To use this, you have to configure a special remote with
`exporttree=yes` when initially setting it up with
[[git-annex-initremote]](1).
The treeish to export can be the name of a git branch, or a tag, or any
other treeish accepted by git, including eg master:subdir to only export a
subdirectory from a branch.
When the remote has a preferred content expression set by
[[git-annex-wanted]](1), the treeish is
filtered through it, excluding annexed files it does not want from
being exported to it. (Note that things in the expression like
"include=" match relative to the top of the treeish being exported.)
Any files in the treeish that are stored on git will also be exported to
the special remote.
Repeated exports are done efficiently, by diffing the old and new tree,
and transferring only the changed files, and renaming files as necessary.
Exports can be interrupted and resumed. However, partially uploaded files
will be re-started from the beginning in most cases.
Once content has been exported to a remote, commands like `git annex get`
can download content from there the same as from other remotes. However,
since an export is not a key/value store, git-annex has to do more
verification of content downloaded from an export. Some types of keys,
that are not based on checksums, cannot be downloaded from an export.
And, git-annex will never trust an export to retain the content of a key.
However, some special remotes, notably S3, support keeping track of old
versions of files stored in them. If a special remote is set up to do
that, it can be used as a key/value store and the limitations in the above
paragraph do not apply. Note that dropping content from such a remote is
not supported. See individual special remotes' documentation for
details of how to enable such versioning.
Commands like `git-annex push` can also be used to export a branch to a
special remote, updating the special remote whenever the branch is changed.
To do this, you need to configure "remote.<name>.annex-tracking-branch" to
tell it what branch to track. For example:
git config remote.myremote.annex-tracking-branch master
git annex push myremote
You can combine using `git annex export` to send changes to a special
remote with `git annex import` to fetch changes from a special remote.
When a file on a special remote has been modified by software other than
git-annex, exporting to it will not overwrite the modified file, and the
export will not succeed. You can resolve this conflict by using
`git annex import`.
(Some types of special remotes such as S3 with versioning may instead
let an export overwrite the modified file; then `git annex import`
will create a sequence of commits that includes the modified file,
so the overwritten modification is not lost.)
# OPTIONS
* `--to=remote`
Specify the special remote to export to.
* `--from=remote`
When the content of a file is not available in the local repository,
this option lets it be downloaded from another remote, and sent on to the
destination remote. The file will be temporarily stored on local disk,
but will never enter the local repository.
This option can be repeated multiple times.
It is possible to use --from with the same remote as --to. If the tree
contains several files with the same content, and the remote being
exported to already contains one copy of the content, this allows making
a copy by downloading the content from it.
* `--tracking`
This is a deprecated way to set "remote.<name>.annex-tracking-branch".
Instead of using this option, you should just set the git configuration
yourself.
* `--fast`
This sets up an export of a tree, but avoids any expensive file uploads to
the remote. You can later run `git annex push` to upload
the files to the export.
* `--jobs=N` `-JN`
Exports multiple files in parallel. This may be faster.
For example: `-J4`
Setting this to "cpus" will run one job per CPU core.
* `--json`
Enable JSON output. This is intended to be parsed by programs that use
git-annex. Each line of output is a JSON object.
* `--json-progress`
Include progress objects in JSON output.
* `--json-error-messages`
Messages that would normally be output to standard error are included in
the JSON instead.
* Also the [[git-annex-common-options]](1) can be used.
# EXAMPLE
git annex initremote myremote type=directory directory=/mnt/myremote \
exporttree=yes encryption=none
git annex export master --to myremote
After that, /mnt/myremote will contain the same tree of files as the master
branch does.
git mv myfile subdir/myfile
git commit -m renamed
git annex export master --to myremote
That updates /mnt/myremote to reflect the renamed file.
git annex export master:subdir --to myremote
That updates /mnt/myremote, to contain only the files in the "subdir"
directory of the master branch.
# EXPORT CONFLICTS
If two different git-annex repositories are both exporting different trees
to the same special remote, it's possible for an export conflict to occur.
This leaves the special remote with some files from one tree, and some
files from the other. Files in the special remote may have entirely the
wrong content as well.
It's not possible for git-annex to detect when making an export will result
in an export conflict. The best way to avoid export conflicts is to either
only ever export to a special remote from a single repository, or to have a
rule about the tree that you export to the special remote. For example, if
you always export origin/master after pushing to origin, then an export
conflict can't happen.
An export conflict can only be detected after the two git repositories
that produced it get back in sync. Then the next time you run `git annex
export`, it will detect the export conflict, and resolve it.
# SEE ALSO
[[git-annex]](1)
[[git-annex-initremote]](1)
[[git-annex-import]](1)
[[git-annex-push]](1)
[[git-annex-preferred-content]](1)
# HISTORY
The `export` command was introduced in git-annex version 6.20170925.
# AUTHOR
Joey Hess <id@joeyh.name>
Warning: Automatically converted into a man page by mdwn2man. Edit with care.