git-annex/doc/git-annex-import.mdwn

239 lines
8.4 KiB
Text
Raw Normal View History

# NAME
git-annex import - add a tree of files to the repository
# SYNOPSIS
git annex import --from remote branch[:subdir] | `[path ...]`
# DESCRIPTION
This command is a way to import a tree of files from elsewhere into your
git-annex repository. It can import files from a git-annex special remote,
or from a directory.
2019-05-14 18:29:40 +00:00
# IMPORTING FROM A SPECIAL REMOTE
Importing from a special remote first downloads or hashes all new content
from it, and then constructs a git commit that reflects files that have
changed on the special remote since the last time git-annex looked at it.
Merging that commit into your repository will update it to reflect changes
made on the special remote.
This way, something can be using the special remote for file storage,
adding files, modifying files, and deleting files, and you can track those
changes using git-annex.
You can combine using `git annex import` to fetch changes from a special
remote with `git annex export` to send your local changes to the special
remote.
You can only import from special remotes that were configured with
`importtree=yes` when set up with [[git-annex-initremote]](1). Only some
kinds of special remotes will let you configure them this way. A perhaps
non-exhastive list is the directory, s3, and adb special remotes.
To import from a special remote, you must specify the name of a branch.
2020-07-03 15:54:21 +00:00
A corresponding remote tracking branch will be updated by `git annex import`.
After that point, it's the same as if you had run a `git fetch`
from a regular git remote; you can merge the changes into your
currently checked out branch.
For example:
git annex import master --from myremote
git annex merge myremote/master
You could just as well use `git merge myremote/master` as the second step,
but using `git-annex merge` avoids a couple of gotchas. When using adjusted
branches, it adjusts the branch before merging from it. And it avoids
the merge failing on the first merge from an import due to unrelated
histories.
If you do use `git merge`, you can pass `--allow-unrelated-histories` the
first time you `git merge` from an import. Think of this as the remote
being a separate git repository with its own files. If you first
`git annex export` files to a remote, and then `git annex import` from it,
you won't need that option.
2019-05-14 18:43:51 +00:00
You can import into a subdirectory, using the "branch:subdir" syntax. For
example, if "camera" is a special remote that accesses a camera, and you
want to import those into the photos directory, rather than to the root of
your repository:
git annex import master:photos --from camera
git merge camera/master
The `git annex sync --content` command (and the git-annex assistant)
can also be used to import from a special remote.
To do this, you need to configure "remote.<name>.annex-tracking-branch"
to tell it what branch to track. For example:
git config remote.myremote.annex-tracking-branch master
git annex sync --content
2019-05-14 18:43:51 +00:00
If a preferred content expression is configured for the special remote,
it will be honored when importing from it. Files that are not preferred
content of the remote will not be imported from it, but will be left on the
remote.
However, preferred content expressions that relate to the key
can't be matched when importing, because the content of the file is not
known. Importing will fail when such a preferred content expression is
set. This includes expressions containing "copies=", "metadata=", and other
things that depend on the key. Preferred content expressions containing
"include=", "exclude=" "smallerthan=", "largerthan=" will work.
2019-05-14 18:43:51 +00:00
# OPTIONS FOR IMPORTING FROM A SPECIAL REMOTE
* `--content`, `--no-content`
Controls whether content is downloaded from the special remote.
The default is to download content into the git-annex repository.
With --no-content, git-annex keys are generated from information
provided by the special remote, without downloading it. Commands like
`git-annex get` can later be used to download files, as desired.
The --no-content option is not supported by all special remotes,
and the kind of git-annex key that is generated is left up to
each special remote. So while the directory special remote hashes
the file and generates the same key it usually would, other
special remotes may use unusual keys like SHA1, or WORM, depending
on the limitations of the special remote.
The annex.securehashesonly configuration, if set, will prevent
--no-content importing from a special remote that uses insecure keys.
Using --no-content prevents annex.largefiles from being checked,
because the files are not downloaded. So, when using --no-content,
files that would usually be considered non-large will be added to the
annex, rather than adding them directly to the git repository.
Note that a different git tree will often be generated when using
--no-content than would be generated when using --content, because
the options cause different kinds of keys to be used when importing
new/changed files. So mixing uses of --content and --no-content can
lead to merge conflicts in some situations.
2019-05-14 18:29:40 +00:00
# IMPORTING FROM A DIRECTORY
When run with a path, `git annex import` moves files from somewhere outside
2019-02-23 19:57:18 +00:00
the git working copy, and adds them to the annex.
This is a legacy interface. It is still supported, but please consider
switching to importing from a directory special remote instead, using the
interface documented above.
Individual files to import can be specified. If a directory is specified,
the entire directory is imported.
git annex import /media/camera/DCIM/*
2015-03-26 15:44:20 +00:00
When importing files, there's a possibility of importing a duplicate
of a file that is already known to git-annex -- its content is either
2015-04-29 17:25:10 +00:00
present in the local repository already, or git-annex knows of another
repository that contains it, or it was present in the annex before but has
been removed now.
2015-03-26 15:44:20 +00:00
By default, importing a duplicate of a known file will result in
a new filename being added to the repository, so the duplicate file
is present in the repository twice. (With all checksumming backends,
including the default SHA256E, only one copy of the data will be stored.)
Several options can be used to adjust handling of duplicate files, see
`--duplicate`, `--deduplicate`, `--skip-duplicates`, `--clean-duplicates`,
and `--reinject-duplicates` documentation below.
# OPTIONS FOR IMPORTING FROM A DIRECTORY
* `--duplicate`
Do not delete files from the import location.
Running with this option repeatedly can import the same files into
different git repositories, or branches, or different locations in a git
repository.
* `--deduplicate`
2015-03-26 15:44:20 +00:00
Only import files that are not duplicates;
duplicate files will be deleted from the import location.
* `--skip-duplicates`
Only import files that are not duplicates. Avoids deleting any
files from the import location.
* `--clean-duplicates`
Does not import any files, but any files found in the import location
2015-03-26 15:44:20 +00:00
that are duplicates are deleted.
* `--reinject-duplicates`
Imports files that are not duplicates. Files that are duplicates have
their content reinjected into the annex (similar to
2018-07-06 16:51:18 +00:00
[[git-annex-reinject]](1)).
2015-04-29 17:32:50 +00:00
* `--force`
Allow existing files to be overwritten by newly imported files.
Also, causes .gitignore to not take effect when adding files.
* file matching options
Many of the [[git-annex-matching-options]](1)
can be used to specify files to import.
git annex import /dir --include='*.png'
## COMMON OPTIONS
* `--jobs=N` `-JN`
Imports multiple files in parallel. This may be faster.
For example: `-J4`
Setting this to "cpus" will run one job per CPU core.
* `--json`
Enable JSON output. This is intended to be parsed by programs that use
git-annex. Each line of output is a JSON object.
* `--json-progress`
Include progress objects in JSON output.
* `--json-error-messages`
Messages that would normally be output to standard error are included in
the json instead.
# CAVEATS
Note that using `--deduplicate` or `--clean-duplicates` with the WORM
backend does not look at file content, but filename and mtime.
2016-01-25 17:41:21 +00:00
If annex.largefiles is configured, and does not match a file, `git annex
import` will add the non-large file directly to the git repository,
instead of to the annex.
# SEE ALSO
[[git-annex]](1)
[[git-annex-add]](1)
[[git-annex-export]](1)
2019-05-14 18:43:51 +00:00
[[git-annex-preferred-content]](1)
# AUTHOR
Joey Hess <id@joeyh.name>
Warning: Automatically converted into a man page by mdwn2man. Edit with care.