improve import duplicate docs
This commit is contained in:
parent
b4fb5eac3b
commit
e1b8853174
2 changed files with 43 additions and 11 deletions
|
@ -13,11 +13,18 @@ the annex. Individual files to import can be specified.
|
|||
If a directory is specified, the entire directory is imported.
|
||||
|
||||
git annex import /media/camera/DCIM/*
|
||||
|
||||
By default, importing two files with the same contents from two different
|
||||
locations will result in both files being added to the repository.
|
||||
(With all checksumming backends, including the default SHA256E,
|
||||
only one copy of the data will be stored.)
|
||||
|
||||
When importing files, there's a possibility of importing a duplicate
|
||||
of a file that is already known to git-annex -- its content is either
|
||||
present in the local repository already, or git-annex knows of anther
|
||||
repository that contains it.
|
||||
|
||||
By default, importing a duplicate of a known file will result in
|
||||
a new filename being added to the repository, so the duplicate file
|
||||
is present in the repository twice. (With all checksumming backends,
|
||||
including the default SHA256E, only one copy of the data will be stored.)
|
||||
|
||||
Several options can be used to adjust handling of duplicate files.
|
||||
|
||||
# OPTIONS
|
||||
|
||||
|
@ -32,19 +39,18 @@ only one copy of the data will be stored.)
|
|||
|
||||
* `--deduplicate`
|
||||
|
||||
Only import files whose content has not been seen before by git-annex.
|
||||
|
||||
Duplicate files will be deleted from the import location.
|
||||
Only import files that are not duplicates;
|
||||
duplicate files will be deleted from the import location.
|
||||
|
||||
* `--skip-duplicates`
|
||||
|
||||
Only import files whose content has not been seen before by git-annex,
|
||||
but avoid deleting duplicate files.
|
||||
Only import files that are not duplicates; and avoid deleting
|
||||
duplicate files from the import location.
|
||||
|
||||
* `--clean-duplicates`
|
||||
|
||||
Does not import any files, but any files found in the import location
|
||||
that are duplicates of content in the annex are deleted.
|
||||
that are duplicates are deleted.
|
||||
|
||||
* file matching options
|
||||
|
||||
|
|
|
@ -0,0 +1,26 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2015-03-26T15:28:45Z"
|
||||
content="""
|
||||
Well, you've found an edge case here.
|
||||
|
||||
It behaves as documented as long as the file being imported is located in some
|
||||
repository know to git-annex. The file content does not have to be present in
|
||||
the local repository for it to behave as documented.
|
||||
|
||||
In your case, the file being imported has a symlink in the git repo, but
|
||||
git-annex knows about 0 annexed copies of the file, so it's treated as
|
||||
if it's a new file and not a duplicate.
|
||||
|
||||
Since import is working at the key level, there's not a good way to look up
|
||||
that there are some symlinks in the git repo even though the content is
|
||||
gone. And even if there was, I think I'd be uncomfortable with it deleting
|
||||
the file as "duplicate" when its content is not available in any known
|
||||
repository. The only behavior improvement might be to import the content
|
||||
but not make a redundant symlink in this case.
|
||||
|
||||
I think it's best to change the documentation. I've added a new
|
||||
paragraph that more exactly and clearly explains what duplicate files
|
||||
are for the purposes of importing.
|
||||
"""]]
|
Loading…
Reference in a new issue