This commit is contained in:
parent
47a5853da8
commit
e7de6fffc1
1 changed files with 100 additions and 0 deletions
100
doc/bugs/Files_recorded_with_other_file__39__s_checksums.mdwn
Normal file
100
doc/bugs/Files_recorded_with_other_file__39__s_checksums.mdwn
Normal file
|
@ -0,0 +1,100 @@
|
|||
### Please describe the problem.
|
||||
|
||||
I have a special remote "source" set up as type=directory importtree=yes.
|
||||
|
||||
I pulled it into one repo in which none of the files were wanted. So far so good. I cloned that repo to a second, archive, repo. git annex sync worked (but took 2 hours). Then I did `git annex get --auto`, most of the files came through OK. But some said things like this:
|
||||
|
||||
```
|
||||
get Pictures/.dtrash/info/IMG_2979_v1-e9dced7b.dtrashinfo (from source...) (checksum...)
|
||||
verification of content failed
|
||||
|
||||
Unable to access these remotes: source
|
||||
|
||||
No other repository is known to contain the file.
|
||||
failed
|
||||
```
|
||||
|
||||
(Both repos are unlocked via adjust --unlocked)
|
||||
|
||||
Upon looking at the file in the repo where it wasn't wanted, I saw this:
|
||||
|
||||
```
|
||||
$ cat IMG_2979_v1-e9dced7b.dtrashinfo
|
||||
/annex/objects/SHA256E-s144--cec5c7b6a9d97344e374e8395e02b74350678147ff65d6df091f5115cf18bf72
|
||||
```
|
||||
|
||||
Interesting. So, in the source directory:
|
||||
|
||||
```
|
||||
$ sha256sum IMG_2979_v1-e9dced7b.dtrashinfo
|
||||
aca3ed7243def7a0bd5fcad542c66841b8e7d2a670b4cafe749eb27e032d8975 IMG_2979_v1-e9dced7b.dtrashinfo
|
||||
```
|
||||
|
||||
That's not a match at all. Well, OK then:
|
||||
|
||||
```
|
||||
$ sha256sum * | grep cec5c7
|
||||
cec5c7b6a9d97344e374e8395e02b74350678147ff65d6df091f5115cf18bf72 IMG_2981_v1-5fc99c7a.dtrashinfo
|
||||
```
|
||||
|
||||
Yikes. So for IMG_2979_v1-e9dced7b.dtrashinfo, git-annex recorded a checksum that belonged to IMG_2981_v1-5fc99c7a.dtrashinfo. Well then, what is this other file recorded as, back in the git-annex repo?
|
||||
|
||||
```
|
||||
$ cat IMG_2981_v1-5fc99c7a.dtrashinfo
|
||||
/annex/objects/SHA256E-s144--cec5c7b6a9d97344e374e8395e02b74350678147ff65d6df091f5115cf18bf72
|
||||
```
|
||||
|
||||
OK, so two files that were not identical in the source directory got recorded with an identical checksum in git-annex somehow. And, when they were attempted to be imported via `git annex get --auto`, this at least was detected there.
|
||||
|
||||
In this .dtrash/info directory, 436 files out of 719 were not loaded by `git annex get`, presumably due to this issue.
|
||||
|
||||
In this directory, the source files were ranging in size from 140 to 227 bytes.
|
||||
|
||||
In a companion directory, .dtrash/files, 24 out of 719 files exhibited this issue. These files tended to be larger, but one that was 495MB triggered it also.
|
||||
|
||||
I have not yet seen it outside .dtrash, but it will be many hours until this get process completes fully, as it needs to copy about 1TB of data.
|
||||
|
||||
In case you are wondering if there is a race condition with .dtrash: no. The only application that writes to it isn't running, and the last time a file was modified in there was over a year ago.
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
I have laid that out as best I can above.
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
8;.20210223 on Debian
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
Assistant is not being used.
|
||||
|
||||
Setup:
|
||||
|
||||
```
|
||||
REPO=Pictures
|
||||
cd /acrypt/git-annex/repos
|
||||
mkdir $REPO
|
||||
cd $REPO
|
||||
git init
|
||||
git config annex.thin true
|
||||
git annex init 'local hub'
|
||||
git annex wanted . "include=* and exclude=$REPO/*"
|
||||
|
||||
# Now initialize things.
|
||||
touch mtree
|
||||
git annex add mtree
|
||||
git annex sync
|
||||
git annex adjust --unlock
|
||||
|
||||
git annex initremote source type=directory directory=/acrypt/git-annex/bind-ro/$REPO importtree=yes encryption=none
|
||||
|
||||
git annex enableremote source directory=/acrypt/git-annex/bind-ro/$REPO
|
||||
git config remote.source.annex-readonly true
|
||||
git config remote.source.annex-tracking-branch main:$REPO
|
||||
git config annex.securehashesonly true
|
||||
git config annex.genmetadata true
|
||||
git config annex.diskreserve 100M
|
||||
```
|
||||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
Loading…
Add table
Reference in a new issue