Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2022-09-05 13:20:44 -04:00
commit 600d3f7141
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
15 changed files with 297 additions and 0 deletions

View file

@ -0,0 +1,34 @@
### Please describe the problem.
I have set up a repository with assistant. Then, within it, I ran:
```
git annex initremote source type=directory directory=... importtree=yes encryption=none
git annex enableremote source type=directory directory=...
git config remote.source.annex-readonly true
git config remote.source.annex-tracking-branch main:data
git annex import main:data --from source
```
At this point, git annex sync will (usually) sync this.
### What steps will reproduce the problem?
There are two problems.
1. The assistant will never sync this, no matter what I do. I can request a manual sync of either the repo or the remote, and neither does anything.
2. It appears that the assistant is creating a locking race with the CLI. For instance, I got `fatal: Unable to create '.git/index.lock': .git/index.lock: openFd: already exists (File exists)` with one run of `git annex sync`, but the run of it before and after worked fine.
When there isn't a race with `git annex sync`, it behaves as desired.
### What version of git-annex are you using? On what operating system?
8.20210223-2 on Debian bullseye.
### Please provide any additional information below.
I will probably disable assistant
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
I'm pretty excited about using this approach to help archive some photos and such!

View file

@ -0,0 +1,33 @@
### Please describe the problem.
I have a directory remote with importtree=yes. In that remote, I have some symlinks that are broken. (Long story; this is a file server and they work on the system that has mounted them, but are broken here.)
### What steps will reproduce the problem?
I've added it with `git config remote.source.annex-tracking-branch main:$REPO`. When I run `git annex sync`, I get:
```
commit
On branch adjusted/main(unlocked)
nothing to commit, working tree clean
ok
list source
git-annex: Unable to list contents of source: [redacted]: getFileStatus: does not exist (No such file or directory)
failed
git-annex: sync: 1 failed
```
### What version of git-annex are you using? On what operating system?
8.20210223-2 on Debian
### Please provide any additional information below.
I would like git-annex to either:
1. Store the symlink as a symlink, or
2. Ignore bad symlinks
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Loading in other parts of my photo collection as we speak!

View file

@ -0,0 +1,100 @@
### Please describe the problem.
I have a special remote "source" set up as type=directory importtree=yes.
I pulled it into one repo in which none of the files were wanted. So far so good. I cloned that repo to a second, archive, repo. git annex sync worked (but took 2 hours). Then I did `git annex get --auto`, most of the files came through OK. But some said things like this:
```
get Pictures/.dtrash/info/IMG_2979_v1-e9dced7b.dtrashinfo (from source...) (checksum...)
verification of content failed
Unable to access these remotes: source
No other repository is known to contain the file.
failed
```
(Both repos are unlocked via adjust --unlocked)
Upon looking at the file in the repo where it wasn't wanted, I saw this:
```
$ cat IMG_2979_v1-e9dced7b.dtrashinfo
/annex/objects/SHA256E-s144--cec5c7b6a9d97344e374e8395e02b74350678147ff65d6df091f5115cf18bf72
```
Interesting. So, in the source directory:
```
$ sha256sum IMG_2979_v1-e9dced7b.dtrashinfo
aca3ed7243def7a0bd5fcad542c66841b8e7d2a670b4cafe749eb27e032d8975 IMG_2979_v1-e9dced7b.dtrashinfo
```
That's not a match at all. Well, OK then:
```
$ sha256sum * | grep cec5c7
cec5c7b6a9d97344e374e8395e02b74350678147ff65d6df091f5115cf18bf72 IMG_2981_v1-5fc99c7a.dtrashinfo
```
Yikes. So for IMG_2979_v1-e9dced7b.dtrashinfo, git-annex recorded a checksum that belonged to IMG_2981_v1-5fc99c7a.dtrashinfo. Well then, what is this other file recorded as, back in the git-annex repo?
```
$ cat IMG_2981_v1-5fc99c7a.dtrashinfo
/annex/objects/SHA256E-s144--cec5c7b6a9d97344e374e8395e02b74350678147ff65d6df091f5115cf18bf72
```
OK, so two files that were not identical in the source directory got recorded with an identical checksum in git-annex somehow. And, when they were attempted to be imported via `git annex get --auto`, this at least was detected there.
In this .dtrash/info directory, 436 files out of 719 were not loaded by `git annex get`, presumably due to this issue.
In this directory, the source files were ranging in size from 140 to 227 bytes.
In a companion directory, .dtrash/files, 24 out of 719 files exhibited this issue. These files tended to be larger, but one that was 495MB triggered it also.
I have not yet seen it outside .dtrash, but it will be many hours until this get process completes fully, as it needs to copy about 1TB of data.
In case you are wondering if there is a race condition with .dtrash: no. The only application that writes to it isn't running, and the last time a file was modified in there was over a year ago. Also the content of the .info files is just JSON and a corresponding filename embedded in them, so it is very clear that the files on the filesystem are correct and the calculated checksums at issue here were never correct.
### What steps will reproduce the problem?
I have laid that out as best I can above.
### What version of git-annex are you using? On what operating system?
8;.20210223 on Debian
### Please provide any additional information below.
Assistant is not being used.
Setup:
```
REPO=Pictures
cd /acrypt/git-annex/repos
mkdir $REPO
cd $REPO
git init
git config annex.thin true
git annex init 'local hub'
git annex wanted . "include=* and exclude=$REPO/*"
# Now initialize things.
touch mtree
git annex add mtree
git annex sync
git annex adjust --unlock
git annex initremote source type=directory directory=/acrypt/git-annex/bind-ro/$REPO importtree=yes encryption=none
git annex enableremote source directory=/acrypt/git-annex/bind-ro/$REPO
git config remote.source.annex-readonly true
git config remote.source.annex-tracking-branch main:$REPO
git config annex.securehashesonly true
git config annex.genmetadata true
git config annex.diskreserve 100M
```
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="jgoerzen"
avatar="http://cdn.libravatar.org/avatar/090740822c9dcdb39ffe506b890981b4"
subject="comment 1"
date="2022-09-05T01:14:53Z"
content="""
Update: This also occurred in other directories, with some video files from 2018. One directory contains 1945 files, and another 21 files. I'm not finding an obvious pattern to the issue.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Lukey"
avatar="http://cdn.libravatar.org/avatar/c7c08e2efd29c692cc017c4a4ca3406b"
subject="comment 2"
date="2022-09-05T09:35:18Z"
content="""
You really should upgrade to the latest version.
"""]]