48 lines
2.4 KiB
Markdown
48 lines
2.4 KiB
Markdown
### Please describe the problem.
|
|
|
|
Hi!
|
|
|
|
I've got a directory special remote set up, with importtree. After scanning through it (import filename... ok), it stops reading from the filesystem and spends hours just reading from cidsdb. Verifying with strace, there is not reading from the filesystem here.
|
|
|
|
### What steps will reproduce the problem?
|
|
|
|
Here are my prep steps:
|
|
|
|
```
|
|
git init
|
|
git config annex.thin true
|
|
git annex init 'local hub'
|
|
git annex wanted . "include=* and exclude=testdata/*"
|
|
touch mtree
|
|
git annex add mtree
|
|
git annex sync
|
|
git annex adjust --unlock-present
|
|
git annex initremote source type=directory directory=/acrypt/git-annex/bind-ro/testdata importtree=yes encryption=none
|
|
git annex enableremote source directory=/acrypt/git-annex/bind-ro/testdata
|
|
git config remote.source.annex-readonly true
|
|
git config remote.source.annex-tracking-branch main:testdata # (says to put the files in the "$REPO" directory)
|
|
git config annex.securehashesonly true
|
|
git config annex.genmetadata true
|
|
git config annex.diskreserve 100M
|
|
git annex sync
|
|
```
|
|
|
|
It is that last sync that runs into this problem.
|
|
|
|
### What version of git-annex are you using? On what operating system?
|
|
|
|
10.20230407 on Debian bullseye
|
|
|
|
### Please provide any additional information below.
|
|
|
|
There are about 150,000 files in that tree. This problem occurs *after* git-annex is done scanning the tree (verified with strace and lsof). This problem does not occur with the same number of files in a local/standard git-annex repo (as opposed to the directory special remote).
|
|
|
|
.git/annex/cidsdb/db is only 51M so it is certainly entirely cached. git-annex is entirely CPU-bound at this point.
|
|
|
|
I can rerun the sync with an unchanged import directory. It still takes 107 minutes, the majority of which is spent reading cidsdb. Only the first minute or two are spent scanning the source area.
|
|
|
|
I have tested this on a source directory that's 2.2G and another that's 1.1T, both with about 150,000 files. After the first import, the subsequent syncs are similar in performance. In other words, this behavior appears to be related to the number of files, not the size of files.
|
|
|
|
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
|
|
|
Yes, and I hope to use it for a project to archive family photos and videos to BD-R (that's what this is about here)
|