From e1fa97001037cf6f4b6a63a875ac9ad8fcfdcf56 Mon Sep 17 00:00:00 2001 From: jgoerzen Date: Tue, 30 May 2023 12:23:28 +0000 Subject: [PATCH] --- doc/bugs/importtree_spends_hours_reading_cidsdb.mdwn | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/doc/bugs/importtree_spends_hours_reading_cidsdb.mdwn b/doc/bugs/importtree_spends_hours_reading_cidsdb.mdwn index 075f0c52d0..b5d2f5d3ad 100644 --- a/doc/bugs/importtree_spends_hours_reading_cidsdb.mdwn +++ b/doc/bugs/importtree_spends_hours_reading_cidsdb.mdwn @@ -39,6 +39,10 @@ There are about 150,000 files in that tree. This problem occurs *after* git-ann .git/annex/cidsdb/db is only 51M so it is certainly entirely cached. git-annex is entirely CPU-bound at this point. +I can rerun the sync with an unchanged import directory. It still takes 107 minutes, the majority of which is spent reading cidsdb. Only the first minute or two are spent scanning the source area. + +I have tested this on a source directory that's 2.2G and another that's 1.1T, both with about 150,000 files. After the first import, the subsequent syncs are similar in performance. In other words, this behavior appears to be related to the number of files, not the size of files. + ### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) Yes, and I hope to use it for a project to archive family photos and videos to BD-R (that's what this is about here)