don't frontload reconcileStaged in git-annex init

init: Avoid scanning for annexed files, which can be lengthy in a large repository. Instead that scan is done on demand. This lets git-annex init be run and some query commands be used in a repository without waiting. Note that autoinit already behaved this way, so while this will mean some commands like git-annex get/unlock/add will do the scan the first time run, that is not really a significant behavior change. And, it's really better to have a consistent behavior. The reason for the inconsistency was a strange bug discussed in b3c4579c79. Avoiding reconcileStaged in init will keep avoiding whatever that was. Sponsored-by: Dartmouth College's DANDI project
2022-11-18 13:58:35 -04:00 · 2022-11-18 13:58:35 -04:00 · 2b014f1a8b
commit 2b014f1a8b
parent c834d2025a
8 changed files with 26 additions and 16 deletions
--- a/doc/bugs/performance_regression63_init_takes_times_more/comment_14_8c3b13806adb731435b346a64990527b._comment
+++ b/doc/bugs/performance_regression63_init_takes_times_more/comment_14_8c3b13806adb731435b346a64990527b._comment
@ -5,4 +5,15 @@
 content="""
 Implemented the two optimisations discussed above, and init in that
 repository dropped from 24 seconds to 19 seconds, a 21% speedup.
+
+I think that's as fast as reconcileStaged is likely to get without
+some deep optimisation of the persistent library.
+
+Then I realized that `git-annex init` does not really need to scan for
+associated files. That can be done later, when running a command that needs
+to access the keys database. Indeed, when git-annex is used in a clone of
+an annexed repo without explicitly running `git-annex init`, that's what
+it already did. I've implemented that, so now `git-annex init` takes 3
+seconds or so. The price will be paid later, the first time running a
+`git-annex add` or `git-annex unlock` or `git-annex get`.
 """]]