don't frontload reconcileStaged in git-annex init

init: Avoid scanning for annexed files, which can be lengthy in a
large repository. Instead that scan is done on demand. This lets git-annex
init be run and some query commands be used in a repository without
waiting.

Note that autoinit already behaved this way, so while this will mean some
commands like git-annex get/unlock/add will do the scan the first time run,
that is not really a significant behavior change.

And, it's really better to have a consistent behavior. The reason for
the inconsistency was a strange bug discussed in
b3c4579c79. Avoiding reconcileStaged in
init will keep avoiding whatever that was.

Sponsored-by: Dartmouth College's DANDI project
This commit is contained in:
Joey Hess 2022-11-18 13:58:35 -04:00
parent c834d2025a
commit 2b014f1a8b
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
8 changed files with 26 additions and 16 deletions

View file

@ -5,4 +5,15 @@
content="""
Implemented the two optimisations discussed above, and init in that
repository dropped from 24 seconds to 19 seconds, a 21% speedup.
I think that's as fast as reconcileStaged is likely to get without
some deep optimisation of the persistent library.
Then I realized that `git-annex init` does not really need to scan for
associated files. That can be done later, when running a command that needs
to access the keys database. Indeed, when git-annex is used in a clone of
an annexed repo without explicitly running `git-annex init`, that's what
it already did. I've implemented that, so now `git-annex init` takes 3
seconds or so. The price will be paid later, the first time running a
`git-annex add` or `git-annex unlock` or `git-annex get`.
"""]]