annex.adjustedbranchrefresh

Added annex.adjustedbranchrefresh git config to update adjusted branches
set up by git-annex adjust --unlock-present/--hide-missing.

Note, in a few cases, I was not able to make the adjusted branch
be updated in calls to moveAnnex, because information about what
file corresponds to a key is not available. They are:

* If two files point to one file, then eg, `git annex get foo` will
  update the branch to unlock foo, but will not unlock bar, because it
  does not know about it. Might be fixable by making `git annex get
  bar` do something besides skipping bar?
* git-annex-shell recvkey likewise (so sends over ssh from old versions
  of git-annex)
* git-annex setkey
* git-annex transferkey if the user does not use --file
* git-annex multicast sends keys with no associated file info

Doing a single full refresh at the end, after any incremental refresh,
will deal with those edge cases.
This commit is contained in:
Joey Hess 2020-11-16 14:09:55 -04:00
parent af6af35228
commit 0896038ba7
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
28 changed files with 311 additions and 180 deletions

View file

@ -82,7 +82,8 @@ and will also propagate commits back to the original branch.
branch.
To update the adjusted branch to reflect changes to content availability,
run `git annex adjust --hide-missing` again.
run `git annex adjust --hide-missing` again. Or, to automate updates,
set the `annex.adjustedbranchrefresh` config.
Despite missing files being hidden, `git annex sync --content` will
still operate on them, and can be used to download missing
@ -104,7 +105,8 @@ and will also propagate commits back to the original branch.
not be broken symlinks.
To update the adjusted branch to reflect changes to content availability,
run `git annex adjust --unlock-present` again. Or use `git-annex sync
run `git annex adjust --unlock-present` again. Or, to automate updates,
set the `annex.adjustedbranchrefresh` config. Or use `git-annex sync
--content`, which updates the branch after transferring content.
# SEE ALSO

View file

@ -1016,6 +1016,22 @@ Like other git commands, git-annex is configured via `.git/config`.
When the `--batch` option is used, this configuration is ignored.
* `annex.adjustedbranchrefresh`
When [[git-annex-adjust]](1) is used to set up an adjusted branch
that needs to be refreshed after getting or dropping files, this config
controls how frequently the branch is refreshed.
Refreshing the branch takes some time, so doing it after every file
can be too slow. The default value is 0 (or false), which does not
refresh the branch. 1 (or true) will refresh once, after git-annex
has made other changes. Higher values refresh after approximately that
many files need to be updated. Ie, 2 refreshes after every file,
and 100 after every 99 files.
(If git-annex gets faster in the future, refresh rates will increase
proportional to the speed improvements.)
* `annex.queuesize`
git-annex builds a queue of git commands, in order to combine similar

View file

@ -10,6 +10,18 @@ it makes sense to do that? And for that matter, can it be done efficiently
enough to do it more frequently? After every file or after some number of
files, or after processing all files in a (sub-)tree?
> Since the answer to that will change over time, let's make a config for
> it. annex.adjustedbranchrefresh
>
> The config can start out as a range from 0 up that indicates how
> infrequently to update the branch for this. With 0 being never refresh,
> 1 being refresh the minimum (once at shutdown), 2 being refresh after
> 1 file, 100 after every 99 files, etc.
> If refreshing gets twice as fast, divide the numbers by two, etc.
> If it becomes sufficiently fast that the overhead doesn't matter,
> it can change to a simple boolean. Since 0 is false and > 0 is true,
> in git config, the old values will still work. (done)
Investigation of the obvious things that make it slow follows:
## efficient branch adjustment