git-annex/doc/todo/Incremental_git_annex_sync_--content_--all.mdwn
2023-06-29 13:30:26 -04:00

21 lines
1.3 KiB
Markdown

Hi,<br>
So I have yet another idea to speed up git annex. For now only for the 2nd pass of git annex sync --content --all.
1. Check which remotes are currently available (i.e. online and connected).
2. If any of the (available) remotes doesn't have a commit recorded (See below), do a full sync.
3. If numcopies.log, mincopies.log, trust.log, group.log, preferred-content.log, required-content.log, group-preferred-content.log or transitions.log (on the git-annex branch) changed since the last time we synced, do a full sync (since if any of those logs changed, it may affect the preferred-content expressions and we need to reevaluate every key).
4. If every check passed, do an incremental sync.
##Full sync:
1. Do a normal (full) git annex sync.
2. For every remote that we synced with, record the commit id of the current tip of the git-annex branch.
* Record the commit id only if every file was successfully transferred/dropped.
##Incremental sync:
1. In the 2nd pass of git annex sync --content --all, only look at keys whose location log changed since the last (full or incremental) sync via `git diff-tree -r --name-only <lowest recorded commit id of all remotes> git-annex`.
2. Again, update the commit id of remotes that we successfully synced with.
[[!tag confirmed]]
[[!tag projects/datalad/potential]]