log migration trees to git-annex branch

This will allow distributed migration: Start a migration in one clone of
a repo, and then update other clones.

commitMigration is a bit of a bear.. There is some inversion of control
that needs some TMVars. Also streamLogFile's finalizer does not handle
recording the trees, so an interrupt at just the wrong time can cause
migration.log to be emptied but the git-annex branch not updated.

Sponsored-by: Graham Spencer on Patreon
This commit is contained in:
Joey Hess 2023-12-06 15:38:01 -04:00
parent b55efc179a
commit 0bd8b17b59
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
12 changed files with 219 additions and 43 deletions

View file

@ -0,0 +1,17 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2023-12-06T19:31:42Z"
content="""
On the `distributedmigration` branch I have `git-annex migrate` recording
migrations on the git-annex branch.
Its method of grafting in 2 trees, one with the old keys and one with the
new is quite efficient. In a migration of 1000 files from SHA256E to SHA1,
the git objects only needs 52kb to record the migration trees.
Compared with 424 kb needed to update the location logs.
The total git repo grew from 508kb to 984k.
Next up: Make `git-annex migrate --update` find new migrations started
elsewhere and apply them to the local annex objects.
"""]]