emptying pushes only delete

This commit is contained in:
Joey Hess 2024-05-21 09:52:35 -04:00
parent 5f4ad2a5de
commit b042dfeb0e
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -12,42 +12,6 @@ This is implememented and working. Remaining todo list for it:
* Test incremental push edge cases involving checkprereq. * Test incremental push edge cases involving checkprereq.
* A race between an incremental push and a full push can result in
a bundle that the incremental push is based on being deleted by the full
push, and then incremental push's manifest file being written later.
Which will prevent cloning or some pulls from working.
There is no way to prevent this race, but the problem can be avoided:
Make each full push also write to a fallback manifest file that is only
written by full pushes, not incremental pushes. When fetching the main
manifest file, always check that all bundles mentioned in it are still in
the remote. If any are missing, fetch and use the fallback manifest file
instead.
(The only other solution I can think of is to never delete old bundles,
except after some amount of time long enough that it credibly avoids
the race. But since a process could be suspended at any point and resumed
later, the race window could be arbitrarily wide.)
* A race between two full pushes can also result in the manifest file listing
a bundle that has been deleted:
Start with a full push of bundle A.
Then there are 2 racing full pushes X and Y, of bundle A and B
respectively. With this series of operations:
1. Y: write bundle B
1. Y: read manifest (listing A)
1. Y: write B to manifest
1. X: write bundle A
1. Y: delete bundle A
1. X: read manifest (listing B)
1. X: write A to manifest
1. X: delete bundle B
Which results in a manifest that lists A, but that bundle was deleted.
* Cloning from an annex:: url with importtree=yes doesn't work * Cloning from an annex:: url with importtree=yes doesn't work
(with or without exporttree=yes). This is because the ContentIdentifier (with or without exporttree=yes). This is because the ContentIdentifier
db is not populated. It should be possible to work around this. db is not populated. It should be possible to work around this.
@ -95,3 +59,52 @@ This is implememented and working. Remaining todo list for it:
They would have to use git fetch --prune to notice the deletion. They would have to use git fetch --prune to notice the deletion.
Once the user does notice, they can re-push their ref to recover. Once the user does notice, they can re-push their ref to recover.
Can this be improved? Can this be improved?
## bundle deletion problems
Deleting bundles results in some problems involving races,
detailed below, that result in the manifest file listing a bundle that has
been deleted. Which breaks cloning, and is data loss, and so *must*
be solved before release.
* A race between an incremental push and a full push can result in
a bundle that the incremental push is based on being deleted by the full
push, and then incremental push's manifest file being written later.
Which will prevent cloning or some pulls from working.
A fix: Make each full push (and emptying push) also write to a fallback
manifest file that is only written by full pushes (and emptying pushes),
not incremental pushes. When fetching the main manifest file, always
check that all bundles mentioned in it are still in the remote. If any
are missing, fetch and use the fallback manifest file instead.
* A race between two full pushes can also result in the manifest file listing
a bundle that has been deleted:
Start with a full push of bundle A.
Then there are 2 racing full pushes X and Y, of bundle A and B
respectively. With this series of operations:
1. Y: write bundle B
1. Y: read manifest (listing A)
1. Y: write B to manifest
1. X: write bundle A
1. Y: delete bundle A
1. X: read manifest (listing B)
1. X: write A to manifest
1. X: delete bundle B
Which results in a manifest that lists A, but that bundle was deleted.
The problems above *could* be solved by not deleting bundles, but that is
unsatisfactory.
Old bundles could be deleted after some period of time. But a process can
be suspended at any point and resumed later, so the race windows can be
arbitrarily wide.
What if only emptying pushes delete bundles? If a manifest file refers to a
bundle that has been deleted, that can be treated the same as if the
manifest file was empty, because we know that, for that bundle to have been
deleted, there must have been an emptying push.