git-annex/doc/bugs/git-annex_branch_push_race.mdwn
Joey Hess c4c965d602 detect and recover from branch push/commit race
Dealing with a race without using locking is exceedingly difficult and tricky.
Fully tested, I hope.

There are three places left where the branch can be updated, that are not
covered by the race recovery code. Let's prove they're all immune to the
race:

1. tryFastForwardTo checks to see if a fast-forward can be done,
   and then does git-update-ref on the branch to fast-forward it.

   If a push comes in before the check, then either no fast-forward
   will be done (ok), or the push set the branch to a ref that can
   still be fast-forwarded (also ok)

   If a push comes in after the check, the git-update-ref will
   undo the ref change made by the push. It's as if the push did not come
   in, and the next git-push will see this, and try to re-do it.
   (acceptable)

2. When creating the branch for the very first time, an empty index
   is created, and a commit of it made to the branch. The commit's ref
   is recorded as the current state of the index. If a push came in
   during that, it will be noticed the next time a commit is made to the
   branch, since the branch will have changed. (ok)

3. Creating the branch from an existing remote branch involves making
   the branch, and then getting its ref, and recording that the index
   reflects that ref.

   If a push creates the branch first, git-branch will fail (ok).

   If the branch is created and a racing push is then able to change it
   (highly unlikely!) we're still ok, because it first records the ref into
   the index.lck, and then updating the index. The race can cause the
   index.lck to have the old branch ref, while the index has the newly pushed
   branch merged into it, but that only results in an unnecessary update of
   the index file later on.
2011-12-11 20:41:35 -04:00

45 lines
1.6 KiB
Markdown

The fix for the [[git-annex_branch_corruption]] bug is subject to a race.
With that fix, git-annex does this when committing a change to the branch:
1. lock the journal file (this avoids git-annex racing itself, FWIW)
2. check what the head of the branch points to, to see if a newer branch
has appeared
3. if so, updates the index file from the branch
4. stages changes in the index
5. commits to the branch using the index file
If a push to the branch comes in during 2-5, then
[[git-annex_branch_corruption]] could still occur.
---
## approach 1, using locking
Add an update hook and a post-update hook. The update hook
will use locking to ensure that no git-annex is currently running
a commit, and block any git-annex's from starting one. It
will background itself, and remain running during the push.
The post-update hook will signal it to exit.
I don't like this approach much, since it involves a daemon, two hooks,
and lots of things to go wrong. And it blocks using git-annex during a
push. This approach should be a last resort.
## approach 2, lockless method
After a commit is made to the branch, check to see if the parent of
the commit is the same ref that the index file was last updated to. If it's
not, then the race occurred.
How to recover from the race? Well, just union merging the parent of the
commit into the index file and re-committing should work, I think. When
the race occurs, the commit reverts its parent's changes, and this will
redo them.
(Of course, this re-commit will also be subject to the race, and
will need the same check for the race as the other commits. It won't loop
forever, I hope.)
> [[done]] and tested.
--[[Joey]]