blog for the day
This commit is contained in:
parent
d3af414568
commit
07a8a59dc6
1 changed files with 80 additions and 0 deletions
80
doc/design/assistant/blog/day_318__forgetting.mdwn
Normal file
80
doc/design/assistant/blog/day_318__forgetting.mdwn
Normal file
|
@ -0,0 +1,80 @@
|
|||
Yesterday I spent shopping for a new laptop, since this one is dying.
|
||||
(Soon I'll be able to compile git-annex fast-ish! Yay!) And thinking about
|
||||
[[wishlist:_dropping_git-annex_history]].
|
||||
|
||||
Today, I added the `git annex forget` command. It's currently been lightly
|
||||
tested, seems to work, and is living in the `forget` branch until I gain
|
||||
confidence with it. It should be perfectly safe to use, even if it's buggy,
|
||||
because you can use `git reflog git-annex` to pull out and revert to an old
|
||||
version of your git-annex branch. So if you're been wanting this feature,
|
||||
please beta test!
|
||||
|
||||
----
|
||||
|
||||
I actually implemented something more generic than just forgetting git
|
||||
history. There's now a whole mechanism for git-annex doing distributed
|
||||
transitions of whatever sort is needed.
|
||||
|
||||
There were several subtleties involved in distributed transitions:
|
||||
|
||||
First is how to tell when a given transition has already been done on a
|
||||
branch. At first I was thinking that the transition log should include the
|
||||
sha of the first commit on the old branch that got rewritten. However, that
|
||||
would mean that after a single transition had been done, every git-annex
|
||||
branch merge would need to look up the first commit of the current branch,
|
||||
to see if it's done the transition yet. That's slow! Instead, transitions
|
||||
are logged with a timestamp, and as long as a branch contains a transition
|
||||
with the same timestamp, it's been done.
|
||||
|
||||
A really tricky problem is what to do if the local repository has
|
||||
transitioned, but a remote has not, and changes keep being made to the
|
||||
remote. What it does so far is incorporate the changes from the remote into
|
||||
the index, and re-run the transition code over the whole thing to yeild a
|
||||
single new commit. This might not be very efficient (once I write the more
|
||||
full-featured transition code), but it lets the local repo keep up with
|
||||
what's going on in the remote, without directly merging with it (which
|
||||
would revert the transition). And once the remote repository has its
|
||||
git-annex upgraded to one that knows about transitions, it will finish up
|
||||
the transition on its side automatically, and the two branches will once
|
||||
again merge.
|
||||
|
||||
Related to the previous problem, we don't want to keep trying to merge
|
||||
from a remote branch when it's not yet transitioned. So a blacklist is
|
||||
used, of untransitioned commits that have already been integrated.
|
||||
|
||||
One really subtle thing is that when the user does a transition more
|
||||
complicated than `git annex forget`, like the `git annex forget --dead`
|
||||
that I need to implement to forget dead remotes, they're not just telling
|
||||
git-annex to forget whatever dead remotes it knows right now. They're
|
||||
actually telling git-annex to perform the transition one time on every
|
||||
existing clone of the repository, at some point in the future. Repositories
|
||||
with unfinished transitions could hang around for years, and at some future
|
||||
point when git-annex runs in the repository again, it would merge in the
|
||||
current state of the world, and re-do the transition. So you might tell it
|
||||
to forget dead remotes today, and then the very repository you ran that in
|
||||
later becomes dead, and a long-slumbering repo wakes up and forgets about
|
||||
the repo that started the whole process! I hope users don't find this
|
||||
massively confusing, but that's how the implementation works right now.
|
||||
|
||||
----
|
||||
|
||||
I think I have at least two more days of work to do to finish up this
|
||||
feature.
|
||||
|
||||
* I still need to add some extra features like forgetting about dead remotes,
|
||||
and forgetting about keys that are no longer present on any remote.
|
||||
|
||||
* After `git annex forget`, `git annex sync`
|
||||
will fail to push the synced/annex branch to remotes, since the branch
|
||||
is no longer a fast-forward of the old one. I will probably fix this by
|
||||
making `git annex sync` do a fallback push of a unique branch in this case,
|
||||
like the assistant already does. Although I may need to adjust that code
|
||||
to handle this case, too..
|
||||
|
||||
* For some reason the automatic transitioning code triggers
|
||||
a "(recovery from race)" commit. This is certianly a bug somewhere,
|
||||
because you can't have a race with only 1 participant.
|
||||
|
||||
----
|
||||
|
||||
Today's work was sponsored by Richard Hartmann.
|
Loading…
Add table
Reference in a new issue