blog for the day

2012-09-16 20:13:32 -04:00 · 2012-09-16 20:13:32 -04:00 · cead5d056e
commit cead5d056e
parent ddacbbe798
1 changed files with 73 additions and 0 deletions
--- a/doc/design/assistant/blog/day_83__3-way.mdwn
+++ b/doc/design/assistant/blog/day_83__3-way.mdwn
@ -0,0 +1,73 @@
+Syncing works well when the graph of repositories is strongly connected.
+Now I'm working on making it work reliably with less connected graphs.
+
+I've been focusing on and testing a doubly-connected list of repositories,
+such as: `A <-> B <-> C`
+
+----
+
+I was seeing a lot of git-annex branch push failures occuring in
+this line of repositories topology. Sometimes was is able to recover from
+these, but when two repositories were trying to push to one-another at the
+same time, and both failed, both would pull and merge, which actually keeps
+the git-annex branch still diverged. (The two merge commits differ.)
+
+A large part of the problem was that it pushed directly into the git-annex
+branch on the remote; the same branch the remote modifies. I changed it to
+push to `synced/git-annex` on the remote, which avoids most push failures.
+Only when A and C are both trying to push into `B/synced/git-annex` at the
+same time would one fail, and need to pull, merge, and retry.
+
+-----
+
+With that change, git syncing always succeeded in my tests, and without
+needing any retries. But with more complex sets of repositories, or more
+traffic, it could still fail.
+
+I want to avoid repeated retries, exponential backoffs, and that kind of
+thing. It'd probably be good enough, but I'm not happy with it because
+it could take arbitrarily long to get git in sync.
+
+I've settled on letting it retry once to push to the synced/git-annex
+and synced/master branches. If the retry fails, it enters a fallback mode,
+which is guaranteed to succeed, as long as the remote is accessible.
+
+The problem with the fallback mode is it uses really ugly branch names.
+Which is why Joachim Breitner and I originally decided on making `git annex
+sync` use the single `synced/master` branch, despite the potential for
+failed syncs. But in the assistant, the requirements are different,
+and I'm ok with the uglier names.
+
+It does seem to make sense to only use the uglier names as a fallback,
+rather than by default. This preserves compatability with `git annex sync`,
+and it allows the assistant to delete fallback sync branches after it's
+merged them, so the ugliness is temporary.
+
+---
+
+Also worked some today on a bug that prevents C from receiving files
+added to A.
+
+The problem is that file contents and git metadata sync independantly. So C
+will probably receive the git metadata from B before B has finished
+downloading the file from A. C would normally queue a download of the
+content when it sees the file appear, but at this point it has nowhere to
+get it from.
+
+My first stab at this was a failure. I made each download of a file result
+in uploads of the file being queued to every remote that doesn't have it
+yet. So rather than C downloading from B, B uploads to C. Which works fine,
+but then C sees this download from B has finished, and proceeds to try to
+re-upload to B. Which rejects it, but notices that this download has
+finished, so re-uploads it to C...
+
+The problem with that approach is that I don't have an event when a download
+succeeds, just an event when a download ends.  Of course, C could skip
+uploading back to the same place it just downloaded from, but loops are
+still possible with other network topologies (ie, if D is connected to both
+B and C, there would be an upload loop 'B -> C -> D -> B`). So unless I can
+find a better event to hook into, this idea is doomed.
+
+I do have another idea to fix the same problem. C could certianly remember
+that it saw a file and didn't know where to get the content from, and then
+when it receives a git push of a git-annex branch, try again.