This will avoid losing any messages received from 1 client when a push
involving another client is running.
Additionally, the handling of push initiation is improved,
it's no longer allowed to run multiples of the same type of push to
the same client.
Still stalls sometimes :(
Observed: With 2 xmpp clients, one would sometimes stop responding
to CanPush messages. Often it was in the middle of a receive-pack
of its own (or was waiting for a failed one to time out).
Now these are always immediately responded to, which is fine; the point
of CanPush is to find out if there's another client out there that's
interested in our push.
Also, in queueNetPushMessage, queue push initiation messages when
we're already running the side of the push they would initiate.
Before, these messages were sent into the netMessagesPush channel,
which was wrong. The xmpp send-pack and receive-pack code discarded
such messages.
This still doesn't make XMPP push 100% robust. In testing, I am seeing
it sometimes try to run two send-packs, or two receive-packs at once
to the same client (probably because the client sent two requests).
Also, I'm seeing rather a lot of cases where it stalls out until it
runs into the 120 second timeout and cancels a push.
And finally, there seems to be a bug in runPush. I have logs that
show it running its setup action, but never its cleanup action.
How is this possible given its use of E.bracket? Either some exception
is finding its way through, or the action somehow stalls forever.
When this happens, one of the 2 clients stops syncing.
This was also tripped by the test suite's automatic conflict resolution
test. Which also shows BTW that an unnecessary copy of content is done
sometimes when merging in direct mode. Not going to try to speed that up
now.
This bug was turned up by the test suite, running fsck in direct mode.
A repository was cloned, was put into direct mode, was fscked, and fsck
incorrectly said that no copy existed of a file, that was actually present
in origin.
This turned out to occur because fsck first did a Annex.Branch.change,
recording that it did not locally have the file. That was recorded in the
journal. Since neither the git annex direct not the fsck had yet needed to
read any info from the branch, but had only made changes to it, the
origin/git-annex branch was not yet merged in. So the journal got a
location log entry written to it, but this did not include
the location log info for the origin. When fsck then did a
Annex.Branch.get, it trusted the journal was cosnsitent, and returned it,
again w/o merging from origin/git-annex. This latter behavior is the
actual bug.
Refer to commit e9bfa8eaed for the thinking
behind it being ok to make a change to a file on the branch, without
first merging the branch. That thinking still stands. However, it means
that files in the journal cannot be trusted to be consistent if the branch
has not been merged. So, to fix, just enure the branch gets merged, even
when reading from the journal.
In tests, this does not seem to cause any extra merging. Except, of course,
in the one case described above. But git annex add, etc, are able to make
changes w/o first merging the branch.