But, it's buggy: the server hangs without processing the VALIDITY,
and I can't seem to work out why. As far as I can see, storefile
is getting as far as running the validitycheck, which is supposed to
read that, but never does.
This is especially strange because what seems like the same protocol
doesn't hang when servePut runs it. This made me think that it needed
to use inAnnexWorker to be more like servePut, but that didn't help.
Another small problem with this is that it does create an empty
.git/annex/tmp/ file for the key. Since this will usually be used in
combination with servePut, that doesn't seem worth worrying about much.
This means that when the client sends a truncated data to indicate
invalidity, DATA is not passed the full expected data. That leaves the
P2P connection in a state where it cannot be reused. While so far, they
are not reused, they will be later when proxies are supported. So, have
to close the P2P connection in this situation.
Always truncate instead. The padding risked something not noticing the
content was bad and getting a file that was corrupted in a novel way
with the padding "X" at the end. A truncated file is better.
Made the data-length header required even for v0. This simplifies the
implementation, and doesn't preclude extra verification being done for
v0.
The connectionWaitVar is an ugly hack. In servePut, nothing waits
on the waitvar, and I could not find a good way to make anything wait on
it.
This came down to SendBytes waiting on the waitv. Nothing ever filled
it.
Only Annex.Proxy needs the waitv, and it handles filling it. So make it
optional.
Added v2-v0 endpoints. These are tedious, but will be needed in order to
use the HTTP protocol to proxy to repositories with older git-annex,
where git-annex-shell will be speaking an older version of the protocol.
Changed GET to use 422 when the content is not present. 404 is needed to
detect when a protocol version is not supported.
For clusters, the timestamps have to be translated, since each node can
have its own idea about what time it is. To translate a timestamp, the
proxy remembers what time it asked the node for a timestamp in
GETTIMESTAMP, and applies the delta as an offset in REMOVE-BEFORE.
This does mean that a remove from a cluster has to call GETTIMESTAMP on
every node before dropping from nodes. Not very efficient. Although
currently it tries to drop from every single node anyway, which is also
not very efficient.
I thought about caching the GETTIMESTAMP from the nodes on the first
call. That would improve efficiency. But, since monotonic clocks on
!Linux don't advance when the computer is suspended, consider what might
happen if one node was suspended for a while, then came back. Its
monotonic timestamp would end up behind where the proxying expects it to
be. Would that result in removing when it shouldn't, or refusing to
remove when it should? Have not thought it through. Either way, a
cluster behaving strangly for an extended period of time because one
of its nodes was briefly asleep doesn't seem like good behavior.
Added Maybe POSIXTime to SafeDropProof, which gets set when the proof is
based on a LockedCopy. If there are several LockedCopies, it uses the
closest expiry time. That is not optimal, it may be that the proof
expires based on one LockedCopy but another one has not expired. But
that seems unlikely to really happen, and anyway the user can just
re-run a drop if it fails due to expiry.
Pass the SafeDropProof to removeKey, which is responsible for checking
it for expiry in situations where that could be a problem. Which really
only means in Remote.Git.
Made Remote.Git check expiry when dropping from a local remote.
Checking expiry when dropping from a P2P remote is not yet implemented.
P2P.Protocol.remove has SafeDropProof plumbed through to it for that
purpose.
Fixing the remaining 2 build warnings should complete this work.
Note that the use of a POSIXTime here means that if the clock gets set
forward while git-annex is in the middle of a drop, it may say that
dropping took too long. That seems ok. Less ok is that if the clock gets
turned back a sufficient amount (eg 5 minutes), proof expiry won't be
noticed. It might be better to use the Monotonic clock, but that doesn't
advance when a laptop is suspended, and while there is the linux
Boottime clock, that is not available on other systems. Perhaps a
combination of POSIXTime and the Monotonic clock could detect laptop
suspension and also detect clock being turned back?
There is a potential future flag day where
p2pDefaultLockContentRetentionDuration is not assumed, but is probed
using the P2P protocol, and peers that don't support it can no longer
produce a LockedCopy. Until that happens, when git-annex is
communicating with older peers there is a risk of data loss when
a ssh connection closes during LOCKCONTENT.