This is similar to git-annex copy --from --to, in that it downloads a
local copy, locks it for removal, uploads it, and drops it. Removal of
the temporary local copy is done without verifying numcopies for the
same reason as that command.
I do wonder, looking at this, if there's a race where the local copy
gets used as a copy to allow some other drop in the narrow window after
it is downloaded and before it gets locked for removal. That would need
some other repository to have an out of date location log that says the
repository contains a copy of the key, in order for it to try to use it
as a copy. If there is such a race, git-annex copy/move would also be
vulnerable to it. It would be better to lock it for removal before
starting to download it! That is possible in v10 repositories, which do
use a separate content lock file.
Note that, when the exported tree contains several files that use the
same key, it will be downloaded repeatedly, once per time needed to
upload it. It would be possible to avoid that extra work, but it would
complicate this since the local copy would need to be preserved, locked
for removal, until the end. Also, that would mean that interrupting the
export would leave possibly a lot of temporarily downloaded keys in the
local repository, while currently it can only leave one.
Since git-annex sync sends the sync branch first, and only displays the
output of the push to the sync branch, this makes git-annex
post-retrieve's output when updating the exported tree be visible when
syncing.
This also makes syncing with a non-bare repository still update the
exported tree, even when the checked out branch is not able to be
updated. The sync branch gets sent regardless.
This handles the workflow where the branch is first pushed to the proxy,
and then files in the exported tree are later are copied to the proxied remote.
Turns out that the way the export log is structured, nothing needs
to be done to finalize the export once the last key is sent to it. Which
is great because that would have been a lot of complication. On
receiving the push, Command.Export runs and calls recordExportBeginning,
does as much as it can to update the export with the files currently
on it, and then calls recordExportUnderway. At that point, the
export.log records the export as "complete", but it's not really. And
that's fine. The same happens when using `git-annex export` when some
files are not available to send. Other repositories that have
access to the special remote can already retrieve files from it. As
the missing files get copied to the exported remote, all that needs
to be done is record each in the export db.
At this point, proxying to exporttree=yes annexobjects=yes special remotes
is fully working. Except for in the case where multiple files in the
tree use the same key, and the files are sent to the proxied remote
before pushing the tree.
It seems that even special remotes without annexobjects=yes will work if
used with the workflow where the git-annex branch is pushed before
copying files. But not with the `git-annex push` workflow.
It works when using git-annex sync/push/assist, or when manually sending
all content to the proxied remote before pushing to the proxy remote.
But when the push comes before the content is sent, sending content does
not update the exported tree.
(When possible, of course it may not be there, or it may get renamed from
there for another exported file first. Or the remote may not support
renames.)
This will avoids redundant uploads.
An example case where this is important: Proxying to a exporttree remote,
a file is uploaded to it but is not yet in an exported tree. When the
exported tree is pushed, the remote needs to be updated by exporting to
it. In this case, the proxy doesn't have a copy of the file, so it would
need to download it from annexobjects before uploading it to the final
location. With this optimisation, it can just rename it.
However: If a key is used twice in an exported tree, it seems a proxy
will need to download and reupload anyway. Unless a copy operation is
added to exporttree remotes..
This avoids needing to re-upload the file again to get it to the
annexobjects location, which git-annex sync was doing when it was
preferred content.
If the file is not preferred content, sync will drop it from the
annexobjects location.
If the file has been deleted from the tree, it will remain in the
annexobjects location until an unused/dropunused pass is done.
Decided not to use the annexobjects location for exportTempName.
There doesn't seem to be any actual benefit to doing that, because an
export that renames to exportTempName always renames it back from that
to another location.
Also the annexobjects directory won't actually help with the paired
rename issue.
The file in the annexobjects location may have been renamed from a
previously exported file that got deleted in a subsequent export.
Or it may be renamed to annexobjects temporarily before being renamed to
another name (to handle eg pairwise renames).
But, an exported file is not guaranteed to contain the content of the
key that the local repository last exported there. Another tree could
have been exported from elsewhere in the meantime.
So, files in annexobjects do not necessarily have the content of their
key. And so have to be strongly verified when retrieving. The same as
is done when retrieving exported files.
While usually uploading to a special remote does not verify the content,
the content in a repository is assumed to be valid, and there is no trust
boundary. But with a proxied special remote, there may be users who are
allowed to store objects, but are not really trusted.
Another way to look at this is it's the equivilant of git-annex-shell
checking the hash of received data, which it does (see StoreContent
implementation).
An interrupted PUT to cluster that has a node that is a special remote
over http left open the connection to the cluster, so the next request
opens another one. So did an interrupted PUT directly to the proxied
special remote over http.
proxySpecialRemote was stuck waiting for all the DATA. Its connection
remained open so it kept waiting.
In servePut, checktooshort handles closing the P2P connection
when too short a data is received from PUT. But, checktooshort was only
called after the protoaction, which is what runs the proxy, which is
what was getting stuck. Modified it to run as a background thread,
which waits for the tooshortv to be written to, which gather always does
once it gets to the end of the data received from the http client.
That makes proxyConnection's releaseconn run once all data is received
from the http client. Made it close the connection handles before
waiting on the asyncworker thread. This lets proxySpecialRemote finish
processing any data from the handle, and then it will give up,
more or less cleanly, if it didn't receive enough data.
I say "more or less cleanly" because with both sides of the P2P
connection taken down, some protocol unhappyness results. Which can lead
to some ugly debug messages. But also can cause the asyncworker thread
to throw an exception. So made withP2PConnections not crash when it
receives an exception from releaseconn.
This did have a small change to the behavior of an interrupted PUT when
proxying to a regular remote. proxyConnection has a protoerrorhandler
that closes the proxy connection on a protocol error. But the proxy
connection is also closed by checktooshort when it closes the P2P
connection. Closing the same proxy connection twice is not a problem,
it just results in duplicated debug messages about it.
An interrupted `git-annex copy --to` a cluster via the http server,
when repeated, failed. The http server output "transfer already in
progress, or unable to take transfer lock". Apparently a second
connection was opened to the cluster, because the first connection
never got shut down.
Turned out the problem was that when proxying to a cluster, it would read a
short ByteString from the client, and send that to the nodes. But that left the
nodes warning more. Meanwhile, the proxy was expecting a SUCCESS/FAILURE
message from the nodes. So it didn't return, and so the cluster connection
stayed open.
This fixes a problem when git-annex testremote is run against a cluster
accessed via the http server. Annex.Cluster uses the location log
to find nodes that contain a key when checking if the key is present or getting
it. Just after a key was stored to a cluster node, reading the location log
was not getting the UUID of that node.
Apparently the Annex action that wrote to the location log, and the one
that read from it were run with two different Annex states. The http server
does use several different Annex threads.
BranchState was part of the AnnexState, and so two threads could have
different BranchStates.
Moved BranchState to the AnnexRead, so all threads will see the common state.
This might possibly impact performance. If one thread is writing changes to the
branch, and another thread is reading from the branch, the writing thread will
now invalidate the BranchState's cache, which will cause the reading thread to
need to do extra work. But correctness is surely more important. If did is
found to have impacted performance, it could probably be dealt with by doing
smarter BranchState cache invalidation.
Another way this might impact performance is that the BranchState has a small
cache. If several threads were reading from the branch and relying on the value
they just read still being in the case, now a cache miss will be more likely.
Increasing the BranchState cache to the number of jobs might be a good
idea to amelorate that. But the cache is currently an innefficient list,
so making it large would need changes to the data types.
(Commit 4304f1b6ae dealt with a follow-on
effect of the bug fixed here.)
Wired it up and it seems to basically work, although the test suite is
not fully passing.
Note that --jobs currently gets multiplied by the number of nodes in the
cluster, which is probably not good.
proxyRequest was treating UNLOCKCONTENT as a separate request.
That made it possible for there to be two different connections to the
proxied remote, with LOCKCONTENT being sent to one, and UNLOCKCONTENT
to the other one. A protocol error.
git-annex testremote now passes against a http proxied remote.
sendExactly will now be sure to evaluate the whole lazy ByteString.
In this case, the lazy ByteString was exactly the right lenth.
But, it seems that L.take caused it to not actually be fully evaluated.
In servePut, this manifested as gather never being fully evaluated,
which caused the hang.
Very, very subtle, and horrible bug. Clearly the use of lazy ByteString
(or really just laziness) is at fault, and it would be very worth moving
to conduit or whatever to avoid this.
removeOldestProxyConnectionPool will be innefficient the larger the pool
is. A better data structure could be more efficient. Eg, make each value
in the pool include the timestamp of its oldest element, then the oldest
value can be found and modified, rather than rebuilding the whole Map.
But, for pools of a few hundred items, this should be fine. It's O(n*n log n)
or so.
Also, when more than 1 connection with the same pool key exists,
it's efficient even for larger pools, since removeOldestProxyConnectionPool
is not needed.
The default of 1 idle connection could perhaps be larger.. like the
number of jobs? Otoh, it seems good to ramp up and down the number of
connections, which does happen. With 1, there is at most one stale
connection, which might cause a request to fail.
The proxy always checks the protocol version of a remote before talking
to it in a version-specific way, so the protocol version in the ProxyParams
is the client's protocol version. The remote will always be at the same or
an older protocol version than the client.
Note that in relayDATAFinish, when the client is at protocol version 0,
the remote must thus be as well, and that's why its version is not
checked in the case for that.
With that clarified, it's evident that, in P2P.Http.State, there's no
need to look at the proxied remote's protocol version at all.
But, it's buggy: the server hangs without processing the VALIDITY,
and I can't seem to work out why. As far as I can see, storefile
is getting as far as running the validitycheck, which is supposed to
read that, but never does.
This is especially strange because what seems like the same protocol
doesn't hang when servePut runs it. This made me think that it needed
to use inAnnexWorker to be more like servePut, but that didn't help.
Another small problem with this is that it does create an empty
.git/annex/tmp/ file for the key. Since this will usually be used in
combination with servePut, that doesn't seem worth worrying about much.
This means that when the client sends a truncated data to indicate
invalidity, DATA is not passed the full expected data. That leaves the
P2P connection in a state where it cannot be reused. While so far, they
are not reused, they will be later when proxies are supported. So, have
to close the P2P connection in this situation.
Always truncate instead. The padding risked something not noticing the
content was bad and getting a file that was corrupted in a novel way
with the padding "X" at the end. A truncated file is better.
Made the data-length header required even for v0. This simplifies the
implementation, and doesn't preclude extra verification being done for
v0.
The connectionWaitVar is an ugly hack. In servePut, nothing waits
on the waitvar, and I could not find a good way to make anything wait on
it.
This came down to SendBytes waiting on the waitv. Nothing ever filled
it.
Only Annex.Proxy needs the waitv, and it handles filling it. So make it
optional.
Added v2-v0 endpoints. These are tedious, but will be needed in order to
use the HTTP protocol to proxy to repositories with older git-annex,
where git-annex-shell will be speaking an older version of the protocol.
Changed GET to use 422 when the content is not present. 404 is needed to
detect when a protocol version is not supported.
For clusters, the timestamps have to be translated, since each node can
have its own idea about what time it is. To translate a timestamp, the
proxy remembers what time it asked the node for a timestamp in
GETTIMESTAMP, and applies the delta as an offset in REMOVE-BEFORE.
This does mean that a remove from a cluster has to call GETTIMESTAMP on
every node before dropping from nodes. Not very efficient. Although
currently it tries to drop from every single node anyway, which is also
not very efficient.
I thought about caching the GETTIMESTAMP from the nodes on the first
call. That would improve efficiency. But, since monotonic clocks on
!Linux don't advance when the computer is suspended, consider what might
happen if one node was suspended for a while, then came back. Its
monotonic timestamp would end up behind where the proxying expects it to
be. Would that result in removing when it shouldn't, or refusing to
remove when it should? Have not thought it through. Either way, a
cluster behaving strangly for an extended period of time because one
of its nodes was briefly asleep doesn't seem like good behavior.
Added Maybe POSIXTime to SafeDropProof, which gets set when the proof is
based on a LockedCopy. If there are several LockedCopies, it uses the
closest expiry time. That is not optimal, it may be that the proof
expires based on one LockedCopy but another one has not expired. But
that seems unlikely to really happen, and anyway the user can just
re-run a drop if it fails due to expiry.
Pass the SafeDropProof to removeKey, which is responsible for checking
it for expiry in situations where that could be a problem. Which really
only means in Remote.Git.
Made Remote.Git check expiry when dropping from a local remote.
Checking expiry when dropping from a P2P remote is not yet implemented.
P2P.Protocol.remove has SafeDropProof plumbed through to it for that
purpose.
Fixing the remaining 2 build warnings should complete this work.
Note that the use of a POSIXTime here means that if the clock gets set
forward while git-annex is in the middle of a drop, it may say that
dropping took too long. That seems ok. Less ok is that if the clock gets
turned back a sufficient amount (eg 5 minutes), proof expiry won't be
noticed. It might be better to use the Monotonic clock, but that doesn't
advance when a laptop is suspended, and while there is the linux
Boottime clock, that is not available on other systems. Perhaps a
combination of POSIXTime and the Monotonic clock could detect laptop
suspension and also detect clock being turned back?
There is a potential future flag day where
p2pDefaultLockContentRetentionDuration is not assumed, but is probed
using the P2P protocol, and peers that don't support it can no longer
produce a LockedCopy. Until that happens, when git-annex is
communicating with older peers there is a risk of data loss when
a ssh connection closes during LOCKCONTENT.
The error message is not displayed to the use, but this mirrors the
behavior when a regular get from a special remote fails. At least now
there is not a protocol error.
Now that storeKey can have a different object file passed to it, this
complication is not needed. This avoids a lot of strange situations,
and will also be needed if streaming is eventually supported.
Still needs some work.
The reason that the waitv is necessary is because without it,
runNet loops back around and reads the next protocol message. But it's
not finished reading the whole bytestring yet, and so it reads some part
of it.
Working, but lots of room for improvement...
Without streaming, so there is a delay before download begins as the
file is retreived from the special remote.
And when resuming it retrieves the whole file from the special remote
*again*.
Also, if the special remote throws an exception, currently it
shows as "protocol error".
This makes eg git-annex get default to using the cluster rather than an
arbitrary node, which is better UI.
The actual cost of accessing a proxied node vs using the cluster is
basically the same. But using the cluster allows smarter load-balancing
to be done on the cluster.
Before it was using a node that might have had a higher cost.
Also threw in a random selection from amoung the low cost nodes. Of
course this is a poor excuse for load balancing, but it's better than
nothing. Most of the time...
The VIA extension is still needed to avoid some extra work and ugly
messages, but this is enough that it actually works.
This filters out the RemoteSides that are a proxied connection via a
remote gateway to the cluster.
The VIA extension will not filter those out, but will send VIA to them
on connect, which will cause the ones that are accessed via the listed
gateways to be filtered out.
Walking a tightrope between security and convenience here, because
git-annex-shell needs to only proxy for things when there has been
an explicit, local action to configure them.
In this case, the user has to have run `git-annex extendcluster`,
which now sets annex-cluster-gateway on the remote.
Note that any repositories that the gateway is recorded to
proxy for will be proxied onward. This is not limited to cluster nodes,
because checking the node log would not add any security; someone could
add any uuid to it. The gateway of course then does its own
checking to determine if it will allow proxying for the remote.
Rejected the idea of automatically instantiating remotes for proxies-of-proxies.
That needs cycle protection, while the current behavior, which happened
for free, is that running git-annex updateproxy on the proxy can be used
to configure it, but only for topologies that actually exist.
The problem with that idea is that the cluster's proxy is necessarily a
remote, and necessarily one that we'll want to sync with, since the git
repository is stored there. So when its preferred content wants a file,
and the cluster does too, the file will get uploaded to it as well as to
the cluster. With fanout, the upload to the cluster will populate the
proxy as well, avoiding a second upload. But only if the file is sent to
the cluster first. If it's sent to the proxy first, there will be two
uploads.
Another, lesser problem is that a repository can proxy for more than one
cluster. So when does it make sense to drop content from the repository?
It could be done when dropping from one cluster, but what of the other
one?
This complication was not necessary anyway. Instead, if it's desirable
to have some content accessed from close to the proxy, one of the
cluster nodes can just be put on the same filesystem as it. That will be
just as fast as storing the content on the proxy.
Except when no nodes want a file, it has to be stored somewhere, so
store it on all. Which is not really desirable, but neither is having to
pick one.
ProtoAssociatedFile deserialization is rather broken, and this could
possibly affect preferred content expressions that match on filenames.
The inability to roundtrip whitespace like tabs and newlines through is
not a problem because preferred content expressions can't be written
that match on whitespace such as a tab. For example:
joey@darkstar:~/tmp/bench/z>git-annex wanted origin-node2 'exclude=*CTRL-VTab*'
wanted origin-node2
git-annex: Parse error: Parse failure: near "*"
But, the filtering of control characters could perhaps be a problem. I think
that filtering is now obsolete, git-annex has comprehensive filtering of
control characters when displaying filenames, that happens at a higher level.
However, I don't want to risk a security hole so am leaving in that filtering
in ProtoAssociatedFile deserialization for now.
Avoid `git-annex sync --content` etc from operating on cluster nodes by default
since syncing with a cluster implicitly syncs with its nodes. This avoids a
lot of unncessary work when a cluster has a lot of nodes just in checking
if each node's preferred content is satisfied. And it avoids content
being sent to nodes individually, so instead syncing with clusters always
fanout uploads to nodes.
The downside is that there are situations where a cluster's preferred content
settings can be met, but those of its nodes are not. Or where a node does not
contain a key, but the cluster does, and there are not enough copies of the key
yet, so it would be desirable the send it there. I think that's an acceptable
tradeoff. These kind of situations are ones where the cluster itself should
probably be responsible for copying content to the node. Which it can do much
less expensively than a client can. Part of the balanced preferred content
design that I will be working on in a couple of months involves rebalancing
clusters, so I expect to revisit this.
The use of annex-sync config does allow running git-annex sync with a specific
node, or nodes, and it will sync with it. And it's also possible to set
annex-sync git configs to make it sync with a node by default. (Although that
will require setting up an explicit git remote for the node rather than relying
on the proxied remote.)
Logs.Cluster.Basic is needed because Remote.Git cannot import Logs.Cluster
due to a cycle. And the Annex.Startup load of clusters happens
too late for Remote.Git to use that. This does mean one redundant load
of the cluster log, though only when there is a proxy.
This makes git-annex sync and similar not treat proxied remotes as git
syncable remotes.
Also, display in git-annex info remote when the remote is proxied.
Loading the remote list a second time was removing all proxied remotes.
That happened because setting up the proxied remote added some config
fields to the in-memory git config, and on the second load, it saw those
configs and decided not to overwrite them with the proxy.
Now on the second load, that still happens. But now, the proxied
git configs are used to generate a remote same as if those configs were
all set. The reason that didn't happen before was twofold,
the gitremotes cache was not dropped, and the remote's url field was not
set correctly.
The problem with the remote's url field is that while it was marked as
proxy inherited, all other proxy inherited fields are annex- configs.
And the code to inherit didn't work for the url field.
Now it all works, but git-annex sync is left running git push/pull on
the proxied remote, which doesn't work. That still needs to be fixed.
Tested it with small chunk sizes (like 2) and resumes that were
eg 1 byte from the end of the file or beginning of file.
Also, git-annex testremote passes now against a cluster!
When the destination does not start with a copy, the cluster has one or
more copies. If more, dropping would reduce the number of copies, so
numcopies must be checked.
Considered checking how many nodes of the cluster contain a copy. If
only 1 node does, it could allow a move without checking numcopies.
The problem with that, though, is that other nodes of the cluster could
have copies that we don't know about. And dropping from a cluster tries
to drop from all nodes, so will drop even from those. So any drop from a
cluster can remove more than 1 copy.