Commit graph

3047 commits

Author SHA1 Message Date
Joey Hess
bcd2b9a5c4
idea 2024-08-12 09:43:14 -04:00
Joey Hess
3019b21c40
more formal documentation of balancing 2024-08-11 13:29:06 -04:00
Joey Hess
3ce2e95a5f
balanced preferred content and --rebalance
This all works fine. But it doesn't check repository sizes yet, and
without repository size checking, once a repository gets full, there
will be no other repository that will want its files.

Use of sha2 seems unncessary, probably alder2 or md5 or crc would have
been enough. Possibly just summing up the bytes of the key mod the number
of repositories would have sufficed. But sha2 is there, and probably
hardware accellerated. I doubt very much there is any security benefit
to using it though. If someone wants to construct a key that will be
balanced onto a given repository, sha2 is certianly not going to stop
them.
2024-08-09 14:16:09 -04:00
Joey Hess
3ea835c7e8
proxied exporttree=yes versionedexport=yes remotes are not untrusted
This removes versionedExport, which was only used by the S3 special
remote. Instead, versionedexport=yes is a common way for remotes to
indicate that they are versioned.
2024-08-08 15:24:19 -04:00
Joey Hess
4750ffbd3b
finalized design for proxying to exporttree=yes annexobjects=yes special remotes 2024-08-06 11:45:45 -04:00
Joey Hess
84d27cf34f
update 2024-08-06 11:13:51 -04:00
Joey Hess
d52fd3cf83
update 2024-07-30 12:17:05 -04:00
Joey Hess
1560e0eee9
comment 2024-07-30 10:50:13 -04:00
Joey Hess
b4eb6e3ced
comment 2024-07-29 11:59:33 -04:00
Joey Hess
321e2adf66
don't think I ever implementned the 422 idea, it will 404 2024-07-29 11:49:40 -04:00
Joey Hess
d3f584fcdb
wording 2024-07-29 11:44:44 -04:00
Joey Hess
5f5c29fbe7
link 2024-07-29 11:43:30 -04:00
Joey Hess
f3b207a4b9
wording 2024-07-29 11:37:13 -04:00
Joey Hess
74f81ebd04
Merge remote-tracking branch 'origin/httpproto' 2024-07-29 11:25:27 -04:00
stv0g
6352cebb92 Added a comment: importtree=yes Support 2024-07-29 06:50:01 +00:00
Joey Hess
cd89f91aa5
remove uuid from annex+http urls
Not needed it turns out.
2024-07-28 20:29:42 -04:00
Joey Hess
0ea645944e
thoughts on exporttree 2024-07-27 19:59:54 -04:00
Joey Hess
0fb86d2916
UNLOCKCONTENT is not a top-level request
proxyRequest was treating UNLOCKCONTENT as a separate request.
That made it possible for there to be two different connections to the
proxied remote, with LOCKCONTENT being sent to one, and UNLOCKCONTENT
to the other one. A protocol error.

git-annex testremote now passes against a http proxied remote.
2024-07-26 20:39:06 -04:00
Joey Hess
a3dab58be2
fix hang at end of PUT to proxied p2p http remote
sendExactly will now be sure to evaluate the whole lazy ByteString.

In this case, the lazy ByteString was exactly the right lenth.
But, it seems that L.take caused it to not actually be fully evaluated.

In servePut, this manifested as gather never being fully evaluated,
which caused the hang.

Very, very subtle, and horrible bug. Clearly the use of lazy ByteString
(or really just laziness) is at fault, and it would be very worth moving
to conduit or whatever to avoid this.
2024-07-26 19:50:15 -04:00
Joey Hess
6a3f755bfa
add common parameters to generic get API
Honestly this was just done to make the documentation correct. There's
no point in using these parameters. And they're optional.
2024-07-24 20:55:58 -04:00
Joey Hess
b4d749cc91
Merge branch 'master' into httpproto 2024-07-23 21:17:06 -04:00
Joey Hess
2aa9154b1f
require a valid uuid at the end of an annex+http url 2024-07-23 12:30:27 -04:00
Joey Hess
a6a03ca586
annex+http urls 2024-07-23 08:42:33 -04:00
Joey Hess
758cff0fde
update 2024-07-22 20:59:45 -04:00
Joey Hess
9984252ab5
P2P protocol is finalized 2024-07-22 19:50:08 -04:00
Joey Hess
e979e85bff
make serveKeepLocked check auth just to be safe 2024-07-22 19:15:52 -04:00
Joey Hess
3069e28dd8
implemented servePutOffset and clientPutOffset
But, it's buggy: the server hangs without processing the VALIDITY,
and I can't seem to work out why. As far as I can see, storefile
is getting as far as running the validitycheck, which is supposed to
read that, but never does.

This is especially strange because what seems like the same protocol
doesn't hang when servePut runs it. This made me think that it needed
to use inAnnexWorker to be more like servePut, but that didn't help.

Another small problem with this is that it does create an empty
.git/annex/tmp/ file for the key. Since this will usually be used in
combination with servePut, that doesn't seem worth worrying about much.
2024-07-22 15:04:10 -04:00
Joey Hess
4826a3745d
servePut and clientPut implementation
Made the data-length header required even for v0. This simplifies the
implementation, and doesn't preclude extra verification being done for
v0.

The connectionWaitVar is an ugly hack. In servePut, nothing waits
on the waitvar, and I could not find a good way to make anything wait on
it.
2024-07-22 10:27:44 -04:00
m.risse@77eac2c22d673d5f10305c0bade738ad74055f92
3590a17f9e Added a comment 2024-07-16 09:21:54 +00:00
Joey Hess
eb4fb388bd
only base64 non-utf8 2024-07-11 15:47:16 -04:00
Joey Hess
68227154fb
switch HTTP P2P protocol to base64url
Base64 can include '/', and with UUIDs and keys both used in routes,
the encoding needs to avoid that. Use base64url everywhere in the HTTP
protocol for consistency.
2024-07-11 12:31:41 -04:00
Joey Hess
a7383b5c59
move serveruuid into routes
In particular the generic get route needs it, so that when a single http
server is serving multiple repositories, it knows what repository to
use.
2024-07-11 11:19:20 -04:00
Joey Hess
7c588a5791
implement remove-before
The reason to use removeBeforeRemoteEndTime is twofold.

First, removeBefore sends two protocol commands. Currently, the HTTP
protocol runner only supports sending a single command per invocation.

Secondly, the http server gets a monotonic timestamp from the client. So
translating back to a POSIXTime would be annoying.

The timestamp flow with a proxy will be:

- client gets timestamp, which gets the monotonic timestamp from the
  proxied remote via the proxy. The timestamp is currently not
  proxied when there is a single proxy.
- client calls remove-before
- http server calls removeBeforeRemoteEndTime which sends REMOVE-BEFORE
  to the proxied remote.
2024-07-10 10:03:26 -04:00
Joey Hess
48f76cb3e8
implement serveRemove and send WWW-Authenticate header on auth failure 2024-07-10 09:13:01 -04:00
Joey Hess
6a8a4d1775
authentication is implemented
just need to make Command.P2PHttp generate a GetServerMode from options
2024-07-09 20:54:47 -04:00
Joey Hess
08371c3745
started on auth 2024-07-09 17:30:55 -04:00
Joey Hess
a3dd8b4bcb
capture API version in routes
Needed so the client can send it.
2024-07-09 12:04:29 -04:00
Joey Hess
b758b01692
add lockids to http p2p protocol 2024-07-08 20:18:55 -04:00
Joey Hess
69c4f07ab0
finish get API 2024-07-08 13:27:50 -04:00
Joey Hess
82d66ede5e
convert lockcontent api to http long polling
Websockets would work, but the problem with using them for this is that
each lockcontent call is a separate websocket connection. And that's an
actual TCP connection. One TCP connection per file dropped would be too
expensive. With http long polling, regular http pipelining can be used,
so it will reuse a TCP connection.

Unfortunately, at least with servant, bi-directional streams with long
polling don't result in true bidirectional full duplex communication.
Servant processes the whole client body stream before generating the server
body stream. I think it's entirely possible to do full bi-directional
communication over http, but it would need changes to servant.

And, there's no way for the client to tell if the server successfully
locked the content, since the server will keep processing the client
stream no matter what.:

So, added a new api endpoint, keeplocked. lockcontent will lock the key
for 10 minutes with retention lock, and then a call to keeplocked will
keep it locked for as long as needed. This does mean that there will
need to be a Map of locks by key, and I will probably want to add
some kind of lock identifier that lockcontent returns.
2024-07-08 12:57:46 -04:00
Joey Hess
1dbb5ec70d
servant API type is complete 2024-07-07 12:59:12 -04:00
Joey Hess
4133063ab1
Merge branch 'master' into httpproto 2024-07-07 12:08:24 -04:00
Joey Hess
86ce3bf1e4
started servant implementation of HTTP P2P protocol 2024-07-07 12:08:10 -04:00
Joey Hess
40306d3fcf
finalizing HTTP P2p protocol some more
Added v2-v0 endpoints. These are tedious, but will be needed in order to
use the HTTP protocol to proxy to repositories with older git-annex,
where git-annex-shell will be speaking an older version of the protocol.

Changed GET to use 422 when the content is not present. 404 is needed to
detect when a protocol version is not supported.
2024-07-05 15:34:58 -04:00
Joey Hess
2fb3ef4d41
finalizing HTTP P2P protocol
Managed to avoid netstrings. Actually, using netstrings while streaming
lazy ByteString turns out to be very difficult. So instead, have a
header that specifies the expected amount of data, and then it can just
arrange to send a different amount of data if it needs to indicate
INVALID.

Also improved the interface for GET of a key.
2024-07-05 15:03:51 -04:00
Joey Hess
5e564947d7
use netstrings for framing binary data with json at the end
This will be easy to implement with servant. It's also very efficient,
and fairly future-proof. Eg, could add another frame with other data.

This does make it a bit harder to use this protocol, but netstrings
probably take about 5 minutes to implement? Let's see...

import Text.Read
import Data.List

toNetString :: String -> String
toNetString s = show (length s) ++ ":" ++ s ++ ","

nextNetString :: String -> Maybe (String, String)
nextNetString s = case break (== ':') s of
        ([], _) -> Nothing
        (sn, rest) -> do
                n <- readMaybe sn
                let (v, rest') = splitAt n (drop 1 rest)
                return (v, drop 1 rest')

Ok, well, that took about 10 minutes ;-)
2024-07-05 11:53:03 -04:00
Joey Hess
95ba4d4480
thoughts on CGI, and use json 2024-07-05 10:08:43 -04:00
Joey Hess
3f9569e27f
update 2024-07-04 15:26:05 -04:00
Joey Hess
1243af4a18
toward SafeDropProof expiry checking
Added Maybe POSIXTime to SafeDropProof, which gets set when the proof is
based on a LockedCopy. If there are several LockedCopies, it uses the
closest expiry time. That is not optimal, it may be that the proof
expires based on one LockedCopy but another one has not expired. But
that seems unlikely to really happen, and anyway the user can just
re-run a drop if it fails due to expiry.

Pass the SafeDropProof to removeKey, which is responsible for checking
it for expiry in situations where that could be a problem. Which really
only means in Remote.Git.

Made Remote.Git check expiry when dropping from a local remote.

Checking expiry when dropping from a P2P remote is not yet implemented.
P2P.Protocol.remove has SafeDropProof plumbed through to it for that
purpose.

Fixing the remaining 2 build warnings should complete this work.

Note that the use of a POSIXTime here means that if the clock gets set
forward while git-annex is in the middle of a drop, it may say that
dropping took too long. That seems ok. Less ok is that if the clock gets
turned back a sufficient amount (eg 5 minutes), proof expiry won't be
noticed. It might be better to use the Monotonic clock, but that doesn't
advance when a laptop is suspended, and while there is the linux
Boottime clock, that is not available on other systems. Perhaps a
combination of POSIXTime and the Monotonic clock could detect laptop
suspension and also detect clock being turned back?

There is a potential future flag day where
p2pDefaultLockContentRetentionDuration is not assumed, but is probed
using the P2P protocol, and peers that don't support it can no longer
produce a LockedCopy. Until that happens, when git-annex is
communicating with older peers there is a risk of data loss when
a ssh connection closes during LOCKCONTENT.
2024-07-04 12:39:06 -04:00
Joey Hess
543c610a31
REMOVE-BEFORE and GETTIMESTAMP
Only implemented server side, not used client side yet.

And not yet implemented for proxies/clusters, for which there's a build
warning about unhandled cases.

This is P2P protocol version 3. Probably will be the only change in that
version..

Added a dependency on clock to access a monotonic clock.
On i386-ancient, that is at version 0.2.0.0.
2024-07-03 17:01:58 -04:00