comments
This commit is contained in:
parent
733a74a7e8
commit
e13444fb2b
2 changed files with 79 additions and 0 deletions
|
@ -0,0 +1,38 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2022-09-13T16:02:58Z"
|
||||
content="""
|
||||
This extension to the protocol would only be useful when removing chunks,
|
||||
because otherwise git-annex doesn't have a way to build up a list of keys
|
||||
that are going to be removed, in a way that could usefully be sent to the
|
||||
external special remote together.
|
||||
|
||||
For chunks, it has a list of keys. So this is feasible.
|
||||
|
||||
I wonder if it's necessary to extend the protocol though. If an external
|
||||
special remote wants to, it can buffer a list of keys that it's been told
|
||||
to remove, and return REMOVE-SUCCESS to each request before actually
|
||||
doing the removal. It could then
|
||||
remove all the buffered keys in a single API call or whatever.
|
||||
|
||||
The risk of course is that if the removal fails, or it's interrupted before
|
||||
it can do the removal, it will have incorrectly claimed to remove the keys.
|
||||
And git-annex will have recorded incorrect information and wrongly
|
||||
indicated the removal succeeded. This would not be a good idea for non-chunk
|
||||
keys (although `fsck --fast --from` the remote could recover from it).
|
||||
|
||||
For a set of chunk keys that are all chunks of the same key, though,
|
||||
git-annex doesn't record anything until they've all been successfully
|
||||
removed. Also, it so happens that after asking for all the chunked keys to
|
||||
be removed, git-annex normally[1] then asks for the unchunked key to be
|
||||
removed too. So, a special remote could buffer chunked keys until it sees
|
||||
an unchunked key, and then remove them all efficiently, and reply to the
|
||||
removal of the unchunked key with the combined result of all the removals.
|
||||
|
||||
[1] The exception is that, if the special remote is not currently
|
||||
configured to use chunking, git-annex happens to remove the unchunked key
|
||||
first, followed by all the chunked keys. I don't think there is a good
|
||||
reason for this in removal; it's a useful optimisation for retrieving
|
||||
content that happens to affect removal too.
|
||||
"""]]
|
|
@ -0,0 +1,41 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2022-09-13T16:26:06Z"
|
||||
content="""
|
||||
I don't know that the above comment is really a good idea for an external
|
||||
remote to try to implement. Needing to know about chunk keys is not too
|
||||
bad, but it also relies on details of git-annex's implementation.
|
||||
|
||||
But it seems worth considering possibilities
|
||||
like that, since this extension would only be used in such a relative
|
||||
corner case.
|
||||
|
||||
Or possibly considering ways to generalize the idea to be usable in more
|
||||
cases..
|
||||
|
||||
Along those lines, it occurs to me that the async extension to the
|
||||
protocol is somewhat similar, since git-annex can ask the external remote
|
||||
to do several things at the same time. Removals of chunk keys are not
|
||||
currently run concurrently, but they could be. An external remote could
|
||||
then gather together some number of concurrent remove requests and perform
|
||||
them all in a single API call (or whatever).
|
||||
|
||||
But how would the external remote know when it's seen all the remove
|
||||
requests for chunks of a key? It seems like it would need to use a
|
||||
heuristic, like no new requests in some amount of time means git-annex is
|
||||
waiting on it to remove everything it's been requested to remove.
|
||||
|
||||
So it might be that a protocol extension would be useful, some way for
|
||||
git-annex to indicate that it is blocked waiting on current requests to
|
||||
finish. That seems more general purpose than a MULTIREMOVE extension.
|
||||
For example, git-annex could also send it when retrieving chunks.
|
||||
(Although retrieving chunks is also not currently done concurrently.)
|
||||
|
||||
(There's also a question of how many concurrent removals of chunk
|
||||
keys it would make sense for git-annex to request at the same time. It
|
||||
could request removing all chunks concurrently but if the special
|
||||
remote needs to do much work or use resources for each request, that
|
||||
might not be good. It would probably be more natural to use something
|
||||
based on -J.)
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue