external protocol VERSION 2

Support VERSION 2 in the external special remote protocol, which is
identical to VERSION 1, but avoids external remote programs neededing to
work around the above bug. External remote program that support
exporttree=yes are recommended to be updated to send VERSION 2.

Sponsored-by: Kevin Mueller on Patreon
This commit is contained in:
Joey Hess 2023-03-28 17:00:08 -04:00
parent 82c65b7951
commit 18d326cb6f
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
11 changed files with 80 additions and 73 deletions

View file

@ -13,6 +13,10 @@ git-annex (10.20230322) UNRELEASED; urgency=medium
* Fix bug that caused broken protocol to be used with external remotes
that use exporttree=yes. In some cases this could result in the wrong
content being exported to, or retrieved from the remote.
* Support VERSION 2 in the external special remote protocol, which is
identical to VERSION 1, but avoids external remote programs neededing
to work around the above bug. External remote program that support
exporttree=yes are recommended to be updated to send VERSION 2.
-- Joey Hess <id@joeyh.name> Thu, 23 Mar 2023 15:04:41 -0400

View file

@ -415,7 +415,7 @@ newtype JobId = JobId Integer
deriving (Eq, Ord, Show)
supportedProtocolVersions :: [ProtocolVersion]
supportedProtocolVersions = [1]
supportedProtocolVersions = [1, 2]
instance Proto.Serializable JobId where
serialize (JobId n) = show n

View file

@ -18,17 +18,7 @@ but am confident that I fixed it in
----
I'm looking forward to export support in rclone!
I'm wondering about the data loss part of this. You said:
> As a result, one process receives two `EXPORT`s in a row, the first of which it ignores,
> while some other process receives a `TRANSFEREXPORT` without a prior `EXPORT`,
> reusing whatever filename was set in the previous transaction,
> and either ovewriting the last accessed remote file with the wrong content,
> or retrieving the content of the last accessed remote file instead of the one git-annex wanted.
So in your implementation, you're keeping EXPORT set after handling TRANSFEREXPORT
In your implementation, you're keeping EXPORT set after handling TRANSFEREXPORT
and similar commands, and so if you receive a TRANSFEREXPORT not prefixed by an
EXPORT, it can be bad. A natural way to write things, indeed
`doc/special_remotes/external/example.sh` does the same.
@ -37,26 +27,20 @@ An alternative implementation would be to clear the EXPORT after handling a
TRANSFEREXPORT, CHECKPRESENTEXPORT, REMOVEEXPORT, or RENAMEEXPORT.
And have those error out if no EXPORT was received before them.
So that's one way we can avoid the data loss problem if your external remote
is used with an old git-annex that has this bug. You might want to do that now.
But, that does not fully avoid the data loss problem. Because this might
happen:
----
EXPORT foo
EXPORT bar
TRANSFEREXPORT STORE $foo_key $foo_file
But, that requires defensive coding in every external remote... Maybe it
would be worth changing the protocol in some way, that avoids the problem.
Then if external remotes were updated to the new protocol, they'd not work
with the buggy git-annex versions, and avoid data loss. I'm leaving this bug
open for now to consider such a protocol change..
So even ignoring the first EXPORT can result in the external special remote
doing the wrong thing. To fully guard against it, you'd have to error out
if multiple EXPORT were received before a request that uses one, rather
than ignoring the first EXPORT.
Weighing the costs and benefits of such a change, the extent of the data
loss is fairly limited. exporttree=yes remotes are always untrusted, so
if a file on one is overwritten with the wrong content, git-annex will be preserving
the right content elsewhere. And if the wrong file is retrieved from one,
git-annex will notice it has the wrong content (so long as its key uses a checksum).
Still, whatever that export is used for would unexpectedly have the wrong file
content, which could still be a bad day for somebody.
(If I decide against the protocol change, I should fix
`doc/special_remotes/external/example.sh` to defend against the bug,
and tell others who have externals that support exporttree too..)
So, fixing this in your remote and others would need significant defensive
coding. Too much to make sense, IMHO. I think git-annex needs to chnge the
protocol in some way instead, to make it easy for you to avoid speaking to
a buggy git-annex.
"""]]

View file

@ -12,40 +12,5 @@ git-annex could either continue to also support VERSION 1, or it could
refuse to work with externals that don't use VERSION 2. The latter would
force externals to get updated. But then those externals would have no way to work
with old git-annex even if they wanted to. I think forcing an update is not called
for.
Note that git-annex's handling of an external that sends VERSION 2 is not
stellar currently:
joey@darkstar:~/tmp/b>git-annex initremote t type=external externaltype=test encryption=none
initremote t
external special remote protocol error, unexpectedly received "CONFIG directory store data here" (command not allowed at this time)
git-annex: unable to use special remote due to protocol error
joey@darkstar:~/tmp/b>git-annex copy --to t
external special remote protocol error, unexpectedly received "UNSUPPORTED-REQUEST" (command not allowed at this time)
external special remote protocol error, unexpectedly received "UNSUPPORTED-REQUEST" (command not allowed at this time)
copy x
external special remote protocol error, unexpectedly received "UNSUPPORTED-REQUEST" (command not allowed at this time)
(unable to use special remote due to protocol error) failed
copy: 1 failed
Protocol debug shows what's happening:
[2023-03-28 16:00:09.361602472] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] --> VERSION 2
[2023-03-28 16:00:09.361764519] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] <-- ERROR unsupported VERSION
[2023-03-28 16:00:09.361897595] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] <-- EXTENSIONS INFO GETGITREMOTENAME ASYNC
[2023-03-28 16:00:09.362112912] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] --> UNSUPPORTED-REQUEST
[2023-03-28 16:00:09.362212948] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] <-- PREPARE
[2023-03-28 16:00:09.362332494] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] --> UNSUPPORTED-REQUEST
Not really ideal behavior for an old version of git-annex when used with
an external that wants to send VERSION 2 to ensure it does not need to deal with
this bug. But better than nothing I suppose.
(That output really ought to be fixed going forward, but old versions of git-annex
of course can't be fixed now.)
for. So git-annex will keep supporting both versions.
"""]]

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2023-03-28T20:51:40Z"
content="""
I've implemented support for `VERSION 2`. Recommend any external special
remotes that support exporttree=yes use it to avoid talking with a buggy
git-annex version.
(See [[todo/better_message_for_external_special_remote_protocol_mismatch]]
for related todo.)
"""]]

View file

@ -39,7 +39,7 @@ empty, but the separating spaces are still required in that case.
The special remote is responsible for sending the first message, indicating
the version of the protocol it is using.
VERSION 1
VERSION 2
Recent versions of git-annex respond with a message indicating
protocol extensions that it supports. Older versions of
@ -271,7 +271,7 @@ These messages may be sent by the special remote at any time that it's
handling a request.
* `VERSION Int`
Supported protocol version. Current version is 1. Must be sent first
Supported protocol version. Current version is 2. Must be sent first
thing at startup, as until it sees this git-annex does not know how to
talk with the special remote program!
(git-annex does not send a reply to this message, but may give up if it
@ -428,6 +428,18 @@ remote.
git-annex will not talk to it any further. If the program receives
an ERROR from git-annex, it can exit with its own ERROR.
## protocol versions
Currently git-annex supports `VERSION 1` and `VERSION 2`.
The two protocol versions are actually identical.
Old versions of git-annex that supported only `VERSION 1`
had a bug in their implementation of the
part of the protocol documented in the[[export_and_import_appendix]].
The bug could result in ontent being exported to the wrong file.
External special remotes that implement that should use `VERSION 2` to
avoid talking to the buggy old version of git-annex.
## extensions
These protocol extensions are currently supported.

View file

@ -37,7 +37,10 @@ a request, it can reply with `UNSUPPORTED-REQUEST`.
`REMOVEEXPORTDIRECTORY`), specifying the name of the exported file. It
will be in the form of a relative path, and may contain path separators,
whitespace, and other special characters.
No response is made to this message.
No response is made to this message.
Note that old versions of git-annex had a bug that sometimes prevented
sending `EXPORT`. To avoid being used with such a buggy version of
git-annex, send `VERSION 2`.
* `TRANSFEREXPORT STORE|RETRIEVE Key File`
Requests the transfer of a File on local disk to or from the previously
provided `EXPORT` Name on the special remote.

View file

@ -150,7 +150,7 @@ doremove () {
}
# This has to come first, to get the protocol started.
echo VERSION 1
echo VERSION 2
while read line; do
set -- $line

View file

@ -54,7 +54,7 @@ getaddrs () {
}
# This has to come first, to get the protocol started.
echo VERSION 1
echo VERSION 2
while read line; do
set -- $line

View file

@ -100,7 +100,7 @@ downloadtorrent () {
}
# This has to come first, to get the protocol started.
echo VERSION 1
echo VERSION 2
while read line; do
set -- $line

View file

@ -0,0 +1,27 @@
git-annex's handling of an external special remote that sends VERSION 2 is not
stellar currently:
joey@darkstar:~/tmp/b>git-annex initremote t type=external externaltype=test encryption=none
initremote t
external special remote protocol error, unexpectedly received "CONFIG directory store data here" (command not allowed at this time)
git-annex: unable to use special remote due to protocol error
joey@darkstar:~/tmp/b>git-annex copy --to t
external special remote protocol error, unexpectedly received "UNSUPPORTED-REQUEST" (command not allowed at this time)
external special remote protocol error, unexpectedly received "UNSUPPORTED-REQUEST" (command not allowed at this time)
copy x
external special remote protocol error, unexpectedly received "UNSUPPORTED-REQUEST" (command not allowed at this time)
(unable to use special remote due to protocol error) failed
copy: 1 failed
Protocol debug shows what's happening:
[2023-03-28 16:00:09.361602472] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] --> VERSION 2
[2023-03-28 16:00:09.361764519] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] <-- ERROR unsupported VERSION
[2023-03-28 16:00:09.361897595] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] <-- EXTENSIONS INFO GETGITREMOTENAME ASYNC
[2023-03-28 16:00:09.362112912] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] --> UNSUPPORTED-REQUEST
[2023-03-28 16:00:09.362212948] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] <-- PREPARE
[2023-03-28 16:00:09.362332494] (Annex.ExternalAddonProcess) /home/joey/bin/git-annex-remote-test[3] --> UNSUPPORTED-REQUEST