remove CHECKURL-MULTI single url response special case

Removed undocumented special case in handling of a CHECKURL-MULTI response
with only a single file listed. Rather than ignoring the url that was in
the response, use it. This allows external special remotes that want to
provide some better url to do so, although I don't entirely agree with
using CHECKURL-MULTI to accomplish that. I'm more of the feeling that an
undocumented special case that throws data away is just not a good idea.

This could in theory break some external special remote program that relied
on the current behavior, but its seems unlikely that it would because such
a program must already handle the multiple url case, unless it only ever
provides a single url response to CHECKURL-MULTI.

Make addurl --file work with a single item CHECKURL-MULTI response.
It already did for external special remotes due to the special case,
but now it also will for builtin ones like the BitTorrent special remote.

This commit was sponsored by Ilya Shlyakhter on Patron.
This commit is contained in:
Joey Hess 2018-10-29 14:41:41 -04:00
parent 3248319ca1
commit a622488758
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 87 additions and 18 deletions

View file

@ -46,6 +46,9 @@ git-annex (7.20181025) UNRELEASED; urgency=medium
immediately to v7.
* When annex.thin is set, allow hard links to be made between executable
work tree files and annex objects.
* addurl: Removed undocumented special case in handling of a CHECKURL-MULTI
response with only a single file listed. Rather than ignoring the url that
was in the response, use it.
-- Joey Hess <id@joeyh.name> Sat, 13 Oct 2018 00:52:02 -0400

View file

@ -132,17 +132,21 @@ checkUrl r o u = do
go deffile (Right (UrlContents sz mf)) = do
let f = adjustFile o (fromMaybe (maybe deffile fromSafeFilePath mf) (fileOption (downloadOptions o)))
void $ commandAction $ startRemote r o f u sz
go deffile (Right (UrlMulti l))
| isNothing (fileOption (downloadOptions o)) =
go deffile (Right (UrlMulti l)) = case fileOption (downloadOptions o) of
Nothing ->
forM_ l $ \(u', sz, f) -> do
let f' = adjustFile o (deffile </> fromSafeFilePath f)
void $ commandAction $
startRemote r o f' u' sz
| otherwise = giveup $ unwords
[ "That url contains multiple files according to the"
, Remote.name r
, " remote; cannot add it to a single file."
]
void $ commandAction $ startRemote r o f' u' sz
Just f -> case l of
[] -> noop
((u',sz,_):[]) -> do
let f' = adjustFile o f
void $ commandAction $ startRemote r o f' u' sz
_ -> giveup $ unwords
[ "That url contains multiple files according to the"
, Remote.name r
, " remote; cannot add it to a single file."
]
startRemote :: Remote -> AddUrlOptions -> FilePath -> URLString -> Maybe Integer -> CommandStart
startRemote r o file uri sz = do

View file

@ -689,10 +689,6 @@ checkUrlM external url =
handleRequest external (CHECKURL url) Nothing $ \req -> case req of
CHECKURL_CONTENTS sz f -> result $ UrlContents sz $
if null f then Nothing else Just $ mkSafeFilePath f
-- Treat a single item multi response specially to
-- simplify the external remote implementation.
CHECKURL_MULTI ((_, sz, f):[]) ->
result $ UrlContents sz $ Just $ mkSafeFilePath f
CHECKURL_MULTI l -> result $ UrlMulti $ map mkmulti l
CHECKURL_FAILURE errmsg -> Just $ giveup errmsg
UNSUPPORTED_REQUEST -> giveup "CHECKURL not implemented by external special remote"

View file

@ -8,9 +8,30 @@ When an external special remote tells git-annex a fuller URL for a given file, g
It would be better if, in the above log, the URL key was based on dx://file-FJZjVx001pB2BQPVKY4zX8kk/A4.assembly1-trinity.fasta , which would preserve the .fasta extension in the key and therefore in the symlink target.
Also, it would be good if the external special remote could return an etag for the URL, which would be a value guaranteed to change if the URL's contents changes;
and if git-annex would then compute the URL key based on the combination of URL and etag.
> [fixed|done]] --[[Joey]]
It'd also be good if there was a option to automatically migrate URL keys to the default backend whenever a file from a URL key is downloaded. Also, to record
the checksummed key (e.g. MD5E) as metadata of the URL key (in a field named e.g. alternateKeys), and if addurl --fast is later done on a URL key for which
a checksummed key is recorded in the metadata, to add the checksummed key instead of the URL key .
Also, it would be good if the external special remote could return an etag
for the URL, which would be a value guaranteed to change if the URL's
contents changes; and if git-annex would then compute the URL key based on
the combination of URL and etag.
> This might be a good idea if sufficiently elaborated on, but I am a one
> idea, one bug, one page kind of guy. I dislike reading over a long detailed
> discussion of something, like the problem above and my analysis of it,
> only to find a second, unrelated discussion of something else.
> Suddenly the mental state is polluted with
> different distinct things, some fixed, other still open. The bug tracking
> system has then failed because it's not tracking state in any useful way.
> Which is why I've closed this todo item with my fix of
> a single item from it. --[[Joey]]
It'd also be good if there was a option to automatically migrate URL keys
to the default backend whenever a file from a URL key is downloaded. Also,
to record the checksummed key (e.g. MD5E) as metadata of the URL key (in a
field named e.g. alternateKeys), and if addurl --fast is later done on a
URL key for which a checksummed key is recorded in the metadata, to add the
checksummed key instead of the URL key .
> Again, mixing discussion of several things in one place is a good way to
> muddy the waters. I think this idea has several problems, but don't want
> to discuss them here. --[[Joey]]

View file

@ -0,0 +1,45 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2018-10-29T17:51:27Z"
content="""
Looking at the code, it addurl clearly *does* use the urls returned by
CHECKURL. In Command/AddUrl.hs:
go deffile (Right (UrlMulti l))
| isNothing (fileOption (downloadOptions o)) =
forM_ l $ \(u', sz, f) -> do
let f' = adjustFile o (deffile </> fromSafeFilePath f)
void $ commandAction $
startRemote r o f' u' sz
`l` is the list of values it returns, and `u'` is individual urls from that list,
as opposed to `u` which is the url the user provided.
`u'` is passed to `startRemote`, and `u` is not.
Hmm, but in Remote/External.hs there is a special case:
-- Treat a single item multi response specially to
-- simplify the external remote implementation.
CHECKURL_MULTI ((_, sz, f):[]) ->
result $ UrlContents sz $ Just $ mkSafeFilePath f
CHECKURL_MULTI l -> result $ UrlMulti $ map mkmulti l
That does not have any kind of rationalle in [[!commit 8a17bcb0be91c345a52d78c08009285b0fcd6e3a]],
but the next commit added `doc/special_remotes/external/git-annex-remote-torrent`
and I think I can see why I felt it simplified things. That script always
replies with CHECKURL-MULTI, but a torrent often contains a single file, and
it would be perhaps bettter to use the original url provided to the user for such a
file from a torrent, rather than an url that asks for file "#1" from the torrent.
Although AFAICS either would work, and Remote/BitTorrent.hs contains just the kind
of special case for a single item torrent that I was wanting to avoid external
special remotes needing to worry about.
The other benefit to special casing UrlContents is that lets addurl --file specify
where to put the file, which the fileOption check in the first code block
above prevents for UrlMulti. But, that could just as well be handled by
adding a single file special case to the code in AddUrl.
I suppose changing this won't break anything, or if it does it was relying
on this undocumented behavior.
"""]]