Urls can now be claimed by remotes. This will allow creating, for example, a external special remote that handles magnet: and *.torrent urls.

This commit is contained in:
Joey Hess 2014-12-08 19:14:24 -04:00
parent ee27298b91
commit 30bf112185
28 changed files with 346 additions and 114 deletions

View file

@ -125,10 +125,16 @@ replying with `UNSUPPORTED-REQUEST` is acceptable.
If the remote replies with `UNSUPPORTED-REQUEST`, its availability
is assumed to be global. So, only remotes that are only reachable
locally need to worry about implementing this.
* `CLAIMURL Value`
* `CLAIMURL Url`
Asks the remote if it wishes to claim responsibility for downloading
an url. If so, the remote should send back an `CLAIMURL-SUCCESS` reply.
If not, it can send `CLAIMURL-FAILURE`.
* `CHECKURL Url`
Asks the remote to check if the url's content can currently be downloaded
(without downloading it). If the url is not accessible, send
`CHECKURL-FAILURE`. If the url is accessible and the size is known,
send the size in `CHECKURL-SIZE`. If the url is accessible, but the size
is unknown, send `CHECKURL-SIZEUNOWN`.
More optional requests may be added, without changing the protocol version,
so if an unknown request is seen, reply with `UNSUPPORTED-REQUEST`.
@ -175,6 +181,14 @@ while it's handling a request.
Indicates that the CLAIMURL url will be handled by this remote.
* `CLAIMURL-FAILURE`
Indicates that the CLAIMURL url wil not be handled by this remote.
* `CHECKURL-SIZE Size`
Indicates that the requested url has been verified to exist,
and its size is known. The size is in bytes.
* `CHECKURL-SIZEUNKNOWN`
Indicates that the requested url has been verified to exist,
but its size could not be determined.
* `CHECKURL-FAILURE`
Indicates that the requested url could not be accessed.
* `UNSUPPORTED-REQUEST`
Indicates that the special remote does not know how to handle a request.
@ -255,14 +269,14 @@ in control.
* `GETSTATE Key`
Gets any state that has been stored for the key.
(git-annex replies with VALUE followed by the state.)
* `SETURLPRESENT Key Value`
* `SETURLPRESENT Key Url`
Records an url (or uri) where the Key can be downloaded from.
* `SETURLMISSING Key Value`
* `SETURLMISSING Key Url`
Records that the key can no longer be downloaded from the specified
url (or uri).
* `GETURLS Key Value`
* `GETURLS Key Prefix`
Gets the recorded urls where a Key can be downloaded from.
Only urls that start with the Value will be returned. The Value
Only urls that start with the Prefix will be returned. The Prefix
may be empty to get all urls.
(git-annex replies one or more times with VALUE for each url.
The final VALUE has an empty value, indicating the end of the url list.)

View file

@ -0,0 +1,14 @@
Worked on [[todo/extensible_addurl]] today. When `git annex addurl` is run,
remotes will be asked if they claim the url, and whichever remote does will
be used to download it, and location tracking will indicate that remote
contains the object. This is a masive 1000 line patch touching 30 files,
including follow-on changes in `rmurl` and `whereis` and even `rekey`.
It should now be possible to build an external special remote that handles
*.torrent and magnet: urls and passes them off to a bittorrent client for
download, for example.
Another use for this would be to make an external special remote that
uses youtube-dl or some other program than quvi for downloading web videos.
The builtin quvi support could probably be moved out of the web special
remote, to a separate remote. I haven't tried to do that yet.

View file

@ -182,8 +182,9 @@ Example:
## `aaa/bbb/*.log.web`
These log files record urls used by the
[[web_special_remote|special_remotes/web]]. Their format is similar
to the location tracking files, but with urls rather than UUIDs.
[[web_special_remote|special_remotes/web]] and sometimes by other remotes.
Their format is similar to the location tracking files, but with urls
rather than UUIDs.
## `aaa/bbb/*.log.rmt`

View file

@ -25,11 +25,40 @@ Solution: Add a new method to remotes:
claimUrl :: Maybe (URLString -> Annex Bool)
Remotes that implement this method (including special remotes) will
be queried when such an uri is added, to see which claims it. Once the
remote is known, addurl will record that the Key is present on that remote,
and record the uri in the url log.
be queried when such an uri is added, to see which claims it.
Then retrieval of the Key works more or less as usual. The only
Once the remote is known, addurl --file will record that the Key is present
on that remote, and record the uri in the url log.
----
What about using addurl to add a new file? In this mode, the Key is not yet
known. addurl currently handles this by generating a dummy Key for the url
(hitting the url to get its size), and running a Transfer using the dummy
key that downloads from the web. Once the download is done, the dummy Key
is upgraded to the final Key.
Something similar could be done for other remotes, but the url log for the
dummy key would need to have the url added to it, for the remote to know
what to download, and then that could be removed after the download. Which
causes ugly churn in git, and would leave a mess if interrupted.
One option is to add another new method to remotes:
downloadUrl :: Maybe (URLString -> Annex FilePath)
Or, the url log could have support added for recording temporary key
urls in memory. (done)
Another problem is that the size of the Key isn't known. addurl
could always operate in relaxed mode, where it generates a size-less Key.
Or, yet another method could be added: (done)
sizeUrl :: URLString -> Annex (Maybe Integer)
----
Retrieval of the Key works more or less as usual. The only
difference being that remotes that support this interface can look
at the url log to find the one with the right "$downloader:" prefix,
and so know where to download from. (Much as the web special remote already
@ -55,3 +84,5 @@ This could be implemented in either the web special remote or even in an
external special remote.
Some other discussion at <https://github.com/datalad/datalad/issues/10>
> [[done]]! --[[Joey]]