88 lines
3.4 KiB
Markdown
88 lines
3.4 KiB
Markdown
`git annex addurl` supports regular urls, as well as detecting videos that
|
|
quvi can download. We'd like to extend this to support extensible uri
|
|
handling.
|
|
|
|
Use cases range from torrent download support, to pulling data
|
|
from scientific data repositories that use their own APIs.
|
|
|
|
The basic idea is to have external special remotes (or perhaps built-in
|
|
ones in some cases), which addurl can use to download an object, referred
|
|
to by some uri-like thing. The uri starts with "$downloader:" to indicate
|
|
that it's not a regular url and so is not handled by the web special
|
|
remote.
|
|
|
|
git annex addurl torrent:$foo
|
|
git annex addurl CERN:$bar
|
|
|
|
Problem: This requires mapping from the name of the downloader, which is
|
|
probably the same as the git-annex-remote-$downloader program implementing
|
|
the special remote protocol (but not always), to the UUID of a remote.
|
|
That's assuming we want location tracking to be able to know that a file is
|
|
both available from CERN and from a torrent, for example.
|
|
|
|
Solution: Add a new method to remotes:
|
|
|
|
claimUrl :: Maybe (URLString -> Annex Bool)
|
|
|
|
Remotes that implement this method (including special remotes) will
|
|
be queried when such an uri is added, to see which claims it.
|
|
|
|
Once the remote is known, addurl --file will record that the Key is present
|
|
on that remote, and record the uri in the url log.
|
|
|
|
----
|
|
|
|
What about using addurl to add a new file? In this mode, the Key is not yet
|
|
known. addurl currently handles this by generating a dummy Key for the url
|
|
(hitting the url to get its size), and running a Transfer using the dummy
|
|
key that downloads from the web. Once the download is done, the dummy Key
|
|
is upgraded to the final Key.
|
|
|
|
Something similar could be done for other remotes, but the url log for the
|
|
dummy key would need to have the url added to it, for the remote to know
|
|
what to download, and then that could be removed after the download. Which
|
|
causes ugly churn in git, and would leave a mess if interrupted.
|
|
|
|
One option is to add another new method to remotes:
|
|
|
|
downloadUrl :: Maybe (URLString -> Annex FilePath)
|
|
|
|
Or, the url log could have support added for recording temporary key
|
|
urls in memory. (done)
|
|
|
|
Another problem is that the size of the Key isn't known. addurl
|
|
could always operate in relaxed mode, where it generates a size-less Key.
|
|
Or, yet another method could be added: (done)
|
|
|
|
sizeUrl :: URLString -> Annex (Maybe Integer)
|
|
|
|
----
|
|
|
|
Retrieval of the Key works more or less as usual. The only
|
|
difference being that remotes that support this interface can look
|
|
at the url log to find the one with the right "$downloader:" prefix,
|
|
and so know where to download from. (Much as the web special remote already
|
|
does.)
|
|
|
|
Prerequisite: Expand the external special remote interface to support
|
|
accessing the url log. (done)
|
|
|
|
----
|
|
|
|
It would also be nice to be able to easily configure a regexp that normal
|
|
urls, if they match, are made to use a particular downloader. So, for
|
|
torrents, this would make matching urls have torrent: prefixed to them.
|
|
|
|
git config annex.downloader.torrent.regexp '(^magnet:|\.torrent$)'
|
|
|
|
It might also be useful to allow bypassing the complexity of the external
|
|
special remote interface, and let a downloader be specified simply by:
|
|
|
|
git config annex.downloader.torrent.command 'aria2c %url $file'
|
|
|
|
This could be implemented in either the web special remote or even in an
|
|
external special remote.
|
|
|
|
Some other discussion at <https://github.com/datalad/datalad/issues/10>
|
|
|
|
> [[done]]! --[[Joey]]
|