Fix resume of download of url when the whole file content is already actually downloaded

Don't much like that there's no way to distinguish between having the whole
content and having an old version of the file that's bigger, but of course
resuming a http transfer can always yield the wrong result if the file on
the http server is changing, and git-annex will detect that when it
verifies the downloaded content.

This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.
This commit is contained in:
Joey Hess 2018-11-12 16:08:47 -04:00
parent c24bdfd689
commit ff9bd9620e
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 40 additions and 1 deletions

View file

@ -98,3 +98,5 @@ git-annex version: 6.20181011+git124-g94aa0e2f6-1~ndall+1
"""]]
Not sure why it ended up not moved into the proper location but I think upon redownload, size should be verified, if "Full" - try to proceed to checksum verification etc.
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,28 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2018-11-12T19:50:04Z"
content="""
I was able to reproduce this with an apache web server. It seems apache
doesn't send back a Content-Range header when the requested range is empty,
though it does otherwise.
Both wget and curl seem to accept that as indicating that nothing more
needs to be downloaded.
joey@darkstar:~>wget -c http://localhost/~joey/foo -O foo
--2018-11-12 15:57:48-- http://localhost/~joey/foo
Resolving localhost (localhost)... ::1, 127.0.0.1
Connecting to localhost (localhost)|::1|:80... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable
The file is already fully retrieved; nothing to do.
Although, it's worth noting that the http server does the same thing
if a range larger than the url's size is requested.. And in this case wget
will behave the same as the above but hasn't actually downloaded the
current content of the file. So this seems like an ugly corner of http
that the two situations cannot be distinguished.
I suppose I'll make git-annex behave the same as wget and curl do.
"""]]