don't rely on exception for http 416
Fix a bug that could make resuming a download from the web fail when the entire content of the file is actually already present locally. What a mess that Request can throw exceptions or not, depending on how it's configured. Makes it very hard if you need to handle some specific http status codes in a function like this! Implementing everything two ways did not seem appealing, if possible at all, so I decided to override the Request if it did come configured to throw exception on non-2xx http status. Other exceptions, like from http-client-restricted, or due to a redirect to a non-http url, still get thrown. This commit was sponsored by Luke Shumaker on Patreon.
This commit is contained in:
parent
3991c8e43d
commit
b90b9b936d
4 changed files with 60 additions and 18 deletions
|
@ -15,6 +15,9 @@ git-annex (8.20201117) UNRELEASED; urgency=medium
|
|||
* Fix build on Windows.
|
||||
* Prevent windows assistant from trying (and failing) to upgrade
|
||||
itself, which has never been supported on windows.
|
||||
* Fix a bug that could make resuming a download from the web fail
|
||||
when the entire content of the file is actually already present
|
||||
locally.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Mon, 16 Nov 2020 09:38:32 -0400
|
||||
|
||||
|
|
|
@ -452,7 +452,12 @@ download' nocurlerror meterupdate url file uo =
|
|||
{- Download a perhaps large file using conduit, with auto-resume
|
||||
- of incomplete downloads.
|
||||
-
|
||||
- Does not catch exceptions.
|
||||
- A Request can be configured to throw exceptions for non-2xx http
|
||||
- status codes, or not. That configuration is overridden by this,
|
||||
- and if it is unable to download, it throws an exception containing
|
||||
- a user-visible explanation of the problem. (However, exceptions
|
||||
- thrown for reasons other than http status codes will still be thrown
|
||||
- as usual.)
|
||||
-}
|
||||
downloadConduit :: MeterUpdate -> Request -> FilePath -> UrlOptions -> IO ()
|
||||
downloadConduit meterupdate req file uo =
|
||||
|
@ -480,28 +485,29 @@ downloadConduit meterupdate req file uo =
|
|||
filter ((/= hAcceptEncoding) . fst)
|
||||
(requestHeaders req)
|
||||
, decompress = const False
|
||||
-- Avoid throwing exceptions non-2xx http status codes,
|
||||
-- since we rely on parsing the Response to handle
|
||||
-- several such codes.
|
||||
, checkResponse = \_ _ -> return ()
|
||||
}
|
||||
|
||||
-- Resume download from where a previous download was interrupted,
|
||||
-- when supported by the http server. The server may also opt to
|
||||
-- send the whole file rather than resuming.
|
||||
resumedownload sz = catchJust
|
||||
(matchStatusCodeHeadersException (alreadydownloaded sz))
|
||||
dl
|
||||
(const noop)
|
||||
where
|
||||
dl = join $ runResourceT $ do
|
||||
let req'' = req' { requestHeaders = resumeFromHeader sz : requestHeaders req }
|
||||
liftIO $ debugM "url" (show req'')
|
||||
resp <- http req'' (httpManager uo)
|
||||
if responseStatus resp == partialContent206
|
||||
resumedownload sz = join $ runResourceT $ do
|
||||
let req'' = req' { requestHeaders = resumeFromHeader sz : requestHeaders req }
|
||||
liftIO $ debugM "url" (show req'')
|
||||
resp <- http req'' (httpManager uo)
|
||||
if responseStatus resp == partialContent206
|
||||
then do
|
||||
store (toBytesProcessed sz) AppendMode resp
|
||||
return (return ())
|
||||
else if responseStatus resp == ok200
|
||||
then do
|
||||
store (toBytesProcessed sz) AppendMode resp
|
||||
store zeroBytesProcessed WriteMode resp
|
||||
return (return ())
|
||||
else if responseStatus resp == ok200
|
||||
then do
|
||||
store zeroBytesProcessed WriteMode resp
|
||||
return (return ())
|
||||
else if alreadydownloaded sz resp
|
||||
then return (return ())
|
||||
else do
|
||||
rf <- extractFromResourceT (respfailure resp)
|
||||
if responseStatus resp == unauthorized401
|
||||
|
@ -510,8 +516,9 @@ downloadConduit meterupdate req file uo =
|
|||
Just ba -> retryauthed ba
|
||||
else return $ giveup rf
|
||||
|
||||
alreadydownloaded sz s h = s == requestedRangeNotSatisfiable416
|
||||
&& case lookup hContentRange h of
|
||||
alreadydownloaded sz resp
|
||||
| responseStatus resp /= requestedRangeNotSatisfiable416 = False
|
||||
| otherwise = case lookup hContentRange (responseHeaders resp) of
|
||||
-- This could be improved by fixing
|
||||
-- https://github.com/aristidb/http-types/issues/87
|
||||
Just crh -> crh == B8.fromString ("bytes */" ++ show sz)
|
||||
|
|
|
@ -41,3 +41,5 @@ Stock git-annex on Fedora 32:
|
|||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
||||
Yes! git-annex has been working great for me so far, and is powering the bioinformatics chat podcast (https://bioinformatics.chat/). Thanks!
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
||||
|
|
|
@ -0,0 +1,30 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2020-11-19T16:56:45Z"
|
||||
content="""
|
||||
To reproduce this, interrupt git-annex after it downloads the whole file,
|
||||
but before it moves it from the download location into the annex. (Or,
|
||||
let it get the file, then move the object back to the temp object location.)
|
||||
|
||||
This is a tricky case, because if the total file size is not
|
||||
known when resuming the download, how can it detect if it's got it all
|
||||
already? And git-annex does not always know the total file size, eg when
|
||||
git-annex addurl --relaxed is used, and then git-annex get is later used
|
||||
to download the content.
|
||||
|
||||
What git-annex already tried to do to detect this is,
|
||||
when it got a 416 it looks for a Content-Range header "bytes */$size"
|
||||
where $size is the same as the size of the file on disk.
|
||||
|
||||
That relied on the http library throwing an exception for the 416.
|
||||
Thing is, http may or may not throw exceptions for non-2xx
|
||||
responses, depending on the input Request. IMHO that is a very bad design,
|
||||
it leads to this kind of bug, rather than making it evident with the data
|
||||
types what is going on.
|
||||
|
||||
Currently downloadConduit takes a Request, and assumes it throws exceptions
|
||||
for 416, but not for 401. Both can't be right.
|
||||
|
||||
Ok, fixed this mess..
|
||||
"""]]
|
Loading…
Add table
Reference in a new issue