Disable http-client's default 30 second response timeout when HEADing an url to check if it exists. Some web servers take quite a long time to answer a HEAD request.
This commit is contained in:
parent
e5109468e2
commit
69dcb08d7a
5 changed files with 32 additions and 3 deletions
|
@ -11,6 +11,9 @@ git-annex (6.20170521) UNRELEASED; urgency=medium
|
|||
directories, by forking a worker process and only deleting the test
|
||||
directory once it exits.
|
||||
* move, copy: Support --batch.
|
||||
* Disable http-client's default 30 second response timeout when HEADing
|
||||
an url to check if it exists. Some web servers take quite a long time
|
||||
to answer a HEAD request.
|
||||
|
||||
-- Joey Hess <id@joeyh.name> Sat, 17 Jun 2017 13:02:24 -0400
|
||||
|
||||
|
|
|
@ -441,13 +441,11 @@ withS3HandleMaybe c gc u a = do
|
|||
Just creds -> do
|
||||
awscreds <- liftIO $ genCredentials creds
|
||||
let awscfg = AWS.Configuration AWS.Timestamp awscreds debugMapper
|
||||
bracketIO (newManager httpcfg) closeManager $ \mgr ->
|
||||
bracketIO (newManager managerSettings) closeManager $ \mgr ->
|
||||
a $ Just $ S3Handle mgr awscfg s3cfg
|
||||
Nothing -> a Nothing
|
||||
where
|
||||
s3cfg = s3Configuration c
|
||||
httpcfg = managerSettings
|
||||
{ managerResponseTimeout = responseTimeoutNone }
|
||||
|
||||
s3Configuration :: RemoteConfig -> S3.S3Configuration AWS.NormalQuery
|
||||
s3Configuration c = cfg
|
||||
|
|
|
@ -56,6 +56,7 @@ managerSettings = tlsManagerSettings
|
|||
#else
|
||||
managerSettings = conduitManagerSettings
|
||||
#endif
|
||||
{ managerResponseTimeout = responseTimeoutNone }
|
||||
|
||||
type URLString = String
|
||||
|
||||
|
|
|
@ -47,3 +47,5 @@ git-annex: drop: 1 failed
|
|||
|
||||
[[!meta author=yoh]]
|
||||
|
||||
|
||||
> [[done]] --[[Joey]]
|
||||
|
|
|
@ -0,0 +1,25 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2017-08-15T17:28:20Z"
|
||||
content="""
|
||||
The normal reason for this to happen is if the size of the file
|
||||
on the website has changed. git-annex checks the reported size and if it
|
||||
differs from the versioned file, it knows that the website no longer
|
||||
contains the same file.
|
||||
|
||||
In this case, it seems to be a cgi program generating a zip file, and the
|
||||
program actually generated two different zip files when I hit it twice with
|
||||
wget. (So if git-annex actually did drop the only copy of the version you
|
||||
downloaded, you'd not be able to download it again. Not that git-annex can know
|
||||
that; this kind of thing is why trusting the web is not a good idea..) They did
|
||||
have the same size, but it looks like the web server is not sending a size
|
||||
header anyway.
|
||||
|
||||
The actual problem is the web server takes a long time to answer a HEAD request
|
||||
for this URL. It takes 35 seconds before curl is able to HEAD it. I suspect
|
||||
it's generating the 300 mb zip file before it gets around to finishing
|
||||
the HEAD request. Not the greatest server behavior, all around.
|
||||
|
||||
That breaks http-client due to its default 30 second timeout. So, will remove that timeout then.
|
||||
"""]]
|
Loading…
Reference in a new issue