fix http-client gzip decompression bug

Prevent haskell http-client from decompressing gzip files, so downloads of
such files works the same as it used to with wget and curl.

Explicitly setting accept-encoding to "identity" is probably not needed,
but that's what wget sends (curl does not send the header), and since
http-client is trying to be excessively smart, it seems we need to set
hAcceptEncoding to something to prevent it from inserting its own,
and this seems better than some hack like "".

This commit was sponsored by Ole-Morten Duesund on Patreon.
This commit is contained in:
Joey Hess 2018-05-21 15:10:25 -04:00
parent f9e3bcdeb8
commit caaedb2993
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 36 additions and 2 deletions

View file

@ -8,6 +8,8 @@ git-annex (6.20180510) UNRELEASED; urgency=medium
since 6.20180427. The older behavior, which has never been well
documented and seems almost entirely useless, has been removed.
* copy: --force no longer does anything.
* Prevent haskell http-client from decompressing gzip files, so downloads
of such files works the same as it used to with wget and curl.
-- Joey Hess <id@joeyh.name> Mon, 14 May 2018 13:42:41 -0400

View file

@ -284,11 +284,20 @@ download meterupdate url file uo =
downloadconduit req = catchMaybeIO (getFileSize file) >>= \case
Nothing -> runResourceT $ do
resp <- http req (httpManager uo)
resp <- http req' (httpManager uo)
if responseStatus resp == ok200
then store zeroBytesProcessed WriteMode resp
else showrespfailure resp
Just sz -> resumeconduit req sz
Just sz -> resumeconduit req' sz
where
-- Override http-client's default decompression of gzip
-- compressed files. We want the unmodified file content.
req' = req
{ requestHeaders = (hAcceptEncoding, "identity") :
filter ((/= hAcceptEncoding) . fst)
(requestHeaders req)
, decompress = const False
}
alreadydownloaded sz s h = s == requestedRangeNotSatisfiable416
&& case lookup hContentRange h of

View file

@ -46,3 +46,5 @@ $ du -L *
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Of course! Still love git-annex to bits <3
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,21 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2018-05-21T18:30:53Z"
content="""
haskell http-client has a strange default handling of compressed files.
It seems to want to decompress them unless the content-type is
"application/x-tar". It defaults to accept-encoding of gzip.
That seems to be targeting being used to implement a web browser or
something, although I don't entirely understand how that behavior would
make sense for a web browser either; I'd expect it to only decompress
content that was transparently compressed in transit, but not other
content. Firefox does not decompress that tarball when downloading it; nor
does it display a foo.html.gz as a web page; it downloads it as-is.
Very strange default for a general purpose http library; IMHO it's a bug.
`Accept-Encoding: identity` and no transparent decompression seems to be
the way to go here, just like wget.
"""]]