Support http urls that contain ":" that is not followed by a port number

The same as git does.

Sponsored-by: Dartmouth College's DANDI project
This commit is contained in:
Joey Hess 2023-02-10 13:34:47 -04:00
parent 8fa3264f3a
commit 96d46db2d5
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 50 additions and 4 deletions

View file

@ -32,3 +32,5 @@ Backstory: Happened to a user trying to access some NWB files on gin for DANDI p
[[!meta author=yoh]]
[[!tag projects/dandi]]
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,26 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2023-02-10T17:04:51Z"
content="""
Not a legal url really, RFC 1738 says "If the port is omitted, the colon is as well."
But web browsers, curl, wget, etc do mostly seem to support it, so at least
Postel's law seems to apply..
Here's the root cause of it failing:
ghci> parseRequest "https://datasets.datalad.org:/dbic/QA/.git/"
*** Exception: InvalidUrlException "https://datasets.datalad.org:/dbic/QA/.git/" "Invalid port"
So http-conduit refuses to parse it and so can't be used to download it.
Filed an issue, but I don't know if they'll want to change
http-conduit to accept a malformed url.
<https://github.com/snoyberg/http-client/issues/501>
Since network-uri is able to parse it, into an URI
that has `"uriPort = ":"`, git-annex could special
case handling of the empty port there, changing it to ""
and so generating an url that http-conduit can parse.
I've implemented this fix.
"""]]