Commit graph

1228 commits

Author SHA1 Message Date
Joey Hess
26a02cb386
display error when an invalid url is downloaded
download is documented as displaying an error when download fails, but
it didn't when the url was not valid at all. That leads to confusing
behavior.

Also, display the url with --debug
2018-09-25 13:38:20 -04:00
Joey Hess
cc82f81227
More FreeBSD build fixes.
Untested, on FreeBSD but enough to fix the listed build errors.

Seems that System.Posix.Files must have used to export this stuff and it
was split.

This commit was sponsored by Peter on Patreon.
2018-09-24 11:25:56 -04:00
Joey Hess
ceee7758a5
fix \ escaping 2018-09-22 11:33:08 -04:00
Joey Hess
d2c351f547
update windows NUL for ghc 8.6.1
This should also work with older ghc, since the path is a windows device
namespace path.
2018-09-22 11:31:55 -04:00
Joey Hess
2aae6e84af
Support newlines in filenames.
Work around git cat-file --batch's protocol not supporting newlines by
running git cat-file not batched and passing the filename as a
parameter.

Of course this is quite a lot less efficient, especially because it
currently runs it multiple times to query for different pieces of
information.

Also, it has subtly different behavior when the batch process was
started and then some changes were made, in which case the batch process
sees the old index but this workaround sees the current index. Since
that batch behavior is mostly a problem that affects the assistant and has
to be worked around in it, I think I can get away with this difference.

I don't know of any other problems with newlines in filenames, everything
else in git I can think of supports -z. And git-annex's json output
supports newlines in filenames so downstream parsers from git-annex will be ok.
git-annex commands that use --batch themselves don't support newlines
in input filenames; using --json --batch is currently a way around that
problem.

This commit was sponsored by Ewen McNeill on Patreon.
2018-09-20 13:45:44 -04:00
Yaroslav Halchenko
b976eb5353
BF(minor): missing space after "Unsupported url scheme" msg before the scheme 2018-09-18 18:19:20 -04:00
Joey Hess
b3c9c59d3d
--debug urls
When git-annex used wget and curl, --debug would show urls. So there can't
be any new security problem with doing so.

This commit was sponsored by John Pellman on Patreon.
2018-09-14 12:46:39 -04:00
Joey Hess
b18fb1e343
clean P2P protocol shutdown on EOF
Avoids "git-annex-shell: <stdin>: hGetChar: end of file"
being displayed by the test suite, due to the way it
runs git-annex-shell without using ssh.

git-annex-shell over ssh was not affected because git-annex hangs up the
ssh connection and so never sees the error message that git-annnex-shell
probably did emit.

This commit was sponsored by Ryan Newton on Patreon.
2018-09-13 10:46:37 -04:00
Joey Hess
872640549b
comment typo 2018-09-05 13:57:06 -04:00
Joey Hess
f4788f3853
clarify comment
haskell-mountpoints contains android specific code, but it's not used
when git-annex was built for linux and is running on android.
2018-09-05 11:22:27 -04:00
Joey Hess
55f8d90dee
remove Utlity.SRV, no longer used 2018-09-05 11:15:33 -04:00
Joey Hess
f54c72d2e1
Fix build on FreeBSD
This must have been broken for years..

This commit was sponsored by Jack Hill on Patreon.
2018-08-29 12:09:03 -04:00
Joey Hess
c565340adc
stop using external hash programs, since cryptonite is faster
In 2013, I wrote "Cryptohash benchmarks 90 to 101% faster than external
hashers". Re-benchmarking today, I found cryptonite's sha256 consistently
outperformed coreutils by 10% for large files. Tested 10 mb, 100 mb, 1 gb
files with both sha256 and sha512. And for smaller files, the external
process startup time swamps the hash time.

Perhaps cryptonite has improved. Or it could just do better on my
current CPU Intel(R) Pentium(R) CPU 4410Y @ 1.50GHz). Anyway, even if cryptonite
is slower in some situations, seems likely it would only be marginally slower;
it's got the same class of highly optimised C code under the hood as coreutils.
The main difference between the two sha256 implementations seems to be
how much of the inner loop they unroll..

This commit was sponsored by Henrik Riomar on Patreon.
2018-08-28 18:10:58 -04:00
Joey Hess
6a445dc086
support conditionally excluding queued files
Switched code to use a for loop to avoid a filterM that would have
doubled the memory used.

This commit was supported by the NSF-funded DataLad project.
2018-08-16 14:38:37 -04:00
Joey Hess
218c76b789
avoid unused imports warning on non-linux 2018-08-07 15:06:33 -04:00
Joey Hess
e1ab01f94d
Fix reversion in display of http 404 errors.
Switch to using http-client for large file downloads caused the reversion;
the code for displaying a 404 response was instead displaying the raw html
document, which is not useful.

This commit was sponsored by Ryan Newton on Patreon.
2018-07-31 12:15:26 -04:00
Joey Hess
50609da787
fix User-Agent reversion
Send User-Agent and any configured annex.http-headers when downloading with
http, fixes reversion introduced when switching to http-client.

This commit was sponsored by mo on Patreon.
2018-07-16 11:56:47 -04:00
Joey Hess
ac228fa723
don't import all of System.Posix.Files
This avoid a build problem when different versions of posix and
posixcompat are used. Does not normally happen as cabal prevents that,
but this is sometimes used with ghc --make which can get into that
situation.
2018-07-10 12:04:49 -04:00
Joey Hess
3dd7f450c1
fix p2p --pair
p2p --pair: Fix interception of the magic-wormhole pairing code, which
since 0.8.2 it has sent to stderr rather than stdout.

This is highly annoying because I had asked the magic wormhole developers
for a machine-readable way to get the data, and instead they changed how
the data was output, and didn't even mention this in my issue, or in the
changelog.

Seems this needs to be tested periodically to make sure it's still working.

This commit was sponsored by Ethan Aubin.
2018-07-04 15:14:03 -04:00
Joey Hess
3976b89116
fix license date
I wrote this this year
2018-06-22 10:25:53 -04:00
Joey Hess
22f49f216e
get android building the security fix
Had to update http-client and network, with follow-on dep changes.

This commit was sponsored by Brock Spratlen on Patreon.
2018-06-21 10:23:04 -04:00
Joey Hess
923578ad78
improve error message
This commit was sponsored by Jack Hill on Patreon.
2018-06-19 14:21:41 -04:00
Joey Hess
47cd8001bc
call base ManagerSetting's exception wrapper
This commit was sponsored by Henrik Riomar on Patreon.
2018-06-19 14:17:05 -04:00
Joey Hess
fc79f68404
support building on debian stable
Specifically, http-client-0.4.31

This commit was supported by the NSF-funded DataLad project.
2018-06-19 11:25:10 -04:00
Joey Hess
3c0a538335
allow ftp urls by default
They're no worse than http certianly. And, the backport of these
security fixes has to deal with wget, which supports http https and ftp
and has no way to turn off individual schemes, so this will make that
easier.
2018-06-18 15:37:17 -04:00
Joey Hess
cc08135e65
prevent using local http proxies per annex.security.allowed-http-addresses
A local http proxy would bypass the security configuration. So,
the security configuration has to be applied when choosing whether to
use the proxy.

While http rebinding attacks against the dns lookup of the proxy IP
address seem very unlikely, this implementation does prevent them, since
it resolves the IP address once, checks it, and then reconfigures
http-client's proxy using the resolved address.

This commit was sponsored by Ole-Morten Duesund on Patreon.
2018-06-18 13:32:20 -04:00
Joey Hess
b54b2cdc0e
prevent http connections to localhost and private ips by default
Security fix!

* git-annex will refuse to download content from http servers on
  localhost, or any private IP addresses, to prevent accidental
  exposure of internal data. This can be overridden with the
  annex.security.allowed-http-addresses setting.
* Since curl's interface does not have a way to prevent it from accessing
  localhost or private IP addresses, curl defaults to not being used
  for url downloads, even if annex.web-options enabled it before.
  Only when annex.security.allowed-http-addresses=all will curl be used.

Since S3 and WebDav use the Manager, the same policies apply to them too.

youtube-dl is not handled yet, and a http proxy configuration can bypass
these checks too. Those cases are still TBD.

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2018-06-17 13:30:28 -04:00
Joey Hess
43bf219a3c
added makeAddressMatcher
Would be nice to add CIDR notation to this, but this is the minimal
thing needed for the security fix.

This commit was sponsored by Ewen McNeill on Patreon.
2018-06-17 13:29:15 -04:00
Joey Hess
014a3fef34
added isPrivateAddress and isLoopbackAddress
For use in a security boundary enforcement.

Based on https://en.wikipedia.org/wiki/Reserved_IP_addresses

Including supporting IPv4 addresses embedded in IPv6 addresses. Because
while RFC6052 3.1 says "Address translators MUST NOT translate packets
in which an address is composed of the Well-Known Prefix and a non-
global IPv4 address; they MUST drop these packets", I don't want to
trust that implementations get that right when enforcing a security
boundary.

This commit was sponsored by John Pellman on Patreon.
2018-06-17 13:28:25 -04:00
Joey Hess
40e8358284
add Utility.HttpManagerRestricted
This is a clean way to add IP address restrictions to http-client, and
any library using it.
See https://github.com/snoyberg/http-client/issues/354#issuecomment-397830259

Some code from http-client and http-client-tls was copied in and
modified. Credited its author accordingly, and used the same MIT license.

The restrictions don't apply to http proxies. If using http proxies is a
problem, http-client already has a way to disable them.
SOCKS support is not included. As far as I can tell, http-client-tls
does not support SOCKS by default, and so git-annex never has.

The additional dependencies are free; git-annex already transitively
depended on them via http-conduit.

This commit was sponsored by Eric Drechsel on Patreon.
2018-06-16 18:44:13 -04:00
Joey Hess
28720c795f
limit url downloads to whitelisted schemes
Security fix! Allowing any schemes, particularly file: and
possibly others like scp: allowed file exfiltration by anyone who had
write access to the git repository, since they could add an annexed file
using such an url, or using an url that redirected to such an url,
and wait for the victim to get it into their repository and send them a copy.

* Added annex.security.allowed-url-schemes setting, which defaults
  to only allowing http and https URLs. Note especially that file:/
  is no longer enabled by default.

* Removed annex.web-download-command, since its interface does not allow
  supporting annex.security.allowed-url-schemes across redirects.
  If you used this setting, you may want to instead use annex.web-options
  to pass options to curl.

With annex.web-download-command removed, nearly all url accesses in
git-annex are made via Utility.Url via http-client or curl. http-client
only supports http and https, so no problem there.
(Disabling one and not the other is not implemented.)

Used curl --proto to limit the allowed url schemes.

Note that this will cause git annex fsck --from web to mark files using
a disallowed url scheme as not being present in the web. That seems
acceptable; fsck --from web also does that when a web server is not available.

youtube-dl already disabled file: itself (probably for similar
reasons). The scheme check was also added to youtube-dl urls for
completeness, although that check won't catch any redirects it might
follow. But youtube-dl goes off and does its own thing with other
protocols anyway, so that's fine.

Special remotes that support other domain-specific url schemes are not
affected by this change. In the bittorrent remote, aria2c can still
download magnet: links. The download of the .torrent file is
otherwise now limited by annex.security.allowed-url-schemes.

This does not address any external special remotes that might download
an url themselves. Current thinking is all external special remotes will
need to be audited for this problem, although many of them will use
http libraries that only support http and not curl's menagarie.

The related problem of accessing private localhost and LAN urls is not
addressed by this commit.

This commit was sponsored by Brett Eisenberg on Patreon.
2018-06-16 11:57:50 -04:00
Joey Hess
caaedb2993
fix http-client gzip decompression bug
Prevent haskell http-client from decompressing gzip files, so downloads of
such files works the same as it used to with wget and curl.

Explicitly setting accept-encoding to "identity" is probably not needed,
but that's what wget sends (curl does not send the header), and since
http-client is trying to be excessively smart, it seems we need to set
hAcceptEncoding to something to prevent it from inserting its own,
and this seems better than some hack like "".

This commit was sponsored by Ole-Morten Duesund on Patreon.
2018-05-21 15:10:25 -04:00
Joey Hess
6a63920732
fix build 2018-05-21 11:00:23 -04:00
Joey Hess
5204e1dd9d
Workaround for bug in an old version of cryptonite that broke https downloads, by using curl for downloads when git-annex is built with it.
This commit was supported by the NSF-funded DataLad project.
2018-05-20 14:12:37 -04:00
Joey Hess
86958fda5d
fix build with old http-client 2018-05-10 00:22:23 -04:00
Joey Hess
db720f6a9c
Display error message when http download fails.
* Display error message when http download fails.

  There's nothing in the http-client library to nicely format a http
  exception, so in some cases it has to fall back to using show on it.
  Seems better than just saying "it failed" or only showing the http
  status code.

* Avoid forward retry when 0 bytes were received.

  forwardRetry was comparing Nothing to Just 0, and so thought there had
  been progress made when 0 bytes were received.

This commit was supported by the NSF-funded DataLad project.
2018-05-08 16:11:45 -04:00
Joey Hess
7dc28dc705
Support building with hinotify-0.3.10.
Kept backwards compat with old versions via a shim.

This commit was sponsored by mo on Patreon.
2018-05-08 14:43:06 -04:00
Joey Hess
2948f6d916
avoid uname -o on !linux and catch any exception from it
Fix bug in last release that prevented the webapp opening on non-Linux systems.

This commit was sponsored by Jake Vosloo on Patreon.
2018-05-08 14:06:19 -04:00
Joey Hess
a8c91ce69a
add streamDirectoryContents 2018-04-26 13:38:36 -04:00
Joey Hess
f5df6244f3
deal with getMounts crashing on android 2018-04-25 17:42:27 -04:00
Joey Hess
096bb0aa21
improve comment 2018-04-25 16:57:57 -04:00
Joey Hess
6e862f47dd
deal with uname -o newline 2018-04-25 16:53:17 -04:00
Joey Hess
9807e5bead
fix webapp opening in termux
Open real url not html shim since android and file:// urls is a nasty
kettle of fish.

This commit was sponsored by John Pellman on Patreon.
2018-04-25 14:38:42 -04:00
Joey Hess
3c6e60dc69
fix build with old http-conduit 2018-04-24 21:23:40 -04:00
Joey Hess
526243d6f5
catch exceptions from getEffectiveUserID
This fixes a crash when using the linux_standalone build in termux on
android.

This commit was sponsored by Jeff Goeke-Smith on Patreon.
2018-04-24 20:10:10 -04:00
Joey Hess
aebf9e6dd5
Fix build with yesod 1.6.
Also avoid some depreaction warnings.
2018-04-22 13:56:35 -04:00
Joey Hess
256d8f07e8
avoid insertWith' depreaction warning
Switch to Data.Map.Strict everywhere that used it.

There are still lots of lazy maps in git-annex. I think switching these
is safe. The risk is that there might be a map that is used in a way
that relies on the values not being evaluated to WHNF, and switching to
strict might result in bad performance or memory use. So, I have not
switched everything.
2018-04-22 13:28:31 -04:00
Joey Hess
558a0a9328
deal with conduit 1.3 change
I don't know if this will build with older conduit, it may need an
ifdef.
2018-04-22 13:14:55 -04:00
Joey Hess
89e1a05a8f
Fix mangling of --json output of utf-8 characters when not running in a utf-8 locale
As long as all code imports Utility.Aeson rather than Data.Aeson,
and no Strings that may contain utf-8 characters are used for eg, object
keys via T.pack, this is guaranteed to fix the problem everywhere that
git-annex generates json.

It's kind of annoying to need to wrap ToJSON with a ToJSON', especially
since every data type that has a ToJSON instance has to be ported over.
However, that only took 50 lines of code, which is worth it to ensure full
coverage. I initially tried an alternative approach of a newtype FileEncoded,
which had to be used everywhere a String was fed into aeson, and chasing
down all the sites would have been far too hard. Did consider creating an
intentionally overlapping instance ToJSON String, and letting ghc fail
to build anything that passed in a String, but am not sure that wouldn't
pollute some library that git-annex depends on that happens to use ToJSON
String internally.

This commit was supported by the NSF-funded DataLad project.
2018-04-16 16:21:21 -04:00
Joey Hess
e5a404ebe2
fix build with old version of http-client 2018-04-09 13:04:23 -04:00