Moving toward dropping MissingH dep.
I think I've addressed the problem identified earlier in
09a66f702d. On Windows,
absPathFrom "/tmp/repo/xxx" "y/bar" would be "/tmp/repo/xxx\\y/bar",
which then confuses relPathDirToFile. Fixed by converting to unix (git)
style paths.
Also, relPathDirToFile was splitting only on \\ on windows and not /
which broke the example in 09a66f702d of
relPathDirToFile (absPathFrom "/tmp/repo/xxx" "y/bar") "/tmp/repo/.git/annex/objects/xxx"
Now, on windows, that will yield "..\\..\\..\\.git/annex/objects/xxx"
which once converted to unix style paths is what we want.
The bug was that withFile closes the handle afterwards, but the content
of the file was not read due to laziness. Using readFile avoids it.
This commit was sponsored by Nick Daly on Patreon.
The COPYRIGHT had Utility/DirWatcher* listed as GPL, but they were
actually BSD licensed.
No idea why I put the GPL on Utility/GPG.hs file originally.
I wrote all of it, except for guilhem's small changes to it in
00fc21bfec, which seem too small to be
independently copyrightable. I'm relicencing it BSD.
findShellCommand needs a full path to a file in order to check it for a
shebang on Windows. It was being run with only the base name of the external
special remote program, which would only work when it was in the current
directory.
This is why users in
https://github.com/DanielDent/git-annex-remote-rclone/pull/10 and elsewhere
were complaining that the previous improvements to git-annex didn't make
git-remote-rclone work on Windows.
Also, reworked checkearlytermination, which while it worked, seemed
to rely on a race condition. And, improved its error messages.
This commit was sponsored by Shane-o on Patreon.
* Run curl with -S, so HTTP errors are displayed, even when
it's otherwise silent.
* When downloading in --json or --quiet mode, use curl in preference
to wget, since curl is able to display only errors to stderr, unlike
wget.
This does mean that downloadQuiet is only silent on stdout, not necessarily
on stderr, which affects a couple other calls of it. For example,
downloading the .git/config of a http remote may show an error message now,
perhaps with slightly suboptimal formatting due to other output.
This adds one extra line of output when a download is successful,
after the progress bar. I don't much like that, but wget does not provide a
way to show HTTP errors without it.
This allows using functions that generate CreateProcess and passing the
result to processTranscript', which is more flexible, and also simpler
than the old interface.
This commit was sponsored by Riku Voipio.
Refactored some common code into initDb.
This only deals with the problem when creating new databases. If a repo
got bad permissions into it, it's up to the user to deal with it.
This commit was sponsored by Ole-Morten Duesund on Patreon.
Probing for hard link support in the pid locking code is redundant since
git-annex init already probes that. But, it didn't seem worth threading
that data through; the pid locking code runs at most once per git-annex
process, and only on unusual filesystems. Optimising a single hard link
and unlink isn't worth it.
This commit was sponsored by Francois Marier on Patreon.
Wormhole pairing will start to provide an appid to wormhole on 2021-12-31.
An appid can't be provided now because Debian stable is going to ship a
older version of git-annex that does not provide an appid. Assumption is
that by 2021-12-31, this version of git-annex will be shipped in a Debian
stable release. If that turns out to not be the case, this change will need
to be cherry-picked into the git-annex in Debian stable, or its wormhole
pairing will break.
This commit was sponsored by Thomas Hochstein on Patreon.
Turns out that Data.List.Utils.split is slow and makes a lot of
allocations. Here's a much simpler single character splitter that behaves
the same (even in wacky corner cases) while running in half the time and
75% the allocations.
As well as being an optimisation, this helps move toward eliminating use of
missingh.
(Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and
allocates even more.)
I have not benchmarked the effect on git-annex, but would not be surprised
to see some parsing of eg, large streams from git commands run twice as
fast, and possibly in less memory.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
hSetEncoding of a closed handle segfaults.
https://ghc.haskell.org/trac/ghc/ticket/71618484c0c197 introduced the crash.
In particular, stdin may get closed (by eg, getContents) and then trying
to set its encoding will crash. We didn't need to adjust stdin's
encoding anyway, but only stderr, to work around
https://github.com/yesodweb/persistent/issues/474
Thanks to Mesar Hameed for assistance related to reproducing this bug.
Since the user does not know whether it will run su or sudo, indicate
whether the password prompt will be for root or the user's password,
when possible.
I assume that programs like gksu that can prompt for either depending on
system setup will make clear in their prompt what they're asking for.
weasel explained that apparmor limits on what files tor can read do not
apply to sockets (because they're not files). And apparently the
problems I was seeing with hidden services not being accessible had to
do with onion address propigation and not the location of the socket
file.
remotedaemon looks up the HiddenServicePort in torrc, so if it was
previously configured with the socket in /etc, that will still work.
This commit was sponsored by Denis Dzyubenko on Patreon.
This interacts with it using stdio, which is surprisingly hard.
sendFile does not currently work, due to
https://github.com/warner/magic-wormhole/issues/108
Parsing the output to find the magic code is done as robustly as
possible, and should continue to work unless wormhole radically changes
the format of its codes. Presumably it will never output something that
looks like a wormhole code before the actual wormhole code; that would
also break this. It would be better if there was a way to make
wormhole not mix the code with other output, as requested in
https://github.com/warner/magic-wormhole/issues/104
Only exchange of files/directories is supported. To exchange messages,
https://github.com/warner/magic-wormhole/issues/99 would need to be resolved.
I don't need message exchange however.
The attacker could just send a very lot of data, with no \n and it would
all be buffered in memory until the kernel killed git-annex or perhaps OOM
killed some other more valuable process.
This is a low impact security hole, only affecting communication between
local git-annex and git-annex-shell on the remote system. (With either
able to be the attacker). Only those with the right ssh key can do it. And,
there are probably lots of ways to construct git repositories that make git
use a lot of memory in various ways, which would have similar impact as
this attack.
The fix in P2P/IO.hs would have been higher impact, if it had made it to a
released version, since it would have allowed DOSing the tor hidden
service without needing to authenticate.
(The LockContent and NotifyChanges instances may not be really
exploitable; since the line is read and ignored, it probably gets read
lazily and does not end up staying buffered in memory.)
Display progress meter on send and receive from remote.
Added a new hGetMetered that can read an exact number of bytes (or
less), updating a meter as it goes.
This commit was sponsored by Andreas on Patreon.
On Debian, apparmor prevents tor from reading from most locations. And,
it silently fails if it is prevented from reading the hidden service
socket. I filed #846275 about this; violating the FHS is the least bad of a
bad set of choices until that bug is fixed.
Still a couple bugs:
* Closing the connection to the server leaves git upload-pack /
receive-pack running, which could be used to DOS.
* Sometimes the data is transferred, but it fails at the end, sometimes
with:
git-remote-tor-annex: <socket: 10>: commitBuffer: resource vanished (Broken pipe)
Must be a race condition around shutdown.
Almost working, but there's a bug in the relaying.
Also, made tor hidden service setup pick a random port, to make it harder
to port scan.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
A bit tricky since Proto doesn't support threads. Rather than adding
threading support to it, ended up using a callback that waits for both
data on a Handle, and incoming messages at the same time.
This commit was sponsored by Denis Dzyubenko on Patreon.
For use with tor hidden services, and perhaps other transports later.
Based on Utility.SimpleProtocol, it's a line-based protocol,
interspersed with transfers of bytestrings of a specified size.
Implementation of the local and remote sides of the protocol is done
using a free monad. This lets monadic code be included here, without
tying it to any particular way to get bytes peer-to-peer.
This adds a dependency on the haskell package "free", although that
was probably pulled in transitively from other dependencies already.
This commit was sponsored by Jeff Goeke-Smith on Patreon.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.
Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.
This commit was sponsored by Ethan Aubin.
This avoids needing to bind to the right port before something else
does.
The socket is in /var/run/user/$uid/ which ought to be writable by only
that uid. At least it is on linux systems using systemd.
For Windows, may need to revisit this and use ports or something.
The first version of tor to support sockets for hidden services
was 0.2.6.3. That is not in Debian stable, but is available in
backports.
This commit was sponsored by andrea rota.
Tor unfortunately does not come out of the box configured to let hidden
services register themselves on the fly via the ControlPort.
And, changing the config to enable the ControlPort and a particular type
of auth for it may break something already using the ControlPort, or
lessen the security of the system.
So, this leaves only one option to us: Add a hidden service to the
torrc. git-annex enable-tor does so, and picks an unused high port for
tor to listen on for connections to the hidden service.
It's up to the caller to somehow pick a local port to listen on
that won't be used by something else. That may be difficult to do..
This commit was sponsored by Jochen Bartl on Patreon.
Yesod didn't used to do auth checks for that, but this may have changed.
I don't have a way to reproduce the reported problem yet, but this change
certianly won't hurt anything.
This commit was sponsored by Thom May on Patreon.
Restarting a crashing git process could result in filename encoding issues
when not in a unicode locale, as the restarted processes's handles were not
read in raw mode.
Since rawMode is always used when starting a coprocess, didn't bother
to parameterise it and just always enable it for simplicity.
This commit was sponsored by Jake Vosloo on Patreon.
gpg-agent started deleting its socket file on shutdown, and this tickled an
ugly behavior in removeDirectoryRecursive,
https://github.com/haskell/directory/issues/60
Running removeDirectoryRecursive again on exception avoids the problem.
gpg 2.1.15 (or so) seems to have added some new fields to the --with-colons
--list-secret-keys output. These include "fpr" and "grp", and come before
the "uid" line. So, the parser was giving up before it saw the name. Fix by
continuing to look for the uid line until the next "sec" line.
This commit was sponsored by Ole-Morten,Duesund on Patreon.
This gets rid of quite a lot of ugly hacks around json generation.
I doubt that any real-world json parsers can parse incomplete objects, so
while it's not as nice to need to wait for the complete object, especially
for commands like `git annex info` that take a while, it doesn't seem worth
the added complexity.
This also causes the order of fields within the json objects to be
reordered. Since any real json parser shouldn't care, the only possible
problem would be with ad-hoc parsers of the old json output.
Keeping Text.JSON use for now, because it seems a better fit for most of
the commands, which don't use very structured JSON objects, but just output
whatever fields suites them. But this lets Aeson be used when a more
structured data type is available to serialize to JSON.
Mostly the username is only used for the git committer or other display
purposes, and we can just fall back to a dummy value in these cases.
The only remaining place where an error is thrown is when starting local
pairing, which needs the username to be known.
Sadly my bug report about this is not going to get fixed it seems, so
I have to drag around a whole added module file just to deal with it.
https://github.com/haskell/directory/issues/52
It started exporting a isSymbolicLink which supports windows. But,
git-annex does no use symlinks on windows yet and this conflicts with the
function by the same name from unix-compat, so hide it.