webdav: Checking if a non-existent file is present on Box.com triggered a
bug in its webdav support that generates an infinite series of redirects.
It seems to redirect foo to foo/ to foo/index.php to
foo/index.php/index.php ... Why a webdav endpoint would behave this way
who knows.
Deal with such problems by assuming such behavior means the file is not
present.
Can't simply disable following redirects, because the webdav endpoint could
legitimately be redirected to a new endpoint. So, when this happens
10 redirects have to be followed, before it gives up and assumes this means
the file does not exist.
This commit was supported by the NSF-funded DataLad project.
This avoids needing to deal with the complexity of partially transferred
files in the export. We'd not be able to resume uploading to such a file
anyway, so just avoid them.
The implementation in Remote.Directory is not completely ideal, because
it could leave the temp file hanging around in the export directory.
This only happens if it's killed with -9, or there's a power failure;
normally viaTmp cleans up after itself, even when interrupted. I could
not see a better way to do it though, since the export directory might
be the root of a filesystem.
Also some design thoughts on resuming, which depend on storeExport being
atomic.
This commit was sponsored by Fernando Jimenez on Partreon.
Security fix: Disallow hostname starting with a dash, which would get
passed to ssh and be treated an option. This could be used by an attacker
who provides a crafted ssh url (for eg a git remote) to execute arbitrary
code via ssh -oProxyCommand.
No CVE has yet been assigned for this hole.
The same class of security hole recently affected git itself,
CVE-2017-1000117.
Method: Identified all places where ssh is run, by git grep '"ssh"'
Converted them all to use a SshHost, if they did not already, for
specifying the hostname.
SshHost was made a data type with a smart constructor, which rejects
hostnames starting with '-'.
Note that git-annex already contains extensive use of Utility.SafeCommand,
which fixes a similar class of problem where a filename starting with a
dash gets passed to a program which treats it as an option.
This commit was sponsored by Jochen Bartl on Patreon.
By forking a worker process and only deleting the test directory once it exits.
This way, if a test leaves files open, they'll get closed when the worker
exits, so avoiding failure to delete open files on Windows, and failure to
delete directories due to NFS lock files.
If a test leaves a git worker process running, the closed pipes should
cause the worker to exit too, also avoiding the problem there. The 10
second sleep ought to give plenty of time for such worker processes to
exit, although this is of course a race.
Finally, even if test directory fails to be deleted still,
it won't appear as if the last test in the test suite failed; the error
will be displayed at the very end.
This commit was supported by the NSF-funded DataLad project.
QuickCheck 2.10 found a counterexample eg "\929184" broke the property.
As far as I can tell, Git.Filename is matching how git handles encoding
of strange high unicode characters in filenames for display. Git does
not display high unicode characters, and instead displays the C-style
escaped form of each byte. This is ambiguous, but since git is not
unicode aware, it doesn't need to roundtrip parse it.
So, making Git.FileName's roundtrip test only chars < 256 seems fine.
Utility.Format.format uses encode_c, in order to mimic git, so that's
ok.
Utility.Format.gen uses decode_c, but only so that stuff like "\n"
in the format string is handled. If the format string contains C-style
octal escapes, they will be converted to ascii characters, and not
combined into unicode characters, but that should not be a problem.
If the user wants unicode characters, they can include them in the
format string, without escaping them.
Finally, decode_c is used by Utility.Gpg.secretKeys, because gpg
--with-colons hex-escapes some characters in particular ':' and '\\'.
gpg passes unicode through, so this use of decode_c is not a problem.
This commit was sponsored by Henrik Riomar on Patreon.
QuickCheck added an Arbitrary instance for CTime aka EpochTime. However,
while git-annex's instance disallowed times before the epoch, QuickCheck's
does not. So, rather than using its instance, convert from an Integer.
This commit was sponsored by Thomas Hochstein on Patreon.
Don't trust OSX FSEvents's eventFlagItemModified to be called when the last
writer of a file closes it; apparently that sometimes does not happen,
which prevented files from being quickly added.
This commit was sponsored by John Peloquin on Patreon.
orElse is great, but was not the right thing to use here because
waitTakeLock could retry for other reasons than the lock being held,
which made tryTakeLock fail when it shouldn't.
Instead, move the code to tryTakeLock and implement waitTakeLock using
tryTakeLock and retry.
(Also, in runTransfer, when checkSaneLock fails, dropLock to avoid leaking a
lock handle.)
This commit was supported by the NSF-funded DataLad project.
Removed dependency on MissingH, instead depending on the split
library.
After laying groundwork for this since 2015, it
was mostly straightforward. Added Utility.Tuple and
Utility.Split. Eyeballed System.Path.WildMatch while implementing
the same thing.
Since MissingH's progress meter display was being used, I re-implemented
my own. Bonus: Now progress is displayed for transfers of files of
unknown size.
This commit was sponsored by Shane-o on Patreon.
Cryptonite is faster and allocates less, and I want to get rid of
MissingH use.
Note that the new dependency on memory is free; it's a dependency of
cryptonite.
This commit was supported by the NSF-funded DataLad project.
Moving toward dropping MissingH dep.
I think I've addressed the problem identified earlier in
09a66f702d. On Windows,
absPathFrom "/tmp/repo/xxx" "y/bar" would be "/tmp/repo/xxx\\y/bar",
which then confuses relPathDirToFile. Fixed by converting to unix (git)
style paths.
Also, relPathDirToFile was splitting only on \\ on windows and not /
which broke the example in 09a66f702d of
relPathDirToFile (absPathFrom "/tmp/repo/xxx" "y/bar") "/tmp/repo/.git/annex/objects/xxx"
Now, on windows, that will yield "..\\..\\..\\.git/annex/objects/xxx"
which once converted to unix style paths is what we want.
The bug was that withFile closes the handle afterwards, but the content
of the file was not read due to laziness. Using readFile avoids it.
This commit was sponsored by Nick Daly on Patreon.
The COPYRIGHT had Utility/DirWatcher* listed as GPL, but they were
actually BSD licensed.
No idea why I put the GPL on Utility/GPG.hs file originally.
I wrote all of it, except for guilhem's small changes to it in
00fc21bfec, which seem too small to be
independently copyrightable. I'm relicencing it BSD.
findShellCommand needs a full path to a file in order to check it for a
shebang on Windows. It was being run with only the base name of the external
special remote program, which would only work when it was in the current
directory.
This is why users in
https://github.com/DanielDent/git-annex-remote-rclone/pull/10 and elsewhere
were complaining that the previous improvements to git-annex didn't make
git-remote-rclone work on Windows.
Also, reworked checkearlytermination, which while it worked, seemed
to rely on a race condition. And, improved its error messages.
This commit was sponsored by Shane-o on Patreon.
* Run curl with -S, so HTTP errors are displayed, even when
it's otherwise silent.
* When downloading in --json or --quiet mode, use curl in preference
to wget, since curl is able to display only errors to stderr, unlike
wget.
This does mean that downloadQuiet is only silent on stdout, not necessarily
on stderr, which affects a couple other calls of it. For example,
downloading the .git/config of a http remote may show an error message now,
perhaps with slightly suboptimal formatting due to other output.
This adds one extra line of output when a download is successful,
after the progress bar. I don't much like that, but wget does not provide a
way to show HTTP errors without it.
This allows using functions that generate CreateProcess and passing the
result to processTranscript', which is more flexible, and also simpler
than the old interface.
This commit was sponsored by Riku Voipio.
Refactored some common code into initDb.
This only deals with the problem when creating new databases. If a repo
got bad permissions into it, it's up to the user to deal with it.
This commit was sponsored by Ole-Morten Duesund on Patreon.
Probing for hard link support in the pid locking code is redundant since
git-annex init already probes that. But, it didn't seem worth threading
that data through; the pid locking code runs at most once per git-annex
process, and only on unusual filesystems. Optimising a single hard link
and unlink isn't worth it.
This commit was sponsored by Francois Marier on Patreon.
Wormhole pairing will start to provide an appid to wormhole on 2021-12-31.
An appid can't be provided now because Debian stable is going to ship a
older version of git-annex that does not provide an appid. Assumption is
that by 2021-12-31, this version of git-annex will be shipped in a Debian
stable release. If that turns out to not be the case, this change will need
to be cherry-picked into the git-annex in Debian stable, or its wormhole
pairing will break.
This commit was sponsored by Thomas Hochstein on Patreon.
Turns out that Data.List.Utils.split is slow and makes a lot of
allocations. Here's a much simpler single character splitter that behaves
the same (even in wacky corner cases) while running in half the time and
75% the allocations.
As well as being an optimisation, this helps move toward eliminating use of
missingh.
(Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and
allocates even more.)
I have not benchmarked the effect on git-annex, but would not be surprised
to see some parsing of eg, large streams from git commands run twice as
fast, and possibly in less memory.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
hSetEncoding of a closed handle segfaults.
https://ghc.haskell.org/trac/ghc/ticket/71618484c0c197 introduced the crash.
In particular, stdin may get closed (by eg, getContents) and then trying
to set its encoding will crash. We didn't need to adjust stdin's
encoding anyway, but only stderr, to work around
https://github.com/yesodweb/persistent/issues/474
Thanks to Mesar Hameed for assistance related to reproducing this bug.
Since the user does not know whether it will run su or sudo, indicate
whether the password prompt will be for root or the user's password,
when possible.
I assume that programs like gksu that can prompt for either depending on
system setup will make clear in their prompt what they're asking for.