ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.
Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.
This commit was sponsored by Ethan Aubin.
This makes merging a remote into a freshly created direct mode repository
work the same as it works in indirect mode.
The git-annex branches would get merged in any case by a sync,
since that doesn't use git merge.
This might need to be revisited later to better mirror git's behavior.
git-annex.cabal: Loosen bounds on persistent to allow 2.5, which on Debian
has been patched to work with esqueleto. This may break cabal's resolver on
non-Debian systems; if so, either use stack to build, or run cabal with
--constraint='persistent ==2.2.4.1' Hopefully this mess with esqueleto will
be resolved soon.
https://github.com/prowdsponsor/esqueleto/issues/137
Yesod didn't used to do auth checks for that, but this may have changed.
I don't have a way to reproduce the reported problem yet, but this change
certianly won't hurt anything.
This commit was sponsored by Thom May on Patreon.
ghc 8.0.2 may make this unncessary, but it's not in a stackage version yet,
so put in a workaround.
Note that the linux builds already delete the RPATHs for similar reasons.
This commit was sponsored by Josh Taylor on Patreon.
* S3: Support the special case endpoint needed for the cn-north-1 region.
* Webapp: Don't list the Frankfurt region, as this (and some other new
regions) need V4 authorization which the aws library does not yet use.
This commit was sponsored by Nick Daly on Patreon.
When used with an older version of ssh, any ServerAliveInterval in
~/.ssh/config will be overridden by .git/annex/ssh.config.
This commit was sponsored by Josh Taylor on Patreon.
Restarting a crashing git process could result in filename encoding issues
when not in a unicode locale, as the restarted processes's handles were not
read in raw mode.
Since rawMode is always used when starting a coprocess, didn't bother
to parameterise it and just always enable it for simplicity.
This commit was sponsored by Jake Vosloo on Patreon.
The Makefile was putting them in git-annex.linux/i18n/i18n, and so I18NPATH
did not point to the files. I think that on close enough to Debian systems,
localedef then fell back to using the system-wide locale files, while on
other systems it would fail to generate locales.
So that stalled transfers will be noticed within about 3 minutes,
even if TCPKeepAlive is disabled or doesn't work.
Rather than setting with -o, use -F with another config file,
so that any settings in ~/.ssh/config or /etc/ssh/ssh_config overrides this.
If a transfer fails for some reason, but some data managed to be sent, the
transfer will be retried. (The assistant already did this.)
Possible impacts:
* More ssh prompts if ssh needs to prompt for a password to connect to a
host, or is prompting about some other problem like a ssh key mismatch.
* More data transfer due to retrying, epecially when a remote does not
support resuming a transfer.
In the worst case, a lot of data will be transferred but it fails before
the end, and then all that data gets transferred again plus one byte more;
repeat until it manages to get the whole file.
gpg-agent started deleting its socket file on shutdown, and this tickled an
ugly behavior in removeDirectoryRecursive,
https://github.com/haskell/directory/issues/60
Running removeDirectoryRecursive again on exception avoids the problem.
Closes https://github.com/datalad/datalad/issues/1020
The use of runWriter in scanUnlockedFiles broke due to this change;
it failed with blocked indefinitely in mvar, because the database write
handle was taken while linkFromAnnex needed to also write to it (to update
the inode cache). So, switched to using a separate runWriter for each call
to addAssociatedFileFast. A little less efficient, but not greatly; the
writes should all still be cached.
In the case where the pointer file is in place, and not the content
of the object, lock's performNew was called with filemodified=True,
which caused it to try to repopulate the object from an unmodified
associated file, of which there were none. So, the content of the object
got thrown away incorrectly. This was the cause (although not the root
cause) of data loss in https://github.com/datalad/datalad/issues/1020
The same problem could also occur when the work tree file is modified,
but the object is not, and lock is called with --force. Added a test case
for this, since it's excercising the same code path and is easier to set up
than the problem above.
Note that this only occurred when the keys database did not have an inode
cache recorded for the annex object. Normally, the annex object would be in
there, but there are of course circumstances where the inode cache is out
of sync with reality, since it's only a cache.
Fixed by checking if the object is unmodified; if so we don't need to
try to repopulate it. This does add an additional checksum to the unlock
path, but it's already checksumming the worktree file in another case,
so it doesn't slow it down overall.
Further investigation found a similar problem occurred when smudge --clean
is called on a file and the inode cache is not populated. cleanOldKeys
deleted the unmodified old object file in this case. This was also
fixed by checking if the object is unmodified.
In general, use of getInodeCaches and sameInodeCache is potentially
dangerous if the inode cache has not gotten populated for some reason.
Better to use isUnmodified. I breifly auited other places that check the
inode cache, and did not see any immediate problems, but it would be easy
to miss this kind of problem.
The modification flag was not being set when making modifications deep
in a tree, so parent trees were not updated to contain the modified tree.
Seems to have exposed another bug where the wrong filename gets grafted in.
This commit was sponsored by Brock Spratlen on Patreon.
An easy change now that supportedVersions is a list. Since v3 and v5 are
identical other than version number, just add v3 to the list.
This commit was sponsored by andrea rota.
Fixes a bug introduced with v6 mode that I didn't notice until now.
Probably not many v3 repos left out there, and upgrading them to v6 mode
is not disastrous, only a little premature.
This commit was sponsored by Riku Voipio
.. and have to be checked to see if they are a pointed to an annexed file.
Cases where such memory use could occur included, but were not limited to:
- git commit -a of a large unlocked file (in v5 mode)
- git-annex adjust when a large file was checked into git directly
Generally, any use of catKey was a potential problem.
Fix by using git cat-file --batch-check to check size before catting.
This adds another git batch process, which is included in the CatFileHandle
for simplicity.
There could be performance impact, anywhere catKey is used. Particularly
likely to affect adjusted branch generation speed, and operations on
unlocked files in v6 mode. Hopefully since the --batch-check and
--batch read the same data, disk buffering will avoid most overhead.
Leaving only the overhead of talking to the process over the pipe and
whatever computation --batch-check needs to do.
This commit was sponsored by Bruno BEAUFILS on Patreon.
Currently only done for utf-8 locales because the charset can easily be
told for those. Other locales don't include the charset in their name.
The locale definition is generated under git-annex.linux/locales.
So, this only works if the user can write there.
If locale generation fails for any reason, it's silently skipped.
The git-annex-standalone.deb installs the bundle under /usr, so this locale
generation won't work for non-root users.
Version mismatches between the system locale-archive and the glibc in the
bundle have been observed to cause git crashes.
Unfortunately, this causes locales to not be used in the linux standalone
bundle, as was the case until version 6.20160419.
glibc hardcodes the path to /usr/lib/locale/locale-archive and does not
let an environment variable cause a different locale-archive file to be used.
The only other option to include locales in the bundle would be to include
exploded locale definition directories in the bundle for a number of
locales, generated by localedef. But these take at least 300 kb per locale,
and there are a great many locales; it would be hundreds of megabytes to
include them all.
(Hmm, we could include localdef in the bundle, and check LANG in runshell
and compile the locale directories on the fly. This would need
/usr/share/i18n/ and /usr/lib/locale-archive to be included in the bundle.
It's.. doable.)
I know this is going to once again cause users of the bundle to complain
that eg, ls doesn't show their unicode filenames right. Better than strange
crashes though.
Multiple external special remote processes for the same remote will be
started as needed when using -J.
This should not beak any existing external special remotes, because running
multiple git-annex commands at the same time could already start multiple
processes for the same external special remotes.
This sped up git annex find --not --in web from 6.64s to 5.69s.
The optimised parser is probably more like 50% faster than the general one
it replaced.
Speeds up commands like "git-annex find --in remote" by over 50%.
Profiling showed that adjustGitEnv was 21% of the time and 37% of the
allocations of that command. It copied the environment each time with
getEnvironment.
The only repeated use of adjustGitEnv is in withIndexFile, which tends to
be run at least once per file. So, it was optimised by keeping a cache of
the environment, which can be reused.
There could be other better ways to optimise this. Maybe get the while
environment once at startup. But, then it would have to be serialized back
out each time running a child process, so I doubt that would be a net win.
It might be better to cache a version of the environment that is
pre-modified to use .git-annex/index. But, profiling doesn't show that
modifying the enviroment is taking any significant time.
key2file and file2key were top cost centers according to profiling.
The repeated use of replace was not efficient. This new approach is quite a
lot more efficient.
This commit was sponsored by Denis Dzyubenko on Patreon.
This reverts commit e181603103.
This broke the i386ancient autobuilder due to its use of
--flag git-annex:XMPP --flag=git-annex:dbus
-- Failure when adding dependencies:
fdo-notify: needed ((>=0.3)), stack configuration has no specified version
(latest applicable is 0.3.1)
gnutls: needed ((>=0.1.4)), stack configuration has no specified version
(latest applicable is 0.2)
network-protocol-xmpp: needed (-any), stack configuration has no specified
version (latest applicable is 0.4.8)
OSX autobuilder also seems hosed by it, so too soon.
De-revert later..
* sync: Previously, when run in a branch with a slash in its name,
such as "foo/bar", the sync branch was "synced/bar". That conflicted
with the sync branch used for branch "bar", so has been changed to
"synced/foo/bar".
* adjust: Previously, when adjusting a branch with a slash in its name,
such as "foo/bar", the adjusted branch was "adjusted/bar(unlocked)".
That conflicted with the adjusted branch used for branch "bar",
so has been changed to "adjusted/foo/bar(unlocked)"
* Also, running sync in an adjusted branch did not correctly sync
changes back to the parent branch when it had a slash in its name.
This bug has been fixed.
Eliminate use of Git.Ref.under and Git.Ref.basename; using
Git.Ref.underBase and Git.Ref.base make everything handle deep branches
correctly.
Probably noone was adjusting deep branches, and v6 is still experimental
anyway, so I'm not going to worry about the mess that was left by that bug.
In the case of git-annex sync, using a fixed git-annex with an old unfixed
one will mean they use different sync branches for a deep branch, and so
they may stop syncing until the old one is upgraded. However, that's only
a problem when syncing between repositories without going via a central
bare repository. Added a warning about this to the CHANGELOG, but it's
probably not going to affect many people at all.
This commit was sponsored by Riku Voipio.
gpg 2.1.15 (or so) seems to have added some new fields to the --with-colons
--list-secret-keys output. These include "fpr" and "grp", and come before
the "uid" line. So, the parser was giving up before it saw the name. Fix by
continuing to look for the uid line until the next "sec" line.
This commit was sponsored by Ole-Morten,Duesund on Patreon.