To use S3 Signature Version 4. Some S3 services seem to require v4, while
others may only support v2, which remains the default.
I'm also not sure if v4 works correctly in all cases, there is this
upstream bug report: https://github.com/aristidb/aws/issues/262
I've only tested it against the default S3 endpoint.
37b42e72e7 made it catch exceptions but
thought they were unlikely to be useful to display, which may be right when
a git command fails, but not in the case yoh found.
Now the warning gets displayed, which is better than an arcane git error.
The warning is still kind of ugly, especially when the pull later in the
sync will clear up what it warns about. But, this is an unusual situation
not likely to happen, and if there is no remote to pull from, the warning
message is needed or the sync will seem to succeed despite not merging the
synced master branch.
Would still be better if it could merge the synced master branch in this
situation, making an empty commit to master to do it seems wrong, and
otherwise it would need a whole separate code path, and would bypass using
git merge in favor of say, setting master to the syned branch. Which would
bypass git configs like arguably merge.ff and certianly
merge.verifySignatures. So don't want to do that.
* Display a warning message when a remote uses a protocol, such as
git://, that git-annex does not support. Silently skipping such a
remote was confusing behavior.
It sets annex-ignore, so the warning is only displayed once.
* Also display a warning message when a remote, without a known uuid,
is located in a directory that does not currently exist, to avoid
silently skipping such a remote.
This is a bit more debatable, since git-annex get will say,
try making repository available. And since it does not set annex-ignore,
the warning will be displayed repeatedly. It's also an extreme edge case,
I don't think I've ever seen it happen in real life.
aeca7c2207 exposed this problem, but it
was never a good idea to have a series of test cases, some of which depend on
prior ones, and throw away annex state after each.
Due to eg, too long a path to the agent socket, caused by running gpg in a
container where /run is not mounted, and/or some other gpg behavior like
unnecessarily making relative paths to its home directory absolute.
When the required content is set to "groupwanted", use whatever expression
has been set in groupwanted as the required content of the repo, similar to
how setting required content to "standard" already worked.
Do not sync with a faster remote that was not specified.
That old behavior was only documented in the changelog, and was certianly
surprising. It also meant adding --fast made it slower..
readonly=true is used to make an external special remote that does not
need the external program to be installed. It was stored in the
remote.log by default, and so every time it was specified in an
enableremote or initremote, whatever value was used became the new
default for subsequent enableremotes of that remote.
That was surprising, and I consider it to be a bug.
It does not make much sense to pass it to initremote because then how
would you populate that remote with anything? You would have to
enableremote elsewhere, and store content there. I'm assuming nobody
used it that way.
Someone might rely on passing it to enableremote once, and then that
being inherited in other clones. But that is not how it's documented to
be used. It is barely documented in git-annex at all, only in the
external special remote protocol, and the documentation there says to
"Document that this external special remote can be used in readonly
mode." (by the user of it passing readonly=true to enableremote). The
one external special remote that I know of that does document that is
<https://github.com/bgilbert/gcsannex> (the one that motivated adding
it). That one's docs do say to pass it to enableremote.
So, it seemed safe to make this behavior change. If someone was in fact
relying on one of those behaviors, all their current repos will still
work as they configured them (although they will need to deal
with the related change in 9f3c2dfeda).
In new clones, they will find enableremote fails, complaining the
external program is not in path. An easy enough problem to recover from.
get --from, move --from: When used with a local git remote, these used to
silently skip files that the location log thought were present on the
remote, when the remote actually no longer contained them. Since that
behavior could be surprising, now instead display a warning.
I got very confused when I encountered this behavior, since it was silently
skipping a file I needed that whereis said was on the remote.
get without --from already displayed a "unable to access these remotes"
message, which while a bit misleading in that the remote is likely
accessible, but just doesn't contain the file, at least indicated something
went wrong.
Having get --from display a warning makes it in line with get
w/o --from, so seems certianly ok. It might be there are situations where
move --from is used, on eg a whole directory, and the user only wants to
move whatever is present in the remote, and is perfectly ok with files
that are not present being skipped. So I'm less sure about the new warning
being ok there. OTOH, only local git remotes avoiding displaying a warning
in that case too, so this just brings them into line with other remotes.
(Also note that this makes it a little bit faster when dealing with a lot of
files, since it avoids a redundant stat of the file.)
Avoid running a large number of git cat-file child processes when run with
a large -J value.
This implementation takes care to avoid adding any overhead to git-annex
when run without -J. When run with -J, there is a small bit of added
overhead, to manipulate the resource pool. That optimisation added a
fair bit of complexity.
The journal read optimisation in aeca7c220 later got fixed in eedd73b84
to stage and commit any files that were left in the journal by a
previous git-annex run. That's necessary for the optimisation to work
correctly. But it also meant that alwayscommit=false started committing
the previous git-annex processes journalled changes, which defeated the
purpose of the config setting entirely.
So, disable the optimisation when alwayscommit=false, leaving the
files in the journal and not committing them. See my comments on the bug
report for why this seemed the best approach.
Also fixes a problem when annex.merge-annex-branches=false and there
are changes in the journal. That config indirectly prevents committing
the journal. (Which seems a bit odd given its name, but it always has..)
So, when there were changes in the journal, perhaps left there due to
alwayscommit=false being set before, the optimisation would prevent
git-annex from reading the journal files, and it would operate with out
of date information.
Considered using the system tmp dir rather than putting it inside .t/,
but then if TEMP were set to a long path, that would be a problem.
Relative path seems the best approach, and will always be nice and
short.
The only downside of it is, if git-annex somehow changes the cwd
while running, it would break. But git-annex does not do that, and
should never do that.
Upgrade other repos than the current one by running git-annex upgrade
inside them, which avoids problems with upgrade code making assumptions
that the cwd will be inside the repo being upgraded.
In particular, this fixes a problem where upgrading a v7 repo to v8 caused
an ugly git error message.
I actually could not find a way to make Upgrade.V7 work properly
without changing directory to the remote. Once I got git ls-files to work,
the git cat-file failed because :path can only be used in the current git
repo.
This means it will still be a .git file when git-annex init runs. That's
ok, the repo probably contains no annexed objects yet, and even if it does,
git-annex init does not care if symlinks in the worktree don't point to the
objects.
I made init, at the end, run the conversion code. Not really necessary
because the next git-annex command could do it just as well. But, this
avoids commands that don't normally write to the repo needing to write to
it, which might avoid some problem or other, and seems worth avoiding
generally.
md5sum is part of busybox, so is probably available unless it were compiled
out. If md5sum (or cut for that matter) is not available, it will
still use the whole path to $base, otherwise hash it.
Of course it's possible for md5sum to be available sometimes and not others
on the same system; in such an event the locales would be built twice for
the same bundle. The cleanup code will delete both sets once that
version of the bundle is upgraded.
* init --version: When the version given is one that automatically
upgrades to a newer version, use the newer version instead.
* Auto upgrades from older repo versions, like v5, now jump right to v8.
Fix support for repositories tuned with annex.tune.branchhash1=true,
including --all not working and git-annex log not displaying anything for
annexed files.
fsck --from remote: Fix a concurrency bug that could make it incorrectly
detect that content in the remote is corrupt, and remove it, resulting in
data loss.
* When git-annex is built with a ssh that does not support ssh connection
caching, default annex.sshcaching to false, but let the user override it.
* Improve warning messages further when ssh connection caching cannot
be used, to clearly state why.
This is untested because of rain, also I am operating from truncated
copiler error messages in a bug report that also doesn't mention what the
library version is. Still, it should work.
May break builds with old ghc, in particular DerivingStrategies is
I think fairly new? The pragmas could be ifdefed if necessary. Works with
ghc 8.6.5.