path to a bare repo when git config is not allowed to list the configs
due to the CVE-2022-24765 fix.
That resulted in a confusing error message, and prevented the nice
message that explains how to mark the repo as safe to use.
Made isBare a tristate so that the case where core.bare is not returned can
be handled.
The handling in updateLocation is to check if the directory
contains config and objects and if so assume it's bare.
Note that if that heuristic is somehow wrong, it would construct a repo
that thinks it's bare but is not. That could cause follow-on problems,
but since git-annex then checks checkRepoConfigInaccessible, and skips
using the repo anyway, a wrong guess should not be a problem.
Sponsored-by: Luke Shumaker on Patreon
Used to fail with a bad error message, indicating there was no
repository with the specified name, or something like that. Now, suggest
they use the uuid to disambiguate.
* info, enableremotemote, renameremote: Avoid a confusing message when more
than one repository matches the user provided name.
* info: Exit nonzero when the input is not supported.
Sponsored-by: Kevin Mueller on Patreon
A benchmark in my sound repository with `git-annex view feedtitle=*`
took 2:52 wall clock time before and 1:58 after. Though it still only used
130% of CPU.
This is the same kind of optimisation that is in seekFilteredKeys, though
that precaches location logs while this streams the metadata logs direct
to parsing them.
seekFilteredKeys contains more streaming, to find the annexed files, and
this could be further sped up with similar streaming.
Sponsored-by: Nicholas Golder-Manning on Patreon
Based on https://github.com/golang/go/discussions/58409, the Go compiler
already defaults to using a google proxy server, which would allow
Google to collect information about what dependencies users are
installing. (Of course they claim they won't.) Two separate environment
settings are needed to turn that off, and users in that thread were
surprised to learn about one of them.
So this warning is already appropriate to some extent.
Also based on the minimisation of user concerns by the golang developers
on that issue and elsewhere, it seems best to assume that they are not
going to be dissuaded from increasing data collection efforts in the future,
even if the blowback prevents this particular attempt.
So this warning should not be removed unless the Go community somehow
extricates itself from Google's control. Or unless ipfs is rewritten in
another language.
Some distros do have ipfs. Unfortunately, Debian appears to be structurally
incapable of packaging it. (8 years and counting;
https://bugs.debian.org/779893). So lots of users will be stuck
installing it from source or having to trust its official binaries.
It apparently displays a notice on first use, but I was surprised to
learn about this behavior today, and I've used it. Displaying a notice
does not make violating users' privacy acceptable.
sync: Fix a bug that caused files to be removed from an importtree=yes
exporttree=yes special remote when the remote's annex-tracking-branch was
not the currently checked out branch.
Sponsored-by: Max Thoursie on Patreon
* sync: When run in a view branch, refresh the view branch to reflect any
changes that have been made to the parent branch or metadata.
This is basically working, but probably needs some more work to deal with
all the edge cases of things sync does.
Sponsored-by: Lawrence Brogan on Patreon
* view: New field?=glob and ?tag syntax that includes a directory "_"
in the view for files that do not have the specified metadata set.
* Added annex.viewunsetdirectory git config to change the name of the
"_" directory in a view.
When in a view using the new syntax, old git-annex will fail to parse the
view log. It errors with "Not in a view.", which is not ideal. But that
only affects view commands.
annex.viewunsetdirectory is included in the View for a couple of reasons.
One is to avoid needing to warn the user that it should not be changed when
in a view, since that would confuse git-annex. Another reason is that it
helped with plumbing the value through to some pure functions.
annex.viewunsetdirectory is actually mangled the same as any other view
directory. So if it's configured to something like "N/A", there won't be
multiple levels of directories, which would also confuse git-annex.
Sponsored-By: Jack Hill on Patreon
S3: Support a region= configuration useful for some non-Amazon S3
implementations. This feature needs git-annex to be built with aws-0.24.
datacenter= sets both the AWS hostname and region in one setting, which is
easy when using AWS, but not useful for other hosts. So kept datacenter
as-is, but added this additional config.
Sponsored-By: Brett Eisenberg on Patreon
Use separate stages for download and upload. In the common case where
it downloads the file from one remote and then uploads to the other,
those are by far the most expensive operations, and there's a decent
chance the two remotes bottleneck on different resources.
Suppose it's being run with -J2 and a bunch of 10 mb files. Two threads
will be started both downloading from the src remote. They will probably
finish at the same time. Then two threads will be started uploading to
the dst remote. They will probably take the same time as well. Before
this change, it would alternate back and forth, bottlenecking on src and dst.
With this change, as soon as the two threads start uploading to dst, two
more threads are able to start, downloading from src. So bandwidth to
both remotes is saturated more often.
Other commands that use transferStages only send in one direction at a
time. So the worker threads for the other direction will sit idle, and
there will be no change in their behavior.
Sponsored-by: Dartmouth College's DANDI project
Lock the local content for drop after getting it from src, to prevent another
process from using the local content as a copy and dropping it from src,
which would prevent dropping the local content after sending it to dest.
Support resuming an interrupted move that downloaded the content from
src, leaving the local content populated. In this case, the location log
has not been updated to say the content is present locally, so we can
assume that it's resuming and go ahead and drop the local content after
sending it to dest.
Note that if a `git-annex get` is being ran at the same time as a
`git-annex move --from --to`, it may get a file just before the move
processes it. So the location log has not been updated yet, and the move
thinks it's resuming. Resulting in local copy being dropped after it's
sent to the dest. This race is something we'll just have to live with,
it seems.
I also gave up on the idea of checking if the location log had been updated
by a `git-annex get` that is ran at the same time. That wouldn't work, because
the location log is precached in the seek stage, so reading it again after
sending the content to dest would not notice changes made to it, unless the cache
were invalidated, which would slow it down a lot. That idea anyway was subject
to races where it would not detect the concurrent `git-annex get`.
So concurrent `git-annex get` will have results that may be surprising.
To make that less surprising, updated the documentation of this feature to
be explicit that it downloads content to the local repository
temporarily.
Sponsored-by: Dartmouth College's DANDI project