While intended for converting URL keys added by addurl --fast to be
as if added by addurl --relaxed, it can also be used to remove size
from other types of keys. Although that is not likely to be useful
for checksummed keys, I suppose it could be used for WORM or other
non-checksum keys.
Specifying the --remove-size option does not prevent other migrations
from taking effect if there's a key upgrade to perform, or if the
backend has changed. So --backend=URL needs to be used to prevent
migrating an URL key to the default backend.
Note that it's not possible to use git-annex migrate to convert from a
non-URL key to an URL key, as URL keys cannot be generated, except by
addurl. So while this can get the same effect as --relaxed would have
when addurl --fast was used, when --fast was not used, it won't work, or
if --backend=URL is not used will remove the size but not prevent
checksum verification, which is not useful. Due to this complexity, I
decided not to mention it in the git-annex addurl man page.
Sponsored-by: Jochen Bartl on Patreon
git-lfs: Fix interoperability with gitlab's implementation of the git-lfs
protocol, which requests Content-Encoding chunked.
Sponsored-by: Dartmouth College's Datalad project
* uninit: Avoid error message when no commits have been made to the
repository yet.
* uninit: Avoid error message when there is no git-annex branch.
Sponsored-by: Svenne Krap on Patreon
Based on my earlier benchmark, I have a rough cost model for how
expensive it is for git-annex smudge to be run on a file, vs
how expensive it is for a gigabyte of a file's content to be read and
piped through to filter-process.
So, using that cost model, it can decide if using filter-process will
be more or less expensive than running the smudge filter on the files to
be restaged.
It turned out to be *really* annoying to temporarily disable
filter-process. I did find a way, but urk, this is horrible. Notice
that, if it's interrupted with it disabled, it will remain disabled
until the next time restagePointerFile runs. Which could be some time
later. If the user runs `git add` or `git checkout` on a lot of small
files before that, they will see slower than expected performance.
(This commit also deletes where I wrote down the benchmark results
earlier.)
Sponsored-by: Noam Kremen on Patreon
filter-process: New command that can make git add/checkout faster when
there are a lot of unlocked annexed files or non-annexed files, but that
also makes git add of large annexed files slower.
Use it by running: git
config filter.annex.process 'git-annex filter-process'
Fully tested and working, but I have not benchmarked it at all.
And, incremental hashing is not done when git add uses it, so extra work is
done in that case.
Sponsored-by: Mark Reidenbach on Patreon
metadata --batch --json: Reject input whose "fields" does not consist of
arrays of strings. Such invalid input used to be silently ignored.
Used to be that parseJSON for a JSONActionItem ran parseJSON separately
for the itemAdded, and if that failed, did not propagate the error. That
allowed different items with differently named fields to be parsed.
But it was actually only used to parse "fields" for metadata, so that
flexability is not needed.
The fix is just to parse "fields" as-is. AddJSONActionItemFields is needed
only because of the wonky way Command.MetaData adds onto the started json
object.
Note that this line got a dummy type signature added,
just because the type checker needs it to be some type.
itemFields = Nothing :: Maybe Bool
Since it's Nothing, it doesn't really matter what type it is,
and the value gets turned into json and is then thrown away.
Sponsored-by: Kevin Mueller on Patreon
Turns out that CommandStart actions do not have their exceptions caught,
which is why the giveup was causing a crash. Mostly these actions
do not do very much work on their own, but it does seem possible there
are other commands whose CommandStart also throws an exception.
So, my first attempt at a fix was to catch those exceptions. But,
--json-error-messages then causes a difficulty, because in order to output
a json error message, an action needs to have been started; that sets up
the json object that the error message will be included in a field of.
While it would be possible to output an object with just an error field,
this would be json output of a format that the user has no reason to
expect, that happens only in an exceptional circumstance. That is something
I have always wanted to avoid with the json output; while git-annex man
pages don't document what the json looks like, the output has always
been made to be self-describing. Eg, it includes "error-messages":[]
even when there's no errors.
With that ruled out, it doesn't seem a good idea to catch CommandStart
exceptions and display the error to stderr when --json-error-messages
is set. And so I don't know if it makes sense to catch exceptions from that
at all. Maybe I'd have a different opinion if --json-error-messages did not
exist though.
So instead, output a blank line like other batch commands do.
This also leaves open the possibility of implementing support for matching
object with metadata --json, which would also want to output a blank line
when the input didn't match.
Sponsored-by: Dartmouth College's DANDI project
addurl: Support adding the same url to multiple files at the same time when
using -J with --batch --with-files.
Implementation was easier than expected, was able to reuse OnlyActionOn.
While it will download the url's content multiple times, that seems like
the best thing to do; see my comment for why.
Sponsored-by: Dartmouth College's DANDI project
This makes it be displayed in the error-messages field with
--json-error-messages. And with --quiet, it will let it be displayed,
which makes sense because it's telling the user why what they requested
to do has failed to happen.
This opens the potential for the object file to be in place but
git-annex is interrupted before it can freeze it. git-annex fsck already
fixes that situation, which can also occur when lockContentForRemoval
thaws content.
Also improve comment to not be Windows-specific.
Caused by dirContains ".." "foo" being incorrectly False.
Also added a test of dirContains, which includes all the previous bug fixes
I could find and some obvious cases.
Reversion in version 8.20211011
Sponsored-by: Brett Eisenberg on Patreon
Fix bug that caused stale git-annex branch information to read when
annex.private or remote.name.annex-private is set.
The private journal file should not prevent reading more current
information from the git-annex branch, but used to.
Note that, overBranchFileContents has to do additional work now, when
there's a private journal file, it reads from the branch redundantly
and more slowly.
Sponsored-by: Jack Hill on Patreon
This removes a messy caveat that was easy to forget and caused at least one
bug. The price paid is that, after a write to a MultiWriter db, it has to
close the db connection that it had been using to read, and open a new
connection. So it might be a little bit slower. But, writes are usually
batched together, so there's often only a single write, and so there should
not be much of a slowdown. Notice that SingleWriter already closed the db
connection after a write, so paid the same overhead.
This is the second try at fixing a bug: git-annex get when run as the first
git-annex command in a new repo did not populate all unlocked files.
(Reversion in version 8.20210621)
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
Rather than the error that occurred when trying to download the unchunked
content, which is less likely to actually be stored in the remote.
Sponsored-by: Boyd Stephen Smith Jr. on Patreon
test: Put gpg temp home directory in system temp directory, not filesystem
being tested.
Since I've found indications gpg can fail talking to the agent when the
socket ends up on eg, fat. And to hopefully fix this bug report I've
followed up on.
The main risk in using the system temp dir is that TMPDIR could be set to a
long directory path, which is too long to put a unix socket in. To
partially amelorate that risk, it uses either an absolute or a relative
path, whichever is shorter. (Hopefully gpg will not convert it to a longer
form of the path..)
If the user sets TMPDIR to something so long a path to it +
"S.gpg-agent" is too long, I suppose that's their issue to deal with.
Sponsored-by: Dartmouth College's Datalad project
This negotiation is not supported by versions of git-annex older
than 6.20180312. Well, maybe really 6.20180227 or so, but using that in
the changelog simplifies things since it was the version for the other
changes as well.
See commit c81768d425 for the back story.
As well as allowing for future protocol improvements, this will result
in negoatiating protocol version 1, which is an improvement over default
version 0.
In fact, it looks like no supported version of git-annex will use
protocol version 0, since version 1 was introduced in 6.20180227.
Still, removing the code for version 0 seems unncessary.
See commit 31e1adc005.
Sponsored-by: Brett Eisenberg on Patreon.