The splitting of the tests into parts for parallelism made --pattern
do extra work, because init tests have to be run for each part, but
many of the parts are empty.
For example, git-annex test --pattern '/move (ssh remote)/'
took 12 seconds to run before. This improves the runtime to 4 seconds.
Sponsored-by: Dartmouth College's Datalad project
init: Avoid scanning for annexed files, which can be lengthy in a
large repository. Instead that scan is done on demand. This lets git-annex
init be run and some query commands be used in a repository without
waiting.
Note that autoinit already behaved this way, so while this will mean some
commands like git-annex get/unlock/add will do the scan the first time run,
that is not really a significant behavior change.
And, it's really better to have a consistent behavior. The reason for
the inconsistency was a strange bug discussed in
b3c4579c79. Avoiding reconcileStaged in
init will keep avoiding whatever that was.
Sponsored-by: Dartmouth College's DANDI project
Increasing the size of the queue 10x makes git-annex init 7% faster in a
repository with 86000 annexed files.
The memory use goes up, from 70876 kb to 85376 kb.
Avoids database querying overhead when the database is newly created.
In the large repository where git-annex init took 24 seconds, this sped it
up to 20.47 seconds, a speedup of around 15%.
Sponsored-by: Dartmouth College's DANDI project
export: Fix a bug that left a file on a special remote when two files with
the same content were both deleted in the exported tree.
Case of the wrong data structure leading to the wrong result.
The DiffMap now contains all the old filenames, and all the new filenames.
Note that, when 2 files with the same content are both renamed,
it only renames the first, but deletes and re-exports the second.
Improving that is possible, but it would need to use a different temporary
filename. Anyway, that is an unusual case, and there are known to be other
unusual cases where export does not rename with maximum efficiency, IIRC.
(Or maybe this is the case that I remember?)
Sponsored-by: Dartmouth College's OpenNeuro project
The initTests have to be run once per part, and a point of diminishing
returns can be reached where more work is being done to set up for 1 or
2 tests than to run them.
This is better than a hard cap of -J8 or so, because it lets other
things than these particular tests still be parallelized at -J16.
Sponsored-by: Dartmouth College's Datalad project
Clean the standalone environment before running the su command
to run "sh". Otherwise, PATH leaked through, causing it to run
git-annex.linux/bin/sh, but GIT_ANNEX_DIR was not set,
which caused that script to not work:
[2022-10-26 15:07:02.145466106] (Utility.Process) process [938146] call: pkexec ["sh","-c","cd '/home/joey/tmp/git-annex.linux/r' && '/home/joey/tmp/git-annex.linux/git-annex' 'enable-tor' '1000'"]
/home/joey/tmp/git-annex.linux/bin/sh: 4: exec: /exe/sh: not found
Changed programPath to not use GIT_ANNEX_PROGRAMPATH,
but instead run the scripts at the top of GIT_ANNEX_DIR.
That works both when the standalone environment is set up, and when it's
not.
Sponsored-by: Kevin Mueller on Patreon