Commit graph

3809 commits

Author SHA1 Message Date
Joey Hess
0605cc1bfb
idea 2022-03-29 18:09:41 -04:00
Joey Hess
bc6d64ec8a
comment 2022-03-29 14:53:07 -04:00
yarikoptic
a03aee0033 Added a comment 2022-03-16 20:32:57 +00:00
Joey Hess
025c18128b
test: Added --jobs option
Default to the number of CPU cores, which seems about optimal
on my laptop. Using one more saves me 2 seconds actually.

Better packing of workers improves speed significantly.

In 2 tests runs, I saw segfaulting workers despite my attempt
to work around that issue. So detect when a worker does, and re-run it.

Removed installSignalHandlers again, because I was seeing an
error "lost signal due to full pipe", which I guess was somehow caused
by using it.

Sponsored-by: Dartmouth College's Datalad project
2022-03-16 14:42:07 -04:00
Joey Hess
8d14ce8f38
parallelize git-annex test for 25% speedup
Note the very weird workaround for what appears to be some kind of tasty
bug, which causes a segfault. This is not new to this modification,
I was seeing a segfault before at least intermittently when limiting
git-annex test -p to only run a single test group.

Also, the path from one test repo to a remote test repo used to be
"../../foo", which somehow broke when moving the test repos from .t to
.t/N. I don't actually quite understand how it used to work, but
"../foo" seems correct and works in the new situation.

Test output from the concurrent processes is not yet serialized.
Should be easy to do using concurrent-output.

More test groups will probably make the speedup larger. It would
probably be best to have a larger number of test groups and divvy them
amoung subprocesses numbered based on the number of CPU cores, perhaps
times 2 or 3.

Sponsored-by: Dartmouth College's Datalad project
2022-03-14 15:24:37 -04:00
ErrGe
7580d787cc Added a comment 2022-03-11 02:26:56 +00:00
ErrGe
4d593d4461 Added a comment 2022-03-11 02:23:20 +00:00
Joey Hess
afeb9b728e
comment 2022-03-10 16:10:46 -04:00
Joey Hess
feaf16141e
comment 2022-03-10 13:22:32 -04:00
Atemu
0e39304905 Added a comment 2022-03-09 12:07:45 +00:00
ErrGe
f706a68c43 2022-03-09 01:08:23 +00:00
Joey Hess
c7f7be0236
comment 2022-03-08 15:50:31 -04:00
Joey Hess
82f1d82286
comment 2022-03-08 15:46:29 -04:00
Joey Hess
4cab0c1b05
comment 2022-03-08 14:51:01 -04:00
Joey Hess
14add55c2b
reopen 2022-03-08 14:08:27 -04:00
Joey Hess
2c5bf952cf
comment 2022-03-08 13:47:06 -04:00
Atemu
c7e3414d8d Added a comment 2022-03-08 13:21:03 +00:00
yarikoptic
875a04e1e2 Added a comment: still slow 2022-03-08 00:28:00 +00:00
Joey Hess
1cbbd23109
comment 2022-03-07 15:25:32 -04:00
tomdhunt
0127e5f4e4 Added a comment 2022-03-07 18:27:38 +00:00
Joey Hess
da698437b6
close 2022-03-07 14:12:11 -04:00
Joey Hess
dab9078ab7
close 2022-03-07 13:23:19 -04:00
mih
fa5a001ef6 Added a comment: Thanks! 2022-03-04 16:23:16 +00:00
Joey Hess
5e385cb637
add 2022-03-02 10:44:49 -04:00
Joey Hess
2fc46e1871
git-annex test from standalone speedup
Avoid git-annex test being very slow when run from within the standalone
linux tarball or OSX app.

It may not really be necessary to add to PATH the directory where the
git-annex binary resides, but it can't hurt. Most places where the test
suite or git-annex run git-annex, they use programPath, so won't need
a modified PATH. But I'm not sure if that's always the case.

Sponsored-by: Dartmouth College's Datalad project
2022-03-01 16:08:55 -04:00
Joey Hess
ecf7c29107
update comment 2022-03-01 15:57:13 -04:00
Joey Hess
316a049e96
comment 2022-03-01 15:50:44 -04:00
yarikoptic
ca5834a18c Added a comment: question about backend 2022-02-28 22:42:34 +00:00
yarikoptic
9e6e53af71 initial report on a very slow git annex test on discovery 2022-02-28 20:50:06 +00:00
Joey Hess
525218ef86
commet 2022-02-28 15:42:33 -04:00
yarikoptic
a33b40876d Added a comment 2022-02-28 18:48:51 +00:00
Joey Hess
7a4a1322f5
update 2022-02-28 13:37:05 -04:00
Joey Hess
20875bd5e8
open related todo 2022-02-28 13:26:43 -04:00
Joey Hess
7de469edd0
comment 2022-02-25 13:32:06 -04:00
yarikoptic
7e9ebea910 Added a comment 2022-02-21 21:53:03 +00:00
Joey Hess
5a8b15f6db
comment 2022-02-21 15:46:12 -04:00
Joey Hess
ce1b3a9699
info: Allow using matching options in more situations
File matching options like --include will be rejected in situations where
there is no filename to match against. (Or where there is a filename but
it's not relative to the cwd, or otherwise seemed too bothersome to match
against.)

The addition of listKeys' was necessary to avoid using more memory in the
common case of "git-annex info". Adding a filterM would have caused the
list to buffer in memory and not stream. This is an ugly hack, but listKeys
had previously run Annex operations inside unafeInterleaveIO (for direct
mode). And matching against a matcher should hopefully not change any Annex
state.

This does allow for eg `git-annex info somefile --include=*.ext`
although why someone would want to do that I don't really know. But it
seems to make sense to allow it.
But, consider: `git-annex info ./somefile --include=somefile`
This does not match, so will not display info about somefile.
If the user really wants to, they can `--include=./somefile`.

Using matching options like --copies or --in=remote seems likely to be
slower than git-annex find with those options, because unlike such
commands, info does not have optimised streaming through the matcher.

Note that `git-annex info remote` is not the same as
`git-annex info --in remote`. The former shows info about all files in
the remote. The latter shows local keys that are also in that remote.
The output should make that clear, but this still seems like a point
where users could get confused.

Sponsored-by: Jochen Bartl on Patreon
2022-02-21 14:46:07 -04:00
Joey Hess
d36de3edf9
comment 2022-02-21 12:49:36 -04:00
Atemu
6ca9f5e18a 2022-02-20 18:03:35 +00:00
yarikoptic
b481ec2738 Added a comment 2022-02-18 21:56:19 +00:00
yarikoptic
9d2e6a60f0 Added a comment 2022-02-18 20:18:04 +00:00
Joey Hess
faf84aa5c2
Avoid git status taking a long time after git-annex unlock of many files.
Implemented by making Git.Queue have a FlushAction, which can accumulate
along with another action on files, and runs only once the other action has
run.

This lets git-annex unlock queue up git update-index actions, without
conflicting with the restagePointerFiles FlushActions.

In a repository with filter-process enabled, git-annex unlock will
often not take any more time than before, though it may when the files are
large. Either way, it should always slow down less than git-annex status
speeds up.

When filter-process is not enabled, git-annex unlock will slow down as much
as git status speeds up.

Sponsored-by: Jochen Bartl on Patreon
2022-02-18 15:06:40 -04:00
Joey Hess
c68f52c6a2
restage pointer file after unlock
This avoids a later git status or similar taking a long time to run
as it runs git-annex smudge once per file. While v9 repositories do
avoid that taking long when the files are small, large files can still
make git status take a very long time.

This does make unlock slower, because now git-annex smudge is being run
once per file unlocked. However, the next commit should speed that up in
many cases.

Sponsored-by: Boyd Stephen Smith Jr. on Patreon
2022-02-18 14:55:52 -04:00
Joey Hess
07215cfeb5
complete annex.skipunknown transition
annex.skipunknown now defaults to false, so commands like `git annex get foo*`
will not silently skip over files/dirs that are not checked into git.

Sponsored-by: Brock Spratlen on Patreon
2022-02-18 13:18:05 -04:00
Joey Hess
0edf01d7d4
registerurl,unregisterurl: rework output and support --json
* registerurl, unregisterurl: Improved output when reading from stdin
  to be more like other batch commands.
* registerurl, unregisterurl: Added --json and --json-error-messages options.

Note that this did change the --batch output in a way that could possibly
break something that expected the old output to never change. I think it's
acceptable to break that because there has never been a guarantee of
unchanging output format except with --batch for most commands. The old
output was just really weird too!

One possible wart is that "git-annex registerurl" with no options now
seems to just hang, since it's waiting for stdin input. Before, it said
"registerurl (stdin)" which was clearer about what's happenening. But this
is a deprecated mode anyway, --batch makes clear what's happening. If
anything, this problem would be a reason to eventually remove the support
for reading from stdin w/o --batch.

Sponsored-by: Dartmouth College's Datalad project
2022-02-14 13:29:20 -04:00
Joey Hess
291dc0d1a9
comment 2022-02-14 12:42:37 -04:00
yarikoptic
c908046235 initial todo for --json for registerurl 2022-02-09 21:39:46 +00:00
Joey Hess
ad2f0446a0
comment 2022-02-08 13:24:28 -04:00
Atemu
d20550ac69 2022-02-08 10:47:21 +00:00
Joey Hess
47084b8a1d
enable filter.annex.process in v9
This has tradeoffs, but is generally a win, and users who it causes git add to
slow down unacceptably for can just disable it again.

It needed to happen in an upgrade, since there are git-annex versions
that do not support it, and using such an old version with a v8
repository with filter.annex.process set will cause bad behavior.
By enabling it in v9, it's guaranteed that any git-annex version that
can use the repository does support it. Although, this is not a perfect
protection against problems, since an old git-annex version, if it's
used with a v9 repository, will cause git add to try to run
git-annex filter-process, which will fail. But at least, the user is
unlikely to have an old git-annex in path if they are using a v9
repository, since it won't work in that repository.

Sponsored-by: Dartmouth College's Datalad project
2022-01-21 13:11:18 -04:00