Commit graph

3790 commits

Author SHA1 Message Date
tomdhunt
0127e5f4e4 Added a comment 2022-03-07 18:27:38 +00:00
Joey Hess
da698437b6
close 2022-03-07 14:12:11 -04:00
Joey Hess
dab9078ab7
close 2022-03-07 13:23:19 -04:00
mih
fa5a001ef6 Added a comment: Thanks! 2022-03-04 16:23:16 +00:00
Joey Hess
5e385cb637
add 2022-03-02 10:44:49 -04:00
Joey Hess
2fc46e1871
git-annex test from standalone speedup
Avoid git-annex test being very slow when run from within the standalone
linux tarball or OSX app.

It may not really be necessary to add to PATH the directory where the
git-annex binary resides, but it can't hurt. Most places where the test
suite or git-annex run git-annex, they use programPath, so won't need
a modified PATH. But I'm not sure if that's always the case.

Sponsored-by: Dartmouth College's Datalad project
2022-03-01 16:08:55 -04:00
Joey Hess
ecf7c29107
update comment 2022-03-01 15:57:13 -04:00
Joey Hess
316a049e96
comment 2022-03-01 15:50:44 -04:00
yarikoptic
ca5834a18c Added a comment: question about backend 2022-02-28 22:42:34 +00:00
yarikoptic
9e6e53af71 initial report on a very slow git annex test on discovery 2022-02-28 20:50:06 +00:00
Joey Hess
525218ef86
commet 2022-02-28 15:42:33 -04:00
yarikoptic
a33b40876d Added a comment 2022-02-28 18:48:51 +00:00
Joey Hess
7a4a1322f5
update 2022-02-28 13:37:05 -04:00
Joey Hess
20875bd5e8
open related todo 2022-02-28 13:26:43 -04:00
Joey Hess
7de469edd0
comment 2022-02-25 13:32:06 -04:00
yarikoptic
7e9ebea910 Added a comment 2022-02-21 21:53:03 +00:00
Joey Hess
5a8b15f6db
comment 2022-02-21 15:46:12 -04:00
Joey Hess
ce1b3a9699
info: Allow using matching options in more situations
File matching options like --include will be rejected in situations where
there is no filename to match against. (Or where there is a filename but
it's not relative to the cwd, or otherwise seemed too bothersome to match
against.)

The addition of listKeys' was necessary to avoid using more memory in the
common case of "git-annex info". Adding a filterM would have caused the
list to buffer in memory and not stream. This is an ugly hack, but listKeys
had previously run Annex operations inside unafeInterleaveIO (for direct
mode). And matching against a matcher should hopefully not change any Annex
state.

This does allow for eg `git-annex info somefile --include=*.ext`
although why someone would want to do that I don't really know. But it
seems to make sense to allow it.
But, consider: `git-annex info ./somefile --include=somefile`
This does not match, so will not display info about somefile.
If the user really wants to, they can `--include=./somefile`.

Using matching options like --copies or --in=remote seems likely to be
slower than git-annex find with those options, because unlike such
commands, info does not have optimised streaming through the matcher.

Note that `git-annex info remote` is not the same as
`git-annex info --in remote`. The former shows info about all files in
the remote. The latter shows local keys that are also in that remote.
The output should make that clear, but this still seems like a point
where users could get confused.

Sponsored-by: Jochen Bartl on Patreon
2022-02-21 14:46:07 -04:00
Joey Hess
d36de3edf9
comment 2022-02-21 12:49:36 -04:00
Atemu
6ca9f5e18a 2022-02-20 18:03:35 +00:00
yarikoptic
b481ec2738 Added a comment 2022-02-18 21:56:19 +00:00
yarikoptic
9d2e6a60f0 Added a comment 2022-02-18 20:18:04 +00:00
Joey Hess
faf84aa5c2
Avoid git status taking a long time after git-annex unlock of many files.
Implemented by making Git.Queue have a FlushAction, which can accumulate
along with another action on files, and runs only once the other action has
run.

This lets git-annex unlock queue up git update-index actions, without
conflicting with the restagePointerFiles FlushActions.

In a repository with filter-process enabled, git-annex unlock will
often not take any more time than before, though it may when the files are
large. Either way, it should always slow down less than git-annex status
speeds up.

When filter-process is not enabled, git-annex unlock will slow down as much
as git status speeds up.

Sponsored-by: Jochen Bartl on Patreon
2022-02-18 15:06:40 -04:00
Joey Hess
c68f52c6a2
restage pointer file after unlock
This avoids a later git status or similar taking a long time to run
as it runs git-annex smudge once per file. While v9 repositories do
avoid that taking long when the files are small, large files can still
make git status take a very long time.

This does make unlock slower, because now git-annex smudge is being run
once per file unlocked. However, the next commit should speed that up in
many cases.

Sponsored-by: Boyd Stephen Smith Jr. on Patreon
2022-02-18 14:55:52 -04:00
Joey Hess
07215cfeb5
complete annex.skipunknown transition
annex.skipunknown now defaults to false, so commands like `git annex get foo*`
will not silently skip over files/dirs that are not checked into git.

Sponsored-by: Brock Spratlen on Patreon
2022-02-18 13:18:05 -04:00
Joey Hess
0edf01d7d4
registerurl,unregisterurl: rework output and support --json
* registerurl, unregisterurl: Improved output when reading from stdin
  to be more like other batch commands.
* registerurl, unregisterurl: Added --json and --json-error-messages options.

Note that this did change the --batch output in a way that could possibly
break something that expected the old output to never change. I think it's
acceptable to break that because there has never been a guarantee of
unchanging output format except with --batch for most commands. The old
output was just really weird too!

One possible wart is that "git-annex registerurl" with no options now
seems to just hang, since it's waiting for stdin input. Before, it said
"registerurl (stdin)" which was clearer about what's happenening. But this
is a deprecated mode anyway, --batch makes clear what's happening. If
anything, this problem would be a reason to eventually remove the support
for reading from stdin w/o --batch.

Sponsored-by: Dartmouth College's Datalad project
2022-02-14 13:29:20 -04:00
Joey Hess
291dc0d1a9
comment 2022-02-14 12:42:37 -04:00
yarikoptic
c908046235 initial todo for --json for registerurl 2022-02-09 21:39:46 +00:00
Joey Hess
ad2f0446a0
comment 2022-02-08 13:24:28 -04:00
Atemu
d20550ac69 2022-02-08 10:47:21 +00:00
Joey Hess
47084b8a1d
enable filter.annex.process in v9
This has tradeoffs, but is generally a win, and users who it causes git add to
slow down unacceptably for can just disable it again.

It needed to happen in an upgrade, since there are git-annex versions
that do not support it, and using such an old version with a v8
repository with filter.annex.process set will cause bad behavior.
By enabling it in v9, it's guaranteed that any git-annex version that
can use the repository does support it. Although, this is not a perfect
protection against problems, since an old git-annex version, if it's
used with a v9 repository, will cause git add to try to run
git-annex filter-process, which will fail. But at least, the user is
unlikely to have an old git-annex in path if they are using a v9
repository, since it won't work in that repository.

Sponsored-by: Dartmouth College's Datalad project
2022-01-21 13:11:18 -04:00
Joey Hess
d427afb347
v9-locking branch still wip 2022-01-11 17:04:25 -04:00
Joey Hess
029820c832
v9-locking branch 2022-01-11 14:49:21 -04:00
Joey Hess
8ae88923b8
moreinfo 2022-01-11 12:24:40 -04:00
Joey Hess
c36895e9cb
comment 2022-01-05 13:09:18 -04:00
yarikoptic
ec9a4945e4 Added a comment 2022-01-03 20:03:38 +00:00
yarikoptic
4f31a27e6a initial report on slow drop 2022-01-03 19:59:08 +00:00
Joey Hess
b060d99fe0
comment 2021-12-08 13:18:13 -04:00
yarikoptic
2719170575 initial todo on more flexible credentials management mechanism 2021-12-07 18:34:58 +00:00
Joey Hess
b7976e08f0
comment 2021-12-01 13:03:05 -04:00
adina.wagner@2a4cac6443aada2bd2a329b8a33f4a7b87cc8eff
a5b635af20 Added a comment: A few Windows benchmarks 2021-11-29 22:17:39 +00:00
Joey Hess
05d79b26d8
clarify 2021-11-29 14:00:32 -04:00
Joey Hess
0f9e5ada82
idea 2021-11-21 11:19:47 -04:00
Joey Hess
9121154a75
new todo 2021-11-09 15:52:17 -04:00
Joey Hess
a0758bdd10
dynamically disable filter-process in restagePointerFile when it would be slower
Based on my earlier benchmark, I have a rough cost model for how
expensive it is for git-annex smudge to be run on a file, vs
how expensive it is for a gigabyte of a file's content to be read and
piped through to filter-process.

So, using that cost model, it can decide if using filter-process will
be more or less expensive than running the smudge filter on the files to
be restaged.

It turned out to be *really* annoying to temporarily disable
filter-process. I did find a way, but urk, this is horrible. Notice
that, if it's interrupted with it disabled, it will remain disabled
until the next time restagePointerFile runs. Which could be some time
later. If the user runs `git add` or `git checkout` on a lot of small
files before that, they will see slower than expected performance.

(This commit also deletes where I wrote down the benchmark results
earlier.)

Sponsored-by: Noam Kremen on Patreon
2021-11-08 16:20:34 -04:00
Joey Hess
054c803f8d
benchmarking of filter-process vs smudge/clean
No firm conclusions yet, but it's doing better than I would have
expected.

Sponsored-by: Graham Spencer on Patreon
2021-11-05 13:37:53 -04:00
Joey Hess
099e8fe061
close 2021-11-05 12:46:56 -04:00
Joey Hess
b25a138e22
update for git-annex filter-process 2021-11-04 15:15:26 -04:00
Joey Hess
8dd91be867
mention filter-process as v9 material 2021-11-04 15:05:24 -04:00
Joey Hess
bf1408f7bf
long-running-smudge branch started 2021-11-03 15:44:05 -04:00