enable filter.annex.process in v9

This has tradeoffs, but is generally a win, and users who it causes git add to
slow down unacceptably for can just disable it again.

It needed to happen in an upgrade, since there are git-annex versions
that do not support it, and using such an old version with a v8
repository with filter.annex.process set will cause bad behavior.
By enabling it in v9, it's guaranteed that any git-annex version that
can use the repository does support it. Although, this is not a perfect
protection against problems, since an old git-annex version, if it's
used with a v9 repository, will cause git add to try to run
git-annex filter-process, which will fail. But at least, the user is
unlikely to have an old git-annex in path if they are using a v9
repository, since it won't work in that repository.

Sponsored-by: Dartmouth College's Datalad project
This commit is contained in:
Joey Hess 2022-01-21 13:11:18 -04:00
parent fad11c2250
commit 47084b8a1d
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
7 changed files with 31 additions and 8 deletions

View file

@ -17,7 +17,8 @@ to git. git-lfs uses it that way.
The first problem with the interface was that it ran a command once per
file. This was later fixed by extending it to support long-running filter
processes, which git-lfs uses. git-annex can also use that interface,
when `git-annex filter-process` is enabled, but it does not by default.
when `git-annex filter-process` is enabled. That is the case in v9
repositories and above.
A second problem with the interface, which affects git-lfs AFAIK, is that
git buffers the output of the smudge filter in memory before updating the
@ -81,12 +82,12 @@ And here's the consequences of git-annex's workarounds:
* It doesn't use the long-running filter process interface by default,
so `git add` of a lot of files runs `git-annex smudge --clean` once per file,
which is slower than it could be. Using `git-annex add` avoids this problem.
So does enabling `git-annex filter-process`.
So does enabling `git-annex filter-process`, which is default in v9.
* After a git-annex get/drop or a git checkout or pull that affects a lot
of files, the clean filter gets run once per file, which is again, slower
than ideal. Enabling `git-annex filter-process` can speed this up
in some cases.
in some cases, and is default in v9.
* When `git-annex filter-process` is enabled, it cannot use the trick
described above that `git-annex smudge --clean` uses to avoid git

View file

@ -1,7 +1,7 @@
When `git-annex filter-process` is enabled, `git add` pipes the content of
files into it, but that's thrown away, and the file is read again by git-annex
to generate a hash. It would improve performance to hash the content
provided via the pipe.
When `git-annex filter-process` is enabled (v9 and above), `git add` pipes
the content of files into it, but that's thrown away, and the file is read
again by git-annex to generate a hash. It would improve performance to hash
the content provided via the pipe.
When filter-process is not enabled, `git-annex smudge --clean` reads
the file to hash it, then reads it a second time to copy it into

View file

@ -18,3 +18,5 @@ could change and if it does, these things could be included.
seem worth it.
May want to implement [[incremental_hashing_for_add]] first.
[[done]] --[[Joey]]