update
This commit is contained in:
parent
9488a53023
commit
3db20b39f2
2 changed files with 29 additions and 76 deletions
|
@ -0,0 +1,21 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 3"""
|
||||||
|
date="2018-10-26T16:21:28Z"
|
||||||
|
content="""
|
||||||
|
While `git add` would be a lot slower when using this interface to add
|
||||||
|
large files, it would make `git checkout` and other commands that update
|
||||||
|
the work tree a lot faster.
|
||||||
|
|
||||||
|
Since the smudge filter is not providing git with the file content any more,
|
||||||
|
using filterdriver would avoid git running many git-annex smudge processes,
|
||||||
|
greatly speeding up large checkouts.
|
||||||
|
|
||||||
|
Unfortunately, `git annex smudge --update` ends up running the smudge filter
|
||||||
|
on all files that the clean filter earlier acted on, so even if filterdriver were
|
||||||
|
used to speed up the clean filter, there would still be one process spawned per
|
||||||
|
file for the smudge filter.
|
||||||
|
|
||||||
|
So some interface improvement is needed before git-annex can usefully use
|
||||||
|
this.
|
||||||
|
"""]]
|
|
@ -1,78 +1,10 @@
|
||||||
git-annex should use smudge/clean filters. v6 mode
|
git-annex should use smudge/clean filters. v7 mode
|
||||||
|
|
||||||
### problems keeping v6 experimental
|
## warts
|
||||||
|
|
||||||
* Checking out a different branch causes git to smudge all changed files,
|
* There are several bugs that are edge cases and
|
||||||
and write their content. This does not honor annex.thin. A warning
|
|
||||||
message is printed in this case.
|
|
||||||
|
|
||||||
This is particularly wasteful when checking out an adjusted unlocked
|
|
||||||
branch, which causes 2x the space to be used.
|
|
||||||
|
|
||||||
"git annex proxy" could be used to handle this.
|
|
||||||
Make it run the git command with smudge filter set to not output content
|
|
||||||
but only pointers, and then at the end populate the pointer files, hard
|
|
||||||
when appropriate. (As an optimization, the smudge filter could also be
|
|
||||||
made to use the long-running filter interface when run this way.)
|
|
||||||
|
|
||||||
git-annex adjust and git-annex sync could both use that internally
|
|
||||||
when checking out the adjusted branch, and merging a branch into HEAD.
|
|
||||||
|
|
||||||
Or: Make the smudge filter never provide the actual file content, but the
|
|
||||||
pointer. Install post-checkout and post-merge hooks that populate
|
|
||||||
the worktree files that were checked out. Of course, they will also
|
|
||||||
need to update the index.
|
|
||||||
|
|
||||||
Problem: post-merge hook is not run when there's a merge conflict.
|
|
||||||
Git does not actually run the smudge filter in this case;
|
|
||||||
the conflicting file becomes a text file containing a merge conflict
|
|
||||||
between the two annex pointers. When the user resolves the conflict
|
|
||||||
and git add's the result, git runs the smudge filter.
|
|
||||||
So, if the smudge filter then provides the pointer, the file would not be
|
|
||||||
populated. The post-commit hook would then need to populate the file,
|
|
||||||
once the merge got committed.
|
|
||||||
|
|
||||||
Problem: No hook seems to be run for git stash / git stash apply
|
|
||||||
or for git reset --hard or git cherry-pick.
|
|
||||||
Fatal or can we live with needing to run a
|
|
||||||
git-annex command to populate the files after those commands?
|
|
||||||
|
|
||||||
> implemented on the `delaysmudge` branch now
|
|
||||||
|
|
||||||
(My enhanced smudge/clean patch set also fixed this problem, in a much
|
|
||||||
nicer way...)
|
|
||||||
|
|
||||||
* Optionally: Use the filterdriver interface during checkout. Unfortunately that
|
|
||||||
interface is slower for cleaning during git add (see
|
|
||||||
[[todo/Long_Running_Filter_Process]]), but since the smudge filter is not
|
|
||||||
providing git with the file content any more, using filterdriver would
|
|
||||||
avoid git running many git-annex smudge processes, greatly speeding up large
|
|
||||||
checkouts. git add could be left slow, with git-annex add being the fast path,
|
|
||||||
until the filterdriver interface is improved. Or, make "git annex proxy"
|
|
||||||
use the filterdriver interface for checkout.
|
|
||||||
|
|
||||||
* When git runs the smudge filter, it buffers all its output in ram before
|
|
||||||
writing it to a file. So, checking out a branch with a large v6 unlocked files
|
|
||||||
can cause git to use a lot of memory.
|
|
||||||
|
|
||||||
This needs to be fixed in git, but my proposed interface in
|
|
||||||
<http://thread.gmane.org/gmane.comp.version-control.git/294425> would
|
|
||||||
avoid the problem for git checkout, since it would use the new interface
|
|
||||||
and not the smudge filter.
|
|
||||||
|
|
||||||
Last verified with git 2.18 in 2018.
|
|
||||||
|
|
||||||
Note that the long-running filter process interface has the same problem.
|
|
||||||
|
|
||||||
The annex.thin idea above could work around this problem.
|
|
||||||
|
|
||||||
> implemented on the `delaysmudge` branch now
|
|
||||||
|
|
||||||
## other warts
|
|
||||||
|
|
||||||
* There are several v6 bugs that are edge cases and
|
|
||||||
need more info or analysis. None of these seem like blockers
|
need more info or analysis. None of these seem like blockers
|
||||||
to keep v6 experimental or to replacing direct mode with v6.
|
to keep v7 experimental or to replacing direct mode with v7.
|
||||||
|
|
||||||
- <http://git-annex.branchable.com/bugs/assistant_crashes_in_TransferScanner/>
|
- <http://git-annex.branchable.com/bugs/assistant_crashes_in_TransferScanner/>
|
||||||
- <http://git-annex.branchable.com/bugs/v6_appears_to_not_thin/>
|
- <http://git-annex.branchable.com/bugs/v6_appears_to_not_thin/>
|
||||||
|
@ -86,14 +18,14 @@ git-annex should use smudge/clean filters. v6 mode
|
||||||
multiple files, and so should be faster.
|
multiple files, and so should be faster.
|
||||||
|
|
||||||
See [[todo/Long_Running_Filter_Process]] .. it's not currently actually a
|
See [[todo/Long_Running_Filter_Process]] .. it's not currently actually a
|
||||||
win but might be a good way to improve git to work better with v6.
|
win but might be a good way to improve git to work better with v7.
|
||||||
|
|
||||||
* Eventually (but not yet), make v6 the default for new repositories.
|
* Eventually (but not yet), make v7 the default for new repositories.
|
||||||
Note that the assistant forces repos into direct mode; that will need to
|
Note that the assistant forces repos into direct mode; that will need to
|
||||||
be changed then, and it should enable annex.thin instead.
|
be changed then, and it should enable annex.thin instead.
|
||||||
|
|
||||||
* Later still, remove support for direct mode, and enable automatic
|
* Later still, remove support for direct mode, and enable automatic
|
||||||
v5 to v6 upgrades.
|
v5 to v7 upgrades.
|
||||||
|
|
||||||
### historical notes
|
### historical notes
|
||||||
|
|
||||||
|
@ -395,7 +327,7 @@ just look at the repo content in the first place..
|
||||||
|
|
||||||
#### Upgrading
|
#### Upgrading
|
||||||
|
|
||||||
annex.version changes to 6
|
annex.version changes to 7
|
||||||
|
|
||||||
git config for filter.annex.smudge and filter.annex.clean is set up.
|
git config for filter.annex.smudge and filter.annex.clean is set up.
|
||||||
|
|
||||||
|
|
Loading…
Add table
Reference in a new issue