recompute: stage new version of file in git

When writing doc/tips/computing_annexed_files.mdwn, I noticed
that a recompute --reproducible followed by a drop and a re-get did not
actually test if the file could be reproducible computed again.

Turns out that get and drop both operate on staged files. If there is an
unstaged modification in the work tree, that's ignored. Somewhat
surprisingly, other commands like info do operate on staged files. So
behavior is inconsistent, and fairly surprising really, when there are
unstaged modifications to files.

Probably this is rarely noticed because `git-annex add` is used to add a
new version of a file, and then it's staged. Or `git mv` is used to move
a file, rather than `mv` of a file over top of an existing file. So it's
uncommon to have an unstaged annexed file in a worktree.

It might be worth making things more consistent, but that's out of scope
for what I'm working on currently.

Also, I anticipate that supporting unlocked files with recompute will
require it to stage changes anyway.

So, make recompute stage the new version of the file.

I considered having recompute refuse to overwrite an existing staged
file. After all, whatever version was staged before will get lost when
the new version is staged over top of it. But, that's no different than
`git-annex addcomputed` being run with the name of an existing staged
file. Or `git-annex add` being run with a new file content when there is
an existing staged file. Or, for that matter, `git add` being ran with a
new content when there is an existing staged file.
This commit is contained in:
Joey Hess 2025-03-12 13:36:16 -04:00
parent 21b45da406
commit a673fc7cfd
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 7 additions and 24 deletions

View file

@ -1,13 +1,6 @@
This is the remainder of my todo list while I was building the
compute special remote. --[[Joey]]
* recompute should stage files in git. Otherwise,
`git-annex drop` after recompute --reproducible drops the staged
file, and `git-annex get` gets the staged file, and if it wasn't
actually reproducible, this is not apparent.
This is blocking adding the tip.
* Support parallel get of input files. The design allows for this,
but how much parallelism makes sense? Would it be possible to use the
usual worker pool?