For these, use VURL and URL keys, with an "annex-compute:" URI prefix.
These URL keys will look something like this:
URL--annex-compute&cbar4,63pconvert,3-f4d3d72cf3f16ac9c3e9a8012bde4462
Generally it's too long so most of it gets md5summed. It's a little
ugly, but it's what fell out of the existing URL key generation
machinery. I did consider special casing to eg
"URL--annex-compute&c4d3d72cf3f16ac9c3e9a8012bde4462". But it seems at
least possibly useful that the name of the file that was computed is
visible and perhaps one or two words of the git-annex compute command
parameters.
Note that two different output files from the same computation will get
the same URL key. And these keys should remain stable.
Working pretty well. Mostly. But:
* Does not yet support inputs that are non-annexed files checked into git
* --fast is currently broken (will need something like VURL keys)
* --unreproducible still uses a checksumming backend, so drop and get
again will likely fail (needs probably to use an URL key or something
like one)
The compute special remote seems to work pretty well too. Eg,
getting from it works, and dropping content that is present in it works.
Eg, a computation might be run in "foo/" and refer to "../bar" as an
input or output.
So, the subdir is part of the computation state.
Also, prevent input or output of files that are outside the git
repository. Of course, the program can access any file on disk if it
wants to; this is just a guard against mistakes. And it may also be
useful if the program comunicates with something less trusted than it,
eg a container image, so input/output files communicated by that are not
the source of security problems.
Except for some of the hard parts: progress displays, incremental
verification, and getting inputs before running a computation.
Untested! In order to test this, git-annex addcomputed needs to be
implemented.