108 lines
4.1 KiB
Markdown
108 lines
4.1 KiB
Markdown
**draft**
|
|
|
|
The [[special_remotes/compute]] special remote uses this interface to run
|
|
compute programs.
|
|
|
|
When an compute special remote is initremoted, a program is specified:
|
|
|
|
git-annex initremote myremote type=compute program=git-annex-compute-foo
|
|
|
|
The user adds an annexed file that is computed by the program by running
|
|
a command like one of these:
|
|
|
|
git-annex addcomputed --to=myremote -- convert file.raw file.jpeg passes=10
|
|
git-annex addcomputed --to=myremote -- compress in out --level=9
|
|
git-annex addcomputed --to=myremote -- clip foo 2:01-3:00 combine with bar to baz
|
|
|
|
Whatever values the user passes to `git-annex addcomputed` are passed to
|
|
the program in `ARGV`, followed by any values that the user provided to
|
|
`git-annex initremote`.
|
|
|
|
To simplify the program's option parsing, any value that the user provides
|
|
that is in the form "foo=bar" will also result in an environment variable
|
|
being set, eg `ANNEX_COMPUTE_passes=10` or `ANNEX_COMPUTE_--level=9`.
|
|
|
|
For security, the program should avoid exposing user input to the shell
|
|
unprotected, or otherwise executing it.
|
|
|
|
The program is run in a temporary directory, which will be cleaned up after
|
|
it exits. Note that it may be run in a subdirectory of its temporary
|
|
directory. This is done when `git-annex addcomputed` was run in a subdirectory
|
|
of the git repository.
|
|
|
|
The content of any annexed file in the repository can be an input
|
|
to the computation. The program requests an input by writing a line to
|
|
stdout:
|
|
|
|
INPUT file.raw
|
|
|
|
Then it can read a line from stdin, which will be the path to the content
|
|
(eg a `.git/annex/objects/` path).
|
|
|
|
If the program needs multiple input files, it should output multiple
|
|
`INPUT` lines first, and then read multiple paths from stdin. This
|
|
allows retrieval of the inputs to potentially run in parallel.
|
|
|
|
If an input file is not available, the program's stdin will be closed
|
|
without a path being written to it. So when reading from stdin fails,
|
|
the program should exit.
|
|
|
|
When `git-annex addcomputed --fast` is being used to add a computation
|
|
to the git-annex repository without actually performing it, the
|
|
response to each "INPUT" will be an empty line rather than the path to
|
|
an input file. In that case, the program should proceed with the rest of
|
|
its output to stdout (eg "OUTPUT" and "REPRODUCIBLE"), but should not
|
|
perform any computation.
|
|
|
|
For each output file that it will compute, the program should write a
|
|
line to stdout:
|
|
|
|
OUTPUT file.jpeg
|
|
|
|
The filename of the output file is both the filename in the program's
|
|
temporary directory, and also the filename that will be added to the
|
|
git-annex repository by `git-annex compute`.
|
|
|
|
If git-annex sees that an output file is growing, it will use its file size
|
|
when displaying progress to the user. So if possible, the program should
|
|
write the content to the file it is computing directly, rather than writing
|
|
to somewhere else and renaming it at the end. But, if the program seeks
|
|
around and writes out of order, it should write to a file somewhere else
|
|
and rename it at the end.
|
|
|
|
The program can also output lines to stdout to indicate its current
|
|
progress:
|
|
|
|
PROGRESS 50%
|
|
|
|
The program can optionally also output a "REPRODUCIBLE" line. That
|
|
indicates that the results of its computations are expected to be
|
|
bit-for-bit reproducible. That makes `git-annex addcomputed` behave as if
|
|
the `--reproducible` option is set.
|
|
|
|
Anything that the program outputs to stderr will be displayed to the user.
|
|
This stderr should be used for error messages, and possibly computation
|
|
output, but not for progress displays.
|
|
|
|
If the program exits nonzero, nothing it computed will be stored in the
|
|
git-annex repository.
|
|
|
|
An example `git-annex-compute-foo` shell script follows:
|
|
|
|
#!/bin/sh
|
|
set -e
|
|
if [ "$1" != "convert" ]; then
|
|
echo "Usage: convert input output [passes=n]" >&2
|
|
exit 1
|
|
fi
|
|
if [ -z "$ANNEX_COMPUTE_passes" ]; then
|
|
ANNEX_COMPUTE_passes=1
|
|
fi
|
|
echo "INPUT $2"
|
|
read input
|
|
echo "OUTPUT $3"
|
|
echo REPRODUCIBLE
|
|
if [ -n "$input" ]; then
|
|
mkdir -p "$(dirname "$3")"
|
|
frobnicate --passes="$ANNEX_COMPUTE_passes" <"$input" >"$3"
|
|
fi
|