diff --git a/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment b/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment new file mode 100644 index 0000000000..fc72c6e6a9 --- /dev/null +++ b/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment @@ -0,0 +1,74 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 9""" + date="2025-01-27T14:46:43Z" + content=""" +Circling back to this, I think the fork in the road is whether this is +about git-annex providing this and that feature to support external special +remotes that compute, or whether git-annex gets a compute special +remote of its own with some simpler/better extension interface +than the external special remote protocol. + +Of course, git-annex having its own compute special remote would not +preclude other external special remotes that compute. And for that matter, +a single external special remote could implement an extension interface. + +--- + +Thinking about how a generic compute special remote in git-annex could +work, multiple instances of it could be initremoted: + + git-annex initremote convertfiles type=compute program=csv-to-xslx + git-annex initremote cutvideo type=compute program=ffmpeg-cut + +Here the "program" parameter would cause a program like +`git-annex-compute-ffmpeg-cut` to be run to get files from that instance +of the compute special remote. The interface could be as simple as it +being run with the key that it is requested to compute, and outputting +the paths to the all keys it was able to compute. (So allowing for +"request one key, receive many".) Perhaps also with some way to indicate +progess of the computation. + +It would make sense to store the details of computations in git-annex +metadata. And a compute program can use git-annex commands to get files +it depends on. Eg, `git-annex-compute-ffmpeg-cut` could run: + + # look up the configured metadata + starttime=$(git-annex metadata --get compute-ffmpeg-starttime --key=$requested) + endtime=$(git-annex metadata --get compute-ffmpeg-endtime --key=$requested) + source=$(git-annex metadata --get compute-ffmpeg-source --key=$requested) + + # get the source video file + git-annex get --key=$source + git-annex examinekey --format='${objectpath}' $source + +It might be worth formalizing that a given computed key can depend on other +keys, and have git-annex always get/compute those keys first. + +When asked to store a key in the compute special remote, it would verify +that the key can be generated by it. Using the same interface as used to +get a key. + +This all leaves a chicken and egg problem, how does the user add a computed +file if they don't know the key yet? + +The user could manually run the commands that generate the computed file, +then `git-annex add` it, and set the metadata. Then `git-annex copy --to` +the compute remote would verify if the file can be generated, and add it if +so. This seems awkward, but also nice to be able to do manually. + +Or, something like VURL keys could be used, with an interface something +like this: + + git-annex addcomputed foo --to ffmpeg-cut + --input compute-ffmpeg-source=input.mov + --set compute-ffmpeg-starttime=15:00 + --set compute-ffmpeg-endtime=30:00 + +All that would do is generate some arbitrary VURL key or similar, +provisionally set the provided metadata (how?), and try to store the key +in the compute special remote. If it succeeds, stage an annex pointer +and commit the metadata. Since it's a VURL key, storing the key in the +compute special remote would also record the hash of the generated file +at that point. +"""]]