diff --git a/doc/todo/compute_special_remote/comment_18_021858e8032eca84488ec2324ec25a6f._comment b/doc/todo/compute_special_remote/comment_18_021858e8032eca84488ec2324ec25a6f._comment new file mode 100644 index 0000000000..4740ce806f --- /dev/null +++ b/doc/todo/compute_special_remote/comment_18_021858e8032eca84488ec2324ec25a6f._comment @@ -0,0 +1,13 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 18""" + date="2025-02-19T18:29:58Z" + content=""" +I've started a `compute` branch which so far has documentation for +the [compute special remote](http://source.git-annex.branchable.com/?p=source.git;a=blob;f=doc/special_remotes/compute.mdwn;hb=refs/heads/compute), +[git-annex addcomputed](http://source.git-annex.branchable.com/?p=source.git;a=blob;f=doc/git-annex-addcomputed.mdwn;hb=refs/heads/compute), +and +[git-annex recompute](http://source.git-annex.branchable.com/?p=source.git;a=blob;f=doc/git-annex-recompute.mdwn;hb=refs/heads/compute) + +I am pretty happy with how this design is shaping up. +"""]] diff --git a/doc/todo/compute_special_remote/comment_19_fcba8049e659d3238b9f83286777f71f._comment b/doc/todo/compute_special_remote/comment_19_fcba8049e659d3238b9f83286777f71f._comment new file mode 100644 index 0000000000..f2e04df5a7 --- /dev/null +++ b/doc/todo/compute_special_remote/comment_19_fcba8049e659d3238b9f83286777f71f._comment @@ -0,0 +1,65 @@ +[[!comment format=mdwn + username="joey" + subject="""open questions""" + date="2025-02-19T18:39:41Z" + content=""" +One thing that I am unsure about is what should happen if `git-annex get foo` +needs the content of file `bar`, which is not present. Should it get `bar` from +a remote? Or should it fail to get `foo`? + +Consider that, in the case of `git-annex get foo --from computeremote`, the +user has asked it to get a file from that particular remote, not from +whatever remote contains `bar`. + +If the same compute remote can also compute `bar`, it seems quite reasonable +for `git-annex get foo --from computeremote` to also compute bar. (This is +similar to a single computation that generates two output files, in which +case getting one of them will get both of them.) + +And it seems reasonable for `git-annex get foo` with no specified remote +to also get or compute bar, from whereever. + +But, there is no way at the level of a special remote to tell the +difference between those two commands. + +Maybe the right answer is to define getting a file from a compute +special remote as including getting its inputs from other remotes. +Preferring getting them from the same compute special remote when possible, +and when not, using the lowest cost remote that works, same as `git-annx +get` does. + +---- + +A related problem is that, `foo` might be fairly small, but `bar` very +large. So getting a small object can require getting or generating other +large objects. Getting `bar` might fail because there is not enough space +to meet annex.diskreserve. Or the user might just be surprised that so much +disk space was eaten up. But dropping `bar` after computing `foo` also +doesn't seem like a good idea; the user might want to hang onto their copy +now that they have it, or perhaps move it to some faster remote. + +Maybe preferred content is the solution? After computing `foo` with `bar`, +keep the copy of `bar` if the local repository wants it, drop it otherwise. + +---- + +Progress display is also going to be complicated for this. There is no +way in the special remote interface to display the progress for `bar` +while getting `foo`. + +Probably the thing to do would be to add together the sizes of both files, +and display a combined progress meter. +It would be ok to not say when it's getting the input file. +This will need a way to set the size for a progress display to larger +than the size of the key. + +---- + +.... All 3 problems above go away if it doesn't automatically get input files +before computations and the computations instead just fail with an error +saying the input file is not present. + +But then consider the case where you just want every file in the repository. +`git-annex get .` failing to compute some files because their input files +happen to come after them in the directory listing is not good. +"""]]