git-annex

Author	SHA1	Message	Date
Joey Hess	bcfd554a0f	findcomputed: New command, displays information about computed files.	2025-03-18 12:55:48 -04:00
Joey Hess	d74d2d5d91	--json for addcomputed and recompute Not very useful, but it does work.	2025-03-17 15:51:43 -04:00
Joey Hess	a0d6a6ea2a	support git files as input to computations Using GIT keys, like are used when exporting git files to special remotes. Except here the GIT key refers to a file checked into the git repo. Note that, since the compute remote uses catObject to get the content, a symlink that is checked into git does not get followed. This is important for security, because following a symlink and adding the content to the repo as an annex object would allow exfiltrating content from outside the repository. Instead, the behavior with a symlink is to run the computation on the symlink target. This may turn out to be confusing, and it might be worth addcomputed checking if the file in git is a symlink and erroring out. Or it could follow symlinks as long as the destination is a file in the repisitory.	2025-03-03 12:09:25 -04:00
Joey Hess	e6ae5e8d56	many recompute improvements I've lost track of them all, but it includes: * Using the same key backend as was used in the original computation. * Fixing bug that prevented updating the source file key in the compute state * Handling --reproducible and --unreproducible. * recompute --original of a file using VURL, when the result is different, but the key remains the same, makes the object file be updated with the new content * Detecting some other ways the program behavior can change, just for completeness. * Also adds --backend to addcomputed.	2025-02-27 15:18:27 -04:00
Joey Hess	3bec89a3c3	started git-annex recompute The perform action of this still needs work to do the right thing. In particular, it currently behaves as if --others was always set. And, it duplicates a lot of code from addcomputed.	2025-02-26 11:54:09 -04:00
Joey Hess	eed522a0f8	addcomputed inherits extra initremote parameters This is limited because the remote config is a field/value map. So order is not preserved, and when 2 parameters have the same field name, only the last one will be passed.	2025-02-26 09:45:35 -04:00
Joey Hess	71e92a509a	use compute program REPRODUCIBLE by default	2025-02-25 17:10:41 -04:00
Joey Hess	16f529c05f	addcomputed --fast and --unreproducible working For these, use VURL and URL keys, with an "annex-compute:" URI prefix. These URL keys will look something like this: URL--annex-compute&cbar4,63pconvert,3-f4d3d72cf3f16ac9c3e9a8012bde4462 Generally it's too long so most of it gets md5summed. It's a little ugly, but it's what fell out of the existing URL key generation machinery. I did consider special casing to eg "URL--annex-compute&c4d3d72cf3f16ac9c3e9a8012bde4462". But it seems at least possibly useful that the name of the file that was computed is visible and perhaps one or two words of the git-annex compute command parameters. Note that two different output files from the same computation will get the same URL key. And these keys should remain stable.	2025-02-25 16:43:15 -04:00
Joey Hess	2e1fe1620e	handle comutations in subdirs of the git repository Eg, a computation might be run in "foo/" and refer to "../bar" as an input or output. So, the subdir is part of the computation state. Also, prevent input or output of files that are outside the git repository. Of course, the program can access any file on disk if it wants to; this is just a guard against mistakes. And it may also be useful if the program comunicates with something less trusted than it, eg a container image, so input/output files communicated by that are not the source of security problems.	2025-02-25 15:08:38 -04:00
Joey Hess	556f44d404	update for new interface	2025-02-24 16:15:04 -04:00
Joey Hess	b5319ec575	documentation for compute remote and associated commands None of this is implemented yet.	2025-02-19 14:29:18 -04:00

11 commits