Commit graph

35352 commits

Author SHA1 Message Date
Joey Hess
bb0bc078fc
Merge branch 'master' of ssh://git-annex.branchable.com 2025-03-11 11:13:21 -04:00
Joey Hess
b02aca8627
reorg and expand security section 2025-03-11 11:12:59 -04:00
yarikoptic
cb9c79c26c Added a comment 2025-03-11 15:09:20 +00:00
Joey Hess
a9df446d5d
expand 2025-03-10 17:35:34 -04:00
Joey Hess
106373c53b
response 2025-03-10 16:46:55 -04:00
Joey Hess
24b6f50b79
Merge branch 'master' of ssh://git-annex.branchable.com 2025-03-10 16:42:24 -04:00
Joey Hess
e0b7653495
added git-annex-compute-singularity
And implemented SANDBOX, which it needs.
2025-03-10 16:41:26 -04:00
Joey Hess
7bda5f470c
document output files must be regular files 2025-03-10 14:15:07 -04:00
Joey Hess
f59c0d1f07
make usage an error 2025-03-10 14:13:32 -04:00
yarikoptic
f36da19adb Added a comment 2025-03-09 01:02:55 +00:00
yarikoptic
1e6324c179 Added a comment: Any way to annotate what are input files? 2025-03-08 14:51:20 +00:00
Joey Hess
9d6c052c27
symlink, don't hardlink
hardlink can cause problems with unlocked files
2025-03-07 17:15:54 -04:00
Joey Hess
45d7f3ca4b
disconnect stdio for wasm binaries 2025-03-07 17:15:21 -04:00
Joey Hess
18be4910d8
use pwd and quote it
Seems more portable and safe
2025-03-07 16:06:37 -04:00
Joey Hess
5ef1c44e07
case 2025-03-07 16:03:35 -04:00
Joey Hess
10e36759bf
layout 2025-03-07 16:03:09 -04:00
Joey Hess
dcd7c207a8
layout 2025-03-07 16:02:43 -04:00
Joey Hess
2391c2802a
add git-annex-compute-wasmedge 2025-03-07 16:02:11 -04:00
Joey Hess
ed51924211
redirect command stdout to stderr
Otherwise it will be interpreted as compute program protocol
2025-03-07 16:01:27 -04:00
Joey Hess
2c6dce83de
make OUTPUT subdirs
Simplifies compute programs.
2025-03-07 14:57:12 -04:00
Joey Hess
b4becb7167
Merge branch 'master' of ssh://git-annex.branchable.com 2025-03-07 14:50:11 -04:00
Joey Hess
81ce4264df
compute: add response to OUTPUT
This allows rejecting output filenames that are outside the repository,
and also handles converting eg "-foo" to "./-foo" to prevent a command
that it's passed to interpreting the output filename as a dashed option.
2025-03-07 14:47:34 -04:00
Joey Hess
6a8e57f0e9
remove todo I just added
If a compute program does this, it has a security hole. Not git-annex.
2025-03-07 13:29:57 -04:00
Joey Hess
78045f8e4f
todo 2025-03-07 13:24:11 -04:00
jasonb@ab4484d9961a46440958fa1a528e0fc435599057
b0d4fe5dd0 2025-03-07 04:13:24 +00:00
yarikoptic
27ef1a47df initial report on slow thaw 2025-03-06 22:40:35 +00:00
Joey Hess
1f59545ad0
improve 2025-03-06 14:54:05 -04:00
Joey Hess
138421449e
add git-annex-compute-imageconvert 2025-03-06 14:47:22 -04:00
Joey Hess
825a648670
prefix output with ./ in example 2025-03-06 14:42:07 -04:00
Joey Hess
b835c8c937
no longer a draft 2025-03-06 14:29:07 -04:00
Joey Hess
6f78341fbf
Merge branch 'compute' 2025-03-06 14:23:58 -04:00
Joey Hess
e952753846
preparing to merge compute 2025-03-06 14:22:45 -04:00
jerome.charousset@86fd8ed1bf55902989d7e70a11c38cb3a444b72d
203a730e28 Added a comment: Special use case for Scientific application 2025-03-06 17:02:22 +00:00
Joey Hess
1e9bb30c4e
update 2025-03-06 12:52:12 -04:00
matrss
629ab3f836 Added a comment 2025-03-05 15:40:44 +00:00
bpoldrack
9f045ed494 Added a comment 2025-03-05 14:23:57 +00:00
msz
62ab16aef3 Tag copy_file_range todo with projects/INM7 (came from our cluster) 2025-03-05 13:35:19 +00:00
msz
f1efad3b94 Added a comment: DataLad exploration of the compute on demand space 2025-03-05 13:31:41 +00:00
msz
e4232791dd Added a comment 2025-03-05 11:27:39 +00:00
kenta
5137cb6d16 filled out bug description 2025-03-05 00:00:19 +00:00
Joey Hess
a2fc471e14
safer git sha object filename
Rather than use the filename provided by INPUT, which could come from user
input, and so could be something that looks like a dashed parameter,
use a .git/object/<sha> filename.

This avoids user input passing through INPUT and back out, with the file
path then passed to a command, which could do something unexpected with
a dashed parameter, or other special parameter.

Added a note in the design about being careful of passing user input to
commands. They still have to be careful of that in general, just not in
this case.
2025-03-04 14:54:13 -04:00
Joey Hess
52f51d065a
rename config to annex.security.allowed-compute-programs
And require for enable as well as autoenable.

It seemed asking for trouble for `git-annex enable foo` to use whatever
compute program is stored in the git config, without verifying that the
user wants that program to be used.

Note that it would be good to allow `git-annex enable foo program=...`
to be used without the program being in the git config. Not implemented yet
though.
2025-03-03 16:12:03 -04:00
Joey Hess
f32d2aecce
autoenable security for compute special remote
Added annex.security.autoenable-compute-programs and only allow
autoenabling special remotes that use compute programs on that list.

The reason this is needed is a user might have some compute programs
that are less safe to use than others. They might want to use an unsafe
one only with one repository, where they are the only committer or other
committers are trusted. They might be ok with others being used by any
repository, and if so they can add them to the list.

Another reason would be a user who has installed a compute program by
accident. Eg, it might be included with git-annex at some point, or
pulled in by some dependency. That user doesn't necessarily want that
compute program to be used in an autoenabled special remote.
2025-03-03 15:52:56 -04:00
Joey Hess
89bfeada87
recompute: display one of the changed files 2025-03-03 15:12:19 -04:00
Joey Hess
a0d6a6ea2a
support git files as input to computations
Using GIT keys, like are used when exporting git files to special
remotes. Except here the GIT key refers to a file checked into the git
repo.

Note that, since the compute remote uses catObject to get the content,
a symlink that is checked into git does not get followed. This is important
for security, because following a symlink and adding the content to the
repo as an annex object would allow exfiltrating content from outside
the repository.

Instead, the behavior with a symlink is to run the computation on the
symlink target. This may turn out to be confusing, and it might be worth
addcomputed checking if the file in git is a symlink and erroring out.
Or it could follow symlinks as long as the destination is a file in the
repisitory.
2025-03-03 12:09:25 -04:00
czard
d7569351bf Added a comment: Permission fix 2025-03-03 12:08:28 +00:00
Joey Hess
e6ae5e8d56
many recompute improvements
I've lost track of them all, but it includes:

* Using the same key backend as was used in the original computation.
* Fixing bug that prevented updating the source file key in the compute
  state
* Handling --reproducible and --unreproducible.
* recompute --original of a file using VURL, when the result is
  different, but the key remains the same, makes the object file
  be updated with the new content
* Detecting some other ways the program behavior can change, just for
  completeness.
* Also adds --backend to addcomputed.
2025-02-27 15:18:27 -04:00
dmcardle
08d998a8ee Added a comment 2025-02-27 19:02:14 +00:00
Joey Hess
9c2c3002a6
fix recompute of renamed files
When a computed file has been renamed, a recompute needs to write to the
new filename.

I decided to remove --others because it's not clear what it should do in
the face of renames. Should it update only other files that have not
been renamed? Or update files that use the old key to the new key
anywhere in the tree? Or write the other files to the cwd, ignoring
renames? Since --others is just a way to save on compute time, adding
this complexity at this point seems like a bad idea. May revisit later.

Added temporary TODO-compute file
2025-02-27 11:27:26 -04:00
Joey Hess
d6a010a615
recompute closer to working properly
Proper behavior without --others implemented.

And eliminated most of the code duplication through refactoring.

Also, changed it to not stage recomputed files. This way, git diff will
show files that have differences.
2025-02-26 15:52:52 -04:00