Commit graph

35468 commits

Author SHA1 Message Date
matrss
629ab3f836 Added a comment 2025-03-05 15:40:44 +00:00
bpoldrack
9f045ed494 Added a comment 2025-03-05 14:23:57 +00:00
msz
62ab16aef3 Tag copy_file_range todo with projects/INM7 (came from our cluster) 2025-03-05 13:35:19 +00:00
msz
f1efad3b94 Added a comment: DataLad exploration of the compute on demand space 2025-03-05 13:31:41 +00:00
msz
e4232791dd Added a comment 2025-03-05 11:27:39 +00:00
kenta
5137cb6d16 filled out bug description 2025-03-05 00:00:19 +00:00
Joey Hess
a2fc471e14
safer git sha object filename
Rather than use the filename provided by INPUT, which could come from user
input, and so could be something that looks like a dashed parameter,
use a .git/object/<sha> filename.

This avoids user input passing through INPUT and back out, with the file
path then passed to a command, which could do something unexpected with
a dashed parameter, or other special parameter.

Added a note in the design about being careful of passing user input to
commands. They still have to be careful of that in general, just not in
this case.
2025-03-04 14:54:13 -04:00
Joey Hess
52f51d065a
rename config to annex.security.allowed-compute-programs
And require for enable as well as autoenable.

It seemed asking for trouble for `git-annex enable foo` to use whatever
compute program is stored in the git config, without verifying that the
user wants that program to be used.

Note that it would be good to allow `git-annex enable foo program=...`
to be used without the program being in the git config. Not implemented yet
though.
2025-03-03 16:12:03 -04:00
Joey Hess
f32d2aecce
autoenable security for compute special remote
Added annex.security.autoenable-compute-programs and only allow
autoenabling special remotes that use compute programs on that list.

The reason this is needed is a user might have some compute programs
that are less safe to use than others. They might want to use an unsafe
one only with one repository, where they are the only committer or other
committers are trusted. They might be ok with others being used by any
repository, and if so they can add them to the list.

Another reason would be a user who has installed a compute program by
accident. Eg, it might be included with git-annex at some point, or
pulled in by some dependency. That user doesn't necessarily want that
compute program to be used in an autoenabled special remote.
2025-03-03 15:52:56 -04:00
Joey Hess
89bfeada87
recompute: display one of the changed files 2025-03-03 15:12:19 -04:00
Joey Hess
a0d6a6ea2a
support git files as input to computations
Using GIT keys, like are used when exporting git files to special
remotes. Except here the GIT key refers to a file checked into the git
repo.

Note that, since the compute remote uses catObject to get the content,
a symlink that is checked into git does not get followed. This is important
for security, because following a symlink and adding the content to the
repo as an annex object would allow exfiltrating content from outside
the repository.

Instead, the behavior with a symlink is to run the computation on the
symlink target. This may turn out to be confusing, and it might be worth
addcomputed checking if the file in git is a symlink and erroring out.
Or it could follow symlinks as long as the destination is a file in the
repisitory.
2025-03-03 12:09:25 -04:00
czard
d7569351bf Added a comment: Permission fix 2025-03-03 12:08:28 +00:00
Joey Hess
e6ae5e8d56
many recompute improvements
I've lost track of them all, but it includes:

* Using the same key backend as was used in the original computation.
* Fixing bug that prevented updating the source file key in the compute
  state
* Handling --reproducible and --unreproducible.
* recompute --original of a file using VURL, when the result is
  different, but the key remains the same, makes the object file
  be updated with the new content
* Detecting some other ways the program behavior can change, just for
  completeness.
* Also adds --backend to addcomputed.
2025-02-27 15:18:27 -04:00
dmcardle
08d998a8ee Added a comment 2025-02-27 19:02:14 +00:00
Joey Hess
9c2c3002a6
fix recompute of renamed files
When a computed file has been renamed, a recompute needs to write to the
new filename.

I decided to remove --others because it's not clear what it should do in
the face of renames. Should it update only other files that have not
been renamed? Or update files that use the old key to the new key
anywhere in the tree? Or write the other files to the cwd, ignoring
renames? Since --others is just a way to save on compute time, adding
this complexity at this point seems like a bad idea. May revisit later.

Added temporary TODO-compute file
2025-02-27 11:27:26 -04:00
Joey Hess
d6a010a615
recompute closer to working properly
Proper behavior without --others implemented.

And eliminated most of the code duplication through refactoring.

Also, changed it to not stage recomputed files. This way, git diff will
show files that have differences.
2025-02-26 15:52:52 -04:00
Joey Hess
3bec89a3c3
started git-annex recompute
The perform action of this still needs work to do the right thing.
In particular, it currently behaves as if --others was always set.
And, it duplicates a lot of code from addcomputed.
2025-02-26 11:54:09 -04:00
Joey Hess
eed522a0f8
addcomputed inherits extra initremote parameters
This is limited because the remote config is a field/value map. So order
is not preserved, and when 2 parameters have the same field name, only
the last one will be passed.
2025-02-26 09:45:35 -04:00
Joey Hess
2b8428bb17
wording 2025-02-25 17:26:28 -04:00
Joey Hess
f8c7cea019
pdate demo program
needed a mkdir
2025-02-25 17:23:38 -04:00
Joey Hess
71e92a509a
use compute program REPRODUCIBLE by default 2025-02-25 17:10:41 -04:00
Joey Hess
16f529c05f
addcomputed --fast and --unreproducible working
For these, use VURL and URL keys, with an "annex-compute:" URI prefix.

These URL keys will look something like this:

	URL--annex-compute&cbar4,63pconvert,3-f4d3d72cf3f16ac9c3e9a8012bde4462

Generally it's too long so most of it gets md5summed. It's a little
ugly, but it's what fell out of the existing URL key generation
machinery. I did consider special casing to eg
"URL--annex-compute&c4d3d72cf3f16ac9c3e9a8012bde4462". But it seems at
least possibly useful that the name of the file that was computed is
visible and perhaps one or two words of the git-annex compute command
parameters.

Note that two different output files from the same computation will get
the same URL key. And these keys should remain stable.
2025-02-25 16:43:15 -04:00
wolf480@8ad1ccdd08efc303a88f7e88c4e629be6637a44e
0713888c7e 2025-02-25 19:58:35 +00:00
wolf480@8ad1ccdd08efc303a88f7e88c4e629be6637a44e
dce725f849 create bug report: creating can't pass spaces in youtube-dl-options 2025-02-25 19:43:44 +00:00
Joey Hess
2e1fe1620e
handle comutations in subdirs of the git repository
Eg, a computation might be run in "foo/" and refer to "../bar" as an
input or output.

So, the subdir is part of the computation state.

Also, prevent input or output of files that are outside the git
repository. Of course, the program can access any file on disk if it
wants to; this is just a guard against mistakes. And it may also be
useful if the program comunicates with something less trusted than it,
eg a container image, so input/output files communicated by that are not
the source of security problems.
2025-02-25 15:08:38 -04:00
Joey Hess
27ed2f151e
updated interface 2025-02-24 16:15:46 -04:00
Joey Hess
556f44d404
update for new interface 2025-02-24 16:15:04 -04:00
Joey Hess
921850d05c
support addcomputed --fast
This complicates the interface but it's still simpler to understand than
the old interface.
2025-02-24 13:48:46 -04:00
Joey Hess
490174b068
new compute program interface
This is much more flexible, and also simpler to understand.
2025-02-24 12:44:20 -04:00
Basile.Pinsard
7b815199a0 2025-02-24 16:36:56 +00:00
jnkl
0238af33c1 2025-02-23 20:56:00 +00:00
jnkl
8d97fe962b 2025-02-23 20:55:22 +00:00
jnkl
3e86ce3d14 2025-02-23 20:54:56 +00:00
jnkl
1eaa4a9b86 2025-02-23 20:48:35 +00:00
jnkl
3885544628 2025-02-23 20:25:08 +00:00
Joey Hess
e3023f7b0b
Merge branch 'master' of ssh://git-annex.branchable.com 2025-02-22 10:04:58 -04:00
Joey Hess
0c65129ddc
distribits 2025 2025-02-22 10:04:28 -04:00
Atemu
aa2fbf552b 2025-02-22 10:51:45 +00:00
Atemu
958e9068e3 2025-02-22 10:50:55 +00:00
Atemu
8d19f7f09e 2025-02-22 10:48:23 +00:00
Joey Hess
4c032655c2
Merge branch 'master' of ssh://git-annex.branchable.com 2025-02-21 15:31:20 -04:00
Joey Hess
b804f8a3cc
update 2025-02-21 15:09:46 -04:00
yarikoptic
b567aac217 map --json wishlist 2025-02-21 15:31:35 +00:00
yarikoptic
0c98cf9a05 initial report about map infinite loop 2025-02-21 15:28:27 +00:00
Joey Hess
e897229088
wip 2025-02-20 17:23:15 -04:00
Joey Hess
4f3d9f8115
update 2025-02-20 13:27:59 -04:00
Joey Hess
c1b53dbbd0
wip 2025-02-20 13:27:47 -04:00
lell
dd4c1c570e Added a comment 2025-02-20 11:00:05 +00:00
Spencer
4413bba69d Added a comment: Confused 2025-02-19 23:22:46 +00:00
Spencer
4414e97b9b Added a comment: For Those Who Stumble Here 2025-02-19 23:08:42 +00:00