Commit graph

46383 commits

Author SHA1 Message Date
Joey Hess
e6ae5e8d56
many recompute improvements
I've lost track of them all, but it includes:

* Using the same key backend as was used in the original computation.
* Fixing bug that prevented updating the source file key in the compute
  state
* Handling --reproducible and --unreproducible.
* recompute --original of a file using VURL, when the result is
  different, but the key remains the same, makes the object file
  be updated with the new content
* Detecting some other ways the program behavior can change, just for
  completeness.
* Also adds --backend to addcomputed.
2025-02-27 15:18:27 -04:00
Joey Hess
1704b5e327
refactoring 2025-02-27 14:54:03 -04:00
Joey Hess
9c2c3002a6
fix recompute of renamed files
When a computed file has been renamed, a recompute needs to write to the
new filename.

I decided to remove --others because it's not clear what it should do in
the face of renames. Should it update only other files that have not
been renamed? Or update files that use the old key to the new key
anywhere in the tree? Or write the other files to the cwd, ignoring
renames? Since --others is just a way to save on compute time, adding
this complexity at this point seems like a bad idea. May revisit later.

Added temporary TODO-compute file
2025-02-27 11:27:26 -04:00
Joey Hess
5d2a608a56
todo 2025-02-26 15:59:47 -04:00
Joey Hess
d6a010a615
recompute closer to working properly
Proper behavior without --others implemented.

And eliminated most of the code duplication through refactoring.

Also, changed it to not stage recomputed files. This way, git diff will
show files that have differences.
2025-02-26 15:52:52 -04:00
Joey Hess
53d107ca47
refactor 2025-02-26 14:05:37 -04:00
Joey Hess
3bec89a3c3
started git-annex recompute
The perform action of this still needs work to do the right thing.
In particular, it currently behaves as if --others was always set.
And, it duplicates a lot of code from addcomputed.
2025-02-26 11:54:09 -04:00
Joey Hess
d49f371acc
showOutput
when the compute program eg displays usage, it needs to start on its own
line
2025-02-26 09:47:56 -04:00
Joey Hess
eed522a0f8
addcomputed inherits extra initremote parameters
This is limited because the remote config is a field/value map. So order
is not preserved, and when 2 parameters have the same field name, only
the last one will be passed.
2025-02-26 09:45:35 -04:00
Joey Hess
a5b53fa98a
todo 2025-02-25 18:45:55 -04:00
Joey Hess
e702cb94ff
add compute remote uuid to compute state url
Otherwise, two different compute remotes that happen to take the same
input would use the same compute state url. Which seems wrong.
2025-02-25 18:44:40 -04:00
Joey Hess
2b8428bb17
wording 2025-02-25 17:26:28 -04:00
Joey Hess
f8c7cea019
pdate demo program
needed a mkdir
2025-02-25 17:23:38 -04:00
Joey Hess
71e92a509a
use compute program REPRODUCIBLE by default 2025-02-25 17:10:41 -04:00
Joey Hess
233a6954b9
ingest when --unreproducible is used without --fast 2025-02-25 17:04:19 -04:00
Joey Hess
16f529c05f
addcomputed --fast and --unreproducible working
For these, use VURL and URL keys, with an "annex-compute:" URI prefix.

These URL keys will look something like this:

	URL--annex-compute&cbar4,63pconvert,3-f4d3d72cf3f16ac9c3e9a8012bde4462

Generally it's too long so most of it gets md5summed. It's a little
ugly, but it's what fell out of the existing URL key generation
machinery. I did consider special casing to eg
"URL--annex-compute&c4d3d72cf3f16ac9c3e9a8012bde4462". But it seems at
least possibly useful that the name of the file that was computed is
visible and perhaps one or two words of the git-annex compute command
parameters.

Note that two different output files from the same computation will get
the same URL key. And these keys should remain stable.
2025-02-25 16:43:15 -04:00
Joey Hess
a154e91513
add git-annex addcomputed
Working pretty well. Mostly. But:

* Does not yet support inputs that are non-annexed files checked into git
* --fast is currently broken (will need something like VURL keys)
* --unreproducible still uses a checksumming backend, so drop and get
  again will likely fail (needs probably to use an URL key or something
  like one)

The compute special remote seems to work pretty well too. Eg,
getting from it works, and dropping content that is present in it works.
2025-02-25 15:50:08 -04:00
Joey Hess
2e1fe1620e
handle comutations in subdirs of the git repository
Eg, a computation might be run in "foo/" and refer to "../bar" as an
input or output.

So, the subdir is part of the computation state.

Also, prevent input or output of files that are outside the git
repository. Of course, the program can access any file on disk if it
wants to; this is just a guard against mistakes. And it may also be
useful if the program comunicates with something less trusted than it,
eg a container image, so input/output files communicated by that are not
the source of security problems.
2025-02-25 15:08:38 -04:00
Joey Hess
ce05a92ee7
add field desc 2025-02-24 16:41:02 -04:00
Joey Hess
556f44d404
update for new interface 2025-02-24 16:15:04 -04:00
Joey Hess
40be51c98a
reimplement using new compute program interface 2025-02-24 16:01:03 -04:00
Joey Hess
921850d05c
support addcomputed --fast
This complicates the interface but it's still simpler to understand than
the old interface.
2025-02-24 13:48:46 -04:00
Joey Hess
490174b068
new compute program interface
This is much more flexible, and also simpler to understand.
2025-02-24 12:44:20 -04:00
Joey Hess
b804f8a3cc
update 2025-02-21 15:09:46 -04:00
Joey Hess
e0b46ef7ad
compute special remote mostly implemented
Except for some of the hard parts: progress displays, incremental
verification, and getting inputs before running a computation.

Untested! In order to test this, git-annex addcomputed needs to be
implemented.
2025-02-21 15:02:53 -04:00
Joey Hess
4f1eea9061
remove unused adjustedBranchRefresh associated file parameter 2025-02-21 14:51:02 -04:00
Joey Hess
e897229088
wip 2025-02-20 17:23:15 -04:00
Joey Hess
4f3d9f8115
update 2025-02-20 13:27:59 -04:00
Joey Hess
c1b53dbbd0
wip 2025-02-20 13:27:47 -04:00
Joey Hess
a2fa2a8c5f
update 2025-02-19 16:03:34 -04:00
Joey Hess
2f11c65491
comments 2025-02-19 15:14:52 -04:00
Joey Hess
b5319ec575
documentation for compute remote and associated commands
None of this is implemented yet.
2025-02-19 14:29:18 -04:00
Joey Hess
ace9944d1c
add REPRODUCIBLE 2025-02-19 14:16:36 -04:00
Joey Hess
f52385f63d
optional and required inputs and some other changes 2025-02-19 12:47:32 -04:00
Joey Hess
f4c3fdeaed
improved draft design 2025-02-18 15:46:47 -04:00
Joey Hess
9b42f5fe89
improve apiurl description 2025-02-18 14:46:10 -04:00
Joey Hess
d394f0b020
git-lfs apiurl parameter
git-lfs: Added an optional apiurl parameter.

This needs version 1.2.5 of the haskell git-lfs library to be used.
stack.yaml updated to use that.

Note that git-annex enableremote can be used to add apiurl= to an existing
git-lfs special remote. To allow unsetting the apiurl and instead use
the probed url, support enableremote with apiurl set to an empty string.

Sponsored-by: Luke T. Shumaker
2025-02-18 14:11:21 -04:00
sharad
dcf2f71696 Added a comment: Faced same issue for long time 2025-02-17 19:30:28 +00:00
Joey Hess
6a5131fe0b
OsPath build fix 2025-02-17 14:56:56 -04:00
Joey Hess
f6bd8ac9ab
OsPath build fix 2025-02-17 14:46:43 -04:00
Joey Hess
550ffc98fb
OSX build fix 2025-02-17 14:06:06 -04:00
Joey Hess
03827783bc
OSX build fixes 2025-02-17 14:05:19 -04:00
Joey Hess
1fb69fe01c
OSX build fixes 2025-02-17 14:04:08 -04:00
Joey Hess
9f2ce19858
OSX build fix 2025-02-17 14:01:54 -04:00
Joey Hess
66b8ba8fb0
OSX build fixes 2025-02-17 13:59:52 -04:00
Joey Hess
5324f34092
Merge branch 'ospath' 2025-02-17 11:58:20 -04:00
datamanager
93fb1ba536 Added a comment 2025-02-15 21:46:33 +00:00
puck
f32f22bc64 2025-02-15 10:36:03 +00:00
Joey Hess
70a2661334
OsPath conversion for OSXMkLibs 2025-02-14 16:53:00 -04:00
Joey Hess
e8b00faea8
Merge branch 'master' into ospath 2025-02-14 16:28:43 -04:00