From 3ad4096686ecb515d9b69c10d2e9bc5fdbf90f58 Mon Sep 17 00:00:00 2001 From: Atemu Date: Sun, 26 Jan 2025 02:36:51 +0000 Subject: [PATCH 01/24] Added a comment --- ...comment_1_464adfa71d322249dfed4ba65c24995d._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_1_464adfa71d322249dfed4ba65c24995d._comment diff --git a/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_1_464adfa71d322249dfed4ba65c24995d._comment b/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_1_464adfa71d322249dfed4ba65c24995d._comment new file mode 100644 index 0000000000..bd8ae47014 --- /dev/null +++ b/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_1_464adfa71d322249dfed4ba65c24995d._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="Atemu" + avatar="http://cdn.libravatar.org/avatar/6ac9c136a74bb8760c66f422d3d6dc32" + subject="comment 1" + date="2025-01-26T02:36:51Z" + content=""" +It will not realise this. + +Why do you have separate repos for this though? You can absolutely just use a non-plain git repo for synchronisation purposes too. +"""]] From 2d6b31713ae3f9fa966a201141eff19439109b97 Mon Sep 17 00:00:00 2001 From: Atemu Date: Sun, 26 Jan 2025 02:54:18 +0000 Subject: [PATCH 02/24] Added a comment --- ..._ba87cf91217ba01415ff55d33550a75b._comment | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 doc/forum/How_to_figure_out_why_files_aren__39__t_being_dropped__63__/comment_2_ba87cf91217ba01415ff55d33550a75b._comment diff --git a/doc/forum/How_to_figure_out_why_files_aren__39__t_being_dropped__63__/comment_2_ba87cf91217ba01415ff55d33550a75b._comment b/doc/forum/How_to_figure_out_why_files_aren__39__t_being_dropped__63__/comment_2_ba87cf91217ba01415ff55d33550a75b._comment new file mode 100644 index 0000000000..603dcfdae8 --- /dev/null +++ b/doc/forum/How_to_figure_out_why_files_aren__39__t_being_dropped__63__/comment_2_ba87cf91217ba01415ff55d33550a75b._comment @@ -0,0 +1,26 @@ +[[!comment format=mdwn + username="Atemu" + avatar="http://cdn.libravatar.org/avatar/6ac9c136a74bb8760c66f422d3d6dc32" + subject="comment 2" + date="2025-01-26T02:54:18Z" + content=""" +My issue apparently had to do with numcopies? I first passed `--numcopies 2` because I was curious but it didn't change anything. Then I passed `--numcopies 1` and it immediately dropped all the files as I'd have expected it to at `numcopies=3`. Running another sync without `--numcopies` didn't attempt to pull in the dropped files either. + +This smells like a bug? If numcopies was actually violated, it should attempt to correct that again, right? (All files were available from a connected repo.) + +Here are the numcopies stats from `git annex info .`: + +``` +numcopies stats: + numcopies +1: 1213 + numcopies +0: 25310 +``` + +Some more background: I have a bunch of drives that are offline that I have set to be trusted. One repo on my NAS is online at all times and semitrusted. + +I have two offline groups: `cold` and `lukewarm`. All drives in those groups are trusted. + +It's weird that it didn't work with 2 but did work with 1. This leads me to believe it could have been due to the one repo being online while the others are offline and trusted; acting more like mincopies. Was behaviour changed in this regard recently? + +I'd still like to know how to debug wanted expressions too though. +"""]] From e09f48b948790e3c010d1e248ff1d919f7c3c628 Mon Sep 17 00:00:00 2001 From: goglu6 Date: Sun, 26 Jan 2025 03:02:03 +0000 Subject: [PATCH 03/24] --- ...r_seems_to_deadlock_for_huge_worktree.mdwn | 24 +++++++++++-------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/doc/bugs/Unlock_filter_seems_to_deadlock_for_huge_worktree.mdwn b/doc/bugs/Unlock_filter_seems_to_deadlock_for_huge_worktree.mdwn index 5530cd570f..b4253cb281 100644 --- a/doc/bugs/Unlock_filter_seems_to_deadlock_for_huge_worktree.mdwn +++ b/doc/bugs/Unlock_filter_seems_to_deadlock_for_huge_worktree.mdwn @@ -4,13 +4,13 @@ I have a pretty big repository with around 300 000 files in the workdir of a bra I wanted to unlock all those files from that branch on a machine, so I tried to use git-annex-adjust --unlock. Sadly, the command do not seems to finish, ever. -Executing the command with debug from a clone(to avoid interacting with the broken index from the first), it seems to deadlock after executing between 10000 and 20000 "thawing" processes when executing the filter-process logic over the files in the worktree. -The problem seems to be reproducible with any repository with a lot of files in the worktree as far as I can tell, independant of file size. +Executing the command with the debug flag from a clone(to avoid interacting with the broken index from the first), it seems to deadlock after executing 10240 completed processes for the filter-process logic over the files in the worktree, which happens to match the annex.queuesize configuration value in use in those repositories. +The problem seems to be reproducible with any repository with more than the aforementioned count of files in the worktree as far as I can tell, independant of file size. -The deadlock described makes higher-level commands like git annex sync also block indefinitely when checkout-ing the unlocked branch for any reason. +The deadlock described makes higher-level commands like git annex sync also block indefinitely when checkout-ing the unlocked branch for any reason in these kinds of unlocked repository du to implcit call to the deadlocking git-annex smudge code. Also, because the filtering is not completely applied, the index is pretty scrambled, its easier to clone the repo and move the annex than fix it, for me at least. -I call the behavior "deadlock" due to the absence of debug log output and low cpu usage on the process when in that state. This seems to indicate some kind of multiprocessing deadlock to me. +I call the behavior "deadlock" due to the absence of debug log output after the 10240 th process and 0% cpu usage on the remaining git and git-annex processes when the bug happens. This seems to indicate some kind of multiprocessing deadlock to me. ### What steps will reproduce the problem? @@ -27,10 +27,13 @@ Here is a minimum set of bash commands that generate the deadlock on my end: git annex add git commit -m "add all empty files" - # This will get stuck after around ~10000-20000 processes from Utility.Process in the debug log while the git annex thaws files into unlocked files - # The deadlock seems to happens after outputing the start of a new thawing, ctrl-c seems to be the only end state for this - git annex adjust --unlock --debug + # This will get stuck after 10240 processes from Utility.Process completed in the debug log while git annex thaws files into unlocked files + # The deadlock seems to happens after outputing the start of the last thawing in the queue, ctrl-c seems to be the only end state for this + git annex adjust --unlock --debug 2> ~/unlock-log + # Ctrl-c the command above once the debug output cease to output new lines without exiting. + # This commands output the number of processes ran for the command above, which is 10240 for me + cat ~/unlock-log | grep Perms | wc -l ### What version of git-annex are you using? On what operating system? @@ -64,14 +67,15 @@ Debian Bookworm [Compiled via "building from source on Debian"] ### Please provide any additional information below. -Excerpt of the last lines from the huge debug log: +Excerpt of the last lines from the huge debug log from the git annex adjust above: [2025-01-16 23:30:27.913022014] (Utility.Process) process [493397] done ExitSuccess [2025-01-16 23:30:27.91309169] (Annex.Perms) thawing content .git/annex/othertmp/BKQKGR.0/BKQKGR -Given the huge debug log produced, it may be easier to reproduce the bug to have it than copying it here. If wanted, I can generate one as required. +Given the huge debug log produced for this bug, it may be easier to reproduce the bug to have it than copying it here. If wanted, I can generate one as required with the process documented in for the bug repoduction above. -Repeatedly calling this(and ctrl-c it when it inevitably get stuck) seems to eventually unlock the files, but its not really a valid solution in my case. + +Repeatedly calling this(and ctrl-c it when it inevitably get stuck) seems to eventually unlock the files ion batches of 10240, but its not really a valid solution in my case. git annex smudge --update --debug From f0701123cfd8a78e636ce78d3768133ea473e9d9 Mon Sep 17 00:00:00 2001 From: luciusf Date: Sun, 26 Jan 2025 11:18:11 +0000 Subject: [PATCH 04/24] Initial post --- ...____34___creates_local_folder_as_repo.mdwn | 69 +++++++++++++++++++ 1 file changed, 69 insertions(+) create mode 100644 doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn diff --git a/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn b/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn new file mode 100644 index 0000000000..66efa94874 --- /dev/null +++ b/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn @@ -0,0 +1,69 @@ +### Please describe the problem. + +When setting up an (SSH) rsync remote, and _not_ adding the `:` at the end of the hostname, it will create a local folder instead of copying to remote. + +``` +[joe@laptop]$ git annex initremote myremote type=rsync rsyncurl=ssh.example.com encryption=hybrid keyid=00001111222233334444 +[joe@laptop]$ git annex copy . --to myremote +copy metal-arm64.raw (to rpi50...) +ok +copy nixos-gnome-24.11.712512.3f0a8ac25fb6-x86_64-linux.iso (to myremote...) +ok +(recording state in git...) +[joe@laptop]$ ls -l +insgesamt 246792 +lrwxrwxrwx. 1 joe joe 204 20. Jan 21:01 metal-arm64.raw -> .git/annex/objects/mG/21/SHA256E-s1306525696--21308f635774faf611ba35c9b04d638aeb7afb1b1c1db949ae65ff81cdafe8b7.raw/SHA256E-s1306525696--21308f635774faf611ba35c9b04d638aeb7afb1b1c1db949ae65ff81cdafe8b7.raw +lrwxrwxrwx. 1 joe joe 204 20. Jan 21:01 nixos-gnome-24.11.712512.3f0a8ac25fb6-x86_64-linux.iso -> .git/annex/objects/fX/g9/SHA256E-s2550136832--da2fe173a279d273bf5a999eafdb618db0642f4a3df95fd94a6585c45082a7f0.iso/SHA256E-s2550136832--da2fe173a279d273bf5a999eafdb618db0642f4a3df95fd94a6585c45082a7f0.iso +drwxr-xr-x. 1 joe joe 12 26. Jan 11:32 ssh.example.com # <---- for me, that was not expected behaviour +``` + +It might be a feature I don't understand, but because I couldn't find documentation about it, I am leaning towards non-intended behaviour. My assumption would be, that a rsync operation to a local directory is already implemented with the [directory special remote](https://git-annex.branchable.com/special_remotes/directory/). + +### What steps will reproduce the problem? + +Have a remote rsync server, where you don't need to specify the base directory. In my case [this is done with NixOS and this configuration which uses `rrsync`](https://wiki.nixos.org/wiki/Rsync). + +The following configures the rsync remote, and later pushed files to it (so far expected behaviour): + +``` +git annex initremote myremote type=rsync rsyncurl=ssh.example.com: encryption=hybrid keyid=00001111222233334444 +git annex copy . --to myremote +``` + +This however, creates a local folder named `ssh.example.com` in my annexed directory: + +``` +git annex initremote myremote type=rsync rsyncurl=ssh.example.com encryption=hybrid keyid=00001111222233334444 +git annex copy . --to myremote # will copy successfully, BUT +ls -l # shows the folder `ssh.example.com` in my directory +``` + +### What version of git-annex are you using? On what operating system? + +* Fedora 41 + +``` +git-annex version: 10.20240701 +build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Benchmark Feeds Testsuite S3 WebDAV +dependency versions: aws-0.24.1 bloomfilter-2.0.1.2 crypton-0.34 DAV-1.3.4 feed-1.3.2.1 ghc-9.6.6 http-client-0.7.17 persistent-sqlite-2.13.3.0 torrent-10000.1.3 uuid-1.3.15 yesod-1.6.2.1 +key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL GITBUNDLE GITMANIFEST VURL X* +remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg rclone hook external +operating system: linux x86_64 +supported repository versions: 8 9 10 +upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10 +local repository version: 10 +``` + +### Please provide any additional information below. + +[[!format sh """ +# If you can, paste a complete transcript of the problem occurring here. +# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log + + +# End of transcript or log. +"""]] + +### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) + +I am just now starting to _really_ use git-annex, after following it's development and every blogpost you wrote about it for almost a decade now. Thank you for a tool desperately needed! From d2d52136c77f6bbd9eab4736b4a6bf7e3417281b Mon Sep 17 00:00:00 2001 From: luciusf Date: Sun, 26 Jan 2025 11:19:53 +0000 Subject: [PATCH 05/24] rename bugs/rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn to bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn --- ...ithout___34____58____34___creates_local_folder_as_remote.mdwn} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename doc/bugs/{rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn => rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn} (100%) diff --git a/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn b/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn similarity index 100% rename from doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_repo.mdwn rename to doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn From ff7a5eab631ff6b35039985e9a9cc5c5273fc23c Mon Sep 17 00:00:00 2001 From: luciusf Date: Sun, 26 Jan 2025 11:29:02 +0000 Subject: [PATCH 06/24] Some clarifications in my reproduce steps about the state of the rsync remote --- ...out___34____58____34___creates_local_folder_as_remote.mdwn | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn b/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn index 66efa94874..dfd518403d 100644 --- a/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn +++ b/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote.mdwn @@ -30,12 +30,12 @@ git annex initremote myremote type=rsync rsyncurl=ssh.example.com: encryption=hy git annex copy . --to myremote ``` -This however, creates a local folder named `ssh.example.com` in my annexed directory: +This however, doesn't copy to the correct remote, but creates a local folder named `ssh.example.com` in my annexed directory instead (note the missing `:` after the hostname): ``` git annex initremote myremote type=rsync rsyncurl=ssh.example.com encryption=hybrid keyid=00001111222233334444 git annex copy . --to myremote # will copy successfully, BUT -ls -l # shows the folder `ssh.example.com` in my directory +ls -l # shows the folder `ssh.example.com` in my directory with the files in it, the rsync remote is empty ``` ### What version of git-annex are you using? On what operating system? From 0dedb8077b6287267542453ff08e183867bf5e87 Mon Sep 17 00:00:00 2001 From: jnkl Date: Sun, 26 Jan 2025 13:09:05 +0000 Subject: [PATCH 07/24] Added a comment --- .../comment_2_e4cd3108130efbfa796e1ff5e5f55116._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_2_e4cd3108130efbfa796e1ff5e5f55116._comment diff --git a/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_2_e4cd3108130efbfa796e1ff5e5f55116._comment b/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_2_e4cd3108130efbfa796e1ff5e5f55116._comment new file mode 100644 index 0000000000..7080606ce5 --- /dev/null +++ b/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_2_e4cd3108130efbfa796e1ff5e5f55116._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="jnkl" + avatar="http://cdn.libravatar.org/avatar/2ab576f3bf2e0d96b1ee935bb7f33dbe" + subject="comment 2" + date="2025-01-26T13:09:04Z" + content=""" +Sorry, I am new to git. I thought pushes are only allowed to bare repositories. Am I wrong? +"""]] From 61c97b7460bbffde299cc4b954ff0e7656d86a79 Mon Sep 17 00:00:00 2001 From: Atemu Date: Sun, 26 Jan 2025 13:30:10 +0000 Subject: [PATCH 08/24] Added a comment --- .../comment_3_f2069c83af180c7026700a102a528827._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_3_f2069c83af180c7026700a102a528827._comment diff --git a/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_3_f2069c83af180c7026700a102a528827._comment b/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_3_f2069c83af180c7026700a102a528827._comment new file mode 100644 index 0000000000..e1a73cf187 --- /dev/null +++ b/doc/forum/Deduplication_between_two_repos_on_the_same_drive__63__/comment_3_f2069c83af180c7026700a102a528827._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="Atemu" + avatar="http://cdn.libravatar.org/avatar/6ac9c136a74bb8760c66f422d3d6dc32" + subject="comment 3" + date="2025-01-26T13:30:10Z" + content=""" +git-annex synchronises branch state via the `synced/branchnamehere` branches. The actual checked out branch in the worktree will only be updated when you run a `merge` or `sync` in the worktree. +"""]] From 7de9c8ff5d4afc20ad8889938d038cbc6516d789 Mon Sep 17 00:00:00 2001 From: matrss Date: Mon, 27 Jan 2025 11:28:43 +0000 Subject: [PATCH 09/24] Added a comment --- ...mment_1_b218e908bd2f897415e6d34137f8536b._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote/comment_1_b218e908bd2f897415e6d34137f8536b._comment diff --git a/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote/comment_1_b218e908bd2f897415e6d34137f8536b._comment b/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote/comment_1_b218e908bd2f897415e6d34137f8536b._comment new file mode 100644 index 0000000000..b38b06793b --- /dev/null +++ b/doc/bugs/rsyncurl_without___34____58____34___creates_local_folder_as_remote/comment_1_b218e908bd2f897415e6d34137f8536b._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 1" + date="2025-01-27T11:28:43Z" + content=""" +I'd say this is intended behavior: I assume that the rsyncurl option is more less passed verbatim to rsync, and rsync can act on both local and remote paths. There is the possibility to use `rsync://` URLs, remote paths via SSH where the host and path are separated by a colon, and local paths. + +The rsync special remote with local paths behaves a bit differently than the directory special remote, namely the rsyncurl is remembered (e.g. for autoenable) while the directory special remote does not remember the directory. There can be use-cases for both. + +Besides, most of the time I think one would want to specify a remote directory with rsync, in which case the colon is necessary anyway. +"""]] From 8c76c04fa97b6b9e6833ff74f1a39672d051f500 Mon Sep 17 00:00:00 2001 From: yarikoptic Date: Mon, 27 Jan 2025 12:32:32 +0000 Subject: [PATCH 10/24] reporting on FTBFS --- ...past_week__58___Variable_not_in_scope.mdwn | 65 +++++++++++++++++++ 1 file changed, 65 insertions(+) create mode 100644 doc/bugs/FTBFS_for_the_past_week__58___Variable_not_in_scope.mdwn diff --git a/doc/bugs/FTBFS_for_the_past_week__58___Variable_not_in_scope.mdwn b/doc/bugs/FTBFS_for_the_past_week__58___Variable_not_in_scope.mdwn new file mode 100644 index 0000000000..14da37c15f --- /dev/null +++ b/doc/bugs/FTBFS_for_the_past_week__58___Variable_not_in_scope.mdwn @@ -0,0 +1,65 @@ +### Please describe the problem. + +I have in my mailbox + +``` + 80 T Jan 26 GitHub Actions *-3.6* (3.7K/0) datalad/git-annex daily summary: 4 FAILED, 8 INCOMPLETE, 1 PASSED, 3 ABSENT + 206 N T Jan 25 GitHub Actions *-3.8* (3.7K/0) datalad/git-annex daily summary: 4 FAILED, 8 INCOMPLETE, 1 PASSED, 3 ABSENT + 357 T Jan 24 GitHub Actions *-4.4* (6.3K/0) datalad/git-annex daily summary: 12 FAILED, 8 INCOMPLETE, 1 PASSED, 3 ABSENT +1279 T Jan 23 GitHub Actions *-4.5* (3.7K/0) datalad/git-annex daily summary: 5 FAILED, 8 INCOMPLETE, 3 ABSENT +1715 T Jan 22 GitHub Actions *-5.0* (3.7K/0) datalad/git-annex daily summary: 5 FAILED, 8 INCOMPLETE, 3 ABSENT 2335 T Jan 21 GitHub Actions *-3.9* (3.7K/0) datalad/git-annex daily summary: 5 FAILED, 8 INCOMPLETE, 3 ABSENT +2656 T Jan 20 GitHub Actions *-4.3* (6.8K/0) datalad/git-annex daily summary: 28 PASSED, 2 ABSENT +2862 T Jan 19 GitHub Actions *-5.0* (6.8K/0) datalad/git-annex daily summary: 28 PASSED, 2 ABSENT +``` + +and looking at the [latest ubuntu build logs](https://github.com/datalad/git-annex/actions/runs/12970824274/job/36176536041) I see + +``` +I: the tail of the log + +Build/LinuxMkLibs.hs:101:17: error: + Variable not in scope: + createDirectoryIfMissing :: Bool -> [Char] -> IO a3 + | +101 | createDirectoryIfMissing True (top ++ libdir takeDirectory d) + | ^^^^^^^^^^^^^^^^^^^^^^^^ + +Build/LinuxMkLibs.hs:149:9: error: + Variable not in scope: + createDirectoryIfMissing :: Bool -> FilePath -> IO a2 + | +149 | createDirectoryIfMissing True (top shimdir) + | ^^^^^^^^^^^^^^^^^^^^^^^^ + +Build/LinuxMkLibs.hs:150:9: error: + Variable not in scope: + createDirectoryIfMissing :: Bool -> FilePath -> IO a1 + | +150 | createDirectoryIfMissing True (top exedir) + | ^^^^^^^^^^^^^^^^^^^^^^^^ + +Build/LinuxMkLibs.hs:160:19: error: + * Variable not in scope: + renameFile :: FilePath -> FilePath -> IO () + * Perhaps you meant `readFile' (imported from Prelude) + | +160 | , renameFile exe exedest + | ^^^^^^^^^^ + +Build/LinuxMkLibs.hs:165:18: error: + Variable not in scope: doesFileExist :: FilePath -> IO Bool + | +165 | unlessM (doesFileExist (top exelink)) $ + | ^^^^^^^^^^^^^ + +Build/LinuxMkLibs.hs:181:9: error: + Variable not in scope: + createDirectoryIfMissing :: Bool -> FilePath -> IO a0 + | +181 | createDirectoryIfMissing True destdir + | ^^^^^^^^^^^^^^^^^^^^^^^^ +make[3]: *** [Makefile:156: Build/Standalone] Error 1 +make[3]: Leaving directory '/home/runner/work/git-annex/git-annex/git-annex-source' +make[2]: *** [Makefile:164: linuxstandalone] Error 2 +``` + From b61d316c36e04c961acc0a0f9cb08d43be5da589 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 27 Jan 2025 09:35:33 -0400 Subject: [PATCH 11/24] fix link --- .../comment_3_573cb6c3ee8d1a2072c61559f81dc32c._comment | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/todo/compute_special_remote/comment_3_573cb6c3ee8d1a2072c61559f81dc32c._comment b/doc/todo/compute_special_remote/comment_3_573cb6c3ee8d1a2072c61559f81dc32c._comment index f4c06b6f7d..c5004caa34 100644 --- a/doc/todo/compute_special_remote/comment_3_573cb6c3ee8d1a2072c61559f81dc32c._comment +++ b/doc/todo/compute_special_remote/comment_3_573cb6c3ee8d1a2072c61559f81dc32c._comment @@ -3,5 +3,5 @@ subject="""comment 3""" date="2024-04-30T19:31:35Z" content=""" -See also [[todo/wishlist__58___derived_content_support]]. +See also [[todo/wishlist:_derived_content_support]]. """]] From 71206a8603698fd542b8b44e9292aee72586c820 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 27 Jan 2025 10:25:55 -0400 Subject: [PATCH 12/24] update comment --- ...omment_6_f1760976e65ae16d4d79f004ac924e55._comment | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment b/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment index 69d2f42283..c88156c5d1 100644 --- a/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment +++ b/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment @@ -3,11 +3,14 @@ subject="""comment 6""" date="2024-04-30T19:53:43Z" content=""" -On trust, it seems to me that if someone chooses to enable a particular -special remote, they are choosing to trust whatever kind of computations it -supports. +On trust, it seems to me that if someone chooses to install a +particular special remote, they are choosing to trust whatever kind of +computations it supports. Eg a special remote could choose to always run a computation inside a particular container system and then if you trust that container system is -secure, you can choose to use it. +secure, you can choose to install it. + +Note that enabling the special remote is not necessary, because a +repository can be set to autoenable a special remote. """]] From 02c792b7243e7dbc1a5e63ca4fc3dad05d3be28e Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 27 Jan 2025 10:37:35 -0400 Subject: [PATCH 13/24] thoughts --- ..._f1760976e65ae16d4d79f004ac924e55._comment | 20 +++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment b/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment index c88156c5d1..dd71ce09ce 100644 --- a/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment +++ b/doc/todo/compute_special_remote/comment_6_f1760976e65ae16d4d79f004ac924e55._comment @@ -11,6 +11,22 @@ Eg a special remote could choose to always run a computation inside a particular container system and then if you trust that container system is secure, you can choose to install it. -Note that enabling the special remote is not necessary, because a -repository can be set to autoenable a special remote. +Enabling the special remote is not necessary, because a +repository can be set to autoenable a special remote. In some sense this is +surprising. I had originally talked about enabling here and then I +remembered autoenable. + +It may be that autoenable should only be allowed for +special remote programs that the user explicitly whitelists, not only +installs into PATH. That would break some existing workflows, though +setting some git configs would not be too hard. + +There seems scope for both compute special remotes that execute code that +comes from the git repository, and ones that only have metadata about the +computation recorded in the git repository, in a way that cannot let them +execute arbitrary code under the control of the git repository. + +A well-behaved compute special remote that does run code that comes from a +git repository could require an additional git config to be set to allow it +to do that. """]] From 6b5206db855df529e73363db807b74c680e75f78 Mon Sep 17 00:00:00 2001 From: matrss Date: Mon, 27 Jan 2025 15:08:57 +0000 Subject: [PATCH 14/24] Added a comment --- .../comment_3_a53bfbd63b3ec5834286167a61d5c4ba._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_3_a53bfbd63b3ec5834286167a61d5c4ba._comment diff --git a/doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_3_a53bfbd63b3ec5834286167a61d5c4ba._comment b/doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_3_a53bfbd63b3ec5834286167a61d5c4ba._comment new file mode 100644 index 0000000000..ec4fbe5cf7 --- /dev/null +++ b/doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_3_a53bfbd63b3ec5834286167a61d5c4ba._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 3" + date="2025-01-27T15:08:57Z" + content=""" +I can still reproduce this issue with 10.20250115, but in my testing it seems like it only happens against a forgejo-aneksajo instance on localhost without TLS, not against a different remote instance. This setup required `git config annex.security.allowed-ip-addresses 127.0.0.1`, maybe it has something to do with that or TLS... +"""]] From cb258ca480d460e571051e0f4bb4ba70f635c88e Mon Sep 17 00:00:00 2001 From: matrss Date: Mon, 27 Jan 2025 15:14:44 +0000 Subject: [PATCH 15/24] Added a comment --- .../comment_4_58ddd2578f115af22e995bd09c2bcea2._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_4_58ddd2578f115af22e995bd09c2bcea2._comment diff --git a/doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_4_58ddd2578f115af22e995bd09c2bcea2._comment b/doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_4_58ddd2578f115af22e995bd09c2bcea2._comment new file mode 100644 index 0000000000..bd64c6c224 --- /dev/null +++ b/doc/bugs/git_annex_checkpresentkey_removes_git_credentials/comment_4_58ddd2578f115af22e995bd09c2bcea2._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 4" + date="2025-01-27T15:14:44Z" + content=""" +It definitely takes a different code path somehow, as I don't see the `Utility.Url` debug messages when the remote is not on localhost. +"""]] From 7adf1f45fa24534e84cf891242043af59d0fd7c8 Mon Sep 17 00:00:00 2001 From: matrss Date: Mon, 27 Jan 2025 15:26:15 +0000 Subject: [PATCH 16/24] Added a comment --- ...ent_6_4641d3ad4a8a8f17f8df47e02555dfa2._comment | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 doc/todo/generic_p2p_socket_transport/comment_6_4641d3ad4a8a8f17f8df47e02555dfa2._comment diff --git a/doc/todo/generic_p2p_socket_transport/comment_6_4641d3ad4a8a8f17f8df47e02555dfa2._comment b/doc/todo/generic_p2p_socket_transport/comment_6_4641d3ad4a8a8f17f8df47e02555dfa2._comment new file mode 100644 index 0000000000..ce9a361d40 --- /dev/null +++ b/doc/todo/generic_p2p_socket_transport/comment_6_4641d3ad4a8a8f17f8df47e02555dfa2._comment @@ -0,0 +1,14 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 6" + date="2025-01-27T15:26:15Z" + content=""" +> > If the PSK were fully contained in the remote string then a third-party getting hold of that string could pretend to be the server + +> I agree this would be a problem, but how would a third-party get ahold of the string though? Remote urls don't usually get stored in the git repository, perhaps you were thinking of some other way. + +My thinking was that git remote URLs usually aren't sensitive information that inherently grant access to a repository, so a construct where the remote URL contains the credentials is just unexpected. A careless user might e.g. put it into a `type=git` special remote or treat it in some other way in which one wouldn't treat a password, without considering the implications. I am not aware of a way in which they could be leaked without user intervention, though. + +Having separate credentials explicitly named as such just seems safer. But in the end this would be the responsibility of the one implementing the p2p transport, anyway. +"""]] From 754c0a001b97a451ee4c8767102bd8dd6ae85fa3 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 27 Jan 2025 12:19:16 -0400 Subject: [PATCH 17/24] comment --- ..._2e10caa2ecbba0f53a3ab031a94c9907._comment | 74 +++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment diff --git a/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment b/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment new file mode 100644 index 0000000000..fc72c6e6a9 --- /dev/null +++ b/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment @@ -0,0 +1,74 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 9""" + date="2025-01-27T14:46:43Z" + content=""" +Circling back to this, I think the fork in the road is whether this is +about git-annex providing this and that feature to support external special +remotes that compute, or whether git-annex gets a compute special +remote of its own with some simpler/better extension interface +than the external special remote protocol. + +Of course, git-annex having its own compute special remote would not +preclude other external special remotes that compute. And for that matter, +a single external special remote could implement an extension interface. + +--- + +Thinking about how a generic compute special remote in git-annex could +work, multiple instances of it could be initremoted: + + git-annex initremote convertfiles type=compute program=csv-to-xslx + git-annex initremote cutvideo type=compute program=ffmpeg-cut + +Here the "program" parameter would cause a program like +`git-annex-compute-ffmpeg-cut` to be run to get files from that instance +of the compute special remote. The interface could be as simple as it +being run with the key that it is requested to compute, and outputting +the paths to the all keys it was able to compute. (So allowing for +"request one key, receive many".) Perhaps also with some way to indicate +progess of the computation. + +It would make sense to store the details of computations in git-annex +metadata. And a compute program can use git-annex commands to get files +it depends on. Eg, `git-annex-compute-ffmpeg-cut` could run: + + # look up the configured metadata + starttime=$(git-annex metadata --get compute-ffmpeg-starttime --key=$requested) + endtime=$(git-annex metadata --get compute-ffmpeg-endtime --key=$requested) + source=$(git-annex metadata --get compute-ffmpeg-source --key=$requested) + + # get the source video file + git-annex get --key=$source + git-annex examinekey --format='${objectpath}' $source + +It might be worth formalizing that a given computed key can depend on other +keys, and have git-annex always get/compute those keys first. + +When asked to store a key in the compute special remote, it would verify +that the key can be generated by it. Using the same interface as used to +get a key. + +This all leaves a chicken and egg problem, how does the user add a computed +file if they don't know the key yet? + +The user could manually run the commands that generate the computed file, +then `git-annex add` it, and set the metadata. Then `git-annex copy --to` +the compute remote would verify if the file can be generated, and add it if +so. This seems awkward, but also nice to be able to do manually. + +Or, something like VURL keys could be used, with an interface something +like this: + + git-annex addcomputed foo --to ffmpeg-cut + --input compute-ffmpeg-source=input.mov + --set compute-ffmpeg-starttime=15:00 + --set compute-ffmpeg-endtime=30:00 + +All that would do is generate some arbitrary VURL key or similar, +provisionally set the provided metadata (how?), and try to store the key +in the compute special remote. If it succeeds, stage an annex pointer +and commit the metadata. Since it's a VURL key, storing the key in the +compute special remote would also record the hash of the generated file +at that point. +"""]] From dd28f97aacb11bc62ed4204794d06b90c4462f81 Mon Sep 17 00:00:00 2001 From: "beryllium@5bc3c32eb8156390f96e363e4ba38976567425ec" Date: Tue, 28 Jan 2025 08:34:40 +0000 Subject: [PATCH 18/24] Added a comment: Simple config amendment for Apache served repositories --- ..._974bf32abc3d093d6ebcda4838a79553._comment | 44 +++++++++++++++++++ 1 file changed, 44 insertions(+) create mode 100644 doc/special_remotes/git/comment_3_974bf32abc3d093d6ebcda4838a79553._comment diff --git a/doc/special_remotes/git/comment_3_974bf32abc3d093d6ebcda4838a79553._comment b/doc/special_remotes/git/comment_3_974bf32abc3d093d6ebcda4838a79553._comment new file mode 100644 index 0000000000..5798ac595b --- /dev/null +++ b/doc/special_remotes/git/comment_3_974bf32abc3d093d6ebcda4838a79553._comment @@ -0,0 +1,44 @@ +[[!comment format=mdwn + username="beryllium@5bc3c32eb8156390f96e363e4ba38976567425ec" + nickname="beryllium" + avatar="http://cdn.libravatar.org/avatar/62b67d68e918b381e7e9dd6a96c16137" + subject="Simple config amendment for Apache served repositories" + date="2025-01-28T08:34:40Z" + content=""" +If you follow the [git-http-backend][id] documentation for serving repositories via Apache, you'll read this section: + + +

To serve gitweb at the same url, use a ScriptAliasMatch to only +those URLs that git http-backend can handle, and forward the +rest to gitweb:

+ +
+
+
ScriptAliasMatch \
+	\"(?x)^/git/(.*/(HEAD | \
+			info/refs | \
+			objects/(info/[^/]+ | \
+				 [0-9a-f]{2}/[0-9a-f]{38} | \
+				 pack/pack-[0-9a-f]{40}\.(pack|idx)) | \
+			git-(upload|receive)-pack))$\" \
+	/usr/libexec/git-core/git-http-backend/$1
+
+ScriptAlias /git/ /var/www/cgi-bin/gitweb.cgi/
+
+
+
+ +If you add the following AliasMatch between the two ScriptAlias directives, you can get Apache to serve the (...).git/config file to the http client, in this case git-annex. + +
+AliasMatch \"(?x)^/git/(.*/config)$\" /var/www/git/$1
+
+ +This allows the annexes to use the autoenable=true to pin the centralisation afforded by the git only repository. Keep a \"source of truth\" so to speak (acknowledging that this is antithetical to what git-annex aims to do). + +As an aside, the tip to generate a uuid didn't seem to work for me. But I suspect I missed the point somewhat. + +Regardless, if you are able to alter the configuration of your \"centralised\" git repository, this might be of assistance. + +[id]: https://git-scm.com/docs/git-http-backend \"git-http-backend\" +"""]] From 6fb1dd6afafbc45e72ddb84cf6cc7dfbd6ac1a87 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 28 Jan 2025 10:28:35 -0400 Subject: [PATCH 19/24] comment --- ..._5addc5ef9399ffedc23190c9d4e566ce._comment | 24 +++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 doc/todo/compute_special_remote/comment_11_5addc5ef9399ffedc23190c9d4e566ce._comment diff --git a/doc/todo/compute_special_remote/comment_11_5addc5ef9399ffedc23190c9d4e566ce._comment b/doc/todo/compute_special_remote/comment_11_5addc5ef9399ffedc23190c9d4e566ce._comment new file mode 100644 index 0000000000..454c11de0b --- /dev/null +++ b/doc/todo/compute_special_remote/comment_11_5addc5ef9399ffedc23190c9d4e566ce._comment @@ -0,0 +1,24 @@ +[[!comment format=mdwn + username="joey" + subject="""Re: worktree provisioning""" + date="2025-01-28T14:08:29Z" + content=""" +@m.risse in your example the "data.nc" file gets new content when +retrieved from the special remote and the source file has changed. + +But if you already have data.nc file present in a repository, it +does not get updated immediately when you update the source +"data.grib" file. + +So, a drop and re-get of a file changes the version of the file you have +available. For that matter, if the old version has been stored on other +remotes, a get may retrieve either an old or a new version. +That is not intuitive and it makes me wonder if using a +special remote is really a good fit for what you're wanting to do. + +In your "cdo" example, it's not clear to me if the new version of the +software generates an identical file to the old, or if it has a bug fix +that causes it to generate a significantly different output. If the two +outputs are significantly different then treating them as the same +git-annex key seems questionable to me. +"""]] From 24d5dbe30b45e6ca4510e5c1769b494722ab7394 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 28 Jan 2025 11:12:02 -0400 Subject: [PATCH 20/24] comment --- ..._304b925c5c54b1fd980446920780be00._comment | 39 +++++++++++++++++++ ..._2e10caa2ecbba0f53a3ab031a94c9907._comment | 3 +- 2 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment diff --git a/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment b/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment new file mode 100644 index 0000000000..44916ca336 --- /dev/null +++ b/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment @@ -0,0 +1,39 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 10""" + date="2025-01-28T14:06:41Z" + content=""" +Using metadata to store the inputs of computations like I did in my example +above also seems that it would allow the metadata to be changed, which +would change the output when a key gets recomputed. + +It might be possible for git-annex to pin down the current state of +metadata (or the whole git-annex branch) and provide the same input to the +computation when it's run again. (Unless `git-annex forget` has caused +that old branch state to be lost..) But it can't fully isolate the program +from all unpinned inputs without using some form of containerization, +which feels out of scope for git-annex. + +Instead of using metadata, the input values could be stored in the +per-special-remote state of the generated key. Or the input values could be +encoded in the key itself, but then two computations that generate the same +output would have two different keys, rather than hashing to the same key. + +And using a key with a regular hash backend lets the user find out if the +computation turns out to not be reproducible later for whatever reason; +getting the file from the compute special remote will fail at hash +verification time. Something like a VURL key could still alternatively be +used in cases where reproducibility is not important. + +To add a computed file, the interface would look close to the same, +but now the --value options are setting fields in the compute special +remote's state: + + git-annex addcomputed foo --to ffmpeg-cut + --input source=input.mov + --value starttime=15:00 + --value endtime=30:00 + +The values could be provided to the "git-annex-compute-" program with +environment variables. +"""]] diff --git a/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment b/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment index fc72c6e6a9..e596f7cd20 100644 --- a/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment +++ b/doc/todo/compute_special_remote/comment_9_2e10caa2ecbba0f53a3ab031a94c9907._comment @@ -43,7 +43,8 @@ it depends on. Eg, `git-annex-compute-ffmpeg-cut` could run: git-annex examinekey --format='${objectpath}' $source It might be worth formalizing that a given computed key can depend on other -keys, and have git-annex always get/compute those keys first. +keys, and have git-annex always get/compute those keys first. And provide +them to the program in a worktree? When asked to store a key in the compute special remote, it would verify that the key can be generated by it. Using the same interface as used to From 4f0e64b6debc60f8e654f1282751d0c647a678cd Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 28 Jan 2025 11:36:02 -0400 Subject: [PATCH 21/24] update --- ..._304b925c5c54b1fd980446920780be00._comment | 42 ++++++++++++++++--- 1 file changed, 36 insertions(+), 6 deletions(-) diff --git a/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment b/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment index 44916ca336..73249ac05c 100644 --- a/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment +++ b/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment @@ -4,8 +4,10 @@ date="2025-01-28T14:06:41Z" content=""" Using metadata to store the inputs of computations like I did in my example -above also seems that it would allow the metadata to be changed, which -would change the output when a key gets recomputed. +above seems that it would allow the metadata to be changed later, which +would change the output when a key gets recomputed. That feels surprising, +because metadata could be changed for any reason, without the intention +of affecting a compute special remote. It might be possible for git-annex to pin down the current state of metadata (or the whole git-annex branch) and provide the same input to the @@ -19,7 +21,7 @@ per-special-remote state of the generated key. Or the input values could be encoded in the key itself, but then two computations that generate the same output would have two different keys, rather than hashing to the same key. -And using a key with a regular hash backend lets the user find out if the +Using a key with a regular hash backend also lets the user find out if the computation turns out to not be reproducible later for whatever reason; getting the file from the compute special remote will fail at hash verification time. Something like a VURL key could still alternatively be @@ -29,11 +31,39 @@ To add a computed file, the interface would look close to the same, but now the --value options are setting fields in the compute special remote's state: - git-annex addcomputed foo --to ffmpeg-cut - --input source=input.mov - --value starttime=15:00 + git-annex addcomputed foo --to ffmpeg-cut \ + --input source=input.mov \ + --value starttime=15:00 \ --value endtime=30:00 The values could be provided to the "git-annex-compute-" program with environment variables. + +For `--input source=foo`, it could look up the git-annex key (or git sha1) +of that file, and store that in the state. So it would provide the compute +program with the same data every time. But it could *also* store the +filename. And that allows for a command like this: + + git-annex recompute foo --from ffmpeg-cut + +Which, when the input.mov file has been changed, would re-run the +computation with the new content of the file, and stage a new version of +the computed file. It could even be used to recompute every file in a tree: + + git-annex recompute . --from ffmpeg-cut + +Also, that command could let input values be adjusted later: + + git-annex recompute foo --from ffmpeg-cut --value starttime=14:50 + git commit -m 'include the introduction of the speaker in the clip' + +It would also be good to have a command that examines a computed key +and displays the values and inputs. Eg: + + git-annex examinecompute foo --from ffmpeg-cut + source=input.mov (annex key SHA256--xxxxxxxxx) + starttime=15:00 + endtime=30:00 + +This feels like it might allow for some useful workflows... """]] From 67034a02ea3898305643763665ae3db8a4a781d2 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 28 Jan 2025 11:38:04 -0400 Subject: [PATCH 22/24] update --- .../comment_10_304b925c5c54b1fd980446920780be00._comment | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment b/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment index 73249ac05c..0a870654f6 100644 --- a/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment +++ b/doc/todo/compute_special_remote/comment_10_304b925c5c54b1fd980446920780be00._comment @@ -58,12 +58,13 @@ Also, that command could let input values be adjusted later: git commit -m 'include the introduction of the speaker in the clip' It would also be good to have a command that examines a computed key -and displays the values and inputs. Eg: +and displays the values and inputs. That could be `git-annex whereis` +or perhaps a dedicated command with more structured output: git-annex examinecompute foo --from ffmpeg-cut source=input.mov (annex key SHA256--xxxxxxxxx) starttime=15:00 endtime=30:00 -This feels like it might allow for some useful workflows... +This all feels like it might allow for some useful workflows... """]] From da9ca7475e1845c6df9fb5ed2ea223d5aee6fcd8 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 28 Jan 2025 11:57:03 -0400 Subject: [PATCH 23/24] comment --- ..._ddc985546fee804733c4ec485253e98f._comment | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 doc/todo/compute_special_remote/comment_12_ddc985546fee804733c4ec485253e98f._comment diff --git a/doc/todo/compute_special_remote/comment_12_ddc985546fee804733c4ec485253e98f._comment b/doc/todo/compute_special_remote/comment_12_ddc985546fee804733c4ec485253e98f._comment new file mode 100644 index 0000000000..c05e779876 --- /dev/null +++ b/doc/todo/compute_special_remote/comment_12_ddc985546fee804733c4ec485253e98f._comment @@ -0,0 +1,29 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 12""" + date="2025-01-28T15:39:44Z" + content=""" +My design so far does not fully support +"Request one key, receive many". + +My `git-annex addcomputed` command doesn't handle the case where a +computation generates multiple output files. While the `git-annex-compute-` +command's interface could let it return several computed files, addcomputed +would only adds one file to the name that the user specifies. What is it +supposed to do if the computation generates more than one? Maybe it needs a +way to let a whole directory be populated with the files generated by a +computation. Or a way to specify multiple files to add. + +And here's another problem: +Suppose I have one very expensive computation that generates files foo +and bar. And a second, less expensive computation, that also generates foo +(same content) as well as generating baz. Both computations are run on the +same compute special remote. Now if the user runs `git-annex get foo`, +they will be unhappy if it chooses to run the expensive computation, +rather than the less expensive computation. + +Since the per-special remote state for a key is used as the computation +input, only one input can be saved for foo's key. So it wouldn't really be +picking between two alernatives, it would just use whatever the current +state for that key is. +"""]] From 87cda29dd7318053a30dd87d97f35dbf8578c052 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 28 Jan 2025 15:29:25 -0400 Subject: [PATCH 24/24] remove Read instance for AssociatedFile This instance is not used. --- Types/Key.hs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Types/Key.hs b/Types/Key.hs index 7302605c8a..03d5aa4638 100644 --- a/Types/Key.hs +++ b/Types/Key.hs @@ -203,7 +203,7 @@ splitKeyNameExtension' keyname = S8.span (/= '.') keyname {- A filename may be associated with a Key. -} newtype AssociatedFile = AssociatedFile (Maybe RawFilePath) - deriving (Show, Read, Eq, Ord) + deriving (Show, Eq, Ord) {- There are several different varieties of keys. -} data KeyVariety