From 9f17cec7ba1c5de2d5d6f4b78216662f85d7c125 Mon Sep 17 00:00:00 2001 From: matrss Date: Fri, 13 Dec 2024 22:02:15 +0000 Subject: [PATCH 01/10] Added a comment --- ..._0a1208f17265ff77cd3956da22439e4b._comment | 83 +++++++++++++++++++ 1 file changed, 83 insertions(+) create mode 100644 doc/todo/generic_p2p_socket_transport/comment_3_0a1208f17265ff77cd3956da22439e4b._comment diff --git a/doc/todo/generic_p2p_socket_transport/comment_3_0a1208f17265ff77cd3956da22439e4b._comment b/doc/todo/generic_p2p_socket_transport/comment_3_0a1208f17265ff77cd3956da22439e4b._comment new file mode 100644 index 0000000000..1e9b0f03c0 --- /dev/null +++ b/doc/todo/generic_p2p_socket_transport/comment_3_0a1208f17265ff77cd3956da22439e4b._comment @@ -0,0 +1,83 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/59541f50d845e5f81aff06e88a38b9de" + subject="comment 3" + date="2024-12-13T22:02:14Z" + content=""" +Your comment seems to be wrongly formatted. It was shown correctly in the notification mail, but doesn't show up here. + +--- + +Just to document what I have tried out, for completeness: with what is already in place it is possible to connect two repositories over yggstack, it is just very awkward. + +On one system you can do: + +- `sudo mkdir /etc/tor && sudo touch /etc/tor/torrc` (without actually having tor installed) +- `sudo git annex enable-tor $(id -u)` +- `yggstack -genconf > yggstack.conf` +- `echo tor-annex::.pk.ygg:12345` (take the pubkey out of yggstack.conf) +- `socat TCP-LISTEN:12345,fork,reuseaddr UNIX-CONNECT:/var/lib/tor-annex/_/s` +- `yggstack -useconffile yggstack.conf -remote-tcp 12345:127.0.0.1:12345` +- `git annex p2p --gen-addresses` + +On the other system do: + +- `yggstack -autoconf -socks 127.0.0.1:9050` +- `git annex p2p --link` and paste in the generated address when asked (it should have the form `tor-annex::.pk.ygg:12345:`) + +On the server side this simply exposes the p2p socket generated for tor through a different means, and on the client side this works because yggstack can be used similarly enough to tor (doing name resolution through the socks proxy at port 9050 and then connecting the supplied port). + +--- + +I really like your proposal of a `p2p-annex::foo+` remote; together with a way to tell remotedaemon to start a process exposing the socket it would make for an easily extendable mechanism. Imagine this: + +Client side: + +- `p2p-annex::foo+` would start `git-annex-p2p-foo ` and talk to its stdin/stdout. + +Server side: + +- A configuration option `annex.start-p2psocket=true` would instruct remotedaemon to listen on .git/annex/p2psocket (I think a hardcoded location is fine, as there only really needs to be one such socket even with multiple networks, and somewhere under .git/annex is a good location to associate it with the repository and will always be writable by the user). +- A configuration option `annex.expose-p2p-via=foo` that could be supplied zero, one, or multiple times, and each of these configurations would instruct remotedaemon to start the external program git-annex-p2ptransport-foo after the p2p socket is ready (this configuration could also just point to a command to execute, but I thought it might be nice to stay with the theme of commonly prefixed programs). + +With these things in place a third-party package git-annex-p2p-yggstack could provide a simple set of shell scripts to implement transport over yggstack: + +For the server side there would be a `git-annex-p2ptransport-yggstack` along these lines (modulo proper process cleanup of course): + +``` +socat TCP-LISTEN:12345,fork,reuseaddr UNIX-CONNECT:.git/annex/p2psocket & +yggstack -useconffile .git/annex/p2ptransport/yggstack/yggstack.conf -remote-tcp 12345:127.0.0.1:12345 +``` + +and a `git-annex-p2ptransport-enable-yggstack` like this: + +``` +git config --local annex.start-p2psocket true +git config --local --add annex.expose-p2p-via yggstack +if [ ! -f .git/annex/p2ptransport/yggstack/yggstack.conf ]; then + yggstack -genconf > .git/annex/p2ptransport/yggstack/yggstack.conf +fi +echo \"p2p-annex::yggstack+.pk.ygg:12345\" >> .git/annex/creds/p2paddrs +``` + +For the client-side it would provide `git-annex-p2p-yggstack` along these lines: + +``` +yggstack -autoconf -socks 127.0.0.1:1080 +nc -X 5 -x 127.0.0.1:1080 .pk.ygg 12345 +``` + +With that package installed one could then do `git annex p2ptransport enable-yggstack` followed by `git annex p2p --gen-addresses`. A `git annex remotedaemon` would now start everything on the server-side, and the client-side could connect using `git annex p2p --link` with the address from `--gen-addresses`. + +--- + +I think this would be sufficiently flexible for most kinds of p2p transport one could come up with. E.g. a transport over fowl or even plain magic-wormhole (though the transit relay wouldn't appreciate it) could use `p2p-annex::fowl+` where the code is a pre-generated token instead of the usual passphrases used by magic-wormhole. The server side would be a script that repeatedly waits for connections to that code, the client side just connects to it. + +Even for more traditional p2p setups (tinc, wireguard, yggdrasil, etc.) where the transport is pre-set up at the system level this would just work if there was a helper for `p2p-annex::tcpip+:` (effectively just netcat again). + +--- + +Configuration, program, and subcommand names etc. are of course open to bike-shedding. Some of the hardcoded ports above should be dynamically chosen, or completely avoided if the transport can do so (yggstack and fowl can't expose unix sockets directly yet, so the digression through the loopback device is needed for now). + +What do you think? +"""]] From 740bd74f2929e26b143f1d4d3a0abbfc247e6e3e Mon Sep 17 00:00:00 2001 From: Doable8234 Date: Sat, 14 Dec 2024 08:15:22 +0000 Subject: [PATCH 02/10] Added a comment --- ...mment_4_f8254b4671a4fae5b51398cddcc190c2._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/forum/Unable_to_delete_preferred-content.log/comment_4_f8254b4671a4fae5b51398cddcc190c2._comment diff --git a/doc/forum/Unable_to_delete_preferred-content.log/comment_4_f8254b4671a4fae5b51398cddcc190c2._comment b/doc/forum/Unable_to_delete_preferred-content.log/comment_4_f8254b4671a4fae5b51398cddcc190c2._comment new file mode 100644 index 0000000000..6a3dc152e7 --- /dev/null +++ b/doc/forum/Unable_to_delete_preferred-content.log/comment_4_f8254b4671a4fae5b51398cddcc190c2._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="Doable8234" + avatar="http://cdn.libravatar.org/avatar/b0d5fea745f92c3b8cc8ecc3dafa6278" + subject="comment 4" + date="2024-12-14T08:15:22Z" + content=""" +Thanks, Joey. That seems to work based on my testing. Appreciate the quick and precise response! + +Fixing my actual repo will have to wait since one of my nodes is now offline, but hopefully that goes off without a glitch. + +Also just want to say how awesome git annex is. I've been using it for nearly 10 years now and don't see myself ever wanting to stop. +"""]] From e019b0d85c4292a214226f3f2145f2620707ad5c Mon Sep 17 00:00:00 2001 From: eugen Date: Sat, 14 Dec 2024 17:48:08 +0000 Subject: [PATCH 03/10] --- .../Compute_space_required_for_a_git_annex_get_--auto__63__.mdwn | 1 + 1 file changed, 1 insertion(+) create mode 100644 doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__.mdwn diff --git a/doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__.mdwn b/doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__.mdwn new file mode 100644 index 0000000000..f75b843310 --- /dev/null +++ b/doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__.mdwn @@ -0,0 +1 @@ +Before I run a command that get new content in a repository -- especially with the --auto flag -- is there a way to find out the size of the data to be copied? My case is simple. I'm just using USB sticks/drives. But I never know if the space is enough for the next `get --auto` command... From 84c86ad2944ef6aa70c723c1081b2aab7be71dcb Mon Sep 17 00:00:00 2001 From: matrss Date: Sun, 15 Dec 2024 18:13:00 +0000 Subject: [PATCH 04/10] Added a comment --- ..._8e4b4f476284b0a9f77b3ebf158b5c34._comment | 20 +++++++++++++++++++ 1 file changed, 20 insertions(+) create mode 100644 doc/todo/generic_p2p_socket_transport/comment_4_8e4b4f476284b0a9f77b3ebf158b5c34._comment diff --git a/doc/todo/generic_p2p_socket_transport/comment_4_8e4b4f476284b0a9f77b3ebf158b5c34._comment b/doc/todo/generic_p2p_socket_transport/comment_4_8e4b4f476284b0a9f77b3ebf158b5c34._comment new file mode 100644 index 0000000000..0d0f7ce839 --- /dev/null +++ b/doc/todo/generic_p2p_socket_transport/comment_4_8e4b4f476284b0a9f77b3ebf158b5c34._comment @@ -0,0 +1,20 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 4" + date="2024-12-15T18:13:00Z" + content=""" +One more thought: the proposed `p2p-annex::foo+` remote makes one assumption that I don't think holds for all thinkable p2p transports. That assumption is that there is a public address for the server-side that can be trusted to be the expected other side. + +For tor and yggstack this does hold: the public address (onion address of the hidden service for tor and the IPv6 derived from the public key of the yggstack peer (potentially resolved from a .pk.ygg DNS entry like above), respectively) ensures that the server side is who they are expected to be. There is no way for a third-party to pretend that they were the server-side, even if they knew the git remote string, because they would need to have the servers private key to do so. + +This is not the case for fowl: with fowl one would essentially do `fowl ...` on both sides to create a tunnel between server and client. If the PSK were fully contained in the remote string then a third-party getting hold of that string could pretend to be the server (when the server side is currently not waiting for a connection itself) and steal the auth token from the client. So under the assumption that the remote string is not a secret this would be a problem. + +But this problem can be overcome: with fowl both sides could simply derive the psk from the p2p auth token to establish the connection, essentially like so: `fowl - ...`. The git remote string would only need to contain the information to use fowl and some unique identifier for the remote then, so that the right auth token can be taken from .git/annex/creds. + +Likewise, for other p2p transports that don't have stable and secure public addresses, necessary information exchange could also happen over magic-wormhole using the auth tokens, or the auth tokens could be used as PSKs between both sides if that's what the transport needs. This would e.g. apply for a hypothetical transport over webrtc data channels, where some kind of \"SDP\" has to be exchanged between both sides to establish a connection. + +--- + +All that to say: I think `p2p-annex::foo+` would indeed be general enough for many conceivable means of transport, if a re-use of the auth tokens in the above fashion would be acceptable. And I can't think of anything against it, yet. +"""]] From a8970f23b8ea04b7b63e8000ef992717ccd3ee4c Mon Sep 17 00:00:00 2001 From: matrss Date: Sun, 15 Dec 2024 23:39:34 +0000 Subject: [PATCH 05/10] Added a comment --- .../comment_1_a9160519a19f35ce6bb1cc555d7112b7._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__/comment_1_a9160519a19f35ce6bb1cc555d7112b7._comment diff --git a/doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__/comment_1_a9160519a19f35ce6bb1cc555d7112b7._comment b/doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__/comment_1_a9160519a19f35ce6bb1cc555d7112b7._comment new file mode 100644 index 0000000000..ccb0defd56 --- /dev/null +++ b/doc/forum/Compute_space_required_for_a_git_annex_get_--auto__63__/comment_1_a9160519a19f35ce6bb1cc555d7112b7._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 1" + date="2024-12-15T23:39:33Z" + content=""" +Something like this should get you the answer: `git annex info --fast . --not --in here --and --want-get` (adapted from the example here: ). +"""]] From 6664feb693539528c60ecfea83aba9b2f6aac6ca Mon Sep 17 00:00:00 2001 From: Doable8234 Date: Mon, 16 Dec 2024 08:20:44 +0000 Subject: [PATCH 06/10] Added a comment --- .../comment_2_8ff41c2d22d49feb7ce8af7feaff7914._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/forum/Requesting_of_files_across_disconnected_devices/comment_2_8ff41c2d22d49feb7ce8af7feaff7914._comment diff --git a/doc/forum/Requesting_of_files_across_disconnected_devices/comment_2_8ff41c2d22d49feb7ce8af7feaff7914._comment b/doc/forum/Requesting_of_files_across_disconnected_devices/comment_2_8ff41c2d22d49feb7ce8af7feaff7914._comment new file mode 100644 index 0000000000..8a0f9fd7e0 --- /dev/null +++ b/doc/forum/Requesting_of_files_across_disconnected_devices/comment_2_8ff41c2d22d49feb7ce8af7feaff7914._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="Doable8234" + avatar="http://cdn.libravatar.org/avatar/b0d5fea745f92c3b8cc8ecc3dafa6278" + subject="comment 2" + date="2024-12-16T08:20:43Z" + content=""" +I've thought about this exact use case, though I never actually used it yet. One simple way to do this could be by using git annex preferred content settings. In the nodes that push out content, all you need to do is set up a cron job for `git annex sync --content`. Now you can make it push content wherever you want by adjusting the preferred content settings. +"""]] From fd41b4c53e5f319a62af2fb8d9f3038b8fcb3e6d Mon Sep 17 00:00:00 2001 From: Doable8234 Date: Mon, 16 Dec 2024 08:24:32 +0000 Subject: [PATCH 07/10] Added a comment --- .../comment_1_1d3e1d89535b72c185cb93c4aeae0ccb._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/todo/support_--not_--unused/comment_1_1d3e1d89535b72c185cb93c4aeae0ccb._comment diff --git a/doc/todo/support_--not_--unused/comment_1_1d3e1d89535b72c185cb93c4aeae0ccb._comment b/doc/todo/support_--not_--unused/comment_1_1d3e1d89535b72c185cb93c4aeae0ccb._comment new file mode 100644 index 0000000000..9775eadfa8 --- /dev/null +++ b/doc/todo/support_--not_--unused/comment_1_1d3e1d89535b72c185cb93c4aeae0ccb._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="Doable8234" + avatar="http://cdn.libravatar.org/avatar/b0d5fea745f92c3b8cc8ecc3dafa6278" + subject="comment 1" + date="2024-12-16T08:24:31Z" + content=""" +I've absolutely no idea about the relative difficulty of implementing these, but it sounds to me like your second part `It would also perhaps be good to detect when matching options are used that don't make sense, and error out on commands like git-annex find --not or git-annex find -and -(` might actually be more important than the first! +"""]] From 932ad041de50aa8bbca75b1b9c50e5a91430b861 Mon Sep 17 00:00:00 2001 From: matrss Date: Mon, 16 Dec 2024 22:02:42 +0000 Subject: [PATCH 08/10] --- doc/bugs/Installation_error_on_android.mdwn | 76 +++++++++++++++++++++ 1 file changed, 76 insertions(+) create mode 100644 doc/bugs/Installation_error_on_android.mdwn diff --git a/doc/bugs/Installation_error_on_android.mdwn b/doc/bugs/Installation_error_on_android.mdwn new file mode 100644 index 0000000000..3e1fbc1638 --- /dev/null +++ b/doc/bugs/Installation_error_on_android.mdwn @@ -0,0 +1,76 @@ +### Please describe the problem. + +Following the installation instructions for android (termux), I get an error while sourcing git-annex-install: + +``` +Running on Android.. Tuning for optimal behavior. +sed: can't read /data/data/com.termux/files/home/git-annex.linux/git-remote-annex: No such file or directory +``` + +I can confirm that git-remote-annex is indeed missing in that directory. + +### What steps will reproduce the problem? + +``` +pkg install wget +wget https://git-annex.branchable.com/install/Android/git-annex-install +source git-annex-install +``` + +### What version of git-annex are you using? On what operating system? + +None yet x) and on a freshly updated termux. + +### Please provide any additional information below. + +[[!format sh """ +# If you can, paste a complete transcript of the problem occurring here. +# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log +~ $ wget https://git-annex.branchable.com/install/Android/git-annex-install +source git-annex-install +--2024-12-16 23:01:13-- https://git-annex.branchable.com/install/Android/git-annex-install +Resolving git-annex.branchable.com (git-annex.branchable.com)... 2600:3c03::f03c:91ff:fedf:c0e5, 66.228.46.55 +Connecting to git-annex.branchable.com (git-annex.branchable.com)|2600:3c03::f03c:91ff:fedf:c0e5|:443... connected. +HTTP request sent, awaiting response... 200 OK +Length: 1470 (1.4K) +Saving to: ‘git-annex-install’ + +git-annex-ins 100% 1.44K --.-KB/s in 0s + +2024-12-16 23:01:14 (194 MB/s) - ‘git-annex-install’ saved [1470/1470] + +Installing dependencies with termux pkg manager... +Checking availability of current mirror: +[*] https://ftp.fau.de/termux/termux-main: ok +Reading package lists... Done +Building dependency tree... Done +Reading state information... Done +git is already the newest version (2.47.1). +wget is already the newest version (1.25.0). +tar is already the newest version (1.35). +coreutils is already the newest version (9.5-3). +proot is already the newest version (5.1.107-65). +0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. +Downloading git-annex... +--2024-12-16 23:01:14-- https://downloads.kitenet.net/git-annex/linux/current/git-annex-standalone-arm64-ancient.tar.gz +Resolving downloads.kitenet.net (downloads.kitenet.net)... 2600:3c03::f03c:91ff:fe73:b0d2, 66.228.36.95 +Connecting to downloads.kitenet.net (downloads.kitenet.net)|2600:3c03::f03c:91ff:fe73:b0d2|:443... connected. +HTTP request sent, awaiting response... 200 OK +Length: 57553624 (55M) [application/x-gzip] +Saving to: ‘STDOUT’ + +- 100% 54.89M 8.16MB/s in 11s + +2024-12-16 23:01:25 (5.18 MB/s) - written to stdout [57553624/57553624] + +Running on Android.. Tuning for optimal behavior. +sed: can't read /data/data/com.termux/files/home/git-annex.linux/git-remote-annex: No such file or directory + +[Process completed (code 2) - press Enter] + +# End of transcript or log. +"""]] + +### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) + + From 910b9a5e4a69957b643e73da5ecc2aca3f7ac23d Mon Sep 17 00:00:00 2001 From: matrss Date: Mon, 16 Dec 2024 22:19:34 +0000 Subject: [PATCH 09/10] Added a comment --- ...t_1_828e5d92ec03594170d7ac52d346533d._comment | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) create mode 100644 doc/forum/clone_and_initialize_with_a_given_uuid/comment_1_828e5d92ec03594170d7ac52d346533d._comment diff --git a/doc/forum/clone_and_initialize_with_a_given_uuid/comment_1_828e5d92ec03594170d7ac52d346533d._comment b/doc/forum/clone_and_initialize_with_a_given_uuid/comment_1_828e5d92ec03594170d7ac52d346533d._comment new file mode 100644 index 0000000000..a00f64e35f --- /dev/null +++ b/doc/forum/clone_and_initialize_with_a_given_uuid/comment_1_828e5d92ec03594170d7ac52d346533d._comment @@ -0,0 +1,16 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 1" + date="2024-12-16T22:19:34Z" + content=""" +It _looks_ like you can just set annex.uuid before the first `git annex init` to achieve this: + +``` +git init / git clone +git config annex.uuid 00000000-0000-0000-0000-000000000003 +git annex init +``` + +But I would say that doing so is ill-advised. You can set a description for each repository and give the remotes descriptive names instead. If you use shared UUIDs you will run into an issue if it ever happens that two of those repositories become connected. +"""]] From 6724d1fd6246525e00193a8de66e4a159219cc10 Mon Sep 17 00:00:00 2001 From: matrss Date: Mon, 16 Dec 2024 22:48:16 +0000 Subject: [PATCH 10/10] Added a comment --- ...comment_3_48b8108fa3fd16ef72c5beeb0765e5c5._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/forum/Requesting_of_files_across_disconnected_devices/comment_3_48b8108fa3fd16ef72c5beeb0765e5c5._comment diff --git a/doc/forum/Requesting_of_files_across_disconnected_devices/comment_3_48b8108fa3fd16ef72c5beeb0765e5c5._comment b/doc/forum/Requesting_of_files_across_disconnected_devices/comment_3_48b8108fa3fd16ef72c5beeb0765e5c5._comment new file mode 100644 index 0000000000..406b747057 --- /dev/null +++ b/doc/forum/Requesting_of_files_across_disconnected_devices/comment_3_48b8108fa3fd16ef72c5beeb0765e5c5._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="matrss" + avatar="http://cdn.libravatar.org/avatar/cd1c0b3be1af288012e49197918395f0" + subject="comment 3" + date="2024-12-16T22:48:16Z" + content=""" +There is a standard group called \"transfer\" which is meant for this kind of thing: . This is especially applicable if there is a static preferred content expression that can be written for each repository (i.e. no ad-hoc gets, just something more structured). + +To make it more dynamic you could include a match on a metadata tag in a repositories preferred content expression. Requesting a file would then be setting the tag on it (well, and a bunch of syncing in all repositories). +"""]]