From ea44f2416cca538f0bab0679ed6765885788a5ce Mon Sep 17 00:00:00 2001 From: yarikoptic Date: Wed, 18 Jan 2023 17:55:50 +0000 Subject: [PATCH 1/9] Added a comment --- .../comment_6_4397649b4a2115891cb0f597999cca66._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/todo/copy_with_both_--to_and_--from_/comment_6_4397649b4a2115891cb0f597999cca66._comment diff --git a/doc/todo/copy_with_both_--to_and_--from_/comment_6_4397649b4a2115891cb0f597999cca66._comment b/doc/todo/copy_with_both_--to_and_--from_/comment_6_4397649b4a2115891cb0f597999cca66._comment new file mode 100644 index 0000000000..8c52557902 --- /dev/null +++ b/doc/todo/copy_with_both_--to_and_--from_/comment_6_4397649b4a2115891cb0f597999cca66._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="yarikoptic" + avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4" + subject="comment 6" + date="2023-01-18T17:55:49Z" + content=""" +FWIW: I also feel that 2nd one (absent affect on a possibly present locally copy) would be preferable. +"""]] From da6504ee13d520f719a6a392ec3b704a2474d564 Mon Sep 17 00:00:00 2001 From: jpds Date: Wed, 18 Jan 2023 22:45:06 +0000 Subject: [PATCH 2/9] --- ...S3_remote_errors_with_garage_endpoint.mdwn | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 doc/bugs/S3_remote_errors_with_garage_endpoint.mdwn diff --git a/doc/bugs/S3_remote_errors_with_garage_endpoint.mdwn b/doc/bugs/S3_remote_errors_with_garage_endpoint.mdwn new file mode 100644 index 0000000000..1bb1f330eb --- /dev/null +++ b/doc/bugs/S3_remote_errors_with_garage_endpoint.mdwn @@ -0,0 +1,30 @@ +### Please describe the problem. + +When I attempt to create a S3 remote against my garage[1] cluster, it errors with the following: + +``` +$ git annex initremote garage type=S3 encryption=none host=my-s3-endpoint.domain.com protocol=https bucket=git-annex requeststyle=path datacenter=garage signature=v4 +initremote garage (checking bucket...) (creating bucket in garage...) +git-annex: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "AuthorizationHeaderMalformed", s3ErrorMessage = "Authorization header malformed, expected scope: 20230118/my-s3-endpoint.domain.com/s3/aws4_request", s3ErrorResource = Just "/git-annex/", s3ErrorHostId = Nothing, s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing} +failed +initremote: 1 failed + +$ git annex initremote garage type=S3 encryption=none host=my-s3-endpoint.domain.com protocol=https bucket=git-annex requeststyle=path datacenter=garage +initremote garage (checking bucket...) (creating bucket in garage...) +git-annex: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "InvalidRequest", s3ErrorMessage = "Bad request: Unsupported authorization method", s3ErrorResource = Just "/git-annex/", s3ErrorHostId = Nothing, s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing} +failed +initremote: 1 failed +``` + +Garage appears to support v4 signatures: https://garagehq.deuxfleurs.fr/documentation/reference-manual/s3-compatibility/#high-level-features - and other S3 tooling works against the endpoint. + + +### What version of git-annex are you using? On what operating system? + +Fedora Silverblue 37 / git-annex-10.20221212-1.fc37.x86_64 + +### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders) + +Yes, many years ago - now trying to get it up and running with my self-hosted S3 endpoint. + +[1]: https://garagehq.deuxfleurs.fr/ From c071ea267d9b5588c35baa6fb86562cfbecdf54e Mon Sep 17 00:00:00 2001 From: jpds Date: Wed, 18 Jan 2023 22:57:58 +0000 Subject: [PATCH 3/9] Added a comment --- .../comment_1_1d5b499a0cea623aadebf7e3b7fd9752._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/bugs/S3_remote_errors_with_garage_endpoint/comment_1_1d5b499a0cea623aadebf7e3b7fd9752._comment diff --git a/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_1_1d5b499a0cea623aadebf7e3b7fd9752._comment b/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_1_1d5b499a0cea623aadebf7e3b7fd9752._comment new file mode 100644 index 0000000000..94f0e7c110 --- /dev/null +++ b/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_1_1d5b499a0cea623aadebf7e3b7fd9752._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="jpds" + avatar="http://cdn.libravatar.org/avatar/24d746ec6a7726b162c12ecceb3ee267" + subject="comment 1" + date="2023-01-18T22:57:58Z" + content=""" +Error on Garage's side is triggered here: https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/fcc5033466e58e3beec05ee7748d33522b6b32b0/src/api/signature/payload.rs#L297 +"""]] From dce215e11a6f25212d1d28b9332fa9b4c97ce99c Mon Sep 17 00:00:00 2001 From: jpds Date: Thu, 19 Jan 2023 15:09:01 +0000 Subject: [PATCH 4/9] Added a comment --- ...comment_2_ca7ffa315cfa49e028fe6ff2d5c3133b._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/bugs/S3_remote_errors_with_garage_endpoint/comment_2_ca7ffa315cfa49e028fe6ff2d5c3133b._comment diff --git a/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_2_ca7ffa315cfa49e028fe6ff2d5c3133b._comment b/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_2_ca7ffa315cfa49e028fe6ff2d5c3133b._comment new file mode 100644 index 0000000000..c19475c45d --- /dev/null +++ b/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_2_ca7ffa315cfa49e028fe6ff2d5c3133b._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="jpds" + avatar="http://cdn.libravatar.org/avatar/24d746ec6a7726b162c12ecceb3ee267" + subject="comment 2" + date="2023-01-19T15:09:01Z" + content=""" +I took a look at the credentialv4 structure at https://github.com/aristidb/aws/blob/9bdc4ee018d0d9047c0434eeb21e2383afaa9ccf/Aws/Core.hs#L621 and found it curious that it has the region inside the scope (as the garage code) does... however in my error message from git-annex - the hostname of the S3 service is what's inside the scope instead of the 'garage' region name. + +I therefore adjusted the garage API's configuration to have the FQDN as the region and then... git-annex Just Worked. +"""]] From 73cc3fcd12a9562c8cc700201ed3c60bad0b1d9d Mon Sep 17 00:00:00 2001 From: jpds Date: Thu, 19 Jan 2023 16:28:19 +0000 Subject: [PATCH 5/9] Added a comment --- ..._ae9308a3bab8904dd0f501cbe2f09de0._comment | 43 +++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 doc/bugs/S3_remote_errors_with_garage_endpoint/comment_3_ae9308a3bab8904dd0f501cbe2f09de0._comment diff --git a/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_3_ae9308a3bab8904dd0f501cbe2f09de0._comment b/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_3_ae9308a3bab8904dd0f501cbe2f09de0._comment new file mode 100644 index 0000000000..c796318031 --- /dev/null +++ b/doc/bugs/S3_remote_errors_with_garage_endpoint/comment_3_ae9308a3bab8904dd0f501cbe2f09de0._comment @@ -0,0 +1,43 @@ +[[!comment format=mdwn + username="jpds" + avatar="http://cdn.libravatar.org/avatar/24d746ec6a7726b162c12ecceb3ee267" + subject="comment 3" + date="2023-01-19T16:28:19Z" + content=""" +I believe the fix for this is: + +``` +diff --git a/Remote/S3.hs b/Remote/S3.hs +index f5014202e..49f2ebd58 100644 +--- a/Remote/S3.hs ++++ b/Remote/S3.hs +@@ -948,8 +948,8 @@ s3Configuration c = cfg + | otherwise -> AWS.HTTP + cfg = case getRemoteConfigValue signatureField c of + Just (SignatureVersion 4) -> +- S3.s3v4 proto endpoint False S3.SignWithEffort +- _ -> S3.s3 proto endpoint False ++ S3.s3v4 proto datacenter False S3.SignWithEffort ++ _ -> S3.s3 proto datacenter False + + data S3Info = S3Info + { bucket :: S3.Bucket +``` + +...however I cannot test it myself right now as it's failing to compile on another bit of code: + +``` +[452 of 679] Compiling Remote.S3 + +git/joeyh/git-annex.branchable.com/Remote/S3.hs:922:68: error: + • Couldn't match type ‘B8.ByteString’ with ‘[Char]’ + Expected type: String + Actual type: B8.ByteString + • In the first argument of ‘T.pack’, namely ‘datacenter’ + In the second argument of ‘($)’, namely ‘T.pack datacenter’ + In the expression: AWS.s3HostName $ T.pack datacenter + | +922 | | h == AWS.s3DefaultHost = AWS.s3HostName $ T.pack datacenter + | ^^^^^^^^^^ +``` +"""]] From f14346bf078f274bf2e7b9a4eec11eb133d89c98 Mon Sep 17 00:00:00 2001 From: nobodyinperson Date: Fri, 20 Jan 2023 10:29:33 +0000 Subject: [PATCH 6/9] --- ...default_preferred_content_expressions.mdwn | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+) create mode 100644 doc/todo/Setting_default_preferred_content_expressions.mdwn diff --git a/doc/todo/Setting_default_preferred_content_expressions.mdwn b/doc/todo/Setting_default_preferred_content_expressions.mdwn new file mode 100644 index 0000000000..19c5100e39 --- /dev/null +++ b/doc/todo/Setting_default_preferred_content_expressions.mdwn @@ -0,0 +1,26 @@ +Hey Joey, + +If I understand correctly, the default content expression (when it's empty, e.g. after a `git annex init` or `git clone ...;git annex sync`) is currently apparently `anything`. This means that a `git annex sync --content` (or just `git annex sync` if `git config --set annex.synccontent true`) will fetch all files. + +It would be very handy if there was something like: + +[[!format bash """ +git annex config --set annex.defaultwanted ... +git annex config --set annex.defaultgroup ... +git annex config --set annex.defaultgroupwanted ... +git annex config --set annex.defaultrequired ... + +# and the corresponding git variant for user-overriding +git config [--global|--system] annex.defaultwanted ... +git config [--global|--system] annex.defaultgroup ... +git config [--global|--system] annex.defaultgroupwanted ... +git config [--global|--system] annex.defaultrequired ... +"""]] + +These defaults would be applied when `git annex` initializes a repository (i.e. gives it a `annex.uuid`, e.g. `git annex init` or `git annex sync` of a fresh clone of a repo with annex). + +I like my annexed/datalad repos (mostly research data next to analysis code for collaboration) to have `annex.synccontent = true` so people can just do (`datalad save`/`git annex add`) `git annex sync` and be sure afterwards everything is in order and safe. However as the default `wanted` is `anything` (apparently), they also get all files they probably don't want if they don't to `git annex wanted . present` manually (and manual boilerplate config and extra steps is always something that's nice to automate). Something like `git annex config --set annex.defaultwanted present` would solve this. + +Thanks again very much for git-annex, I love it! 💛 + +Yann From 6f95f821cb3e3dc57cc678ceda279728d6c519b6 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Wed, 18 Jan 2023 15:15:41 -0400 Subject: [PATCH 7/9] remove --fast from man page git-annex move does not actually behave any differently with --fast than without it. (git-annex copy does) (cherry picked from commit f74904ee2c7cec936cac3f2536daa1d426739b80) --- doc/git-annex-move.mdwn | 6 ------ 1 file changed, 6 deletions(-) diff --git a/doc/git-annex-move.mdwn b/doc/git-annex-move.mdwn index 0fbea84d58..b57d834915 100644 --- a/doc/git-annex-move.mdwn +++ b/doc/git-annex-move.mdwn @@ -68,12 +68,6 @@ Paths of files or directories to operate on can be specified. Use this option to move a specified key. -* `--fast` - - When moving content to a remote, avoid a round trip to check if the remote - already has content. This can be faster, but might skip moving content - to the remote in some cases. - * matching options The [[git-annex-matching-options]](1) From 5645017a03910f787c5eac9eaacbad23ca53697b Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 20 Jan 2023 11:23:04 -0400 Subject: [PATCH 8/9] comment --- ..._959f6081cb3cb777ea4fad70bad07da3._comment | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment diff --git a/doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment b/doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment new file mode 100644 index 0000000000..1a15c3943f --- /dev/null +++ b/doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment @@ -0,0 +1,30 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 6""" + date="2023-01-20T15:11:50Z" + content=""" +I've started on an implementation of this, in the `fromto` branch. + +Downloading to a local temp file has some complications which make me want +to avoid it is possible. For one thing, these temp files would have to +somehow get cleaned up after an interrupted move. For another, two +concurrent move processes from different remotes to different remotes would +need to either use separate temp files (wasting disk space) or locking so +only one uses the temp file at one time. The existing code in +Annex.Transfer would have to be parameterized with the temp file to use, +but then the transfer log/lock files that are used by that code would be +problematic. So perhaps that Annex.Transfer code could not be reused, but +then it would need to independeantly deal with resuming, locking, and stall +detection. + +So, I'm considering downloading --from the remote as usual, populating the +local annex with the content, sending that --to the remote, and then +dropping the local copy. That has its own complications, but they seem +mostly less. Although there are two small races that I have not been able +to resolve yet, which would result in `git-annex move --from --to`, when +run concurrently with a `git-annex get` type process, result in the local +copy not being present at the end (see [[!commit a46c385aec2584419330c5dbb571c19ceb92f6fb]]). +That would be surprising behavior, but also unlikely to happen. This approach +also has the problem that, when the file is unlocked, the unlocked file would +get populated after downloading the content, which would be unncessary work. +"""]] From 05b2ae30f0546d5371928e928c6f8c9307a6888e Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 23 Jan 2023 12:45:01 -0400 Subject: [PATCH 9/9] update --- ...omment_6_959f6081cb3cb777ea4fad70bad07da3._comment | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment b/doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment index 1a15c3943f..4745538330 100644 --- a/doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment +++ b/doc/todo/copy_with_both_--to_and_--from_/comment_6_959f6081cb3cb777ea4fad70bad07da3._comment @@ -24,7 +24,12 @@ mostly less. Although there are two small races that I have not been able to resolve yet, which would result in `git-annex move --from --to`, when run concurrently with a `git-annex get` type process, result in the local copy not being present at the end (see [[!commit a46c385aec2584419330c5dbb571c19ceb92f6fb]]). -That would be surprising behavior, but also unlikely to happen. This approach -also has the problem that, when the file is unlocked, the unlocked file would -get populated after downloading the content, which would be unncessary work. +That would be surprising behavior, but also unlikely to happen. +(And perhaps not too surprising, since running `git-annex move --to` +concurrently with `git-annex get` can of course result in the local copy +not being present at the end..) + +The latter approach also has the problem that, when the file is unlocked, the +unlocked file would get populated after downloading the content, which would be +unncessary work. """]]