Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2023-01-20 11:23:24 -04:00
commit 45c338204f
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
6 changed files with 125 additions and 0 deletions

View file

@ -0,0 +1,30 @@
### Please describe the problem.
When I attempt to create a S3 remote against my garage[1] cluster, it errors with the following:
```
$ git annex initremote garage type=S3 encryption=none host=my-s3-endpoint.domain.com protocol=https bucket=git-annex requeststyle=path datacenter=garage signature=v4
initremote garage (checking bucket...) (creating bucket in garage...)
git-annex: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "AuthorizationHeaderMalformed", s3ErrorMessage = "Authorization header malformed, expected scope: 20230118/my-s3-endpoint.domain.com/s3/aws4_request", s3ErrorResource = Just "/git-annex/", s3ErrorHostId = Nothing, s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}
failed
initremote: 1 failed
$ git annex initremote garage type=S3 encryption=none host=my-s3-endpoint.domain.com protocol=https bucket=git-annex requeststyle=path datacenter=garage
initremote garage (checking bucket...) (creating bucket in garage...)
git-annex: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "InvalidRequest", s3ErrorMessage = "Bad request: Unsupported authorization method", s3ErrorResource = Just "/git-annex/", s3ErrorHostId = Nothing, s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing}
failed
initremote: 1 failed
```
Garage appears to support v4 signatures: https://garagehq.deuxfleurs.fr/documentation/reference-manual/s3-compatibility/#high-level-features - and other S3 tooling works against the endpoint.
### What version of git-annex are you using? On what operating system?
Fedora Silverblue 37 / git-annex-10.20221212-1.fc37.x86_64
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Yes, many years ago - now trying to get it up and running with my self-hosted S3 endpoint.
[1]: https://garagehq.deuxfleurs.fr/

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="jpds"
avatar="http://cdn.libravatar.org/avatar/24d746ec6a7726b162c12ecceb3ee267"
subject="comment 1"
date="2023-01-18T22:57:58Z"
content="""
Error on Garage's side is triggered here: https://git.deuxfleurs.fr/Deuxfleurs/garage/src/commit/fcc5033466e58e3beec05ee7748d33522b6b32b0/src/api/signature/payload.rs#L297
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="jpds"
avatar="http://cdn.libravatar.org/avatar/24d746ec6a7726b162c12ecceb3ee267"
subject="comment 2"
date="2023-01-19T15:09:01Z"
content="""
I took a look at the credentialv4 structure at https://github.com/aristidb/aws/blob/9bdc4ee018d0d9047c0434eeb21e2383afaa9ccf/Aws/Core.hs#L621 and found it curious that it has the region inside the scope (as the garage code) does... however in my error message from git-annex - the hostname of the S3 service is what's inside the scope instead of the 'garage' region name.
I therefore adjusted the garage API's configuration to have the FQDN as the region and then... git-annex Just Worked.
"""]]

View file

@ -0,0 +1,43 @@
[[!comment format=mdwn
username="jpds"
avatar="http://cdn.libravatar.org/avatar/24d746ec6a7726b162c12ecceb3ee267"
subject="comment 3"
date="2023-01-19T16:28:19Z"
content="""
I believe the fix for this is:
```
diff --git a/Remote/S3.hs b/Remote/S3.hs
index f5014202e..49f2ebd58 100644
--- a/Remote/S3.hs
+++ b/Remote/S3.hs
@@ -948,8 +948,8 @@ s3Configuration c = cfg
| otherwise -> AWS.HTTP
cfg = case getRemoteConfigValue signatureField c of
Just (SignatureVersion 4) ->
- S3.s3v4 proto endpoint False S3.SignWithEffort
- _ -> S3.s3 proto endpoint False
+ S3.s3v4 proto datacenter False S3.SignWithEffort
+ _ -> S3.s3 proto datacenter False
data S3Info = S3Info
{ bucket :: S3.Bucket
```
...however I cannot test it myself right now as it's failing to compile on another bit of code:
```
[452 of 679] Compiling Remote.S3
git/joeyh/git-annex.branchable.com/Remote/S3.hs:922:68: error:
• Couldn't match type B8.ByteString with [Char]
Expected type: String
Actual type: B8.ByteString
• In the first argument of T.pack, namely datacenter
In the second argument of ($), namely T.pack datacenter
In the expression: AWS.s3HostName $ T.pack datacenter
|
922 | | h == AWS.s3DefaultHost = AWS.s3HostName $ T.pack datacenter
| ^^^^^^^^^^
```
"""]]

View file

@ -0,0 +1,26 @@
Hey Joey,
If I understand correctly, the default content expression (when it's empty, e.g. after a `git annex init` or `git clone ...;git annex sync`) is currently apparently `anything`. This means that a `git annex sync --content` (or just `git annex sync` if `git config --set annex.synccontent true`) will fetch all files.
It would be very handy if there was something like:
[[!format bash """
git annex config --set annex.defaultwanted ...
git annex config --set annex.defaultgroup ...
git annex config --set annex.defaultgroupwanted ...
git annex config --set annex.defaultrequired ...
# and the corresponding git variant for user-overriding
git config [--global|--system] annex.defaultwanted ...
git config [--global|--system] annex.defaultgroup ...
git config [--global|--system] annex.defaultgroupwanted ...
git config [--global|--system] annex.defaultrequired ...
"""]]
These defaults would be applied when `git annex` initializes a repository (i.e. gives it a `annex.uuid`, e.g. `git annex init` or `git annex sync` of a fresh clone of a repo with annex).
I like my annexed/datalad repos (mostly research data next to analysis code for collaboration) to have `annex.synccontent = true` so people can just do (`datalad save`/`git annex add`) `git annex sync` and be sure afterwards everything is in order and safe. However as the default `wanted` is `anything` (apparently), they also get all files they probably don't want if they don't to `git annex wanted . present` manually (and manual boilerplate config and extra steps is always something that's nice to automate). Something like `git annex config --set annex.defaultwanted present` would solve this.
Thanks again very much for git-annex, I love it! 💛
Yann

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 6"
date="2023-01-18T17:55:49Z"
content="""
FWIW: I also feel that 2nd one (absent affect on a possibly present locally copy) would be preferable.
"""]]