Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2024-03-11 10:00:48 -04:00
commit 087e099e6a
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 170 additions and 0 deletions

View file

@ -0,0 +1,101 @@
### Please describe the problem.
On the current [macOS `HomeBrew` build of the current git-annex](https://formulae.brew.sh/formula/git-annex) (10.20240227), it appears that the build dependencies have dragged in the [latest Haskell `tls`
package](https://hackage.haskell.org/package/tls-2.0.1/docs/Network-TLS.html). Which now defaults `supportedExtendedMainSecret` to `RequireEMS` (previously it seems to have been `AllowEMS`; see eg [darcs bug report of similar error](https://bugs.darcs.net/issue2715)).
The result of this is that some podcast feeds, from webservers which do not support EMS, fail with an error, eg:
```
importfeed https://risky.biz/feeds/risky-business
download failed: HandshakeFailed (Error_Protocol "peer does not support Extended Main Secret" HandshakeFailure)
warning: downloading the feed failed (feed: https://risky.biz/feeds/risky-business)
ok
```
(And presumably this will also affect some non-podcast HTTPS downloads; I found it in a podcast download context.)
I believe this "Extended Main Secret" is also known as "Extended Master Secret", aka [RFC 7627](https://www.ietf.org/rfc/rfc7627.html), which was written up in 2015. So I can understand why ~9 years later the Haskell `tls` library is defaulting to insisting on EMS in a new major version. Unfortunately not all webservers, especially podcast feed webservers, have caught up with this.
As best I can tell git annex is getting this `tls` dependency via [`http-client`](https://hackage.haskell.org/package/http-client) which uses [`http-client-tls`](https://hackage.haskell.org/package/http-client-tls-0.3.6.3), and `http-client-tls` appears to just have a `tls (>=1.2)` dependency, which is presumably how `tls-2.0.0` / `tls-2.0.1` got dragged in, with these new defaults.
I'm unclear if git-annex is in a position to pass `AllowEMS` to the TLS library (and thus restore to the old default). But at least in the short term it might be worth considering doing that if possible.
### What steps will reproduce the problem?
Currently I have three podcast feeds (two from the same webserver) which fail:
```
git annex importfeed https://risky.biz/feeds/risky-business
```
```
git annex importfeed https://risky.biz/feeds/risky-business-news
```
```
git annex importfeed https://www.thecultureoftech.com/index.php/feed/podcast/
```
(Given the irony that the first two are are an InfoSec podcast, I have also reported this missing EMS extension support to them as well, so it may get fixed before you try it.)
It looks like I've also had one media file download fail repeatedly for the same reason (but the podcast feed itself downloads okay):
```
git annex addurl https://traffic.omny.fm/d/clips/53b6fe2a-4ef6-4356-ae92-a61500df6da0/40b3f537-c161-4823-ae44-af3a007e121b/b2682900-b36c-447b-812d-b1290049fea8/audio.mp3
```
### What version of git-annex are you using? On what operating system?
git annex 10.20240227, on macOS Ventura (13.6.3). With git annex installed from HomeBrew.
```
ewen@basadi:~$ git annex version
git-annex version: 10.20240227
build flags: Assistant Webapp Pairing FsEvents TorrentParser MagicMime Benchmark Feeds Testsuite S3 WebDAV
dependency versions: aws-0.24.1 bloomfilter-2.0.1.2 crypton-0.34 DAV-1.3.4 feed-1.3.2.1 ghc-9.6.3 http-client-0.7.16 persistent-sqlite-2.13.3.0 torrent-10000.1.3 uuid-1.3.15 yesod-1.6.2.1
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: darwin x86_64
supported repository versions: 8 9 10
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
ewen@basadi:~$
```
### Please provide any additional information below.
[[!format sh """
ewen@basadi:~/Music/podcasts$ git annex importfeed https://www.thecultureoftech.com/index.php/feed/podcast/
importfeed gathering known urls ok
importfeed https://www.thecultureoftech.com/index.php/feed/podcast/
download failed: HandshakeFailed (Error_Protocol "peer does not support Extended Main Secret" HandshakeFailure)
warning: downloading the feed failed (feed: https://www.thecultureoftech.com/index.php/feed/podcast/)
ok
ewen@basadi:~/Music/podcasts$
ewen@basadi:~/Music/podcasts$ git annex addurl https://traffic.omny.fm/d/clips/53b6fe2a-4ef6-4356-ae92-a61500df6da0/40b3f537-c161-4823-ae44-af3a007e121b/b2682900-b36c-447b-812d-b1290049fea8/audio.mp3
addurl https://traffic.omny.fm/d/clips/53b6fe2a-4ef6-4356-ae92-a61500df6da0/40b3f537-c161-4823-ae44-af3a007e121b/b2682900-b36c-447b-812d-b1290049fea8/audio.mp3
git-annex: HttpExceptionRequest Request {
host = "traffic.omny.fm"
port = 443
secure = True
requestHeaders = [("Accept-Encoding",""),("User-Agent","git-annex/10.20240227")]
path = "/d/clips/53b6fe2a-4ef6-4356-ae92-a61500df6da0/40b3f537-c161-4823-ae44-af3a007e121b/b2682900-b36c-447b-812d-b1290049fea8/audio.mp3"
queryString = ""
method = "HEAD"
proxy = Nothing
rawBody = False
redirectCount = 10
responseTimeout = ResponseTimeoutDefault
requestVersion = HTTP/1.1
proxySecureMode = ProxySecureWithConnect
}
(InternalException (HandshakeFailed (Error_Protocol "peer does not support Extended Main Secret" HandshakeFailure)))
failed
addurl: 1 failed
ewen@basadi:~/Music/podcasts$
"""]]
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
Absolutely, I've been using git-annex as my podcatcher (among other reasons) for about a decade at this point. Thanks for developing it!

View file

@ -0,0 +1,26 @@
[[!comment format=mdwn
username="ewen"
avatar="http://cdn.libravatar.org/avatar/605b2981cb52b4af268455dee7a4f64e"
subject="TLS v1.2 EMS (Extended Master/Main Secret)"
date="2024-03-07T03:01:20Z"
content="""
From some more research it seems that Extended Master Secret (aka Extended Main Secret) is a TLS 1.2 only extension, to work around a problem with TLS 1.2 (eg, [2015 post about the problem](https://www.tripwire.com/state-of-security/tls-extended-master-secret-extension-fixing-a-hole-in-tls)).
TLS v1.3 doesn't have this problem, by design, AFAIK. And thus clients/servers supporting TLS v1.3 entirely avoids the problem (possibly why I have only found it on a few servers; the one I looked into in detail definitely won't connect with TLS v1.3 right now, but they're looking into it).
The webserver support can be confirmed with, eg forced TLS v1.2:
```
echo \"\" | openssl s_client -tls1_2 -connect WEBSERVER:443 2>&1 | egrep \"Protocol|Extended master\"
```
and forced TLS v1.3 to check if that will work:
```
echo \"\" | openssl s_client -tls1_3 -connect WEBSERVER:443
```
Hopefully that means the number of impacted sites is *relatively* small (eg, ones that haven't enabled TLS v1.3 support in the last 5+ years).
Ewen
"""]]

View file

@ -0,0 +1,30 @@
[[!comment format=mdwn
username="bbigras"
avatar="http://cdn.libravatar.org/avatar/f1c0201e3f1435eaab02c803a33c52ae"
subject="comment 2"
date="2024-03-06T17:03:50Z"
content="""
I'm not sure I understand what you mean.
Do you mean that I should clone the s3 bucket with a tool like git-remote-s3? I'm not sure how to do that.
I tried downloading the whole bucket from backblaze, but since I enabled encryption on the bucket, backblaze doesn't let me download the encrypted files. I'm not sure yet if I can get those files using their api.
Just in case I didn't explain correctly what I'm trying to do, I'm trying to do something like this, on a new computer with a fresh ~/Document:
```bash
cd ~/Documents
git init
Initialized empty Git repository in /home/bbigras/Documents/.git/
git annex init
init ok
(recording state in git...)
git annex initremote backblaze type=S3 signature=v4 host=s3.us-west-000.backblazeb2.com bucket=my-bucket protocol=https encryption=hybrid keyid=my-key-id
initremote backblaze (encryption setup) (to gpg keys: my-key-id) (checking bucket...)
The bucket already exists, and its annex-uuid file indicates it is used by a different special remote.
git-annex: Cannot reuse this bucket.
failed
initremote: 1 failed
```
"""]]

View file

@ -0,0 +1,13 @@
[[!comment format=mdwn
username="imlew"
avatar="http://cdn.libravatar.org/avatar/23858c3eed3c3ea9e21522f4c999f1ed"
subject="Still experimental?"
date="2024-03-06T12:26:56Z"
content="""
`annex.tune.objecthash1=true` and `annex.tune.branchhash1=true` seem like they could be helpful in reducing git annex's inode usage, but the disclaimer about this feature being experimental is a little worrying.
Since this it is over 10 years old though, is it still considered experimental or has it graduated to being a stable feature? I.e. will using this meaningfully increase the chance of losing data?
Also, what is the (potential) benefit of using lowercase for the hashes?
"""]]