diff --git a/doc/bugs/Buggy_external_special_remote_stalls_after_7245a9e.mdwn b/doc/bugs/Buggy_external_special_remote_stalls_after_7245a9e.mdwn new file mode 100644 index 0000000000..a1d522aca8 --- /dev/null +++ b/doc/bugs/Buggy_external_special_remote_stalls_after_7245a9e.mdwn @@ -0,0 +1,64 @@ +In the datalad test suite, a test involving one of our special remotes +hangs after 7245a9ed5 (Improve shutdown process for external special +remotes and external backends, 2020-11-02). The hang depends on the +remote program creating an SSH socket but not cleaning it up on +failure. That's being addressed on our end +(). + +While that's clearly bad handling on the part of datalad's special +remote, I'm opening this bug report in case the above stall indicates +that 7245a9ed5 brings an increased susceptibility to hanging that's +worth worrying about. Based on 7245a9ed5's commit message, it sounds +like that resolved some hanging on Windows (and generally sounds like +a good change), so it probably makes sense keep that at the expense of +hanging that is restricted to buggy special remotes. But I don't have +a good enough understanding of the issue to know if it's limited to +misbehaving special remotes, or if perhaps there's a way to avoid both +hangs. + +In case it's helpful, here's a demo that tweaks the example special +remote to trigger a hang: + +[[!format sh """ +set -eu + +ga_srcdir="$(realpath "${1?requires git-annex source directory as first argument}")" +host="${2?requires ssh host as second argument}" +cd "$(mktemp -d "${TMPDIR:-/tmp}"/gx-testremote-XXXXXXX)" +top="$PWD" + +mkdir bin +git -C "$ga_srcdir" cat-file -p cfae74ae25a791a8dbf8b476fe17d5c6feacd8b0 \ + >bin/git-annex-remote-foo + +# Open a socket in setupcreds(). +cmd="ssh -fN -o ControlMaster=auto -o ControlPersist=15m -o ControlPath=$top/socket $host" +sed -i "s|setupcreds () {|&\\n\\t$cmd|" bin/git-annex-remote-foo + +chmod +x bin/git-annex-remote-foo +export PATH="$top/bin:$PATH" + +git annex version +git init a +( + cd a + git annex init main + # This will trigger an INITREMOTE-FAILURE failure because + # MYPASSWORD and MYLOGIN are not set. This hangs after 7245a9ed5. + git annex initremote d --debug \ + type=external externaltype=foo directory="$top/d" encryption=none +) +"""]] + +``` +$ sh /tmp/gx-failed-dir-hang.sh ~/src/haskell/git-annex smaug +git-annex version: 8.20201103 +build flags: Assistant Pairing Inotify MagicMime Feeds Testsuite S3 WebDAV +dependency versions: aws-0.20 bloomfilter-2.0.1.0 cryptonite-0.25 DAV-1.3.4 feed-1.2.0.1 ghc-8.6.5 http-client-0.6.4 persistent-sqlite-2.10.5.2 uuid-1.3.13 +[... 40 lines ...] +[2020-11-12 11:40:01.886013862] /tmp/gx-testremote-3MudMuQ/bin/git-annex-remote-foo[1] --> SETCONFIG directory /tmp/gx-testremote-3MudMuQ/d +[2020-11-12 11:40:02.451619481] /tmp/gx-testremote-3MudMuQ/bin/git-annex-remote-foo[1] --> INITREMOTE-FAILURE You need to set MYPASSWORD and MYLOGIN environment variables when running initremote. +``` + +[[!meta author=kyle]] +[[!tag projects/datalad]]