move old fixed datalad/dandi/repronim bugs to the project pages

As done previously in 2023 in commit bcc69f07e8

Commands used:

    for f in $(git grep -l '\[\[!tag projects/dandi\]\]'); do if grep -q 'done\]\]' "$f"; then git mv "$f" ../projects/dandi/bugs-done; g=$(echo "$f" | sed 's/.mdwn//'); if [ -d "$g" ]; then git mv "$g" ../projects/dandi/bugs-done; fi; fi; done
    for f in $(git grep -l '\[\[!tag projects/repronim\]\]'); do if grep -q 'done\]\]' "$f"; then git mv "$f" ../projects/repronim/bugs-done; g=$(echo "$f" | sed 's/.mdwn//'); if [ -d "$g" ]; then git mv "$g" ../projects/repronim/bugs-done; fi; fi; done
    for f in $(git grep -l '\[\[!tag projects/datalad\]\]'); do if grep -q 'done\]\]' "$f"; then git mv "$f" ../projects/datalad/bugs-done; g=$(echo "$f" | sed 's/.mdwn//'); if [ -d "$g" ]; then git mv "$g" ../projects/datalad/bugs-done; fi; fi; done
This commit is contained in:
Joey Hess 2025-01-01 13:12:56 -04:00
parent 2fe36b35a2
commit 292acd3c28
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
108 changed files with 0 additions and 0 deletions

View file

@ -0,0 +1,68 @@
### Please describe the problem.
Unable to addurl to a `file:///` on Windows
1. doesn't understand `file:///C:/`
2. with `file://C:/` blows with permission denied:
[[!format sh """
C:\...pData\Local\Temp\1\datalad_temp_testrepo_tmphjl88>git annex addurl --file buga file:///C:/123
addurl file:///C:/123
download failed: /C:/123: openBinaryFile: invalid argument (Invalid argument)
failed
git-annex: addurl: 1 failed
C:\...pData\Local\Temp\1\datalad_temp_testrepo_tmphjl88>git annex addurl --file buga file://C:/123
addurl file://C:/123
(to buga)
git-annex: .git\annex\tmp\URL-s6--file&c%%C&c%123: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\appveyor\\
AppData\\Local\\Temp\\1\\datalad_temp_testrepo_tmphjl88\\.git\\annex\\tmp\\URL-s6--file&c%%C&c%123" Just "\\\\?\\
C:\\Users\\appveyor\\AppData\\Local\\Temp\\1\\datalad_temp_testrepo_tmphjl88\\buga": permission denied (The proce
ss cannot access the file because it is being used by another process.)
failed
git-annex: addurl: 1 failed
"""]]
here is some relevant details (and showing curl handling both file:// and file:///):
[[!format sh """
C:\...pData\Local\Temp\1\datalad_temp_testrepo_tmphjl88>git status
On branch adjusted/master(unlocked)
nothing to commit, working tree clean
C:\...pData\Local\Temp\1\datalad_temp_testrepo_tmphjl88>git annex version
git-annex version: 7.20181205-g51d6f38b1
build flags: Assistant Webapp Pairing S3(multipartupload)(storageclasses) WebDAV TorrentParser Feeds Testsuite
dependency versions: aws-0.17.1 bloomfilter-2.0.1.0 cryptonite-0.23 DAV-1.3.1 feed-0.3.12.0 ghc-8.0.2 http-client
-0.5.7.1 persistent-sqlite-2.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.4.5
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3
_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B51
2E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2S256E BLAKE2S256 BLAKE2S
160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM
URL
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar hook external
operating system: mingw32 i386
supported repository versions: 5 7
upgrade supported from repository versions: 2 3 4 5 6
local repository version: 7
C:\...pData\Local\Temp\1\datalad_temp_testrepo_tmphjl88>git status
On branch adjusted/master(unlocked)
nothing to commit, working tree clean
C:\...pData\Local\Temp\1\datalad_temp_testrepo_tmphjl88>curl file://C:/123
124
C:\...pData\Local\Temp\1\datalad_temp_testrepo_tmphjl88>curl file:///C:/123
124
"""]]
More information about this appveyor server could be obtained from [datalad wtf](http://paste.debian.net/1055359/) output
Awhile back we [had related discussion](https://git-annex.branchable.com/bugs/git-annex_drop_fails_to_access_file__58____47____47____47___target_URL_on_Windows/) but at least `addurl` seemed to work then.
[[!meta author=yoh]]
[[!tag projects/repronim]]
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2023-03-27T16:27:00Z"
content="""
I tried this on windows, and the second command succeeds now.
The first command still fails as shown.
At this point, what's left of this bug seems to be the same as
<https://git-annex.branchable.com/bugs/git-annex_drop_fails_to_access_file__58____47____47____47___target_URL_on_Windows/>
"""]]

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2023-03-27T17:57:12Z"
content="""
Ok, put in an ugly hack to fix this.
"""]]

View file

@ -0,0 +1,26 @@
### Please describe the problem.
Familiarizing myself more with adjusted branches mode and might be doing smth wrong. But in this http://www.oneukrainian.com/tmp/case-20230630.tgz case I observe that `annex sync` simply updates `master` to some prior state, thus possibly silently causing a data loss for me if I don't spot it:
```
tar -xzf case-20230630.tgz
cd case
content.html@ datasets.datalad.org/ subfolder/
( source ~/git-annexes/10.20230626+git13-g029d12815c.env; git annex version | head -n 1; git describe master; git checkout 'adjusted/master(unlocked)'; git annex sync ; git describe master; )
git-annex version: 10.20230626+git13-g029d12815c-1~ndall+1
0.0.0-2-gf34191a
Switched to branch 'adjusted/master(unlocked)'
git-annex sync will change default behavior to operate on --content in a future version of git-annex. Recommend you explicitly use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent)
commit
On branch adjusted/master(unlocked)
nothing to commit, working tree clean
ok
0.0.0-1-gde710c5
```
PS investigation of adjusted/unlocked came up in ReproNim context where people wanted a "hard copy" of the fmriprep results without symlinks to simplify navigation of the results in the browser, which otherwise due to browser resolving symlinks makes it hard and require a workaround like starting a webserver [as we documented in dbic handbook](https://dbic-handbook.readthedocs.io/en/latest/datalad.html#how-to-view-mriqcfmriprepetc-dataladified-results-in-a-browser)
[[!meta author=yoh]]
[[!tag projects/repronim]]
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,33 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2023-07-05T19:49:19Z"
content="""
Simplified test case:
git init tc
cd tc
git-annex init
echo 1 > foo
git-annex add
git commit -m add
git annex adjust --unlock
git checkout master
rm foo
echo 2 > foo
git-annex add
git commit -m "this commit will be lost"
git checkout 'adjusted/master(unlocked)'
git annex adjust --unlock # or git-annex sync
git log master
What an unfortunate oversight! And it's not a reversion, it's been there
since the beginning of adjusted branches.
git-annex adjust should display a warning message in that situation,
since the original branch has diverged from the adjusted branch.
And git-annex sync should be able to resolve the divergence by
auto-merging the changes from the original branch into the adjusted
branch.
"""]]

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2023-07-05T21:01:53Z"
content="""
I've fixed the data loss part of this bug.
`git-annex sync` is able to resolve the divergence too. But for some
reason, the first time it's run after the divergence, it leaves it
diverged, and the second time it resolves it. That needs to be fixed.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2023-07-06T16:16:36Z"
content="""
Ok, fixed git-annex sync to immediately merge the changes from the original
branch into the adjusted branch.
"""]]

View file

@ -0,0 +1,190 @@
### Please describe the problem.
Our DataLad test which explicitly tests that we are not breeding commits in git-annex branch while adding files/urls to point to datalad-archive special remote started to fail going from git-annex 10.20240532-gf9ce7a452cc0fd5cdd2d58739741f7264fdbc598 to 10.20240532-g28f5c47b5a0daf96e5ed9aa719ff1e2763d3cc8b
(invocation: `python -m pytest -s -v datalad/local/tests/test_add_archive_content.py::TestAddArchiveOptions::test_add_delete_after_and_drop_subdir`)
If before we had a single commit
<details>
<summary></summary>
```shell
git log -p git-annex^..git-annex
commit b42433cab9f671d206fe937ee7b68b53f11a0c54 (git-annex)
Author: DataLad Tester <test@example.com>
Date: Sun Jun 30 10:48:16 2024 -0400
update
diff --git a/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log
new file mode 100644
index 0000000..cc638db
--- /dev/null
+++ b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log
@@ -0,0 +1,2 @@
+1719758896s 1 c04eb54b-4b4e-5755-8436-866b043170fa
+1719758897s 0 d53ab0e3-21a9-4084-806f-bf9f5812f34e
diff --git a/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log.web b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log.web
new file mode 100644
index 0000000..8ef0f1f
--- /dev/null
+++ b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log.web
@@ -0,0 +1 @@
+1719758896s 1 :dl+archive:MD5E-s3584--2f350c3650d5e3a21785d55f5a94ce70.tar#path=1/file.txt&size=4
diff --git a/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log
new file mode 100644
index 0000000..cc638db
--- /dev/null
+++ b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log
@@ -0,0 +1,2 @@
+1719758896s 1 c04eb54b-4b4e-5755-8436-866b043170fa
+1719758897s 0 d53ab0e3-21a9-4084-806f-bf9f5812f34e
diff --git a/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log.web b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log.web
new file mode 100644
index 0000000..30bb5e9
--- /dev/null
+++ b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log.web
@@ -0,0 +1 @@
+1719758896s 1 :dl+archive:MD5E-s3584--2f350c3650d5e3a21785d55f5a94ce70.tar#path=1/1.dat&size=5
```
</details>
<details>
<summary>now we got two</summary>
```shell
Author: DataLad Tester <test@example.com>
Date: Sun Jun 30 10:45:12 2024 -0400
update
diff --git a/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log
new file mode 100644
index 0000000..97acf53
--- /dev/null
+++ b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log
@@ -0,0 +1,2 @@
+1719758713s 0 86661c7b-0604-49e7-8d65-1baf4ca9f469
+1719758712s 1 c04eb54b-4b4e-5755-8436-866b043170fa
diff --git a/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log.web b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log.web
new file mode 100644
index 0000000..e5bafba
--- /dev/null
+++ b/d77/a0b/MD5E-s4--ec4d1eb36b22d19728e9d1d23ca84d1c.txt.log.web
@@ -0,0 +1 @@
+1719758712s 1 :dl+archive:MD5E-s3584--de6498c9ca26fee011f289f5f5972ed0.tar#path=1/file.txt&size=4
diff --git a/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log
index 11934b6..97acf53 100644
--- a/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log
+++ b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log
@@ -1,2 +1,2 @@
-1719758712s 1 86661c7b-0604-49e7-8d65-1baf4ca9f469
+1719758713s 0 86661c7b-0604-49e7-8d65-1baf4ca9f469
1719758712s 1 c04eb54b-4b4e-5755-8436-866b043170fa
commit 8c4fdbadb4b1735cbb47f833ef99235790b8bcbf
Author: DataLad Tester <test@example.com>
Date: Sun Jun 30 10:45:12 2024 -0400
update
diff --git a/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log
new file mode 100644
index 0000000..11934b6
--- /dev/null
+++ b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log
@@ -0,0 +1,2 @@
+1719758712s 1 86661c7b-0604-49e7-8d65-1baf4ca9f469
+1719758712s 1 c04eb54b-4b4e-5755-8436-866b043170fa
diff --git a/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log.web b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log.web
new file mode 100644
index 0000000..107c66f
--- /dev/null
+++ b/f45/7f1/MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat.log.web
@@ -0,0 +1 @@
+1719758712s 1 :dl+archive:MD5E-s3584--de6498c9ca26fee011f289f5f5972ed0.tar#path=1/1.dat&size=5
```
</details>
for the same effect. And I believe the command which triggers them is `['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'addurl', '--with-files', '--json', '--json-error-messages', '--batch']` which before (for years?!) resulted in expected single commit.
<details>
<summary>Here is the full set of datalad logs for the steps triggering that </summary>
```shell
[DEBUG ] Determined class of decorated function: <class 'datalad.local.add_archive_content.AddArchiveContent'>
[DEBUG ] Resolved dataset to add-archive-content: /home/yoh/.tmp/datalad_temp_tree_rsua9kmg
[DEBUG ] Determined class of decorated function: <class 'datalad.core.local.status.Status'>
[DEBUG ] Resolved dataset to report status: /home/yoh/.tmp/datalad_temp_tree_rsua9kmg
[DEBUG ] Querying AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).diffstatus() for paths: [PosixPath('/home/yoh/.tmp/datalad_temp_tree_rsua9kmg/subdir/1.tar')]
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'rev-parse', '--quiet', '--verify', 'HEAD^{commit}'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[DEBUG ] Query repo: ['ls-files', '--stage', '-z', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'ls-files', '--stage', '-z', '--exclude-standard', '-o', '--directory', '--no-empty-directory', '--', 'subdir/1.tar'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Done query repo: ['ls-files', '--stage', '-z', '--exclude-standard', '-o', '--directory', '--no-empty-directory']
[DEBUG ] Done AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'ls-files', '-z', '-m', '-d', '--', 'subdir/1.tar'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[DEBUG ] Query repo: ['ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l', '--', 'subdir/1.tar'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Done query repo: ['ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG ] Done AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'status', '--porcelain', '--untracked-files=normal', '--ignore-submodules=none'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'find', '--anything', '--json', '--json-error-messages', '-c', 'annex.dotfiles=true', '--', 'subdir/1.tar'] (protocol_class=AnnexJsonProtocol) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Finished ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'find', '--anything', '--json', '--json-error-messages', '-c', 'annex.dotfiles=true', '--', 'subdir/1.tar'] with status 0
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'contentlocation', 'MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar', '-c', 'annex.dotfiles=true'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[INFO ] Adding content of the archive subdir/1.tar into annex AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Initiating clean cache for the archives under /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives
[DEBUG ] Cache initialized
[DEBUG ] Not initiating existing cache for the archives under /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives
[DEBUG ] Cached directory for archive /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/annex/objects/gg/zf/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar is fbab09b98e
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'cat-file', 'blob', 'git-annex:remote.log'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[Level 11] CommandError: 'git -c diff.ignoreSubmodules=none -c core.quotepath=false cat-file blob git-annex:remote.log' failed with exitcode 128 [err: 'fatal: path 'remote.log' does not exist in 'git-annex'']
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'cat-file', 'blob', 'git-annex:trust.log'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[Level 11] CommandError: 'git -c diff.ignoreSubmodules=none -c core.quotepath=false cat-file blob git-annex:trust.log' failed with exitcode 128 [err: 'fatal: path 'trust.log' does not exist in 'git-annex'']
[INFO ] Initializing special remote datalad-archives
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'initremote', 'datalad-archives', 'encryption=none', 'type=external', 'autoenable=true', 'externaltype=datalad-archives', 'uuid=c04eb54b-4b4e-5755-8436-866b043170fa', '-c', 'annex.dotfiles=true'] (protocol_class=StdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Finished ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'initremote', 'datalad-archives', 'encryption=none', 'type=external', 'autoenable=true', 'externaltype=datalad-archives', 'uuid=c04eb54b-4b4e-5755-8436-866b043170fa', '-c', 'annex.dotfiles=true'] with status 0
[DEBUG ] Run ['git', 'config', '-z', '-l', '--show-origin'] (protocol_class=StdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Finished ['git', 'config', '-z', '-l', '--show-origin'] with status 0
[DEBUG ] Acquiring a lock /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives/fbab09b98e.extract-lck
[DEBUG ] Acquired? lock /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives/fbab09b98e.extract-lck: True
[DEBUG ] Extracting /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/annex/objects/gg/zf/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar under /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives/fbab09b98e
[DEBUG ] Run ['7z', 'x', '/home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/annex/objects/gg/zf/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar'] (protocol_class=KillOutput) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives/fbab09b98e)
[DEBUG ] Finished ['7z', 'x', '/home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/annex/objects/gg/zf/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar'] with status 0
[DEBUG ] Releasing lock /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives/fbab09b98e.extract-lck
[INFO ] Start Extracting archive
[DEBUG ] Adding /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/subdir/.dataladiwgxvqzi/1/1.dat to annex pointing to dl+archive:MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar#path=1/1.dat&size=5 and with options None
[DEBUG ] Starting new runner for BatchedAnnex(command=['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'addurl', '--with-files', '--json', '--json-error-messages', '--batch'], encoding=None, exception_on_timeout=False, last_request=None, output_proc=<function readline_json at 0x7f165f5adf80>, path=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg, return_code=None, runner=None, stderr_output=b'', timeout=None, wait_timed_out=None)
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'addurl', '--with-files', '--json', '--json-error-messages', '--batch'] (protocol_class=BatchedCommandProtocol) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Starting new runner for BatchedAnnex(command=['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'dropkey', '--force', '--json', '--json-error-messages', '--batch'], encoding=None, exception_on_timeout=False, last_request=None, output_proc=<function readline_json at 0x7f165f5adf80>, path=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg, return_code=None, runner=None, stderr_output=b'', timeout=None, wait_timed_out=None)
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'dropkey', '--force', '--json', '--json-error-messages', '--batch'] (protocol_class=BatchedCommandProtocol) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Adding /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/subdir/.dataladiwgxvqzi/1/file.txt to annex pointing to dl+archive:MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar#path=1/file.txt&size=4 and with options None
[INFO ] Finished adding subdir/1.tar: Files processed: 2, renamed: 2, removed: 2, +annex: 2
[DEBUG ] Removing extracted and annexed files under /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/subdir/.dataladiwgxvqzi
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'rm', '--force', '-r', '--', 'subdir/.dataladiwgxvqzi'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Query status of AnnexRepo('/home/yoh/.tmp/datalad_temp_tree_rsua9kmg') for all paths
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'rev-parse', '--quiet', '--verify', 'HEAD^{commit}'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[DEBUG ] Query repo: ['ls-files', '--stage', '-z']
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'ls-files', '--stage', '-z'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Done query repo: ['ls-files', '--stage', '-z']
[DEBUG ] Done AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'ls-files', '-z', '-m', '-d'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[DEBUG ] Query repo: ['ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l'] (protocol_class=GeneratorStdOutErrCapture) (cwd=/home/yoh/.tmp/datalad_temp_tree_rsua9kmg)
[DEBUG ] Done query repo: ['ls-tree', 'HEAD', '-z', '-r', '--full-tree', '-l']
[DEBUG ] Done AnnexRepo(/home/yoh/.tmp/datalad_temp_tree_rsua9kmg).get_content_info(...)
[INFO ] Extracting archive 2 Files done in 0.872975 sec at 2.29102 Files/sec
[DEBUG ] Cleaning up the cache for /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/annex/objects/gg/zf/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar under /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives/fbab09b98e
[DEBUG ] Cleaning up the stamp file for /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/annex/objects/gg/zf/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar/MD5E-s3584--bb87b72d411b7415410da27950d2a165.tar under /home/yoh/.tmp/datalad_temp_tree_rsua9kmg/.git/datalad/tmp/archives/fbab09b98e.stamp
add-archive-content(ok): /home/yoh/.tmp/datalad_temp_tree_rsua9kmg (dataset)
```
</details>
[[!meta author=yoh]]
[[!tag projects/repronim]]
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,19 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2024-07-31T14:20:51Z"
content="""
Note that this does not affect the number of commits made by `addurl` generally
eg when adding multiple urls with --batch from the web.
Also, I don't think that the commits you picked out and showed necessarily
correspond to one-another. The state being recorded in the commit in the 1st
run is not the same as the state that gets recorded by the two commits in the
2nd run. Unless, there is an actual behavior change that eg, leaves the file
present in a repository that it was not present in before.
In the first run the commit shows key
MD5E-s5--db87ebcba59a8c9f34b68e713c08a718.dat ends up recorded as present in
datalad-archives but not in the local repository. In the second run, the
commits show that the same key ends up recorded present in both repositories.
"""]]

View file

@ -0,0 +1,23 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2024-07-31T16:06:38Z"
content="""
Bisected to [[!commit 780367200b14d532f745079dfa09ffaa214d0a84]],
"remove dead nodes when loading the cluster log".
Replacing `loadClusters` with a noop on top of that commit gets the test
suite passing again.
Since nothing in `loadClusters` involves the location log at all, I think
this must come down to a difference in when/if git-annex starts reading
from the git-annex branch. There could be git-annex commands that didn't
used to read from the branch before, that now do. Which might mean merging
in other git-annex branches at different points in time than happened
before, which I suppose can result in an additional commit.
Unfortunately, I can't avoid the early `loadClusters` for reasons explained
in that commit.
Anyway, I doubt this will result in a lot of additional commits.
"""]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2024-07-31T19:50:38Z"
content="""
Aha! I found a way around the dependency loop.
This is fixed.
"""]]

View file

@ -0,0 +1,33 @@
### Please describe the problem.
See e.g. on [https://github.com/datalad/git-annex/actions/runs/6680765679/job/18154374923](https://github.com/datalad/git-annex/actions/runs/6680765679/job/18154374923)
```
Repo Tests v10 unlocked
Init Tests
init: OK (0.17s)
add: OK (0.73s)
addurl: OK (0.57s)
crypto: FAIL (3.07s)
./Test/Framework.hs:86:
initremote failed with unexpected exit code (transcript follows)
initremote foo (encryption setup) (to gpg keys: 129D6E0AC537B9C7)
git-annex: .git/annex/othertmp/remote.log: hPut: invalid argument (invalid character)
failed
(recording state in git...)
initremote: 1 failed
```
started only recently but consistently:
```
(git)smaug:/mnt/datasets/datalad/ci/git-annex/builds/2023/10[master]git
$> git grep -l 'hPut: invalid argument'
cron-20231027/build-ubuntu.yaml-1289-1c03c8fd-failed/0_test-annex (normal, ubuntu-latest).txt
...
```
[[!meta author=yoh]]
[[!tag projects/repronim]]
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,20 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2023-11-01T16:07:15Z"
content="""
Reproduced with LANG=C:
./Test/Framework.hs:86:
initremote failed with unexpected exit code (transcript follows)
initremote foo (encryption setup) (to gpg keys: 129D6E0AC537B9C7)
git-annex: .git/annex/othertmp/remote.log: withFile: invalid argument (cannot encode character '\132')
failed
(recording state in git...)
initremote: 1 failed
Not quite the same error but almost certianly the same problem.
I've confirmed this is caused by
[[!commit 3742263c99180d1391e4fd51724aae52d6d02137]]
"""]]

View file

@ -0,0 +1,25 @@
[[!comment format=mdwn
username="joey"
subject="""comment 2"""
date="2023-11-01T16:53:48Z"
content="""
Will probably need to revert the Remote/Helper/Encryptable.hs part of that
commit.
What is happening here is, encodeBS is failing when run on the String from
a SharedPubKeyCipher. That String comes from Utility.Gpg.genRandom and is
literally a bunch of random bytes. So it's not encoded with the filesystem
encoding. And it really ought to be a ByteString of course, but since it's
not, anything involving encoding it fails.
That's why the old code had this comment:
{- Not using Utility.Base64 because these "Strings" are really
- bags of bytes and that would convert to unicode and not round-trip
- cleanly. -}
And converted that String to a ByteString via `B.pack . s2w8`, which avoids this problem.
What an ugly thing. Really ought to be fixed to use ByteString throughout.
But for now, let's revert.
"""]]

View file

@ -0,0 +1,106 @@
### Please describe the problem.
Reference: issue/discovery in [repronim/containers while adding neurodesk images](https://github.com/ReproNim/containers/issues/64#issuecomment-1492256561)
- apparently we had no URLs made registered with images despite running `registerurl KEY ANNEX`
- some images do have urls
took awhile to grasp what is going on and then I found an unfinished reproducer from `Mar 15 2021 annex-claimurl.sh` without recollection why I have not finished it, but it seems that it might be "operator error" somehow? but seems unlikely... might be datalad special remote bug?
Summary of the problem: if there is an external git-annex-remote which CLAIMURL - git-annex registerurl does **not** associate that URL with any (that external or web) remote and thus does not make that key available to the user despite knowing the url.
Should it btw default to `web` if no remote is associated with it?
Filed complimentary [registerurl --remote REMOTE](https://git-annex.branchable.com/todo/registerurl_--remote_REMOTE/) TODO since in this case I would have preferred to just register against web remote.
### What steps will reproduce the problem?
Here is a new "quick" reproducer but you need datalad being installed to get `git-annex-remote-datalad`.
```
#!/bin/bash
export PS4='> '
set -eu
set -x
cd "$(mktemp -d ${TMPDIR:-/tmp}/dl-XXXXXXX)"
git init
git annex init
# It works fine if we do not enable datalad special remote!
# so it is something about interaction there
git annex initremote datalad externaltype=datalad type=external encryption=none autoenable=true uuid=65b6c36b-debd-4a23-8fa3-675cbd200496
git annex enableremote datalad
git annex info
# so it seems that addurl does it right
git annex addurl --debug --file 123.dat http://www.oneukrainian.com/tmp/123.dat
# but if I do via registerurl -- not quite so
echo 124 > 124.dat
git annex add 124.dat
key=$(readlink -f 124.dat | xargs basename)
git annex registerurl --debug "$key" http://www.oneukrainian.com/tmp/124.dat
git commit -m 'added those two files with urls'
git annex whereis --debug 123.dat
git annex whereis --debug 124.dat
git checkout git-annex
: # URLs are known for both
git grep oneukrainian
: # but only 123.dat would be associated with datalad remote
git grep 65b6c36b-debd-4a23-8fa3-675cbd200496
```
With [full log here](http://www.oneukrainian.com/tmp/annex-claimurl-2023.sh.log) and without `--debug` ending up like
```
grep -v '^\[' annex-claimurl-2023.sh.log | tail -n 29
(recording state in git...)
> git commit -m 'added those two files with urls'
2 files changed, 2 insertions(+)
create mode 120000 123.dat
create mode 120000 124.dat
> git annex whereis --debug 123.dat
whereis 123.dat [2023-03-31 18:29:27.56573965] (Utility.Process) process [1429290] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"]
(2 copies)
62c53770-5274-40d4-a45a-de308c234ea9 -- yoh@bilena:~/.tmp/dl-FbOrptq [here]
65b6c36b-debd-4a23-8fa3-675cbd200496 -- [datalad]
datalad: http://www.oneukrainian.com/tmp/123.dat
ok
> git annex whereis --debug 124.dat
whereis 124.dat [2023-03-31 18:29:27.857735575] (Utility.Process) process [1429322] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","cat-file","--batch"]
(1 copy)
62c53770-5274-40d4-a45a-de308c234ea9 -- yoh@bilena:~/.tmp/dl-FbOrptq [here]
ok
> git checkout git-annex
Switched to branch 'git-annex'
> :
> git grep oneukrainian
060/68b/SHA256E-s4--ca2ebdf97d7469496b1f4b78958f9dc8447efdcb623953fee7b6996b762f6fff.dat.log.web:1680301767.477711756s 1 :http://www.oneukrainian.com/tmp/124.dat
ae1/21c/SHA256E-s4--181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b.dat.log.web:1680301767.037966322s 1 :http://www.oneukrainian.com/tmp/123.dat
> :
> git grep 65b6c36b-debd-4a23-8fa3-675cbd200496
ae1/21c/SHA256E-s4--181210f8f9c779c26da1d9b2075bde0127302ee0e3fca38c9a83f5b1dd8e5d3b.dat.log:1680301767.038748415s 1 65b6c36b-debd-4a23-8fa3-675cbd200496
remote.log:65b6c36b-debd-4a23-8fa3-675cbd200496 autoenable=true encryption=none externaltype=datalad name=datalad type=external timestamp=1680301766.517251391s
uuid.log:65b6c36b-debd-4a23-8fa3-675cbd200496 datalad timestamp=1680301765.789226249s
```
so - both keys have urls, but only 123.dat one is associated with datalad special remote, and only it has url reported by whereis
### What version of git-annex are you using? On what operating system?
10.20230126 but tried with older 8.20210803 since thought it must be regression -- the same result
[[!meta author=yoh]]
[[!tag projects/repronim]]
> [[fixed|done]] --[[Joey]]

View file

@ -0,0 +1,29 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2023-04-04T17:07:37Z"
content="""
This is intentional, see [[!commit 451171b7c1eaccfd0f39d4ec1d64c6964613f55a]]
which changed setUrlPresent to only update presence info when the url
belongs to the web but not when it's claimed by other special remotes.
It makes sense for registerurl to be symmetric with rmurl, and rmurl only
updates presence info when the url is a web url.
To the extent I've been able to follow the complex reasoning there for why,
part of it is clear: The web special remote is different from other special
remotes in that content cannot be dropped from it by git-annex, and the url is
the only pointer to content. So when rmurl removes the last web url, it makes
sense to treat the content as no longer present on the web. But if the url is
claimed by another special remote, which does support dropping content, the
content would still be present on it after removing its url, and would be
accessible w/o using that url, and `git-annex fsck --fast --from` would notice
it was present and fix up the location log if it didn't show it as content.
Also note that the rmurl man page documents this when it says:
Removing the last web url will make git-annex no longer treat content as being
present in the web special remote.
All you need to do is use `git-annex setpresentkey` along with registerurl.
"""]]

View file

@ -0,0 +1,13 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 2"
date="2023-04-04T20:15:59Z"
content="""
yet to re-review that reasoning, but does it mean that to merely register a URL client needs to
- call `annex registerurl`
- inspect to which remote URL was added/was claimed (is there a way? `whois` is silent)
- if it was claimed by some special remote other than web -- use `annex setpresentkey`?
Sounds like too much / too fragile, and somewhat different from how `addurl` behaves which does it all just fine regardless either it is web or some claimurl'ed remote.
"""]]

View file

@ -0,0 +1,35 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 3"
date="2023-04-05T00:30:00Z"
content="""
So to some degree it is a regression / broken behavior which initially worked just fine with registerurl -- tried the 6.20180913+git149-g23bd27773 version and it performed \"as expected\". Eh, never enough tests ;)
I have looked at that commit changelog and [detailed description](http://source.git-annex.branchable.com/?p=source.git;a=blob;f=doc/bugs/suggests_to_enable_web_remote_even_when_there_is_no_web_urls_for_the_file/comment_4_6dff7befbaacbff573c5f72688966af5._comment;h=c636b09291a23bbce52b0367a767717137f99a21;hb=451171b7c1eaccfd0f39d4ec1d64c6964613f55a) . Not fully grasping yet why `registerurl` should not behave symmetrically with `addurl` in being sufficient by itself to add a url to content so it becomes usable for `get` right away, without some other dances like `setpresentkey`. I think I do get `rmurl` \"ambiguity\" but here on that more reflected below.
Rereading your comment [above](https://git-annex.branchable.com/bugs/registerurl_does_not_register_if_external_remote/#comment-ba9d6517d8f8c10167da95b122a022b3):
> part of it is clear: The web special remote is different from other special remotes in that content cannot be dropped from it by git-annex, and the url is the only pointer to content.
This is just an assumption on some \"special nature of web remote\", e.g. the `datalad` remote also doesn't support dropping, and URL is also just the pointer to content. And CLAIMURL functionality came IIRC exactly for that use case and before adding some kind of duality for having content accessible directly from special remote and via url.
> But if the url is claimed by another special remote, which does support dropping content, the content would still be present on it after removing its url, and would be accessible w/o using that url,
that is yet another assumption, since e.g. in the case of datalad remote `rmurl` effect would be identical to `web` remote, and there is no other way to get content from that remote. (so there is no duality mentioned above)
> All you need to do is use git-annex setpresentkey along with registerurl.
this somewhat contradicts above \"the content would still be present on it after removing its url\" which suggests that presence of URL for the remote already sufficient indication of being present on the remote.
Overall, there is seems some assumptions about URLs and external remotes which ideally should be avoided. May be it it should somehow be reflected in the external remote protocol to indicate that CLAIMing URL indicates that it is present at that URL, and that there is no other way to access that content from the remote besides via URL.
As a workaround I of cause will now either `setpresentkey` or will just reassign all urls to be handled directly by web remote somehow. But in the long run I think it is problematic design since even `registerurl` doesn't even report to which remote that URL was registered to
```
> git annex registerurl --json SHA256E-s4--ca2ebdf97d7469496b1f4b78958f9dc8447efdcb623953fee7b6996b762f6fff.dat http://www.oneukrainian.com/tmp/124.dat
{\"command\":\"registerurl\",\"error-messages\":[],\"file\":null,\"input\":[\"SHA256E-s4--ca2ebdf97d7469496b1f4b78958f9dc8447efdcb623953fee7b6996b762f6fff.dat\",\"http://www.oneukrainian.com/tmp/124.dat\"],\"success\":true}
```
so how could I generally to know proper invocation for `setpresent` key to follow it up?
"""]]

View file

@ -0,0 +1,64 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2023-04-05T17:25:48Z"
content="""
Whups, I forgot about the newish unregisterurl! That's the true inverse of
registerurl. So rmurl is really more the inverse of addurl.
I think I've fully understood the situation that led to this reversion now.
I do think it was a reversion. That change was all about SETURLPRESENT and
SETURLMISSING in the external special remote protocol, as well as rmurl;
I think that the effect on registerurl was not considered.
So while I'd like to simplify registerurl to as basic a plumbing command as
possible, and would prefer it not to update location tracking, there's the
matter of backward compatability. Especially for simple cases like adding
regular web urls with it. It would be ok to change it back to update location
tracking for remotes that claim an url. As long as unregisterurl can be
symmetric with it --- can it?
rmurl also has its own wacky behavior in this area:
# git-annex addurl --fast https://cdimage.debian.org/debian-cd/current/i386/bt-cd/debian-11.6.0-i386-netinst.iso.torrent
(downloading torrent file...) addurl https://cdimage.debian.org/debian-cd/current/i386/bt-cd/debian-11.6.0-i386-netinst.iso.torrent (from bittorrent) (to debian-11.6.0-i386-netinst.iso) ok
(recording state in git...)
# git-annex rmurl debian-11.6.0-i386-netinst.iso https://cdimage.debian.org/debian-cd/current/i386/bt-cd/debian-11.6.0-i386-netinst.iso.torrent
rmurl debian-11.6.0-i386-netinst.iso ok
(recording state in git...)
# git-annex whereis debian-11.6.0-i386-netinst.iso
whereis debian-11.6.0-i386-netinst.iso (1 copy)
00000000-0000-0000-0000-000000000002 -- bittorrent
ok
# git-annex get debian-11.6.0-i386-netinst.iso
(fails)
Is that a bug? It's certianly not ideal for the bittorrent special
remote, which can't download the file once the url is removed. (It is
documented behavior though.)
While thinking about those questions, I thought of this situation:
# git-annex initremote s3 type=S3 ..
# git-annex copy --key $key --to s3
# git-annex registerurl $key $url
# git-annex unregisterurl $key $url
# git-annex drop --key $key --from s3
At the end there, it's still able to drop the content from s3.
Now, consider hypothetically, if I decide to make the S3 remote CLAIMURL
urls that are in the S3 bucket. As things stand, that won't change the
above scenario. (Although the key won't be recorded as located in the web
after registerurl.)
But... If unregisterurl is changed to update remote tracking for other remotes
than web, after the S3 CLAIMURL change, the behavior of that scenario will not
be the same! After unregisterurl, it will no longer consider the content to be
present in S3. Now you're racking up S3 charges with content that git-annex
stored in S3, but that it refuses to delete. That seems bad.
So, that scenario is leading me to think that I should not change
unregisterurl (or rmurl) to update location tracking of remotes other than web.
And so changing registerurl is also looking like a bad idea.
"""]]

View file

@ -0,0 +1,18 @@
[[!comment format=mdwn
username="joey"
subject="""comment 5"""
date="2023-04-05T18:47:51Z"
content="""
What I'm inclined to do is is add a --remote= parameter to registerurl and
unregisterurl. If the specified remote does not claim the url, have it fail
to add it. (See also [[todo/registerurl_--remote_REMOTE]])
So, you can then use registerurl with --remote=$uuid, check that it
succeeded, and then use setpresentkey to mark it present on that uuid.
Without the fragility you complained of.
Update: The --remote parameter is implemented now.
(Could registerurl with --remote update location tracking itself? Maybe,
but I'd worry about a scenario like in the previous comment.)
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 6"
date="2023-04-05T19:36:40Z"
content="""
Obviously, as the author of the referenced wishlist, I would welcome addition of `--remote` option to both those commands.
But IMHO addition of the option doesn't solve initial/naive/programmable user oriented use case where user doesn't know which remote could or should handle the URL, and just wants, analogously or complimentary to `addurl`, to extend the list of the urls available for some key. There is even no user level interface to ask for \"what remotes can handle this url\" to erect some tandem of commands to register extra URLs for a key. So I don't see how addition of the option would solve the problem.
"""]]

View file

@ -0,0 +1,19 @@
[[!comment format=mdwn
username="joey"
subject="""comment 7"""
date="2023-04-05T19:57:37Z"
content="""
Well, unregisterurl and rmurl can't safely update location tracking for remotes
other than the web. Unless there were some way to know that simply removing an
url was *sufficient*, like it is for the web, and unlike how it would be
with my S3 remote scenario above.
But, the only issue with registerurl updating location tracking is that it's
not symmetric with unregisterurl.
So is that symmetry more important than comment 6? I don't know. In both
cases, some users are going to be surprised by inconsistent behavior.
The only way to avoid all user surprise would be to go back in time and
make these plumbing commands not update location tracking from the start.
"""]]

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="joey"
subject="""comment 8"""
date="2023-04-05T21:00:04Z"
content="""
Guess I'll come down on the side of restoring old behavior which was
changed w/o warning (and without the new behavior ever being documented).
And on the side of user experience showing the current behavior is surprising.
The future users who get surprised by the resulting inconsistency
of unregisterurl not unsetting location tracking will just have to
live with it.. Sigh.
"""]]

View file

@ -0,0 +1,23 @@
### Please describe the problem.
```
reprostim@reproiner:/data/reprostim/Videos$ ps auxw | grep webapp
reprost+ 25249 0.0 0.0 9892 2100 pts/5 S+ Jan05 0:00 git annex --debug webapp --listen 0.0.0.0:8888
reprost+ 25250 5.4 0.2 1074346616 65556 ? Ssl Jan05 60:39 /usr/bin/git-annex --debug webapp --listen 0.0.0.0:8888
reprost+ 224039 0.0 0.0 6332 2116 pts/6 S+ 10:55 0:00 grep webapp
reprostim@reproiner:/data/reprostim/Videos$ lsof -i :8888
reprostim@reproiner:/data/reprostim/Videos$ lsof -i | grep git-annex
git-annex 25250 reprostim 14u IPv4 129033 0t0 UDP *:55556
git-annex 221230 reprostim 14u IPv4 129033 0t0 UDP *:55556
reprostim@reproiner:/data/reprostim/Videos$ git annex version
git-annex version: 10.20230126
```
[[!meta author=yoh]]
[[!tag projects/repronim]]
> [[done]] --[[Joey]]

View file

@ -0,0 +1,25 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2024-01-18T16:47:15Z"
content="""
--listen takes an IP address (or hostname),
it does not let you specify the port. I've clarified the documentation
about this.
I don't reproduce the behavior you show, when I try that the process
runs but does not bind to any port, and in .git/annex/daemon.log, I see:
WebApp crashed: Network.Socket.getAddrInfo (called with preferred socket type/protocol: AddrInfo {addrFlags = [], addrFamily = AF_UNSPEC, addrSocketType = Stream, addrProtocol = 0, addrAddress = 0.0.0.0:0, addrCanonName = Nothing}, host name: Just "0.0.0.0:8888", service name: Nothing): does not exist (Name or service not known)
This may be an OS or resolver difference. If "0.0.0.0:8888" somehow
resolves to an IP address on your system, then the webapp will listen
on that IP address I suppose. But it's not expecting that to specify a
port.
The webapp outputs the full url, including the port it chose.
Since that url also includes an auth token that is required to use the
webapp, specifying the port for it to listen on does not seem very useful.
What's your use case for wanting to specify a port?
"""]]

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 2"
date="2024-01-18T19:00:50Z"
content="""
> What's your use case for wanting to specify a port?
it was to have some static port I could keep an ssh redirect pointing to so I could investigate behavior of the remote running annex webapp.
My 1c: Since AFAIK no ADDRESS could have `:` (unless some `ssh` based \"server:/path/to/socket\" which I do not think supported) so might be better to just crash (not just have it a matter of documentation) if `:\d+` part is found to be specified?
"""]]

View file

@ -0,0 +1,15 @@
[[!comment format=mdwn
username="joey"
subject="""comment 3"""
date="2024-01-18T21:13:42Z"
content="""
IPv6 would beg to differ about `:\d+` ;-)
Actually, it may be that your address:port was treated as some IPv6 mixed
IPv4, iirc something like that is a thing.
> it was to have some static port I could keep an ssh redirect pointing to so I could investigate behavior of the remote running annex webapp.
You would still need to copy over the url though to get the access key
for the webapp..
"""]]

View file

@ -0,0 +1,22 @@
[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2024-01-25T17:29:08Z"
content="""
I found an old todo about the same thing,
[[todo/Make_webapp_port_configurable]].
The idea there was, they were using docker and wanted to open only a
specific port selected for the webapp. So basically the same kind of thing.
I think that this should be a separate --port option, to avoid needing to
try to parse something that may be an ipv6 address or hostname, or
whatever.
I don't think that using --port should prevent the webapp from needing
the `?auth=' part of the url, as output when using --listen.
Probably it does not make sense to use --port without also using --listen,
but if the user does use it, I don't think --port needs to output the url
the way --listen does.
"""]]

View file

@ -0,0 +1,7 @@
[[!comment format=mdwn
username="joey"
subject="""comment 5"""
date="2024-01-25T18:06:15Z"
content="""
Implemented --port.
"""]]