This commit is contained in:
parent
dbd2f0e7bf
commit
554b73bb39
1 changed files with 142 additions and 0 deletions
142
doc/bugs/migrate_removes_associated_URLs_with_custom_scheme.mdwn
Normal file
142
doc/bugs/migrate_removes_associated_URLs_with_custom_scheme.mdwn
Normal file
|
@ -0,0 +1,142 @@
|
|||
### Please describe the problem.
|
||||
|
||||
With URLs that are handled by the web remote the URL will be kept through a migration, i.e.
|
||||
|
||||
```
|
||||
git annex addurl --relaxed <some-http-url>
|
||||
git annex get <the-added-file>
|
||||
git annex migrate
|
||||
```
|
||||
|
||||
will migrate the key of the file to be hash based, and keep the URL associated to that key.
|
||||
If I do the same with a URL whose scheme is handled by a custom special remote (this one specifically is what I got the issue with: <https://github.com/matrss/datalad-cds/blob/main/src/datalad_cds/cds_remote.py>, it registers itself for the `cds:` scheme), the URL seems to be dropped from the key (i.e. whereis no longer shows it and git annex can no longer fetch it from the special remote).
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
The steps mentioned above for a URL whose scheme is handled by an (external?) special remote. Specifically, I saw it with datalad-cds, it might happen for other special remotes as well.
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
```
|
||||
git-annex version: 10.20231129
|
||||
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
|
||||
dependency versions: aws-0.24.1 bloomfilter-2.0.1.2 crypton-0.32 DAV-1.3.4 feed-1.3.2.1 ghc-9.4.8 http-client-0.7.15 persistent-sqlite-2.13.3.0 torrent-10000.1.3 uuid-1.3.15 yesod-1.6.2.1
|
||||
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
|
||||
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
|
||||
operating system: linux x86_64
|
||||
supported repository versions: 8 9 10
|
||||
upgrade supported from repository versions: 0 1 2 3 4 5 6 7 8 9 10
|
||||
local repository version: 10
|
||||
```
|
||||
|
||||
Installed with nix from nixpkgs on an ubuntu system.
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
It works with the web remote:
|
||||
[[!format sh """
|
||||
# If you can, paste a complete transcript of the problem occurring here.
|
||||
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
|
||||
$ datalad create test-ds
|
||||
create(ok): [...] (dataset)
|
||||
$ cd test-ds/
|
||||
$ git annex addurl --relaxed "https://git-annex.branchable.com/"
|
||||
addurl https://git-annex.branchable.com/ (to git-annex.branchable.com_) ok
|
||||
(recording state in git...)
|
||||
$ git annex whereis git-annex.branchable.com_
|
||||
whereis git-annex.branchable.com_ (1 copy)
|
||||
00000000-0000-0000-0000-000000000001 -- web
|
||||
|
||||
web: https://git-annex.branchable.com/
|
||||
ok
|
||||
$ ls -l
|
||||
[...] git-annex.branchable.com_ -> '.git/annex/objects/pJ/v4/URL--https&c%%git-annex.branchable.com%/URL--https&c%%git-annex.branchable.com%'
|
||||
$ git annex get git-annex.branchable.com_
|
||||
get git-annex.branchable.com_ (from web...)
|
||||
ok
|
||||
(recording state in git...)
|
||||
$ git annex migrate
|
||||
migrate git-annex.branchable.com_ (checksum...) ok
|
||||
(recording state in git...)
|
||||
$ ls -l
|
||||
[...] git-annex.branchable.com_ -> .git/annex/objects/31/qv/MD5E-s12462--6b956d66f8352205df79936ada326ec3/MD5E-s12462--6b956d66f8352205df79936ada326ec3
|
||||
$ git annex whereis git-annex.branchable.com_
|
||||
whereis git-annex.branchable.com_ (2 copies)
|
||||
00000000-0000-0000-0000-000000000001 -- web
|
||||
60079e0e-42e4-492e-a7b1-dde764d069eb -- [here]
|
||||
|
||||
web: https://git-annex.branchable.com/
|
||||
ok
|
||||
# End of transcript or log.
|
||||
"""]]
|
||||
|
||||
It doesn't work with the custom special remote:
|
||||
[[!format sh """
|
||||
$ datalad create test-ds
|
||||
create(ok): [...] (dataset)
|
||||
$ cd test-ds/
|
||||
$ datalad download-cds --lazy --nosave --path "2022-01-01.grib" "$(cat <<EOF
|
||||
{
|
||||
"dataset": "reanalysis-era5-complete",
|
||||
"sub-selection": {
|
||||
"class": "ea",
|
||||
"date": "2022-01-01",
|
||||
"expver": "1",
|
||||
"levelist": "1",
|
||||
"levtype": "ml",
|
||||
"param": "130",
|
||||
"stream": "oper",
|
||||
"time": "00:00:00/06:00:00/12:00:00/18:00:00",
|
||||
"type": "an",
|
||||
"grid": ".3/.3"
|
||||
}
|
||||
}
|
||||
EOF
|
||||
)"
|
||||
cds(ok): [...] (dataset)
|
||||
$ ls -l
|
||||
[...] 2022-01-01.grib -> '.git/annex/objects/Mx/4G/URL--cds&cv1-eyJkYXRhc2V0IjoicmVhbmFs-162a71d794c333f5e04b13283421a49a/URL--cds&cv1-eyJkYXRhc2V0IjoicmVhbmFs-162a71d794c333f5e04b13283421a49a'
|
||||
$ git annex whereis 2022-01-01.grib
|
||||
whereis 2022-01-01.grib (1 copy)
|
||||
923e2755-e747-42f4-890a-9c921068fb82 -- [cds]
|
||||
|
||||
cds: {"dataset":"reanalysis-era5-complete","sub-selection":{"class":"ea","date":"2022-01-01","expver":"1","levelist":"1","levtype":"ml","param":"130","stream":"oper","time":"00:00:00/06:00:00/12:00:00/18:00:00","type":"an","grid":".3/.3"}}
|
||||
cds: cds:v1-eyJkYXRhc2V0IjoicmVhbmFseXNpcy1lcmE1LWNvbXBsZXRlIiwic3ViLXNlbGVjdGlvbiI6eyJjbGFzcyI6ImVhIiwiZGF0ZSI6IjIwMjItMDEtMDEiLCJleHB2ZXIiOiIxIiwibGV2ZWxpc3QiOiIxIiwibGV2dHlwZSI6Im1sIiwicGFyYW0iOiIxMzAiLCJzdHJlYW0iOiJvcGVyIiwidGltZSI6IjAwOjAwOjAwLzA2OjAwOjAwLzEyOjAwOjAwLzE4OjAwOjAwIiwidHlwZSI6ImFuIiwiZ3JpZCI6Ii4zLy4zIn19
|
||||
ok
|
||||
$ git config --local remote.cds.annex-security-allow-unverified-downloads ACKTHPPT
|
||||
$ git annex get 2022-01-01.grib
|
||||
get 2022-01-01.grib (from cds...)
|
||||
2024-04-06 11:37:05,250 INFO Welcome to the CDS
|
||||
2024-04-06 11:37:05,251 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-complete
|
||||
2024-04-06 11:37:05,340 INFO Request is queued
|
||||
2024-04-06 11:37:06,400 INFO Request is running
|
||||
2024-04-06 11:37:26,399 INFO Request is completed
|
||||
2024-04-06 11:37:26,399 INFO Downloading https://download-0017.copernicus-climate.eu/cache-compute-0017/cache/data9/adaptor.mars.external-1712396225.5545986-18258-18-822e5b91-cf60-4dbd-a808-a1253d4fe109.grib to .git/annex/tmp/URL--cds&cv1-eyJkYXRhc2V0IjoicmVhbmFs-162a71d794c333f5e04b13283421a49a (5.5M)
|
||||
0%| | 0.00/5.51M [00:00<?, ?B/s] 2%|▏ | 104k/5.51M [00:00<00:06, 866kB/s] 17%|█▋ | 942k/5.51M [00:00<00:00, 5.00MB/s] 57%|█████▋ | 3.15M/5.51M [00:00<00:00, 13.0MB/s] 99%|█████████▉| 5.48M/5.51M [00:00<00:00, 17.3MB/s] 2024-04-06 11:37:27,322 INFO Download rate 6M/s
|
||||
ok
|
||||
(recording state in git...)
|
||||
$ git annex migrate
|
||||
migrate 2022-01-01.grib (checksum...) ok
|
||||
(recording state in git...)
|
||||
$ ls -l
|
||||
[...] 2022-01-01.grib -> .git/annex/objects/KJ/6K/MD5E-s5774880--94a848eefd02d72952c8541c52a93550.grib/MD5E-s5774880--94a848eefd02d72952c8541c52a93550.grib
|
||||
$ git annex whereis 2022-01-01.grib
|
||||
whereis 2022-01-01.grib (1 copy)
|
||||
5dfef0c9-8e18-4ea2-9ee1-646830b5749b -- [here]
|
||||
ok
|
||||
$ git annex drop 2022-01-01.grib
|
||||
drop 2022-01-01.grib (unsafe)
|
||||
Could only verify the existence of 0 out of 1 necessary copy
|
||||
|
||||
Rather than dropping this file, try using: git annex move
|
||||
|
||||
(Use --force to override this check, or adjust numcopies.)
|
||||
failed
|
||||
drop: 1 failed
|
||||
"""]]
|
||||
|
||||
I know that this is sort of abusing the URL handling in git-annex, but it was super easy to implement. You recommended me to use SETSTATE/GETSTATE from the external special remote protocol instead already at some point, but I didn't get around to reworking it for that yet.
|
||||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
||||
Yes! It is absolutely great, thank you for it.
|
Loading…
Add table
Add a link
Reference in a new issue