Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2023-01-07 13:56:29 -04:00
commit 5e8070bc3c
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
4 changed files with 70 additions and 0 deletions

View file

@ -0,0 +1,22 @@
[[!comment format=mdwn
username="sb-beryllium@6e2c477eac63b823bd315ef8aaf5f93173c1f15b"
nickname="sb-beryllium"
avatar="http://cdn.libravatar.org/avatar/ef62105380b73ef91d760ec327c14e22"
subject="comment 4"
date="2023-01-07T13:05:23Z"
content="""
Apologies, this is beryllium (by alias)... due to unforeseen circumstances, I have had to register with a different email address. I am hoping this is a temporary situation.
Further apologies... I acknowledge that I am vacillating somewhat, but I've reverted to thinking that the hooks should be generated with unix line-endings (LF).
The reason I say this is, it does appear that the pseudo-standard for files under the .git directory is that they use unix line-endings only (perhaps an actual, documented standard... I'm not sure where to look/ask to confirm it).
This is the case for files such as .git/config, and .git/refs/** etc... under Git for Windows, and even JGit running with Windows native Java.
So to me.. it seems to make sense that hook files should have unix line-endings exclusively.
There are possible other mitigations... but that's where I stand with that finally. I'm still not fussed if no actions is determined to be the best course of action.
Thanks regardless.
"""]]

View file

@ -0,0 +1,32 @@
Per our brief discussion ATM git-annex allows to prioritize URLs only by assigning them to be handled by different special remotes and having different costs for those different remotes.
This doesn't allow for e.g. prioritization within built-in special "web" remote which is the most frequent use case. Our use case:
```
(base) dandi@drogon:/mnt/backup/dandi/dandizarrs/ea8c43c7-757e-4653-8e4a-a6d356120836$ git annex whereis 0/0/0/3/6/169
(recording state in git...)
whereis 0/0/0/3/6/169 (2 copies)
00000000-0000-0000-0000-000000000001 -- web
86da9d10-da54-4371-8d6f-7559c6a236f5 -- dandi@drogon:/mnt/backup/dandi/dandizarrs/ea8c43c7-757e-4653-8e4a-a6d356120836 [here]
web: https://api.dandiarchive.org/api/zarr/ea8c43c7-757e-4653-8e4a-a6d356120836.zarr/0/0/0/3/6/169
web: https://dandiarchive.s3.amazonaws.com/zarr/ea8c43c7-757e-4653-8e4a-a6d356120836/0/0/0/3/6/169?versionId=h3qb0rOswsssHxEdfN8QAWUMoVhddQrY
ok
```
where we have API-server based URL -- we do not want to access unless really really needed (would be the slowest, would bring load to the server etc), and then direct access to public bucket -- fastest (unless some other local remote has it even better).
Joey envisioned potentially being able to assign priorities via e.g.
git-annex enableremote web url-priority-1=s3.amazonaws.com/ url-prority-2=/api.dandiarchive.org/
but I also wondered if there could just be some way to provide costs (or adjustments to costs) for different URLs so they all become considered while considering costs across remotes?
E.g. may be I have a URL which is fast (s3 bucket), then I have bunch of average regular remotes with decent speed (e.g. dropbox etc), and then URL to some slow archive (API server). Both urls are served by `web` remote, and there would be no way to "order" all data access schemes/remotes in the optimal sequence of costs unless different URLs could have different costs considered along with different remotes.
PS somehow I have some odd memory of seeing some config option to provide git-annex a script which would output cost given a URL... I disliked that approach since it would require me to code the script, and thus didn't use it. Did I dream it up?
[[!meta author=yoh]]
[[!tag projects/dandi]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="satra"
avatar="http://cdn.libravatar.org/avatar/d3afc58453bc273dc015254e1d9581b3"
subject="use list order for cost"
date="2023-01-06T19:36:07Z"
content="""
as a starting point is order an implicit notion of cost? so could having s3 before dandi achieve the expected outcome. try s3 first then try api?
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="importtree readonly remote folder?"
date="2023-01-06T14:21:35Z"
content="""
I have ran into the same \"desire\" I think: to be able to `import` the read-only (to me) tree of DICOMs from a remote server reachable over ssh, and so that later on those who have a clone of the repository and have access to that server could easily `annex get` necessary load. Am I right in my importtree-ignorant thinking that if `rsync` special remote supports `importtree` mode -- it would also work fine when I have only read-only access to the remote folder?
"""]]