Merge branch 'master' of ssh://git-annex.branchable.com
This commit is contained in:
commit
e7ff1c6762
8 changed files with 212 additions and 0 deletions
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="Ilya_Shlyakhter"
|
||||
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
|
||||
subject="comment 4"
|
||||
date="2018-10-05T19:56:24Z"
|
||||
content="""
|
||||
If the s3 remote claims s3:// URLs, does the bucket name have to be a DNS domain? I thought, when a special remote claims a URL, it can interepret it however it wants?
|
||||
"""]]
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 5"
|
||||
date="2018-10-05T20:20:42Z"
|
||||
content="""
|
||||
yes, for `s3://` urls there is only a bucket name, not a domain. Since bucket name is allowed to have `.`, some folks started to use their project domain name as a bucket name (e.g. `openneuro.org`, `images.cocodataset.org`). Then if you are to access them directly via url, full domain name would be e.g. http://images.cocodataset.org.s3.amazonaws.com, which would start causing troubles if you try to access it via https
|
||||
|
||||
[[!format sh \"\"\"
|
||||
$> wget -S https://images.cocodataset.org.s3.amazonaws.com
|
||||
--2018-10-05 16:19:48-- https://images.cocodataset.org.s3.amazonaws.com/
|
||||
Resolving images.cocodataset.org.s3.amazonaws.com (images.cocodataset.org.s3.amazonaws.com)... 52.216.18.32
|
||||
Connecting to images.cocodataset.org.s3.amazonaws.com (images.cocodataset.org.s3.amazonaws.com)|52.216.18.32|:443... connected.
|
||||
The certificate's owner does not match hostname ‘images.cocodataset.org.s3.amazonaws.com’
|
||||
\"\"\"]]
|
||||
|
||||
for which we started to provide workarounds in datalad.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="Ilya_Shlyakhter"
|
||||
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
|
||||
subject="comment 6"
|
||||
date="2018-10-05T21:31:33Z"
|
||||
content="""
|
||||
My current planned solution is to write an external special remote that claims s3:// URLs and downloads them. Then can use addurl --fast . My use case is that, if I run a batch job that reads inputs from s3 and writes outputs to s3, what I get at the end are pointers to s3, and I want to check these results into git-annex. Ideally there'd be a way for me to tell the batch system to use git-annex to send things to s3, but currently that's not possible.
|
||||
|
||||
Question: if an external special remote claims a URL that a built-in special remote could handle, does the external special remote take priority?
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 7"
|
||||
date="2018-10-05T21:41:45Z"
|
||||
content="""
|
||||
FWIW datalad special remote already supports download from such s3:// URLs
|
||||
"""]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="Ilya_Shlyakhter"
|
||||
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
|
||||
subject="comment 2"
|
||||
date="2018-10-05T18:24:18Z"
|
||||
content="""
|
||||
Thanks for adding annex.jobs . Could you make it so that setting it to 0 means \"use all available processors\"? I use git-annex on AWS instances, and reserve instances with different processor counts at different times.
|
||||
|
||||
\"git-annex is rarely cpu-bound\" -- I though parallelization helps by parallelizing I/O operations such as file transfers?
|
||||
|
||||
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue