Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2018-10-13 12:19:09 -04:00
commit b62a43a800
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
11 changed files with 136 additions and 1 deletions

View file

@ -46,7 +46,12 @@ content of an annexed file remains unchanged.
files or slow systems.
* `URL` -- This is a key that is generated from the url to a file.
It's generated when using eg, `git annex addurl --fast`, when the file
content is not available for hashing.
content is not available for hashing. The key may not contain the full
URL; for long URLs, part of the URL may be represented by a checksum.
The URL key may contain `&` characters; be sure to quote the key if
passing it to a shell script. The URL-backend key is distinct from URLs/URIs
that may be attached to a key (from any backend) indicating the key's location
on the web or in one of [[special_remotes]].
If you want to be able to prove that you're working with the same file
contents that were checked into a repository earlier, you should avoid
@ -75,3 +80,5 @@ in `.gitattributes`:
* annex.backend=WORM
*.mp3 annex.backend=SHA256E
*.ogg annex.backend=SHA256E
See also: [[git-annex-examinekey]]

View file

@ -0,0 +1,9 @@
[[!comment format=mdwn
username="pz63@5ea0a27986d782e467e1ebef6eb31ba440cc58d5"
nickname="pz63"
avatar="http://cdn.libravatar.org/avatar/6268f09b18a71aafa3ad68ecd8a20d50"
subject="comment 2"
date="2018-10-12T15:49:00Z"
content="""
Makes sense. Thank you very much!
"""]]

View file

@ -0,0 +1,49 @@
I was excited to give a shot to https://github.com/OpenNeuroDatasets/ds001544/ which has proper publicurl set etc... unfortunately there is something forbidding immediate parallel get (in below example `-J8` but it is the same with `-J2`). It works on the 2nd try, or if not using -J. Error message states "unknown export location":
[[!format sh """
$> git annex version | head -n 1; git clone https://github.com/OpenNeuroDatasets/ds001544/ ; cd ds001544; git annex enableremote s3-PUBLIC; git annex get -J8 . 2>&1 | head -n 20; echo "2nd run"; git annex get -J8 .
git-annex version: 6.20181011+git7-g373c2abc2-1~ndall+1
Cloning into 'ds001544'...
remote: Enumerating objects: 866, done.
remote: Counting objects: 100% (866/866), done.
remote: Compressing objects: 100% (483/483), done.
remote: Total 866 (delta 144), reused 863 (delta 141), pack-reused 0
Receiving objects: 100% (866/866), 75.07 KiB | 4.69 MiB/s, done.
Resolving deltas: 100% (144/144), done.
(merging origin/git-annex into git-annex...)
(recording state in git...)
Remote origin not usable by git-annex; setting annex-ignore
enableremote s3-PUBLIC ok
(recording state in git...)
get code/convert_sub01_ses01.R (from s3-PUBLIC...)
unknown export location
Unable to access these remotes: s3-PUBLIC
Try making some of these repositories available:
1c66b8f9-34c7-42d1-8e9f-d7bc1982311a -- root@460f24a504cc:/datalad/ds001544
837e28c7-9e4a-4792-b1b1-aa69d3430a42 -- [s3-PUBLIC]
a7294efc-f620-445d-8e9d-803b3ec748ef -- s3-PRIVATE
(Note that these git remotes have annex-ignore set: origin)
failed
get sub-01/ses-01/fmap/sub-01_ses-01_acq-cf0PA_run-03_epi.nii.gz (from s3-PUBLIC...)
unknown export location
Unable to access these remotes: s3-PUBLIC
Try making some of these repositories available:
1c66b8f9-34c7-42d1-8e9f-d7bc1982311a -- root@460f24a504cc:/datalad/ds001544
837e28c7-9e4a-4792-b1b1-aa69d3430a42 -- [s3-PUBLIC]
2nd run
get code/convert_sub01_ses02.R (from s3-PUBLIC...) (checksum...) ok
get code/convert_sub01_ses01.R (from s3-PUBLIC...) (checksum...) ok
get sub-01/ses-01/fmap/sub-01_ses-01_acq-cf1PA_run-02_epi.nii.gz (from s3-PUBLIC...) (checksum...) ok
get sub-01/ses-01/func/sub-01_ses-01_task-Stroop_acq-cf0AP_run-03_physio.tsv.gz (from s3-PUBLIC...) (checksum...) ok
...
"""]]
[[!meta author=yoh]]

View file

@ -0,0 +1 @@
What, exactly, is the relationship between keys, URLs and URIs? As I understand it, for each key, git-annex keeps a list of zero or more URLs/URIs from which the key's contents may be downloaded. Is each entry in this list specific to one special remote, i.e. is this really a list of pairs (URL/URI, special-remote-uuid)? Is the record of which key is present in which remote completely independent of the record of which key may be downloaded from which remote through which URL/URI? Is the only difference between URLs and URIs that, when a URL is added for a key, git-annex records the key as present in the web special remote, while when a URI is added, it doesn't? (Sorry for too many questions.)

View file

@ -28,6 +28,10 @@ that can be determined purely by looking at the key.
Also, '\\n' is a newline, '\\000' is a NULL, etc.
Note that it is not possible to extract the URL from a key created by the
URL backend: parts of longer URLs may be represented in the key by a
checksum.
* `--json`
Enable JSON output. This is intended to be parsed by programs that use

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="Mowgli"
avatar="http://cdn.libravatar.org/avatar/17ab194dddf7b7da59ec039cbb3ac252"
subject="comment 72"
date="2018-10-13T12:37:56Z"
content="""
@joey, What I have is just the builds I get from gentoo respective the haskell overlay. The most current version in that overlay is 6.20180626 but it also does not compile.
I need a version more recent than 6.20170818 as that version does still use rsync for remote copy.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="comment 4"
date="2018-10-12T00:06:16Z"
content="""
a test failure: https://circleci.com/gh/conda-forge/git-annex-feedstock/116?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link
"""]]

View file

@ -0,0 +1,20 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="arm64 possible CIs etc"
date="2018-10-11T22:22:50Z"
content="""
According to the great details @mmarmm on github provided in request to support [arm64 for neurodebian](https://github.com/neurodebian/dockerfiles/issues/10#issuecomment-406644418):
Shippable supports free Arm64 CI/CD and I believe Codefresh does too (both 64-bit and 32-bit for both providers):
https://blog.shippable.com/shippable-arm-packet-deliver-native-ci-cd-for-arm-architecture
http://docs.shippable.com/platform/tutorial/workflow/run-ci-builds-on-arm/
CodeFresh Arm Beta signup: https://goo.gl/forms/aDhlk56jZcblYokj1
If you need raw infrastructure the WorksOnArm project will supply full servers if you want to deal with metal: https://github.com/worksonarm/cluster/
I personally haven't looked into any of them yet
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="comment 4"
date="2018-10-12T01:05:30Z"
content="""
\"I'm not convinced that git-annex should try to make the symlinks shorter just because some programs have UIs that don't work well with longer symlinks\" -- UI is just one plus of shorter keys. Another is that some systems can't handle long paths; e.g. backends says don't use 512 or 384 hashes on Windows. Another is that long keys and symlinks increase the amount of data git deals with, which can matter for large repos. Using base64 encoding for hashes would shorten key lengths by a third; not repeating the hash twice in symlinks would give another factor of 2.
"""]]

View file

@ -0,0 +1,11 @@
[[!comment format=mdwn
username="yarikoptic"
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
subject="comment 5"
date="2018-10-12T13:42:07Z"
content="""
I was arguing for removal of the KEY-directory/ for a while ;) See e.g. as old as https://github.com/datalad/datalad/issues/32 . There is an issue/discussion on this website too somewhere, couldn't find quickly.
IMHO it is just a \"tech\" problem, i.e. no design principle forbids fixing it. It might though lead to performance issues since the containing directory then needs to be chmod'ed back and forth to introduce changes to the KEY-file under it, but it is probably very similar to what it is now anyways.
FWIW in DataLad we moved to use MD5E backend as the default to at least somewhat relief the burden of long symlinks. I think we are \"secure\" enough for what we use DataLad here ;)
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="Ilya_Shlyakhter"
avatar="http://cdn.libravatar.org/avatar/1647044369aa7747829c38b9dcc84df0"
subject="comment 6"
date="2018-10-12T13:58:05Z"
content="""
Removing KEY/directory could give more savings, but sometimes there is more than one file there (eg key metadata), so the dir makes sense. But the content filename in the dir neednt repeat the key. But, changing that could be hard. Adding backend variamts with base64-encoded checksums seems possible though?
"""]]