move old fixed datalad/dandi/repronim bugs to the project pages
This is to cut down on the number of files in bugs/, which makes it slow to file new bug reports or update active bug reports. These old bugs were about 1/3rd of the files in there. These projects want lists of their old bugs to still be accessible, and have the lists on their project pages, which will still list the old bugs. Commands used: for f in $(git grep -l '\[\[!tag projects/dandi\]\]'); do if grep -q 'done\]\]' "$f"; then git mv "$f" ../projects/dandi/bugs-done; g=$(echo "$f" | sed 's/.mdwn//'); if [ -d "$g" ]; then git mv "$g" ../projects/dandi/bugs-done; fi; fi; done for f in $(git grep -l '\[\[!tag projects/repronim\]\]'); do if grep -q 'done\]\]' "$f"; then git mv "$f" ../projects/repronim/bugs-done; g=$(echo "$f" | sed 's/.mdwn//'); if [ -d "$g" ]; then git mv "$g" ../projects/repronim/bugs-done; fi; fi; done for f in $(git grep -l '\[\[!tag projects/datalad\]\]'); do if grep -q 'done\]\]' "$f"; then git mv "$f" ../projects/datalad/bugs-done; g=$(echo "$f" | sed 's/.mdwn//'); if [ -d "$g" ]; then git mv "$g" ../projects/datalad/bugs-done; fi; fi; done That assumes that bugs are not tagged by multiple projects at the same time. Of the ones I moved, I've checked and none are. Could do the same with todo/ but there are only 370 files in there, and less than 84 of them could be moved this way, which does not seem likely to produce a sizeable speedup. Sponsored-by: Dartmouth College's Datalad project
This commit is contained in:
parent
946fc20165
commit
bcc69f07e8
1011 changed files with 4 additions and 4 deletions
|
@ -0,0 +1,49 @@
|
|||
### Please describe the problem.
|
||||
|
||||
Original complaints could be found mentioned in the comments of the [importfeed page](https://git-annex.branchable.com/git-annex-importfeed/): when using `addurl`, and even when the server provides Content-Disposition field with the filename, git-annex seems (BTW -- no Content-Disposition was mentioned in the --debug output) to take that filename value and obfuscates it (replaces '-' with '_' etc) to what supposed to be the original filename.
|
||||
|
||||
|
||||
[[!format sh """
|
||||
$> mkdir /tmp/testrepo; cd /tmp/testrepo; git init; git annex init;
|
||||
mkdir: cannot create directory ‘/tmp/testrepo’: File exists
|
||||
E: could not determine git repository root
|
||||
Initialized empty Git repository in /tmp/testrepo/.git/
|
||||
init ok
|
||||
(recording state in git...)
|
||||
|
||||
$> git annex addurl --fast https://girder.dandiarchive.org/api/v1/item/5e9f9588b5c9745bad9f58ff/download
|
||||
addurl https://girder.dandiarchive.org/api/v1/item/5e9f9588b5c9745bad9f58ff/download (to sub_mouse_AAYYT_ses_20180420_sample_2_slice_20180420_slice_2_cell_20180420_sample_2.nwb) ok
|
||||
(recording state in git...)
|
||||
|
||||
$> ls -l
|
||||
total 4
|
||||
lrwxrwxrwx 1 yoh yoh 184 May 7 17:02 sub_mouse_AAYYT_ses_20180420_sample_2_slice_20180420_slice_2_cell_20180420_sample_2.nwb -> .git/annex/objects/Gj/9z/URL-s9335000--https&c%%girder.dandiarchive.org-48163bc503cb7181516be86ef215f923/URL-s9335000--https&c%%girder.dandiarchive.org-48163bc503cb7181516be86ef215f923
|
||||
"""]]]
|
||||
|
||||
whenever original content-disposition was having "-" in the filename, which are perfectly safe the filename AFAIK:
|
||||
|
||||
[[!format sh """
|
||||
$> wget -S https://girder.dandiarchive.org/api/v1/item/5e9f9588b5c9745bad9f58ff/download
|
||||
... bunch of forwards to the final one with the content disposition field
|
||||
Resolving dandiarchive.s3.amazonaws.com (dandiarchive.s3.amazonaws.com)... 52.219.101.51
|
||||
Connecting to dandiarchive.s3.amazonaws.com (dandiarchive.s3.amazonaws.com)|52.219.101.51|:443... connected.
|
||||
HTTP request sent, awaiting response...
|
||||
HTTP/1.1 200 OK
|
||||
x-amz-id-2: VgJE1jV5XUkBQXZDWgR5WEDfmHJp4Fj6fGo6z2tYkLfyTsxDWC+m92B2qOSVppCuiRFu2QpNV5M=
|
||||
x-amz-request-id: 1221CAC30E3931CF
|
||||
Date: Thu, 07 May 2020 21:02:52 GMT
|
||||
Last-Modified: Wed, 22 Apr 2020 00:54:32 GMT
|
||||
ETag: "acf3b4f5951435245a0efcd4a518e77d"
|
||||
Content-Disposition: attachment; filename="sub-mouse-AAYYT_ses-20180420-sample-2_slice-20180420-slice-2_cell-20180420-sample-2.nwb"
|
||||
...
|
||||
|
||||
$> git annex version
|
||||
git-annex version: 7.20190708+git9-gfa3524b95-1~ndall+1
|
||||
|
||||
"""]]
|
||||
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[done]] --[[Joey]]
|
|
@ -0,0 +1,42 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2020-05-08T16:50:14Z"
|
||||
content="""
|
||||
This is due to the filename being passed through sanitizeFilePath.
|
||||
|
||||
There are security concerns here. If the filename contains "../"
|
||||
it absolutely has to be modified, or the command would have to fail and
|
||||
refuse the import it.
|
||||
|
||||
If the filename contains an ANSI escape sequence, it could potentially
|
||||
lead to a security hole. Or if the filename starts with "-" it could be
|
||||
somewhere between a possible security hole and just very annoying to work
|
||||
with. As could a filename that contains a newline, which will
|
||||
break large quantities of shell pipelines. While generally git repos can
|
||||
have these problems with files in them too, the exposure seems larger when
|
||||
talking to some random web server than when pulling from a repo.
|
||||
|
||||
Also, cross filesystem compatibility is a concern. It used to allow "|" in
|
||||
the filename, but a bug pointed out that cannot be used on fat filesystems.
|
||||
And "\\" means different things on linux and windows, so probably best to avoid
|
||||
filenames containing it on linux too.
|
||||
|
||||
Finally, it's somewhat opinionated, since it replaces spaces with
|
||||
underscores. That's certainly the least defensible thing.
|
||||
|
||||
(git-annex may also truncate the filename if it's longer than what the
|
||||
filesystem supports.)
|
||||
|
||||
So, it's clearly wrong that it should be taken as-is without obfuscation,
|
||||
IMHO. Maybe there's a way to improve it to meet some use case though.
|
||||
|
||||
I could see having a config that avoids sanitizing the filename, but
|
||||
makes addurl fail if the filename looks like a security problem.
|
||||
|
||||
Though that has the downside that git-annex would then need to
|
||||
comprehensively track, going forward, all the ways that people find to make
|
||||
filenames be a security problem; the current method, by being strict in
|
||||
what it lets through, probably limits expoits to ones involving a) unicode
|
||||
or b) the user's wetware.
|
||||
"""]]
|
|
@ -0,0 +1,26 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2020-05-08T18:19:20Z"
|
||||
content="""
|
||||
`git-annex import` does not do any sanitization, and that could be
|
||||
considered inconsistent, particularly when importing from a remote like S3.
|
||||
|
||||
A difference with that is, it creates a remote tracking branch for the
|
||||
imported files. (That happens to avoid "../" path traversal because git
|
||||
generally avoids it.) Maybe the real difference is, import from a special
|
||||
remote is completely analagous to fetching from a git remote. So it feels
|
||||
different to me than adding an url does.
|
||||
|
||||
If I sync with a S3 bucket and it turns out it imported a escape sequence
|
||||
file, well I could have looked at the bucket first, or imported and
|
||||
reviewed the branch before merging it. And if I was syncing with a git
|
||||
remote the same thing could happen. So it feels like I should have no
|
||||
expectation git-annex would protect me. Whereis, if I add an url and the
|
||||
web server uses an obscure-ish http header to surprise me with a similar
|
||||
malicious filename, I had no way before hand to know that would happen, and
|
||||
so it does feel like git-annex should protect me.
|
||||
|
||||
(Although if git did prevent that, git-annex should too, and I'd be
|
||||
fine with git preventing that.)
|
||||
"""]]
|
|
@ -0,0 +1,18 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2020-05-08T19:56:27Z"
|
||||
content="""
|
||||
Implemented git-annex addurl --preserve-filename, which will do what you
|
||||
want.
|
||||
|
||||
Leaving this bug open because I only implemented it for web urls, not yet
|
||||
for torrents and other special remotes that have their own url scheme.
|
||||
The sanitization for those is currently done at a lower level than addurl,
|
||||
and so that will take a bit more work to implement.
|
||||
|
||||
(importfeed does not, I think, need to implement this option, because
|
||||
the filenames are based on information from the rss feed, and it's
|
||||
perfectly fine to sanitize eg a podcast episode title to get a reasonable
|
||||
filename.)
|
||||
"""]]
|
|
@ -0,0 +1,20 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 4"
|
||||
date="2020-05-09T22:10:43Z"
|
||||
content="""
|
||||
> If the filename contains an ANSI escape sequence, it could potentially lead to a security hole.
|
||||
> ... As could a filename that contains a newline, which will break large quantities of shell pipelines.
|
||||
|
||||
IMHO those indeed are ok to target for sanitization
|
||||
|
||||
> Or if the filename starts with \"-\" it could be somewhere between a possible security hole and just very annoying to work with.
|
||||
|
||||
So why not to sanitize it only at the beginning of the filename?
|
||||
`-` is a very common and a safe character to use within filename. For that matter we VERY frequently use `-` in filenames. It even became part of our BIDS standard in neuroimaging: https://bids-specification.readthedocs.io where we separate `_key` from `value`, e.g.in ` . I really do not see why git-annex should so aggressively sanitize filenames as replacing \"-\" within filenames -- it makes nothing more secure or convenient.
|
||||
|
||||
> While generally git repos can have these problems with files in them too, the exposure seems larger when talking to some random web server than when pulling from a repo.
|
||||
|
||||
Well, not sure about ansi characters and new line symbols, but typically files are saved by the browsers with the name suggested by the server.
|
||||
"""]]
|
|
@ -0,0 +1,15 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 5"""
|
||||
date="2020-05-11T17:20:07Z"
|
||||
content="""
|
||||
I agree that it may as well allow non-leading '-'. But, if you are relying
|
||||
on getting the unsanitized filename generally, you should use
|
||||
--preserve-filename
|
||||
|
||||
Web browsers do do some santization, particulary of '/'.
|
||||
Chrome removes leading "." as well. Often files are downloaded
|
||||
without the user confirming it. I suspect there is enough insecurity
|
||||
in that area that someone could make a living injecting bitcoin miners into
|
||||
dotfiles.
|
||||
"""]]
|
|
@ -0,0 +1,6 @@
|
|||
While running `git-annex addurl --batch --with-files --jobs 10 --json --json-error-messages --json-progress --raw`, I occasionally run into files that fail to download for no discernable reason, and the `"error-messages"` key in the output from the command is an empty list. This makes it hard to figure out exactly why the download is failing.
|
||||
|
||||
[[!meta author=jwodder]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
|
@ -0,0 +1,16 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2021-10-27T16:23:52Z"
|
||||
content="""
|
||||
Is it reproducible with a particular url? Does it only happen with -J?
|
||||
|
||||
Version would also be good to know. There were recent relevant
|
||||
changes eg [[!commit 4f42292b13dc5a6664eeb19b5c9d48991eaef292]].
|
||||
|
||||
I've spent a while hunting for a code path where it fails without
|
||||
displaying a warning, and have not found one. Since the code in addurl
|
||||
is structured as return Nothing and hopefully display a warning
|
||||
beforehand, rather than as throw an error, it's certianly possible that
|
||||
happens.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="jwodder"
|
||||
avatar="http://cdn.libravatar.org/avatar/b06e01332c949b895c681cc92934f36a"
|
||||
subject="comment 2"
|
||||
date="2021-10-27T18:16:43Z"
|
||||
content="""
|
||||
It appears that the problem occurs whenever one tries to download the same URL to two different paths at the same time. When this occurs, one of the downloads fails, and though its \"error-messages\" is empty, its \"notes\" field reads, \"transfer already in progress, or unable to take transfer lock\".
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="jwodder"
|
||||
avatar="http://cdn.libravatar.org/avatar/b06e01332c949b895c681cc92934f36a"
|
||||
subject="comment 3"
|
||||
date="2021-10-27T18:19:23Z"
|
||||
content="""
|
||||
As to your questions, I am using git-annex 8.20211011 on macOS 11.6. The problem does not occur when the `--jobs` option is omitted, but that's not viable for the current project we're using git-annex for.
|
||||
"""]]
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 4"""
|
||||
date="2021-10-27T18:40:48Z"
|
||||
content="""
|
||||
Aha, that makes sense! addurl constructs a url-based Key to use while
|
||||
downloading, and the key transfer machinery prevents redundant downloads
|
||||
of the same Key at the same time.
|
||||
|
||||
Arguably, the problem is not where the message gets put, but that
|
||||
it fails when adding an url to two different paths at the same time.
|
||||
|
||||
I have, though, moved that message so it will appear in error-messages.
|
||||
"""]]
|
|
@ -0,0 +1,26 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 5"""
|
||||
date="2021-10-27T18:56:23Z"
|
||||
content="""
|
||||
The best solution I can find is for it to notice when another thread is
|
||||
downloading the same url, and wait until it finishes. Then proceed
|
||||
with downloading the url for a second time.
|
||||
|
||||
It's not very satisfying to re-download. But once the url Key is downloaded,
|
||||
it does not keep that url Key populated, but hashes the content and moves
|
||||
the content to the final Key. It would be a real complication to
|
||||
communicate, across threads, what Key the content ended up at, and have the
|
||||
waiting thread use that. And addurl is already complicated well beyond a
|
||||
point I am comfortable with.
|
||||
|
||||
Also, the content of an url can of course change over time. If I feed
|
||||
"$url foo" into git-annex addurl --batch -J10 and then some time
|
||||
later, I feed "$url bar", I might expect that file bar gets whatever
|
||||
content the url has now, not the content that the url had back when I added
|
||||
the same url to file foo. And if I cared about avoiding re-downloading,
|
||||
I could add the url to the first file, and then copy the annex link to the
|
||||
second file myself.
|
||||
|
||||
Implemented this approach.
|
||||
"""]]
|
|
@ -0,0 +1,21 @@
|
|||
### Please describe the problem.
|
||||
|
||||
This is a continuation to the [prior report/discussion](https://git-annex.branchable.com/bugs/leaks_git_config_error_message_upon_inability_to_read_downloaded___34__config__34___file/#comment-424548e59fc41618ffeeb65f418694b3) to facilitate access to private repositories on public hosting portals.
|
||||
|
||||
If we place more odd/custom behavior of gitlab etc installations which forward to login screen (thus no 401 or 404 response) upon attempt to access something which might be within private rep, aside, the situation with github and gogs (github clone) which powers gin (which I had [mentioned](https://git-annex.branchable.com/bugs/leaks_git_config_error_message_upon_inability_to_read_downloaded___34__config__34___file/#comment-ec2193d97bb19945ad74cee13f747b35) in that prior discussion)) is different: they return 404 response. And I think (didn't check git code, but just based on its behavior) `git` is then asking for credentials as the "next way to try". I think git-annex should do the same -- if 404 received, ask `git credential` to fill for that domain (as it would do now in case of 401).
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
Try to clone and get data from a private repository on [https://gin.g-node.org/](https://gin.g-node.org/) (repo could be created, or let me know and I would create one, but you would still need to register there). I am not yet 100% certain that upon authentication you would be able to fetch that `/config` (haven't tried). Satellite issue/discussion I just initiated on gin is [here](https://github.com/G-Node/gogs/issues/111)
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
8.20201127+git54-ga1b227171-1~ndall+1
|
||||
|
||||
|
||||
edit 1: although probably a deeper look into how/why git decides to ask for credentials for private repos might be due. May be similar check should be done by git-annex first, since otherwise there might be no way to tell apart from a "proper" 404 for inability to get `/config` from github
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[notabug|done]] --[[Joey]]
|
|
@ -0,0 +1,17 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2021-01-21T16:57:06Z"
|
||||
content="""
|
||||
The git source code does not appear to behave
|
||||
like that, see http.c `normalize_curl_result`, which reauths on 401, but
|
||||
not on 404. If you think git behaves like this, you need to show an example
|
||||
where it clearly accesses an url that is 404 and goes on to authenticate.
|
||||
|
||||
Seems to me that these hosting sites may simply not be exposing foo.git/config
|
||||
to http. Git does not request that file over http. Such a hosting site would
|
||||
probably also not expose foo.git/annex/ over http, so git-annex would not be
|
||||
able to use it anyway. To support git-annex, it would need to
|
||||
expose both, and then git-annex's handling of 401 should work fine for
|
||||
authentication.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 2"
|
||||
date="2021-01-21T18:36:50Z"
|
||||
content="""
|
||||
a quick one: https://gin.g-node.org/ does expose `foo.git/annex/` -- that is what gin has extended original borg with. Example repo to try on https://gin.g-node.org/ljchang/Sherlock . The problem/difficulty is only in access to \"private\" repositories -- access to config and annexed files is working fine through http
|
||||
"""]]
|
|
@ -0,0 +1,22 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2021-01-21T19:20:00Z"
|
||||
content="""
|
||||
It still seems easy to demonstrate that git does not ask for creds on 404:
|
||||
|
||||
joey@darkstar:~> git clone http://google.com/this-url-does-not-exist
|
||||
Cloning into 'this-url-does-not-exist'...
|
||||
fatal: repository 'http://google.com/this-url-does-not-exist/' not found
|
||||
|
||||
So I need you to show me what makes you think that git does such a strange
|
||||
thing, before I can take seriously a request to replicate that behavior in
|
||||
git-annex. Because the only possible reason I would implement such an
|
||||
insane thing is if git has lost its collective mind and so I needed to
|
||||
follow into the abyss.
|
||||
|
||||
If the actual issue is that gogs has implemented support for git-annex,
|
||||
but that it sends 404 when git-annex requests config from a
|
||||
private repo, rather than 401, it seems to me the place to fix that is in
|
||||
gogs.
|
||||
"""]]
|
|
@ -0,0 +1,112 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 4"
|
||||
date="2021-01-22T01:47:46Z"
|
||||
content="""
|
||||
yeap, it is not about 404 ...
|
||||
|
||||
<details>
|
||||
<summary>with gogs/gin situation is obscure but \"easyish\" - 401 is returned upon access to `/info/refs` but not above:</summary>
|
||||
|
||||
```shell
|
||||
$> wget -S \"https://gin.g-node.org/SakshamSharda/ophys_testing1.git/info/refs\"
|
||||
--2021-01-21 20:37:22-- https://gin.g-node.org/SakshamSharda/ophys_testing1.git/info/refs
|
||||
Resolving gin.g-node.org (gin.g-node.org)... 141.84.41.219
|
||||
Connecting to gin.g-node.org (gin.g-node.org)|141.84.41.219|:443... connected.
|
||||
HTTP request sent, awaiting response...
|
||||
HTTP/1.1 401 Unauthorized
|
||||
Date: Fri, 22 Jan 2021 01:37:23 GMT
|
||||
Server: Apache/2.4.38 (Debian)
|
||||
content-type: text/plain
|
||||
www-authenticate: Basic realm=\".\"
|
||||
content-length: 0
|
||||
set-cookie: lang=en-US; Path=/; Max-Age=2147483647
|
||||
set-cookie: gnode_gin=823b677f19feb8ef; Path=/; HttpOnly
|
||||
set-cookie: _csrf=GrekbiqDJleLLNcVyax5z77buGY6MTYxMTI3OTQ0MzYwMTMyMzE4NQ; Path=/; Expires=Sat, 23 Jan 2021 01:37:23 GMT
|
||||
Keep-Alive: timeout=5, max=100
|
||||
Connection: Keep-Alive
|
||||
|
||||
Username/Password Authentication Failed.
|
||||
1 51975 ->6 [2].....................................:Thu 21 Jan 2021 08:37:23 PM EST:.
|
||||
(git)lena:~/proj/misc/git[master]git
|
||||
$> wget -S \"https://gin.g-node.org/SakshamSharda/ophys_testing1.git/info\"
|
||||
--2021-01-21 20:37:52-- https://gin.g-node.org/SakshamSharda/ophys_testing1.git/info
|
||||
Resolving gin.g-node.org (gin.g-node.org)... 141.84.41.219
|
||||
Connecting to gin.g-node.org (gin.g-node.org)|141.84.41.219|:443... connected.
|
||||
HTTP request sent, awaiting response...
|
||||
HTTP/1.1 404 Not Found
|
||||
Date: Fri, 22 Jan 2021 01:37:53 GMT
|
||||
Server: Apache/2.4.38 (Debian)
|
||||
content-type: text/html; charset=UTF-8
|
||||
set-cookie: lang=en-US; Path=/; Max-Age=2147483647
|
||||
set-cookie: gnode_gin=26d42c5108c8715d; Path=/; HttpOnly
|
||||
set-cookie: _csrf=SAKUL4rdspufTb_lxEWIijnzYBU6MTYxMTI3OTQ3Mjk5MDczODgzMA; Path=/; Expires=Sat, 23 Jan 2021 01:37:52 GMT
|
||||
Keep-Alive: timeout=5, max=100
|
||||
Connection: Keep-Alive
|
||||
Transfer-Encoding: chunked
|
||||
2021-01-21 20:37:53 ERROR 404: Not Found.
|
||||
|
||||
|
||||
```
|
||||
</details>
|
||||
|
||||
|
||||
github is ... trickier, or to say -- my C/gdb/whatever foo is not good enough, since
|
||||
|
||||
<details>
|
||||
<summary>it is still 404 with simple wget but git remote-https seems to get 401:</summary>
|
||||
|
||||
```shell
|
||||
(gdb) p results
|
||||
$15 = {curl_result = CURLE_HTTP_RETURNED_ERROR, http_code = 401, auth_avail = 1, http_connectcode = 0}
|
||||
(gdb) p rl
|
||||
No symbol \"rl\" in current context.
|
||||
(gdb) p url
|
||||
$16 = 0x5555557a4450 \"https://github.com/yarikoptic/abcd-testds2/info/refs?service=git-upload-pack\"
|
||||
(gdb) bt
|
||||
#0 http_request (url=0x5555557a4450 \"https://github.com/yarikoptic/abcd-testds2/info/refs?service=git-upload-pack\",
|
||||
result=<optimized out>, target=<optimized out>, options=0x7fffffffd920) at http.c:1981
|
||||
#1 0x00005555555665bf in http_request_reauth (
|
||||
url=0x5555557a4450 \"https://github.com/yarikoptic/abcd-testds2/info/refs?service=git-upload-pack\", result=0x7fffffffd880,
|
||||
target=0, options=0x7fffffffd920) at http.c:2040
|
||||
#2 0x000055555555f7f3 in discover_refs (service=<optimized out>, service@entry=0x5555556b622c \"git-upload-pack\",
|
||||
for_push=for_push@entry=0) at remote-curl.c:493
|
||||
#3 0x000055555556137e in get_refs (for_push=<optimized out>) at remote-curl.c:548
|
||||
#4 cmd_main (argc=argc@entry=3, argv=argv@entry=0x7fffffffdcd8) at remote-curl.c:1523
|
||||
#5 0x000055555555ee94 in main (argc=3, argv=0x7fffffffdcd8) at common-main.c:52
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
$> wget --header \"Git-Protocol: version=2\" --header \"Pragma: no-cache\" -S 'https://github.com/yarikoptic/abcd-testds2/info/refs?service=git-upload-pack'
|
||||
--2021-01-21 20:41:21-- https://github.com/yarikoptic/abcd-testds2/info/refs?service=git-upload-pack
|
||||
Resolving github.com (github.com)... 140.82.114.3
|
||||
Connecting to github.com (github.com)|140.82.114.3|:443... connected.
|
||||
HTTP request sent, awaiting response...
|
||||
HTTP/1.1 404 Not Found
|
||||
Server: GitHub.com
|
||||
Date: Fri, 22 Jan 2021 01:41:21 GMT
|
||||
Content-Type: text/plain; charset=utf-8
|
||||
Status: 404 Not Found
|
||||
Vary: X-PJAX, Accept-Encoding, Accept, X-Requested-With
|
||||
Cache-Control: no-cache
|
||||
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
|
||||
X-Frame-Options: deny
|
||||
X-Content-Type-Options: nosniff
|
||||
X-XSS-Protection: 1; mode=block
|
||||
Referrer-Policy: origin-when-cross-origin, strict-origin-when-cross-origin
|
||||
Expect-CT: max-age=2592000, report-uri=\"https://api.github.com/_private/browser/errors\"
|
||||
Content-Security-Policy: default-src 'none'; base-uri 'self'; connect-src 'self'; form-action 'self'; img-src 'self' data:; script-src 'self'; style-src 'unsafe-inline'
|
||||
Set-Cookie: _gh_sess=UoF3mYOvfYf5mFbK1tr7aWOuYpQbNoJVhajA5nr2ANUvg%2FekQjtgh0h3xLva0EcwHnLNNsl7VMEdVLXNGi9Yn4AbjrBxX0sdo51DL1XQYR%2Bm3ZeS71I7keexEnrZspp%2FQxaT7cJpceXr7ZrKg2HwJu8dMo%2Bcz13Vr%2F9p7MtZ6cIjUMMF3ql8GX%2BYO949RdgS31KNBb1Ln917v7GlLaZhbejgGAYJOFI2YMuWhs3WkZxOZCMy1JnW%2Bbp3OcdyffBt0ToaKaLcUx1mt6kzzOb4Ow%3D%3D--FD5dTEIs8HUBjIdH--P%2B86pTRJ%2FwWUndICVXAaNA%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
|
||||
Set-Cookie: _octo=GH1.1.1513753117.1611279681; Path=/; Domain=github.com; Expires=Sat, 22 Jan 2022 01:41:21 GMT; Secure; SameSite=Lax
|
||||
Set-Cookie: logged_in=no; Path=/; Domain=github.com; Expires=Sat, 22 Jan 2022 01:41:21 GMT; HttpOnly; Secure; SameSite=Lax
|
||||
Content-Length: 9
|
||||
X-GitHub-Request-Id: 8F40:2881:CD3AD3:1222997:600A2D41
|
||||
2021-01-21 20:41:21 ERROR 404: Not Found.
|
||||
```
|
||||
|
||||
</details>
|
||||
|
||||
but overall the point is that git does seems to get 401 with auth availability (although I failed to dig out how exactly it gets it). So I will leave it to the experts to figure out how
|
||||
"""]]
|
|
@ -0,0 +1,29 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 5"""
|
||||
date="2021-01-22T18:36:50Z"
|
||||
content="""
|
||||
These possibilities seem about equally likely to me:
|
||||
|
||||
1. gogs has not implemented authed access to the files git-annex needs
|
||||
for private repositories
|
||||
2. gogs has a bug where it returns 404 rather than 401 when not authed,
|
||||
but serves the files up when authed.
|
||||
|
||||
So why try to work around it in git-annex when it's a coin flip whether
|
||||
git-annex can at all, when in either case there's clearly a bug in gogs,
|
||||
and is specifically in code in gogs that is intended to support git-annex?
|
||||
|
||||
github has a bad habit of using user-agent to make urls do different
|
||||
things when git accesses them than when other http clients do. That is the
|
||||
case in your example; use wget -U git/1 and it will 401. But I don't
|
||||
see how that's relevant, since git-annex does not talk to github except for
|
||||
a) via git and b) via its git-lfs implementation (which supports http basic
|
||||
auth although I can't remember if I tested it against github's server or only
|
||||
other servers like gitlab).
|
||||
|
||||
If github's lfs endpoint did do user-agent sniffing, IMHO that would
|
||||
violate their spec, but also yeah, I'd probably put in some appropiately
|
||||
snarky fake user-agent in git-annex there. But not in general, and none of
|
||||
this says git-annex should be treating 404 like 401.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 6"
|
||||
date="2021-01-25T15:18:39Z"
|
||||
content="""
|
||||
THANK YOU Joey. That is indeed quite odd (\"security through obscurity\") behavior from github (note: github returns 401 even if that repo does not exist, so it is at least consistent in not revealing presence/absence of private repos at a url). Feel welcome to close this issue since I guess nothing should indeed be done on git-annex side, and ideally `gin` portal just returns 401 in such cases
|
||||
"""]]
|
|
@ -0,0 +1,11 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 7"""
|
||||
date="2021-01-28T16:37:59Z"
|
||||
content="""
|
||||
github's rationalle for the sniffing, such as it is, is that an url to a
|
||||
git repository lets you view it in the web ui, and the same url can be
|
||||
cloned by git.
|
||||
|
||||
Agreed, I'll close this in git-annex, and they can fix it in gin.
|
||||
"""]]
|
|
@ -0,0 +1,88 @@
|
|||
### Please describe the problem.
|
||||
|
||||
decided to test annex on a new to me file system -- beegfs
|
||||
|
||||
```
|
||||
$> mount | grep beegfs
|
||||
beegfs_nodev on /mnt/beegfs type beegfs (rw,relatime,cfgFile=/etc/beegfs/beegfs-client.conf,_netdev)
|
||||
|
||||
```
|
||||
|
||||
```
|
||||
$> modinfo beegfs
|
||||
filename: /lib/modules/5.4.0-77-generic/updates/fs/beegfs_autobuild/beegfs.ko
|
||||
version: 7.2.2
|
||||
alias: fs-beegfs
|
||||
author: Fraunhofer ITWM, CC-HPC
|
||||
description: BeeGFS parallel file system client (http://www.beegfs.com)
|
||||
license: GPL v2
|
||||
srcversion: 533BB7E5866E52F63B9ACCB
|
||||
depends: ib_core,rdma_cm
|
||||
retpoline: Y
|
||||
name: beegfs
|
||||
vermagic: 5.4.0-77-generic SMP mod_unload modversions
|
||||
|
||||
```
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
1. get beegfs
|
||||
|
||||
2.
|
||||
```
|
||||
leviathan:/mnt/beegfs/yoh/tmp
|
||||
$> TMPDIR=$PWD/annex-tmp git annex test
|
||||
```
|
||||
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
```
|
||||
leviathan:/mnt/beegfs/yoh/tmp
|
||||
$> git annex version
|
||||
git-annex version: 8.20210621-g91f9aac
|
||||
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
|
||||
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.26 DAV-1.3.4 feed-1.3.0.1 ghc-8.8.4 http-client-0.6.4.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.1.0
|
||||
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
|
||||
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
|
||||
operating system: linux x86_64
|
||||
supported repository versions: 8
|
||||
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
|
||||
```
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
looking in detail -- it seems it is not init, but addurl (but subject is set in stone now, can't edit) -- got mislead I guess by the interleaving stdout/err:
|
||||
|
||||
[[!format sh """
|
||||
addurl: FAIL (2.79s)
|
||||
Init Tests
|
||||
init: ./Test/Framework.hs:57:
|
||||
addurl on file:///mnt/beegfs/yoh/tmp/.t/tmprepo96/myurl failed (transcript follows)
|
||||
(to _mnt_beegfs_yoh_tmp_.t_tmprepo96_myurl) git-annex: .git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo96%myurl: renameFile:renamePath:rename: resource busy (Device or resource busy)failedaddurl: 1 failed
|
||||
|
||||
...
|
||||
addurl: FAIL (1.86s)
|
||||
./Test/Framework.hs:57:
|
||||
addurl on file:///mnt/beegfs/yoh/tmp/.t/tmprepo193/myurl failed (transcript follows)
|
||||
(to _mnt_beegfs_yoh_tmp_.t_tmprepo193_myurl) git-annex: .git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo193%myurl: renameFile:renamePath:rename: resource busy (Device or resource busy)failedaddurl: 1 failed
|
||||
Init Tests
|
||||
...
|
||||
addurl: FAIL (2.29s)
|
||||
./Test/Framework.hs:57:
|
||||
addurl on file:///mnt/beegfs/yoh/tmp/.t/tmprepo293/myurl failed (transcript follows)
|
||||
(to _mnt_beegfs_yoh_tmp_.t_tmprepo293_myurl) git-annex: .git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo293%myurl: renameFile:renamePath:rename: resource busy (Device or resource busy)failedaddurl: 1 failed
|
||||
|
||||
3 out of 984 tests failed (1776.96s)
|
||||
|
||||
"""]]
|
||||
|
||||
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
|
||||
|
||||
on days ending with `y` it seems to work quite nicely.
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[fixed|done]], I think, though have not installed beegfs to test.
|
||||
> --[[Joey]]
|
|
@ -0,0 +1,23 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2021-07-02T14:26:34Z"
|
||||
content="""
|
||||
EBUSY The rename fails because oldpath or new‐
|
||||
path is a directory that is in use by
|
||||
some process (perhaps as current working
|
||||
directory, or as root directory, or be‐
|
||||
cause it was open for reading) or is in
|
||||
use by the system (for example as mount
|
||||
point), while the system considers this
|
||||
an error. (Note that there is no re‐
|
||||
quirement to return EBUSY in such cases—
|
||||
there is nothing wrong with doing the
|
||||
rename anyway—but it is allowed to re‐
|
||||
turn EBUSY if the system cannot other‐
|
||||
wise handle such situations.)
|
||||
|
||||
".git/annex/tmp/URL-s3--file&c%%%mnt%beegfs%yoh%tmp%.t%tmprepo193%myurl"
|
||||
is not a directory, it is a file. So, rename seems to have no business failing
|
||||
in this way. Probably the FS is buggy.
|
||||
"""]]
|
|
@ -0,0 +1,24 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 2"
|
||||
date="2021-07-04T03:27:20Z"
|
||||
content="""
|
||||
Thank you Joey! indeed most likely a \"too fancy\" of a file system.
|
||||
|
||||
On [https://www.beegfs.io/release/beegfs_6/Changelog.txt](https://www.beegfs.io/release/beegfs_6/Changelog.txt) I found
|
||||
|
||||
|
||||
```
|
||||
== Changes in 6.11 (release date: 2017-05-26) ==
|
||||
|
||||
General Changes:
|
||||
|
||||
* client: Add option sysRenameEbusyAsXdev to return EXDEV instead of EBUSY if
|
||||
rename() is called on open files. (Tools like \"mv\" can handle EXDEV as return
|
||||
value.)
|
||||
```
|
||||
|
||||
do you think EXDEV would be worked out Ok if that is the culprit? (meanwhile I will let the beegfs users know as well - may be they could try)
|
||||
|
||||
"""]]
|
|
@ -0,0 +1,50 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2021-07-05T16:18:39Z"
|
||||
content="""
|
||||
I've checked with strace, to see if the file was open while it was being
|
||||
renamed. Not that there is anything generally wrong with renaming an open
|
||||
file on a POSIX file system, but it would possibly be a problem on windows,
|
||||
where some forms of opening a file locks it in place. And apparently
|
||||
this filesystem is not trying to be very POSIX either.
|
||||
|
||||
413026 openat(AT_FDCWD, ".git/annex/tmp/URL-s3--file&c%%%tmp%foo", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 17
|
||||
413026 write(17, "hi\n", 3) = 3
|
||||
413026 close(17) = 0
|
||||
...
|
||||
413026 openat(AT_FDCWD, ".git/annex/tmp/URL-s3--file&c%%%tmp%foo", O_RDONLY|O_NOCTTY|O_NONBLOCK) = 11
|
||||
413026 read(11, "hi\n", 8192) = 3
|
||||
...
|
||||
413026 openat(AT_FDCWD, ".git/annex/tmp/URL-s3--file&c%%%tmp%foo", O_RDONLY|O_NOCTTY|O_NONBLOCK <unfinished ...>
|
||||
413028 <... futex resumed>) = 0
|
||||
413026 <... openat resumed>) = 16
|
||||
...
|
||||
413026 read(16, "hi\n", 32752) = 3
|
||||
...
|
||||
413026 close(16) = 0
|
||||
...
|
||||
413026 rename(".git/annex/tmp/URL-s3--file&c%%%tmp%foo", "_tmp_foo") = 0
|
||||
...
|
||||
413028 close(11) = 0
|
||||
|
||||
So the file is left open across the rename, which ought to be able to be
|
||||
changed and would presumably fix the problem.
|
||||
|
||||
It's also a bit odd that the file gets read twice after being copied,
|
||||
once for checksum makes sense, but what's the other one?
|
||||
(Copying while checksumming should be able to avoid one of the reads,
|
||||
but there is an open todo tracking progress on that.)
|
||||
|
||||
Aah, the other read is when it's probing if the file is html in case it ought
|
||||
to be passed off to youtube-dl. That is the read that lingers for a while,
|
||||
because it's done with a lazy readFile and probing if the file is html doesn't
|
||||
read to the end and close it, so the file handle lingers until the GC gets
|
||||
around to closing it. Of course youtube-dl won't be able to do anything with a
|
||||
file url, but git-annex doesn't know that. And anyway the failure on this
|
||||
filesystem would also happen when adding a http url.
|
||||
|
||||
Ok, fixed it to close the handle promptly. That should fix the test suite.
|
||||
It does not seem unlikely that something else will break due to this
|
||||
filesystem's unusual behavior though.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 4"""
|
||||
date="2021-07-05T17:17:59Z"
|
||||
content="""
|
||||
Also looked over other uses of readFile. While there are a couple that
|
||||
don't read the whole file and so may have a lag closing, none of them are
|
||||
files that are used in ways that seem likely to trigger this kind of
|
||||
problem.
|
||||
"""]]
|
|
@ -0,0 +1,28 @@
|
|||
### Please describe the problem.
|
||||
|
||||
Probably it is more of a todo than a bug.
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
This is a use-case where I am trying to establish a special remote to be shared by multiple unrelated repositories.
|
||||
|
||||
So I had original repo1 in which I
|
||||
|
||||
- created an external special remote with chunking, it got UUID1
|
||||
- uploaded some data (all got chunked)
|
||||
|
||||
created repo2 in which I
|
||||
|
||||
- initialized special remote with identical settings and provided `uuid=UUID1`
|
||||
- decided to test if annex would be able to get a key from the shared special remote
|
||||
|
||||
but `annex fsck --key KEY --from remote --fast`, since it doesn't have an exact chunking list, just provides special remote backend with original full key only, which is obviously not found, and it reports failure. But I wondered -- couldn't `git-annex` just use chunking size and "mint" possible chunked-keys to test on the special remote since it has all the information? After all chunk keys AFAIK are deterministically minted and pretty much are just "augmented" original key with `-S<chunksize>-C<chunkindex>` added to the key.
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
8.20200908+git175-g95d02d6e2-1~ndall+1
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[done]] --[[Joey]]
|
|
@ -0,0 +1,26 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2020-10-22T16:09:17Z"
|
||||
content="""
|
||||
Note that what you are trying to do will only work if the special remote
|
||||
is not encrypted.
|
||||
|
||||
As well as your use case, which seems very unusual, I think one other use
|
||||
case would be if a clone uploaded to the special remote, but never synced
|
||||
out its git-annex branch before being lost, and fsck --from
|
||||
remote is being run in another clone to reconstruct it. Currently it
|
||||
won't try chunks as none are recorded.
|
||||
|
||||
Speculatively trying the current remote's chunk config would handle the
|
||||
majority of cases, though wouldn't help if the other clone had adjusted the
|
||||
special remote's chunk size too.
|
||||
|
||||
There's some overhead, but it can check it last, and not check it if
|
||||
it's in the list of known chunks, so the overhead would only usually
|
||||
be paid if the content git-annex expected to be present had gone missing,
|
||||
which I think is rare enough to not care about.
|
||||
|
||||
(Also, this can only be done when the size of the key is known, so not
|
||||
eg addurl --relaxed keys.)
|
||||
"""]]
|
|
@ -0,0 +1,29 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2020-10-22T17:00:25Z"
|
||||
content="""
|
||||
Implemented that. But..
|
||||
|
||||
As implemented, there's nothing to make the chunk size get stored in the
|
||||
chunk log for a key, after it accesses its content using the configured
|
||||
chunk size.
|
||||
|
||||
So, changing the chunk= of the remote can prevent accessing content that
|
||||
was accessible before. Of course, avoiding that is why chunk sizes are
|
||||
logged in the first place.
|
||||
|
||||
Seems like maybe fsck --from should fix the chunk log? I think
|
||||
fsck would always need to be used, to fix up the location log, before any
|
||||
other commands rely on the data being in the special remote, so it seems
|
||||
fine to only fix the chunk log there.
|
||||
|
||||
But, also a bit unclear how fsck would find out when it needs to do this.
|
||||
It only needs to when the remote's configured chunk size is not
|
||||
listed in the chunk log. But that's also common after changing the chunk
|
||||
size of a remote. So it would have to mess around with checking the
|
||||
presence of chunk keys itself, which would be extra work and also ugly
|
||||
to implement.
|
||||
|
||||
I'm leaving this todo^Wbug open for now due to this.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2020-10-22T17:36:12Z"
|
||||
content="""
|
||||
Ok, made update the chunk log as needed while checking if chunks are
|
||||
present. So this is done.
|
||||
"""]]
|
|
@ -0,0 +1,119 @@
|
|||
### Please describe the problem.
|
||||
|
||||
I was trying to follow https://git-annex.branchable.com/special_remotes/git-lfs/ (only without any encryption), to store at least some data on github via LFS (e.g., for https://github.com/dandi-datasets/nwb_test_data).
|
||||
|
||||
Even though I do provide URL to the `annex initremote` call, it is not stored within `remote.log`:
|
||||
|
||||
|
||||
[[!format sh """
|
||||
$> sudo rm -rf /tmp/testds2 && ( mkdir /tmp/testds2 && cd /tmp/testds2 && git init && git annex init && git annex initremote gh-lfs autoenable=true type=git-lfs url=git@github.com:yarikoptic/testds2.git encryption=none && git show git-annex:remote.log; )
|
||||
Initialized empty Git repository in /tmp/testds2/.git/
|
||||
init (scanning for unlocked files...)
|
||||
ok
|
||||
(recording state in git...)
|
||||
initremote gh-lfs ok
|
||||
(recording state in git...)
|
||||
c9132e68-e9d8-40b5-ba34-5d60a8b9c844 autoenable=true encryption=none name=gh-lfs type=git-lfs timestamp=1570642576.06742667s
|
||||
|
||||
"""]]
|
||||
|
||||
git annex 7.20190912-1~ndall+1
|
||||
|
||||
|
||||
If I just proceed, populate and copy some data via lfs (example uses datalad's `create-sibling-github` to create a new repo):
|
||||
|
||||
[[!format sh """
|
||||
$> ( cd /tmp/testds2 && touch 123 && git annex add 123 && git commit -m 'add 123' && datalad create-sibling-github -s origin testds2 && git push -u origin master && git annex copy --to=gh-lfs 123; git push origin git-annex; )
|
||||
add 123
|
||||
ok
|
||||
(recording state in git...)
|
||||
[master (root-commit) d2b2f52] add 123
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 120000 123
|
||||
[WARNING] Authentication failed using a token.
|
||||
.: origin(-) [https://github.com/yarikoptic/testds2.git (git)]
|
||||
'https://github.com/yarikoptic/testds2.git' configured as sibling 'origin' for <Dataset path=/tmp/testds2>
|
||||
Enumerating objects: 3, done.
|
||||
Counting objects: 100% (3/3), done.
|
||||
Delta compression using up to 4 threads
|
||||
Compressing objects: 100% (2/2), done.
|
||||
Writing objects: 100% (3/3), 307 bytes | 307.00 KiB/s, done.
|
||||
Total 3 (delta 0), reused 0 (delta 0)
|
||||
To github.com:yarikoptic/testds2.git
|
||||
* [new branch] master -> master
|
||||
Branch 'master' set up to track remote branch 'master' from 'origin'.
|
||||
copy 123 (to gh-lfs...)
|
||||
ok
|
||||
(recording state in git...)
|
||||
Enumerating objects: 19, done.
|
||||
Counting objects: 100% (19/19), done.
|
||||
Delta compression using up to 4 threads
|
||||
Compressing objects: 100% (15/15), done.
|
||||
Writing objects: 100% (19/19), 1.66 KiB | 567.00 KiB/s, done.
|
||||
Total 19 (delta 4), reused 0 (delta 0)
|
||||
remote: Resolving deltas: 100% (4/4), done.
|
||||
remote:
|
||||
remote: Create a pull request for 'git-annex' on GitHub by visiting:
|
||||
remote: https://github.com/yarikoptic/testds2/pull/new/git-annex
|
||||
remote:
|
||||
To github.com:yarikoptic/testds2.git
|
||||
* [new branch] git-annex -> git-annex
|
||||
|
||||
"""]]
|
||||
|
||||
on a new clone I get a complaint that `url=` is missing, and no data is fetched
|
||||
|
||||
[[!format sh """
|
||||
$> sudo rm -rf testds2-clone && git clone git@github.com:yarikoptic/testds2.git testds2-clone && ( cd testds2-clone && git annex init && git annex get 123; )
|
||||
Cloning into 'testds2-clone'...
|
||||
remote: Enumerating objects: 22, done.
|
||||
remote: Counting objects: 100% (22/22), done.
|
||||
remote: Compressing objects: 100% (13/13), done.
|
||||
remote: Total 22 (delta 5), reused 21 (delta 4), pack-reused 0
|
||||
Receiving objects: 100% (22/22), done.
|
||||
Resolving deltas: 100% (5/5), done.
|
||||
123@
|
||||
init (merging origin/git-annex into git-annex...)
|
||||
(recording state in git...)
|
||||
(scanning for unlocked files...)
|
||||
Invalid command: 'git-annex-shell 'configlist' '/~/yarikoptic/testds2.git''
|
||||
You appear to be using ssh to clone a git:// URL.
|
||||
Make sure your core.gitProxy config option and the
|
||||
GIT_PROXY_COMMAND environment variable are NOT set.
|
||||
|
||||
Remote origin does not have git-annex installed; setting annex-ignore
|
||||
|
||||
This could be a problem with the git-annex installation on the remote. Please make sure that git-annex-shell is available in PATH when you ssh into the remote. Once you have fixed the git-annex installation, run: git annex enableremote origin
|
||||
(Auto enabling special remote gh-lfs...)
|
||||
|
||||
Specify url=
|
||||
ok
|
||||
(recording state in git...)
|
||||
get 123 (not available)
|
||||
Try making some of these repositories available:
|
||||
92ce3cfc-8c58-42db-8aa3-ea4d4b3a6011 -- yoh@hopa:/tmp/testds2
|
||||
c9132e68-e9d8-40b5-ba34-5d60a8b9c844 -- gh-lfs
|
||||
|
||||
(Note that these git remotes have annex-ignore set: origin)
|
||||
failed
|
||||
git-annex: get: 1 failed
|
||||
"""]]
|
||||
|
||||
so I had to enableremote it while providing URL I become able to `get` the file:
|
||||
|
||||
[[!format sh """
|
||||
$> git annex enableremote gh-lfs autoenable=true type=git-lfs url=git@github.com:yarikoptic/testds2.git encryption=none && git annex get 123
|
||||
enableremote gh-lfs ok
|
||||
(recording state in git...)
|
||||
get 123 (from gh-lfs...)
|
||||
(checksum...) ok
|
||||
(recording state in git...)
|
||||
"""]]
|
||||
|
||||
|
||||
Shouldn't that URL be recorded in remote.log? (similarly to `type=git` remotes)
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[done]]; see my comment --[[Joey]]
|
|
@ -0,0 +1,24 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2019-10-21T19:07:42Z"
|
||||
content="""
|
||||
That is intentional, because a git-lfs remote can have multiple urls that
|
||||
can access it, and different users of the remote might want to use
|
||||
different urls.
|
||||
|
||||
It's also documented to work that way, the same as the directory
|
||||
special remote documents that you have to provide directory= each time it's
|
||||
enabled.
|
||||
|
||||
But, now that git-annex supports sameas remotes, it would be possible to
|
||||
have one special remote for each different url to a given git-lfs remote,
|
||||
and have git-annex know they're the same repository. The user can then
|
||||
enableremote whichever one they want.
|
||||
|
||||
See [[todo/git-lfs_special_remote_simpler_setup]] for where I hope this
|
||||
will lead.
|
||||
|
||||
Closing this bug report as redundant with that todo item, and not actually a
|
||||
bug since it is documented to behave the way it currently behaves.
|
||||
"""]]
|
|
@ -0,0 +1,49 @@
|
|||
### Please describe the problem.
|
||||
|
||||
I am trying to import (and then reimport) a directory which I sync to from box.com shared with me folder.
|
||||
I have used `--duplicate` option to not delete original files upon `import`. But then upon-rerunning `import` command git-annex would error out if file already exists. `--reinject-duplicates` seems to be the option to use, but all those modes are "exclusive" so I cannot use `--duplicate --reinject-duplicates`, and using `--reinject-duplicates` alone would result in removing original files (as without `--duplicates`)
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
7.20190819+git2-g908476a9b-1~ndall+1
|
||||
|
||||
### Please provide any additional information below.
|
||||
|
||||
my little demo snippet for import with using --duplicate and then both options at the same time:
|
||||
|
||||
[[!format sh """
|
||||
$> mkdir /tmp/d-in /tmp/d-repo && touch /tmp/d-in/file && ( cd /tmp/d-repo && git init && git annex init && for r in 1 2; do echo "Run $r"; ls -l ../d-in && git annex import --duplicate ../d-in/.; done )
|
||||
Initialized empty Git repository in /tmp/d-repo/.git/
|
||||
init ok
|
||||
(recording state in git...)
|
||||
Run 1
|
||||
total 0
|
||||
-rw------- 1 yoh yoh 0 Oct 14 10:51 file
|
||||
import ./file ok
|
||||
(recording state in git...)
|
||||
Run 2
|
||||
total 0
|
||||
-rw------- 1 yoh yoh 0 Oct 14 10:51 file
|
||||
import ./file
|
||||
not overwriting existing ./file (is a symlink)
|
||||
failed
|
||||
git-annex: import: 1 failed
|
||||
|
||||
|
||||
$> cd d-repo
|
||||
$> git annex import ../d-in/. --reinject-duplicates --duplicate 2>&1 | head -n 3
|
||||
Invalid option `--duplicate'
|
||||
|
||||
Usage: git-annex COMMAND
|
||||
|
||||
"""]]
|
||||
|
||||
|
||||
Or may be there is a better way to establish re-runnable import from a directory workflow?
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
[[!tag moreinfo]]
|
||||
|
||||
> [[done]] --[[Joey]]
|
|
@ -0,0 +1,13 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2019-11-19T17:12:41Z"
|
||||
content="""
|
||||
I think that you can accomplish what you want by making the directory
|
||||
you're importing from be a directory special remote with exporttree=yes
|
||||
importtree=yes and use the new `git annex import master --from remote`
|
||||
|
||||
If that does not do what you want, I'd prefer to look at making it be able
|
||||
to do so. I hope to eventually remove the legacy git-annex import from
|
||||
directory, since we have this new more general interface.
|
||||
"""]]
|
|
@ -0,0 +1,7 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2020-03-30T15:50:17Z"
|
||||
content="""
|
||||
Tagged moreinfo since I'm waiting on a reply to my suggestion.
|
||||
"""]]
|
|
@ -0,0 +1,59 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 3"
|
||||
date="2020-10-06T01:26:59Z"
|
||||
content="""
|
||||
I think it worked wonderfully
|
||||
|
||||
<details>
|
||||
<summary>here is my script I have tried</summary>
|
||||
|
||||
```shell
|
||||
#!/bin/bash
|
||||
|
||||
export PS4='> '
|
||||
set -x
|
||||
set -eu
|
||||
cd \"$(mktemp -d ${TMPDIR:-/tmp}/dl-XXXXXXX)\"
|
||||
|
||||
mkdir d-in d-repo
|
||||
echo content >| d-in/file
|
||||
|
||||
function dance() {
|
||||
git annex import master --from d-in
|
||||
# but we need to merge it
|
||||
git merge d-in/master
|
||||
ls -l
|
||||
grep -e . *
|
||||
}
|
||||
|
||||
(
|
||||
cd d-repo
|
||||
git init
|
||||
git annex init
|
||||
git annex initremote d-in type=directory directory=../d-in exporttree=yes importtree=yes encryption=none
|
||||
|
||||
ls -l ../d-in
|
||||
|
||||
for r in 1 2; do
|
||||
echo \"Run $r\";
|
||||
dance
|
||||
done
|
||||
|
||||
echo \"more\" >> ../d-in/file
|
||||
echo \"new\" > ../d-in/newfile
|
||||
dance
|
||||
|
||||
rm ../d-in/file
|
||||
dance
|
||||
|
||||
)
|
||||
|
||||
```
|
||||
</details>
|
||||
|
||||
and it seemed to do the right job! I have not tried to add some `.gitattributes` into that branch it imports into to tell some files to go to git, but I hope it would just work, and if not -- I will come back! feel welcome to close this issue.
|
||||
|
||||
Cheers
|
||||
"""]]
|
|
@ -0,0 +1,70 @@
|
|||
### Please describe the problem.
|
||||
|
||||
|
||||
[original question raised by John](https://github.com/dandi/dandisets/issues/139#issuecomment-1149948239) which lead me to the goose chase.
|
||||
|
||||
Following reproducer
|
||||
|
||||
```
|
||||
#!/bin/bash
|
||||
|
||||
cd "$(mktemp -d ${TMPDIR:-/tmp}/dl-XXXXXXX)"
|
||||
set -eux
|
||||
|
||||
git init --bare remote
|
||||
( cd remote; git annex init; cat config )
|
||||
rpath=$PWD/remote
|
||||
|
||||
git init repo
|
||||
cd repo
|
||||
git annex init
|
||||
echo 'This is test text.' > file.txt
|
||||
git add file.txt
|
||||
git commit -m Init file.txt
|
||||
|
||||
git remote add --fetch remote-git $rpath
|
||||
|
||||
# without this -- there is no annex-uuid for remote -- git-annex branch is not getting merged
|
||||
git annex info
|
||||
|
||||
cat .git/config
|
||||
|
||||
# but this still fails
|
||||
git annex initremote testremote type=git location=$rpath autoenable=true
|
||||
|
||||
```
|
||||
|
||||
ends with
|
||||
|
||||
```
|
||||
[remote "remote-git"]
|
||||
url = /home/yoh/.tmp/dl-VjO0aSF/remote
|
||||
fetch = +refs/heads/*:refs/remotes/remote-git/*
|
||||
annex-uuid = afdc6d54-cd6d-4a20-b639-a639f9c7ef09
|
||||
+ git annex initremote testremote type=git location=/home/yoh/.tmp/dl-VjO0aSF/remote autoenable=true
|
||||
initremote testremote
|
||||
git-annex: could not find existing git remote with specified location
|
||||
failed
|
||||
initremote: 1 failed
|
||||
|
||||
```
|
||||
|
||||
so
|
||||
|
||||
- error "could not find existing git remote with specified location" seems not descriptive of the underlying problem since location matches the url. Underlying issue is still not clear why we can't initremote
|
||||
- as you could see in the script - need `annex info` to have annex-uuid populated and looking at [code ](https://git.kitenet.net/index.cgi/git-annex.git/tree/Remote/Git.hs?id=af0d854460c28230dc682faa7c6daf3d96698cb6#n110) comment -- it requires UUID to be known. If not known -- ideally should be a dedicated error message ("remote blah found but lacks uuid, check if remote is annex")
|
||||
- IMHO should not need manual `annex info` to merge git-annex branch
|
||||
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
above
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
10.20220504
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
|
@ -0,0 +1,26 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2022-06-08T16:55:50Z"
|
||||
content="""
|
||||
Hmm, I think this only works for ssh:// urls currently.
|
||||
|
||||
Even the ssh url form host:/path does not work, because it gets
|
||||
normalized to a ssh:// url.
|
||||
|
||||
The implementation does not support non-url's at all; the provided location
|
||||
is treated as an url (`Git.Url location`). And even if it were treated as a
|
||||
path, the path gets normalized to a relative path and an absolute path (or
|
||||
differently relavatized path) would not work.
|
||||
|
||||
Using paths with this is rather problematic too, because if the repo is
|
||||
cloned to another machine, it would not find the repo at the recorded path.
|
||||
Similarly, relative paths are also problimatic. But it may as well support
|
||||
them to the extent it can.
|
||||
|
||||
I think this needs changes to the core Git data structure, to store the
|
||||
original, unmodified git.remote.path. Or a different interface than the
|
||||
current, one that accepts any repo location and probes it to find the uuid.
|
||||
The latter idea seems better because it simplifies the UI rather than
|
||||
complicating the internal representation.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2022-06-09T17:04:23Z"
|
||||
content="""
|
||||
Implemented probing of the uuid of the repo location. Which may change
|
||||
how you use this feature. Although the old roundabout method of having an
|
||||
existing git remote and running initremote with the same location will
|
||||
work too, it's not neccessary to do that anymore.
|
||||
"""]]
|
|
@ -0,0 +1,13 @@
|
|||
[[!comment format=mdwn
|
||||
username="jkniiv"
|
||||
avatar="http://cdn.libravatar.org/avatar/05fd8b33af7183342153e8013aa3713d"
|
||||
subject="comment 2"
|
||||
date="2022-06-09T02:32:18Z"
|
||||
content="""
|
||||
Wouldn't it be possible to support (absolute) file:// urls, eg. something similar to
|
||||
`file:///home/jkniiv/test-VEfBrTZ/remote2`? In my mind they feel like a reasonable approximation
|
||||
of ssh:// urls and could be useful for getting a feel for git special remotes before setting
|
||||
up a bare git-repo/annex on an ssh-server. I know they are not the same thing implementation wise
|
||||
but I feel that being able to try this feature out on a least-effort basis would be useful
|
||||
from a pedagogical standpoint.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 4"""
|
||||
date="2022-06-09T17:28:19Z"
|
||||
content="""
|
||||
Re file:// urls, it does now work to use them in location=. I don't know if
|
||||
I'd consider using them any better than absolute paths though. YMMV.
|
||||
"""]]
|
|
@ -0,0 +1,150 @@
|
|||
[[!meta title="http remotes that require authentication are not yet supported"]]
|
||||
|
||||
It is not a ground shaking issue, but probably would be best to handle it more gracefully.
|
||||
|
||||
Initially mentioned while doing install using datalad. Account/permission is required to access this particular repo, ask Canadians for access if you don't have it yet Joey. credentials I guess got asked for and cached by git upon initial invocation, so upon subsequent calls didn't ask for any:
|
||||
|
||||
[[!format sh """
|
||||
$> datalad install https://git.bic.mni.mcgill.ca/bic/Coffey-mri-bids
|
||||
[INFO ] Cloning https://git.bic.mni.mcgill.ca/bic/Coffey-mri-bids [1 other candidates] into '/tmp/Coffey-mri-bids'
|
||||
[INFO ] fatal: bad config line 1 in file /home/yoh/.tmp/git-annex96493-5.tmp
|
||||
[INFO ] Remote origin not usable by git-annex; setting annex-ignore
|
||||
install(ok): /tmp/Coffey-mri-bids (dataset)
|
||||
"""]]
|
||||
|
||||
which boiled down to that message being spited out during `git annex init` which samples the remote, but fails to download the config and gets instead a redirected html page:
|
||||
|
||||
[[!format sh """
|
||||
$> git clone https://git.bic.mni.mcgill.ca/bic/Coffey-mri-bids
|
||||
Cloning into 'Coffey-mri-bids'...
|
||||
warning: redirecting to https://git.bic.mni.mcgill.ca/bic/Coffey-mri-bids.git/
|
||||
remote: Enumerating objects: 398, done.
|
||||
remote: Counting objects: 100% (398/398), done.
|
||||
remote: Compressing objects: 100% (282/282), done.
|
||||
remote: Total 398 (delta 53), reused 393 (delta 48)
|
||||
Receiving objects: 100% (398/398), 34.97 KiB | 795.00 KiB/s, done.
|
||||
Resolving deltas: 100% (53/53), done.
|
||||
|
||||
|
||||
$> git -C Coffey-mri-bids annex init --debug
|
||||
...
|
||||
[2019-11-27 19:27:01.341315979] Request {
|
||||
host = "git.bic.mni.mcgill.ca"
|
||||
port = 443
|
||||
secure = True
|
||||
requestHeaders = [("Accept-Encoding","identity"),("User-Agent","git-annex/7.20190819+git2-g908476a9b-1~ndall+1")]
|
||||
path = "/bic/Coffey-mri-bids/config"
|
||||
queryString = ""
|
||||
method = "GET"
|
||||
proxy = Nothing
|
||||
rawBody = False
|
||||
redirectCount = 10
|
||||
responseTimeout = ResponseTimeoutDefault
|
||||
requestVersion = HTTP/1.1
|
||||
}
|
||||
|
||||
[2019-11-27 19:27:01.90016181] read: git ["config","--null","--list","--file","/home/yoh/.tmp/git-annex228094-5.tmp"]
|
||||
fatal: bad config line 1 in file /home/yoh/.tmp/git-annex228094-5.tmp
|
||||
[2019-11-27 19:27:01.913302324] process done ExitFailure 128
|
||||
|
||||
Remote origin not usable by git-annex; setting annex-ignore
|
||||
|
||||
$> wget -S https://git.bic.mni.mcgill.ca/bic/Coffey-mri-bids/config
|
||||
--2019-11-27 19:29:25-- https://git.bic.mni.mcgill.ca/bic/Coffey-mri-bids/config
|
||||
Resolving git.bic.mni.mcgill.ca (git.bic.mni.mcgill.ca)... 132.216.133.92
|
||||
Connecting to git.bic.mni.mcgill.ca (git.bic.mni.mcgill.ca)|132.216.133.92|:443... connected.
|
||||
HTTP request sent, awaiting response...
|
||||
HTTP/1.1 302 Found
|
||||
Server: nginx
|
||||
Date: Thu, 28 Nov 2019 00:29:26 GMT
|
||||
Content-Type: text/html; charset=utf-8
|
||||
Content-Length: 109
|
||||
Connection: keep-alive
|
||||
Cache-Control: no-cache
|
||||
Location: https://git.bic.mni.mcgill.ca/users/sign_in
|
||||
Set-Cookie: _gitlab_session=8a4f8d5569636004aaebfb73588a2d53; path=/; secure; HttpOnly
|
||||
X-Request-Id: xTcSyu4H36
|
||||
X-Runtime: 0.071681
|
||||
Strict-Transport-Security: max-age=31536000
|
||||
Referrer-Policy: strict-origin-when-cross-origin
|
||||
Location: https://git.bic.mni.mcgill.ca/users/sign_in [following]
|
||||
--2019-11-27 19:29:26-- https://git.bic.mni.mcgill.ca/users/sign_in
|
||||
Reusing existing connection to git.bic.mni.mcgill.ca:443.
|
||||
HTTP request sent, awaiting response...
|
||||
HTTP/1.1 200 OK
|
||||
Server: nginx
|
||||
Date: Thu, 28 Nov 2019 00:29:26 GMT
|
||||
Content-Type: text/html; charset=utf-8
|
||||
Transfer-Encoding: chunked
|
||||
Connection: keep-alive
|
||||
Vary: Accept-Encoding
|
||||
Cache-Control: max-age=0, private, must-revalidate
|
||||
Etag: W/"305857ff0ba591a1e4ee7fec83b5687c"
|
||||
Referrer-Policy: strict-origin-when-cross-origin
|
||||
Set-Cookie: _gitlab_session=8a4f8d5569636004aaebfb73588a2d53; path=/; expires=Thu, 28 Nov 2019 02:29:26 -0000; secure; HttpOnly
|
||||
X-Content-Type-Options: nosniff
|
||||
X-Download-Options: noopen
|
||||
X-Frame-Options: DENY
|
||||
X-Permitted-Cross-Domain-Policies: none
|
||||
X-Request-Id: MHFi7Yjxe82
|
||||
X-Runtime: 0.063359
|
||||
X-Ua-Compatible: IE=edge
|
||||
X-Xss-Protection: 1; mode=block
|
||||
Strict-Transport-Security: max-age=31536000
|
||||
Referrer-Policy: strict-origin-when-cross-origin
|
||||
Length: unspecified [text/html]
|
||||
Saving to: ‘config’
|
||||
|
||||
config [ <=> ] 13.19K --.-KB/s in 0s
|
||||
|
||||
2019-11-27 19:29:26 (89.1 MB/s) - ‘config’ saved [13505]
|
||||
|
||||
$> cat config
|
||||
<!DOCTYPE html>
|
||||
<html class="devise-layout-html">
|
||||
<head prefix="og: http://ogp.me/ns#">
|
||||
<meta charset="utf-8">
|
||||
<meta content="IE=edge" http-equiv="X-UA-Compatible">
|
||||
<meta content="object" property="og:type">
|
||||
<meta content="GitLab" property="og:site_name">
|
||||
<meta content="Sign in" property="og:title">
|
||||
...
|
||||
"""]]
|
||||
|
||||
I guess the problem is multi-faceted:
|
||||
|
||||
1. in case of authenticated http remote, `git` caches credentials, but then `git annex` tries to download file directly (instead of somehow via git), it could not "sense" that remote to be a valid annex and/or get files from it.
|
||||
|
||||
You can try with this simple one -- user "demo", password "demo":
|
||||
|
||||
[[!format sh """
|
||||
$> git clone http://www.onerussian.com/tmp/secret-repo/.git
|
||||
Cloning into 'secret-repo'...
|
||||
Username for 'http://www.onerussian.com': demo
|
||||
Password for 'http://demo@www.onerussian.com':
|
||||
|
||||
$> git -C secret-repo annex init
|
||||
init (merging origin/git-annex into git-annex...)
|
||||
(recording state in git...)
|
||||
|
||||
Remote origin not usable by git-annex; setting annex-ignore
|
||||
ok
|
||||
(recording state in git...)
|
||||
|
||||
"""]]
|
||||
|
||||
although remote is a proper annex, indeed `git annex` cannot use it since does not authenticate as git does.
|
||||
So even though the error message is not incorrect, I would say the situation is suboptimal
|
||||
|
||||
2. if remote server instead of just returning 404 or 403 error code (as eg github seems to do in similar cases of non-authenticated access) instead redirects to some login page, annex feeds that page as a config to git, ignores the error message and just marks that remote as ignored for annex, while leaking that obscure "fatal" error message from git.
|
||||
|
||||
IMHO, ideally 1. should be addressed properly (authentication), and for 2. annex should spit out some more sensible message ("git failed to parse a config file fetched from the remote X. Please inspect it at this /path/config"), so keep that file around for debugging. As it is now I had to dig quite deep to figure out WTF is going on.
|
||||
|
||||
git annex 7.20190819+git2-g908476a9b-1~ndall+1 and the same with bleeding edge 7.20191114+git43-ge29663773-1~ndall+1 (probably that commit is the one with my patch for stricter git versioning, so use the count of 42 ;))
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[done]]; the error message is improved and also git remotes that need
|
||||
> http basic auth to access will get password from `git credential`.
|
||||
> --[[Joey]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="related: shouldn't git annex try external remotes to download config?"
|
||||
date="2019-11-28T01:22:53Z"
|
||||
content="""
|
||||
I haven't tested, but I can see the situation where a specific repository URL could be handled by external special remote (such as datalad, downloaders of which do handle obscure setups such as this one without 403/404 but rather forwarding to login page) which would provide authenticated access to the URL. Would annex even try that config URL via external special remotes?
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 2"
|
||||
date="2019-11-29T18:09:45Z"
|
||||
content="""
|
||||
one of the use-cases (will be) https://gin.g-node.org/ -- an archive of (primarily) electrophys data. The platform is based on gogs, but uses git-annex underneath. It \"will be\" because currently access to git-annex is provided only via ssh, but as of today it is already possible to `git clone` (tried on public, didn't try private) datasets via https, and developers are looking into exposing git-annex also via http. To access private datasets authentication will need to be handled
|
||||
"""]]
|
|
@ -0,0 +1,31 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2020-01-22T16:04:37Z"
|
||||
content="""
|
||||
git-annex could use `git credential` if the config download fails with
|
||||
401 unauthorized and then retry with the credentials. (The git-lfs special
|
||||
remote already does this.) And it would also need to do the same thing
|
||||
when getting a key from the remote.
|
||||
|
||||
But that would not help with the https://git.bic.mni.mcgill.ca example,
|
||||
apparently, because there's no 401, but a 302 redirect to a 200,
|
||||
that is indistingishable from a successful download.
|
||||
|
||||
Yeah, when git-annex expects a git config, if it doesn't parse as one,
|
||||
it could retry, asking for credentials.
|
||||
But that seems asking for trouble: what if it fails to parse for
|
||||
another reason, maybe the web server served up something other than the
|
||||
expected config, maybe a captive portal got in the way. There would be a
|
||||
username/password prompt that doesn't make sense to the user at all.
|
||||
|
||||
And if this happens in a key download, git-annex certianly has no way to
|
||||
tell that what it downloaded is not intended as the content of a key,
|
||||
short of verifying the content, and failure to verify certainly doesn't
|
||||
justify prompting for a username/password.
|
||||
|
||||
So, I am not comfortable with falling back to ask for credentials unless
|
||||
I've seen a http status code that indicates they are necessary.
|
||||
And IMHO gitlab's use of a 302 redirect to a login page is a bug in
|
||||
gitlab, and will need to be fixed there, or a better http server used.
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""re: related: shouldn't git annex try external remotes to download config?"""
|
||||
date="2020-01-22T16:31:16Z"
|
||||
content="""
|
||||
No, the external special remote protocol is not aimed at downloading git
|
||||
config files. Anyway, this code path is never involved with using
|
||||
special remotes; the uuid of a special remote is known and so there is no
|
||||
need to ever download a git config file to discover it.
|
||||
"""]]
|
|
@ -0,0 +1,48 @@
|
|||
### Please describe the problem.
|
||||
|
||||
May be not a problem per se, but decided to check if expected. Following [this advise](http://git-annex.branchable.com/todo/git_smudge_clean_interface_suboptiomal/#comment-65f848510d8684bf65c6698f68b700dd) I have `git config filter.annex.process "git-annex filter-process"` in that git-annex repo and now observe following tree (in htop) of processes:
|
||||
|
||||
```
|
||||
3799768 dandi 20 0 1025G 191M 40616 S 6.6 0.3 0:31.87 │ │ ├─ git-annex addurl --batch --with-files --jobs 5 --json --json-error-messages --json-progress --raw
|
||||
3799796 dandi 20 0 191M 5088 4680 S 0.0 0.0 0:00.01 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch
|
||||
3805272 dandi 20 0 6892 3420 2992 S 0.0 0.0 0:00.27 │ │ │ ├─ /bin/bash /usr/bin/git-annex-remote-rclone
|
||||
3805640 dandi 20 0 20432 13032 4024 S 0.0 0.0 0:02.82 │ │ │ ├─ git --git-dir=.git --work-tree=. check-ignore -z --stdin --verbose --non-matching
|
||||
3805646 dandi 20 0 20432 13044 4036 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs check-attr -z --stdin annex.backend annex.largefiles annex.numcopies annex.mincopies --
|
||||
3805650 dandi 20 0 31900 4064 3816 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805685 dandi 20 0 30144 4000 3752 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805704 dandi 20 0 30144 16076 15792 S 0.0 0.0 0:00.01 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805705 dandi 20 0 30144 3976 3728 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805717 dandi 20 0 30144 15968 15680 S 0.0 0.0 0:00.01 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805781 dandi 20 0 30144 3980 3724 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805786 dandi 20 0 30144 4068 3820 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805807 dandi 20 0 30144 16028 15744 S 0.0 0.0 0:00.02 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805808 dandi 20 0 30144 3884 3636 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805828 dandi 20 0 30144 4008 3764 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3805848 dandi 20 0 20432 13104 4092 S 0.0 0.0 0:00.04 │ │ │ ├─ git --git-dir=.git --work-tree=. check-ignore -z --stdin --verbose --non-matching
|
||||
3805852 dandi 20 0 20432 12948 3940 S 0.0 0.0 0:00.02 │ │ │ ├─ git --git-dir=.git --work-tree=. check-ignore -z --stdin --verbose --non-matching
|
||||
3805865 dandi 20 0 20432 13032 4024 S 0.0 0.0 0:00.02 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs check-attr -z --stdin annex.backend annex.largefiles annex.numcopies annex.mincopies --
|
||||
3806054 dandi 20 0 30144 4004 3752 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3806066 dandi 20 0 45216 5108 4700 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch
|
||||
3806067 dandi 20 0 30144 3888 3640 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3806068 dandi 20 0 30144 16032 15748 S 0.0 0.0 0:00.01 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3806095 dandi 20 0 30144 4060 3816 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3806104 dandi 20 0 20432 12928 3916 S 0.0 0.0 0:00.06 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs check-attr -z --stdin annex.backend annex.largefiles annex.numcopies annex.mincopies --
|
||||
3806110 dandi 20 0 30144 15944 15660 S 0.0 0.0 0:00.02 │ │ │ └─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
3804258 dandi 20 0 1024G 44336 37772 S 0.0 0.1 0:00.04 │ │ ├─ git-annex addurl --batch --with-files --jobs 5 --json --json-error-messages --json-progress --raw
|
||||
3804277 dandi 20 0 40844 5124 4740 S 0.0 0.0 0:00.00 │ │ │ └─ git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch
|
||||
3805399 dandi 20 0 1024G 23508 20844 S 0.0 0.0 0:00.61 │ │ ├─ git-annex examinekey --batch --migrate-to-backend=SHA256E
|
||||
3805493 dandi 20 0 1024G 36516 26184 S 0.0 0.1 0:01.51 │ │ ├─ git-annex fromkey --force --batch --json --json-error-messages
|
||||
3805503 dandi 20 0 25788 5120 4712 S 0.0 0.0 0:00.00 │ │ │ ├─ git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch
|
||||
3805510 dandi 20 0 12472 3984 3732 S 0.0 0.0 0:00.05 │ │ │ └─ git --git-dir=.git --work-tree=. --literal-pathspecs hash-object -w --stdin-paths --no-filters
|
||||
```
|
||||
|
||||
which might be ok but still wonder why they are just sleeping there in more than one per `--jobs` number quantities. git annex 10.20220624-g769be12
|
||||
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
|
||||
> [[done]]; this is now handled like other git helper processes
|
||||
> and will be capped to the maximum of the number of jobs or cpu cores,
|
||||
> and in practice usually fewer than that will be started. --[[Joey]]
|
|
@ -0,0 +1,16 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2022-07-25T20:37:55Z"
|
||||
content="""
|
||||
I was able to reproduce this by feeding 10 urls into git-annex addurl
|
||||
-J5 and got 7 hash-object processes running.
|
||||
|
||||
filter.annex.process has nothing to do with this. I reproduced the behavior
|
||||
without it set.
|
||||
|
||||
Seems like a simple concurrency issue, where each thread potentially starts
|
||||
its own hash-object handle, and there can be around 2x as many threads
|
||||
started as the -J number due to job stages. Annex.Concurrent sets up pools of
|
||||
handles for other similar git processes, but not hash-object.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
(Sorry about the title; I was trying to work within the character limit.)
|
||||
|
||||
When invoking `git-annex metadata --batch --json --json-error-messages`, if an error occurs in response to some input — say, because the name of a nonexistent file was supplied (or, in my case, because the name of a file downloaded milliseconds ago in a parallel addurl process was supplied) — then `git-annex metadata` will output "git-annex: not an annexed file: {filepath}" to standard error and immediately exit. Not only is this in contrast to what it seems `--json-error-messages` should do, but the "exiting immediately" bit is in contrast to my understanding of how batch mode is supposed to work. Surely this should be fixed?
|
||||
|
||||
[[!meta author=jwodder]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
|
@ -0,0 +1,13 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2021-11-01T16:27:48Z"
|
||||
content="""
|
||||
For consistency with other --batch, I've made it reply with a blank line
|
||||
when the input is not an annexed file.
|
||||
|
||||
Do note that --json-error-messages cannot cram every possible kind of error
|
||||
message into a json object. In particular, errors that occur at startup,
|
||||
and not when acting on a particular file or key, do not fit into the json
|
||||
schema.
|
||||
"""]]
|
|
@ -0,0 +1,44 @@
|
|||
### Please describe the problem.
|
||||
|
||||
From [https://github.com/DanielDent/git-annex-remote-rclone/pull/57](https://github.com/DanielDent/git-annex-remote-rclone/pull/57), where we use that rclone special remote for backup of DANDI data to dropbox
|
||||
|
||||
Seems like a test sometimes fails on Mac OS with:
|
||||
|
||||
```
|
||||
+ git-annex copy -J5 --quiet . --to GA-rclone-CI
|
||||
git-annex: .git/annex/move.log: openFile: resource busy (file is locked)
|
||||
copy: 1 failed
|
||||
Error: Process completed with exit code 1.
|
||||
```
|
||||
|
||||
indeed so far seemed to happen only on Mac
|
||||
|
||||
```
|
||||
(git)smaug:/mnt/datasets/datalad/ci/git-annex-remote-rclone[master]2022
|
||||
$> datalad foreach-dataset git grep 'file is locked'
|
||||
foreach-dataset(error): /mnt/datasets/datalad/ci/git-annex-remote-rclone (dataset) [CommandError: 'git grep 'file is locked'' failed with exitcode 1 under /mnt/datasets/datalad/ci/git-annex-remote-rclone]
|
||||
03/cron/20221003T064418/da57e9a/github-Tests-144-failed/9_test (macos-latest, v1.53.3).txt:2022-10-03T06:47:44.4978580Z git-annex: .git/annex/move.log: openFile: resource busy (file is locked)
|
||||
03/cron/20221003T064418/da57e9a/github-Tests-144-failed/test (macos-latest, v1.53.3)/9_tests.txt:2022-10-03T06:47:44.4978530Z git-annex: .git/annex/move.log: openFile: resource busy (file is locked)
|
||||
03/push/master/1d0d3ce/github-Tests-146-failed/10_test (macos-latest, v1.33).txt:2022-10-03T23:35:41.8464390Z git-annex: .git/annex/move.log: openFile: resource busy (file is locked)
|
||||
03/push/master/1d0d3ce/github-Tests-146-failed/9_test (macos-latest, v1.53.3).txt:2022-10-03T23:37:44.0652500Z git-annex: .git/annex/move.log: openFile: resource busy (file is locked)
|
||||
03/push/master/1d0d3ce/github-Tests-146-failed/test (macos-latest, v1.33)/9_tests.txt:2022-10-03T23:35:41.8463970Z git-annex: .git/annex/move.log: openFile: resource busy (file is locked)
|
||||
03/push/master/1d0d3ce/github-Tests-146-failed/test (macos-latest, v1.53.3)/9_tests.txt:2022-10-03T23:37:44.0652360Z git-annex: .git/annex/move.log: openFile: resource busy (file is locked)
|
||||
foreach-dataset(ok): /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/10 (dataset)
|
||||
foreach-dataset(error): /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/06 (dataset) [CommandError: 'git grep 'file is locked'' failed with exitcode 1 under /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/06]
|
||||
foreach-dataset(error): /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/07 (dataset) [CommandError: 'git grep 'file is locked'' failed with exitcode 1 under /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/07]
|
||||
foreach-dataset(error): /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/09 (dataset) [CommandError: 'git grep 'file is locked'' failed with exitcode 1 under /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/09]
|
||||
foreach-dataset(error): /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/08 (dataset) [CommandError: 'git grep 'file is locked'' failed with exitcode 1 under /mnt/datasets/datalad/ci/git-annex-remote-rclone/2022/08]
|
||||
```
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
no minimal reproducer yet but happens as part of [this test "script"](https://github.com/DanielDent/git-annex-remote-rclone/blob/master/tests/all-in-one.sh)
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
git-annex version: 10.20220927
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> Presumed [[fixed|done]]; please followup if I'm wrong. --[[Joey]]
|
|
@ -0,0 +1,22 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2022-10-07T16:44:04Z"
|
||||
content="""
|
||||
I doubt this is really OSX specific. This must be two threads running logMove
|
||||
at the same time, that end up trying to both write or one write and one
|
||||
read at the same time. That causes the haskell RTS to fail this way.
|
||||
|
||||
Since it does use a lock file when writing and appending to the log file,
|
||||
I think it must be the call to checkLogFile that is failing. That avoids
|
||||
taking the lock, for performance reasons. The performace gain is pretty
|
||||
minimal though, taking the lock is not much. Only when modifyLogFile
|
||||
is called at the same time might it need to block on the file being
|
||||
rewritten, but the file only ever has 100 items, so that never takes long
|
||||
either.
|
||||
|
||||
So, I have added locking to checkLogFile (and to calcLogFile though it's
|
||||
not used here, just because it has the same problem). That should fix it,
|
||||
though we'll need to wait on the test to know for sure. I'm going to close
|
||||
this, as I'm pretty sure though..
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 2"
|
||||
date="2022-11-04T12:41:47Z"
|
||||
content="""
|
||||
ok, did the archaeologic expedition to figure when fixed -- was fixed in [10.20221003-19-g4a42c6909 AKA 10.20221103~28](https://git.kitenet.net/index.cgi/git-annex.git/commit/?id=4a42c69092a03cce7b31b79b862e59c9842ced77) , brew still (well -- we are just 1 day post release! ;)) has 10.20221003 so in testing git-annex-remote-rclone we keep getting hit but hopefully it would go away soon with update of git-annex in brew.
|
||||
"""]]
|
|
@ -0,0 +1,96 @@
|
|||
### Please describe the problem.
|
||||
|
||||
git status reports having staged changes and no changes from index
|
||||
|
||||
```shell
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git status
|
||||
On branch draft
|
||||
Your branch is up to date with 'github/draft'.
|
||||
|
||||
Changes not staged for commit:
|
||||
(use "git add <file>..." to update what will be committed)
|
||||
(use "git restore <file>..." to discard changes in working directory)
|
||||
modified: .dandi/assets.json
|
||||
|
||||
no changes added to commit (use "git add" and/or "git commit -a")
|
||||
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git annex status
|
||||
M ./.dandi/assets.json
|
||||
```
|
||||
|
||||
although git shows no diff and sha256 checksum corresponds to the key:
|
||||
|
||||
```shell
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git diff --cached
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git show -- .dandi/assets.json
|
||||
commit b859efed7ddb2ff31cc26168f40676c572d2798f (HEAD -> draft, github/draft, github/HEAD)
|
||||
Author: DANDI User <info@dandiarchive.org>
|
||||
Date: Fri Sep 16 22:22:29 2022 +0000
|
||||
|
||||
[backups2datalad] 66 files added
|
||||
|
||||
diff --git a/.dandi/assets.json b/.dandi/assets.json
|
||||
index d3ef95e1ee..62fe372810 100644
|
||||
--- a/.dandi/assets.json
|
||||
+++ b/.dandi/assets.json
|
||||
@@ -1 +1 @@
|
||||
-/annex/objects/SHA256E-s69400783--8b576786d3926ab0e84809b4131cdc5a8f631674d378afa343e7dcd84f011c90.json
|
||||
+/annex/objects/SHA256E-s69507227--6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14.json
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ sha256sum .dandi/assets.json
|
||||
6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14 .dandi/assets.json
|
||||
```
|
||||
|
||||
I think may be the tricky part is that I have it of
|
||||
|
||||
```
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git config annex.version
|
||||
10
|
||||
```
|
||||
|
||||
although I thought that we kept it at 8 but I have user wider config setting
|
||||
|
||||
```
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git config filter.annex.process
|
||||
git-annex filter-process
|
||||
```
|
||||
|
||||
I was recommended to speed up operations while avoiding upgrade to 10, but I guess running most recent version once lead to the upgrade since all the other repos are still at 8 as I thought it would be
|
||||
|
||||
```
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ grep -h '\<version =' ../*/.git/config | sort | uniq -c
|
||||
1 version = 10
|
||||
186 version = 8
|
||||
```
|
||||
|
||||
having it reported modified causes our script which does sanity check to operate only on clean repo to fail.
|
||||
|
||||
`git reset --hard` seems mitigated that
|
||||
|
||||
```
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git reset --hard
|
||||
HEAD is now at b859efed7d [backups2datalad] 66 files added
|
||||
(git-annex) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git status
|
||||
On branch draft
|
||||
Your branch is up to date with 'github/draft'.
|
||||
|
||||
nothing to commit, working tree clean
|
||||
```
|
||||
|
||||
all. I will now rerun our script and see in what state I would end up (although, once again, I ended up in version 10 of the repo already, so may be behavior would be different).
|
||||
|
||||
### What steps will reproduce the problem?
|
||||
|
||||
I think I get it after I `annex move` and then `annex get` that file back. Just for my own reference -- git-annex repo is result of the https://github.com/dandi/dandisets/blob/draft/tools/backups2datalad-update-cron
|
||||
|
||||
|
||||
### What version of git-annex are you using? On what operating system?
|
||||
|
||||
10.20220822-g84f1875 (conda build), originally observed on earlier 10.20220724-ge30d846
|
||||
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
[[!meta title="annex.stalldetection prevents git-annex get from restaging unlocked files"]]
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
|
@ -0,0 +1,15 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 10"
|
||||
date="2022-09-22T17:34:35Z"
|
||||
content="""
|
||||
damn, I should have shared my config! I also do have `annex.stalldetection` set!
|
||||
|
||||
```
|
||||
[annex]
|
||||
stalldetection = 1KB/120s
|
||||
```
|
||||
|
||||
never thought it might be related. We should look into having some matrix test run with such config set.
|
||||
"""]]
|
|
@ -0,0 +1,11 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 11"""
|
||||
date="2022-09-22T17:38:45Z"
|
||||
content="""
|
||||
Yeah, a whole git-annex test run with stalldetection set would have found
|
||||
this bug. Which seems a bit heavy-weight for the test suite to try as a
|
||||
separate pass by default. But then again, stalldetection does significantly
|
||||
change how git-annex operates since it has to fork off child processes that
|
||||
it can kill when they stall.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 12"
|
||||
date="2022-09-22T18:14:15Z"
|
||||
content="""
|
||||
Adding a matrix run where I initiated a custom config settings to our [datalad/git-annex](https://github.com/datalad/git-annex/pull/133) CI run. Let's see how that goes. May be some other interesting config settings to add there? e.g. retries etc? or global `~/.gitconfig` is not used/mocked away during tests? (e.g. we do that in datalad, so I had to trick that in [PR against datalad](https://github.com/datalad/datalad/pull/7056) to test against this setting being set)
|
||||
"""]]
|
|
@ -0,0 +1,32 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 12"""
|
||||
date="2022-09-22T17:40:57Z"
|
||||
content="""
|
||||
So, `git-annex transferrer`, after downloading the content, does handle
|
||||
populating pointer files. So it calls restagePointerFile to register a cleanup
|
||||
action.
|
||||
|
||||
Whatever is making that process exit 1 must be preventing the cleanup
|
||||
action from being run. And I think what that is, is that its stdout handle
|
||||
gets closed at the same time its stdin handle is closed. I tried running
|
||||
`git-annex transferrer` manually and feeding it a transfer request on
|
||||
stdin. After its stdin was closed, it proceeded to send
|
||||
`"om (recording state in git...)\n"` to stdout, and that would fail
|
||||
with stdout already closed.
|
||||
|
||||
Worse, I suspect there's another problem.. When a stall actually
|
||||
is detected, git-annex kills the `git-annex transferrer` process that has
|
||||
stalled. But suppose that process has already successfully downloaded some
|
||||
content and populated pointer files. Killing it would prevent it from
|
||||
running restagePointerFile on those. It seems that to solve this,
|
||||
it would need to communicate back to the parent what pointer files need to
|
||||
be restaged. (Which would also solve the exit 1 problem, although not
|
||||
necessarily in the best way.)
|
||||
|
||||
Also, I think that multiple processes running the restagePointerFile
|
||||
cleanup action at the same time can be a problem, because one will
|
||||
lock the index and the rest will fail to restage. Not what's happening
|
||||
here, but with -J, there would be multiple `git-annex transferrer`
|
||||
processes doing that at the same time at the end.
|
||||
"""]]
|
|
@ -0,0 +1,30 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 13"""
|
||||
date="2022-09-22T18:16:22Z"
|
||||
content="""
|
||||
Avoided the early stdout handle close, and that did fix this bug as
|
||||
reported.
|
||||
|
||||
The related problems I identified in comment #12 are still unfixed, so
|
||||
leaving this open for now.
|
||||
|
||||
I think what ought to be done to wrap this up is make restagePointerFile
|
||||
record the files that need to be restaged in a log file. Then at shutdown,
|
||||
git-annex can read the log file, and restage everything listed in it.
|
||||
This will solve multiple problems:
|
||||
|
||||
* When a previous git-annex process was interrupted after a get/drop of an
|
||||
unlocked file, the file will be in the log, so git-annex can notice
|
||||
that and handle the restaging.
|
||||
* When a stalled `git-annex transferrer` is killed, the parent git-annex
|
||||
will read the log and handle the restaging that it was not able to do.
|
||||
* When multiple processes are trying to restage files at the same time,
|
||||
an exclusive lock can be used to make only one of them run, and it can
|
||||
handle restaging the files that the others have recorded in the log too.
|
||||
* As a bonus, in the situations where git-annex is legitimately unable to
|
||||
restage files, it can still record them to be restaged later. And the
|
||||
"only a cosmetic problem" message can tell the user to run a single
|
||||
simple git-annex command, rather than a complicated
|
||||
`git update-index` command per file.
|
||||
"""]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 15"""
|
||||
date="2022-09-22T18:42:06Z"
|
||||
content="""
|
||||
@yarikoptic oh, `git-annex test` does prevent global gitconfig from
|
||||
influeencing the tests. So your matrix test won't work if you're
|
||||
running `git-annex test` in it. If you're running other git-annex commands
|
||||
in datalad's test suite, it would work though.
|
||||
|
||||
I've opened [[todo/specify_gitconfig_for_test_suite]].
|
||||
"""]]
|
|
@ -0,0 +1,33 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""status update"""
|
||||
date="2022-09-23T19:57:38Z"
|
||||
content="""
|
||||
I've implemented the log file. The stalled transferrer case is now handled.
|
||||
This bug is fixed.
|
||||
|
||||
As to a few other cases I considered in comments upthread:
|
||||
|
||||
When a get/drop was interrupted before it could restage,
|
||||
the next get/drop will cause the necessary restaging for the
|
||||
interrupted process to happen. However, this doesn't help if there's
|
||||
nothing left to get/drop. Should git-annex always run restagePointerFiles
|
||||
on shutdown? That would make any git-annex command handle the restaging.
|
||||
But it doesn't seem right for query commands to do potentially a lot of
|
||||
work to handle this case. Anyway, I don't think this needs to be dealt
|
||||
with in this bug report.
|
||||
|
||||
When multiple processes try to restage at the same time, one will
|
||||
restage everything that all of them logged. The others will still display a
|
||||
warning to the user that they couldn't restage. It would be hard to avoid
|
||||
displaying that warning, since it does need to warn when it was
|
||||
unable to restage because git has the index locked at the time. Anyway,
|
||||
I think it's ok to display the message despite the files having been
|
||||
restaged, because it's the same as a later git-annex process handling the
|
||||
restaging. (It does seem like two transferrers belonging to the same parent
|
||||
could collide in this way, and one display the warning, which isn't great..)
|
||||
|
||||
I also implemented a "git-annex restage" command that
|
||||
is an easier way to restage in the cases where git-annex is not able
|
||||
to do it itself.
|
||||
"""]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2022-09-21T17:05:51Z"
|
||||
content="""
|
||||
Is .dandi/assets.json an unlocked file?
|
||||
|
||||
`git diff --cached` seems like the wrong thing to run, because
|
||||
that would show changes that you have staged for commit.
|
||||
This change is one that has not been staged for commit.
|
||||
So `git diff` should show it.
|
||||
"""]]
|
|
@ -0,0 +1,46 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 2"
|
||||
date="2022-09-21T18:46:50Z"
|
||||
content="""
|
||||
d'oh forgot to show that I have tried that one too. Here is everything at once again with `git diff` and again doing checksums (that should have been different in my prev examples as well if different only in tree but not in index):
|
||||
|
||||
```shell
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git status
|
||||
On branch draft
|
||||
Your branch is up to date with 'github/draft'.
|
||||
|
||||
Changes not staged for commit:
|
||||
(use \"git add <file>...\" to update what will be committed)
|
||||
(use \"git restore <file>...\" to discard changes in working directory)
|
||||
modified: .dandi/assets.json
|
||||
|
||||
|
||||
It took 3.19 seconds to enumerate untracked files. 'status -uno'
|
||||
may speed it up, but you have to be careful not to forget to add
|
||||
new files yourself (see 'git help status').
|
||||
no changes added to commit (use \"git add\" and/or \"git commit -a\")
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git diff
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git diff --cached
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ sha256sum .dandi/assets.json
|
||||
6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14 .dandi/assets.json
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git show -- .dandi/assets.json
|
||||
commit b859efed7ddb2ff31cc26168f40676c572d2798f (HEAD -> draft, github/draft, github/HEAD)
|
||||
Author: DANDI User <info@dandiarchive.org>
|
||||
Date: Fri Sep 16 22:22:29 2022 +0000
|
||||
|
||||
[backups2datalad] 66 files added
|
||||
|
||||
diff --git a/.dandi/assets.json b/.dandi/assets.json
|
||||
index d3ef95e1ee..62fe372810 100644
|
||||
--- a/.dandi/assets.json
|
||||
+++ b/.dandi/assets.json
|
||||
@@ -1 +1 @@
|
||||
-/annex/objects/SHA256E-s69400783--8b576786d3926ab0e84809b4131cdc5a8f631674d378afa343e7dcd84f011c90.json
|
||||
+/annex/objects/SHA256E-s69507227--6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14.json
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git annex status
|
||||
M ./.dandi/assets.json
|
||||
|
||||
```
|
||||
"""]]
|
|
@ -0,0 +1,30 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 3"
|
||||
date="2022-09-21T18:49:06Z"
|
||||
content="""
|
||||
the workaround you suggest elsewhere for \"cosmetic\" problem works here too
|
||||
|
||||
```
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git status
|
||||
On branch draft
|
||||
Your branch is up to date with 'github/draft'.
|
||||
|
||||
Changes not staged for commit:
|
||||
(use \"git add <file>...\" to update what will be committed)
|
||||
(use \"git restore <file>...\" to discard changes in working directory)
|
||||
modified: .dandi/assets.json
|
||||
|
||||
no changes added to commit (use \"git add\" and/or \"git commit -a\")
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git update-index -q --refresh .dandi/assets.json
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git status
|
||||
On branch draft
|
||||
Your branch is up to date with 'github/draft'.
|
||||
|
||||
nothing to commit, working tree clean
|
||||
|
||||
```
|
||||
|
||||
but since we are relying on output from `status`, it is not just a \"cosmetic\" issue. IMHO if such `update-index` is needed, it should have been done by git-annex automagically somehow/sometime.
|
||||
"""]]
|
|
@ -0,0 +1,29 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 4"""
|
||||
date="2022-09-21T19:19:08Z"
|
||||
content="""
|
||||
So you can reproduce this? I am pretty sure it's not as simple as a drop
|
||||
followed by a get, so more information about reproducing it seems crucial.
|
||||
|
||||
I assume you are *not* seeing the "This is only a cosmetic problem affecting git status"
|
||||
message?
|
||||
|
||||
I expect that running `git update-index --refresh .dandi/assets.json`
|
||||
will fix git status. Can you confirm?
|
||||
|
||||
The only way I know of that this can happen without the message is if a
|
||||
drop or a get is still running, or gets interrupted. One of the last things
|
||||
git-annex before exiting is restage all the unlocked files that it has
|
||||
updated.
|
||||
|
||||
Short of that, it seems like it would have to be a bug that prevents
|
||||
restagePointerFile from working. Which might not be a bug in git-annex,
|
||||
if the problem involves git's handling of timestamps in the index, for
|
||||
example. (Which is known to have some odd behaviors.)
|
||||
|
||||
(git-annex could be improved to do the
|
||||
restaging later when interrupted and possibly after such a bug.
|
||||
But there's no way to make it recover in `git status`, because
|
||||
git doesn't run it in this situation.)
|
||||
"""]]
|
|
@ -0,0 +1,10 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 5"""
|
||||
date="2022-09-21T22:06:49Z"
|
||||
content="""
|
||||
Seems likely that the --time-limit option, when combined with -J,
|
||||
could result in git-annex exiting before a worker thread gets a chance to
|
||||
call stagePointerFile. I have not verified this, and it would be unlikely
|
||||
to result in the same file being affected reproducibly.
|
||||
"""]]
|
|
@ -0,0 +1,33 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 6"
|
||||
date="2022-09-22T01:03:18Z"
|
||||
content="""
|
||||
may be it one of those options, in my case - it is just a straight `get` on that single unlocked file:
|
||||
|
||||
```
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git status
|
||||
On branch draft
|
||||
Your branch is up to date with 'github/draft'.
|
||||
|
||||
nothing to commit, working tree clean
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ cat .dandi/assets.json
|
||||
/annex/objects/SHA256E-s69507227--6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14.json
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git annex get .dandi/assets.json
|
||||
get .dandi/assets.json (from dandi-dandisets-dropbox...)
|
||||
(checksum...) ok
|
||||
(recording state in git...)
|
||||
(dandisets) dandi@drogon:/mnt/backup/dandi/dandisets/000026$ git status
|
||||
On branch draft
|
||||
Your branch is up to date with 'github/draft'.
|
||||
|
||||
Changes not staged for commit:
|
||||
(use \"git add <file>...\" to update what will be committed)
|
||||
(use \"git restore <file>...\" to discard changes in working directory)
|
||||
modified: .dandi/assets.json
|
||||
|
||||
no changes added to commit (use \"git add\" and/or \"git commit -a\")
|
||||
|
||||
```
|
||||
"""]]
|
|
@ -0,0 +1,58 @@
|
|||
[[!comment format=mdwn
|
||||
username="yarikoptic"
|
||||
avatar="http://cdn.libravatar.org/avatar/f11e9c84cb18d26a1748c33b48c924b4"
|
||||
subject="comment 7"
|
||||
date="2022-09-22T01:33:24Z"
|
||||
content="""
|
||||
sorry I have not mentioned your [earlier comment 4](http://git-annex.branchable.com/bugs/reports_file___34__modified__34___whenever_it_is_not/#comment-ca0281ff580c91c40e429fbbb71a3791) but my clarification above I think gives the answers to your questions ;)
|
||||
|
||||
<details>
|
||||
<summary>FWIW here is the get --debug output </summary>
|
||||
|
||||
```shell
|
||||
[2022-09-21 21:29:59.904218] (Utility.Process) process [3968193] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"ls-files\",\"--stage\",\"-z\",\"--error-unmatch\",\"--\",\".dandi/assets.json\"]
|
||||
[2022-09-21 21:29:59.904725] (Utility.Process) process [3968194] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"cat-file\",\"--batch-check=%(objectname) %(objecttype) %(objectsize)\",\"--buffer\"]
|
||||
[2022-09-21 21:29:59.905645] (Utility.Process) process [3968195] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"cat-file\",\"--batch=%(objectname) %(objecttype) %(objectsize)\",\"--buffer\"]
|
||||
[2022-09-21 21:29:59.906012] (Utility.Process) process [3968196] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"show-ref\",\"git-annex\"]
|
||||
[2022-09-21 21:29:59.907578] (Utility.Process) process [3968196] done ExitSuccess
|
||||
[2022-09-21 21:29:59.907891] (Utility.Process) process [3968197] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"show-ref\",\"--hash\",\"refs/heads/git-annex\"]
|
||||
[2022-09-21 21:29:59.913611] (Utility.Process) process [3968197] done ExitSuccess
|
||||
[2022-09-21 21:29:59.914676] (Utility.Process) process [3968198] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"log\",\"refs/heads/git-annex..5f5efa8544ff02c9261dd1590425dcea37a55526\",\"--pretty=%H\",\"-n1\"]
|
||||
[2022-09-21 21:29:59.916707] (Utility.Process) process [3968198] done ExitSuccess
|
||||
[2022-09-21 21:29:59.916968] (Utility.Process) process [3968199] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"log\",\"refs/heads/git-annex..18497e6e9cab7754a85256416c361fee36ba65b2\",\"--pretty=%H\",\"-n1\"]
|
||||
[2022-09-21 21:29:59.918722] (Utility.Process) process [3968199] done ExitSuccess
|
||||
[2022-09-21 21:29:59.919069] (Utility.Process) process [3968200] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"cat-file\",\"--batch=%(objectname) %(objecttype) %(objectsize)\",\"--buffer\"]
|
||||
get .dandi/assets.json [2022-09-21 21:29:59.921463] (Utility.Process) process [3968202] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"cat-file\",\"--batch\"]
|
||||
(from dandi-dandisets-dropbox...) [2022-09-21 21:29:59.931525] (Utility.Process) process [3968203] chat: /home/dandi/miniconda3/envs/dandisets/bin/git-annex [\"transferrer\",\"-c\",\"annex.debug=true\"]
|
||||
[2022-09-21 21:29:59.93162] (Annex.TransferrerPool) > d rdandi-dandisets-dropbox SHA256E-s69507227--6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14.json .dandi/assets.json
|
||||
[2022-09-21 21:29:59.942599] (Annex.TransferrerPool) < opb
|
||||
|
||||
[2022-09-21 21:29:59.942718] (Annex.TransferrerPool) < ops 69507227
|
||||
[2022-09-21 21:30:03.103409] (Annex.TransferrerPool) < ope
|
||||
[2022-09-21 21:30:03.103539] (Annex.TransferrerPool) < om (checksum...)
|
||||
(checksum...) [2022-09-21 21:30:03.768599] (Annex.TransferrerPool) < t
|
||||
[2022-09-21 21:30:03.768843] (Annex.Branch) read 6e0/a70/SHA256E-s69507227--6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14.json.log
|
||||
[2022-09-21 21:30:03.770259] (Annex.Branch) set 6e0/a70/SHA256E-s69507227--6a0a91c4158d316ab8ad9bd8ebf7579b9c3c579e1035c48134246b6a5d2f6f14.json.log
|
||||
ok
|
||||
[2022-09-21 21:30:03.770361] (Utility.Process) process [3968200] done ExitSuccess
|
||||
[2022-09-21 21:30:03.770425] (Utility.Process) process [3968195] done ExitSuccess
|
||||
[2022-09-21 21:30:03.770484] (Utility.Process) process [3968194] done ExitSuccess
|
||||
[2022-09-21 21:30:03.770531] (Utility.Process) process [3968193] done ExitSuccess
|
||||
[2022-09-21 21:30:03.771187] (Utility.Process) process [3968452] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"hash-object\",\"-w\",\"--stdin-paths\",\"--no-filters\"]
|
||||
[2022-09-21 21:30:03.77319] (Utility.Process) process [3968453] feed: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"update-index\",\"-z\",\"--index-info\"]
|
||||
[2022-09-21 21:30:04.063182] (Utility.Process) process [3968453] done ExitSuccess
|
||||
[2022-09-21 21:30:04.063779] (Utility.Process) process [3968463] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"show-ref\",\"--hash\",\"refs/heads/git-annex\"]
|
||||
[2022-09-21 21:30:04.065352] (Utility.Process) process [3968463] done ExitSuccess
|
||||
(recording state in git...)
|
||||
[2022-09-21 21:30:04.06587] (Utility.Process) process [3968464] read: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"write-tree\"]
|
||||
[2022-09-21 21:30:04.407935] (Utility.Process) process [3968464] done ExitSuccess
|
||||
[2022-09-21 21:30:04.408528] (Utility.Process) process [3968468] chat: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"commit-tree\",\"56c62dcc21145201f9454a2dd6e75cc37f072ee4\",\"--no-gpg-sign\",\"-p\",\"refs/heads/git-annex\"]
|
||||
[2022-09-21 21:30:04.410591] (Utility.Process) process [3968468] done ExitSuccess
|
||||
[2022-09-21 21:30:04.413623] (Utility.Process) process [3968469] call: git [\"--git-dir=.git\",\"--work-tree=.\",\"--literal-pathspecs\",\"-c\",\"annex.debug=true\",\"update-ref\",\"refs/heads/git-annex\",\"c3a1f9208649b47621b1424b055bd9871aa2fc79\"]
|
||||
[2022-09-21 21:30:04.415318] (Utility.Process) process [3968469] done ExitSuccess
|
||||
[2022-09-21 21:30:04.416301] (Utility.Process) process [3968202] done ExitSuccess
|
||||
[2022-09-21 21:30:04.416574] (Utility.Process) process [3968452] done ExitSuccess
|
||||
[2022-09-21 21:30:06.373343] (Utility.Process) process [3968203] done ExitFailure 1
|
||||
```
|
||||
</details>
|
||||
"""]]
|
|
@ -0,0 +1,9 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 8"""
|
||||
date="2022-09-22T17:02:04Z"
|
||||
content="""
|
||||
I've fixed the issue I found with --timestamp combined with -J. Which I do
|
||||
think could have resulted in the same kind of problem. But you've shown
|
||||
that is not the cause in your case..
|
||||
"""]]
|
|
@ -0,0 +1,19 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 9"""
|
||||
date="2022-09-22T17:04:35Z"
|
||||
content="""
|
||||
Thanks for the --debug. It shows that git-annex is not running
|
||||
`git update-index --refresh` at all.
|
||||
|
||||
And it shows that the transfer happens in a `git-annex transferrer` process.
|
||||
So, I think you have annex.stalldetection set.
|
||||
|
||||
[2022-09-21 21:29:59.931525] (Utility.Process) process [3968203] chat: /home/dandi/miniconda3/envs/dandisets/bin/git-annex [\"transferrer\",\"-c\",\"annex.debug=true\"]
|
||||
|
||||
And interestingly, that transferrer process fails at the end:
|
||||
|
||||
[2022-09-21 21:30:06.373343] (Utility.Process) process [3968203] done ExitFailure 1
|
||||
|
||||
Aha! I can reproduce it by setting annex.stalldetection.
|
||||
"""]]
|
|
@ -0,0 +1,72 @@
|
|||
### Please describe the problem.
|
||||
|
||||
NB can't change the title since it is not about depends since libgcc-s1 is essential... so most likely some LD_LIBRARY_PATH manipulation is in place or smth like that.
|
||||
|
||||
[Testing of git-annex-remote-rclone on ubuntu-20.04 crashed](https://github.com/DanielDent/git-annex-remote-rclone/actions/runs/3750292044/jobs/6370225718) with
|
||||
|
||||
```
|
||||
+ git-annex copy -J5 --quiet . --to GA-rclone-CI
|
||||
libgcc_s.so.1 must be installed for pthread_cancel to work
|
||||
/home/runner/work/git-annex-remote-rclone/git-annex-remote-rclone/tests/all-in-one.sh: line 124: 3066 Aborted (core dumped) git-annex copy -J5 --quiet . --to GA-rclone-CI
|
||||
Error: Process completed with exit code 134.
|
||||
```
|
||||
|
||||
installation of git annex
|
||||
|
||||
```
|
||||
Run datalad-installer --sudo ok git-annex -m datalad/git-annex:release
|
||||
2022-12-21T15:10:30+0000 [INFO ] datalad_installer Writing environment modifications to /tmp/dl-env-j8s29if7.sh
|
||||
2022-12-21T15:10:30+0000 [INFO ] datalad_installer Installing git-annex via datalad/git-annex:release
|
||||
2022-12-21T15:10:30+0000 [INFO ] datalad_installer Version: None
|
||||
2022-12-21T15:10:30+0000 [INFO ] datalad_installer Downloading https://github.com/datalad/git-annex/releases/download/10.20221212/git-annex-standalone_10.20221212-1.ndall%2B1_amd64.deb
|
||||
2022-12-21T15:10:33+0000 [INFO ] datalad_installer Running: sudo dpkg -i /tmp/tmpah14ch03/git-annex-standalone_10.20221212-1.ndall+1_amd64.deb
|
||||
Selecting previously unselected package git-annex-standalone.
|
||||
(Reading database ... 236921 files and directories currently installed.)
|
||||
Preparing to unpack .../git-annex-standalone_10.20221212-1.ndall+1_amd64.deb ...
|
||||
Unpacking git-annex-standalone (10.20221212-1~ndall+1) ...
|
||||
Setting up git-annex-standalone (10.20221212-1~ndall+1) ...
|
||||
Processing triggers for mailcap (3.70+nmu1ubuntu1) ...
|
||||
Processing triggers for hicolor-icon-theme (0.17-2) ...
|
||||
Processing triggers for man-db (2.10.2-1) ...
|
||||
2022-12-21T15:10:35+0000 [INFO ] datalad_installer git-annex is now installed at /usr/bin/git-annex
|
||||
```
|
||||
|
||||
or may be that is an issue with `rclone`? in this case it was
|
||||
|
||||
```
|
||||
Run datalad-installer --sudo ok rclone=v1.59.2 -m downloads.rclone.org
|
||||
2022-12-21T15:10:35+0000 [INFO ] datalad_installer Writing environment modifications to /tmp/dl-env-aon5z6_f.sh
|
||||
2022-12-21T15:10:35+0000 [INFO ] datalad_installer Installing rclone from downloads.rclone.org
|
||||
2022-12-21T15:10:35+0000 [INFO ] datalad_installer Version: v1.59.2
|
||||
2022-12-21T15:10:35+0000 [INFO ] datalad_installer Bin dir: /usr/local/bin
|
||||
2022-12-21T15:10:35+0000 [INFO ] datalad_installer Man dir: None
|
||||
2022-12-21T15:10:35+0000 [INFO ] datalad_installer Downloading https://downloads.rclone.org/v1.59.2/rclone-v1.59.2-linux-amd64.zip
|
||||
2022-12-21T15:10:38+0000 [INFO ] datalad_installer Moving /tmp/tmp75sde__c/rclone-v1.59.2-linux-amd64/rclone to /usr/local/bin/rclone
|
||||
2022-12-21T15:10:38+0000 [INFO ] datalad_installer rclone is now installed at /usr/local/bin/rclone
|
||||
```
|
||||
|
||||
I have tried to reproduce locally with exactly those installations of rclone and git-annex but not getting the same problem :-/
|
||||
|
||||
I have also ran with `--debug` and got
|
||||
```
|
||||
[2022-12-21 17:20:10.056928113] (Utility.Process) process [11603] chat: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","commit-tree","c95a5c849daca7183eefc28c360942104d01e900","--no-gpg-sign","-p","refs/heads/git-annex"]
|
||||
[2022-12-21 17:20:10.060448661] (Utility.Process) process [11603] done ExitSuccess
|
||||
[2022-12-21 17:20:10.060806165] (Utility.Process) process [11604] call: git ["--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","annex.debug=true","update-ref","refs/heads/git-annex","248cef615747c4aba64fbb475b0a03c8d2a78b27"]
|
||||
[2022-12-21 17:20:10.063957208] (Utility.Process) process [11604] done ExitSuccess
|
||||
[2022-12-21 17:20:10.066005436] (Utility.Process) process [11127] done ExitSuccess
|
||||
[2022-12-21 17:20:10.066266539] (Utility.Process) process [11114] done ExitSuccess
|
||||
[2022-12-21 17:20:10.066702845] (Utility.Process) process [11126] done ExitSuccess
|
||||
[2022-12-21 17:20:10.067107151] (Utility.Process) process [11125] done ExitSuccess
|
||||
[2022-12-21 17:20:10.067357854] (Utility.Process) process [11599] done ExitSuccess
|
||||
libgcc_s.so.1 must be installed for pthread_cancel to work
|
||||
/home/runner/work/git-annex-remote-rclone/git-annex-remote-rclone/tests/all-in-one.sh: line 125: 11083 Aborted (core dumped) git-annex drop -J5 --debug .
|
||||
Error: Process completed with exit code 134.
|
||||
```
|
||||
in https://github.com/DanielDent/git-annex-remote-rclone/actions/runs/3751417971/jobs/6372374929 .
|
||||
|
||||
Any ideas Joey?
|
||||
|
||||
[[!meta author=yoh]]
|
||||
[[!tag projects/dandi]]
|
||||
|
||||
> [[fixed|done]] --[[Joey]]
|
|
@ -0,0 +1,23 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2022-12-22T18:38:32Z"
|
||||
content="""
|
||||
I'm a bit surprised git-annex is using `pthread_cancel`, since `strings`
|
||||
does not show it contains that symbol. Perhaps one of the other pthread
|
||||
symbols it uses ends up calling that.
|
||||
|
||||
It does seem though from the message that it's git-annex and not a program
|
||||
it runs that is core dumping on this. Also I checked, and the rclone you
|
||||
installed is a statically linked binary so I would not expect it to use
|
||||
`libgcc_s.so`. And And git-annex-remote-rclone is a bash script, and bash
|
||||
doesn't use pthreads.
|
||||
|
||||
(I do think that, in general, using the git-annex standalone tarball and
|
||||
then trying to run additional programs besides git-annex inside it is not
|
||||
going to always work well. Standalone interposes its own versions of libraries,
|
||||
which may not work with the other programs. There is already a todo about that,
|
||||
[[todo/restore_original_environment_when_running_external_special_remotes_from_standalone_git-annex__63__]].)
|
||||
|
||||
I've added `libgcc_s.so.1` to the standalone build.
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue