reproduced and some analysis
This commit is contained in:
parent
f9f4f29cc2
commit
e4a626018f
2 changed files with 75 additions and 0 deletions
|
@ -0,0 +1,35 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 1"""
|
||||||
|
date="2020-05-01T17:53:57Z"
|
||||||
|
content="""
|
||||||
|
I was able to reproduce the progress bar sticking at 100%. I made two local
|
||||||
|
repos in the assistant, each a remote of the other (not using ssh), and
|
||||||
|
added a file to one. It reached 100% fairly quickly, and then sat there
|
||||||
|
for a minute or so, before getting unstuck.
|
||||||
|
|
||||||
|
(The bug submitter's log includes "hPutStr: illegal operation (handle is
|
||||||
|
closed)" and I did not always see that in my test. May be unrelated.)
|
||||||
|
|
||||||
|
While stuck like that, I noticed that git-annex info also said the transfer
|
||||||
|
was still in progress. So apparently transferkeys is being delayed from
|
||||||
|
finishing the transfer for some reason.
|
||||||
|
|
||||||
|
Looking at ps, I noticed that "git-annex smudge --clean -- thefile" was a
|
||||||
|
child process of transferkeys. (The file was unlocked, and was the file I
|
||||||
|
had just added and that it was stuck on.) This may be a red herring; I
|
||||||
|
don't always see that when it's stuck.
|
||||||
|
|
||||||
|
On a hunch, I stopped the assistant process that was running in the sibling
|
||||||
|
repo of the one where I was adding files. So there was only a single
|
||||||
|
assistant running, not two. That seems to have gotten rid of the delay, or
|
||||||
|
at least most of it. It still takes around 6s longer than I would have
|
||||||
|
expected at the end.
|
||||||
|
|
||||||
|
Running transferkeys manually, and feeding it requests in its internal
|
||||||
|
protocol, transfers happened in well under a second. It did not do any
|
||||||
|
smudging then.
|
||||||
|
|
||||||
|
My guess so far is that something about the two assistants causes a race,
|
||||||
|
which it has to delay for a while to recover from.
|
||||||
|
"""]]
|
|
@ -0,0 +1,40 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 2"""
|
||||||
|
date="2020-05-01T18:57:22Z"
|
||||||
|
content="""
|
||||||
|
stracing transferkeys after it got stuck, it was stuck until
|
||||||
|
this completed:
|
||||||
|
|
||||||
|
wait4(461218, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 461218
|
||||||
|
|
||||||
|
That pid was "git --git-dir=../annex2/.git --work-tree=../annex2
|
||||||
|
--literal-pathspecs --literal-pathspecs -c core.safecrlf=false update-index
|
||||||
|
-q --refresh -z --stdin", and it did have a git-annex smudge --clean that
|
||||||
|
seemed stuck for a while.
|
||||||
|
|
||||||
|
Then it proceeded doing this:
|
||||||
|
|
||||||
|
rename("../annex2/.git/annexindex.0/index", "../annex2/.git/index") = 0
|
||||||
|
|
||||||
|
annex2 is the remote. That temp directory and the git update-index process
|
||||||
|
says restagePointerFile is what got stuck.
|
||||||
|
|
||||||
|
I managed to strace the stuck git-annex smudge --clean, and it's in a loop:
|
||||||
|
|
||||||
|
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
|
||||||
|
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
|
||||||
|
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
|
||||||
|
|
||||||
|
(Just before that loop, it seemed to be waiting for some
|
||||||
|
child processes, but I didn't catch which ones they were.)
|
||||||
|
|
||||||
|
After many iterations:
|
||||||
|
|
||||||
|
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
|
||||||
|
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
|
||||||
|
munmap(0x7f4438549000, 41947136) = 0
|
||||||
|
munmap(0x7f443ad4a000, 41947136) = 0
|
||||||
|
exit_group(0) = ?
|
||||||
|
+++ exited with 0 +++
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue