reproduced and some analysis
This commit is contained in:
parent
f9f4f29cc2
commit
e4a626018f
2 changed files with 75 additions and 0 deletions
|
@ -0,0 +1,35 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 1"""
|
||||
date="2020-05-01T17:53:57Z"
|
||||
content="""
|
||||
I was able to reproduce the progress bar sticking at 100%. I made two local
|
||||
repos in the assistant, each a remote of the other (not using ssh), and
|
||||
added a file to one. It reached 100% fairly quickly, and then sat there
|
||||
for a minute or so, before getting unstuck.
|
||||
|
||||
(The bug submitter's log includes "hPutStr: illegal operation (handle is
|
||||
closed)" and I did not always see that in my test. May be unrelated.)
|
||||
|
||||
While stuck like that, I noticed that git-annex info also said the transfer
|
||||
was still in progress. So apparently transferkeys is being delayed from
|
||||
finishing the transfer for some reason.
|
||||
|
||||
Looking at ps, I noticed that "git-annex smudge --clean -- thefile" was a
|
||||
child process of transferkeys. (The file was unlocked, and was the file I
|
||||
had just added and that it was stuck on.) This may be a red herring; I
|
||||
don't always see that when it's stuck.
|
||||
|
||||
On a hunch, I stopped the assistant process that was running in the sibling
|
||||
repo of the one where I was adding files. So there was only a single
|
||||
assistant running, not two. That seems to have gotten rid of the delay, or
|
||||
at least most of it. It still takes around 6s longer than I would have
|
||||
expected at the end.
|
||||
|
||||
Running transferkeys manually, and feeding it requests in its internal
|
||||
protocol, transfers happened in well under a second. It did not do any
|
||||
smudging then.
|
||||
|
||||
My guess so far is that something about the two assistants causes a race,
|
||||
which it has to delay for a while to recover from.
|
||||
"""]]
|
|
@ -0,0 +1,40 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2020-05-01T18:57:22Z"
|
||||
content="""
|
||||
stracing transferkeys after it got stuck, it was stuck until
|
||||
this completed:
|
||||
|
||||
wait4(461218, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 461218
|
||||
|
||||
That pid was "git --git-dir=../annex2/.git --work-tree=../annex2
|
||||
--literal-pathspecs --literal-pathspecs -c core.safecrlf=false update-index
|
||||
-q --refresh -z --stdin", and it did have a git-annex smudge --clean that
|
||||
seemed stuck for a while.
|
||||
|
||||
Then it proceeded doing this:
|
||||
|
||||
rename("../annex2/.git/annexindex.0/index", "../annex2/.git/index") = 0
|
||||
|
||||
annex2 is the remote. That temp directory and the git update-index process
|
||||
says restagePointerFile is what got stuck.
|
||||
|
||||
I managed to strace the stuck git-annex smudge --clean, and it's in a loop:
|
||||
|
||||
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
|
||||
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
|
||||
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
|
||||
|
||||
(Just before that loop, it seemed to be waiting for some
|
||||
child processes, but I didn't catch which ones they were.)
|
||||
|
||||
After many iterations:
|
||||
|
||||
rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [], 8) = 0
|
||||
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
|
||||
munmap(0x7f4438549000, 41947136) = 0
|
||||
munmap(0x7f443ad4a000, 41947136) = 0
|
||||
exit_group(0) = ?
|
||||
+++ exited with 0 +++
|
||||
"""]]
|
Loading…
Add table
Reference in a new issue