update
This commit is contained in:
parent
c2ad84b423
commit
b312b2a30b
2 changed files with 48 additions and 0 deletions
|
@ -19,4 +19,10 @@ an ever-growing amount of memory, and be slowed down by the write attempts.
|
|||
Still, it does give it something better to do while the write is failing
|
||||
than sleeping and retrying, eg to do the rest of the work it's been asked
|
||||
to do.
|
||||
|
||||
(Update: Reads from a database first call flushDbQueue, and it would
|
||||
not be safe for that to return without actually writing to the database,
|
||||
since the read would then see possible stale information. It turns
|
||||
out that `git-annex get` does do a database read per file (getAssociatedFiles).
|
||||
So it seems this approach will not work.)
|
||||
"""]]
|
||||
|
|
|
@ -0,0 +1,42 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 25"""
|
||||
date="2022-10-11T17:17:13Z"
|
||||
content="""
|
||||
Revisiting this, I don't understand why the htop at the top of this page
|
||||
lists so many `git-annex get` processes. They all seem to be children
|
||||
of a single parent `git-annex get` process. But `git-annex get` does
|
||||
not fork children like those. I wonder if perhaps those child
|
||||
processes were forked() to do something else, but somehow hung before
|
||||
they could exec?!
|
||||
|
||||
By comment #2, annex.stalldetection is enabled, so `git-annex get`
|
||||
runs 5 `git-annex transferrer` processes. Each of those can write to the
|
||||
database, so concurrent sqlite writes can happen. So, my "re: comment 16"
|
||||
comment was off-base in thinking there was a single git-annex process.
|
||||
And so, I don't think the debug info requested in that comment is needed.
|
||||
|
||||
Also, it turns out that the database queue is being flushed after every
|
||||
file it gets, which is causing a sqlite write per file. So there are a
|
||||
lot of sqlite writes happening, which probably makes this issue much more
|
||||
likely to occur, on systems with slow enough disk IO that it does occur.
|
||||
Especially if the files are relatively small.
|
||||
|
||||
The reason for the queue flush is partly that Annex.run forces a queue
|
||||
flush after every action. That could, I think be avoided. That was only
|
||||
done to make sure the queue is flushed before the program exits, which
|
||||
should be able to be handled in a different way. But also,
|
||||
the queue has to be flushed before reading from the database in order
|
||||
for the read to see current information. In the `git-annex get` case,
|
||||
it queues a change to the inode cache, and then reads the associated
|
||||
files. To avoid that, it would need to keep track of the two different
|
||||
tables in the keys db, and flush the queue only when querying a table
|
||||
that a write had been queued to. That would be worth doing
|
||||
just to generally speed up `git-annex get`. A quick benchmark shows
|
||||
a get of 1000 small files that takes 17s will only take 12s once that's
|
||||
done. And that's on a fast SSD, probably much more on a hard drive!
|
||||
|
||||
So I don't have a full solution, but speeding git-annex up significantly and
|
||||
also making whatever the problem in this bug is probably much less likely
|
||||
to occur is a good next step..
|
||||
"""]]
|
Loading…
Add table
Reference in a new issue