profiling and performance
This commit is contained in:
parent
9313bf0eac
commit
814bb3a270
2 changed files with 65 additions and 0 deletions
|
@ -0,0 +1,50 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 12"""
|
||||||
|
date="2022-11-17T17:43:27Z"
|
||||||
|
content="""
|
||||||
|
With --debug, it shows when it enters and leaves commitDb.
|
||||||
|
|
||||||
|
commitDb is run around 70 times in this repository. Making it queue
|
||||||
|
more changes into each commitDb, so it runs an order of magnitude fewer,
|
||||||
|
did speed it up by around 1 second (from 24).
|
||||||
|
|
||||||
|
Summing up the time spent in commitDb according to --debug, it is around 6.5
|
||||||
|
seconds, out of the 24 second run. That is close to the actual time sqlite is
|
||||||
|
taking.
|
||||||
|
|
||||||
|
But per my comment above, with commitDb stubbed out, the whole run takes
|
||||||
|
only 6 seconds. (I've re-verified that to be sure.) Why is it so much faster then?
|
||||||
|
After doing some profiling and looking at flame graphs, I think it comes
|
||||||
|
down to allocations:
|
||||||
|
|
||||||
|
(For future reference, here is a flame graph measuring runtimes
|
||||||
|
of unmodified git-annex: <https://tmp.joeyh.name/git-annex-flame-1.svg>
|
||||||
|
Shows the majority of time in commitDb as expected, and that it's due to
|
||||||
|
addAssociatedFile. The resulting upsert spends about 1/4th of its time reading
|
||||||
|
and 3/4 writing. All unsurprising.. But also a sizable amount of time is
|
||||||
|
spent in GC, and in IDLE, whatever that is.)
|
||||||
|
|
||||||
|
Without it stubbed out:
|
||||||
|
|
||||||
|
total alloc = 6,385,408,240 bytes (excludes profiling overheads)
|
||||||
|
|
||||||
|
With it stubbed out:
|
||||||
|
|
||||||
|
total alloc = 1,714,898,200 bytes (excludes profiling overheads)
|
||||||
|
|
||||||
|
Here is a flame graph measuring allocations, not runtime, of unmodified git-annex:
|
||||||
|
https://tmp.joeyh.name/git-annex-flame-3.svg>
|
||||||
|
By far the main allocation is by mkSqlBackend, which is surprising because
|
||||||
|
that represents a database handle. Why would it need to spend tons of memory
|
||||||
|
making database handles?
|
||||||
|
|
||||||
|
Starting to think this could be a bug in persistent or persistent-sqlite. I've
|
||||||
|
tried disabling the lowlevel persistent parts of Database.Handle in case there
|
||||||
|
was a bug in there, and using persistent more typically, but that didn't change
|
||||||
|
the mkSqlBackend allocations. Next step might be a simpler test case.
|
||||||
|
|
||||||
|
OTOH, it may be that the way that persistent uses the SqlBackend as a dictionary
|
||||||
|
of functions, that calls to individual functions in it confuse the profiler,
|
||||||
|
and allocation done to build SQL statements is being misattributed?
|
||||||
|
"""]]
|
|
@ -0,0 +1,15 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 13"""
|
||||||
|
date="2022-11-17T20:20:16Z"
|
||||||
|
content="""
|
||||||
|
An idea from looking at the flame graphs, is that if
|
||||||
|
reconcileStaged knew it was run on an empty keys db (or was doing a full
|
||||||
|
scan of the tree, not an incremental updte), it could avoid doing the
|
||||||
|
upsert, and do a blind insert instead.
|
||||||
|
|
||||||
|
This will need some care to be implemented safely...
|
||||||
|
|
||||||
|
I benchmarked it, and using insertUnique is no faster, but using insert is.
|
||||||
|
This would be a 15% speed up.
|
||||||
|
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue