43 lines
2 KiB
Text
43 lines
2 KiB
Text
|
[[!comment format=mdwn
|
||
|
username="joey"
|
||
|
subject="""more profiling"""
|
||
|
date="2016-09-26T19:59:43Z"
|
||
|
content="""
|
||
|
Instead of profiling `git annex copy --to remote`, I profiled `git annex
|
||
|
find --not --in web`, which needs to do the same kind of location log lookup.
|
||
|
|
||
|
total time = 12.41 secs (12413 ticks @ 1000 us, 1 processor)
|
||
|
total alloc = 8,645,057,104 bytes (excludes profiling overheads)
|
||
|
|
||
|
COST CENTRE MODULE %time %alloc
|
||
|
|
||
|
adjustGitEnv Git.Env 21.4 37.0
|
||
|
catchIO Utility.Exception 13.2 2.8
|
||
|
spanList Data.List.Utils 12.6 17.9
|
||
|
parsePOSIXTime Logs.TimeStamp 6.1 5.0
|
||
|
catObjectDetails.receive Git.CatFile 5.9 2.1
|
||
|
startswith Data.List.Utils 5.7 3.8
|
||
|
md5 Data.Hash.MD5 5.1 7.9
|
||
|
join Data.List.Utils 2.4 6.0
|
||
|
readFileStrictAnyEncoding Utility.Misc 2.2 0.5
|
||
|
|
||
|
The adjustGitEnv overhead is a surprise! It seems it is getting called once
|
||
|
per file, and allocating a new copy of the environment each time. Call stack:
|
||
|
withIndex calls withIndexFile calls addGitEnv calls adjustGitEnv.
|
||
|
Looks like simply making gitEnv be cached at startup would avoid most of
|
||
|
the adjustGitEnv slowdown.
|
||
|
|
||
|
(The catchIO overhead is a false reading; the detailed profile shows
|
||
|
that all its time and allocations are inherited. getAnnexLinkTarget
|
||
|
is running catchIO in the expensive case, so readSymbolicLink is
|
||
|
the actual expensive bit.)
|
||
|
|
||
|
The parsePOSIXTime comes from reading location logs. It's implemented
|
||
|
using a generic Data.Time.Format.parseTime, which uses a format string
|
||
|
"%s%Qs". A custom parser that splits into seconds and picoseconds
|
||
|
and simply reads both numbers might be more efficient.
|
||
|
|
||
|
catObjectDetails.receive is implemented using mostly String and could
|
||
|
probably be sped up by being converted to use ByteString.
|
||
|
"""]]
|