Merge branch 'bs'
This commit is contained in:
commit
37db1fa5a0
230 changed files with 2045 additions and 1413 deletions
|
@ -0,0 +1,40 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 7"""
|
||||
date="2019-12-18T19:18:04Z"
|
||||
content="""
|
||||
Updated profiling. git-annex find is now ByteString end-to-end!
|
||||
Note the massive reduction in alloc, and improved runtime.
|
||||
|
||||
Wed Dec 11 14:41 2019 Time and Allocation Profiling Report (Final)
|
||||
|
||||
git-annex +RTS -p -RTS find
|
||||
|
||||
total time = 1.51 secs (1515 ticks @ 1000 us, 1 processor)
|
||||
total alloc = 608,475,328 bytes (excludes profiling overheads)
|
||||
|
||||
COST CENTRE MODULE SRC %time %alloc
|
||||
|
||||
keyFile' Annex.Locations Annex/Locations.hs:(590,1)-(600,30) 8.2 16.6
|
||||
>>=.\.succ' Data.Attoparsec.Internal.Types Data/Attoparsec/Internal/Types.hs:146:13-76 4.7 0.7
|
||||
getAnnexLinkTarget'.probesymlink Annex.Link Annex/Link.hs:79:9-46 4.2 7.6
|
||||
>>=.\ Data.Attoparsec.Internal.Types Data/Attoparsec/Internal/Types.hs:(146,9)-(147,44) 3.9 2.3
|
||||
parseLinkTarget Annex.Link Annex/Link.hs:(255,1)-(263,25) 3.9 11.8
|
||||
doesPathExist Utility.RawFilePath Utility/RawFilePath.hs:30:1-25 3.4 0.6
|
||||
keyFile'.esc Annex.Locations Annex/Locations.hs:(596,9)-(600,30) 3.2 14.7
|
||||
fileKey' Annex.Locations Annex/Locations.hs:(609,1)-(619,41) 3.0 4.7
|
||||
parseLinkTargetOrPointer Annex.Link Annex/Link.hs:(240,1)-(244,25) 2.8 0.2
|
||||
hashUpdates.\.\.\ Crypto.Hash Crypto/Hash.hs:85:48-99 2.5 0.1
|
||||
combineAlways System.FilePath.Posix.ByteString System/FilePath/Posix/../Internal.hs:(698,1)-(704,67) 2.0 3.3
|
||||
getState Annex Annex.hs:(251,1)-(254,27) 2.0 1.1
|
||||
withPtr.makeTrampoline Basement.Block.Base Basement/Block/Base.hs:(401,5)-(404,31) 1.9 1.7
|
||||
withMutablePtrHint Basement.Block.Base Basement/Block/Base.hs:(468,1)-(482,50) 1.8 1.2
|
||||
parseKeyVariety Types.Key Types/Key.hs:(323,1)-(371,42) 1.8 0.0
|
||||
fileKey'.go Annex.Locations Annex/Locations.hs:611:9-55 1.7 2.2
|
||||
isLinkToAnnex Annex.Link Annex/Link.hs:(299,1)-(305,47) 1.7 1.0
|
||||
hashDirMixed Annex.DirHashes Annex/DirHashes.hs:(82,1)-(90,27) 1.7 1.3
|
||||
primitive Basement.Monad Basement/Monad.hs:72:5-18 1.6 0.1
|
||||
withPtr Basement.Block.Base Basement/Block/Base.hs:(395,1)-(404,31) 1.5 1.6
|
||||
mkKeySerialization Types.Key Types/Key.hs:(115,1)-(117,22) 1.1 2.8
|
||||
decimal.step Data.Attoparsec.ByteString.Char8 Data/Attoparsec/ByteString/Char8.hs:448:9-49 0.8 1.2
|
||||
"""]]
|
3
doc/todo/optimise_by_converting_Ref_to_ByteString.mdwn
Normal file
3
doc/todo/optimise_by_converting_Ref_to_ByteString.mdwn
Normal file
|
@ -0,0 +1,3 @@
|
|||
Profiling of `git annex find --not --in web` suggests that converting Ref
|
||||
to contain a ByteString, rather than a String, would eliminate a
|
||||
fromRawFilePath that uses about 1% of runtime.
|
21
doc/todo/optimise_journal_access.mdwn
Normal file
21
doc/todo/optimise_journal_access.mdwn
Normal file
|
@ -0,0 +1,21 @@
|
|||
Often a command will need to read a number of files from the git-annex
|
||||
branch, and it uses getJournalFile for each to check for any journalled
|
||||
change that has not reached the branch. But typically, the journal is empty
|
||||
and in such a case, that's a lot of time spent trying to open journal files
|
||||
that DNE.
|
||||
|
||||
Profiling eg, `git annex find --in web` shows things called by getJournalFile
|
||||
use around 5% of runtime.
|
||||
|
||||
What if, once at startup, it checked if the journal was entirely empty.
|
||||
If so, it can remember that, and avoid reading journal files.
|
||||
Perhaps paired with staging the journal if it's not empty.
|
||||
|
||||
This could lead to behavior changes in some cases where one command is
|
||||
writing changes and another command used to read them from the journal and
|
||||
may no longer do so. But any such behavior change is of a behavior that
|
||||
used to involve a race; the reader could just as well be ahead of the
|
||||
writer and it would have already behaved as it would after the change.
|
||||
|
||||
But: When a process writes to the journal, it will need to update its state
|
||||
to remember it's no longer empty. --[[Joey]]
|
|
@ -9,29 +9,9 @@ Benchmarking `git-annex find`, speedups range from 28-66%. The files fly by
|
|||
much more snappily. Other commands likely also speed up, but do more work
|
||||
than find so the improvement is not as large.
|
||||
|
||||
The `bs` branch is in a mergeable state now, but still needs work:
|
||||
The `bs` branch is in a mergeable state now. [[done]]
|
||||
|
||||
* Eliminate all the fromRawFilePath, toRawFilePath, encodeBS,
|
||||
decodeBS conversions. Or at least most of them. There are likely
|
||||
quite a few places where a value is converted back and forth several times.
|
||||
Stuff not entirely finished:
|
||||
|
||||
As a first step, profile and look for the hot spots. Known hot spots:
|
||||
|
||||
* keyFile uses fromRawFilePath and that adds around 3% overhead in `git-annex find`.
|
||||
Converting it to a RawFilePath needs a version of `</>` for RawFilePaths.
|
||||
* getJournalFileStale uses fromRawFilePath, and adds 3-5% overhead in
|
||||
`git-annex whereis`. Converting it to RawFilePath needs a version
|
||||
of `</>` for RawFilePaths. It also needs a ByteString.readFile
|
||||
for RawFilePath.
|
||||
|
||||
* System.FilePath is not available for RawFilePath, and many of the
|
||||
conversions are to get a FilePath in order to use that library.
|
||||
|
||||
It should be entirely straightforward to make a version of System.FilePath
|
||||
that can operate on RawFilePath, except possibly there could be some
|
||||
complications due to Windows.
|
||||
|
||||
* Use versions of IO actions like getFileStatus that take a RawFilePath,
|
||||
avoiding a conversion. Note that these are only available on unix, not
|
||||
windows, so a compatability shim will be needed.
|
||||
(I can't seem to find any library that provides one.)
|
||||
* Profile various commands and look for hot spots involving conversion
|
||||
between RawFilePath and FilePath.
|
||||
|
|
|
@ -0,0 +1,40 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2019-12-11T18:16:13Z"
|
||||
content="""
|
||||
Updated profiling. git-annex find is now ByteString end-to-end!
|
||||
Note the massive reduction in alloc, and improved runtime.
|
||||
|
||||
Wed Dec 11 14:41 2019 Time and Allocation Profiling Report (Final)
|
||||
|
||||
git-annex +RTS -p -RTS find
|
||||
|
||||
total time = 1.51 secs (1515 ticks @ 1000 us, 1 processor)
|
||||
total alloc = 608,475,328 bytes (excludes profiling overheads)
|
||||
|
||||
COST CENTRE MODULE SRC %time %alloc
|
||||
|
||||
keyFile' Annex.Locations Annex/Locations.hs:(590,1)-(600,30) 8.2 16.6
|
||||
>>=.\.succ' Data.Attoparsec.Internal.Types Data/Attoparsec/Internal/Types.hs:146:13-76 4.7 0.7
|
||||
getAnnexLinkTarget'.probesymlink Annex.Link Annex/Link.hs:79:9-46 4.2 7.6
|
||||
>>=.\ Data.Attoparsec.Internal.Types Data/Attoparsec/Internal/Types.hs:(146,9)-(147,44) 3.9 2.3
|
||||
parseLinkTarget Annex.Link Annex/Link.hs:(255,1)-(263,25) 3.9 11.8
|
||||
doesPathExist Utility.RawFilePath Utility/RawFilePath.hs:30:1-25 3.4 0.6
|
||||
keyFile'.esc Annex.Locations Annex/Locations.hs:(596,9)-(600,30) 3.2 14.7
|
||||
fileKey' Annex.Locations Annex/Locations.hs:(609,1)-(619,41) 3.0 4.7
|
||||
parseLinkTargetOrPointer Annex.Link Annex/Link.hs:(240,1)-(244,25) 2.8 0.2
|
||||
hashUpdates.\.\.\ Crypto.Hash Crypto/Hash.hs:85:48-99 2.5 0.1
|
||||
combineAlways System.FilePath.Posix.ByteString System/FilePath/Posix/../Internal.hs:(698,1)-(704,67) 2.0 3.3
|
||||
getState Annex Annex.hs:(251,1)-(254,27) 2.0 1.1
|
||||
withPtr.makeTrampoline Basement.Block.Base Basement/Block/Base.hs:(401,5)-(404,31) 1.9 1.7
|
||||
withMutablePtrHint Basement.Block.Base Basement/Block/Base.hs:(468,1)-(482,50) 1.8 1.2
|
||||
parseKeyVariety Types.Key Types/Key.hs:(323,1)-(371,42) 1.8 0.0
|
||||
fileKey'.go Annex.Locations Annex/Locations.hs:611:9-55 1.7 2.2
|
||||
isLinkToAnnex Annex.Link Annex/Link.hs:(299,1)-(305,47) 1.7 1.0
|
||||
hashDirMixed Annex.DirHashes Annex/DirHashes.hs:(82,1)-(90,27) 1.7 1.3
|
||||
primitive Basement.Monad Basement/Monad.hs:72:5-18 1.6 0.1
|
||||
withPtr Basement.Block.Base Basement/Block/Base.hs:(395,1)-(404,31) 1.5 1.6
|
||||
mkKeySerialization Types.Key Types/Key.hs:(115,1)-(117,22) 1.1 2.8
|
||||
decimal.step Data.Attoparsec.ByteString.Char8 Data/Attoparsec/ByteString/Char8.hs:448:9-49 0.8 1.2
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue