diff --git a/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_10_4e3cec5a4ce33dadfd0ce7a8d17bfaa0._comment b/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_10_4e3cec5a4ce33dadfd0ce7a8d17bfaa0._comment new file mode 100644 index 0000000000..f6535207db --- /dev/null +++ b/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_10_4e3cec5a4ce33dadfd0ce7a8d17bfaa0._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 10""" + date="2021-09-28T17:57:50Z" + content=""" +Since my profiling above shows windows is as fast as linux (though +probably on faster hardware), one thing I am wondering is if antivirus +could be slowing it down. I know AV on windows can slow down things like +writing to files, because it blocks closing the file until it finishes +scanning. Maybe the github windows CI does not run AV, but your windows +does, mih? +"""]] diff --git a/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_8_b46545c78628379dd50d306c87550b9d._comment b/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_8_b46545c78628379dd50d306c87550b9d._comment new file mode 100644 index 0000000000..d69bb45762 --- /dev/null +++ b/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_8_b46545c78628379dd50d306c87550b9d._comment @@ -0,0 +1,58 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 8""" + date="2021-09-28T16:22:36Z" + content=""" +Corresponding time profile on Windows. This was run on a github CI +instance, so I don't know if the CPU was busy with other tasks. + + Mon Sep 27 22:45 2021 Time and Allocation Profiling Report (Final) + + git-annex.exe +RTS -p -RTS smudge --clean x + + total time = 0.02 secs (22 ticks @ 1000 us, 1 processor) + total alloc = 6,753,624 bytes (excludes profiling overheads) + + COST CENTRE MODULE SRC %time %alloc + + createProcess Utility.Process Utility\Process.hs:(177,1)-(180,16) 50.0 18.7 + openLock.\ Utility.LockFile.Windows Utility\LockFile\Windows.hs:(64,17)-(65,76) 4.5 0.0 + toInodeCache' Utility.InodeCache Utility\InodeCache.hs:(196,1)-(206,34) 4.5 3.0 + checkSentinalFile.loadoldcache Utility.InodeCache Utility\InodeCache.hs:(260,9)-(261,83) 4.5 0.3 + adjustGitEnv Git.Env Git\Env.hs:(18,1)-(22,7) 4.5 41.2 + pipeReadStrict'.go Git.Command Git\Command.hs:(80,9)-(85,37) 4.5 0.1 + mergeGitConfig Types.GitConfig Types\GitConfig.hs:(284,1)-(299,54) 4.5 0.0 + fixupRepo Annex.Fixup Annex\Fixup.hs:(37,1)-(42,31) 4.5 0.0 + noCommit Command Command.hs:56:1-37 4.5 0.0 + getFileSize System.PosixCompat.Files src\System\PosixCompat\Files.hsc:(358,1)-(359,76) 4.5 1.5 + searchParser Options.Applicative.Common src\Options\Applicative\Common.hs:(136,1)-(151,40) 4.5 0.2 + catch Control.Monad.Catch src\Control\Monad\Catch.hs:333:3-32 4.5 0.2 + getFileSize Utility.FileSize Utility\FileSize.hs:37:1-80 0.0 1.5 + getFileType System.PosixCompat.Files src\System\PosixCompat\Files.hsc:(350,1)-(355,62) 0.0 2.9 + getFileStatus System.PosixCompat.Files src\System\PosixCompat\Files.hsc:(312,1)-(337,24) 0.0 6.0 + unconsB Codec.Binary.UTF8.Generic Codec\Binary\UTF8\Generic.hs:297:1-18 0.0 1.2 + +(Whole profile at ) + +What stands out is that createProcess is twice as expensive as linux, and +fully half the runtime is apparently spent just forking a few processes. + +Here are the git processes started and percent of time to start them, +from the more detailed profiling: + + git config 9% + git cat-file 9% + git check-attr 4.5% + git diff (reconcileStaged) 9% + git show-ref (reconcileStaged) 9% + git write-tree (reconcileStaged) 9% + +(The reconcileStaged stuff only happened because `git init` +didn't do it -- I forgot to put git-annex in the path the way I ran it +on the CI builder, and so init didn't do everything it usually would. +A second run with that fixed had a createProcess percent reduced to 30%, +though still at around2 0.2s total runtime.) + +This is not super slow on the Windows CI, it's competative with Linux, +though my Linux laptop probably has a slower CPU (1.5ghz). +"""]] diff --git a/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_9_409a95cb01a4b26d46d7aaff042026a0._comment b/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_9_409a95cb01a4b26d46d7aaff042026a0._comment new file mode 100644 index 0000000000..51d56994ce --- /dev/null +++ b/doc/bugs/Windows__58___substantial_per-file_cost_for___96__add__96__/comment_9_409a95cb01a4b26d46d7aaff042026a0._comment @@ -0,0 +1,44 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 9""" + date="2021-09-28T17:09:21Z" + content=""" +So most of the time is spent in createProcess. Forking is not slow +(on linux anyway) so why are 4 createProcess 23% of runtime on linux? + +Here is a strace --relative-timestamps on linux, showing a single createProcess +call, for reading `git config`. + + 0.000196 pipe([11, 12]) = 0 + 0.000134 pipe([13, 14]) = 0 + 0.000067 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0 + 0.000074 vfork() = 1412528 + 0.000552 close(12) = 0 + 0.000058 fcntl(11, F_SETFD, FD_CLOEXEC) = 0 + 0.000095 close(14) = 0 + 0.000050 fcntl(13, F_SETFD, FD_CLOEXEC) = 0 + 0.000047 read(13, "", 4) = 0 + 0.000050 close(13) = 0 + 0.000061 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 + 0.000062 futex(0x98336c8, FUTEX_WAKE_PRIVATE, 1) = 1 + 0.000061 futex(0x9833660, FUTEX_WAKE_PRIVATE, 1) = 1 + 0.000066 fcntl(11, F_GETFL) = 0 (flags O_RDONLY) + 0.000047 fcntl(11, F_SETFL, O_RDONLY|O_NONBLOCK) = 0 + 0.000073 ioctl(11, TCGETS, 0x7ffc091b0d80) = -1 ENOTTY (Inappropriate ioctl for device) + +Total runtime of createProcess above is 1.69 ms. +And from the profile on windows, it's taking around 1.7 ms per createProcess. + +I'd say that at least 1.32 ms of that is necessary, leaving out the `futex` +and `rt_sigprocmask` that are probably GHC runtime stuff, and the `ioctl` +and `read` and `fcntl F_GETFL` which seem unnecessary. +If those were optimised out, the total `git-annex smudge --clean` +runtime would speed up by only 10% or so. + +Feels like I've reached the end of profiling. Most of the time is being spent +starting git processes, and it can't be sped up significantly without +starting fewer git processes. + +(I do wish that `git check-attr` could be removed, but it's needed for the +annex.largefiles .gitattributes support.) +"""]]