diff --git a/doc/todo/split_off_clean__47__smudge_filter__63__/comment_3_568ca39690bef083189ad158553683d3._comment b/doc/todo/split_off_clean__47__smudge_filter__63__/comment_3_568ca39690bef083189ad158553683d3._comment new file mode 100644 index 0000000000..6564e5291c --- /dev/null +++ b/doc/todo/split_off_clean__47__smudge_filter__63__/comment_3_568ca39690bef083189ad158553683d3._comment @@ -0,0 +1,18 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2019-09-20T16:13:39Z" + content=""" +What is your evidence that the size of the git-annex +executable has anything to do with that? + +I hope you're aware that linux does not load entire programs into memory +before running them. It mmaps them and loads pages on demand. (I assume +most other modern OS's do similar things.) + +It's easy to build a significantly smaller git-annex, just disable +the assistant build flag, and you should get a binary around 30mb rather +than 60mb. I'll wager you do not find any statistically significant +differences in the performance of such a build of git-annex with one +containing the assistant. +"""]] diff --git a/doc/todo/split_off_clean__47__smudge_filter__63__/comment_4_dbfb6deef8956c4b71cc30c0019d0bb5._comment b/doc/todo/split_off_clean__47__smudge_filter__63__/comment_4_dbfb6deef8956c4b71cc30c0019d0bb5._comment new file mode 100644 index 0000000000..3d8be7f785 --- /dev/null +++ b/doc/todo/split_off_clean__47__smudge_filter__63__/comment_4_dbfb6deef8956c4b71cc30c0019d0bb5._comment @@ -0,0 +1,9 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 4""" + date="2019-09-20T16:50:24Z" + content=""" +git-annex isolates all runtime state in its Annex monad. This +makes it easy to have multiple threads each with their own isolated state, +which is how all its concurrency features work. +"""]] diff --git a/doc/todo/split_off_clean__47__smudge_filter__63__/comment_5_0d22b900fad406c0bc144de146e2668d._comment b/doc/todo/split_off_clean__47__smudge_filter__63__/comment_5_0d22b900fad406c0bc144de146e2668d._comment new file mode 100644 index 0000000000..098a023f9a --- /dev/null +++ b/doc/todo/split_off_clean__47__smudge_filter__63__/comment_5_0d22b900fad406c0bc144de146e2668d._comment @@ -0,0 +1,20 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 5""" + date="2019-09-20T16:51:56Z" + content=""" +This is not to say that the idea of having a longer-running git-annex +process that responds to all git's smudge/clean is a bad idea. Each new +invokation of git-annex has to re-open databases, start up git cat-file +to query from, link the executable, read git config, etc. That takes a +few hundred milliseconds. + +The best way to get a longer-running git-annex process for smudge/clean +would be to use git's long-running filter process interface. But that +interface currently feeds the entire content of large files through a pipe +to the git-annex process, which *very* innefficient. So git-annex doesn't +use that interface. Improving the interface to let the clean filter read +the content of the file itself, rather than it being piped through, +would be the best way to improve `git add` performance. +[[todo/git_smudge_clean_interface_suboptiomal]] does discuss this. +"""]]