speed up initial scanning for annexed files

Streaming through git this way speeds it up by around 25%. This is similar to the optimisations of seeking annexed files. Sponsored-by: Dartmouth College's Datalad project
2021-05-31 13:40:42 -04:00 · 2021-05-31 13:40:42 -04:00 · 0f54e5e0ae
commit 0f54e5e0ae
parent aa00e171cb
2 changed files with 46 additions and 18 deletions
--- a/doc/todo/Avoid_lengthy_34Scanning_for_unlocked_files_...34/comment_12_70c4c9f6c35acd7ca1134ac74356e5be._comment
+++ b/doc/todo/Avoid_lengthy_34Scanning_for_unlocked_files_...34/comment_12_70c4c9f6c35acd7ca1134ac74356e5be._comment
@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 12"""
+ date="2021-05-31T16:30:11Z"
+ content="""
+Implemented streaming through git. In a repo with 100000 unlocked files,
+version 8.20210429 took 46 seconds, now reduced to 36 seconds.
+
+When the files are locked, of course the old version was faster
+due to being able to skip all symlinks, 2 seconds. The new version takes
+slightly less time than it does for unlocked files, 35 seconds.
+
+Now the git query and processing is only a few seconds of the total run time,
+writing information about all the files to sqlite is most of the rest,
+and may also be possible to speed up.
+"""]]