From 221b47162d89dac96acb9f8a4ded887f0b7421c4 Mon Sep 17 00:00:00 2001 From: Lukey Date: Thu, 24 Sep 2020 16:36:12 +0000 Subject: [PATCH] --- .../skip_first_pass_in_git_annex_sync.mdwn | 21 +++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 doc/todo/skip_first_pass_in_git_annex_sync.mdwn diff --git a/doc/todo/skip_first_pass_in_git_annex_sync.mdwn b/doc/todo/skip_first_pass_in_git_annex_sync.mdwn new file mode 100644 index 0000000000..e9e2a0eba8 --- /dev/null +++ b/doc/todo/skip_first_pass_in_git_annex_sync.mdwn @@ -0,0 +1,21 @@ +Hello Joey, +So I have another idea to speed up git annex sync --content --all. As far as I understand, the first pass with --all (walking the worktree) is just to make preferred content expressions which include paths (and lackingcopies=) happy. Now, if the preferred content expression of the local repo and of none of the remotes contain paths, that first pass can be skipped. + +I did some benchmarking by replacing the following in seekSyncContent: + + Just WantAllKeys -> Just <$> genBloomFilter (seekworktree mvar (WorkTreeItems [])) + +with: + + Just WantAllKeys -> pure Nothing + +and it led to a 2x speedup (with warm cache): + + before: + real 0m41.186s + + after: + real 0m22.182s + + +This repo has 25641 keys and all of them are in the worktree too.