From 9e9def2dc0a0b84a2f98805fd1c170fdc486cf16 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Wed, 18 Dec 2019 16:11:14 -0400 Subject: [PATCH] todo --- doc/todo/optimise_journal_access.mdwn | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 doc/todo/optimise_journal_access.mdwn diff --git a/doc/todo/optimise_journal_access.mdwn b/doc/todo/optimise_journal_access.mdwn new file mode 100644 index 0000000000..a49441cf5e --- /dev/null +++ b/doc/todo/optimise_journal_access.mdwn @@ -0,0 +1,21 @@ +Often a command will need to read a number of files from the git-annex +branch, and it uses getJournalFile for each to check for any journalled +change that has not reached the branch. But typically, the journal is empty +and in such a case, that's a lot of time spent trying to open journal files +that DNE. + +Profiling eg, `git annex find --in web` shows things called by getJournalFile +use around 5% of runtime. + +What if, once at startup, it checked if the journal was entirely empty. +If so, it can remember that, and avoid reading journal files. +Perhaps paired with staging the journal if it's not empty. + +This could lead to behavior changes in some cases where one command is +writing changes and another command used to read them from the journal and +may no longer do so. But any such behavior change is of a behavior that +used to involve a race; the reader could just as well be ahead of the +writer and it would have already behaved as it would after the change. + +But: When a process writes to the journal, it will need to update its state +to remember it's no longer empty. --[[Joey]]