git-annex/doc/todo/optimise_journal_access.mdwn

24 lines
1.1 KiB
Text
Raw Normal View History

2019-12-18 20:11:14 +00:00
Often a command will need to read a number of files from the git-annex
branch, and it uses getJournalFile for each to check for any journalled
change that has not reached the branch. But typically, the journal is empty
and in such a case, that's a lot of time spent trying to open journal files
that DNE.
Profiling eg, `git annex find --in web` shows things called by getJournalFile
use around 5% of runtime.
What if, once at startup, it checked if the journal was entirely empty.
If so, it can remember that, and avoid reading journal files.
Perhaps paired with staging the journal if it's not empty.
This could lead to behavior changes in some cases where one command is
writing changes and another command used to read them from the journal and
may no longer do so. But any such behavior change is of a behavior that
used to involve a race; the reader could just as well be ahead of the
writer and it would have already behaved as it would after the change.
But: When a process writes to the journal, it will need to update its state
to remember it's no longer empty. --[[Joey]]
[[!tag confirmed]]