The only remaining vestiage of backends is different types of keys. These
are still called "backends", mostly to avoid needing to change user interface
and configuration. But everything to do with storing keys in different
backends was gone; instead different types of remotes are used.
In the refactoring, lots of code was moved out of odd corners like
Backend.File, to closer to where it's used, like Command.Drop and
Command.Fsck. Quite a lot of dead code was removed. Several data structures
became simpler, which may result in better runtime efficiency. There should
be no user-visible changes.
All commands that often have to read a lot of information from
the git-annex branch should now be nearly as fast as before
the branch was introduced.
Before fsck was taking approximatly 3 hours, now it's running in 8 minutes.
The code is very nasty. It should be rewritten to read the header line
from git cat-file, and then read the specified number of bytes of content.
git is slow when the index file is large and has to be rewritten each time
a file is changed. To speed this up, added a journal where changes are
recorded before being fed into the index file and committed to the
git-annex branch. The entire journal can be fed into git with just 2
commands, and only one write of the index file.
This will speed up typical cases like git-annex get, which currently
has to read the location log once, then read it a second time in order to
add a line to it. Since these reads now involve more than just reading
in a file, it seemed good to add a cache layer.
Only the most recent thing needs to be cached, because git-annex has
good locality; it operates on one file at a time, and only cares
about one item from the branch per file.