"git annex adjust" may be a temporary interface, but works for a proof of
concept.
It is pretty fast at creating the adjusted branch. The main overhead is
injecting pointer files. It might be worth optimising that by reusing the
symlink target as the pointer file content. When I tried to do that,
the problem was that the clean filter doesn't use that same format, and so
git thought files had changed. Could be dealt with, perhaps make the clean
filter use symlink format for pointer files when on an adjusted branch?
But the real overhead is in checking out the branch, when git runs the
smudge filter once per file. That is perhaps too slow to be usable,
although it may only affect initial checkout of the branch, and not
updates. TBD.
extractTree has to parse the whole input list in order to generate a tree,
so convert interface to non-streaming.
Some quick memory benchmarks in a repo with 60k files
don't look too bad despite not streaming.
To stream, without building up a whole tree object, one way would
be a new interface:
adjustTree :: MonadIO m :: (TreeItem -> m (Maybe TreeItem)) -> Ref -> Repo -> m Sha
This would only need to buffer tree objects from the current one down
to the root, in order to update trees when a TreeItem is changed.
But, while it supports changing items in the tree, and removing items,
it does not support adding new items, or moving items from one directory to
another.