This was unexpectedly difficult because of a depdenency cycle. To parse a
preferred content expression involves several things that need to operate
on the list of remotes. Which needs Remote.External. The only way to avoid
this cycle (I tried breaking it at several points) was to skip parsing the
expression in SETWANTED.
That's sorta ok, because git-annex already has to deal with unparsable
preferred content expressions being stored, in order to handle eg,
upgrades. But I'm still not very happy that I cannot check it.
I feel this is a strong indication that I need to beware of further
bloating the special remote protocol interface.
Batch detection is heuristic, so can sometimes fail. I observed one such
failure while starting up in a repository with 87000 files. After the first
several batches of ~5000 files, it fell out of batch mode, and never
re-entered it, and so made many more commits of a few files at a time
than necessary.
So, let's always use batch mode when in the startup scan. This avoids the
heuristic there, at least.
There is clearly also room to improve the heuristic. Possibly 10 files is
too high a bar to be found during a commit, on a system that can commit
quickly.
Fixes a test case I received where a corrupted repo was repaired, but the
git-annex branch was not. The root of the problem was that the
MissingObject returned by the repair code was not necessarily a complete
set of all objects that might have been deleted during the repair.
So, stop trying to return that at all, and instead make the index file
checking code explicitly verify that each object the index uses is present.
Previous test did not notice if there is a dangling symlink.
Also, if a directory exists with the same name as the imported file, that
cannot work, so don't let --force have an effect.
Note that the hash backends were made to stop printing a (checksum..)
message as part of this, since it showed up without a file when deciding
whether to act on a file. Should have probably removed that message a while
ago anyway, I suppose.
The assistant's commit code also always avoids git commit, for simplicity.
Indirect mode sync still does a git commit -a to catch unstaged changes.
Note that this means that direct mode sync no longer runs the pre-commit
hook or any other hooks git commit might call. The git annex pre-commit
hook action for direct mode is however explicitly run. (The assistant
already ran git commit with hooks disabled, so no change there.)