This yields a second or so speedup in unused, find, etc. Seems that even
when the ByteString is immediately split and then converted to Strings,
it's faster.
I may try to push ByteStrings out into more of git-annex gradually,
although I suspect most of the time-critical parts are already covered
now, and many of the rest rely on libraries that only support Strings.
Fixed the laziness space leak, so it runs in 60 mb or so again. Slightly
faster due to using Data.Set.difference now, although this also makes it
use slightly more memory.
Also added display of the refs being checked, and made unused --from
also check all refs for things in the remote.
Using Sets is the right thing; they have constant size lookup like my
SizeList, and logn insertation, which beats nub to death.
Runs faster than --fast mode did before, and gives accurate counts.
13 seconds total runtime with a warm cache in a repository with 40 thousand
keys.
find: Rather than only showing files whose contents are present, when used
with --exclude --copies or --in, displays all files that match the
specified conditions.
Note that this is a behavior change for find --exclude! Old behavior
can be gotten with find --in . --exclude=...
These were a mistake, they make the type signatures harder to read and
less flexible. The CommandSeek, CommandStart, CommandPerform, and
CommandCleanup types were a good idea, but composing them with the
parameters expected is going too far.
It probably does not make sense to enable auto mode for move. I cannot
think of a situation where it would make sense to try to use it.
A hypothetical auto mode for move would only differ from a normal
move in one case -- when both repositories have a file, move deletes it
from one, and this reduces the number of copies. So an auto mode would
either only let move work in that situation, or avoid removing the file
in that situation, depending on the number of copies. This would be
complex to implement, and is perhaps not a very obvious behavior.
The error is a good thing to have, so users don't expect it to do something
it does not.
get, drop: Added --auto option, which decides whether to get/drop content
as needed to work toward the configured numcopies.
The problem with bundling it up in optimize was that I then found I wanted
to run an optmize that did not drop files, only got them. Considered adding
a --only-get switch to it, but that seemed wrong. Instead, let's make
existing subcommands optionally smarter.
Note that the only actual difference between drop and drop --auto is that
the latter does not even try to drop a file if it knows of not enough
copies, and does not print any error messages about files it was unable to
drop.
It might be nice to make get avoid asking git for attributes when not in
auto mode. For now it always asks for attributes.
Adds a missing newline when a longnote is followed by a endresult.
Multiple longnotes in a row will now be separated by a blank line, which
could be a bug or a feature depending on taste.
Removed several places where newlines were explicitly displayed after
longnotes.
First, this ensures that git annex addurl, when run repeatedly with the
same url, doesn't create duplicate files, which it did before when it
fell back to the longer filename.
Secondly, the file part of an url is frequently not very descriptive on its
own.
The uri scheme, auth, and port is intentionally left out, as clutter.
Using a single strictness annotation, in just the right place.
Tried several others, none of which helped and some of which potentially
hurt. This is only the second time I've really had to deal with this in
a year of using haskell, which is, I suppose not that bad.
when a git repository is first being created. Clones will automatically
notice that git-annex is in use and automatically perform a basic
initalization. It's still recommended to run "git annex init" in any
clones, to describe them.
The tricky part about this is that to generate a key, the file must be
present already. Worked around by adding (back) an URL key type, which
is used for addurl --fast.