In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for
those. Note that this does not prevent mixing up of eg, refs and branches
at the type level. Since git really doesn't care, except rare cases like
git update-ref, or git tag -d, that seems ok for now.
There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref
may or may not be a tree-ish, depending on the object type, so there seems
no point in trying to represent it at the type level.
Many functions took the repo as their first parameter. Changing it
consistently to be the last parameter allows doing some useful things with
currying, that reduce boilerplate.
In particular, g <- gitRepo is almost never needed now, instead
use inRepo to run an IO action in the repo, and fromRepo to get
a value from the repo.
This also provides more opportunities to use monadic and applicative
combinators.
Other code is currently depending on checkAttr forcing absolute filenames
to relative, so keep it doing so.
This is a quick fix, and should sometime be moved elsewhere.
* This version of git-annex only works with git 1.7.7 and newer.
The breakage with old versions is subtle, and affects
annex.numcopies .gitattributes settings, so be sure to upgrade git
to 1.7.7. (Debian package now depends on that version.)
* Don't pass absolute paths to git show-attr, as it started following
symlinks when that's done in 1.7.7. Instead, use relative paths,
which show-attr only handles 100% correctly in 1.7.7. Closes: #645046
Unfortunatly I can find no way to work with the old and new gits, as
the old had bugs that require absolute paths, while the new doesn't like
them at all. And the behavior of git show-attr in 1.7.7. is the same as
eg, git add of an absolute path to a symlink, so seems entirely
intentional and not likely to change.
This yields a second or so speedup in unused, find, etc. Seems that even
when the ByteString is immediately split and then converted to Strings,
it's faster.
I may try to push ByteStrings out into more of git-annex gradually,
although I suspect most of the time-critical parts are already covered
now, and many of the rest rely on libraries that only support Strings.
Added Git.ByteString which replaces Git IO methods with ones using lazy
ByteStrings. This can be more efficient when large quantities of data are
being read from git.
In Git.LsTree, parse git ls-tree output more efficiently, thanks
to ByteString. This benchmarks 25% faster, in a benchmark that includes
(probably predominately) the run time for git ls-tree itself.
In real world numbers, this makes git annex unused 2 seconds faster for
each branch it needs to check, in my usual large repo.
Fixed the laziness space leak, so it runs in 60 mb or so again. Slightly
faster due to using Data.Set.difference now, although this also makes it
use slightly more memory.
Also added display of the refs being checked, and made unused --from
also check all refs for things in the remote.