Note that this changes the default behavior of git add in a newly
initialized repository; it will add files to the annex.
Don't like that this could break workflows, but it's necessary in order for
any pointer files in the repo to be handled by git-annex.
This allows the git repository to be moved while git-annex is running in
it, with fewer problems.
On Windows, this avoids some of the problems with the absurdly small
MAX_PATH of 260 bytes. In particular, git-annex repositories should
work in deeper/longer directory structures than before. See
http://git-annex.branchable.com/bugs/__34__git-annex:_direct:_1_failed__34___on_Windows/
There are several possible ways this change could break git-annex:
1. If it changes its working directory while it's running, that would
be Bad News. Good news everyone! git-annex never does so. It would also
break thread safety, so all such things were stomped out long ago.
2. parentDir "." -> "" which is not a valid path. I had to fix one
instace of this, and I should probably wipe all calls to parentDir out
of the git-annex code base; it was never a good idea.
3. Things like relPathDirToFile require absolute input paths,
and code assumes that the git repo path is absolute and passes it to it
as-is. In the case of relPathDirToFile, I converted it to not make
this assumption.
Currently, the test suite has 16 failures.
Removed instance, got it all to build using fromRef. (With a few things
that really need to show something using a ref for debugging stubbed out.)
Then added back Read instance, and made Logs.View use it for serialization.
This changes the view log format.
That's needed in files used to build the configure program.
For the other files, I'm keeping my __WINDOWS__ define, as I find that much easier to type.
I may search and replace it to use the mingw32_HOST_OS thing later.
Prelude.undefined error message was introduced by
bb4f31a0ee.
It seems best to filter out local repositories that cannot be accessed
from the list of remotes, rather than keeping them in and making every
thing that uses the list have to deal with remotes that may have an unknown
location.
Besides fixing the error message, this also makes unavailable local
remotes' names not be shown in various messages, including in git annex
status output.
Also, move --to an unavailable local repository now avoids some ugly
errors like "changeWorkingDirectory: does not exist".
Baked into the code was an assumption that a repository's git directory
could be determined by adding ".git" to its work tree (or nothing for bare
repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are
used to separate the two.
This was attacked at the type level, by storing the gitdir and worktree
separately, so Nothing for the worktree means a bare repo.
A complication arose because we don't learn where a repository is bare
until its configuration is read. So another Location type handles
repositories that have not had their config read yet. I am not entirely
happy with this being a Location type, rather than representing them
entirely separate from the Git type. The new code is not worse than the
old, but better types could enforce more safety.
Added support for core.worktree. Overriding it with -c isn't supported
because it's not really clear what to do if a git repo's config is read, is
not bare, and is then overridden to bare. What is the right git directory
in this case? I will worry about this if/when someone has a use case for
overriding core.worktree with -c. (See Git.Config.updateLocation)
Also removed and renamed some functions like gitDir and workTree that
misused git's terminology.
One minor regression is known: git annex add in a bare repository does not
print a nice error message, but runs git ls-files in a way that fails
earlier with a less nice error message. This is because before --work-tree
was always passed to git commands, even in a bare repo, while now it's not.
Turns out that git will accept a .git/config containing an url with eg,
spaces in its name. Handle this by escaping the url if it's not valid.
This also fixes support for urls containing escaped characters like %20
for space. Before, the path from the url was not unescaped properly.
The last branch ref that the index was updated to is stored in
.git/annex/index.lck, and the index only updated when the current
branch ref differs.
(The .lck file should later be used for locking too.)
Some more optimization is still needed, since there is some redundancy in
calls to git show-ref.
In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for
those. Note that this does not prevent mixing up of eg, refs and branches
at the type level. Since git really doesn't care, except rare cases like
git update-ref, or git tag -d, that seems ok for now.
There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref
may or may not be a tree-ish, depending on the object type, so there seems
no point in trying to represent it at the type level.
Many functions took the repo as their first parameter. Changing it
consistently to be the last parameter allows doing some useful things with
currying, that reduce boilerplate.
In particular, g <- gitRepo is almost never needed now, instead
use inRepo to run an IO action in the repo, and fromRepo to get
a value from the repo.
This also provides more opportunities to use monadic and applicative
combinators.
Other code is currently depending on checkAttr forcing absolute filenames
to relative, so keep it doing so.
This is a quick fix, and should sometime be moved elsewhere.
* This version of git-annex only works with git 1.7.7 and newer.
The breakage with old versions is subtle, and affects
annex.numcopies .gitattributes settings, so be sure to upgrade git
to 1.7.7. (Debian package now depends on that version.)
* Don't pass absolute paths to git show-attr, as it started following
symlinks when that's done in 1.7.7. Instead, use relative paths,
which show-attr only handles 100% correctly in 1.7.7. Closes: #645046
Unfortunatly I can find no way to work with the old and new gits, as
the old had bugs that require absolute paths, while the new doesn't like
them at all. And the behavior of git show-attr in 1.7.7. is the same as
eg, git add of an absolute path to a symlink, so seems entirely
intentional and not likely to change.
This yields a second or so speedup in unused, find, etc. Seems that even
when the ByteString is immediately split and then converted to Strings,
it's faster.
I may try to push ByteStrings out into more of git-annex gradually,
although I suspect most of the time-critical parts are already covered
now, and many of the rest rely on libraries that only support Strings.
Added Git.ByteString which replaces Git IO methods with ones using lazy
ByteStrings. This can be more efficient when large quantities of data are
being read from git.
In Git.LsTree, parse git ls-tree output more efficiently, thanks
to ByteString. This benchmarks 25% faster, in a benchmark that includes
(probably predominately) the run time for git ls-tree itself.
In real world numbers, this makes git annex unused 2 seconds faster for
each branch it needs to check, in my usual large repo.
Fixed the laziness space leak, so it runs in 60 mb or so again. Slightly
faster due to using Data.Set.difference now, although this also makes it
use slightly more memory.
Also added display of the refs being checked, and made unused --from
also check all refs for things in the remote.