This allows bypassing the direct mode guard in a safe way to do all sorts
of things including git revert, git mv, git checkout ...
This commit was sponsored by the WikiMedia Foundation.
This fixes all instances of " \t" in the code base. Most common case
seems to be after a "where" line; probably vim copied the two space layout
of that line.
Done as a background task while listening to episode 2 of the Type Theory
podcast.
This avoids cp -a overriding the default mode acls that the user might have
set in a git repository.
With GNU cp, this behavior change should not be a breaking change, because
git-anex also uses rsync sometimes in the same situation, and has only ever
preserved timestamps when using rsync.
Systems without GNU cp will no longer use cp -a, but instead just cp.
So, timestamps will no longer be preserved. Preserving timestamps when
copying between repos is not guaranteed anyway.
Closes: #729757
Removed old extensible-exceptions, only needed for very old ghc.
Made webdav use Utility.Exception, to work after some changes in DAV's
exception handling.
Removed Annex.Exception. Mostly this was trivial, but note that
tryAnnex is replaced with tryNonAsync and catchAnnex replaced with
catchNonAsync. In theory that could be a behavior change, since the former
caught all exceptions, and the latter don't catch async exceptions.
However, in practice, nothing in the Annex monad uses async exceptions.
Grepping for throwTo and killThread only find stuff in the assistant,
which does not seem related.
Command.Add.undo is changed to accept a SomeException, and things
that use it for rollback now catch non-async exceptions, rather than
only IOExceptions.
Running `git annex direct` would cause loss of data, because the object
was moved to a temp file, which it then tried to replace the work tree file
with, and on failure, the temp file got deleted. Now it's instead moved
back into the annex object location.
Support users who have set commit.gpgsign, by disabling gpg signatures for
git-annex branch commits and commits made by the assistant.
The thinking here is that a user sets commit.gpgsign intending the commits
that they manually initiate to be gpg signed. But not commits made in the
background, whether by a deamon or implicitly to the git-annex branch.
gpg signing those would be at best a waste of CPU and at worst would fail,
or flood the user with gpg passphrase prompts, or put their signature on
changes they did not directly do.
See Debian bug #753720.
Also makes all commits done by git-annex go through a few central control
points, to make such changes easier in future.
Also disables commit.gpgsign in the test suite.
This commit was sponsored by Antoine Boegli.
http://marc.info/?l=git&m=140262402204212&w=2
This git bug manifested on FAT and Windows as the test suite failing in 3
places. All involved merge conflict resolution. It turned out that the
associated file mappings were getting messed up, and that happened because
this git bug lost track of what files were supposed to be symlinks.
This commit was sponsored by Eric Kidd.
Rather than calculating the TSDelta once, and caching it, this now
reads the inode sential file's InodeCache file once, and then each time a
new InodeCache is generated, looks at the sentinal file to get the current
delta.
This way, if the time zone changes while git-annex is running, it will
adapt.
This adds some inneffiency, but only on Windows, and only 1 stat per new
file added. The worst innefficiency is that `git annex status` and
`git annex sync` will now (on Windows) stat the inode sentinal file once per
file in the repo.
It would be more efficient to use getCurrentTimeZone, rather than needing
to stat the sentinal file. This should be easy to do, once the time
package gets my bugfix patch.
This commit was sponsored by Jürgen Lüters.
On Windows, changing the time zone causes the apparent mtime of files to
change. This confuses git-annex, which natually thinks this means the files
have actually been modified (since THAT'S WHAT A MTIME IS FOR, BILL <sheesh>).
Work around this stupidity, by using the inode sentinal file to detect if
the timezone has changed, and calculate a TSDelta, which will be applied
when generating InodeCaches.
This should add no overhead at all on unix. Indeed, I sped up a few
things slightly in the refactoring.
Seems to basically work! But it has a big known problem:
If the timezone changes while the assistant (or a long-running command)
runs, it won't notice, since it only checks the inode cache once, and
so will use the old delta for all new inode caches it generates for new
files it's added. Which will result in them seeming changed the next time
it runs.
This commit was sponsored by Vincent Demeester.
It was possible for a interrupted sync or merge in direct mode to
leave the work tree out of sync with the last recorded commit.
This would result in the next commit seeing files missing from the work
tree, and committing their removal.
Now, a direct mode merge happens not only in a throwaway work tree, but using
a temporary index file, and without any commits or index changes
being made until the real work tree has been updated. If the merge is
interrupted, the work tree may have some updated files, but worst case a
commit will redundantly commit changes that come from the merge.
This commit was sponsored by Tony Cantor.
Added test cases for both ways this can happen, with a conflict involving a
file, or a directory.
Cleaned up resolveMerge to not touch the work tree in direct mode, which
turned out to be the only way to handle things.. And makes it much nicer.
Still need to run test suite on windows.
In the case of a conflicted merge where the remote adds a directory, and we
have a file (which is checked in), resolveMerge' will create the link,
and so the fix for 1192d98721 looked at that,
thought it was an unannexed file (it's not in the oldref), and preserved
it.
This is a hacky fix. It would be better for resolveMerge' to not update the
work tree, at least in direct mode, and only stage the changes, which
mergeDirectCleanUp could then move into tree. I want to make that change,
but this is not the time to do it.
Removed instance, got it all to build using fromRef. (With a few things
that really need to show something using a ref for debugging stubbed out.)
Then added back Read instance, and made Logs.View use it for serialization.
This changes the view log format.
The assistant's commit code also always avoids git commit, for simplicity.
Indirect mode sync still does a git commit -a to catch unstaged changes.
Note that this means that direct mode sync no longer runs the pre-commit
hook or any other hooks git commit might call. The git annex pre-commit
hook action for direct mode is however explicitly run. (The assistant
already ran git commit with hooks disabled, so no change there.)
Because that allowed writing to symlinks of files that are not present,
which followed the link and put bad content in an object location.
fsck: Fix up .git/annex/object directory permissions.
This commit was sponsored by an anonymous bitcoin donor.
Now that direct mode sets core.bare=true, git's normal prohibition about
pushing into the currently checked out branch doesn't work.
A simple fix for this would be an update hook which blocks the pushes..
but git hooks must be executable, and git-annex needs to be usable on eg,
FAT, which lacks x bits.
Instead, enabling direct mode switches the branch (eg master) to a special
purpose branch (eg annex/direct/master). This branch is not pushed when
syncing; instead any changes that git annex sync commits get written to
master, and it's pushed (along with synced/master) to the remote.
Note that initialization has been changed to always call setDirect,
even if it's just setDirect False for indirect mode. This is needed because
if the user has just cloned a direct mode repo, that nothing has synced
with before, it may have no master branch, and only a annex/direct/master.
Resulting in that branch being checked out locally too. Calling setDirect False
for indirect mode moves back out of this branch, to a new master branch,
and ensures that a manual "git push" doesn't push changes directly to
the annex/direct/master of the remote. (It's possible that the user
makes a commit w/o using git-annex and pushes it, but nothing I can do
about that really.)
This commit was sponsored by Jonathan Harrington.
Done using a mode witness, which ensures it's fixed everywhere.
Fixing catFileKey was a bear, because git cat-file does not provide a
nice way to query for the mode of a file and there is no other efficient
way to do it. Oh, for libgit2..
Note that I am looking at tree objects from HEAD, rather than the index.
Because I cat-file cannot show a tree object for the index.
So this fix is technically incomplete. The only cases where it matters
are:
1. A new large file has been directly staged in git, but not committed.
2. A file that was committed to HEAD as a symlink has been staged
directly in the index.
This could be fixed a lot better using libgit2.
If the cleanup of a single file fails for some reason, continue
to clean up other files.
This could happen because of a race. The merge pulls in a change to a file,
which gets changed locally at the same time.
Made fromDirect check that a file in the tree has good content (and is not
a broken symlink either) before copying it to another file that has the
same key.
Made replaceFile clean up the temp file if the action that creates it, or
the file replacement action fails.
This was also tripped by the test suite's automatic conflict resolution
test. Which also shows BTW that an unnecessary copy of content is done
sometimes when merging in direct mode. Not going to try to speed that up
now.
The bug was in movein, which just replaceFile'd the file with a symlink,
even if it already had the desired content, before trying to pull the
content out of the annex and replace the symlink with it.
That was ok-ish for non conflicted merges, where if the file existed it would
be an old version of the content. But for conflicted merges, the automatic
merge resolver has already run, and will have already put the desired
content into the file for the local variant.
Also, made removeDirect not trust that the associated files map is correct.
Only if it can verify that another file has the content will it not move it
into .git/annex/objects.
This fixes a bug with git annex add in direct mode. If some files already
existed in the tree pointing at the same key as a file that was just added,
and their content was not present, add neglected to copy the content to
those files.
I also changed the behavior of moveAnnex slightly: When content is moved
into the annex in direct mode, it does not overwrite any content already
present in direct mode files. That content may be modified after all.
It's possible for files in indirect mode to have a direct mode mapping
file. Probably from when they were in direct mode. In this case,
toDirectGen tried to copy the content from the direct mode file that the
mapping said had it. But, being in indirect mode, it didn't really have the
content. So it did nothing. This fix makes it always move the content from
.git/annex/objects/ when it's there.
A content directory can be frozen in direct mode. One way this can happen
is if the content is transferred before direct mode has a mapping for it,
so it's stored in the content directory.
So, we need to thaw the content directory before doing things with it.
The root of the problem is that toInodeCache sees a non-symlink, and so
goes on and generates a new inode cache for the dummy symlink.
Any place that toInodeCache, or sameFileStatus, or genInodeCache are called
may need to deal with this case. Although many of them are ok. For example,
prepSendAnnex calls sameInodeCache, which calls genInodeCache.. but if
the file content is not present, the InodeCache generated for its standin
file is appropriately not the same, and so it returns Nothing.
I've audited some, but have to say I'm not happy with this; it should be
handled at the type level somehow, or a toInodeCache wrapper be used that
is aware of dummy symlinks.
(The Watcher already dealt with it, via the guardSymlinkStandin function.)
Fixed by storing a list of cached inodes for a key, instead of just one.
Backwards compatability note: An old git-annex version will fail to parse
an inode cache file that has been written by a new version, and has
multiple items. It will succees if just one. So old git-annexes will have
even worse behavior when there are duplicated files, if that is possible.
I don't think it will be a problem. (Famous last words.)
Also, note that it doesn't expire old and unused inode caches for a key.
It would be possible to add this if needed; just look through the
associated files for a key and if there are more cached inodes, throw out
any not corresponding to associated files. Unless a file is being copied
repeatedly and the old copy deleted, this lack of expiry should not be a
problem.
* since this is a crippled filesystem anyway, git-annex doesn't use
symlinks on it
* so there's no reason to use the mixed case hash directories that we're
stuck using to avoid breaking everyone's symlinks to the content
* so we can do what is already done for all bare repos, and make non-bare
repos on crippled filesystems use the all-lower case hash directories
* which are, happily, all 3 letters long, so they cannot conflict with
mixed case hash directories
* so I was able to 100% fix this and even resuming `git annex add` in the
test case will recover and it will all just work.
This avoids commit churn by the assistant when eg,
replacing a file with a symlink.
But, just as importantly, it prevents the working tree being left with a
deleted file if git-annex, or perhaps the whole system, crashes at the
wrong time.
(It also probably avoids confusing displays in file managers.)
There are two types of equality here, and which one is right varies,
so this forces me to consider and choose between them.
Based on this, I learned that the commit in git anex sync was
always doing a strong comparison, even when in a repository where
the inodes had changed. Fixed that.