Commit graph

580 commits

Author SHA1 Message Date
Joey Hess
3c4ad3eeca
indent 2016-03-11 14:46:54 -04:00
Joey Hess
ec8eba18ad
fix warning 2016-03-11 14:33:38 -04:00
Joey Hess
fed8fcb99f
allow adding new items via adjustTree 2016-03-11 14:08:06 -04:00
Joey Hess
8d124beba8
add commitDiff, and clean up partial function 2016-03-11 13:15:49 -04:00
Joey Hess
fbf4d89e82
extract commit parent(s) 2016-03-11 12:47:14 -04:00
Joey Hess
cf24e9b892
working toward adjusted commit propigation 2016-03-03 16:19:09 -04:00
Joey Hess
730b249477
replicate git's message about an existing lock file 2016-03-03 13:06:39 -04:00
Joey Hess
de4bd97c9d
support for git-style lock files, on unix and windows 2016-03-03 12:49:54 -04:00
Joey Hess
7c20bf6e7a
make sync aware of adjusted branches
So, it will pull and push the original branch, not the adjusted one.

And, for merging, it will use updateAdjustedBranch (not implemented yet).

Note that remaining uses of Git.Branch.current need to be checked too;
for things that should act on the original branch, and not the adjusted
branch.
2016-02-29 15:23:08 -04:00
Joey Hess
955ab3a973
fix android build 2016-02-29 11:43:22 -04:00
Joey Hess
facc50d965
forgot to use sfile 2016-02-26 16:12:40 -04:00
Joey Hess
3b74dc8be8
add fromBlobType 2016-02-25 15:34:22 -04:00
Joey Hess
7b2496508f
factor out commitTree 2016-02-25 15:33:50 -04:00
Joey Hess
1f91d1d0b7
add catCommit, with commit object parser 2016-02-25 15:14:47 -04:00
Joey Hess
be2e9427ad
refactor 2016-02-25 13:46:31 -04:00
Joey Hess
804aeca5d2
parse strictly
This reduces memory use, because it avoids thunks that buffer parts of the
ls-tree output that are not needed.
2016-02-23 23:08:41 -04:00
Joey Hess
e5dd91b189
better encapsulation 2016-02-23 22:22:22 -04:00
Joey Hess
4ea36b8c63
few strictness improvemnets 2016-02-23 22:03:47 -04:00
Joey Hess
85b05a29df
refactor 2016-02-23 21:56:08 -04:00
Joey Hess
e08bebf0eb
add adjustTree (low-level) interface that avoids buffering much in memory
Using getTree and recordTree in my big repo takes 594 mb ram.
Using adjustTree takes 73 mb.
2016-02-23 21:35:16 -04:00
Joey Hess
9519af25f3
remove support for network older than 2.4
debian stable has 2.4
2016-02-23 20:35:32 -04:00
Joey Hess
123f823ef7
no streaming
extractTree has to parse the whole input list in order to generate a tree,
so convert interface to non-streaming.

Some quick memory benchmarks in a repo with 60k files
don't look too bad despite not streaming.

To stream, without building up a whole tree object, one way would
be a new interface:

adjustTree :: MonadIO m :: (TreeItem -> m (Maybe TreeItem)) -> Ref -> Repo -> m Sha

This would only need to buffer tree objects from the current one down
to the root, in order to update trees when a TreeItem is changed.

But, while it supports changing items in the tree, and removing items,
it does not support adding new items, or moving items from one directory to
another.
2016-02-23 20:25:31 -04:00
Joey Hess
e266a6ec78
use getSha 2016-02-23 18:30:11 -04:00
Joey Hess
fc072699b7
minor improvements 2016-02-23 17:21:42 -04:00
Joey Hess
ae76cfde7d
add mktree interface 2016-02-23 16:36:38 -04:00
Joey Hess
a49d5d30fe
fix handling of unspecified attributes (particularly for annex.largefiles) 2016-02-05 18:41:23 -04:00
Joey Hess
d37fe6a547
annex.largefiles can be configured in .gitattributes too
This is particulary useful for v6 repositories, since the .gitattributes
configuration will apply in all clones of the repository.
2016-02-02 15:18:17 -04:00
Joey Hess
b52cf5697b
immediate queue flushing when annex.queuesize=1
Previously, it only flushed when the queue got larger than 1.

Also, make the queue auto-flush when items are added, rather than needing
to be flushed as a separate step. This simplifies the code and make it more
efficient too, as it avoids needing to read the queue out of the state to
check if it should be flushed.
2016-01-13 14:55:01 -04:00
Joey Hess
3320870bad
optimise
03cb2c8ece put a cat-file into the fast
bloomfilter generation path. Instead, add another bloom filter which diffs
from the work tree to the index.

Also, pull the sha of the changed object out of the diffs, and cat that
object directly, rather than indirecting through the filename.

Finally, removed some hacks that are unncessary thanks to the worktree to
index diff.
2016-01-06 20:38:02 -04:00
Joey Hess
aa4f353e5d
clarify absPathFrom
The repo path is typically relative, not absolute, so
providing it to absPathFrom doesn't yield an absolute path.
This is not a bug, just unclear documentation.

Indeed, there seem to be no reason to simplifyPath here, which absPathFrom
does, so instead just combine the repo path and the TopFilePath.

Also, removed an export of the TopFilePath constructor; asTopFilePath
is provided to construct one as-is.
2016-01-05 17:33:48 -04:00
Joey Hess
b3d60ca285
use TopFilePath for associated files
Fixes several bugs with updates of pointer files. When eg, running
git annex drop --from localremote
it was updating the pointer file in the local repository, not the remote.
Also, fixes drop ../foo when run in a subdir, and probably lots of other
problems. Test suite drops from ~30 to 11 failures now.

TopFilePath is used to force thinking about what the filepath is relative
to.

The data stored in the sqlite db is still just a plain string, and
TopFilePath is a newtype, so there's no overhead involved in using it in
DataBase.Keys.
2016-01-05 17:22:19 -04:00
Joey Hess
ec28151722
improve data type 2016-01-01 15:56:24 -04:00
Joey Hess
f7256842cc
wait for git lstree to exit 2016-01-01 15:51:29 -04:00
Joey Hess
70fee8208c
remove old TODO 2016-01-01 15:43:13 -04:00
Joey Hess
b0626230b7
fix use of hifalutin terminology 2015-11-16 14:37:31 -04:00
Joey Hess
53db9d0b5c
work around git check-ignore --batch bad exit status bug, and bring back import -J 2015-11-06 15:39:51 -04:00
Joey Hess
31472161e4
merge git command queue when joining with concurrent thread 2015-11-05 18:21:48 -04:00
Joey Hess
ef5496b8de
Catch up with current git behavior when both repo and repo.git exist; it seems it now prefers repo in this case, although historically it may have preferred repo.git. 2015-10-26 15:35:55 -04:00
Joey Hess
b0e5c09408
fix various build warnings, mostly on Windows
And some when S3 is disabled
2015-10-13 13:24:44 -04:00
Joey Hess
f2b6ebd502 status: Show added but not yet committed files.
Seems easy, but git ls-files can't list the right subset of files.
So, I wrote a whole new parser for git status output, and converted the
status command to use that.

There are a few other small behavior changes. The order changed. Unlocked
files show as T. In indirect mode, deleted files were not shown before, and
that's fixed. Regular files checked directly into git and modified
were not shown before, and are now.
2015-09-22 17:32:28 -04:00
Joey Hess
6158036e23 Switched to using git for Windows, rather than msysgit.
Using msysgit with git-annex is no longer supported.

At the same time, I'm updating the rsync.exe in my downloads repository
with the one from msys2.

Note that rsync is currently still being ldded and installed in Git/cmd/
like the other cygwin programs. The ldd fails and this failure is ignored.
It would be better to special case it to go in Git/usr/bin/, so that the
user can't run rsync in a dos prompt window, which doesn't work, as it needs
additional libs. However, as far as git-annex running rsync running ssh,
it works ok in this location.

Removed the ssh.cmd and ssh-keygen.cmd; these are not needed with git for
windows. Keeping them would let ssh be run manually from a dos prompt
window, but that's not really a goal.
2015-09-10 19:16:30 -04:00
Joey Hess
06dac3bfed avoid nul-truncation
This might be a little slower, but it's safer, in the event that a
union-merged file contains a NUL.

AFAIK, no files in the git-annex branch do.
2015-08-11 18:46:10 -04:00
Joey Hess
24800b1bf1 Only look at reflogs for relevant branches, not for git-annex branches
This speeds it up quite a bit.. May still be too slow in large repos.
2015-07-07 17:36:30 -04:00
Joey Hess
b11d2f5a8a unused: --used-refspec can now be configured to look at refs in the reflog. This provides a way to not consider old versions of files to be unused after they have reached a specified age, when the old refs in the reflog expire.
May be slow.
2015-07-07 17:13:50 -04:00
Joey Hess
dbd41159e0 Support git's undocumented core.sharedRepository=2 value, which is equivilant to "world". 2015-07-06 15:33:44 -04:00
Joey Hess
f7dc20595e refactor ls-tree params
All in one place to avoid bugs like 174da80ddc
2015-07-06 14:21:43 -04:00
Joey Hess
eb33569f9d remove Params constructor from Utility.SafeCommand
This removes a bit of complexity, and should make things faster
(avoids tokenizing Params string), and probably involve less garbage
collection.

In a few places, it was useful to use Params to avoid needing a list,
but that is easily avoided.

Problems noticed while doing this conversion:

	* Some uses of Params "oneword" which was entirely unnecessary
	  overhead.
	* A few places that built up a list of parameters with ++
	  and then used Params to split it!

Test suite passes.
2015-06-01 13:52:23 -04:00
Joey Hess
03667a162a couple of AMP warnings I missed before 2015-05-10 16:51:03 -04:00
Joey Hess
5c7cdbae46 more {-# OPTIONS_GHC -fno-warn-tabs #-} ... Forcing people who have what is merely a difference of opinion to you to do this is a bit of an asshole move. Just saying. 2015-05-10 16:38:49 -04:00
Joey Hess
c3dd133257 desc 2015-04-19 08:18:17 -04:00
Joey Hess
addc82dab7 removed all uses of undefined from code base
It's a code smell, can lead to hard to diagnose error messages.
2015-04-19 00:38:29 -04:00
Joey Hess
d3d92abf95 spotted a few more places where diff-tree needed --
None of these are very likely at all to ever be ambiguous, since tree
refs almost never have symbolic names and the sha is very unlikely
to be in the work tree.. But, let's get it right!
2015-04-09 21:22:35 -04:00
Joey Hess
2879adc551 fix union merge to call diff-index with -- after the ref
Otherwise, if there's a file in the repo with a name matching the ref,
git could get confused and the merge not work.
2015-04-09 21:13:28 -04:00
Joey Hess
0f740fd198 This fixes a bug in the assistant introduced by the literal pathspec changes in version 5.20150406.
git-checkignore refuses to work if any pathspec options are set. Urgh.

I audited the rest of git, and no other commands used by git-annex have
such limitations. Indeed, AFAICS, *all* other commands support
--literal-pathspecs. So, worked around this where git-checkignore is
called.
2015-04-09 13:37:06 -04:00
Joey Hess
15d45186cc use --literal-pathspecs globally, as a better way to avoid globbing
This might be overkill; I only know I need it in ls-files, but other git
commands can also do their own globbing, it turns out, and I am pretty sure
I never want them too when git-annex is using them as plumbing.

Test suite still passes and it looks ok.
2015-03-30 19:44:13 -04:00
Joey Hess
f933898cd0 workaround git ls-files bug in handling slash-escaped wildcards
There's no good solution for git-annex here; I can't escape or un-escape
and avoid breaking in some cases, so I've chosen the combo least likely
to result in breakage.

Git really needs to fix its behavior here.

The only other thing git-annex could do is treat this as a feature,
and don't try to escape at all. Ugh.
2015-03-30 19:00:04 -04:00
Joey Hess
f35d0bf4b2 Prevent git-ls-files from double-expanding wildcards when an unexpanded wildcard is passed to a git-annex command like add or find.
Note that previously, `git annex find *.jpg` would find eg, foo/bar.jpg.
That was never intended or documented behavior, so I'm going to change it.
But this is potentially a behavior change if someone discovered that
behavior and relied on it despite it being accidental. Oh well.. can't make
an omlette w/o breaking some eggs.
2015-03-27 16:45:50 -04:00
Joey Hess
cedca095b9 promote forum request to todo item so it is not lost 2015-03-27 16:38:51 -04:00
Joey Hess
3af4691978 Improve error message when --in @date is used and there is no reflog for the git-annex branch. 2015-03-26 11:15:15 -04:00
Joey Hess
a6db10d565 sync: Fix committing when in a direct mode repo that has no HEAD ref.
Seen for example, a newly checked out git submodule. In this case,
.git/HEAD is a raw sha, rather than the usual reference to a ref.

Removed currentSha in passing, since it was a more roundabout way of
doing what headSha does, and headSha is more robust.
2015-03-04 15:25:35 -04:00
Joey Hess
e322826e33 Submodules are now supported by git-annex!
Seems to work, but still experimental until it's been tested more.

When repositories are on filesystems not supporting symlinks, the .git dir
symlink trick cannot be used. Since we're going to be in direct mode
anyway, the .git dir symlink is not strictly needed.

However, I have not fixed the code that creates new annex symlinks to
handle this case -- the committed symlinks will be wrong.

git annex sync happens to currently fail in a submodule using direct mode,
because there's no HEAD ref. That also needs to be dealt with to get
this fully working in crippled filesystems.

Leaving http://github.com/datalad/datalad/issues/44 open until these issues
are dealt with.
2015-03-02 16:43:44 -04:00
Joey Hess
5169999b07 add -q to git symbolic-ref call
Avoids a warning message from git when HEAD doesn't exist. Which it won't
when eg, git-annex is used in a submodule just cloned with
git clone --recursive. In this case, a specific ref is checked out and
there's no HEAD yet.

The code already returned Nothing in this case, so no behavior change other
than not showing the warning. And git-annex operates fine in this
situation.
2015-03-02 15:56:37 -04:00
Joey Hess
52e40970c8 avoid unncessary IO 2015-02-12 15:33:44 -04:00
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
6035f94666 Windows: Fix running of the pre-commit-annex hook. 2015-01-20 14:48:16 -04:00
Joey Hess
f4de021a54 convert parentDir to be based on takeDirectory, but fixed for trailing / 2015-01-09 14:26:52 -04:00
Joey Hess
3bab5dfb1d revert parentDir change
Reverts 965e106f24

Unfortunately, this caused breakage on Windows, and possibly elsewhere,
because parentDir and takeDirectory do not behave the same when there is a
trailing directory separator.
2015-01-09 13:11:56 -04:00
Joey Hess
676ef32547 Merge branch 'master' into relativepaths 2015-01-06 21:41:25 -04:00
Joey Hess
adefcf189a Bugfix: A file named HEAD in the work tree could confuse some git commands run by git-annex. 2015-01-06 21:41:21 -04:00
Joey Hess
858d776352 Merge branch 'master' into relativepaths
Conflicts:
	Locations.hs
	debian/changelog
2015-01-06 19:00:01 -04:00
Joey Hess
965e106f24 made parentDir return a Maybe FilePath; removed most uses of it
parentDir is less safe than takeDirectory, especially when working
with relative FilePaths. It's really only useful in loops that
want to terminate at /

This commit was sponsored by Audric SCHILTKNECHT.
2015-01-06 18:55:56 -04:00
Joey Hess
d44b28437d git-hash-object needs absolute files (git bug)
A relative path to a file makes it fail. I am pretty sure this is a git
bug; workaround it.
2015-01-06 17:33:29 -04:00
Joey Hess
82f667e7f2 git repo path may be relative, so don't assume absolute any more
Fixes 6 test failures.
2015-01-06 16:32:44 -04:00
Joey Hess
cd865c3b8f Switch to using relative paths to the git repository.
This allows the git repository to be moved while git-annex is running in
it, with fewer problems.

On Windows, this avoids some of the problems with the absurdly small
MAX_PATH of 260 bytes. In particular, git-annex repositories should
work in deeper/longer directory structures than before. See
http://git-annex.branchable.com/bugs/__34__git-annex:_direct:_1_failed__34___on_Windows/

There are several possible ways this change could break git-annex:

1. If it changes its working directory while it's running, that would
   be Bad News. Good news everyone! git-annex never does so. It would also
   break thread safety, so all such things were stomped out long ago.

2. parentDir "." -> "" which is not a valid path. I had to fix one
   instace of this, and I should probably wipe all calls to parentDir out
   of the git-annex code base; it was never a good idea.

3. Things like relPathDirToFile require absolute input paths,
   and code assumes that the git repo path is absolute and passes it to it
   as-is. In the case of relPathDirToFile, I converted it to not make
   this assumption.

Currently, the test suite has 16 failures.
2015-01-06 16:19:41 -04:00
Joey Hess
4d786ebe4a Check git version at runtime, rather than assuming it will be the same as the git version used at build time when running git-checkattr and git-branch remove.
It's ok to probe every time for git-branch remove because that's
run quite rarely. For git-checkattr, it's run only once, when
starting the --batch mode, and so again the overhead is pretty minimal.

This leaves 2 places where the build version is still used.
git merge might be interactive or fail if one skews, and --no-gpg-sign
might not be pased, or might be passed to a git that doesn't understand it
if the other skews. It seems a little expensive to check the git version
each time these are used.

This doesn't seem likely to cause many problems, at least compared with
check-attr hanging on skew.
2015-01-05 15:54:52 -04:00
Joey Hess
db27ad26bf split out DiffTreeItem
This makes github-backup happier when it reuses this library.
2014-12-22 15:32:51 -04:00
Joey Hess
c64ede23cd Use wget -q --show-progress for less verbose wget output, when built with wget 1.16. 2014-12-16 14:04:40 -04:00
Joey Hess
13260ccc3a undo command
This commit was sponsored by Andrew Cant.
2014-11-14 14:41:07 -04:00
Joey Hess
c5ca0dc543 simplify 2014-11-12 15:57:38 -04:00
Joey Hess
864086a956 proxy: for all your direct mode repository munging needs
This allows bypassing the direct mode guard in a safe way to do all sorts
of things including git revert, git mv, git checkout ...

This commit was sponsored by the WikiMedia Foundation.
2014-11-12 15:51:46 -04:00
Joey Hess
bf2b029c49 comment typo 2014-11-10 15:38:31 -04:00
Joey Hess
adc5ca70a8 pre-commit: Block partial commit of unlocked annexed file, since that left a typechange staged in index
I had hoped that the git devs could change git's handling of partial
commits to not use a false index file, but seems not.

So, this relies on some git internals to detect that case. The test suite
has a test case added to catch it if changes to git break it.

This commit was sponsored by Paul Tagliamonte.
2014-11-10 15:36:24 -04:00
Joey Hess
20a497b181 move remote removal into separate module
This allows using Git.Remote w/o needing to have Git.BuildVersion, which
requires configure. It will simplify github-backup when these libraries are
used there.
2014-10-27 11:28:58 -04:00
Joey Hess
1e59df083d Use haskell setenv library to clean up several ugly workarounds for inability to manipulate the environment on windows.
Didn't know that this library existed!

This includes making git-annex not re-exec itself on start on windows, and
making the test suite on Windows run tests without forking.
2014-10-15 20:33:52 -04:00
Joey Hess
c6e9125c61 repair: Prevent auto gc from happening when fetching from a remote. 2014-10-12 14:27:46 -04:00
Joey Hess
9fd95d9025 indent with tabs not spaces
Found these with:
git grep "^  " $(find -type  f -name \*.hs) |grep -v ':  where'

Unfortunately there is some inline hamlet that cannot use tabs for
indentation.

Also, Assistant/WebApp/Bootstrap3.hs is a copy of a module and so I'm
leaving it as-is.
2014-10-09 15:09:26 -04:00
Joey Hess
7b50b3c057 fix some mixed space+tab indentation
This fixes all instances of " \t" in the code base. Most common case
seems to be after a "where" line; probably vim copied the two space layout
of that line.

Done as a background task while listening to episode 2 of the Type Theory
podcast.
2014-10-09 15:09:11 -04:00
Joey Hess
11f111bf1a Fix parsing of ipv6 address in git remote address when it was not formatted as an url. 2014-09-10 14:17:02 -04:00
Joey Hess
b874f84086 New annex.hardlink setting. Closes: #758593
* New annex.hardlink setting. Closes: #758593
* init: Automatically detect when a repository was cloned with --shared,
  and set annex.hardlink=true, as well as marking the repository as
  untrusted.

Had to reorganize Logs.Trust a bit to avoid a cycle between it and
Annex.Init.
2014-09-05 13:44:09 -04:00
Joey Hess
4405650828 Fix handing of autocorrection when running outside a git repository.
Old behavior was to take the first fuzzy match. Now, it checks the globa
git config, and runs the normal fuzzy handling, including failing to run a
semi-random command by default.
2014-08-23 16:51:33 -07:00
Joey Hess
c784ef4586 unify exception handling into Utility.Exception
Removed old extensible-exceptions, only needed for very old ghc.

Made webdav use Utility.Exception, to work after some changes in DAV's
exception handling.

Removed Annex.Exception. Mostly this was trivial, but note that
tryAnnex is replaced with tryNonAsync and catchAnnex replaced with
catchNonAsync. In theory that could be a behavior change, since the former
caught all exceptions, and the latter don't catch async exceptions.

However, in practice, nothing in the Annex monad uses async exceptions.
Grepping for throwTo and killThread only find stuff in the assistant,
which does not seem related.

Command.Add.undo is changed to accept a SomeException, and things
that use it for rollback now catch non-async exceptions, rather than
only IOExceptions.
2014-08-07 22:03:29 -04:00
Joey Hess
000dd42ac4 improve repair of bad branches
The repair code assumed that if fsck found no broken objects, after
removing bad objects and possibly pulling replacements from remote, all was
well.. but this is not really true. Removing bad objects could leave some
branches broken. fsck doesn't report any missing objects in this case,
and its messages about broken branches are ignored by the fsck output
parser.

To deal with this, added a separate scan of all refs to find broken ones
and remove them when --forced. This will also let anyone who ran into this
bug run repair again to fix up the incomplete repair done before.

This commit was sponsored by Aaron Whitehouse.
2014-07-21 18:42:58 -04:00
Joey Hess
ec5ed2af9d Set gcrypt-publish-participants when setting up a gcrypt repository, to avoid unncessary passphrase prompts.
This is a security/usability tradeoff. To avoid exposing the gpg key ids
who can decrypt the repository, users can unset
gcrypt-publish-participants.

The gcrypt-publish-participants option is available in my fork of
git-remote-gcrypt.

This commit was sponsored by Christopher Kernahan.
2014-07-15 17:33:14 -04:00
Joey Hess
eef8e8c51a Fix git version that supported --no-gpg-sign.
This is weird, git describe said the commit landed in 1.8.5, but 1.9.3 does
not have it on OSX. Assume 2.0.0.
2014-07-08 12:46:15 -04:00
Joey Hess
1c1f463c3a
avoid using --no-gpg-sign with old versions of git
and refactor some
2014-07-04 13:49:12 -04:00
Joey Hess
fc67925fd7
reorg
avoid Git.Command needing Utility.Batch which needs async

For github-backup etc
2014-07-04 12:18:49 -04:00
Joey Hess
d41849bc23
support commit.gpgsign
Support users who have set commit.gpgsign, by disabling gpg signatures for
git-annex branch commits and commits made by the assistant.

The thinking here is that a user sets commit.gpgsign intending the commits
that they manually initiate to be gpg signed. But not commits made in the
background, whether by a deamon or implicitly to the git-annex branch.
gpg signing those would be at best a waste of CPU and at worst would fail,
or flood the user with gpg passphrase prompts, or put their signature on
changes they did not directly do.

See Debian bug #753720.

Also makes all commits done by git-annex go through a few central control
points, to make such changes easier in future.

Also disables commit.gpgsign in the test suite.

This commit was sponsored by Antoine Boegli.
2014-07-04 11:53:51 -04:00
Joey Hess
986bf1d6f6 Fix bug in annex.queuesize calculation that caused much more queue flushing than necessary.
The bug caused the size of the queue to be miscalculted; it was doubled
each time an item was added. Commands run after approx 140 items rather
than the intended 10240!
2014-06-18 17:23:36 -04:00
Joey Hess
fbd5a67cba fix a test suite reversion on Windows
Forgot to pass gitEnv when running commands in the git queue on windows.
2014-06-12 18:37:12 -04:00
Joey Hess
a44fd2c019 export CreateProcess fields from Utility.Process
update code to avoid cwd and env redefinition warnings
2014-06-10 19:20:14 -04:00
Joey Hess
d6711800ad avoid bad commits after interrupted direct mode sync (or merge)
It was possible for a interrupted sync or merge in direct mode to
leave the work tree out of sync with the last recorded commit.
This would result in the next commit seeing files missing from the work
tree, and committing their removal.

Now, a direct mode merge happens not only in a throwaway work tree, but using
a temporary index file, and without any commits or index changes
being made until the real work tree has been updated. If the merge is
interrupted, the work tree may have some updated files, but worst case a
commit will redundantly commit changes that come from the merge.

This commit was sponsored by Tony Cantor.
2014-06-09 19:40:28 -04:00
Joey Hess
138d25518d Merge branch 'master' into remotecontrol
Conflicts:
	doc/devblog/day_152__more_ssh_connection_caching.mdwn
2014-04-14 13:38:35 -04:00
Joey Hess
e53a85743e
adjust to not use cpp in modules used by configure 2014-04-14 13:37:12 -04:00
Joey Hess
f67d5abc41 support gcrypt remotes (assuming them to be over ssh transport) 2014-04-08 16:16:46 -04:00
Joey Hess
43909723b3 added git-annex remotedaemon
So far, handling connecting to git-annex-shell notifychanges, and
pulling immediately when a change is pushed to a remote.

A little bit buggy (crashes after the first pull), but it already works!

This commit was sponsored by Mark Sheppard.
2014-04-06 19:10:23 -04:00
Joey Hess
1052eeface Windows: Fix some filename encoding bugs.
http://git-annex.branchable.com/bugs/Unicode_file_names_ignored_on_Windows/

Not a complete fix yet.
2014-03-19 15:57:56 -04:00
Joey Hess
67f09bca6d fully fix fsck memory use by iterative fscking
Not very well tested, but I'm sure it doesn't eg, loop forever.
2014-03-12 15:18:43 -04:00
Joey Hess
475bf70af6 read stdout and stderr concurrently
Avoids any buffering-related blocking.
2014-03-12 13:54:29 -04:00
Joey Hess
85d13b4302 better streaming when cleaning up corrupt objects
A repo with a lot of objects will now stream them through, rather than
buffering a list of them all in memory.
2014-03-10 16:36:18 -04:00
Joey Hess
0e0d396b27 Improve memory usage when git fsck finds a great many broken objects.
From 1.7 gb to 900 mb on 300 thousand unique reported shas.

When shas are not unique, this streams much better than before, so won't
buffer the full list before putting them into the Set and throwing away
dups. And when fsck output includes ignorable lines, especially
dangling object lines, they won't be buffered in memory at all.
2014-03-10 15:14:09 -04:00
Joey Hess
8496d8aa63
improved direct mode dir/file conflicted merge resultion, using tree grafting 2014-03-04 15:00:19 -04:00
Joey Hess
1192d98721 sync: Fix bug in direct mode that caused a file not checked into git to be deleted when merging with a remote that added a file by the same name. (Thanks, jkt) 2014-03-03 14:57:16 -04:00
Joey Hess
d0fce426c4 pre-commit-annex hook script to automatically extract metadata from lots of types of files
Using the extract(1) program to do the heavy lifting.

Decided to make git-annex run pre-commit-annex when committing. Since
git-annex pre-commit also runs it, it'll be run when git commit is run too,
via the pre-commit hook. This basically gives back the pre-commit hook
that git-annex took away. The implementation avoids repeatedly looking
for the hook script when the assistant is running and committing
repeatedly; only checks if the hook is available once.

To make the script simpler, made git-annex metadata -s field?=value
only set a field when it's not already got a value.

This commit was sponsored by bak.
2014-03-02 20:11:58 -04:00
Joey Hess
f8cfcd4e44 couple more warning fixes 2014-02-25 14:53:43 -04:00
Joey Hess
3f6e4b8c7c fix all remaining -Wall warnings on Windows 2014-02-25 14:48:50 -04:00
Joey Hess
46cc39f1a4 repair: Optimise unpacking of pack files, and avoid repeated error messages about corrupt pack files. 2014-02-24 19:36:58 -04:00
Joey Hess
4e0be2792b remove Read instance for Ref
Removed instance, got it all to build using fromRef. (With a few things
that really need to show something using a ref for debugging stubbed out.)

Then added back Read instance, and made Logs.View use it for serialization.
This changes the view log format.
2014-02-19 01:19:57 -04:00
Joey Hess
67fd06af76 add git annex view command
(And a vpop command, which is still a bit buggy.)

Still need to do vadd and vrm, though this also adds their documentation.

Currently not very happy with the view log data serialization. I had to
lose the TDFA regexps temporarily, so I can have Read/Show instances of
View. I expect the view log format will change in some incompatable way
later, probably adding last known refs for the parent branch to View
or something like that.

Anyway, it basically works, although it's a bit slow looking up the
metadata. The actual git branch construction is about as fast as it can be
using the current git plumbing.

This commit was sponsored by Peter Hogg.
2014-02-18 18:22:20 -04:00
Joey Hess
9633c67842 filter branches (incomplete)
Promosing work toward metadata driven filter branches. A few methods
to construct them are stubbed out; all the data types and pure code
seems good.

This commit was sponsored by Walter Somerville.
2014-02-16 17:39:54 -04:00
Joey Hess
61ecf76644 unbreak the build 2014-02-12 14:34:01 -04:00
Joey Hess
029a1c431a
remove windows --git-dir unix style path hack
This is no longer necessary, at least with msysgit 1.8.5.2.msysgit.0.
Its root cause may have been fixed by other recent git path fixes.
It was causing the webapp to fail to make repos on other drives.
2014-02-11 16:12:22 -04:00
Joey Hess
c95d0cf7a8 Windows: Fix handling of absolute unix-style git repository paths.
Note that on Windows a remote with a path like /home/foo/bar
is interpreted by git as being some screwy relative path (relative to what
exactly seems ill-defined -- it seemed relative to C:\Program Files\Git\ in
my tests!) So no attempt has been made to handle such a path sanely, just not
to crash when encountering it.

Note that "C:\\foo" </> "/home/foo/bar" yields /home/foo/bar even though
that is not absolute! I don't know what to make of all this,
except that I will be very happy when this crock of **** vanishes from
the face of the earth.
2014-02-08 15:39:04 -04:00
Joey Hess
92edee0b04 remove workaround
This was needed when absNormPath was not being used on Windows, since path
normalization includes removing ./
2014-02-08 14:47:57 -04:00
Joey Hess
a44e01c29c --in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}" 2014-02-06 12:43:56 -04:00
Joey Hess
ed7c61914c assistant: Run the periodic git gc in batch mode. 2014-01-22 17:11:41 -04:00
Joey Hess
78ead70ea4 repair: Check git version at run time. 2014-01-21 13:22:48 -04:00
Joey Hess
4e19e87921 repair: Fix bug in packed refs file exploding code that caused a .gitrefs directory to be created instead of .git/refs 2014-01-15 16:34:18 -04:00
Joey Hess
5e6e89f423 repair: Support old git versions from before git fsck --no-dangling was implemented. 2014-01-13 18:10:45 -04:00
Joey Hess
858eb26303 Avoid looping if long-running git cat-file or git hash-object crashes and keeps crashing when restarted. 2014-01-01 21:42:25 -04:00
Joey Hess
49aad120b9 Windows: Fix bug in direct mode merge code that could cause files in subdirectories to go missing. 2013-12-31 16:39:11 -04:00
Richard Hartmann
974fe009bf Another round of s/amoung/among/ 2013-12-19 12:30:53 -04:00
Joey Hess
c99d6a8151 assistant: Fix OSX-specific bug that caused the startup scan to try to follow symlinks to other directories, and add their contents to the annex. 2013-12-18 15:05:29 -04:00
Joey Hess
625076f9a5 status: Ignore new files that are gitignored. 2013-12-12 14:01:24 -04:00
Joey Hess
e6c4f550d8 repair: Remove damaged git-annex sync branches. 2013-12-10 16:17:49 -04:00
Joey Hess
b37323d857 update 2013-12-10 15:48:24 -04:00
Joey Hess
c0ce3269e9 accidentially committed wrong version of file 2013-12-10 15:45:22 -04:00
Joey Hess
ce045a51af Improve repair of git-annex index file.
Fixes a test case I received where a corrupted repo was repaired, but the
git-annex branch was not. The root of the problem was that the
MissingObject returned by the repair code was not necessarily a complete
set of all objects that might have been deleted during the repair.

So, stop trying to return that at all, and instead make the index file
checking code explicitly verify that each object the index uses is present.
2013-12-10 15:40:01 -04:00
Joey Hess
c717905d15 work around msysgit very strange behavior on ./ or .\ at start of path
Seems that verify_path() rejects such a path on Windows, but I cannot see
why. Git bug?
2013-12-04 23:49:18 -04:00
Joey Hess
4882a611e5 assistant: Batch jobs are now run with ionice and nocache, when those commands are available. 2013-12-01 14:53:15 -04:00
Joey Hess
03932212ec Avoid using git commit in direct mode, since in some situations it will read the full contents of files in the tree.
The assistant's commit code also always avoids git commit, for simplicity.
Indirect mode sync still does a git commit -a to catch unstaged changes.

Note that this means that direct mode sync no longer runs the pre-commit
hook or any other hooks git commit might call. The git annex pre-commit
hook action for direct mode is however explicitly run. (The assistant
already ran git commit with hooks disabled, so no change there.)
2013-12-01 13:59:45 -04:00
Joey Hess
6edac746f0 merge improved fsck types from git-repair and some associated changes 2013-11-30 14:29:11 -04:00
Joey Hess
0980f3dae6 Fix bug that broke switching between local repositories in the webapp when they use the new guarded direct mode.
git treats eg ~/annex as a bare git repository located in ~/.annex/.git
if ~/annex/.git/config has core.bare=true.
2013-11-22 23:27:15 -04:00
Joey Hess
d490bbb891 make runRepairOf run preRepair
This may be a little late, since a fsck has already been done,
but it can't hurt.
2013-11-21 20:13:55 -04:00
Joey Hess
7d682dd844 merge from git-repair 2013-11-21 20:07:44 -04:00
Joey Hess
ff2b0a9df6 merge from git-repair 2013-11-21 00:43:30 -04:00
Joey Hess
8217e97d88 merge from git-repair 2013-11-20 19:34:30 -04:00
Joey Hess
e80d935b53 merge from git-repair 2013-11-20 19:16:42 -04:00
Joey Hess
8a466247ed merge from git-repair 2013-11-20 18:45:22 -04:00
Joey Hess
7dbb702edd merge from git-repair 2013-11-20 18:31:00 -04:00
Joey Hess
ef34316c45 fix repair failure that occurred when index was corrupted, and other objects too
In this case, the index problem prevented fsck from finding the other
problems.
2013-11-19 17:16:33 -04:00
Joey Hess
b1ed98636b merge with git-repair 2013-11-19 17:08:57 -04:00
Joey Hess
b245aa40df moving git-repair to its own package 2013-11-18 13:24:55 -04:00
Joey Hess
eab4470440 better handling of missing index file 2013-11-13 14:39:26 -04:00
Joey Hess
13108b7196 assistant: Notice on startup when the index file is corrupt, and auto-repair. 2013-11-13 14:27:17 -04:00
Joey Hess
5e7e0c7dc0 repair: Handle case where index file is corrupt, but all objects are ok. 2013-11-13 13:41:02 -04:00
Joey Hess
958312885f webapp: Improve UI around remote that have no annex.uuid set, either because setup of them is incomplete, or because the remote git repository is not a git-annex repository.
Complicated by such repositories potentially being repos that should have
an annex.uuid, but it failed to be gotten, perhaps due to the past ssh repo
setup bugs. This is handled now by an Upgrade Repository button.
2013-11-07 18:02:00 -04:00
Joey Hess
59ecc804cd add new status command
This works for both direct and indirect mode.

It may need some performance tuning.

Note that unlike git status, it only shows the status of the work tree, not
the status of the index. So only one status letter, not two .. and since
files that have been added and not yet committed do not differ between the
work tree and the index, they are not shown. Might want to add display of
the index vs the last commit eventually.

This commit was sponsored by an unknown bitcoin contributor, whose
contribution as been going up lately! ;)
2013-11-07 14:07:25 -04:00
Joey Hess
3802f2f270 work around lack of receive.denyCurrentBranch in direct mode
Now that direct mode sets core.bare=true, git's normal prohibition about
pushing into the currently checked out branch doesn't work.

A simple fix for this would be an update hook which blocks the pushes..
but git hooks must be executable, and git-annex needs to be usable on eg,
FAT, which lacks x bits.

Instead, enabling direct mode switches the branch (eg master) to a special
purpose branch (eg annex/direct/master). This branch is not pushed when
syncing; instead any changes that git annex sync commits get written to
master, and it's pushed (along with synced/master) to the remote.

Note that initialization has been changed to always call setDirect,
even if it's just setDirect False for indirect mode. This is needed because
if the user has just cloned a direct mode repo, that nothing has synced
with before, it may have no master branch, and only a annex/direct/master.
Resulting in that branch being checked out locally too. Calling setDirect False
for indirect mode moves back out of this branch, to a new master branch,
and ensures that a manual "git push" doesn't push changes directly to
the annex/direct/master of the remote. (It's possible that the user
makes a commit w/o using git-annex and pushes it, but nothing I can do
about that really.)

This commit was sponsored by Jonathan Harrington.
2013-11-05 21:08:31 -04:00
Joey Hess
cf34e59c8c factor out update 2013-11-05 18:20:52 -04:00
Joey Hess
4510819215 v5 for direct mode, with automatic upgrade
This includes storing the current state of the HEAD ref, which git annex
sync is going to need, but does not make sync use it.
2013-11-05 17:05:03 -04:00
Joey Hess
04768e44b2 automatically set and unset core.bare when switching to/from direct mode 2013-11-05 15:41:24 -04:00
Joey Hess
0edd9ec03a refactored hook setup 2013-11-05 15:29:56 -04:00
Joey Hess
c2862d9585 pass -c option on to all git commands run
The -c option now not only modifies the git configuration seen by
git-annex, but it is passed along to every git command git-annex runs.

This was easy to plumb through because gitCommandLine is already used to
construct every git command line, to add --git-dir and --work-tree
2013-11-05 13:38:37 -04:00
Joey Hess
58db042033 map: Work when there are gcrypt remotes. 2013-11-04 14:14:44 -04:00
Joey Hess
7ed8e87a34 assistant: Support repairing git remotes that are locally accessible
(eg, on removable drives)

gcrypt remotes are not yet handled.

This commit was sponsored by Sören Brunk.
2013-10-27 15:38:59 -04:00
Joey Hess
0036139b33 wire git repair into webapp 2013-10-23 14:43:58 -04:00
Joey Hess
1ab2ad86c7 minor 2013-10-23 13:19:37 -04:00
Joey Hess
435ea52f3c repair command: add handling of git-annex branch and index 2013-10-23 13:00:45 -04:00
Joey Hess
d5eb85acf4 add repair command 2013-10-23 12:21:59 -04:00
Joey Hess
d345e5b52f add git fsck to cronner, and UI for repository repair (not yet wired up) 2013-10-22 16:02:52 -04:00
Joey Hess
44bb9a808f clean warnings 2013-10-22 14:52:17 -04:00
Joey Hess
ff3f654cbe make git fsck batch-capable 2013-10-22 14:49:41 -04:00
Joey Hess
3e61749d08 index file recovery 2013-10-22 12:58:04 -04:00
Joey Hess
2fb08acda5 add reflog 2013-10-21 16:41:46 -04:00
Joey Hess
18487c779f corrupt branch resetting (but not yet reflog walking) 2013-10-21 16:20:54 -04:00
Joey Hess
fcd91be6f0 implemented removal of corrupt tracking branches
Oh, git, you made this so hard. Not determining if a branch pointed to some
corrupt object, that was easy, but dealing with corrupt branches using git
plumbing is a PITA.
2013-10-21 15:28:06 -04:00
Joey Hess
6d8250c255 avoid redundant fsck when no changes are made 2013-10-20 19:42:17 -04:00
Joey Hess
4f871f89ba git-recover-repository 1/2 done 2013-10-20 17:50:51 -04:00
Joey Hess
f482de1b76 remove workaround for bug in git 1.8.4r0 2013-10-20 15:23:06 -04:00
Joey Hess
edbf177628 fix lsTreeFiles to use --full-tree
This makes it show the full tree, not just the current directory,
and enables --full-name, which yields TopFilePaths.
2013-10-18 15:50:26 -04:00
Joey Hess
c979e0ea62 fix 2013-10-17 19:51:16 -04:00
Joey Hess
c116383b5d fix 2013-10-17 19:49:44 -04:00
Joey Hess
81c4259a0d fix 2013-10-17 19:41:00 -04:00
Joey Hess
16243b9972 missing import 2013-10-17 19:39:22 -04:00
Joey Hess
e93206e294 Windows: Deal with strange msysgit 1.8.4 behavior of not understanding DOS formatted paths for --git-dir and --work-tree. 2013-10-17 19:35:57 -04:00
Joey Hess
aff125ddab try working around windows xargs problem 2013-10-17 15:56:56 -04:00
Joey Hess
d785432f78 use TopFilePath for DiffTree and LsTree 2013-10-17 14:51:19 -04:00
Joey Hess
82ff37520f fix off-by-one 2013-10-16 12:14:14 -04:00
Joey Hess
bac078742d Deal with git check-attr -z output format change in git 1.8.5.
I have not actually tested with 1.8.5, which is not yet relesaed, but
git.git commit f7cd8c50b9ab83e084e8f52653ecc8d90665eef2 changes -z
to also apply to output, without regards to back-compat. (But with pretty
good reasons.)

New code should work with both versions, by fingerprinting for NULs and
newlines.
2013-10-15 16:05:27 -04:00
Joey Hess
f1295b5141 fix windows build 2013-10-02 20:26:00 -04:00
Joey Hess
1536ebfe47 Disable receive.denyNonFastForwards when setting up a gcrypt special remote
gcrypt needs to be able to fast-forward the master branch. If a git
repository is set up with git init --shared --bare, it gets that set, and
pushing to it will then fail, even when it's up-to-date.
2013-10-01 15:23:48 -04:00
Joey Hess
57d49a6d04 remove *>=> and >=*> ; use <$$> instead
I forgot I had <$$> hidden away in Utility.Applicative.
It allows doing the same kind of currying as does >=*>
and I found using it made the code more readable for me.

(*>=> was not used)
2013-09-27 19:58:48 -04:00
Joey Hess
e864c8d033 blind enabling gcrypt repos on rsync.net
This pulls off quite a nice trick: When given a path on rsync.net, it
determines if it is an encrypted git repository that the user has
the key to decrypt, and merges with it. This is works even when
the local repository had no idea that the gcrypt remote exists!

(As previously done with local drives.)

This commit sponsored by Pedro Côrte-Real
2013-09-27 16:21:56 -04:00
Joey Hess
1550759220 enabling rsync.net gcrypt repos
Still need to detect when the user is trying to create a repo
that already exists, and jump to the enabling code.
2013-09-26 23:47:30 -04:00
Joey Hess
735ed3b822 prep for enabling remotre gcrypt repos in webapp 2013-09-26 17:26:13 -04:00
Joey Hess
3192b059b5 add back lost check that git-annex-shell supports gcrypt 2013-09-24 17:51:12 -04:00
Joey Hess
7390f08ef9 Use cryptohash rather than SHA for hashing.
This is a massive win on OSX, which doesn't have a sha256sum normally.

Only use external hash commands when the file is > 1 mb,
since cryptohash is quite close to them in speed.

SHA is still used to calculate HMACs. I don't quite understand
cryptohash's API for those.

Used the following benchmark to arrive at the 1 mb number.

1 mb file:

benchmarking sha256/internal
mean: 13.86696 ms, lb 13.83010 ms, ub 13.93453 ms, ci 0.950
std dev: 249.3235 us, lb 162.0448 us, ub 458.1744 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
  4 (4.0%) high mild
  1 (1.0%) high severe
variance introduced by outliers: 10.415%
variance is moderately inflated by outliers

benchmarking sha256/external
mean: 14.20670 ms, lb 14.17237 ms, ub 14.27004 ms, ci 0.950
std dev: 230.5448 us, lb 150.7310 us, ub 427.6068 us, ci 0.950
found 3 outliers among 100 samples (3.0%)
  2 (2.0%) high mild
  1 (1.0%) high severe

2 mb file:

benchmarking sha256/internal
mean: 26.44270 ms, lb 26.23701 ms, ub 26.63414 ms, ci 0.950
std dev: 1.012303 ms, lb 925.8921 us, ub 1.122267 ms, ci 0.950
variance introduced by outliers: 35.540%
variance is moderately inflated by outliers

benchmarking sha256/external
mean: 26.84521 ms, lb 26.77644 ms, ub 26.91433 ms, ci 0.950
std dev: 347.7867 us, lb 210.6283 us, ub 571.3351 us, ci 0.950
found 6 outliers among 100 samples (6.0%)

import Crypto.Hash
import Data.ByteString.Lazy as L
import Criterion.Main
import Common

testfile :: FilePath
testfile = "/run/shm/data" -- on ram disk

main = defaultMain
        [ bgroup "sha256"
                [ bench "internal" $ whnfIO internal
                , bench "external" $ whnfIO external
                ]
        ]

sha256 :: L.ByteString -> Digest SHA256
sha256 = hashlazy

internal :: IO String
internal = show . sha256 <$> L.readFile testfile

external :: IO String
external = do
	s <- readProcess "sha256sum" [testfile]
        return $ fst $ separate (== ' ') s
2013-09-22 20:06:02 -04:00
Joey Hess
006cf7976f more completely solve catKey memory leak
Done using a mode witness, which ensures it's fixed everywhere.

Fixing catFileKey was a bear, because git cat-file does not provide a
nice way to query for the mode of a file and there is no other efficient
way to do it. Oh, for libgit2..

Note that I am looking at tree objects from HEAD, rather than the index.
Because I cat-file cannot show a tree object for the index.
So this fix is technically incomplete. The only cases where it matters
are:

1. A new large file has been directly staged in git, but not committed.
2. A file that was committed to HEAD as a symlink has been staged
   directly in the index.

This could be fixed a lot better using libgit2.
2013-09-19 16:41:21 -04:00
Joey Hess
f26c996dc6 interface to parse git tree objects 2013-09-19 15:58:35 -04:00
Joey Hess
eb42bde19a sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory. 2013-09-19 14:48:42 -04:00