Commit graph

2596 commits

Author SHA1 Message Date
Joey Hess
0b3e2bed78 add a pid file
Writes pid to a file. Is supposed to take an exclusive lock, but that's not
working, and it's too late for me to understand why.
2012-06-11 01:20:19 -04:00
Joey Hess
d5884388b0 daemonize git annex watch 2012-06-11 00:39:09 -04:00
Joey Hess
ca9ee21bd7 crazy optimisation
Crazy like a fox..
2012-06-10 19:58:34 -04:00
Joey Hess
c1b432ee54 run git add --update after inotify is started
This way, there's no window where deleted files won't be noticed.
2012-06-10 19:10:18 -04:00
Joey Hess
aae0ba1995 fixed the double commits problem 2012-06-10 18:41:05 -04:00
Joey Hess
fc0dd79774 avoid running pre-commit hook from watch commits 2012-06-10 17:53:17 -04:00
Joey Hess
cda6c4dff5 tweak 2012-06-10 17:40:35 -04:00
Joey Hess
2de50f733a smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.

I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)

Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.

There is one bug, for some reason it does double commits, every time.
2012-06-10 16:07:48 -04:00
Joey Hess
6e54907e35 add a thread to commit changes
Currently the stupidest possible version, just wakes up every second,
and may make empty commits sometimes.
2012-06-10 13:56:39 -04:00
Joey Hess
e5f855b7f8 generalize and improve state MVar code 2012-06-10 13:23:10 -04:00
Joey Hess
5308b51ec0 stage deletions directly using update-index
no need to run git-rm separately
2012-06-10 13:05:58 -04:00
Joey Hess
7f823b56af fix non-linux build 2012-06-09 14:06:56 -04:00
Joey Hess
d45a9a7831 refactor and function name cleanup
(oops, I had a calcMerge and a calc_merge!)
2012-06-08 00:29:39 -04:00
Joey Hess
7d78cbf97c use git queue for rm too 2012-06-07 21:17:10 -04:00
Joey Hess
20f425be19 make watch use the queue
May not work. Certianly needs to flush the queue from time to time
when only symlink changes are being made.
2012-06-07 15:40:44 -04:00
Joey Hess
0a11b35d89 extend Git.Queue to be able to queue more than simple git commands
While I was in there, I noticed and fixed a bug in the queue size
calculations. It was never encountered only because Queue.add was
only ever run with 1 file in the list.
2012-06-07 15:19:44 -04:00
Joey Hess
727158ff55 Merge branch 'master' into watch 2012-06-07 13:48:55 -04:00
Joey Hess
4d1c114e4d initremote: Automatically describe a remote when creating it.
This ensures that all special remotes show up in git annex status.
Before, a special remote that was not manually described, and was not
a current git remote, did not show up there, although initremote did list
it.
2012-06-07 11:16:48 -04:00
Joey Hess
d5de27ff40 tweak 2012-06-06 23:30:38 -04:00
Joey Hess
b8ae9528ab refactor 2012-06-06 23:20:09 -04:00
Joey Hess
b8f85f7a82 build watch on non-linux, just don't do anything 2012-06-06 22:49:32 -04:00
Joey Hess
c5b11561f0 handle running out of watch descriptors 2012-06-06 16:50:28 -04:00
Joey Hess
db8effb8f3 ignore .gitignore and .gitattributes 2012-06-06 15:50:12 -04:00
Joey Hess
b819f644ad close the git add race
There's a race adding a new file to the annex: The file is moved to the
annex and replaced with a symlink, and then we git add the symlink. If
someone comes along in the meantime and replaces the symlink with
something else, such as a new large file, we add that instead. Which could
be bad..

This race is fixed by avoiding using git add, instead the symlink is
directly staged into the index.

It would be nice to make `git annex add` use this same technique.
I have not done so yet because it currently runs git update-index once per
file, which would slow does `git annex add`. A future enhancement would be
to extend the Git.Queue to include the ability to run update-index with
a list of Streamers.
2012-06-06 14:29:10 -04:00
Joey Hess
993e6459a3 factor out nukeFile 2012-06-06 13:13:13 -04:00
Joey Hess
723eb19bbf split out utility functions 2012-06-06 13:07:30 -04:00
Joey Hess
a7a729bce4 Merge branch 'master' into watch 2012-06-05 20:30:37 -04:00
Joey Hess
c981ccc077 add: Prevent (most) modifications from being made to a file while it is being added to the annex.
Anything that tries to open the file for write, or delete the file,
or replace it with something else, will not affect the add.

Only if a process has the file open for write before add starts
can it still change it while (or after) it's added to the annex.
(fsck will catch this later of course)
2012-06-05 20:28:34 -04:00
Joey Hess
5809f33f8b use createAnnexDirectory when setting up tmp dir 2012-06-05 20:25:32 -04:00
Joey Hess
d3cee987ca separate source of content from the filename associated with the key when generating a key
This already made migrate's code a lot simpler.
2012-06-05 19:51:03 -04:00
Joey Hess
cbdaccd44a run event handlers all in the same Annex monad
Uses a MVar again, as there seems no other way to thread the state through
inotify events.

This is a rather unsatisfactory result. I had wanted to run them in
the same monad so that the git queue could be used to coleasce git commands
and speed things up. But, that led to fragility: If several files are
added, and one is removed before queue flush, git add will fail to add
any of them. So, the queue is still explicitly flushed after each add for
now.

TODO: Investigate using git add --ignore-errors. This would need to be done
in Command.Add. And, git add still exits nonzero with it, so would need
to avoid crashing on queue flush.
2012-06-04 21:21:52 -04:00
Joey Hess
48efa2d2d3 avoid explicit queue flush
The queue is still flushed on add, because each add event is handled by a
separate Annex monad. That needs to be fixed to speed up add a lot.
2012-06-04 20:44:15 -04:00
Joey Hess
bd7857d903 ignore-unmatch when removing a staged file
When a file is added, and then deleted before the add action runs,
the delete event was unhappy that the file never did get staged.
2012-06-04 20:13:25 -04:00
Joey Hess
cbf16f1967 refactor 2012-06-04 19:43:29 -04:00
Joey Hess
ec98581112 notice deleted files on startup 2012-06-04 18:14:42 -04:00
Joey Hess
5b4e5ce7e5 deletion
When a new file is annexed, a deletion event occurs when it's moved away
to be replaced by a symlink. Most of the time, there is no problimatic
race, because the same thread runs the add event as the deletion event.
So, once the symlink is in place, the deletion code won't run at all,
due to existing checks that a deleted file is really gone.

But there is a race at startup, as then the inotify thread is running
at the same time as the main thread, which does the initial tree walking
and annexing. It would be possible for the deletion inotify to run
in a perfect race with the addition, and remove the newly added symlink
from the git cache.

To solve this race, added event serialization via a MVar. We putMVar
before running each event, which blocks if an event is already running.
And when an event finishes (or crashes!), we takeMVar to free the lock.

Also, make rm -rf not spew warnings by passing --ignore-unmatch when
deleting directories.
2012-06-04 18:09:18 -04:00
Joey Hess
659e6b1324 suppress "recording state in git" message during add 2012-06-04 17:18:54 -04:00
Joey Hess
677ad74687 add handling of symlink addition events
And just like that, annexed files can be moved and copies around within
the tree, and are automatically fixed to point to the content, and staged
in git. Huzzah!

Delete still remains TODO, with its troublesome race during add..
2012-06-04 15:10:43 -04:00
Joey Hess
7053f5f947 handle directory deletion
When a directory is deleted, or moved away, git rm -r it to stage
the deletion.
2012-06-04 13:30:30 -04:00
Joey Hess
23dbff4b43 add events for symlink creation and directory removal
Improved the inotify code, so it will also notice directory removal
and symlink creation.

In the watch code, optimised away a stat of a file that's being added,
that's done by Command.Add.start. This is the reason symlink creation is
handled separately from file creation, since during initial tree walk
at startup, a stat was already done, and can be reused.
2012-06-04 13:22:56 -04:00
Joey Hess
eab3872d91 Merge branch 'master' into watch 2012-06-04 12:07:59 -04:00
Joey Hess
3a10095d40 import: New subcommand, pulls files from a directory outside the annex and adds them
Use case for this was developed somewhere on the Transiberian Railroad.
2012-05-31 19:47:18 -04:00
Joey Hess
65977a5584 lock: Reset unlocked file to index, rather than to branch head.
Resetting an unlocked file to the branch head failed if it had just been
added, not committed, and unlocked, since the branch didbn't have it.

The code was concerned about dropping any changes that might be staged in the
index, but I cannot see why.
2012-05-30 17:01:22 -04:00
Joey Hess
6e213d04f1 sync: Show a nicer message if a user tries to sync to a special remote. 2012-05-27 20:55:56 -04:00
Joey Hess
bb4f31a0ee Clean up handling of git directory and git worktree.
Baked into the code was an assumption that a repository's git directory
could be determined by adding ".git" to its work tree (or nothing for bare
repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are
used to separate the two.

This was attacked at the type level, by storing the gitdir and worktree
separately, so Nothing for the worktree means a bare repo.

A complication arose because we don't learn where a repository is bare
until its configuration is read. So another Location type handles
repositories that have not had their config read yet. I am not entirely
happy with this being a Location type, rather than representing them
entirely separate from the Git type. The new code is not worse than the
old, but better types could enforce more safety.

Added support for core.worktree. Overriding it with -c isn't supported
because it's not really clear what to do if a git repo's config is read, is
not bare, and is then overridden to bare. What is the right git directory
in this case? I will worry about this if/when someone has a use case for
overriding core.worktree with -c. (See Git.Config.updateLocation)

Also removed and renamed some functions like gitDir and workTree that
misused git's terminology.

One minor regression is known: git annex add in a bare repository does not
print a nice error message, but runs git ls-files in a way that fails
earlier with a less nice error message. This is because before --work-tree
was always passed to git commands, even in a bare repo, while now it's not.
2012-05-18 17:03:12 -04:00
Joey Hess
f7d8982672 Fix use of several config settings
annex.ssh-options, annex.rsync-options, annex.bup-split-options.

And adjust types to avoid the bugs that broke several config settings
recently. Now "annex." prefixing is enforced at the type level.
2012-05-05 20:16:56 -04:00
Joey Hess
392931eca9 addunused: New command, the opposite of dropunused, it relinks unused content into the git repository. 2012-05-02 14:59:05 -04:00
Joey Hess
8f45300479 dropunused: Allow specifying ranges to drop.
Sort of by popular demand, but the last straw for not using seq
was that it can run into command line length limits.
2012-05-02 13:15:19 -04:00
Joey Hess
0c9c14b52f percentage library 2012-04-29 17:48:07 -04:00
Joey Hess
d2bfba6324 show percent the bloom filter is full 2012-04-29 16:10:47 -04:00