watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
{- git-annex command
|
|
|
|
-
|
|
|
|
- Copyright 2012 Joey Hess <joey@kitenet.net>
|
|
|
|
-
|
|
|
|
- Licensed under the GNU GPL version 3 or higher.
|
|
|
|
-}
|
|
|
|
|
2012-06-07 02:49:32 +00:00
|
|
|
{-# LANGUAGE CPP #-}
|
2012-06-05 01:21:52 +00:00
|
|
|
{-# LANGUAGE BangPatterns #-}
|
|
|
|
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
module Command.Watch where
|
|
|
|
|
|
|
|
import Common.Annex
|
|
|
|
import Command
|
2012-06-04 23:43:29 +00:00
|
|
|
import Utility.ThreadLock
|
2012-06-04 19:10:43 +00:00
|
|
|
import qualified Annex
|
2012-06-07 19:40:44 +00:00
|
|
|
import qualified Annex.Queue
|
2012-06-06 18:29:10 +00:00
|
|
|
import qualified Command.Add
|
2012-06-04 17:30:30 +00:00
|
|
|
import qualified Git.Command
|
2012-06-06 18:29:10 +00:00
|
|
|
import qualified Git.UpdateIndex
|
2012-06-04 19:10:43 +00:00
|
|
|
import qualified Backend
|
|
|
|
import Annex.Content
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
|
2012-06-10 17:56:39 +00:00
|
|
|
import Control.Concurrent
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
import Control.Concurrent.STM
|
|
|
|
import Data.Time.Clock
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
|
2012-06-07 02:49:32 +00:00
|
|
|
#if defined linux_HOST_OS
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
import Utility.Inotify
|
2012-06-07 02:49:32 +00:00
|
|
|
import System.INotify
|
|
|
|
#endif
|
|
|
|
|
2012-06-10 22:29:05 +00:00
|
|
|
type ChangeChan = TChan Change
|
|
|
|
|
|
|
|
type Handler = FilePath -> Annex (Maybe Change)
|
|
|
|
|
|
|
|
data Change = Change
|
|
|
|
{ changeTime :: UTCTime
|
|
|
|
, changeFile :: FilePath
|
|
|
|
, changeDesc :: String
|
|
|
|
}
|
|
|
|
deriving (Show)
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
def :: [Command]
|
|
|
|
def = [command "watch" paramPaths seek "watch for changes"]
|
|
|
|
|
|
|
|
seek :: [CommandSeek]
|
|
|
|
seek = [withNothing start]
|
|
|
|
|
|
|
|
start :: CommandStart
|
|
|
|
start = notBareRepo $ do
|
|
|
|
showStart "watch" "."
|
2012-06-10 21:40:35 +00:00
|
|
|
watch
|
|
|
|
stop
|
|
|
|
|
|
|
|
watch :: Annex ()
|
|
|
|
#if defined linux_HOST_OS
|
|
|
|
watch = do
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
showAction "scanning"
|
2012-06-10 21:40:35 +00:00
|
|
|
withStateMVar $ \st -> liftIO $ withINotify $ \i -> do
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
changechan <- atomically newTChan
|
2012-06-10 22:29:05 +00:00
|
|
|
let hook a = Just $ runHandler st changechan a
|
2012-06-07 03:20:09 +00:00
|
|
|
let hooks = WatchHooks
|
|
|
|
{ addHook = hook onAdd
|
|
|
|
, delHook = hook onDel
|
|
|
|
, addSymlinkHook = hook onAddSymlink
|
|
|
|
, delDirHook = hook onDelDir
|
|
|
|
, errHook = hook onErr
|
|
|
|
}
|
2012-06-10 23:08:03 +00:00
|
|
|
-- The commit thread is started early, so that the user
|
|
|
|
-- can immediately begin adding files and having them
|
|
|
|
-- committed, even while the inotify scan is taking place.
|
|
|
|
_ <- forkIO $ commitThread st changechan
|
|
|
|
-- This does not return until the inotify scan is done.
|
|
|
|
-- That can take some time for large trees.
|
2012-06-07 03:20:09 +00:00
|
|
|
watchDir i "." (ignored . takeFileName) hooks
|
2012-06-10 23:08:03 +00:00
|
|
|
-- Notice any files that were deleted before inotify
|
|
|
|
-- was started.
|
|
|
|
runStateMVar st $
|
|
|
|
inRepo $ Git.Command.run "add" [Param "--update"]
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
putStrLn "(started)"
|
|
|
|
waitForTermination
|
2012-06-07 02:49:32 +00:00
|
|
|
#else
|
2012-06-10 21:40:35 +00:00
|
|
|
watch = error "watch mode is so far only available on Linux"
|
2012-06-07 02:49:32 +00:00
|
|
|
#endif
|
|
|
|
|
2012-06-07 03:30:38 +00:00
|
|
|
ignored :: FilePath -> Bool
|
|
|
|
ignored ".git" = True
|
|
|
|
ignored ".gitignore" = True
|
|
|
|
ignored ".gitattributes" = True
|
|
|
|
ignored _ = False
|
|
|
|
|
2012-06-10 17:23:10 +00:00
|
|
|
{- Stores the Annex state in a MVar, so that threaded actions can access
|
|
|
|
- it.
|
deletion
When a new file is annexed, a deletion event occurs when it's moved away
to be replaced by a symlink. Most of the time, there is no problimatic
race, because the same thread runs the add event as the deletion event.
So, once the symlink is in place, the deletion code won't run at all,
due to existing checks that a deleted file is really gone.
But there is a race at startup, as then the inotify thread is running
at the same time as the main thread, which does the initial tree walking
and annexing. It would be possible for the deletion inotify to run
in a perfect race with the addition, and remove the newly added symlink
from the git cache.
To solve this race, added event serialization via a MVar. We putMVar
before running each event, which blocks if an event is already running.
And when an event finishes (or crashes!), we takeMVar to free the lock.
Also, make rm -rf not spew warnings by passing --ignore-unmatch when
deleting directories.
2012-06-04 21:32:46 +00:00
|
|
|
-
|
2012-06-10 17:23:10 +00:00
|
|
|
- Once the action is finished, retrieves the state from the MVar.
|
deletion
When a new file is annexed, a deletion event occurs when it's moved away
to be replaced by a symlink. Most of the time, there is no problimatic
race, because the same thread runs the add event as the deletion event.
So, once the symlink is in place, the deletion code won't run at all,
due to existing checks that a deleted file is really gone.
But there is a race at startup, as then the inotify thread is running
at the same time as the main thread, which does the initial tree walking
and annexing. It would be possible for the deletion inotify to run
in a perfect race with the addition, and remove the newly added symlink
from the git cache.
To solve this race, added event serialization via a MVar. We putMVar
before running each event, which blocks if an event is already running.
And when an event finishes (or crashes!), we takeMVar to free the lock.
Also, make rm -rf not spew warnings by passing --ignore-unmatch when
deleting directories.
2012-06-04 21:32:46 +00:00
|
|
|
-}
|
2012-06-10 17:23:10 +00:00
|
|
|
withStateMVar :: (MVar Annex.AnnexState -> Annex a) -> Annex a
|
|
|
|
withStateMVar a = do
|
|
|
|
state <- Annex.getState id
|
|
|
|
mvar <- liftIO $ newMVar state
|
|
|
|
r <- a mvar
|
|
|
|
newstate <- liftIO $ takeMVar mvar
|
|
|
|
Annex.changeState (const newstate)
|
|
|
|
return r
|
|
|
|
|
|
|
|
{- Runs an Annex action, using the state from the MVar. -}
|
|
|
|
runStateMVar :: MVar Annex.AnnexState -> Annex () -> IO ()
|
|
|
|
runStateMVar mvar a = do
|
2012-06-05 01:21:52 +00:00
|
|
|
startstate <- takeMVar mvar
|
2012-06-10 17:23:10 +00:00
|
|
|
!newstate <- Annex.exec startstate a
|
|
|
|
putMVar mvar newstate
|
|
|
|
|
2012-06-10 22:29:05 +00:00
|
|
|
{- Runs an action handler, inside the Annex monad.
|
2012-06-10 17:23:10 +00:00
|
|
|
-
|
|
|
|
- Exceptions are ignored, otherwise a whole watcher thread could be crashed.
|
|
|
|
-}
|
2012-06-10 22:29:05 +00:00
|
|
|
runHandler :: MVar Annex.AnnexState -> ChangeChan -> Handler -> FilePath -> IO ()
|
|
|
|
runHandler st changechan hook file = handle =<< tryIO (runStateMVar st go)
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
where
|
2012-06-10 22:29:05 +00:00
|
|
|
go = maybe noop (signalChange changechan) =<< hook file
|
2012-06-10 17:23:10 +00:00
|
|
|
handle (Right ()) = return ()
|
|
|
|
handle (Left e) = putStrLn $ show e
|
watch subcommand
So far this only handles auto-annexing new files that are created inside
the repository while it's running. To make this really useful,
it needs to at least:
- notice deleted files and stage the deletion
(tricky; there's a race with add..)
- notice renamed files, auto-fix the symlink, and stage the new file location
- periodically auto-commit staged changes
- honor .gitignore, not adding files it excludes
Also nice to have would be:
- Somehow sync remotes, possibly using a push sync like dvcs-autosync
does, so they are immediately updated.
- Somehow get content that is unavilable. This is problimatic with inotify,
since we only get an event once the user has tried (and failed) to read
from the file. Perhaps instead, automatically copy content that is added
out to remotes, with the goal of all repos eventually getting a copy,
if df allows.
- Drop files that have not been used lately, or meet some other criteria
(as long as there's a copy elsewhere).
- Perhaps automatically dropunused files that have been deleted,
although I cannot see a way to do that, since by the time the inotify
deletion event arrives, the file is deleted, and we cannot see what
its symlink pointed to! Alternatievely, perhaps automatically
do an expensive unused/dropunused cleanup process.
Some of this probably needs the currently stateless threads to maintain
a common state.
2012-04-12 21:29:43 +00:00
|
|
|
|
2012-06-10 22:29:05 +00:00
|
|
|
{- Handlers call this when they made a change that needs to get committed. -}
|
|
|
|
madeChange :: FilePath -> String -> Annex (Maybe Change)
|
|
|
|
madeChange file desc = liftIO $
|
|
|
|
Just <$> (Change <$> getCurrentTime <*> pure file <*> pure desc)
|
|
|
|
|
2012-06-06 18:29:10 +00:00
|
|
|
{- Adding a file is tricky; the file has to be replaced with a symlink
|
|
|
|
- but this is race prone, as the symlink could be changed immediately
|
|
|
|
- after creation. To avoid that race, git add is not used to stage the
|
2012-06-10 22:29:05 +00:00
|
|
|
- symlink.
|
|
|
|
-
|
|
|
|
- Inotify will notice the new symlink, so this Handler does not stage it
|
|
|
|
- or return a Change, leaving that to onAddSymlink.
|
|
|
|
-}
|
|
|
|
onAdd :: Handler
|
2012-06-05 01:21:52 +00:00
|
|
|
onAdd file = do
|
2012-06-06 18:29:10 +00:00
|
|
|
showStart "add" file
|
2012-06-10 22:29:05 +00:00
|
|
|
handle =<< Command.Add.ingest file
|
|
|
|
return Nothing
|
2012-06-06 18:29:10 +00:00
|
|
|
where
|
2012-06-10 22:29:05 +00:00
|
|
|
handle Nothing = showEndFail
|
|
|
|
handle (Just key) = do
|
|
|
|
Command.Add.link file key True
|
2012-06-06 18:29:10 +00:00
|
|
|
showEndOk
|
2012-06-04 17:22:56 +00:00
|
|
|
|
2012-06-04 19:10:43 +00:00
|
|
|
{- A symlink might be an arbitrary symlink, which is just added.
|
|
|
|
- Or, if it is a git-annex symlink, ensure it points to the content
|
|
|
|
- before adding it.
|
|
|
|
-}
|
2012-06-10 22:29:05 +00:00
|
|
|
onAddSymlink :: Handler
|
2012-06-04 19:10:43 +00:00
|
|
|
onAddSymlink file = go =<< Backend.lookupFile file
|
|
|
|
where
|
2012-06-10 22:29:05 +00:00
|
|
|
go Nothing = do
|
|
|
|
addlink =<< liftIO (readSymbolicLink file)
|
|
|
|
madeChange file "add"
|
2012-06-04 19:10:43 +00:00
|
|
|
go (Just (key, _)) = do
|
|
|
|
link <- calcGitLink file key
|
|
|
|
ifM ((==) link <$> liftIO (readSymbolicLink file))
|
2012-06-10 22:29:05 +00:00
|
|
|
( do
|
|
|
|
addlink link
|
|
|
|
madeChange file "add"
|
2012-06-04 19:10:43 +00:00
|
|
|
, do
|
|
|
|
liftIO $ removeFile file
|
|
|
|
liftIO $ createSymbolicLink link file
|
2012-06-06 18:29:10 +00:00
|
|
|
addlink link
|
2012-06-10 22:29:05 +00:00
|
|
|
madeChange file "fix"
|
2012-06-04 19:10:43 +00:00
|
|
|
)
|
2012-06-07 19:40:44 +00:00
|
|
|
addlink link = stageSymlink file link
|
2012-06-04 17:22:56 +00:00
|
|
|
|
2012-06-10 22:29:05 +00:00
|
|
|
onDel :: Handler
|
|
|
|
onDel file = do
|
|
|
|
Annex.Queue.addUpdateIndex =<<
|
|
|
|
inRepo (Git.UpdateIndex.unstageFile file)
|
|
|
|
madeChange file "rm"
|
2012-06-04 17:22:56 +00:00
|
|
|
|
deletion
When a new file is annexed, a deletion event occurs when it's moved away
to be replaced by a symlink. Most of the time, there is no problimatic
race, because the same thread runs the add event as the deletion event.
So, once the symlink is in place, the deletion code won't run at all,
due to existing checks that a deleted file is really gone.
But there is a race at startup, as then the inotify thread is running
at the same time as the main thread, which does the initial tree walking
and annexing. It would be possible for the deletion inotify to run
in a perfect race with the addition, and remove the newly added symlink
from the git cache.
To solve this race, added event serialization via a MVar. We putMVar
before running each event, which blocks if an event is already running.
And when an event finishes (or crashes!), we takeMVar to free the lock.
Also, make rm -rf not spew warnings by passing --ignore-unmatch when
deleting directories.
2012-06-04 21:32:46 +00:00
|
|
|
{- A directory has been deleted, or moved, so tell git to remove anything
|
2012-06-10 17:05:58 +00:00
|
|
|
- that was inside it from its cache. Since it could reappear at any time,
|
|
|
|
- use --cached to only delete it from the index.
|
|
|
|
-
|
|
|
|
- Note: This could use unstageFile, but would need to run another git
|
|
|
|
- command to get the recursive list of files in the directory, so rm is
|
|
|
|
- just as good. -}
|
2012-06-10 22:29:05 +00:00
|
|
|
onDelDir :: Handler
|
|
|
|
onDelDir dir = do
|
|
|
|
Annex.Queue.addCommand "rm"
|
|
|
|
[Params "--quiet -r --cached --ignore-unmatch --"] [dir]
|
|
|
|
madeChange dir "rmdir"
|
2012-06-06 18:29:10 +00:00
|
|
|
|
2012-06-07 03:20:09 +00:00
|
|
|
{- Called when there's an error with inotify. -}
|
2012-06-10 22:29:05 +00:00
|
|
|
onErr :: Handler
|
|
|
|
onErr msg = do
|
|
|
|
warning msg
|
|
|
|
return Nothing
|
2012-06-06 20:50:28 +00:00
|
|
|
|
2012-06-06 18:29:10 +00:00
|
|
|
{- Adds a symlink to the index, without ever accessing the actual symlink
|
|
|
|
- on disk. -}
|
2012-06-07 19:40:44 +00:00
|
|
|
stageSymlink :: FilePath -> String -> Annex ()
|
2012-06-08 04:29:39 +00:00
|
|
|
stageSymlink file linktext =
|
|
|
|
Annex.Queue.addUpdateIndex =<<
|
|
|
|
inRepo (Git.UpdateIndex.stageSymlink file linktext)
|
2012-06-10 17:56:39 +00:00
|
|
|
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
{- Signals that a change has been made, that needs to get committed. -}
|
2012-06-10 22:29:05 +00:00
|
|
|
signalChange :: ChangeChan -> Change -> Annex ()
|
|
|
|
signalChange chan change = do
|
|
|
|
liftIO $ atomically $ writeTChan chan change
|
|
|
|
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
-- Just in case the commit thread is not flushing
|
|
|
|
-- the queue fast enough.
|
|
|
|
Annex.Queue.flushWhenFull
|
|
|
|
|
2012-06-10 22:29:05 +00:00
|
|
|
{- Gets all unhandled changes.
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
- Blocks until at least one change is made. -}
|
2012-06-10 22:29:05 +00:00
|
|
|
getChanges :: ChangeChan -> IO [Change]
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
getChanges chan = atomically $ do
|
|
|
|
c <- readTChan chan
|
|
|
|
go [c]
|
|
|
|
where
|
|
|
|
go l = do
|
|
|
|
v <- tryReadTChan chan
|
|
|
|
case v of
|
|
|
|
Nothing -> return l
|
|
|
|
Just c -> go (c:l)
|
|
|
|
|
|
|
|
{- Puts unhandled changes back into the channel.
|
|
|
|
- Note: Original order is not preserved. -}
|
2012-06-10 22:29:05 +00:00
|
|
|
refillChanges :: ChangeChan -> [Change] -> IO ()
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
refillChanges chan cs = atomically $ mapM_ (writeTChan chan) cs
|
|
|
|
|
2012-06-10 22:29:05 +00:00
|
|
|
{- This thread makes git commits at appropriate times. -}
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
commitThread :: MVar Annex.AnnexState -> ChangeChan -> IO ()
|
|
|
|
commitThread st changechan = forever $ do
|
|
|
|
-- First, a simple rate limiter.
|
2012-06-10 21:53:17 +00:00
|
|
|
threadDelay oneSecond
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
-- Next, wait until at least one change has been made.
|
|
|
|
cs <- getChanges changechan
|
|
|
|
-- Now see if now's a good time to commit.
|
2012-06-10 21:53:17 +00:00
|
|
|
time <- getCurrentTime
|
|
|
|
if shouldCommit time cs
|
2012-06-10 22:29:05 +00:00
|
|
|
then void $ tryIO $ runStateMVar st $ commitStaged
|
2012-06-10 21:53:17 +00:00
|
|
|
else refillChanges changechan cs
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
where
|
|
|
|
oneSecond = 1000000 -- microseconds
|
|
|
|
|
2012-06-10 22:29:05 +00:00
|
|
|
commitStaged :: Annex ()
|
|
|
|
commitStaged = do
|
|
|
|
Annex.Queue.flush
|
|
|
|
inRepo $ Git.Command.run "commit"
|
|
|
|
[ Param "--allow-empty-message"
|
|
|
|
, Param "-m", Param ""
|
|
|
|
-- Empty commits may be made if tree changes cancel
|
|
|
|
-- each other out, etc
|
|
|
|
, Param "--allow-empty"
|
|
|
|
-- Avoid running the usual git-annex pre-commit hook;
|
|
|
|
-- watch does the same symlink fixing, and we don't want
|
|
|
|
-- to deal with unlocked files in these commits.
|
|
|
|
, Param "--quiet"
|
|
|
|
]
|
|
|
|
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
{- Decide if now is a good time to make a commit.
|
|
|
|
- Note that the list of change times has an undefined order.
|
|
|
|
-
|
|
|
|
- Current strategy: If there have been 10 commits within the past second,
|
|
|
|
- a batch activity is taking place, so wait for later.
|
|
|
|
-}
|
2012-06-10 22:29:05 +00:00
|
|
|
shouldCommit :: UTCTime -> [Change] -> Bool
|
|
|
|
shouldCommit now changes
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
| len == 0 = False
|
|
|
|
| len > 4096 = True -- avoid bloating queue too much
|
2012-06-10 22:29:05 +00:00
|
|
|
| length (filter thisSecond changes) < 10 = True
|
smart commit thread
The commit thread now has access to a channel containing the times of
all uncommitted changes. This lets it be smart about detecting busy times
when a batch job is running (such as rm -rf, or untarring something, etc),
and avoid committing until it's done. While at the same time, instantly
committing one-off changes that the user is going to expect to see
immediately.
I had to use STM to implement the channel, because of
http://hackage.haskell.org/trac/ghc/ticket/4154
While this adds a dependency, I always wanted to use STM, so this actually
makes me happy. ;)
Also happy that shouldCommit is a pure function, so other commit smartness
strategies can easily be played with. Although the current one seems pretty
good.
There is one bug, for some reason it does double commits, every time.
2012-06-10 20:07:48 +00:00
|
|
|
| otherwise = False -- batch activity
|
2012-06-10 17:56:39 +00:00
|
|
|
where
|
2012-06-10 22:29:05 +00:00
|
|
|
len = length changes
|
|
|
|
thisSecond c = now `diffUTCTime` changeTime c <= 1
|