Merge branch 'master' into watch

This commit is contained in:
Joey Hess 2012-06-07 13:48:55 -04:00
commit 727158ff55
11 changed files with 93 additions and 22 deletions

View file

@ -15,6 +15,7 @@ import qualified Remote
import qualified Logs.Remote
import qualified Types.Remote as R
import Annex.UUID
import Logs.UUID
def :: [Command]
def = [command "initremote"
@ -60,6 +61,7 @@ findByName name = do
where
generate = do
uuid <- liftIO genUUID
describeUUID uuid name
return (uuid, M.insert nameKey name M.empty)
findByName' :: String -> M.Map UUID R.RemoteConfig -> Maybe (UUID, R.RemoteConfig)

View file

@ -54,9 +54,9 @@ remoteMap :: (Remote -> a) -> Annex (M.Map UUID a)
remoteMap c = M.fromList . map (\r -> (uuid r, c r)) .
filter (\r -> uuid r /= NoUUID) <$> remoteList
{- Map of UUIDs and their descriptions.
{- Map of UUIDs of remotes and their descriptions.
- The names of Remotes are added to suppliment any description that has
- been set for a repository. -}
- been set for a repository. -}
uuidDescriptions :: Annex (M.Map UUID String)
uuidDescriptions = M.unionWith addName <$> uuidMap <*> remoteMap name
@ -101,9 +101,6 @@ nameToUUID n = byName' n >>= go
double (a, _) = (a, a)
{- Pretty-prints a list of UUIDs of remotes, for human display.
-
- Shows descriptions from the uuid log, falling back to remote names,
- as some remotes may not be in the uuid log.
-
- When JSON is enabled, also generates a machine-readable description
- of the UUIDs. -}

1
debian/changelog vendored
View file

@ -5,6 +5,7 @@ git-annex (3.20120606) UNRELEASED; urgency=low
to manually run git commands when manipulating files.
* add: Prevent (most) modifications from being made to a file while it
is being added to the annex.
* initremote: Automatically describe a remote when creating it.
-- Joey Hess <joeyh@debian.org> Tue, 05 Jun 2012 20:25:51 -0400

View file

@ -0,0 +1,26 @@
Today I worked on the race conditions, and fixed two of them. Both
were fixed by avoiding using `git add`, which looks at the files currently
on disk. Instead, `git annex watch` injects symlinks directly into git's
index, using `git update-index`.
There is one bad race condition remaining. If multiple processes have a
file open for write, one can close it, and it will be added to the annex.
But then the other can still write to it.
----
Getting away from race conditions for a while, I made `git annex watch`
not annex `.gitignore` and `.gitattributes` files.
And, I made it handle running out of inotify descriptors. By default,
`/proc/sys/fs/inotify/max_user_watches` is 8192, and that's how many
directories inotify can watch. Now when it needs more, it will print
a nice message showing how to increase it with `sysctl`.
FWIW, DropBox also uses inotify and has the same limit. It seems to not
tell the user how to fix it when it goes over. Here's what `git annex
watch` will say:
Too many directories to watch! (Not watching ./dir4299)
Increase the limit by running:
echo fs.inotify.max_user_watches=81920 | sudo tee -a /etc/sysctl.conf; sudo sysctl -p

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawkmtR6oVColYKoU0SjBORLDGrwR10G-mKo"
nickname="Jo-Herman"
subject="Dropbox Inotify"
date="2012-06-06T22:03:29Z"
content="""
Actually, Dropbox giver you a warning via libnotify inotify. It tends to go away too quickly to properly read though, much less actually copy down the command...
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.252.8.36"
subject="comment 2"
date="2012-06-06T23:25:57Z"
content="""
When I work on the [[webapp]], I'm planning to make it display this warning, and any other similar warning messages that might come up.
"""]]

View file

@ -0,0 +1,14 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawnBJ6Dv1glxzzi4qIzGFNa6F-mfHIvv9Ck"
nickname="Jim"
subject="Wording"
date="2012-06-07T03:43:19Z"
content="""
For the unfamiliar, it's hard to tell if a command like that would persist. I'd suggest being as clear as possible, e.g.:
Increase the limit for now by running:
sudo sysctl fs.inotify.max_user_watches=81920
Increase the limit now and automatically at every boot by running:
echo fs.inotify.max_user_watches=81920 | sudo tee -a /etc/sysctl.conf; sudo sysctl -p
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.252.8.36"
subject="comment 4"
date="2012-06-07T04:48:15Z"
content="""
Good thought Jim. I've done something like that.
"""]]

View file

@ -19,23 +19,22 @@ really useful, it needs to:
- notice deleted files and stage the deletion
(tricky; there's a race with add since it replaces the file with a symlink..)
**done**
- Gracefully handle when the default limit of 8192 inotified directories
is exceeded. This can be tuned by root, so help the user fix it.
**done**
- periodically auto-commit staged changes (avoid autocommitting when
lots of changes are coming in)
- tunable delays before adding new files, etc
- Coleasce related add/rm events. See commit
cbdaccd44aa8f0ca30afba23fc06dd244c242075 for some details of the problems
with doing this.
- don't annex `.gitignore` and `.gitattributes` files, but do auto-stage
changes to them
- coleasce related add/rm events for speed and less disk IO
- don't annex `.gitignore` and `.gitattributes` files **done**
- configurable option to only annex files meeting certian size or
filename criteria
- option to check files not meeting annex criteria into git directly
- honor .gitignore, not adding files it excludes (difficult, probably
needs my own .gitignore parser to avoid excessive running of git commands
to check for ignored files)
- Possibly, when a directory is moved out of the annex location,
unannex its contents.
- Gracefully handle when the default limit of 8192 inotified directories
is exceeded. This can be tuned by root, so help the user fix it.
- Support OSes other than Linux; it only uses inotify currently.
OSX and FreeBSD use the same mechanism, and there is a Haskell interface
for it,
@ -67,10 +66,18 @@ Many races need to be dealt with by this code. Here are some of them.
**Currently unfixed**; This changes content in the annex, and fsck will
later catch the inconsistency.
Possible fixes: Somehow track or detect if a file is open for write
by any processes. Or, when possible, making a copy on write copy
before adding the file would avoid this. Or, as a last resort, make
an expensive copy of the file and add that.
Possible fixes:
* Somehow track or detect if a file is open for write by any processes.
* Or, when possible, making a copy on write copy before adding the file
would avoid this.
* Or, as a last resort, make an expensive copy of the file and add that.
* Tracking file opens and closes with inotify could tell if any other
processes have the file open. But there are problems.. It doesn't
seem to differentiate between files opened for read and for write.
And there would still be a race after the last close and before it's
injected into the annex, where it could be opened for write again.
Would need to detect that and undo the annex injection or something.
* File is added and then replaced with another file before the annex add
makes its symlink.
@ -82,16 +89,14 @@ Many races need to be dealt with by this code. Here are some of them.
* File is added and then replaced with another file before the annex add
stages the symlink in git.
**Currently unfixed**; `git add` will be run on the new file, which is
not at all good when it's big. Could be dealt with by using `git
update-index` to manually put the symlink into the index without git
Now fixed; `git annex watch` avoids running `git add` because of this
race. Instead, it stages symlinks directly into the index, without
looking at what's currently on disk.
* Link is moved, fixed link is written by fix event, but then that is
removed by the user and replaced with a file before the event finishes.
**Currently unfixed**: `git add` will be run on the file. Basically same
effect as previous race above.
Now fixed; same fix as previous race above.
* File is removed and then re-added before the removal event starts.

View file

@ -23,6 +23,8 @@ The webapp is a web server that displays a shiny interface.
* there could be a UI to export a file, which would make it be served up
over http by the web app
* Display any relevant warning messages. One is the `inotify max_user_watches`
exceeded message.
## implementation

View file

@ -24,7 +24,7 @@ With a little setup, git-annex can use Box as a
* Create `~/.davfs2/davfs2.conf` with some important settings:
mkdir ~/.davfs2/
echo use_locks 0 >> ~/.davfs2/davfs2.conf
echo use_locks 0 > ~/.davfs2/davfs2.conf
echo cache_size 1 >> ~/.davfs2/davfs2.conf
echo delay_upload 0 >> ~/.davfs2/davfs2.conf