git-annex/doc/design/assistant/inotify.mdwn
2012-06-05 15:10:04 -04:00

86 lines
3.6 KiB
Markdown

Finish "git annex watch" command, which runs, in the background, watching via
inotify for changes, and automatically annexing new files, etc.
There is a `watch` branch in git that adds such a command. To make this
really useful, it needs to:
- on startup, add any files that have appeared since last run **done**
- on startup, fix the symlinks for any renamed links **done**
- on startup, stage any files that have been deleted since last run
(seems to require a `git commit -a` on startup, or at least a
`git add --update`, which will notice deleted files) **done**
- notice new files, and git annex add **done**
- notice renamed files, auto-fix the symlink, and stage the new file location
**done**
- handle cases where directories are moved outside the repo, and stop
watching them **done**
- when a whole directory is deleted or moved, stage removal of its
contents from the index **done**
- notice deleted files and stage the deletion
(tricky; there's a race with add since it replaces the file with a symlink..)
**done**
- periodically auto-commit staged changes (avoid autocommitting when
lots of changes are coming in)
- tunable delays before adding new files, etc
- Coleasce related add/rm events. See commit
cbdaccd44aa8f0ca30afba23fc06dd244c242075 for some details of the problems
with doing this.
- don't annex `.gitignore` and `.gitattributes` files, but do auto-stage
changes to them
- configurable option to only annex files meeting certian size or
filename criteria
- honor .gitignore, not adding files it excludes (difficult, probably
needs my own .gitignore parser to avoid excessive running of git commands
to check for ignored files)
- Possibly, when a directory is moved out of the annex location,
unannex its contents.
- Gracefully handle when the default limit of 8192 inotified directories
is exceeded. This can be tuned by root, so help the user fix it.
- Support OSes other than Linux; it only uses inotify currently.
OSX and FreeBSD use the same mechanism, and there is a Haskell interface
for it,
## the races
Many races need to be dealt with by this code. Here are some of them.
* File is added and then removed before the add event starts.
Not a problem; The add event does nothing since the file is not present.
* File is added and then removed before the add event has finished
processing it.
Minor problem; When the add's processing of the file (checksum and so
on) fails due to it going away, there is an ugly error message, but
things are otherwise ok.
* File is added and then removed before the add event finishes.
Currently unfixed; The annex add re-adds the file as a symlink and then
the remove event does nothing since the symlink exists.
* File is added and then replaced with another file before the annex add
makes its symlink.
Minor problem; The annex add will fail creating its symlink since
the file exists. There is an ugly error message, but the second add
event will add the new file.
* File is added and then replaced with another file before the annex add
moves its content into the annex.
Currently unfixed; The new content will be moved to the annex under the
old checksum, and fsck will later catch this inconsistency.
Possible fix: Move content someplace before doing checksumming.
* File is removed and then re-added before the removal event starts.
Not a problem; The removal event does nothing since the file exists,
and the add event replaces it in git with the new one.
* File is removed and then re-added before the removal event finishes.
Not a problem; The removal event removes the old file from the index, and
the add event adds the new one.