From 9bcfbc40480c2a6a675ff5b9198e64287c6e6881 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 6 Jul 2012 21:17:21 -0600 Subject: [PATCH 1/4] todo --- doc/design/assistant/syncing.mdwn | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn index 9c607f992d..94699aae05 100644 --- a/doc/design/assistant/syncing.mdwn +++ b/doc/design/assistant/syncing.mdwn @@ -31,6 +31,10 @@ all the other git clones, at both the git level and the key/value level. only uploading new files but not downloading, and only downloading files in some directories and not others. See for use cases: [[forum/Wishlist:_options_for_syncing_meta-data_and_data]] +* Running external commands from one thread blocks all of them until + it completes. Try to switch to haskell's threaded runtime, which I + think fixes this. Failing that, make sure all network accessing + commands are run by separate processes or something. ## data syncing From 583cfb5667ae2ec9e01e9fe4f91b32d758de6aa4 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawmWQTrnPloMWiPFg8Y2Y5g-2IYe26D0KKw" Date: Sat, 7 Jul 2012 16:18:08 +0000 Subject: [PATCH 2/4] Added a comment --- .../comment_6_1d4fbbd212fa92967abda346323031f4._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/forum/fsck_gives_false_positives/comment_6_1d4fbbd212fa92967abda346323031f4._comment diff --git a/doc/forum/fsck_gives_false_positives/comment_6_1d4fbbd212fa92967abda346323031f4._comment b/doc/forum/fsck_gives_false_positives/comment_6_1d4fbbd212fa92967abda346323031f4._comment new file mode 100644 index 0000000000..294778cbfa --- /dev/null +++ b/doc/forum/fsck_gives_false_positives/comment_6_1d4fbbd212fa92967abda346323031f4._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawmWQTrnPloMWiPFg8Y2Y5g-2IYe26D0KKw" + nickname="Jim" + subject="comment 6" + date="2012-07-07T16:18:06Z" + content=""" +It's also possible you got a one-time DRAM corruption. You have to expect those to happen every so often unless you're using ECC memory. +"""]] From eb9063c0d1bf34a339516c8ecbfbff8dd2f83e25 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 7 Jul 2012 10:56:09 -0600 Subject: [PATCH 3/4] update --- doc/design/assistant/syncing.mdwn | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn index 94699aae05..2871ec2164 100644 --- a/doc/design/assistant/syncing.mdwn +++ b/doc/design/assistant/syncing.mdwn @@ -1,23 +1,24 @@ Once files are added (or removed or moved), need to send those changes to all the other git clones, at both the git level and the key/value level. -## action items +## immediate action items * Check that download transfer triggering code works (when a symlink appears and the remote does *not* upload to us. -* Investigate why transfers seem to block other git-annex assistant work. * At startup, and possibly periodically, look for files we have that location tracking indicates remotes do not, and enqueue Uploads for them. Also, enqueue Downloads for any files we're missing. -* Find a way to probe available outgoing bandwidth, to throttle so - we don't bufferbloat the network to death. -* git-annex needs a simple speed control knob, which can be plumbed - through to, at least, rsync. A good job for an hour in an - airport somewhere. -* file transfer processes are not waited for, contain the zombies. +* The TransferWatcher does not notice ongoing transfers, because inotify is + waiting for the info file to be closed, but that never happens, it's left + open to keep it locked. ## longer-term TODO +* git-annex needs a simple speed control knob, which can be plumbed + through to, at least, rsync. A good job for an hour in an + airport somewhere. +* Find a way to probe available outgoing bandwidth, to throttle so + we don't bufferbloat the network to death. * Investigate the XMPP approach like dvcs-autosync does, or other ways of signaling a change out of band. * Add a hook, so when there's a change to sync, a program can be run From 3247415c56fcb3b7ed7914e44cae640c6443abc0 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 7 Jul 2012 11:12:11 -0600 Subject: [PATCH 4/4] update; split out hard todo --- doc/design/assistant/inotify.mdwn | 22 +++++++++++----------- doc/design/assistant/syncing.mdwn | 7 ++----- doc/todo/threaded_runtime.mdwn | 29 +++++++++++++++++++++++++++++ 3 files changed, 42 insertions(+), 16 deletions(-) create mode 100644 doc/todo/threaded_runtime.mdwn diff --git a/doc/design/assistant/inotify.mdwn b/doc/design/assistant/inotify.mdwn index 7b600090ad..fd81366d44 100644 --- a/doc/design/assistant/inotify.mdwn +++ b/doc/design/assistant/inotify.mdwn @@ -18,6 +18,17 @@ available! I may need to fork off multiple watcher processes to handle this. See [[bugs/Issue_on_OSX_with_some_system_limits]]. +## todo + +* Run niced and ioniced? Seems to make sense, this is a background job. +* configurable option to only annex files meeting certian size or + filename criteria +* option to check files not meeting annex criteria into git directly, + automatically +* honor .gitignore, not adding files it excludes (difficult, probably + needs my own .gitignore parser to avoid excessive running of git commands + to check for ignored files) + ## beyond Linux I'd also like to support OSX and if possible the BSDs. @@ -65,17 +76,6 @@ I'd also like to support OSX and if possible the BSDs. * Windows has a Win32 ReadDirectoryChangesW, and perhaps other things. -## todo - -- Run niced and ioniced? Seems to make sense, this is a background job. -- configurable option to only annex files meeting certian size or - filename criteria -- option to check files not meeting annex criteria into git directly, - automatically -- honor .gitignore, not adding files it excludes (difficult, probably - needs my own .gitignore parser to avoid excessive running of git commands - to check for ignored files) - ## the races Many races need to be dealt with by this code. Here are some of them. diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn index 2871ec2164..98749d2442 100644 --- a/doc/design/assistant/syncing.mdwn +++ b/doc/design/assistant/syncing.mdwn @@ -10,7 +10,8 @@ all the other git clones, at both the git level and the key/value level. them. Also, enqueue Downloads for any files we're missing. * The TransferWatcher does not notice ongoing transfers, because inotify is waiting for the info file to be closed, but that never happens, it's left - open to keep it locked. + open to keep it locked. May need to separate the transfer info files + into an info file, and a lock file. ## longer-term TODO @@ -32,10 +33,6 @@ all the other git clones, at both the git level and the key/value level. only uploading new files but not downloading, and only downloading files in some directories and not others. See for use cases: [[forum/Wishlist:_options_for_syncing_meta-data_and_data]] -* Running external commands from one thread blocks all of them until - it completes. Try to switch to haskell's threaded runtime, which I - think fixes this. Failing that, make sure all network accessing - commands are run by separate processes or something. ## data syncing diff --git a/doc/todo/threaded_runtime.mdwn b/doc/todo/threaded_runtime.mdwn new file mode 100644 index 0000000000..095ffa4359 --- /dev/null +++ b/doc/todo/threaded_runtime.mdwn @@ -0,0 +1,29 @@ +The [[design/assistant]] would be better if git-annex used ghc's threaded +runtime (`ghc -threaded`). + +Currently, whenever the assistant code runs some external command, all +threads are blocked waiting for it to finish. + +For transfers, the assistant works around this problem by forking separate +upload processes, and not waiting on them until it sees an indication that +they have finished the transfer. While this works, it's messy.. threaded +would be better. + +When pulling, pushing, and merging, the assistant runs external git +commands, and this does block all other threads. The threaded runtime would +really help here. + +--- + +Currently, git-annex seems unstable when built with the threaded runtime. +The test suite tends to hang when testing add. `git-annex` occasionally +hangs, apparently in a futex lock. This is not the assistant hanging, and +git-annex does not otherwise use threads, so this is surprising. --[[Joey]] + +--- + +It would be possible to not use the threaded runtime. Instead, we could +have a child process pool, with associated continuations to run after a +child process finishes. Then periodically do a nonblocking waitpid on each +process in the pool in turn (waiting for any child could break anything not +using the pool!). This is probably a last resort...