This cannot completely guard against a runaway log event, and only runs
every hour anyway, but it should avoid most problems with very
long-running, active assistants using up too much space.
It used to not log to daemon.log when a repository was first created, and
when starting the webapp. Now both do. Redirecting stdout and stderr to the
log is tricky when starting the webapp, because the web browser may want to
communicate with the user. (Either a console web browser, or web.browser = echo)
This is handled by restoring the original fds when running the browser.
It might even work, although nothing yet triggers XMPP pushes.
Also added a set of deferred push messages. Only one push can run at a
time, and unrelated push messages get deferred. The set will never grow
very large, because it only puts two types of messages in there, that
can only vary in the client doing the push.
This is a nice win; much less code runs in Annex, so other threads have
more chances to run concurrently.
I do notice that renaming a file has gone from 1 to 2 commits. I think this
is due to the above improvement letting the committer run more frequently,
so it commits the rm first.
Converted several threads to run in the monad.
Added a lot of useful combinators for working with the monad.
Now the monad includes the name of the thread.
Some debugging messages are disabled pending converting other threads.
I now have this topology working:
assistant ---> {bare repo, special remote} <--- assistant
And, I think, also this one:
+----------- bare repo --------+
v v
assistant ---> special remote <--- assistant
While before with assistant <---> assistant connections, both sides got
location info updated after a transfer, in this topology, the bare repo
*might* get its location info updated, but the other assistant has no way to
know that it did. And a special remote doesn't record location info,
so transfers to it won't propigate out location log changes at all.
So, for these to work, after a transfer succeeds, the git-annex branch
needs to be pushed. This is done by recording a synthetic commit has
occurred, which lets the pusher handle pushing out the change (which will
include actually committing any still journalled changes to the git-annex
branch).
Of course, this means rather a lot more syncing action than happened
before. At least the pusher bundles together very close together pushes,
somewhat. Currently it just waits 2 seconds between each push.
Adjust build deps to ensure that only a fixed version of the library will
be used.
Also, removed the bound thread stuff, which I now think was (probably)
a red herring.
This *may* solve the segfault I was seeing when the XMPP library called
startTLS. My hypothesis is as follows:
* TLS is documented
(http://www.gnu.org/software/gnutls/manual/gnutls.html#Thread-safety)
thread safe, but only when a single thread accesses it.
* forkIO threads are not bound to an OS thread, so it was possible for
the threaded runtime to run part of the XMPP code on one thread, and
then switch to another thread later.
So, forkOS, with its bound threads, should be used for the XMPP thread.
Since the crash doesn't happen reliably, I am not yet sure about this fix.
Note that I kept all the other threads in the assistant unbound, because
bound threads have significantly higher overhead.
Lacking error handling, reconnection, credentials configuration,
and doesn't actually do anything when it receives an incoming notification.
Other than that, it might work! :)
Hooked up everything that needs to notify on pushes. Note that
syncNewRemote does not notify. This is probably ok, and I'd need to thread
more state through to make it do so.
This is only set up to support a single push notification method; I didn't
use a NotificationBroadcaster. Partly because I don't yet know what info
about pushes needs to be communicated, so my data types are only
preliminary.
Monitors git-annex branch for changes, which are noticed by the Merger
thread whenever the branch ref is changed (either due to an incoming push,
or a local change), and refreshes cached config values for modified config
files.
Rate limited to run no more often than once per minute. This is important
because frequent git-annex branch changes happen when files are being
added, or transferred, etc.
A primary use case is that, when preferred content changes are made,
and get pushed to remotes, the remotes start honoring those settings.
Other use cases include propigating repository description and trust
changes to remotes, and learning when a remote has added a new special
remote, so the webapp can present the GUI to enable that special remote
locally.
Also added a uuid.log cache. All other config files already had caches.
This ensures file propigate takes place in situations such as: Usb drive A
is connected to B. A's master branch is already in sync with B, but it is
being used to sneakernet some files around, so B downloads those. There is no
master branch change, so C does not request these files. B needs to upload
the files it just downloaded on to C, etc.
My first try at this, I saw loops happen. B uploaded to C, which then
tried to upload back to B (because it had not received the updated
git-annex branch from B yet). B already had the file, but it still created
a transfer info file from the incoming transfer, and its watcher saw
that be removed, and tried to upload back to C.
These loops should have been fixed by my previous commit. (They never
affected ssh remotes, only local ones, it seemed.) While C might still try
to upload to B, or to some other remote that already has the file, the
extra work dies out there.
Now when a download is queued and there's no known remote to get it from,
it's added to a deferred download list, which will be retried later.
The Merger thread tries to queue any deferred downloads when it receives
a push to the git-annex branch.
Note that the Merger thread now also forces an update of the git-annex
branch. The assistant was not updating this branch before, and it saw a
(mostly) correct view of state, but now that incoming pushes go to
synced/git-annex, it needs to be merged in.
They work fine. But I had to go to a lot of trouble to get Yesod to render
routes in a pure function. It may instead make more sense to have each
alert have an assocated IO action, and a single route that runs the IO
action of a given alert id. I just wish I'd realized that before the past
several hours of struggling with something Yesod really doesn't want to
allow.
This deals with interruptions in network connectevity, by listening
for a new network interface coming up (using dbus to see when
network-manager or wicd do it), and forcing a rescan of
This prevents multiple runs of the assistant in the foreground, and lets
--stop stop foregrounded runs too.
The webapp firstrun case also now writes a pid file, once it's made the git
repo to put it in.
The fun part was making it move things from TransferQueue to currentTransfers
entirely atomically. Which will avoid inconsistent display if the WebApp
renders the current status at just the wrong time. STM to the rescue!
I've convinced myself that nothing in DaemonStatus can deadlock,
as it always keepts the TMVar full. That was the only reason it was in the
Annex monad.
This avoids forking another process, avoids polling, fixes a race,
and avoids a rare forkProcess thread hang that I saw once time
when starting the webapp.
This should fix OSX/BSD issues with not noticing transfer information
files with kqueue. Now that threads are used, the thread can manage the
transfer slot allocation and deallocation by itself; much cleaner.
Added knownRemotes to DaemonStatus. This list is not entirely trivial to
calculate, and having it here should make it easier to add/remove remotes
on the fly later on. It did require plumbing the daemonstatus through to
some more threads.
The reason the DirWatcher had to wait for program termination was because
it used withINotify, so when it finished, its watcher threads were killed.
But since I have two DirWatcher threads now, that was not good, and could
perhaps explain the MVar problem I saw yesterday. In any case, fixed this
part of the code by making the DirWatcher return a handle that can be used
to stop it, and now the main Assistant thread is the only one calling
waitForTermination.
SampleMVar won't work; between getting the current value and changing
it, another thread could made a change, which would get lost.
TMVar works well; this update situation is handled by atomic transactions.
Rethought how to keep track of pending adds that need to be retried later.
The commit thread already run up every second when there are changes,
so let's keep pending adds queued as changes until they're safe to add.
Also, the committer is now smarter about avoiding empty commits when
all the adds are currently unsafe, or in the rare case that an add event
for a symlink is not received in time. It may avoid them entirely.
This seems to work as before for inotify, and is untested for kqueue.
(Actually commit batching seems to be improved for inotify, although I'm
not sure why. I'm seeing only two commits made during large batch
operations, and the first of those is the non-batch mode commit.)
Kqueue needs to remember which files failed to be added due to being open,
and retry them. This commit gets the data in place for such a retry thread.
Broke KeySource out into its own file, and added Eq and Ord instances
so it can be stored in a Set.