So that remotes that use a persistent network connection are restarted.
A remote might keep open a long duration network connection, and could
fail to deal well with losing the connection. This is particularly a
concern now that we have external special reotes. An external
special remote that is implemented naively might open the connection only
when PREPARE is sent, and if it loses connection, throw errors on each
request that is made.
(Note that the ssh connection caching should not have this problem; if the
long-duration ssh process loses connection, the named pipe is disconnected
and the next ssh attempt will reconnect. Also, XMPP already deals with
disconnection robustly in its own way.)
There's no way for git-annex to know if a lost network connection actually
affects a given remote, which might have a transfer in process. It does not
make sense to force kill the transferkeys process every time the NetWatcher
detects a change. (Especially because the NetWatcher sometimes polls 1
change per hour.)
In any case, the NetWatcher only detects connection to a network, not
disconnection. So if a transfer is in progress over the network, and the
network goes down, that will need to time out on its own.
An alternate approch that was considered is to use a separate transferkeys
process for each remote, and detect when a request fails, and assume that
means that process is in a failing state and restart it. The problem with
that approach is that if a resource is not available and a remote fails
every time, it degrades to starting a new transferkeys process for every
file transfer, which is too expensive.
Instead, this commit only handles the network reconnection case, and restarts
transferkeys only once the network has reconnected and another transfer needs
to be made. So, a transferkeys process will be reused for 1 hour, or until the
next network connection.
----
The NotificationBroadcaster was rewritten to use TMVars rather than MSampleVars,
to allow checking without blocking if a notification has been received.
----
This commit was sponsored by Tobias Brunner.
When a page is loaded, the javascript requests an notification url, and
does long polling on the url to be informed of changes. But if a change
occured before the notification url was requested, it would not be notified
of that change, and so the page display would not update.
I fixed this by *always* updating the page display after it gets
the notification url. This is extra work, but the overhead is not noticable
in the other overhead of loading a page.
(A nicer way would be to somehow record the version of a page initially
loaded, and then compare it with the current version when getting the
notification url, and only force an update if it's changed. But getting
the "version" of the different parts of the page that use long polling
is difficult.)
This is a way to send a notification to a set of clients, any of which can be
blocked waiting for a new notification to arrive. A complication is that any
number of clients may be be dead, and we don't want stale notifications for
those clients to pile up and leak memory.
It took me 3 tries to find the solution, which turns out to be simple: An array
of SampleVars, one per client. Using SampleVars means that clients only see the
most recent notification, but when the notification is just "the assistant's
state changed somehow; display a refreshed rendering of it", that's sufficient.