So that remotes that use a persistent network connection are restarted.
A remote might keep open a long duration network connection, and could
fail to deal well with losing the connection. This is particularly a
concern now that we have external special reotes. An external
special remote that is implemented naively might open the connection only
when PREPARE is sent, and if it loses connection, throw errors on each
request that is made.
(Note that the ssh connection caching should not have this problem; if the
long-duration ssh process loses connection, the named pipe is disconnected
and the next ssh attempt will reconnect. Also, XMPP already deals with
disconnection robustly in its own way.)
There's no way for git-annex to know if a lost network connection actually
affects a given remote, which might have a transfer in process. It does not
make sense to force kill the transferkeys process every time the NetWatcher
detects a change. (Especially because the NetWatcher sometimes polls 1
change per hour.)
In any case, the NetWatcher only detects connection to a network, not
disconnection. So if a transfer is in progress over the network, and the
network goes down, that will need to time out on its own.
An alternate approch that was considered is to use a separate transferkeys
process for each remote, and detect when a request fails, and assume that
means that process is in a failing state and restart it. The problem with
that approach is that if a resource is not available and a remote fails
every time, it degrades to starting a new transferkeys process for every
file transfer, which is too expensive.
Instead, this commit only handles the network reconnection case, and restarts
transferkeys only once the network has reconnected and another transfer needs
to be made. So, a transferkeys process will be reused for 1 hour, or until the
next network connection.
----
The NotificationBroadcaster was rewritten to use TMVars rather than MSampleVars,
to allow checking without blocking if a notification has been received.
----
This commit was sponsored by Tobias Brunner.
This has not been tested at all. It compiles!
The only known missing things are support for encryption, and for get/set
of special remote configuration, and of key state. (The latter needs
separate work to add a new per-key log file to store that state.)
Only thing I don't much like is that initremote needs to be passed both
type=external and externaltype=foo. It would be better to have just
type=foo
Most of this is quite straightforward code, that largely wrote itself given
the types. The only tricky parts were:
* Need to lock the remote when using it to eg make a request, because
in theory git-annex could have multiple threads that each try to use
a remote at the same time. I don't think that git-annex ever does
that currently, but better safe than sorry.
* Rather than starting up every external special remote program when
git-annex starts, they are started only on demand, when first used.
This will avoid slowdown, especially when running fast git-annex query
commands. Once started, they keep running until git-annex stops, currently,
which may not be ideal, but it's hard to know a better time to stop them.
* Bit of a chicken and egg problem with caching the cost of the remote,
because setting annex-cost in the git config needs the remote to already
be set up. Managed to finesse that.
This commit was sponsored by Lukas Anzinger.
Batch detection is heuristic, so can sometimes fail. I observed one such
failure while starting up in a repository with 87000 files. After the first
several batches of ~5000 files, it fell out of batch mode, and never
re-entered it, and so made many more commits of a few files at a time
than necessary.
So, let's always use batch mode when in the startup scan. This avoids the
heuristic there, at least.
There is clearly also room to improve the heuristic. Possibly 10 files is
too high a bar to be found during a commit, on a system that can commit
quickly.
Fixed up a number of things that had worked around there not being a way to
get that.
Most notably, transfer info files on windows now include the process id,
since no locking is currently done. This means the file format varies
between windows and unix.
Windows has a larger (unsigned) PID space, so cannot use the unix CInt
there.
Note that TransferInfo does not yet ever get the TransferPid populated,
as there is missing locking.
transferkeys had used special FDs for communication, but that would be
quite annoying to do in Windows.
Instead, use stdin and stdout. But, to avoid commands like rsync stomping
on them and messing up the communications channel, they're duplicated to a
different handle; stdin is replaced with a null handle, and stdout is
replaced with a copy of stderr. This should all work in windows too.
Stopping in progress transfers may work on windows.. if the types unify
anyway. ;) May need some more porting.
Fixes a test case I received where a corrupted repo was repaired, but the
git-annex branch was not. The root of the problem was that the
MissingObject returned by the repair code was not necessarily a complete
set of all objects that might have been deleted during the repair.
So, stop trying to return that at all, and instead make the index file
checking code explicitly verify that each object the index uses is present.
The assistant's commit code also always avoids git commit, for simplicity.
Indirect mode sync still does a git commit -a to catch unstaged changes.
Note that this means that direct mode sync no longer runs the pre-commit
hook or any other hooks git commit might call. The git annex pre-commit
hook action for direct mode is however explicitly run. (The assistant
already ran git commit with hooks disabled, so no change there.)
This works around horribleness in the Mavericks cpp, which falls over on
the #if when configure is running. Moving it avoids the file being built at
that point.
But it's also a location that makes sense..
The Upgrader avoids checking for upgrades on startup when it was just
upgraded. This avoids an upgrade loop if something goes wrong. One example
of something going wrong would be if the upgrade info file and the
distribution file get out of sync (or the distribution file is cached in
a proxy), so it thinks it has upgraded to a new version, but has really
not.
When an automatic upgrade completes, or when the user clicks on the upgrade
button in one webapp, but also has it open in another browser window/tab,
we have a problem: The current web server is going to stop running in
minutes, but there is no way to send a redirect to the web browser to the
new url.
To solve this, used long polling, so the webapp is always listening for
urls it should redirect to. This allows globally redirecting every open
webapp. Works great! Tested with 2 web browsers with 2 tabs each.
May be useful for other purposes later too, dunno.
The overhead is 2 http requests per page load in the webapp. Due to yesod's
speed, this does not seem to noticibly delay it. Only 1 of the requests
could possibly block the page load, the other is async.
Made alerts be able to have multiple buttons, so the alerts about upgrading
can have a button that enables automatic upgrades.
Implemented automatic upgrading when the program file has changed.
Note that when an automatic upgrade happens, the webapp displays an alert
about it for a few minutes, and then closes. This still needs work.
Not yet wired up to restart the assistant on upgrade; that needs careful
sanity checking to wait until the upgrade is done before restarting.
Used the DirWatcher here, so it gets events for any changes to the
directory containing the program file. (But not subdirs.) This is necessary
in order to detect when the file is renamed as part of the upgrade, which
an inotify on a single file would not detect. (Also, I have DirWatcher code,
but not FileWatcher code.)
Note that upgrades that remove or rename a whole directory tree containing
the executable will *not* trigger this code. So eg, deleting and replacing
the whole standalone tarball dir tree won't work -- but untarring it
over top will. So should dpkg package upgrades.
Added programPath, using a new GHC feature to find the full path to the
executable. The fallback code for old GHC or unsupported OS is less good;
its worst failure mode would be either failing to find the program, and so
not checking for upgrades, or finding a git-annex that's in PATH, but is
not the one running.
This commit was sponsored by John Roepke.