As seen in this bug report, the lifted exception handling using the StateT
monad throws away state changes when an action throws an exception.
http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/
.. Which can result in cached values being redundantly calculated, or other
possibly worse bugs when the annex state gets out of sync with reality.
This switches from a StateT AnnexState to a ReaderT (MVar AnnexState).
All changes to the state go via the MVar. So when an Annex action is
running inside an exception handler, and it makes some changes, they
immediately go into affect in the MVar. If it then throws an exception
(or even crashes its thread!), the state changes are still in effect.
The MonadCatchIO-transformers change is actually only incidental.
I could have kept on using lifted-base for the exception handling.
However, I'd have needed to write a new instance of MonadBaseControl
for the new monad.. and I didn't write the old instance.. I begged Bas
and he kindly sent it to me. Happily, MonadCatchIO-transformers is
able to derive a MonadCatchIO instance for my monad.
This is a deep level change. It passes the test suite! What could it break?
Well.. The most likely breakage would be to code that runs an Annex action
in an exception handler, and *wants* state changes to be thrown away.
Perhaps the state changes leaves the state inconsistent, or wrong. Since
there are relatively few places in git-annex that catch exceptions in the
Annex monad, and the AnnexState is generally just used to cache calculated
data, this is unlikely to be a problem.
Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit
redundant. It's now entirely possible to run concurrent Annex actions in
different threads, all sharing access to the same state! The ThreadedMonad
just adds some extra work on top of that, with its own MVar, and avoids
such actions possibly stepping on one-another's toes. I have not gotten
rid of it, but might try that later. Being able to run concurrent Annex
actions would simplify parts of the Assistant code.
This fixes a bug with git annex add in direct mode. If some files already
existed in the tree pointing at the same key as a file that was just added,
and their content was not present, add neglected to copy the content to
those files.
I also changed the behavior of moveAnnex slightly: When content is moved
into the annex in direct mode, it does not overwrite any content already
present in direct mode files. That content may be modified after all.
Unless the request is for repo uuid we already know. This way, if A1 pairs
with friend B1, and B1 pairs with device B2, then B1 can request A1 pair
with it and no confirmation is needed. (In future, may want to try to do
that automatically, to make a more robust network.)
Observed that the pushed refs were received, but not merged into master.
The merger never saw an add event for these refs. Either git is not writing
to a new file and renaming it into place, or the inotify code didn't notice
that. Changed it to also watch for modify events and that seems to have
fixed it!
(Except for the actual streaming of receive-pack through XMPP, which
can only run once we've gotten an appropriate uuid in a push initiation
message.)
Pushes are now only initiated when the initiation message comes from a
known uuid. This allows multiple distinct repositories to use the same xmpp
address.
Note: This probably breaks initial push after xmpp pairing, because at that
point we may not know about the paired uuid, and so reject the push from
it. It won't break in simple cases, because the annex-uuid of the remote
is checked. However, when there are multiple clients behind a single xmpp
address, only uuid of the first is recorded in annex-uuid, and so any
pushes from the others will be rejected (unless the first remote pushes their
uuids to us beforehand.
Without this, a very large batch add has commits of sizes approx
5000, 2500, 1250, etc down to 10, and then starts over at 5000.
This fixes it so it's 5000+ every time.
That hook updates associated file bookkeeping info for direct mode.
But, everything already called addAssociatedFile when adding/changing a
file. It only needed to also call removeAssociatedFile when deleting a file,
or a directory.
This should make bulk adds faster, by some possibly significant amount.
Bulk removals may be a little slower, since it has to use catKeyFile now
on each removed file, but will still be faster than adds.
There's a tradeoff between making less frequent commits, and
needing to use memory to store all the changes that are coming
in. At 10 thousand, it needs 150 mb of memory. 5 thousand drops
that down to 90 mb or so.
This also turns out to have significant imact on total run time.
I benchmarked 10k changes taking 27 minutes. But two 5k batches
took only 21 minutes.
If an add failed, we should lose the KeySource, since it, presumably,
differs due to a change that was made to the file.
(The locked down file is already deleted.)
Turns out that a lot of the time spent in a bulk add was just updating the
add alert to rotate through each file that was added. Showing one alert
makes for a significant speedup.
Also, when the webapp is open, this makes it take quite a lot less cpu
during bulk adds.
Also, it lets the user know when a bulk add happened, which is sorta
nice..
This better handles error messages formatted for console display, by
adding a <br> after each line.
Hmm, I wonder if it'd be worth pulling in a markdown formatter, and running
the messages through it?
In the case of the inotify limit warning, particularly, if it happens once
it will be happening repeatedly, and so combining alerts resulted in a
much too large alert message that took up a lot of memory and was too
large for the webapp to display.