I'm down to 9 places in the code that can produce unwaited for zombies.
Most of these are pretty innocuous, at least for now, are only
used in short-running commands, or commands that run a set of
actions and explicitly reap zombies after each one.
The one from Annex.Branch.files could be trouble later,
since both Command.Fsck and Command.Unused can trigger it,
and the assistant will be doing those eventally. Ditto the one in
Git.LsTree.lsTree, which Command.Unused uses.
The only ones currently affecting the assistant though, are
in Git.LsFiles. Several threads use several of those.
(And yeah, using pipes or ResourceT would be a less ad-hoc approach,
but I don't really feel like ripping my entire code base apart right
now to change a foundation monad. Maybe one of these days..)
Nearly everything that's reading from git is operating on a small
amount of output and has been switched to use that. Only pipeNullSplit
stuff continues using the lazy version that yields zombies.
This includes a full parser for the boolean expressions in the log,
that compiles them into Matchers. Those matchers are not used yet.
A complication is that matching against an expression should never
crash git-annex with an error. Instead, vicfg checks that the expressions
parse. If a bad expression (or an expression understood by some future
git-annex version) gets into the log, it'll be ignored.
Most of the code in Limit couldn't fail anyway, but I did have to make
limitCopies check its parameter first, and return an error if it's bad,
rather than erroring at runtime.
One note: Deleted lines are not currently parsed as config changes.
That makes sense for trust settings. It may make sense to support deleted
lines as a way to clear group settings.
Incomplete; I need to finish parsing and saving. This will also be used
for editing transfer control expresssions.
Removed the group display from the status output, I didn't really
like that format, and vicfg can be used to see as well as edit rempository
group membership.
Simplified it using existing functions.
I doubt setSticky needs to return the FileMode; if it does for some
reason, it can be changed to use modifyFileMode'
Converted isSticky to a pure function for consistency with isSymlink.
Note that the sticky bit of a file can be tested thus:
isSticky . fileMode <$> getFileStatus file
Fix resuming of downloads, which do not have a transfer info file to read.
When checking upload progress, use the MVar, rather than re-reading
the info file.
Catch exceptions in the transfer action. Required a tryAnnex.
When a transfer fails, the progress info can be used to intelligently
retry it. If the transfer managed to make some progress, but did not
fully complete, then there's a good chance that a retry will finish it
(or at least make more progress).
Transfer info files are updated when the callback is called, updating
the number of bytes transferred.
Left unused p variables at every place the callback should be used.
Which is rather a lot..
Don't expose these as branches in refs/heads/. Instead hide them away in
refs/synced/ where only show-ref will find them.
Make unused only look at branches and tags, not these other things,
so it won't care if some stale sync ref used to use a file.
This means they don't need to be deleted, which could have
led to an incoming sync being missed.
This fixes a problem I was seeing in the assistant where two remotes would
attempt to sync with one another at the same time, and both failed pushing
the diverged git-annex branch. Then when both tried to resolve the failed
push, they each modified their git-annex branch, which again each blocked
the other from pushing into it. The result was that the git-annex
branches were perpetually diverged (despite having the same content!) and
once the assistant fell into this trap, it couldn't get out and always
had to do the slow push/fail/pull/merge/push/fail cycle.
They work fine. But I had to go to a lot of trouble to get Yesod to render
routes in a pure function. It may instead make more sense to have each
alert have an assocated IO action, and a single route that runs the IO
action of a given alert id. I just wish I'd realized that before the past
several hours of struggling with something Yesod really doesn't want to
allow.
Used by the assistant, rather than copy, this is faster because it avoids
using git ls-files, avoids checking the location log redundantly, and
runs in oneshot mode, avoiding making a commit to the git-annex branch
for every file transferred.
Found a very cheap way to determine when a disconnected remote has
diverged, and has new content that needs to be transferred: Piggyback on
the git-annex branch update, which already checks for divergence.
However, this does not check if new content has appeared locally while
disconnected, that should be transferred to the remote.
Also, this does not handle cases where the two git repos are in sync,
but their content syncing has not caught up yet.
This code could have its efficiency improved:
* When multiple remotes are synced, if any one has diverged, they're
all queued for transfer scans.
* The transfer scanner could be told whether the remote has new content,
the local repo has new content, or both, and could optimise its scan
accordingly.
This commit includes a paydown on technical debt incurred two years ago,
when I didn't know that it was bad to make custom Read and Show instances
for types. As the routes need Read and Show for Transfer, which includes a
Key, and deriving my own Read instance of key was not practical,
I had to finally clean that up.
So the compact Key read and show functions are now file2key and key2file,
and Read and Show are now derived instances.
Changed all code that used the old instances, compiler checked.
(There were a few places, particularly in Command.Unused, and the test
suite where the Show instance continue to be used for legitimate
comparisons; ie show key_x == show key_y (though really in a bloom filter))
Avoid crashing when "git annex get" fails to download from one location,
and falls back to downloading from a second location.
The problem is that git annex get calls download recursively from within
itself if the first download attempt fails. So the first time through, it
writes a transfer info file, which is then overwritten on the second,
recursive call. Then on cleanup, it tries to delete the file twice, which
of course doesn't work.
Fixed both by not crashing if the transfer file is removed, and by
changing Get to not run download recursively like that. It's the only
thing that did so, and it just seems like a bad idea.
git annex assistant --autostart will start separate daemons in each
listed autostart repo
running the webapp outside any git-annex repo will open it on the
first listed autostart repo
This prevents multiple runs of the assistant in the foreground, and lets
--stop stop foregrounded runs too.
The webapp firstrun case also now writes a pid file, once it's made the git
repo to put it in.
This avoids forking another process, avoids polling, fixes a race,
and avoids a rare forkProcess thread hang that I saw once time
when starting the webapp.
Make Utility.Process wrap the parts of System.Process that I use,
and add debug logging to them.
Also wrote some higher-level code that allows running an action
with handles to a processes stdin or stdout (or both), and checking
its exit status, all in a single function call.
As a bonus, the debug logging now indicates whether the process
is being run to read from it, feed it data, chat with it (writing and
reading), or just call it for its side effect.
Test suite now passes with -threaded!
I traced back all the hangs with -threaded to System.Cmd.Utils. It seems
it's just crappy/unsafe/outdated, and should not be used. System.Process
seems to be the cool new thing, so converted all the code to use it
instead.
In the process, --debug stopped printing commands it runs. I may try to
bring that back later.
Note that even SafeSystem was switched to use System.Process. Since that
was a modified version of code from System.Cmd.Utils, it needed to be
converted too. I also got rid of nearly all calls to forkProcess,
and all calls to executeFile, which I'm also doubtful about working
well with -threaded.
This *almost* works.
Along the way, I noticed that the --uuid parameter was being accidentially
passed after the --, so that has never been actually used by
git-annex-shell to verify it's running in the expected repository. Oops. Fixed.
Not yet tested and places git-annex-shell is run need to be modified to
pass the new field settings.
Note that rsyncServerSend was changed to fork, rather than directly exec
rsync, because it needs to keep the transfer lock held, and clean up the
transfer log when done.
In order to record a semi-useful filename associated with the key,
this required plumbing the filename all the way through to the remotes'
storeKey and retrieveKeyFile.
Note that there is potential for deadlock here, narrowly avoided.
Suppose the repos are A and B. A sends file foo to B, and at the same
time, B gets file foo from A. So, A locks its upload transfer info file,
and then locks B's download transfer info file. At the same time,
B is taking the two locks in the opposite order. This is only not a
deadlock because the lock code does not wait, and aborts. So one of A or
B's transfers will be aborted and the other transfer will continue.
Whew!
Note this is per-remote, so trying to get the same file from multiple
remotes can still let duplicate downloads run. (And uploading the same file
to multiple remotes is not duplicate at all of course.)
get, move, and copy are the only git-annex subcommands that transfer
files, but there's still git-annex-shell recvkey and sendkey to deal with too.
I considered modifying retrieveKeyFile or getViaTmp, but they are called
by other code that does not involve expensive file transfers (migrate)
or that does file transfers that should not be checked by this (fsck --from).