This avoids forking another process, avoids polling, fixes a race,
and avoids a rare forkProcess thread hang that I saw once time
when starting the webapp.
Make Utility.Process wrap the parts of System.Process that I use,
and add debug logging to them.
Also wrote some higher-level code that allows running an action
with handles to a processes stdin or stdout (or both), and checking
its exit status, all in a single function call.
As a bonus, the debug logging now indicates whether the process
is being run to read from it, feed it data, chat with it (writing and
reading), or just call it for its side effect.
Test suite now passes with -threaded!
I traced back all the hangs with -threaded to System.Cmd.Utils. It seems
it's just crappy/unsafe/outdated, and should not be used. System.Process
seems to be the cool new thing, so converted all the code to use it
instead.
In the process, --debug stopped printing commands it runs. I may try to
bring that back later.
Note that even SafeSystem was switched to use System.Process. Since that
was a modified version of code from System.Cmd.Utils, it needed to be
converted too. I also got rid of nearly all calls to forkProcess,
and all calls to executeFile, which I'm also doubtful about working
well with -threaded.
This *almost* works.
Along the way, I noticed that the --uuid parameter was being accidentially
passed after the --, so that has never been actually used by
git-annex-shell to verify it's running in the expected repository. Oops. Fixed.
Not yet tested and places git-annex-shell is run need to be modified to
pass the new field settings.
Note that rsyncServerSend was changed to fork, rather than directly exec
rsync, because it needs to keep the transfer lock held, and clean up the
transfer log when done.
In order to record a semi-useful filename associated with the key,
this required plumbing the filename all the way through to the remotes'
storeKey and retrieveKeyFile.
Note that there is potential for deadlock here, narrowly avoided.
Suppose the repos are A and B. A sends file foo to B, and at the same
time, B gets file foo from A. So, A locks its upload transfer info file,
and then locks B's download transfer info file. At the same time,
B is taking the two locks in the opposite order. This is only not a
deadlock because the lock code does not wait, and aborts. So one of A or
B's transfers will be aborted and the other transfer will continue.
Whew!
Note this is per-remote, so trying to get the same file from multiple
remotes can still let duplicate downloads run. (And uploading the same file
to multiple remotes is not duplicate at all of course.)
get, move, and copy are the only git-annex subcommands that transfer
files, but there's still git-annex-shell recvkey and sendkey to deal with too.
I considered modifying retrieveKeyFile or getViaTmp, but they are called
by other code that does not involve expensive file transfers (migrate)
or that does file transfers that should not be checked by this (fsck --from).
While this word may be less familiar to some users, it avoids the
connotation that version 2 is better than version 1, which is wrong
when the two variants were conflicting.
Note that, since this always pushes branch synced/master to the remote, it
assumes that master has already gotten all the commits that are on the
remote merged in. Otherwise, fast-forward prevention may prevent the push.
That's probably ok, because the next stage is to automatically detect
incoming pushes and merge.
Kqueue needs to remember which files failed to be added due to being open,
and retry them. This commit gets the data in place for such a retry thread.
Broke KeySource out into its own file, and added Eq and Ord instances
so it can be stored in a Set.
Was decoding the git-cat-file of the symlink target as utf8, but that can't
do, unix filenames are from the 70's and need this shiny disco
fileSystemEncoding.