However, I don't yet have a reliable way to deal with files being modified
while they're being transferred. I have code that detects it on the sending
side, but the receiver is still free to move the wrong content into its
annex, and record that it has the content. So that's not acceptable, and
I'll need to work on it some more.
However, at this point I can use a direct mode repository as a remote and
transfer files from and to it.
Aka solve the github problem.
Note that it's possible the initial configlist will fail for some network
reason etc, and then the fetch succeeds. In this case, a usable remote gets
disabled. But it does print a message, and this only happens once per
remote, so that seems ok.
When a transfer fails, the progress info can be used to intelligently
retry it. If the transfer managed to make some progress, but did not
fully complete, then there's a good chance that a retry will finish it
(or at least make more progress).
cp is used here, but we can just watch the size of the destination file
This commit made from within the ruins of an old mill, overlooking a
beautiful waterfall.
Current implementation parses rsync's output a character a time, which
is hardly efficient. It could be sped up a lot by using hGetBufSome,
but that would require going really lowlevel, down to raw C style buffers
(good example of that here: http://users.aber.ac.uk/afc/stricthaskell.html)
But rsync doesn't output very much, so currently it seems ok.
Transfer info files are updated when the callback is called, updating
the number of bytes transferred.
Left unused p variables at every place the callback should be used.
Which is rather a lot..
Turns out that recvkey already does this same check. This avoids a transfer
file being created for the download that never happened, which in turn
will avoid the assistant seeing that the download has finished, when no
transfer actually took place.
Currently only the web special remote is readonly, but it'd be possible to
also have readonly drives, or other remotes. These are handled in the
assistant by only downloading from them, and never trying to upload to
them.
This commit includes a paydown on technical debt incurred two years ago,
when I didn't know that it was bad to make custom Read and Show instances
for types. As the routes need Read and Show for Transfer, which includes a
Key, and deriving my own Read instance of key was not practical,
I had to finally clean that up.
So the compact Key read and show functions are now file2key and key2file,
and Read and Show are now derived instances.
Changed all code that used the old instances, compiler checked.
(There were a few places, particularly in Command.Unused, and the test
suite where the Show instance continue to be used for legitimate
comparisons; ie show key_x == show key_y (though really in a bloom filter))
Make Utility.Process wrap the parts of System.Process that I use,
and add debug logging to them.
Also wrote some higher-level code that allows running an action
with handles to a processes stdin or stdout (or both), and checking
its exit status, all in a single function call.
As a bonus, the debug logging now indicates whether the process
is being run to read from it, feed it data, chat with it (writing and
reading), or just call it for its side effect.
Test suite now passes with -threaded!
I traced back all the hangs with -threaded to System.Cmd.Utils. It seems
it's just crappy/unsafe/outdated, and should not be used. System.Process
seems to be the cool new thing, so converted all the code to use it
instead.
In the process, --debug stopped printing commands it runs. I may try to
bring that back later.
Note that even SafeSystem was switched to use System.Process. Since that
was a modified version of code from System.Cmd.Utils, it needed to be
converted too. I also got rid of nearly all calls to forkProcess,
and all calls to executeFile, which I'm also doubtful about working
well with -threaded.
This *almost* works.
Along the way, I noticed that the --uuid parameter was being accidentially
passed after the --, so that has never been actually used by
git-annex-shell to verify it's running in the expected repository. Oops. Fixed.
In order to record a semi-useful filename associated with the key,
this required plumbing the filename all the way through to the remotes'
storeKey and retrieveKeyFile.
Note that there is potential for deadlock here, narrowly avoided.
Suppose the repos are A and B. A sends file foo to B, and at the same
time, B gets file foo from A. So, A locks its upload transfer info file,
and then locks B's download transfer info file. At the same time,
B is taking the two locks in the opposite order. This is only not a
deadlock because the lock code does not wait, and aborts. So one of A or
B's transfers will be aborted and the other transfer will continue.
Whew!
Not including such remotes turned out to have other consequences,
including annex-truselevel git config being ignored. Instead, add guards
before each operation that might try to operate on such a repo.
Prelude.undefined error message was introduced by
bb4f31a0ee.
It seems best to filter out local repositories that cannot be accessed
from the list of remotes, rather than keeping them in and making every
thing that uses the list have to deal with remotes that may have an unknown
location.
Besides fixing the error message, this also makes unavailable local
remotes' names not be shown in various messages, including in git annex
status output.
Also, move --to an unavailable local repository now avoids some ugly
errors like "changeWorkingDirectory: does not exist".
This was shown redundantly for a tricky reason -- while it runs
inside a doSideAction block that would appear to supress it,
the action being run is in a different state monad; for the remote,
and so the suppression doesn't work.
Always suppressing the message when committing to a local remote is
ok do to though -- it mirrors the /dev/nulling of the git annex shell commit
output. And it turns out that any time there is a git-annex branch state
change to commit on the remote, the local repo has also had a similar
change made, and so the message has been shown already.
The environment needs to override git-config. Changed when git config is
read, and avoid rereading it once it's been read.
chdir for both worktree settings.
getConfig got a remote-specific config, and this confusing name caused it
to be used a couple of places that only were interested in global configs.
Rename to getRemoteConfig and make getConfig only get global configs.
There are no behavior changes here, but remote.<name>.annex-web-options
never actually worked (and per-remote web options is a very unlikely to be
useful case so I didn't make it work), so fix the documentation for it.
openSUSE patches rsync with a patch adding SIP protocol support.
https://gist.github.com/2026167
With this patch, running rsync with no hostname parameter is apparently
supposed to list SIP hosts on the network. Practically, it does nothing
and exits 0.
git-annex uses rsync in a very special way to allow git-annex-shell to be
run on the remote host, and so did not need to specify a hostname, or a
file to transfer as a rsync parameter. So it sent ":", a degenerate case of
"host:file".
But the patch cannot differentiate ":" with no host parameter
(a bug in the SIP patch surely).
Results were that getting files failed, as rsync seemed to succeed, but the
requested file failed to arrive. Also I think that sending files will
make git-annex think a file has been transferred to the remote when
really rsync does nothing.
The workaround for this buggy rsync patch is to use "dummy:" as the
hostname.
Added Annex.cleanup, which is a general purpose interface for adding
actions to run at the end.
Remotes with the old git-annex-shell will commit every time, and have no
commit command, so hide stderr when running the commit command.
If there's no Content-Length, or the key has no size, this check is not
done, but it should happen most of the time, and protect against web
content that has changed.
Done by adding a oneshot mode, in which location log changes are written to
the journal, but not committed. Taking advantage of git-annex's existing
ability to recover in this situation.
This is used by git-annex-shell and other places where changes are made to
a remote's location log.
For a local git remote, can symlink the file.
For a git remote using rsync, can preseed any local content.
There are a few reasons to use fsck --from on a normal git remote.
One is if it's using gitosis or similar, and you don't have shell access
to run git annex locally. Another reason could be if you just want to
fsck certian files of a bare remote.