* Bugfix: Remove leading \ from checksums output by sha*sum commands,
when the filename contains \ or a newline. Closes: #696384
* fsck: Still accept checksums with a leading \ as valid, now that
above bug is fixed.
* migrate: Remove leading \ in checksums
However, I don't yet have a reliable way to deal with files being modified
while they're being transferred. I have code that detects it on the sending
side, but the receiver is still free to move the wrong content into its
annex, and record that it has the content. So that's not acceptable, and
I'll need to work on it some more.
However, at this point I can use a direct mode repository as a remote and
transfer files from and to it.
* get/copy --auto: Transfer data even if it would exceed numcopies,
when preferred content settings want it.
* drop --auto: Fix dropping content when there are no preferred content
settings.
Files are now written to a tmp directory in the remote, and once all
chunks are written, etc, it's moved into the final place atomically.
For now, checkpresent still checks every single chunk of a file, because
the old method could leave partially transferred files with some chunks
present and others not.
Inject the required git-remote-xmpp into PATH when running xmpp git push.
Rest of the time it will not be in PATH, and git won't be able to talk to
xmpp remotes.
I now have this topology working:
assistant ---> {bare repo, special remote} <--- assistant
And, I think, also this one:
+----------- bare repo --------+
v v
assistant ---> special remote <--- assistant
While before with assistant <---> assistant connections, both sides got
location info updated after a transfer, in this topology, the bare repo
*might* get its location info updated, but the other assistant has no way to
know that it did. And a special remote doesn't record location info,
so transfers to it won't propigate out location log changes at all.
So, for these to work, after a transfer succeeds, the git-annex branch
needs to be pushed. This is done by recording a synthetic commit has
occurred, which lets the pusher handle pushing out the change (which will
include actually committing any still journalled changes to the git-annex
branch).
Of course, this means rather a lot more syncing action than happened
before. At least the pusher bundles together very close together pushes,
somewhat. Currently it just waits 2 seconds between each push.
I had been using -ignore-package monads-tf to deal with this, but
the XMPP library uses monads-tf, so that also ignores it. Instead,
use PackageImports to force use of mtl in my own code.
This was complicated quite a bit by needing to check numcopies. I optimised
that, so it only looks up numcopies once per file, no matter how many
remotes it checks to drop from. Although it did just occur to me that
it might be better to first check if it wants to drop content, and only
then check numcopies..
When rsyncProgress pipes rsync's stdout, this turns out to cause a ssh
process started by rsync to be left behind as a zombie. I don't know why,
but my recent zombie reaping cleanup was correct, it's just that this other
zombie, that's not directly started by git-annex, was no longer reaped
due to changes in the cleanup. Make rsyncProgress reap the zombie started
by rsync, as a workaround.
FWIW, the process tree looks like this. It seems like the rsync child
is for some reason starting but not waiting on this extra ssh process.
Ssh connection caching may be involved -- disabling it seemed to change
the shape of the tree, but did not eliminate the zombie.
9378 pts/14 S+ 0:00 | \_ rsync -p --progress --inplace -4 -e 'ssh' '-S' ...
9379 pts/14 S+ 0:00 | | \_ ssh ...
9380 pts/14 S+ 0:00 | | \_ rsync -p --progress --inplace -4 -e 'ssh' '-S' ...
9381 pts/14 Z+ 0:00 | \_ [ssh] <defunct>
If the autostart file lists a repository, for which a directory exists,
but there's not actually a valid git repo in there, the web app used to
try to use it, and see it wasn't valid, and then try to autostart again.
The ensuing runaway loop also ate memory, although not as fast as I was led
to belive was happening to someone on IRC yesterday. So that guy may have
had a different problem. But this seems otherwise a reasonable fit for the
circumstances described, if git-annex was started before something that
occurred during desktop login that made the repository available.
Both when queueing downloads, and uploads, consults the preferred content
settings.
I didn't make it check yet when requeing failed transfers or queuing
deferred downloads; dealing with the preferred content settings (or indeed,
other settings) changing while the assistant is running still needs work.