71 lines
3 KiB
Text
71 lines
3 KiB
Text
|
Fixed the assistant to wait on all the zombie processes that would sometimes
|
||
|
pile up. I didn't realize this was as bad as it was.
|
||
|
|
||
|
Zombies and git-annex have been a problem since I started developing it,
|
||
|
because back then I made some rather poor choices, due to barely knowing
|
||
|
how to write Haskell. So parts of the code that stream input from git commands
|
||
|
don't clean up after them properly. Not normally a problem, because
|
||
|
git-annex reaps the zombies after each file it processes. But this reaping
|
||
|
is not thread-safe; it cannot be used in the assistant.
|
||
|
|
||
|
If I were starting git-annex today, I'd use one of the new Haskell things like
|
||
|
Conduits, that allow for very clean control over finalization of resources.
|
||
|
But switching it to Conduits now would probably take weeks of work; I've not
|
||
|
yet felt it was worthwhile. (Also it's not clear Conduits are the last,
|
||
|
best thing.)
|
||
|
|
||
|
For now, it keeps track of the pids it needs to wait on, and all the code
|
||
|
run by the assistant is zombie-free. However, some code for fsck and unused
|
||
|
that I anticipate the assistant using eventually still has some lurking
|
||
|
zombies.
|
||
|
|
||
|
----
|
||
|
|
||
|
Solved the issue with preferred content expressions and dropping that
|
||
|
I mentioned yesterday. My solution was to add a parameter to specify a set
|
||
|
of repositories where content should be assumed not to be present. When
|
||
|
deciding whether to drop, it can put the current repository in, and then
|
||
|
if the expression fails to match, the content can be dropped.
|
||
|
|
||
|
Using yesterday's example "(not copies=trusted:2) and (not in=usbdrive)",
|
||
|
when the local repo is one of the 2 trusted copies, the drop check will
|
||
|
see only 1 trusted copy, so the expression matches, and so the content will
|
||
|
not be dropped.
|
||
|
|
||
|
I've not tested my solution, but it type checks. :P I'll wire it up to
|
||
|
`get/drop/move --auto` tomorrow and see how it performs.
|
||
|
|
||
|
----
|
||
|
|
||
|
Would preferred content expressions be more readble if they were inverted
|
||
|
(becoming content filtering expressions)?
|
||
|
|
||
|
1. "(not copies=trusted:2) and (not in=usbdrive)" becomes
|
||
|
"copies=trusted:2 or in=usbdrive"
|
||
|
2. "smallerthan=10mb and include=*.mp3 and exclude=junk/*" becomes
|
||
|
"largerthan=10mb or exclude=*.mp3" or include=junk/*"
|
||
|
3. "(not group=archival) and (not copies=archival:1)" becomes
|
||
|
"group=archival or copies=archival:1"
|
||
|
|
||
|
1 and 3 are improved, but 2, less so. It's a trifle weird for "include"
|
||
|
to mean "include in excluded content".
|
||
|
|
||
|
The other reason not to do this is that currently the expressions
|
||
|
can be fed into `git annex find` on the command line, and it'll come
|
||
|
back with the files that would be kept.
|
||
|
|
||
|
Perhaps a middle groud is to make "dontwant" be an alias for "not".
|
||
|
Then we can write "dontwant (copies=trusted:2 or in=usbdrive)"
|
||
|
|
||
|
----
|
||
|
|
||
|
A user told me this:
|
||
|
|
||
|
> I can confirm that the assistant does what it is supposed to do really well. I
|
||
|
> just hooked up my notebook to the network and it starts syncing from notebook to
|
||
|
> fileserver and the assistant on the fileserver also immediately starts syncing
|
||
|
> to the [..] backup
|
||
|
|
||
|
That makes me happy, it's the first quite so real-world success report I've
|
||
|
heard.
|