It used to not log to daemon.log when a repository was first created, and
when starting the webapp. Now both do. Redirecting stdout and stderr to the
log is tricky when starting the webapp, because the web browser may want to
communicate with the user. (Either a console web browser, or web.browser = echo)
This is handled by restoring the original fds when running the browser.
since some systems may have configuration problems or other issues that
prevent web browsers from connecting to the right localhost IP for the
webapp.
Tested on both ipv4 and ipv6 localhost. Url for the latter looks like:
http://[::1]:50676
The expensive scan uses lookupFile, but in direct mode, that doesn't work
for files that are present. So the scan was not finding things that are
present that need to be uploaded. (It did find things not present that
needed to be downloaded.)
Now lookupFile also works in direct mode. Note that it still prefers
symlinks on disk to info committed to git, in direct mode. This is
necessary to make things like Assistant.Threads.Watcher.onAddSymlink
work correctly, when given a new symlink not yet checked into git (or
replacing a file checked into git).
This way, once it switches to the new repo, the user can switch back to the
old one, and its menu will allow switching to the new again.
However, if there are multiple repos, the others don't yet learn about the
new repo.
Would like to also have restart UI, but that's rather harder to do,
seems it'd need to start another copy of the webapp, and redirect the
browser to its new url, but running two assistants in the same repo at
the same time isn't good.
Now there's a Config type, that's extracted from the git config at startup.
Note that laziness means that individual config values are only looked up
and parsed on demand, and so we get implicit memoization for all of them.
So this is not only prettier and more type safe, it optimises several
places that didn't have explicit memoization before. As well as getting rid
of the ugly explicit memoization code.
Not yet done for annex.<remote>.* configuration settings.
When a file is changed in direct mode, the old content is probably lost
(at least from the local repo), and bookeeping needs to be updated to
reflect this.
Also, synthetic add events are generated at assistant startup, so
make it detect when the file has not really changed, and avoid re-adding
it.
This does add the overhead of querying the runing git cat-file for the
key that's recorded in git for the file, each time a file is added or
modified in direct mode.
git add --update cannot be used, because it'll stage typechanged direct
mode files. Intead, use ls-files to find deleted files, and stage them
ourselves.
It seems that no commit was made before when the scan staged deleted files.
(Probably masked since if files were added, a commit happened then..)
Now that I'm doing the staging, I was also able to fix that bug.
This allows it to use Build.SysConfig to always install the programs
configure detected. Amoung other fixes, this ensures the right uuid
generator and checksum programs are installed.
I also cleaned up the handling of lsof's path; configure now checks for
it in PATH, but falls back to looking for it in sbin directories.
* get/copy --auto: Transfer data even if it would exceed numcopies,
when preferred content settings want it.
* drop --auto: Fix dropping content when there are no preferred content
settings.
It was doubly broken; both missing a slash, and containing
"runshell git-annex", while some parts of the code expected it to be a
simple path to a program. This appears to include the transfer queue
runner, and the code that starts a new assistant process when switching to
another repository in the webapp.
For no apparent reason, this version removes all useful instances of
ToJavaScript, leavind behind only an instance for Aeson.Value. Argh. Pissed
off at this arbitrary breaking change, and seriously considering dropping
this library.
Noticed that when pairing, sometimes both sides start to push, and the other
side sends a PushRequest, and the two deadlock, neither doing anything.
(Timeout eventually breaks this.) So, let both run at the same time.
This should help prevent git-annex clients receiving messages that
were intended for normal clients they're sharing the account with.
Changed XMPP protocol use to always send chat messages directed at the
specific client, as the negative priority blocks less directed messages.
I decided to use the fallback push mode from the beginning for XMPP, since
while it uses some ugly branches, it avoids the possibility of a normal
push failing, and needing to pull and re-push. Due to the overhead of XMPP,
and the difficulty of building such a chain of actions due to the async
implementation, this seemed reasonable.
It seems to work great!
My reasoning is that StartingPush could be received after another push
starts being received, and it would be better to respond to it afterwards
than not.
XMPP has no defined message size limits, but some servers will have ad-hoc
limits. However, 4k seems safe, even after the additional bloat of base64.
That should not exceed 8k.
Inject the required git-remote-xmpp into PATH when running xmpp git push.
Rest of the time it will not be in PATH, and git won't be able to talk to
xmpp remotes.
It might even work, although nothing yet triggers XMPP pushes.
Also added a set of deferred push messages. Only one push can run at a
time, and unrelated push messages get deferred. The set will never grow
very large, because it only puts two types of messages in there, that
can only vary in the client doing the push.
Maybe the spec allows it, but broadcasting self-directed presence info to
all buddies is just insane.
I had to bring back the IQ messages for self-pairing, while still using
directed presence for other pairing. Ugly.
Testing between Google Talk and prosody, the directed IQ messages
were not received. Google Talk probably only relays them between
clients using the same account.
I first tried even more directed presence, with each client JID being sent
a separate presence, but that didn't work on Google Talk, particularly
it was ignored when one client sent it to another client using the same
account.
So, presence directed at the user@host of the client to pair with. Tested
working between Google Talk and prosody (in both directions), as well
as between two clients with the same account on Google Talk, and
two clients with the same account on prosody.
Only problem with this form of directed presence is that if I also use it
for git pushes, more clients than are interested in a push's data will
receive it. So I may need some better approach, or a hybrid between
directed IQ and directed presence.
Amusingly, I am not really using xmpp ping for pairing. I forgot to put in
the ping tag! And when I did, it stopped working, on Google Talk. Seems
it handles client to client pings, at least using the same JID, without
actually sending them to the end client. My mistake avoided this,
and seems to work, so I've left it as-is for now, with just the git-annex
tag in an IQ message. Also tested on prosody.
Wrote a better git remote name sanitizer. Git blows up on lots of weird
stuff, especially if it starts the remote name, but I managed to get
some common punctuation working.
Still wait 1 minute after a change before waiting on the next change, but don't
wait at the start, when we might get a pull that contains config changes
right away.
Currently have three old versions of functions that more reworking is
needed to remove: getDaemonStatusOld, modifyDaemonStatusOld_, and
modifyDaemonStatusOld
This is a nice win; much less code runs in Annex, so other threads have
more chances to run concurrently.
I do notice that renaming a file has gone from 1 to 2 commits. I think this
is due to the above improvement letting the committer run more frequently,
so it commits the rm first.
Converted several threads to run in the monad.
Added a lot of useful combinators for working with the monad.
Now the monad includes the name of the thread.
Some debugging messages are disabled pending converting other threads.
I now have this topology working:
assistant ---> {bare repo, special remote} <--- assistant
And, I think, also this one:
+----------- bare repo --------+
v v
assistant ---> special remote <--- assistant
While before with assistant <---> assistant connections, both sides got
location info updated after a transfer, in this topology, the bare repo
*might* get its location info updated, but the other assistant has no way to
know that it did. And a special remote doesn't record location info,
so transfers to it won't propigate out location log changes at all.
So, for these to work, after a transfer succeeds, the git-annex branch
needs to be pushed. This is done by recording a synthetic commit has
occurred, which lets the pusher handle pushing out the change (which will
include actually committing any still journalled changes to the git-annex
branch).
Of course, this means rather a lot more syncing action than happened
before. At least the pusher bundles together very close together pushes,
somewhat. Currently it just waits 2 seconds between each push.
I am befuddled that Twitter Bootstrap has no built-in Icon for The Cloud,
and also that Chromium's depiction of CLOUD (U+2601) has an uncanny
resemblance to PILE OF POO (U+1F4A9) when rendered small, and looks like a
looming Frankenstorm when rendered large, and not a sweet, sunny, nothing
can go wrong The Cloud.
<http://www.fileformat.info/info/unicode/char/2601/browsertest.htm>
So, I must resort to irony in my choice of icons.
Adjust build deps to ensure that only a fixed version of the library will
be used.
Also, removed the bound thread stuff, which I now think was (probably)
a red herring.
MountWatcher can't do this, because it uses the session dbus,
and won't have access to the new DBUS_SESSION_BUS_ADDRESS if a new session
is started.
Bumped dbus library version, FD leak in it is fixed.
Currently relies on SRV being set, or the JID's hostname being the server
hostname and the port being default. Future work: Allow manual
configuration of user name, hostname, and port.
Now when the dbus connection is dropped, it'll fall back to polling.
I could make it try to reconnect, but there's a FD leak in the dbus
library, so not yet.
This *may* solve the segfault I was seeing when the XMPP library called
startTLS. My hypothesis is as follows:
* TLS is documented
(http://www.gnu.org/software/gnutls/manual/gnutls.html#Thread-safety)
thread safe, but only when a single thread accesses it.
* forkIO threads are not bound to an OS thread, so it was possible for
the threaded runtime to run part of the XMPP code on one thread, and
then switch to another thread later.
So, forkOS, with its bound threads, should be used for the XMPP thread.
Since the crash doesn't happen reliably, I am not yet sure about this fix.
Note that I kept all the other threads in the assistant unbound, because
bound threads have significantly higher overhead.
Seems presence notifications are not sent to clients that have marked
themselves unavailable. (Testing with google talk.)
This is the death knell for the presence hack, because it has to stay
available, and even the toggle to unavailable and back could cause it to
miss a notification. Still, flipped it so it basically works, for now.
Lacking error handling, reconnection, credentials configuration,
and doesn't actually do anything when it receives an incoming notification.
Other than that, it might work! :)
Hooked up everything that needs to notify on pushes. Note that
syncNewRemote does not notify. This is probably ok, and I'd need to thread
more state through to make it do so.
This is only set up to support a single push notification method; I didn't
use a NotificationBroadcaster. Partly because I don't yet know what info
about pushes needs to be communicated, so my data types are only
preliminary.
Monitors git-annex branch for changes, which are noticed by the Merger
thread whenever the branch ref is changed (either due to an incoming push,
or a local change), and refreshes cached config values for modified config
files.
Rate limited to run no more often than once per minute. This is important
because frequent git-annex branch changes happen when files are being
added, or transferred, etc.
A primary use case is that, when preferred content changes are made,
and get pushed to remotes, the remotes start honoring those settings.
Other use cases include propigating repository description and trust
changes to remotes, and learning when a remote has added a new special
remote, so the webapp can present the GUI to enable that special remote
locally.
Also added a uuid.log cache. All other config files already had caches.
This can result in the file being dropped, or being downloaded, or even
being dropped from some other repo.
It's even possible to create a file in a directory where content is not
wanted, which will make the assistant immediately send it elsewhere, and
then drop it.
This was complicated quite a bit by needing to check numcopies. I optimised
that, so it only looks up numcopies once per file, no matter how many
remotes it checks to drop from. Although it did just occur to me that
it might be better to first check if it wants to drop content, and only
then check numcopies..
None-bare removable drive repos don't have the assistant running in them,
so don't get their master branch updated as syncs come in. This will
probably change later, but for now, set up something that works.
Also, set the description of a newly added drive's repo locally. This
ensures that the repo edit form has the description in it.
This avoids the expensive transfer scan relying on its list of remotes
to scan being accurate throughout, which it will not be when the user
pauses syncing to a remote.
I feel it's ok to queue transfers to *any* known remote, not just the ones
being scanned.
Note that there are still small races where after syncing to a remote is
paused, a transfer can be queued for it. Not just in the expensive transfer
scan, but in the cheap failed transfer scan, and elsewhere.
Although I observe that these toggles don't always prevent syncing.
When a transfer scan is active, it will still queue items from the disabled
remote.
Also, transfers from a disabled remote show up as from "unknown", which is
not ideal.