xmpp

2012-10-23 15:47:36 -04:00 · 2012-10-23 15:47:36 -04:00 · e66bcec263
commit e66bcec263
parent e5a08bef51
2 changed files with 77 additions and 0 deletions
--- a/doc/design/assistant/blog/day_112__and_now_for_something_completely_different.mdwn
+++ b/doc/design/assistant/blog/day_112__and_now_for_something_completely_different.mdwn
@ -0,0 +1,50 @@
+Time to solve the assistant's [[cloud]] notification problem. This is
+really the last big road-bump to making it be able to sync computers
+across the big, bad internet.
+
+So, IRC still seems a possibility, but I'm going to try XMPP first. Since
+Google Talk uses XMPP, it should be easy for users to get an account, and
+it's also easy to run your own XMPP server.
+
+Played around with the Haskell XMPP library. Clint helpfully showed me an
+example of a simple client, which helped cut through that library's thicket
+of data types. In short order I had some clients that were able to see each
+other connecting to my personal XMPP server. On to some design..
+
+1. I want to avoid user-visible messages.
+   (dvcs-autosync also uses XMPP, but I checked the code and it
+   seems to send user-visible messages, so I diverge from its lead here.)
+   This seems very possible, only a matter of finding the right
+   way to use the protocol, or an appropriate and widely deployed extension.
+   The only message I need to send with XMPP, really, is "I have pushed to our
+   git repo". One bit of data would do; being able to send a UUID of the repo
+   that did the update would be nice.
+
+2. I'd also like to broadcast my notification to a user's buddies.
+   dvcs-autosync sends only self-messages, but that requires every node
+   have the same XMPP account configured. While I want to be able to run
+   in that mode, I also want to support pairs of users who have their own XMPP
+   accounts, that are buddied up in XMPP.
+
+3. To add to the fun, the assistant's use of XMPP should not make that XMPP
+   account appear active to its buddies. Users should not need a dedicated
+   XMPP account for git-annex, and always seeming to be available when
+   git-annex is running would not be nice.
+
+The first method I'm trying out is to encode the notification
+data inside a XMPP presence broadcast. This should meet all three
+requirements. The plan is two send two
+presence messages, the first marks the client as available, and the second
+as unavailable again.
+The "id" attribute will be set to some
+value generated by the assistant. That attribute is allowed on presence
+messages, and servers are [required to preserve it](http://xmpp.org/rfcs/rfc6121.html#presence-probe-inbound-id)
+while the client is connected.
+(I'd only send unavailable messages, but while
+that worked when I tested it using the prosody server, with google talk,
+repeated unavailable messages were suppressed. Also, google talk does not
+preserve the "id" attribute of unavailable presence messages.)
+
+If this presence hackery doesn't work out, I could try
+[XEP-0163: Personal Eventing Protocol](http://xmpp.org/extensions/xep-0163.html).
+But I like not relying on any extensions.
--- a/doc/design/assistant/cloud.mdwn
+++ b/doc/design/assistant/cloud.mdwn
@ -44,6 +44,33 @@ the assistant will transfer the file from the cloud to Bob.
 * pubsubhubbub does not seem like an option; its hubs want to pull down
  a feed over http.

+### jabber TODO
+
+* test with big servers, eg google chat
+* Prevent idle disconnection. Probably means sending or receiving pings,
+  but would prefer to avoid eg pinging every 60 seconds as some clients do.
+* Make the git-annex clients invisible, so a user can use their regular
+  account without always seeming to be present when git-annex is logged in.
+  See <http://xmpp.org/extensions/xep-0126.html>
+
+### jabber security
+
+Any data git-annex sends over this XMPP will be visible to the XMPP
+account's buddies, to the XMPP server, and quite likely to other interested
+parties. So it's important to consider the security exposure of using it.
+
+If git-annex sends only a single bit notification, this lets attackers know
+when the user is active and changing files. Although the assistant's other
+syncing activities can somewhat mask this.
+
+As soon as git-annex does anything unlike any other client, an attacker can
+see how many clients are connected for a user, and fingerprint the ones
+running git-annex, and determine how many clients are running git-annex.
+
+If git-annex sent the UUID of the remote it pushed to, this would let
+attackers determine how many different remotes are being used,
+and map some of the connections between clients and remotes.
+
 ## storing git repos in the cloud

 Of course, one option is to just use github etc to store the git repo.