This commit is contained in:
Joey Hess 2012-10-23 15:47:36 -04:00
parent e5a08bef51
commit e66bcec263
2 changed files with 77 additions and 0 deletions

View file

@ -0,0 +1,50 @@
Time to solve the assistant's [[cloud]] notification problem. This is
really the last big road-bump to making it be able to sync computers
across the big, bad internet.
So, IRC still seems a possibility, but I'm going to try XMPP first. Since
Google Talk uses XMPP, it should be easy for users to get an account, and
it's also easy to run your own XMPP server.
Played around with the Haskell XMPP library. Clint helpfully showed me an
example of a simple client, which helped cut through that library's thicket
of data types. In short order I had some clients that were able to see each
other connecting to my personal XMPP server. On to some design..
1. I want to avoid user-visible messages.
(dvcs-autosync also uses XMPP, but I checked the code and it
seems to send user-visible messages, so I diverge from its lead here.)
This seems very possible, only a matter of finding the right
way to use the protocol, or an appropriate and widely deployed extension.
The only message I need to send with XMPP, really, is "I have pushed to our
git repo". One bit of data would do; being able to send a UUID of the repo
that did the update would be nice.
2. I'd also like to broadcast my notification to a user's buddies.
dvcs-autosync sends only self-messages, but that requires every node
have the same XMPP account configured. While I want to be able to run
in that mode, I also want to support pairs of users who have their own XMPP
accounts, that are buddied up in XMPP.
3. To add to the fun, the assistant's use of XMPP should not make that XMPP
account appear active to its buddies. Users should not need a dedicated
XMPP account for git-annex, and always seeming to be available when
git-annex is running would not be nice.
The first method I'm trying out is to encode the notification
data inside a XMPP presence broadcast. This should meet all three
requirements. The plan is two send two
presence messages, the first marks the client as available, and the second
as unavailable again.
The "id" attribute will be set to some
value generated by the assistant. That attribute is allowed on presence
messages, and servers are [required to preserve it](http://xmpp.org/rfcs/rfc6121.html#presence-probe-inbound-id)
while the client is connected.
(I'd only send unavailable messages, but while
that worked when I tested it using the prosody server, with google talk,
repeated unavailable messages were suppressed. Also, google talk does not
preserve the "id" attribute of unavailable presence messages.)
If this presence hackery doesn't work out, I could try
[XEP-0163: Personal Eventing Protocol](http://xmpp.org/extensions/xep-0163.html).
But I like not relying on any extensions.

View file

@ -44,6 +44,33 @@ the assistant will transfer the file from the cloud to Bob.
* pubsubhubbub does not seem like an option; its hubs want to pull down
a feed over http.
### jabber TODO
* test with big servers, eg google chat
* Prevent idle disconnection. Probably means sending or receiving pings,
but would prefer to avoid eg pinging every 60 seconds as some clients do.
* Make the git-annex clients invisible, so a user can use their regular
account without always seeming to be present when git-annex is logged in.
See <http://xmpp.org/extensions/xep-0126.html>
### jabber security
Any data git-annex sends over this XMPP will be visible to the XMPP
account's buddies, to the XMPP server, and quite likely to other interested
parties. So it's important to consider the security exposure of using it.
If git-annex sends only a single bit notification, this lets attackers know
when the user is active and changing files. Although the assistant's other
syncing activities can somewhat mask this.
As soon as git-annex does anything unlike any other client, an attacker can
see how many clients are connected for a user, and fingerprint the ones
running git-annex, and determine how many clients are running git-annex.
If git-annex sent the UUID of the remote it pushed to, this would let
attackers determine how many different remotes are being used,
and map some of the connections between clients and remotes.
## storing git repos in the cloud
Of course, one option is to just use github etc to store the git repo.