diff --git a/doc/clusters.mdwn b/doc/clusters.mdwn
new file mode 100644
index 0000000000..adcbea9cb1
--- /dev/null
+++ b/doc/clusters.mdwn
@@ -0,0 +1,98 @@
+A git-annex repository can provide access to its remotes as nodes of a
+cluster. This allows other repositories to access the cluster as a single
+logical repository.
+
+[[!toc ]]
+
+## using a cluster
+
+For example, a remote "bigserver" that is configured as a cluster will
+make available an additional remote "bigserver-mycluster", as well as some
+remotes for each node eg "bigserver-node1", "bigserver-node2", etc.
+
+The user can get files from the cluster without caring which node it comes
+from:
+
+ $ git-annex get foo --from bigserver-mycluster
+ copy foo (from bigserver-mycluster...) ok
+
+And the user can send files to the cluster, without caring what nodes
+they are stored to:
+
+ $ git-annex move bar --to bigserver-mycluster
+ move bar (to bigserver-mycluster...) ok
+
+In fact, a single upload can be sent to every node of the cluster at once.
+
+ $ git-annex whereis bar
+ whereis bar (3 copies)
+ acae2ff6-6c1e-8bec-b8b9-397a3755f397 -- my cluster [bigserver-mycluster]
+ 9f514001-6dc0-4d83-9af3-c64c96626892 -- node 1 [bigserver-node1]
+ d81e0b28-612e-4d73-a4e6-6dabbb03aba1 -- node 2 [bigserver-node2]
+ 5657baca-2f11-11ef-ae1a-5b68c6321dd9 -- node 3 [bigserver-node3]
+
+Notice that the file is shown as present in the cluster, as well as on
+individual nodes. But the cluster itself does not count as a copy of the file,
+so the 3 copies are the copies on individual nodes.
+
+Most other git-annex commands that operate on repositories can also operate on
+clusters.
+
+Clusters can only be accessed via ssh.
+
+## configuring a cluster
+
+A new cluster first needs to be initialized. Run [[git-annex-initcluster]] in
+the repository that will serve the cluster to clients. In the example above,
+this was the "bigserver" repository.
+
+ $ git-annex initcluster mycluster
+
+Once a cluster is initialized, the next step is to add nodes to it.
+To make a remote be a node of the cluster, configure
+`git config remote.name.annex-cluster-node`, setting it to the
+name of the cluster.
+
+In the example above, the three cluster nodes were configured like this:
+
+ $ git remote add node1 /media/disk1/repo
+ $ git remote add node2 /media/disk2/repo
+ $ git remote add node3 /media/disk3/repo
+ $ git config remote.node1.annex-cluster-node true
+ $ git config remote.node2.annex-cluster-node true
+ $ git config remote.node3.annex-cluster-node true
+
+Finally, run `git-annex updatecluster` to record the cluster configuration
+in the git-annex branch. That tells other repositories about the cluster.
+
+ $ git-annex updatecluster mycluster
+ Added node node1 to cluster: mycluster
+ Added node node2 to cluster: mycluster
+ Added node node3 to cluster: mycluster
+ Started proxying for node1
+ Started proxying for node2
+ Started proxying for node3
+
+## preferred content of clusters
+
+The preferred content of the cluster can be configured. This tells
+users what files the cluster as a whole should contain.
+
+To configure the preferred content of a cluster, as well as other related
+things like [[groups|git-annex-group]] and [[required_content]], it's easiest
+to do the configuration in a repository that has the cluster as a remote.
+
+For example:
+
+ git-annex wanted bigserver-mycluster standard
+ git-annex group bigserver-mycluster archive
+
+By default, when a file is uploaded to a cluster, it is stored on every node of
+the cluster. To control which nodes to store to, the [[preferred_content]] of
+each node can be configured.
+
+If the preferred content configuration of nodes make none of them
+want a copy of a file, the upload to the cluster will fail. That is done to
+avoid git-annex picking an arbitrary node. But, the user can bypass the
+cluster and send content to any individual node, even if it's not preferred
+content of that node.
diff --git a/doc/git-annex-initcluster.mdwn b/doc/git-annex-initcluster.mdwn
index 7b9c9cb7f4..c8916564c1 100644
--- a/doc/git-annex-initcluster.mdwn
+++ b/doc/git-annex-initcluster.mdwn
@@ -8,37 +8,11 @@ git-annex initcluster name [description]
# DESCRIPTION
-A git-annex repository can provide access to its remotes as a unified
-cluster. This allows other repositories to access the cluster as a remote,
-with uploads and downloads distributed amoung the nodes of the cluster,
-according to their preferred content settings.
-
This command initializes a new cluster with the specified name. If no
description is provided, one will be set automatically.
-Once a cluster is initialized, the next step is to add nodes to it.
-To make a remote be a node of the cluster, configure
-`git config remote.name.annex-cluster-node`, setting it to the
-name of the cluster.
-
-Finally, run `git-annex updatecluster` to record the cluster configuration
-in the git-annex branch. That tells other repositories about the cluster.
-
-Example:
-
- git-annex initcluster mycluster
- git config remote.foo.annex-cluster-node mycluster
- git config remote.bar.annex-cluster-node mycluster
- git config remote.baz.annex-cluster-node mycluster
- git-annex updatecluster
-
-Suppose, for example, that remote "bigserver" has had those commands run in
-it. Then after pulling from "bigserver", git-annex will know about an
-additional remote, "bigserver-mycluster", which can be used like any other
-remote but is an interface to the cluster as a whole. The individual cluster
-nodes will also be proxied as remotes, eg "bigserver-foo".
-
-Clusters can only be accessed via ssh.
+The next step after running this command is to configure
+the cluster, then run [[git-annex-updatecluster]].
# OPTIONS
@@ -51,6 +25,8 @@ Clusters can only be accessed via ssh.
[[git-annex-preferred-content]](1)
[[git-annex-updateproxy]](1)
+
+
# AUTHOR
Joey Hess
diff --git a/doc/git-annex-updatecluster.mdwn b/doc/git-annex-updatecluster.mdwn
index 75bc6f41cf..ddbc968586 100644
--- a/doc/git-annex-updatecluster.mdwn
+++ b/doc/git-annex-updatecluster.mdwn
@@ -29,6 +29,8 @@ and run this command.
[[git-annex-initcluster]](1)
[[git-annex-updateproxy]](1)
+
+
# AUTHOR
Joey Hess
diff --git a/doc/links/key_concepts.mdwn b/doc/links/key_concepts.mdwn
index b1b037789c..0c2e1ddf74 100644
--- a/doc/links/key_concepts.mdwn
+++ b/doc/links/key_concepts.mdwn
@@ -5,3 +5,5 @@
* [[special_remotes]]
* [[workflows|workflow]]
* [[sync]]
+* [[preferred_content]]
+* [[clusters]]