update for exporttree=yes

This commit is contained in:
Joey Hess 2024-08-08 15:51:36 -04:00
parent 727b6a0b6d
commit 0959bfe5d3
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38

View file

@ -12,7 +12,7 @@ special remotes.
## using a cluster ## using a cluster
To use a cluster, your repository needs to have its gateway configured as a To use a cluster, your repository needs to have its gateway configured as a
remote. Clusters can currently only be accessed via ssh or by a annex+http remote. Clusters can currently only be accessed via ssh or by an annex+http
url. This gateway remote is added the same as any other git remote: url. This gateway remote is added the same as any other git remote:
$ git remote add bigserver me@bigserver:annex $ git remote add bigserver me@bigserver:annex
@ -105,7 +105,7 @@ In the example above, the three cluster nodes were configured like this:
$ git remote add node1 /media/disk1/repo $ git remote add node1 /media/disk1/repo
$ git remote add node2 /media/disk2/repo $ git remote add node2 /media/disk2/repo
$ git remote add node3 /media/disk3/repo $ git remote add node3 /media/disk2/repo
$ git config remote.node1.annex-cluster-node mycluster $ git config remote.node1.annex-cluster-node mycluster
$ git config remote.node2.annex-cluster-node mycluster $ git config remote.node2.annex-cluster-node mycluster
$ git config remote.node3.annex-cluster-node mycluster $ git config remote.node3.annex-cluster-node mycluster
@ -131,6 +131,26 @@ on more than one at a time will likely be faster.
$ git config annex.jobs cpus $ git config annex.jobs cpus
## special remotes as cluster nodes
Cluster nodes don't have to be regular git remotes. They can
also be special remotes.
Even special remotes with `exporttree=yes` can be
used as cluster nodes. Those also need to be configured with
`annexobjects=yes` though. And, will also need to configure
`remote.name.annex-tracking-branch` to the branch that will
trigger an update of the exported tree when it is pushed to the
cluster gateway.
Let's set up a directory special remote as cluster node,
with the "master" branch exported as a tree:
$ git-annex initremote node4 type=directory directory=/media/disk3/repo exporttree=yes annexobjects=yes
$ git config remote.node4.annex-tracking-branch master
$ git config remote.node4.annex-cluster-node mycluster
$ git-annex updatecluster
## adding additional gateways to a cluster ## adding additional gateways to a cluster
A cluster can have more than one gateway. One way to use this is to A cluster can have more than one gateway. One way to use this is to
@ -211,9 +231,16 @@ be pulled from another one. And gateways only learn about the locations of
keys that are uploaded to the cluster via them. So in the example above, keys that are uploaded to the cluster via them. So in the example above,
after an upload to AMS-mycluster, NYC-mycluster will only know that the after an upload to AMS-mycluster, NYC-mycluster will only know that the
key is stored in its nodes, but won't know that it's stored in nodes key is stored in its nodes, but won't know that it's stored in nodes
behind AMS. So, it's best to have a single git repository that is synced behind AMS.
with, or perhaps run [[git-annex-remotedaemon]] on each gateway to keep
its git repository in sync with the other gateways. So, it's best to have a single git repository that is synced with, or
perhaps run [[git-annex-remotedaemon]] on each gateway to keep its git
repository in sync with the other gateways.
When using special remotes with `exporttree=yes` as nodes, it's
particularly important that pushes reach all the gateways, since the
exported tree will only get updated when the annex-tracking-branch is
pushed.
Clusters can be constructed with any number of gateways, and any internal Clusters can be constructed with any number of gateways, and any internal
topology of connections between gateways. But there must always be a path topology of connections between gateways. But there must always be a path
@ -226,4 +253,5 @@ uploading a key to the cluster.
A breakdown in communication between gateways will temporarily split the A breakdown in communication between gateways will temporarily split the
cluster. When communication resumes, some keys may need to be copied to cluster. When communication resumes, some keys may need to be copied to
additional nodes. additional nodes, and of course the git repositories will need to be pushed
as well to get things back in sync.