design for simulating clusters w/o simulating cluster gateways
This commit is contained in:
parent
b9214d4162
commit
61c95f4d29
2 changed files with 73 additions and 1 deletions
|
@ -368,6 +368,47 @@ as passed to "git annex sim" while a simulation is running.
|
||||||
step 100
|
step 100
|
||||||
rebalance off
|
rebalance off
|
||||||
|
|
||||||
|
* `clusternode name repo`
|
||||||
|
|
||||||
|
Simulate a repository being a node of a cluster, which can be referred to
|
||||||
|
using the specified name.
|
||||||
|
|
||||||
|
Rather than a cluster gateway being simulated as a separate entity, any
|
||||||
|
connection to a cluster node with that name is treated as accessing that
|
||||||
|
repository via the same cluster gateway.
|
||||||
|
|
||||||
|
Since a cluster gateway knows about all changes that are made to nodes
|
||||||
|
via it, every repository that has a connection to a cluster node will
|
||||||
|
immediately know about changes that are made via that node, without
|
||||||
|
needing a simulated git pull.
|
||||||
|
|
||||||
|
To simulate a repository being a node of more than one cluster, or behind
|
||||||
|
multiple gateways in the same cluster, use this command to give it
|
||||||
|
multiple names.
|
||||||
|
|
||||||
|
For example:
|
||||||
|
|
||||||
|
init foo
|
||||||
|
init bar
|
||||||
|
init node1
|
||||||
|
init node2
|
||||||
|
clusternode cluster-node1 node1
|
||||||
|
clusternode cluster-node2 node2
|
||||||
|
group node1 cluster
|
||||||
|
group node2 cluster
|
||||||
|
wanted node1 sizebalanced=cluster
|
||||||
|
wanted node2 sizebalanced=cluster
|
||||||
|
connect cluster-node2 <- foo -> cluster-node1
|
||||||
|
connect cluster-node2 <- bar -> cluster-node1
|
||||||
|
addmulti 10 foo 1gb 2gb foo
|
||||||
|
addmulti 10 bar 1gb 2gb bar
|
||||||
|
action foo sendwanted cluster-node1 while action foo sendwanted cluster-node2 while action bar sendwanted cluster-node1 while action bar sendwanted cluster-node2
|
||||||
|
|
||||||
|
In the above example, while foo and bar are both concurrently sending
|
||||||
|
wanted files to both nodes, each will know immediately which files have
|
||||||
|
been sent by the other, and so the files will be sizebalanced between
|
||||||
|
them optimally.
|
||||||
|
|
||||||
# OPTIONS
|
# OPTIONS
|
||||||
|
|
||||||
* The [[git-annex-common-options]](1) can be used.
|
* The [[git-annex-common-options]](1) can be used.
|
||||||
|
|
|
@ -92,7 +92,38 @@ Planned schedule of work:
|
||||||
clusternode mycluster-foo foo
|
clusternode mycluster-foo foo
|
||||||
clusternode othercluster-foo foo
|
clusternode othercluster-foo foo
|
||||||
|
|
||||||
|
Implementation plan for this:
|
||||||
|
|
||||||
|
* clusternode initializes a new cluster node UUID, and adds to
|
||||||
|
simRepos.
|
||||||
|
* add `simClusterNodes :: M.Map UUID (UUID, RemoteName)`,
|
||||||
|
which maps from the cluster node UUID to the UUID of the underlying
|
||||||
|
repo, and its node name.
|
||||||
|
* clusternode also adds to simClusterNodes.
|
||||||
|
* setPresentKey checks if the UUID is in simClusterNodes.
|
||||||
|
* If it is, it makes the key present/missing in the underlying repo
|
||||||
|
UUID as well.
|
||||||
|
* And, it looks through simConnections to find any other repos that
|
||||||
|
also have a connection to the cluster node with that name.
|
||||||
|
Each of those repos also gets its simLocations updated.
|
||||||
|
|
||||||
|
But: The cluster node UUID would need to have the same preferred content
|
||||||
|
etc as the underlying repo. And, it would need to be in the same groups.
|
||||||
|
And it would be counted as another copy. Could use a cluster UUID to
|
||||||
|
avoid the numcopies count. But can adding a separate UUID be avoided?
|
||||||
|
|
||||||
|
Implementation plan for this without separate UUID:
|
||||||
|
|
||||||
|
* add `simClusterNodes :: M.Map RepoName UUID`,
|
||||||
|
* clusternode adds to simClusterNodes.
|
||||||
|
* checkKnownRemote needs to check simClusterNodes as well as
|
||||||
|
simRepos so that cluster nodes can be used as remotes.
|
||||||
|
* Plumb repo name through to setPresentKey.
|
||||||
|
* setPresentKey checks if repo name is in simClusterNodes.
|
||||||
|
* If it is, it looks through simConnections to find any other
|
||||||
|
repos that also have a connection to the cluster node with
|
||||||
|
that name. Each of those repos also gets its simLocations updated
|
||||||
|
for the change being logged.
|
||||||
|
|
||||||
* sim: Add support for metadata, so preferred content that matches on it
|
* sim: Add support for metadata, so preferred content that matches on it
|
||||||
will work
|
will work
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue