git-annex/doc/design
Joey Hess 202ea3ff2a
don't sync with cluster nodes by default
Avoid `git-annex sync --content` etc from operating on cluster nodes by default
since syncing with a cluster implicitly syncs with its nodes. This avoids a
lot of unncessary work when a cluster has a lot of nodes just in checking
if each node's preferred content is satisfied. And it avoids content
being sent to nodes individually, so instead syncing with clusters always
fanout uploads to nodes.

The downside is that there are situations where a cluster's preferred content
settings can be met, but those of its nodes are not. Or where a node does not
contain a key, but the cluster does, and there are not enough copies of the key
yet, so it would be desirable the send it there. I think that's an acceptable
tradeoff. These kind of situations are ones where the cluster itself should
probably be responsible for copying content to the node. Which it can do much
less expensively than a client can. Part of the balanced preferred content
design that I will be working on in a couple of months involves rebalancing
clusters, so I expect to revisit this.

The use of annex-sync config does allow running git-annex sync with a specific
node, or nodes, and it will sync with it. And it's also possible to set
annex-sync git configs to make it sync with a node by default. (Although that
will require setting up an explicit git remote for the node rather than relying
on the proxied remote.)

Logs.Cluster.Basic is needed because Remote.Git cannot import Logs.Cluster
due to a cycle. And the Annex.Startup load of clusters happens
too late for Remote.Git to use that. This does mean one redundant load
of the cluster log, though only when there is a proxy.
2024-06-25 10:24:38 -04:00
..
adjusted_branches Added a comment: adjusted branche to "focus" on a specific subtree 2016-08-22 14:19:57 +00:00
assistant Typo fix unncessary -> unnecessary. 2022-08-20 09:40:19 -04:00
balanced_preferred_content Added a comment 2023-07-24 13:10:09 +00:00
encryption
exporting_trees_to_special_remotes Added a comment 2018-02-07 20:01:53 +00:00
external_backend_protocol Added a comment: xxHash as the backend 2022-12-12 08:21:35 +00:00
external_special_remote_protocol Added a comment: support for bulk write/read/test remote 2024-04-02 06:41:25 +00:00
git-remote-daemon
iabackup Added a comment: 14 of 21PB, actually 2015-04-30 02:58:05 +00:00
metadata followup 2015-04-09 14:33:11 -04:00
new_repo_versions devblog 2016-05-04 14:39:53 -04:00
p2p_protocol comment 2019-04-03 13:11:34 -04:00
requests_routing Added a comment: Friendly bump to keep on the radar 2019-10-24 09:26:23 +00:00
adjusted_branches.mdwn link to the adjust manpage 2016-06-23 14:39:49 +00:00
assistant.mdwn
balanced_preferred_content.mdwn don't sync with cluster nodes by default 2024-06-25 10:24:38 -04:00
caching_database.mdwn sqlite datbase for importfeed 2023-10-23 16:46:22 -04:00
encryption.mdwn Fix typos "=yet" -> "=yes" 2023-03-10 18:07:20 +01:00
exporting_trees_to_special_remotes.mdwn comment 2022-05-02 14:45:45 -04:00
external_backend_protocol.mdwn this protocol is not draft for some time 2020-10-22 19:55:29 -04:00
external_special_remote_protocol.mdwn let Remote.availability return Unavilable 2023-08-16 14:31:31 -04:00
gcrypt.mdwn
git-remote-daemon.mdwn update 2015-01-15 15:58:56 -04:00
iabackup.mdwn Fix spelling in doc/design/iabackup.mdwn 2018-06-03 12:28:26 +00:00
importing_trees_from_special_remotes.mdwn improve docs about removeExportDirectory 2019-05-28 11:16:01 -04:00
metadata.mdwn update for v6 unlocked files 2015-12-26 14:59:06 -04:00
new_repo_versions.mdwn 2023-11-20 02:09:42 +00:00
p2p_protocol.mdwn dropping from clusters 2024-06-23 09:43:40 -04:00
p2p_protocol_over_http.mdwn correct AUTH-SUCCESS and AUTH-FAILURE 2024-06-10 15:06:27 -04:00
passthrough_proxy.mdwn update 2024-06-23 05:26:45 -04:00
preferred_content.mdwn
requests_routing.mdwn
roadmap.mdwn avoid truncating the list of confirmed items 2023-06-23 16:20:00 -04:00