From ca3f05fd6cf0f12dc29193a5c5b3fc01c31097dc Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawmBUR4O9mofxVbpb8JV9mEbVfIYv670uJo" Date: Fri, 22 Apr 2011 14:24:17 +0000 Subject: [PATCH 1/7] --- doc/forum/wishlist:_git-annex_replicate.mdwn | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/forum/wishlist:_git-annex_replicate.mdwn diff --git a/doc/forum/wishlist:_git-annex_replicate.mdwn b/doc/forum/wishlist:_git-annex_replicate.mdwn new file mode 100644 index 0000000000..0d926b3375 --- /dev/null +++ b/doc/forum/wishlist:_git-annex_replicate.mdwn @@ -0,0 +1,12 @@ +I'd like to be able to do something like the following: + + * Create encrypted git-annex remotes on a couple of semi-trusted machines - ones that have good connectivity, but non-redundant hardware + * set numcopies=3 + * run `git-annex replicate` and have git-annex run the appropriate copy commands to make sure every file is on at least 3 machines + +There would also likely be a `git annex rebalance` command which could be used if remotes were added or removed. If possible, it should copy files between servers directly, rather than proxy through a potentially slow client. + +There might be the need to have a 'replication_priority' option for each remote that configures which machines would be preferred. That way you could set your local server to a high priority to ensure that it is always 1 of the 3 machines used and files are distributed across 2 of the remaining remotes. Other than priority, other options that might help: + + * maxspace - A self imposed quota per remote machine. git-annex replicate should try to replicate files first to machines with more free space. maxspace would change the free space calculation to be `min(actual_free_space, maxspace - space_used_by_git_annex) + * bandwidth - when replication files, copies should be done between machines with the highest available bandwidth. ( I think this option could be useful for git-annex get in general) From 028b338c29fa26a3a821fd79172bc8a42a23387a Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U" Date: Fri, 22 Apr 2011 18:27:01 +0000 Subject: [PATCH 2/7] Added a comment --- ...comment_1_9926132ec6052760cdf28518a24e2358._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/forum/wishlist:_git-annex_replicate/comment_1_9926132ec6052760cdf28518a24e2358._comment diff --git a/doc/forum/wishlist:_git-annex_replicate/comment_1_9926132ec6052760cdf28518a24e2358._comment b/doc/forum/wishlist:_git-annex_replicate/comment_1_9926132ec6052760cdf28518a24e2358._comment new file mode 100644 index 0000000000..cec971ee3b --- /dev/null +++ b/doc/forum/wishlist:_git-annex_replicate/comment_1_9926132ec6052760cdf28518a24e2358._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U" + nickname="Richard" + subject="comment 1" + date="2011-04-22T18:27:00Z" + content=""" +While having remotes redistribute introduces some obvious security concerns, I might use it. + +As remotes support a cost factor already, you can basically implement bandwidth through that. +"""]] From a03dc49bb21190bd2823df9d6b99b04158a955ee Mon Sep 17 00:00:00 2001 From: gernot Date: Sat, 23 Apr 2011 16:02:42 +0000 Subject: [PATCH 3/7] --- ...efine_remotes_that_must_have_all_files.mdwn | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 doc/forum/wishlist:_define_remotes_that_must_have_all_files.mdwn diff --git a/doc/forum/wishlist:_define_remotes_that_must_have_all_files.mdwn b/doc/forum/wishlist:_define_remotes_that_must_have_all_files.mdwn new file mode 100644 index 0000000000..156cfb0090 --- /dev/null +++ b/doc/forum/wishlist:_define_remotes_that_must_have_all_files.mdwn @@ -0,0 +1,18 @@ +I would like to be able to name a few remotes that must retain *all* annexed +files. `git-annex fsck` should warn me if any files are missing from those +remotes, even if `annex.numcopies` has been satisfied by other remotes. + +I imagine this could also be useful for bup remotes, but I haven't actually +looked at those yet. + +Based on existing output, this is what a warning message could look like: + + fsck FILE + 3 of 3 trustworthy copies of FILE exist. + FILE is, however, still missing from these required remotes: + UUID -- Backup Drive 1 + UUID -- Backup Drive 2 + Back it up with git-annex copy. + Warning + +What do you think? From 65adb9240f45c265dc19d28976823186a8c4b7b0 Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Sat, 23 Apr 2011 16:22:07 +0000 Subject: [PATCH 4/7] Added a comment --- ...mment_2_c43932f4194aba8fb2470b18e0817599._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/forum/wishlist:_git-annex_replicate/comment_2_c43932f4194aba8fb2470b18e0817599._comment diff --git a/doc/forum/wishlist:_git-annex_replicate/comment_2_c43932f4194aba8fb2470b18e0817599._comment b/doc/forum/wishlist:_git-annex_replicate/comment_2_c43932f4194aba8fb2470b18e0817599._comment new file mode 100644 index 0000000000..9d50d15310 --- /dev/null +++ b/doc/forum/wishlist:_git-annex_replicate/comment_2_c43932f4194aba8fb2470b18e0817599._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 2" + date="2011-04-23T16:22:07Z" + content=""" +Besides the cost values, annex.diskreserve was recently added. (But is not available for special remotes.) + +I have held off on adding high-level management stuff like this to git-annex, as it's hard to make it generic enough to cover use cases. + +A low-level way to accomplish this would be to have a way for `git annex get` and/or `copy` to skip files when `numcopies` is already satisfied. Then cron jobs could be used. +"""]] From 96a7b7926ed09aa264207907ea4e0a5e31a031cb Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Sat, 23 Apr 2011 16:27:13 +0000 Subject: [PATCH 5/7] Added a comment --- ...comment_1_cceccc1a1730ac688d712b81a44e31c3._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_1_cceccc1a1730ac688d712b81a44e31c3._comment diff --git a/doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_1_cceccc1a1730ac688d712b81a44e31c3._comment b/doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_1_cceccc1a1730ac688d712b81a44e31c3._comment new file mode 100644 index 0000000000..1f65fd982f --- /dev/null +++ b/doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_1_cceccc1a1730ac688d712b81a44e31c3._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 1" + date="2011-04-23T16:27:13Z" + content=""" +Seems to have a scalability problem, what happens when such a repository becomes full? + +Another way to accomplish I think the same thing is to pick the repositories that you would include in such a set, and make all other repositories untrusted. And set numcopies as desired. Then git-annex will never remove files from the set of non-untrusted repositories, and fsck will warn if a file is present on only an untrusted repository. +"""]] From aa820623dc9ea648fb9fa8e9263557529155a7a9 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawmBUR4O9mofxVbpb8JV9mEbVfIYv670uJo" Date: Sat, 23 Apr 2011 17:54:43 +0000 Subject: [PATCH 6/7] Added a comment --- ...comment_3_c13f4f9c3d5884fc6255fd04feadc2b1._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/forum/wishlist:_git-annex_replicate/comment_3_c13f4f9c3d5884fc6255fd04feadc2b1._comment diff --git a/doc/forum/wishlist:_git-annex_replicate/comment_3_c13f4f9c3d5884fc6255fd04feadc2b1._comment b/doc/forum/wishlist:_git-annex_replicate/comment_3_c13f4f9c3d5884fc6255fd04feadc2b1._comment new file mode 100644 index 0000000000..e7eb06b3b1 --- /dev/null +++ b/doc/forum/wishlist:_git-annex_replicate/comment_3_c13f4f9c3d5884fc6255fd04feadc2b1._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawmBUR4O9mofxVbpb8JV9mEbVfIYv670uJo" + nickname="Justin" + subject="comment 3" + date="2011-04-23T17:54:42Z" + content=""" +Hmm, so it seems there is almost a way to do this already. + +I think the one thing that isn't currently possible is to have 'plain' ssh remotes.. basically something just like the directory remote, but able to take a ssh user@host/path url. something like sshfs could be used to fake this, but for things like fsck you would want to do the sha1 calculations on the remote host. +"""]] From 9715f3132c5fa69e8edf2bc7c41c1a4e9c0602be Mon Sep 17 00:00:00 2001 From: gernot Date: Sun, 24 Apr 2011 11:20:06 +0000 Subject: [PATCH 7/7] Added a comment --- ...t_2_eec848fcf3979c03cbff2b7407c75a7a._comment | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) create mode 100644 doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_2_eec848fcf3979c03cbff2b7407c75a7a._comment diff --git a/doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_2_eec848fcf3979c03cbff2b7407c75a7a._comment b/doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_2_eec848fcf3979c03cbff2b7407c75a7a._comment new file mode 100644 index 0000000000..1855cdda01 --- /dev/null +++ b/doc/forum/wishlist:_define_remotes_that_must_have_all_files/comment_2_eec848fcf3979c03cbff2b7407c75a7a._comment @@ -0,0 +1,16 @@ +[[!comment format=mdwn + username="gernot" + ip="87.79.209.169" + subject="comment 2" + date="2011-04-24T11:20:05Z" + content=""" +Right, I have thought about untrusting all but a few remotes to achieve +something similar before and I'm sure it would kind of work. It would be more +of an ugly workaround, however, because I would have to untrust remotes that +are, in reality, at least semi-trusted. That's why an extra option/attribute +for that kind of purpose/remote would be nice. + +Obviously I didn't see the scalability problem though. Good Point. Maybe I can +achieve the same thing by writing a log parsing script for myself? + +"""]]