From 8f69f5d9aa70934e5243965b7f03e8dd8e92d447 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 4 Jan 2021 13:34:32 -0400 Subject: [PATCH] comment --- ..._d88cedb3d70342071621c1695c8aeb05._comment | 34 +++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100644 doc/forum/Import_existing_files/comment_5_d88cedb3d70342071621c1695c8aeb05._comment diff --git a/doc/forum/Import_existing_files/comment_5_d88cedb3d70342071621c1695c8aeb05._comment b/doc/forum/Import_existing_files/comment_5_d88cedb3d70342071621c1695c8aeb05._comment new file mode 100644 index 0000000000..a1062d4244 --- /dev/null +++ b/doc/forum/Import_existing_files/comment_5_d88cedb3d70342071621c1695c8aeb05._comment @@ -0,0 +1,34 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 5""" + date="2021-01-04T17:17:08Z" + content=""" +I think you're really overcomplicating things. Some really basic use of +git-annex as described in the [[walkthrough]] will work fine in the +situation you describe. Ie, initialize a git-annex repository in +~/Pictures. If you have some other servers or hard drives that also have +pictures, initialize git-annex repositories on those as well. Connect these +repositories that all hold pictures together, by adding git remotes +pointing to the other pictures repositories. + +Then when you `git push` (or `git-annex sync`), git-annex will automatically +learn if some picture is stored in multiple of the repositories. You'll be +able to run commands like `git-annex find --copies 2` or `git-annex drop` +to operate on that information. Similarly, if Picture/BestPics2020/a.jpg +and Picture/2020/01/a.jpg were the same content, git-annex will notice that +when you add them to the annex, and will automatically deduplicate. + +If you have readonly DVDs or whatever, yes those can be handled in ways +like Lukey describes, but why bother trying to deal with all those edge +cases before you're using git-annex at all? + +As far as too many files, git has issues with the index file becoming +slower with more files, but you need huge numbers of files for this to be a +significant problem -- think millions. git-annex commands that need to +operate on all files necessarily take longer when there are more files, +but git-annex always lets you only operate on a subset of files, such as +the ones in the current directory, so this is not a significant scalability +problem. Worrying about speed before something is slow is a kind of +premature optimisation; git-annex has actually been optimised in cases where +it was slow. +"""]]