From 0df94132d997a492e0ab43f16acc9bc52fd4d82d Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Thu, 13 Jul 2023 19:57:34 -0400 Subject: [PATCH] add a warning and a related todo arising from a conversation at FOSSY --- doc/tips/splitting_a_repository.mdwn | 3 +++ doc/todo/filter-branch_for_objects.mdwn | 14 ++++++++++++++ 2 files changed, 17 insertions(+) create mode 100644 doc/todo/filter-branch_for_objects.mdwn diff --git a/doc/tips/splitting_a_repository.mdwn b/doc/tips/splitting_a_repository.mdwn index cd94785760..d20497c74d 100644 --- a/doc/tips/splitting_a_repository.mdwn +++ b/doc/tips/splitting_a_repository.mdwn @@ -70,6 +70,9 @@ Finally the annexed file contents need to be copied to the new repository: # Fix up annex links to content and make sure it's all ok. git annex fsck +Warning: This method of copying the annexed file contents and dropping +the unused ones causes the git-annex branch to log information. + # alternative older method Here is another way to do it. Suppose the old big repo is at `~/oldrepo`: diff --git a/doc/todo/filter-branch_for_objects.mdwn b/doc/todo/filter-branch_for_objects.mdwn new file mode 100644 index 0000000000..3483470895 --- /dev/null +++ b/doc/todo/filter-branch_for_objects.mdwn @@ -0,0 +1,14 @@ +`git-annex filter-branch` can be used to split a git-annex repository. +However, the approach in [[tips/splitting_a_repository]] then copies all +objects into the new repository and drops unused objects. And dropunused +updates location log in that situation, even when the location log didn't +exist in that repository before. So, that approach leaks information about +objects that were in the original repository into the split repository. + +Splitting a git-annex repository is something that, when you need to do it, +you may have good reasons to want to avoid any such leakage of +information. + +So perhaps add a feature that copies only the needed objects over to the +split repository? Or update the tip with a better method that avoids this +problem. --[[Joey]]