From 13a8706cda856815c283d82768f1582c120b343d Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Thu, 13 May 2021 14:09:06 -0400 Subject: [PATCH] almost have a plan --- ..._53a7a0d5ba6be411bdae10a6f8ba16fc._comment | 46 +++++++++++++++++++ ..._d82ee205451cf55eb283952c4774d32a._comment | 17 +++++++ ..._3565680846a8d547d0912d1ef31430b2._comment | 40 ++++++++++++++++ 3 files changed, 103 insertions(+) create mode 100644 doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_3_53a7a0d5ba6be411bdae10a6f8ba16fc._comment create mode 100644 doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_4_d82ee205451cf55eb283952c4774d32a._comment create mode 100644 doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_5_3565680846a8d547d0912d1ef31430b2._comment diff --git a/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_3_53a7a0d5ba6be411bdae10a6f8ba16fc._comment b/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_3_53a7a0d5ba6be411bdae10a6f8ba16fc._comment new file mode 100644 index 0000000000..6c685c597d --- /dev/null +++ b/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_3_53a7a0d5ba6be411bdae10a6f8ba16fc._comment @@ -0,0 +1,46 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2021-05-13T16:10:39Z" + content=""" +Hmm, it seems possible that two repos could use the same uuid for a +remote, but have different configurations for it. Eg, an internal use repo +that might even embed creds for the remote, and a public use repo that +relies on public http urls to download from the remote. + +So there would then be 3 things that need to be able to be specified: + +* keys to copy +* uuids whose per-key information should be copied (or ones to skip) +* uuids whose non-per-key information should be copied (or ones to skip) + (remote description, special remote config, trust, group, preferred + content, etc) + +Might as well add, for completeness: + +* whether to copy global config settings, or not (numcopies, mincopies, + git-annex-config, group-preferred-content, difference.log) + +Could get more granular than this, eg only copying some metadata fields and +not others, or description but not trust log, but I'd want to see a use +case. A line has to be drawn somewhere or it just gets ridiculous, and the +user might as well pull up [[internals]] and git-filter-branch and +post-process the tree generated by this command. + +So a UI for these 3 or 4 things.. + + git-annex copy-branch --keys-from=path + --include-key-information-for=repo + --exclude-key-information-for=repo + --include-config-for=repo + --exclude-config-for=repo + --include-global-config + --exclude-global-config + +Eg: + + git-annex copy-branch --keys-from=. + --exclude-key-information-for=privateremote + --exclude-config-for=privateremote + --include-global-config +"""]] diff --git a/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_4_d82ee205451cf55eb283952c4774d32a._comment b/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_4_d82ee205451cf55eb283952c4774d32a._comment new file mode 100644 index 0000000000..8af41e1564 --- /dev/null +++ b/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_4_d82ee205451cf55eb283952c4774d32a._comment @@ -0,0 +1,17 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 4""" + date="2021-05-13T16:29:41Z" + content=""" +The other axis is, I guess, should it include past commits to the git-annex +branch, or only the current data? I'm inclined toward only the current +data. The only thing that uses past data really is `git-annex log` and it's just +not worth the added time expense. And also `git annex forget` already +throws away the past data. + +There is the added wart of exported treeishes being grafted into the +git-annex branch (to avoid them being lost in GC in some edge cases). +It would need to do like `git annex forget` was recently fixed to, and +include those grafts when throwing away the rest of the history. +(See [[!commit 8e7dc958d20861a91562918e24e071f70d34cf5b]]) +"""]] diff --git a/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_5_3565680846a8d547d0912d1ef31430b2._comment b/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_5_3565680846a8d547d0912d1ef31430b2._comment new file mode 100644 index 0000000000..444918d789 --- /dev/null +++ b/doc/todo/copy-key___40__--batch__41___to_copy__47__merge_availability_info/comment_5_3565680846a8d547d0912d1ef31430b2._comment @@ -0,0 +1,40 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 5""" + date="2021-05-13T16:55:36Z" + content=""" +The filtering of uuids from logs this command needs is very closely +related to how the git-annex branch is filtered when dropping dead uuids +and keys. + +Annex.Branch.Transitions.dropDead could alsmost be used as-is, just +providing it a trustmap that has the excluded uuids marked as dead. + +But, it does not currently modify the trustLog, which makes sense for +transitions, but for this the trust log needs to include only the desired +uuids. + +And, providing a trustmap does have the problem that, +if a uuid is mentioned in the branch without being in uuid.log, +it would not be in the trustmap, and so it would not be excluded. One way +for that to happen is well, using this command to copy only per-key info +for a remote, but not config for a remote. Hmm. Using a filtering +function, rather than a trustmap, would avoid this problem. But, +dropDead does some processing to handle sameas-uuid pointing to a dead +uuid, including a special case involving remoteLog. + +Implementation plan: + +* Address above problems with dropDead, somehow, so it can be reused. +* Add a function (in Logs) from a key to all possible git-annex branch log + files for that key. +* For each key seeked, run that function, query the branch to see which + log files exist, and pass through dropDead to filter and populate + the temporary index. This way, the command does not need to buffer + the whole set of keys in memory. +* Get a list of all non-key logs + `(topLevelNewUUIDBasedLogs++topLevelOldUUIDBasedLogs++otherLogs)`, + and pass them all through dropDead as well. +* Refactor regraftexports from Annex.Branch, and call it after + constructing the filtered index. +"""]]