diff --git a/doc/todo/more_efficient_resolution_of_trivial_export_conflicts.mdwn b/doc/todo/more_efficient_resolution_of_trivial_export_conflicts.mdwn index 4ebdad5c4c..a7230bc593 100644 --- a/doc/todo/more_efficient_resolution_of_trivial_export_conflicts.mdwn +++ b/doc/todo/more_efficient_resolution_of_trivial_export_conflicts.mdwn @@ -16,3 +16,5 @@ unexport files added by the other tree. It's sufficient to check that files are present in the export and upload any that are missing. --[[Joey]] [[!tag confirmed]] + +> I proved this is not a safe optimisation, so [[wontfix|done]] --[[Joey]] diff --git a/doc/todo/more_efficient_resolution_of_trivial_export_conflicts/comment_2_70c3cfa9b95f3aa6c2e2e2f6dbbc50d0._comment b/doc/todo/more_efficient_resolution_of_trivial_export_conflicts/comment_2_70c3cfa9b95f3aa6c2e2e2f6dbbc50d0._comment new file mode 100644 index 0000000000..6f8c923df9 --- /dev/null +++ b/doc/todo/more_efficient_resolution_of_trivial_export_conflicts/comment_2_70c3cfa9b95f3aa6c2e2e2f6dbbc50d0._comment @@ -0,0 +1,66 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 2""" + date="2020-05-04T17:30:59Z" + content=""" +I took a look at this, the relevant code is here: + + warning "Resolving export conflict.." + forM_ ts $ \oldtreesha -> do + -- Unexport both the srcsha and the dstsha, + -- because the wrong content may have + -- been renamed to the dstsha due to the + -- export conflict. + let unexportboth d = + [ Git.DiffTree.srcsha d + , Git.DiffTree.dstsha d + ] + -- Don't rename to temp, because the + -- content is unknown; delete instead. + mapdiff + (\diff -> commandAction $ startUnexport r db (Git.DiffTree.file diff) (unexportboth diff)) + oldtreesha new + +So, it diffs from the tree in the export conflict to the new, resolved tree. +Any file that differs -- eg, any file that was involved in the conflict -- +gets removed from the export. + +> In the todo, I said: +> +> For example, if A exports a tree containing `[foo]`, and B exports a tree +> containing `[foo, bar]`, bar gets unexported when resolving the conflict. + +Let's be more clear about the content of the trees, and say A exports +`[(foo, 1)]` and B exports `[(foo, 1), (bar, 1)]`. + +If the export is then resolved to `[(foo, 1), (bar, 1)]`, +we can see nothing needs to be done. But what it currently does is +diff from `[(foo, 1)]` to the resolution and so unexports bar. + +If B had instead exported `[(foo, 1), (bar, 2)]`, then +it would still need to diff from that the the resolution, and so would +unexport bar, and so it should. + +But.. but.. ugh. Consider an export conflict that started with `[(bar, 1)]` +exported. A exported `[(bar, 2)]` and B exported `[(baz, 1)]` (by renaming bar +to baz). So the export remote might contain `[(baz, 2)]` (A uploaded 2 to bar, +and then B renamed bar to baz) or it might contain `[(bar, 2), (baz, 1)]`; +we do not know ordering between A and B. + +If the export conflict resolution is `[(bar, 2), (baz 1)]` then the tree +exported by B is a subset, so it skips that one. And the tree +exported by A is a subset, so ummm... it skips that one. And so nothing gets +unexported. Then, it proceeds to try to upload any missing files +to the export. If the export remote contains `[(bar, 2), (baz, 1)]` nothing is +missing, nothing gets uploaded, all is well. But if the export remote +contains `[(baz, 2)]`, it will upload `(bar, 2)`, resulting in +`[(bar, 2), (baz 2)]`. That is not what it's supposed to contain. + +So, no, this optimisation will not work! + +The only way to make this optimisation work, I think, is to not do renaming +when updating export remotes. But file renaming is more common than export +conflicts; you can always adjust your workflow to avoid export conflicts, +by pulling from the remote you tend to conflict with, before performing an +export. +"""]] diff --git a/doc/todo/more_efficient_resolution_of_trivial_export_conflicts/comment_3_d0549f8a07032ce61e88b4ccb2c6ef3b._comment b/doc/todo/more_efficient_resolution_of_trivial_export_conflicts/comment_3_d0549f8a07032ce61e88b4ccb2c6ef3b._comment new file mode 100644 index 0000000000..2b2b8314fc --- /dev/null +++ b/doc/todo/more_efficient_resolution_of_trivial_export_conflicts/comment_3_d0549f8a07032ce61e88b4ccb2c6ef3b._comment @@ -0,0 +1,31 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 3""" + date="2020-05-04T18:36:30Z" + content=""" +Hmm, but.. What if both A and B's trees are subsets +of the resolved tree? Safe then? + +A exports `[(foo, 1)]`, while B exports `[(bar, 2)]` +the resolved tree is `[(foo, 1), (bar, 2)]`. + +Well, what was in the export before? Suppose it was `[(foo, 2)]`.. +Then B would have renamed foo to bar, and A exported 1 to foo. +Order is unknown, so the export has either of `[(foo, 1), (bar, 2)]` +or `[(bar, 1)]` + +Yeah, still not safe even when both trees are subsets. + +---- + +An optimisation like this needs some way to detect if there's been a rename +like B keeps doing in these examples. If there has not been any rename, +the optimisation is safe. + +export.log contains only the sha of the tree that has been exported +by each repo to the export remote. It might contains some trees that +were exported before, but when it gets compacted, that information is +lost, and anyway there's no way to know if B exported some old tree before +or after A exported its most recently exported tree. So, I don't think +retrospective rename detection is possible. +"""]]