prove this optimisation would not be safe, so close
This commit is contained in:
parent
2ab2b1f9e2
commit
d2e78dfc0d
3 changed files with 99 additions and 0 deletions
|
@ -16,3 +16,5 @@ unexport files added by the other tree. It's sufficient to check that files
|
|||
are present in the export and upload any that are missing. --[[Joey]]
|
||||
|
||||
[[!tag confirmed]]
|
||||
|
||||
> I proved this is not a safe optimisation, so [[wontfix|done]] --[[Joey]]
|
||||
|
|
|
@ -0,0 +1,66 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 2"""
|
||||
date="2020-05-04T17:30:59Z"
|
||||
content="""
|
||||
I took a look at this, the relevant code is here:
|
||||
|
||||
warning "Resolving export conflict.."
|
||||
forM_ ts $ \oldtreesha -> do
|
||||
-- Unexport both the srcsha and the dstsha,
|
||||
-- because the wrong content may have
|
||||
-- been renamed to the dstsha due to the
|
||||
-- export conflict.
|
||||
let unexportboth d =
|
||||
[ Git.DiffTree.srcsha d
|
||||
, Git.DiffTree.dstsha d
|
||||
]
|
||||
-- Don't rename to temp, because the
|
||||
-- content is unknown; delete instead.
|
||||
mapdiff
|
||||
(\diff -> commandAction $ startUnexport r db (Git.DiffTree.file diff) (unexportboth diff))
|
||||
oldtreesha new
|
||||
|
||||
So, it diffs from the tree in the export conflict to the new, resolved tree.
|
||||
Any file that differs -- eg, any file that was involved in the conflict --
|
||||
gets removed from the export.
|
||||
|
||||
> In the todo, I said:
|
||||
>
|
||||
> For example, if A exports a tree containing `[foo]`, and B exports a tree
|
||||
> containing `[foo, bar]`, bar gets unexported when resolving the conflict.
|
||||
|
||||
Let's be more clear about the content of the trees, and say A exports
|
||||
`[(foo, 1)]` and B exports `[(foo, 1), (bar, 1)]`.
|
||||
|
||||
If the export is then resolved to `[(foo, 1), (bar, 1)]`,
|
||||
we can see nothing needs to be done. But what it currently does is
|
||||
diff from `[(foo, 1)]` to the resolution and so unexports bar.
|
||||
|
||||
If B had instead exported `[(foo, 1), (bar, 2)]`, then
|
||||
it would still need to diff from that the the resolution, and so would
|
||||
unexport bar, and so it should.
|
||||
|
||||
But.. but.. ugh. Consider an export conflict that started with `[(bar, 1)]`
|
||||
exported. A exported `[(bar, 2)]` and B exported `[(baz, 1)]` (by renaming bar
|
||||
to baz). So the export remote might contain `[(baz, 2)]` (A uploaded 2 to bar,
|
||||
and then B renamed bar to baz) or it might contain `[(bar, 2), (baz, 1)]`;
|
||||
we do not know ordering between A and B.
|
||||
|
||||
If the export conflict resolution is `[(bar, 2), (baz 1)]` then the tree
|
||||
exported by B is a subset, so it skips that one. And the tree
|
||||
exported by A is a subset, so ummm... it skips that one. And so nothing gets
|
||||
unexported. Then, it proceeds to try to upload any missing files
|
||||
to the export. If the export remote contains `[(bar, 2), (baz, 1)]` nothing is
|
||||
missing, nothing gets uploaded, all is well. But if the export remote
|
||||
contains `[(baz, 2)]`, it will upload `(bar, 2)`, resulting in
|
||||
`[(bar, 2), (baz 2)]`. That is not what it's supposed to contain.
|
||||
|
||||
So, no, this optimisation will not work!
|
||||
|
||||
The only way to make this optimisation work, I think, is to not do renaming
|
||||
when updating export remotes. But file renaming is more common than export
|
||||
conflicts; you can always adjust your workflow to avoid export conflicts,
|
||||
by pulling from the remote you tend to conflict with, before performing an
|
||||
export.
|
||||
"""]]
|
|
@ -0,0 +1,31 @@
|
|||
[[!comment format=mdwn
|
||||
username="joey"
|
||||
subject="""comment 3"""
|
||||
date="2020-05-04T18:36:30Z"
|
||||
content="""
|
||||
Hmm, but.. What if both A and B's trees are subsets
|
||||
of the resolved tree? Safe then?
|
||||
|
||||
A exports `[(foo, 1)]`, while B exports `[(bar, 2)]`
|
||||
the resolved tree is `[(foo, 1), (bar, 2)]`.
|
||||
|
||||
Well, what was in the export before? Suppose it was `[(foo, 2)]`..
|
||||
Then B would have renamed foo to bar, and A exported 1 to foo.
|
||||
Order is unknown, so the export has either of `[(foo, 1), (bar, 2)]`
|
||||
or `[(bar, 1)]`
|
||||
|
||||
Yeah, still not safe even when both trees are subsets.
|
||||
|
||||
----
|
||||
|
||||
An optimisation like this needs some way to detect if there's been a rename
|
||||
like B keeps doing in these examples. If there has not been any rename,
|
||||
the optimisation is safe.
|
||||
|
||||
export.log contains only the sha of the tree that has been exported
|
||||
by each repo to the export remote. It might contains some trees that
|
||||
were exported before, but when it gets compacted, that information is
|
||||
lost, and anyway there's no way to know if B exported some old tree before
|
||||
or after A exported its most recently exported tree. So, I don't think
|
||||
retrospective rename detection is possible.
|
||||
"""]]
|
Loading…
Reference in a new issue