urk
This commit is contained in:
parent
577af1b679
commit
9ed32ce62b
1 changed files with 41 additions and 0 deletions
|
@ -0,0 +1,41 @@
|
||||||
|
When a key has no known size (from addurl --relaxed eg), I think data loss
|
||||||
|
could occur in this situation:
|
||||||
|
|
||||||
|
* repo A has an object for the key with size X
|
||||||
|
* repo B has an object for the same key with size Y (!= X)
|
||||||
|
* repo A transfers to the special remote
|
||||||
|
* then B transfers to the special remote
|
||||||
|
* B transfers one more chunk than A, because of the different size
|
||||||
|
* B actually "resumes" after the last chunk A uploaded. So now the remote
|
||||||
|
contains A's chunks, followed by B's extra chunk.
|
||||||
|
* A and B sync up, which merges the chunk logs. Since that log
|
||||||
|
uses "key:chunksize" as the log key, and the two logs have two different
|
||||||
|
ones, one will win or come first in the merged log. Suppose it's
|
||||||
|
the entry for B. So, the log then will be interpreted as the number of
|
||||||
|
chunks being B's.
|
||||||
|
* Now when the object is retrieved from the special remote, it will
|
||||||
|
retrieve and concacenate A's chunks, followed by B's extra chunk.
|
||||||
|
|
||||||
|
So this is corruption at least, it can be recovered from, but to do so
|
||||||
|
you have to know the original length of A's object. Note that most keys
|
||||||
|
with unknown size also have no checksum to use to verify them, so it would
|
||||||
|
be easy for this to happen and not be caught.
|
||||||
|
|
||||||
|
(Alternatively, after B transfers, it can sync with A, drop, and get
|
||||||
|
the content back from the special remote. Same result by another route,
|
||||||
|
and without needing any particular git-annex branch merge behavior to
|
||||||
|
happen so easier to reproduce. (I have not tried either yet.))
|
||||||
|
|
||||||
|
A simulantaneous upload by A and B might cause unrecoverable data loss
|
||||||
|
if they eg alternate chunks. Unsure if that can really happen.
|
||||||
|
|
||||||
|
If A starts to transfer, sends some chunks, but is interrupted, and B
|
||||||
|
then transfers, resuming after the last chunk A stored, that would be data
|
||||||
|
loss.
|
||||||
|
|
||||||
|
It might be best to just disable storing in chunks for keys of unknown size,
|
||||||
|
since it can fail so badly with them, and they're kind of a side thing?
|
||||||
|
|
||||||
|
(Could continue retrieving, for whatever is stored hopefully w/o being
|
||||||
|
corrupted already.)
|
||||||
|
--[[Joey]]
|
Loading…
Add table
Add a link
Reference in a new issue