update
This commit is contained in:
parent
ca1d80d708
commit
937197842e
1 changed files with 22 additions and 12 deletions
|
@ -17,11 +17,11 @@ file, that similarly leaks information.
|
||||||
It is not currently possible to enable chunking on a non-chunked remote.
|
It is not currently possible to enable chunking on a non-chunked remote.
|
||||||
|
|
||||||
Problem: Two uploads of the same key from repos with different chunk sizes
|
Problem: Two uploads of the same key from repos with different chunk sizes
|
||||||
could lead to data loss. For example, suppose A is 10 mb, and B is 20 mb,
|
could lead to data loss. For example, suppose A is 10 mb chunksize, and B
|
||||||
and the upload speed is the same. If B starts first, when A will overwrite
|
is 20 mb, and the upload speed is the same. If B starts first, when A will
|
||||||
the file it is uploading for the 1st chunk. Then A uploads the second
|
overwrite the file it is uploading for the 1st chunk. Then A uploads the
|
||||||
chunk, and once A is done, B finishes the 1st chunk and uploads its second.
|
second chunk, and once A is done, B finishes the 1st chunk and uploads its
|
||||||
We now have [chunk 1(from A), chunk 2(from B)].
|
second. We now have [chunk 1(from A), chunk 2(from B)].
|
||||||
|
|
||||||
# new requirements
|
# new requirements
|
||||||
|
|
||||||
|
@ -95,7 +95,8 @@ all the chunks are present, if the key size is not known?
|
||||||
Problem: Also, this makes it difficult to download encrypted keys, because
|
Problem: Also, this makes it difficult to download encrypted keys, because
|
||||||
we only know the decrypted size, not the encrypted size, so we can't
|
we only know the decrypted size, not the encrypted size, so we can't
|
||||||
be sure how many chunks to get, and all chunks need to be downloaded before
|
be sure how many chunks to get, and all chunks need to be downloaded before
|
||||||
we can decrypt any of them.
|
we can decrypt any of them. (Assuming we encrypt first; chunking first
|
||||||
|
avoids this problem.)
|
||||||
|
|
||||||
Problem: Does not solve concurrent uploads with different chunk sizes.
|
Problem: Does not solve concurrent uploads with different chunk sizes.
|
||||||
|
|
||||||
|
@ -155,7 +156,12 @@ the git-annex branch.
|
||||||
Look at git-annex:aaa/bbb/SHA256-s12345--xxxxxxx.log.cnk to get the
|
Look at git-annex:aaa/bbb/SHA256-s12345--xxxxxxx.log.cnk to get the
|
||||||
chunk count and size. File format would be:
|
chunk count and size. File format would be:
|
||||||
|
|
||||||
ts uuid chunksize chunkcount
|
ts uuid chunksize chunkcount 0|1
|
||||||
|
|
||||||
|
Where a trailing 0 means that chunk size is no longer present on the
|
||||||
|
remote, and a trailing 1 means it is. For future expansion, any other
|
||||||
|
value /= "0" is also accepted, meaning the chunk is present. For example,
|
||||||
|
this could be used for [[deltas]], storing the checksums of the chunks.
|
||||||
|
|
||||||
Note that a given remote uuid might have multiple lines, if a key was
|
Note that a given remote uuid might have multiple lines, if a key was
|
||||||
stored on it twice using different chunk sizes. Also note that even when
|
stored on it twice using different chunk sizes. Also note that even when
|
||||||
|
@ -164,12 +170,12 @@ remote too.
|
||||||
|
|
||||||
`hasKey` would check if any one (chunksize, chunkcount) is satisfied by
|
`hasKey` would check if any one (chunksize, chunkcount) is satisfied by
|
||||||
the files on the remote. It would also check if the non-chunked key is
|
the files on the remote. It would also check if the non-chunked key is
|
||||||
present.
|
present, as a fallback.
|
||||||
|
|
||||||
When dropping a key from the remote, drop all logged chunk sizes.
|
When dropping a key from the remote, drop all logged chunk sizes.
|
||||||
(Also drop any non-chunked key.)
|
(Also drop any non-chunked key.)
|
||||||
|
|
||||||
As long as the location log and the new log are committed atomically,
|
As long as the location log and the chunk log are committed atomically,
|
||||||
this guarantees that no orphaned chunks end up on a remote
|
this guarantees that no orphaned chunks end up on a remote
|
||||||
(except any that might be left by interrupted uploads).
|
(except any that might be left by interrupted uploads).
|
||||||
|
|
||||||
|
@ -189,9 +195,13 @@ Reasons:
|
||||||
this allows some chunks to come from one and some from another,
|
this allows some chunks to come from one and some from another,
|
||||||
and be reassembled without problems.
|
and be reassembled without problems.
|
||||||
|
|
||||||
2. Prevents an attacker from re-assembling the chunked file using details
|
2. Also allows chunks of the same object to be downloaded from different
|
||||||
of the gpg output. Which would expose file size if padding is being used
|
remotes, perhaps concurrently, and again be reassembled without
|
||||||
to obscure it.
|
problems.
|
||||||
|
|
||||||
|
3. Prevents an attacker from re-assembling the chunked file using details
|
||||||
|
of the gpg output. Which would expose approximate
|
||||||
|
file size even if padding is being used to obscure it.
|
||||||
|
|
||||||
Note that this means that the chunks won't exactly match the configured
|
Note that this means that the chunks won't exactly match the configured
|
||||||
chunk size. gpg does compression, which might make them a
|
chunk size. gpg does compression, which might make them a
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue