split out todo for special remotes and close the main todo
This commit is contained in:
parent
76e11e4458
commit
362a2808a5
2 changed files with 49 additions and 32 deletions
|
@ -20,29 +20,10 @@ just hard link object files from the old to new key, and update the location
|
||||||
log for the new key to indicate the content is present in the repo.
|
log for the new key to indicate the content is present in the repo.
|
||||||
This command could be something like `git-annex migrate --update`.
|
This command could be something like `git-annex migrate --update`.
|
||||||
|
|
||||||
That wouldn't be entirely sufficient though, because special remotes from
|
|
||||||
pre-migration will be populated with the old keys. A similar command could
|
|
||||||
upload the new content to special remotes, but that would double the data
|
|
||||||
stored in a special remote (or drop the old keys from them),
|
|
||||||
and use a lot of bandwidth. Probably not a good idea.
|
|
||||||
|
|
||||||
Alternatively, the old key could be left on a special remote, but update
|
|
||||||
the location log for the special remote to say it has the new key,
|
|
||||||
and have git-annex request the old key when it wants to get (or checkpresent)
|
|
||||||
the new key from the special remote.
|
|
||||||
This would need the mapping to be cheap enough to query that it won't
|
|
||||||
signficantly slow down accessing a special remote.
|
|
||||||
|
|
||||||
Dropping the new key from the special remote would then need to drop the
|
|
||||||
old key. But that could violate numcopies for the old key. Perhaps it could
|
|
||||||
check numcopies for the old key and drop it, otherwise leave the old key on
|
|
||||||
the special remote.
|
|
||||||
|
|
||||||
Rather than a dedicated command that users need to remember to run,
|
Rather than a dedicated command that users need to remember to run,
|
||||||
distributed migration could be done automatically when merging a git-annex
|
distributed migration could be done automatically when merging a git-annex
|
||||||
branch that adds migration information. Just hardlink object files and
|
branch that adds migration information. Just hardlink object files and
|
||||||
update the location log for the local repo and for available special
|
update the location log.
|
||||||
remotes.
|
|
||||||
|
|
||||||
It would be possible to avoid updating the location log, but then all
|
It would be possible to avoid updating the location log, but then all
|
||||||
location log queries would have to check the migration mapping. It would be
|
location log queries would have to check the migration mapping. It would be
|
||||||
|
@ -51,6 +32,8 @@ queries the location log for each file.
|
||||||
|
|
||||||
--[[Joey]]
|
--[[Joey]]
|
||||||
|
|
||||||
|
> [[done]] --[[Joey]]
|
||||||
|
|
||||||
# security
|
# security
|
||||||
|
|
||||||
It is possible for bad migration information to be recorded in the
|
It is possible for bad migration information to be recorded in the
|
||||||
|
@ -59,20 +42,10 @@ when bad migration information is recorded:
|
||||||
|
|
||||||
* When updating the local repository with a migration, verify that
|
* When updating the local repository with a migration, verify that
|
||||||
the object file hashes to the new key before hardlinking.
|
the object file hashes to the new key before hardlinking.
|
||||||
* When downloading content from a special remote by getting the old
|
|
||||||
pre-migration key, verify that download hashes to the new key.
|
|
||||||
|
|
||||||
That leaves at least two possible security problems:
|
> This was done.
|
||||||
|
|
||||||
* checkpresent against the special remote has to trust that the content
|
That leaves at these possible security problems:
|
||||||
stored on it for the old key will hash to the new key. This could result
|
|
||||||
in data loss when a bad migration is provided, and the special remote is
|
|
||||||
trusted.
|
|
||||||
|
|
||||||
Eg, if key A is locally present, and B is present on the special
|
|
||||||
remote, and then wrong migration is recorded from B to A,
|
|
||||||
the special remote will be treated as containing a copy of A,
|
|
||||||
allowing dropping the local copy of A, which was the only copy.
|
|
||||||
|
|
||||||
* DOS by flooding the git-annex branch with migrations, resulting in
|
* DOS by flooding the git-annex branch with migrations, resulting in
|
||||||
lots of hard links (or copies on filesystems not supporting hard links)
|
lots of hard links (or copies on filesystems not supporting hard links)
|
||||||
|
@ -93,3 +66,6 @@ remote to contain the only copy.
|
||||||
If we pull a git-annex branch from someone, they can already DOS disk space
|
If we pull a git-annex branch from someone, they can already DOS disk space
|
||||||
and CPU by checking a lot of junk into git. So maybe a DOS by migration is
|
and CPU by checking a lot of junk into git. So maybe a DOS by migration is
|
||||||
not really a concern.
|
not really a concern.
|
||||||
|
|
||||||
|
> If people are worried about this kind of thing, they can avoid using the
|
||||||
|
> feature. --[[Joey]]
|
||||||
|
|
41
doc/todo/distributed_migration_for_special_remotes.mdwn
Normal file
41
doc/todo/distributed_migration_for_special_remotes.mdwn
Normal file
|
@ -0,0 +1,41 @@
|
||||||
|
[[distributed_migration]] is implemented for local repositories via
|
||||||
|
`git-annex migrate --update`.
|
||||||
|
|
||||||
|
That leaves updating special remotes after a migration as the main pain
|
||||||
|
point in doing migrations.
|
||||||
|
|
||||||
|
One approach would be a command like `git-annex migrate
|
||||||
|
--update-remote=foo` that uploads new keys and drops old keys.
|
||||||
|
But that would double the data stored in the special remote and use a lot
|
||||||
|
of bandwidth.
|
||||||
|
|
||||||
|
Alternatively, the old key could be left on a special remote, but update
|
||||||
|
the location log for the special remote to say it has the new key,
|
||||||
|
and have git-annex request the old key when it wants to get (or checkpresent)
|
||||||
|
the new key from the special remote.
|
||||||
|
This would need the mapping to be cheap enough to query that it won't
|
||||||
|
signficantly slow down accessing a special remote.
|
||||||
|
|
||||||
|
Dropping the new key from the special remote would then need to drop the
|
||||||
|
old key. But that could violate numcopies for the old key. Perhaps it could
|
||||||
|
check numcopies for the old key and drop it, otherwise leave the old key on
|
||||||
|
the special remote.
|
||||||
|
|
||||||
|
--[[Joey]]
|
||||||
|
|
||||||
|
# security
|
||||||
|
|
||||||
|
|
||||||
|
When downloading content from a special remote by getting the old
|
||||||
|
pre-migration key it's important to verify that download hashes to the new key.
|
||||||
|
See [[distributed_migration]]'s security section for relevant background.
|
||||||
|
|
||||||
|
Another problem to consider: checkpresent against the special remote has to
|
||||||
|
trust that the content stored on it for the old key will hash to the new
|
||||||
|
key. This could result in data loss when a bad migration is provided, and
|
||||||
|
the special remote is trusted.
|
||||||
|
|
||||||
|
Eg, if key A is locally present, and B is present on the special
|
||||||
|
remote, and then wrong migration is recorded from B to A,
|
||||||
|
the special remote will be treated as containing a copy of A,
|
||||||
|
allowing dropping the local copy of A, which was the only copy.
|
Loading…
Reference in a new issue