starting api design
This commit is contained in:
parent
b7991248db
commit
87987c78cf
1 changed files with 50 additions and 21 deletions
|
@ -11,7 +11,7 @@ that has the modifications in it.
|
||||||
Updating the working copy is then done by merging the import treeish.
|
Updating the working copy is then done by merging the import treeish.
|
||||||
This way, conflicts will be detected and handled as normal by git.
|
This way, conflicts will be detected and handled as normal by git.
|
||||||
|
|
||||||
----
|
## content identifiers
|
||||||
|
|
||||||
The remote is responsible for collecting a list of
|
The remote is responsible for collecting a list of
|
||||||
files currently in it, along with some content identifier. That data is
|
files currently in it, along with some content identifier. That data is
|
||||||
|
@ -53,7 +53,28 @@ the same remote? In that case, perhaps different trees would be imported,
|
||||||
and merged into master. So the two repositories then have differing
|
and merged into master. So the two repositories then have differing
|
||||||
masters, which can be reconciled in merge as usual.
|
masters, which can be reconciled in merge as usual.
|
||||||
|
|
||||||
----
|
Since exporttree remotes don't have content identifier information yet, it
|
||||||
|
needs to be collected the first time import tree is used. (Or import
|
||||||
|
everything, but that is probably too expensive). Any modifications made to
|
||||||
|
exported files before the first import tree would not be noticed. Seems
|
||||||
|
acceptible as long as this only affects exporttree remotes created before
|
||||||
|
this feature was added.
|
||||||
|
|
||||||
|
What if repo A is being used to import tree from R for a while, and the
|
||||||
|
user gets used to editing files on R and importing them. Then they stop
|
||||||
|
using A and switch to clone B. It would not have the content identifier
|
||||||
|
information that A did. It seems that in this case, B needs to re-download
|
||||||
|
everything, to build up the map of content identifiers.
|
||||||
|
(Anything could have changed since the last time A imported).
|
||||||
|
That seems too expensive!
|
||||||
|
|
||||||
|
Would storing content identifiers in the git-annex branch be too
|
||||||
|
expensive? Probably not.. For S3 with versioning a content identifier is
|
||||||
|
already stored. When the content identifier is (mtime, size, inode),
|
||||||
|
that's a small amount of data. The maximum size of a content identifier
|
||||||
|
could be limited to the size of a typical hash, and if a remote for some
|
||||||
|
reason gets something larger, it could simply hash it to generate
|
||||||
|
the content identifier.
|
||||||
|
|
||||||
## race conditions TODO
|
## race conditions TODO
|
||||||
|
|
||||||
|
@ -152,25 +173,6 @@ Since this is acceptable in git, I suppose we can accept it here too..
|
||||||
|
|
||||||
----
|
----
|
||||||
|
|
||||||
Since exporttree remotes don't have content identifier information yet, it
|
|
||||||
needs to be collected the first time import tree is used. (Or import
|
|
||||||
everything, but that is probably too expensive). Any modifications made to
|
|
||||||
exported files before the first import tree would not be noticed. Seems
|
|
||||||
acceptible as long as this only affects exporttree remotes created before
|
|
||||||
this feature was added.
|
|
||||||
|
|
||||||
What if repo A is being used to import tree from R for a while, and the
|
|
||||||
user gets used to editing files on R and importing them. Then they stop
|
|
||||||
using A and switch to clone B. It would not have the content identifier
|
|
||||||
information that A did (unless it's stored in git-annex branch rather than
|
|
||||||
locally). It seems that in this case, B needs to re-download everything,
|
|
||||||
since anything could have changed since the last time A imported.
|
|
||||||
That seems too expensive!
|
|
||||||
|
|
||||||
Would storing content identifiers in the git-annex branch be too expensive?
|
|
||||||
|
|
||||||
----
|
|
||||||
|
|
||||||
If multiple repos can access the remote at the same time, then there's a
|
If multiple repos can access the remote at the same time, then there's a
|
||||||
potential problem when one is exporting a new tree, and the other one is
|
potential problem when one is exporting a new tree, and the other one is
|
||||||
importing from the remote.
|
importing from the remote.
|
||||||
|
@ -187,6 +189,33 @@ importing from the remote.
|
||||||
> to be on the remote. (May need to reword that prompt.)
|
> to be on the remote. (May need to reword that prompt.)
|
||||||
> --[[Joey]]
|
> --[[Joey]]
|
||||||
|
|
||||||
|
## api design
|
||||||
|
|
||||||
|
Pulling all of the above together, this is an extension to the
|
||||||
|
ExportActions api.
|
||||||
|
|
||||||
|
listContents :: Annex [(ExportLocation, ContentIdentifier)]
|
||||||
|
|
||||||
|
getContentIdentifier :: ExportLocation -> Annex (Maybe ContentIdentifier)
|
||||||
|
|
||||||
|
retrieveExportWithContentIdentifier :: ExportLocation -> ContentIdentifier -> FilePath -> MeterUpdate -> Annex Bool
|
||||||
|
|
||||||
|
storeExportWithContentIdentifier :: FilePath -> Key -> ExportLocation -> MeterUpdate -> Annex (Maybe ContentIdentifier)
|
||||||
|
|
||||||
|
retrieveExportWithContentIdentifier is used when downloading a new file from
|
||||||
|
the remote that listContents found. retrieveExport can't be used because
|
||||||
|
it has a Key parameter and the key is not yet known in this case.
|
||||||
|
|
||||||
|
storeExportWithContentIdentifier is used to get the content identifier
|
||||||
|
corresponding to what was just stored. It can either get the content
|
||||||
|
identifier in reply to the store (as S3 does with versioning), or it can
|
||||||
|
store to a temp location, get the content identifier of that, and then
|
||||||
|
rename the content into place. When there's a race with a concurrent
|
||||||
|
writer, it needs to avoid getting the ContentIdentifier for data written by
|
||||||
|
the other writer.
|
||||||
|
|
||||||
|
TODO what's needed to work around the other race condition discussed above?
|
||||||
|
|
||||||
----
|
----
|
||||||
|
|
||||||
See also, [[adb_special_remote]]
|
See also, [[adb_special_remote]]
|
||||||
|
|
Loading…
Add table
Reference in a new issue