idea for making more special remotes support importtree
Sponsored-by: Jack Hill on Patreon
This commit is contained in:
parent
e853ef3095
commit
8208daaf17
3 changed files with 91 additions and 0 deletions
|
@ -5,3 +5,7 @@ My main concern about this is, will external special remotes pick good
|
||||||
ContentIdentifiers and will they manage the race conditions documented in
|
ContentIdentifiers and will they manage the race conditions documented in
|
||||||
[[import_tree]]? Mistakes in these things can result in data loss, and it's
|
[[import_tree]]? Mistakes in these things can result in data loss, and it's
|
||||||
rather subtle stuff. --[[Joey]]
|
rather subtle stuff. --[[Joey]]
|
||||||
|
|
||||||
|
> It may be better to implement [[importtree_only_remotes]] and make
|
||||||
|
> a simpler protocol extension that supports that, rather than supporting
|
||||||
|
> both export and import tree together. --[[Joey]]
|
||||||
|
|
|
@ -0,0 +1,15 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""comment 1"""
|
||||||
|
date="2021-08-30T18:04:21Z"
|
||||||
|
content="""
|
||||||
|
This seems more tractable if a rsync remote supports only importtree=yes
|
||||||
|
but not also exporttree=yes.
|
||||||
|
|
||||||
|
That would prevent needing to worry about git-annex making changes
|
||||||
|
to the remote at the same time it's getting content from it. Any changes
|
||||||
|
would be made by something else, and git-annex would only import them.
|
||||||
|
|
||||||
|
store/remove would not do anything. checkpresent would perhaps always
|
||||||
|
fail.
|
||||||
|
"""]]
|
72
doc/todo/importtree_only_remotes.mdwn
Normal file
72
doc/todo/importtree_only_remotes.mdwn
Normal file
|
@ -0,0 +1,72 @@
|
||||||
|
Currently for a special remote to support being configured
|
||||||
|
with exporttree=no importtree=yes, it needs to implement the
|
||||||
|
ImportActions interface, which uses ContentIdentifiers
|
||||||
|
for safety and includes some methods that are only needed
|
||||||
|
for exporttree=yes.
|
||||||
|
|
||||||
|
Few special remotes support that interface, and probably a lot of them
|
||||||
|
just can't; they don't have something that can be used as a ContentIdentifier,
|
||||||
|
or lack the necessary atomicity properties to implement it safely.
|
||||||
|
|
||||||
|
The external special remote protocol does not support that interface
|
||||||
|
yet, due to its complexity and also because noone has requested it.
|
||||||
|
(There is a draft protocol extension for export and import, see
|
||||||
|
<https://git-annex.branchable.com/design/external_special_remote_protocol/export_and_import_appendix/#index2h2>)
|
||||||
|
(See also [[todo/add_import_tree_to_external_special_remote_protocol]])
|
||||||
|
|
||||||
|
A simpler interface that supoorts only importtree=yes without needing to
|
||||||
|
worry about exporttree=yes, could let a lot more special remotes support
|
||||||
|
tree import. (For example [[todo/import_tree_from_rsync_special_remote]].)
|
||||||
|
|
||||||
|
Such a special remote could be populated in any way by something outside
|
||||||
|
git-annex, and `git annex import --from remote` would download the content
|
||||||
|
and generate a remote tracking branch. Once imported, other clones could
|
||||||
|
use `git annex get` to download files from the special remote.
|
||||||
|
|
||||||
|
Bearing in mind that since something is writing to the special remote, any
|
||||||
|
file on it could be overwritten at any point, so such a get may download
|
||||||
|
the wrong content. (So the remote should have retrievalSecurityPolicy =
|
||||||
|
RetrievalVerifiableKeysSecure to make downloads be verified well enough.)
|
||||||
|
|
||||||
|
I said this would not use a ContentIdentifier, but it seems it needs some
|
||||||
|
simple form of ContentIdentifier, which could be just an mtime.
|
||||||
|
Without any ContentIdentifier, it seems that each time
|
||||||
|
`git annex import --from remote` is run, it would need to re-download
|
||||||
|
all files from the remote, because it would have no way of knowing
|
||||||
|
if it had seen a version of a file before. This ContentIdentifier would
|
||||||
|
be used only to avoid re-downloading when importing. It would not be used
|
||||||
|
by any other methods. It could even be a dummy value if re-downloading
|
||||||
|
every file on import is acceptable.
|
||||||
|
|
||||||
|
What is needed in such an interface?
|
||||||
|
|
||||||
|
listImportableContents :: Annex (Maybe (ContentIdentifier, ImportableContents ByteSize))
|
||||||
|
-- Retrieves content from an import location to a file.
|
||||||
|
-- The content retrieved could be anything; it needs to be
|
||||||
|
-- strongly verified if this is used to download a particular Key
|
||||||
|
-- that was at one point stored on the remote, since the content
|
||||||
|
-- of the remote could change at any time.
|
||||||
|
-- (The MeterUpdate does not need to be used if
|
||||||
|
-- sequentially to the file.)
|
||||||
|
-- Throws exception on failure.
|
||||||
|
retrieveImport :: ImportLocation -> FilePath -> MeterUpdate -> Annex ()
|
||||||
|
-- Checks if anything is present on the remote at the specified
|
||||||
|
-- ImportLocation. It may check the size or other characteristics
|
||||||
|
-- of the Key, but does not need to guarantee that the content on
|
||||||
|
-- the remote is the same as the Key's content.
|
||||||
|
-- Throws an exception if the remote cannot be accessed.
|
||||||
|
checkPresentImport :: Key -> ImportLocation -> Annex Bool
|
||||||
|
|
||||||
|
listImportableContents is unchanged, and checkPresentImport above
|
||||||
|
is identical to checkPresentExport. retrieveImport is very similar
|
||||||
|
to retrieveExport, except that the content retrieved is not guaranteed
|
||||||
|
to be the same as the content of any key. Actually, it may be identical;
|
||||||
|
the only thing that uses retrieveExport forces verification of the content
|
||||||
|
retrived since it could have been changed by another writer.
|
||||||
|
|
||||||
|
The similarity with interface that we already have suggests that
|
||||||
|
perhaps this does not need changes to Types.Remote to implement.
|
||||||
|
It could be done as a Remote.Helper.SimpleImport that takes those
|
||||||
|
3 methods and translates them to the current interface.
|
||||||
|
Or by complicating Remote.Helper.ExportImport further..
|
||||||
|
--[[Joey]]
|
Loading…
Add table
Add a link
Reference in a new issue