git-annex/doc/todo/export/comment_4_da8b5b8c9fc371fe138590744b2adce3._comment
2017-07-10 14:06:49 -04:00

51 lines
2.2 KiB
Text

[[!comment format=mdwn
username="joey"
subject="""comment 4"""
date="2017-07-10T17:30:23Z"
content="""
Yoh suggested storing a treeish associated with the export to a special
remote. Pointed out that the diff from that treeish to the next one could
be used to update the export.
(That does seem very close to [[todo/dumb, unsafe, human-readable_backend]].)
Would need to somehow deal with partial uploads. There are two ways
an upload can be partial:
* Some of the files in the treeish have been uploaded; others have not.
* A file has been partially uploaded.
These two cases need to be disentangled somehow in order to handle
them. It could use the location log, so once a key gets uploaded
to the special remote, its content is marked present. However, using the
location log does not seem sufficient to handle all cases (eg two files
swapping names between two treeishes, where one of the files has been
uploaded only partially to the special remote).
It seems promosing to keep track of two separate treeishes:
1. The treeish that is the current goal to have exported to the special
remote.
2. The treeish for the current state of the special remote. Note that
after even an interrupted transfer, this treeish needs to be updated to
contain the current state of the special remote, which would mean
constructing a new treeish. (Perhaps a per-remote index file could be
used.)
Having these two treeishes allows:
* Detecting renames of exported files, which some special remotes can do
more efficiently.
* Determining the key that a given file on the special remote is
storing the content of.
* Resuming an interrupted export, without re-uploading all the files.
* Detecting a partially uploaded file, because the current state treeish
for the remote should not contain that file.
* Knowing what key was in the process of being sent to a partially uploaded
file, and so resuming that upload. Look at the goal treeish and see what
key it has for the file; as long as the goal treeish is the same goal
that was used for the interrupted export, that's the key. (This needs a
way to track if the goal has changed.)
* Optionally, making `git annex sync` and the assistant upload as needed
to satisfy goal treeishes for special remotes.
"""]]