implement updating the ContentIdentifier db with info from the git-annex branch
untested This won't be super slow, but it does need to diff two likely large trees, and since the git-annex branch rarely sits still, it will most likely be run at the beginning of every import. A possible speed improvement would be to only run this when the database did not contain a ContentIdentifier. But that would only speed up imports when there is no new version of a file on the special remote, at most renames of existing files being imported. A better speed improvement would be to record something in the git-annex branch that indicates when an import has been run, and only do the diff if the git-annex branch has record of a newer import than we've seen before. Then, it would only run when there is in fact new ContentIdentifier information available from a remote. Certianly doable, but didn't want to complicate things yet.
This commit is contained in:
parent
12e4906657
commit
ee251b2e2e
4 changed files with 72 additions and 24 deletions
|
@ -17,19 +17,9 @@ this.
|
|||
* Need to support annex-tracking-branch configuration, which documentation
|
||||
says makes git-annex sync and assistant do imports.
|
||||
|
||||
* Database.ContentIdentifier needs a way to update the database with
|
||||
information coming from the git-annex branch. This will allow multiple
|
||||
clones to import from the same remote, and share content identifier
|
||||
information amoung them.
|
||||
|
||||
It will only need to be updated when listContents returns a
|
||||
ContentIdentifier that is not already known in the database.
|
||||
|
||||
How to do the update: Stash the ref of the last git-annex branch it's
|
||||
updated from in the database. Diff between that ref and the current
|
||||
git-annex branch. For each file in the diff that's a .cid file, read
|
||||
the file from the branch, and store into the database.
|
||||
Update the stashed ref.
|
||||
* Test behavior when multiple repos import from same special remote;
|
||||
the second importer should not re-download as long as it has pulled
|
||||
from the first importer.
|
||||
|
||||
* When on an adjusted unlocked branch, need to import the files unlocked.
|
||||
Also, the tracking branch code needs to know about such branches,
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue