Merge branch 'master' into importtree
This commit is contained in:
commit
4e0d08b66b
8 changed files with 110 additions and 6 deletions
|
@ -6,7 +6,7 @@ locally paired systems, and remote servers with rsync.
|
||||||
Help me prioritize my work: What special remote would you most like
|
Help me prioritize my work: What special remote would you most like
|
||||||
to use with the git-annex assistant?
|
to use with the git-annex assistant?
|
||||||
|
|
||||||
[[!poll open=yes 18 "Amazon S3 (done)" 13 "Amazon Glacier (done)" 10 "Box.com (done)" 77 "My phone (or MP3 player)" 29 "Tahoe-LAFS" 17 "OpenStack SWIFT" 37 "Google Drive"]]
|
[[!poll open=yes 18 "Amazon S3 (done)" 13 "Amazon Glacier (done)" 10 "Box.com (done)" 79 "My phone (or MP3 player)" 29 "Tahoe-LAFS" 17 "OpenStack SWIFT" 37 "Google Drive"]]
|
||||||
|
|
||||||
This poll is ordered with the options I consider easiest to build
|
This poll is ordered with the options I consider easiest to build
|
||||||
listed first. Mostly because git-annex already supports them and they
|
listed first. Mostly because git-annex already supports them and they
|
||||||
|
|
|
@ -9,12 +9,57 @@ their changes into the local repository's version control.
|
||||||
The basic idea is to have a `git annex import --from remote` command.
|
The basic idea is to have a `git annex import --from remote` command.
|
||||||
|
|
||||||
It would find changed/new/deleted files on the remote.
|
It would find changed/new/deleted files on the remote.
|
||||||
Download the changed/new files and inject into the annex.
|
Download the changed/new files and inject into the annex.
|
||||||
Generate a new treeish, with parent the treeish that was exported earlier,
|
And then generate a commit that can be merged (by the command or later by
|
||||||
that has the modifications in it.
|
the user) to make their branch reflect changes made on the remote.
|
||||||
|
|
||||||
Updating the local working copy is then done by merging the import treeish.
|
## generating commits and merging
|
||||||
This way, conflicts will be detected and handled as normal by git.
|
|
||||||
|
For the merge to work correctly, the parent of the generated commit
|
||||||
|
needs to be, when possible, a commit whose tree corresponds to the last
|
||||||
|
tree that was exported to the remote. This way, git merge will treat the
|
||||||
|
remote the same as a normal git remote where changes were made.
|
||||||
|
|
||||||
|
The export log does not record the last exported commit though, only the
|
||||||
|
tree. And the exported tree may not be the tree of any commit in the
|
||||||
|
history; it's often a subtree.
|
||||||
|
|
||||||
|
So, the export log needs to get a commit sha added to it. And it's possible
|
||||||
|
that commit will get garbage collected or not pushed, and so not be
|
||||||
|
available. It could be linked into the git-annex branch as is done for the
|
||||||
|
exported tree, but doing that for a commit is pretty strange. It's also
|
||||||
|
possible for the user to export a tree by sha, so there's no commit.
|
||||||
|
And of course, if no export has been done yet, there would be no commit.
|
||||||
|
|
||||||
|
If the last exported commit is not accessible, or not recorded, seems it
|
||||||
|
would be ok to make a commit with no parent. git merge would then need
|
||||||
|
--allow-unrelated-histories, and it would be more likely for the merge to
|
||||||
|
have conflicts.
|
||||||
|
|
||||||
|
## command line interface
|
||||||
|
|
||||||
|
`git annex import --from remote` would import files from the remote to the
|
||||||
|
top of the working tree. Sometimes users will want to import into a
|
||||||
|
subdirectory, so there should be a way to do that.
|
||||||
|
|
||||||
|
`git annex export` has its own way to specify a subdirectory to export,
|
||||||
|
eg "master:subdir" (which is one way of referring to a git tree in git).
|
||||||
|
So it seems it would make sense to make importing use a similar syntax.
|
||||||
|
When importing, "master:subdir" would mean to import into a tree at subdir,
|
||||||
|
and merge it into master. So any branch ref not containing a colon, eg
|
||||||
|
"master" naturally means import not in a subdir, and merge it into the
|
||||||
|
branch.
|
||||||
|
|
||||||
|
Note that while export can have a particular commit or tree sha specified,
|
||||||
|
it does not makes sense to import *to* a particular sha.
|
||||||
|
|
||||||
|
Also, there should be a way to configure it so `git annex sync --content`
|
||||||
|
first imports from a remote and then exports to it. Currently `git annex
|
||||||
|
export` has `--tracking` to configure the latter. It seems to only make
|
||||||
|
sense to import and export the same tracking branch. So, should `git annex
|
||||||
|
export --tracking` set the same thing, or perhaps it would be better to
|
||||||
|
move the tracking branch configuration out of `git annex export` and into
|
||||||
|
an interface that explicitly configures both import and export?
|
||||||
|
|
||||||
## content identifiers
|
## content identifiers
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,12 @@
|
||||||
|
Started building [[todo/import_tree]] (in the `importtree` branch). So far
|
||||||
|
the content identifier storage in the git-annex branch is done. Since the
|
||||||
|
API tells me it will need to both map from a key to content identifiers,
|
||||||
|
and from content identifier to the key, I also added a sqlite database to
|
||||||
|
handle the latter.
|
||||||
|
|
||||||
|
While implementing that, I happened to notice a bug in storage of metadata
|
||||||
|
that contains newlines; [[internals]] said that would be base64'd, but it
|
||||||
|
was not. That bug turns out to have been introduced by the ByteString
|
||||||
|
conversion in January, and it's the second bug caused by that conversion.
|
||||||
|
The other one broke git-annex on Windows, which was fixed by a release
|
||||||
|
yesterday.
|
19
doc/devblog/day_574__weeds.mdwn
Normal file
19
doc/devblog/day_574__weeds.mdwn
Normal file
|
@ -0,0 +1,19 @@
|
||||||
|
Not a lot of progress on [[todo/import_tree]] today I feel..
|
||||||
|
|
||||||
|
Started off by adding a QuickCheck test of the content
|
||||||
|
identifier log, which did find one bug in that code.
|
||||||
|
|
||||||
|
Then started roughing out the core of the importing operation, which involves
|
||||||
|
building up git trees for the files that are imported. But that needs a
|
||||||
|
way to graft an imported tree into a subdirectory of another tree,
|
||||||
|
and the only way I had available to do it needed to read in the entire
|
||||||
|
recursive tree of the current branch, which would be slower and use
|
||||||
|
more memory than I like.
|
||||||
|
|
||||||
|
So, got sidetracked building a git tree grafter. It turns out that
|
||||||
|
the export tree code also needs to graft a tree (into the git-annex
|
||||||
|
branch), and did so using the same innefficient method that I want to
|
||||||
|
avoid, so it will also be able to be improved using the grafter.
|
||||||
|
|
||||||
|
Unfortunately, I had to stop for the day with the grafter not quite working
|
||||||
|
properly.
|
|
@ -0,0 +1,9 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="gan"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/564f55f9fc3773e521bafdbb6f23efc9"
|
||||||
|
subject="Provide flags to youtube-dl?"
|
||||||
|
date="2019-02-22T18:01:25Z"
|
||||||
|
content="""
|
||||||
|
Is there already a way to specify flags to youtube-dl on a per-file basis. I think it would be OK to do it during either during addurl (modifying the resulting reference that is stored in the annex somehow), or during git-annex get. This is so that the preferred format can be specified. Primarily this would enable to download audio-only formats for some files. ) Apologies if I missed some documentation on how to achieve this)
|
||||||
|
|
||||||
|
"""]]
|
|
@ -0,0 +1,8 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="gan"
|
||||||
|
avatar="http://cdn.libravatar.org/avatar/564f55f9fc3773e521bafdbb6f23efc9"
|
||||||
|
subject="Clarification"
|
||||||
|
date="2019-02-22T18:03:16Z"
|
||||||
|
content="""
|
||||||
|
So, to clarify - I read your first answer. But if this coulud be done during get perhaps then it's OK because it is an explicit request for the potentially unsafe operation?
|
||||||
|
"""]]
|
|
@ -0,0 +1,9 @@
|
||||||
|
[[!comment format=mdwn
|
||||||
|
username="joey"
|
||||||
|
subject="""Re: provide flags to youtube-dl"""
|
||||||
|
date="2019-02-22T20:01:37Z"
|
||||||
|
content="""
|
||||||
|
@gan, there's not much point in providing flags that are only used in the
|
||||||
|
initial download; the main point in adding the url to git-annex is so you
|
||||||
|
can download the same content from it again later.
|
||||||
|
"""]]
|
|
@ -3,6 +3,8 @@ and the remote allows files to somehow be edited on it, then there ought
|
||||||
to be a way to import the changes back from the remote into the git repository.
|
to be a way to import the changes back from the remote into the git repository.
|
||||||
The command could be `git annex import --from remote`
|
The command could be `git annex import --from remote`
|
||||||
|
|
||||||
|
There also ought to be a way to make `git annex sync` automatically import.
|
||||||
|
|
||||||
See [[design/importing_trees_from_special_remotes]] for current design for
|
See [[design/importing_trees_from_special_remotes]] for current design for
|
||||||
this.
|
this.
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue