Merge branch 'master' of ssh://git-annex.branchable.com

2011-12-23 11:34:10 -04:00 · 2011-12-23 11:34:10 -04:00 · 8a2105c90a
commit 8a2105c90a
parent f015ef5fde abba5d3e82
6 changed files with 168 additions and 0 deletions
--- a/doc/forum/git_pull_remote_git-annex/comment_4_646f2077edcabc000a7d9cb75a93cf55._comment
+++ b/doc/forum/git_pull_remote_git-annex/comment_4_646f2077edcabc000a7d9cb75a93cf55._comment
@ -0,0 +1,37 @@
+[[!comment format=mdwn
+ username="http://adamspiers.myopenid.com/"
+ nickname="Adam"
+ subject="I think Matt is right."
+ date="2011-12-23T14:04:44Z"
+ content="""
+I got bitten by this too.  It seems that the user is expected to fetch
+remote git-annex branches themselves, but this is not documented
+anywhere.
+
+The man page says of \"git annex merge\":
+
+    Automatically merges any changes from remotes into the git-annex
+    branch.
+
+I am not a git newbie, but even so I had incorrectly assumed that git
+annex merge would take care of pulling the git-annex branch from the
+remote prior to merging, thereby ensuring all versions of the
+git-annex branch would be merged, and that the location tracking data
+would be synced across all peer repositories.
+
+My master branches do not track any specific upstream branch, because
+I am operating in a decentralized fashion.  Therefore the error
+message caused by `git pull $remote` succeeded in encouraging me to
+instead use `git pull $remote master`, and this excludes the git-annex
+branch from the fetch.  Even worse, a git newbie might realise this
+and be tempted to do `git pull $remote git-annex`.
+
+Therefore I think it needs to be explicitly documented that
+
+    git fetch $remote
+    git merge $remote/master
+
+is required when the local branch doesn't track an upstream branch.
+Or maybe a `--fetch` option could be added to `git annex merge` to
+perform the fetch from all remotes before running the merge(s).
+"""]]
--- a/doc/forum/vlc_and_git-annex.mdwn
+++ b/doc/forum/vlc_and_git-annex.mdwn
@ -0,0 +1,11 @@
+I used to save movies with the srt subtitle files next to them. 
+
+Usually vlc finds it because it's on the same directory than the movie file, however with git annex the link is located on another folder.
+So after adding movies to git, the subtitles doesn't load anymore.
+
+couldn't find a quick fix. I'm thinking a bash script, but wanted to discuss it here with all annex users.
+
+I know It's out of annex scope, but I think a movie archive is a great scenario for git-annex.
+most of my HD is filled up with movies from the camcorder, screencast, etc... 
+And we usually don't modify those files
+
--- a/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh/comment_1_c0b7682a2b6f3078457b85683c825baf._comment
+++ b/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh/comment_1_c0b7682a2b6f3078457b85683c825baf._comment
@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="http://adamspiers.myopenid.com/"
+ nickname="Adam"
+ subject="comment 1"
+ date="2011-12-23T13:31:33Z"
+ content="""
+ControlPersist is awesome - thanks!
+
+Here's [an alternative, git-specific approach](http://thread.gmane.org/gmane.comp.version-control.home-dir/502).
+"""]]
--- a/doc/todo/link_file_to_remote_repo_feature.mdwn
+++ b/doc/todo/link_file_to_remote_repo_feature.mdwn
@ -0,0 +1,48 @@
+I have two repos, using SHA1 backend and both using git.
+The first one is a laptop, the second one is a usb drive.
+
+When I drop a file on the laptop repo, the file is not available on that repo until I run *git annex get*
+But when the usb drive is plugged in the file is actually available.
+
+How about adding a feature to link some/all files to the remote repo?
+
+e.g. 
+We have *railscasts/196-nested-model-form-part-1.mp4* file added to git, and only available on the usb drive:
+
+      $ git annex whereis 196-nested-model-form-part-1.mp4 
+         whereis 196-nested-model-form-part-1.mp4 (1 copy) 
+  	   a7b7d7a4-2a8a-11e1-aebc-d3c589296e81 -- origin (Portable usb drive)
+
+I can see the link with:
+
+    $ cd railscasts
+    $ ls -ls 196*
+    8 lrwxr-xr-x  1 framallo  staff  193 Dec 20 05:49 196-nested-model-form-part-1.mp4 -> ../.git/annex/objects/Wz/6P/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e
+
+I save this in a variable just to make the example more clear:
+
+    ID=".git/annex/objects/Wz/6P/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e"
+
+The file doesn't exist on the local repo:
+
+    $ ls ../$ID
+    ls: ../$ID: No such file or directory
+
+however I can create a link to access that file on the remote repo.
+First I create a needed dir:
+
+    $ mkdir ../.git/annex/objects/Wz/6P/SHA256-s16898930--43679c67cd968243f58f8f7fb30690b5f3f067574e318d609a01613a2a14351e/
+
+Then I link to the remote file:
+
+    $ ln -s /mnt/usb_drive/repo_folder/$ID ../$ID
+
+now I can open the file in the laptop repo.
+
+
+I think it could be easy to implement. Maybe It's a naive approach, but looks apealing.
+Checking if it's a real file or a link shouldn't impact on performance.
+The limitation is that it would work only with remote repos on local dirs
+
+Also allows you to have one directory structure like AFS or other distributed FS. If the file is not local I go to the remote server. 
+Which is great for apps like Picasa, Itunes, and friends that depends on the file location.
--- a/doc/todo/wishlist:_Provide_a_34git_annex34_command_that_will_skip_duplicates/comment_7_c39f1bb7c61a89b238c61bee1c049767._comment
+++ b/doc/todo/wishlist:_Provide_a_34git_annex34_command_that_will_skip_duplicates/comment_7_c39f1bb7c61a89b238c61bee1c049767._comment
@ -0,0 +1,54 @@
+[[!comment format=mdwn
+ username="http://adamspiers.myopenid.com/"
+ nickname="Adam"
+ subject="comment 7"
+ date="2011-12-22T20:04:14Z"
+ content="""
+> My main concern with putting this in git-annex is that finding
+> duplicates necessarily involves storing a list of every key and file
+> in the repository
+
+Only if you want to search the *whole* repository for duplicates, and if
+you do, then you're necessarily going to have to chew up memory in
+some process anyway, so what difference whether it's git-annex or
+(say) a Perl wrapper?
+
+> and git-annex is very carefully built to avoid things that require
+> non-constant memory use, so that it can scale to very big
+> repositories.
+
+That's a worthy goal, but if everything could be implemented with an
+O(1) memory footprint then we'd be in much more pleasant world :-)
+Even O(n) isn't that bad ...
+
+That aside, I like your `--format=\"%f %k\n\"` idea a lot.  That opens
+up the \"black box\" of `.git/annex/objects` and makes nice things
+possible, as your pipeline already demonstrates.  However, I'm not
+sure why you think `git annex find | sort | uniq` would be more
+efficient.  Not only does the sort require the very thing you were
+trying to avoid (i.e. the whole list in memory), but it's also 
+O(n log n) which is significantly slower than my O(n) Perl script 
+linked above.
+
+More considerations about this pipeline:
+
+* Doesn't it only include locally available files?  Ideally it should
+  spot duplicates even when the backing blob is not available locally.
+* What's the point of `--include '*'` ?  Doesn't `git annex find` 
+  with no arguments already include all files, modulo the requirement
+  above that they're locally available?
+* Any user using this `git annex find | ...` approach is likely to
+  run up against its limitations sooner rather than later, because
+  they're already used to the plethora of options `find(1)` provides.
+  Rather than reinventing the wheel, is there some way `git annex find`
+  could harness the power of `find(1)` ?
+
+Those considerations aside, a combined approach would be to implement
+
+    git annex find --format=...
+
+and then alter my Perl wrapper to `popen(2)` from that rather than using
+`File::Find`.  But I doubt you would want to ship Perl wrappers in the
+distribution, so if you don't provide a Haskell equivalent then users
+who can't code are left high and dry.
+"""]]
--- a/doc/todo/wishlist:_Provide_a_34git_annex34_command_that_will_skip_duplicates/comment_8_221ed2e53420278072a6d879c6f251d1._comment
+++ b/doc/todo/wishlist:_Provide_a_34git_annex34_command_that_will_skip_duplicates/comment_8_221ed2e53420278072a6d879c6f251d1._comment
@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="http://adamspiers.myopenid.com/"
+ nickname="Adam"
+ subject="How much memory would it actually use anyway?"
+ date="2011-12-22T20:15:22Z"
+ content="""
+Another thought - an SHA1 digest is 20 bytes.  That means you can fit over 50 million keys into 1GB of RAM.  Granted you also need memory to store the values (pathnames) which in many cases will be longer, and some users may also choose more expensive backends than SHA1 ... but even so, it seems to me that you are at risk of throwing the baby out with the bath water.  
+"""]]