Merge remote-tracking branch 'origin/master' into reorg

2011-03-16 00:09:35 -04:00 · 2011-03-16 00:09:35 -04:00 · 539083b847
commit 539083b847
parent 09a7689bc3 fd2f04694f
15 changed files with 162 additions and 0 deletions
--- a/doc/forum/can_git-annex_replace_ddm63/comment_2_008554306dd082d7f543baf283510e92._comment
+++ b/doc/forum/can_git-annex_replace_ddm63/comment_2_008554306dd082d7f543baf283510e92._comment
@ -0,0 +1,19 @@
+[[!comment format=mdwn
+ username="http://dieter-be.myopenid.com/"
+ nickname="dieter"
+ subject="comment 2"
+ date="2011-02-16T21:32:04Z"
+ content="""
+thanks Joey,
+
+is it possible to run some git annex command that tells me, for a specific directory, which files are available in an other remote? (and which remote, and which filenames?)
+I guess I could run that, do my own policy thingie, and run `git annex get` for the files I want.
+
+For your podcast use case (and some of my use cases) don't you think git [annex] might actually be overkill?  For example your podcasts use case, what value does git annex give over a simple rsync/rm script?
+such a script wouldn't even need a data store to store its state, unlike git. it seems simpler and cleaner to me.
+
+for the mpd thing, check http://alip.github.com/mpdcron/ (bad project name, it's a plugin based \"event handler\")
+you should be able to write a simple plugin for mpdcron that does what you want (or even interface with mpd yourself from perl/python/.. to use its idle mode to get events)
+
+Dieter
+"""]]
--- a/doc/forum/can_git-annex_replace_ddm63/comment_3_4c69097fe2ee81359655e59a03a9bb8d._comment
+++ b/doc/forum/can_git-annex_replace_ddm63/comment_3_4c69097fe2ee81359655e59a03a9bb8d._comment
@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="http://joey.kitenet.net/"
+ nickname="joey"
+ subject="comment 3"
+ date="2011-03-16T03:01:17Z"
+ content="""
+Whups, the comment above got stuck in moderation queue for 27 days. I will try to check that more frequently.
+
+In the meantime, I've implemented \"git annex whereis\" -- enjoy!
+
+I find keeping my podcasts in the annex useful because it allows me to download individual episodes or poscasts easily when low bandwidth is available (ie, dialup), or over sneakernet. And generally keeps everything organised.
+"""]]
--- a/doc/forum/hashing_objects_directories/comment_2_504c96959c779176f991f4125ea22009._comment
+++ b/doc/forum/hashing_objects_directories/comment_2_504c96959c779176f991f4125ea22009._comment
@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="https://www.google.com/accounts/o8/id?id=AItOawl9sYlePmv1xK-VvjBdN-5doOa_Xw-jH4U"
+ nickname="Richard"
+ subject="comment 2"
+ date="2011-03-15T13:52:16Z"
+ content="""
+Can't you just use an underscore instead of a colon?
+
+Would it be feasible to split directories dynamically? I.e. start with SHA1_123456789abcdef0123456789abcdef012345678/SHA1_123456789abcdef0123456789abcdef012345678 and, at a certain cut-off point, switch to shorter directory names? This could even be done per subdirectory and based purely on a locally-configured number. Different annexes on different file systems or with different file subsets might even have different thresholds. This would ensure scale while not forcing you to segment from the start. Also, while segmenting with longer directory names means a flatter tree, segments longer than four characters might not make too much sense. Segmenting too often could lead to some directories becoming too populated, bringing us back to the dynamic segmentation.
+
+All of the above would make merging annexes by hand a _lot_ harder, but I don't know if this is a valid use case. And if all else fails, one could merge everything with the unsegemented directory names and start again from there.
+
+-- RichiH
+"""]]
--- a/doc/forum/hashing_objects_directories/comment_3_9134bde0a13aac0b6a4e5ebabd7f22e8._comment
+++ b/doc/forum/hashing_objects_directories/comment_3_9134bde0a13aac0b6a4e5ebabd7f22e8._comment
@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="http://joey.kitenet.net/"
+ nickname="joey"
+ subject="comment 3"
+ date="2011-03-16T03:13:39Z"
+ content="""
+It is unfortunatly not possible to do system-dependant hashing, so long as git-annex stores symlinks to the content in git.
+
+It might be possible to start without hashing, and add hashing for new files after a cutoff point. It would add complexity.
+
+I'm currently looking at a 2 character hash directory segment, based on an md5sum of the key, which splits it into 1024 buckets. git uses just 256 buckets for its object directory, but then its objects tend to get packed away. I sorta hope that one level is enough, but guess I could go to 2 levels (objects/ab/cd/key), which would provide 1048576 buckets, probably plenty, as if you are storing more than a million files, you are probably using a modern enough system to have a filesystem that doesn't need hashing.
+"""]]