From a4d37c4550b911567d0dfb29ed5ba89f619065a7 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawkptNW1PzrVjYlJWP_9e499uH0mjnBV6GQ" Date: Thu, 7 Apr 2011 08:03:12 +0000 Subject: [PATCH 1/4] --- .../sparse_git_checkouts_with_annex.mdwn | 28 +++++++++++++++++++ 1 file changed, 28 insertions(+) create mode 100644 doc/forum/sparse_git_checkouts_with_annex.mdwn diff --git a/doc/forum/sparse_git_checkouts_with_annex.mdwn b/doc/forum/sparse_git_checkouts_with_annex.mdwn new file mode 100644 index 0000000000..32e6e4bbc4 --- /dev/null +++ b/doc/forum/sparse_git_checkouts_with_annex.mdwn @@ -0,0 +1,28 @@ +I checked in my music collection into git annex (about 25000 files) and i'm really impressed by the performance of git annex (after i've done an git-repack). Now i'm also moving my movies into the same git-annex, but i have the following layout of my disk drives: + +* small raid-1 for important stuff (music, documents), which is also backupped (aka: raid) +* big bulk data store (aka: media) + +In the git-annex the following layout of files is used: + +* documents/ <- on raid +* music/ <- on raid +* videos/ <- on media + +Now i didn't simply clone the raid-annex to media, but did an sparse-checkout (possible since version 1.7.0) + +* raid: .git-annex/, documents/ and music +* media: .git-annex/, videos/ + +As you can see i have to checkout the .git-annex directory with the file-logs twice which slows down git operations. Everything else works fine until now. git-annex doesn't have any problem, that only a part of the symlinks are present, which is really great. Is there a possibility to sparse checkout the .git-annex directory also? Perhaps splitting the log files in .git-annex/ into N subfolders, corresponding to the toplevel subfolders, like this? + +* Before: + $ ls .git-annex + 00 01 02.... +* After: + $ ls .git-annex + documents/ music/ videos/ + $ ls .git-annex/documents + 00 01 02.... + +This would make it possible to checkout only the part of the log files which i'm interested in. From eca9914be1213f8110afef35e647ce14ae22bfb7 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawkptNW1PzrVjYlJWP_9e499uH0mjnBV6GQ" Date: Thu, 7 Apr 2011 08:04:39 +0000 Subject: [PATCH 2/4] (sorry for noise, had to format the code blocks) --- .../sparse_git_checkouts_with_annex.mdwn | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/doc/forum/sparse_git_checkouts_with_annex.mdwn b/doc/forum/sparse_git_checkouts_with_annex.mdwn index 32e6e4bbc4..97d2f445d3 100644 --- a/doc/forum/sparse_git_checkouts_with_annex.mdwn +++ b/doc/forum/sparse_git_checkouts_with_annex.mdwn @@ -16,13 +16,16 @@ Now i didn't simply clone the raid-annex to media, but did an sparse-checkout (p As you can see i have to checkout the .git-annex directory with the file-logs twice which slows down git operations. Everything else works fine until now. git-annex doesn't have any problem, that only a part of the symlinks are present, which is really great. Is there a possibility to sparse checkout the .git-annex directory also? Perhaps splitting the log files in .git-annex/ into N subfolders, corresponding to the toplevel subfolders, like this? -* Before: - $ ls .git-annex - 00 01 02.... -* After: - $ ls .git-annex - documents/ music/ videos/ - $ ls .git-annex/documents - 00 01 02.... +Before: + + $ ls .git-annex + 00 01 02.... + +After: + + $ ls .git-annex + documents/ music/ videos/ + $ ls .git-annex/documents + 00 01 02.... This would make it possible to checkout only the part of the log files which i'm interested in. From 7634f92e83f8a789df1d660ed95a08d5c136f70f Mon Sep 17 00:00:00 2001 From: "http://fraggod.pip.verisignlabs.com.pip.verisignlabs.com/" Date: Thu, 7 Apr 2011 13:44:38 +0000 Subject: [PATCH 3/4] Added a comment: Reported the issue to GHC --- ...ment_5_67406dd8d9bd4944202353508468c907._comment | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 doc/bugs/git-annex-shell:_internal_error:_evacuate__40__static__41__:_strange_closure_type_30799/comment_5_67406dd8d9bd4944202353508468c907._comment diff --git a/doc/bugs/git-annex-shell:_internal_error:_evacuate__40__static__41__:_strange_closure_type_30799/comment_5_67406dd8d9bd4944202353508468c907._comment b/doc/bugs/git-annex-shell:_internal_error:_evacuate__40__static__41__:_strange_closure_type_30799/comment_5_67406dd8d9bd4944202353508468c907._comment new file mode 100644 index 0000000000..bffa9bb868 --- /dev/null +++ b/doc/bugs/git-annex-shell:_internal_error:_evacuate__40__static__41__:_strange_closure_type_30799/comment_5_67406dd8d9bd4944202353508468c907._comment @@ -0,0 +1,13 @@ +[[!comment format=mdwn + username="http://fraggod.pip.verisignlabs.com.pip.verisignlabs.com/" + subject="Reported the issue to GHC" + date="2011-04-07T13:44:36Z" + content=""" +Finally got around to [report the issue to GHC tracker](http://hackage.haskell.org/trac/ghc/ticket/5085#comment:7). + +Looks quite alike (at least to the haskell-illiterate person like me) to a highest-priority issue that's hanging right at the top of the list. +There are other similar reports, but they seem to be either related to PowerPC Macs, closed as invalid or due to needinfo inactivity. + +Guess any further discussion belongs there, unless ghc developers will bounce it back. +Thanks a lot for your help, Joey, and for sharing a great thing that git-annex is. +"""]] From 00f1c720eddf95ce21e6d2c35623a82e97ed604c Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Thu, 7 Apr 2011 16:32:04 +0000 Subject: [PATCH 4/4] Added a comment --- ...mment_1_c7dc199c5740a0e7ba606dfb5e3e579a._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/forum/sparse_git_checkouts_with_annex/comment_1_c7dc199c5740a0e7ba606dfb5e3e579a._comment diff --git a/doc/forum/sparse_git_checkouts_with_annex/comment_1_c7dc199c5740a0e7ba606dfb5e3e579a._comment b/doc/forum/sparse_git_checkouts_with_annex/comment_1_c7dc199c5740a0e7ba606dfb5e3e579a._comment new file mode 100644 index 0000000000..7adf4fc4d6 --- /dev/null +++ b/doc/forum/sparse_git_checkouts_with_annex/comment_1_c7dc199c5740a0e7ba606dfb5e3e579a._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 1" + date="2011-04-07T16:32:04Z" + content=""" +That's awesome, I had not heard of git sparse checkouts before. + +It does not make sense to tie the log files to the directory of the corresponding files, as then the logs would have to move when the files are moved, which would be a PITA and likely make merging log file changes very complex. Also, of course, multiple files in different locations can point at the same content, which has the same log file. And, to cap it off, git-annex can need to access the log file for a given key without having the slightest idea what file in the repository might point to it, and it would be very expensive to scan the whole repository to find out what that file is in order to lookup the filename of the log file. + +The most likely change in git-annex that will make this better is in [[this_todo_item|todo/branching]] -- but it's unknown how to do it yet. +"""]]