From 4d0a4bbee73b87428d57405310ceaa6be629d72a Mon Sep 17 00:00:00 2001 From: sachinkumar83 Date: Fri, 14 Apr 2017 03:38:35 +0000 Subject: [PATCH 1/6] --- doc/forum/deploy.mdwn | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 doc/forum/deploy.mdwn diff --git a/doc/forum/deploy.mdwn b/doc/forum/deploy.mdwn new file mode 100644 index 0000000000..b9e0974122 --- /dev/null +++ b/doc/forum/deploy.mdwn @@ -0,0 +1,5 @@ +Greetings, + +I use the push-to-deploy pattern (as described in 4.1 http://gitolite.com/deploy.html). However, my git repo has large binary files that I'd like to annex. Is there an example of using git annex with a bare remote repository with the appropriate post-receive hook to accomplish the deploy? + +Thanks From a9dd72bb976a957a82335f8accb57b4e74bd9c8f Mon Sep 17 00:00:00 2001 From: memeplex Date: Fri, 14 Apr 2017 20:19:31 +0000 Subject: [PATCH 2/6] --- ...h_autocompletion_with_big_annex_repos.mdwn | 50 +++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn diff --git a/doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn b/doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn new file mode 100644 index 0000000000..ded86df72d --- /dev/null +++ b/doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn @@ -0,0 +1,50 @@ +I'm currently using git annex to manage my entire file collection +(including tons of music and books) and I noticed how slow +autocompletion has become for files in the index (say for git add). +The main offender is a while-read-case-echo bash loop in +`__git_index_files` that can be readily substituted with a much faster +sed invocation. Here is my benchmark: + +``` +__git_index_files () +{ + local dir="$(__gitdir)" root="${2-.}" file; + if [ -d "$dir" ]; then + __git_ls_files_helper "$root" "$1" | while read -r file; do + case "$file" in + ?*/*) + echo "${file%%/*}" + ;; + *) + echo "$file" + ;; + esac; + done | sort | uniq; + fi +} + +time __git_index_files > /dev/null + + +__git_index_files () +{ + local dir="$(__gitdir)" root="${2-.}" file; + if [ -d "$dir" ]; then + __git_ls_files_helper "$root" "$1" | \ + sed -r 's@^"?([^/]+)/.*$@\1@' | sort | uniq + fi +} + +time __git_index_files > /dev/null + +real 0m0.830s +user 0m0.597s +sys 0m0.310s + +real 0m0.345s +user 0m0.357s +sys 0m0.000s +``` + +So you might redefine `__git_index_files` as above in your .bashrc after sourcing the git autocomplete script. + From db9d6e0ea0b0f575e164f6e07174c2839dc976f3 Mon Sep 17 00:00:00 2001 From: memeplex Date: Fri, 14 Apr 2017 22:11:55 +0000 Subject: [PATCH 3/6] --- ...sh_autocompletion_with_big_annex_repos.mdwn | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn b/doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn index ded86df72d..61d49192ca 100644 --- a/doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn +++ b/doc/tips/Faster_bash_autocompletion_with_big_annex_repos.mdwn @@ -25,26 +25,26 @@ __git_index_files () time __git_index_files > /dev/null +real 0m0.830s +user 0m0.597s +sys 0m0.310s __git_index_files () { local dir="$(__gitdir)" root="${2-.}" file; if [ -d "$dir" ]; then __git_ls_files_helper "$root" "$1" | \ - sed -r 's@^"?([^/]+)/.*$@\1@' | sort | uniq + sed -r 's@/.*@@' | uniq | sort | uniq fi } + time __git_index_files > /dev/null -real 0m0.830s -user 0m0.597s -sys 0m0.310s +real 0m0.075s +user 0m0.083s +sys 0m0.010s -real 0m0.345s -user 0m0.357s -sys 0m0.000s ``` -So you might redefine `__git_index_files` as above in your .bashrc after sourcing the git autocomplete script. - +10 times faster! So you might redefine `__git_index_files` as above in your .bashrc after sourcing the git autocomplete script. From 072d981db1d51e8a97d194c5a343325efbe38b33 Mon Sep 17 00:00:00 2001 From: memeplex Date: Fri, 14 Apr 2017 22:35:49 +0000 Subject: [PATCH 4/6] Added a comment --- ...comment_3_88e37911a0280d817559b2d51ddb1d4e._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/forum/Git_repos_in_git_annex__63__/comment_3_88e37911a0280d817559b2d51ddb1d4e._comment diff --git a/doc/forum/Git_repos_in_git_annex__63__/comment_3_88e37911a0280d817559b2d51ddb1d4e._comment b/doc/forum/Git_repos_in_git_annex__63__/comment_3_88e37911a0280d817559b2d51ddb1d4e._comment new file mode 100644 index 0000000000..efd86d8f8f --- /dev/null +++ b/doc/forum/Git_repos_in_git_annex__63__/comment_3_88e37911a0280d817559b2d51ddb1d4e._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="memeplex" + avatar="http://cdn.libravatar.org/avatar/84a611000e819ef825421de06c9bca90" + subject="comment 3" + date="2017-04-14T22:35:49Z" + content=""" +Another use case is when you use annex for backup. For example I keep most of my files (dotfiles, scripts, music, books, etc) in a \"home\" git annex repo. Part of them is stored in google drive, part in a pendrive, part in my home computer, part in my work computer. Every once in a while I sync everything into my home computer and into an additional external hard drive remote that I keep as a backup. Sadly, my archived git projects can't be managed like that. Nevertheless, it's not a big deal since they are already in the cloud (gitlab or github) and in my local filesystem. Besides, since they are archived, I can just create tarballs and add them to the annex (and in case annex allowed to store .git directories, I'd not be really comfortable with the huge amount of symlinks that would produce). + +That said, it's kind of a bathos that the illusion of a distributed, decentralized, redundant, versioned, annotated, etc. filesystem over git is broken by git itself. +"""]] From 5d26133bcea14d585329dfaafafb5419b8b24f4f Mon Sep 17 00:00:00 2001 From: memeplex Date: Mon, 17 Apr 2017 17:53:23 +0000 Subject: [PATCH 5/6] --- doc/forum/Lots_of_4k_symlinks.mdwn | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) create mode 100644 doc/forum/Lots_of_4k_symlinks.mdwn diff --git a/doc/forum/Lots_of_4k_symlinks.mdwn b/doc/forum/Lots_of_4k_symlinks.mdwn new file mode 100644 index 0000000000..c9113d9193 --- /dev/null +++ b/doc/forum/Lots_of_4k_symlinks.mdwn @@ -0,0 +1,24 @@ +Hi, + +this is a minor issue and probably there is no better solution, but nevertheless I would like to point out it and maybe discuss a little about the issue. + +Given that the symlinks generated by annex are pretty large in size (they point to a file named by a large hash number), ext4 is using an entire block (4K) of storage instead of [embedding the symlink into the inode][inode] itself. For the "archivist use case" of annex, this might lead to tens or hundreds of MBs of disk occupied by symlinks which actually don't add up to more than a few MBs. + +Here is a real world example: + +``` +(ins)carlos@carlos home$ du -hs music/ +56M music/ +(ins)carlos@carlos home$ du -bhs music/ +3.3M music/ +(ins)carlos@carlos home$ ln -s /tmp/x x +(ins)carlos@carlos home$ du x +0 x +(ins)carlos@carlos home$ ln -s /tmp/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xx +(ins)carlos@carlos home$ du xx +4 xx +``` + +Cheers, Carlos + +[inode]: https://kernelnewbies.org/Linux_3.8#head-372b38979138cf2006bd0114ae97f889f67ef46a From bd974974fbf2ad17c2a15af82759176542a0b38d Mon Sep 17 00:00:00 2001 From: "https://me.yahoo.com/a/hVbIabkhqO11.DpKUWBoztFSLD5q#8cbe8" Date: Tue, 18 Apr 2017 16:26:08 +0000 Subject: [PATCH 6/6] Added a comment: Why is it takins too long? --- ...ment_2_bbd98d0b5d77dc7efc55ef8c2a18d612._comment | 13 +++++++++++++ 1 file changed, 13 insertions(+) create mode 100644 doc/forum/Synchronize_large_files___40__VM_images__41__/comment_2_bbd98d0b5d77dc7efc55ef8c2a18d612._comment diff --git a/doc/forum/Synchronize_large_files___40__VM_images__41__/comment_2_bbd98d0b5d77dc7efc55ef8c2a18d612._comment b/doc/forum/Synchronize_large_files___40__VM_images__41__/comment_2_bbd98d0b5d77dc7efc55ef8c2a18d612._comment new file mode 100644 index 0000000000..bc04122df2 --- /dev/null +++ b/doc/forum/Synchronize_large_files___40__VM_images__41__/comment_2_bbd98d0b5d77dc7efc55ef8c2a18d612._comment @@ -0,0 +1,13 @@ +[[!comment format=mdwn + username="https://me.yahoo.com/a/hVbIabkhqO11.DpKUWBoztFSLD5q#8cbe8" + nickname="Murat" + avatar="http://cdn.libravatar.org/avatar/52d95e40aca820c1993077ef9aa676c75700a072511c143f6db6b78be6b1b212" + subject="Why is it takins too long?" + date="2017-04-18T16:26:06Z" + content=""" +Hi, +due to my requirement I need to revert vm image every time before running it via \"git reset --hard\" which is really fast on the other hand \"git annex unlock\" takes really long, I run git-annex on Centos 6 and git-annex version git-annex-3.20120522-2.1.el6.x86_64, if I update git-annex version can it help to fasten \"unlock\"? + +thanks a lot + +"""]]