Merge branch 'master' into database

Conflicts: debian/changelog
2015-02-17 16:15:00 -04:00 · 2015-02-17 16:15:00 -04:00 · bd6e41f8e6
commit bd6e41f8e6
parent afb3e3e472 8346ce18a8
28 changed files with 428 additions and 21 deletions
--- a/Build/LinuxMkLibs.hs
+++ b/Build/LinuxMkLibs.hs
@ -44,31 +44,40 @@ mklibs top = do
 	-- Various files used by runshell to set up env vars used by the
 	-- linker shims.
 	writeFile (top </> "libdirs") (unlines libdirs)
 	writeFile (top </> "linker")
 		(Prelude.head $ filter ("ld-linux" `isInfixOf`) libs')
 	writeFile (top </> "gconvdir")
 		(parentDir $ Prelude.head $ filter ("/gconv/" `isInfixOf`) glibclibs)
-	mapM_ (installLinkerShim top) exes
+	let linker = Prelude.head $ filter ("ld-linux" `isInfixOf`) libs'
 	mapM_ (installLinkerShim top linker) exes
 {- Installs a linker shim script around a binary.
 -
 - Note that each binary is put into its own separate directory,
 - to avoid eg git looking for binaries in its directory rather
- - than in PATH.-}
+ - than in PATH.
-installLinkerShim :: FilePath -> FilePath -> IO ()
+ -
-installLinkerShim top exe = do
+ - The linker is symlinked to a file with the same basename as the binary,
-	createDirectoryIfMissing True shimdir
+ - since that looks better in ps than "ld-linux.so".
 -}
 installLinkerShim :: FilePath -> FilePath -> FilePath -> IO ()
 installLinkerShim top linker exe = do
 	createDirectoryIfMissing True (top </> shimdir)
 	createDirectoryIfMissing True (top </> exedir)
 	renameFile exe exedest
 	link <- relPathDirToFile (top </> exedir) (top ++ linker)
 	unlessM (doesFileExist (top </> exelink)) $
 		createSymbolicLink link (top </> exelink)
 	writeFile exe $ unlines
 		[ "#!/bin/sh"
-		, "exec \"$GIT_ANNEX_LINKER\" --library-path \"$GIT_ANNEX_LD_LIBRARY_PATH\" \"$GIT_ANNEX_SHIMMED/" ++ base ++ "/" ++ base ++ "\" \"$@\""
+		, "exec \"$GIT_ANNEX_DIR/" ++ exelink ++ "\" --library-path \"$GIT_ANNEX_LD_LIBRARY_PATH\" \"$GIT_ANNEX_DIR/shimmed/" ++ base ++ "/" ++ base ++ "\" \"$@\""
 		]
 	modifyFileMode exe $ addModes executeModes
  where
 	base = takeFileName exe
-	shimdir = top </> "shimmed" </> base
+	shimdir = "shimmed" </> base
-	exedest = shimdir </> base
+	exedir = "exe"
 	exedest = top </> shimdir </> base
 	exelink = exedir </> base
 {- Converting symlinks to hard links simplifies the binary shimming
 - process. -}
--- a/debian/changelog
+++ b/debian/changelog
@ -31,6 +31,7 @@ git-annex (5.20150206) UNRELEASED; urgency=medium
  * sync, assistant: Use the ssh-options git config when doing git pull
    and push.
  * remotedaemon: Use the ssh-options git config.
  * Linux standalone: Improved process names of linker shimmed programs.
  * fsck: Incremental fsck uses sqlite to store its records, instead
    of abusing the sticky bit. Existing sticky bits are ignored,
    incremental fscks started by old versions won't be resumed by
--- a/doc/bare_repositories/comment_2_c88216da0588562c851c2ceabbfebc0a._comment
+++ b/doc/bare_repositories/comment_2_c88216da0588562c851c2ceabbfebc0a._comment
@ -0,0 +1,28 @@
 [[!comment format=mdwn
 username="https://openid.stackexchange.com/user/814e4910-8e9b-4fe5-83ef-ff863c1a7314"
 nickname="BehemothTheCat"
 subject="push fails"
 date="2015-02-14T00:11:40Z"
 content="""
 These instructions don't work for me, unfortunately.
 This step:
    git push origin master git-annex
 results in:
    To ssh://my.server.com/home/itz/git/annex.git
     ! [rejected]        git-annex -> git-annex (non-fast-forward)
    error: failed to push some refs to 'ssh://my.server.com/home/itz/git/annex.git'
    hint: Updates were rejected because a pushed branch tip is behind its remote
    hint: counterpart. Check out this branch and integrate the remote changes
    hint: (e.g. 'git pull ...') before pushing again.
    hint: See the 'Note about fast-forwards' in 'git push --help' for details.
 Versions: git 1:1.9.1-1~bpo70+2 , git-annex 5.20141024~bpo70+1 (both packaged by Debian, same on local and remote)
 And yes, I did a pull on the master branch first.  Afraid to do anything
 with the git-annex branch without explicit instruction.
 """]]
--- a/doc/bugs/git_annex_direct_-62_rename:_does_not_exist/comment_8_e8890ec39415140d9d9448e5dd67a7ff._comment
+++ b/doc/bugs/git_annex_direct_-62_rename:_does_not_exist/comment_8_e8890ec39415140d9d9448e5dd67a7ff._comment
@ -0,0 +1,8 @@
 [[!comment format=mdwn
 username="https://www.google.com/accounts/o8/id?id=AItOawnwNDA50ZupMvOgpgDqzDRyu5B-mYlVwa4"
 nickname="Andreas"
 subject="comment 8"
 date="2015-02-16T17:32:32Z"
 content="""
 Sorry, missed the link. The recent version from the tarball fixes the issue for me.
 """]]
--- a/doc/bugs/weird_entry_in_process_list.mdwn
+++ b/doc/bugs/weird_entry_in_process_list.mdwn
@ -0,0 +1,41 @@
 ### Please describe the problem.
 The standalone linux binaries do not show up as `git-annex` in the process list, but as `ld-linux-x86-64` - it's pretty confusing!
 ### What steps will reproduce the problem?
 Install the standalone binaries from downloads.kitenet.net, run git-annex.
 ### What version of git-annex are you using? On what operating system?
 Today's snapshot from downloads.k.n.
 ### Please provide any additional information below.
 [[!format sh """
 # If you can, paste a complete transcript of the problem occurring here.
 # If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
 root@koumbit-mp-test:/var/isuma/media/video# top -b  -n 1 | head -10
 top - 14:00:09 up 15 days, 23:25,  4 users,  load average: 1.18, 1.26, 1.34
 Tasks: 216 total,   1 running, 213 sleeping,   0 stopped,   2 zombie
 Cpu(s):  0.4%us,  0.1%sy,  0.0%ni, 99.3%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
 Mem:   6122044k total,  5469364k used,   652680k free,   321080k buffers
 Swap:  2928632k total,        0k used,  2928632k free,  4009592k cached
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 28261 root      20   0  4528  652  528 D   79  0.0   0:01.28 ld-linux-x86-64
 1381 root      20   0  126m  13m 4060 S    2  0.2 190:25.64 Xorg
    1 root      20   0  8356  812  684 S    0  0.0   0:05.50 init
 root@koumbit-mp-test:/var/isuma/media/video# ps axf | grep annex
 9861 pts/2    S+     0:00                  |   \_ git annex add hd high high~ ipod ipod~ large low mp4_sd raw small wc xlarge
 9862 pts/2    Sl+    3:50                  |       \_ /opt/git-annex.linux//lib64/ld-linux-x86-64.so.2 --library-path /opt/git-annex.linux//etc/ld.so.conf.d:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/audit:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv:/opt/git-annex.linux//usr/lib:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu:/opt/git-annex.linux//lib64:/opt/git-annex.linux//lib/x86_64-linux-gnu: /opt/git-annex.linux/shimmed/git-annex/git-annex add hd high high~ ipod ipod~ large low mp4_sd raw small wc xlarge
 9878 pts/2    S+     0:00                  |           \_ /opt/git-annex.linux//lib64/ld-linux-x86-64.so.2 --library-path /opt/git-annex.linux//etc/ld.so.conf.d:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/audit:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv:/opt/git-annex.linux//usr/lib:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu:/opt/git-annex.linux//lib64:/opt/git-annex.linux//lib/x86_64-linux-gnu: /opt/git-annex.linux/shimmed/git/git --git-dir=.git --work-tree=. check-attr -z --stdin annex.backend annex.numcopies --
 9881 pts/2    S+     0:01                  |           \_ /opt/git-annex.linux//lib64/ld-linux-x86-64.so.2 --library-path /opt/git-annex.linux//etc/ld.so.conf.d:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/audit:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv:/opt/git-annex.linux//usr/lib:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu:/opt/git-annex.linux//lib64:/opt/git-annex.linux//lib/x86_64-linux-gnu: /opt/git-annex.linux/shimmed/git/git --git-dir=.git --work-tree=. cat-file --batch
 9882 pts/2    S+     0:00                  |           \_ /opt/git-annex.linux//lib64/ld-linux-x86-64.so.2 --library-path /opt/git-annex.linux//etc/ld.so.conf.d:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/audit:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv:/opt/git-annex.linux//usr/lib:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu:/opt/git-annex.linux//lib64:/opt/git-annex.linux//lib/x86_64-linux-gnu: /opt/git-annex.linux/shimmed/git/git --git-dir=.git --work-tree=. cat-file --batch
 28293 pts/2    R+     0:00                  |           \_ /opt/git-annex.linux//lib64/ld-linux-x86-64.so.2 --library-path /opt/git-annex.linux//etc/ld.so.conf.d:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/audit:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu/gconv:/opt/git-annex.linux//usr/lib:/opt/git-annex.linux//usr/lib/x86_64-linux-gnu:/opt/git-annex.linux//lib64:/opt/git-annex.linux//lib/x86_64-linux-gnu: /opt/git-annex.linux/shimmed/sha256sum/sha256sum .git/annex/misctmp/videonew9862
 # End of transcript or log.
 """]]
 couldn't it alter its process name to make this a little more intuitive? This is especially problematic because i am trying to hook git-annex into Puppet and Facter, which require me to guess where the various git-annex repos are on the server. The way i was doing that so far was with `lsof -c 'git-annex' -F0tn`, which is obviously failing under those circumstances.... Unless there's a better way to find those repos across the system? I assume there's a git-annex assistant running here... --[[anarcat]]
 > [[fixed|done]] --[[Joey]]
--- a/doc/bugs/weird_entry_in_process_list/comment_1_61e1fc604b49964ef97f31c9d5546afc._comment
+++ b/doc/bugs/weird_entry_in_process_list/comment_1_61e1fc604b49964ef97f31c9d5546afc._comment
@ -0,0 +1,13 @@
 [[!comment format=mdwn
 username="joey"
 subject="""comment 1"""
 date="2015-02-16T23:35:08Z"
 content="""
 Haskell programs actually cannot alter their process name. I've had a bug
 open on ghc for a year about that.
 However, I can make a nicer symlink name than ld-linux.so, and use that,
 and it will then be clear what program is being run, although the
 parameters to it will still be unusual due to the shimming used in the
 standalone build.
 """]]
--- a/doc/design/assistant/polls/prioritizing_special_remotes.mdwn
+++ b/doc/design/assistant/polls/prioritizing_special_remotes.mdwn
@ -6,7 +6,7 @@ locally paired systems, and remote servers with rsync.
 Help me prioritize my work: What special remote would you most like
 to use with the git-annex assistant?
-[[!poll open=yes 18 "Amazon S3 (done)" 12 "Amazon Glacier (done)" 10 "Box.com (done)" 74 "My phone (or MP3 player)" 25 "Tahoe-LAFS" 13 "OpenStack SWIFT" 35 "Google Drive"]]
+[[!poll open=yes 18 "Amazon S3 (done)" 12 "Amazon Glacier (done)" 10 "Box.com (done)" 74 "My phone (or MP3 player)" 25 "Tahoe-LAFS" 13 "OpenStack SWIFT" 36 "Google Drive"]]
 This poll is ordered with the options I consider easiest to build
 listed first. Mostly because git-annex already supports them and they
--- a/doc/design/caching_database.mdwn
+++ b/doc/design/caching_database.mdwn
@ -25,11 +25,11 @@ Store in the database the Ref of the branch that was used to construct it.
 ## implementation plan
-1. Implement for metadata, on a branch, with sqlite.
+1. Store incremental fsck info in db, on a branch, with sqlite.
 2. Make sure that builds on all platforms.
-3. Add associated file mappings support. This is needed to fully
+3. Implement for metadata, on a branch, with sqlite.
 4. Add associated file mappings support. This is needed to fully
   use the caching database to construct views.
 4. Store incremental fsck info in db.
 5. Replace .map files with 3. for direct mode.
 ## sqlite or not?
@ -39,12 +39,21 @@ SQL. And even if that's hidden by a layer like persistent, it's still going
 to involve some technical debt (eg, database migrations).
 It would be great if there were some haskell thing like acid-state
-that I could use instead. But, acid-sate needs to load the whole
+that I could use instead. But, acid-state needs to load the whole
 DB into memory. In the comments of
 [[bugs/incremental_fsck_should_not_use_sticky_bit]] I examined several
 other haskell database-like things, and found them all wanting, except for
 possibly TCache. (And TCache is backed by persistent/sqlite anyway.)
 ## one db or multiple?
 Using a single database will use less space. Eg, each Key will only need to
 appear in it once, with proper normalization.
 OTOH, it's more complicated, and harder to recover from problems.
 Currently leaning toward one database per purpose.
 ## case study: persistent with sqllite
 Here's a non-normalized database schema in persistent's syntax.
@ -123,6 +132,11 @@ eg, esquelito.
 Update2: Using esquelito to do a join got this down to 0.109s.
 See `database` branch for code.
 Update3: Converting to a single un-normalized table for AssociatedFiles
 avoids the join, and increased lookup speed to 0.087s. Of course, when
 a key has multiple associated files, this will use more disk space, due
 to not normalizing the key.
 Compare the above with 1000 calls to `associatedFiles`, which is approximately
 as fast as just opening and reading 1000 files, so will take well under
 0.05s with a **cold** cache.
--- a/doc/devblog/day_253__sqlite_for_incremental_fsck.mdwn
+++ b/doc/devblog/day_253__sqlite_for_incremental_fsck.mdwn
@ -0,0 +1,56 @@
 Yesterday I did a little more investigation of key/value stores.
 I'd love a pure haskell key/value store that didn't buffer everything in
 memory, and that allowed concurrent readers, and was ACID, and production
 quality. But so far, I have not found anything that meets all those
 criteria. It seems that sqlite is the best choice for now.
 Started working on the `database` branch today. The plan is to use
 sqlite for incremental fsck first, and if that works well, do the rest
 of what's planned in [[design/caching_database]].
 At least for now, I'm going to use a dedicated database file for each
 different thing. (This may not be as space-efficient due to lacking
 normalization, but it keeps things simple.) 
 So, .git/annex/fsck.db will be used by incremental fsck, and it has
 a super simple Persistent database schema:
 [[!format haskell """
 Fscked
  key SKey
  UniqueKey key
 """]]
 It was pretty easy to implement this and make incremental fsck use it. The
 hard part is making it both fast and robust.
 At first, I was doing everything inside a single `runSqlite` action.
 Including creating the table. But, it turns out that runs as a single
 transaction, and if it was interrupted, this left the database in a
 state where it exists, but has no tables. Hard to recover from.
 So, I separated out creating the database, made that be done in a separate
 transation and fully atomically. Now `fsck --incremental` could be crtl-c'd
 and resumed with `fsck --more`, but it would lose the transaction and so
 not remember anything had been checked.
 To fix that, I tried making a separate transation per file fscked. That
 worked, and it resumes nicely where it left off, but all those transactions
 made it much slower.
 To fix the speed, I made it commit just one transaction per minute. This
 seems like an ok balance. Having fsck re-do one minute's work when restarting
 an interrupted incremental fsck is perfectly reasonable, and now the speed,
 using the sqlite database, is nearly as fast as the old sticky bit hack was.
 (Specifically, 6m7s old vs 6m27s new, fscking 37000 files from cold cache
 in --fast mode.)
 There is still a problem with multiple concurrent `fsck --more`
 failing. Probably a concurrent writer problem? And, some porting will be
 required to get sqlite and persistent working on Windows and Android.
 So the branch isn't ready to merge yet, but it seems promising.
 In retrospect, while incremental fsck has the simplest database schema, it
 might be one of the harder things listed in [[design/caching_database]], 
 just because it involves so many writes to the database. The other use
 cases are more read heavy.
--- a/doc/devblog/day_253__sqlite_for_incremental_fsck/comment_1_683d669ac6af8e314585609f75cfdaf3._comment
+++ b/doc/devblog/day_253__sqlite_for_incremental_fsck/comment_1_683d669ac6af8e314585609f75cfdaf3._comment
@ -0,0 +1,7 @@
 [[!comment format=mdwn
 username="https://id.koumbit.net/anarcat"
 subject="comment 1"
 date="2015-02-16T21:26:22Z"
 content="""
 i am curious: why separate database files while you can have multiple tables in the same database file? --[[anarcat]]
 """]]
--- a/doc/direct_mode.mdwn
+++ b/doc/direct_mode.mdwn
@ -104,6 +104,14 @@ command run on that work tree, and then updating the real work
 tree to reflect any changes staged or committed by the git command,
 with appropriate handling of the direct mode files.
 ## undoing changes in direct mode
 There is also the `undo` command to do the equivalent of the above revert in a simpler way. Say you made a change in direct mode, the assistant dutifully committed it and you realise your mistake, you can try:
    git annex undo file
 to revert the last change to `file`. Note that you can use the `--depth` flag to revert earlier versions of the file.
 ## forcing git to use the work tree in direct mode
 This is for experts only. You can lose data doing this, or check enormous
--- a/doc/forum/Using_standard_groups_and_sync_to_preserve_history:_--all_not_recognised.mdwn
+++ b/doc/forum/Using_standard_groups_and_sync_to_preserve_history:_--all_not_recognised.mdwn
@ -0,0 +1,33 @@
 My copy of *git-annex* refuses to sync all, namely when I try it I get the following error
    $ git annex sync --content --all
    git-annex: unrecognized option `--all'
    Usage: git-annex sync [REMOTE ...] [option ...]
        --content  also transfer file contents
    To see additional options common to all commands, run: git annex help options
 This contradicts the advice on [preferred content](http://git-annex.branchable.com/preferred_content/) set out under **difference: unused**, 
 and I cannot see any other options in my man page that would address the lack of this option.
 The problem I am trying to solve is that I wish to preserve all history on the backup drives.  Namely, if I do the following
    touch test-of-annex-backup.txt
    git annex add test-of-annex-backup.txt
    git commit --message='test: Create empty test-of-annex-backup.txt file'
    git annex edit test-of-annex-backup.txt
    echo "This line creates version 2 of this file" > test-of-annex-backup.txt
    git annex add test-of-annex-backup.txt
    git commit --message='test: Create version 2 of test-of-annex-backup.txt'
    git annex sync --content --all
 I expect to see 2 copies of `test-of-annex-backup.txt` be copied to each accessible annex repository in the `backup` group
 I tried googling for `"git annex sync --content --all"`, but I only find pages telling me that this is what I should use, and none saying the option has been deprecated.
 I am very confused, as this seems to me an almost stereotypical use of *git-annex*, and yet I cannot see how to do it
 thanks
 Andrew
--- a/doc/forum/What_happens_after_terminated_add_of_huge_picture_folder63.mdwn
+++ b/doc/forum/What_happens_after_terminated_add_of_huge_picture_folder63.mdwn
@ -0,0 +1,77 @@
 Step by step:
 git annex add ./hugePictureFolder<br>
 // no it's to big and taking to long, let's not do this<br>
 CRTL+D<br>
 git annex --force drop ./hugePictureFolder<br>
 git status<br>
 fatal: bad default revision 'HEAD'<br>
 git reset --hard git-annex<br>
 git status // ok<br>
 ls <br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 000<br>
 drwxr-xr-x  9 rolas rolas      4096 Vas 16 11:40 001<br>
 drwxr-xr-x 11 rolas rolas      4096 Vas 16 11:40 002<br>
 drwxr-xr-x 12 rolas rolas      4096 Vas 16 11:40 003<br>
 drwxr-xr-x  6 rolas rolas      4096 Vas 16 11:40 004<br>
 drwxr-xr-x 12 rolas rolas      4096 Vas 16 11:40 005<br>
 drwxr-xr-x 11 rolas rolas      4096 Vas 16 11:40 006<br>
 drwxr-xr-x 13 rolas rolas      4096 Vas 16 11:40 007<br>
 drwxr-xr-x  6 rolas rolas      4096 Vas 16 11:40 008<br>
 drwxr-xr-x 13 rolas rolas      4096 Vas 16 11:40 009<br>
 drwxr-xr-x 14 rolas rolas      4096 Vas 16 11:40 00a<br>
 drwxr-xr-x 16 rolas rolas      4096 Vas 16 11:40 00b<br>
 drwxr-xr-x 11 rolas rolas      4096 Vas 16 11:40 00c<br>
 drwxr-xr-x  9 rolas rolas      4096 Vas 16 11:40 00d<br>
 drwxr-xr-x 20 rolas rolas      4096 Vas 16 11:40 00e<br>
 drwxr-xr-x 18 rolas rolas      4096 Vas 16 11:40 00f<br>
 drwxr-xr-x 14 rolas rolas      4096 Vas 16 11:40 010<br>
 drwxr-xr-x 11 rolas rolas      4096 Vas 16 11:40 011<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 012<br>
 drwxr-xr-x 12 rolas rolas      4096 Vas 16 11:40 013<br>
 drwxr-xr-x  7 rolas rolas      4096 Vas 16 11:40 014<br>
 drwxr-xr-x 16 rolas rolas      4096 Vas 16 11:40 015<br>
 drwxr-xr-x  7 rolas rolas      4096 Vas 16 11:40 016<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 017<br>
 drwxr-xr-x  9 rolas rolas      4096 Vas 16 11:40 018<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 019<br>
 drwxr-xr-x  8 rolas rolas      4096 Vas 16 11:40 01a<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 01b<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 01c<br>
 drwxr-xr-x  8 rolas rolas      4096 Vas 16 11:40 01d<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 01e<br>
 drwxr-xr-x 11 rolas rolas      4096 Vas 16 11:40 01f<br>
 drwxr-xr-x 15 rolas rolas      4096 Vas 16 11:40 020<br>
 drwxr-xr-x 13 rolas rolas      4096 Vas 16 11:40 021<br>
 drwxr-xr-x  5 rolas rolas      4096 Vas 16 11:40 022<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 023<br>
 drwxr-xr-x  9 rolas rolas      4096 Vas 16 11:40 024<br>
 drwxr-xr-x 12 rolas rolas      4096 Vas 16 11:40 025<br>
 drwxr-xr-x  8 rolas rolas      4096 Vas 16 11:40 026<br>
 drwxr-xr-x 12 rolas rolas      4096 Vas 16 11:40 027<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 028<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 029<br>
 drwxr-xr-x  9 rolas rolas      4096 Vas 16 11:40 02a<br>
 drwxr-xr-x  9 rolas rolas      4096 Vas 16 11:40 02b<br>
 drwxr-xr-x  6 rolas rolas      4096 Vas 16 11:40 02c<br>
 drwxr-xr-x  7 rolas rolas      4096 Vas 16 11:40 02d<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 02e<br>
 drwxr-xr-x  5 rolas rolas      4096 Vas 16 11:40 02f<br>
 drwxr-xr-x 13 rolas rolas      4096 Vas 16 11:40 030<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 031<br>
 drwxr-xr-x 10 rolas rolas      4096 Vas 16 11:40 032<br>
 ...<br>
 ...<br>
 <br>
 What did I Do? Can I do CTRL+D? If yes, what should I do to recover?<br>
 <br>
 Thanks<br>
 Rolandas<br>
 <br>
 $ git --version<br>
 git version 2.3.0<br>
 <br>
 $ git annex version<br>
 git-annex version: 5.20140412ubuntu1<br>
 <br>
--- a/doc/forum/canceling_wrong_repository_merge/comment_2_87aabff41a1c6aec773b8f52ead51105._comment
+++ b/doc/forum/canceling_wrong_repository_merge/comment_2_87aabff41a1c6aec773b8f52ead51105._comment
@ -0,0 +1,11 @@
 [[!comment format=mdwn
 username="https://id.koumbit.net/anarcat"
 subject="watch out for direct mode"
 date="2015-02-16T23:22:19Z"
 content="""
 so while `git rebase` can do magic, it will not work out of the box on direct mode repositories, unless you use `-c core.bare=false`, in which case you will totally shoot yourself in the foot because git will happily remove all those real files sitting in the checkout. you will need to `git annex indirect` before you do any of that magic. working on a clone of the git repo is also a good idea, if only for testing.
 i personnally destroyed my whole music collection doing such a cleanup of the history. fortunately, i had a recent archived clone of the repo, so things weren't so bad.
 but watch out for direct mode, as always.
 """]]
--- a/doc/forum/multiple_urls_for_the_same_UUID/comment_8_c44144c677d54aaea6e900d0d7e000a3._comment
+++ b/doc/forum/multiple_urls_for_the_same_UUID/comment_8_c44144c677d54aaea6e900d0d7e000a3._comment
@ -0,0 +1,8 @@
 [[!comment format=mdwn
 username="https://www.google.com/accounts/o8/id?id=AItOawkMTPqZZWoz396ABpx6nh3osxKQCFaSW6M"
 nickname="Mark"
 subject="comment 8"
 date="2015-02-13T13:55:21Z"
 content="""
 Ah, ok. So it doesn't cause any issues that the host/* remote branches will also keep getting swapped from one repository to another? The operation of `git annex sync` is sufficiently (and happily) opaque to me, so I was concerned that this might break some of its basic assumptions.
 """]]
--- a/doc/forum/optimising_lookupkey.mdwn
+++ b/doc/forum/optimising_lookupkey.mdwn
@ -0,0 +1,13 @@
 to work around [[forum/original_filename_on_s3/]], i need to get the key from a file, and i'm not within the git-annex process. i know there's `git annex lookupkey $FILE`, but that incurs significant overhead because the whole git annex runtime needs to fire up. in my tests, this takes around 25ms on average.
 could i optimise this by simply doing a `readlink` call on the git checkout? it sure looks like `readlink | basename` is all I really need, and that can probably be done below 10ms (4ms in my tests). how reliable are those links anyways, and is that what lookupkey does?
 similarly, i wonder if it's safe to bypass git-annex and talk straight with git to extract location tracking? i can jump from 90ms to below 10ms for such requests if I turn `git annex find <file>` into the convoluted:
 <pre>
 git annex lookupkey $file
 printf $key | md5sum
 git cat-file -p refs/heads/git-annex:$hash/${key}.log
 </pre>
 thanks. --[[anarcat]]
--- a/doc/forum/root_assistant63.mdwn
+++ b/doc/forum/root_assistant63.mdwn
@ -0,0 +1,3 @@
 How safe (or not) is it to run the assistant as root?
 If not safe, what would be a good way to sync directories like /usr/local ?
--- a/doc/internals/hashing/comment_5_b0cb207a85cda5a0ff2ea71caca22c0d._comment
+++ b/doc/internals/hashing/comment_5_b0cb207a85cda5a0ff2ea71caca22c0d._comment
@ -0,0 +1,11 @@
 [[!comment format=mdwn
 username="https://id.koumbit.net/anarcat"
 subject="why md5sum?"
 date="2015-02-13T15:59:46Z"
 content="""
 why the extra processing to generate the hashing directories?
 we already have a hash here, for example, `SHA256E-s8242375--5f82490990812ad3feabb02355750710a9d94283ab256d1c691c3bf8d7d9fbe3.ogg` has a loon `5f82490990812ad3feabb02355750710a9d94283ab256d1c691c3bf8d7d9fbe3` hash. Why not use the first characters of that? This is will not change for a give file, and has a higher chance of generating collisions (which is a good thing here, because we can reuse directories).
 In other words, why aren't the hashes of `SHA256E-s8242375--5f82490990812ad3feabb02355750710a9d94283ab256d1c691c3bf8d7d9fbe3.ogg` simply `5f8/249`? --[[anarcat]]
 """]]
--- a/doc/special_remotes/S3.mdwn
+++ b/doc/special_remotes/S3.mdwn
@ -68,4 +68,4 @@ the S3 remote.
  then use the same bucket.
 * `x-amz-meta-*` are passed through as http headers when storing keys
-  in S3.
+  in S3. see [the Internet Archive S3 interface documentation](https://archive.org/help/abouts3.txt) for example headers.
--- a/doc/special_remotes/directory.mdwn
+++ b/doc/special_remotes/directory.mdwn
@ -6,7 +6,7 @@ you want to use to sneakernet files between systems (possibly with
 the drive's mountpoint as a directory remote.
 Note that directory remotes have a special directory structure
-(by design, the same as the \[[rsync|rsync]] remote).
+(by design, the same as the [[rsync|rsync]] remote).
 If you just want two copies of your repository with the files "visible"
 in the tree in both, the directory special remote is not what you want.
 Instead, you should use a regular `git clone` of your git-annex repository.
--- a/doc/tips/Git_annex_and_Calibre/comment_1_b0ef346eaab9ff616aa1ba6b5f4530bc._comment
+++ b/doc/tips/Git_annex_and_Calibre/comment_1_b0ef346eaab9ff616aa1ba6b5f4530bc._comment
@ -0,0 +1,7 @@
 [[!comment format=mdwn
 username="https://id.koumbit.net/anarcat"
 subject="spurious changes in metadata.db"
 date="2015-02-15T05:03:20Z"
 content="""
 note that metadata.db seems to change even though no change was performed on the library. i filed a [bug report upstream](https://bugs.launchpad.net/calibre/+bug/1422058) to try and figure out what is going on here. -- [[anarcat]]
 """]]
--- a/doc/tips/Git_annex_and_Calibre/comment_2_9e8122ea81bbd0a86bd6c5173db801f8._comment
+++ b/doc/tips/Git_annex_and_Calibre/comment_2_9e8122ea81bbd0a86bd6c5173db801f8._comment
@ -0,0 +1,7 @@
 [[!comment format=mdwn
 username="https://id.koumbit.net/anarcat"
 subject="undo to the rescue"
 date="2015-02-15T05:52:58Z"
 content="""
 note: to avoid having too many such changes, i end up using [[todo/direct_mode_undo]] quite often.
 """]]
--- a/doc/tips/publishing_your_files_to_the_public/comment_2_27a40806d009d617b3ad56873197bf87._comment
+++ b/doc/tips/publishing_your_files_to_the_public/comment_2_27a40806d009d617b3ad56873197bf87._comment
@ -0,0 +1,7 @@
 [[!comment format=mdwn
 username="BojanNikolic"
 subject="Publishing using rsync/directory layout"
 date="2015-02-16T10:04:41Z"
 content="""
 Is it possible to easily do the same with rsync/directory layout of the special remote? These have prefixes which are not shown when doing git annex lookupkey
 """]]
--- a/doc/todo/Display_fingerprint_to_WebApps_GPG_34create_encrypted_new_repo34.mdwn
+++ b/doc/todo/Display_fingerprint_to_WebApps_GPG_34create_encrypted_new_repo34.mdwn
@ -0,0 +1,14 @@
 I'm using git-annex 5.20150205-gbf9058a and just used the WebApp to create a new remote SSH repo, and thought I'd try the encrypted option.
 It give me three GPG keys to choose from (all valid keys) but only displayed the email addresses which were all identical so I couldn't tell which was which.
 I then clicked the first key selection button, hoping it would display more info but it seemed to start doing things immediately. It requested the GPG passphrase which I cancelled but it was still doing things, and worse it wasn't clear what state the repo was in (encrypted or not), so I deleted it and started again (it's fine now).
 The passphrase dialog box does display the key fingerprint, but it's then too late to alter the key selection.
 Request: Could the WebApp always display the fingerprint after the email address?
 Some clarity on what happens when you cancel would be nice too.
 Thanks
 Giovanni
--- a/doc/todo/direct_mode_undo.mdwn
+++ b/doc/todo/direct_mode_undo.mdwn
@ -84,3 +84,5 @@ that touched files in that directory, and undo the changes to those files.
 Also, --depth could make undo look for an older commit than the most
 recent one to affect the specified file.
 See [[direct_mode]] for documentation about this feature.
--- a/doc/todo/direct_mode_undo/comment_1_bd7e9f152805a57cce97bef64e4891dd._comment
+++ b/doc/todo/direct_mode_undo/comment_1_bd7e9f152805a57cce97bef64e4891dd._comment
@ -0,0 +1,14 @@
 [[!comment format=mdwn
 username="https://id.koumbit.net/anarcat"
 subject="comment 1"
 date="2015-02-15T05:46:01Z"
 content="""
 > This way, if a file has a staged change, it gets committed, and then that commit is reverted, resulting in another commit. Which a later run of undo can in turn revert. If it didn't commit, the history about the staged change that was reverted would be lost.
 so far, my experience with this is that unstaged changes get dropped and the change that gets undoed is the last committed change. In other words, if i have:
    $ git annex status
    M file
 `git annex undo` is going to drop that modification and `git revert HEAD`. but maybe i got confused, in which care some of the documentation i just did in [[direct mode]] needs to be corrected. --[[anarcat]]
 """]]
--- a/doc/todo/direct_mode_undo/comment_2_b826160420e0f343aadc5353e50aed2b._comment
+++ b/doc/todo/direct_mode_undo/comment_2_b826160420e0f343aadc5353e50aed2b._comment
@ -0,0 +1,17 @@
 [[!comment format=mdwn
 username="https://id.koumbit.net/anarcat"
 subject="sigh... nevermind"
 date="2015-02-15T05:52:02Z"
 content="""
 seems like i was wrong. i could have sworn i saw a committed file get unstaged. what i saw was:
    $ git annex status
    M file
    $ git annex undo file
    $ git annex status
    ? file
 the thing is: the file was *removed* in a previous version, so i thought this was what it reverted to. i'm unsure as to why the file was marked as missing there - i ended up reverting from a backup (from another remote, by hand). after trying to reproduce this, i failed, so there may have been some PEBKAC in action again.
 this feature is so useful though, thanks for this. --[[anarcat]]
 """]]
--- a/standalone/linux/skel/runshell
+++ b/standalone/linux/skel/runshell
@ -66,10 +66,8 @@ for lib in $(cat $base/libdirs); do
 	GIT_ANNEX_LD_LIBRARY_PATH="$base/$lib:$GIT_ANNEX_LD_LIBRARY_PATH"
 done
 export GIT_ANNEX_LD_LIBRARY_PATH
-GIT_ANNEX_LINKER="$base/$(cat $base/linker)"
+GIT_ANNEX_DIR="$base"
-export GIT_ANNEX_LINKER
+export GIT_ANNEX_DIR
 GIT_ANNEX_SHIMMED="$base/shimmed"
 export GIT_ANNEX_SHIMMED
 ORIG_GCONV_PATH="$GCONV_PATH"
 export ORIG_GCONV_PATH
		`@ -0,0 +1,3 @@`
							`How safe (or not) is it to run the assistant as root?`

							`If not safe, what would be a good way to sync directories like /usr/local ?`