Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2014-09-10 15:15:26 -04:00
commit 286021cebc
26 changed files with 1748 additions and 1 deletions

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawk9nck8WX8-ADF3Fdh5vFo4Qrw1I_bJcR8"
nickname="Jon Ander"
subject="comment 14"
date="2014-09-08T07:27:46Z"
content="""
Still experiencing this bug in Debian testing (5.20140717) and Debian sid (5.20140831)
"""]]

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="CandyAngel"
ip="81.111.193.130"
subject="comment 10"
date="2014-09-08T08:08:50Z"
content="""
Removing .git/annex/index is safe, it is a step in getting git-annex to [forget a commit entirely](http://git-annex.branchable.com/forum/How_to_get_git-annex_to_forget_a_commit__63__).
"""]]

View file

@ -0,0 +1,22 @@
[[!comment format=mdwn
username="zardoz"
ip="78.48.163.229"
subject="comment 9"
date="2014-09-07T14:04:51Z"
content="""
Any ideas? I noticed one alternative way (cf. the reset workaround
above) to make «git annex add» work again is by deleting
.git/annex/index*. Is this safe?
In both repos, I had not even staged annex additions before the index
was corrupted; the corruption must somehow have been left-over from
earlier actions, altough all previous additions succeeded at the time,
before both repositories mysteriously stopped working (in the context
of backend-migration).
I still have the original snapshots around if youd like to debug
this. As noted, «git fsck» succeeds, and all the block-level checksums
check out, so the problem cant be on the block device or file-system
level.
"""]]

View file

@ -0,0 +1,26 @@
### Please describe the problem.
When attempting to 'git annex get' a file that does not exist in the git repository, git-annex correctly reports "not found". But it still returns exit code 0, incorrectly indicating success. This is problematic for scripting.
### What steps will reproduce the problem?
See transcript
### What version of git-annex are you using? On what operating system?
git-annex 5.20140517.4 as supplied by 'git-annex' aptitude package on Ubuntu 12.04.4 LTS (32-bit)
### Please provide any additional information below.
[[!format sh """
henry@commsbox:~/work/tmp$ git init test
Initialized empty Git repository in /home/henry/work/tmp/test/.git/
henry@commsbox:~/work/tmp$ cd test
henry@commsbox:~/work/tmp/test$ git annex init
init ok
(Recording state in git...)
henry@commsbox:~/work/tmp/test$ git annex get nonexistent.file
git-annex: nonexistent.file not found
henry@commsbox:~/work/tmp/test$ echo $?
0
"""]]

View file

@ -0,0 +1,45 @@
### Please describe the problem.
See the logs. git-annex-shell tries to use not existing runshell
### What steps will reproduce the problem?
I am on Debian testing and have, some month ago, tried the tarball distribution.
I have returned to deb packages later and deleted the tarball installation.
Seems that there some traces left.
I have tried to find the runshell configuration, but failed to do so.
I have destroyed the repo completely, has not helped.
### What version of git-annex are you using? On what operating system?
ii git-annex 5.20140831 amd64
### Please provide any additional information below.
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
[2014-09-07 17:15:04 CEST] main: starting assistant version 5.20140831
[2014-09-07 17:15:04 CEST] Cronner: Consistency check in progress
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
(scanning...) [2014-09-07 17:16:47 CEST] Watcher: Performing startup scan
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.2013121/
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
/home/<user>/.ssh/git-annex-shell: 4: exec: /home/<user>/git-annex.linux.5.20131213/runshell: not found
# End of transcript or log.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://andrew.aylett.co.uk/"
nickname="andrew"
subject="comment 4"
date="2014-09-07T18:41:28Z"
content="""
I, too, have seen this issue -- took me a while to recover from it. I do (now, at least) have a pre-commit hook that calls git annex pre-commit; I didn't set that up myself.
"""]]

View file

@ -0,0 +1,38 @@
### Please describe the problem.
error message:
copy somefile.jpg (checking myserver...) (to myserver...)
git-annex: runInteractiveProcess: pipe: Too many open files
rsync failed -- run git annex again to resume file transfer
failed
### What steps will reproduce the problem?
1. Start a `git annex copy` with lots of files in the queue.
2. Start a second `git annex copy` on the same set of files.
The intention is to minimize the amount of silent time on the wire due to administrative work between actual file transfers. These two processes will trip over each other and see that transfer X is already going, and skip to the next file Y, so in the end they upload about half of the files each.
3. Expect all files to be uploaded. Actually observe the above error message for at least one of the processes.
### What version of git-annex are you using? On what operating system?
git-annex version: 5.20140420-ga25b8bb
build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV FsEvents XMPP DNS Feeds Quvi TDFA CryptoHash
key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 WORM URL
remote types: git gcrypt S3 bup directory rsync web webdav tahoe glacier hook external
Darwin mymacbook 13.3.0 Darwin Kernel Version 13.3.0: Tue Jun 3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 x86_64
### Please provide any additional information below.
[[!format sh """
lsof -p <my annex process>
... some .app/** files, tty etc ...
... some unnamed pipes ...
.../.git/annex/ssh/myserver.lock
.../.git/annex/transfer/upload/b4d67c4f-8cca-423c-9363-f3063b7fe3e4/lck.SHA256E-s10448418--4f61fab4... ~200 different files.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://id.clacke.se/"
nickname="Claes"
subject="5.20140830"
date="2014-09-07T19:24:49Z"
content="""
Will verify if this is still valid for 5.20140830.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://id.clacke.se/"
nickname="Claes"
subject="yep"
date="2014-09-07T19:42:04Z"
content="""
Still valid for `git-annex version: 5.20140830-g3c96b79`
"""]]

View file

@ -4,4 +4,4 @@ Same as the desktop webapp, users will be able to enter a directory they
want the first time they run it, but to save typing on android, anything
that gets enough votes will be included in a list of choices as well.
[[!poll open=yes expandable=yes 68 "/sdcard/annex" 6 "Whole /sdcard" 7 "DCIM directory (photos and videos only)" 2 "Same as for regular git-annex. ~/annex/"]]
[[!poll open=yes expandable=yes 69 "/sdcard/annex" 6 "Whole /sdcard" 7 "DCIM directory (photos and videos only)" 2 "Same as for regular git-annex. ~/annex/"]]

View file

@ -0,0 +1,3 @@
When I run git-annex-webapp, a browser is opened and I am redirected to the assistant. However, how can I run the web app and just have it start the server process without opening a browser, and navigating to the page from a remote computer?
Thanks!

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="Xyem"
ip="81.111.193.130"
subject="comment 1"
date="2014-09-10T14:32:21Z"
content="""
[Running the assistant headless](http://git-annex.branchable.com/design/assistant/blog/day_232__headless_webapp/).
Hope this helps.
"""]]

View file

@ -0,0 +1,146 @@
[[!toc]]
# Introduction
Hello.
I've installed git-annex and git-annex assistant on Windows 7 in a corp environment (hello gotchas!).
In this post I'll describe issues I encountered, how I fixed them, recommendations I have for the installer, and some results from a couple `git annex test` runs.
# Background
My regular domain user doesn't have permissions to write to `C:\Program Files (x86)`, so I use a secondary domain user which is in the Administrators group. I use "Run as different user" to run installers, etc. (cf. "Run as Administrator")
During msysgit installation I checked "only bash, don't add to path, don't integrate with Explorer" etc, since I like my third-party applications isolated.
# The installer
## Where to install `git-annex.exe`
The nightly build of git-annex/assistant from NEST (20140908) only prompts for the base path of the msysgit location and it installs files in `$BASE/bin` and `$BASE/cmd`... I'll try manually copying files post-install to mitigate the path issues described in other posts on this forum.
The msysgit installer (1.9.4-preview20140815) presents a certain screen with three radio options:
1. git bash only
2. just git in `cmd.exe`
3. git + unix tools in `cmd.exe`.
I *think* this is the meaning of each:
1. cmd.exe's PATH is not touched.
2. `$GITBASE/cmd` is added to PATH
3. `$GITBASE/bin` is added to PATH
Therefore, I think that if you do something so that `git-annex.exe` is added to both $GITBASE/cmd and $GITBASE/bin (perhaps a symlink or even a .lnk file) then all three user preference options will be covered.
All I did was copy `$BASE/cmd/git-annex.exe` to `$BASE/bin/git-annex.exe` and now both `git annex` and `git-annex` work in my msysgit "git bash" console. I didn't test `cmd.exe` since I selected option 1 in the msysgit installer.
## Installer locations: user profile or system-wide?
I found a shortcut for the webapp in Start Menu/Startup ... for the wrong user. Please prompt the user during the installation: "Install startup link system-wide or for current user?"
# git annex test results
## `$HOME` defaulted to some mapped drive, whoops!
The test suite has been running since before I started this post. Is that normal? :)
I notice that it emits "Detected a crippled filesystem", "Enabling direct mode." and other messages again and again. If those checks are expensive, maybe the result should be memoized/cached.
Oh goodness, the test is reading and writing to my "home directory": a remote filesystem I never use. It's slow. I'll have to configure msysgit to use a different, more local `$HOME`. This a common problem on this workstation. I'll let the test finish in case it reveals something useful to you, but this will not be how I use it going forward...
I am unable to attach `testWithMappedDriveHomeDirConsoleOutput.txt` to this post. 1 out of 84 tests failed. Here is the only case sensitive occurrence of FAIL in the console output, with some lines of context.
OK
info: Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Enabling direct mode.
git-annex: Data.BloomFilter.Util.suggestSizing: capacity too large to represent
FAIL
Exception: user error (git-annex ["info","--json"] exited 1)
version: Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Enabling direct mode.
git-annex version: 5.20140908-g378fbb1
build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV DNS Feeds Quvi TDFA CryptoHash
## test results with local NTFS `$HOME`
...The console output is scrolling by much more quickly.
2 out of 84 tests failed.
prop_past_sane: OK
+++ OK, passed 1000 tests.
prop_duration_roundtrips: OK
+++ OK, passed 1000 tests.
prop_metadata_sane: OK
+++ OK, passed 1000 tests.
prop_metadata_serialize: OK
+++ OK, passed 1000 tests.
prop_branchView_legal: OK
+++ OK, passed 1000 tests.
prop_view_roundtrips: OK
+++I nOiKt, Tpeasstsse
d 1 0i0n0i tt:e sts.
prop_viewedFile_rountrips: FAIL
*** Failed! Falsifiable (after 51 tests and 1 shrink):
"a:"
Use --quickcheck-replay '50 592211036 1831676953' to reproduce.
Unit Tests
add sha1dup: init test repo
Detected a filesystem without fifo support.
Disabling ssh connection caching.
and
OK
info: Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Enabling direct mode.
git-annex: Data.BloomFilter.Util.suggestSizing: capacity too large to represent
FAIL
Exception: user error (git-annex ["info","--json"] exited 1)
version: Detected a filesystem without fifo support.
Disabling ssh connection caching.
Detected a crippled filesystem.
Enabling direct mode.
git-annex version: 5.20140908-g378fbb1
build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV DNS Feeds Quvi TDFA CryptoHash
...Note the corruption. I think this happens when I drag the scroll bar while console output is being emitted. (msysgit's problem?) I would presume and hope that this is a "display only" issue. UPDATE: see section Corruption below.
# .vbs failure
I copied the `git-annex.lnk` out of my admin user's start menu onto my desktop and double clicked it. `wscript.exe` got stuck in a loop where new copies were being spawned over and over again (and old copies dieing at the same rate).
I think I know why. `git-annex.exe` isn't on the path... but `git-annex.lnk` is in the CWD (Desktop in this case). Yeah, that is the problem. The vbs attempts to run "git-annex webapp", and this .lnk points to a valid "executable": `git-annex-webapp.vbs`... So it just calls itself with an argument over and over again.
Workaround: invoke `git annex webapp` from the normal git bash console.
# Corruption?
In some section above I speculated that the "jittery" corruption I was seeing in my console was a "display only" problem caused by scrolling around while new characters were being printed to the console. Now, I don't think so.
The corruption can be seen in the Log in the webapp. Here's an example from the top of the log:
[2014-09-08 13:37:45 Central Daylight Time] main: starting assistant version 5.20140908-g378fbb1
Launching web browser on file://d:\annex\.git\annex\webapp.html
[2014-09-08 13:37:45 Central Daylight Time] Cronner: You should enable consistency checking to protect your data.
(scanning...) [2014-09-08 13:37:45 Central Daylight Time] Watcher: Performing startup scan
(started...) rreerrcceevvcc::vv ::ff aaffiiaalliieellddee dd(( NN((ooNN ooee rreerrrroorrrroo))rr
))
I have no clue about this! (Well... "I think it's trying to communicate!")
# Conclusion
I hope this information is helpful. I've enabled the 'email comments to me' option on this post and I'd be happy to perform further tests upon request.
Cheers!

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="justinl"
ip="184.17.213.135"
subject="works"
date="2014-09-10T17:59:48Z"
content="""
Yep, the standalone armel build worked perfectly. Thanks!
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawm8wY171R5c4u_jPmB6LU6n6Px2xePM4sE"
nickname="Efraim"
subject="comment 1"
date="2014-09-07T14:09:05Z"
content="""
have you tried from the command line `git annex unused` to see if you have unused files in your repo? From the assistant, the option under configuration -> Unused files gives you an option to expire old files after a period of time so they get deleted from your repo.
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="rasmus"
ip="109.201.154.150"
subject="comment 2"
date="2014-09-07T15:17:25Z"
content="""
Thanks for help, Efraim.
I'm not sure this is it. On my other laptop, where the above statics were not calculated the `.git` folder of `doc.annex` is 26Gb (contents is 8.6Gb). Meanwhile, unused files are 0.6Gb. In `conf.annex` the `.git` folder is 3.2Gb, content is 70Mb and unused files is 2.2Mb. I used the web interface to find the size of unused files.
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="http://id.clacke.se/"
nickname="Claes"
subject="repack parameters"
date="2014-09-07T21:53:10Z"
content="""
Because git-annex tracks all the events of an annexed file for each repo -- added, dropped, copied etc -- and it tracks these in one object per file in the git-annex branch, it does indeed create a lot of objects. To improve both space and performance I made sure to add `git gc --auto` as a post-commit hook, as the objects in my case can quickly reach the tens or even hundreds of thousands.
To further improve performance and space, you can choose to set `pack.window` and `pack.depth` to vastly higher values than the defaults (10 and 50, respectively), because there is a large amount of objects with very similar content. I did a `git repack --window 2500 --depth 1000 -f -a d` and brought down my repo from 3 GiB (packed!) to 300 MiB. Make sure to have a lot of memory and CPU available when doing this, or it will take forever. You can set `pack.window` ridiculously high if you like, as long as you limit it with `pack.windowMemory`, so that it makes use of all your available memory for comparing objects and finding the optimal delta.
"""]]

View file

@ -0,0 +1,16 @@
[[!comment format=mdwn
username="rasmus"
ip="109.201.154.209"
subject="Re: repack parameters"
date="2014-09-08T13:20:36Z"
content="""
Thanks for your tips, Claes. I was really aware of `git repack` and that set of parameters.
I didn't mention, but sadly I'd run `git gc` on the repos just before collecting the above numbers.
I tried to repack two repositories -- `doc.annex` and `config.annex` -- using the values you suggested. However, it did not have any measurable effect (less than 100mb in both cases).
The number of unused files seem to be (much) less than 500 files in the repos.
BTW: All of the extra size is in the `.git/objects/` folder. `.git/annex/` is quite small (always much less than 1GB). Would that indicate that large files are checked in with git sans annex somehow?
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="rasmus"
ip="109.201.154.183"
subject="comment 5"
date="2014-09-08T13:48:03Z"
content="""
So `git prune` worked wonders on my repos, getting rid of GBs of stuff in the `.git/objects` folders. I don't know why they weren't picked up by `git gc`. In retrospect, it was perhaps a bit careless of me to run `git prune` directly, but hopefully I will be OK. . .
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="rasmus"
ip="193.145.48.43"
subject="comment 6"
date="2014-09-10T12:07:53Z"
content="""
Seems `git prune` only worked as a temporary fix. My `doc.annex/.git/objects` is 3.6Gb after two days. I don't get why `git` sans `annex` is checking in stuff -- which I assume is the reason it's stored in `.git/objects`.
"""]]

View file

@ -0,0 +1 @@
what sort of options can we use in the expression field? from the [git annex bible](http://git-annex.branchable.com/git-annex/) it suggests for incremental fscks, but I'm wondering if it can run shell scripts or `git annex importurl` lines too.

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawkzwmw_zyMpZC9_J7ey--woeYPoZkAOgGw"
nickname="dxtrish"
subject="comment 2"
date="2014-09-07T21:16:09Z"
content="""
Because megaannex apparently isn't working nowadays I have created a compatible program in Go. I think it should handle everything except REMOVE right now.
You can find it on github: https://github.com/dxtr/megaannex-go
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawkmN2ZZIdYNiFKmEH7rz4jMb6sYsx_dptA"
nickname="Jack William"
subject="What if you do not want to encrypt?"
date="2014-09-07T18:35:19Z"
content="""
Once use case for GIT with Amazon S3 is to maintain a web site on S3 you can easily update from a local machine. In that case you would not want to encrypt. Is encryption optional? This isn't clear from the instructions.
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="http://schnouki.net/"
nickname="Schnouki"
subject="comment 6"
date="2014-09-09T12:48:59Z"
content="""
Jack, if you don't want to use encryption you can use `encryption=none` as documented [here](http://git-annex.branchable.com/special_remotes/S3/).
I'm not sure exactly what you're trying to do, but please note that you files won't be easily available on S3: they will be named as git-annex keys, with long and unreadable names such as \"SHA256E-s6311--c7533fdd259d872793b7298cbb56a1912e80c52a845661b0b9ff391c65ee2abc.html\" instead of \"index.html\".
"""]]

View file

@ -0,0 +1,18 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawkQOUUx4LVAk6EnstSLvdv7gZc0NsRlHXw"
nickname="Dave"
subject="windows port volunteer tester"
date="2014-09-08T15:39:03Z"
content="""
Hello.
I volunteer to test the Windows port. I can only do so on the work computer. :)
I know how to produce a Minimal Reproducible Example. Are there some areas that need attention?
I've enabled the 'email replies to me' ikiwiki feature. (Nice plugin.)
Cheers,
--Dave
"""]]