Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2013-04-22 15:52:39 -04:00
commit 0138c0bbcd
4 changed files with 124 additions and 0 deletions

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="http://joeyh.name/"
nickname="joey"
subject="comment 1"
date="2013-04-22T18:58:08Z"
content="""
The assistant can handle this syncing via a central bare remote. The only problem is that, since your server does not have git-annex installed, that remote will have annex-ignore set on it. That made the assistant not use it for syncing at all.
I've just committed a change to git that makes the assistant still use such a git remote for syncing the git repository, even though it cannot store file contents there.
"""]]

View file

@ -0,0 +1,57 @@
This post is a personal story, a bug report, and a feature request.
I discovered `git-annex assistant` by accident a couple of weeks (months?) ago. I was very impressed by the technical smartness and decided it was worth a try. My current configuration (everything done through the assistant web GUI) is:
* Two machines that act as client repositories and are usually in the same LAN. Both of them run OS X Mountain Lion.
* One transfer repository which was Box.com, but then I switched to Amazon S3 (today, Ill explain why).
* The repository consists of mixed files and is ~4 GB big (very small plain text files, PDFs, some MP3 files, the biggest files are about 2050 MB). The Finder reports that the repository is ~40 GB big while `du -sh` reports 4.4 GB. Its probably just a silly Finder bug, but it would be great to know why this happens.
* I have version 4.20130324 on both clients. The OS X versions seems to be always a little bit behind. It would be great if it could match with other platforms.
Here are some issues which I think are worth mentioning:
1. I had some issues with the assistant, because I initially launched it from the mounted DMG. `git-annex assistant` set the environment up, but it configured the wrong paths (to `/Volumes` instead `/Applications` where the bundle later went). This has caused issues with authentication and some ssh keys. `git-annex` was confused which keys to use and I had to manually remove the first ones from the `.ssh/` directory. This could be very confusing for technically not so savvy users, because the error messages in the GUI were confusing.
2. Im still very confused about the behavior of `git-annex assistant` when it launches. The application icon bounces in the Dock for quite a long time, then it looks like the process wouldve crashed (I can only *Force Quit* it from the Dock). Then the icon disappears after some minutes / half an hour. I was unsure if the assistant would run if I would *Force Quit* it (yes). The behavior is very strange on OS X. It would be great to have at least a normal icon which bounces a couple of times, stays in the Dock without appearing like a non-responding app, and that launches the web GUI when a user clicks on it. Optimal would be just a menu bar icon which would show the current progress, let me launch the web GUI, check for updates, pause & resume all transfers, and restart & quit `git-annex assistant`.
3. The initial sync went great and pretty fast. I was also impressed by the speed `git-annex` picks up new changes and synchronizes them to other repositories. After adding the Box.com transfer repository everything got worse. Files that were added to one of the client repositories were never synced to the other machine, despite making sure that the client repositories appeared above the Box.com repository in `git-annex assistant`. The transfer was very slow (thats probably due to the API rate limits of Box.com) and I got the impression that the files were uploaded multiple times (probably true, because the assistant couldnt upload them in the first place and then just tried it again giving me the impression that it was stuck in some kind of loop). I tried to manually click on a couple of hundred play buttons in the queue in the hope that there would somewhere appear the item which shouldve been synced to the other machine in the first place. I also attached some log messages to the end of this post which could be helpful. At this point `git-annex assistant` basically stopped working for me. I lived somehow with it in the hope that it would upload everything to Box.com in the upcoming days, but it never got better, even after a couple of weeks (in the meantime we copied files manually over an attached network storage which lived outside of the repository). Today I bit the bullet and switched to Amazon S3. All files were uploaded in a couple of minutes and it looks like this setup would be more reliable. Itll cost us a couple of $ each month, but its OK, because it actually works.
4. I read somewhere that transfer repositories where meant to keep only files that are not present on all client repositories. With my configuration `git-annex` just uploads everything to the transfer repository (no matter if its Box.com or Amazon S3). Is this the correct behavior? Did I get something wrong or is this just not implemented yet? Its OK in the current version, but it would be great to store only the missing files in the future.
5. After I switched to Amazon S3 some encrypted file names appear in my queue. They are gone after I restart `git-annex assistant` or when I manually remove them with the `X` button. Is this a bug?
6. It is confusing that the web GUI shows that the transfer failed when the second machine is offline. While technically this is true, it would be much more assuring to know that the client is just offline. If something fails I get the impression that it is broken, but the client was only unreachable and thats just a temporary state thatll get automatically fixed when the client goes online again. Adding green / red lights to the repository with a mouse over description would be probably a good idea. Reporting that the client is unreachable instead of showing the error message would also be more appropriate.
7. The biggest issue I have now is that `git-annex assistant` randomly crashes on both machines. I dont know if this is a known issue or if this an OS X only issue, but it is slightly annoying. If you want to make sure that files really synchronized, youve to start `git-annex assistant` just to check if it still runs. I hope that this will get fixed as soon as possible. I would love to help you with debug logs, but I dont know where to look. Theres nothing meaningful in the system log or the debug log.
8. If the issues with Box.com are because of their API rate limits, I would make this clearer when adding new Box.com repositories or I would remove the feature altogether, because they actually make `git-annex assistant` unusable.
To sum everything up: my biggest issues so far are the constant crashes and the strange behavior of the Dock icon. Im still pretty impressed by the technical smartness and that the software is available at all. Thank you very much—Joey—for letting us use this software on a daily basis. Im glad that you launched the Kickstarter campaign. I dont know if its much of a hassle, but a second Kickstarter to make sure that development will continue would be probably a good idea. I would like to help with logs, but I dont know how. Please let me know if theres something I can do.
**Box.com logs:**
87% 16.0KB/s 2s[2013-04-22 15:14:35 CEST] Pusher: Syncing with 10.0.1.4_jcshared
To ssh://rafael@git-annex-10.0.1.4-rafael/~/jc-shared/
7428090..f0d2f9e git-annex -> synced/git-annex
9d2cb35..1f7b678 master -> synced/master
Already up-to-date.
Already up-to-date.
gpg: [stdout]: write error: Broken pipe
gpg: DBG: deflate: iobuf_write failed
gpg: build_packet failed: file write error
gpg: [stdout]: write error: Broken pipe
gpg: iobuf_flush failed on close: file write error
gpg: symmetric encryption of `[stdin]' failed: file write error
ResponseTimeout
git-annex: fd:41: hPutBuf: resource vanished (Broken pipe)
ResponseTimeout
(gpg)
87% 16.0KB/s 2sgpg: [stdout]: write error: Broken pipe
gpg: DBG: deflate: iobuf_write failed
gpg: build_packet failed: file write error
gpg: [stdout]: write error: Broken pipe
gpg: iobuf_flush failed on close: file write error
gpg: symmetric encryption of `[stdin]' failed: file write error
ResponseTimeout
git-annex: fd:54: hPutBuf: resource vanished (Broken pipe)
send: resource vanished (Connection reset by peer)
ResponseTimeout

View file

@ -0,0 +1,31 @@
Hi,
first sorry for my poor english it's not my native language.
I have one repository on my laptop and two repository on usb disk. Made with following walkthrough (creating a repository and adding a remote).
Yesterday I have a backup of my repository on usb disk before add some file with
cd /media/usb/annex;git fetch laptop; git merge laptop/master&&git annex get .&&git annex sync
Now the repository on my usb disk is a mess.
Every file before the commit are lost.
For example :
After the sync : file Z.7z
Z.7z: broken symbolic link to `../../../../.git/annex/objects/2K/49/SHA256-s0--e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855/SHA256-s0--e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'
The file ../../../../.git/annex/objects/2K/49/SHA256-s0--e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855/SHA256-s0--e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 doesn't exist.
On the repository before the sync (inside the Backup) :
file Z.7z
Z.7z: symbolic link to `../../../../.git/annex/objects/J1/f4/SHA256-s696365035--d2dcc67bf2f05fcfc7f42723b2d415d4f057a2eeadc282b40f5bc3724534f2f4/SHA256-s696365035--d2dcc67bf2f05fcfc7f42723b2d415d4f057a2eeadc282b40f5bc3724534f2f4'
The file ../../../../.git/annex/objects/J1/f4/SHA256-s696365035--d2dcc67bf2f05fcfc7f42723b2d415d4f057a2eeadc282b40f5bc3724534f2f4/SHA256-s696365035--d2dcc67bf2f05fcfc7f42723b2d415d4f057a2eeadc282b40f5bc3724534f2f4 still exist in the repository after the sync.
3 questions, if somebody could help me :
- what I do wrong ?
- why the- symlink for every file had change after "cd /media/usb/annex;git fetch laptop; git merge laptop/master&&git annex get .&&git annex sync"
- how could I fix my repository ? recover file from the backup ? how ? Copy every file to start my repository from a new clean state ?

View file

@ -0,0 +1,26 @@
[[!comment format=mdwn
username="http://joeyh.name/"
nickname="joey"
subject="comment 1"
date="2013-04-22T19:48:55Z"
content="""
git-annex stores the contents of files inside `.git/annex/objects`. The `git annex add` is failing because it cannot `rename()` the file into that directory, because it is on a different filesystem. Even if it did a more expensive move of the file, it would not do what you want, because all the files would be moved to the `.git/annex/objects` directory, which is stored on your smaller drive.
The way git-annex is intended to be used with multiple drives is this:
* Make a separate git repository on each drive.
* Set up git remotes connecting these repositories together. You don't have to connect them all up, but at least make
the git repository on your main filesystem have a remote for each git repository on other drives.
* Use `git annex sync` to keep the git repositories in sync. (Or do it manually with `git pull`)
* When you want a file to be available in the local repository, use `git annex get $file` to get it.
* When your local repository is getting too full, use `git annex drop` or `git annex move` to flush files
out to the other drive(s).
The [[walkthrough]] goes through an example of adding a removable USB drive this way, but you can do the same thing for
non-removable drives.
> Having two repositories also has the disadvantage that I need two repositories on all other nodes am I right?
No -- you combine the two repositories, so any clone of either one contains all the files in both. Other notes then only need one
repository. However, for another node to be able to get files from both repositories on this node, it will need to have two git remotes configured, one for each repository.
"""]]