Merge branch 'master' of ssh://git-annex.branchable.com

This commit is contained in:
Joey Hess 2013-07-20 19:42:29 -04:00
commit 7979526043
12 changed files with 144 additions and 0 deletions

View file

@ -0,0 +1,13 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.0.140"
subject="comment 6"
date="2013-07-20T22:31:02Z"
content="""
Isolated the bug to a problem with the upstream inotify library.
<https://github.com/kolmodin/hinotify/issues/5> I've sent in a patch to that library that fixes the problem.
Unfortunately, I cannot work around it in git-annex more than I already have. It'll no longer crash, but will skip over files or directories that contain characters not valid in the current locale.
I have applied my patch to the haskell-hinotify package in Debian unstable, and have deployed fixed versions to all my linux autobuilds, including Android. (An Android user had mentioned also seeing this bug.)
"""]]

View file

@ -0,0 +1,12 @@
[[!comment format=mdwn
username="andy"
ip="108.202.17.204"
subject="comment 5"
date="2013-07-20T21:34:08Z"
content="""
Yes, the symlink was still broken. Or rather, `tar -tf filename.tar` didn't like the file.
The file's about 193 M bzipped. I'll be e-mailing you a link to an S3 copy with DojOsEf2 in the subject line of the email--once the upload finishes. :)
I'm not sure if this is relevant, but the file is a tar of an annex and another git repo. Perhaps that is influencing something (although I presume it shouldn't)?
"""]]

View file

@ -0,0 +1,13 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.0.140"
subject="comment 6"
date="2013-07-20T22:43:27Z"
content="""
Got the file. Verified checksum.
Reproduced bug!
Wow, it really seems to be a bug specific to this one particular
file content. That's crazy.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.0.140"
subject="comment 7"
date="2013-07-20T23:00:59Z"
content="""
Can be reproduced with first 500kb of file. I have deleted all the rest of the file, without looking at it. (Scout's honor!)
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.0.140"
subject="comment 8"
date="2013-07-20T23:07:17Z"
content="""
Ok, it is in fact relevant that the file is a tarball of a git-annex repository, because git-annex add turns out to be looking at the beginning of the file, and seeing that it contains a git-annex link.
I thought that code was only supposed to fire in repos on FAT filesystems that don't have symlinks. So, several issues to fix here, it seems..
"""]]

View file

@ -0,0 +1,10 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.0.140"
subject="comment 9"
date="2013-07-20T23:26:09Z"
content="""
Ok, I've put in 2 separate bug fixes, any one of which would have been sufficient to prevent this data loss. Am working on a third fix to detect this kind of problem at a higher level and avoid losing content even if it gets all confused.
This bug may be a candidate to be backported to Debian stable, since it causes data loss.
"""]]

View file

@ -0,0 +1,21 @@
### Please describe the problem.
I am unable to restore a git-annex dir to its pre init state.
### What steps will reproduce the problem?
init a git-annex dir on android with a file system with out symlinks.
use for a while.
Run: "git-annex uninit" -> You cannot run this command in a direct mode repository.
Run: "git-annex indirect" -> Git is configured to not use symlinks, so you must use direct mode.
### What version of git-annex are you using? On what operating system?
git-annex version: 4.20130601-g7483ca4
### Please provide any additional information below.
[[!format sh """
# If you can, paste a complete transcript of the problem occurring here.
# If the problem is with the git-annex assistant, paste in .git/annex/daemon.log
# End of transcript or log.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.0.140"
subject="comment 1"
date="2013-07-20T23:20:48Z"
content="""
There's no way to make indirect mode work on a filesystem w/o symlinks, but it should be possible to make unannex (required for uninit) work in direct mode. Just has not been done yet.
"""]]

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="https://www.google.com/accounts/o8/id?id=AItOawm_cen0223TLcWCTPwCPecCQC5JxGnPO04"
nickname="Eric"
subject="comment 2"
date="2013-07-20T23:26:23Z"
content="""
unannex an uninit are the main thing, so that people have an exit strategy for git-annex on android.
"""]]

View file

@ -1,3 +1,5 @@
**EDIT: Mistakenly posted this thread in the forum. I created a new post in todo. Link: http://git-annex.branchable.com/todo/Bittorrent-like_features/?updated**
Do you think it would be possible to have bittorrent-like transfers between remotes, so that no one remote gets pegged too hard with transfers? It would be great if you distribute your files between multiple bandwidth-capped remotes, and want fast down speed. Obviously, this isn't a simple task, but the protocol is already there, it just needs to be adapted for the purpose (and re-written in Haskell...). Maybe some day in the future after the more important stuff gets taken care of? It could be an enticing stretch goal.
PS: still working on getting BTC, will be donating soon!

View file

@ -0,0 +1,8 @@
[[!comment format=mdwn
username="http://joeyh.name/"
ip="4.154.0.140"
subject="comment 3"
date="2013-07-20T20:33:37Z"
content="""
Indeed, that was a bug with encrypted directory special remotes. Fixed it.
"""]]

View file

@ -0,0 +1,31 @@
I made an oops and created a wishlist thread in the forum regarding bittorrent-like behaviour. Sorry, my bad.
Here's the original thread:
http://git-annex.branchable.com/forum/Wishlist:_Bittorrent-like_transfers/
I think I summed up pretty well what bittorrent-like features could be added to git-annex in one of the posts, so I'll copy and paste some of it here (with slight clarifications added in).
>Disclaimer: I'm thinking out loud of what could make git-annex even more awesome. I don't expect this to be implemented any time soon. Please pardon any dumbassery.
>Having your remotes (optionally!) act like a swarm would be an awesome feature to have because you bring in a lot of new features that optimize storage, bandwidth, and overall traffic usage. This would be made a lot easier if parts of it were implemented in small steps that added a nifty feature. The best part is, each of these could be implemented by themselves, and they're all features that would be really useful.
>
>Step 1. Concurrent downloads of a file from remotes.
>
>This would make sense to have, it saves upload traffic on your remotes, and you also get faster DL speeds on the receiving end.
>
>Step 2. Implementing part of the super-seeding capabilities.
>
>You upload pieces of a file to different remotes from your laptop, and on your desktop you can download all those pieces and put them together again to get a complete file. If you really wanted to get fancy, you could build in redundancy (ala RAID) so if a remote or two gets lost, you don't lose the entire file. This would be a very efficient use of storage if you have a bunch of free cloud storage accounts (~1GB each) and some big files you want to back up.
>
>Step 3. Setting it up so that those remotes could talk to one another and share those pieces.
>
>This is where it gets more like bittorrent. Useful because you upload 1 copy and in a few hours, have say, 5 complete copies on 5 different remotes. You could add or remove remotes from a swarm locally, and push those changes to those remotes, which then adapt themselves to suit the new rules and share those with other remotes in the swarm (rules should be GPG-signed as a safety precaution). Also, if/when deltas get implemented, you could push that delta to the swarm and have all the remotes adopt it. This is cooler than regular bittorrent because the shared file can be updated. As a safety precaution, the delta could be GPG signed so a corrupt file doesn't contaminate the entire swarm. Each remote could have bandwidth/storage limits set in a dotfile.
>
>This is a high-level idea of how it might work, and it's also a HUGE set of features to add, but if implemented, you'd be saving a ton of resources, adding new use cases, and making git-annex more flexible.
And this:
>Obviously, Step 3 would only work on remotes that you have control of processes on, but if given login credentials to cloud storage remotes (potentially dangerous!) they could read/write to something like dropbox or rsync.
>
>Another thing, this would be completely trackerless. You just use remote groups (or create swarm definitions) and share those with your remotes. **It's completely decentralized!**