Merge branch 'master' into s3-aws
This commit is contained in:
commit
e535ff8fa4
27 changed files with 205 additions and 67 deletions
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="https://www.google.com/accounts/o8/id?id=AItOawmUJBh1lYmvfCCiGr3yrdx-QhuLCSRnU5c"
|
||||
nickname="Justin"
|
||||
subject="comment 2"
|
||||
date="2014-10-24T04:49:20Z"
|
||||
content="""
|
||||
Oh jeez, I screwed that up wrt HEAD and GET. Sorry. The cost per HEAD on Google is 1/10 the price of GET, so we're talking $.13 to HEAD my 130k-file annex, which is totally reasonable.
|
||||
|
||||
One can GET a bucket, which is what I was looking at. This returns up to 1000 elements of its contents (and there's a way to iterate over larger buckets). Of course this would only be useful if the majority of files in the bucket were of interest to git-annex, and it sounds like more trouble than it's worth at the prices I'm seeing.
|
||||
|
||||
There might be a throughput improvement to be had by keeping the connection alive, although in my brief investigation, I think there may be a larger gain to be had by pipelining the various steps. Based on the fact that git-annex oomed when trying to upload a large file from my rpi, it seems like maybe the whole file is encrypted in memory before it's uploaded? And certainly the HEAD(s) appear not to be done in parallel with the upload.
|
||||
|
||||
Sorry again for that HEAD/GET fail.
|
||||
"""]]
|
|
@ -0,0 +1,12 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.96"
|
||||
subject="comment 3"
|
||||
date="2014-10-24T16:02:23Z"
|
||||
content="""
|
||||
The OOM is [[S3_memory_leaks]]; fixed in the s3-aws branch.
|
||||
|
||||
Yeah, GET of a bucket is doable. Another problem with it though is, if the bucket has a lot of contents, such as many files, or large files split into many chunks, that all has to be buffered in memory or processed as a stream. It would make sense in operations where git-annex knows it wants to check every key in a bucket. `git annex unused --from $s3remote` is the case that springs to mind where it could be quite useful to do that. Integrating it with `get`, not so much.
|
||||
|
||||
I'd be inclined to demote this to a wishlist todo item to try to use bucket GET for `unused`. And/or rethink whether it makes sense for `copy --to` to run in --fast mode by default. I've been back and forth on that question before, but just from a runtime perspective, not from a 13 cents perspective. ;)
|
||||
"""]]
|
|
@ -0,0 +1,11 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.96"
|
||||
subject="comment 11"
|
||||
date="2014-10-23T21:05:15Z"
|
||||
content="""
|
||||
When it resumes, it will start at 0% but jump forward to the resume point pretty quickly, after verifying which chunks have already been sent.
|
||||
If any full chunk gets transferred, I'd expect it to resume. This may not be very obvious it's happening for smaller files.
|
||||
|
||||
I have been running `git annex testremote` against S3 special remotes today, and have not managed to reproduce this problem (using either the old S3 or the new AWS libraries). It could be anything, including a problem with your network or the network between you and the S3 endpoint. Have you tried using a different S3 region?
|
||||
"""]]
|
|
@ -0,0 +1,14 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.96"
|
||||
subject="comment 1"
|
||||
date="2014-10-23T18:52:48Z"
|
||||
content="""
|
||||
The S3 library that git-annex is using does not support the authentication method that this region uses.
|
||||
|
||||
It is supported by the aws library that git-annex uses in the `s3-aws` branch in git, and I already added the region there this morning.
|
||||
|
||||
I can't merge `s3-aws` yet; the neccessary version of the aws library is not yet available in eg, Debian. And even upgrading aws from cabal seems to result in dependency hell, due to its needing a newer version of scientific. This should all sort itself out in time.
|
||||
|
||||
If you need this region, you'll need to try to build git-annex's s3-aws branch, for now.
|
||||
"""]]
|
|
@ -0,0 +1,8 @@
|
|||
[[!comment format=mdwn
|
||||
username="http://joeyh.name/"
|
||||
ip="209.250.56.96"
|
||||
subject="comment 2"
|
||||
date="2014-10-23T19:51:46Z"
|
||||
content="""
|
||||
Looks like the cabal dependency hell is managable; if done in system without anything installed, cabal manages to install the new aws, and everything else, except for the dbus library. Still not ready to be merged though.
|
||||
"""]]
|
Loading…
Add table
Add a link
Reference in a new issue