Commit graph

152 commits

Author SHA1 Message Date
Joey Hess
e97fce35a6
Display progress meter in -J mode when downloading from the web.
Including in addurl, and get --from web, but also in S3 and External
special remotes when a web url is known for content in those remotes.
2015-11-16 21:00:54 -04:00
Joey Hess
4153507864
Fix failure to build with aws-0.13.0 and finish nearline support.
* Fix failure to build with aws-0.13.0.
* When built with aws-0.13.0, the S3 special remote can be used to create
  google nearline buckets, by setting storageclass=NEARLINE.
2015-11-02 11:14:03 -04:00
Joey Hess
c32a2429ed
S3: Fix support for using https.
Was using the http-only Manager before, not the tls-capable one.
2015-10-15 10:37:06 -04:00
Joey Hess
b1abe59193
add removeKey action to Remote
Not implemented for any remotes yet; probably the git remote is the only
one that will ever implement it.
2015-10-08 15:01:38 -04:00
Joey Hess
9e3ac97608 avoid deprecation warnings when built with http-client >= 0.4.18
Since I want git-annex to keep building on debian stable, I need to still
support the old http-client, which required explicit calls to
closeManager, or use of withManager to get Managers to close at appropriate
times. This is not needed in the new version, and so they added a
deprecation warning. IMHO much too early, because look at the mess I had to
go through to avoid that deprecation warning while supporting both
versions..
2015-10-01 13:48:56 -04:00
Joey Hess
20205b6073 avoid hard dependency on new version of aws 2015-09-22 11:04:26 -04:00
Joey Hess
26d6566307 S3 storage classes expansion
Added support for storageclass=STANDARD_IA to use Amazon's
new Infrequently Accessed storage.

Also allows using storageclass=NEARLINE to use Google's NearLine storage.

The necessary changes to aws to support this are in
https://github.com/aristidb/aws/pull/176
2015-09-17 17:20:01 -04:00
Joey Hess
1cd3b7ddf0 refactor 2015-08-17 10:42:14 -04:00
Joey Hess
c5b8484c2e Simplify setup process for a ssh remote.
Now it suffices to run git remote add, followed by git-annex sync. Now the
remote is automatically initialized for use by git-annex, where before the
git-annex branch had to manually be pushed before using git-annex sync.
Note that this involved changes to git-annex-shell, so if the remote is
using an old version, the manual push is still needed.

Implementation required git-annex-shell be changed, so configlist can
autoinit a repository even when no git-annex branch has been pushed yet.
Unfortunate because we'll have to wait for it to get deployed to servers
before being able to rely on this change in the documentation.

Did consider making git-annex sync push the git-annex branch to repos that
didn't have a uuid, but this seemed difficult to do without complicating it
in messy ways.

It would be cleaner to split a command out from configlist to handle
the initialization. But this is difficult without sacrificing backwards
compatability, for users of old git-annex versions which would not use the
new command.
2015-08-05 13:49:58 -04:00
Joey Hess
1eb4b47c79 layout 2015-06-15 14:48:38 -04:00
Joey Hess
f2486b21dd show S3 urls for public repos in whereis
Note that it's possible for a S3 bucket to be configured to allow public
access, but for git-annex to not know that it is. I chose to not show the
url unless public=yes.
2015-06-05 16:52:38 -04:00
Joey Hess
5f0f063a7a S3: Publically accessible buckets can be used without creds. 2015-06-05 16:23:35 -04:00
Joey Hess
4acd28bf21 public=yes config to send AclPublicRead
In my tests, this has to be set when uploading a file to the bucket
and then the file can be accessed using the bucketname.s3.amazonaws.com
url.

Setting it when creating the bucket didn't seem to make the whole bucket
public, or allow accessing files stored in it. But I have gone ahead and
also sent it when creating the bucket just in case that is needed in some
case.
2015-06-05 14:38:01 -04:00
Joey Hess
334fd6d598 groundwork for readonly access
Split S3Info out of S3Handle and added some stubs
2015-06-05 13:12:45 -04:00
Joey Hess
e27b97d364 Merge branch 'master' into concurrentprogress
Conflicts:
	Command/Fsck.hs
	Messages.hs
	Remote/Directory.hs
	Remote/Git.hs
	Remote/Helper/Special.hs
	Types/Remote.hs
	debian/changelog
	git-annex.cabal
2015-05-12 13:23:22 -04:00
Joey Hess
9f14f51d63 generalied elem/notElem in ghc 7.10 require some additional type signatures when using OverloadedStrings 2015-05-10 15:41:41 -04:00
Joey Hess
ee78958798 S3: Fix incompatability with bucket names used by hS3; the aws library cannot handle upper-case bucket names. git-annex now converts them to lower case automatically.
For example, it failed to get files from a bucket named S3.

Also fixes `git annex initremote UPPERCASE type=S3`, which failed with the
new aws library, with a signing error message.
2015-04-27 18:00:58 -04:00
Joey Hess
22a4e92df7 S3: git annex enableremote will not create a bucket name, which failed since the bucket already exists. 2015-04-23 14:16:53 -04:00
Joey Hess
b3eccec68c S3: git annex info will show additional information about a S3 remote (endpoint, port, storage class) 2015-04-23 14:12:25 -04:00
Joey Hess
ae9bbf25a0 convert all log prorities, not just debug
In particular, error should go to stderr
2015-04-21 15:59:30 -04:00
Joey Hess
3b3aaf0d56 S3: Enable debug logging when annex.debug or --debug is set.
To debug a bug report, but generally useful.
2015-04-21 15:55:42 -04:00
Joey Hess
a2902cdaaf add filename to progress bar, and display ok/failed at end
This needed plumbing an AssociatedFile through retrieveKeyFileCheap.
2015-04-14 16:35:10 -04:00
Joey Hess
afc5153157 update my email address and homepage url 2015-01-21 12:50:09 -04:00
Joey Hess
4f657aa14e add getFileSize, which can get the real size of a large file on Windows
Avoid using fileSize which maxes out at just 2 gb on Windows.
Instead, use hFileSize, which doesn't have a bounded size.
Fixes support for files > 2 gb on Windows.

Note that the InodeCache code only needs to compare a file size,
so it doesn't matter it the file size wraps. So it has been
left as-is. This was necessary both to avoid invalidating existing inode
caches, and because the code passed FileStatus around and would have become
more expensive if it called getFileSize.

This commit was sponsored by Christian Dietrich.
2015-01-20 17:09:24 -04:00
Joey Hess
27fb7e514d Fix build with -f-S3. 2014-12-19 16:53:25 -04:00
Joey Hess
65bce2c80d reformat 2014-12-16 15:26:13 -04:00
Joey Hess
2cd84fcc8b Expand checkurl to support recommended filename, and multi-file-urls
This commit was sponsored by an anonymous bitcoiner.
2014-12-11 15:33:42 -04:00
Joey Hess
30bf112185 Urls can now be claimed by remotes. This will allow creating, for example, a external special remote that handles magnet: and *.torrent urls. 2014-12-08 19:15:07 -04:00
Joey Hess
cb6e16947d add stub claimUrl 2014-12-08 13:40:15 -04:00
Joey Hess
0a891fcfc5 support S3 front-end used by globalways.net
This threw an unusual exception w/o an error message when probing to see if
the bucket exists yet. So rather than relying on tryS3, catch all
exceptions.

This does mean that it might get an exception for some transient network
error, think this means the bucket DNE yet, and try to create it, and then
fail when it already exists.
2014-11-05 12:42:12 -04:00
Joey Hess
93feefae05 Revert "work around minimum part size problem"
This reverts commit a42022d8ff.

I misunderstood the cause of the problem.
2014-11-04 16:21:55 -04:00
Joey Hess
a42022d8ff work around minimum part size problem
When uploading the last part of a file, which was 640229 bytes, S3 rejected
that part: "Your proposed upload is smaller than the minimum allowed size"

I don't know what the minimum is, but the fix is just to include the last
part into the previous part. Since this can result in a part that's
double-sized, use half-sized parts normally.
2014-11-04 16:06:13 -04:00
Joey Hess
ad2125e24a fix a couple type errors and the progress bar 2014-11-04 15:39:48 -04:00
Joey Hess
fccdd61eec fix memory leak
Unfortunately, I don't fully understand why it was leaking using the old
method of a lazy bytestring. I just know that it was leaking, despite
neither hGetUntilMetered nor byteStringPopper seeming to leak by
themselves.

The new method avoids the lazy bytestring, and simply reads chunks from the
handle and streams them out to the http socket.
2014-11-04 15:22:08 -04:00
Joey Hess
29871e320c combine 2 checks 2014-11-04 14:47:18 -04:00
Joey Hess
0f78f197eb casts; now fully working.. but still leaking
Still seems to buffer the whole partsize in memory, but I'm pretty sure my
code is not what's doing it. See https://github.com/aristidb/aws/issues/142
2014-11-03 21:12:15 -04:00
Joey Hess
f0551578d6 this should avoid leaking memory 2014-11-03 20:49:30 -04:00
Joey Hess
4230b56b79 logic error 2014-11-03 20:15:33 -04:00
Joey Hess
62de9a39bf WIP 3 2014-11-03 20:04:42 -04:00
Joey Hess
d16382e99f WIP 2 2014-11-03 19:50:33 -04:00
Joey Hess
5360417436 WIP try sending using RequestBodyStreamChunked
May not work; if it does this is gonna be the simplest way to get good
memory size and progress reporting.
2014-11-03 19:18:46 -04:00
Joey Hess
8f61bfad51 link to memory leak bug 2014-11-03 17:55:05 -04:00
Joey Hess
711b18a6eb improve info display for multipart 2014-11-03 17:24:53 -04:00
Joey Hess
2c53f331bd fix build 2014-11-03 17:23:46 -04:00
Joey Hess
6a965cf8d7 adjust version check
I assume 0.10.6 will have the fix for the bug I reported, which got fixed
in master already..
2014-11-03 16:23:00 -04:00
Joey Hess
5c3d9d6caa show multipart configuration in git annex info s3remote 2014-11-03 16:07:41 -04:00
Joey Hess
8faeb25076 finish multipart support using unreleased update to aws lib to yield etags
Untested and not even compiled yet.

Testing should include checks that file content streams through without
buffering in memory.

Note that CL.consume causes all the etags to be buffered in memory.
This is probably nearly unavoidable, since a request has to be constructed
that contains the list of etags in its body. (While it might be possible to
stream generation of the body, that would entail making a http request that
dribbles out parts of the body as the multipart uploads complete, which is
not likely to work well..

To limit this being a problem, it's best for partsize to be set to some
suitably large value, like 1gb. Then a full terabyte file will need only
1024 etags to be stored, which will probably use around 1 mb of memory.
2014-11-03 16:04:55 -04:00
Joey Hess
6e89d070bc WIP multipart S3 upload
I'm a little stuck on getting the list of etags of the parts.
This seems to require taking the md5 of each part locally,
which doesn't get along well with lazily streaming in the part from the
file. It would need to read the file twice, or lose laziness and buffer a
whole part -- but parts might be quite large.

This seems to be a problem with the API provided; S3 is supposed to return
an etag, but that is not exposed. I have filed a bug:
https://github.com/aristidb/aws/issues/141
2014-10-28 14:17:30 -04:00
Joey Hess
8ed1a0afee fix build 2014-10-23 16:52:05 -04:00
Joey Hess
8edf7a0fc3 fix build 2014-10-23 16:51:10 -04:00