git-annex

Author	SHA1	Message	Date
Joey Hess	ee78958798	S3: Fix incompatability with bucket names used by hS3; the aws library cannot handle upper-case bucket names. git-annex now converts them to lower case automatically. For example, it failed to get files from a bucket named S3. Also fixes `git annex initremote UPPERCASE type=S3`, which failed with the new aws library, with a signing error message.	2015-04-27 18:00:58 -04:00
Joey Hess	22a4e92df7	S3: git annex enableremote will not create a bucket name, which failed since the bucket already exists.	2015-04-23 14:16:53 -04:00
Joey Hess	b3eccec68c	S3: git annex info will show additional information about a S3 remote (endpoint, port, storage class)	2015-04-23 14:12:25 -04:00
Joey Hess	ae9bbf25a0	convert all log prorities, not just debug In particular, error should go to stderr	2015-04-21 15:59:30 -04:00
Joey Hess	3b3aaf0d56	S3: Enable debug logging when annex.debug or --debug is set. To debug a bug report, but generally useful.	2015-04-21 15:55:42 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	4f657aa14e	add getFileSize, which can get the real size of a large file on Windows Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.	2015-01-20 17:09:24 -04:00
Joey Hess	27fb7e514d	Fix build with -f-S3.	2014-12-19 16:53:25 -04:00
Joey Hess	65bce2c80d	reformat	2014-12-16 15:26:13 -04:00
Joey Hess	2cd84fcc8b	Expand checkurl to support recommended filename, and multi-file-urls This commit was sponsored by an anonymous bitcoiner.	2014-12-11 15:33:42 -04:00
Joey Hess	30bf112185	Urls can now be claimed by remotes. This will allow creating, for example, a external special remote that handles magnet: and *.torrent urls.	2014-12-08 19:15:07 -04:00
Joey Hess	cb6e16947d	add stub claimUrl	2014-12-08 13:40:15 -04:00
Joey Hess	0a891fcfc5	support S3 front-end used by globalways.net This threw an unusual exception w/o an error message when probing to see if the bucket exists yet. So rather than relying on tryS3, catch all exceptions. This does mean that it might get an exception for some transient network error, think this means the bucket DNE yet, and try to create it, and then fail when it already exists.	2014-11-05 12:42:12 -04:00
Joey Hess	93feefae05	Revert "work around minimum part size problem" This reverts commit `a42022d8ff`. I misunderstood the cause of the problem.	2014-11-04 16:21:55 -04:00
Joey Hess	a42022d8ff	work around minimum part size problem When uploading the last part of a file, which was 640229 bytes, S3 rejected that part: "Your proposed upload is smaller than the minimum allowed size" I don't know what the minimum is, but the fix is just to include the last part into the previous part. Since this can result in a part that's double-sized, use half-sized parts normally.	2014-11-04 16:06:13 -04:00
Joey Hess	ad2125e24a	fix a couple type errors and the progress bar	2014-11-04 15:39:48 -04:00
Joey Hess	fccdd61eec	fix memory leak Unfortunately, I don't fully understand why it was leaking using the old method of a lazy bytestring. I just know that it was leaking, despite neither hGetUntilMetered nor byteStringPopper seeming to leak by themselves. The new method avoids the lazy bytestring, and simply reads chunks from the handle and streams them out to the http socket.	2014-11-04 15:22:08 -04:00
Joey Hess	29871e320c	combine 2 checks	2014-11-04 14:47:18 -04:00
Joey Hess	0f78f197eb	casts; now fully working.. but still leaking Still seems to buffer the whole partsize in memory, but I'm pretty sure my code is not what's doing it. See https://github.com/aristidb/aws/issues/142	2014-11-03 21:12:15 -04:00
Joey Hess	f0551578d6	this should avoid leaking memory	2014-11-03 20:49:30 -04:00
Joey Hess	4230b56b79	logic error	2014-11-03 20:15:33 -04:00
Joey Hess	62de9a39bf	WIP 3	2014-11-03 20:04:42 -04:00
Joey Hess	d16382e99f	WIP 2	2014-11-03 19:50:33 -04:00
Joey Hess	5360417436	WIP try sending using RequestBodyStreamChunked May not work; if it does this is gonna be the simplest way to get good memory size and progress reporting.	2014-11-03 19:18:46 -04:00
Joey Hess	8f61bfad51	link to memory leak bug	2014-11-03 17:55:05 -04:00
Joey Hess	711b18a6eb	improve info display for multipart	2014-11-03 17:24:53 -04:00
Joey Hess	2c53f331bd	fix build	2014-11-03 17:23:46 -04:00
Joey Hess	6a965cf8d7	adjust version check I assume 0.10.6 will have the fix for the bug I reported, which got fixed in master already..	2014-11-03 16:23:00 -04:00
Joey Hess	5c3d9d6caa	show multipart configuration in git annex info s3remote	2014-11-03 16:07:41 -04:00
Joey Hess	8faeb25076	finish multipart support using unreleased update to aws lib to yield etags Untested and not even compiled yet. Testing should include checks that file content streams through without buffering in memory. Note that CL.consume causes all the etags to be buffered in memory. This is probably nearly unavoidable, since a request has to be constructed that contains the list of etags in its body. (While it might be possible to stream generation of the body, that would entail making a http request that dribbles out parts of the body as the multipart uploads complete, which is not likely to work well.. To limit this being a problem, it's best for partsize to be set to some suitably large value, like 1gb. Then a full terabyte file will need only 1024 etags to be stored, which will probably use around 1 mb of memory.	2014-11-03 16:04:55 -04:00
Joey Hess	6e89d070bc	WIP multipart S3 upload I'm a little stuck on getting the list of etags of the parts. This seems to require taking the md5 of each part locally, which doesn't get along well with lazily streaming in the part from the file. It would need to read the file twice, or lose laziness and buffer a whole part -- but parts might be quite large. This seems to be a problem with the API provided; S3 is supposed to return an etag, but that is not exposed. I have filed a bug: https://github.com/aristidb/aws/issues/141	2014-10-28 14:17:30 -04:00
Joey Hess	8ed1a0afee	fix build	2014-10-23 16:52:05 -04:00
Joey Hess	8edf7a0fc3	fix build	2014-10-23 16:51:10 -04:00
Joey Hess	171e677a3c	update for aws 0.10's better handling of DNE for HEAD Kept support for older aws, since Debian has 0.9.2 still.	2014-10-23 16:32:18 -04:00
Joey Hess	6acc6863c5	fix build	2014-10-23 15:54:00 -04:00
Joey Hess	7489f516bc	one last build fix, yes it builds now	2014-10-23 15:50:41 -04:00
Joey Hess	76ee815e89	needs type families	2014-10-23 15:48:37 -04:00
Joey Hess	f0989cf0bd	fix build	2014-10-23 15:41:57 -04:00
Joey Hess	35551d0ed0	Merge branch 'master' into s3-aws Conflicts: Remote/S3.hs	2014-10-22 17:14:38 -04:00
Joey Hess	1b90838bbd	add internet archive item url to info	2014-10-21 15:34:32 -04:00
Joey Hess	9280fe4cbe	include creds location in info This is intended to let the user easily tell if a remote's creds are coming from info embedded in the repository, or instead from the environment, or perhaps are locally stored in a creds file. This commit was sponsored by Frédéric Schütz.	2014-10-21 15:09:40 -04:00
Joey Hess	a0297915c1	add per-remote-type info Now `git annex info $remote` shows info specific to the type of the remote, for example, it shows the rsync url. Remote types that support encryption or chunking also include that in their info. This commit was sponsored by Ævar Arnfjörð Bjarmason.	2014-10-21 14:36:09 -04:00
Joey Hess	ef3804bdb3	S3: Fix embedcreds=yes handling for the Internet Archive. Before, embedcreds=yes did not cause the creds to be stored in remote.log, but also prevented them being locally cached.	2014-10-12 13:15:52 -04:00
Joey Hess	2f3c3aa01f	glacier, S3: Fix bug that caused embedded creds to not be encypted using the remote's key. encryptionSetup must be called before setRemoteCredPair. Otherwise, the RemoteConfig doesn't have the cipher in it, and so no cipher is used to encrypt the embedded creds. This is a security fix for non-shared encryption methods! For encryption=shared, there's no security problem, just an inconsistentency in whether the embedded creds are encrypted. This is very important to get right, so used some types to help ensure that setRemoteCredPair is only run after encryptionSetup. Note that the external special remote bypasses the type safety, since creds can be set after the initial remote config, if the external special remote program requests it. Also note that IA remotes never use encryption, so encryptionSetup is not run for them at all, and again the type safety is bypassed. This leaves two open questions: 1. What to do about S3 and glacier remotes that were set up using encryption=pubkey/hybrid with embedcreds? Such a git repo has a security hole embedded in it, and this needs to be communicated to the user. Is the changelog enough? 2. enableremote won't work in such a repo, because git-annex will try to decrypt the embedded creds, which are not encrypted, so fails. This needs to be dealt with, especially for ecryption=shared repos, which are not really broken, just inconsistently configured. Noticing that problem for encryption=shared is what led to commit `fbdeeeed5f`, which tried to fix the problem by not decrypting the embedded creds. This commit was sponsored by Josh Taylor.	2014-09-18 17:26:12 -04:00
Joey Hess	ef01ff1e77	Merge branch 'master' into s3-aws Conflicts: git-annex.cabal	2014-08-15 17:30:40 -04:00
Joey Hess	6adbd50cd9	testremote: Add testing of behavior when remote is not available Added a mkUnavailable method, which a Remote can use to generate a version of itself that is not available. Implemented for several, but not yet all remotes. This allows testing that checkPresent properly throws an exceptions when it cannot check if a key is present or not. It also allows testing that the other methods don't throw exceptions in these circumstances. This immediately found several bugs, which this commit also fixes! * git remotes using ssh accidentially had checkPresent return an exception, rather than throwing it * The chunking code accidentially returned False rather than propigating an exception when there were no chunks and checkPresent threw an exception for the non-chunked key. This commit was sponsored by Carlo Matteo Capocasa.	2014-08-10 15:02:59 -04:00
Joey Hess	5fc54cb182	auto-create IA buckets Needs my patch to aws which will hopefully be accepted soon.	2014-08-09 22:17:40 -04:00
Joey Hess	445f04472c	better memoization	2014-08-09 22:13:03 -04:00
Joey Hess	5ee72b1bae	fix meter update	2014-08-09 16:49:31 -04:00
Joey Hess	3659cb9efb	S3: finish converting to aws library Implemented the Retriever. Unfortunately, it is a fileRetriever and not a byteRetriever. It should be possible to convert this to a byteRetiever, but I got stuck: The conduit sink needs to process individual chunks, but a byteRetriever needs to pass a single L.ByteString to its callback for processing. I looked into using unsafeInerlaveIO to build up the bytestring lazily, but the sink is already operating under conduit's inversion of control, and does not run directly in IO anyway. On the plus side, no more memory leak..	2014-08-09 15:58:01 -04:00

1 2 3

135 commits