git-annex

Author	SHA1	Message	Date
Joey Hess	4be94c67c7	make removeKey throw exceptions	2020-05-14 14:11:05 -04:00
Joey Hess	d9c7f81ba4	make retrieveKeyFile and retrieveKeyFileCheap throw exceptions Converted retrieveKeyFileCheap to a Maybe, to avoid needing to throw a exception when a remote doesn't support it.	2020-05-13 17:07:07 -04:00
Joey Hess	c1cd402081	make storeKey throw exceptions When storing content on remote fails, always display a reason why. Since the Storer used by special remotes already did, this mostly affects git remotes, but not entirely. For example, if git-lfs failed to connect to the endpoint, it used to silently return False.	2020-05-13 14:03:00 -04:00
Joey Hess	b50ee9cd0c	remove Preparer abstraction That had almost no benefit at all, and complicated things quite a lot. What I proably wanted this to be was something like ResourceT, but it was not. The few remotes that actually need some preparation done only once and reused used a MVar and not Preparer.	2020-05-13 11:56:21 -04:00
Joey Hess	1532d67c3e	S3: Support signature=v4 To use S3 Signature Version 4. Some S3 services seem to require v4, while others may only support v2, which remains the default. I'm also not sure if v4 works correctly in all cases, there is this upstream bug report: https://github.com/aristidb/aws/issues/262 I've only tested it against the default S3 endpoint.	2020-05-07 13:18:11 -04:00
Joey Hess	81e3faf810	Merge branch 'v7'	2020-02-26 18:15:18 -04:00
Joey Hess	8af6d2c3c5	fix encryption of content to gcrypt and git-lfs Fix serious regression in gcrypt and encrypted git-lfs remotes. Since version 7.20200202.7, git-annex incorrectly stored content on those remotes without encrypting it. Problem was, Remote.Git enumerates all git remotes, including git-lfs and gcrypt. It then dispatches to those. So, Remote.List used the RemoteConfigParser from Remote.Git, instead of from git-lfs or gcrypt, and that parser does not know about encryption fields, so did not include them in the ParsedRemoteConfig. (Also didn't include other fields specific to those remotes, perhaps chunking etc also didn't get through.) To fix, had to move RemoteConfig parsing down into the generate methods of each remote, rather than doing it in Remote.List. And a consequence of that was that ParsedRemoteConfig had to change to include the RemoteConfig that got parsed, so that testremote can generate a new remote based on an existing remote. (I would have rather fixed this just inside Remote.Git, but that was not practical, at least not w/o re-doing work that Remote.List already did. Big ugly mostly mechanical patch seemed preferable to making git-annex slower.)	2020-02-26 18:05:36 -04:00
Joey Hess	67476fbc54	minor code simplification	2020-02-25 13:06:09 -04:00
Joey Hess	1883f7ef8f	support git remotes that need http basic auth using git credential to get the password One thing this doesn't do is wrap the password prompting inside the prompt action. So with -J, the output can be a bit garbled.	2020-01-22 16:16:19 -04:00
Joey Hess	2be4122bfc	include passthrough params in --describe-other-params	2020-01-20 16:53:27 -04:00
Joey Hess	7038acf96c	add descriptions for all remote config fields not yet used	2020-01-20 15:20:04 -04:00
Joey Hess	99cb3e75f1	add LISTCONFIGS to external special remote protocol Special remote programs that use GETCONFIG/SETCONFIG are recommended to implement it. The description is not yet used, but will be useful later when adding a way to make initremote list all accepted configs. configParser now takes a RemoteConfig parameter. Normally, that's not needed, because configParser returns a parter, it does not parse it itself. But, it's needed to look at externaltype and work out what external remote program to run for LISTCONFIGS. Note that, while externalUUID is changed to a Maybe UUID, checkExportSupported used to use NoUUID. The code that now checks for Nothing used to behave in some undefined way if the external program made requests that triggered it. Also, note that in externalSetup, once it generates external, it parses the RemoteConfig strictly. That generates a ParsedRemoteConfig, which is thrown away. The reason it's ok to throw that away, is that, if the strict parse succeeded, the result must be the same as the earlier, lenient parse. initremote of an external special remote now runs the program three times. First for LISTCONFIGS, then EXPORTSUPPORTED, and again LISTCONFIGS+INITREMOTE. It would not be hard to eliminate at least one of those, and it should be possible to only run the program once.	2020-01-17 16:07:17 -04:00
Joey Hess	907ca937ab	use more field functions Using field functions consistently avoids possibility of typos and also helps ensure that all fields are added to RemoteConfigParsers (as long as I have remembered to add them when writing the functions).	2020-01-15 11:15:07 -04:00
Joey Hess	7f2bfd41d7	include credPairRemoteFields in RemoteConfigParsers Avoids parse error when the fields are added to RemoteConfig at setup time and it then gets parsed, also at setup time. After setup time, such internally added fields are not a problem, because they're Accepted. So it may not be necessary in all cases to list such internally added fields, but I think it's a good idea to always do so.	2020-01-15 10:57:45 -04:00
Joey Hess	0706d9d093	finish porting S3	2020-01-15 10:52:28 -04:00
Joey Hess	c4ea3ca40a	ported almost all remotes, until my brain melted external is not started yet, and S3 is part way through and not compiling yet	2020-01-14 15:41:34 -04:00
Joey Hess	71ecfbfccf	be stricter about rejecting invalid configurations for remotes This is a first step toward that goal, using the ProposedAccepted type in RemoteConfig lets initremote/enableremote reject bad parameters that were passed in a remote's configuration, while avoiding enableremote rejecting bad parameters that have already been stored in remote.log This does not eliminate every place where a remote config is parsed and a default value is used if the parse false. But, I did fix several things that expected foo=yes/no and so confusingly accepted foo=true but treated it like foo=no. There are still some fields that are parsed with yesNo but not not checked when initializing a remote, and there are other fields that are parsed in other ways and not checked when initializing a remote. This also lays groundwork for rejecting unknown/typoed config keys.	2020-01-10 14:52:48 -04:00
Joey Hess	650a631ef8	include all remotes back in	2019-12-02 12:26:33 -04:00
Joey Hess	81d402216d	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. Previously attempted in `4536c93bb2` and reverted in `96aba8eff7`. The problems mentioned in the latter commit are addressed now: Read/Show of KeyData is backwards-compatible with Read/Show of Key from before this change, so Types.Distribution will keep working. The Eq instance is fixed. Also, Key has smart constructors, avoiding needing to remember to update the cached serialization. Used git-annex benchmark: find is 7% faster whereis is 3% faster get when all files are already present is 5% faster Generally, the benchmarks are running 0.1 seconds faster per 2000 files, on a ram disk in my laptop.	2019-11-22 17:49:16 -04:00
Joey Hess	890330f0fe	make --json-error-messages capture url download errors Convert Utility.Url to return Either String so the error message can be displated in the annex monad and so captured. (When curl is used, its errors are still not caught.)	2019-11-12 13:52:38 -04:00
Joey Hess	9828f45d85	add RemoteStateHandle This solves the problem of sameas remotes trampling over per-remote state. Used for: * per-remote state, of course * per-remote metadata, also of course * per-remote content identifiers, because two remote implementations could in theory generate the same content identifier for two different peices of content While chunk logs are per-remote data, they don't use this, because the number and size of chunks stored is a common property across sameas remotes. External special remote had a complication, where it was theoretically possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or EXPORTSUPPORTED. Since the uuid of the remote is typically generate in Remote.setup, it would only be possible to pass a Maybe RemoteStateHandle into it, and it would otherwise have to construct its own. Rather than go that route, I decided to send an ERROR in this case. It seems unlikely that any existing external special remote will be affected. They would have to make up a git-annex key, and set state for some reason during INITREMOTE. I can imagine such a hack, but it doesn't seem worth complicating the code in such an ugly way to support it. Unfortunately, both TestRemote and Annex.Import needed the Remote to have a new field added that holds its RemoteStateHandle.	2019-10-14 13:51:42 -04:00
Joey Hess	c3975ff3b4	sameas RemoteConfig inheritance I found a way to avoid inheritance complicating anything outside of Logs.Remote. It seems fine to require all inherited values to be inherited and not set in the sameas remote's config. Since inherited values will be used for stuff like encryption and perhaps chunking, which control the actual content stored on the remote, it seems likely that there will not be any reason to need them to vary between two remotes that access the same underlying data store. The newer version of containers is free; the minimum ghc version is bundled with a newer version than that.	2019-10-10 15:58:22 -04:00
Joey Hess	d1130ea04a	get rid of hardcoded "name" lookups Support "sameas-name" being set instead. In RenameRemote, rename which ever of the two is set.	2019-10-10 13:25:10 -04:00
Joey Hess	708fc6567f	S3: Fix encoding when generating public urls of S3 objects. This code feels worryingly stringily typed, but using URI does not help because the uriPath still has to be constructed with the right uri-encoding.	2019-08-15 12:56:46 -04:00
Joey Hess	5004381dd9	improve error display when storing to an export/import remote fails Prompted by the test suite on windows failing to with "export foo failed" and no information about what went wrong. Note that only storeExportWithContentIdentifier has been converted. storeExport still returns a Bool and so exceptions may be hidden. However, storeExportWithContentIdentifier has many more failure modes, since it needs to avoid overwriting modified files. So it's more important it have better error display.	2019-08-13 12:05:00 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	700a3f2787	Merge branch 'master' into import-from-s3	2019-05-01 14:30:52 -04:00
Joey Hess	9dd764e6f7	Added mimeencoding= term to annex.largefiles expressions. * Added mimeencoding= term to annex.largefiles expressions. This is probably mostly useful to match non-text files with eg "mimeencoding=binary" * git-annex matchexpression: Added --mimeencoding option.	2019-04-30 12:17:22 -04:00
Joey Hess	f08cd6a4ac	set S3 version id in retrieveExportWithContentIdentifierS3 This is necessary because of checks for a S3 version id being set done when deleting the export or overwriting or renaming it.	2019-04-24 15:13:07 -04:00
Joey Hess	a42e7a012a	refuse unsafe store to unversioned exporttree with old aws version I've developed a patch to aws, once it gets merged, the real version number of aws can be filled in.	2019-04-23 14:39:30 -04:00
Joey Hess	a7db925f59	typo	2019-04-23 13:19:48 -04:00
Joey Hess	710c2cdbdc	implement rest of missing methods for import from S3	2019-04-23 13:09:27 -04:00
Joey Hess	2f79cb4b45	versioned import from S3 is working Still some bugs and two stubbed methods to implement though.	2019-04-19 15:13:49 -04:00
Joey Hess	9dc7a10448	Drop support for building with aws older than 0.14. debian stable has 0.14 so lose the complexity for old versions	2019-04-19 14:27:59 -04:00
Joey Hess	55a5d9679a	implemented mkImportableContentsVersioned	2019-04-19 13:39:33 -04:00
Joey Hess	bf6c7ea6b6	starting work on import from S3 Not in a usuable state yet.	2019-04-18 15:20:09 -04:00
Joey Hess	06cbaa4233	fix back-compat with old git-annex Unfortunately, "port" has to be set by default, or the old git-annex will crash when trying to enable the S3 remote. So, when protocol=https is specified, it needs to override port=80, since it may be a default setting.	2019-03-22 12:27:41 -04:00
Joey Hess	2a99d7ffc0	improve error message	2019-03-22 12:23:59 -04:00
Joey Hess	7d37011a11	S3: Added protocol= initremote setting, to allow https to be used on a non-standard port protocol=https implies port=443 and port=443 implies protocol=https -- this was necessary because the existing configs set port=443, but with a protocol setting, users will naturally want to use it, and then there's no need for them to supply the default https port. So we keep back-compat, add a nicer way to enable https, and also add support for non-standard https ports.	2019-03-22 12:17:05 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	2912429640	better indicate when special remotes do not support renameExport Avoid a warning message when renameExport is not supported, and just fallback to deleting with a subsequent re-upload. Especially needed for importtree remotes, where renameExport needs to be disabled. This changes the external special remote protocol, but in a backwards-compatible way. A reply of UNSUPPORTED-REQUEST to an older version of git-annex will cause it to make renameExport return False.	2019-03-11 12:53:24 -04:00
Joey Hess	e412129523	concurrency and status messages when downloading from import	2019-03-08 12:33:44 -04:00
Joey Hess	760f26ebc6	Merge branch 'master' into importtree	2019-02-26 11:36:36 -04:00
Joey Hess	19f833b0b1	aws-0.21.1 * S3: Support enabling bucket versioning when built with aws-0.21.1. * stack.yaml: Build with aws-0.21.1	2019-02-24 12:45:09 -04:00
Joey Hess	ccc0684d21	no remotes support import yet	2019-02-20 16:59:04 -04:00
Joey Hess	ab689cf0cd	Improved speed of S3 remote by only loading S3 creds once This gets back any speed lost in commit `9cebfd7002`, and speeds up all uses of S3 remotes that operate on them more than once. This commit was sponsored by Brett Eisenberg on Patreon.	2019-01-30 16:20:14 -04:00
Joey Hess	8eb66a5c40	avoid potentually unsafe use of runResourceT Pushed the ResourceT out into larger code blocks, and made sure that the the http result from a sendS3Handle is processed inside the same ResourceT block. I don't think this fixes any bugs, but it allows getting rid of a scary comment. This commit was sponsored by Eric Drechsel on Patreon.	2019-01-30 15:40:13 -04:00
Joey Hess	9cebfd7002	purify exportActions Purifying exportActions will allow introspecting and modifying it, which is needed to add progress bar display to it. Only S3 and WebDAV ran an Annex action while constructing ExportActions. There was a small performance gain from them doing that, since a resource was able to be prepared and reused for multiple actions by Command.Export. As seen in commit `809cfbbd8a` and `5d394023eb` S3 and WebDAV actually create a new handle for each access in normal, non-export use. It doesn't seem worth making export use of them marginally more efficient than normal use. It would be better to do that work upfront when constructing the remote. Or perhaps use a MVar to cache a handle. This commit was sponsored by Nick Piper on Patreon.	2019-01-30 15:11:40 -04:00
Joey Hess	809cfbbd8a	prepareS3Handle didn't give any benefits, so remove I seem to have thought that a Preparer was only run once when a remote is accessed multiple times, but that is not in fact the case. prepareS3Handle is run once per access. So, there is no point to it. That there is some duplicate work done on each access is now apparent. Luckily, the http manager is reused, so only one http connection is made. But the S3 creds are loaded repeatedly. Room for improvement here. This commit was sponsored by Jack Hill on Patreon.	2019-01-30 14:23:39 -04:00
Joey Hess	8fc6c11cf1	couple fixes	2019-01-29 15:20:22 -04:00
Joey Hess	a8f1add4d1	S3: Detect when version=yes but an exported file lacks versioning, and refuse to delete it, to avoid data loss. This commit was sponsored by Denis Dzyubenko on Patreon.	2019-01-29 15:07:27 -04:00
Joey Hess	bb9817ceae	enableremote S3: Do not let versioning=yes be set on existing remote Because when git-annex lacks S3 version IDs for files stored in the bucket, deleting them would cause data loss. Also because git-annex is not able to download unversioned objects from a bucket when versioning=yes. This also prevents setting versioning=no. While that would perhaps be possible to do safely, it would add complexity, and would mean that if the user accidentially did enableremote versioning=no, they would not be able to undo it. This commit was sponsored by Trenton Cronholm on Patreon.	2019-01-29 14:09:50 -04:00
Joey Hess	ee011b3cbb	initremote S3: Automatically enable versioning in S3 buckets when configured with versioning=yes. Needs not yet released version 0.22 of aws library; with older versions asks the user to configure the bucket versioning themselves. Note that S3 endpoints that don't support versioning will cause putBucketVersioning to throw an exception, so initremote will fail. This commit was sponsored by Jake Vosloo on Patreon.	2019-01-29 13:46:04 -04:00
Joey Hess	669b305de2	S3: Send a Content-Type header when storing objects in S3 So exports to public buckets can be linked to from web pages. (When git-annex is built with MagicMime support.) Thanks to Jared Cosulich for the idea.	2019-01-23 13:08:47 -04:00
Joey Hess	d3ab5e626b	rename key2file and file2key What these generate is not really suitable to be used as a filename, which is why keyFile and fileKey further escape it. These are just serializing Keys. Also removed a quickcheck test that was very unlikely to test anything useful, since it relied on random chance creating something that looks like a serialized key. The other test is sufficient for testing what that was intended to test anyway.	2019-01-14 13:03:35 -04:00
Joey Hess	cb375977a6	follow-on changes from MetaData type changes Including writing and parsing the metadata log files with bytestring-builder and attoparsec.	2019-01-07 15:51:05 -04:00
Joey Hess	7d51b0c109	import Utility.FileSystemEncoding in Common	2019-01-03 11:37:02 -04:00
Joey Hess	2e069eb9f6	use putBucket to future-proof New fields can be added to PutBucket in the future.	2018-12-31 13:09:20 -04:00
Joey Hess	4579dd6201	S3: Improve diagnostics when a remote is configured with exporttree and versioning, but no S3 version id has been recorded for a key. When public access is used for the remote, it complained that the user needed to set creds to use it, which was just wrong. When creds were being used, it fell back from trying to use the version ID to just accessing the key in the bucket, which was ok for non-export remotes, but wrong for buckets. In both cases, display a hopefully useful warning. This should only come up when an existing S3 remote has been exported to, and then later versioning was enabled. Note that it would perhaps be possible to fall back from trying to use retrieveKeyFile when it fails and instead use retrieveKeyFileFromExport, which may work when S3 version ID is missing. But there are problems with that approach; how to tell when retrieveKeyFile has failed due to this rather than a network problem etc? Anyway, that approach would only work until the file in the export got overwritten, and then it would no longer be accessible. And with versioning enabled, the user wants old versions of objects to remain accessible, so it seems better to warn about the problem as soon as possible, so they can go back and add S3 version IDs. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-12-06 13:44:37 -04:00
Joey Hess	a9dd087074	centralized "yes"/"no" parsing This commit was sponsored by Jack Hill on Patreon.	2018-10-10 11:14:27 -04:00
Joey Hess	451171b7c1	clean up url removal presence update * rmurl: Fix a case where removing the last url left git-annex thinking content was still present in the web special remote. * SETURLPRESENT, SETURIPRESENT, SETURLMISSING, and SETURIMISSING used to update the presence information of the external special remote that called them; this was not documented behavior and is no longer done. Done by making setUrlPresent and setUrlMissing only update presence info for the web, and only when the url is a web url. See the comment for reasoning about why that's the right thing to do. In AddUrl, had to make it update location tracking, to handle the non-web-url case. This commit was sponsored by Ewen McNeill on Patreon.	2018-10-04 17:35:49 -04:00
Joey Hess	773084c49b	S3: Fix url construction bug When the publicurl has been set to an url that does not end with a slash, we need to add one in between it and the rest of the url. As far as I can see, git-annex does not default to such publicurls; it's careful to end them with slashes. But this was observed in the wild, and there may be documentation that doesn't include the slash. And it's an easy mistake to make in any case. This commit was sponsored by Eric Drechsel on Patreon.	2018-09-14 12:25:23 -04:00
Joey Hess	677038199c	fix build with older aws S3: Multipart uploads are now only supported when git-annex is built with aws-0.16.0 or later, as earlier versions of the library don't support versioning with multipart uploads. This will affect the android build, and debian stable also has a too old aws to support both features at the same time. This commit was sponsored by Nick Piper on Patreon.	2018-09-13 09:58:39 -04:00
Joey Hess	445ea66732	simplify	2018-09-06 16:07:16 -04:00
Joey Hess	b7daf2685f	support public versioned S3 access Makes git annex whereis display the versionId urls. And, when a s3 remote is enabled without creds, git-annex will use the versionId urls to access its contents. This commit was sponsored by Fernando Jimenez on Patreon.	2018-09-06 14:31:41 -04:00
Joey Hess	7407a80c27	S3: Support AWS_SESSION_TOKEN This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-09-05 15:53:57 -04:00
Joey Hess	53d839d543	more efficient encoding	2018-08-31 13:49:08 -04:00
Joey Hess	b3d42283ad	use per-remote metadata storage for S3 version ID Since the same key can be stored in a versioned S3 bucket multiple times with different version IDs, this allows tracking them all. Not currently needed, but if we ever want to drop from a versioned S3 bucket, we'll need to know them all. This commit was supported by the NSF-funded DataLad project.	2018-08-31 13:27:29 -04:00
Joey Hess	6b75b9c448	turn on appendonly when versioning is enabled	2018-08-31 10:53:07 -04:00
Joey Hess	19dcff2b71	use S3 version ID for retrieval Have to store the S3 object along with the version ID, so retrieval can use the same object. This commit was supported by the NSF-funded DataLad project.	2018-08-30 15:37:08 -04:00
Joey Hess	794e9a7a44	store S3 version IDs Only done when versioning=yes is configured. It could always do it when S3 sends back a version id, but there may be buckets that have versioning enabled by accident, so it seemed better to honor the configuration. S3's docs say version IDs are "randomly generated", so presumably storing the same content twice gets two different ones not the same one. So I considered storing a list of version IDs for a key. That would allow removing the key completely. But.. The way Logs.RemoteState works, when there are multiple writers, the last writer wins. So storing a list would need a different log format that merges, which seemed overkill to support removing a key from an append-only remote. Note that Logs.RemoteState for S3 is now dedicated to version IDs. If something else needs to be stored, a new log will be needed to do it. This commit was supported by the NSF-funded DataLad project.	2018-08-30 14:30:56 -04:00
Joey Hess	0ff5a41311	S3 versioning=yes config Not yet used. This commit was supported by the NSF-funded DataLad project.	2018-08-30 13:45:28 -04:00
Joey Hess	02630b39ee	add Remote.readonly Does nothing yet. Considered making bup readonly, but while the content can't be removed, it is able to delete a branch, so didn't. This commit was supported by the NSF-funded DataLad project.	2018-08-30 11:12:18 -04:00
Joey Hess	2884637cab	S3: Support credential-less download from remotes configured with public=yes exporttree=yes. This commit was supported by the NSF-funded DataLad project.	2018-07-31 16:32:43 -04:00
Joey Hess	4315bb9e42	add retrievalSecurityPolicy This will be used to protect against CVE-2018-10859, where an encrypted special remote is fed the wrong encrypted data, and so tricked into decrypting something that the user encrypted with their gpg key and did not store in git-annex. It also protects against CVE-2018-10857, where a remote follows a http redirect to a file:// url or to a local private web server. While that's already been prevented in git-annex's own use of http, external special remotes, hooks, etc use other http implementations and could still be vulnerable. The policy is not yet enforced, this commit only adds the appropriate metadata to remotes. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-06-21 11:36:36 -04:00
Joey Hess	67e46229a5	change Remote.repo to Remote.getRepo This is groundwork for letting a repo be instantiated the first time it's actually used, instead of at startup. The only behavior change is that some old special cases for xmpp remotes were removed. Where before git-annex silently did nothing with those no-longer supported remotes, it may now fail in some way. The additional IO action should have no performance impact as long as it's simply return. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon	2018-06-04 15:30:26 -04:00
Joey Hess	197b1510fa	remove unused import	2018-04-09 13:09:40 -04:00
Joey Hess	0f6775f1ff	refactor sinkResponseFile and add downloadC Remote.S3 and Remote.Helper.Http both had similar code to sink a http-conduit Response to a file; refactor out sinkResponseFile. downloadC downloads an url to a file using http-conduit, and supports resuming. Falls back to curl to handle urls that http-conduit does not support. This is not used yet, but the goal is to replace download with it. git-annex.cabal: conduit-extra was not actually used for a long time, remove the dep. conduit moves into the main dependency list, but since http-conduit was already in there, and it depends on conduit, that's not really adding a new build dep. This commit was supported by the NSF-funded DataLad project.	2018-04-06 16:07:08 -04:00
Joey Hess	9b98d3f630	better HTTP connection reuse Enable HTTP connection reuse across multiple files, when git-annex uses http-conduit. Before, a new Manager was created each time Utility.Url used it. Now, a single Manager gets created the first time, so connections are reused. Doesn't help when external programs are used for url download, but does speed up addurl --fast, fsck --from web, etc. Testing fsck --fast --from web with 3 files, over high-latency satellite internet, it sped up from 19.37s to 14.96s. This commit was supported by the NSF-funded DataLad project.	2018-04-04 15:39:40 -04:00
Joey Hess	2927618d35	Added adb special remote which allows exporting files to Android devices. git annex testremote passes. exportree not implemented yet, although the documentation talks about it, since it will be the main way this remote will be used. The adb push/pull progress is displayed for now; it would be better to consume it and use it to update the git-annex progress bar. This commit was sponsored by andrea rota.	2018-03-27 14:54:41 -04:00
Joey Hess	a01b0680e3	fix version number	2017-10-11 11:43:03 -04:00
Joey Hess	6679705116	typo	2017-10-11 11:24:51 -04:00
Joey Hess	61dccecad7	Fix build with aws-0.17. This commit was sponsored by Denis Dzyubenko on Patreon.	2017-10-11 10:57:20 -04:00
Joey Hess	2e69efea8d	git annex sync --content to exports Assistant still todo. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon	2017-09-19 14:20:47 -04:00
Joey Hess	b03d77c211	add ExportTree table to export db New table needed to look up what filenames are used in the currently exported tree, for reasons explained in export.mdwn. Also, added smart constructors for ExportLocation and ExportDirectory to make sure they contain filepaths with the right direction slashes. And some code refactoring. This commit was sponsored by Francois Marier on Patreon.	2017-09-18 13:59:59 -04:00
Joey Hess	e1f5c90c92	split out Types.Export	2017-09-15 16:46:03 -04:00
Joey Hess	9f4ffe65e9	implement removeExportDirectory Not yet called by Command.Export. WebDAV needs this to clean up empty collections. Also, example.sh turned out to not be cleaning up directories when removing content from them, so it made sense for it to use this. Remote.Directory did not need it, and since its cleanup method for empty directories is more efficient than what Command.Export will need to do to find empty directories, it uses Nothing so that extra work can be avoided. This commit was sponsored by Thom May on Patreon.	2017-09-15 13:18:21 -04:00
Joey Hess	9c3622882b	export: cache connections for S3 and webdav	2017-09-12 16:59:04 -04:00
Joey Hess	7ef9b7ef46	update copyright year	2017-09-12 13:53:03 -04:00
Joey Hess	088d819cd8	propigate exception in checkPresentExportS3 checkPresentExport is supposed to throw exceptions	2017-09-12 13:46:33 -04:00
Joey Hess	1332e6cec0	stop warning about removals from IA In a test, I uploaded a pdf, and several files were derived from it. After removing the pdf, the derived files went away after approximatly half an hour. This window does not seem worth warning about every time. Documented it in the tip.	2017-09-12 12:47:43 -04:00
Joey Hess	da23dec7d3	avoid showing error when copy fails Since renameExport is allowed to fail for any reason, and its failure is always recovered from by doing a new upload and deleting the old content, this avoids unnecessary noise. Copying a file on the IA failed, apparently something wrong with their emulation of S3: S3Error {s3StatusCode = Status {statusCode = 400, statusMessage = "Bad Request"}, s3ErrorCode = "InvalidArgument", s3ErrorMessage = "Invalid Argument", s3ErrorResource = Just "x-(amz\|archive)-copy-source header is bad: 'joeyh-public-test2/foo'", s3ErrorHostId = Nothing, s3ErrorAccessKeyId = Nothing, s3ErrorStringToSign = Nothing, s3ErrorBucket = Nothing, s3ErrorEndpointRaw = Nothing, s3ErrorEndpoint = Nothing} This commit was sponsored by Jake Vosloo on Patreon.	2017-09-12 12:42:44 -04:00
Joey Hess	267f47c473	S3: Allow removing files from IA, but warn about derived versions potentially still existing there. Removal works, only derives are a potential issue, so allow removing with a warning. This way, unexporting a file works, and behavior is consistent with IA remotes whether or not exporttree=yes. Also tested exporting filenames containing unicode, spaces, underscores. All worked, despite the IA's faq saying it doesn't. This commit was sponsored by Trenton Cronholm on Patreon.	2017-09-12 12:35:58 -04:00
Joey Hess	afdff226fb	don't show key urls in whereis for S3 with public=yes and exporttree=yes	2017-09-08 16:44:00 -04:00
Joey Hess	650d0955a0	S3 export finalization Fixed ACL issue, and updated some documentation.	2017-09-08 16:28:28 -04:00
Joey Hess	44cd5ae313	S3 export (untested) It opens a http connection per file exported, but then so does git annex copy --to s3. Decided not to munge exported filenames for IA. Too large a chance of the munging having confusing results. Instead, export of files not supported by IA, eg with spaces in their name, will fail. This commit was supported by the NSF-funded DataLad project.	2017-09-08 15:46:24 -04:00
Joey Hess	16eb2f976c	prevent exporttree=yes on remotes that don't support exports Don't allow "exporttree=yes" to be set when the special remote does not support exports. That would be confusing since the user would set up a special remote for exports, but `git annex export` to it would later fail. This commit was supported by the NSF-funded DataLad project.	2017-09-07 13:48:44 -04:00
Joey Hess	28e2cad849	implement exporttree=yes configuration * Only export to remotes that were initialized to support it. * Prevent storing key/value on export remotes. * Prevent enabling exporttree=yes and encryption in the same remote. SetupStage Enable was changed to take the old RemoteConfig. This allowed only setting exporttree when initially setting up a remote, and not configuring it later after stuff might already be stored in the remote. Went with =yes rather than =true for consistency with other parts of git-annex. Changed docs accordingly. This commit was supported by the NSF-funded DataLad project.	2017-09-04 13:09:38 -04:00
Joey Hess	a4328b49d2	refactor ExportActions This will allow disabling exports for remotes that are not configured to allow them. Also, exportSupported will be useful for the external special remote to probe. This commit was supported by the NSF-funded DataLad project	2017-09-01 13:05:09 -04:00
Joey Hess	e55e445a36	add API for exporting Implemented so far for the directory special remote. Several remotes don't make sense to export to. Regular Git remotes, obviously, do not. Bup remotes almost certianly do not, since bup would need to be used to extract the export; same store for Ddar. Web and Bittorrent are download-only. GCrypt is always encrypted so exporting to it would be pointless. There's probably no point complicating the Hook remotes with exporting at this point. External, S3, Glacier, WebDAV, Rsync, and possibly Tahoe should be modified to support export. Thought about trying to reuse the storeKey/retrieveKeyFile/removeKey interface, rather than adding a new interface. But, it seemed better to keep it separate, to avoid a complicated interface that sometimes encrypts/chunks key/value storage and sometimes users non-key/value storage. Any common parts can be factored out. Note that storeExport is not atomic. doc/design/exporting_trees_to_special_remotes.mdwn has some things in the "resuming exports" section that bear on this decision. Basically, I don't think, at this time, that an atomic storeExport would help with resuming, because exports are not key/value storage, and we can't be sure that a partially uploaded file is the same content we're currently trying to export. Also, note that ExportLocation will always use unix path separators. This is important, because users may export from a mix of windows and unix, and it avoids complicating the API with path conversions, and ensures that in such a mix, they always use the same locations for exports. This commit was sponsored by Bruno BEAUFILS on Patreon.	2017-08-29 13:00:41 -04:00

1 2 3 4 5 ...

318 commits