git-annex

Author	SHA1	Message	Date
Joey Hess	5f5170b22b	remove SafeFilePath Move sanitizeFilePath call to where fromSafeFilePath had been.	2020-05-11 14:04:56 -04:00
Joey Hess	69e2e4763e	only check --force at init time, not enable time git-lfs repos that encrypt the annexed content but not the git repo only need --force passed to initremote, allow enableremote and autoenable of such remotes without forcing again. Needing --force again particularly made autoenable of such a repo not work. And once such a repo has been set up, it seems a second --force when enabling it elsewhere has little added value. It does tell the user about the possibly insecure configuration, but if the git repo has already been pushed to that remote in the clear, data has already been exposed. The goal of that --force was not to prevent every situation where such an exposure can happen -- anyone who sets up a public git repo and pushes to it will expose things similarly and git-annex is not involved. Instead, the purpose of the --force is to point out to the user that they're asking for a configuration where encryption is inconsistently applied.	2020-05-07 15:59:29 -04:00
Joey Hess	1532d67c3e	S3: Support signature=v4 To use S3 Signature Version 4. Some S3 services seem to require v4, while others may only support v2, which remains the default. I'm also not sure if v4 works correctly in all cases, there is this upstream bug report: https://github.com/aristidb/aws/issues/262 I've only tested it against the default S3 endpoint.	2020-05-07 13:18:11 -04:00
Joey Hess	f9ed30de3b	avoid beware of the leopard situation * Display a warning message when a remote uses a protocol, such as git://, that git-annex does not support. Silently skipping such a remote was confusing behavior. It sets annex-ignore, so the warning is only displayed once. * Also display a warning message when a remote, without a known uuid, is located in a directory that does not currently exist, to avoid silently skipping such a remote. This is a bit more debatable, since git-annex get will say, try making repository available. And since it does not set annex-ignore, the warning will be displayed repeatedly. It's also an extreme edge case, I don't think I've ever seen it happen in real life.	2020-05-04 13:01:11 -04:00
Joey Hess	2aeb79249b	external: stop storing readonly=true in remote.log readonly=true is used to make an external special remote that does not need the external program to be installed. It was stored in the remote.log by default, and so every time it was specified in an enableremote or initremote, whatever value was used became the new default for subsequent enableremotes of that remote. That was surprising, and I consider it to be a bug. It does not make much sense to pass it to initremote because then how would you populate that remote with anything? You would have to enableremote elsewhere, and store content there. I'm assuming nobody used it that way. Someone might rely on passing it to enableremote once, and then that being inherited in other clones. But that is not how it's documented to be used. It is barely documented in git-annex at all, only in the external special remote protocol, and the documentation there says to "Document that this external special remote can be used in readonly mode." (by the user of it passing readonly=true to enableremote). The one external special remote that I know of that does document that is <https://github.com/bgilbert/gcsannex> (the one that motivated adding it). That one's docs do say to pass it to enableremote. So, it seemed safe to make this behavior change. If someone was in fact relying on one of those behaviors, all their current repos will still work as they configured them (although they will need to deal with the related change in `9f3c2dfeda`). In new clones, they will find enableremote fails, complaining the external program is not in path. An easy enough problem to recover from.	2020-04-23 15:21:26 -04:00
Joey Hess	9f3c2dfeda	stop using remote.name.annex-readonly for two distinct things	2020-04-23 14:56:03 -04:00
Joey Hess	cd1676d604	fix bug involving local git remote and out of date location log get --from, move --from: When used with a local git remote, these used to silently skip files that the location log thought were present on the remote, when the remote actually no longer contained them. Since that behavior could be surprising, now instead display a warning. I got very confused when I encountered this behavior, since it was silently skipping a file I needed that whereis said was on the remote. get without --from already displayed a "unable to access these remotes" message, which while a bit misleading in that the remote is likely accessible, but just doesn't contain the file, at least indicated something went wrong. Having get --from display a warning makes it in line with get w/o --from, so seems certianly ok. It might be there are situations where move --from is used, on eg a whole directory, and the user only wants to move whatever is present in the remote, and is perfectly ok with files that are not present being skipped. So I'm less sure about the new warning being ok there. OTOH, only local git remotes avoiding displaying a warning in that case too, so this just brings them into line with other remotes. (Also note that this makes it a little bit faster when dealing with a lot of files, since it avoids a redundant stat of the file.)	2020-04-21 12:36:58 -04:00
Joey Hess	529f488ec4	fix a thundering herd problem Avoid repeatedly opening keys db when accessing a local git remote and -J is used. What was happening was that Remote.Git.onLocal created a new annex state as each thread started up. The way the MVar was used did not prevent that. And that, in turn, led to repeated opening of the keys db, as well as probably other extra work or resource use. Also managed to get rid of Annex.remoteannexstate, and it turned out there was an unncessary Maybe in the keysdbhandle, since the handle starts out closed.	2020-04-17 17:09:29 -04:00
Joey Hess	f85ca7dc80	fix all remaining -Wincomplete-uni-patterns warnings A couple of these were probably actual bugs in edge cases. Most of the changes I'm fine with. The fact that aeson's object returns sometihng that we know will be an Object, but the type checker does not know is kind of annoying.	2020-04-15 13:55:08 -04:00
Joey Hess	ca9c6c5f60	Fix a potential failure to parse git config Git has an obnoxious special case in git config, a line "foo" is the same as "foo = true". That means there is no way to examine the output of git config and tell if it was run with --null or not, since a "foo" in the first line could be such a boolean, or could be followed by its value on the next line if --null were used. So, rather than trying to do such a detection, track the style of config at all the points where it's generated.	2020-04-13 13:05:41 -04:00
Joey Hess	7ebc118776	adb: Better messages when the adb command is not installed After a user completely ignored the display of the exception probably because it didn't make sense.. This does make it a little bit slower since it checks adb is in path each time before running it. Also, it might display a lot of warnings about it not being installed. This commit was sponsored by Ilya Shlyakhter on Patreon.	2020-04-02 10:48:28 -04:00
Joey Hess	4b92bbe8d7	webdav: Made exporttree remotes faster by caching connection to the server Followed example of Remote.S3.	2020-03-20 12:48:43 -04:00
Joey Hess	a9d56a1abd	fix builds build	2020-03-10 13:50:46 -04:00
Joey Hess	6a91471923	GETCONFIG name fix Fix regression that prevented external special remotes from using GETCONFIG to query values like "name". (Introduced in version 7.20200202.7.)	2020-03-09 12:38:04 -04:00
Joey Hess	7f992ef59c	mostly finished with createDirectoryUnder conversion Remaining things needing converted are in the assistant, and Annex.Ssh. Every other remaining call to createDirectoryIfMissing True has been audited and is not relevant. The ones in Build/ of course don't get included in the program. Others included eg, Remote.Tahoe and Config.Files which both write to dotfiles under the home directory.	2020-03-06 11:57:15 -04:00
Joey Hess	6d58ca94d6	some easy createDirectoryUnder conversions	2020-03-05 15:20:10 -04:00
Joey Hess	ccd8c43dc8	git-annex config: guard against non-repo-global configs git-annex config: Only allow configs be set that are ones git-annex actually supports reading from repo-global config, to avoid confused users trying to set other configs with this.	2020-03-02 15:54:18 -04:00
Joey Hess	2366e7fb84	catch whereisKey exception and provide error messages when external programs neglect to * whereis: If a remote fails to report on urls where a key is located, display a warning, rather than giving up and not displaying any information. * When external special remotes fail but neglect to provide an error message, say what request failed, which is better than displaying an empty error message to the user.	2020-02-27 14:09:18 -04:00
Joey Hess	81e3faf810	Merge branch 'v7'	2020-02-26 18:15:18 -04:00
Joey Hess	e535da621c	Bugfix to getting content from an export remote with -J, when the export database was not yet populated. (cherry picked from commit `e520341500`)	2020-02-26 18:07:20 -04:00
Joey Hess	8af6d2c3c5	fix encryption of content to gcrypt and git-lfs Fix serious regression in gcrypt and encrypted git-lfs remotes. Since version 7.20200202.7, git-annex incorrectly stored content on those remotes without encrypting it. Problem was, Remote.Git enumerates all git remotes, including git-lfs and gcrypt. It then dispatches to those. So, Remote.List used the RemoteConfigParser from Remote.Git, instead of from git-lfs or gcrypt, and that parser does not know about encryption fields, so did not include them in the ParsedRemoteConfig. (Also didn't include other fields specific to those remotes, perhaps chunking etc also didn't get through.) To fix, had to move RemoteConfig parsing down into the generate methods of each remote, rather than doing it in Remote.List. And a consequence of that was that ParsedRemoteConfig had to change to include the RemoteConfig that got parsed, so that testremote can generate a new remote based on an existing remote. (I would have rather fixed this just inside Remote.Git, but that was not practical, at least not w/o re-doing work that Remote.List already did. Big ugly mostly mechanical patch seemed preferable to making git-annex slower.)	2020-02-26 18:05:36 -04:00
Joey Hess	9050788b66	info: Fix display of the encryption value. (Some debugging junk had crept in.)	2020-02-26 15:02:23 -04:00
Joey Hess	e520341500	Bugfix to getting content from an export remote with -J, when the export database was not yet populated.	2020-02-26 14:57:29 -04:00
Joey Hess	67476fbc54	minor code simplification	2020-02-25 13:06:09 -04:00
Joey Hess	79a0435b77	automate remote.name.skipFetchAll initremote, enableremote: Set remote.name.skipFetchAll when the remote cannot be fetched from by git, so git fetch --all will not try to use it.	2020-02-19 13:58:26 -04:00
Joey Hess	69f2d1dd43	remoteConfig rework remoteAnnexConfig will avoid bugs like `a3a674d15b` Use now more generic remoteConfig in a couple places that built non-annex config settings manually before.	2020-02-19 13:45:11 -04:00
Joey Hess	399319ccbc	Avoid throwing fatal errors when asked to write to a readonly git remote on http Test suite found one of them, looking for giveup turned up several more.	2020-02-14 14:38:13 -04:00
Joey Hess	1883f7ef8f	support git remotes that need http basic auth using git credential to get the password One thing this doesn't do is wrap the password prompting inside the prompt action. So with -J, the output can be a bit garbled.	2020-01-22 16:16:19 -04:00
Joey Hess	75059c9f3b	better error message when git config fails to parse remote config Rather than leaking the name of the temp file, just say the config parse failed, and where the config was downloaded from. Not closing the bug report because two issues were reported in the same bug report, because the universe wants me to continually re-read old unclosed bug reports to waste my time determining what still needs to be done.	2020-01-22 13:35:54 -04:00
Joey Hess	d227093002	avoid ugly error message Http remotes that do expose a git config file, but are not initialized resulted in an ugly and unncessary error message, now sqelched. When git-annex-shell configlist is run w/o the autoinit field, it may not generate a uuid for the repository. So in that case, it's not unexpected for the config it does list to not include a UUID, and dumping out the config in a warning message is not needed. If configlist is asked to autoinit and we don't get back a config with a UUID in it, that suggests some problem, and what we got back may not be a config at all but some diagnostic message, so it does make sense to output it then.	2020-01-22 11:57:20 -04:00
Joey Hess	830c30001b	fix --describe-other-params of external when encryption is not specified Encryption not being specified makes lenientRemoteConfigParser fail to parse, and so it was not able to start the external up to get LISTCONFIGS.	2020-01-20 16:56:34 -04:00
Joey Hess	2be4122bfc	include passthrough params in --describe-other-params	2020-01-20 16:53:27 -04:00
Joey Hess	7038acf96c	add descriptions for all remote config fields not yet used	2020-01-20 15:20:04 -04:00
Joey Hess	201049cf93	gcrypt inherits shellescape setting from rsync, allow it	2020-01-20 15:13:49 -04:00
Joey Hess	923230ea30	convert RemoteConfigFieldParser to data type	2020-01-20 13:49:30 -04:00
Joey Hess	8406ff8861	speed hack Avoids the external program being started just to use LISTCONFIGS on an already accepted config. So initremote/enableremote will still run the external program an extra time to use LISTCONFIGS, but everything that uses the special remote after it's initialized will not any longer.	2020-01-17 17:26:36 -04:00
Joey Hess	5c58f86790	always add the specialRemoteConfigParsers Was not being added in some places, resulting in error messages about encryption not being a valid field.	2020-01-17 17:13:44 -04:00
Joey Hess	99cb3e75f1	add LISTCONFIGS to external special remote protocol Special remote programs that use GETCONFIG/SETCONFIG are recommended to implement it. The description is not yet used, but will be useful later when adding a way to make initremote list all accepted configs. configParser now takes a RemoteConfig parameter. Normally, that's not needed, because configParser returns a parter, it does not parse it itself. But, it's needed to look at externaltype and work out what external remote program to run for LISTCONFIGS. Note that, while externalUUID is changed to a Maybe UUID, checkExportSupported used to use NoUUID. The code that now checks for Nothing used to behave in some undefined way if the external program made requests that triggered it. Also, note that in externalSetup, once it generates external, it parses the RemoteConfig strictly. That generates a ParsedRemoteConfig, which is thrown away. The reason it's ok to throw that away, is that, if the strict parse succeeded, the result must be the same as the earlier, lenient parse. initremote of an external special remote now runs the program three times. First for LISTCONFIGS, then EXPORTSUPPORTED, and again LISTCONFIGS+INITREMOTE. It would not be hard to eliminate at least one of those, and it should be possible to only run the program once.	2020-01-17 16:07:17 -04:00
Joey Hess	1ce722d86f	avoid relying on crazy monoid instance This code worked as intended, but only by accident, because of this instance: instance Monoid b => Monoid (x -> b) where mempty = const (mempty :: b) Let's be explicit that we throw away the error message.	2020-01-17 13:49:12 -04:00
Joey Hess	e78bf29725	avoid getting config parser when there is no config to parse The benefit here is that external special remotes will need a LISTCONFIGS request and response to generate their config parser, and this avoids it being done for all the ones that don't have any configs. Note that, a config parser could in theory fail to parse if there are no configs (none currently do), but a parse failure is already thrown away when generating the remote list because it's too late. Such problems have to be caught at initremote/enableremote time, not here.	2020-01-17 13:32:48 -04:00
Joey Hess	465ec9dcd7	ported Remote.External Not yet added anything to the protocol to get a list of remote config fields; any fields will be accepted and are available for the external remote to use as before. There is one minor behavior change.. Before, GETCONFIG could be passed a field such as type, externaltype, encryption, etc, and would get the value of that. Now, GETCONFIG only works on fields that don't have a defined meaning to git-annex, so are passed through to the external remote. This seems unlikely to affect any external special remotes in practice.	2020-01-15 13:01:22 -04:00
Joey Hess	6a982e38eb	a few more field functions	2020-01-15 12:57:56 -04:00
Joey Hess	907ca937ab	use more field functions Using field functions consistently avoids possibility of typos and also helps ensure that all fields are added to RemoteConfigParsers (as long as I have remembered to add them when writing the functions).	2020-01-15 11:15:07 -04:00
Joey Hess	7f2bfd41d7	include credPairRemoteFields in RemoteConfigParsers Avoids parse error when the fields are added to RemoteConfig at setup time and it then gets parsed, also at setup time. After setup time, such internally added fields are not a problem, because they're Accepted. So it may not be necessary in all cases to list such internally added fields, but I think it's a good idea to always do so.	2020-01-15 10:57:45 -04:00
Joey Hess	0706d9d093	finish porting S3	2020-01-15 10:52:28 -04:00
Joey Hess	c4ea3ca40a	ported almost all remotes, until my brain melted external is not started yet, and S3 is part way through and not compiling yet	2020-01-14 15:41:34 -04:00
Joey Hess	c498269a88	convert configParser to Annex action and add passthrough option Needed so Remote.External can query the external program for its configs. When the external program does not support the query, the passthrough option will make all input fields be available.	2020-01-14 13:52:03 -04:00
Joey Hess	8f142a9279	fix wrong type Use of Typeable means the type checker can't catch this kind of mistake, the error is deferred to runtime. testremote now passes on a directory special remote	2020-01-14 13:05:38 -04:00
Joey Hess	963239da5c	separate RemoteConfig parsing basically working Many special remotes are not updated yet and are commented out.	2020-01-14 12:35:08 -04:00
Joey Hess	71f78fe45d	wip separate RemoteConfig parsing Remote now contains a ParsedRemoteConfig. The parsing happens when the Remote is constructed, rather than when individual configs are used. This is more efficient, and it lets initremote/enableremote reject configs that have unknown fields or unparsable values. It also allows for improved type safety, as shown in Remote.Helper.Encryptable where things that used to match on string configs now match on data types. This is a work in progress, it does not build yet. The main risk in this conversion is forgetting to add a field to RemoteConfigParser. That will prevent using that field with initremote/enableremote, and will prevent remotes that already are set up from seeing that configuration. So will need to check carefully that every field that getRemoteConfigValue is called on has been added to RemoteConfigParser. (One such case I need to remember is that credPairRemoteField needs to be included in the RemoteConfigParser.)	2020-01-13 12:39:21 -04:00
Joey Hess	71ecfbfccf	be stricter about rejecting invalid configurations for remotes This is a first step toward that goal, using the ProposedAccepted type in RemoteConfig lets initremote/enableremote reject bad parameters that were passed in a remote's configuration, while avoiding enableremote rejecting bad parameters that have already been stored in remote.log This does not eliminate every place where a remote config is parsed and a default value is used if the parse false. But, I did fix several things that expected foo=yes/no and so confusingly accepted foo=true but treated it like foo=no. There are still some fields that are parsed with yesNo but not not checked when initializing a remote, and there are other fields that are parsed in other ways and not checked when initializing a remote. This also lays groundwork for rejecting unknown/typoed config keys.	2020-01-10 14:52:48 -04:00
Joey Hess	2000e9a4b8	avoid build warning on windows	2020-01-01 14:40:35 -04:00
Joey Hess	fb04cfd0e6	fix windows build	2020-01-01 14:27:03 -04:00
Joey Hess	37467a008f	annex.addunlocked expressions * annex.addunlocked can be set to an expression with the same format used by annex.largefiles, in case you want to default to unlocking some files but not others. * annex.addunlocked can be configured by git-annex config. Added a git-annex-matching-expression man page, broken out from tips/largefiles. A tricky consequence of this is that git-annex add --relaxed honors annex.addunlocked, but an expression might want to know the size or content of an url, which it's not going to download. I decided it was better not to fail, and just dummy up some plausible data in that case. Performance impact should be negligible. The global config is already loaded for annex.largefiles. The expression only has to be parsed once, and in the simple true/false case, it should not do any additional work matching it.	2019-12-20 15:56:25 -04:00
Joey Hess	4acbb40112	git-annex config annex.largefiles annex.largefiles can be configured by git-annex config, to more easily set a default that will also be used by clones, without needing to shoehorn the expression into the gitattributes file. The git config and gitattributes override that. Whenever something is added to git-annex config, we have to consider what happens if a user puts a purposfully bad value in there. Or, if a new git-annex adds some new value that an old git-annex can't parse. In this case, a global annex.largefiles that can't be parsed currently makes an error be thrown. That might not be ideal, but the gitattribute behaves the same, and is almost equally repo-global. Performance notes: git-annex add and addurl construct a matcher once and uses it for every file, so the added time penalty for reading the global config log is minor. If the gitattributes annex.largefiles were deprecated, git-annex add would get around 2% faster (excluding hashing), because looking that up for each file is not fast. So this new way of setting it is progress toward speeding up add. git-annex smudge does need to load the log every time. As well as checking the git attribute. Not ideal. Setting annex.gitaddtoannex=false avoids both overheads.	2019-12-20 13:01:41 -04:00
Joey Hess	686791c4ed	more RawFilePath Remove dup definitions and just use the RawFilePath one. </> etc are enough faster that it's probably faster than building a String directly, although I have not benchmarked.	2019-12-18 17:10:28 -04:00
Joey Hess	c19211774f	use filepath-bytestring for annex object manipulations git-annex find is now RawFilePath end to end, no string conversions. So is git-annex get when it does not need to get anything. So this is a major milestone on optimisation. Benchmarks indicate around 30% speedup in both commands. Probably many other performance improvements. All or nearly all places where a file is statted use RawFilePath now.	2019-12-11 15:25:07 -04:00
Joey Hess	bdec7fed9c	convert TopFilePath to use RawFilePath Adds a dependency on filepath-bytestring, an as yet unreleased fork of filepath that operates on RawFilePath. Git.Repo also changed to use RawFilePath for the path to the repo. This does eliminate some RawFilePath -> FilePath -> RawFilePath conversions. And filepath-bytestring's </> is probably faster. But I don't expect a major performance improvement from this. This is mostly groundwork for making Annex.Location use RawFilePath, which will allow for a conversion-free pipleline.	2019-12-09 15:07:21 -04:00
Joey Hess	c20f4704a7	all commands building except for assistant also, changed ConfigValue to a newtype, and moved it into Git.Config.	2019-12-05 14:41:18 -04:00
Joey Hess	650a631ef8	include all remotes back in	2019-12-02 12:26:33 -04:00
Joey Hess	f3047d7186	include git-annex-shell back in Also pushed ConfigKey down into the Git modules, which is the bulk of the changes.	2019-12-02 11:51:52 -04:00
Joey Hess	d7833def66	use ByteString for git config The parser and looking up config keys in the map should both be faster due to using ByteString. I had hoped this would speed up startup time, but any improvement to that was too small to measure. Seems worth keeping though. Note that the parser breaks up the ByteString, but a config map ends up pointing to the config as read, which is retained in memory until every value from it is no longer used. This can change memory usage patterns marginally, but won't affect git-annex.	2019-11-27 17:40:09 -04:00
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	81d402216d	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. Previously attempted in `4536c93bb2` and reverted in `96aba8eff7`. The problems mentioned in the latter commit are addressed now: Read/Show of KeyData is backwards-compatible with Read/Show of Key from before this change, so Types.Distribution will keep working. The Eq instance is fixed. Also, Key has smart constructors, avoiding needing to remember to update the cached serialization. Used git-annex benchmark: find is 7% faster whereis is 3% faster get when all files are already present is 5% faster Generally, the benchmarks are running 0.1 seconds faster per 2000 files, on a ram disk in my laptop.	2019-11-22 17:49:16 -04:00
Joey Hess	b207d944f3	sync, assistant: Pull and push from git-lfs remotes. Oversight, forgot to add it to gitSyncableRemote	2019-11-18 16:13:21 -04:00
Joey Hess	5877de5e80	git-lfs: remember urls, and autoenable remotes using known urls * git-lfs: The url provided to initremote/enableremote will now be stored in the git-annex branch, allowing enableremote to be used without an url. initremote --sameas can be used to add additional urls. * git-lfs: When there's a git remote with an url that's known to be used for git-lfs, automatically enable the special remote.	2019-11-18 16:09:09 -04:00
Joey Hess	cee14f147a	stop displaying rsync progress, and use git-annex's own progress display for local-to-local repo transfers Reasons to do this include: 1. I've gotten pretty used to git-annex's own progress display, which is used for all transfers over ssh (except to old git-annex-shell), and for most special remote transfers. It's getting to seem weird to see the rsync progress display instead. 2. When -J was used, the rsync output could not be shown, and so there was no progress display. Now there will be. Progress will also be displayed now when cp CoW is used. But I'd expect a CoW copy to typically run so fast that the progress display will barely be noticable. This commit was sponsored by Peter on Patreon.	2019-11-15 13:21:06 -04:00
Joey Hess	890330f0fe	make --json-error-messages capture url download errors Convert Utility.Url to return Either String so the error message can be displated in the annex monad and so captured. (When curl is used, its errors are still not caught.)	2019-11-12 13:52:38 -04:00
Joey Hess	9e8d40181f	remove some unncessary uses of warningIO warningIO is not concurrent output safe, and it doesn't go to --json-error-messages There are a few more that would be too hard to remove, and there are also several dozen direct prints to stderr still.	2019-11-12 10:07:27 -04:00
Joey Hess	9828f45d85	add RemoteStateHandle This solves the problem of sameas remotes trampling over per-remote state. Used for: * per-remote state, of course * per-remote metadata, also of course * per-remote content identifiers, because two remote implementations could in theory generate the same content identifier for two different peices of content While chunk logs are per-remote data, they don't use this, because the number and size of chunks stored is a common property across sameas remotes. External special remote had a complication, where it was theoretically possible for a remote to send SETSTATE or GETSTATE during INITREMOTE or EXPORTSUPPORTED. Since the uuid of the remote is typically generate in Remote.setup, it would only be possible to pass a Maybe RemoteStateHandle into it, and it would otherwise have to construct its own. Rather than go that route, I decided to send an ERROR in this case. It seems unlikely that any existing external special remote will be affected. They would have to make up a git-annex key, and set state for some reason during INITREMOTE. I can imagine such a hack, but it doesn't seem worth complicating the code in such an ugly way to support it. Unfortunately, both TestRemote and Annex.Import needed the Remote to have a new field added that holds its RemoteStateHandle.	2019-10-14 13:51:42 -04:00
Joey Hess	35d7ffe128	initremote --sameas fully working And using sameas remotes is working. Moved annex-config-uuid setting out of Remote.Helper.Special. EnableRemote will also have to set it.	2019-10-11 14:19:10 -04:00
Joey Hess	2bd6e81bb0	support annex-config-uuid when generating remote This is used by a special remote with sameas-uuid= The remote's uuid is the sameas-uuid, but it needs to get its RemoteConfig from the annex-config-uuid.	2019-10-11 12:34:11 -04:00
Joey Hess	df5b0ffab3	inherit other fields I think this is all that need to be inherited.	2019-10-10 16:11:21 -04:00
Joey Hess	c3975ff3b4	sameas RemoteConfig inheritance I found a way to avoid inheritance complicating anything outside of Logs.Remote. It seems fine to require all inherited values to be inherited and not set in the sameas remote's config. Since inherited values will be used for stuff like encryption and perhaps chunking, which control the actual content stored on the remote, it seems likely that there will not be any reason to need them to vary between two remotes that access the same underlying data store. The newer version of containers is free; the minimum ghc version is bundled with a newer version than that.	2019-10-10 15:58:22 -04:00
Joey Hess	59908586f4	rename RemoteConfigKey to RemoteConfigField And some associated renames. I was going to have some values named fooKeyKey otherwise..	2019-10-10 15:44:05 -04:00
Joey Hess	d1130ea04a	get rid of hardcoded "name" lookups Support "sameas-name" being set instead. In RenameRemote, rename which ever of the two is set.	2019-10-10 13:25:10 -04:00
Joey Hess	92ff30df70	set annex-config-uuid when RemoteConfig contains a sameas-uuid Initremote sets that, so after both initremote and enableremote, the git config will be set. Any remote that does not use Annex.SpecialRemote won't set annex-config-uuid. But that's only Remote.Git, which doesn't use RemoteConfig anyway.	2019-10-10 12:58:59 -04:00
Joey Hess	46071a2435	use storeUUIDIn	2019-10-10 12:38:17 -04:00
Joey Hess	06c04ffe29	use storeUUIDIn	2019-10-10 12:12:09 -04:00
Joey Hess	a6c3d1cb6d	avoid unneccesary extra blank line before git-credentials prompt	2019-09-24 18:06:10 -04:00
Joey Hess	bc1b9a2c0a	improved GitLFS api	2019-09-24 18:05:11 -04:00
Joey Hess	6ae0a44c64	git-lfs: Added support for http basic auth	2019-09-24 14:46:20 -04:00
Joey Hess	de564df8b3	git-lfs: Only do endpoint discovery once when concurrency is enabled This avoids some extra work, but I don't think it was possible for two ssh endpoint discoveries run concurrently to both prompt for the ssh password; Annex.Ssh itself deals with concurrency. This is mostly groundwork for http password prompting.	2019-09-24 13:01:51 -04:00
Joey Hess	53fd746705	avoid some build warnings on windows	2019-09-12 14:11:19 -04:00
Joey Hess	3f0eef4baa	v7 for all repositories * Default to v7 for new repositories. * Automatically upgrade v5 repositories to v7.	2019-08-30 14:09:14 -04:00
Joey Hess	e804f48f82	remove a few more isDirect tests	2019-08-28 11:53:10 -04:00
Joey Hess	16f646c9a6	don't hide message when ensureInitialized fails	2019-08-27 12:38:47 -04:00
Joey Hess	bb18bbd426	consolidate calls to ensureInitialized tryGitConfigRead may run ensureInitialized first, but when checkuuid = false, that is skipped. So, make sure it's run before all onLocal actions. ensureInitialized is inexpensive, so the extra call by tryGitConfigRead is not a big deal. But since it was easy to do, I made it only be run once by all calls to onLocal. A few calls to onLocal didn't call ensureInitialized before. Notably, the checkPresent action didn't, and does now. That means that there's a guarantee that any necessary repo upgrades will be run before the checkPresent action runs in the repo. Which is important especially for the direct mode conversion, because without that upgrade, the checkPresent action would need to support direct mode still. Now I can remove the last bits of direct mode support in Annex.Content without worrying that it will break accessing remotes that have not been upgraded. This does necessarily mean that checkPresent needs to write to the disk when performing such a repo upgrade. The other remote actions already did, so retrieval from a readonly remote that needed to be upgraded would fail. Having checkPresent also fail doesn't seem like a large reversion, especially since it already failed in the default case when checkuuid = true.	2019-08-27 12:18:01 -04:00
Joey Hess	708fc6567f	S3: Fix encoding when generating public urls of S3 objects. This code feels worryingly stringily typed, but using URI does not help because the uriPath still has to be constructed with the right uri-encoding.	2019-08-15 12:56:46 -04:00
Joey Hess	386c0ce90a	close handle so windows can stat the file windows cannot stat a file that another process has open, which caused this to crash with an exception	2019-08-13 13:26:25 -04:00
Joey Hess	cfd0b4108e	avoid windows build warning	2019-08-13 13:10:33 -04:00
Joey Hess	5004381dd9	improve error display when storing to an export/import remote fails Prompted by the test suite on windows failing to with "export foo failed" and no information about what went wrong. Note that only storeExportWithContentIdentifier has been converted. storeExport still returns a Bool and so exceptions may be hidden. However, storeExportWithContentIdentifier has many more failure modes, since it needs to avoid overwriting modified files. So it's more important it have better error display.	2019-08-13 12:05:00 -04:00
Joey Hess	f27c5db5c5	avoid rsync failing with a permissions error The test suite was intermittently failing with rsync complaining it could not write to dest. get foo (from origin...) SHA256E-s20--e394a389d787383843decc5d3d99b6d184ffa5fddeec23b911f9ee7fc8b9ea77 20 100% 0.00kB/s 0:00:00 ^M 20 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/1) (from origin...) SHA256E-s20--e394a389d787383843decc5d3d99b6d184ffa5fddeec23b911f9ee7fc8b9ea77 20 100% 0.00kB/s 0:00:00 ^M 20 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/1) rsync: open "/home/joey/src/git-annex/.t/tmprepo1103/.git/annex/tmp/SHA256E-s20--e394a389d787383843decc5d3d99b6d184ffa5fddeec23b911f9ee7fc8b9ea77" failed: Permission denied (13) rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1207) [sender=3.1.3] It seems that the first rsync actually transferred the file, but then for some reason git-annex thinks it failed, so it retries. The second rsync then fails because the first rsync copied the file mode over and so the file is not writable now. So, this fixes that problem, but leaves open the question of why git-annex would think rsync failed when it wrote the file and didn't output any error message. Possibly a bug in rsyncProgress that either hides an error message, or somehow makes rsync unhappy?	2019-08-09 15:26:58 -04:00
Joey Hess	fb7d92457f	support using gcrypt with git-lfs special remote	2019-08-05 13:43:45 -04:00
Joey Hess	8401b09e32	Allow setting up a gcrypt special remote with encryption=shared It was documented to work, but seems it has been broken for a while/forever.	2019-08-05 12:41:05 -04:00
Joey Hess	3f450f0f4a	add encryption warning	2019-08-05 11:35:26 -04:00
Joey Hess	ecf7f34c23	remember sha256 and size when necessary Using Logs.RemoteState for this means that if the same key gets uploaded twice to a git-lfs remote, but somehow has different content the two times (eg it's an URL key with non-stable content), the sha256/size of the newer content uploaded will overwrite what was remembered before. That seems ok; it just means that git-annex will request the newer version of the content when downloading from git-lfs. It will remember the sha256 and size if both are not known, or if only the sha256 is not known but the size is known, it only remembers the sha256, to avoid wasting space on the size. I did not add special case for when the sha256 is known and the size is not, because it's been a long time since git-annex created SHA256 keys without a size. (See doc/upgrades/SHA_size.mdwn)	2019-08-05 11:05:59 -04:00
Joey Hess	f5eb28682a	expand	2019-08-04 13:59:24 -04:00
Joey Hess	408cb0af39	remove unused imports	2019-08-04 12:43:53 -04:00
Joey Hess	9aab851a55	fix reversion lost check of resp_actions in `b82ecf7076`	2019-08-04 12:43:16 -04:00
Joey Hess	7269851550	download from LFS working including resuming	2019-08-04 12:32:36 -04:00
Joey Hess	b82ecf7076	verify that LFS server responds with requested object The protocol design allows the server to respond with some other object; if a server for some reason a server did that, it would not be right for git-annex to download its content. I don't think it would be a security hole, since git-annex is downloading a specific key and will verify the key's content. Seems like a good idea to belt-and-suspenders test for such a misuse of the protocol.	2019-08-03 16:23:47 -04:00
Joey Hess	28c0395d61	start at retieval from LFS Doesn't yet download the content, which will need to support resuming.	2019-08-03 12:51:16 -04:00
Joey Hess	5be0a35dae	implemented checkPresent for git-lfs	2019-08-03 12:21:28 -04:00
Joey Hess	a16e83eec8	also debug http response status code	2019-08-03 11:30:06 -04:00
Joey Hess	fc09a41ed1	storing objects in git-lfs is working Still need to record the sha256 and size when they cannot be determined by inspecting the key.	2019-08-02 13:56:55 -04:00
Joey Hess	6c1130a3bb	lfs endpoint discovery and caching in git-lfs special remote	2019-08-02 12:38:14 -04:00
Joey Hess	1cef791cf3	skeleton git-lfs special remote This is a special remote and a git remote at the same time; git can pull and push to it and git-annex can use it as a special remote. Remote.Git has to check if it's configured as a git-lfs special remote and sets it up as one if so. Object methods not implemented yet.	2019-08-01 15:30:12 -04:00
Joey Hess	426053cb6c	Corrected some license statements In `40ecf58d4b` I changed the license of code I wrote from GPL to AGPL. But, two files containing code I wrote combined with code by others were updated to say their license is AGPL, while in fact part of it was (the code I wrote) but part remained under the original license (the code written by others). Remote/Ddar.hs is now changed entirely back to GPL 3. Annex/DirHashes.hs stays AGPL, but I broke out Utility/MD5.hs with the code not written by me, and corrected its license statement to GPL-2, which is the actual version of the GPL included with the code in its original distribution at http://www.cs.ox.ac.uk/people/ian.lynagh/md5/	2019-07-28 14:27:33 -04:00
Joey Hess	21ff5e1e5a	CoW probing Improved probing when CoW copies can be made between files on the same drive. Now supports CoW between BTRFS subvolumes. And, falls back to rsync instead of using cp when CoW won't work, eg copies between repos on the same EXT4 filesystem. Rather than trying cp --reflink=always for each file copied to a remote, it's tried once and if it fails it falls back to using rsync thereafter for the lifetime of the Remote object. That avoids overhead of calling cp which while small, will add up over a large number of files. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2019-07-17 14:19:08 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	26c54d6ea3	make metered more generic Allow it to be used when the Key is not known.	2019-06-25 12:33:36 -04:00
Joey Hess	44de3fff0b	avoid rsync/gcrypt ssh startup delay with -J Avoid a delay at startup when concurrency is enabled and there are rsync or gcrypt special remotes, which was caused by git-annex opening a ssh connection to the remote too early. sshOptions makes a connection to the ssh server if one is not already open, when concurrency is enabled. Avoid doing that at startup, when the remote list is being built, but the remote may not be used at all. Instead, rsync/gcrypt now runs sshOptions once per ssh connection to the server. This should not be significant overhead since Remote.Git already has the same overhead (as do Bup and Ddar).	2019-06-13 11:16:38 -04:00
Joey Hess	94cba37f68	fix build	2019-05-28 11:18:05 -04:00
Joey Hess	c1ed0293b0	improve docs about removeExportDirectory	2019-05-28 11:16:01 -04:00
Joey Hess	8960f259b8	make readonly export remotes really be readonly When a remote is configured to be readonly, don't allow changing what's exported to it. This was missed in the original export remote implementation, but it makes sense for a readonly export remote to not be allowed to change.	2019-05-28 11:04:28 -04:00
Joey Hess	700a3f2787	Merge branch 'master' into import-from-s3	2019-05-01 14:30:52 -04:00
Joey Hess	9dd764e6f7	Added mimeencoding= term to annex.largefiles expressions. * Added mimeencoding= term to annex.largefiles expressions. This is probably mostly useful to match non-text files with eg "mimeencoding=binary" * git-annex matchexpression: Added --mimeencoding option.	2019-04-30 12:17:22 -04:00
Joey Hess	f08cd6a4ac	set S3 version id in retrieveExportWithContentIdentifierS3 This is necessary because of checks for a S3 version id being set done when deleting the export or overwriting or renaming it.	2019-04-24 15:13:07 -04:00
Joey Hess	a42e7a012a	refuse unsafe store to unversioned exporttree with old aws version I've developed a patch to aws, once it gets merged, the real version number of aws can be filled in.	2019-04-23 14:39:30 -04:00
Joey Hess	15bd7d57ca	info: Show when a remote is configured with importtree	2019-04-23 14:27:43 -04:00
Joey Hess	a7db925f59	typo	2019-04-23 13:19:48 -04:00
Joey Hess	710c2cdbdc	implement rest of missing methods for import from S3	2019-04-23 13:09:27 -04:00
Joey Hess	2f79cb4b45	versioned import from S3 is working Still some bugs and two stubbed methods to implement though.	2019-04-19 15:13:49 -04:00
Joey Hess	9dc7a10448	Drop support for building with aws older than 0.14. debian stable has 0.14 so lose the complexity for old versions	2019-04-19 14:27:59 -04:00
Joey Hess	55a5d9679a	implemented mkImportableContentsVersioned	2019-04-19 13:39:33 -04:00
Joey Hess	bf6c7ea6b6	starting work on import from S3 Not in a usuable state yet.	2019-04-18 15:20:09 -04:00
Joey Hess	cd86692c95	fix storeExportWithContentIdentifier	2019-04-09 19:15:20 -04:00
Joey Hess	7b6d0da9b8	adb import As well as adding the necessary methods, a few other changes to the adb remote: * Use ".annextmp" extension for temp files, to avoid conflict with other temp files. * Stop using "echo $?" to get exit status of command inside adb. There were two problems; first the "echo" just before it meant it was always 0! And secondly, it seems kind of random on my phone whether it's 1 or 0, not dependant on whether the command seems to have succeeded.	2019-04-09 17:52:41 -04:00
Joey Hess	2dc20e3fa4	update design doc with final design choices	2019-04-09 13:05:22 -04:00
Joey Hess	06cbaa4233	fix back-compat with old git-annex Unfortunately, "port" has to be set by default, or the old git-annex will crash when trying to enable the S3 remote. So, when protocol=https is specified, it needs to override port=80, since it may be a default setting.	2019-03-22 12:27:41 -04:00
Joey Hess	2a99d7ffc0	improve error message	2019-03-22 12:23:59 -04:00
Joey Hess	7d37011a11	S3: Added protocol= initremote setting, to allow https to be used on a non-standard port protocol=https implies port=443 and port=443 implies protocol=https -- this was necessary because the existing configs set port=443, but with a protocol setting, users will naturally want to use it, and then there's no need for them to supply the default https port. So we keep back-compat, add a nicer way to enable https, and also add support for non-standard https ports.	2019-03-22 12:17:05 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	2912429640	better indicate when special remotes do not support renameExport Avoid a warning message when renameExport is not supported, and just fallback to deleting with a subsequent re-upload. Especially needed for importtree remotes, where renameExport needs to be disabled. This changes the external special remote protocol, but in a backwards-compatible way. A reply of UNSUPPORTED-REQUEST to an older version of git-annex will cause it to make renameExport return False.	2019-03-11 12:53:24 -04:00
Joey Hess	e412129523	concurrency and status messages when downloading from import	2019-03-08 12:33:44 -04:00
Joey Hess	ee5f1422df	remove debug print	2019-03-07 16:08:58 -04:00
Joey Hess	9a72785307	fixes to export db lookup when accessing importtree=yes Now in a fresh clone with a importtree=yes remote enabled, git annex fsck --from the remote works.	2019-03-07 14:10:56 -04:00
Joey Hess	93025dd59f	add missing locking of ContentIdentifier database when writing This is not super efficient; it would be better to lock the database once and build up a queue of changes and flush once. But, storeExportWithContentIdentifier is likely going to be the really expensive part, so let's do the simple thing and only optimise later if needed.	2019-03-07 13:32:33 -04:00
Joey Hess	3f449f845e	update	2019-03-07 13:28:18 -04:00
Joey Hess	b3d30e7d70	remove unncessary locking of ContentIdentifier db Remote.Helper.ExportImport only reads from it, and locking is only needed when writing.	2019-03-06 14:36:57 -04:00
Joey Hess	b23c301820	fix false positive from checkPresentExportWithContentIdentifierM when file does not exist	2019-03-05 17:04:00 -04:00
Joey Hess	dc278c059c	fix STM crash git-annex: thread blocked indefinitely in an STM transaction failed git-annex: sqlite query crashed CallStack (from HasCallStack): error, called at ./Database/Handle.hs:98:42 in main:Database.Handle failed This needs further investigation.	2019-03-05 16:37:40 -04:00
Joey Hess	46d33e804a	added checkPresentExportWithContentIdentifier Ugh, don't like needing to add this, but I can't see a way around it.	2019-03-05 16:03:03 -04:00
Joey Hess	354aafce1a	refactor database handle code Use same, simpler method to make only one thread open the export db as is used for the ContentIdentifier db. And, always update the export db once before using.	2019-03-05 15:42:39 -04:00
Joey Hess	fd2a1aaa17	avoid using renameExport on import remotes	2019-03-05 14:57:48 -04:00
Joey Hess	bec66258a8	minor	2019-03-05 14:50:39 -04:00
Joey Hess	8c54604e67	import+export from directory special remote fully working Had to add two more API calls to override export APIs that are not safe for use in combination with import. It's unfortunate that removeExportDirectory is documented to be allowed to remove non-empty directories. I'm not entirely sure why it's that way, my best guess is it was intended to make it easy to implement with just rm -rf.	2019-03-05 14:20:14 -04:00
Joey Hess	554b7b7f3e	fix todo	2019-03-04 18:20:12 -04:00
Joey Hess	bc509143e5	avoid opening export db until needed Before, it was opened when constructing the export Remote, even if it never got used.	2019-03-04 18:11:32 -04:00
Joey Hess	cd3a2b023a	initial try at using storeExportWithContentIdentifier Untested, and I'm not sure about the locking of the ContentIdentifier db.	2019-03-04 17:50:41 -04:00
Joey Hess	aaacf431d8	handle importtree=yes config For now, it's only allowed when exporttree=yes is also set. That simplified the implementation, but could later be changed if there's a remote that makes sense to be an import but not an export. However, it may work just as well to make a remote be readonly to prevent export to it while still allowing import.	2019-03-04 16:07:35 -04:00
Joey Hess	88ccfaa78c	storeExportWithContentIdentifierM for directory special remote Not sure if my reasoning about the races really holds. It would certianly be possible to better guard against races by using Linux-specific renameat2 with RENAME_EXCHANGE or RENAME_NOREPLACE. Or by using link and relying on it not overwriting existing files -- but that would need a filesystem that supports hard links and directory can be used in filesystems that don't.	2019-03-04 14:46:25 -04:00
Joey Hess	3cd19fb4d0	use InodeCache to avoid races in import from directory special remote This does not avoid all possible races, but it does avoid all likely ones, and is demonstratably better than git's own handling of races where files get modified at the same time as it's updating the working tree. The main thing this won't detect are not unlikely races where part of a file gets changed while it's being copied and then the file is restored to its original condition before the modification check. No, it's more likely that the limitations of checking inode, size, and mtime won't detect certian modifications, involving eg mmapped files.	2019-03-04 13:57:23 -04:00
Joey Hess	e2e57f8556	initial export support for directory special remote This does not guard against race condition yet, it's only for testing purposes.	2019-02-27 13:42:34 -04:00
Joey Hess	45aacd888b	import downloader complete (untested) Made some api changes. listImportableContents needs to provide the size of the data, so the downloader can check disk free space. retrieveExportWithContentIdentifier is passed the filepath to write to Use temporary "CID" key during download of a ContentIdentifier from a remote, so withTmp can be used and then move the content to the real key once it's known.	2019-02-27 13:15:02 -04:00
Joey Hess	760f26ebc6	Merge branch 'master' into importtree	2019-02-26 11:36:36 -04:00
Joey Hess	19f833b0b1	aws-0.21.1 * S3: Support enabling bucket versioning when built with aws-0.21.1. * stack.yaml: Build with aws-0.21.1	2019-02-24 12:45:09 -04:00
Joey Hess	fd304dce60	split out Types.Import and some changes to the types in it	2019-02-21 13:39:09 -04:00
Joey Hess	ccc0684d21	no remotes support import yet	2019-02-20 16:59:04 -04:00
Joey Hess	e49f3139b5	fix windows build some more	2019-02-18 17:46:21 -04:00
Joey Hess	4c3178aadf	fix windows build	2019-02-18 17:38:21 -04:00
Joey Hess	9f6b7d6258	On Windows, avoid using rsync for file-to-file copies, since rsync is not always available there. Installing git-annex with stack rsync won't be available. Also, using the git-annex installer with 64 bit git installs a non-working rsync binary because it's linked with libraries provided by 32 bit git.	2019-02-18 17:27:34 -04:00
Joey Hess	60c1b5c994	deal with attempt to export filename with # or ? to webdav xporting files with '#' or '?' in their name won't work because urls get truncated on those. Fail in a better way in this case, and avoid failing when removing such files from the export, so after the user has renamed the problem files the export will succeed.	2019-02-07 13:47:57 -04:00
Joey Hess	7b9701675e	Display progress bar when getting files from export remotes And moved the progress bar display into storeExport as well. This commit was sponsored by John Pellman on Patreon.	2019-01-31 13:34:12 -04:00
Joey Hess	ab689cf0cd	Improved speed of S3 remote by only loading S3 creds once This gets back any speed lost in commit `9cebfd7002`, and speeds up all uses of S3 remotes that operate on them more than once. This commit was sponsored by Brett Eisenberg on Patreon.	2019-01-30 16:20:14 -04:00
Joey Hess	8eb66a5c40	avoid potentually unsafe use of runResourceT Pushed the ResourceT out into larger code blocks, and made sure that the the http result from a sendS3Handle is processed inside the same ResourceT block. I don't think this fixes any bugs, but it allows getting rid of a scary comment. This commit was sponsored by Eric Drechsel on Patreon.	2019-01-30 15:40:13 -04:00
Joey Hess	9cebfd7002	purify exportActions Purifying exportActions will allow introspecting and modifying it, which is needed to add progress bar display to it. Only S3 and WebDAV ran an Annex action while constructing ExportActions. There was a small performance gain from them doing that, since a resource was able to be prepared and reused for multiple actions by Command.Export. As seen in commit `809cfbbd8a` and `5d394023eb` S3 and WebDAV actually create a new handle for each access in normal, non-export use. It doesn't seem worth making export use of them marginally more efficient than normal use. It would be better to do that work upfront when constructing the remote. Or perhaps use a MVar to cache a handle. This commit was sponsored by Nick Piper on Patreon.	2019-01-30 15:11:40 -04:00
Joey Hess	5d394023eb	remove incorrect comment resourcePrepare does not cause the resource to only be prepared once. The http manager should be reused, which does avoid http connection overhead, but not because of the use of resourcePrepare.	2019-01-30 14:38:35 -04:00
Joey Hess	809cfbbd8a	prepareS3Handle didn't give any benefits, so remove I seem to have thought that a Preparer was only run once when a remote is accessed multiple times, but that is not in fact the case. prepareS3Handle is run once per access. So, there is no point to it. That there is some duplicate work done on each access is now apparent. Luckily, the http manager is reused, so only one http connection is made. But the S3 creds are loaded repeatedly. Room for improvement here. This commit was sponsored by Jack Hill on Patreon.	2019-01-30 14:23:39 -04:00
Joey Hess	720e5fda5c	export retrieval fallback to handle S3 remote with partially missing version IDs When key-based retrieval from a S3 remote with exporttree=yes appendonly=yes fails, fall back to trying to retrieve from the exported tree. This allows downloads of files that were exported to such a remote before versioning was enabled on it. This is useful at least for a transition for users who got into that situation, so they can download content from their S3 remote. May want to remove this in the future though, since normally trying to download the second time is only extra work. This commit was sponsored by Brock Spratlen on Patreon.	2019-01-30 13:23:03 -04:00
Joey Hess	ad1d422dd7	fix false positive in export conflict detection Like the earlier fixed one in Command.Export, it occurred when the same tree was exported by multiple clones. Previous fix was incomplete since several other places looked at the list of exported trees to detect when there was an export conflict. Added a single unified function to avoid missing any places it needed to be fixed. This commit was sponsored by mo on Patreon.	2019-01-30 12:36:30 -04:00
Joey Hess	8fc6c11cf1	couple fixes	2019-01-29 15:20:22 -04:00
Joey Hess	a8f1add4d1	S3: Detect when version=yes but an exported file lacks versioning, and refuse to delete it, to avoid data loss. This commit was sponsored by Denis Dzyubenko on Patreon.	2019-01-29 15:07:27 -04:00
Joey Hess	bb9817ceae	enableremote S3: Do not let versioning=yes be set on existing remote Because when git-annex lacks S3 version IDs for files stored in the bucket, deleting them would cause data loss. Also because git-annex is not able to download unversioned objects from a bucket when versioning=yes. This also prevents setting versioning=no. While that would perhaps be possible to do safely, it would add complexity, and would mean that if the user accidentially did enableremote versioning=no, they would not be able to undo it. This commit was sponsored by Trenton Cronholm on Patreon.	2019-01-29 14:09:50 -04:00
Joey Hess	ee011b3cbb	initremote S3: Automatically enable versioning in S3 buckets when configured with versioning=yes. Needs not yet released version 0.22 of aws library; with older versions asks the user to configure the bucket versioning themselves. Note that S3 endpoints that don't support versioning will cause putBucketVersioning to throw an exception, so initremote will fail. This commit was sponsored by Jake Vosloo on Patreon.	2019-01-29 13:46:04 -04:00
Joey Hess	c4977ec1ff	refactoring	2019-01-29 13:42:32 -04:00
Joey Hess	669b305de2	S3: Send a Content-Type header when storing objects in S3 So exports to public buckets can be linked to from web pages. (When git-annex is built with MagicMime support.) Thanks to Jared Cosulich for the idea.	2019-01-23 13:08:47 -04:00
Joey Hess	d5f2463702	misctmp cleanup * Switch to using .git/annex/othertmp for tmp files other than partial downloads, and make stale files left in that directory when git-annex is interrupted be cleaned up promptly by subsequent git-annex processes. * The .git/annex/misctmp directory is no longer used and git-annex will delete anything lingering in there after it's 1 week old. Also, in Annex.Ingest, made the filename it uses in the tmp dir be prefixed with "ingest-" to avoid potentially using a filename used by some other code.	2019-01-17 16:02:22 -04:00
Joey Hess	96aba8eff7	Revert "cache the serialization of a Key" This reverts commit `4536c93bb2`. That broke Read/Show of a Key, and unfortunately Key is read in at least one place; the GitAnnexDistribution data type. It would be worth bringing this optimisation back, but it would need either a custom Read/Show instance that preserves back-compat, or wrapping Key in a data type that contains the serialization, or changing how GitAnnexDistribution is serialized. Also, the Eq instance would need to compare keys with and without a cached seralization the same.	2019-01-16 16:21:59 -04:00
Joey Hess	4536c93bb2	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. It means that every place a Key has any of its fields changed, the cache has to be dropped. I've grepped and found them all. But, it would be better to avoid that gotcha somehow..	2019-01-14 16:37:28 -04:00
Joey Hess	d3ab5e626b	rename key2file and file2key What these generate is not really suitable to be used as a filename, which is why keyFile and fileKey further escape it. These are just serializing Keys. Also removed a quickcheck test that was very unlikely to test anything useful, since it relied on random chance creating something that looks like a serialized key. The other test is sufficient for testing what that was intended to test anyway.	2019-01-14 13:03:35 -04:00
Joey Hess	727767e1e2	make everything build again after ByteString Key changes	2019-01-11 16:39:46 -04:00
Joey Hess	cb375977a6	follow-on changes from MetaData type changes Including writing and parsing the metadata log files with bytestring-builder and attoparsec.	2019-01-07 15:51:05 -04:00
Joey Hess	7d51b0c109	import Utility.FileSystemEncoding in Common	2019-01-03 11:37:02 -04:00
Joey Hess	b3c69eaaf8	strict bytestring encoders and decoders Only had lazy ones before. Already sped up a few parts of the code.	2019-01-01 14:55:15 -04:00
Joey Hess	9cc6d5549b	convert UUID from String to ByteString This should make == comparison of UUIDs somewhat faster, and perhaps a few other operations around maps of UUIDs etc. FromUUID/ToUUID are used to convert String, which is still used for all IO of UUIDs. Eventually the hope is those instances can be removed, and all git-annex branch log files etc use ByteString throughout, for a real speed improvement. Note the use of fromRawFilePath / toRawFilePath -- while a UUID usually contains only alphanumerics and so could be treated as ascii, it's conceivable that some git-annex repository has been initialized using a UUID that is not only not a canonical UUID, but contains high unicode or invalid unicode. Using the filesystem encoding avoids any problems with such a thing. However, a NUL in a UUID seems extremely unlikely, so I didn't use encodeBS / decodeBS to avoid their extra overhead in handling NULs. The Read/Show instance for UUID luckily serializes the same way for ByteString as it did for String.	2019-01-01 14:45:33 -04:00
Joey Hess	2e069eb9f6	use putBucket to future-proof New fields can be added to PutBucket in the future.	2018-12-31 13:09:20 -04:00
Joey Hess	3fdc6fdfa9	remove unused import	2018-12-30 15:18:49 -04:00
Joey Hess	a26514d67e	Fix doubled progress display when downloading an url when -J is used. downloadUrl uses meteredFile, which sets up one progress meter, and Remote.Web also uses metered, so two progress meters are displayed for the same download. Reversion introduced with the http-conduit switch in `c34152777b` -- I don't know why the extra call to metered was added there. When -J is not used, the extra progress meter didn't display, but an extra blank line did get output, which is also fixed. This commit was sponsored by John Pellman on Patreon.	2018-12-30 12:29:49 -04:00
Joey Hess	3f587d447a	fix webdav reversion webdav: When initializing, avoid trying to make a directory at the top of the webdav server, which could never accomplish anything and failed on nextcloud servers. (Reversion introduced in version 6.20170925.) This commit was sponsored by mo on patreon.	2018-12-10 12:49:51 -04:00
Joey Hess	4579dd6201	S3: Improve diagnostics when a remote is configured with exporttree and versioning, but no S3 version id has been recorded for a key. When public access is used for the remote, it complained that the user needed to set creds to use it, which was just wrong. When creds were being used, it fell back from trying to use the version ID to just accessing the key in the bucket, which was ok for non-export remotes, but wrong for buckets. In both cases, display a hopefully useful warning. This should only come up when an existing S3 remote has been exported to, and then later versioning was enabled. Note that it would perhaps be possible to fall back from trying to use retrieveKeyFile when it fails and instead use retrieveKeyFileFromExport, which may work when S3 version ID is missing. But there are problems with that approach; how to tell when retrieveKeyFile has failed due to this rather than a network problem etc? Anyway, that approach would only work until the file in the export got overwritten, and then it would no longer be accessible. And with versioning enabled, the user wants old versions of objects to remain accessible, so it seems better to warn about the problem as soon as possible, so they can go back and add S3 version IDs. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-12-06 13:44:37 -04:00
Joey Hess	1308a76bf1	deMaybe credPairRemoteKey It's always Just	2018-12-04 13:37:43 -04:00
Joey Hess	a25fef36ad	fix json for exportedtrees in conflict Repeating the same json field with multiple values tends to not be supported well by json parsers, so list the trees separated by spaces.	2018-12-03 14:43:59 -04:00
Joey Hess	b8f9dea27d	add exportedtree to info info: When used with an exporttree remote, includes an "exportedtree" info, which is the tree last exported to the remote. During an export conflict, multiple values will be listed. This commit was sponsored by John Pellman on Patreon.	2018-12-03 14:36:00 -04:00
Robert Schütz	32017f8082	replace TORRENT by WITH_TORRENTPARSER	2018-11-27 12:29:25 -04:00
Joey Hess	370757087d	catch lockContentForRemoval exception removeKey should not throw exceptions, so catch exception there In Assistant.Unused, keep trying to drop other keys if one drop fails	2018-11-15 15:39:57 -04:00
Joey Hess	d65df7ab21	improve messages around export conflicts When an export conflict prevents accessing a special remote, be clearer about what the problem is and how to resolve it. This commit was sponsored by Trenton Cronholm on Patreon.	2018-11-13 15:50:06 -04:00
Joey Hess	c472c268c4	webapp: Fixed a crash when adding a git remote. Reversion introduced in `2b66492d6e` which added a new cache that needs to be cleared.	2018-10-29 16:01:08 -04:00
Joey Hess	a622488758	remove CHECKURL-MULTI single url response special case Removed undocumented special case in handling of a CHECKURL-MULTI response with only a single file listed. Rather than ignoring the url that was in the response, use it. This allows external special remotes that want to provide some better url to do so, although I don't entirely agree with using CHECKURL-MULTI to accomplish that. I'm more of the feeling that an undocumented special case that throws data away is just not a good idea. This could in theory break some external special remote program that relied on the current behavior, but its seems unlikely that it would because such a program must already handle the multiple url case, unless it only ever provides a single url response to CHECKURL-MULTI. Make addurl --file work with a single item CHECKURL-MULTI response. It already did for external special remotes due to the special case, but now it also will for builtin ones like the BitTorrent special remote. This commit was sponsored by Ilya Shlyakhter on Patron.	2018-10-29 14:52:12 -04:00

... 2 3 4 5 6 ...

1371 commits