git-annex

Author	SHA1	Message	Date
Joey Hess	283d2f85d1	importfeed: Fix reversion that caused some '.' in filenames to be replaced with '_' sanitizeFilePath was changed to sanitize leading '.', but ImportFeed was running it on parts of the template. So eg the leading '.' in the extension got sanitized. Note the added case for sanitizeLeadingFilePathCharacter ('/':_) -- this was added because, if the template is title/episode and the title is not set, it would expand to "/episode". So this is another potential security fix.	2020-08-05 11:35:00 -04:00
Joey Hess	f75be32166	external backends wip It's able to start them up, the only thing not implemented is generating and verifying keys. And, the key translation for HasExt.	2020-07-29 15:23:18 -04:00
Joey Hess	555fe669e1	refactoring in preparation for external backends	2020-07-29 12:00:27 -04:00
Joey Hess	f5e65d680b	add back inAnnex check for drop here Needed again after last commit removed it from startLocal again.	2020-07-25 18:17:33 -04:00
Joey Hess	2a45b5ae9a	avoid failure to lock content of removed file causing drop etc to fail This was already prevented in other ways, but as seen in commit `c30fd24d91`, those were a bit fragile. And I'm not sure races were avoided in every case before. At least a race between two separate git-annex processes, dropping the same content, seemed possible. This way, if locking fails, and the content is not present, it will always do the right thing. Also, it avoids the overhead of an unncessary inAnnex check for every file. This commit was sponsored by Denis Dzyubenko on Patreon.	2020-07-25 11:59:33 -04:00
Joey Hess	c30fd24d91	add back inAnnex check after seeking The test suite noticed this case, where two files with the same key are dropped, and the seek stage sees both have content due to the way files stream through it. But then locking the content to drop fails on the second file, because the first file has already been dropped. So, add back otherwise redundant inAnnex check.	2020-07-25 11:18:50 -04:00
Joey Hess	18f1fb5841	drop performance improvements Sped up seeking files to drop by 2x, and also some performance improvements to checking numcopies. Interestingly, the seek speedup is not due to precaching, but I think is due to calling getParsed earlier. Annex.Drop had to be changed to check inAnnex there, since it was removed from Command.Drop. All other users of Command.Drop already checked inAnnex themselves. This commit was sponsored by Ryan Newton on Patreon.	2020-07-24 13:27:46 -04:00
Joey Hess	c4cc2cdf4c	rename getKey to genKey for consistency with external backend protocol	2020-07-20 14:06:05 -04:00
Joey Hess	172743728e	move cryptographicallySecure into Backend type This is groundwork for external backends, but also makes sense to keep this information with the rest of a Backend's implementation. Also, removed isVerifiable. I noticed that the same information is encoded by whether a Backend implements verifyKeyContent or not.	2020-07-20 12:17:42 -04:00
Joey Hess	2634a5ed99	avoid inflating error counter when forking and merging annex state	2020-07-19 18:31:25 -04:00
Joey Hess	7a42a47902	renaming	2020-07-10 14:17:35 -04:00
Joey Hess	9f6bd6cc05	add inRepoDetails planned to use for an optimisation most things using stagedDetails were not expecting to get dup files in a conflicted merge and deal with them, so converted them to use inRepoDetails.	2020-07-08 15:36:35 -04:00
Joey Hess	7347e50123	add stage number to stagedDetails parser And convert parser to attoparsec, probably faster. Before, a parse failure threw the whole --stage output line in to the filename, which was certianly a bad idea, so fixed that.	2020-07-08 15:05:12 -04:00
Joey Hess	9483b10469	cache one more log file for metadata My worry was that a preferred content expression that matches on metadata would have removed the location log from cache, causing an expensive re-read when a Seek action later checked the location log. Especially when the --all optimisation in the previous commit pre-cached the location log. This also means that the --all optimisation could cache the metadata log too, if it wanted too, but not currently done. The cache is a list, with the most recently accessed file first. That optimises it for the common case of reading the same file twice, eg a get, examine, followed by set reads it twice. And sync --content reads the location log 3 times in a row commonly. But, as a list, it should not be made to be too long. I thought about expanding it to 5 items, but that seemed unlikely to be a win commonly enough to outweigh the extra time spent checking the cache. Clearly there could be some further benchmarking and tuning here.	2020-07-07 14:18:55 -04:00
Joey Hess	e72ec8b9b2	add back git-annex branch read cache The cache was removed way back in 2012, commit `3417c55189` Then I forgot I had removed it! I remember clearly multiple times when I thought, "this reads the same data twice, but the cache will avoid that being very expensive". The reason it was removed was it messed up the assistant noticing when other processes made changes. That same kind of problem has recently been addressed when adding the optimisation to avoid reading the journal unnecessarily. Indeed, enableInteractiveJournalAccess is run in just the right places, so can just piggyback on it to know when it's not safe to use the cache.	2020-07-06 12:22:33 -04:00
Joey Hess	57cceac569	simplify interface by removing size Add size to the returned key after the fact, unless the remote happened to add it itself.	2020-07-03 14:22:22 -04:00
Joey Hess	85506a7015	import: Added --no-content option, which avoids downloading files from a special remote Only supported by some special remotes: directory I need to check the rest and they're currently missing methods until I do. git-annex sync --no-content does not yet use this to do imports	2020-07-03 13:41:57 -04:00
Joey Hess	b2f4b84d27	clean up some build warnings on windows	2020-07-02 11:34:18 -04:00
Joey Hess	087b7ee66a	Revert "data type that starts off using a set but converts to a bloom filter when large" This reverts commit `7e2c4ed216`. I was not able to use this in the end.. See comment in the previous commit.	2020-07-01 20:12:19 -04:00
Joey Hess	a09937580e	more windows build fixes	2020-07-01 15:22:56 -04:00
Joey Hess	7e2c4ed216	data type that starts off using a set but converts to a bloom filter when large This adds a dep on hashable, but it's a free dependency, since unordered-containers already pulled it in. Using unordered-containers for the set seems to make sense, since it hashes and bloom filter hashes too. (Though different hashes.) I dunno, never quite know if I should use unordered-containers or containers.	2020-07-01 14:06:12 -04:00
Joey Hess	d3d187c869	fix build on windows Annex.GitOverlay was using a module that needs posix to build.	2020-07-01 11:22:15 -04:00
Joey Hess	a59e95a82d	improve "unable to lock down 1 copy" message This is a fairly hard to understand situation for the user. Listing the remotes should help them understand it a bit better. This commit was sponsored by Ethan Aubin.	2020-06-26 13:00:40 -04:00
Joey Hess	b651d3ede0	test: Fix some test cases that assumed git's default branch name git is making that configurable, and configuring it globally would break the test suite in a few places. No other part of git-annex assumes any branch name. Renamed a few placeholders to make that clearer. This commit was sponsored by Jake Vosloo on Patreon.	2020-06-23 16:40:51 -04:00
Joey Hess	7757c0e900	Honor annex.largefiles when importing a tree from a special remote. This commit was sponsored by Martin D on Patreon.	2020-06-23 16:07:18 -04:00
Joey Hess	104b3a9c6a	Build with the http-client-restricted library when available Otherwise use the vendored copy as before. The library is in Debian testing but not stable. Once it reaches stable, the vendored copy can be removed. Did not add it to debian/control because IIRC that's used to build git-annex on stable too, possibly. However, the Debian maintainer will probably want to make the package depend on libghc-http-client-restricted-dev This commit was sponsored by Ilya Shlyakhter on Patreon.	2020-06-22 11:31:31 -04:00
Joey Hess	aa1ad0b7ca	remove redundant imports Clean build under ghc 8.8.3, which seems to do better at finding cases where two imports both provide the same symbol, and warns about one of them. This commit was sponsored by Ilya Shlyakhter on Patreon.	2020-06-22 11:05:34 -04:00
Joey Hess	d5451afc8f	fix deadlock Fix a deadlock that could occur after git-annex got an unlocked file, causing the command to hang indefinitely. Known to happen on vfat filesystems, possibly others. Note that a deadlock is still theoretically possible, if anything smudge --clean does causes it to run the git queue for some other reason. Apparently that doesn't happen, but will need to keep an eye on it.	2020-06-18 12:56:29 -04:00
Joey Hess	96f6aa39dd	add runsGitAnnexChildProcess calls This is all the calls to git-annex that seem capable of possibly locking the same pidlock as their parent. Except possibly for some in the assistant.	2020-06-17 15:31:03 -04:00
Joey Hess	82448bdf39	fix a annex.pidlock issue That made eg git-annex get of an unlocked file hang until the annex.pidlocktimeout and then fail. This fix should be fully thread safe no matter what else git-annex is doing. Only using runsGitAnnexChildProcess in the one place it's known to be a problem. Could audit for all places where git-annex runs itself as a child and add it to all of them, later.	2020-06-17 15:30:59 -04:00
Joey Hess	ad81feb053	fix implicit embedcreds regression Fix bug that made creds not be stored in git when a special remote was initialized with gpg encryption, but without an explicit embedcreds=yes. (Yet nother regression introduced in version 7.20200202.7. 5th so far.)	2020-06-16 18:00:19 -04:00
Joey Hess	a76b1ba3d6	local git remote autoinit improvements * Improve display of problems auto-initializing or upgrading local git remotes. * When a local git remote cannot be initialized because it has no git-annex branch or a .noannex file, avoid displaying a message about it.	2020-06-16 13:24:00 -04:00
Joey Hess	8a7c615a8f	import: Avoid using some strange names for temporary keys The ContentIdentifier can contain almost anything, so could have characters that are not fit for the filesystem, or might be longer than a key usually is, or contain a newline, or .... genKeyName deals with those problems. This should not present a back-compat issue, because this is a temporary key used while downloading the imported file, before the real key for it can be generated.	2020-06-11 16:07:36 -04:00
Joey Hess	6b0cb2d732	defer cleaning keys db of old data Avoid creating the keys database during init when there are no unlocked files, to prevent init failing when sqlite does not work in the filesystem.	2020-06-11 15:40:13 -04:00
Joey Hess	24ff5e2b29	use uninterruptibleMask Some recent changes to use mask missed that async exceptions can still be thrown inside it. The goal is to make sure a block of cleanup code runs entirely, w/o being interrupted by an async exception, so use uninterruptibleMask. Also, converted a few to bracket, which is nicer.	2020-06-09 15:02:56 -04:00
Joey Hess	0210e81d83	async exception safety for openFd Audited for openFile and openFd, and this fixes all the ones I found where an async exception could prevent the file getting closed. Except for the lock pool, which is a whole other can of worms.	2020-06-05 15:48:00 -04:00
Joey Hess	319f2a4afc	audit all uses of SomeException to avoid catching async exceptions Except for the assistant, which I think may use them between threads? Most of the uses of SomeException were already catching only async exceptions. But I did find a few places that were accidentially catching them.	2020-06-05 15:16:57 -04:00
Joey Hess	2bff3b7c49	init: When annex.pidlock is set, skip lock probing.	2020-06-05 11:12:16 -04:00
Joey Hess	1d41ae5d2a	init warning on stalled lock probe init: If lock probing stalls for a long time (eg a broken NFS server), display a message to let the user know what's taking so long.	2020-06-05 11:06:19 -04:00
Joey Hess	2670890b17	convert to withCreateProcess for async exception safety This handles all createProcessSuccess callers, and aside from process pools, the complete conversion of all process running to async exception safety should be complete now. Also, was able to remove from Utility.Process the old API that I now know was not a good idea. And proof it was bad: The code size went down, despite there being a fair bit of boilerplate for some future API to reduce.	2020-06-04 15:45:52 -04:00
Joey Hess	438dbe3b66	convert to withCreateProcess for async exception safety This handles all sites where checkSuccessProcess/ignoreFailureProcess is used, except for one: Git.Command.pipeReadLazy That one will be significantly more work to convert to bracketing. (Also skipped Command.Assistant.autoStart, but it does not need to shut down the processes it started on exception because they are git-annex assistant daemons..) forceSuccessProcess is done, except for createProcessSuccess. All call sites of createProcessSuccess will need to be converted to bracketing. (process pools still todo also)	2020-06-04 12:44:09 -04:00
Joey Hess	2dc7b5186a	convert to withCreateProcess for async exception safety	2020-06-04 12:05:25 -04:00
Joey Hess	92f775eba0	convert to withCreateProcess for async exception safety Not yet 100% done, so far I've grepped for waitForProcess and converted everything that uses that to start the process with withCreateProcess. Except for some things like P2P.IO and Assistant.TransferrerPool, and Utility.CoProcess, that manage a pool of processes. See #2 in https://git-annex.branchable.com/todo/more_extensive_retries_to_mask_transient_failures/#comment-209f8a8c38e63fb3a704e1282cb269c7 for how those will need to be dealt with. checkSuccessProcess, ignoreFailureProcess, and forceSuccessProcess calls waitForProcess, so callers of them will also need to be dealt with, and have not been yet.	2020-06-03 15:48:09 -04:00
Joey Hess	89b2542d3c	annex.skipunknown with transition plan Added annex.skipunknown git config, that can be set to false to change the behavior of commands like `git annex get foo*`, to not skip over files/dirs that are not checked into git and are explicitly listed in the command line. Significant complexity was needed to handle git-annex add, which uses some git ls-files calls, but needs to not use --error-unmatch because of course the files are not known to git. annex.skipunknown is planned to change to default to false in a git-annex release in early 2022. There's a todo for that.	2020-05-28 15:55:17 -04:00
Joey Hess	484a74f073	auto-init autoenable=yes Try to enable special remotes configured with autoenable=yes when git-annex auto-initialization happens in a new clone of an existing repo. Previously, git-annex init had to be explicitly run to enable them. That was a bit of a wart of a special case for users to need to keep in mind. Special remotes cannot display anything when autoenabled this way, to avoid interfering with the output of git-annex query commands. Any error messages will be hidden, and if it fails, nothing is displayed. The user will realize the remote isn't enable when they try to use it, and can run git-annex init manually then to try the autoenable again and see what failed. That seems like a reasonable approach, and it's less complicated than communicating something across a pipe in order to display it as a side message. Other reason not to do that is that, if the first command the user runs is one like git-annex find that has machine readable output, any message about autoenable failing would need to not be displayed anyway. So better to not display a failure message ever, for consistency. (Had to split out Remote.List.Util to avoid an import cycle.)	2020-05-27 12:40:35 -04:00
Joey Hess	0a9a3ed1c3	left an unhandled case in previous commit	2020-05-15 14:31:50 -04:00
Joey Hess	3334d3831b	change retrieveExport and getKey to throw exception retrieveExport is part of ongoing transition to make remote methods throw exceptions, rather than silently hide them. getKey very rarely fails, and when it does it's always for the same reason (user configured annex.backend to url for some reason). So, this will avoid dealing with Nothing everywhere it's used. This commit was sponsored by Ilya Shlyakhter on Patreon.	2020-05-15 13:45:53 -04:00
Joey Hess	c1cd402081	make storeKey throw exceptions When storing content on remote fails, always display a reason why. Since the Storer used by special remotes already did, this mostly affects git remotes, but not entirely. For example, if git-lfs failed to connect to the endpoint, it used to silently return False.	2020-05-13 14:03:00 -04:00
Joey Hess	5f5170b22b	remove SafeFilePath Move sanitizeFilePath call to where fromSafeFilePath had been.	2020-05-11 14:04:56 -04:00
Joey Hess	cabbc91b18	addurl, importfeed: Allow '-' in filenames, as long as it's not the first character	2020-05-11 13:50:49 -04:00

1 2 3 4 5 ...

1473 commits