git-annex

Author	SHA1	Message	Date
Joey Hess	f45ad178cb	more RawFilePath conversion At 318/645 after 4k lines of changes This commit was sponsored by Jake Vosloo on Patreon.	2020-10-29 12:03:50 -04:00
Joey Hess	e505c03bcc	more RawFilePath conversion nukeFile replaced with removeWhenExistsWith removeLink, which allows using RawFilePath. Utility.Directory cannot use RawFilePath since setup does not depend on posix. This commit was sponsored by Graham Spencer on Patreon.	2020-10-29 10:50:29 -04:00
Joey Hess	8d66f7ba0f	more RawFilePath conversion Added a RawFilePath createDirectory and kept making stuff build. Up to 296/645 This commit was sponsored by Mark Reidenbach on Patreon.	2020-10-28 17:25:59 -04:00
Joey Hess	6c29817748	RawFilePath version of getCurrentDirectory This commit was sponsored by Jochen Bartl on Patreon	2020-10-28 16:03:45 -04:00
Joey Hess	08cbaee1f8	more RawFilePath conversion Most of Git/ builds now. Notable win is toTopFilePath no longer double converts This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-10-28 15:55:30 -04:00
Joey Hess	d6e94a6b2e	got configure working after Utility.Path ByteString conversion Had to split out some modules because getWorkingDirectory needs unix, which is not a build-dep of configure. This commit was sponsored by Brock Spratlen on Patreon.	2020-10-28 15:01:19 -04:00
Joey Hess	e219aadbab	convert to RawByteString This will break a lot of stuff that uses it, but once fixed should lead to better performance. Mostly mechanical. Changes of note: * upFrom now uses isPathSeparator, which is better on Windows where there is not just one * splitShortExtensions used to take the length of a string, which would count wide unicode characters as a single character. Changing to B.length changes that. Note that, git-annex's annexMaxExtensionLength already changed to the length in bytes before this change. This function is only used in generating views, and the small behavior change should not be a problem. * relHome still uses FilePath because it didn't seem worth changing(?) This commit was sponsored by Jack Hill on Patreon.	2020-10-28 14:32:45 -04:00
Joey Hess	9a5cd96f0d	Fix a memory leak introduced in the last release The problem was this line: cleanup = and <$> sequence (map snd v) That caused all of v to be held onto until the end, when the cleanup action was run. I could not seem to find a bang pattern that avoided the leak, so I resorted to a IORef, rather clunky, but not a performance problem because it will only be written once per git ls-files, so typically just 1 time. This commit was sponsored by Mark Reidenbach on Patreon.	2020-10-13 16:31:01 -04:00
Joey Hess	d54dd0ef9c	Fix build on Windows with network-3 inet_addr was removed, but all this needs is localhost, so hardcoding it should work fine. It may be that this windows ifdef is no longer needed. It was added in 2013 with a note that getAddrInfo didn't work on windows, but it seems likely such a problem would have been fixed since.	2020-10-08 10:50:39 -04:00
Joey Hess	fd81dd912b	change from deprecated and removed aNY_PORT to defaultPort (Both are just 0 internally.)	2020-10-08 10:35:16 -04:00
Joey Hess	30e3a2e4c4	remove unused define	2020-10-08 10:31:03 -04:00
Joey Hess	4c32499e82	Parse youtube-dl progress output Which lets progress be displayed when doing concurrent downloads. Amoung other things, like --json-progress etc. The youtube-dl output is no longer displayed, except for any errors. This commit was sponsored by Denis Dzyubenko on Patreon.	2020-09-29 17:53:48 -04:00
Joey Hess	15c1ee16d9	import --no-content: Check annex.largefiles Import small files into git, the same as is done when importing with content. Which means, for small files, --no-content does download them. If the largefiles expression needs the file content available (due to mimetype or mimeencoding being used), the import will fail. This commit was sponsored by Jake Vosloo on Patreon.	2020-09-28 13:28:57 -04:00
Joey Hess	f624876dc2	remove zombie process in file seeking This was the last one marked as a zombie. There might be others I don't know about, but except for in the hypothetical case of a thread dying due to an async exception before it can wait on a process it started, I don't know of any. It would probably be safe to remove the reapZombies now, but let's wait and so that in its own commit in case it turns out to cause problems. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-09-25 11:38:42 -04:00
Joey Hess	5117ae8aec	fix build warning	2020-09-25 11:07:41 -04:00
Joey Hess	c1b4d76e6b	make MatchFiles introspectable matchNeedsFileContent is not used yet, but shows how to add information about terminals. That one would be needed for https://git-annex.branchable.com/todo/sync_fast_import/ Note the tricky bit in Annex.FileMatcher.call where it folds over the included matcher to propagate the information. This commit was sponsored by Svenne Krap on Patreon.	2020-09-24 14:01:53 -04:00
Joey Hess	68f9766544	Improve --debug output to show pid of processes that are started and stopped getPid returns Nothing if the process has already been stopped, and in that case, the pid will not be displayed. I think that would only happen if waitForProcess or similar gets called more than once on the same process handle though. getPid on unix has an overhead of only a MVar read. On Windows it needs to make a syscall, so will be probably more expensive. While the added expense happens even when debug logging is disabled, it should be small enough compared with the overhead of starting a process that it's not a problem. (It does occur to me that a debugM that took an IO String could only run it when debugging is really enabled, which would improve performance. It does not seem possible to use the current hslogger interface to do that though; it does not expose the information that would be needed.)	2020-09-24 12:39:57 -04:00
Joey Hess	83df401d93	Merge branch 'batchasync' into master	2020-09-16 13:02:58 -04:00
Joey Hess	3a05d53761	add SeekInput (not yet used) No behavior changes (hopefully), just adding SeekInput and plumbing it through to the JSON display code for later use. Over the course of 2 grueling days. withFilesNotInGit reimplemented in terms of seekHelper should be the only possible behavior change. It seems to test as behaving the same. Note that seekHelper dummies up the SeekInput in the case where segmentPaths' gives up on sorting the expanded paths because there are too many input paths. When SeekInput later gets exposed as a json field, that will result in it being a little bit wrong in the case where 100 or more paths are passed to a git-annex command. I think this is a subtle enough problem to not matter. If it does turn out to be a problem, fixing it would require splitting up the input parameters into groups of < 100, which would make git ls-files run perhaps more than is necessary. May want to revisit this, because that fix seems fairly low-impact.	2020-09-15 15:41:13 -04:00
Joey Hess	ddf963d019	deepseq all things returned from ResourceT http Potentially fixes https://git-annex.branchable.com/bugs/concurrent_git-annex-copy_to_s3_special_remote_fails/ although I don't know if it does. My thinking is, ResourceT may allocate a resource and then free it, and a unforced thunk to that resource could result in reading memory that has since been overwritten by something else, or in a SEGV, depending. While that seems kind of like a bug in ResourceT to me, if it is what's happening, this will avoid it. If it's not, this doesn't really hurt much since the values are all smallish. This commit was sponsored by Graham Spencer on Patreon.	2020-09-14 18:30:06 -04:00
Joey Hess	efd2f1a918	avoid failure when gpgconf is not in path	2020-09-08 12:21:24 -04:00
Joey Hess	e88ab0c09d	improve comment	2020-09-07 15:39:07 -04:00
Joey Hess	5d1b28c79c	avoid build warning	2020-09-07 15:10:09 -04:00
Joey Hess	820d4368b3	remove unused isSymLink That made Utility.FileMode depend on unix, but Utility.Tmp now depends on it, and is used by Setup, which does not. So it was easiest to remove this, especially since it's not used.	2020-09-02 14:59:35 -04:00
Joey Hess	376de6fcce	fix build	2020-09-02 14:59:22 -04:00
Joey Hess	22a7f14e7b	better dummy value no need for empty string when it's going to be thrown away..	2020-09-02 14:55:43 -04:00
Joey Hess	6e9a4f50f3	make viaTmp honor umask Fixed several cases where files were created without file mode bits that the umask would usually set. This included exports to the directory special remote, torrent files used by the bittorrent special remote, hooks written by git-annex init, and some log files in .git/annex/ Audited all calls, looking for ones that didn't want the umask bits to be set. All such turned out to already set the specific restrictive file mode they wanted.	2020-09-02 14:54:07 -04:00
Joey Hess	6361f7c310	make removeAuthorizedKeys robust if the file DNE Noticed this could potentially crash, although the only thing using it would normally create the file first, if something then deleted it..	2020-09-02 14:37:42 -04:00
Joey Hess	eed20fe3b7	fix some file modes in calls to withTmpFileIn to honor umask Also audited for other calls to openTempFile, and all are ok, except for viaTmp which will need further work. Remote.Directory fixed to set umask mode when writing to an export, although it has another one using viaTmp that's not fixed. Will make exports that are published via a http server running as another user work, for example. Remote.BitTorrent fixed to set umask mode when downloading the torrent file. Normally this does not matter as that file does not hang around after the download, but if a bittorrent download were started by one user, got interrupted and then another user ran it, this will let them access the torrent file created by the first user.	2020-09-02 14:36:08 -04:00
Joey Hess	854cd2ad47	httpalso: support exporttree=yes Also tested what happens if the other special remote has importtree=yes and exporttree=yes, and in that case, download via httpalso works too, without needing to implement any importtree methods here. It might be possible to make it automatically set exporttree=yes if the --sameas does. Didn't try, will probably be layering issues. Or perhaps it should be inherited by sameas like some other configs? But then, wouldn't it also make sense to inherit importree=yes? But as shown here, it's not needed by this kind of remote.	2020-09-02 11:26:00 -04:00
Joey Hess	cde3e5eb0c	test: Stop gpg-agent daemons that are started for the test framework's gpg key They normally shutdown when the GNUPGHOME directory is deleted, but on NFS they keep the directory from being deleted. And also, this avoids a number of them piling up while the test suite is running.	2020-08-28 14:28:42 -04:00
Joey Hess	cb3916ae8a	remove build warning about old process version The timeout features never materialized.	2020-08-28 11:17:30 -04:00
Joey Hess	b68f214312	Display a message when git-annex has to wait for a pid lock file held by another process	2020-08-26 13:05:34 -04:00
Joey Hess	7bdb0cdc0d	add gitAnnexChildProcess and use instead of incorrect use of runsGitAnnexChildProcess Fixes reversion in 8.20200617 that made annex.pidlock being enabled result in some commands stalling, particularly those needing to autoinit. Renamed runsGitAnnexChildProcess to make clearer where it should be used. Arguably, it would be better to have a way to make any process git-annex runs have the env var set. But then it would need to take the pid lock when running any and all processes, and that would be a problem when git-annex runs two processes concurrently. So, I'm left doing it ad-hoc in places where git-annex really does run a child process, directly or indirectly via a particular git command.	2020-08-25 14:57:49 -04:00
Joey Hess	4c58433c48	avoid using MonadFail in ParseDuration There's no instance for Either String, so that makes it not as useful as it could be, so instead just return an Either String.	2020-08-15 15:53:35 -04:00
Joey Hess	4466c1001d	improve slightly This probably avoids the situation that caused the exception to be thrown. It also makes sure that both threads end up canceled in the end, while before the exception from wait outt could have caused errt to never be waited on.	2020-08-10 16:33:58 -04:00
Joey Hess	c59a51a065	discard any exception thrown while trying to kill worker threads Since there's a race here, and since Kyle saw an exception leak out, which I have not been able to reproduce that. See my comment for what I think might be going on. Note that, I used tryNonAsync, because it seems a later tryNonAsync caught the exception. I don't actually understand how it did, as I understand exception classification, it's the data type, not the way it was thrown. One possibility is that the async exception may have been wrapped in some other, non-async exception, and Show displayed it the same way.	2020-08-10 16:24:51 -04:00
Joey Hess	a6af887a19	couple more exports needed	2020-08-10 14:18:00 -04:00
Joey Hess	6661eeda96	one more export needed	2020-08-10 13:46:20 -04:00
Joey Hess	3fcf478c19	fix build with old versions of process	2020-08-10 13:21:40 -04:00
Joey Hess	9994c5882c	further change to support dlist-1.0 Its tail no longer yields a DList.	2020-08-05 10:37:14 -04:00
Joey Hess	c4ec52b9ae	Slightly sped up the linux standalone bundle Reduce the number of directories listed in libdirs, which makes the linker check a lot less dead ends looking for directories. Eliminated some directories that didn't really contain shared libraries, or only contained the linker. That left only 2, one in lib and one in usr/lib, so consolidate those two. Doing it this way, rather than just consolidating all libs that might exist into a single directory means that, if there are optimised versions of some libs, eg in lib/subarch/foo.so, and lib/subarch2/foo.so, they don't get moved around in a way that would make the linker pick the wrong one.	2020-07-31 14:42:03 -04:00
Joey Hess	f75be32166	external backends wip It's able to start them up, the only thing not implemented is generating and verifying keys. And, the key translation for HasExt.	2020-07-29 15:23:18 -04:00
Joey Hess	aa492bc659	Fix a hang when using git-annex with an old openssh 7.2p2 This does mean a 2 second delay after transfers when using that ssh, but it's an old and apparently quite weirdly broken version of ssh.	2020-07-22 11:04:33 -04:00
Joey Hess	ac56a5c2a0	Fix a lock file descriptor leak that could occur when running commands like git-annex add with -J Bug was introduced as part of a different FD leak fix in version 6.20160318.	2020-07-21 15:30:47 -04:00
Joey Hess	798fdad660	fix build with dlist-1.0 That removed the list function. This new implementation appears to actually be more efficient anyway, since it avoids toList.	2020-07-21 12:58:51 -04:00
Joey Hess	2234a1d64a	document	2020-07-19 21:31:06 -04:00
Joey Hess	4c9ad1de46	optimisation: stream keys through git cat-file --buffer This is only implemented for git-annex get so far. It makes git-annex get nearly twice as fast in a repo with 10k files, all of them present! But, see the TODO for some caveats.	2020-07-10 13:54:52 -04:00
Joey Hess	f63a7aa0e7	fix headTList to drop the head item	2020-07-10 13:02:32 -04:00
Joey Hess	de3d7d044d	make catObjectStream support newline and carriage return in filenames Turns out the %(rest) trick was not needed. Instead, just maintain a list of files we've asked for, and each cat-file response is for the next file in the list. This actually benchmarks 25% faster than before! Very surprising, but it must be due to needing to shove less data through the pipe, and parse less.	2020-07-08 13:49:03 -04:00
Joey Hess	d66fc1a464	Revert "async exception safety for coprocesses" This reverts commit `7013798df5`.	2020-07-06 15:11:28 -04:00
Joey Hess	38c0057cc6	fix build on windows	2020-07-01 16:53:50 -04:00
Joey Hess	83df96d1f5	fix build on windows	2020-06-30 11:01:28 -04:00
Joey Hess	104b3a9c6a	Build with the http-client-restricted library when available Otherwise use the vendored copy as before. The library is in Debian testing but not stable. Once it reaches stable, the vendored copy can be removed. Did not add it to debian/control because IIRC that's used to build git-annex on stable too, possibly. However, the Debian maintainer will probably want to make the package depend on libghc-http-client-restricted-dev This commit was sponsored by Ilya Shlyakhter on Patreon.	2020-06-22 11:31:31 -04:00
Joey Hess	01eb863a14	Build with the git-lfs library when available Otherwise use the vendored copy as before. The library is in Debian testing but not stable. Once it reaches stable, the vendored copy can be removed. Did not add it to debian/control because IIRC that's used to build git-annex on stable too, possibly. However, the Debian maintainer will probably want to make the package depend on libghc-git-lfs-dev. This commit was sponsored by Ilya Shlyakhter on Patreon.	2020-06-22 11:21:25 -04:00
Joey Hess	aa1ad0b7ca	remove redundant imports Clean build under ghc 8.8.3, which seems to do better at finding cases where two imports both provide the same symbol, and warns about one of them. This commit was sponsored by Ilya Shlyakhter on Patreon.	2020-06-22 11:05:34 -04:00
Joey Hess	6ef62cb3c7	fix unused import warning Network.HTTP.Client exports makeConnection since 0.5.3. Debian stable has a newer version than 0.5.3, so bumping the min version seems better than adding an ifdef.	2020-06-22 10:55:37 -04:00
Joey Hess	82448bdf39	fix a annex.pidlock issue That made eg git-annex get of an unlocked file hang until the annex.pidlocktimeout and then fail. This fix should be fully thread safe no matter what else git-annex is doing. Only using runsGitAnnexChildProcess in the one place it's known to be a problem. Could audit for all places where git-annex runs itself as a child and add it to all of them, later.	2020-06-17 15:30:59 -04:00
Joey Hess	9fb549b3f1	fix strictness issue Recent changes to Utility.Gpg exposed a strictness bug in how Creds uses it.	2020-06-16 17:09:34 -04:00
Joey Hess	c8ff3e082e	add debug logging wrapper for withCreateProcess	2020-06-11 16:43:24 -04:00
Joey Hess	24ff5e2b29	use uninterruptibleMask Some recent changes to use mask missed that async exceptions can still be thrown inside it. The goal is to make sure a block of cleanup code runs entirely, w/o being interrupted by an async exception, so use uninterruptibleMask. Also, converted a few to bracket, which is nicer.	2020-06-09 15:02:56 -04:00
Joey Hess	7013798df5	async exception safety for coprocesses Tested the forcerestart code path and it works. The hairy part is, what if an async exception is caught when it's in restart? If it's in the part that stops the old process, the old process is left in the handle. The next attempt to use the CoProcessHandle will then throw an IO exception, which will result in restart getting run again. So I think this will work, but have not actually tested it. The use of withMVarMasked lets it start the new process and fill the mvar with it, even if there's an async exception at that point. Note that exceptions are masked while running forcerestart, so do not need to worry about an async exception being thrown while it's recovering from an async exception.	2020-06-09 13:44:23 -04:00
Joey Hess	0210e81d83	async exception safety for openFd Audited for openFile and openFd, and this fixes all the ones I found where an async exception could prevent the file getting closed. Except for the lock pool, which is a whole other can of worms.	2020-06-05 15:48:00 -04:00
Joey Hess	660d8d3a87	simpler way to do this Remove old code that can be trivially implemented using async in a much nicer way (that is async exception safe). I've audited all forkOS calls (except for ones in the assistant), and this was the last remaining one that is not async exception safe. The rest look ok to me.	2020-06-05 14:18:06 -04:00
Joey Hess	074260f036	async exception safety Use async/cancel so helper threads are not left running. Bracket createPipe to ensure the handles get closed.	2020-06-05 14:08:46 -04:00
Joey Hess	a0d09f1d9e	bracket createPipe for async exception safety	2020-06-05 13:58:04 -04:00
Joey Hess	b7619414bf	support building with process-1.6.3 again	2020-06-05 11:40:18 -04:00
Joey Hess	b329cf32ec	fix reversion Accidentially lost the handle setup in processTranscript.	2020-06-05 10:40:41 -04:00
Joey Hess	2670890b17	convert to withCreateProcess for async exception safety This handles all createProcessSuccess callers, and aside from process pools, the complete conversion of all process running to async exception safety should be complete now. Also, was able to remove from Utility.Process the old API that I now know was not a good idea. And proof it was bad: The code size went down, despite there being a fair bit of boilerplate for some future API to reduce.	2020-06-04 15:45:52 -04:00
Joey Hess	e4993b4456	async exception safety Convert to withCreateProcess (missed this one a couple commits ago) and also make sure that the child thread gets canceled on exception.	2020-06-04 12:57:22 -04:00
Joey Hess	bd3074643b	remove unused createBackgroundProcess	2020-06-04 12:48:42 -04:00
Joey Hess	20557cf0ef	stop exporting createProcessChecked Yay, that had an ugly comment associated with it.	2020-06-04 12:46:55 -04:00
Joey Hess	438dbe3b66	convert to withCreateProcess for async exception safety This handles all sites where checkSuccessProcess/ignoreFailureProcess is used, except for one: Git.Command.pipeReadLazy That one will be significantly more work to convert to bracketing. (Also skipped Command.Assistant.autoStart, but it does not need to shut down the processes it started on exception because they are git-annex assistant daemons..) forceSuccessProcess is done, except for createProcessSuccess. All call sites of createProcessSuccess will need to be converted to bracketing. (process pools still todo also)	2020-06-04 12:44:09 -04:00
Joey Hess	92f775eba0	convert to withCreateProcess for async exception safety Not yet 100% done, so far I've grepped for waitForProcess and converted everything that uses that to start the process with withCreateProcess. Except for some things like P2P.IO and Assistant.TransferrerPool, and Utility.CoProcess, that manage a pool of processes. See #2 in https://git-annex.branchable.com/todo/more_extensive_retries_to_mask_transient_failures/#comment-209f8a8c38e63fb3a704e1282cb269c7 for how those will need to be dealt with. checkSuccessProcess, ignoreFailureProcess, and forceSuccessProcess calls waitForProcess, so callers of them will also need to be dealt with, and have not been yet.	2020-06-03 15:48:09 -04:00
Joey Hess	31d53587d5	generalize withNullHandle to MonadIO	2020-06-03 15:18:48 -04:00
Joey Hess	1f2e2d15e8	async exception safety Convert to withCreateProcess and concurrently, both of which handle cleaning up when there's an async exception thrown to the thread running this.	2020-06-03 13:19:28 -04:00
Joey Hess	94986fb228	convert to withCreateProcess Makes it stop the command if the consumer gets killed. Also, it seems that the old version expected bracketOnError to return the False from the error handler, but it does not, it would have thrown the exception and ignored the False. That's fixed, it will now return False when there is an exception.	2020-06-03 13:15:01 -04:00
Joey Hess	53263efe4b	simplify This was a pre-withCreateProcess attempt at doing the same thing, so can just call boolSystem now that it uses withCreateProcess. There's a slight behavior change, since it used to wait, after an async exception, for the command to finish, before re-throwing the exception. Now, it rethrows the exception right away. I don't think that impact any of the users of this.	2020-06-03 13:01:18 -04:00
Joey Hess	e1fc4f7594	make safeCommand stop the process if the thread gets killed And a comment on a todo item that this commit is perhaps the start of solving.	2020-06-03 12:52:11 -04:00
Joey Hess	30ac015b79	add a formatContainsVar function Also, the format function gets faster because it checks for "escaped_" at gen time instead of every time format is called.	2020-05-19 15:35:00 -04:00
Joey Hess	49bf7c8403	typo	2020-05-12 13:59:15 -04:00
Joey Hess	2a8fdfc7d8	Display a warning message when asked to operate on a file inside a directory that's a symbolic link to elsewhere This relicates git's behavior. It adds a few stat calls for the command line parameters, so there is some minor slowdown, but even with thousands of parameters it will not be very noticable, and git does the same statting in similar circumstances. Note that this does not prevent eg "git annex add symlink"; the symlink will be added to git as usual. And "git annex find symlink" will silently list nothing as well. It's only "symlink/foo" or "subdir/symlink/foo" that triggers the warning.	2020-05-11 15:03:35 -04:00
Joey Hess	6952060665	addurl --preserve-filename and a few related changes * addurl --preserve-filename: New option, uses server-provided filename without any sanitization, but with some security checking. Not yet implemented for remotes other than the web. * addurl, importfeed: Avoid adding filenames with leading '.', instead it will be replaced with '_'. This might be considered a security fix, but a CVE seems unwattanted. It was possible for addurl to create a dotfile, which could change behavior of some program. It was also possible for a web server to say the file name was ".git" or "foo/.git". That would not overrwrite the .git directory, but would cause addurl to fail; of course git won't add "foo/.git". sanitizeFilePath is too opinionated to remain in Utility, so moved it. The changes to mkSafeFilePath are because it used sanitizeFilePath. In particular: isDrive will never succeed, because "c:" gets munged to "c_" ".." gets sanitized now ".git" gets sanitized now It will never be null, because sanitizeFilePath keeps the length the same, and splitDirectories never returns a null path. Also, on the off chance a web server suggests a filename of "", ignore that, rather than trying to save to such a filename, which would fail in some way.	2020-05-08 16:22:55 -04:00
Joey Hess	021ed4f1b9	fix some build warnings	2020-05-04 12:44:26 -04:00
Joey Hess	4a6d328ae9	Avoid a test suite failure when the environment does not let gpg be tested Due to eg, too long a path to the agent socket, caused by running gpg in a container where /run is not mounted, and/or some other gpg behavior like unnecessarily making relative paths to its home directory absolute.	2020-04-28 15:47:23 -04:00
Joey Hess	19b5137227	addurl --fast error message improvement addurl: When run with --fast on an url that annex.security.allowed-ip-addresses prevents accessing, display a more useful message. (Also importfeed --fast potentially.)	2020-04-27 13:48:14 -04:00
Joey Hess	45fb7af21c	check-attr resource pool Limited to min of -JN or number of CPU cores, because it will often be CPU bound, once it's read the gitignore file for a directory. In some situations it's more disk bound, but in any case it's unlikely to be the main bottleneck that -J is used to avoid. Eg, when dropping, this is used for numcopies checks, but the main bottleneck will be accessing the remotes to verify presence. So the user might decide to -J32 that, but having 32 check-attr processes would just waste however many filehandles they open, and probably worsen their performance due to CPU contention. Note that, I first tried just letting up to the -JN be started. However, even when it's no bottleneck at all, that still results in all of them being started. Why? Well, all the worker threads start up nearly simulantaneously, so there's a thundering herd..	2020-04-21 11:05:57 -04:00
Joey Hess	cee6b344b4	cat-file resource pool Avoid running a large number of git cat-file child processes when run with a large -J value. This implementation takes care to avoid adding any overhead to git-annex when run without -J. When run with -J, there is a small bit of added overhead, to manipulate the resource pool. That optimisation added a fair bit of complexity.	2020-04-20 15:19:31 -04:00
Joey Hess	d5d8259937	ByteString Ref continued Attoparsec parser for diff-tree. Changed fromRef back to producing a String, to avoid needing to convert every use of it. However, this does mean I'm going to miss some opportunities where fromRef is used and the result converted back to a ByteString. Would be worth revisiting that at some point maybe.	2020-04-07 11:54:27 -04:00
Joey Hess	279991604d	started converting Ref from String to ByteString This should make code that reads shas and refs from git faster. Does not compile yet, a lot needs to be done still.	2020-04-06 17:14:49 -04:00
Joey Hess	f6d19b18f6	remove unused imports	2020-03-30 12:11:52 -04:00
Joey Hess	cd5658d972	fix windows build	2020-03-10 13:24:37 -04:00
Joey Hess	6d58ca94d6	some easy createDirectoryUnder conversions	2020-03-05 15:20:10 -04:00
Joey Hess	ebbc5004fa	convert createAnnexDirectory to use createDirectoryUnder It will create foo/.git/annex/, but not foo/.git/ and not foo/. This will avoid it creating an empty path to a repo when a drive is yanked out and the mount point goes away, for example.	2020-03-05 14:33:04 -04:00
Joey Hess	5b022eea87	implemented createDirectoryUnder	2020-03-05 14:10:34 -04:00
Joey Hess	662e5a5db9	small opt to absPath Noticed that it gets the CWD unncessarily when the path is absolute. I have not benchmarked this, but I guess that the small overhead of isAbsolute is so tiny compared to the system call that it's worth it even if most of the time relative paths are passed to absPath.	2020-03-05 13:52:30 -04:00
Joey Hess	716e573514	split up quickcheck tests for hashes and macs So when one fais, it's clear which one is the problem.	2020-03-02 14:34:48 -04:00
Joey Hess	f6d629e483	changelog and minor style	2020-02-28 12:57:55 -04:00
Peter Simons	73cf523a4b	Fix build with ghc-8.8.x. The 'fail' method has been moved to the 'MonadFail' class. I made the changes so that the code still compiles with previous versions of 'base' that don't have the new MonadFail class exported by Prelude yet.	2020-02-28 12:54:20 -04:00
Joey Hess	9659f1c30f	annex.security.allowed-ip-addresses ports syntax Extended annex.security.allowed-ip-addresses to let specific ports of an IP address to be used, while denying use of other ports.	2020-02-25 15:45:52 -04:00
Joey Hess	029c883713	Merge branch 'master' into v8	2020-02-19 14:32:11 -04:00
Joey Hess	6f90bb7738	handle git-credential prompt in -J mode If git-credential has it cached and does not prompt, this will unfortunately result in a brief flicker, as the displayed console regions are hidden while running it and then re-displayed. Better than a corrupted display. Actually, I tried it and don't see a visible flicker, so probably only over a slow ssh will it be apparent.	2020-01-22 16:42:15 -04:00
Joey Hess	1883f7ef8f	support git remotes that need http basic auth using git credential to get the password One thing this doesn't do is wrap the password prompting inside the prompt action. So with -J, the output can be a bit garbled.	2020-01-22 16:16:19 -04:00
Joey Hess	b68a8d8968	use conversion functions from filepath-bytestring (again) This reverts commit `3a04af7927`.	2020-01-04 20:18:40 -04:00
Joey Hess	2cea674d1e	Merge branch 'master' into v8	2020-01-01 14:26:43 -04:00
Joey Hess	999a6f0541	windows build fix	2020-01-01 13:05:23 -04:00
Joey Hess	39c91f91a9	windows build fix	2020-01-01 12:24:31 -04:00
Joey Hess	e006acc8e3	fix quickcheck failure prop_encode_decode_roundtrip failed on "\175" in C locale. This may be a new problem after the switch to RawFilePath, but it already had filtering for high chars, so changed to only test ascii chars.	2019-12-30 13:54:46 -04:00
Joey Hess	3a04af7927	temporary revert "use conversion functions from filepath-bytestring" This reverts commit `75c40279c1`. Debian unstable is one version too old, so this can be de-reverted in a bit.	2019-12-27 19:29:09 -04:00
Joey Hess	2b821eb225	Merge branch 'master' into sqlite	2019-12-26 15:15:42 -04:00
Joey Hess	444d5591ee	Improve file ordering behavior when one parameter is "." and other parameters are other directories eg, `git-annex get . ..` used to order the files strangly, because it did not realize that when git ls-files output eg "foo", that should be grouped with the first set of files and not the second set. Fixed by making dirContains "." "./foo" = True which makes sense, because dirContains ".." "../foo" = True	2019-12-20 18:01:29 -04:00
Joey Hess	d5628a16b8	Merge branch 'bs' into sqlite-bs	2019-12-18 14:51:03 -04:00
Joey Hess	75c40279c1	use conversion functions from filepath-bytestring Behavior should be the same, but I'd hope to eventually get rid of most of Utility.FileSystemEncoding and this is a first step.	2019-12-18 13:42:43 -04:00
Joey Hess	322c542b5c	fix ByteString conversion on windows the encode' and decode' functions on Windows should not apply the filesystem encoding, which does not work there. Instead, convert to and from UTF-8. Also, avoid exporting encodeW8 and decodeW8. Both use the filesystem encoding, so won't work as expected on windows.	2019-12-18 13:32:56 -04:00
Joey Hess	c19211774f	use filepath-bytestring for annex object manipulations git-annex find is now RawFilePath end to end, no string conversions. So is git-annex get when it does not need to get anything. So this is a major milestone on optimisation. Benchmarks indicate around 30% speedup in both commands. Probably many other performance improvements. All or nearly all places where a file is statted use RawFilePath now.	2019-12-11 15:25:07 -04:00
Joey Hess	a0168cd9a2	use RawFilePath getSymbolicLinkStatus for speed	2019-12-06 15:42:54 -04:00
Joey Hess	2f9a80d803	merging sqlite and bs branches Since the sqlite branch uses blobs extensively, there are some performance benefits, ByteStrings now get stored and retrieved w/o conversion in some cases like in Database.Export.	2019-12-06 15:30:45 -04:00
Joey Hess	5f391179f1	use RawFilePath getFileStatus for speed Only done on those calls to getFileStatus that had a RawFilePath, not a FilePath. The others would probably be just as fast if converted to use it with toRawFilePath, but I'm not 100% sure. Note that genInodeCache' uses fromRawFilePath, but that value only gets used on Windows, so on unix the thunk will never be evaluated.	2019-12-06 14:44:42 -04:00
Joey Hess	360942ba12	RawFilePath will need to support Windows too Of course, readSymbolicLink always fails on Windows, but now it's ready for other things that don't fail there.	2019-12-06 14:17:48 -04:00
Joey Hess	f39f018ee0	fix git ls-tree parser File mode is octal not decimal. This broke in the conversion to attoparsec. (I've submitted the content of Utility.Attoparsec to the attoparsec developers.) Test suite passes 100% now.	2019-12-06 14:05:48 -04:00
Joey Hess	37d0f73e66	reword comment	2019-11-27 16:38:18 -04:00
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	6a97ff6b3a	wip RawFilePath Goal is to make git-annex faster by using ByteString for all the worktree traversal. For now, this is focusing on Command.Find, in order to benchmark how much it helps. (All other commands are temporarily disabled) Currently in a very bad unbuildable in-between state.	2019-11-25 16:18:19 -04:00
Joey Hess	1ff889e456	explict export lists A small amount of dead code removed. All of Utility/ done now. This commit was sponsored by Brock Spratlen on Patreon.	2019-11-23 11:24:10 -04:00
Joey Hess	81d402216d	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. Previously attempted in `4536c93bb2` and reverted in `96aba8eff7`. The problems mentioned in the latter commit are addressed now: Read/Show of KeyData is backwards-compatible with Read/Show of Key from before this change, so Types.Distribution will keep working. The Eq instance is fixed. Also, Key has smart constructors, avoiding needing to remember to update the cached serialization. Used git-annex benchmark: find is 7% faster whereis is 3% faster get when all files are already present is 5% faster Generally, the benchmarks are running 0.1 seconds faster per 2000 files, on a ram disk in my laptop.	2019-11-22 17:49:16 -04:00
Joey Hess	1d0dbdf201	squelch tab warnings	2019-11-22 12:49:41 -04:00
Joey Hess	7263aafd2b	Merge branch 'master' into sqlite	2019-11-22 12:49:35 -04:00
Joey Hess	b82ab21468	missed an export	2019-11-22 12:35:57 -04:00
Joey Hess	a9888f6151	Windows: Fix handling of changes to time zone. Used to work but was broken in version 7.20181031, specifically commit `5ab0f48ffb`. That this was not noticed over at least 1 daylight savings time zone changes makes me wonder if the TSDelta stuff is still needed. Perhaps the mtime on Windows no longer changes when the time zone is changed? (cherry picked from commit `09ee6b0ccb`)	2019-11-21 17:28:18 -04:00
Joey Hess	d4661959de	Merge branch 'master' into sqlite	2019-11-21 17:26:50 -04:00
Joey Hess	8ea5f3ff99	explict export lists Eliminated some dead code. In other cases, exported a currently unused function, since it was a logical part of the API. Of course this improves the API documentation. It may also sometimes let ghc optimize code better, since it can know a function is internal to a module. 364 modules still to go, according to git grep -E 'module [A-Za-z.]+ where'	2019-11-21 16:08:37 -04:00
Joey Hess	890330f0fe	make --json-error-messages capture url download errors Convert Utility.Url to return Either String so the error message can be displated in the annex monad and so captured. (When curl is used, its errors are still not caught.)	2019-11-12 13:52:38 -04:00
Joey Hess	09ee6b0ccb	Windows: Fix handling of changes to time zone. Used to work but was broken in version 7.20181031, specifically commit `5ab0f48ffb`. That this was not noticed over at least 1 daylight savings time zone changes makes me wonder if the TSDelta stuff is still needed. Perhaps the mtime on Windows no longer changes when the time zone is changed?	2019-11-06 14:36:49 -04:00
Joey Hess	89bdcffdfa	found a way to extract InodeCache from git index This will allow a race-free database transition. It is somewhat hairy in that it depends on an unspecified git output format.	2019-11-06 14:23:00 -04:00
Joey Hess	4940a135af	eliminate raw sql LIKE query	2019-10-30 15:19:52 -04:00
Joey Hess	94efc400e9	horrible impementation of isInodeKnown The only good thing about it is it does not require a major version bump to improve the database. That will need to happen at some point though. Potentially very very slow in a large repository. Ugly use of raw sql.	2019-10-23 14:37:29 -04:00
Joey Hess	eebf080b33	comment typo	2019-10-23 12:32:46 -04:00
Joey Hess	9a5d9019ba	Deal with pkexec changing to root's home directory when running a command. Wow, that's not documented anywhere, and seems like a major gotcha in pkexec. Broke enable-tor.	2019-10-21 12:39:19 -04:00
Joey Hess	b90ddbc383	enable-tor: Use pkexec to run command as root when gksu and kdesu are not available. gksu is no longer in debian, even stable kdesu in debian is not installed in PATH any longer, though the executable is still present under /usr/lib pkexec is packagekit's replacement for those older commands.	2019-09-30 15:19:01 -04:00
Joey Hess	f2737a5fbe	enable-tor: Run kdesu with -c option.	2019-09-30 15:14:05 -04:00
Joey Hess	ab8a6a82e1	remove unused	2019-09-24 18:16:01 -04:00
Joey Hess	a4750fa537	move haddock block so haddock will build	2019-09-24 18:14:47 -04:00
Joey Hess	71f30d2f07	improve haddock	2019-09-24 18:10:34 -04:00
Joey Hess	bc1b9a2c0a	improved GitLFS api	2019-09-24 18:05:11 -04:00
Joey Hess	6ae0a44c64	git-lfs: Added support for http basic auth	2019-09-24 14:46:20 -04:00
Joey Hess	53fd746705	avoid some build warnings on windows	2019-09-12 14:11:19 -04:00
Joey Hess	9624fe4c37	improve comment	2019-09-01 12:33:19 -04:00
Joey Hess	1558e03014	Refuse to upgrade direct mode repositories when git is older than 2.22 That git fixed a memory leak that could cause an OOM during the upgrade. Most git-annex builds have a new enough git already. OSX git was upgraded with brew. Linux i386ancient build's git was too old. Upgrading it to a fixed git didn't work (due to the newer git not working with the old ssh, https://bugs.chromium.org/p/git/issues/detail?id=7 ) Choices to deal with that were: * Somehow make direct mode upgrade work with the old git, avoiding its OOM problem. One way would be to switch the repo to indirect mode first, and so upgrade to a repo with locked files. Not good when the filesystem does not support symlinks. * backport the OOM fix from git 2.22 (And do what about the version number so git-annex knows it's fixed?) * backport openssh (and possibly more stuff) * move the i386ancient build to at least Debian stretch (still backporting git) But this will make it no longer work with some of the ancient kernels it targets. Of those, backporting the OOM fix seemed the best approach. Put "oomfix" in the git version number to indicate it. I have not automated building the git backport, so here's the patch I used: diff -ur orig/git-2.1.4/convert.c git-2.1.4/convert.c --- orig/git-2.1.4/convert.c 2014-12-18 18:42:18.000000000 +0000 +++ git-2.1.4/convert.c 2019-08-29 20:05:04.371872338 +0100 @@ -404,7 +404,7 @@ if (start_async(&async)) return 0; /* error was already reported */ - if (strbuf_read(&nbuf, async.out, len) < 0) { + if (strbuf_read(&nbuf, async.out, 0) < 0) { error("read from external filter %s failed", cmd); ret = 0; } diff -ur orig/git-2.1.4/GIT-VERSION-GEN git-2.1.4/GIT-VERSION-GEN --- orig/git-2.1.4/GIT-VERSION-GEN 2014-12-18 18:42:18.000000000 +0000 +++ git-2.1.4/GIT-VERSION-GEN 2019-08-29 20:06:39.132743228 +0100 @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.1.4 +DEF_VER=v2.1.4.oomfix LF=' ' diff -ur orig/git-2.1.4/configure git-2.1.4/configure --- orig/git-2.1.4/configure 2014-12-18 18:42:19.000000000 +0000 +++ git-2.1.4/configure 2019-08-29 20:27:45.896380015 +0100 @@ -580,8 +580,8 @@ # Identity of this package. PACKAGE_NAME='git' PACKAGE_TARNAME='git' -PACKAGE_VERSION='2.1.4' -PACKAGE_STRING='git 2.1.4' +PACKAGE_VERSION='2.1.4.oomfix' +PACKAGE_STRING='git 2.1.4.oomfix' PACKAGE_BUGREPORT='git@vger.kernel.org' PACKAGE_URL='' diff -ur orig/git-2.1.4/version git-2.1.4/version --- orig/git-2.1.4/version 2014-12-18 18:42:19.000000000 +0000 +++ git-2.1.4/version 2019-08-29 20:06:17.572545210 +0100 @@ -1 +1 @@ -2.1.4 +2.1.4.oomfix	2019-08-29 15:24:41 -04:00
Joey Hess	69cefe8190	followup and display rsync exit status	2019-08-15 14:47:22 -04:00
Joey Hess	05d52f9699	fix display of http exceptions	2019-08-10 11:09:25 -04:00
Joey Hess	868942e19b	fix unused module import warnings when building on windows	2019-08-08 12:18:53 -04:00
Joey Hess	ee72fd2f7d	add exports useful if using this module to write a git-lfs server	2019-08-05 15:40:36 -04:00
Joey Hess	c527ae5887	Merge branch 'master' into git-lfs	2019-08-05 11:48:45 -04:00
Joey Hess	19defc7932	fix reversion `4af55c42bf` reordered the exception catching, preventing following ftp redirect	2019-08-04 14:32:06 -04:00
Joey Hess	4af55c42bf	factored out downloadConduit from download useful when an API provides a Request to download	2019-08-04 12:31:54 -04:00
Joey Hess	5be0a35dae	implemented checkPresent for git-lfs	2019-08-03 12:21:28 -04:00
Joey Hess	f536a0b264	weaken comment I'm seeing the github lfs server request an upload of an object that has already been uploaded to it before. Probably because they offload storage to S3 and so skipped the overhead of checking for an unncessary upload.	2019-08-03 11:31:02 -04:00
Joey Hess	74e9e3ccf0	add to request headers, don't overwrite	2019-08-03 11:15:08 -04:00
Joey Hess	fc09a41ed1	storing objects in git-lfs is working Still need to record the sha256 and size when they cannot be determined by inspecting the key.	2019-08-02 13:56:55 -04:00
Joey Hess	6c1130a3bb	lfs endpoint discovery and caching in git-lfs special remote	2019-08-02 12:38:14 -04:00
Joey Hess	03a765909c	move IO code out Let's keep this entirely pure. git-annex has its own facilities for running a ssh command, that make it respect various config settings, and cache connections, etc. So better not to have the library run ssh itself.	2019-08-02 10:57:40 -04:00
Joey Hess	2533acc7a2	note about ssh hostname sanitization	2019-08-02 10:40:55 -04:00
Joey Hess	bd6c508334	finalizing lfs module It may eventually move to its own package.	2019-08-01 14:04:56 -04:00
Joey Hess	018b5b8173	Support building with socks-0.6 and persistant-template-2.7 persistent-template now needs UndecidableInstances. socks changed defaultSocksConf to take a SockAddr.	2019-07-30 12:50:48 -04:00
Joey Hess	426053cb6c	Corrected some license statements In `40ecf58d4b` I changed the license of code I wrote from GPL to AGPL. But, two files containing code I wrote combined with code by others were updated to say their license is AGPL, while in fact part of it was (the code I wrote) but part remained under the original license (the code written by others). Remote/Ddar.hs is now changed entirely back to GPL 3. Annex/DirHashes.hs stays AGPL, but I broke out Utility/MD5.hs with the code not written by me, and corrected its license statement to GPL-2, which is the actual version of the GPL included with the code in its original distribution at http://www.cs.ox.ac.uk/people/ian.lynagh/md5/	2019-07-28 14:27:33 -04:00
Joey Hess	7fd650355e	merge from http-client-restricted I made some improvements to its API after splitting it out of git-annex, so merge those back in. This is groundwork for removing the embedded copy of it and depending on it. Also moved the managerResponseTimeout disabling to Annex.Url as it's git-annex specific. This commit was sponsored by Ethan Aubin on Patreon.	2019-07-17 16:48:50 -04:00
Joey Hess	7234b1f9a7	small optimisation to file copying Avoid statting file, just try to remove it. Also a comment to explain why it tries to remove it, which was puzzling me when I revisited this code until I saw that cp fails to overwrite a mode 444 file, including perhaps one left by a previous interrupted cp. This commit was sponsored by Fernando Jimenez on Patreon.	2019-07-17 14:22:21 -04:00
Joey Hess	21ff5e1e5a	CoW probing Improved probing when CoW copies can be made between files on the same drive. Now supports CoW between BTRFS subvolumes. And, falls back to rsync instead of using cp when CoW won't work, eg copies between repos on the same EXT4 filesystem. Rather than trying cp --reflink=always for each file copied to a remote, it's tried once and if it fails it falls back to using rsync thereafter for the lifetime of the Remote object. That avoids overhead of calling cp which while small, will add up over a large number of files. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2019-07-17 14:19:08 -04:00
Joey Hess	0c6b7e288d	Add BLAKE2BP512 and BLAKE2BP512E backends using a blake2 variant optimised for 4-way CPUs This had been deferred because the Debian package of cryptonite, and possibly other builds, was broken for blake2bp, but I've confirmed #892855 is fixed. This commit was sponsored by Brett Eisenberg on Patreon.	2019-07-05 15:30:03 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	42c386fc47	add: Display progress meter when hashing files. * add: Display progress meter when hashing files. * add: Support --json-progress option.	2019-06-25 13:12:47 -04:00
Joey Hess	759fd9ea68	avoid url resume from 0 When downloading an url and the destination file exists but is empty, avoid using http range to resume, since a range "bytes=0-" is an unusual edge case that it's best to avoid relying on working. This is known to fix a case where importfeed downloaded a partial feed from such a server. Since importfeed uses withTmpFile, the destination always exists empty, so it would particularly tickle such problem servers. Resuming from 0 is otherwise possible, but unlikely.	2019-06-20 12:26:17 -04:00
Joey Hess	fe49747fc8	add missing case and fix name shadowing warning	2019-06-04 11:24:32 -04:00
Joey Hess	6136e299a2	add back support for following http to ftp redirects Did not test build with http-client < 0.5 and while I tried to support it, the ifdefed parts may needs some fixes.	2019-05-30 16:04:59 -04:00
Joey Hess	67c06f5121	add back support for ftp urls Add back support for ftp urls, which was disabled as part of the fix for security hole CVE-2018-10857 (except for configurations which enabled curl and bypassed public IP address restrictions). Now it will work if allowed by annex.security.allowed-ip-addresses.	2019-05-30 14:51:34 -04:00
Joey Hess	aa7710982b	avoid list lookup by parseToken Minor optimisation to parsing of a preferred content expression.	2019-05-14 13:11:29 -04:00
Joey Hess	00e9e15c70	squelch build warning with old version of quickcheck	2019-05-03 11:02:12 -04:00
Joey Hess	2b52dbe905	fix build with older QuickCheck The NonEmpty instance was moved out of QuickCheck and into a package with more deps than I want to drag in, so I'm providing my own instance, but with older QuickCheck, use theirs to avoid overlapping.	2019-03-22 10:07:16 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	c0bd202147	fix failing test case An empty list of [ContentIdenfier] serialized to the same thing as a single ContentIdentifier "". Avoid this ambiguity by requiring the list be non-empty.	2019-03-06 14:27:15 -04:00
Joey Hess	1ec9e1494c	use relatedTempate in viaTmp	2019-03-04 14:12:00 -04:00
Joey Hess	4603713b4e	avoid using htonl It got removed from network-3.0.0.0 and nothing in the haskell ecosystem currently provides it (which seems it ought to be fixed). Tested new code on both little-endian and big-endian with: ghci> hostAddressToTuple $ fromJust $ embeddedIpv4 (0,0,0,0,0,0xffff,0x7f00,1) (127,0,0,1)	2019-02-19 12:17:20 -04:00
Joey Hess	f5f059e288	relocate gpg test framework temp dir to outside repo The gitAnnexTmpOtherDir cleanup made it be deleted too early sometimes, and so the test suite failed. Also there was a report of a similar failure which likely had a similar cause and hopwfully this fixes that too.	2019-01-21 14:16:00 -04:00
Joey Hess	e38b654096	Estimated time to completion display shortened from eg "1h1m1s" to "1h1m" Because seconds accuracy over such a time is unlikely to be accurate. Also, it was possible to get a ridiculous "1y1d1h1m1s" if stalled or very slow.	2019-01-21 00:04:35 -04:00
Joey Hess	96aba8eff7	Revert "cache the serialization of a Key" This reverts commit `4536c93bb2`. That broke Read/Show of a Key, and unfortunately Key is read in at least one place; the GitAnnexDistribution data type. It would be worth bringing this optimisation back, but it would need either a custom Read/Show instance that preserves back-compat, or wrapping Key in a data type that contains the serialization, or changing how GitAnnexDistribution is serialized. Also, the Eq instance would need to compare keys with and without a cached seralization the same.	2019-01-16 16:21:59 -04:00
Joey Hess	0e44985210	remove duplicate import	2019-01-14 18:26:38 -04:00
Joey Hess	e0c4ac99b5	convert serializeKey' to strict ByteString The builder produces a lazy ByteString, and L.toStrict has to copy it, but needing to use the builder is no longer to common case; the serialization will normally be cached already as a strict ByteString, and this avoids keyFile' needing to use L.toStrict . serializeKey'	2019-01-14 17:03:46 -04:00
Joey Hess	5d98cba923	use ByteStrings when reading annex symlinks and pointers Now there's a ByteString used all the way from disk to Key. The main complication in this conversion was the use of fromInternalGitPath in several places to munge things on Windows. The things that used that were changed to parse the ByteString using either path separator. Also some code that had read from files to a String lazily was changed to read a minimal strict ByteString.	2019-01-14 15:37:08 -04:00
Joey Hess	fc21cccf1c	slight optimisation more	2019-01-11 19:56:31 -04:00
Joey Hess	16c798b5ef	switch MetaValue to ByteString and MetaField to Text MetaField was already limited to alphanumerics, so it makes sense to use Text for it. Note that technically a UUID can contain invalid UTF-8, and so remoteMetaDataPrefix's use of T.pack . fromUUID could replace non-UTF8 values with '?' or whatever. In practice, a UUID is usually also text, I only kept open the possibility of it containing invalid UTF-8 to avoid breaking parsing of strange UUIDs in git-annex branch files. So, I decided to let this edge case slip by. Have not updated the rest of the code base yet for this change, as the change took 2.5 hours longer than I expected to get working properly.	2019-01-07 14:18:24 -04:00
Joey Hess	a80922a594	support for ByteStrings	2019-01-07 12:29:25 -04:00
Joey Hess	7d51b0c109	import Utility.FileSystemEncoding in Common	2019-01-03 11:37:02 -04:00
Joey Hess	f574d8af10	comment typo	2019-01-03 00:22:05 -04:00
Joey Hess	3ba6e9bb96	use attoparsec parser for String parsing, 10x speedup This is not as efficient as using ByteStrings throughout, but converting the String to ByteString is actually significantly faster than the old parser. benchmarking parse/old time 9.657 μs (9.600 μs .. 9.732 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 9.703 μs (9.645 μs .. 9.785 μs) std dev 231.6 ns (161.5 ns .. 323.7 ns) variance introduced by outliers: 25% (moderately inflated) benchmarking parse/new time 834.6 ns (797.1 ns .. 886.9 ns) 0.987 R² (0.976 R² .. 0.999 R²) mean 816.4 ns (802.7 ns .. 845.1 ns) std dev 62.39 ns (37.66 ns .. 108.4 ns) variance introduced by outliers: 82% (severely inflated) There is a small behavior change from the old parsePOSIXTime, which accepted any amount of trailing whitespace after the timestamp. That behavior was not documented, and it doesn't seem anything relied on it.	2019-01-02 13:28:44 -04:00
Joey Hess	3c74dcd4e1	attoparsec parser for POSIXTime (Not yet used anywhere.) Benchmarking {-# LANGUAGE OverloadedStrings #-} import Criterion.Main import Utility.TimeStamp import Data.Attoparsec.ByteString main = defaultMain [ bgroup "parse" [ bench "new" $ whnf (parseOnly (parserPOSIXTime <* endOfInput)) "1431286201.113452s" , bench "old" $ whnf parsePOSIXTime "1431286201.113452s" ] ] benchmarking parse/new time 643.6 ns (640.2 ns .. 646.7 ns) 1.000 R² (0.999 R² .. 1.000 R²) mean 645.3 ns (642.1 ns .. 650.9 ns) std dev 14.59 ns (9.194 ns .. 22.07 ns) variance introduced by outliers: 29% (moderately inflated) benchmarking parse/old time 9.657 μs (9.600 μs .. 9.732 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 9.703 μs (9.645 μs .. 9.785 μs) std dev 231.6 ns (161.5 ns .. 323.7 ns) variance introduced by outliers: 25% (moderately inflated) So old took 9703 ns to parse, and new 643 ns.	2019-01-02 12:48:53 -04:00
Joey Hess	ba2c0663f9	comments	2019-01-01 22:48:14 -04:00
Joey Hess	ec1b9da72f	avoid abusing from/toRawFilePath for non-FilePaths	2019-01-01 22:44:04 -04:00
Joey Hess	b3c69eaaf8	strict bytestring encoders and decoders Only had lazy ones before. Already sped up a few parts of the code.	2019-01-01 14:55:15 -04:00
Joey Hess	1b44426805	avoid conflicting definitions of Template type When both modules are imported and then re-exported.	2018-12-30 15:03:31 -04:00
Joey Hess	5480b3a9af	fix bogus ghc 8.6.3 build warning ghc warned that the guards did not cover all values of h, but they clearly do, and when rewritten as a case statement the warning goes away. Probably a ghc bug, but I kind of prefer the case statement over the guards anyway.	2018-12-30 14:43:27 -04:00

... 2 3 4 5 6 ...

1602 commits