git-annex

Author	SHA1	Message	Date
Joey Hess	b68a8d8968	use conversion functions from filepath-bytestring (again) This reverts commit `3a04af7927`.	2020-01-04 20:18:40 -04:00
Joey Hess	2cea674d1e	Merge branch 'master' into v8	2020-01-01 14:26:43 -04:00
Joey Hess	999a6f0541	windows build fix	2020-01-01 13:05:23 -04:00
Joey Hess	39c91f91a9	windows build fix	2020-01-01 12:24:31 -04:00
Joey Hess	e006acc8e3	fix quickcheck failure prop_encode_decode_roundtrip failed on "\175" in C locale. This may be a new problem after the switch to RawFilePath, but it already had filtering for high chars, so changed to only test ascii chars.	2019-12-30 13:54:46 -04:00
Joey Hess	3a04af7927	temporary revert "use conversion functions from filepath-bytestring" This reverts commit `75c40279c1`. Debian unstable is one version too old, so this can be de-reverted in a bit.	2019-12-27 19:29:09 -04:00
Joey Hess	2b821eb225	Merge branch 'master' into sqlite	2019-12-26 15:15:42 -04:00
Joey Hess	444d5591ee	Improve file ordering behavior when one parameter is "." and other parameters are other directories eg, `git-annex get . ..` used to order the files strangly, because it did not realize that when git ls-files output eg "foo", that should be grouped with the first set of files and not the second set. Fixed by making dirContains "." "./foo" = True which makes sense, because dirContains ".." "../foo" = True	2019-12-20 18:01:29 -04:00
Joey Hess	d5628a16b8	Merge branch 'bs' into sqlite-bs	2019-12-18 14:51:03 -04:00
Joey Hess	75c40279c1	use conversion functions from filepath-bytestring Behavior should be the same, but I'd hope to eventually get rid of most of Utility.FileSystemEncoding and this is a first step.	2019-12-18 13:42:43 -04:00
Joey Hess	322c542b5c	fix ByteString conversion on windows the encode' and decode' functions on Windows should not apply the filesystem encoding, which does not work there. Instead, convert to and from UTF-8. Also, avoid exporting encodeW8 and decodeW8. Both use the filesystem encoding, so won't work as expected on windows.	2019-12-18 13:32:56 -04:00
Joey Hess	c19211774f	use filepath-bytestring for annex object manipulations git-annex find is now RawFilePath end to end, no string conversions. So is git-annex get when it does not need to get anything. So this is a major milestone on optimisation. Benchmarks indicate around 30% speedup in both commands. Probably many other performance improvements. All or nearly all places where a file is statted use RawFilePath now.	2019-12-11 15:25:07 -04:00
Joey Hess	a0168cd9a2	use RawFilePath getSymbolicLinkStatus for speed	2019-12-06 15:42:54 -04:00
Joey Hess	2f9a80d803	merging sqlite and bs branches Since the sqlite branch uses blobs extensively, there are some performance benefits, ByteStrings now get stored and retrieved w/o conversion in some cases like in Database.Export.	2019-12-06 15:30:45 -04:00
Joey Hess	5f391179f1	use RawFilePath getFileStatus for speed Only done on those calls to getFileStatus that had a RawFilePath, not a FilePath. The others would probably be just as fast if converted to use it with toRawFilePath, but I'm not 100% sure. Note that genInodeCache' uses fromRawFilePath, but that value only gets used on Windows, so on unix the thunk will never be evaluated.	2019-12-06 14:44:42 -04:00
Joey Hess	360942ba12	RawFilePath will need to support Windows too Of course, readSymbolicLink always fails on Windows, but now it's ready for other things that don't fail there.	2019-12-06 14:17:48 -04:00
Joey Hess	f39f018ee0	fix git ls-tree parser File mode is octal not decimal. This broke in the conversion to attoparsec. (I've submitted the content of Utility.Attoparsec to the attoparsec developers.) Test suite passes 100% now.	2019-12-06 14:05:48 -04:00
Joey Hess	37d0f73e66	reword comment	2019-11-27 16:38:18 -04:00
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	6a97ff6b3a	wip RawFilePath Goal is to make git-annex faster by using ByteString for all the worktree traversal. For now, this is focusing on Command.Find, in order to benchmark how much it helps. (All other commands are temporarily disabled) Currently in a very bad unbuildable in-between state.	2019-11-25 16:18:19 -04:00
Joey Hess	1ff889e456	explict export lists A small amount of dead code removed. All of Utility/ done now. This commit was sponsored by Brock Spratlen on Patreon.	2019-11-23 11:24:10 -04:00
Joey Hess	81d402216d	cache the serialization of a Key This will speed up the common case where a Key is deserialized from disk, but is then serialized to build eg, the path to the annex object. Previously attempted in `4536c93bb2` and reverted in `96aba8eff7`. The problems mentioned in the latter commit are addressed now: Read/Show of KeyData is backwards-compatible with Read/Show of Key from before this change, so Types.Distribution will keep working. The Eq instance is fixed. Also, Key has smart constructors, avoiding needing to remember to update the cached serialization. Used git-annex benchmark: find is 7% faster whereis is 3% faster get when all files are already present is 5% faster Generally, the benchmarks are running 0.1 seconds faster per 2000 files, on a ram disk in my laptop.	2019-11-22 17:49:16 -04:00
Joey Hess	1d0dbdf201	squelch tab warnings	2019-11-22 12:49:41 -04:00
Joey Hess	7263aafd2b	Merge branch 'master' into sqlite	2019-11-22 12:49:35 -04:00
Joey Hess	b82ab21468	missed an export	2019-11-22 12:35:57 -04:00
Joey Hess	a9888f6151	Windows: Fix handling of changes to time zone. Used to work but was broken in version 7.20181031, specifically commit `5ab0f48ffb`. That this was not noticed over at least 1 daylight savings time zone changes makes me wonder if the TSDelta stuff is still needed. Perhaps the mtime on Windows no longer changes when the time zone is changed? (cherry picked from commit `09ee6b0ccb`)	2019-11-21 17:28:18 -04:00
Joey Hess	d4661959de	Merge branch 'master' into sqlite	2019-11-21 17:26:50 -04:00
Joey Hess	8ea5f3ff99	explict export lists Eliminated some dead code. In other cases, exported a currently unused function, since it was a logical part of the API. Of course this improves the API documentation. It may also sometimes let ghc optimize code better, since it can know a function is internal to a module. 364 modules still to go, according to git grep -E 'module [A-Za-z.]+ where'	2019-11-21 16:08:37 -04:00
Joey Hess	890330f0fe	make --json-error-messages capture url download errors Convert Utility.Url to return Either String so the error message can be displated in the annex monad and so captured. (When curl is used, its errors are still not caught.)	2019-11-12 13:52:38 -04:00
Joey Hess	09ee6b0ccb	Windows: Fix handling of changes to time zone. Used to work but was broken in version 7.20181031, specifically commit `5ab0f48ffb`. That this was not noticed over at least 1 daylight savings time zone changes makes me wonder if the TSDelta stuff is still needed. Perhaps the mtime on Windows no longer changes when the time zone is changed?	2019-11-06 14:36:49 -04:00
Joey Hess	89bdcffdfa	found a way to extract InodeCache from git index This will allow a race-free database transition. It is somewhat hairy in that it depends on an unspecified git output format.	2019-11-06 14:23:00 -04:00
Joey Hess	4940a135af	eliminate raw sql LIKE query	2019-10-30 15:19:52 -04:00
Joey Hess	94efc400e9	horrible impementation of isInodeKnown The only good thing about it is it does not require a major version bump to improve the database. That will need to happen at some point though. Potentially very very slow in a large repository. Ugly use of raw sql.	2019-10-23 14:37:29 -04:00
Joey Hess	eebf080b33	comment typo	2019-10-23 12:32:46 -04:00
Joey Hess	9a5d9019ba	Deal with pkexec changing to root's home directory when running a command. Wow, that's not documented anywhere, and seems like a major gotcha in pkexec. Broke enable-tor.	2019-10-21 12:39:19 -04:00
Joey Hess	b90ddbc383	enable-tor: Use pkexec to run command as root when gksu and kdesu are not available. gksu is no longer in debian, even stable kdesu in debian is not installed in PATH any longer, though the executable is still present under /usr/lib pkexec is packagekit's replacement for those older commands.	2019-09-30 15:19:01 -04:00
Joey Hess	f2737a5fbe	enable-tor: Run kdesu with -c option.	2019-09-30 15:14:05 -04:00
Joey Hess	ab8a6a82e1	remove unused	2019-09-24 18:16:01 -04:00
Joey Hess	a4750fa537	move haddock block so haddock will build	2019-09-24 18:14:47 -04:00
Joey Hess	71f30d2f07	improve haddock	2019-09-24 18:10:34 -04:00
Joey Hess	bc1b9a2c0a	improved GitLFS api	2019-09-24 18:05:11 -04:00
Joey Hess	6ae0a44c64	git-lfs: Added support for http basic auth	2019-09-24 14:46:20 -04:00
Joey Hess	53fd746705	avoid some build warnings on windows	2019-09-12 14:11:19 -04:00
Joey Hess	9624fe4c37	improve comment	2019-09-01 12:33:19 -04:00
Joey Hess	1558e03014	Refuse to upgrade direct mode repositories when git is older than 2.22 That git fixed a memory leak that could cause an OOM during the upgrade. Most git-annex builds have a new enough git already. OSX git was upgraded with brew. Linux i386ancient build's git was too old. Upgrading it to a fixed git didn't work (due to the newer git not working with the old ssh, https://bugs.chromium.org/p/git/issues/detail?id=7 ) Choices to deal with that were: * Somehow make direct mode upgrade work with the old git, avoiding its OOM problem. One way would be to switch the repo to indirect mode first, and so upgrade to a repo with locked files. Not good when the filesystem does not support symlinks. * backport the OOM fix from git 2.22 (And do what about the version number so git-annex knows it's fixed?) * backport openssh (and possibly more stuff) * move the i386ancient build to at least Debian stretch (still backporting git) But this will make it no longer work with some of the ancient kernels it targets. Of those, backporting the OOM fix seemed the best approach. Put "oomfix" in the git version number to indicate it. I have not automated building the git backport, so here's the patch I used: diff -ur orig/git-2.1.4/convert.c git-2.1.4/convert.c --- orig/git-2.1.4/convert.c 2014-12-18 18:42:18.000000000 +0000 +++ git-2.1.4/convert.c 2019-08-29 20:05:04.371872338 +0100 @@ -404,7 +404,7 @@ if (start_async(&async)) return 0; /* error was already reported */ - if (strbuf_read(&nbuf, async.out, len) < 0) { + if (strbuf_read(&nbuf, async.out, 0) < 0) { error("read from external filter %s failed", cmd); ret = 0; } diff -ur orig/git-2.1.4/GIT-VERSION-GEN git-2.1.4/GIT-VERSION-GEN --- orig/git-2.1.4/GIT-VERSION-GEN 2014-12-18 18:42:18.000000000 +0000 +++ git-2.1.4/GIT-VERSION-GEN 2019-08-29 20:06:39.132743228 +0100 @@ -1,7 +1,7 @@ #!/bin/sh GVF=GIT-VERSION-FILE -DEF_VER=v2.1.4 +DEF_VER=v2.1.4.oomfix LF=' ' diff -ur orig/git-2.1.4/configure git-2.1.4/configure --- orig/git-2.1.4/configure 2014-12-18 18:42:19.000000000 +0000 +++ git-2.1.4/configure 2019-08-29 20:27:45.896380015 +0100 @@ -580,8 +580,8 @@ # Identity of this package. PACKAGE_NAME='git' PACKAGE_TARNAME='git' -PACKAGE_VERSION='2.1.4' -PACKAGE_STRING='git 2.1.4' +PACKAGE_VERSION='2.1.4.oomfix' +PACKAGE_STRING='git 2.1.4.oomfix' PACKAGE_BUGREPORT='git@vger.kernel.org' PACKAGE_URL='' diff -ur orig/git-2.1.4/version git-2.1.4/version --- orig/git-2.1.4/version 2014-12-18 18:42:19.000000000 +0000 +++ git-2.1.4/version 2019-08-29 20:06:17.572545210 +0100 @@ -1 +1 @@ -2.1.4 +2.1.4.oomfix	2019-08-29 15:24:41 -04:00
Joey Hess	69cefe8190	followup and display rsync exit status	2019-08-15 14:47:22 -04:00
Joey Hess	05d52f9699	fix display of http exceptions	2019-08-10 11:09:25 -04:00
Joey Hess	868942e19b	fix unused module import warnings when building on windows	2019-08-08 12:18:53 -04:00
Joey Hess	ee72fd2f7d	add exports useful if using this module to write a git-lfs server	2019-08-05 15:40:36 -04:00
Joey Hess	c527ae5887	Merge branch 'master' into git-lfs	2019-08-05 11:48:45 -04:00
Joey Hess	19defc7932	fix reversion `4af55c42bf` reordered the exception catching, preventing following ftp redirect	2019-08-04 14:32:06 -04:00
Joey Hess	4af55c42bf	factored out downloadConduit from download useful when an API provides a Request to download	2019-08-04 12:31:54 -04:00
Joey Hess	5be0a35dae	implemented checkPresent for git-lfs	2019-08-03 12:21:28 -04:00
Joey Hess	f536a0b264	weaken comment I'm seeing the github lfs server request an upload of an object that has already been uploaded to it before. Probably because they offload storage to S3 and so skipped the overhead of checking for an unncessary upload.	2019-08-03 11:31:02 -04:00
Joey Hess	74e9e3ccf0	add to request headers, don't overwrite	2019-08-03 11:15:08 -04:00
Joey Hess	fc09a41ed1	storing objects in git-lfs is working Still need to record the sha256 and size when they cannot be determined by inspecting the key.	2019-08-02 13:56:55 -04:00
Joey Hess	6c1130a3bb	lfs endpoint discovery and caching in git-lfs special remote	2019-08-02 12:38:14 -04:00
Joey Hess	03a765909c	move IO code out Let's keep this entirely pure. git-annex has its own facilities for running a ssh command, that make it respect various config settings, and cache connections, etc. So better not to have the library run ssh itself.	2019-08-02 10:57:40 -04:00
Joey Hess	2533acc7a2	note about ssh hostname sanitization	2019-08-02 10:40:55 -04:00
Joey Hess	bd6c508334	finalizing lfs module It may eventually move to its own package.	2019-08-01 14:04:56 -04:00
Joey Hess	018b5b8173	Support building with socks-0.6 and persistant-template-2.7 persistent-template now needs UndecidableInstances. socks changed defaultSocksConf to take a SockAddr.	2019-07-30 12:50:48 -04:00
Joey Hess	426053cb6c	Corrected some license statements In `40ecf58d4b` I changed the license of code I wrote from GPL to AGPL. But, two files containing code I wrote combined with code by others were updated to say their license is AGPL, while in fact part of it was (the code I wrote) but part remained under the original license (the code written by others). Remote/Ddar.hs is now changed entirely back to GPL 3. Annex/DirHashes.hs stays AGPL, but I broke out Utility/MD5.hs with the code not written by me, and corrected its license statement to GPL-2, which is the actual version of the GPL included with the code in its original distribution at http://www.cs.ox.ac.uk/people/ian.lynagh/md5/	2019-07-28 14:27:33 -04:00
Joey Hess	7fd650355e	merge from http-client-restricted I made some improvements to its API after splitting it out of git-annex, so merge those back in. This is groundwork for removing the embedded copy of it and depending on it. Also moved the managerResponseTimeout disabling to Annex.Url as it's git-annex specific. This commit was sponsored by Ethan Aubin on Patreon.	2019-07-17 16:48:50 -04:00
Joey Hess	7234b1f9a7	small optimisation to file copying Avoid statting file, just try to remove it. Also a comment to explain why it tries to remove it, which was puzzling me when I revisited this code until I saw that cp fails to overwrite a mode 444 file, including perhaps one left by a previous interrupted cp. This commit was sponsored by Fernando Jimenez on Patreon.	2019-07-17 14:22:21 -04:00
Joey Hess	21ff5e1e5a	CoW probing Improved probing when CoW copies can be made between files on the same drive. Now supports CoW between BTRFS subvolumes. And, falls back to rsync instead of using cp when CoW won't work, eg copies between repos on the same EXT4 filesystem. Rather than trying cp --reflink=always for each file copied to a remote, it's tried once and if it fails it falls back to using rsync thereafter for the lifetime of the Remote object. That avoids overhead of calling cp which while small, will add up over a large number of files. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2019-07-17 14:19:08 -04:00
Joey Hess	0c6b7e288d	Add BLAKE2BP512 and BLAKE2BP512E backends using a blake2 variant optimised for 4-way CPUs This had been deferred because the Debian package of cryptonite, and possibly other builds, was broken for blake2bp, but I've confirmed #892855 is fixed. This commit was sponsored by Brett Eisenberg on Patreon.	2019-07-05 15:30:03 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	42c386fc47	add: Display progress meter when hashing files. * add: Display progress meter when hashing files. * add: Support --json-progress option.	2019-06-25 13:12:47 -04:00
Joey Hess	759fd9ea68	avoid url resume from 0 When downloading an url and the destination file exists but is empty, avoid using http range to resume, since a range "bytes=0-" is an unusual edge case that it's best to avoid relying on working. This is known to fix a case where importfeed downloaded a partial feed from such a server. Since importfeed uses withTmpFile, the destination always exists empty, so it would particularly tickle such problem servers. Resuming from 0 is otherwise possible, but unlikely.	2019-06-20 12:26:17 -04:00
Joey Hess	fe49747fc8	add missing case and fix name shadowing warning	2019-06-04 11:24:32 -04:00
Joey Hess	6136e299a2	add back support for following http to ftp redirects Did not test build with http-client < 0.5 and while I tried to support it, the ifdefed parts may needs some fixes.	2019-05-30 16:04:59 -04:00
Joey Hess	67c06f5121	add back support for ftp urls Add back support for ftp urls, which was disabled as part of the fix for security hole CVE-2018-10857 (except for configurations which enabled curl and bypassed public IP address restrictions). Now it will work if allowed by annex.security.allowed-ip-addresses.	2019-05-30 14:51:34 -04:00
Joey Hess	aa7710982b	avoid list lookup by parseToken Minor optimisation to parsing of a preferred content expression.	2019-05-14 13:11:29 -04:00
Joey Hess	00e9e15c70	squelch build warning with old version of quickcheck	2019-05-03 11:02:12 -04:00
Joey Hess	2b52dbe905	fix build with older QuickCheck The NonEmpty instance was moved out of QuickCheck and into a package with more deps than I want to drag in, so I'm providing my own instance, but with older QuickCheck, use theirs to avoid overlapping.	2019-03-22 10:07:16 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	c0bd202147	fix failing test case An empty list of [ContentIdenfier] serialized to the same thing as a single ContentIdentifier "". Avoid this ambiguity by requiring the list be non-empty.	2019-03-06 14:27:15 -04:00
Joey Hess	1ec9e1494c	use relatedTempate in viaTmp	2019-03-04 14:12:00 -04:00
Joey Hess	4603713b4e	avoid using htonl It got removed from network-3.0.0.0 and nothing in the haskell ecosystem currently provides it (which seems it ought to be fixed). Tested new code on both little-endian and big-endian with: ghci> hostAddressToTuple $ fromJust $ embeddedIpv4 (0,0,0,0,0,0xffff,0x7f00,1) (127,0,0,1)	2019-02-19 12:17:20 -04:00
Joey Hess	f5f059e288	relocate gpg test framework temp dir to outside repo The gitAnnexTmpOtherDir cleanup made it be deleted too early sometimes, and so the test suite failed. Also there was a report of a similar failure which likely had a similar cause and hopwfully this fixes that too.	2019-01-21 14:16:00 -04:00
Joey Hess	e38b654096	Estimated time to completion display shortened from eg "1h1m1s" to "1h1m" Because seconds accuracy over such a time is unlikely to be accurate. Also, it was possible to get a ridiculous "1y1d1h1m1s" if stalled or very slow.	2019-01-21 00:04:35 -04:00
Joey Hess	96aba8eff7	Revert "cache the serialization of a Key" This reverts commit `4536c93bb2`. That broke Read/Show of a Key, and unfortunately Key is read in at least one place; the GitAnnexDistribution data type. It would be worth bringing this optimisation back, but it would need either a custom Read/Show instance that preserves back-compat, or wrapping Key in a data type that contains the serialization, or changing how GitAnnexDistribution is serialized. Also, the Eq instance would need to compare keys with and without a cached seralization the same.	2019-01-16 16:21:59 -04:00
Joey Hess	0e44985210	remove duplicate import	2019-01-14 18:26:38 -04:00
Joey Hess	e0c4ac99b5	convert serializeKey' to strict ByteString The builder produces a lazy ByteString, and L.toStrict has to copy it, but needing to use the builder is no longer to common case; the serialization will normally be cached already as a strict ByteString, and this avoids keyFile' needing to use L.toStrict . serializeKey'	2019-01-14 17:03:46 -04:00
Joey Hess	5d98cba923	use ByteStrings when reading annex symlinks and pointers Now there's a ByteString used all the way from disk to Key. The main complication in this conversion was the use of fromInternalGitPath in several places to munge things on Windows. The things that used that were changed to parse the ByteString using either path separator. Also some code that had read from files to a String lazily was changed to read a minimal strict ByteString.	2019-01-14 15:37:08 -04:00
Joey Hess	fc21cccf1c	slight optimisation more	2019-01-11 19:56:31 -04:00
Joey Hess	16c798b5ef	switch MetaValue to ByteString and MetaField to Text MetaField was already limited to alphanumerics, so it makes sense to use Text for it. Note that technically a UUID can contain invalid UTF-8, and so remoteMetaDataPrefix's use of T.pack . fromUUID could replace non-UTF8 values with '?' or whatever. In practice, a UUID is usually also text, I only kept open the possibility of it containing invalid UTF-8 to avoid breaking parsing of strange UUIDs in git-annex branch files. So, I decided to let this edge case slip by. Have not updated the rest of the code base yet for this change, as the change took 2.5 hours longer than I expected to get working properly.	2019-01-07 14:18:24 -04:00
Joey Hess	a80922a594	support for ByteStrings	2019-01-07 12:29:25 -04:00
Joey Hess	7d51b0c109	import Utility.FileSystemEncoding in Common	2019-01-03 11:37:02 -04:00
Joey Hess	f574d8af10	comment typo	2019-01-03 00:22:05 -04:00
Joey Hess	3ba6e9bb96	use attoparsec parser for String parsing, 10x speedup This is not as efficient as using ByteStrings throughout, but converting the String to ByteString is actually significantly faster than the old parser. benchmarking parse/old time 9.657 μs (9.600 μs .. 9.732 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 9.703 μs (9.645 μs .. 9.785 μs) std dev 231.6 ns (161.5 ns .. 323.7 ns) variance introduced by outliers: 25% (moderately inflated) benchmarking parse/new time 834.6 ns (797.1 ns .. 886.9 ns) 0.987 R² (0.976 R² .. 0.999 R²) mean 816.4 ns (802.7 ns .. 845.1 ns) std dev 62.39 ns (37.66 ns .. 108.4 ns) variance introduced by outliers: 82% (severely inflated) There is a small behavior change from the old parsePOSIXTime, which accepted any amount of trailing whitespace after the timestamp. That behavior was not documented, and it doesn't seem anything relied on it.	2019-01-02 13:28:44 -04:00
Joey Hess	3c74dcd4e1	attoparsec parser for POSIXTime (Not yet used anywhere.) Benchmarking {-# LANGUAGE OverloadedStrings #-} import Criterion.Main import Utility.TimeStamp import Data.Attoparsec.ByteString main = defaultMain [ bgroup "parse" [ bench "new" $ whnf (parseOnly (parserPOSIXTime <* endOfInput)) "1431286201.113452s" , bench "old" $ whnf parsePOSIXTime "1431286201.113452s" ] ] benchmarking parse/new time 643.6 ns (640.2 ns .. 646.7 ns) 1.000 R² (0.999 R² .. 1.000 R²) mean 645.3 ns (642.1 ns .. 650.9 ns) std dev 14.59 ns (9.194 ns .. 22.07 ns) variance introduced by outliers: 29% (moderately inflated) benchmarking parse/old time 9.657 μs (9.600 μs .. 9.732 μs) 1.000 R² (0.999 R² .. 1.000 R²) mean 9.703 μs (9.645 μs .. 9.785 μs) std dev 231.6 ns (161.5 ns .. 323.7 ns) variance introduced by outliers: 25% (moderately inflated) So old took 9703 ns to parse, and new 643 ns.	2019-01-02 12:48:53 -04:00
Joey Hess	ba2c0663f9	comments	2019-01-01 22:48:14 -04:00
Joey Hess	ec1b9da72f	avoid abusing from/toRawFilePath for non-FilePaths	2019-01-01 22:44:04 -04:00
Joey Hess	b3c69eaaf8	strict bytestring encoders and decoders Only had lazy ones before. Already sped up a few parts of the code.	2019-01-01 14:55:15 -04:00
Joey Hess	1b44426805	avoid conflicting definitions of Template type When both modules are imported and then re-exported.	2018-12-30 15:03:31 -04:00
Joey Hess	5480b3a9af	fix bogus ghc 8.6.3 build warning ghc warned that the guards did not cover all values of h, but they clearly do, and when rewritten as a case statement the warning goes away. Probably a ghc bug, but I kind of prefer the case statement over the guards anyway.	2018-12-30 14:43:27 -04:00
Joey Hess	14971414dc	Make test suite work better when the temp directory is on NFS. Deleting directories is one of the great unsolved problems of CS, thanks to abominations like NFS lock files and Windows and races with other processes cleaning up after themselves in the background. The gpg test harness sometimes failed to delete its temp directory on NFS. Avoid the problem class by not deleting it at all, and putting it inside the tmp repo being tested. The test suite's more robust (and/or nonsensical) workarounds for deleting its test dir will thus be used, hopefully avoiding the problem until an OS finds a new way to violate POSIX and the laws of nature. Note that this means that the .gnupg directory will be on whatever filesystem the test suite is being run on, which may be a lesser quality filesystem than gpg is really expecting. Gpg does not seem to need to write sockets etc to there so this seems ok. The only known problem is that if the filesystem forces a directory mode like 777, gpg will warn about unsafe home directory perms, but it still works.	2018-12-19 12:44:56 -04:00
Joey Hess	850d19d038	add dropFromEnd	2018-11-23 11:24:05 -04:00
Joey Hess	9127fe4821	add DebugLocks build flag Using the method described in https://www.fpcomplete.com/blog/2018/05/pinpointing-deadlocks-in-haskell but my own code to implement it, and with callstacks added. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-11-19 15:02:43 -04:00
Joey Hess	ff9bd9620e	Fix resume of download of url when the whole file content is already actually downloaded Don't much like that there's no way to distinguish between having the whole content and having an old version of the file that's bigger, but of course resuming a http transfer can always yield the wrong result if the file on the http server is changing, and git-annex will detect that when it verifies the downloaded content. This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2018-11-12 16:08:47 -04:00
Joey Hess	051dfcb3be	Revert "fix comment" This reverts commit `bac7d34e71`. The comment was right; ARG_MAX is the total length of all arguments.	2018-11-06 17:26:20 -04:00
Joey Hess	bac7d34e71	fix comment	2018-11-06 11:42:31 -04:00
Joey Hess	5ad5d45d4c	make Arbitrary POSIXTime include decimal half the time	2018-10-31 16:27:55 -04:00
Joey Hess	2ca408dc33	Increase minimum QuickCheck version.	2018-10-31 15:53:22 -04:00
Joey Hess	f00b329e0c	remove unused import	2018-10-30 13:38:29 -04:00
Joey Hess	86df2a08fe	fix windows build	2018-10-30 11:09:45 -04:00
Joey Hess	5ab0f48ffb	high-res mtimes Cache high-resolution mtimes for improved detection of modified files in v7 (and direct mode). Including on Windows. With back-compat support so old low-res mtimes won't break anything, and so the new information also won't break old versions of git-annex.	2018-10-30 00:41:26 -04:00
Joey Hess	48af284872	fix parse of negative posix time Should never happen, but..	2018-10-29 23:40:34 -04:00
Joey Hess	a8ad577d1d	fix parsing of timestamp w/o trailing 's' Luckily, this did not affect any git-annex log files, since they all include the trailing 's' for backwards compatability reasons. But, if I later want to drop that, this is the first commit where git-annex can be trusted to parse that right. The misparse caused it to be off by up to 10 seconds.	2018-10-29 23:36:47 -04:00
Joey Hess	3d1b22dc8e	factor out another function	2018-10-29 23:33:56 -04:00
Joey Hess	2e9f128dea	moved module and relicensed	2018-10-29 23:13:36 -04:00
Joey Hess	5d97898a7c	touch files with high-resolution timestamp Needs unix 2.7.2, but that was included in ghc 8.0.1 (and much older) so not really a new dep.	2018-10-29 22:25:21 -04:00
Joey Hess	94b7968f1f	forgot to remove this when dropping support for old ghc	2018-10-29 22:01:06 -04:00
Joey Hess	595fb98473	add small delay to avoid problems on systems with low-resolution mtime I've seen intermittent failures of the test suite with v6 for a long time, it seems to have possibly gotten worse with the changes around v7. Or just being unlucky; all tests failed today. Seen on amd64 and i386 builders, repeatedly but intermittently: unused: FAIL (4.86s) Test.hs:928: git diff did not show changes to unlocked file And I think other such failures, all involving v7/v6 mode tests. I managed to reproduce the unused failure with --keep-failures, and inside the repo, git diff was indeed not showing any changes for the modified unlocked file. The two stats will be the same other than mtime; the old and new files have the same size and inode, since the test case writes to the file and then overwrites it. Indeed, notice the identical timestamps: builder@orca:~/gitbuilder/build/.t/tmprepo335$ echo 1 > foo; stat foo; echo 2 > foo; stat foo File: foo Size: 2 Blocks: 8 IO Block: 4096 regular file Device: 801h/2049d Inode: 3546179 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/ builder) Gid: ( 1000/ builder) Access: 2018-10-29 22:14:10.894942036 +0000 Modify: 2018-10-29 22:14:10.894942036 +0000 Change: 2018-10-29 22:14:10.894942036 +0000 Birth: - File: foo Size: 2 Blocks: 8 IO Block: 4096 regular file Device: 801h/2049d Inode: 3546179 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/ builder) Gid: ( 1000/ builder) Access: 2018-10-29 22:14:10.894942036 +0000 Modify: 2018-10-29 22:14:10.898942036 +0000 Change: 2018-10-29 22:14:10.898942036 +0000 Birth: - I'm seeing this in Linux VMs; it doesn't happen on my laptop. I've also not experienced the intermittent test suite failures on my laptop. So, I hope that this small delay will avoid the problem. Update: I didn't, indeed I then reproduced the same failure on my laptop, so it must be due to something else. But keeping this change anyway since not needing to worry about lowish-resolution mtime in the test suite seems worthwhile.	2018-10-29 19:31:26 -04:00
Joey Hess	234842a347	v7 Install new git hooks in this version. This does beg the question of what to do if git later gets eg a post-smudge hook, that could run git-annex smudge --update. I think the thing to do in that case would be to make git-annex smudge --update install the new hooks. That way, as the user uses git-annex, the hook would be created pretty quickly and without needing any extra syscalls except for when git-annex smudge --update is called. I considered doing something like that for installation of the post-checkout and post-merge hooks, which would have avoided the need for v7. But the only place it was cheap to do it would be in git-annex smudge which could cheaply notice that smudge.log didn't exist yet and so know the hooks needed to be installed. But since smudge used to populate pointer files, it would be quite surprising if a single git checkout/merge failed to update the work tree, and so that idea didn't work out. The other reason for v7 is psychological -- users don't need to worry about whether they might be running an old version of git-annex that doesn't support their v7 repository very well. And bug reports about "v6" have gotten a bit of a bad association in my head since they often hit one of the known limitations and didn't realize it was experimental. newtyped RepoVersion Int to avoid needing 2 comparisons in versionSupportsUnlockedPointers etc. Also it's just nicer. This commit was sponsored by John Pellman on Patreon.	2018-10-25 18:24:23 -04:00
Joey Hess	38d691a10f	removed the old Android app Running git-annex linux builds in termux seems to work well enough that the only reason to keep the Android app would be to support Android 4-5, which the old Android app supported, and which I don't know if the termux method works on (although I see no reason why it would not). According to [1], Android 4-5 remains on around 29% of devices, down from 51% one year ago. [1] https://www.statista.com/statistics/271774/share-of-android-platforms-on-mobile-devices-with-android-os/ This is a rather large commit, but mostly very straightfoward removal of android ifdefs and patches and associated cruft. Also, removed support for building with very old ghc < 8.0.1, and with yesod < 1.4.3, and without concurrent-output, which were only being used by the cross build. Some documentation specific to the Android app (screenshots etc) needs to be updated still. This commit was sponsored by Brett Eisenberg on Patreon.	2018-10-13 01:41:11 -04:00
Joey Hess	45e09ea7f3	debug the full adjusted Request So that the user-agent etc are included in the debug.	2018-10-04 13:45:27 -04:00
Joey Hess	303d10cee6	Improve display when git config download from a http remote fails. The error message displayed used to only come from curl/wget and perhaps was clearer than the one displayed now that http-client is used. In any case, it does make sense to hide it because git-annex prints its own warning message. This commit was sponsored by Jake Vosloo on Patreon.	2018-10-03 12:31:09 -04:00
Joey Hess	502c5a4917	remove support for old http-client version git-annex already bumped to a newer version for the http security fix. This commit was sponsored by mo on Patreon.	2018-10-03 12:00:07 -04:00
Joey Hess	c88e8c8249	unify error display	2018-10-03 11:56:52 -04:00
Joey Hess	26a02cb386	display error when an invalid url is downloaded download is documented as displaying an error when download fails, but it didn't when the url was not valid at all. That leads to confusing behavior. Also, display the url with --debug	2018-09-25 13:38:20 -04:00
Joey Hess	cc82f81227	More FreeBSD build fixes. Untested, on FreeBSD but enough to fix the listed build errors. Seems that System.Posix.Files must have used to export this stuff and it was split. This commit was sponsored by Peter on Patreon.	2018-09-24 11:25:56 -04:00
Joey Hess	ceee7758a5	fix \ escaping	2018-09-22 11:33:08 -04:00
Joey Hess	d2c351f547	update windows NUL for ghc 8.6.1 This should also work with older ghc, since the path is a windows device namespace path.	2018-09-22 11:31:55 -04:00
Joey Hess	2aae6e84af	Support newlines in filenames. Work around git cat-file --batch's protocol not supporting newlines by running git cat-file not batched and passing the filename as a parameter. Of course this is quite a lot less efficient, especially because it currently runs it multiple times to query for different pieces of information. Also, it has subtly different behavior when the batch process was started and then some changes were made, in which case the batch process sees the old index but this workaround sees the current index. Since that batch behavior is mostly a problem that affects the assistant and has to be worked around in it, I think I can get away with this difference. I don't know of any other problems with newlines in filenames, everything else in git I can think of supports -z. And git-annex's json output supports newlines in filenames so downstream parsers from git-annex will be ok. git-annex commands that use --batch themselves don't support newlines in input filenames; using --json --batch is currently a way around that problem. This commit was sponsored by Ewen McNeill on Patreon.	2018-09-20 13:45:44 -04:00
Yaroslav Halchenko	b976eb5353	BF(minor): missing space after "Unsupported url scheme" msg before the scheme	2018-09-18 18:19:20 -04:00
Joey Hess	b3c9c59d3d	--debug urls When git-annex used wget and curl, --debug would show urls. So there can't be any new security problem with doing so. This commit was sponsored by John Pellman on Patreon.	2018-09-14 12:46:39 -04:00
Joey Hess	b18fb1e343	clean P2P protocol shutdown on EOF Avoids "git-annex-shell: <stdin>: hGetChar: end of file" being displayed by the test suite, due to the way it runs git-annex-shell without using ssh. git-annex-shell over ssh was not affected because git-annex hangs up the ssh connection and so never sees the error message that git-annnex-shell probably did emit. This commit was sponsored by Ryan Newton on Patreon.	2018-09-13 10:46:37 -04:00
Joey Hess	872640549b	comment typo	2018-09-05 13:57:06 -04:00
Joey Hess	f4788f3853	clarify comment haskell-mountpoints contains android specific code, but it's not used when git-annex was built for linux and is running on android.	2018-09-05 11:22:27 -04:00
Joey Hess	55f8d90dee	remove Utlity.SRV, no longer used	2018-09-05 11:15:33 -04:00
Joey Hess	f54c72d2e1	Fix build on FreeBSD This must have been broken for years.. This commit was sponsored by Jack Hill on Patreon.	2018-08-29 12:09:03 -04:00
Joey Hess	c565340adc	stop using external hash programs, since cryptonite is faster In 2013, I wrote "Cryptohash benchmarks 90 to 101% faster than external hashers". Re-benchmarking today, I found cryptonite's sha256 consistently outperformed coreutils by 10% for large files. Tested 10 mb, 100 mb, 1 gb files with both sha256 and sha512. And for smaller files, the external process startup time swamps the hash time. Perhaps cryptonite has improved. Or it could just do better on my current CPU Intel(R) Pentium(R) CPU 4410Y @ 1.50GHz). Anyway, even if cryptonite is slower in some situations, seems likely it would only be marginally slower; it's got the same class of highly optimised C code under the hood as coreutils. The main difference between the two sha256 implementations seems to be how much of the inner loop they unroll.. This commit was sponsored by Henrik Riomar on Patreon.	2018-08-28 18:10:58 -04:00
Joey Hess	6a445dc086	support conditionally excluding queued files Switched code to use a for loop to avoid a filterM that would have doubled the memory used. This commit was supported by the NSF-funded DataLad project.	2018-08-16 14:38:37 -04:00
Joey Hess	218c76b789	avoid unused imports warning on non-linux	2018-08-07 15:06:33 -04:00
Joey Hess	e1ab01f94d	Fix reversion in display of http 404 errors. Switch to using http-client for large file downloads caused the reversion; the code for displaying a 404 response was instead displaying the raw html document, which is not useful. This commit was sponsored by Ryan Newton on Patreon.	2018-07-31 12:15:26 -04:00
Joey Hess	50609da787	fix User-Agent reversion Send User-Agent and any configured annex.http-headers when downloading with http, fixes reversion introduced when switching to http-client. This commit was sponsored by mo on Patreon.	2018-07-16 11:56:47 -04:00
Joey Hess	ac228fa723	don't import all of System.Posix.Files This avoid a build problem when different versions of posix and posixcompat are used. Does not normally happen as cabal prevents that, but this is sometimes used with ghc --make which can get into that situation.	2018-07-10 12:04:49 -04:00
Joey Hess	3dd7f450c1	fix p2p --pair p2p --pair: Fix interception of the magic-wormhole pairing code, which since 0.8.2 it has sent to stderr rather than stdout. This is highly annoying because I had asked the magic wormhole developers for a machine-readable way to get the data, and instead they changed how the data was output, and didn't even mention this in my issue, or in the changelog. Seems this needs to be tested periodically to make sure it's still working. This commit was sponsored by Ethan Aubin.	2018-07-04 15:14:03 -04:00
Joey Hess	3976b89116	fix license date I wrote this this year	2018-06-22 10:25:53 -04:00
Joey Hess	22f49f216e	get android building the security fix Had to update http-client and network, with follow-on dep changes. This commit was sponsored by Brock Spratlen on Patreon.	2018-06-21 10:23:04 -04:00
Joey Hess	923578ad78	improve error message This commit was sponsored by Jack Hill on Patreon.	2018-06-19 14:21:41 -04:00
Joey Hess	47cd8001bc	call base ManagerSetting's exception wrapper This commit was sponsored by Henrik Riomar on Patreon.	2018-06-19 14:17:05 -04:00
Joey Hess	fc79f68404	support building on debian stable Specifically, http-client-0.4.31 This commit was supported by the NSF-funded DataLad project.	2018-06-19 11:25:10 -04:00
Joey Hess	3c0a538335	allow ftp urls by default They're no worse than http certianly. And, the backport of these security fixes has to deal with wget, which supports http https and ftp and has no way to turn off individual schemes, so this will make that easier.	2018-06-18 15:37:17 -04:00
Joey Hess	cc08135e65	prevent using local http proxies per annex.security.allowed-http-addresses A local http proxy would bypass the security configuration. So, the security configuration has to be applied when choosing whether to use the proxy. While http rebinding attacks against the dns lookup of the proxy IP address seem very unlikely, this implementation does prevent them, since it resolves the IP address once, checks it, and then reconfigures http-client's proxy using the resolved address. This commit was sponsored by Ole-Morten Duesund on Patreon.	2018-06-18 13:32:20 -04:00
Joey Hess	b54b2cdc0e	prevent http connections to localhost and private ips by default Security fix! * git-annex will refuse to download content from http servers on localhost, or any private IP addresses, to prevent accidental exposure of internal data. This can be overridden with the annex.security.allowed-http-addresses setting. * Since curl's interface does not have a way to prevent it from accessing localhost or private IP addresses, curl defaults to not being used for url downloads, even if annex.web-options enabled it before. Only when annex.security.allowed-http-addresses=all will curl be used. Since S3 and WebDav use the Manager, the same policies apply to them too. youtube-dl is not handled yet, and a http proxy configuration can bypass these checks too. Those cases are still TBD. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2018-06-17 13:30:28 -04:00
Joey Hess	43bf219a3c	added makeAddressMatcher Would be nice to add CIDR notation to this, but this is the minimal thing needed for the security fix. This commit was sponsored by Ewen McNeill on Patreon.	2018-06-17 13:29:15 -04:00
Joey Hess	014a3fef34	added isPrivateAddress and isLoopbackAddress For use in a security boundary enforcement. Based on https://en.wikipedia.org/wiki/Reserved_IP_addresses Including supporting IPv4 addresses embedded in IPv6 addresses. Because while RFC6052 3.1 says "Address translators MUST NOT translate packets in which an address is composed of the Well-Known Prefix and a non- global IPv4 address; they MUST drop these packets", I don't want to trust that implementations get that right when enforcing a security boundary. This commit was sponsored by John Pellman on Patreon.	2018-06-17 13:28:25 -04:00

1 2 3 4 5 ...

1449 commits