git-annex

Author	SHA1	Message	Date
Joey Hess	efae085272	fixed reconcileStaged crash when index is locked or in conflict Eg, when git commit runs the smudge filter. Commit `428c91606b` introduced the crash, as write-tree fails in those situations. Now it will work, and git-annex always gets up-to-date information even in those situations. It does need to do a bit more work, each time git-annex is run with the index locked. Although if the index is unmodified from the last time write-tree succeeded, that work is avoided.	2021-05-24 11:33:23 -04:00
Joey Hess	984034f335	filter-branch working aside from some edge cases Added a note to man page about what happens to information that is recorded in the private journal. Since it uses Branch.get, that information will be copied when options allow. It seemed better to allow it and document it than not allow it, since the options allow excluding repositories and so can be used to exclude private repos if desired.	2021-05-17 13:24:58 -04:00
Joey Hess	1d16654a22	convert formatLsTree to ByteString for speed	2021-05-17 10:46:24 -04:00
Joey Hess	4bf7940d6b	fileRef: make paths relative and simplified Fix behavior of several commands, including reinject, addurl, and rmurl when given an absolute path to an unlocked file, or a relative path that leaves and re-enters the repository. To avoid slowing down all the cases where the paths are already ok with an unncessary call to getCurrentDirectory, put in an optimisation in relPathCwdToFile. That will probably also speed up other parts of git-annex by some small amount, but I have not benchmarked. Note that I did not convert branchFileRef, because it seems likely that it will be used with a file that is not provided by the user, so is already in a sane format. This is certainly true for the way git-annex uses it, though maybe arguable to the extent Git.Ref is a reusable library.	2021-05-07 13:25:59 -04:00
Joey Hess	32138b8cd8	implement annex.privateremote and remote.name.private configs The slightly unusual parsing in Types.GitConfig avoids the need to look at the remote list to get configs of remotes. annexPrivateRepos combines all the configs, and will only be calculated once, so it's nice and fast. privateUUIDsKnown and regardingPrivateUUID now need to read from the annex mvar, so are not entirely free. But that overhead can be optimised away, as seen in getJournalFileStale. The other call sites didn't seem worth optimising to save a single MVar access. The feature should have impreceptable speed overhead when not being used.	2021-04-23 14:21:57 -04:00
Joey Hess	0e830b6bb5	make remoteKeyToRemoteName safer If it's passed a ConfigKey such as annex.version, avoid returning an empty remote name and return Nothing instead. Also, foo.bar.baz is not treated as a remote named "bar".	2021-04-23 13:29:21 -04:00
Joey Hess	5712a7ef93	fix incomplete pattern match warning There was not really a bug here, because the 2 lists are always the same length, but the compiler does not know that.	2021-03-30 12:59:53 -04:00
Joey Hess	4611813ef1	Fix bug importing from a special remote into a subdirectory more than one level deep Which generated unusual git trees that could confuse git merge, since they incorrectly had 2 subtrees with the same name. Root of the bug was a) not testing that at all! but also b) confusing graftdirs, which contains eg "foo/bar" with non-recursively read trees, which would contain eg "bar" when reading a subtree of "foo". It's worth noting that Annex.Import uses graftTree, but it really shouldn't have needed to. Eg, when importing into foo/bar from a remote, it's enough to generate a tree of foo/bar/x, foo/bar/y, and does not include other files that are at the top of the master branch. It uses graftTree, so it does include the other files, as well as the foo/bar tree. git merge will do the same thing for both trees. With that said, switching it away from graftTree would result in another import generating a new commit that seems to delete files that were there in a previous commit, so it probably has to keep using graftTree since it used it before. This commit was sponsored by Kevin Mueller on Patreon.	2021-03-26 16:04:36 -04:00
Joey Hess	5d78cd9d08	Sped up git-annex init in a clone of an existing repository Seems that hasOrigin was never finding origin's git-annex branch, so a new one got created each time. And so then it later needed to merge the two branches, which is expensive. Added --no-track to git branch to avoid it displaying a message about setting up tracking branches. Of course there's no reason to make the git-annex branch a tracking branch since git-annex auto-merges it.	2021-03-23 15:23:13 -04:00
Joey Hess	a8b837aaef	add git ls-tree --long parser Not yet used, but allows getting the size of items in the tree fairly cheaply. I noticed that CmdLine.Seek uses ls-tree and the feeds the files into another long-running process to check their size. That would be an example of a place that might be sped up by using this. Although in that particular case, it only needs to know the size of unlocked files, not locked. And since enabling --long probably doubles the ls-tree runtime or more, the overhead of using it there may outwweigh the benefit.	2021-03-23 12:47:00 -04:00
Joey Hess	ed717cf646	fix handling of subtree I don't think this actually fixes any buggy behavior in git-annex, I just noticed that using treeItemToLsTreeItem and then serializing it resulted in something starting with "160000 blob" rather than "160000 commit"	2021-03-12 13:24:19 -04:00
Joey Hess	4b57e1c0ad	allow adjusttreeitem to remove submodules	2021-03-12 13:19:23 -04:00
Joey Hess	e07eabbf7f	Fix support for local gcrypt repositories with a space in their URI Git.Remote.parseRemoteLocation had a hack to handle URIs that contained characters like spaces, which is something git unfortunately allows despite not being a valid URI. However, that hack looked for "//" to guess something was an URI, and these gcrypt URIs, being to a local path, don't contain that. So instead escape all illegal characters and check if the resulting thing is an URI. And that was already done by Git.Construct.fromUrl, so internally the gcrypt URI with a space looks like "gcrypt::foo%20bar" and that needs to be de-escaped when converting back from URI to local repo path. This change might also allow a few other almost-valid URIs to be handled as URIs by git-annex. None that contain "//" will change, and any behavior change should result in git-annex doing closer to a right thing than it did before, probably. This commit was sponsored by Noam Kremen on Patreon.	2021-03-09 12:49:51 -04:00
Joey Hess	3a66cd715f	avoid making absolute git remote path relative When a git remote is configured with an absolute path, use that path, rather than making it relative. If it's configured with a relative path, use that. Git.Construct.fromPath changed to preserve the path as-is, rather than making it absolute. And Annex.new changed to not convert the path to relative. Instead, Git.CurrentRepo.get generates a relative path. A few things that used fromAbsPath unncessarily were changed in passing to use fromPath instead. I'm seeing fromAbsPath as a security check, while before it was being used in some cases when the path was known absolute already. It may be that fromAbsPath is not really needed, but only git-annex-shell uses it now, and I'm not 100% sure that there's not some input that would cause a relative path to be used, opening a security hole, without the security check. So left it as-is. Test suite passes and strace shows the configured remote url is used unchanged in the path into it. I can't be 100% sure there's not some code somewhere that takes an absolute path to the repo and converts it to relative and uses it, but it seems pretty unlikely that the code paths used for a git remote would call such code. One place I know of is gitAnnexLink, but I'm pretty sure that git remotes never deal with annex symlinks. If that did get called, it generates a path relative to cwd, which would have been wrong before this change as well, when operating on a remote.	2021-02-08 13:18:01 -04:00
Joey Hess	e3224ff77d	formatLsTree did not use a tab where git does Fixed that, and made parserLsTree accept the space as well as tab. Fixes a reversion that made import of a tree from a special remote result in a merge that deleted files that were not preferred content of that special remote.	2021-01-28 12:36:37 -04:00
Joey Hess	e7134ca1eb	avoid partial functions in Git.Url After the last commit, it was able to throw errors just due to an unparseable url. This avoids needing to worry about that, as long as the call site has already checked that it has a parseable url.	2021-01-18 15:07:23 -04:00
Joey Hess	2aa4fab62a	avoid crashing when there are remotes using unparseable urls Including the non-standard URI form that git-remote-gcrypt uses for rsync. Eg, "ook://foo:bar" cannot be parsed because "bar" is not a valid port number. But git could have a remote with that, it would try to run git-remote-ook to handle it. So, git-annex has to allow for such things, rather than crashing. This commit was sponsored by Luke Shumaker on Patreon.	2021-01-18 14:59:08 -04:00
Joey Hess	5193aae385	Bug fix: Fix tilde expansion in ssh urls when the tilde is the last character in the url. Thanks, Grond for the patch.	2021-01-18 12:22:48 -04:00
Joey Hess	dc0caef297	merge from git-repair	2021-01-11 21:57:35 -04:00
Joey Hess	33bcee86f1	avoid using wildcard near bug kyle fixed	2021-01-07 13:44:23 -04:00
Kyle Meyer	fd161da2c2	adjustTree: Consider submodule deletions In addition to regular file deletions, the removefiles argument passed to adjustTree may contain removed submodules. When making the new tree, filter these out in the same way that is done for regular files so that the deletion is propagated.	2021-01-07 13:43:09 -04:00
Joey Hess	cc89699457	mincopies This is conceptually very simple, just making a 1 that was hard coded be exposed as a config option. The hard part was plumbing all that, and dealing with complexities like reading it from git attributes at the same time that numcopies is read. Behavior change: When numcopies is set to 0, git-annex used to drop content without requiring any copies. Now to get that (highly unsafe) behavior, mincopies also needs to be set to 0. It seemed better to remove that edge case, than complicate mincopies by ignoring it when numcopies is 0. This commit was sponsored by Denis Dzyubenko on Patreon.	2021-01-06 14:15:19 -04:00
Joey Hess	1c5fc8f047	Git.Queue: allow providing git common options like -c	2021-01-04 12:51:55 -04:00
Joey Hess	cd776ecb2e	avoid combining queued commands with different params I don't think this affected git-annex currently, but if the same command was queued twice with different params, one set of params was thrown away, and the files going with those were run with the other set of params.	2021-01-04 12:41:19 -04:00
Joey Hess	5d8e4a7c74	avoid borg list of archives that have been listed before This makes sync a lot faster in the common case where there's no new backup. There's still room for it to be faster. Currently the old imported tree has to be traversed, to generate the ImportableContents. Which then gets turned around to generate the new imported tree, which is identical. So, it would be possible to just return a "no new imports", or an ImportableContents that has a way to graft in a tree. The latter is probably too far to go to optimise this, unless other things need it. The former might be worth it, but it's already pretty fast, since git ls-tree is pretty fast.	2020-12-22 14:06:40 -04:00
Joey Hess	a3b714ddd9	finish fixing removeLink on windows `9cb250f7be` got the ones in RawFilePath, but there were others that used the one from unix-compat, which fails at runtime on windows. To avoid this, import System.PosixCompat.Files hiding removeLink This commit was sponsored by Ethan Aubin.	2020-11-24 13:20:44 -04:00
Joey Hess	804808d569	squash build warnings on windows	2020-11-23 14:00:17 -04:00
Joey Hess	fcb1d67b41	fix build on windows	2020-11-20 12:53:25 -04:00
Joey Hess	ff0927bde9	converted reads from stderr to use hGetLineUntilExitOrEOF These are all unlikely to suffer from the inherited stderr fd problem, but who knows, it could happen.	2020-11-19 16:21:17 -04:00
Joey Hess	66497d39b3	convert git config reading to use hGetLineUntilExitOrEOF Much nicer than the old hack of waiting for a few seconds for stderr to be read.	2020-11-19 15:38:43 -04:00
Joey Hess	6b63278f31	init: When writing hook scripts, set all execute bits, not only the user execute bit	2020-11-17 13:31:12 -04:00
Joey Hess	0121f5f6d3	support parsing numeric git configs as bool I'm not sure if git documents it aside from 0 and 1, but any integer can be interpreted as a bool by it. Doing the same in git-annex is good for consistency. Also, I am planning a config that starts out as a numeric range, but will later transition to a simple bool (hopefully), which this interpretation supports well.	2020-11-16 10:09:25 -04:00
Joey Hess	885974be99	add newtypes for QuickCheck to avoid LANG=C issues All properties changed to use them, except for prop_encode_c_decode_c_roundtrip, which already filtered to ascii for other reasons. A few modules had to be split out, because Setup does not build-depend on QuickCheck.	2020-11-09 20:21:18 -04:00
Joey Hess	2c8cf06e75	more RawFilePath conversion Converted file mode setting to it, and follow-on changes. Compiles up through 369/646. This commit was sponsored by Ethan Aubin.	2020-11-05 18:45:37 -04:00
Joey Hess	5a1e73617d	finished this stage of the RawFilePath conversion Finally compiles again, and test suite passes. This commit was sponsored by Brock Spratlen on Patreon.	2020-11-04 14:20:37 -04:00
Joey Hess	eb42cd4d46	more RawFilePath conversion 535/645 This commit was sponsored by Brett Eisenberg on Patreon.	2020-11-03 10:11:04 -04:00
Joey Hess	55400a03d3	more RawFilePath conversion This commit was sponsored by Luke Shumaker on Patreon.	2020-11-02 16:31:28 -04:00
Joey Hess	87f91ce563	more RawFilePath conversion 451/645	2020-10-30 15:55:59 -04:00
Joey Hess	681b44236a	more RawFilePath conversion at 377/645 This commit was sponsored by Svenne Krap on Patreon.	2020-10-29 14:20:57 -04:00
Joey Hess	f45ad178cb	more RawFilePath conversion At 318/645 after 4k lines of changes This commit was sponsored by Jake Vosloo on Patreon.	2020-10-29 12:03:50 -04:00
Joey Hess	e505c03bcc	more RawFilePath conversion nukeFile replaced with removeWhenExistsWith removeLink, which allows using RawFilePath. Utility.Directory cannot use RawFilePath since setup does not depend on posix. This commit was sponsored by Graham Spencer on Patreon.	2020-10-29 10:50:29 -04:00
Joey Hess	8d66f7ba0f	more RawFilePath conversion Added a RawFilePath createDirectory and kept making stuff build. Up to 296/645 This commit was sponsored by Mark Reidenbach on Patreon.	2020-10-28 17:25:59 -04:00
Joey Hess	b8bd2e45e3	more RawFilePath conversion Notable wins in Annex.Locations which was sometimes doing 6 conversions in a single function call. This commit was sponsored by Denis Dzyubenko on Patreon.	2020-10-28 16:24:14 -04:00
Joey Hess	6c29817748	RawFilePath version of getCurrentDirectory This commit was sponsored by Jochen Bartl on Patreon	2020-10-28 16:03:45 -04:00
Joey Hess	08cbaee1f8	more RawFilePath conversion Most of Git/ builds now. Notable win is toTopFilePath no longer double converts This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-10-28 15:55:30 -04:00
Joey Hess	f167851628	Revert "pass --git-dir, rather than changing cwd" This reverts commit `c142696c58`. It turns out it was not needed; `681313dfd4` fixed up the git dir, so setting cwd to it works ok. But worst, this commit broke the test suite massively. I don't understand how. git-annex get was failing. Very weirdly, git-annex find in a fresh clone of an annex repo, during autoinit, was displaying a side message -- but side messages are disabled when running find.	2020-10-23 16:09:50 -04:00
Joey Hess	681313dfd4	deal with .git pointer file in Git.CurrentRepo This fixes the bug. Note, it's only done when GIT_DIR is set. When it's not set, Git.Construct already handled it. This is why it was only noticed with this git submodule command. This commit was sponsored by Brett Eisenberg on Patreon.	2020-10-23 14:56:12 -04:00
Joey Hess	c142696c58	pass --git-dir, rather than changing cwd If .git is a gitlink file, setting cwd to it will fail, but --git-dir will succeed. And this is the only place where it sets cwd when running git, everywhere else already uses --git-dir. Note that, git-annex's submodule fixup code usually converts gitlink files to symlinks, so this wasn't usually problem. Still, worth fixing. This commit was sponsored by Svenne Krap on Patreon.	2020-10-23 13:36:56 -04:00
Joey Hess	f624876dc2	remove zombie process in file seeking This was the last one marked as a zombie. There might be others I don't know about, but except for in the hypothetical case of a thread dying due to an async exception before it can wait on a process it started, I don't know of any. It would probably be safe to remove the reapZombies now, but let's wait and so that in its own commit in case it turns out to cause problems. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2020-09-25 11:38:42 -04:00
Joey Hess	ca454c47f2	explicitly wait for a git process Eliminate a zombie that was only cleaned up by the later zombie cleanup code. This is still not ideal, it would be cleaner if it used conduit or something, and if the thread gets killed before waiting, it won't stop the process. Only remaining zombies are in CmdLine.Seek	2020-09-25 11:03:12 -04:00
Joey Hess	5abb0f86c4	update comments	2020-09-07 13:03:51 -04:00
Joey Hess	41ebed3941	Support git remotes where .git is a file, not a directory Eg when --separate-git-dir was used, and core.symlinks=false. This commit was sponsored by Brock Spratlen on Patreon.	2020-08-28 15:08:14 -04:00
Joey Hess	cb74cefde7	Fix a hang when using git-annex with an old openssh 7.2p2 Which had some weird inheriting of ssh FDs by sshd. Bug was introduced in git-annex version 7.20200202.7.	2020-07-21 16:14:25 -04:00
Joey Hess	b88ecb36dd	remove unused code	2020-07-15 11:16:36 -04:00
Joey Hess	992fe446ad	unused import	2020-07-15 11:16:15 -04:00
Joey Hess	8cd0e351ba	avoid using cat-file --buffer if git is too old This is only needed for the i386ancient build, so build in the git version git-annex is built with, assuming git won't be upgraded, or if it is, they just won't get the speedup of --buffer	2020-07-15 10:56:32 -04:00
Joey Hess	535cdc8d48	importfeed: Made checking known urls step around 10% faster. This was a bit disappointing, I was hoping for a 2x speedup. But, I think the metadata lookup is wasting a lot of time and also needs to be made to stream. The changes to catObjectStreamLsTree were benchmarked to not also speed up --all around 3% more. Seems I managed to make it polymorphic after all.	2020-07-14 12:47:51 -04:00
Joey Hess	15c0207a23	reword comment better	2020-07-13 13:03:09 -04:00
Joey Hess	24550e010b	update comment to match behavior	2020-07-13 12:52:57 -04:00
Joey Hess	5387b95dcd	add catObjectMetaDataStream	2020-07-10 14:36:18 -04:00
Joey Hess	bf72316b08	add function split out from CatFile	2020-07-10 13:28:16 -04:00
Joey Hess	bd2d304064	better catObjectStream' and use Chan The catObjectStream' is generic enough to let it be nicely used from inside Annex monad. Chan will be faster than DList here. Bearing in mind, it is unbounded, but in reality will be bounded by the size of the stdio buffer through git cat-file. This speeds up --all by about 10% although I think only getting back to the previous performance before I introduced that DList.	2020-07-10 13:15:14 -04:00
Joey Hess	cb6e19f4c5	work around catObjectStream polymorism perf Breaking it up like this doesn't change perf, and lets another version be written in just a couple lines.	2020-07-09 14:27:07 -04:00
Joey Hess	9f6bd6cc05	add inRepoDetails planned to use for an optimisation most things using stagedDetails were not expecting to get dup files in a conflicted merge and deal with them, so converted them to use inRepoDetails.	2020-07-08 15:36:35 -04:00
Joey Hess	7347e50123	add stage number to stagedDetails parser And convert parser to attoparsec, probably faster. Before, a parse failure threw the whole --stage output line in to the filename, which was certianly a bad idea, so fixed that.	2020-07-08 15:05:12 -04:00
Joey Hess	c1eaf5b930	note	2020-07-08 14:21:37 -04:00
Joey Hess	d08c178f97	avoid catObjectStream skipping over unavailable shas Not needed as it's used for --all, but will be needed later.	2020-07-08 13:57:17 -04:00
Joey Hess	de3d7d044d	make catObjectStream support newline and carriage return in filenames Turns out the %(rest) trick was not needed. Instead, just maintain a list of files we've asked for, and each cat-file response is for the next file in the list. This actually benchmarks 25% faster than before! Very surprising, but it must be due to needing to shove less data through the pipe, and parse less.	2020-07-08 13:49:03 -04:00
Joey Hess	d010ab04be	sped up the --all option by 2x to 16x by using git cat-file --buffer This assumes that no location log files will have a newline or carriage return in their name. catObjectStream skips any such files due to cat-file not supporting them. Keys have been prevented from containing newlines since 2011, commit `480495beb4`. If some old repo had a key with a newline in it, --all will just skip processing that key. Other things, like .git/annex/unused files certianly assume no newlines in keys too, and AFAICR, such keys never actually worked. Carriage return is escaped by preSanitizeKeyName since 2013. WORM keys generated before that point could perhaps contain a CR. (URL probably not, http probably doesn't support an URL with a raw CR in it.) So, added a warning in fsck about such keys. Although, fsck --all will naturally skip them, so won't be able to warn about them. Not entirely satisfactory, but I'll bet there are not really any such keys in existence. Thanks to Lukey for finding this optimisation.	2020-07-07 13:54:04 -04:00
Joey Hess	e41f8c83f3	close stdin handles before waiting on commands Fixes reversion in recent conversions, the old code relied on the GC apparently, but the new code explicitly waits on the process, so must close stdin handle first or the command will never exit.	2020-06-05 17:27:49 -04:00
Joey Hess	05703893af	use right handle	2020-06-05 16:38:11 -04:00
Joey Hess	319f2a4afc	audit all uses of SomeException to avoid catching async exceptions Except for the assistant, which I think may use them between threads? Most of the uses of SomeException were already catching only async exceptions. But I did find a few places that were accidentially catching them.	2020-06-05 15:16:57 -04:00
Joey Hess	2670890b17	convert to withCreateProcess for async exception safety This handles all createProcessSuccess callers, and aside from process pools, the complete conversion of all process running to async exception safety should be complete now. Also, was able to remove from Utility.Process the old API that I now know was not a good idea. And proof it was bad: The code size went down, despite there being a fair bit of boilerplate for some future API to reduce.	2020-06-04 15:45:52 -04:00
Joey Hess	438dbe3b66	convert to withCreateProcess for async exception safety This handles all sites where checkSuccessProcess/ignoreFailureProcess is used, except for one: Git.Command.pipeReadLazy That one will be significantly more work to convert to bracketing. (Also skipped Command.Assistant.autoStart, but it does not need to shut down the processes it started on exception because they are git-annex assistant daemons..) forceSuccessProcess is done, except for createProcessSuccess. All call sites of createProcessSuccess will need to be converted to bracketing. (process pools still todo also)	2020-06-04 12:44:09 -04:00
Joey Hess	2dc7b5186a	convert to withCreateProcess for async exception safety	2020-06-04 12:05:25 -04:00
Joey Hess	92f775eba0	convert to withCreateProcess for async exception safety Not yet 100% done, so far I've grepped for waitForProcess and converted everything that uses that to start the process with withCreateProcess. Except for some things like P2P.IO and Assistant.TransferrerPool, and Utility.CoProcess, that manage a pool of processes. See #2 in https://git-annex.branchable.com/todo/more_extensive_retries_to_mask_transient_failures/#comment-209f8a8c38e63fb3a704e1282cb269c7 for how those will need to be dealt with. checkSuccessProcess, ignoreFailureProcess, and forceSuccessProcess calls waitForProcess, so callers of them will also need to be dealt with, and have not been yet.	2020-06-03 15:48:09 -04:00
Joey Hess	89b2542d3c	annex.skipunknown with transition plan Added annex.skipunknown git config, that can be set to false to change the behavior of commands like `git annex get foo*`, to not skip over files/dirs that are not checked into git and are explicitly listed in the command line. Significant complexity was needed to handle git-annex add, which uses some git ls-files calls, but needs to not use --error-unmatch because of course the files are not known to git. annex.skipunknown is planned to change to default to false in a git-annex release in early 2022. There's a todo for that.	2020-05-28 15:55:17 -04:00
Joey Hess	dfc4e641b5	repair: Improve fetching from a remote with an url in host:path format. User reported git@my.gitlab.foo:username/myrepo.git didn't work with git-repair, because it rewrites it to an url ssh://git@my.gitlab.foo/~/username/myrepo.git and the /~/ was not something the hosting site supported. Since git-annex still generally needs the repo url to be well, an url, did not change the conversion code. But in this case, we're running git fetch, so we might as well pass it the remote name rather than the url. Did a quick audit of repoLocation uses to see if there was anything else like this problem elsewhere, and didn't see any. But this is not the first time this special case in git and git-annex's attempt to de-special-case it has caused a problem..	2020-05-04 15:32:06 -04:00
Joey Hess	f85ca7dc80	fix all remaining -Wincomplete-uni-patterns warnings A couple of these were probably actual bugs in edge cases. Most of the changes I'm fine with. The fact that aeson's object returns sometihng that we know will be an Object, but the type checker does not know is kind of annoying.	2020-04-15 13:55:08 -04:00
Joey Hess	bcc0ec5b99	fix runtime crash on incomplete pattern match in lambda This was very susprising to me that it was not caught by -Wall, so I enabled -Wincomplete-uni-patterns to catch such things. It found a second one just lines above, but no others anywhere.	2020-04-13 16:03:21 -04:00
Joey Hess	5a62e8132d	When parsing git configs, support all the documented ways to write true and false, including "yes", "on", "1", etc. This change does impact git-annex config eg "git annex config --set annex.addunlocked on" will store "on" and new git-annex will understand that value, while old git-annex will error: git-annex: bad annex.addunlocked configuration in git annex config: Parse failure: near "on" That seems acceptable. Not special remote configs that are only documented as =true or =false however. Having git-annex support other values for those would break backwards compatability when used with old versions of git-annex. And older versions ignore invalid special remote configs.. That would not be a good combination.	2020-04-13 14:05:30 -04:00
Joey Hess	9cb69dbb76	support boolean git configs that are represented by the name of the setting with no value Eg"core.bare" is the same as "core.bare = true". Note that git treats "core.bare =" the same as "core.bare = false", so the code had to become more complicated in order to treat the absense of a value differently than an empty value. Ugh.	2020-04-13 13:35:22 -04:00
Joey Hess	ca9c6c5f60	Fix a potential failure to parse git config Git has an obnoxious special case in git config, a line "foo" is the same as "foo = true". That means there is no way to examine the output of git config and tell if it was run with --null or not, since a "foo" in the first line could be such a boolean, or could be followed by its value on the next line if --null were used. So, rather than trying to do such a detection, track the style of config at all the points where it's generated.	2020-04-13 13:05:41 -04:00
Joey Hess	86426036a0	optimise catfile interface with ByteString and Attoparsec Around 3% total speedup. Profiling git annex find --not --in web, it's now bytestring end-to-end, and there is only a little added overhead in eg accessing the Annex state MVar (3%). The rest of the runtime is spent reading symlinks, and in attoparsec. This feels like the end of the optimisation road, without a major change like caching information for faster queries.	2020-04-10 14:18:52 -04:00
Joey Hess	3c369997fc	remove unused import	2020-04-08 14:04:58 -04:00
Joey Hess	c0cd07c36b	Ref ByteString conversion done Test suite passes.	2020-04-07 17:41:09 -04:00
Joey Hess	6c81e0c8f1	ByteString Ref continued Several nice speed wins I think. At 340/633 files converted.	2020-04-07 13:27:11 -04:00
Joey Hess	d5d8259937	ByteString Ref continued Attoparsec parser for diff-tree. Changed fromRef back to producing a String, to avoid needing to convert every use of it. However, this does mean I'm going to miss some opportunities where fromRef is used and the result converted back to a ByteString. Would be worth revisiting that at some point maybe.	2020-04-07 11:54:27 -04:00
Joey Hess	279991604d	started converting Ref from String to ByteString This should make code that reads shas and refs from git faster. Does not compile yet, a lot needs to be done still.	2020-04-06 17:14:49 -04:00
Kyle Meyer	376e69ec65	adjust: Propagate submodule changes back to original branch When the recorded submodule commit changes on an adjusted branch, the change is carried in the function that reverseAdjustedCommit passes for adjustTree's adjusttreeitem parameter. Update the CommitObject handling in adjustTree to consider adjusttreeitem so that a submodule change is synced back.	2020-03-26 15:16:08 -04:00
Joey Hess	3440b77d1e	guard against unsafe git ls-files uses This breaks several parts of the upgrade code, when upgrading remotes of the current repo, but those parts were buggy, and will need to be fixed somehow anyway.	2020-03-09 15:55:34 -04:00
Joey Hess	9bbb73469e	foo	2020-03-09 15:55:00 -04:00
Joey Hess	70d24c0302	add a comment about CWD While git ls-files can actually be used on a repo that is not in the cwd, it works inconsistently. For example, this fails: git --git-dir=../foo/.git --work-tree=../foo ls-files ../foo But change some of the paths to absolute and it will succeed. That seems like a bug in git. OTOH, this succeeds: git --git-dir=../foo/.git --work-tree=../foo ls-files But, that lists paths relative to the top of the --work-tree, rather than the usual listing them relative to the cwd. Because the cwd is not in the repo. And so anything parsing the ls-files output of that is likely to operate on files in the wrong location. Indeed, there is code in Upgrade/ that has this problem!	2020-03-09 13:37:01 -04:00
Joey Hess	7f992ef59c	mostly finished with createDirectoryUnder conversion Remaining things needing converted are in the assistant, and Annex.Ssh. Every other remaining call to createDirectoryIfMissing True has been audited and is not relevant. The ones in Build/ of course don't get included in the program. Others included eg, Remote.Tahoe and Config.Files which both write to dotfiles under the home directory.	2020-03-06 11:57:15 -04:00
Joey Hess	029c883713	Merge branch 'master' into v8	2020-02-19 14:32:11 -04:00
Joey Hess	1f0fc9ff5f	remove unused import	2020-02-19 13:13:13 -04:00
Joey Hess	1883f7ef8f	support git remotes that need http basic auth using git credential to get the password One thing this doesn't do is wrap the password prompting inside the prompt action. So with -J, the output can be a bit garbled.	2020-01-22 16:16:19 -04:00
Joey Hess	75059c9f3b	better error message when git config fails to parse remote config Rather than leaking the name of the temp file, just say the config parse failed, and where the config was downloaded from. Not closing the bug report because two issues were reported in the same bug report, because the universe wants me to continually re-read old unclosed bug reports to waste my time determining what still needs to be done.	2020-01-22 13:35:54 -04:00
Joey Hess	6db4aee7df	use --no-abbrev instead of --abbrev=40 This avoids hardcoding the sha size, so when git uses sha256, it will output the full sha256 and not a truncation to 40 characters. I reviewed git's history, and while there have been some bugs with commands not supporting --no-abbrev (eg git diff --no-index --no-abbrev was broken in git 2.1), none of the commands git-annex uses will be impacted by those old bugs.	2020-01-07 12:29:37 -04:00
Joey Hess	5e4deb3620	support sha256 git repos Git will eventually switch to sha2 and there will not be one single shaSize anymore, but two (40 and 64). Changed all parsers for git plumbing output to support both sizes of shas. One potential problem this does not deal with is, if somewhere in git-annex it reads two shas from different sources, and compares them to see if they're the same sha, it would fail if they're sha1 and sha256 of the same value. I don't know if that will really be a concern.	2020-01-07 12:22:19 -04:00
Joey Hess	2cea674d1e	Merge branch 'master' into v8	2020-01-01 14:26:43 -04:00
Joey Hess	022dead40a	windows build fix	2020-01-01 13:46:03 -04:00
Joey Hess	f0b53d8465	windows build fix	2020-01-01 13:12:33 -04:00
Joey Hess	e006acc8e3	fix quickcheck failure prop_encode_decode_roundtrip failed on "\175" in C locale. This may be a new problem after the switch to RawFilePath, but it already had filtering for high chars, so changed to only test ascii chars.	2019-12-30 13:54:46 -04:00
Joey Hess	ea3cb7d277	fix a case where file tracked by git unexpectedly becomes annex pointer file smudge: When annex.largefiles=anything, files that were already stored in git, and have not been modified could sometimes be converted to being stored in the annex. Changes in 7.20191024 made this more of a problem. This case is now detected and prevented.	2019-12-27 15:08:03 -04:00
Joey Hess	2b821eb225	Merge branch 'master' into sqlite	2019-12-26 15:15:42 -04:00
Joey Hess	37467a008f	annex.addunlocked expressions * annex.addunlocked can be set to an expression with the same format used by annex.largefiles, in case you want to default to unlocking some files but not others. * annex.addunlocked can be configured by git-annex config. Added a git-annex-matching-expression man page, broken out from tips/largefiles. A tricky consequence of this is that git-annex add --relaxed honors annex.addunlocked, but an expression might want to know the size or content of an url, which it's not going to download. I decided it was better not to fail, and just dummy up some plausible data in that case. Performance impact should be negligible. The global config is already loaded for annex.largefiles. The expression only has to be parsed once, and in the simple true/false case, it should not do any additional work matching it.	2019-12-20 15:56:25 -04:00
Joey Hess	02e00fd7ab	Merge branch 'master' into sqlite	2019-12-19 16:33:42 -04:00
Joey Hess	16125694eb	keep filename ByteString Minor optimisation, since it still has to be copied from lazy to strict, but it will add up when doing a big merge.	2019-12-18 15:57:40 -04:00
Joey Hess	d5628a16b8	Merge branch 'bs' into sqlite-bs	2019-12-18 14:51:03 -04:00
Joey Hess	bdec7fed9c	convert TopFilePath to use RawFilePath Adds a dependency on filepath-bytestring, an as yet unreleased fork of filepath that operates on RawFilePath. Git.Repo also changed to use RawFilePath for the path to the repo. This does eliminate some RawFilePath -> FilePath -> RawFilePath conversions. And filepath-bytestring's </> is probably faster. But I don't expect a major performance improvement from this. This is mostly groundwork for making Annex.Location use RawFilePath, which will allow for a conversion-free pipleline.	2019-12-09 15:07:21 -04:00
Joey Hess	2f9a80d803	merging sqlite and bs branches Since the sqlite branch uses blobs extensively, there are some performance benefits, ByteStrings now get stored and retrieved w/o conversion in some cases like in Database.Export.	2019-12-06 15:30:45 -04:00
Joey Hess	f39f018ee0	fix git ls-tree parser File mode is octal not decimal. This broke in the conversion to attoparsec. (I've submitted the content of Utility.Attoparsec to the attoparsec developers.) Test suite passes 100% now.	2019-12-06 14:05:48 -04:00
Joey Hess	4aaef14c61	fix another quickcheck property broken by NUL in Arbitrary String	2019-12-06 13:13:08 -04:00
Joey Hess	faf5415163	add back lost filtering of multibyte chars in prop_encode_decode_roundtrip I had thought using ByteString would avoid the problem, but the quickcheck property is still taking Arbitrary String input, so the use of ByteString internally doesn't matter.	2019-12-06 12:14:55 -04:00
Joey Hess	c20f4704a7	all commands building except for assistant also, changed ConfigValue to a newtype, and moved it into Git.Config.	2019-12-05 14:41:18 -04:00
Joey Hess	3c7fd09ec8	get many more commands building again about half are building now	2019-12-05 11:40:10 -04:00
Joey Hess	f3047d7186	include git-annex-shell back in Also pushed ConfigKey down into the Git modules, which is the bulk of the changes.	2019-12-02 11:51:52 -04:00
Joey Hess	d7833def66	use ByteString for git config The parser and looking up config keys in the map should both be faster due to using ByteString. I had hoped this would speed up startup time, but any improvement to that was too small to measure. Seems worth keeping though. Note that the parser breaks up the ByteString, but a config map ends up pointing to the config as read, which is retained in memory until every value from it is no longer used. This can change memory usage patterns marginally, but won't affect git-annex.	2019-11-27 17:40:09 -04:00
Joey Hess	d830386ab2	update based on profiling While L.toStrict copies, profiling showed it was only around 0.3% of git-annex find runtime. Does not seem worth optimising that, which would probably involve either a major refactoring, or a use of UnsafeInterleaveIO. Also, it seems to me that the latter would need to read chunks, and preappend the leftover part to the next chunk. But a strict ByteString append itself is a copy, so I'm not convinced that would be faster than L.toStrict.	2019-11-27 14:09:11 -04:00
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	6a97ff6b3a	wip RawFilePath Goal is to make git-annex faster by using ByteString for all the worktree traversal. For now, this is focusing on Command.Find, in order to benchmark how much it helps. (All other commands are temporarily disabled) Currently in a very bad unbuildable in-between state.	2019-11-25 16:18:19 -04:00
Joey Hess	d4661959de	Merge branch 'master' into sqlite	2019-11-21 17:26:50 -04:00
Joey Hess	43f19ef00a	Fix bug that made bare repos be treated as non-bare when --git-dir was used. Eg: git clone url --bare r git --git-dir r annex init This resulted in worktree = Just "." and so several things that check worktree to determine when the repo is bare ran code paths intended for non-bare. One such code path[1] ran git checkout with --worktree=. which actually makes it ignore core.bare config, and so the current directory got populated with a checkout of the master branch in this example. There was probably also other breakage. The fix is a bit complicated because whether the repo is bare is not known until after Git.Config reads the config, but Git.Config handles setting the RepoLocations's worktree when core.worktree is set. So have to assume the worktree is the cwd, let core.worktree override that, and then if the repo turns out to be bare, it's set back to Nothing. (And then GIT_WORK_TREE can still override all of that.) [1] switchHEADBack, which runs even when the clone is not from a bare repo.	2019-11-21 13:26:02 -04:00
Joey Hess	5877de5e80	git-lfs: remember urls, and autoenable remotes using known urls * git-lfs: The url provided to initremote/enableremote will now be stored in the git-annex branch, allowing enableremote to be used without an url. initremote --sameas can be used to add additional urls. * git-lfs: When there's a git remote with an url that's known to be used for git-lfs, automatically enable the special remote.	2019-11-18 16:09:09 -04:00
Joey Hess	99536e3a0b	remove one more warningIO Had to generalize Git.Queue so it can run an Annex action, yipes. Only remaining warningIO are in the legacy chunk code.	2019-11-12 10:45:52 -04:00
Joey Hess	dc9295017f	v8 upgrade of keys db Renamed the database to .git/annex/keysdb; the old .git/annex/keys gets deleted during the upgrade. It is possible that an old git-annex process is running during the upgrade. If so, it will be able to continue using the old keys db until the upgrade is complete, and then will presumably fail in some ugly way. Or perhaps the upgrade will be unable to delete the open files on some systems, and so fail with an ugly error message. It's also possible for multiple processes to be running the upgrade concurrently. That should be fine; they will both write the same information into the keys db. Other databases still need to be upgraded.	2019-11-06 16:16:00 -04:00
Joey Hess	89bdcffdfa	found a way to extract InodeCache from git index This will allow a race-free database transition. It is somewhat hairy in that it depends on an unspecified git output format.	2019-11-06 14:23:00 -04:00
Joey Hess	25f912de5b	benchmark: Add --databases to benchmark sqlite databases Rescued from commit `11d6e2e260` which removed db benchmarks in favor of benchmarking arbitrary git-annex commands. Which is nice and general, but microbenchmarks are useful too.	2019-10-29 16:59:27 -04:00
Joey Hess	bbdeb1a1a8	sync: Fix crash when there are submodules and an adjusted branch is checked out Reverse adjusting the branch uses treeItemToTreeContent, which was missed when adding submodule support earlier.	2019-10-23 11:52:56 -04:00
Joey Hess	f4dd7d5191	work around windows having infected git's plumbing Work around git cat-file --batch's odd stripping of carriage return from the end of the line (some windows infection), avoiding crashing when the repo contains a filename ending in a carriage return.	2019-10-08 15:27:05 -04:00
Joey Hess	45e5cc63b5	typo	2019-09-24 14:34:15 -04:00
Joey Hess	9418b516ac	git-credential interface	2019-09-24 12:39:54 -04:00
Joey Hess	92ea93ee21	update wording to match current wording in git	2019-09-15 19:01:05 -04:00
Joey Hess	fef3cd055d	Removed support for git versions older than 2.1 debian oldoldstable has 2.1, and that's what i386ancient uses. It would be better to require git 2.2, which is needed to use adjusted branches, but can't do that w/o losing support for some old linux kernels or a complicated git backport.	2019-09-11 16:14:43 -04:00
Joey Hess	3f0eef4baa	v7 for all repositories * Default to v7 for new repositories. * Automatically upgrade v5 repositories to v7.	2019-08-30 14:09:14 -04:00
Joey Hess	dc672863c3	init: Install working hook scripts when run on a crippled filesystem and on Windows	2019-08-13 15:14:17 -04:00
Joey Hess	bf5dd723d3	Fix querying git for object type when operating on a file containing newlines This typo would make "git cat-file cat-file" fail, and the way it's used, I think it broke querying all info from filenames containing newlines, because the other queries are only run when it succeeds.	2019-08-07 13:35:42 -04:00
Joey Hess	6c1130a3bb	lfs endpoint discovery and caching in git-lfs special remote	2019-08-02 12:38:14 -04:00
Joey Hess	8028f14957	correct the comment to match the implementation	2019-07-16 12:28:44 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	8e5ea28c26	finish CommandStart transition The hoped for optimisation of CommandStart with -J did not materialize. In fact, not runnign CommandStart in parallel is slower than -J3. So, CommandStart are still run in parallel. (The actual bad performance I've been seeing with -J in my big repo has to do with building the remoteList.) But, this is still progress toward making -J faster, because it gets rid of the onlyActionOn roadblock in the way of making CommandCleanup jobs run separate from CommandPerform jobs. Added OnlyActionOn constructor for ActionItem which fixes the onlyActionOn breakage in the last commit. Made CustomOutput include an ActionItem, so even things using it can specify OnlyActionOn. In Command.Move and Command.Sync, there were CommandStarts that used includeCommandAction, so output messages, which is no longer allowed. Fixed by using startingCustomOutput, but that's still not quite right, since it prevents message display for the includeCommandAction run inside it too.	2019-06-12 13:24:01 -04:00
Joey Hess	97fd9da6e7	add back non-preferred files to imported tree Prevents merging the import from deleting the non-preferred files from the branch it's merged into. adjustTree previously appended the new list of items to the old, which could result in it generating a tree with multiple files with the same name. That is not good and confuses some parts of git. Gave it a function to resolve such conflicts. That allowed dealing with the problem of what happens when the import contains some files (or subtrees) with the same name as files that were filtered out of the export. The files from the import win.	2019-05-20 16:43:52 -04:00
Joey Hess	ecdbdf6180	add --verify Needed for the --quiet to actually shut it up. The extra verification this makes it do should be fine, as this is supposed to really return a single tree's sha.	2019-05-06 16:41:01 -04:00
Joey Hess	d08c19defb	avoid git warning on first import of subdir from a remote git rev-parse --quiet avoids "fatal: Invalid object name" when the branch does not exist. Git.Ref.tree already returned a Maybe, so callers already handle those cases themselves.	2019-05-06 16:29:34 -04:00
Joey Hess	4a8f02e939	don't empty historyCommitParents	2019-05-01 13:38:40 -04:00
Joey Hess	b69d11ec42	wip	2019-04-30 14:00:27 -04:00
Joey Hess	b9b3567747	added Git.History	2019-04-24 14:55:49 -04:00
Joey Hess	c8f7cb8558	fix incorrect comment The process will typically block until all input is read.	2019-04-24 14:29:46 -04:00
Joey Hess	b6a3d0ae10	fix test suite when git is too old to understand --allow-unrelated-histories	2019-03-22 13:49:22 -04:00

1 2 3 4 5 ...

784 commits