git-annex

Author	SHA1	Message	Date
Joey Hess	561c036664	split out generic git log parser Sponsored-By: Jack Hill on Patreon	2023-11-10 15:40:03 -04:00
Joey Hess	8bde6101e3	sqlite datbase for importfeed importfeed: Use caching database to avoid needing to list urls on every run, and avoid using too much memory. Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster, and memory use dropped from 203000k to 59408k. Database.ImportFeed is Database.ContentIdentifier with the serial number filed off. There is a bit of code duplication I would like to avoid, particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use the persistent sqlite tables, so despite the code being the same, they cannot be factored out. Since this database includes the contentidentifier metadata, it will be slightly redundant if a sqlite database is ever added for metadata. I did consider making such a generic database and using it for this. But, that would then need importfeed to update both the url database and the metadata database, which is twice as much work diffing the git-annex branch trees. Or would entagle updating two databases in a complex way. So instead it seems better to optimise the database that importfeed needs, and if the metadata database is used by another command, use a little more disk space and do a little bit of redundant work to update it. Sponsored-by: unqueued on Patreon	2023-10-23 16:46:22 -04:00
Joey Hess	c2e60dd7a6	enable parallel ghc for building git-annex Via a build flag this time, that's off by default because hackage demands it be so, but that gets turned on by the Makefile and by stack.	2023-09-26 13:46:44 -04:00
Joey Hess	4ac2758ba5	Revert "enable parallel ghc for building git-annex" This reverts commit `3f6aff89b1`. Sadly hackage rejects cabal files using -j unless hidden behind an option that is disabled by default.	2023-09-26 13:34:28 -04:00
Joey Hess	b9240d2c5d	releasing package git-annex version 10.20230926	2023-09-26 13:29:49 -04:00
Joey Hess	3f6aff89b1	enable parallel ghc for building git-annex This drops a full recompile on my new 12 core laptop from 4:00 to 2:47. It would be possible for me to use: cabal configure --ghc-options=-j But that also makes cabal parallelize ghc for each package it installs to satisfy git-annex's dependencies. Since cabal is already configured to parallize installing dependencies, that would use N^2 cpu cores, which seems like a bad idea. And also I'd have to remember to do it. So I'm thinking it's better to do it by default. If a system that is building git-annex is also busy with other things, let the scheduler sort it out. If this impacts someone particularly badly, they can of course avoid it with: cabal configure --ghc-options=-j1	2023-09-21 13:00:31 -04:00
Joey Hess	54da44d42a	Support being built with crypton rather than cryptonite crypton is a fork of cryptonite, and cryptonite's github repo has been archived. Some deps are already using cryptonite so it's clearly the way forward. Added a build flag without a default, so cabal configure will select on its own which to use. stack files pin to cryptonite for now. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-09-21 12:43:42 -04:00
Joey Hess	50300a47fe	Removed the vendored git-lfs and the GitLfs build flag AFAICS all git-annex builds are using the git-lfs library not the vendored copy. Debian stable now includes a new enough haskell-git-lfs package as well. Last time this was tried it did not.	2023-08-28 13:12:31 -04:00
Joey Hess	5e818e4903	remove man pages from cabal file Since `393275c105` Setup.hs no longer installs the man pages. Since the cabal package is only used to install git-annex with cabal, it doesn't need to include files like these that are not used when installing with cabal.	2023-08-28 12:57:50 -04:00
Joey Hess	cf8b30c914	oldkeys: New command that lists the keys used by old versions of a file The tricky thing about this turned out to be handling renames and reverts. For that, it has to make two passes over the git log, and to avoid buffering a possibly huge amount of logs in memory (ie the whole git log of an entire repository!), runs git log twice. (It might be possible to speed this up by asking git log to show a diff, and so avoid needing to use catKey.) Sponsored-By: Brock Spratlen on Patreon	2023-08-22 14:51:06 -04:00
Joey Hess	977403d338	implement Unavilable for borg bup ddar directory rsync Only gcrypt remains to add support for. (Well, possibly also adb?) Sponsored-by: Luke T. Shumaker on Patreon	2023-08-16 15:48:09 -04:00
Joey Hess	be028f10e5	split out Utility.Url.Parse This is mostly for git-repair which can't include all of Utility.Url without adding many dependencies that are not really necessary.	2023-08-14 12:28:10 -04:00
Joey Hess	d19139a10d	releasing package git-annex version 10.20230802	2023-08-02 16:09:14 -04:00
Joey Hess	85aadcfa1e	windows back to lts-18.13 temporarily I can't seem to get stack to resolve dependencies with Win32-2.13.4.0, no matter what I try. Why it blows up, I don't know. And allow-newer: true actually causes it to downgrade Win32 to the one version that won't build. Unbelivable that allows downgrades. So just gonna have to wait for that to get into stackage nightly, and then stack.yaml can be updated to use that, and the changes in this commit reverted.	2023-08-02 12:49:38 -04:00
Joey Hess	f1842b616a	fix stack build on windows For whatever reason, putting Win32-2.13.4.0 in stack.yaml results in stack blowing up with many unrelated dependency problems. But making git-annex depend on that version lets stack resolve deps.	2023-08-02 11:50:12 -04:00
Joey Hess	28864f0bb2	add back utf8-string to setup build deps Needed on Windows since Utility.FileSystemEncoding uses it	2023-08-02 09:29:23 -04:00
Joey Hess	6da6449fff	stack.yaml: Update to build with ghc-9.6.2 and aws-0.24 This enables some new features that need the new aws. Use http-client-restricted-0.1.0 because it uses the crypton side of the cryptonite/crypton fork, which seems to be needed for ghc-9.6.2. Dependency on connection removed because of the cryptonite/crypton fork. This avoids needing a build flag. It was only used to throw a typed exception in Utility.Url, which nothing depended on. Used a fork of bloomfilter because it's not being maintained and no longer builds as-of this ghc version. (I have been trying to contact its maintainer about it, and emailed him today suggesting I take over the package.) Sponsored-by: Brock Spratlen on Patreon	2023-08-01 18:53:26 -04:00
Joey Hess	68c9b08faf	fix build with unix-2.8.0 Changed the parameters to openFd. So needed to add a small wrapper library to keep supporting older versions as well.	2023-08-01 18:41:27 -04:00
Joey Hess	3b825eb7a6	rewrap	2023-08-01 15:47:05 -04:00
Joey Hess	fb640bc2f4	support building with unix-compat 0.7 It removed System.PosixCompat.User.	2023-08-01 15:17:43 -04:00
Joey Hess	393275c105	Setup.hs: Stop installing man pages, desktop files, and the git-annex-shell and git-remote-tor-annex symlinks Anything still relying on that, eg via cabal v1-install will need to change to using make install-home. Which was added back in 2019 in `6491b62614` because cabal new-build (now the default) already didn't use Setup in a way that let its installation of those things work. Notably this means Setup does not need to depend on unix-compat, which is useful because in 0.7 it removed System.PosixCompat.User, which Setup needed to determine where to install the desktop files. See https://github.com/haskell-pkg-janitors/unix-compat/issues/3	2023-08-01 15:08:56 -04:00
Joey Hess	e1fc9e204e	added git-annex satisfy This ended up having an interface like sync, rather than like get/copy/drop. That let it be implemented in terms of sync, which took a lot less code. Also, it lets it handle many of the edge cases that sync does, such as getting files that are not visible in a --hide-missing branch, and sending files to exporttree remotes. As well as being easier to implement, `git-annex satisfy myremote` makes sense as it satisfies the preferred content settings of the remote. `git-annex satisfy somefile` does not form a sentence that makes sense. So while -C can be a little bit annoying, it still makes sense to have this syntax. Note that, while I initially thought this would also satisfy numcopies, it does not. Arguably it ought to. But, sync does not send files in order to satisfy numcopies, it only sends files to satisfy preferred content. And it's important that this transfer the same files as sync does, because it will probably be used in a workflow where the user sometimes syncs and sometimes satisfies, and does not expect satisfy to do things that sync would not do. (Also opened a new bug that also affects sync et all, not only this command.) Sponsored-by: Nicholas Golder-Manning on Patreon	2023-06-29 15:34:53 -04:00
Joey Hess	1b9958f4fd	document git-annex satisfy	2023-06-29 14:15:01 -04:00
Joey Hess	a8779f4c2a	prep release	2023-06-26 10:41:36 -04:00
Joey Hess	2b2ec8fa63	lower optparse-applicative bounds after recent bump 0.14.2 included H.pretty. I tested with 0.16.1 and it displays ok using it.	2023-06-21 12:51:45 -04:00
Peter Simons	ffb708be09	Adapt code to optparse-applicative 0.18.1 and later. optparse-applicative switched to the 'prettyprinter' library in its latest release, which means the 'H.text' function has disappeared. Instead, 'H.pretty' can be used to convert all 'Pretty a' types into a renderable document.	2023-06-21 11:51:04 -04:00
Joey Hess	6821ba8dab	sync: use log to track adjusted branch needs updating Speeds up sync in an adjusted branch by avoiding re-adjusting the branch unncessarily, particularly when it is adjusted with --hide-missing or --unlock-present. When there are a lot of files, that was the majority of the time of a --no-content sync. Uses a log file, which is updated when content presence changes. This adds a little bit of overhead to every file get/drop when on such an adjusted branch. The overhead is minimal for get of any size of file, but might be noticable for drop in some cases. It seems like a reasonable trade-off. It would be possible to update the log file only at the end, but then it would not happen if the command is interrupted. When not in an adjusted branch, there should be no additional overhead. (getCurrentBranch is an MVar read, and it avoids the MVar read of getGitConfig.) Note that this does not deal with situations such as: git checkout master, git-annex get, git checkout adjusted branch, git-annex sync. The sync won't know that the adjusted branch needs to be updated. Dealing with that would add overhead to operation in non-adjusted branches, which I don't like. Also, there are other situations like having two adjusted branches that both need to be updated like this, and switching between them and sync not updating. This does mean a behavior change to sync, since it did previously deal with those situations. But, the documentation did not say that it did. The man pages only talk about sync updating the adjusted branch after it transfers content. I did consider making sync keep track of content it transferred (and dropped) and only update the adjusted branch then, not to catch up to other changes made previously. That would perform better. But it seemed rather hard to implement, and also it would have problems with races with a concurrent get/drop, which this implementation avoids. And it seemed pretty likely someone had gotten used to get/drop followed by sync updating the branch. It seems much less likely someone is switching branches, doing get/drop, and then switching back and expecting sync to update the branch. Re-running git-annex adjust still does a full re-adjusting of the branch, for anyone who needs that. Sponsored-by: Leon Schuermann on Patreon	2023-06-08 14:35:41 -04:00
Joey Hess	c6acf574c7	implement importChanges optimisaton (not used yet) For simplicity, I've not tried to make it handle History yet, so when there is a history, a full import will still be done. Probably the right way to handle history is to first diff from the current tree to the last imported tree. Then, diff from the current tree to each of the historical trees, and recurse through the history diffing from child tree to parent tree. I don't think that will need a record of the previously imported historical trees, and so Logs.Import doesn't store them. Although I did leave room for future expansion in that log just in case. Next step will be to change importTree to importChanges and modify recordImportTree et all to handle it, by using adjustTree. Sponsored-by: Brett Eisenberg on Patreon	2023-05-31 16:01:34 -04:00
Joey Hess	e955912ad0	git-annex assist assist: New command, which is the same as git-annex sync but with new files added and content transferred by default. (Also this fixes another reversion in git-annex sync, --commit --no-commit, and --message were not enabled, oops.) See added comment for why git-annex assist does commit staged changes elsewhere in the work tree, but only adds files under the cwd. Note that it does not support --no-commit, --no-push, --no-pull like sync does. My thinking is, why should it? If you want that level of control, use git commit, git annex push, git annex pull. Sync only got those options because pull and push were not split out. Sponsored-by: k0ld on Patreon	2023-05-18 14:37:43 -04:00
Joey Hess	80e9a655f8	add man pages for pull and push to cabal file	2023-05-18 12:54:15 -04:00
Joey Hess	5df89d58c7	git-annex pull and push Split out two new commands, git-annex pull and git-annex push. Those plus a git commit are equivilant to git-annex sync. In a sense, git-annex sync conflates 3 things, and it would have been better to have push and pull from the beginning and not sync. Although note that git-annex sync --content is faster than a pull followed by a push, because it only has to walk the tree once, look at preferred content once, etc. So there is some value in git-annex sync in speed, as well as user convenience. And it would be hard to split out pull and push from sync, as far as the implementaton goes. The implementation inside sync was easy, just adjust SyncOptions so it does the right thing. Note that the new commands default to syncing content, unless annex.synccontent is explicitly set to false. I'd like sync to also do that, but that's a hard transition to make. As a start to that transition, I added a note to git-annex-sync.mdwn that it may start to do so in a future version of git-annex. But a real transition would necessarily involve displaying warnings when sync is used without --content, and time. Sponsored-by: Kevin Mueller on Patreon	2023-05-16 16:51:07 -04:00
Joey Hess	9155ed1072	configremote New command, currently limited to changing autoenable= setting of a special remote. It will probably never be used for more than that given the limitations on it. Sponsored-by: Brock Spratlen on Patreon	2023-04-18 15:30:49 -04:00
Joey Hess	fe5e586b72	rename Git.Filename to Git.Quote	2023-04-12 17:22:03 -04:00
Joey Hess	a576fc3b12	fix mojibake reversion in display of utf8 When displaying a ByteString like "💕", safeOutput operates on individual bytes like "\240\159\146\149" and isControl '\146' = True, so it got truncated to just "\240". So, only treat the low control characters, and DEL, as control characters. Also split Utility.Terminal out of Utility.SafeOutput. The latter needs win32, but Utility.SafeOutput is used by Control.Exception, which is used by Setup. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-04-12 13:53:30 -04:00
Joey Hess	cd544e548b	filter out control characters in error messages giveup changed to filter out control characters. (It is too low level to make it use StringContainingQuotedPath.) error still does not, but it should only be used for internal errors, where the message is not attacker-controlled. Changed a lot of existing error to giveup when it is not strictly an internal error. Of course, other exceptions can still be thrown, either by code in git-annex, or a library, that include some attacker-controlled value. This does not guard against those. Sponsored-by: Noam Kremen on Patreon	2023-04-10 13:50:51 -04:00
Joey Hess	9c242af171	releasing package git-annex version 10.20230407	2023-04-07 13:37:03 -04:00
Joey Hess	cc36c8516a	Sped up sqlite inserts 2x when built with persistent 2.14.5.0 https://github.com/yesodweb/persistent/issues/1457 Sponsored-by: Dartmouth College's DANDI project	2023-03-31 14:38:25 -04:00
Joey Hess	2b40fa51d3	git-annex.cabal: Prevent building with unix-compat 0.7 Which removed System.PosixCompat.User. See https://github.com/haskell-pkg-janitors/unix-compat/issues/3 Sponsored-by: Noam Kremen on Patreon	2023-03-31 12:52:23 -04:00
Joey Hess	40a5c645cf	prep for release tomorrow	2023-03-28 17:02:34 -04:00
Joey Hess	b624394c72	releasing package git-annex version 10.20230321	2023-03-21 16:14:10 -04:00
Yaroslav Halchenko	84b0a3707a	Apply codespell -w throughout	2023-03-17 15:14:58 -04:00
Joey Hess	3c08af0da1	factor out convertToWindowsNativeNamespace into its own module Gonna use this more widely. Sponsored-by: Dartmouth College's Datalad project	2023-03-01 13:28:39 -04:00
Joey Hess	a206cdddb4	releasing package git-annex version 10.20230227	2023-02-27 12:23:43 -04:00
Joey Hess	f24f96e018	move webapp build deps under Assistant build flag git-annex.cabal: Move webapp build deps under the Assistant build flag so git-annex can be built again without yesod etc installed. Commit `78440ca37d` got rid of the webapp build flag to work around what was apparently a cabal bug. It moved the webapp build deps to the main build-depends list. But that prevents building git-annex when yesod etc are not installed. Putting them under the Assistant build flag seems to not tickle that cabal bug, and lets git-annex build automatically with the assistant disabled when the webapp build deps are not installed. I hypotehesize that the problem may have involved build-depends nested behind two build flags. Also, cabal clean may need to be run in order for cabal to find the right solution after this change, when building in a directory where cabal configure had been run before. Also moved 3 modules that are needed to build git-annex w/o the assistant out from under the Assistant build flag. Sponsored-by: Brock Spratlen on Patreon	2023-02-23 12:25:22 -04:00
Joey Hess	f3019d7e22	releasing package git-annex version 10.20230214	2023-02-14 14:09:10 -04:00
Joey Hess	5df95a5879	add upper bounds on base version hackage now rejects packages without this	2023-01-26 15:33:52 -04:00
Joey Hess	e726800dda	add upper bounds on Cabal version hackage now rejects packages without this. My bet is any version of cabal is going to work, I'm using the public API. Annoying.	2023-01-26 15:31:29 -04:00
Joey Hess	65167463aa	releasing package git-annex version 10.20230126	2023-01-26 15:27:32 -04:00
Joey Hess	579d9b60c1	improve concurrency of move/copy --from --to Use separate stages for download and upload. In the common case where it downloads the file from one remote and then uploads to the other, those are by far the most expensive operations, and there's a decent chance the two remotes bottleneck on different resources. Suppose it's being run with -J2 and a bunch of 10 mb files. Two threads will be started both downloading from the src remote. They will probably finish at the same time. Then two threads will be started uploading to the dst remote. They will probably take the same time as well. Before this change, it would alternate back and forth, bottlenecking on src and dst. With this change, as soon as the two threads start uploading to dst, two more threads are able to start, downloading from src. So bandwidth to both remotes is saturated more often. Other commands that use transferStages only send in one direction at a time. So the worker threads for the other direction will sit idle, and there will be no change in their behavior. Sponsored-by: Dartmouth College's DANDI project	2023-01-24 13:59:39 -04:00
Joey Hess	f8bc208e89	findkeys: New command, very similar to git-annex find but operating on keys I've long been asked for `git-annex find --all` or something like that, but pushed back on it because I feel that the command is analagous to find(1) and so it would be surprising for it to list keys rather than files. So instead, add a new findkeys subcommand. Note that the use of withKeyOptions is rather strange because usually that is used to fall back to --all rather than listing files, but here it's made to default to --all like behavior and never list files. A performance thing that could be improved is that withKeyOptions always reads and caches location logs. But findkeys with no options does not need them, so it could be made faster. That caching does speed up options like --in though. This is really just a subset of a more general performance thing that --all reads location logs sometimes unncessarily. Anyway, it needs to read the location log in order to checkDead, and it seems good that findkeys does skip dead keys. Also, cleaned up comments on git-annex-find man page asking for --all option. Sponsored-by: Dartmouth College's DANDI project	2023-01-17 14:51:57 -04:00

1 2 3 4 5 ...

1031 commits