git-annex

Author	SHA1	Message	Date
jgoerzen	56d6b622d7	Added a comment	2023-06-11 02:18:23 +00:00
Joey Hess	c33c226abd	fixed	2023-06-09 16:13:52 -04:00
Joey Hess	a0ab425c95	add ContentIndentifiersCidRemoteKeyIndex Optimise database to further speed up importing large trees from special remotes. See comment for details of why the other index didn't help cid queries. It would probably be better to manually create an index on only cid, rather than adding a second uniqueness constraint that is a larger index. But persitent does not support creating indexes, and an attempt to manually add it to the migration failed. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-06-09 15:12:33 -04:00
Joey Hess	532b227086	update exportdb tree in getImportableContents This avoids bottlenecking on git check-ignore in a particular situation. Also, there may have been a correctness issue with it not having updated it. When the exportdb is already up-to-date, this is not expensive. And the exportdb is updated elsewhere, so usually it is up-to-date. Sponsored-by: Joshua Antonishen on Patreon	2023-06-08 18:36:24 -04:00
Joey Hess	7888702955	update	2023-06-07 11:32:53 -04:00
Joey Hess	5bc37c2de2	comment	2023-06-06 15:17:09 -04:00
Joey Hess	d63af3f52e	comment	2023-06-06 14:45:48 -04:00
Joey Hess	3c15e0f7a0	cache negative lookups of global numcopies and mincopies Speeds up eg git-annex sync --content by up to 50%. When it does not need to transfer or drop anything, it now noops a lot more quickly. I didn't see anything else in sync --content noop loop that could really be sped up. It has to cat git objects to keys, stat object files, etc. Sponsored-by: unqueued on Patreon	2023-06-06 14:43:25 -04:00
jgoerzen	432e7cd9f3	Added a comment	2023-06-05 19:32:29 +00:00
Joey Hess	528882a6df	comment	2023-06-05 14:08:12 -04:00
jgoerzen	2c2a84caac	Added a comment	2023-06-02 21:44:54 +00:00
Joey Hess	b43fb4923f	comment	2023-06-02 13:11:24 -04:00
Joey Hess	b8750bcb17	Merge branch 'master' of ssh://git-annex.branchable.com	2023-06-02 12:14:03 -04:00
Joey Hess	b40b368857	comment	2023-06-02 12:13:50 -04:00
jgoerzen	5dcbf7d41e	Added a comment	2023-06-02 03:25:27 +00:00
Joey Hess	7178db5e06	Merge branch 'master' of ssh://git-annex.branchable.com	2023-06-01 18:43:29 -04:00
Joey Hess	2e92cef13f	comment	2023-06-01 18:43:17 -04:00
jgoerzen	53eeca40ae	Added a comment	2023-06-01 21:26:23 +00:00
Joey Hess	594110a6af	comment	2023-06-01 14:21:55 -04:00
Joey Hess	029b08f54b	Merge branch 'master' of ssh://git-annex.branchable.com	2023-05-31 16:34:03 -04:00
Joey Hess	f6aa097a39	avoid import writing to cidsdb initially Speed up importing trees from special remotes somewhat by avoiding redundant writes to sqlite database. Before, import would write to both the git-annex branch and also to the sqlite database. But then the next time it was run, needsUpdateFromLog would see the branch had changed, so run updateFromLog, which would make the same writes to the sqlite database a second time. Now import writes only to the git-annex branch. The next time it's run, needsUpdateFromLog sees that the branch has changed and so calls updateFromLog, which updates the sqlite database. Why defer the write to the sqlite database like this? It seems that it could write to the database as it goes, and at the end call recordAnnexBranchTree to indicate that the information in the git-annex branch has all been written to the cidsdb. That would avoid the second import doing extra work. But, there could be other processes running at the same time, and one of them may update the git-annex branch, eg merging a remote git-annex branch into it. Any cids logs on that merged git-annex branch would not be reflected in the cidsdb yet. If the import then called recordAnnexBranchTree, the cidsdb would never get updated with that merged information. I don't think there's a good way to prevent, or to detect that situation. So, it can't call recordAnnexBranchTree at the end. So it might as well wait until the next run and do updateFromLog then. It could instead do updateFromLog at the end, but it's going to check needsUpdateFromLog at the beginning anyway. Note that the database writes were queued, so there is already a cidmap that is used to remember changes that the current process has made. So, omitting database writes can't change the behavior of the current process. Also note that thirdpartypopulatedimport uses recordcidkeyindb, which reflects what it already did. That code path does not use the cidmap, but does not need to query it either. It might be possible to make that code path also only update the git-annex branch and not the db, but I haven't checked. Sponsored-by: Noam Kremen on Patreon	2023-05-30 17:05:28 -04:00
jgoerzen	f47e7abd57	Added a comment	2023-05-30 20:58:21 +00:00
Joey Hess	aaeae746f0	comment and a neat idea	2023-05-30 15:42:34 -04:00
Joey Hess	5da7f703b0	comment	2023-05-30 14:30:39 -04:00

24 commits