git-annex

Author	SHA1	Message	Date
Joey Hess	067aabdd48	wip RawFilePath 2x git-annex find speedup Finally builds (oh the agoncy of making it build), but still very unmergable, only Command.Find is included and lots of stuff is badly hacked to make it compile. Benchmarking vs master, this git-annex find is significantly faster! Specifically: num files old new speedup 48500 4.77 3.73 28% 12500 1.36 1.02 66% 20 0.075 0.074 0% (so startup time is unchanged) That's without really finishing the optimization. Things still to do: * Eliminate all the fromRawFilePath, toRawFilePath, encodeBS, decodeBS conversions. * Use versions of IO actions like getFileStatus that take a RawFilePath. * Eliminate some Data.ByteString.Lazy.toStrict, which is a slow copy. * Use ByteString for parsing git config to speed up startup. It's likely several of those will speed up git-annex find further. And other commands will certianly benefit even more.	2019-11-26 16:01:58 -04:00
Joey Hess	018b5b8173	Support building with socks-0.6 and persistant-template-2.7 persistent-template now needs UndecidableInstances. socks changed defaultSocksConf to take a SockAddr.	2019-07-30 12:50:48 -04:00
Joey Hess	9a5ddda511	remove many old version ifdefs Drop support for building with ghc older than 8.4.4, and with older versions of serveral haskell libraries than will be included in Debian 10. The only remaining version ifdefs in the entire code base are now a couple for aws! This commit should only be merged after the Debian 10 release. And perhaps it will need to wait longer than that; it would make backporting new versions of git-annex to Debian 9 (stretch) which has been actively happening as recently as this year. This commit was sponsored by Ilya Shlyakhter.	2019-07-05 15:09:37 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	e3a704224f	fix export db locking deadlock	2019-03-07 16:06:02 -04:00
Joey Hess	9a72785307	fixes to export db lookup when accessing importtree=yes Now in a fresh clone with a importtree=yes remote enabled, git annex fsck --from the remote works.	2019-03-07 14:10:56 -04:00
Joey Hess	cd3a2b023a	initial try at using storeExportWithContentIdentifier Untested, and I'm not sure about the locking of the ContentIdentifier db.	2019-03-04 17:50:41 -04:00
Joey Hess	b67fa2180e	add getExportedKey Not optimised because that would need transition code to be written for existing export datbases.	2019-03-04 17:27:10 -04:00
Joey Hess	1c8793691a	import: update location log for removed files	2019-03-01 13:26:59 -04:00
Joey Hess	7acee61adf	rename	2019-03-01 12:50:33 -04:00
Joey Hess	d0066d9a87	fully update export db during import This makes exporting immediately after import and merge be a no-op.	2019-02-27 15:29:41 -04:00
Joey Hess	9216718fa0	fix one more export conflict false positive Somehow forgot about the case where the current export db tree is the same one in the export log, and it warned about an export conflict when getting a file in that situation. Of course it's no conflict at all! This commit was sponsored by Jochen Bartl on Patreon.	2019-01-30 13:18:42 -04:00
Joey Hess	ad1d422dd7	fix false positive in export conflict detection Like the earlier fixed one in Command.Export, it occurred when the same tree was exported by multiple clones. Previous fix was incomplete since several other places looked at the list of exported trees to detect when there was an export conflict. Added a single unified function to avoid missing any places it needed to be fixed. This commit was sponsored by mo on Patreon.	2019-01-30 12:36:30 -04:00
Joey Hess	f62114e5ad	Merge branch 'remove-esqueleto'	2018-11-20 11:50:04 -04:00
Joey Hess	d65df7ab21	improve messages around export conflicts When an export conflict prevents accessing a special remote, be clearer about what the problem is and how to resolve it. This commit was sponsored by Trenton Cronholm on Patreon.	2018-11-13 15:50:06 -04:00
Sean Parsons	42bdc9fa2f	Removed Esqueleto as a dependency.	2018-11-06 22:18:55 +00:00
Joey Hess	def5d8b02c	Fix potential crash in exporttree database due to failure to honor uniqueness constraint I don't know the circumstances, but have a report of this: git-annex: failed to commit changes to sqlite database: Just SQLite3 returned ErrorConstraint while attempting to perform step. All 3 tables in the export db have uniqueness constraints on them, insertUnique is used for all the rest, but this use of insertMany means it doesn't check the constraint. I guess that's what caused the crash, but I have not been able to test it yet. Use putMany when available, as it should be faster than mapM of insertMany. This commit was sponsored by Brock Spratlen on Patreon.	2018-10-09 16:56:33 -04:00
Joey Hess	b8ed97f5d8	add missing space	2018-10-09 16:32:59 -04:00
Joey Hess	710d6a35ed	fix build with old version of persistent	2017-09-25 09:57:41 -04:00
Joey Hess	129418615b	refactor	2017-09-20 16:22:32 -04:00
Joey Hess	f4be3c3f89	merge changes made on other repos into ExportTree Now when one repository has exported a tree, another repository can get files from the export, after syncing. There's a bug: While the database update works, somehow the database on disk does not get updated, and so the database update is run the next time, etc. Wasn't able to figure out why yet. This commit was sponsored by Ole-Morten Duesund on Patreon.	2017-09-18 19:21:41 -04:00
Joey Hess	0ad7e36dc1	update ExportTree table efficiently Use same diff and key lookup except when the whole tree has to be scanned. This commit was sponsored by Peter Hogg on Patreon.	2017-09-18 14:27:50 -04:00
Joey Hess	b03d77c211	add ExportTree table to export db New table needed to look up what filenames are used in the currently exported tree, for reasons explained in export.mdwn. Also, added smart constructors for ExportLocation and ExportDirectory to make sure they contain filepaths with the right direction slashes. And some code refactoring. This commit was sponsored by Francois Marier on Patreon.	2017-09-18 13:59:59 -04:00
Joey Hess	e1f5c90c92	split out Types.Export	2017-09-15 16:46:03 -04:00
Joey Hess	c633144d28	remove empty directories when removing from export The subtle part of this is what happens when the remote fails to remove an empty directory. The removal from the export needs to fail in that case, so the removal will be tried again later. However, removeExportLocation has already been run and changed the export db, so if the next run checks getExportLocation, it might decide nothing remains to be done, leaving the empty directory. Dealt with that by making removeEmptyDirectories, handle a failure by calling addExportLocation, reverting the database changes so the next run will be guaranteed to try deleting the empty directory again. This commit was sponsored by Thomas Hochstein on Patreon.	2017-09-15 15:22:53 -04:00
Joey Hess	e223cf568f	add table to keep track of what subdirectories are populated in the export So empty subdirectories can be identified and removed. This commit was sponsored by Jochen Bartl on Patreon.	2017-09-15 14:35:22 -04:00
Joey Hess	6ab14710fc	fix consistency bug reading from export database The export database has writes made to it and then expects to read back the same data immediately. But, the way that Database.Handle does writes, in order to support multiple writers, makes that not work, due to caching issues. This resulted in export re-uploading files it had already successfully renamed into place. Fixed by allowing databases to be opened in MultiWriter or SingleWriter mode. The export database only needs to support a single writer; it does not make sense for multiple exports to run at the same time to the same special remote. All other databases still use MultiWriter mode. And by inspection, nothing else in git-annex seems to be relying on being able to immediately query for changes that were just written to the database. This commit was supported by the NSF-funded DataLad project.	2017-09-06 17:19:07 -04:00
Joey Hess	4da763439b	use export db to correctly handle duplicate files Removed uncorrect UniqueKey key in db schema; a key can appear multiple times with different files. The database has to be flushed after each removal. But when adding files to the export, lots of changes are able to be queued up w/o flushing. So it's still fairly efficient. If large removals of files from exports are too slow, an alternative would be to make two passes over the diff, one pass queueing deletions from the database, then a flush and the a second pass updating the location log. But that would use more memory, and need to look up exportKey twice per removed file, so I've avoided such optimisation yet. This commit was supported by the NSF-funded DataLad project.	2017-09-04 14:39:32 -04:00
Joey Hess	2c90ed1fea	flush queued changes to export db on exit	2017-09-04 14:00:54 -04:00
Joey Hess	7eb9889bfd	track exported files in a sqlite database Went with a separate db per export remote, rather than a single export database. Mostly because there will probably not be a lot of separate export remotes, and it might be convenient to be able to delete a given remote's export database. This commit was supported by the NSF-funded DataLad project.	2017-09-04 13:53:08 -04:00

30 commits