git-annex

Author	SHA1	Message	Date
Joey Hess	780367200b	remove dead nodes when loading the cluster log This is to avoid inserting a cluster uuid into the location log when only dead nodes in the cluster contain the content of a key. One reason why this is necessary is Remote.keyLocations, which excludes dead repositories from the list. But there are probably many more. Implementing this was challenging, because Logs.Location importing Logs.Cluster which imports Logs.Trust which imports Remote.List resulted in an import cycle through several other modules. Resorted to making Logs.Location not import Logs.Cluster, and instead it assumes that Annex.clusters gets populated when necessary before it's called. That's done in Annex.Startup, which is run by the git-annex command (but not other commands) at early startup in initialized repos. Or, is run after initialization. Note that is Remote.Git, it is unable to import Annex.Startup, because Remote.Git importing Logs.Cluster leads the the same import cycle. So ensureInitialized is not passed annexStartup in there. Other commands, like git-annex-shell currently don't run annexStartup either. So there are cases where Logs.Location will not see clusters. So it won't add any cluster UUIDs when loading the log. That's ok, the only reason to do that is to make display of where objects are located include clusters, and to make commands like git-annex get --from treat keys as being located in a cluster. git-annex-shell certainly does not do anything like that, and I'm pretty sure Remote.Git (and callers to Remote.Git.onLocalRepo) don't either.	2024-06-16 14:39:44 -04:00
Joey Hess	570ceffe8d	broke out initcluster One benefit of this is that a typo in annex-cluster-node config won't init a new cluster. Also it gets the cluster description set and is consistent with initremote.	2024-06-14 17:23:11 -04:00
Joey Hess	bbf261487d	add git-annex updatecluster command Seems to work fine, making the right changes to the git-annex branch.	2024-06-14 15:02:01 -04:00
Joey Hess	aa56d433d5	implement cluster.log Not used yet. (Or tested.) I did consider making the log start with the uuid of the node, followed by the cluster uuid (or uuids). That would perhaps mean a smaller write to the git-annex branch when adding a node, but overall the log file would be larger, and it will be read and cached near to startup on most git-annex runs.	2024-06-13 16:00:58 -04:00
Joey Hess	501d65eeab	started implementing git-annex-shell proxy So far, it negotiates VERSION with both parties. This is a tricky dance. Untested.	2024-06-10 18:01:36 -04:00
Joey Hess	f97f4b8bdb	Added updateproxy command and remote.name.annex-proxy configuration So far this only records proxy information on the git-annex branch.	2024-06-04 14:52:03 -04:00
Joey Hess	aeedca70ca	prep release	2024-05-30 17:53:33 -04:00
Joey Hess	424afe46d7	fix incremental push to preserve existing bundle keys in manifest Also broke Manifest out to its own type with a smart constructor. Sponsored-by: mycroft on Patreon	2024-05-13 09:47:05 -04:00
Joey Hess	ff5193c6ad	Merge branch 'master' into git-remote-annex	2024-05-10 14:20:36 -04:00
Joey Hess	e1447dc2e2	add git bundle interface Sponsored-by: mycroft on Patreon	2024-05-07 14:22:41 -04:00
Joey Hess	c7731cdbd9	add Backend.GitRemoteAnnex Making GITBUNDLE be in the backend list allows those keys to be hashed to verify, both when git-remote-annex downloads them, and by other transfers and by git fsck. GITMANIFEST is not in the backend list, because those keys will never be stored in .git/annex/objects and can't be verified in any case. This does mean that git-annex version will include GITBUNDLE in the list of backends. Also documented these in backends.mdwn Sponsored-by: Kevin Mueller on Patreon	2024-05-07 13:54:08 -04:00
Joey Hess	a01d64a4ad	add git-remote-annex stub and build machinery Renamed git-remote-annex.sh, keeping it around for now for reference. Sponsored-by: Graham Spencer on Patreon	2024-05-06 13:05:58 -04:00
Joey Hess	d6ad5b9b50	releasing package git-annex version 10.20240430	2024-04-30 15:27:31 -04:00
Joey Hess	d372553540	rclone special remote Added rclone special remote, which can be used without needing to install the git-annex-remote-rclone program. This needs a new version of rclone, which supports "rclone gitannex". This is implemented as a variant of an external special remote, that runs "rclone gitannex" instead of the usual git-annex-remote- command. Parameterized Remote.External to support that. Sponsored-by: Luke T. Shumaker on Patreon	2024-04-17 15:20:37 -04:00
Joey Hess	016d1bee88	add reregisterurl command What this can currently be used for is only to change an url from being used by a special remote to being used by the web remote. This could have been a --move-from option to registerurl. But, that would have complicated its option and --batch processing, and also would have complicated unregisterurl, which is implemented on top of Command.Registerurl. So, a separate command was actually less complicated to implement. The generic description of the command is because I want to make this command a catch-all for other url updating kind of things, if there are ever any more. Also because it was hard to come up with a good name for the specific action. I considered `git-annex moveurl`, but that seems to indicate data is perhaps actually being moved, and seems to sit at the same level as addurl and rmurl, and this command is at the plumbing level of registerurl and unregisterurl. Sponsored-by: Dartmouth College's DANDI project	2024-03-05 15:06:14 -04:00
Joey Hess	e7652b0997	implement URL to VURL migration This needs the content to be present in order to hash it. But it's not possible for a module used by Backend.URL to call inAnnex because that would entail a dependency loop. So instead, rely on the fact that Command.Migrate calls inAnnex before performing a migration. But, Command.ExamineKey calls fastMigrate and the key may or may not exist, and it's not wanting to actually perform a migration in any case. To handle that, had to add an additional value to fastMigrate to indicate whether the content is inAnnex. Factored generateEquivilantKey out of Remote.Web. Note that migrateFromURLToVURL hardcodes use of the SHA256E backend. It would have been difficult not to, given all the dependency loop issues. But --backend and annex.backend are used to tell git-annex migrate to use VURL in any case, so there's no config knob that the user could expect to configure that. Sponsored-by: Brock Spratlen on Patreon	2024-03-01 16:42:02 -04:00
Joey Hess	cc17ac423b	implement isCryptographicallySecureKey for VURL Considerable difficulty to work around an import cycle. Had to move the list of backends (except for VURL) to Backend.Variety to VURL could use it. Sponsored-by: Kevin Mueller on Patreon	2024-02-29 17:26:35 -04:00
Joey Hess	55bf01b788	add equivilant key log for VURL keys When downloading a VURL from the web, make sure that the equivilant key log is populated. Unfortunately, this does not hash the content while it's being downloaded from the web. There is not an interface in Backend currently for incrementally hash generation, only for incremental verification of an existing hash. So this might add a noticiable delay, and it has to show a "(checksum...") message. This could stand to be improved. But, that separate hashing step only has to happen on the first download of new content from the web. Once the hash is known, the VURL key can have its hash verified incrementally while downloading except when the content in the web has changed. (Doesn't happen yet because verifyKeyContentIncrementally is not implemented yet for VURL keys.) Note that the equivilant key log file is formatted as a presence log. This adds a tiny bit of overhead (eg "1 ") per line over just listing the urls. The reason I chose to use that format is it seems possible that there will need to be a way to remove an equivilant key at some point in the future. I don't know why that would be necessary, but it seemed wise to allow for the possibility. Downloads of VURL keys from other special remotes that claim urls, like bittorrent for example, does not popilate the equivilant key log. So for now, no checksum verification will be done for those. Sponsored-by: Nicholas Golder-Manning on Patreon	2024-02-29 16:01:49 -04:00
Joey Hess	c2d6c02c27	Added dependency on unbounded-delays And stop vendoring part of it. This is a free dependency because tasty depends on it. Sponsored-by: Leon Schuermann on Patreon	2024-02-27 13:11:59 -04:00
Joey Hess	bee3abab14	releasing package git-annex version 10.20240227	2024-02-27 13:02:17 -04:00
Joey Hess	d61633e183	releasing package git-annex version 10.20240129	2024-01-29 14:12:12 -04:00
Joey Hess	812cbf0e17	Stateless OpenPGP interface Implemented according to https://www.ietf.org/archive/id/draft-dkg-openpgp-stateless-cli-09.html#name-encrypt-encrypt-a-message Not yet used by git-annex. Sponsored-by: Leon Schuermann on Patreon	2024-01-10 15:59:35 -04:00
Joey Hess	f3fa9dc65f	releasing package git-annex version 10.20231227	2023-12-27 19:27:55 -04:00
Joey Hess	8a3beabf35	use RawFilePath for opening sqlite databases Fix a crash opening sqlite databases when run in a non-unicode locale, with a remote that uses a non-unicode filepath. In that situation converting to Text fails. The fix needs git-annex to be built with persistent-sqlite 2.13.3. Building against older versions still works, but that version is used when building with stack. Database.RawFilePath is a lot of code copied from persistent-sqlite and lightly modified, since only 1 function in persistent-sqlite was made to support RawFilePath. This is a bit of a pain, and I hope that persistent-sqlite will eventually switch to using OsPath, allowing this module to be removed from git-annex. Sponsored-by: k0ld on Patreon	2023-12-26 18:31:52 -04:00
Joey Hess	0bd8b17b59	log migration trees to git-annex branch This will allow distributed migration: Start a migration in one clone of a repo, and then update other clones. commitMigration is a bit of a bear.. There is some inversion of control that needs some TMVars. Also streamLogFile's finalizer does not handle recording the trees, so an interrupt at just the wrong time can cause migration.log to be emptied but the git-annex branch not updated. Sponsored-by: Graham Spencer on Patreon	2023-12-06 15:40:03 -04:00
Joey Hess	060259b750	add manual: true to ParallelBuild flag hackage demands that -j be gated behind a build flag with manual: true set.	2023-11-29 16:05:03 -04:00
Joey Hess	bacd781c4f	releasing package git-annex version 10.20231129	2023-11-29 16:01:01 -04:00
Joey Hess	cda3e85164	make my authorship explicit in the code This is intended to guard against LLM code theft, which is the current bubble technology de jour. Note that authorJoeyHess' with a year older than the year I began developing git-annex will behave badly, by intention. Eg, it will spin and eventually crash. This is not the first anti-LLM protection in git-annex. For example see `9562da790f`. That method, while much harder for an adversary to detect and remove, also complicates code somewhat significantly, and needs extensions to be enabled. There are also probably significantly fewer ways to implement that method in Haskell. This new approach, by contrast, will be easy to add throughout the code base, with very little effort, and without complicating reading or maintaining it any more than noticing that yes, I am the author of this code. An adversary could of course remove all calls to these functions before feeding code into their LLM-based laundry facility. I think this would need to be done manually, or with the help of some fairly advanced Haskell parsing though. In some cases, authorJoeyHess needs to be removed, while in other places it needs to be replaced with a value. Also a monadic use of authorJoeyHess' may involve other added monadic machinery which would need to be eliminated to keep the code compiling. Alternatively, an adversary could replace my name with something innocuous. This would be clear intent to remove author attribution from my code, even more than running it through an LLM laundry is. If you work for a large company that is laundering my code through an LLM, please do us a favor and use your immense privilege to quit and go do something socially beneficial. I will not explain further developments of this code in such detail, and you have better things to do than playing cat and mouse with me as I explore directions such as extending this approach to the type level. Sponsored-by: k0ld on Patreon	2023-11-20 12:29:12 -04:00
Joey Hess	561c036664	split out generic git log parser Sponsored-By: Jack Hill on Patreon	2023-11-10 15:40:03 -04:00
Joey Hess	8bde6101e3	sqlite datbase for importfeed importfeed: Use caching database to avoid needing to list urls on every run, and avoid using too much memory. Benchmarking in my podcasts repo, importfeed got 1.42 seconds faster, and memory use dropped from 203000k to 59408k. Database.ImportFeed is Database.ContentIdentifier with the serial number filed off. There is a bit of code duplication I would like to avoid, particularly recordAnnexBranchTree, and getAnnexBranchTree. But these use the persistent sqlite tables, so despite the code being the same, they cannot be factored out. Since this database includes the contentidentifier metadata, it will be slightly redundant if a sqlite database is ever added for metadata. I did consider making such a generic database and using it for this. But, that would then need importfeed to update both the url database and the metadata database, which is twice as much work diffing the git-annex branch trees. Or would entagle updating two databases in a complex way. So instead it seems better to optimise the database that importfeed needs, and if the metadata database is used by another command, use a little more disk space and do a little bit of redundant work to update it. Sponsored-by: unqueued on Patreon	2023-10-23 16:46:22 -04:00
Joey Hess	c2e60dd7a6	enable parallel ghc for building git-annex Via a build flag this time, that's off by default because hackage demands it be so, but that gets turned on by the Makefile and by stack.	2023-09-26 13:46:44 -04:00
Joey Hess	4ac2758ba5	Revert "enable parallel ghc for building git-annex" This reverts commit `3f6aff89b1`. Sadly hackage rejects cabal files using -j unless hidden behind an option that is disabled by default.	2023-09-26 13:34:28 -04:00
Joey Hess	b9240d2c5d	releasing package git-annex version 10.20230926	2023-09-26 13:29:49 -04:00
Joey Hess	3f6aff89b1	enable parallel ghc for building git-annex This drops a full recompile on my new 12 core laptop from 4:00 to 2:47. It would be possible for me to use: cabal configure --ghc-options=-j But that also makes cabal parallelize ghc for each package it installs to satisfy git-annex's dependencies. Since cabal is already configured to parallize installing dependencies, that would use N^2 cpu cores, which seems like a bad idea. And also I'd have to remember to do it. So I'm thinking it's better to do it by default. If a system that is building git-annex is also busy with other things, let the scheduler sort it out. If this impacts someone particularly badly, they can of course avoid it with: cabal configure --ghc-options=-j1	2023-09-21 13:00:31 -04:00
Joey Hess	54da44d42a	Support being built with crypton rather than cryptonite crypton is a fork of cryptonite, and cryptonite's github repo has been archived. Some deps are already using cryptonite so it's clearly the way forward. Added a build flag without a default, so cabal configure will select on its own which to use. stack files pin to cryptonite for now. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-09-21 12:43:42 -04:00
Joey Hess	50300a47fe	Removed the vendored git-lfs and the GitLfs build flag AFAICS all git-annex builds are using the git-lfs library not the vendored copy. Debian stable now includes a new enough haskell-git-lfs package as well. Last time this was tried it did not.	2023-08-28 13:12:31 -04:00
Joey Hess	5e818e4903	remove man pages from cabal file Since `393275c105` Setup.hs no longer installs the man pages. Since the cabal package is only used to install git-annex with cabal, it doesn't need to include files like these that are not used when installing with cabal.	2023-08-28 12:57:50 -04:00
Joey Hess	cf8b30c914	oldkeys: New command that lists the keys used by old versions of a file The tricky thing about this turned out to be handling renames and reverts. For that, it has to make two passes over the git log, and to avoid buffering a possibly huge amount of logs in memory (ie the whole git log of an entire repository!), runs git log twice. (It might be possible to speed this up by asking git log to show a diff, and so avoid needing to use catKey.) Sponsored-By: Brock Spratlen on Patreon	2023-08-22 14:51:06 -04:00
Joey Hess	977403d338	implement Unavilable for borg bup ddar directory rsync Only gcrypt remains to add support for. (Well, possibly also adb?) Sponsored-by: Luke T. Shumaker on Patreon	2023-08-16 15:48:09 -04:00
Joey Hess	be028f10e5	split out Utility.Url.Parse This is mostly for git-repair which can't include all of Utility.Url without adding many dependencies that are not really necessary.	2023-08-14 12:28:10 -04:00
Joey Hess	d19139a10d	releasing package git-annex version 10.20230802	2023-08-02 16:09:14 -04:00
Joey Hess	85aadcfa1e	windows back to lts-18.13 temporarily I can't seem to get stack to resolve dependencies with Win32-2.13.4.0, no matter what I try. Why it blows up, I don't know. And allow-newer: true actually causes it to downgrade Win32 to the one version that won't build. Unbelivable that allows downgrades. So just gonna have to wait for that to get into stackage nightly, and then stack.yaml can be updated to use that, and the changes in this commit reverted.	2023-08-02 12:49:38 -04:00
Joey Hess	f1842b616a	fix stack build on windows For whatever reason, putting Win32-2.13.4.0 in stack.yaml results in stack blowing up with many unrelated dependency problems. But making git-annex depend on that version lets stack resolve deps.	2023-08-02 11:50:12 -04:00
Joey Hess	28864f0bb2	add back utf8-string to setup build deps Needed on Windows since Utility.FileSystemEncoding uses it	2023-08-02 09:29:23 -04:00
Joey Hess	6da6449fff	stack.yaml: Update to build with ghc-9.6.2 and aws-0.24 This enables some new features that need the new aws. Use http-client-restricted-0.1.0 because it uses the crypton side of the cryptonite/crypton fork, which seems to be needed for ghc-9.6.2. Dependency on connection removed because of the cryptonite/crypton fork. This avoids needing a build flag. It was only used to throw a typed exception in Utility.Url, which nothing depended on. Used a fork of bloomfilter because it's not being maintained and no longer builds as-of this ghc version. (I have been trying to contact its maintainer about it, and emailed him today suggesting I take over the package.) Sponsored-by: Brock Spratlen on Patreon	2023-08-01 18:53:26 -04:00
Joey Hess	68c9b08faf	fix build with unix-2.8.0 Changed the parameters to openFd. So needed to add a small wrapper library to keep supporting older versions as well.	2023-08-01 18:41:27 -04:00
Joey Hess	3b825eb7a6	rewrap	2023-08-01 15:47:05 -04:00
Joey Hess	fb640bc2f4	support building with unix-compat 0.7 It removed System.PosixCompat.User.	2023-08-01 15:17:43 -04:00
Joey Hess	393275c105	Setup.hs: Stop installing man pages, desktop files, and the git-annex-shell and git-remote-tor-annex symlinks Anything still relying on that, eg via cabal v1-install will need to change to using make install-home. Which was added back in 2019 in `6491b62614` because cabal new-build (now the default) already didn't use Setup in a way that let its installation of those things work. Notably this means Setup does not need to depend on unix-compat, which is useful because in 0.7 it removed System.PosixCompat.User, which Setup needed to determine where to install the desktop files. See https://github.com/haskell-pkg-janitors/unix-compat/issues/3	2023-08-01 15:08:56 -04:00
Joey Hess	e1fc9e204e	added git-annex satisfy This ended up having an interface like sync, rather than like get/copy/drop. That let it be implemented in terms of sync, which took a lot less code. Also, it lets it handle many of the edge cases that sync does, such as getting files that are not visible in a --hide-missing branch, and sending files to exporttree remotes. As well as being easier to implement, `git-annex satisfy myremote` makes sense as it satisfies the preferred content settings of the remote. `git-annex satisfy somefile` does not form a sentence that makes sense. So while -C can be a little bit annoying, it still makes sense to have this syntax. Note that, while I initially thought this would also satisfy numcopies, it does not. Arguably it ought to. But, sync does not send files in order to satisfy numcopies, it only sends files to satisfy preferred content. And it's important that this transfer the same files as sync does, because it will probably be used in a workflow where the user sometimes syncs and sometimes satisfies, and does not expect satisfy to do things that sync would not do. (Also opened a new bug that also affects sync et all, not only this command.) Sponsored-by: Nicholas Golder-Manning on Patreon	2023-06-29 15:34:53 -04:00

1 2 3 4 5 ...

1059 commits