git-annex

Author	SHA1	Message	Date
Joey Hess	73060eea51	annex.fastcopy Added annex.fastcopy and remote.name.annex-fastcopy config setting. When set, this allows the copy_file_range syscall to be used, which can eg allow for server-side copies on NFS. (For fastest copying, also disable annex.verify or remote.name.annex-verify.) This is a simple implementation, that does not handle resuming as well as it possibly could. It can be used with both local git remotes (including on NFS), and directory special remotes. Other types of remotes could in theory also support it, so I've left the config documented as a general thing.	2025-06-03 15:01:38 -04:00
Joey Hess	9024d8e2d1	fixes for enabling and autoenabling mask special remotes	2025-04-11 13:18:23 -04:00
Joey Hess	1313cc4d60	mask remotes, partial implementation Everything implemented except for passing through to the masked remote. Which should be trivial.	2025-04-10 13:10:07 -04:00
Joey Hess	e81fd72018	Added remote.name.annex-web-options config Which is a per-remote version of the annex.web-options config. Had to plumb RemoteGitConfig through to getUrlOptions. In cases where a special remote does not use curl, there was no need to do that and I used Nothing instead. In the case of the addurl and importfeed commands, it seemed best to say that running these commands is not using the web special remote per se, so the config is not used for those commands.	2025-04-01 10:17:38 -04:00
Joey Hess	83163ae08a	typo	2025-03-26 11:15:58 -04:00
Joey Hess	bcfd554a0f	findcomputed: New command, displays information about computed files.	2025-03-18 12:55:48 -04:00
Joey Hess	52f51d065a	rename config to annex.security.allowed-compute-programs And require for enable as well as autoenable. It seemed asking for trouble for `git-annex enable foo` to use whatever compute program is stored in the git config, without verifying that the user wants that program to be used. Note that it would be good to allow `git-annex enable foo program=...` to be used without the program being in the git config. Not implemented yet though.	2025-03-03 16:12:03 -04:00
Joey Hess	f32d2aecce	autoenable security for compute special remote Added annex.security.autoenable-compute-programs and only allow autoenabling special remotes that use compute programs on that list. The reason this is needed is a user might have some compute programs that are less safe to use than others. They might want to use an unsafe one only with one repository, where they are the only committer or other committers are trusted. They might be ok with others being used by any repository, and if so they can add them to the list. Another reason would be a user who has installed a compute program by accident. Eg, it might be included with git-annex at some point, or pulled in by some dependency. That user doesn't necessarily want that compute program to be used in an autoenabled special remote.	2025-03-03 15:52:56 -04:00
Joey Hess	c1b53dbbd0	wip	2025-02-20 13:27:47 -04:00
Joey Hess	b5319ec575	documentation for compute remote and associated commands None of this is implemented yet.	2025-02-19 14:29:18 -04:00
matrss	eab8aec4f0		2025-01-30 14:50:58 +00:00
Joey Hess	42d55bc57c	pre-init config and hook Added annex.pre-init-command git config and pre-init-annex hook that is run before git-annex repository initialization. This can block initialization. Or it can preform pre-initialization configuration or tweaking. I left stdio connected while it's running, so it could also be used for interactive prompting conceivably, although that would want to use /dev/tty anyway probably in order to not pollute the stdout of a command when automatic initialization is done. Sponsored-by: Dartmouth College's OpenNeuro project	2025-01-13 14:22:49 -04:00
Joey Hess	ce49caec60	document files	2025-01-13 13:14:12 -04:00
Joey Hess	a73fa77417	added hooks corresponding to annex.-command Added freezecontent-annex and thawcontent-annex hooks that correspond to the git configs annex.freezecontent and annex.thawcontent. * Added secure-erase-annex hook that corresponds to the git config annex.secure-erase-command. * Added commitmessage-annex hook that corresponds to the git config annex.commitmessage-command. * Added http-headers-annex hook that corresponds to the git config annex.http-headers-command. that correspond to the post-update-annex and pre-commit-annex hooks. The use case for these is eg, setting up a git repository that is run in a container, where the easiest way to provide a script is by putting it in .git/hooks/, rather than copying it into the container in a way that puts it in PATH. This is all the ones that make sense to add for annex.*-config git configs. annex.youtube-dl-command is not a hook, it's telling git-annex what command to run. So is annex.shared-sop-command. So omitted those. May later also want to add hooks corresponding to `remote.<name>.annex-cost-command` etc. Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2025-01-10 14:54:42 -04:00
Joey Hess	5df1b2b36e	configs annex.post-update-command and annex.pre-commit-command Added git configs annex.post-update-command and annex.pre-commit-command that correspond to the git-annex hook scripts post-update-annex and pre-commit-annex. Note that the hook files take precience over the git config, since the git config can includ global config which should be overridden by local config. These new git configs are probably not super useful. Especially the pre-commit-annex hook is there to install scripts to instead of the pre-commit hook, since git-annex installs that hook itself. So why would someone want to use a git config for that? Only reason I can think of would be in a global git config. Or possibly because it's easier to set a git config than write a hook script, on an OS like Windows. The real reason I'm adding these is as groundwork for making other annex.-command git configs also be available as hook scripts. I want to avoid having some things available as only git hooks and others as both gitconfigs and git hooks. (It seems that some annex.-command configs don't translate to git hooks though.) In the man page, moved documentation of the hooks to be next to the documentation of the git configs. This is to avoid repitition.	2025-01-10 13:27:51 -04:00
Joey Hess	dd052dcba1	annexInsteadOf config Added config `url.<base>.annexInsteadOf` corresponding to git's `url.<base>.pushInsteadOf`, to configure the urls to use for accessing the git-annex repositories on a server without needing to configure remote.name.annexUrl in each repository. While one use case for this would be rewriting urls to use annex+http, I decided not to add any kind of special case for that. So while git-annex p2phttp, when serving multiple repositories, needs an url of eg "annex+http://example.com/git-annex/ for each of them, rewriting an url like "https://example.com/git/foo/bar" with this config set to "https://example.com/git/" will result in eg "annex+http://example.com/git-annex/foo/bar", which p2phttp does not support. That seems better dealt with in either git-annex p2phttp or a http middleware, rather than complicating the config with a special case for annex+http. Anyway, there are other use cases for this that don't involve annex+http.	2024-12-03 14:39:07 -04:00
Joey Hess	b8a717a617	reuse http url password for p2phttp url when on same host When remote.name.annexUrl is an annex+http(s) url, that uses the same hostname as remote.name.url, which is itself a http(s) url, they are assumed to share a username and password. This avoids unnecessary duplicate password prompts.	2024-11-19 15:27:26 -04:00
Joey Hess	b94221594b	add: When adding a dotfile as a non-large file, mention that it's a dotfile This is to reduce user confusion when their annex.largefiles matches it, or is not set. Note that, when annex.dotfiles is set, but a dotfile is not matched by annex.largefiles, the "non-large file" message will be displayed. That makes sense because whether the file is a dotfile does not matter with that configuration. Also, this slightly optimised the annex.dotfiles path in passing, by avoiding the slight slowdown caused by the check added in commit `876d5b6c6f` in that case.	2024-11-13 14:09:24 -04:00
Joey Hess	876d5b6c6f	add: Consistently treat files in a dotdir as dotfiles, even when ran inside that dotdir Assistant and smudge also updated. This does add a small amount of extra work, getting the TopFilePath. Not enough to be concerned by. Also improve documentation to make clear that files inside dotdirs are treated as dotfiles. Sponsored-by: Eve on Patreon	2024-11-13 13:43:01 -04:00
Joey Hess	84c781d924	documentation for git-annex sim command not implemented yet	2024-09-04 15:03:17 -04:00
Joey Hess	76ece2a699	make --rebalance of balanced use fullysizebalanced when useful When the specified number of copies is > 1, and some repositories are too full, it can be better to move content from them to other less full repositories, in order to make space for new content. annex.fullybalancedthreshhold is documented, but not implemented yet This is not tested very well yet, and is known to sometimes take several runs to stabalize.	2024-08-21 17:59:08 -04:00
Joey Hess	1265d7e5df	implement maxsize log and command * maxsize: New command to tell git-annex how large the expected maximum size of a repository is. * vicfg: Include maxsize configuration.	2024-08-11 15:41:26 -04:00
Joey Hess	4750ffbd3b	finalized design for proxying to exporttree=yes annexobjects=yes special remotes	2024-08-06 11:45:45 -04:00
Joey Hess	bc9cc79e85	set remote's annexUrl automatically When the remote repository's git config file has annex.url set to an annex+http url.	2024-07-28 20:13:41 -04:00
Joey Hess	a6a03ca586	annex+http urls	2024-07-23 08:42:33 -04:00
Joey Hess	86ce3bf1e4	started servant implementation of HTTP P2P protocol	2024-07-07 12:08:10 -04:00
Joey Hess	542de0c0c4	document proxying to special remotes	2024-07-01 11:33:55 -04:00
Joey Hess	07e899c9d3	git-annex-shell: proxy nodes located beyond remote cluster gateways Walking a tightrope between security and convenience here, because git-annex-shell needs to only proxy for things when there has been an explicit, local action to configure them. In this case, the user has to have run `git-annex extendcluster`, which now sets annex-cluster-gateway on the remote. Note that any repositories that the gateway is recorded to proxy for will be proxied onward. This is not limited to cluster nodes, because checking the node log would not add any security; someone could add any uuid to it. The gateway of course then does its own checking to determine if it will allow proxying for the remote.	2024-06-26 12:56:16 -04:00
Joey Hess	0b72b85df5	added git-annex extendcluster This works, but updatecluster does not work yet in multi-gateway clusters, nor do gateways relay to other gateways.	2024-06-26 10:26:54 -04:00
Joey Hess	b8016eeb65	add annex-proxied This makes git-annex sync and similar not treat proxied remotes as git syncable remotes. Also, display in git-annex info remote when the remote is proxied.	2024-06-24 10:16:59 -04:00
Joey Hess	570ceffe8d	broke out initcluster One benefit of this is that a typo in annex-cluster-node config won't init a new cluster. Also it gets the cluster description set and is consistent with initremote.	2024-06-14 17:23:11 -04:00
Joey Hess	bbf261487d	add git-annex updatecluster command Seems to work fine, making the right changes to the git-annex branch.	2024-06-14 15:02:01 -04:00
Joey Hess	2844230dfe	add git configs for clusters	2024-06-14 12:20:17 -04:00
Joey Hess	f97f4b8bdb	Added updateproxy command and remote.name.annex-proxy configuration So far this only records proxy information on the git-annex branch.	2024-06-04 14:52:03 -04:00
Joey Hess	2ffe077cc2	git-remote-annex: brought back max-git-bundles config An incremental push that gets converted to a full push due to this config results in the inManifest having just one bundle in it, and the outManifest listing every other bundle. So it actually takes up more space on the special remote. But, it speeds up clone and fetch to not have to download a long series of bundles for incremental pushes.	2024-05-28 13:28:19 -04:00
Joey Hess	3e7324bbcb	only delete bundles on pushEmpty This avoids some apparently otherwise unsolveable problems involving races that resulted in the manifest listing bundles that were deleted. Removed the annex-max-git-bundles config because it can't actually result in deleting old bundles. It would still be possible to have a config that controls how often to do a full push, which would avoid needing to download too many bundles on clone, as well as needing to checkpresent too many bundles in verifyManifest. But it would need a different name and description.	2024-05-21 11:13:27 -04:00
Joey Hess	7dd2a67c41	fix names of new git configs	2024-05-14 15:33:47 -04:00
Joey Hess	23c4125ed4	mention other commands shipped with git-annex in SEE ALSO in man page	2024-05-14 15:23:45 -04:00
Joey Hess	0bf72ef103	max-git-bundles config for git-remote-annex	2024-05-14 14:23:40 -04:00
Joey Hess	6f1039900d	prevent using git-remote-annex with unsuitable special remote configs I hope to support importtree=yes eventually, but it does not currently work. Added remote.<name>.allow-encrypted-gitrepo that needs to be set to allow using it with encrypted git repos. Note that even encryption=pubkey uses a cipher stored in the git repo to encrypt the keys stored in the remote. While it would be possible to not encrypt the GITBUNDLE and GITMANIFEST keys, and then allow using encryption=pubkey, it doesn't currently work, and that would be a complication that I doubt is worth it.	2024-05-14 13:52:20 -04:00
Joey Hess	ff5193c6ad	Merge branch 'master' into git-remote-annex	2024-05-10 14:20:36 -04:00
Joey Hess	306ea42447	improve git-remote-annex docs renamed the git config to something shorter too	2024-05-06 13:06:22 -04:00
Joey Hess	a8cef2bf85	added man page for git-remote-annex And document remote.<name>.git-remote-annex-max-bundles which will configure it. datalad-annex uses a similar url format, but with some enhancements. See https://github.com/datalad/datalad-next/blob/main/datalad_next/gitremotes/datalad_annex.py I added the UUID to the URL, because it is needed in order to pick out which manifest file to use. The design allows for a single key/value store to have several special remotes all stored in it, and so the manifest includes the UUID in its name. While datalad-annex allows datalad-annex::<url>?, and allows referencing peices of the url in the parameters, needing the UUID prevents git-remote-annex from supporting that syntax. And anyway, it is a complication and I want to keep things simple for now. Sponsored-by: unqueued on Patreon	2024-05-06 12:48:04 -04:00
Joey Hess	c410b2bb73	annex.maxextensions configuration Controls how many filename extensions to preserve. Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2024-04-18 14:23:38 -04:00
Joey Hess	d372553540	rclone special remote Added rclone special remote, which can be used without needing to install the git-annex-remote-rclone program. This needs a new version of rclone, which supports "rclone gitannex". This is implemented as a variant of an external special remote, that runs "rclone gitannex" instead of the usual git-annex-remote- command. Parameterized Remote.External to support that. Sponsored-by: Luke T. Shumaker on Patreon	2024-04-17 15:20:37 -04:00
Joey Hess	016d1bee88	add reregisterurl command What this can currently be used for is only to change an url from being used by a special remote to being used by the web remote. This could have been a --move-from option to registerurl. But, that would have complicated its option and --batch processing, and also would have complicated unregisterurl, which is implemented on top of Command.Registerurl. So, a separate command was actually less complicated to implement. The generic description of the command is because I want to make this command a catch-all for other url updating kind of things, if there are ever any more. Also because it was hard to come up with a good name for the specific action. I considered `git-annex moveurl`, but that seems to indicate data is perhaps actually being moved, and seems to sit at the same level as addurl and rmurl, and this command is at the plumbing level of registerurl and unregisterurl. Sponsored-by: Dartmouth College's DANDI project	2024-03-05 15:06:14 -04:00
Joey Hess	68e99513f0	added annex.commitmessage-command config Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2024-02-12 14:35:22 -04:00
Joey Hess	8e9ee31621	webapp: Added --port option, and annex.port config The getSocket comment that mentioned using ":port" in the hostname seems to have been incorrect or be out of date. After all, the bug report came when the user first tried doing that, and it didn't work. Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2024-01-25 14:08:36 -04:00
Joey Hess	20567e605a	add directional stalldetection and bwlimit configs Sponsored-by: Dartmouth College's DANDI project	2024-01-19 15:27:53 -04:00
Joey Hess	df35f70801	tweak stall detection scaling Refactored to allow offline experimentation, and ended up changing the allowedvariation (aka fudge factor) to 3. 10 seems too high, and 1.5 too low. Scale earlier, so even if the first chunk takes less than the configured time period, allowance is made that later chunks might transfer slower. Decided to use the same allowedvariation to decide when to start scaling. Smoothed the scaling out. Some examples: ghci> upscale (BwRate 10 (Duration 60)) 25 BwRate 13 (Duration {durationSeconds = 75}) -- A small scaling upwards after 1/3rd the time. Not noticable. ghci> upscale (BwRate 10 (Duration 60)) 60 BwRate 30 (Duration {durationSeconds = 180}) -- At the configured time, 3x scaling. ghci> upscale (BwRate 10 (Duration 60)) 120 BwRate 60 (Duration {durationSeconds = 360}) -- A typical upscaling, here a 1 minute duration became 6 minutes -- due to the first chunk taking 2 minutes to transfer. ghci> upscale (BwRate 10 (Duration 60)) 600 BwRate 300 (Duration {durationSeconds = 1800}) -- Here the first chunk took 10 minutes to transfer, so it will -- take 30 minutes to detect a stall. Sponsored-by: Dartmouth College's DANDI project	2024-01-19 12:58:41 -04:00

1 2 3 4 5 ...

725 commits