git-annex

Author	SHA1	Message	Date
Joey Hess	c20b8610b6	enableremote: Allow type= to be provided when it does not change the type of the special remote Fixes breakage in datalad test suite https://github.com/datalad/datalad/issues/7747 Also, revert the change I earlier made to git-annex's own test suite due to the same problem.	2025-09-29 10:40:10 -04:00
Joey Hess	6fcccbba19	clean up imports needed by old versions of ghc Now that ghc 9.0.2 is the oldest supported version. Eg cruft from https://web.archive.org/web/20190424185034/https://prime.haskell.org/wiki/Libraries/Proposals/SemigroupMonoid Sponsored-by: Jack Hill	2025-09-23 13:55:13 -04:00
Joey Hess	ff65cd6954	invalidate recorded content identifier tree when export changes Fix bug that made changes to a special remote sometimes be missed when importing a tree from it. The diff import would miss when a change was exported, then manually undone on the special remote (eg deleting a newly exported file). A full import is needed to catch such changes. After upgrading, any such missed changes will be included in the next tree imported from a special remote. This happens because the previously recorded content identifier tree does not have export information included, so it is treated as invalid, and a full import is done. Fixes reversion introduced in version 10.20230626, commit `40017089f2` Unfortunately, this does mean that after each export, the next import will be a full import. Which can take significantly longer than the diff import does, when there are a lot of files in the tree. It would be better if exporting also update the content identifier tree. However, I don't know if that can be done inexpensively. It would be future optimisation work, in any case. (That could only be done for an export that is run in the same repository as the import. When an export is run in a different repository, the export.log gets updated, and that propagates to the repository where import is later run. At that point, a full import is done.) Sponsored-by: Luke T. Shumaker	2025-09-23 12:52:55 -04:00
Joey Hess	16d7432a2f	prevent deadlock when reconcileStaged runs restagePointerFiles Fix hang that could occur when using git-annex adjust on a branch with a number of files greater than annex.queuesize. Or potentially other commands. When reconcileStaged is running, the database is being opened. But restagePointerFiles closes the database, and later writes to it. So it will deadlock if called by reconcileStaged. The deadlock occurred when the git queue happened to be full, causing adding a call to restagePointerFiles to it to flush the queue and restagePointerFiles to run at the wrong time. Fixed by making reconcileStaged, when it populates or depopulates a pointer file, arrange for restagePointerFiles to be run as a cleanup action, rather than from the git queue. But, what if restagePointerFiles is already in the git queue before reconcileStaged is run? If it adds anything else to the git queue, causing the queue to flush, it would still deadlock. To avoid this hypothetical situation, added a Annex.inreconcilestaged, and made restagePointerFiles check it and not do anything. Note that, I did consider the simpler approach of only running restagePointerFiles as a cleanup action, rather than from the git queue. But see commit `6a3bd283b8` for why it was made to use the queue in the first place. I wanted to avoid tying this bug fix to a behavior change. Sponsored-by: mycroft	2025-09-22 14:56:50 -04:00
Joey Hess	dfbf76e2ca	enableremote: Disallow using type= to attempt to change the type of an existing remote Changing the type out from under an existing special remote exposes the existing config to something that may interpret it wildly differently. As seen in the bug report, this can even result in behavior that makes git-annex say it's buggy. So prevent the user from doing this. --sameas is the better way. Sponsored-by: Kevin Mueller	2025-09-22 10:54:16 -04:00
Joey Hess	2b1e9eced2	open feed file with close-on-exec bit set parseFeedFromFile does not set the bit, so open and read the file ourselves. Versioned dependency on utf8-string should not cause any issues, that version is available in all all versions of debian that package it. Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2025-09-05 16:02:17 -04:00
Joey Hess	6f9a9c81f6	convert all readFile, writeFile, and appendFile to close-on-exec safe versions Even in the Build system. This allows grepping to make sure that there are none left un-converted: git grep "writeFile" \|grep -v F\\.\| grep -v doc/\|grep -v writeFileString \| grep -v writeFileProtected \|grep -v Utility/FileIO git grep "readFile" \|grep -v F\\.\| grep -v doc/\|grep -v readFileString \|grep -v Utility/FileIO git grep "appendFile" \|grep -v F\\.\| grep -v doc/\|grep -v appendFileString \|grep -v Utility/FileIO Might be nice to automate that to prevent future mistakes... Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2025-09-05 15:44:32 -04:00
Joey Hess	033e4b086f	audit all openFd and dupping for close-on-exec Made all uses of openFd and dup set the close-on-exec flag, with a few exceptions when starting a git-annex daemon. Made openFdWithMode be used everywhere, rather than openFd. Adding a new parameter to it ensures I checked everything. And will help to make sure this gets considered in the future when opening fds. In lockPidFile, the only thing that keeps the pid file locked, once daemonize re-runs the command in a new session, is that the fd is inherited. In Utility.LogFile.redir, the new fd it dups to does not have the close-on-exec flag set, because this is used to set up the stdout and stderr fds, which need to be inherited by child processes. Same in Assistant.startDaemon where the browser gets started with the original stdout and stderr. This does nothing about uses of openFile and similar! Sponsored-By: mycroft	2025-09-04 16:01:41 -04:00
Joey Hess	146d224c63	drop: --fast support when dropping from a remote This is the same as --not --in $remote, but easier to type. And the documentation of --fast helps also document that drop can do extra work when used without --fast. Sponsored-by: Nicholas Golder-Manning	2025-08-29 12:45:33 -04:00
Joey Hess	2a0ec700af	remove youtube-dl support, always use yt-dlp The annex.youtube-dl-command git config is no longer used, git-annex always runs the yt-dlp command, rather than the old youtube-dl command. Sponsored-by: Leon Schuermann	2025-08-27 09:29:43 -04:00
Joey Hess	75be161574	remove git version check for adjusted branch `2686d2d7ea` made git older than 2.5 not be supported, so this check for an older version is not longer needed. Sponsored-by: Kevin Mueller	2025-08-21 11:12:36 -04:00
Joey Hess	2686d2d7ea	Removed support for git versions older than 2.5. This entirely removes Git.BuildVersion, which avoids the possibility that git-annex will behave differently based on the version of git it was built with, rather than the version it's used with. Debian oldoldstable is the oldest version of git that git-annex needs to support, since it's used in the amd64ancient build. cabal configure will fail if the git version is too old. Sponsored-by: Nicholas Golder-Manning	2025-08-21 11:04:26 -04:00
Joey Hess	0924a45cc4	info: Added --show option To pick which parts of the info to calculate and display. Sponsored-by: Dartmouth College's DANDI project	2025-08-13 16:49:21 -04:00
Joey Hess	d3fbda13e4	p2p --enable p2p: Added --enable option, which can be used to enable P2P networks provided by external commands git-annex-p2p-<netname> Made git-annex p2p --enable tor behave the same as git-annex enable-tor, to make tor a bit less of a special case. However, it canot be run as root, since it cannot take the user id parameter.	2025-07-30 14:08:59 -04:00
Joey Hess	a6f8248465	add connProcess to P2PConnection When using the new generic P2P transport to open an outgoing connection to a peer, this will hold the pid of the git-annex-p2p-<netname> command. closeConnection simply waits for it. Rather than relying on garbage collection of the closed handles to close it. In Remote.Helper.Ssh, connProcess is set to Nothing, even though there is a similar process being used there. That code stores the pid in OpenConnection instead, and handles waiting for it itself. A bit ugly, but not worth cleaning up at this point, maybe later.	2025-07-30 12:35:16 -04:00
Joey Hess	f631bc9e56	add P2PAnnex constructor This is for p2p-annex:: urls that will use the new generic P2P transport. In addressCredsFile, threw in an url encoding of any non-alphanumeric characters that are in the address. This is to avoid any possible path traversal attacks via a p2p-annex:: url, since the address part of it could contain any characters. And, went ahead and did the same url encoding of tor-annex:: urls, even though tor onion addresses are all alphanumerics, on the off chance that might avoid a similar problem. (It does not seem likely enough to treat it as a security hole.)	2025-07-30 12:09:17 -04:00
Joey Hess	ba24f78626	fix build with OsPath build flag	2025-07-21 12:26:45 -04:00
Joey Hess	758515dc9a	fsck: Fix location of annexed files when run in linked worktrees This cleans up after the bug that was fixed in commit `6a9e923c74` Object files that were stored in the wrong location are rescued, and after that any wrong location logs will be fixed by the usual fsck.	2025-07-15 13:09:45 -04:00
Joey Hess	ef30fa2fa9	support combineing --socket with HTTPs Might be useful when proxying? Dunno.	2025-07-07 16:41:19 -04:00
Joey Hess	492c484a82	p2phttp: Added --socket option Used protectedOutput to set up a umask that makes the socket only accessible by the current user. Authentication is still needed when using this option unless it is combined with --wideopen. It was just simpler to keep authentication separate from this.	2025-07-07 16:40:02 -04:00
Joey Hess	66b009a0f6	p2phttp: Scan multilevel directories with --directory This allows for eg dir/user/repo structure. But also other layouts. It still does not look for repositories that are nested inside other repositories. The check for symlinks is mostly to avoid cycles that would prevent findRepos from returning. Eg, foo/bar/baz being a symlink to foo/bar. If the directory is writable by someone else they can still race it and get it to follow a symlink to some other directory. I don't think p2phttp needs to worry about that kind of situation though, and I doubt it avoids such problems when operating on files in a git-annex repository either.	2025-07-07 16:07:13 -04:00
Joey Hess	46ee651c94	non-tor AuthTokens As groundwork for making git-annex p2p support other P2P networks than tor hidden services, when an AuthToken is not a TorAnnex value, but something else (that will be added later), store the P2PAddress that it will be used with along with the AuthToken. And in loadP2PAuthTokens, only return AuthTokens for the specified P2PAddress. See commit `2de27751d6` for some design work that led to this. Also, git-annex p2p --gen-addresses is changed to generate a separate AuthToken for every P2P address. Rather than generating a single AuthToke and using it for every one. When we have more than just tor, this will be important for security, to avoid a compromise of one P2P network exposing the AuthToken used for another network.	2025-07-07 15:10:15 -04:00
Joey Hess	9f4e956346	sync: push current branch first sync: Push the current branch first, rather than a synced branch, to better support git forges (gitlab, gitea, forgejo, etc.) which use push-to-create with the first pushed branch becoming the default branch. With considerable complication to filter out warning message about receive.denyCurrentBranch when pushing to a non-bare repository. Localization may break it in the future, but it seems like the best way to handle this. See my comments for the gory details.	2025-06-04 12:06:00 -04:00
Joey Hess	f167e7f55b	adjust annex.synccontent transition warning sync will also be changing to drop unwanted content by default, this wording change avoids leaving the wrong impression	2025-05-30 14:30:01 -04:00
Joey Hess	f6eac67f0e	rename repoName to repoDesc That's what the function mostly is, if it shows a remote name it's only in an edge case, where that is the best description of it available.	2025-05-29 12:55:40 -04:00
Joey Hess	2fad57de44	fix display of remote name in json Also fixes it in the graphviz map in some cases, where there is no description for a repository. And in json, use the remote name, never the description, since the field is "remote" which is intended to be the git remote name. Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2025-05-29 12:53:42 -04:00
Joey Hess	a44638ca73	adjust json field names Avoid using "name" for what git-annex otherwise refers to as a description. (For the remotes in the map, the "remote" field should be the remote name, but there is a bug preventing it from being that.) Sponsored-by: the NIH-funded NICEMAN (ReproNim TR&D3) project	2025-05-29 12:42:53 -04:00
Joey Hess	52a8b5b117	map: Support --json option Sponsored-by: Dartmouth College's OpenNeuro project	2025-05-28 14:17:28 -04:00
Joey Hess	286a681b57	remove dangling where	2025-05-20 09:37:33 -04:00
Joey Hess	e64e9d5fae	whereused: Fix bug that could find matches from grafts in remote git-annex branches git log with --remotes= needs the preceeding --exclude=*/git-annex in order to not look at git-annex branches of remotes. Sponsored-by: mycroft	2025-05-05 14:32:25 -04:00
Joey Hess	2ee6c25c72	map: Fix buggy handling of remotes that are bare git repositories accessed via ssh It was treating remote paths of a remote repo as if they were local paths, and so trying to expand git directories and so forth on them. That led to bad results, including a path like "foo.git" getting turned into "foo.git.git" Sponsored-by: Dartmouth College's OpenNeuro project	2025-04-22 15:21:01 -04:00
Joey Hess	7b3d7a8f78	fix message also dead code removal	2025-04-22 13:36:54 -04:00
Joey Hess	7fb413189a	migrate: Fix --remove-size to work when a file is not present `5f74a45861` added this bug	2025-04-01 10:47:31 -04:00
Joey Hess	e81fd72018	Added remote.name.annex-web-options config Which is a per-remote version of the annex.web-options config. Had to plumb RemoteGitConfig through to getUrlOptions. In cases where a special remote does not use curl, there was no need to do that and I used Nothing instead. In the case of the addurl and importfeed commands, it seemed best to say that running these commands is not using the web special remote per se, so the config is not used for those commands.	2025-04-01 10:17:38 -04:00
Joey Hess	cc8f7e9776	fsck: Avoid complaining about required content of dead repositories requiredContentMap does not exclude dead repos. Usually this is not a problem because it is used when we are operating on a repository, and in that case, the repository is not dead (or if it is, the required content configurations should still be used). But in the case of fsck, this made a old required content config for a dead repository be warned about in a situation where it is not a problem.	2025-03-26 10:30:33 -04:00
Joey Hess	d0b5a09b0e	deal with NoUUID in checkCanProxy updatecluster, updateproxy: When a remote that has no annex-uuid is configured as annex-cluster-node, warn and avoid writing bad data to the git-annex branch. The proxy.log and cluster.log end up unparseable when a NoUUID gets written to them.	2025-03-21 12:29:44 -04:00
Joey Hess	74457b6b93	findcompute --inputs Useful for eg, generating dependency graphs.	2025-03-19 15:39:05 -04:00
Joey Hess	bcfd554a0f	findcomputed: New command, displays information about computed files.	2025-03-18 12:55:48 -04:00
Joey Hess	d74d2d5d91	--json for addcomputed and recompute Not very useful, but it does work.	2025-03-17 15:51:43 -04:00
Joey Hess	2d60ce4803	record fscked files in fsck db by default Remember the files that are checked, so a later run with --more will skip them, without needing to use --incremental.	2025-03-17 15:34:08 -04:00
Joey Hess	23538ea17b	annex.addunlocked support for git-annex compute And for git-annex recompute, add the file unlocked when the original is unlocked.	2025-03-17 14:26:09 -04:00
Joey Hess	a673fc7cfd	recompute: stage new version of file in git When writing doc/tips/computing_annexed_files.mdwn, I noticed that a recompute --reproducible followed by a drop and a re-get did not actually test if the file could be reproducible computed again. Turns out that get and drop both operate on staged files. If there is an unstaged modification in the work tree, that's ignored. Somewhat surprisingly, other commands like info do operate on staged files. So behavior is inconsistent, and fairly surprising really, when there are unstaged modifications to files. Probably this is rarely noticed because `git-annex add` is used to add a new version of a file, and then it's staged. Or `git mv` is used to move a file, rather than `mv` of a file over top of an existing file. So it's uncommon to have an unstaged annexed file in a worktree. It might be worth making things more consistent, but that's out of scope for what I'm working on currently. Also, I anticipate that supporting unlocked files with recompute will require it to stage changes anyway. So, make recompute stage the new version of the file. I considered having recompute refuse to overwrite an existing staged file. After all, whatever version was staged before will get lost when the new version is staged over top of it. But, that's no different than `git-annex addcomputed` being run with the name of an existing staged file. Or `git-annex add` being run with a new file content when there is an existing staged file. Or, for that matter, `git add` being ran with a new content when there is an existing staged file.	2025-03-12 13:42:00 -04:00
Joey Hess	0712ae020c	fix recompute --reproducible run on a VURL key This avoids "Cannot generate a key for backend VURL", and makes it use the usual hashing backend.	2025-03-12 11:48:29 -04:00
Joey Hess	0477a8d098	add INPUT-REQUIRED Used by git-annex-compute-singularity to make addcomputed --fast work. Also, simplified git-annex-compute-singularity; there is no need to hard link the container into place. singularity does not care about the extension of the container, so can just pass it the annex object file.	2025-03-11 11:46:31 -04:00
Joey Hess	c6c6e2632d	avoid unncessary git-annex branch changes for recompute and addcomputed	2025-03-06 12:41:30 -04:00
Joey Hess	ccc454a791	computation progress display	2025-03-05 13:46:06 -04:00
Joey Hess	51538fa0a8	improve error message when unable to get an input file In this case, the compute program is run the same as if addcomputed --fast were used, so it should succeed, without outputting a computed file. computeInputsUnavailable is in ComputeState for simplicity, but it is not serialized with the rest of the ComputeState.	2025-03-04 13:13:18 -04:00
Joey Hess	b395bd4f56	move showOutput into compute remote	2025-03-04 10:02:33 -04:00
Joey Hess	89bfeada87	recompute: display one of the changed files	2025-03-03 15:12:19 -04:00
Joey Hess	b01a0d2323	avoid recomputing every time on git inputs	2025-03-03 14:56:49 -04:00

1 2 3 4 5 ...

3,072 commits