git-annex

Author	SHA1	Message	Date
Joey Hess	d905232842	use ResourcePool for hash-object handles Avoid starting an unncessary number of git hash-object processes when concurrency is enabled. Sponsored-by: Dartmouth College's DANDI project	2022-07-25 17:32:39 -04:00
Joey Hess	63cef2ae0b	v8 repositories automatically upgrade to v9 (And v9 later on to v10.) When v9/v10 were added, making v8 automatically upgrade was deferred "for a few months" to prevent interoperability problems if users also have an old version of git-annex. Of course that could still be the case, but there has been a good amount of time and this can't be put off forever. Allow setting annex.autoupgraderepository to false to avoid this upgrade. Previously, that only prevented upgrades from no longer supported git-annex versions, but v8 is still supported, and users may want to keep on v8 to interoperate with an old git-annex version. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2022-07-25 16:20:04 -04:00
Joey Hess	a0e788c94a	releasing package git-annex version 10.20220724	2022-07-25 14:07:20 -04:00
Joey Hess	4e88137a28	prevent appends except when annex.alwayscompact=false I would like for a new repo version to enable appends, but to do so safely would need a v11 followed by a 1 year delay followed by a v12 that does it. Since a similar v9 and v10 transition is currently happening, and is less than 6 months along in most repos, it does not feel wise to stack up another year-long transition behind that. What if I need to hurry up a new repo version for some other change? Added todo so I remember to make this change at some time when a v11 and probably v12 repo version do make sense. Sponsored-by: Dartmouth College's DANDI project	2022-07-20 13:23:55 -04:00
Joey Hess	36f0bdcd57	add annex.alwayscompact Added annex.alwayscompact setting which can be unset to speed up writes to the git-annex branch in some cases. Sponsored-by: Dartmouth College's DANDI project	2022-07-18 16:39:19 -04:00
Joey Hess	a2b1f369d1	disable journalIgnorable in enableInteractiveBranchAccess Fix a reversion that prevented --batch commands (and the assistant) from noticing data written to the journal by other commands. I have not identified which commit broke this for sure, but probably it was `aeca7c2207` --batch commands that wrote to the journal avoided the problem since journalIgnorable sets unset on write. It's a little bit surprising that nobody noticed that query --batch commands did not see data written by other commands. Sponsored-by: Dartmouth College's DANDI project	2022-07-15 13:48:41 -04:00
Joey Hess	093ad89ead	S3: Avoid writing or checking the uuid file in the S3 bucket when importtree=yes or exporttree=yes It does not make sense for either; importing from an existing bucket should not write to it. And the user may not have write access at all. And exporting to a bucket should not write other files. Also this prevents the uuid file being imported after being written. Sponsored-by: Dartmouth College's DANDI project	2022-07-14 15:05:51 -04:00
Joey Hess	50c2cac7e7	adb: Added configuration setting oldandroid=true To avoid using find -printf, which was first supported in Android around 2019-2020. Probing seems too fragile, and execing stat once per file is too slow to do when there's a faster way available, which brought me to an option... Sponsored-by: Brett Eisenberg on Patreon	2022-07-13 18:00:47 -04:00
Joey Hess	fbc3c223a6	filter-process: Fix protocol for empty files This caused git to complain that filter-process failed and kill it with signal 15. Because it wrote an extra flushPkt for an empty file, which git did not expect, and so git saw an unexpected response to the next request. Luckily, filter-process is only used by default in v9 and up, and v8 is still the default. Also, git had to be updating an empty file, followed by another file, which is a fairly unlikely situation. And git restarts filter-process after this happens and uses it to filter the rest of the files. So this isn't a crippling bug. Sponsored-by: Luke Shumaker on Patreon	2022-07-13 17:13:54 -04:00
Joey Hess	201e41cffd	add: Fix reversion when adding an annex link that has been moved to another directory Fixes commit `f259be7f39` Sponsored-by: Dartmouth College's Datalad project	2022-07-05 16:22:41 -04:00
Joey Hess	d01530ac21	Revert "lts-19.13 (ghc 9.0.2)" This reverts commit `d2bc268317`. That seemed to break building on windows, before it starts building git-annex at all, it tried to install ghc and something blew up: Processing archive: C:\Users\runneradmin\AppData\Local\Programs\stack\x86_64-windows\ghc-9.0.2.tar.xz Extracting ghc-9.0.2.tar ... Extracted total of 11790 files from ghc-9.0.2.tar C:\Users\runneradmin\AppData\Local\Programs\stack\x86_64-windows\ghc-9.0.2-tmp-6d0fbe7f3b29e56c\ghc-9.0.2\: renameDirectory:pathIsDirectory:CreateFile "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\Programs\\stack\\x86_64-windows\\ghc-9.0.2-tmp-6d0fbe7f3b29e56c\\ghc-9.0.2\\": does not exist (The system cannot find the file specified.) Hopefully a newer ghc version or updated stackage version will fix this at some point, in the meantime revert it.	2022-07-05 13:13:25 -04:00
Joey Hess	02ef3d6a64	fix build with assistant disabled and webapp enabled The webapp modules cannot build with the assistant disabled, so make the webapp be under the assistant build flag. Sponsored-by: Jarkko Kniivilä on Patreon	2022-06-29 14:19:18 -04:00
Joey Hess	b223988e22	remove --backend from global options --backend is no longer a global option, and is only accepted by commands that actually need it. Three commands that used to support backend but don't any longer are watch, webapp, and assistant. It would be possible to make them support it, but I doubt anyone used the option with these. And in the case of webapp and assistant, the option was handled inconsistently, only taking affect when the command is run with an existing git-annex repo, not when it creates a new one. Also, renamed GlobalOption etc to AnnexOption. Because there are many options of this type that are not actually global (any more) and get added to commands that need them. Sponsored-by: Kevin Mueller on Patreon	2022-06-29 13:33:25 -04:00
Joey Hess	21c50c0f72	fix parallel copy from/to a local git repo Improve handling of parallelization with -J when copying content from/to a git remote that is a local path. Sponsored-by: Nicholas Golder-Manning on Patreon	2022-06-29 12:40:12 -04:00
Joey Hess	d2bc268317	lts-19.13 (ghc 9.0.2)	2022-06-28 14:49:33 -04:00
Joey Hess	c1b9ea2759	The 23 never happened release. It's 24 somewhere..	2022-06-23 13:55:54 -04:00
Joey Hess	57d088e9c2	fix release version	2022-06-23 13:35:14 -04:00
Joey Hess	86968a4047	releasing package git-annex version 10.20220526	2022-06-23 13:31:36 -04:00
Joey Hess	f259be7f39	fix overwrite race with small file that got large When adding a small file, it does not get locked down, so can be modified after git-annex checks that it's small. The use of queued git add made the race window nice and wide too. Fixed by checking if the file has changed, and by not using git add. Instead, have to recapitulate git add's handling of things like symlinks and executable files. Sponsored-by: Jochen Bartl on Patreon	2022-06-14 16:38:56 -04:00
Joey Hess	78a3d44ea0	get rid of racy addLink The remaining callers all did not rely on it checking gitignore, so were easy to convert. They were susceptable to the same overwrite race as add and fix, although less likely to have it and a narrower window than add's race. Command.Rekey in passing got an unncessary call to removeFile deleted. addSymlink handles deleting any existing worktree file.	2022-06-14 14:47:15 -04:00
Joey Hess	64c7f60f7a	fixed overwrite race with git-annex fix Similar to git-annex add, git-annex fix queued git add, so if a file got modified before git add ran, the wrong content would be staged, perhaps a large file content. Sponsored-by: Brock Spratlen on Patreon	2022-06-14 14:19:58 -04:00
Joey Hess	dd6dec4eb1	fix add overwrite race with git-annex add to annex This is not a complete fix for all such races, only the one where a large file gets changed while adding and gets added to git rather than to the annex. addLink needs to go away, any caller of it is probably subject to the same kind of race. (Also, addLink itself fails to check gitignore when symlinks are not supported.) ingestAdd no longer checks gitignore. (It didn't check it consistently before either, since there were cases where it did not run git add!) When git-annex import calls it, it's already checked gitignore itself earlier. When git-annex add calls it, it's usually on files found by withFilesNotInGit, which handles checking ignores. There was one other case, when git-annex add --batch calls it. In that case, old git-annex behaved rather badly, it would seem to add the file, but git add would later fail, leaving the file as an unstaged annex symlink. That behavior has also been fixed. Sponsored-by: Brett Eisenberg on Patreon	2022-06-14 13:37:19 -04:00
Joey Hess	6d0b243d9d	avoid cleaning up move log when drop from remote fails move: Improve resuming a move that succeeded in transferring the content, but where dropping failed due to eg a network problem, in cases where numcopies checks prevented the resumed move from dropping the object from the source repository. This was earlier done for moves that got interrupted during the drop stage. Sponsored-by: Svenne Krap on Patreon	2022-06-09 15:26:25 -04:00
Joey Hess	13fc6a9b6a	fix to use 1 chunk for empty file Fix retrival of an empty file that is stored in a special remote with chunking enabled. The speculative chunk stuff caused a reversion by adding an empty list for the empty file. Which is just wrong; the empty file is still stored on the remote, and should be retrieved like any other file. It uses 1 chunk, so `max 1` is the simple fix. Sponsored-by: Noam Kremen on Patreon	2022-06-09 14:24:56 -04:00
Joey Hess	14584e7a38	initremote type=git probe uuid rather than matching path of an existing remote to find the uuid. The main benefit of this is that locations not using ssh:// will work now, including both paths and host:/path The other benefit is that it's a simpler interface, no need to have an existing remote with the same url and some other name. Although that will still work of course. This does rely on tryGitConfigRead working when given a Git.Repo that is not a remote. Luckily, it works fine that way. Also, tryGitConfigRead will auto-init a local repo that has a git-annex branch. I did not enable auto-init of ssh repos though. The uuid discovery actually happens twice; initremote discovers it, and uses it to store the special remote config, but does not set it in the git remote it creates. So the next run of git-annex does uuid discovery again, and caches it that time. This could be improved for a tiny speedup, but I didn't want to complicate things for that in this commit. Sponsored-by: Dartmouth College's DANDI project	2022-06-09 13:16:50 -04:00
Joey Hess	c59ea5b1ca	info: Added --autoenable option Use cases include using git-annex init --no-autoenable and then going back and enabling the special remotes that have autoenable configured. As well as just querying to remember which ones have it enabled. It lists all special remotes that have autoenable=yes whether currently enabled or not. And it can be used with --json. I pondered making this "git-annex info autoenable", but that seemed wrong because then if the use has a directory named "autoenable", it's unclear what they are asking for. (Although "git-annex info remote" may be similarly unclear.) Making it an option does mean that it can't be provided via --batch though. Sponsored-by: Dartmouth College's Datalad project	2022-06-01 14:20:38 -04:00
Joey Hess	0d50c90794	init: Added --no-autoenable option Someone may disagree with what repositories are set to autoenable and it's good to have local overrides. See https://github.com/datalad/datalad/issues/6634 Sponsored-by: Dartmouth College's Datalad project	2022-06-01 13:27:49 -04:00
Joey Hess	b60d85c4c0	releasing package git-annex version 10.20220525	2022-05-25 14:01:31 -04:00
Joey Hess	85f9193167	fix git-annex test -p test: When limiting tests to run with -p, work around tasty limitation by automatically including dependent tests. This fixes a reversion because it didn't used to use dependencies and forced tasty to run the init tests first. That changed when parallelizing the test suite. It will sometimes do a little more work than strictly required, because it adds init tests deps when limited to eg quickcheck tests, which don't depend on them. But this only adds a few seconds work. Sponsored-by: Dartmouth College's Datalad project	2022-05-23 14:24:54 -04:00
Joey Hess	af0d854460	deal with git's changes for CVE-2022-24765 Deal with git's recent changes to fix CVE-2022-24765, which prevent using git in a repository owned by someone else. That makes git config --list not list the repo's configs, only global configs. So annex.uuid and annex.version are not visible to git-annex. It displayed a message about that, which is not right for this situation. Detect the situation and display a better message, similar to the one other git commands display. Also, git-annex init when run in that situation would overwrite annex.uuid with a new one, since it couldn't see the old one. Add a check to prevent it running too in this situation. It may be that this fix has security implications, if a config set by the malicious user who owns the repo causes git or git-annex to run code. I don't think any git-annex configs get run by git-annex init. It may be that some git config of a command does get run by one of the git commands that git-annex init runs. ("git status" is the command that prompted the CVE-2022-24765, since core.fsmonitor can cause it to run a command). Since I don't know how to exploit this, I'm not treating it as a security fix for now. Note that passing --git-dir makes git bypass the security check. git-annex does pass --git-dir to most calls to git, which it does to avoid needing chdir to the directory containing a git repository when accessing a remote. So, it's possible that somewhere in git-annex it gets as far as running git with --git-dir, and git reads some configs that are unsafe (what CVE-2022-24765 is about). This seems unlikely, it would have to be part of git-annex that runs in git repositories that have no (visible) annex.uuid, and git-annex init is the only one that I can think of that then goes on to run git, as discussed earlier. But I've not fully ruled out there being others.. The git developers seem mostly worried about "git status" or a similar command implicitly run by a shell prompt, not an explicit use of git in such a repository. For example, Ævar Arnfjörð Bjarma wrote: > * There are other bits of config that also point to executable things, > e.g. core.editor, aliases etc, but nothing has been found yet that > provides the "at a distance" effect that the core.fsmonitor vector > does. > > I.e. a user is unlikely to go to /tmp/some-crap/here and run "git > commit", but they (or their shell prompt) might run "git status", and > if you have a /tmp/.git ... Sponsored-by: Jarkko Kniivilä on Patreon	2022-05-20 14:38:27 -04:00
Joey Hess	aa414d97c9	make fsck normalize object locations The purpose of this is to fix situations where the annex object file is stored in a directory structure other than where annex symlinks point to. But it will also move object files from the hashdirmixed back to hashdirlower if the repo configuration makes that the normal location. It would have been more work to avoid that than to let it do it. Sponsored-by: Dartmouth College's Datalad project	2022-05-16 15:38:06 -04:00
Joey Hess	54809e9eb3	fix untrustworthiness of import/export remotes Commit `36133f27c0` had a boolean flip in it, aaargh. Special remotes with importtree=yes or exporttree=yes are once again treated as untrusted, since files stored in them can be deleted or modified at any time. Sponsored-by: Kevin Mueller on Patreon	2022-05-09 15:53:23 -04:00
Joey Hess	e8a601aa24	incremental verification for retrieval from import remotes Sponsored-by: Dartmouth College's Datalad project	2022-05-09 15:39:43 -04:00
Joey Hess	d1cce869ed	implement dataUnits finally Added support for "megabit" and related bandwidth units in annex.stalldetection and everywhere else that git-annex parses data units. Note that the short form is "Mbit" not "Mb" because that differs from "MB" only in case, and git-annex parses units case-insensitively. It would be horrible if two different versions of git-annex parsed the same value differently, so I don't think "Mb" can be supported. See comment for bonus sad story from my childhood. Sponsored-by: Nicholas Golder-Manning	2022-05-05 15:25:11 -04:00
Joey Hess	4e4c44ed8e	hah, I mean 0504 of course	2022-05-04 11:47:40 -04:00
Joey Hess	cb0e89bf77	releasing package git-annex version 10.20220404	2022-05-04 11:46:56 -04:00
Joey Hess	0406c33f58	fix git-annex repair false positive Avoid treating refs/annex/last-index or other refs that are not commit objects as evidence of repository corruption. The repair code checks to find bad refs by trying to run `git log` on them, and assumes that no output means something is broken. But git log on a tree object is empty. This was worth fixing generally, not as a special case, since it's certainly possible that other things store tree or other objects in refs. Sponsored-by: Max Thoursie on Patreon	2022-05-04 11:32:23 -04:00
Joey Hess	43701759a3	disable shellescape for rsync 3.2.4 rsync 3.2.4 broke backwards-compatability by preventing exposing filenames to the shell. Made the rsync and gcrypt special remotes detect this and disable shellescape. An alternative fix would have been to always set RSYNC_OLD_ARGS=1. Which would avoid the overhead of probing rsync --help for each affected remote. But that is really very fast to run, and it seemed better to switch to the modern code path rather than keeping on using the bad old code path. Sponsored-by: Tobias Ammann on Patreon	2022-05-03 12:12:41 -04:00
Joey Hess	280d41b58f	Fix a build failure with ghc 9.2.2 Thanks, gnezdo for the patch.	2022-05-02 14:21:48 -04:00
Joey Hess	17b20a2450	Fix test failure on NFS when cleaning up gpg temp directory Using removePathForcibly avoids concurrent removal problems. The i386ancient build still uses an old version of ghc and directory that do not include removePathForcibly though. Sponsored-by: Dartmouth College's Datalad project	2022-04-19 13:33:33 -04:00
Joey Hess	fd65de0eb9	multicast: Support uftp 5.0 by switching from aes256-cbc to aes256-gcm aes256-gcm is supported by both 4.x and 5.x, while 5.x dropped aes256-cbc. Sponsored-by: Graham Spencer on Patreon	2022-04-19 12:02:10 -04:00
Joey Hess	ff6b36c706	assistant prompt pushing of manual commits to remotes assistant: When annex.autocommit is set, notice commits that the user makes manually, and push them out to remotes promptly. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2022-03-31 13:02:16 -04:00
Joey Hess	d266a41f8d	prevent numcopies or mincopies being configured to 0 Ignore annex.numcopies set to 0 in gitattributes or git config, or by git-annex numcopies or by --numcopies, since that configuration would make git-annex easily lose data. Same for mincopies. This is a continuation of the work to make data only be able to be lost when --force is used. It earlier led to the --trust option being disabled, and similar reasoning applies here. Most numcopies configs had docs that strongly discouraged setting it to 0 anyway. And I can't imagine a use case for setting to 0. Not that there might not be one, but it's just so far from the intended use case of git-annex, of managing and storing your data, that it does not seem like it makes sense to cater to such a hypothetical use case, where any git-annex drop can lose your data at any time. Using a smart constructor makes sure every place avoids 0. Note that this does mean that NumCopies is for the configured desired values, and not the actual existing number of copies, which of course can be 0. The name configuredNumCopies is used to make that clear. Sponsored-by: Brock Spratlen on Patreon	2022-03-28 15:20:34 -04:00
Joey Hess	959beeea9f	releasing package git-annex version 10.20220322	2022-03-22 13:56:45 -04:00
Joey Hess	a460aa8b70	Removed the NetworkBSD build flag Debian stable and the i386ancient build both have a new enough network to not need this flag any longer. Sponsored-by: Svenne Krap on Patreon	2022-03-22 11:52:52 -04:00
Joey Hess	982eb7ed0d	remove vendored http-client-restricted Removed vendored copy of http-client-restricted, and removed the HttpClientRestricted build flag that avoided that dependency. http-client-restricted is in Debian stable, and the i386ancient build also uses it, so I think this vendored copy is no longer needed. Sponsored-by: Noam Kremen on Patreon	2022-03-22 11:50:06 -04:00
Joey Hess	42b6f24e67	reorder	2022-03-21 16:02:24 -04:00
Joey Hess	6079b0c72c	fix reversion add: Avoid unncessarily converting a newly unlocked file to be stored in git when it is not modified, even when annex.largefiles does not match it. This fixes a reversion in version 10.20220222, where git-annex unlock followed by git-annex add, followed by git commit file could result in git thinking the file was modified after the commit. I do have half a mind to remove the withUnmodifiedUnlockedPointers part of git-annex add. It seems weird, despite that old bug report arguing a case of consistency that it ought to behave that way. When git-annex add surpises me, it seems likely it's wrong.. But for now, this is the smallest possible fix. Sponsored-by: Dartmouth College's Datalad project	2022-03-21 15:54:04 -04:00
Joey Hess	3e2f1f73cb	add back inode to directory special remote ContentIdentifier Directory special remotes with importtree=yes have changed to once more take inodes into account. This will cause extra work when importing from a directory on a FAT filesystem that changes inodes on every mount. To avoid that extra work, set ignoreinodes=yes when initializing a new directory special remote, or change the configuration of your existing remote: git-annex enableremote foo ignoreinodes=yes This will mean a one-time re-import of all contents from every directory special remote due to the changed setting. `73df633a62` thought it was too unlikely that there would be modifications that the inode number was needed to notice. That was probably right; it's very unlikely that a file will get modified and end up with the same size and mtime as before. But, what was not considered is that a program like NextCloud might write two files with different content so closely together that they share the mtime. The inode is necessary to detect that situation. Sponsored-by: Max Thoursie on Patreon	2022-03-21 13:12:02 -04:00
Joey Hess	025c18128b	test: Added --jobs option Default to the number of CPU cores, which seems about optimal on my laptop. Using one more saves me 2 seconds actually. Better packing of workers improves speed significantly. In 2 tests runs, I saw segfaulting workers despite my attempt to work around that issue. So detect when a worker does, and re-run it. Removed installSignalHandlers again, because I was seeing an error "lost signal due to full pipe", which I guess was somehow caused by using it. Sponsored-by: Dartmouth College's Datalad project	2022-03-16 14:42:07 -04:00

1 2 3 4 5 ...

1426 commits