git-annex

Author	SHA1	Message	Date
Joey Hess	c33c226abd	fixed	2023-06-09 16:13:52 -04:00
Joey Hess	a0ab425c95	add ContentIndentifiersCidRemoteKeyIndex Optimise database to further speed up importing large trees from special remotes. See comment for details of why the other index didn't help cid queries. It would probably be better to manually create an index on only cid, rather than adding a second uniqueness constraint that is a larger index. But persitent does not support creating indexes, and an attempt to manually add it to the migration failed. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-06-09 15:12:33 -04:00
Joey Hess	6821ba8dab	sync: use log to track adjusted branch needs updating Speeds up sync in an adjusted branch by avoiding re-adjusting the branch unncessarily, particularly when it is adjusted with --hide-missing or --unlock-present. When there are a lot of files, that was the majority of the time of a --no-content sync. Uses a log file, which is updated when content presence changes. This adds a little bit of overhead to every file get/drop when on such an adjusted branch. The overhead is minimal for get of any size of file, but might be noticable for drop in some cases. It seems like a reasonable trade-off. It would be possible to update the log file only at the end, but then it would not happen if the command is interrupted. When not in an adjusted branch, there should be no additional overhead. (getCurrentBranch is an MVar read, and it avoids the MVar read of getGitConfig.) Note that this does not deal with situations such as: git checkout master, git-annex get, git checkout adjusted branch, git-annex sync. The sync won't know that the adjusted branch needs to be updated. Dealing with that would add overhead to operation in non-adjusted branches, which I don't like. Also, there are other situations like having two adjusted branches that both need to be updated like this, and switching between them and sync not updating. This does mean a behavior change to sync, since it did previously deal with those situations. But, the documentation did not say that it did. The man pages only talk about sync updating the adjusted branch after it transfers content. I did consider making sync keep track of content it transferred (and dropped) and only update the adjusted branch then, not to catch up to other changes made previously. That would perform better. But it seemed rather hard to implement, and also it would have problems with races with a concurrent get/drop, which this implementation avoids. And it seemed pretty likely someone had gotten used to get/drop followed by sync updating the branch. It seems much less likely someone is switching branches, doing get/drop, and then switching back and expecting sync to update the branch. Re-running git-annex adjust still does a full re-adjusting of the branch, for anyone who needs that. Sponsored-by: Leon Schuermann on Patreon	2023-06-08 14:35:41 -04:00
Joey Hess	3c15e0f7a0	cache negative lookups of global numcopies and mincopies Speeds up eg git-annex sync --content by up to 50%. When it does not need to transfer or drop anything, it now noops a lot more quickly. I didn't see anything else in sync --content noop loop that could really be sped up. It has to cat git objects to keys, stat object files, etc. Sponsored-by: unqueued on Patreon	2023-06-06 14:43:25 -04:00
Joey Hess	cfad0def18	wrap	2023-06-05 15:15:20 -04:00
Joey Hess	fe1b2dfb4b	speed up very first tree import by 25% Reading from the cidsdb is responsible for about 25% of the runtime of an import. Since the cidmap is used to store the same information in ram, the cidsdb is not written to during an import any longer. And so, if it started off empty (and updateFromLog wasn't needed), those reads can just be skipped. This is kind of a cheesy optimisation, since after any import from any special remote, the database will no longer be empty, so it's a single use optimisation. But it's probably not uncommon to start by importing a lot of files, and it can save a lot of time then. Sponsored-by: Brock Spratlen on Patreon	2023-06-02 13:30:30 -04:00
Joey Hess	40017089f2	use importChanges optimisation Large speed up to importing trees from special remotes that contain a lot of files, by only processing changed files. Benchmarks: Importing from a special remote that has 10000 files, that have all been imported before, and 1 new file sped up from 26.06 to 2.59 seconds. An import with no change and 10000 unchanged files sped up from 24.3 to 1.99 seconds. Going up to 20000 files, an import with no changes sped up from 125.95 to 3.84 seconds. Sponsored-by: k0ld on Patreon	2023-06-01 13:47:00 -04:00
Joey Hess	f6aa097a39	avoid import writing to cidsdb initially Speed up importing trees from special remotes somewhat by avoiding redundant writes to sqlite database. Before, import would write to both the git-annex branch and also to the sqlite database. But then the next time it was run, needsUpdateFromLog would see the branch had changed, so run updateFromLog, which would make the same writes to the sqlite database a second time. Now import writes only to the git-annex branch. The next time it's run, needsUpdateFromLog sees that the branch has changed and so calls updateFromLog, which updates the sqlite database. Why defer the write to the sqlite database like this? It seems that it could write to the database as it goes, and at the end call recordAnnexBranchTree to indicate that the information in the git-annex branch has all been written to the cidsdb. That would avoid the second import doing extra work. But, there could be other processes running at the same time, and one of them may update the git-annex branch, eg merging a remote git-annex branch into it. Any cids logs on that merged git-annex branch would not be reflected in the cidsdb yet. If the import then called recordAnnexBranchTree, the cidsdb would never get updated with that merged information. I don't think there's a good way to prevent, or to detect that situation. So, it can't call recordAnnexBranchTree at the end. So it might as well wait until the next run and do updateFromLog then. It could instead do updateFromLog at the end, but it's going to check needsUpdateFromLog at the beginning anyway. Note that the database writes were queued, so there is already a cidmap that is used to remember changes that the current process has made. So, omitting database writes can't change the behavior of the current process. Also note that thirdpartypopulatedimport uses recordcidkeyindb, which reflects what it already did. That code path does not use the cidmap, but does not need to query it either. It might be possible to make that code path also only update the git-annex branch and not the db, but I haven't checked. Sponsored-by: Noam Kremen on Patreon	2023-05-30 17:05:28 -04:00
Joey Hess	5070087a63	repair: Fix handling of git ref names on Windows Sponsored-by: Kevin Mueller on Patreon	2023-05-30 16:09:13 -04:00
Joey Hess	f2db6da938	default to yt-dlp and fix progress parsing bugs I noticed git-annex was using a lot of CPU when downloading from youtube, and was not displaying progress. Turns out that yt-dlp (and I think also youtube-dl) sometimes only knows an estimated size, not the actual size, and displays the progress output slightly differently for that. That broke the parser. And, the parser was feeding chunks that failed to parse back as a remainder, which caused it to try to re-parse the entire output each time, so it got slower and slower. Using --progress-template like this should avoid parsing problems as well as future proof against output changes. But it will work with only yt-dlp. So, this seemed like the right time to deprecate youtube-dl, and default to yt-dlp when available. git-annex will still use youtube-dl if that's all that's available. However, since the progress parser for youtube-dl was buggy, and I don't want to maintain two different progress parsers (especially since youtube-dl is no longer in debian unstable having been replaced by yt-dlp), made git-annex no longer try to parse youtube-dl's progress. Also, updated docs for yt-dlp being default. It did not seem worth renaming annex.youtube-dl-options and annex.youtube-dl-command. Note that yt-dlp does not seem to document the fields available in the progress template. I found them by reading the source and looking at the templates it uses internally. Also note that the use of "i" (rather than "s") in progressTemplate makes it display floats rounded to integers; particularly the estimated total size can be a float. That also does not seem to be documented but I assume is a python thing? Sponsored-by: Joshua Antonishen on Patreon	2023-05-27 13:04:53 -04:00
Joey Hess	0f89d221bd	version: Avoid error message when entire output is not read Sponsored-by: Dartmouth College's Datalad project	2023-05-19 15:00:57 -04:00
Joey Hess	c4ad9b1446	Fix bug in -z handling of trailing NUL in input The obvious way to fix this would be to adapt lines to split on null. However, it's actually nontrivial to rewrite lines. In particular it has a weird implementation to avoid a space leak. See: https://gitlab.haskell.org/ghc/ghc/-/issues/4334 Also, while that is a small amount of code, it's covered by a rather complex copyright and I'd have to include that copyright in git-annex. So, I opted to filter out the trailing empty string instead. Sponsored-by: Dartmouth College's Datalad project	2023-05-19 14:34:02 -04:00
Joey Hess	e955912ad0	git-annex assist assist: New command, which is the same as git-annex sync but with new files added and content transferred by default. (Also this fixes another reversion in git-annex sync, --commit --no-commit, and --message were not enabled, oops.) See added comment for why git-annex assist does commit staged changes elsewhere in the work tree, but only adds files under the cwd. Note that it does not support --no-commit, --no-push, --no-pull like sync does. My thinking is, why should it? If you want that level of control, use git commit, git annex push, git annex pull. Sync only got those options because pull and push were not split out. Sponsored-by: k0ld on Patreon	2023-05-18 14:37:43 -04:00
Joey Hess	f93a7fce1d	sync: Started transition to --content being enabled by default When used without --content or --no-content, warn about the upcoming transition, and suggest using one of the options, or setting annex.synccontent. Sponsored-by: Brett Eisenberg on Patreon	2023-05-17 13:23:42 -04:00
Joey Hess	40731ff9fd	sync: Added -g as a short option for --no-content I anticipate that if sync is transitioned to syncing content by default, people will want a short option. And in repositories where annex.synccontent = true, they already would. And pull and push sync content by default, so a short option is useful with them too. Mnemonic: -g makes only git data be synced Also, -a makes only annex data be synced. Would have preferred -c, which would complement -C, but it was already taken to set git configs. Sponsored-by: Noam Kremen on Patreon	2023-05-17 12:34:26 -04:00
Joey Hess	5df89d58c7	git-annex pull and push Split out two new commands, git-annex pull and git-annex push. Those plus a git commit are equivilant to git-annex sync. In a sense, git-annex sync conflates 3 things, and it would have been better to have push and pull from the beginning and not sync. Although note that git-annex sync --content is faster than a pull followed by a push, because it only has to walk the tree once, look at preferred content once, etc. So there is some value in git-annex sync in speed, as well as user convenience. And it would be hard to split out pull and push from sync, as far as the implementaton goes. The implementation inside sync was easy, just adjust SyncOptions so it does the right thing. Note that the new commands default to syncing content, unless annex.synccontent is explicitly set to false. I'd like sync to also do that, but that's a hard transition to make. As a start to that transition, I added a note to git-annex-sync.mdwn that it may start to do so in a future version of git-annex. But a real transition would necessarily involve displaying warnings when sync is used without --content, and time. Sponsored-by: Kevin Mueller on Patreon	2023-05-16 16:51:07 -04:00
Joey Hess	2e984c51b6	sync --no-pull and --no-push affect download and upload of content The man page is somewhat vague about this, but I do think it was a bug that these options didn't alreay behave that way. The options are documented to disable imports and exports, which is the same operations just with a special remote that uses trees. The real motivation for this is that I'm adding git-annex pull and git-annex push, and I want these options to turn off the equivilant of those commands. And git-annex pull will certianly download and push upload. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-05-16 16:25:23 -04:00
Joey Hess	212442dd9b	pullOption should be pushOption in seekExportContent sync: Fix bug that made --no-pull, rather than --no-push prevent exporting trees to special remotes. Sponsored-by: Joshua Antonishen on Patreon	2023-05-16 15:55:24 -04:00
Joey Hess	271f3b1ab4	uninit: Support --json and --json-error-messages Had to convert uninit to do everything that can error out inside a CommandStart. This was harder than feels nice. (Also, in passing, converted CommandCheck to use a data type, not a weird number that it was not clear how it managed to be unique.) Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-11 13:43:02 -04:00
Joey Hess	02cfef1f91	uninit: Avoid buffering the names of all annexed files in memory Oops, using the same list twice does prevent streaming in constant memory. Sponsored-by: unqueued on Patreon	2023-05-11 13:25:55 -04:00
Joey Hess	de84abb210	configremote: Support --json and --json-error-messages Seems unlikely to be too useful, but who knows. Moved the checkSafeConfig call to happen after an action is started, so it will be captured by --json-error-messages Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-10 14:21:42 -04:00
Joey Hess	a242eabc7a	enableremote: Support --json and --json-error-messages Seems unlikely to be too useful, but who knows. Was trivial anyway. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-10 14:09:27 -04:00
Joey Hess	b3cc8dbacb	initremote: Support --json and --json-error-messages Including special --whatelse handling. Otherwise, it seems unlikely to be too useful, but who knows. Refactored code to call starting before displaying error messages. This makes the error messages be captured by --json-error-messages Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-10 14:03:40 -04:00
Joey Hess	8d8e044458	upgrade: Support --json and --json-error-messages and --json-progress Seems unlikely to be very useful, but trivial. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-10 12:54:48 -04:00
Joey Hess	c98fb0b637	merge: Support --json and --json-error-messages and --json-progress Seems unlikely to be very useful, but trivial. And, this completes the story that git-annex sync does not need json, since every sub-operation is available in a command that does support json. (Well, except for committing, but that's not a git-annex command.) Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-10 12:34:19 -04:00
Joey Hess	7919349cee	importfeed: Support --json and --json-error-messages and --json-progress Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-09 16:51:16 -04:00
Joey Hess	04ee6c4c6b	importfeed: Support -J (and work toward supporting --json) Both -J and --json needed importfeed to be refactored to use commandAction. That was difficult, because of the interrelated nature of downloading feeds and then downloading files from feeds, both of which needed to use commandAction. And then checking for problems in feeds has to come after these actions, which may be run as background jobs. As for --json support, it's most of the way there, but still has some warts, so I didn't enable jsonOptions yet. The warts include: - An initial empty json record is displayed by getCache. - Input is not populated, should be feed url - feedProblem at end will not be captured by --json-error-messages (see FIXME) Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-09 16:13:56 -04:00
Joey Hess	a71c831949	renameremote: Support --json and --json-error-messages Seems unlikely to be useful, but it works so Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-08 16:25:40 -04:00
Joey Hess	3d8f93dc0a	reinject: Support --json and --json-error-messages Also fix support for operating on multiple pairs of files and keys. Moved notAnnexed to inside starting, so error message will get into the json. Cannot include the key in the starting as it's not known yet, so instead add it to the json later. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-08 15:43:37 -04:00
Joey Hess	91b9915b09	reinit: Support --json and --json-error-messages Basically same concerns as init.. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-08 15:07:40 -04:00
Joey Hess	f09a248fe2	init: Support --json and --json-error-messages Dunno how useful this will be, since about all that's accessible from the json is whether it succeeded or failed, and the error messages which were already on stderr. Note that, when autoenabling a special remote, it would be possible for one to stop and prompt or output not using Messages and so not output as part of the json. I don't think that happens, but I'm not 100% sure something doesn't manage to break it. Of course, the same could be the case for commands that transfer objects. Using Annex.Init.autoEnableSpecialRemotes in --json mode would avoid the problem, but I've chosen to wait until I know it's needed to use it. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-08 14:58:08 -04:00
Joey Hess	c208442292	unused: Support --json and --json-error-messages Generalized AddJSONActionItemField to allow it to add several fields. Not entirely happy with that, since the names of the fields have to be carefully chosen to not conflict with other json fields. And fields added that way can't be parsed back in FromJSON, except for the "fields" field that is special cased for metadata. Still, I couldn't see another way to do it. Also, omit file:null from the json output. Which does affect other commands, eg git-annex whereis --all --json. Hopefully that won't break something that expects a null file. If it did, that could be reverted, but it would be ugly to have file:null in the unused --json Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-08 14:39:57 -04:00
Joey Hess	365dbc89dc	expire, trust et al, dead, describe: Support --json and --json-error-messages For expire, the normal output is unchanged, but the --json output includes the uuid in machine parseable form. Which could be very useful for this somewhat obscure command. That needed ActionItemUUID to be implemented, which seemed like a lot of work, but then --- I had been going to skip implementing them for trust, untrust, dead, semitrust, and describe, but putting the uuid in the json is useful information, it tells what uuid git-annex picked given the input. It was not hard to support these once ActionItemUUID was implemented. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-05 15:33:30 -04:00
Joey Hess	1a9af823bc	addunused, dropunused: Support --json and --json-error-messages This also changes addunused to display the names of the files that it adds. That seems like a general usability improvement, and not displaying the input number does not seem likely to be a problem to a user, since the filename is based on the key. Displaying the filename was necessary to get it and the key included in the json. dropunused does not include the key in the json. It would be possible to add, but would need more changes. And I doubt that dropunused --json would be used in a situation where a program cared which keys were dropped. Note that drop --unused does have the key in its json, so such a program could just use it. Or could just dropkey --batch with the specific keys it wants to drop if it cares about specific keys. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-05 14:01:40 -04:00
Joey Hess	1d4bd2dcb8	migrate, undo: Support --json and --json-error-messages Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-04 16:34:35 -04:00
Joey Hess	38fc5d3fc7	rekey, setpresentkey: Support --json and --json-error-messages Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-04 16:03:54 -04:00
Joey Hess	f20c8b087e	fix: Support --json and --json-error-messages And triaged out some commands that don't need to support these options. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-04 14:28:21 -04:00
Joey Hess	46c7c30140	log: Support --json and --json-error-messages Also in passing the --all display was fixed up to not quote keys like filenames. Note that the check added to compareChanges was needed to avoid logging when nothing changed. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-04 12:36:31 -04:00
Joey Hess	f56f6140fa	remote tailing 's' from log --raw-data log: When --raw-date is used, display only seconds from the epoch, as documented, omitting a trailing "s" that was included in the output before. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-04 11:53:38 -04:00
Joey Hess	c235488e2d	rmurl: Support --json and --json-error-messages The json does not include an url field, but it does have an input field that is "file url" when using --batch and ["file", "url"] when using the command line. I chose not to change that because it would complicate batchInput. An url field could be added if it turns out to be useful. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-04 11:28:27 -04:00
Joey Hess	6cbcba484c	unannex: Support --json and --json-error-messages Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-05-03 15:56:20 -04:00
Joey Hess	57c1b4f5e5	initremote: Avoid creating a remote that is not encrypted when gpg is broken checksize was applied lazily, so the exception didn't happen until the remote was set up. Sponsored-by: k0ld on Patreon	2023-05-01 13:00:05 -04:00
Joey Hess	aff37fc208	avoid annexFileMode special case This makes annexFileMode be just an application of setAnnexPerm', which avoids having 2 functions that do different versions of the same thing. Fixes some buggy behavior for some combinations of core.sharedRepository and umask. Sponsored-by: Jack Hill on Patreon	2023-04-27 15:58:37 -04:00
Joey Hess	67f8268b3f	Support core.sharedRepository=0xxx at long last Sponsored-by: Brett Eisenberg on Patreon	2023-04-26 17:03:29 -04:00
Joey Hess	0aa98aa09b	fix perms for core.sharedRepository These two missed setting it. It rarely matters that the journal gets the right perm. But, when using annex.alwayscommit=false, someone else may come along later and want to append to the journal file. It probably never matters what the sentinal perms are, but for completeness.. Sponsored-by: Luke Shumaker on Patreon	2023-04-26 16:29:11 -04:00
Joey Hess	f971b199ed	fix init .git/annex/ perms for core.sharedRepository init: Bug fix: Create .git/annex/ and .git/annex/fsckdb/ directories with permissions configured by core.sharedRepository. The fsckfb being created happens to create .git/annex/ and it was not using createAnnexDirectory. Probably a reversion partly, but maybe the database directory was always created not honoring core.sharedRepository? Sponsored-by: Noam Kremen on Patreon	2023-04-26 16:14:21 -04:00
Joey Hess	7af75a59be	Warn about unsupported core.sharedRepository=0xxx when set This spams the user with a lot of messages, but it seems like busywork to avoid that and only warn once, since this warning will go away when it gets implemented. Also fix parsing of the octal value. Sponsored-by: Kevin Mueller on Patreon	2023-04-26 13:25:29 -04:00
Joey Hess	4d6c918eff	avoid quoting spaces in git-annex find output to terminal That's too much quoting, the user expects the filename to be copy and pasteable. It would be ok to slash-escape space ('\ ') which is what gnu find does, but it doesn't seem necessary either. ${escaped_file} has always quoted spaces though, so keep on doing it there. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-04-26 00:18:30 -04:00
Joey Hess	be36e208c2	json object for FileNotFound When a nonexistant file is passed to a command and --json-error-messages is enabled, output a JSON object indicating the problem. (But git ls-files --error-unmatch still displays errors about such files in some situations.) I don't like the duplication of the name of the command introduced by this, but I can't see a great way around it. One way would be to pass the Command instead. When json is not enabled, the stderr is unchanged. This is necessary because some commands like find have custom output. So dislaying "find foo not found" would be wrong. So had to complicate things with toplevelFileProblem having different output with and without json. When not using --json-error-messages but still using --json, it displays the error to stderr, but does display a json object without the error. It does have an errorid though. Unsure how useful that behavior is. Sponsored-by: Dartmouth College's Datalad project	2023-04-25 19:26:20 -04:00
Joey Hess	91ba0cc7fd	Revert "--json-exceptions" This reverts commit `a325524454`. Turns out this was predicated on an incorrect belief that json output didn't already sometimes lack the "key" field. Since json output already can when `giveup` was used, it seems unncessary to add a whole new option for this.	2023-04-25 17:37:34 -04:00
Joey Hess	a325524454	--json-exceptions Added a --json-exceptions option, which makes some exceptions be output in json. The distinction is that --json-error-messages is for messages relating to a particular ActionItem, while --json-exceptions is for messages that are not, eg ones for a file that does not exist. It's unfortunate that we need two switches with such a fine distinction between them, but I'm worried about maintaining backwards compatability in the json output, to avoid breaking anything that parses it, and this was the way to make sure I didn't. toplevelWarning is generally used for the latter kind of message. And the other calls to toplevelWarning could be converted to showException. The only possible gotcha is that if toplevelWarning is ever called after starting acting on a file, it will add to the --json-error-messages of the json displayed for that file and converting to showException would be a behavior change. That seems unlikely, but I didn't convery everything to avoid needing to satisfy myself it was not a concern. Sponsored-by: Dartmouth College's Datalad project	2023-04-25 17:05:33 -04:00
Joey Hess	d11b3bc1af	Honor --force option when operating on a local git remote Propagate Annex.force into the remote's Annex state. Fixes this problem: joey@darkstar:~/tmp/xxxx>git-annex copy mmm --to origin --force copy mmm (to origin...) not enough free space, need 908.72 MB more (use --force to override this check or adjust annex.diskreserve) failed to send content to remote failed Does beg the question if anything else should be propagated. Some things like Annex.forcenumcopies certianly not; using --numcopies overrides the number of copies the current repo wants, not all of them. Sponsored-by: Graham Spencer on Patreon	2023-04-19 12:53:58 -04:00
Joey Hess	31e4b6dee1	catch chdir exception in --autostop assistant --autostop: Avoid crashing when ~/.config/git-annex/autostart lists a directory that it cannot chdir to. Sponsored-by: k0ld on Patreon	2023-04-19 12:42:02 -04:00
Joey Hess	9155ed1072	configremote New command, currently limited to changing autoenable= setting of a special remote. It will probably never be used for more than that given the limitations on it. Sponsored-by: Brock Spratlen on Patreon	2023-04-18 15:30:49 -04:00
Joey Hess	8728695b9c	support enableremote of git repo changing eg autoenable= enableremote: Support enableremote of a git remote (that was previously set up with initremote) when additional parameters such as autoenable= are passed. The enableremote special case for regular git repos is intended to handle ones that don't have a UUID probed, and the user wants git-annex to re-probe. So, that special case is still needed. But, in that special case, the user is not passing any extra parameters. So, when there are parameters, instead run the special remote setup code. That requires there to be a uuid known already, and it allows changing things like autoenable= Remote.Git.enableRemote changed to be a no-op if a git remote with the name already exists. Which it generally will in this case. Sponsored-by: Jack Hill on Patreon	2023-04-18 14:00:24 -04:00
Joey Hess	160d4c9254	whereused: Fix display of branch:file when run in a subdirectory The file needs to be relative to the top of the repository in that case, but it was relative to the subdir. Sponsored-by: Luke Shumaker on Patreon	2023-04-12 15:18:04 -04:00
Joey Hess	3346aa9659	safe output to terminal for calckey inprogress and lookupkey These are quite low-level, but still there is no point in displaying escape sequences that have been embedded in a key to the terminal. I think these are the only remaining commands that didn't use safe output, except for cases where git-annex is speaking a protocol to itself. Sponsored-by: Kevin Mueller on Patreon	2023-04-12 14:03:44 -04:00
Joey Hess	c50aa21d5f	init: Avoid autoenabling special remotes that have control characters in their names I'm on the fence about this. Notice that pulling from a git remote can pull branches that have escape sequences in their names. Git will display those as-is. Arguably git should try harder to avoid that. But, names of remotes are usually up to the local user, and autoenable changes that, and so it makes sense that git chooses to display control characters in names of remotes, and so autoenable needs to guard against it. Sponsored-by: Graham Spencer on Patreon	2023-04-12 12:37:12 -04:00
Joey Hess	afa5b883dc	find, findkeys, examinekey: escape output to terminal when --format is not used Note that filenames are not quoted, only escaped. This is to match the output of --format with escaping. Sponsored-by: Lawrence Brogan on Patreon	2023-04-11 15:27:07 -04:00
Joey Hess	df6f9f1ee8	filter out control characters and quote filenames Searched for uses of putStr and hPutStr and changed appropriate ones to filter out control characters and quote filenames. This notably does not make find and findkeys quote filenames in their default output. Because they should only do that when stdout is non a pipe. A few commands like calckey and lookupkey seem too low-level to make sense to filter output, so skipped those. Also when relaying output from other commands that is not progress output, have git-annex filter out control characters. Sponsored-by: k0ld on Patreon	2023-04-11 14:27:22 -04:00
Joey Hess	da83652c76	addurl --preserve-filename: reject control characters As well as escape sequences, control characters seem unlikely to be desired when doing addurl, and likely to trip someone up. So disallow them as well. I did consider going the other way and allowing filenames with control characters and escape sequences, since git-annex is in the process of escaping display of all filenames. Might still be a better idea? Also display the illegal filename git quoted when it rejects it. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-04-10 12:18:25 -04:00
Joey Hess	d689a5b338	git style filename quoting controlled by core.quotePath This is by no means complete, but escaping filenames in actionItemDesc does cover most commands. Note that for ActionItemBranchFilePath, the value is branch:file, and I choose to only quote the file part (if necessary). I considered quoting the whole thing. But, branch names cannot contain control characters, and while they can contain unicode, git coes not quote unicode when displaying branch names. So, it would be surprising for git-annex to quote unicode in a branch name. The find command is the most obvious command that still needs to be dealt with. There are probably other places that filenames also get displayed, eg embedded in error messages. Some other commands use ActionItemOther with a filename, I think that ActionItemOther should either be pre-sanitized, or should explicitly not be used for filenames, so that needs more work. When --json is used, unicode does not get escaped, but control characters were already escaped in json. (Key escaping may turn out to be needed, but I'm ignoring that for now.) Sponsored-by: unqueued on Patreon	2023-04-08 14:52:26 -04:00
Joey Hess	9c242af171	releasing package git-annex version 10.20230407	2023-04-07 13:37:03 -04:00
Joey Hess	98a3ba0ea5	restore old registerurl location tracking behavior registerurl: When an url is claimed by a special remote other than the web, update location tracking for that special remote. registerurl's behavior was changed in commit `451171b7c1`, apparently accidentially to not update location tracking except for the web. This makes registerurl followed by unregisterurl not be a no-op, when the url happens to be claimed by a remote other than the web. It is a noop when the url is unclaimed except by the web. I don't like the inconsistency, and wish that registerurl and unregisterurl never updated location tracking, which would be more in keeping with them being plumbing. But there is the fact that it used to behave this way, and also it was inconsistent that it updated location tracking for the web but not for other remotes, unlike addurl. And there's an argument that the user might not know what remote to expect to claim an url, so would be considerably in the dark when using registerurl. (Although they have to know what content gets downloaded, since they specify a key..) Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-04-05 17:06:44 -04:00
Joey Hess	2b940f7725	registerurl, unregisterurl: Added --remote option This serves two purposes. --remote=web bypasses other special remotes that claim the url, same as addurl --raw. And, specifying some other remote allows making sure that an url is claimed by the remote you expect, which makes then using setpresentkey not be fragile. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-04-05 15:54:41 -04:00
Joey Hess	e3f5bd4ca6	Revert "override rather than setting user.name and user.email" This reverts commit `66eb63dd82`. git-annex init is the only thing that uses ensureCommit. So overriding there will make later commits to the git-annex branch or by git-annex sync fail. It's ugly that git-annex init sets user.name and user.email, but it only does it on systems that are badly configured.	2023-04-04 15:15:02 -04:00
Joey Hess	e91bf784cd	Support user.useConfigOnly git config When it's set and git cannot determine user.name or user.email, this will result in git-annex init failing when committing to create the git-annex branch. Other git-annex commands that commit can also fail. Sponsored-by: Jack Hill on Patreon	2023-04-04 15:12:52 -04:00
Joey Hess	66eb63dd82	override rather than setting user.name and user.email Avoid setting user.name and user.email in the git config when git is unable to detect them. git-annex has good reason to want to ensure git commit succeeds when eg committing to the git-annex branch. But it's not playing nice to set these values where other commands can see them. Sponsored-by: Brett Eisenberg on Patreon	2023-04-04 14:56:44 -04:00
Joey Hess	3eb51ee929	readFileStrict to avoid laziness bug Fix laziness bug introduced in last release that breaks use of --unlock-present and --hide-missing adjusted branches. Since there is a writeFile of the same file immediately after readFile, it may still have the file open for read (or may have happened to read it already and closed it). I was not able to reproduce the problem in brief testing, but this seems obvious. Sponsored-by: Luke Shumaker on Patreona	2023-04-04 14:25:01 -04:00
Joey Hess	cc36c8516a	Sped up sqlite inserts 2x when built with persistent 2.14.5.0 https://github.com/yesodweb/persistent/issues/1457 Sponsored-by: Dartmouth College's DANDI project	2023-03-31 14:38:25 -04:00
Joey Hess	2b40fa51d3	git-annex.cabal: Prevent building with unix-compat 0.7 Which removed System.PosixCompat.User. See https://github.com/haskell-pkg-janitors/unix-compat/issues/3 Sponsored-by: Noam Kremen on Patreon	2023-03-31 12:52:23 -04:00
Joey Hess	000723d96c	releasing package git-annex version 10.20230329	2023-03-29 16:09:05 -04:00
Joey Hess	40a5c645cf	prep for release tomorrow	2023-03-28 17:02:34 -04:00
Joey Hess	18d326cb6f	external protocol VERSION 2 Support VERSION 2 in the external special remote protocol, which is identical to VERSION 1, but avoids external remote programs neededing to work around the above bug. External remote program that support exporttree=yes are recommended to be updated to send VERSION 2. Sponsored-by: Kevin Mueller on Patreon	2023-03-28 17:00:08 -04:00
Joey Hess	02662f5292	fix concurrency bug causing EXPORT to be sent to the wrong external Fix bug that caused broken protocol to be used with external remotes that use exporttree=yes. In some cases this could result in the wrong content being exported to, or retrieved from the remote. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-03-28 15:21:10 -04:00
Joey Hess	a5709dcc22	Copy with a reflink when exporting a tree to a directory special remote Remote.Directory makes a temp file, then calls this, and since the temp file exists, it prevented probing if CoW works. Note that deleting the empty file does mean there's a small window for a race. If another process is also exporting to the remote, that could let it make the same temp file. However, the temp filename actually has the processes's pid in it, which avoids that being a problem. This may have been a reversion caused by commits around `63d508e885`, but I haven't gone back and tested to be sure. The directory special remote had supposedly supported CoW for this going back to about half a year before that. Sponsored-by: Graham Spencer on Patreon	2023-03-28 13:09:14 -04:00
Joey Hess	24ae4b291c	addurl, importfeed: Fix failure when annex.securehashesonly is set The temporary URL key used for the download, before the real key is generated, was blocked by annex.securehashesonly. Fixed by passing the Backend that will be used for the final key into runTransfer. When a Backend is provided, have preCheckSecureHashes check that, rather than the key being transferred. Sponsored-by: unqueued on Patreon	2023-03-27 15:10:46 -04:00
Joey Hess	cd076cd085	Windows: Support urls like "file:///c:/path" That is a legal url, but parseUrl parses it to "/c:/path" which is not a valid path on Windows. So as a workaround, use parseURIPortable everywhere, which removes the leading slash when run on windows. Note that if an url is parsed like this and then serialized back to a string, it will be different from the input. Which could potentially be a problem, but is probably not in practice. An alternative way to do it would be to have an uriPathPortable that fixes up the path after parsing. But it would be harder to make sure that is used everywhere, since uriPath is also used when constructing an URI. It's also worth noting that System.FilePath.normalize "/c:/path" yields "c:/path". The reason I didn't use it is that it also may change "/" to "\" in the path and I wanted to keep the url changes minimal. Also noticed that convertToWindowsNativeNamespace handles "/c:/path" the same as "c:/path". Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2023-03-27 13:38:02 -04:00
Joey Hess	2b5fa091e2	annex.maxextensionlength for view view: Support annex.maxextensionlength when generating filenames for the view branch. Note that refining an existing view will reuse the extension length that was configured when initially constructing the view. This is necessarily the case because it reuses the filenames. Also view files used to have all extensions at the end, no matter how many there were. Since annex.maxextensionlength's documentation includes that it's limited to 2 extensions, I made it consistent with that. Sponsored-by: k0ld on Patreon	2023-03-24 14:01:38 -04:00
Joey Hess	038a2600f4	Avoid leaving repo with a detached head when there is a failure checking out an updated adjusted branch I don't know of scenarios where that can happen (besides the bug fixed by the parent commit), but there probably are some. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2023-03-23 16:36:43 -04:00
Joey Hess	cb4d9f7b1f	run restagePointerFiles in adjustedBranchRefreshFull Avoid failure to update adjusted branch --unlock-present after git-annex drop when annex.adjustedbranchrefresh=1 At higher values, it did flush the queue, which ran restagePointerFiles. But at 1, adjustedBranchRefreshFull gets added to the queue, and while restagePointerFiles is also in the queue, it runs after that. Sponsored-by: Brock Spratlen on Patreon	2023-03-23 16:25:45 -04:00
Joey Hess	a0badc5069	sync: Fix parsing of gcrypt::rsync:// urls that use a relative path Such an url is not valid; parseURI will fail on it. But git-annex doesn't actually need to parse the url, because all it needs to do to support syncing with it is know that it's not a local path, and use git pull and push. (Note that there is no good reason for the user to use such an url. An absolute url is valid and I patched git-remote-gcrypt to support them years ago. Still, users gonna do anything that tools allow, and git-remote-gcrypt still supports them.) Sponsored-by: Jack Hill on Patreon	2023-03-23 15:20:00 -04:00
Joey Hess	b624394c72	releasing package git-annex version 10.20230321	2023-03-21 16:14:10 -04:00
Joey Hess	570035b3f6	credit	2023-03-17 15:22:12 -04:00
Yaroslav Halchenko	0ae5ff797f	Typo: sansative -> sensitive	2023-03-17 15:14:50 -04:00
Joey Hess	f1b678face	copy --from --to location tracking update copy: When --from and --to are combined and the content is already present on the destination remote, update location tracking as necessary. Sponsored-by: Dartmouth College's DANDI project	2023-03-13 14:51:09 -04:00
Joey Hess	38e9ea8497	one-way escaping of newlines in uuid.log A repository can have a newline in its description due to being in a directory containing a newline, or due to git-annex describe being passed a string with a newline in it for some reason. Putting that newline in uuid.log breaks its format. So, escape the newline when it enters uuid.log, to \n This is a one-way escaping, it is not converted back to a newline when reading the log. If it were, commands like git-annex info and whereis would display a multi-line description, which could be confusing to read. And, implementing roundtripping would necessarily cause problems if an old version of git-annex were used to set a description that contained whatever special character is used to escape the \n. Eg, a \ or if it used the ! prefix before base64 data that is used in some other logs, the ! character. Then the description set by the old git-annex would not roundtrip. There just doesn't seem to be any benefit of roundtripping newlines through, so why bother? And, git often displays \n for newline when a filename contains a newline, so git-annex doing it in this case seems sorta ok by analogy to git. (Some other git-annex logs can also have newlines put into them if the user really wants to break git-annex. For example: git-annex config annex.largefiles "foo bar" The full list is probably config.log, remote.log, group.log, preferred-content.log, required-content.log, group-preferred-content.log, schedule.log. Probably there is no good reason to use a newline in any of these, and the breakage is probably limited to the bad data the user put in not coming back out. And users can write any garbage to log files themselves manually in any case. So, I am not going to address all of those at this time. If a problem such as this one with the newline in the repository path comes up, it can be dealt with on a case by case basis.) Sponsored-by: Dartmouth College's Datalad project	2023-03-13 14:19:32 -04:00
Joey Hess	2323af3736	importfeed: Display feed title When importing a bunch of feeds, this makes it more clear what it's working on. Also, I sometimes want to delete a particular feed from a list of feeds but don't know which url belongs to the feed, and this solves that. Control characters are filtered out just to protect against some feed putting escape character stuff in the feed, which could be a security problem. (Control characters also get filtered out of importfeed filenames.) Sponsored-by: Luke Shumaker on Patreon	2023-03-11 13:52:45 -04:00
Joey Hess	d8feda7a2f	added arm64-ancient build Added arm64 build for ancient kernels, needed to support Android phones whose kernels are too old to support kernels used by the current arm64 build. Updated Android/git-annex-install to use it. (Also made it use i386-ancient because that seems like a good idea.) Sponsored-by: Noam Kremen on Patreon	2023-03-10 11:59:03 -04:00
Joey Hess	ff141c093e	include subdir when checking export branch is checked out sync: Fix a reversion that prevented sending files to exporttree=yes remotes when annex-tracking-branch was configured to branch:subdir (Introduced in version 10.20230214) Sponsored-by: Kevin Mueller on Patreon	2023-03-10 11:41:52 -04:00
Joey Hess	54ad1b4cfb	Windows: Support long filenames in more (possibly all) of the code Works around this bug in unix-compat: https://github.com/jacobstanley/unix-compat/issues/56 getFileStatus and other FilePath using functions in unix-compat do not do UNC conversion on Windows. Made Utility.RawFilePath use convertToWindowsNativeNamespace to do the necessary conversion on windows to support long filenames. Audited all imports of System.PosixCompat.Files to make sure that no functions that operate on FilePath were imported from it. Instead, use the equvilants from Utility.RawFilePath. In particular the re-export of that module in Common had to be removed, which led to lots of other changes throughout the code. The changes to Build.Configure, Build.DesktopFile, and Build.TestConfig make Utility.Directory not be needed to build setup. And so let it use Utility.RawFilePath, which depends on unix, which cannot be in setup-depends. Sponsored-by: Dartmouth College's Datalad project	2023-03-01 15:55:58 -04:00
Joey Hess	9c3c4c1712	deprecate git-annex status w/o runtime warning As far as I can see, git-annex status was added to support direct mode, and like other things added for that, it ought to be deprecated. Behavior is similar to git status --short, though not identical in a few cases eg renamed files. I think datalad does not use this command, although it might have in the past. Could not find any use of it in the current datalad code. A deprecation warning at runtime would be the next step, probably will wait and do that for all the deprecated commands together (except findref).	2023-02-28 16:34:31 -04:00
Joey Hess	9fcaf27cba	done with adjusted view branches! Well, perhaps it could be documented better, but it's a compositional feature so users who need it will probably try it and be happy to find that it works.	2023-02-27 15:55:31 -04:00
Joey Hess	1c4f4b449a	support --unlock-present adjustment of view branches When generating the view, check if the key is present. When syncing in a view branch with an adjustment, run adjustedBranchRefreshFull the same as is done when syncing in other adjusted branches. This is needed because the docs for git-annex adjust --unlock-present suggest using git-annex sync to update the branch when annex.adjustedbranchrefresh is not set. Note that, with annex.adjustedbranchrefresh set, it just works! The adjusted branch gets updated in the usual way and it doesn't matter that there's a view branch underneath. And of course, re-running git-annex adjut --unlock-present also works, as suggested in the docs. Sponsored-by: Erik Bjäreholt on Patreon	2023-02-27 15:37:57 -04:00
Joey Hess	a206cdddb4	releasing package git-annex version 10.20230227	2023-02-27 12:23:43 -04:00
Joey Hess	195508fc65	Improve error message when unable to read a sqlite database due to permissions problem Old message was: sqlite query crashed: thread blocked indefinitely in an MVar operation New message is eg: sqlite worker thread crashed: SQLite3 returned ErrorCan'tOpen while attempting to perform open ".git/annex/keysdb/db". The worker thread used to throw an exception. But before that exception was seen by anything waiting on the worker thread to finish, the takeMVar in queryDb would have crashed with BlockedIndefinitelyOnMVar. Sponsored-by: k0ld on Patreon	2023-02-23 15:28:22 -04:00
Joey Hess	f24f96e018	move webapp build deps under Assistant build flag git-annex.cabal: Move webapp build deps under the Assistant build flag so git-annex can be built again without yesod etc installed. Commit `78440ca37d` got rid of the webapp build flag to work around what was apparently a cabal bug. It moved the webapp build deps to the main build-depends list. But that prevents building git-annex when yesod etc are not installed. Putting them under the Assistant build flag seems to not tickle that cabal bug, and lets git-annex build automatically with the assistant disabled when the webapp build deps are not installed. I hypotehesize that the problem may have involved build-depends nested behind two build flags. Also, cabal clean may need to be run in order for cabal to find the right solution after this change, when building in a directory where cabal configure had been run before. Also moved 3 modules that are needed to build git-annex w/o the assistant out from under the Assistant build flag. Sponsored-by: Brock Spratlen on Patreon	2023-02-23 12:25:22 -04:00
Joey Hess	8f2829e646	Revert "stack.yaml: Update to lts-19.33 and aws-0.24" This reverts commit `648e59cac2`. Failed to build on windows, because In the dependencies for haskeline-0.8.2: Win32-2.11.1.0 from Stack configuration does not match >=2.1 && <2.10 \|\| >=2.12 (latest matching version is 2.13.4.0) jkniiv did find a solution that builds: -- Win32-2.11.1.0 +- Win32-2.9.0.0 +- Cabal-3.6.3.0 +- directory-1.3.7.1 +- process-1.6.17.0 +- time-1.11.1.2 But that is a quite old version of Win32 and risks bugs from it, and bumping Cabal and directory to newer than lts-19.33 has seems also likely to be risky. So, I've given up. aws-0.24 won't be able to be in the stack build until there's a stackage lts (or nightly) that has filepath (>=1.4.100.0), which will not happen until sometime after the next ghc release.	2023-02-20 15:15:06 -04:00
Joey Hess	16d3097a08	fix reversion in info, and add test case info: Fix reversion in last release involving handling of unsupported input by continuing to handle any other inputs, before exiting nonzero at the end. Sponsored-by: Dartmouth College's Datalad project	2023-02-20 14:31:37 -04:00
Joey Hess	da61d564f1	fix view reversion caused by optimisation view: Fix a reversion in 10.20230214 that omitted a file from a view when the file had no metadata set, but the view only used path fields. Sponsored-by: Jack Hill on Patreon	2023-02-16 15:18:17 -04:00
Joey Hess	648e59cac2	stack.yaml: Update to lts-19.33 and aws-0.24 This enables some new features that need the new aws. Win32 downgraded from the version in lts-19.33 because git-annex does not build with that version, and newer versions of Win32 need a newer filepath version, which can't be upgraded while using lts-19.33. Sponsored-By: Brett Eisenberg on Patreon	2023-02-15 14:51:41 -04:00
Joey Hess	672258c8f4	Revert "revert recent bug fix temporarily for release" This reverts commit `16f1e24665`.	2023-02-14 14:11:23 -04:00
Joey Hess	f3019d7e22	releasing package git-annex version 10.20230214	2023-02-14 14:09:10 -04:00
Joey Hess	16f1e24665	revert recent bug fix temporarily for release Decided this bug is not severe enough to delay the release until tomorrow, so this will be re-applied after the release.	2023-02-14 14:06:29 -04:00
Joey Hess	c1ef4a7481	Avoid Git.Config.updateLocation adding "/.git" to the end of the repo path to a bare repo when git config is not allowed to list the configs due to the CVE-2022-24765 fix. That resulted in a confusing error message, and prevented the nice message that explains how to mark the repo as safe to use. Made isBare a tristate so that the case where core.bare is not returned can be handled. The handling in updateLocation is to check if the directory contains config and objects and if so assume it's bare. Note that if that heuristic is somehow wrong, it would construct a repo that thinks it's bare but is not. That could cause follow-on problems, but since git-annex then checks checkRepoConfigInaccessible, and skips using the repo anyway, a wrong guess should not be a problem. Sponsored-by: Luke Shumaker on Patreon	2023-02-14 14:00:36 -04:00
Joey Hess	452b080dba	better handling of multiple repositories with the same name Used to fail with a bad error message, indicating there was no repository with the specified name, or something like that. Now, suggest they use the uuid to disambiguate. * info, enableremotemote, renameremote: Avoid a confusing message when more than one repository matches the user provided name. * info: Exit nonzero when the input is not supported. Sponsored-by: Kevin Mueller on Patreon	2023-02-13 14:31:09 -04:00
Joey Hess	826b225ca8	Sped up view branch construction by 50% A benchmark in my sound repository with `git-annex view feedtitle=*` took 2:52 wall clock time before and 1:58 after. Though it still only used 130% of CPU. This is the same kind of optimisation that is in seekFilteredKeys, though that precaches location logs while this streams the metadata logs direct to parsing them. seekFilteredKeys contains more streaming, to find the annexed files, and this could be further sped up with similar streaming. Sponsored-by: Nicholas Golder-Manning on Patreon	2023-02-13 13:29:57 -04:00
Joey Hess	e9b6efac5a	fix buggy sync to exporttree remote when annex-tracking-branch is not checked out sync: Fix a bug that caused files to be removed from an importtree=yes exporttree=yes special remote when the remote's annex-tracking-branch was not the currently checked out branch. Sponsored-by: Max Thoursie on Patreon	2023-02-10 15:49:15 -04:00
Joey Hess	c2b3e870df	finishing up sync in view branch sync: When run in a view branch, avoid updating synced/ branches, or trying to merge anything from remotes. Sponsored-by: Erik Bjäreholt on Patreon	2023-02-10 15:27:42 -04:00
Joey Hess	bb4550c7c1	sync: Warn when the adjusted basis ref cannot be found As happens eg when the user has renamed branches. Sponsored-by: Graham Spencer on Patreon	2023-02-10 14:33:21 -04:00
Joey Hess	96d46db2d5	Support http urls that contain ":" that is not followed by a port number The same as git does. Sponsored-by: Dartmouth College's DANDI project	2023-02-10 13:34:47 -04:00
Joey Hess	5f9bf51438	sync in view branch updates the view branch * sync: When run in a view branch, refresh the view branch to reflect any changes that have been made to the parent branch or metadata. This is basically working, but probably needs some more work to deal with all the edge cases of things sync does. Sponsored-by: Lawrence Brogan on Patreon	2023-02-08 15:37:28 -04:00
Joey Hess	a11d6e0baf	avoid sync pushing view branches to remotes, and better view branch names * sync: Avoid pushing view branches to remotes. * Changed the name of view branches to include the parent branch. Existing view branches checked out using an old name will still work. It does not seem useful for sync to push view branches around, because the information in a view branch can entirely be derived from other information in git. And sync doesn't push adjusted branches around either. The better view branch names make it more in line with adjusted branch names, but were also needed to make fromViewBranch be able to return the original branch name. Kept the old view branch names still working. But, when those branches exist in a repo, sync will still try to push them as before. Avoiding that would need more complicated and/or expensive changes to sync. Sponsored-By: Boyd Stephen Smith Jr. on Patreon	2023-02-08 13:57:48 -04:00
Joey Hess	aa0350ff49	add directory to views for files that lack specified metadata * view: New field?=glob and ?tag syntax that includes a directory "_" in the view for files that do not have the specified metadata set. * Added annex.viewunsetdirectory git config to change the name of the "_" directory in a view. When in a view using the new syntax, old git-annex will fail to parse the view log. It errors with "Not in a view.", which is not ideal. But that only affects view commands. annex.viewunsetdirectory is included in the View for a couple of reasons. One is to avoid needing to warn the user that it should not be changed when in a view, since that would confuse git-annex. Another reason is that it helped with plumbing the value through to some pure functions. annex.viewunsetdirectory is actually mangled the same as any other view directory. So if it's configured to something like "N/A", there won't be multiple levels of directories, which would also confuse git-annex. Sponsored-By: Jack Hill on Patreon	2023-02-07 16:28:46 -04:00
Joey Hess	04ec726d3b	S3 region= S3: Support a region= configuration useful for some non-Amazon S3 implementations. This feature needs git-annex to be built with aws-0.24. datacenter= sets both the AWS hostname and region in one setting, which is easy when using AWS, but not useful for other hosts. So kept datacenter as-is, but added this additional config. Sponsored-By: Brett Eisenberg on Patreon	2023-02-06 14:08:45 -04:00
Joey Hess	65167463aa	releasing package git-annex version 10.20230126	2023-01-26 15:27:32 -04:00
Joey Hess	3a08c80dd8	improve wording	2023-01-24 14:11:32 -04:00
Joey Hess	a6c1d9752b	move/copy: option parsing for --from with --to Allowing --from and --to as an alternative to --from or --to is hard to do with optparse-applicative! The obvious approach of (pfrom <\|> pto <\|> pfromandto) does not work when pfromandto uses the same option names as pfrom and pto do. It compiles but the generated parser does not work for all desired combinations. Instead, have to parse optionally from and optionally to. When neither is provided, the parser succeeds, but it's a result that can't be handled. So, have to giveup after option parsing. There does not seem to be a way to make an optparse-applicative Parser give up internally either. Also, need seek' because I first tried making fto be a where binding, but that resulted in a hang when git-annex move was run without --from or --to. I think because startConcurrency was not expecting the stages value to contain an exception and so ended up blocking. Sponsored-by: Dartmouth College's DANDI project	2023-01-18 14:42:39 -04:00
Joey Hess	f8bc208e89	findkeys: New command, very similar to git-annex find but operating on keys I've long been asked for `git-annex find --all` or something like that, but pushed back on it because I feel that the command is analagous to find(1) and so it would be surprising for it to list keys rather than files. So instead, add a new findkeys subcommand. Note that the use of withKeyOptions is rather strange because usually that is used to fall back to --all rather than listing files, but here it's made to default to --all like behavior and never list files. A performance thing that could be improved is that withKeyOptions always reads and caches location logs. But findkeys with no options does not need them, so it could be made faster. That caching does speed up options like --in though. This is really just a subset of a more general performance thing that --all reads location logs sometimes unncessarily. Anyway, it needs to read the location log in order to checkDead, and it seems good that findkeys does skip dead keys. Also, cleaned up comments on git-annex-find man page asking for --all option. Sponsored-by: Dartmouth College's DANDI project	2023-01-17 14:51:57 -04:00
Joey Hess	cfaae7e931	added an optional cost= configuration to all special remotes Note that when this is specified and an older git-annex is used to enableremote such a special remote, it will simply ignore the cost= field and use whatever the default cost is. In passing, fixed adb to support the remote.name.cost and remote.name.cost-command configs. Sponsored-by: Dartmouth College's DANDI project	2023-01-12 13:42:28 -04:00
Joey Hess	6fa166e1fc	web: Add urlinclude and urlexclude configuration settings Sponsored-by: Dartmouth College's DANDI project	2023-01-09 17:16:53 -04:00
Joey Hess	8d06930c88	web special remote is no longer a singleton Allow initremote of additional special remotes with type=web, in addition to the default web special remote. When --sameas=web is used, these provide additional names for the web special remote, and may also have their own additional configuration (once there is any for the web special remote) and cost. Sponsored-by: Dartmouth College's DANDI project	2023-01-09 15:49:20 -04:00
Joey Hess	f316b7f105	Revert "Removed the vendored git-lfs and the GitLfs build flag" This reverts commit `efda811404`. Turns out that datalad is building git-annex against debian bullseye. https://github.com/datalad/git-annex/issues/149	2023-01-04 17:33:29 -04:00
Joey Hess	7919b9d57a	changelog for `cf892f4256`	2023-01-02 12:36:10 -04:00
Joey Hess	efda811404	Removed the vendored git-lfs and the GitLfs build flag AFAICS all git-annex builds are using the git-lfs library not the vendored copy. Debian stable does have a too old haskell-git-lfs package to be able to build git-annex from source, but there is not currently a backport of a recent git-annex to Debian stable. And if they update the backport at some point, they should be able to backport the library too. Sponsored-by: Svenne Krap on Patreon	2022-12-26 12:49:53 -04:00
Joey Hess	d475f82c62	Added libgcc_s.so.1 to the linux standalone build so pthread_cancel will work In Makefile, listed additional deps of Build/Standalone. Without that, it does not get updated for the change to Utility/LinuxMkLibs.hs when compiling incrementally. Sponsored-by: Dartmouth College's DANDI project	2022-12-22 15:15:25 -04:00
Joey Hess	eb8e0594bb	use status --ignore-submodules in configureSmudgeFilter Speed up git-annex upgrade (from v5) and init in a repository that has submodules. Setting the config does not affect the submodules, so avoid the work of getting status in them, which may involve using the smudge filter etc. Sponsored-By: the NIH-funded NICEMAN (ReproNim TR&D3) project	2022-12-20 16:02:42 -04:00
Joey Hess	0b2dd374d8	--anything and --nothing Added --anything (and --nothing). Eg, git-annex find --anything will list all annexed files whether or not the content is present. This is slightly faster and clearer than --include=* or --exclude=* While I can't imagine how --nothing will be used, preferred content expressions already had anything and nothing, so might as well support both as matching options as well. Sponsored-by: Dartmouth College's Datalad project	2022-12-20 15:44:09 -04:00
Joey Hess	9d60385001	convert renameFile to moveFile to support cross-device moves Improve handling of some .git/annex/ subdirectories being on other filesystems, in the bittorrent special remote, and youtube-dl integration, and git-annex addurl. The only one of these that I've confirmed to be a problem is in the bittorrent special remote when .git/annex/tmp and .git/annex/othertmp are on different filesystems. As well as auditing for renameFile, also audited for createLink, all of those are ok as are the other remaining renameFile calls. Also audited all code paths that use .git/annex/othertmp, and did not find any other cross-device problems. So, removing mention of othertmp needing to be on the same device. Sponsored-by: Dartmouth College's Datalad project	2022-12-20 15:17:50 -04:00
Joey Hess	aa6919737c	--metadata lexicographical comparisons Change --metadata comparisons < > <= and >= to fall back to lexicographical comparisons when one or both values being compared are not numbers. Sponsored-by: Erik Bjäreholt on Patreon	2022-12-12 13:33:24 -04:00
Joey Hess	ab11fd70e2	releasing package git-annex version 10.20221212	2022-12-12 12:51:59 -04:00
Joey Hess	65f9e7a3c7	fix deadlock in restagePointerFiles Fix a hang that occasionally occurred during commands such as move. (A bug introduced in 10.20220927, in commit `6a3bd283b8`) The restage.log was kept locked while running a complex index refresh action. In an unusual situation, that action could need to write to the restage log, which caused a deadlock. The solution is a two-stage process. First the restage.log is moved to a work file, which is done with the lock held. Then the content of the work file is read and processed, which happens without the lock being held. This is all done in a crash-safe manner. Note that streamRestageLog may not be fully safe to run concurrently with itself. That's ok, because restagePointerFiles uses it with the index lock held, so only one can be run at a time. streamRestageLog does delete the restage.old file at the end without locking. If a calcRestageLog is run concurrently, it will either see the file content before it was deleted, or will see it's missing. Either is ok, because at most this will cause calcRestageLog to report more work remains to be done than there is. Sponsored-by: Dartmouth College's Datalad project	2022-12-08 14:36:11 -04:00
Joey Hess	2b5e6ff20a	test: Add --test-debug option This work is supported by the NIH-funded NICEMAN (ReproNim TR&D3) project.	2022-11-28 15:12:53 -04:00
Joey Hess	43f681d4c1	Support parsing yt-dpl output to display download progress Before this fix, no progress was displayed when yt-dpl was used. Sponsored-by: Graham Spencer on Patreon	2022-11-21 15:04:36 -04:00
Joey Hess	5256be61c1	When youtube-dl is not available in PATH, use yt-dlp instead Debian is going to drop youtube-dl which is not active upstream, and yt-dlp is the replacement. This will make it be used if youtube-dl gets removed. If an old version of youtube-dl remains installed, git-annex will still use it. That might not be desirable, but changing git-annex to use yt-dlp in preference to youtube-dl when both are installed risks breaking when the user has annex.youtube-dl-options set to something that is supported by youtube-dl, but not by yt-dlp. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2022-11-21 14:40:33 -04:00
Joey Hess	2b014f1a8b	don't frontload reconcileStaged in git-annex init init: Avoid scanning for annexed files, which can be lengthy in a large repository. Instead that scan is done on demand. This lets git-annex init be run and some query commands be used in a repository without waiting. Note that autoinit already behaved this way, so while this will mean some commands like git-annex get/unlock/add will do the scan the first time run, that is not really a significant behavior change. And, it's really better to have a consistent behavior. The reason for the inconsistency was a strange bug discussed in `b3c4579c79`. Avoiding reconcileStaged in init will keep avoiding whatever that was. Sponsored-by: Dartmouth College's DANDI project	2022-11-18 13:58:47 -04:00
Joey Hess	c834d2025a	queue more changes to keys db Increasing the size of the queue 10x makes git-annex init 7% faster in a repository with 86000 annexed files. The memory use goes up, from 70876 kb to 85376 kb.	2022-11-18 13:29:34 -04:00
Joey Hess	8fcee4ac9d	Sped up the initial scanning for annexed files by 15% Avoids database querying overhead when the database is newly created. In the large repository where git-annex init took 24 seconds, this sped it up to 20.47 seconds, a speedup of around 15%. Sponsored-by: Dartmouth College's DANDI project	2022-11-18 13:16:57 -04:00
Joey Hess	a3e9a0ae27	update changelog	2022-11-18 12:58:13 -04:00
Joey Hess	def779b250	revert change to use lts-19.32 This reverts commit `15dd7fe84b`. aws 0.23 is not used any longer, so read-only S3 import won't be supported yet when building with stack. That commit broke the build on windows, because the new version of Win32 that was included (because the old one does not work with this lts version) needs a version of filepath that is newer than the one bundled with the ghc in that lts version. It is not possible to override that to a newer filepath. Seems that the only solution to get aws 0.23 will be to wait for a ghc that contains filepath 1.4.100.0. No ghc yet contains it. (Backporting the Win32 fix to a point release version that does not include this bleeding edge filepath would also resolve it, but seems unlikely to happen.) Sponsored-by: Jarkko Kniivilä on Patreon	2022-11-14 11:56:55 -04:00
Joey Hess	b2cc63d5bf	export: fix multi-file delete bug export: Fix a bug that left a file on a special remote when two files with the same content were both deleted in the exported tree. Case of the wrong data structure leading to the wrong result. The DiffMap now contains all the old filenames, and all the new filenames. Note that, when 2 files with the same content are both renamed, it only renames the first, but deletes and re-exports the second. Improving that is possible, but it would need to use a different temporary filename. Anyway, that is an unusual case, and there are known to be other unusual cases where export does not rename with maximum efficiency, IIRC. (Or maybe this is the case that I remember?) Sponsored-by: Dartmouth College's OpenNeuro project	2022-11-09 16:24:37 -04:00
Joey Hess	15dd7fe84b	stack.yaml: Updated to lts-19.32 This allows building with aws-0.23 Win32-2.13.4.0 contains a function that is not in lts-19.32 yet. Adding it to stack.yaml does not seem to cause problems when building on linux.	2022-11-09 14:29:00 -04:00
Joey Hess	e100993935	complete support for S3 signature=anonymous aws-0.23 has been released. When built with an older aws, initremote will error out when run with signature=anonymous. And when a remote has been initialized with that by a version of git-annex that does support it, older versions will fail when the remote is accessed, with a useful error message. Sponsored-by: Dartmouth College's DANDI project	2022-11-04 16:20:28 -04:00
Joey Hess	de1e8201a6	Merge branch 'master' into anons3	2022-11-04 15:08:29 -04:00
Joey Hess	d22bd53310	releasing package git-annex version 10.20221103	2022-11-03 14:07:53 -04:00
Joey Hess	14f7a386f0	Make git-annex enable-tor work when using the linux standalone build Clean the standalone environment before running the su command to run "sh". Otherwise, PATH leaked through, causing it to run git-annex.linux/bin/sh, but GIT_ANNEX_DIR was not set, which caused that script to not work: [2022-10-26 15:07:02.145466106] (Utility.Process) process [938146] call: pkexec ["sh","-c","cd '/home/joey/tmp/git-annex.linux/r' && '/home/joey/tmp/git-annex.linux/git-annex' 'enable-tor' '1000'"] /home/joey/tmp/git-annex.linux/bin/sh: 4: exec: /exe/sh: not found Changed programPath to not use GIT_ANNEX_PROGRAMPATH, but instead run the scripts at the top of GIT_ANNEX_DIR. That works both when the standalone environment is set up, and when it's not. Sponsored-by: Kevin Mueller on Patreon	2022-10-26 15:45:08 -04:00
Joey Hess	731e806c96	use lookupKeyStaged in --batch code paths Make --batch mode handle unstaged annexed files consistently whether the file is unlocked or not. Before this, a unstaged locked file would have the symlink on disk examined and operated on in --batch mode, while an unstaged unlocked file would be skipped. Note that, when not in batch mode, unstaged files are skipped over too. That is actually somewhat new behavior; as late as 7.20191114 a command like `git-annex whereis .` would operate on unstaged locked files and skip over unstaged unlocked files. That changed during optimisation of CmdLine.Seek with apparently little fanfare or notice. Turns out that rmurl still behaved that way when given an unstaged file on the command line. It was changed to use lookupKeyStaged to handle its --batch mode. That also affected its non-batch mode, but since that's just catching up to the change earlier made to most other commands, I have not mentioed that in the changelog. It may be that other uses of lookupKey should also change to lookupKeyStaged. But it may also be that would slow down some things, or lead to unwanted behavior changes, so I've kept the changes minimal for now. An example of a place where the use of lookupKey is better than lookupKeyStaged is in Command.AddUrl, where it looks to see if the file already exists, and adds the url to the file when so. It does not matter there whether the file is staged or not (when it's locked). The use of lookupKey in Command.Unused likewise seems good (and faster). Sponsored-by: Nicholas Golder-Manning on Patreon	2022-10-26 14:43:06 -04:00
Joey Hess	cde2e61105	improve sqlite retrying behavior Avoid hanging when a suspended git-annex process is keeping a sqlite database locked. Sponsored-by: Dartmouth College's Datalad project	2022-10-18 15:47:20 -04:00
Joey Hess	3149a1e2fe	More robust handling of ErrorBusy when writing to sqlite databases While ErrorBusy and other exceptions were caught and the write retried for up to 10 seconds, it was still possible for git-annex to eventually give up and error out without writing to the database. Now it will retry as long as necessary. This does mean that, if one git-annex process is suspended just as sqlite has locked the database for writing, another git-annex that tries to write it it might get stuck retrying forever. But, that could already happen when opening the sqlite database, which retries forever on ErrorBusy. This is an area where git-annex is known to not behave well, there's a todo about the general case of it. Sponsored-by: Dartmouth College's Datalad project	2022-10-17 15:56:19 -04:00
Joey Hess	6fbd337e34	avoid uncessary keys db writes; doubled speed! When running eg git-annex get, for each file it has to read from and write to the keys database. But it's reading exclusively from one table, and writing to a different table. So, it is not necessary to flush the write to the database before reading. This avoids writing the database once per file, instead it will buffer 1000 changes before writing. Benchmarking getting 1000 small files from a local origin, git-annex get now takes 13.62s, down from 22.41s! git-annex drop now takes 9.07s, down from 18.63s! Wowowowowowowow! (It would perhaps have been better if there were separate databases for the two tables. At least it would have avoided this complexity. Ah well, this is better than splitting the table in a annex.version upgrade.) Sponsored-by: Dartmouth College's Datalad project	2022-10-12 15:33:16 -04:00
Joey Hess	c2ad84b423	all keys are still present on versioned remote after import of a tree When importing from versioned remotes, fix tracking of the content of deleted files. Only S3 supports versioning so far, so only it was affected. But, the draft import/export interface for external remotes also seemed to need a change, so that versionedExport could be set.	2022-10-11 13:05:40 -04:00
Joey Hess	b4305315b2	S3: pass fileprefix into getBucket calls S3: Speed up importing from a large bucket when fileprefix= is set by only asking for files under the prefix. getBucket still returns the files with the prefix included, so the rest of the fileprefix stripping still works unchanged. Sponsored-by: Dartmouth College's DANDI project	2022-10-10 17:37:26 -04:00
Joey Hess	ca91c3ba91	S3: Support signature=anonymous to access a S3 bucket anonymously This can be used, for example, with importtree=yes to import from a public bucket. This needs a patch that has not yet landed in the aws library, and will need to be adjusted to support compiling with old versions of the library, so is not yet suitable for merging. See https://github.com/aristidb/aws/pull/281 The stack.yaml changes are provided to show how to build against the aws fork and will need to be reverted as well. Sponsored-by: Dartmouth College's DANDI project	2022-10-10 17:02:45 -04:00
Joey Hess	4a42c69092	take lock in checkLogFile and calcLogFile move: Fix openFile crash with -J This does make them a bit slower, although usually the log file is not very big, so even when it's being rewritten, they will not block for long taking the lock. Still, little slowdowns may add up when moving a lot file files. A less expensive fix would be to use something lower level than openFile that does not check if the file is already open for write by another thread. But GHC does not seem to provide anything convenient; even mkFD checks for a writing thread. fullLines is no longer necessary since these functions no longer will read the file while it's being written. Sponsored-by: Dartmouth College's DANDI project	2022-10-07 13:19:17 -04:00
Joey Hess	15f9fcbcb1	avoid combining multiple words provided to trust/untrust/dead * trust, untrust, semitrust, dead: Fix behavior when provided with multiple repositories to operate on. * trust, untrust, semitrust, dead: When provided with no parameters, do not operate on a repository that has an empty name. The man page and usage already indicated that multiple repos could be provided to these commands, but they actually used unwords to combine everything into string, and found a repo matching that string. This was especially bad when no parameters resulted in the empty string and some repo happened to have an empty description. This does change the behavior, and it's possible someone relied on the current behavior to eg, trust a repo by name with the name not quoted into a single parameter. But fixing the empty string bug and matching the documentation are worth breaking that usage. Note that git-annex init/reinit do still unwords multiple parameters when provided to them. That is inconsistent behavior, but it certianly seems possible that something does run git-annex init with an unquoted description, and I don't think it's worth breaking that just to make it more consistent with these other commands. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2022-10-03 13:48:40 -04:00
Joey Hess	32a44c3813	releasing package git-annex version 10.20221003	2022-10-03 13:24:21 -04:00
Joey Hess	1328be2013	applied a patch	2022-09-30 14:04:10 -04:00
Joey Hess	49ee07f93d	fix flush of a closed file handle Avoids displaying warning about git-annex restage needing to be run in situations where it does not. Closing a handle flushes it anyway, so no need for an explict flush. The handle does get closed twice, but that's fine, the second one does nothing. Sponsored-by: Dartmouth College's DANDI project	2022-09-30 14:02:31 -04:00
Joey Hess	02fca53cb2	releasing package git-annex version 10.20220927	2022-09-27 13:31:55 -04:00
Joey Hess	7059322a6c	Support "inbackend" in preferred content expressions Well, actually, fix a typo that has always been in the implementation of that. "inbacked" used to work, but let's not tell users about that; they might try to use it and expect git-annex to keep supporting the typo.. Sponsored-by: Jack Hill on Patreon	2022-09-26 16:06:49 -04:00
Joey Hess	17129fed66	fix wormhole --appid option position p2p: Pass wormhole the --appid option before the receive/send command, as it does not accept that option after the command I'm left wondering, did I get this wrong from the beginning, or did wormhole change its option parser? I'm reminded of the change in 0.8.2 where it silently changed what FD the pairing code was output to. But, looking at the wormhole source, it was at least putting --appid before send in its test suite from the introduction of the option. So I think probably this has always been broken. On 2021-12-31 the --appid option was enabled, and it took until now for someone to try git-annex p2p --pair and notice that flag day broke it.. Sponsored-by: Svenne Krap on Patreon	2022-09-26 15:11:38 -04:00
Joey Hess	ce65f11de0	enable-tor: Fix breakage caused by git's fix for CVE-2022-24765 This relies on `bfa451fc4e` and is a bit of an ugly hack. Sponsored-by: Noam Kremen on Patreon	2022-09-26 14:48:58 -04:00
Joey Hess	bfa451fc4e	pass --git-dir when reading git config when it was specified explicitly Let GIT_DIR and --git-dir override git's protection against operating in a repository owned by another user. This is the same behavior other git commands have. Sponsored-by: Jarkko Kniivilä on Patreon	2022-09-26 14:38:34 -04:00
Joey Hess	79aadf63d4	changelog and close	2022-09-26 13:11:23 -04:00
Joey Hess	8230f4a6f1	changelog for problem fixed by earlier revert	2022-09-26 12:27:32 -04:00
Joey Hess	2478e9e03a	restage: New git-annex command, handles restaging unlocked files This is much easier and less failure-prone than having the user run git update-index --refresh themselves. Sponsored-by: Dartmouth College's DANDI project	2022-09-23 16:29:59 -04:00
Joey Hess	f64eff9355	test: Added --test-with-git-config option Sponsored-by: Dartmouth College's DANDI project	2022-09-22 15:58:45 -04:00
Joey Hess	6e3c9bea2e	drain transferrer read handle when shutting it down Fixes updating git index file after getting an unlocked file when annex.stalldetection is set. The transferrer may want to send additional protocol messages when it's shut down. Closing the read handle prevented it from doing that, and caused it to crash rather than cleanly shutting down. Draining the handle without processing the protocol seemed ok to do, because anything it outputs is going to be some side message displayed at shutdown. Displaying those once per transferrer process that is running seems unncessary. Sponsored-by: Dartmouth College's DANDI project	2022-09-22 14:39:39 -04:00
Joey Hess	66bd4f80b3	Improved handling of --time-limit when combined with -J When concurrency is enabled, there can be worker threads still running when the time limit is checked. Exiting right there does not give those threads time to finish what they're doing. Instead, the seeking is wrapped up, and git-annex then shuts down cleanly. The whole point of --time-limit existing, rather than using timeout(1) when running git-annex is to let git-annex finish the action(s) it is working on when the time limit is reached, and shut down cleanly. I noticed this problem when investigating why restagePointerFile might not have run after get/drop of an unlocked file. With --time-limit -J, a worker thread may have finished updating a work tree file, and be killed by the time limit check before it can run restagePointerFile. So despite --time-limit running the shutdown actions, the work tree file didn't get restaged. Sponsored-by: Dartmouth College's DANDI project	2022-09-22 12:54:52 -04:00
Joey Hess	34e313f786	annex.diskreserve default increased from 1 mb to 100 mb It's hard to know what's a good default for this. But 1 mb seems way too small, because it's very easy for a git pull or some similar operation that we don't think of as using much space to use up 1 mb of space. Most people would want to free up some space if a filesystem only had 100 mb free. But on a small VPS, it's probably not uncommon to have only 1 gb free. So 1 gb is too large for annex.diskreserve. While old 1 gb USB keys are around, it's unlikely that anyone is relying on them to shuttle annex data around; it would be worth anyone's time to upgrade to a 32 gb or larger cheap modern USB key ($5). Sponsored-by: Kevin Mueller on Patreon	2022-09-21 15:00:13 -04:00
Joey Hess	24016090b3	wording	2022-09-20 14:55:34 -04:00
Joey Hess	8d26fdd670	skip checkRepoConfigInaccessible when git directory specified explicitly Fix a reversion that prevented git-annex from working in a repository when --git-dir or GIT_DIR is specified to relocate the git directory to somewhere else. (Introduced in version 10.20220525) checkRepoConfigInaccessible could still run git config --list, just passing --git-dir. It seems not necessary, because I know that passing --git-dir bypasses git's check for repo ownership. I suppose it might be that git eventually changes to check something about the ownership of the working tree, so passing --git-dir without --work-tree would still be worth doing. But for now this is the simple fix. Sponsored-by: Nicholas Golder-Manning on Patreon	2022-09-20 14:52:43 -04:00
Joey Hess	0756f4453d	try retrieval from more than one export location when the first fails Combined with commit `0ffc59d341`, this fixes the case where there are duplicate files on the special remote, and the first gets modified/deleted, while the second is still present. directory, adb: Fixed a bug when importtree=yes, and multiple files in the special remote have the same content, that caused it to refuse to get a file from the special remote, incorrectly complaining that it had changed, due to only accepting the inode+mtime of one file (that was since modified or deleted) and not accepting the inode+mtime of other duplicate files. Sponsored-by: Max Thoursie on Patreon	2022-09-20 13:33:57 -04:00
Joey Hess	1fe9cf7043	deal with ignoreinode config setting Improve handling of directory special remotes with importtree=yes whose ignoreinode setting has been changed. (By either enableremote or by upgrading to commit 3e2f1f73cbc5fc10475745b3c3133267bd1850a7.) When getting a file from such a remote, accept the content that would have been accepted with the previous ignoreinode setting. After a change to ignoreinode, importing a tree from the remote will re-import and generate new content identifiers using the new config. So when ignoreinode has changed to no, the inodes will be learned, and after that point, a change in an inode will be detected as a change. Before re-importing, a change in an inode will be ignored, as it was before the ignoreinode change. This seems acceptble, because the user can re-import immediately if they urgently need to add inodes. And if not, they'll do it sometime, presumably, and the change will take effect then. Sponsored-by: Erik Bjäreholt on Patreon	2022-09-16 14:11:25 -04:00
Joey Hess	1ed90cb75e	improve wording	2022-09-16 12:52:00 -04:00
Joey Hess	451a7ce77f	vicfg: Include mincopies configuration Sponsored-by: k0ld on Patreon	2022-09-15 15:11:59 -04:00
Joey Hess	eefc026370	fix reversion on skipping dead keys in --all/bare Fix a reversion that made dead keys not be skipped when operating on all keys via --all or in a bare repo. (Introduced in version 8.20200720) Also, improved the documentation of git-annex-dead, it does not only apply to fsck --all. Also, made git-annex fsck, when run on a file whose key is dead, display that. Before, it displayed that only when run with --all, but with this fix, it skips dead keys with --all. But it can still be run on a file that uses a dead key, and displaying "This key is dead" explains to the user why it does not consider missing content for it to be a problem. Sponsored-by: k0ld on Patreon	2022-09-13 14:38:13 -04:00
Joey Hess	d2c842e9a1	don't force use of conduit in withUrlOptionsPromptingCreds Use curl for downloads from git remotes when annex.url-options and other git configs are set. If the url needs a password, curl will fail, and git credential will not be used to prompt for it. But the user can set --netrc in url-options and put the password in the netrc file. This also means that url-options settings like -4 will take effect. That was the case before commit `1883f7ef8f` forced conduit to be used.	2022-09-09 16:07:32 -04:00
Joey Hess	9621beabc4	cache credentials in memory when doing http basic auth to a git remote When accessing a git remote over http needs a git credential prompt for a password, cache it for the lifetime of the git-annex process, rather than repeatedly prompting. The git-lfs special remote already caches the credential when discovering the endpoint. And presumably commands like git pull do as well, since they may download multiple urls from a remote. The TMVar CredentialCache is read, so two concurrent calls to getBasicAuthFromCredential will both prompt for a credential. There would already be two concurrent password prompts in such a case, and existing uses of `prompt` probably avoid it. Anyway, it's no worse than before.	2022-09-09 14:20:32 -04:00
Joey Hess	8a4cfd4f2d	use getSymbolicLinkStatus not getFileStatus to avoid crash on broken symlink Fix crash importing from a directory special remote that contains a broken symlink. The crash was in listImportableContentsM but some other places in Remote.Directory also seemed like they could have the same problem. Also audited for other places that have such a problem. Not all calls to getFileStatus are bad, in some cases it's better to crash on something unexpected. For example, `git-annex import path` when the path is a broken symlink should crash, the same as when it does not exist. Many of the getFileStatus calls are like that, particularly when they involve .git/annex/objects which should never have a broken symlink in it. Fixed a few other possible cases of the problem. Sponsored-by: Lawrence Brogan on Patreon	2022-09-05 13:46:32 -04:00
Joey Hess	a93163d6f7	optimise linker in linux standalone tarballs Trick the linker into not doing unncessary work searching for optimised libraries that are not present, by symlinking the directories where optimised libs would be to the main lib dir. This reduces the ENOENT of git-annex init by about 1/2. The linker always finds the files where it looks first time now. I have not looked at what the wall clock speedup might be, it's probably rather small. If a x86-64-v5 comes to be, the list will need to be extended. And there may be other directories used on some machines that I have missed. Not done for arm64 yet, or any uncommon architectures. Sponsored-by: Dartmouth College's Datalad project	2022-08-30 15:20:04 -04:00
Joey Hess	78440ca37d	move assistant and webapp build-depends into main build-depends For some reason, cabal 3.4.1.0 builds w/o the assistant and webapp, even when the flag is explicitly turned on. Moving the build-depends from inside the if flag section to the main build-depends somehow fixes this. Since the webapp build deps are thus always available, there is no reason not to build the webapp when building the assistant. So, got rid of the webapp build flag. Kept the assistant build flag for now, since building without it does at least still speed up the build. Sponsored-by: Brock Spratlen on Patreon	2022-08-29 15:23:49 -04:00
Joey Hess	cbac6c680b	remove changelog entry about reverted stack.yaml change	2022-08-22 12:02:22 -04:00
Joey Hess	e801634875	prep release	2022-08-22 12:02:04 -04:00
Yaroslav Halchenko	0151976676	Typo fix unncessary -> unnecessary. Detected while reading recent CHANGELOG entry but then decided to apply to entire codebase and docs since why not?	2022-08-20 09:40:19 -04:00
Joey Hess	ed39979ac8	import: Avoid following symbolic links inside directories being imported Too big a footgun. This does not prevent attackers who can write to the directory being imported from racing the check. But they can cause anything to be imported anyway, so would be limited to getting the legacy import to follow into a directory they do not write to, and move files out of it into the annex. (The directory special remote does not have that problem since it does not move files.) Sponsored-by: Jack Hill on Patreon	2022-08-19 13:31:16 -04:00
Joey Hess	94029995fa	fix git-annex add regression on deleted file Fix a regression in 10.20220624 that caused git-annex add to crash when there was an unstaged deletion. Sponsored-by: Dartmouth College's Datalad project	2022-08-19 12:55:49 -04:00
Joey Hess	840bd50390	make it easier to use curl for unusual url schemes Use curl when annex.security.allowed-url-schemes includes an url scheme not supported by git-annex internally, as long as annex.security.allowed-ip-addresses is configured to allow using curl. Sponsored-by: Luke Shumaker on Patreon	2022-08-15 12:22:13 -04:00
Joey Hess	e60766543f	add annex.dbdir (WIP) WIP: This is mostly complete, but there is a problem: createDirectoryUnder throws an error when annex.dbdir is set to outside the git repo. annex.dbdir is a workaround for filesystems where sqlite does not work, due to eg, the filesystem not properly supporting locking. It's intended to be set before initializing the repository. Changing it in an existing repository can be done, but would be the same as making a new repository and moving all the annexed objects into it. While the databases get recreated from the git-annex branch in that situation, any information that is in the databases but not stored in the branch gets lost. It may be that no information ever gets stored in the databases that cannot be reconstructed from the branch, but I have not verified that. Sponsored-by: Dartmouth College's Datalad project	2022-08-11 16:58:53 -04:00
Joey Hess	2530012fa3	fix wording	2022-08-10 12:32:49 -04:00
Joey Hess	abd417d4fe	Avoid running multiple bup split processes concurrently Since bup split is not concurrency safe. Used a lock file so that 2 git-annex processes only run one bup split between them (per bup repo). (Concurrent writes from different git-annex repository clones to the same bup repo could still have concurrency problems.) Sponsored-by: Noam Kremen on Patreon	2022-08-08 18:54:06 -04:00
Joey Hess	5bc70e2da5	When bup split fails, display its stderr It seems worth noting here that I emailed bup's author about bup split being noisy on stderr even with -q in approximately 2011. That never got fixed. Its current repo on github only accepts pull requests, not bug reports. Needing to add such complexity to deal with such a longstanding unfixed issue is not fun. Sponsored-by: Kevin Mueller on Patreon	2022-08-05 13:57:20 -04:00
Joey Hess	f94908f2a6	improve output when storing to bup bup split outputs to stderr even with -q. This was discarded when using -J, but it was still outputting when not using -J, and so was git-annex. Sponsored-by: Nicholas Golder-Manning on Patreon	2022-08-05 12:29:33 -04:00
Joey Hess	a23fd7349f	work around git segfault Work around bug in git 2.37 that causes a segfault when when core.untrackedCache is set, and broke git-annex init. Depending on when git gets fixed and how widely the buggy versions are used, this could be reverted quite soon, or need to linger for a long time. It only makes git-annex init a tiny bit slower in a new repo. Sponsored-by: Max Thoursie on Patreon	2022-08-04 14:20:57 -04:00
Joey Hess	3a513cfe73	add --dry-run: New option This is intended for users who want to see what it would output in order to eg, check if a file would be added to git or the annex. It is not intended as a way for scripts to get information. Sponsored-by: Dartmouth College's Datalad project	2022-08-03 11:16:04 -04:00
Joey Hess	570b1aa6a1	Allow find --branch to be used in a bare repository, the same as the deprecated findref can be This will allow later fully deprecating and removing findref. Sponsored-by: Erik Bjäreholt on Patreon	2022-07-29 12:52:12 -04:00
Joey Hess	be19a68276	new matching options --want-get-by and --want-drop-by Sponsored-by: Graham Spencer on Patreon	2022-07-28 13:26:03 -04:00
Joey Hess	b5dc04099e	stack.yaml: Updated to lts-19.16 Last try at this broke on windows with a problem installing ghc, but I wanted to try again. Also this has a version of aws that allows using aeson 2.0, which has a potential security fix.	2022-07-26 16:04:49 -04:00
Joey Hess	d905232842	use ResourcePool for hash-object handles Avoid starting an unncessary number of git hash-object processes when concurrency is enabled. Sponsored-by: Dartmouth College's DANDI project	2022-07-25 17:32:39 -04:00
Joey Hess	63cef2ae0b	v8 repositories automatically upgrade to v9 (And v9 later on to v10.) When v9/v10 were added, making v8 automatically upgrade was deferred "for a few months" to prevent interoperability problems if users also have an old version of git-annex. Of course that could still be the case, but there has been a good amount of time and this can't be put off forever. Allow setting annex.autoupgraderepository to false to avoid this upgrade. Previously, that only prevented upgrades from no longer supported git-annex versions, but v8 is still supported, and users may want to keep on v8 to interoperate with an old git-annex version. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2022-07-25 16:20:04 -04:00
Joey Hess	a0e788c94a	releasing package git-annex version 10.20220724	2022-07-25 14:07:20 -04:00
Joey Hess	4e88137a28	prevent appends except when annex.alwayscompact=false I would like for a new repo version to enable appends, but to do so safely would need a v11 followed by a 1 year delay followed by a v12 that does it. Since a similar v9 and v10 transition is currently happening, and is less than 6 months along in most repos, it does not feel wise to stack up another year-long transition behind that. What if I need to hurry up a new repo version for some other change? Added todo so I remember to make this change at some time when a v11 and probably v12 repo version do make sense. Sponsored-by: Dartmouth College's DANDI project	2022-07-20 13:23:55 -04:00
Joey Hess	36f0bdcd57	add annex.alwayscompact Added annex.alwayscompact setting which can be unset to speed up writes to the git-annex branch in some cases. Sponsored-by: Dartmouth College's DANDI project	2022-07-18 16:39:19 -04:00
Joey Hess	a2b1f369d1	disable journalIgnorable in enableInteractiveBranchAccess Fix a reversion that prevented --batch commands (and the assistant) from noticing data written to the journal by other commands. I have not identified which commit broke this for sure, but probably it was `aeca7c2207` --batch commands that wrote to the journal avoided the problem since journalIgnorable sets unset on write. It's a little bit surprising that nobody noticed that query --batch commands did not see data written by other commands. Sponsored-by: Dartmouth College's DANDI project	2022-07-15 13:48:41 -04:00
Joey Hess	093ad89ead	S3: Avoid writing or checking the uuid file in the S3 bucket when importtree=yes or exporttree=yes It does not make sense for either; importing from an existing bucket should not write to it. And the user may not have write access at all. And exporting to a bucket should not write other files. Also this prevents the uuid file being imported after being written. Sponsored-by: Dartmouth College's DANDI project	2022-07-14 15:05:51 -04:00
Joey Hess	50c2cac7e7	adb: Added configuration setting oldandroid=true To avoid using find -printf, which was first supported in Android around 2019-2020. Probing seems too fragile, and execing stat once per file is too slow to do when there's a faster way available, which brought me to an option... Sponsored-by: Brett Eisenberg on Patreon	2022-07-13 18:00:47 -04:00
Joey Hess	fbc3c223a6	filter-process: Fix protocol for empty files This caused git to complain that filter-process failed and kill it with signal 15. Because it wrote an extra flushPkt for an empty file, which git did not expect, and so git saw an unexpected response to the next request. Luckily, filter-process is only used by default in v9 and up, and v8 is still the default. Also, git had to be updating an empty file, followed by another file, which is a fairly unlikely situation. And git restarts filter-process after this happens and uses it to filter the rest of the files. So this isn't a crippling bug. Sponsored-by: Luke Shumaker on Patreon	2022-07-13 17:13:54 -04:00
Joey Hess	201e41cffd	add: Fix reversion when adding an annex link that has been moved to another directory Fixes commit `f259be7f39` Sponsored-by: Dartmouth College's Datalad project	2022-07-05 16:22:41 -04:00
Joey Hess	d01530ac21	Revert "lts-19.13 (ghc 9.0.2)" This reverts commit `d2bc268317`. That seemed to break building on windows, before it starts building git-annex at all, it tried to install ghc and something blew up: Processing archive: C:\Users\runneradmin\AppData\Local\Programs\stack\x86_64-windows\ghc-9.0.2.tar.xz Extracting ghc-9.0.2.tar ... Extracted total of 11790 files from ghc-9.0.2.tar C:\Users\runneradmin\AppData\Local\Programs\stack\x86_64-windows\ghc-9.0.2-tmp-6d0fbe7f3b29e56c\ghc-9.0.2\: renameDirectory:pathIsDirectory:CreateFile "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\Programs\\stack\\x86_64-windows\\ghc-9.0.2-tmp-6d0fbe7f3b29e56c\\ghc-9.0.2\\": does not exist (The system cannot find the file specified.) Hopefully a newer ghc version or updated stackage version will fix this at some point, in the meantime revert it.	2022-07-05 13:13:25 -04:00
Joey Hess	02ef3d6a64	fix build with assistant disabled and webapp enabled The webapp modules cannot build with the assistant disabled, so make the webapp be under the assistant build flag. Sponsored-by: Jarkko Kniivilä on Patreon	2022-06-29 14:19:18 -04:00
Joey Hess	b223988e22	remove --backend from global options --backend is no longer a global option, and is only accepted by commands that actually need it. Three commands that used to support backend but don't any longer are watch, webapp, and assistant. It would be possible to make them support it, but I doubt anyone used the option with these. And in the case of webapp and assistant, the option was handled inconsistently, only taking affect when the command is run with an existing git-annex repo, not when it creates a new one. Also, renamed GlobalOption etc to AnnexOption. Because there are many options of this type that are not actually global (any more) and get added to commands that need them. Sponsored-by: Kevin Mueller on Patreon	2022-06-29 13:33:25 -04:00
Joey Hess	21c50c0f72	fix parallel copy from/to a local git repo Improve handling of parallelization with -J when copying content from/to a git remote that is a local path. Sponsored-by: Nicholas Golder-Manning on Patreon	2022-06-29 12:40:12 -04:00
Joey Hess	d2bc268317	lts-19.13 (ghc 9.0.2)	2022-06-28 14:49:33 -04:00
Joey Hess	c1b9ea2759	The 23 never happened release. It's 24 somewhere..	2022-06-23 13:55:54 -04:00
Joey Hess	57d088e9c2	fix release version	2022-06-23 13:35:14 -04:00
Joey Hess	86968a4047	releasing package git-annex version 10.20220526	2022-06-23 13:31:36 -04:00
Joey Hess	f259be7f39	fix overwrite race with small file that got large When adding a small file, it does not get locked down, so can be modified after git-annex checks that it's small. The use of queued git add made the race window nice and wide too. Fixed by checking if the file has changed, and by not using git add. Instead, have to recapitulate git add's handling of things like symlinks and executable files. Sponsored-by: Jochen Bartl on Patreon	2022-06-14 16:38:56 -04:00
Joey Hess	78a3d44ea0	get rid of racy addLink The remaining callers all did not rely on it checking gitignore, so were easy to convert. They were susceptable to the same overwrite race as add and fix, although less likely to have it and a narrower window than add's race. Command.Rekey in passing got an unncessary call to removeFile deleted. addSymlink handles deleting any existing worktree file.	2022-06-14 14:47:15 -04:00
Joey Hess	64c7f60f7a	fixed overwrite race with git-annex fix Similar to git-annex add, git-annex fix queued git add, so if a file got modified before git add ran, the wrong content would be staged, perhaps a large file content. Sponsored-by: Brock Spratlen on Patreon	2022-06-14 14:19:58 -04:00
Joey Hess	dd6dec4eb1	fix add overwrite race with git-annex add to annex This is not a complete fix for all such races, only the one where a large file gets changed while adding and gets added to git rather than to the annex. addLink needs to go away, any caller of it is probably subject to the same kind of race. (Also, addLink itself fails to check gitignore when symlinks are not supported.) ingestAdd no longer checks gitignore. (It didn't check it consistently before either, since there were cases where it did not run git add!) When git-annex import calls it, it's already checked gitignore itself earlier. When git-annex add calls it, it's usually on files found by withFilesNotInGit, which handles checking ignores. There was one other case, when git-annex add --batch calls it. In that case, old git-annex behaved rather badly, it would seem to add the file, but git add would later fail, leaving the file as an unstaged annex symlink. That behavior has also been fixed. Sponsored-by: Brett Eisenberg on Patreon	2022-06-14 13:37:19 -04:00
Joey Hess	6d0b243d9d	avoid cleaning up move log when drop from remote fails move: Improve resuming a move that succeeded in transferring the content, but where dropping failed due to eg a network problem, in cases where numcopies checks prevented the resumed move from dropping the object from the source repository. This was earlier done for moves that got interrupted during the drop stage. Sponsored-by: Svenne Krap on Patreon	2022-06-09 15:26:25 -04:00
Joey Hess	13fc6a9b6a	fix to use 1 chunk for empty file Fix retrival of an empty file that is stored in a special remote with chunking enabled. The speculative chunk stuff caused a reversion by adding an empty list for the empty file. Which is just wrong; the empty file is still stored on the remote, and should be retrieved like any other file. It uses 1 chunk, so `max 1` is the simple fix. Sponsored-by: Noam Kremen on Patreon	2022-06-09 14:24:56 -04:00
Joey Hess	14584e7a38	initremote type=git probe uuid rather than matching path of an existing remote to find the uuid. The main benefit of this is that locations not using ssh:// will work now, including both paths and host:/path The other benefit is that it's a simpler interface, no need to have an existing remote with the same url and some other name. Although that will still work of course. This does rely on tryGitConfigRead working when given a Git.Repo that is not a remote. Luckily, it works fine that way. Also, tryGitConfigRead will auto-init a local repo that has a git-annex branch. I did not enable auto-init of ssh repos though. The uuid discovery actually happens twice; initremote discovers it, and uses it to store the special remote config, but does not set it in the git remote it creates. So the next run of git-annex does uuid discovery again, and caches it that time. This could be improved for a tiny speedup, but I didn't want to complicate things for that in this commit. Sponsored-by: Dartmouth College's DANDI project	2022-06-09 13:16:50 -04:00
Joey Hess	c59ea5b1ca	info: Added --autoenable option Use cases include using git-annex init --no-autoenable and then going back and enabling the special remotes that have autoenable configured. As well as just querying to remember which ones have it enabled. It lists all special remotes that have autoenable=yes whether currently enabled or not. And it can be used with --json. I pondered making this "git-annex info autoenable", but that seemed wrong because then if the use has a directory named "autoenable", it's unclear what they are asking for. (Although "git-annex info remote" may be similarly unclear.) Making it an option does mean that it can't be provided via --batch though. Sponsored-by: Dartmouth College's Datalad project	2022-06-01 14:20:38 -04:00
Joey Hess	0d50c90794	init: Added --no-autoenable option Someone may disagree with what repositories are set to autoenable and it's good to have local overrides. See https://github.com/datalad/datalad/issues/6634 Sponsored-by: Dartmouth College's Datalad project	2022-06-01 13:27:49 -04:00
Joey Hess	b60d85c4c0	releasing package git-annex version 10.20220525	2022-05-25 14:01:31 -04:00
Joey Hess	85f9193167	fix git-annex test -p test: When limiting tests to run with -p, work around tasty limitation by automatically including dependent tests. This fixes a reversion because it didn't used to use dependencies and forced tasty to run the init tests first. That changed when parallelizing the test suite. It will sometimes do a little more work than strictly required, because it adds init tests deps when limited to eg quickcheck tests, which don't depend on them. But this only adds a few seconds work. Sponsored-by: Dartmouth College's Datalad project	2022-05-23 14:24:54 -04:00
Joey Hess	af0d854460	deal with git's changes for CVE-2022-24765 Deal with git's recent changes to fix CVE-2022-24765, which prevent using git in a repository owned by someone else. That makes git config --list not list the repo's configs, only global configs. So annex.uuid and annex.version are not visible to git-annex. It displayed a message about that, which is not right for this situation. Detect the situation and display a better message, similar to the one other git commands display. Also, git-annex init when run in that situation would overwrite annex.uuid with a new one, since it couldn't see the old one. Add a check to prevent it running too in this situation. It may be that this fix has security implications, if a config set by the malicious user who owns the repo causes git or git-annex to run code. I don't think any git-annex configs get run by git-annex init. It may be that some git config of a command does get run by one of the git commands that git-annex init runs. ("git status" is the command that prompted the CVE-2022-24765, since core.fsmonitor can cause it to run a command). Since I don't know how to exploit this, I'm not treating it as a security fix for now. Note that passing --git-dir makes git bypass the security check. git-annex does pass --git-dir to most calls to git, which it does to avoid needing chdir to the directory containing a git repository when accessing a remote. So, it's possible that somewhere in git-annex it gets as far as running git with --git-dir, and git reads some configs that are unsafe (what CVE-2022-24765 is about). This seems unlikely, it would have to be part of git-annex that runs in git repositories that have no (visible) annex.uuid, and git-annex init is the only one that I can think of that then goes on to run git, as discussed earlier. But I've not fully ruled out there being others.. The git developers seem mostly worried about "git status" or a similar command implicitly run by a shell prompt, not an explicit use of git in such a repository. For example, Ævar Arnfjörð Bjarma wrote: > * There are other bits of config that also point to executable things, > e.g. core.editor, aliases etc, but nothing has been found yet that > provides the "at a distance" effect that the core.fsmonitor vector > does. > > I.e. a user is unlikely to go to /tmp/some-crap/here and run "git > commit", but they (or their shell prompt) might run "git status", and > if you have a /tmp/.git ... Sponsored-by: Jarkko Kniivilä on Patreon	2022-05-20 14:38:27 -04:00
Joey Hess	aa414d97c9	make fsck normalize object locations The purpose of this is to fix situations where the annex object file is stored in a directory structure other than where annex symlinks point to. But it will also move object files from the hashdirmixed back to hashdirlower if the repo configuration makes that the normal location. It would have been more work to avoid that than to let it do it. Sponsored-by: Dartmouth College's Datalad project	2022-05-16 15:38:06 -04:00
Joey Hess	54809e9eb3	fix untrustworthiness of import/export remotes Commit `36133f27c0` had a boolean flip in it, aaargh. Special remotes with importtree=yes or exporttree=yes are once again treated as untrusted, since files stored in them can be deleted or modified at any time. Sponsored-by: Kevin Mueller on Patreon	2022-05-09 15:53:23 -04:00
Joey Hess	e8a601aa24	incremental verification for retrieval from import remotes Sponsored-by: Dartmouth College's Datalad project	2022-05-09 15:39:43 -04:00
Joey Hess	d1cce869ed	implement dataUnits finally Added support for "megabit" and related bandwidth units in annex.stalldetection and everywhere else that git-annex parses data units. Note that the short form is "Mbit" not "Mb" because that differs from "MB" only in case, and git-annex parses units case-insensitively. It would be horrible if two different versions of git-annex parsed the same value differently, so I don't think "Mb" can be supported. See comment for bonus sad story from my childhood. Sponsored-by: Nicholas Golder-Manning	2022-05-05 15:25:11 -04:00
Joey Hess	4e4c44ed8e	hah, I mean 0504 of course	2022-05-04 11:47:40 -04:00
Joey Hess	cb0e89bf77	releasing package git-annex version 10.20220404	2022-05-04 11:46:56 -04:00
Joey Hess	0406c33f58	fix git-annex repair false positive Avoid treating refs/annex/last-index or other refs that are not commit objects as evidence of repository corruption. The repair code checks to find bad refs by trying to run `git log` on them, and assumes that no output means something is broken. But git log on a tree object is empty. This was worth fixing generally, not as a special case, since it's certainly possible that other things store tree or other objects in refs. Sponsored-by: Max Thoursie on Patreon	2022-05-04 11:32:23 -04:00
Joey Hess	43701759a3	disable shellescape for rsync 3.2.4 rsync 3.2.4 broke backwards-compatability by preventing exposing filenames to the shell. Made the rsync and gcrypt special remotes detect this and disable shellescape. An alternative fix would have been to always set RSYNC_OLD_ARGS=1. Which would avoid the overhead of probing rsync --help for each affected remote. But that is really very fast to run, and it seemed better to switch to the modern code path rather than keeping on using the bad old code path. Sponsored-by: Tobias Ammann on Patreon	2022-05-03 12:12:41 -04:00
Joey Hess	280d41b58f	Fix a build failure with ghc 9.2.2 Thanks, gnezdo for the patch.	2022-05-02 14:21:48 -04:00
Joey Hess	17b20a2450	Fix test failure on NFS when cleaning up gpg temp directory Using removePathForcibly avoids concurrent removal problems. The i386ancient build still uses an old version of ghc and directory that do not include removePathForcibly though. Sponsored-by: Dartmouth College's Datalad project	2022-04-19 13:33:33 -04:00
Joey Hess	fd65de0eb9	multicast: Support uftp 5.0 by switching from aes256-cbc to aes256-gcm aes256-gcm is supported by both 4.x and 5.x, while 5.x dropped aes256-cbc. Sponsored-by: Graham Spencer on Patreon	2022-04-19 12:02:10 -04:00
Joey Hess	ff6b36c706	assistant prompt pushing of manual commits to remotes assistant: When annex.autocommit is set, notice commits that the user makes manually, and push them out to remotes promptly. Sponsored-by: Boyd Stephen Smith Jr. on Patreon	2022-03-31 13:02:16 -04:00
Joey Hess	d266a41f8d	prevent numcopies or mincopies being configured to 0 Ignore annex.numcopies set to 0 in gitattributes or git config, or by git-annex numcopies or by --numcopies, since that configuration would make git-annex easily lose data. Same for mincopies. This is a continuation of the work to make data only be able to be lost when --force is used. It earlier led to the --trust option being disabled, and similar reasoning applies here. Most numcopies configs had docs that strongly discouraged setting it to 0 anyway. And I can't imagine a use case for setting to 0. Not that there might not be one, but it's just so far from the intended use case of git-annex, of managing and storing your data, that it does not seem like it makes sense to cater to such a hypothetical use case, where any git-annex drop can lose your data at any time. Using a smart constructor makes sure every place avoids 0. Note that this does mean that NumCopies is for the configured desired values, and not the actual existing number of copies, which of course can be 0. The name configuredNumCopies is used to make that clear. Sponsored-by: Brock Spratlen on Patreon	2022-03-28 15:20:34 -04:00
Joey Hess	959beeea9f	releasing package git-annex version 10.20220322	2022-03-22 13:56:45 -04:00
Joey Hess	a460aa8b70	Removed the NetworkBSD build flag Debian stable and the i386ancient build both have a new enough network to not need this flag any longer. Sponsored-by: Svenne Krap on Patreon	2022-03-22 11:52:52 -04:00
Joey Hess	982eb7ed0d	remove vendored http-client-restricted Removed vendored copy of http-client-restricted, and removed the HttpClientRestricted build flag that avoided that dependency. http-client-restricted is in Debian stable, and the i386ancient build also uses it, so I think this vendored copy is no longer needed. Sponsored-by: Noam Kremen on Patreon	2022-03-22 11:50:06 -04:00
Joey Hess	42b6f24e67	reorder	2022-03-21 16:02:24 -04:00
Joey Hess	6079b0c72c	fix reversion add: Avoid unncessarily converting a newly unlocked file to be stored in git when it is not modified, even when annex.largefiles does not match it. This fixes a reversion in version 10.20220222, where git-annex unlock followed by git-annex add, followed by git commit file could result in git thinking the file was modified after the commit. I do have half a mind to remove the withUnmodifiedUnlockedPointers part of git-annex add. It seems weird, despite that old bug report arguing a case of consistency that it ought to behave that way. When git-annex add surpises me, it seems likely it's wrong.. But for now, this is the smallest possible fix. Sponsored-by: Dartmouth College's Datalad project	2022-03-21 15:54:04 -04:00
Joey Hess	3e2f1f73cb	add back inode to directory special remote ContentIdentifier Directory special remotes with importtree=yes have changed to once more take inodes into account. This will cause extra work when importing from a directory on a FAT filesystem that changes inodes on every mount. To avoid that extra work, set ignoreinodes=yes when initializing a new directory special remote, or change the configuration of your existing remote: git-annex enableremote foo ignoreinodes=yes This will mean a one-time re-import of all contents from every directory special remote due to the changed setting. `73df633a62` thought it was too unlikely that there would be modifications that the inode number was needed to notice. That was probably right; it's very unlikely that a file will get modified and end up with the same size and mtime as before. But, what was not considered is that a program like NextCloud might write two files with different content so closely together that they share the mtime. The inode is necessary to detect that situation. Sponsored-by: Max Thoursie on Patreon	2022-03-21 13:12:02 -04:00
Joey Hess	025c18128b	test: Added --jobs option Default to the number of CPU cores, which seems about optimal on my laptop. Using one more saves me 2 seconds actually. Better packing of workers improves speed significantly. In 2 tests runs, I saw segfaulting workers despite my attempt to work around that issue. So detect when a worker does, and re-run it. Removed installSignalHandlers again, because I was seeing an error "lost signal due to full pipe", which I guess was somehow caused by using it. Sponsored-by: Dartmouth College's Datalad project	2022-03-16 14:42:07 -04:00
Joey Hess	b1934cc794	changelog	2022-03-02 18:27:20 -04:00
Joey Hess	2fc46e1871	git-annex test from standalone speedup Avoid git-annex test being very slow when run from within the standalone linux tarball or OSX app. It may not really be necessary to add to PATH the directory where the git-annex binary resides, but it can't hurt. Most places where the test suite or git-annex run git-annex, they use programPath, so won't need a modified PATH. But I'm not sure if that's always the case. Sponsored-by: Dartmouth College's Datalad project	2022-03-01 16:08:55 -04:00

... 3 4 5 6 7 ...

1774 commits