git-annex

Author	SHA1	Message	Date
Joey Hess	c1cd402081	make storeKey throw exceptions When storing content on remote fails, always display a reason why. Since the Storer used by special remotes already did, this mostly affects git remotes, but not entirely. For example, if git-lfs failed to connect to the endpoint, it used to silently return False.	2020-05-13 14:03:00 -04:00
Joey Hess	5f5170b22b	remove SafeFilePath Move sanitizeFilePath call to where fromSafeFilePath had been.	2020-05-11 14:04:56 -04:00
Joey Hess	cabbc91b18	addurl, importfeed: Allow '-' in filenames, as long as it's not the first character	2020-05-11 13:50:49 -04:00
Joey Hess	6952060665	addurl --preserve-filename and a few related changes * addurl --preserve-filename: New option, uses server-provided filename without any sanitization, but with some security checking. Not yet implemented for remotes other than the web. * addurl, importfeed: Avoid adding filenames with leading '.', instead it will be replaced with '_'. This might be considered a security fix, but a CVE seems unwattanted. It was possible for addurl to create a dotfile, which could change behavior of some program. It was also possible for a web server to say the file name was ".git" or "foo/.git". That would not overrwrite the .git directory, but would cause addurl to fail; of course git won't add "foo/.git". sanitizeFilePath is too opinionated to remain in Utility, so moved it. The changes to mkSafeFilePath are because it used sanitizeFilePath. In particular: isDrive will never succeed, because "c:" gets munged to "c_" ".." gets sanitized now ".git" gets sanitized now It will never be null, because sanitizeFilePath keeps the length the same, and splitDirectories never returns a null path. Also, on the off chance a web server suggests a filename of "", ignore that, rather than trying to save to such a filename, which would fail in some way.	2020-05-08 16:22:55 -04:00
Joey Hess	19b5137227	addurl --fast error message improvement addurl: When run with --fast on an url that annex.security.allowed-ip-addresses prevents accessing, display a more useful message. (Also importfeed --fast potentially.)	2020-04-27 13:48:14 -04:00
Joey Hess	04352ed9c5	check-ignore resource pool Much like check-attr before.	2020-04-21 11:25:28 -04:00
Joey Hess	45fb7af21c	check-attr resource pool Limited to min of -JN or number of CPU cores, because it will often be CPU bound, once it's read the gitignore file for a directory. In some situations it's more disk bound, but in any case it's unlikely to be the main bottleneck that -J is used to avoid. Eg, when dropping, this is used for numcopies checks, but the main bottleneck will be accessing the remotes to verify presence. So the user might decide to -J32 that, but having 32 check-attr processes would just waste however many filehandles they open, and probably worsen their performance due to CPU contention. Note that, I first tried just letting up to the -JN be started. However, even when it's no bottleneck at all, that still results in all of them being started. Why? Well, all the worker threads start up nearly simulantaneously, so there's a thundering herd..	2020-04-21 11:05:57 -04:00
Joey Hess	cee6b344b4	cat-file resource pool Avoid running a large number of git cat-file child processes when run with a large -J value. This implementation takes care to avoid adding any overhead to git-annex when run without -J. When run with -J, there is a small bit of added overhead, to manipulate the resource pool. That optimisation added a fair bit of complexity.	2020-04-20 15:19:31 -04:00
Joey Hess	fe9cf1256e	move remoteList into dupState This does mean that RemoteDaemon.Transport.Tor's call runs it, otherwise no change, but this is groundwork for doing more such expensive actions in dupState.	2020-04-17 14:36:45 -04:00
Joey Hess	a7840c0e04	improve programPath Fixes a failure mode where git-annex sync would try to run git-annex and complain that it failed to find it in ~/.config/git-annex/program or PATH, when there was a git-annex in /usr/bin/, but the original one was run from elsewhere (eg, ~/bin) and happened not to be present any longer. Now, it will fall back to using git-annex from PATH in such a case. Which might fail due to some version incompatability, but still better than a misleading error message. Also made readProgramFile only read the file, not look for git-annex in PATH as a fallback. That fallback may have confused Assistant.Upgrade, which really wants the value from the file.	2020-04-15 16:46:34 -04:00
Joey Hess	43a9808292	disable journal read optimisation when alwayscommit=false The journal read optimisation in `aeca7c220` later got fixed in `eedd73b84` to stage and commit any files that were left in the journal by a previous git-annex run. That's necessary for the optimisation to work correctly. But it also meant that alwayscommit=false started committing the previous git-annex processes journalled changes, which defeated the purpose of the config setting entirely. So, disable the optimisation when alwayscommit=false, leaving the files in the journal and not committing them. See my comments on the bug report for why this seemed the best approach. Also fixes a problem when annex.merge-annex-branches=false and there are changes in the journal. That config indirectly prevents committing the journal. (Which seems a bit odd given its name, but it always has..) So, when there were changes in the journal, perhaps left there due to alwayscommit=false being set before, the optimisation would prevent git-annex from reading the journal files, and it would operate with out of date information.	2020-04-15 13:24:33 -04:00
Joey Hess	5a62e8132d	When parsing git configs, support all the documented ways to write true and false, including "yes", "on", "1", etc. This change does impact git-annex config eg "git annex config --set annex.addunlocked on" will store "on" and new git-annex will understand that value, while old git-annex will error: git-annex: bad annex.addunlocked configuration in git annex config: Parse failure: near "on" That seems acceptable. Not special remote configs that are only documented as =true or =false however. Having git-annex support other values for those would break backwards compatability when used with old versions of git-annex. And older versions ignore invalid special remote configs.. That would not be a good combination.	2020-04-13 14:05:30 -04:00
Joey Hess	ca9c6c5f60	Fix a potential failure to parse git config Git has an obnoxious special case in git config, a line "foo" is the same as "foo = true". That means there is no way to examine the output of git config and tell if it was run with --null or not, since a "foo" in the first line could be such a boolean, or could be followed by its value on the next line if --null were used. So, rather than trying to do such a detection, track the style of config at all the points where it's generated.	2020-04-13 13:05:41 -04:00
Joey Hess	eedd73b846	fix reversion caused by earlier optimisation to git-annex branch reads `aeca7c2207` was predicated on the assumption that updateTo would stage any journal files, but in one case it did not actually do so. The test suite happened to expose the bug.	2020-04-10 15:25:22 -04:00
Joey Hess	2caf579718	cache annex index filename for 1.5% speedup to queries	2020-04-10 13:37:04 -04:00
Joey Hess	aeca7c2207	Sped up query commands that read the git-annex branch by around 5% The only price paid is one additional MVar read per write to the journal. Presumably writing a journal file dominiates over a MVar read time by several orders of magnitude. --batch does not get the speedup because then it needs to notice when another process has made a change. Also made the assistant and other damon modes bypass the optimisation, which would not help them anyway.	2020-04-09 13:54:43 -04:00
Joey Hess	c0cd07c36b	Ref ByteString conversion done Test suite passes.	2020-04-07 17:41:09 -04:00
Joey Hess	6c81e0c8f1	ByteString Ref continued Several nice speed wins I think. At 340/633 files converted.	2020-04-07 13:27:11 -04:00
Joey Hess	87d5583a91	use programPath consistently, not readProgramFile Improve git-annex's ability to find the path to its program, especially when it needs to run itself in another repo to upgrade it. Some parts of the code used readProgramFile, probably because I forgot that programPath exists. I noticed this when a git-annex auto-upgrade failed because it was running git-annex upgrade --autoonly, but the code to run git-annex used readProgramFile, which happened to point to an older build of git-annex.	2020-03-30 16:06:27 -04:00
Joey Hess	f6d19b18f6	remove unused imports	2020-03-30 12:11:52 -04:00
Joey Hess	0e4d80d5c1	remove pre-commit hook This was originally added so that unannex could prevent the hook from running while files were in a state that the hook would interpret as old-style unlocked and so would lock. Now that's gone, so the only thing the hook was preventing was two pre-commit processes running simulantaneously. But such concurrency is normal in git-annex and should not be a problem. Does mean that .git/hooks/pre-commit-annex might run more concurrently, that seems the only risk of it causing any problems.	2020-03-30 11:54:04 -04:00
Joey Hess	2e6e8aa60a	fix windows build some more	2020-03-20 11:47:09 -04:00
Joey Hess	d930a2035c	Avoid converting .git file in a worktree or submodule to a symlink when the repository is not a git-annex repository. This means it will still be a .git file when git-annex init runs. That's ok, the repo probably contains no annexed objects yet, and even if it does, git-annex init does not care if symlinks in the worktree don't point to the objects. I made init, at the end, run the conversion code. Not really necessary because the next git-annex command could do it just as well. But, this avoids commands that don't normally write to the repo needing to write to it, which might avoid some problem or other, and seems worth avoiding generally.	2020-03-09 14:54:14 -04:00
Joey Hess	c0a981cb0e	update comment	2020-03-09 14:31:28 -04:00
Joey Hess	093fde5abd	completed the createDirectoryIfMissing conversion Remaining calls in the assistant and Annex.Ssh have been audited and are ok.	2020-03-06 12:55:03 -04:00
Joey Hess	2f204b5d37	refactor	2020-03-06 11:43:07 -04:00
Joey Hess	eaa49ab53d	convert replaceFile to createDirectoryUnder Since it was used on both worktree and .git/annex files, split into multiple functions. In passing, this also improves permissions of created directories in .git/annex, using createAnnexDirectory on those.	2020-03-06 11:31:01 -04:00
Joey Hess	6d58ca94d6	some easy createDirectoryUnder conversions	2020-03-05 15:20:10 -04:00
Joey Hess	ebbc5004fa	convert createAnnexDirectory to use createDirectoryUnder It will create foo/.git/annex/, but not foo/.git/ and not foo/. This will avoid it creating an empty path to a repo when a drive is yanked out and the mount point goes away, for example.	2020-03-05 14:33:04 -04:00
Joey Hess	ccd8c43dc8	git-annex config: guard against non-repo-global configs git-annex config: Only allow configs be set that are ones git-annex actually supports reading from repo-global config, to avoid confused users trying to set other configs with this.	2020-03-02 15:54:18 -04:00
Joey Hess	c78b9b55b6	rename changeGitConfig to overrideGitConfig and avoid unncessary calls It's important that it be clear that it overrides a config, such that reloading the git config won't change it, and in particular, setConfig won't change it. Most of the calls to changeGitConfig were actually after setConfig, which was redundant and unncessary. So removed those. The only remaining one, besides --debug, is in the handling of repository-global config values. That one's ok, because the way mergeGitConfig is implemented, it does not override any value that is set in git config. If a value with a repo-global setting was passed to setConfig, it would set it in the git config, reload the git config, re-apply mergeGitConfig, and use the newly set value, which is the right thing.	2020-02-27 01:11:53 -04:00
Joey Hess	81e3faf810	Merge branch 'v7'	2020-02-26 18:15:18 -04:00
Joey Hess	8af6d2c3c5	fix encryption of content to gcrypt and git-lfs Fix serious regression in gcrypt and encrypted git-lfs remotes. Since version 7.20200202.7, git-annex incorrectly stored content on those remotes without encrypting it. Problem was, Remote.Git enumerates all git remotes, including git-lfs and gcrypt. It then dispatches to those. So, Remote.List used the RemoteConfigParser from Remote.Git, instead of from git-lfs or gcrypt, and that parser does not know about encryption fields, so did not include them in the ParsedRemoteConfig. (Also didn't include other fields specific to those remotes, perhaps chunking etc also didn't get through.) To fix, had to move RemoteConfig parsing down into the generate methods of each remote, rather than doing it in Remote.List. And a consequence of that was that ParsedRemoteConfig had to change to include the RemoteConfig that got parsed, so that testremote can generate a new remote based on an existing remote. (I would have rather fixed this just inside Remote.Git, but that was not practical, at least not w/o re-doing work that Remote.List already did. Big ugly mostly mechanical patch seemed preferable to making git-annex slower.)	2020-02-26 18:05:36 -04:00
Joey Hess	9659f1c30f	annex.security.allowed-ip-addresses ports syntax Extended annex.security.allowed-ip-addresses to let specific ports of an IP address to be used, while denying use of other ports.	2020-02-25 15:45:52 -04:00
Joey Hess	1bb32098d6	jump right to v8, don't stop part way * init --version: When the version given is one that automatically upgrades to a newer version, use the newer version instead. * Auto upgrades from older repo versions, like v5, now jump right to v8.	2020-02-24 13:21:00 -04:00
Joey Hess	c31e1be781	convert KeySource to RawFilePath	2020-02-21 10:04:44 -04:00
Joey Hess	029c883713	Merge branch 'master' into v8	2020-02-19 14:32:11 -04:00
Joey Hess	69f2d1dd43	remoteConfig rework remoteAnnexConfig will avoid bugs like `a3a674d15b` Use now more generic remoteConfig in a couple places that built non-annex config settings manually before.	2020-02-19 13:45:11 -04:00
Joey Hess	ae4177d456	fix warning	2020-02-17 15:06:28 -04:00
Joey Hess	da9945c013	silence build warning	2020-02-14 19:38:50 -04:00
Joey Hess	879f52a116	annex.tune.branchhash1=true bugfix Fix support for repositories tuned with annex.tune.branchhash1=true, including --all not working and git-annex log not displaying anything for annexed files.	2020-02-14 15:22:48 -04:00
Joey Hess	a490947068	annex.sshcaching warning improvement and allow overridding build time default * When git-annex is built with a ssh that does not support ssh connection caching, default annex.sshcaching to false, but let the user override it. * Improve warning messages further when ssh connection caching cannot be used, to clearly state why.	2020-02-14 14:21:03 -04:00
Joey Hess	5c3636037b	Display a warning when concurrency is enabled but ssh connection caching is not enabled or won't work due to a crippled filesystem A warning message is unsatisfying. But erroring out is too hard a failure, especially since it may well work fine if the user has enabled passwordless ssh. I did think about falling back to one ssh connection at a time in this case, but it would have needed a rework of every ssh call, which seems far overboard for such a niche problem. There's no single place where git-annex runs ssh, so no one place that it could block a concurrent call on a semaphore. And, even if it did fall back to one ssh connection at a time, it seems to me that doing so without warning the user about the problem just invites bug reports like "git-annex is ignoring my -J2 and only doing one download at a time". So a warning is needed, and I suppose is good enough.	2020-01-23 12:35:46 -04:00
Joey Hess	6f90bb7738	handle git-credential prompt in -J mode If git-credential has it cached and does not prompt, this will unfortunately result in a brief flicker, as the displayed console regions are hidden while running it and then re-displayed. Better than a corrupted display. Actually, I tried it and don't see a visible flicker, so probably only over a slow ssh will it be apparent.	2020-01-22 16:42:15 -04:00
Joey Hess	1883f7ef8f	support git remotes that need http basic auth using git credential to get the password One thing this doesn't do is wrap the password prompting inside the prompt action. So with -J, the output can be a bit garbled.	2020-01-22 16:16:19 -04:00
Joey Hess	2be4122bfc	include passthrough params in --describe-other-params	2020-01-20 16:53:27 -04:00
Joey Hess	aa949bbb7d	initremote --describe-other-params Does not yet include descriptions from external special remote programs.	2020-01-20 16:05:51 -04:00
Joey Hess	7038acf96c	add descriptions for all remote config fields not yet used	2020-01-20 15:20:04 -04:00
Joey Hess	923230ea30	convert RemoteConfigFieldParser to data type	2020-01-20 13:49:30 -04:00
Joey Hess	8b9b90c74a	bugfixes getRemoteConfigPassedThrough was never returning anything, Typeable prevented the type checker from noticing a dumb mistake. parseRemoteConfig was not adding Accepted values as PassedThrough	2020-01-17 17:09:56 -04:00

1 2 3 4 5 ...

1426 commits