git-annex

Author	SHA1	Message	Date
Joey Hess	d0b06c17c0	Added --no-check-gitignore option for finer grained control than using --force. add, addurl, importfeed, import: Added --no-check-gitignore option for finer grained control than using --force. (--force is used for too many different things, and at least one of these also uses it for something else. I would like to reduce --force's footprint until it only forces drops or a few other data losses. For now, --force still disables checking ignores too.) addunused: Don't check .gitignores when adding files. This is a behavior change, but I justify it by analogy with git add of a gitignored file adding it, asking to add all unused files back should add them all back, not skip some. The old behavior was surprising. In Command.Lock and Command.ReKey, CheckGitIgnore False does not change behavior, it only makes explicit what is done. Since these commands are run on annexed files, the file is already checked into git, so git add won't check ignores.	2020-09-18 13:19:13 -04:00
Joey Hess	3a05d53761	add SeekInput (not yet used) No behavior changes (hopefully), just adding SeekInput and plumbing it through to the JSON display code for later use. Over the course of 2 grueling days. withFilesNotInGit reimplemented in terms of seekHelper should be the only possible behavior change. It seems to test as behaving the same. Note that seekHelper dummies up the SeekInput in the case where segmentPaths' gives up on sorting the expanded paths because there are too many input paths. When SeekInput later gets exposed as a json field, that will result in it being a little bit wrong in the case where 100 or more paths are passed to a git-annex command. I think this is a subtle enough problem to not matter. If it does turn out to be a problem, fixing it would require splitting up the input parameters into groups of < 100, which would make git ls-files run perhaps more than is necessary. May want to revisit this, because that fix seems fairly low-impact.	2020-09-15 15:41:13 -04:00
Joey Hess	957a87b437	fix absolute filenames fed into --batch and git-annex info	2020-04-15 16:04:05 -04:00
Joey Hess	eaa49ab53d	convert replaceFile to createDirectoryUnder Since it was used on both worktree and .git/annex files, split into multiple functions. In passing, this also improves permissions of created directories in .git/annex, using createAnnexDirectory on those.	2020-03-06 11:31:01 -04:00
Joey Hess	c19211774f	use filepath-bytestring for annex object manipulations git-annex find is now RawFilePath end to end, no string conversions. So is git-annex get when it does not need to get anything. So this is a major milestone on optimisation. Benchmarks indicate around 30% speedup in both commands. Probably many other performance improvements. All or nearly all places where a file is statted use RawFilePath now.	2019-12-11 15:25:07 -04:00
Joey Hess	bdec7fed9c	convert TopFilePath to use RawFilePath Adds a dependency on filepath-bytestring, an as yet unreleased fork of filepath that operates on RawFilePath. Git.Repo also changed to use RawFilePath for the path to the repo. This does eliminate some RawFilePath -> FilePath -> RawFilePath conversions. And filepath-bytestring's </> is probably faster. But I don't expect a major performance improvement from this. This is mostly groundwork for making Annex.Location use RawFilePath, which will allow for a conversion-free pipleline.	2019-12-09 15:07:21 -04:00
Joey Hess	5f391179f1	use RawFilePath getFileStatus for speed Only done on those calls to getFileStatus that had a RawFilePath, not a FilePath. The others would probably be just as fast if converted to use it with toRawFilePath, but I'm not 100% sure. Note that genInodeCache' uses fromRawFilePath, but that value only gets used on Windows, so on unix the thunk will never be evaluated.	2019-12-06 14:44:42 -04:00
Joey Hess	3c7fd09ec8	get many more commands building again about half are building now	2019-12-05 11:40:10 -04:00
Joey Hess	689d1fcc92	remove most remnants of direct mode A few remain, as needed for upgrades, and for accessing objects from remotes that are direct mode repos that have not been converted yet.	2019-08-26 16:27:48 -04:00
Joey Hess	436f107715	make CommandStart return a StartMessage The goal is to be able to run CommandStart in the main thread when -J is used, rather than unncessarily passing it off to a worker thread, which incurs overhead that is signficant when the CommandStart is going to quickly decide to stop. To do that, the message it displays needs to be displayed in the worker thread, after the CommandStart has run. Also, the change will mean that CommandStart will no longer necessarily run with the same Annex state as CommandPerform. While its docs already said it should avoid modifying Annex state, I audited all the CommandStart code as part of the conversion. (Note that CommandSeek already sometimes runs with a different Annex state, and that has not been a source of any problems, so I am not too worried that this change will lead to breakage going forward.) The only modification of Annex state I found was it calling allowMessages in some Commands that default to noMessages. Dealt with that by adding a startCustomOutput and a startingUsualMessages. This lets a command start with noMessages and then select the output it wants for each CommandStart. One bit of breakage: onlyActionOn has been removed from commands that used it. The plan is that, since a StartMessage contains an ActionItem, when a Key can be extracted from that, the parallel job runner can run onlyActionOn' automatically. Then commands won't need to worry about this detail. Future work. Otherwise, this was a fairly straightforward process of making each CommandStart compile again. Hopefully other behavior changes were mostly avoided. In a few cases, a command had a CommandStart that called a CommandPerform that then called showStart multiple times. I have collapsed those down to a single start action. The main command to perhaps suffer from it is Command.Direct, which used to show a start for each file, and no longer does. Another minor behavior change is that some commands used showStart before, but had an associated file and a Key available, so were changed to ShowStart with an ActionItemAssociatedFile. That will not change the normal output or behavior, but --json output will now include the key. This should not break it for anyone using a real json parser.	2019-06-06 17:13:54 -04:00
Joey Hess	40ecf58d4b	update licenses from GPL to AGPL This does not change the overall license of the git-annex program, which was already AGPL due to a number of sources files being AGPL already. Legally speaking, I'm adding a new license under which these files are now available; I already released their current contents under the GPL license. Now they're dual licensed GPL and AGPL. However, I intend for all my future changes to these files to only be released under the AGPL license, and I won't be tracking the dual licensing status, so I'm simply changing the license statement to say it's AGPL. (In some cases, others wrote parts of the code of a file and released it under the GPL; but in all cases I have contributed a significant portion of the code in each file and it's that code that is getting the AGPL license; the GPL license of other contributors allows combining with AGPL code.)	2019-03-13 15:48:14 -04:00
Joey Hess	303e828b7c	rest of the deserializeKey renameing	2019-01-14 13:17:47 -04:00
Joey Hess	558520d27a	fix rekey/migrate bookkeeping in v6 After `220317df5a` the test suite still detected a problem; migrate of an unlocked file replaced it with a pointer file rather than a file with the content. This was a bookeeping problem; the worktree file was being copied to the object file and the inode cache updated, but if that database write didn't get flushed in time, later checks would think the content was not present. Fixed by copying the object file to the worktree file instead, which avoids needing to update the inode cache. Also, only copy when there's a hard link to break, not always. This commit was sponsored by Brock Spratlen on Patreon.	2018-10-16 17:18:21 -04:00
Joey Hess	220317df5a	v6: fix migrate of unlocked file After commit `b2bafdb2fc` the test suite threw up a failure migrating unlocked files. I'm not clear how that commit broke it (presumably by inAnnex reporting the right information now), but the actual problem is plain: The inodecache for the worktree file is generated, but then the file is replaced with a copy (unncessarily unless annex.link is set, but the code always does so) and so linkToAnnex/linkAnnex then fails because it notices the inode cache is not valid. This commit was sponsored by Jake Vosloo on Patreon.	2018-10-16 16:45:08 -04:00
Joey Hess	bdf6783b92	improve error message	2018-10-16 15:52:40 -04:00
Joey Hess	53526136e8	move commandAction out of CmdLine.Seek This is groundwork for nested seek loops, eg seeking over all files and then performing commandActions on a list of remotes, which can be done concurrently. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2018-10-01 14:12:06 -04:00
Joey Hess	1d1054faa6	added -z Added -z option to git-annex commands that use --batch, useful for supporting filenames containing newlines. It only controls input to --batch, the output will still be line delimited unless --json or etc is used to get some other output. While git often makes -z affect both input and output, I don't like trying them together, and making it affect output would have been a significant complication, and also git-annex output is generally not intended to be machine parsed, unless using --json or a format option. Commands that take pairs like "file key" still separate them with a space in --batch mode. All such commands take care to support filenames with spaces when parsing that, so there was no need to change it, and it would have needed significant changes to the batch machinery to separate tose with a null. To make fromkey and registerurl support -z, I had to give them a --batch option. The implicit batch mode they enter when not provided with input parameters does not support -z as that would have complicated option parsing. Seemed better to move these toward using the same --batch as everything else, though the implicit batch mode can still be used. This commit was sponsored by Ole-Morten Duesund on Patreon.	2018-09-20 16:11:47 -04:00
Joey Hess	fcff64f8bb	optimisation: avoid stat call This commit was sponsored by Paul Walmsley on Patreon.	2018-09-05 17:26:12 -04:00
Joey Hess	b657242f5d	enforce retrievalSecurityPolicy Leveraged the existing verification code by making it also check the retrievalSecurityPolicy. Also, prevented getViaTmp from running the download action at all when the retrievalSecurityPolicy is going to prevent verifying and so storing it. Added annex.security.allow-unverified-downloads. A per-remote version would be nice to have too, but would need more plumbing, so KISS. (Bill the Cat reference not too over the top I hope. The point is to make this something the user reads the documentation for before using.) A few calls to verifyKeyContent and getViaTmp, that don't involve downloads from remotes, have RetrievalAllKeysSecure hard-coded. It was also hard-coded for P2P.Annex and Command.RecvKey, to match the values of the corresponding remotes. A few things use retrieveKeyFile/retrieveKeyFileCheap without going through getViaTmp. * Command.Fsck when downloading content from a remote to verify it. That content does not get into the annex, so this is ok. * Command.AddUrl when using a remote to download an url; this is new content being added, so this is ok. This commit was sponsored by Fernando Jimenez on Patreon.	2018-06-21 13:37:01 -04:00
Joey Hess	31e1adc005	deal with unlocked files P2P protocol version 1 adds VALID\|INVALID after DATA; INVALID means the file was detected to change content while it was being sent and so we may not have received the valid content of the file. Added new MustVerify constructor for Verification, which forces verification even when annex.verify=false etc. This is used when INVALID and in protocol version 0. As well as changing git-annex-shell p2psdio, this makes git-annex tor remotes always force verification, since they don't yet use protocol version 1. Previously, annex.verify=false could skip verification when using tor remotes, and let bad data into the repository. This commit was sponsored by Jack Hill on Patreon.	2018-03-13 14:27:14 -04:00
Joey Hess	a171e576b2	rekey --force: Incorrectly marked the new key's content as being present in the local repo even when it was not.	2016-12-19 18:18:57 -04:00
Joey Hess	82d01f5619	rekey: Added --batch mode. Would have liked to make the Parser parse the file and key pairs, but it seems that optparse-applicative is unable to handle eg: many ((,) <$> argument <*> argument) This commit was sponsored by Thomas Hochstein on Patreon.	2016-12-05 12:55:50 -04:00
Joey Hess	48d8c175f8	avoid backtrace when rekey cntent verification fails	2016-11-22 01:16:18 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	b7c8bf5274	Preserve execute bits of unlocked files in v6 mode. When annex.thin is set, adding an object will add the execute bits to the work tree file, and this does mean that the annex object file ends up executable. This doesn't add any complexity that wasn't already present, because git annex add of an executable file has always ingested it so that the annex object ends up executable. But, since an annex object file can be executable or not, when populating an unlocked file from one, the executable bit is always added or removed to match the mode of the pointer file.	2016-04-14 14:47:08 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	d6fe7fdd7d	rekey: No longer copies over urls from the old to the new key. It makes sense for migrate to do that, but not for this low-level (and little used) plumbing command to.	2016-01-07 18:06:20 -04:00
Joey Hess	3b960d1422	migrate and rekey v6 unlocked file support	2016-01-07 15:14:15 -04:00
Joey Hess	8e9608d7f0	refactoring no behavior changes	2015-12-22 13:42:58 -04:00
Joey Hess	2def1d0a23	other 80% of avoding verification when hard linking to objects in shared repo In `c6632ee5c8`, it actually only handled uploading objects to a shared repository. To avoid verification when downloading objects from a shared repository, was a lot harder. On the plus side, if the process of downloading a file from a remote is able to verify its content on the side, the remote can indicate this now, and avoid the extra post-download verification. As of yet, I don't have any remotes (except Git) using this ability. Some more work would be needed to support it in special remotes. It would make sense for tahoe to implicitly verify things downloaded from it; as long as you trust your tahoe server (which typically runs locally), there's cryptographic integrity. OTOH, despite bup being based on shas, a bup repo under an attacker's control could have the git ref used for an object changed, and so a bup repo shouldn't implicitly verify. Indeed, tahoe seems unique in being trustworthy enough to implicitly verify.	2015-10-02 14:35:12 -04:00
Joey Hess	2fb3722ce9	Do verification of checksums of annex objects downloaded from remotes. * When annex objects are received into git repositories, their checksums are verified then too. * To get the old, faster, behavior of not verifying checksums, set annex.verify=false, or remote.<name>.annex-verify=false. * setkey, rekey: These commands also now verify that the provided file matches the key, unless annex.verify=false. * reinject: Already verified content; this can now be disabled by setting annex.verify=false. recvkey and reinject already did verification, so removed now duplicate code from them. fsck still does its own verification, which is ok since it does not use getViaTmp, so verification doesn't happen twice when using fsck --from.	2015-10-01 15:56:39 -04:00
Joey Hess	b72d3fbeba	rename function	2015-10-01 14:18:57 -04:00
Joey Hess	6e5c1f8db3	convert all commands to work with optparse-applicative Still no options though.	2015-07-08 15:08:02 -04:00
Joey Hess	3125da54f6	display cmdparamdesc in optparse-applicative usage messages Since optparse-applicative display "FOO" as "[FOO]", the paramOptional modifier which wrapped it in square brackets was removed from most places.	2015-07-08 13:39:11 -04:00
Joey Hess	a2ba701056	started converting to use optparse-applicative This is a work in progress. It compiles and is able to do basic command dispatch, including git autocorrection, while using optparse-applicative for the core commandline parsing. * Many commands are temporarily disabled before conversion. * Options are not wired in yet. * cmdnorepo actions don't work yet. Also, removed the [Command] list, which was only used in one place.	2015-07-08 13:36:25 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	7ae16bb6f7	Revert "let url claims optionally include a suggested filename" This reverts commit `85df9c30e9`. Putting filename in the claim was a bad idea.	2014-12-11 14:09:57 -04:00
Joey Hess	85df9c30e9	let url claims optionally include a suggested filename	2014-12-11 12:47:57 -04:00
Joey Hess	30bf112185	Urls can now be claimed by remotes. This will allow creating, for example, a external special remote that handles magnet: and *.torrent urls.	2014-12-08 19:15:07 -04:00
Joey Hess	59f88558d5	doh't use "def" for command definitions, it conflicts with Data.Default.def	2014-10-14 14:20:10 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	98fc7e8a19	add, import, assistant: Better preserve the mtime of symlinks, when when adding content that gets deduplicated. Note that this turned out to remove a syscall, not add any expense. Otherwise, I would not have done it.	2013-09-25 16:07:11 -04:00
Joey Hess	a64106dcef	Supports indirect mode on encfs in paranoia mode, and other filesystems that do not support hard links, but do support symlinks and other POSIX filesystem features.	2013-06-10 13:11:33 -04:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	7b4733f0e8	addurl: Bugfix: Did not properly add file in direct mode.	2013-04-11 13:35:52 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	cfd3b16fe1	add section metadata to all commands Not yet used .. mindless train work.	2013-03-24 18:28:21 -04:00
Joey Hess	5e6a60c17d	migrate, rekey: copy rather than hard linking in crippled filesystem mode	2013-02-15 13:51:50 -04:00

1 2

57 commits