git-annex

Author	SHA1	Message	Date
Joey Hess	9c4650358c	add KeyVariety type Where before the "name" of a key and a backend was a string, this makes it a concrete data type. This is groundwork for allowing some varieties of keys to be disabled in file2key, so git-annex won't use them at all. Benchmarks ran in my big repo: old git-annex info: real 0m3.338s user 0m3.124s sys 0m0.244s new git-annex info: real 0m3.216s user 0m3.024s sys 0m0.220s new git-annex find: real 0m7.138s user 0m6.924s sys 0m0.252s old git-annex find: real 0m7.433s user 0m7.240s sys 0m0.232s Surprising result; I'd have expected it to be slower since it now parses all the key varieties. But, the parser is very simple and perhaps sharing KeyVarieties uses less memory or something like that. This commit was supported by the NSF-funded DataLad project.	2017-02-24 15:16:56 -04:00
Joey Hess	2577f1c0a2	fsck --all --from was checking the content of files in the local repository, rather than on the special remote. Straight up forgot to handle this case! This commit was sponsored by Fernando Jimenez on Patreon.	2016-11-16 15:33:57 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	d13194b230	--branch, stage 2 Show branch:file that is being operated on. I had to make ActionItem a type and not a type class because withKeyOptions' passed two different types of values when using the type class, and I could not get the type checker to accept that.	2016-07-20 15:23:43 -04:00
Joey Hess	5642daa651	fsck: Fix a reversion in direct mode fsck of a file that is present when the location log thinks it is not. Reversion introduced in version 5.20151208.	2016-07-12 13:41:03 -04:00
Joey Hess	ae65aecb0b	fsck: When a key is not previously known in the location log, record something so that reinject --known will work.	2016-05-10 13:20:45 -04:00
Joey Hess	9169234c34	fix overindent	2016-05-10 13:08:24 -04:00
Joey Hess	bd516af734	fsck: Warn when core.sharedRepository is set and an annex object file's write bit is not set and cannot be set due to the file being owned by a different user. Made all Annex.Perms file mode changing functions ignore errors when core.sharedRepository is set, because the file might be owned by someone else. I don't fancy getting bug reports about crashes due to set modes in this configuration, which is a very foot-shooty configuration in the first place. The fsck warning is necessary because old repos kept files mode 444, which doesn't allow locking them, and so if the mode remains 444 due to the file being owned by someone else, the user should be told about it.	2016-04-14 15:36:53 -04:00
Joey Hess	b7c8bf5274	Preserve execute bits of unlocked files in v6 mode. When annex.thin is set, adding an object will add the execute bits to the work tree file, and this does mean that the annex object file ends up executable. This doesn't add any complexity that wasn't already present, because git annex add of an executable file has always ingested it so that the annex object ends up executable. But, since an annex object file can be executable or not, when populating an unlocked file from one, the executable bit is always added or removed to match the mode of the pointer file.	2016-04-14 14:47:08 -04:00
Joey Hess	ec198fec83	fsck: When the only copy of a file is in a dead repository, mention the repository.	2016-02-19 15:12:11 -04:00
Joey Hess	885e54df0a	fsck: Populate unlocked files in v6 repositories whose content is present in annex/objects but didn't reach the work tree. This also handles fixing up after `cf260d9a15`	2016-02-14 17:27:50 -04:00
Joey Hess	675321264f	fsck: Detect and fix missing associated file mappings in v6 repositories. This also handles fixing up after the bad data written by `cf260d9a15`.	2016-02-14 17:09:54 -04:00
Joey Hess	74bbdfa888	files with only 1 linkCount may still be unlocked When on crippled filesystem, or without annex.thin set.	2016-02-14 17:04:09 -04:00
Joey Hess	5b51db7645	clean up	2016-02-14 16:52:43 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	87f0708f88	persistent-sqlite is now a hard build dependency, since v6 repository mode needs it.	2015-12-26 13:00:52 -04:00
Joey Hess	99f1d7991d	recent fsck changes caused ugly message when object was not present	2015-12-15 16:10:48 -04:00
Joey Hess	0ddcaae9c1	changes for v6 broke fsck in direct mode	2015-12-15 14:27:20 -04:00
Joey Hess	e7183d83d3	fsck for v6 unlocked files This only adds 1 stat to each file fscked for locked files, so added overhead is minimal. For unlocked files it has to access the database to see if a file is modified.	2015-12-11 16:07:54 -04:00
Joey Hess	abd66c7089	fsck: Failed to honor annex.diskreserve when checking a remote.	2015-12-11 13:50:27 -04:00
Joey Hess	4b02af57b6	display a message in the unlikely scenario of fsking a dead repository	2015-11-10 14:44:58 -04:00
Joey Hess	cd7929034a	fsck: When fscking a dead repo, avoid incorrect "fixing location log" message. keyLocations doesn't return locations in dead repos, but if we're fscking a dead repo, we want to look at what locations are actually logged for it.	2015-11-10 13:59:04 -04:00
Joey Hess	3d0f41518d	parallel fsck (yes, these changes are all it takes now!)	2015-11-04 16:28:14 -04:00
Joey Hess	2def1d0a23	other 80% of avoding verification when hard linking to objects in shared repo In `c6632ee5c8`, it actually only handled uploading objects to a shared repository. To avoid verification when downloading objects from a shared repository, was a lot harder. On the plus side, if the process of downloading a file from a remote is able to verify its content on the side, the remote can indicate this now, and avoid the extra post-download verification. As of yet, I don't have any remotes (except Git) using this ability. Some more work would be needed to support it in special remotes. It would make sense for tahoe to implicitly verify things downloaded from it; as long as you trust your tahoe server (which typically runs locally), there's cryptographic integrity. OTOH, despite bup being based on shas, a bup repo under an attacker's control could have the git ref used for an object changed, and so a bup repo shouldn't implicitly verify. Indeed, tahoe seems unique in being trustworthy enough to implicitly verify.	2015-10-02 14:35:12 -04:00
Joey Hess	cad3349001	rename fsckKey to verifyKeyContent No behavior changes.	2015-10-01 13:29:17 -04:00
Joey Hess	3f47d1b351	Improve bash completion, so it completes names of remotes and backends in appropriate places. Not necessarily everywhere, but a lot of the most often used places. Re the use of .Internal, see https://github.com/pcapriotti/optparse-applicative/issues/155	2015-09-14 13:19:04 -04:00
Joey Hess	0b7a8b72bb	Fix building without database. Ben Boeckel had a patch, but.. Actually, that was not the only place that used ScheduleIncremental when built w/o database. Since the data type doesn't need database stuff, I've instead fixed this build problem by exposing the ScheduleIncremental constructor to database-less builds.	2015-08-23 15:39:29 -07:00
Joey Hess	9dfe03dbcd	Improve shutdown due to --time-limit, especially for fsck * Perform a clean shutdown when --time-limit is reached. This includes running queued git commands, and cleanup actions normally run when a command is finished. * fsck: Commit incremental fsck database when --time-limit is reached. Previously, some of the last files fscked did not make it into the database when using --time-limit. Note that this changes Annex.addCleanup hooks, to run after --time-limit expires. Fsck was using such a hook to clean up after a --incremental-schedule, and that shouldn't run when --time-limit exipires it. So, instead, moved that cleanup code to be run by cleanupIncremental. Resulted in some data type juggling.	2015-07-31 16:01:54 -04:00
Joey Hess	1fb9ab342b	Support building without persistent database on for systems that lack TH. This removes support for incremental fsck.	2015-07-25 17:37:09 -04:00
Joey Hess	6a4f2087be	finished converting all the main options	2015-07-10 13:23:06 -04:00
Joey Hess	a7f58634b8	wip	2015-07-09 16:05:45 -04:00
Joey Hess	8ad927dbc6	converted copy and move Got a little tricky..	2015-07-09 15:23:14 -04:00
Joey Hess	032e6485fa	use Alternative for parsing KeyOptions	2015-07-09 12:44:03 -04:00
Joey Hess	94e703e8b8	use Alternative when parsing mutually exclusive fsck options	2015-07-09 12:26:25 -04:00
Joey Hess	c1c64ec76c	formatting	2015-07-09 10:42:28 -04:00
Joey Hess	d8d1499229	finalOpt is the same as optional	2015-07-09 01:02:27 -04:00
Joey Hess	60806dd191	wip	2015-07-08 17:59:06 -04:00
Joey Hess	6a88c7c101	converted fsck's options to optparse-applicative Global options and seeking and key options are still to be done.	2015-07-08 16:58:54 -04:00
Joey Hess	6e5c1f8db3	convert all commands to work with optparse-applicative Still no options though.	2015-07-08 15:08:02 -04:00
Joey Hess	a2ba701056	started converting to use optparse-applicative This is a work in progress. It compiles and is able to do basic command dispatch, including git autocorrection, while using optparse-applicative for the core commandline parsing. * Many commands are temporarily disabled before conversion. * Options are not wired in yet. * cmdnorepo actions don't work yet. Also, removed the [Command] list, which was only used in one place.	2015-07-08 13:36:25 -04:00
Joey Hess	5123a512d6	add a hint about marking a key dead	2015-06-09 15:12:40 -04:00
Joey Hess	6eefc5db65	fsck: Ignore keys that are known to be dead when running in --all mode or a in a bare repo. Otherwise, still reports files with lost contents, even if the content is dead.	2015-06-09 14:08:57 -04:00
Joey Hess	a812d598ef	Take space that will be used by running downloads into account when checking annex.diskreserve.	2015-05-12 15:20:22 -04:00
Joey Hess	e27b97d364	Merge branch 'master' into concurrentprogress Conflicts: Command/Fsck.hs Messages.hs Remote/Directory.hs Remote/Git.hs Remote/Helper/Special.hs Types/Remote.hs debian/changelog git-annex.cabal	2015-05-12 13:23:22 -04:00
Joey Hess	6cf62a9bde	support time-1.5.0 This no longer uses old-locale's defaultTimeLocale, but provides one of its own. Factored out a Logs.TimeStamp.	2015-05-10 15:21:35 -04:00
Joey Hess	469242ac4d	fsck: Ignore error recording the fsck in the activity log, which can happen when running fsck in a read-only repository. Closes: #698559 (fsck can still need to write to the repository if it find problems, but a successful fsck can be done read-only)	2015-05-06 14:45:20 -04:00
Joey Hess	38c458b407	refactor	2015-04-30 14:02:56 -04:00
Joey Hess	cfbeb1e7b7	Fix bogus failure of fsck --fast.	2015-04-27 17:40:21 -04:00
Joey Hess	8d685768d3	fsck --from remote: Avoid downloading a key if it would go over the annex.diskreserve limit.	2015-04-18 14:23:42 -04:00
Joey Hess	8489057e8d	fsck --from remote: When bad content is found in the remote, and the local repo does not have a copy of the content, preserve the bad content in .git/annex/bad/ to avoid further data loss.	2015-04-18 14:13:07 -04:00
Joey Hess	a2902cdaaf	add filename to progress bar, and display ok/failed at end This needed plumbing an AssociatedFile through retrieveKeyFileCheap.	2015-04-14 16:35:10 -04:00
Joey Hess	9445556c97	rethought distributed fsck; instead add activity.log and expire command This is much more space efficient!	2015-04-05 12:50:02 -04:00
Joey Hess	656fc1c881	fsck: Added --distributed and --expire options, for distributed fsck.	2015-04-01 17:53:16 -04:00
Joey Hess	cd6b62f35e	--auto is no longer a global option; only get, drop, and copy accept it. Not a behavior change unless you were passing it to a command that ignored it.	2015-03-25 17:06:14 -04:00
Joey Hess	3414229354	fsck: Multiple incremental fscks of different repos (some remote) can now be in progress at the same time in the same repo without it getting confused about which files have been checked for which remotes.	2015-02-17 17:08:11 -04:00
Joey Hess	afb3e3e472	avoid crash when starting fsck --incremental when one is already running Turns out sqlite does not like having its database deleted out from underneath it. It might suffice to empty the table, but I would rather start each fsck over with a new database, so I added a lock file, and running incremental fscks use a shared lock. This leaves one concurrency bug left; running two concurrent fsck --more will lead to: "SQLite3 returned ErrorBusy while attempting to perform step." and one or both will fail. This is a concurrent writers problem.	2015-02-17 13:30:24 -04:00
Joey Hess	7d36e7d18d	commit new transaction after 60 seconds Database.Handle can now be given a CommitPolicy, making it easy to specify transaction granularity. Benchmarking the old git-annex incremental fsck that flips sticky bits to the new that uses sqlite, running in a repo with 37000 annexed files, both from cold cache: old: 6m6.906s new: 6m26.913s This commit was sponsored by TasLUG.	2015-02-16 17:05:42 -04:00
Joey Hess	d2766df914	commit more transactions when fscking This makes interrupt and resume work, robustly. But, incremental fsck is slowed down by all those transactions..	2015-02-16 16:07:36 -04:00
Joey Hess	91e9146d1b	convert incremental fsck to using sqlite database Did not keep backwards compat for sticky bit records. An incremental fsck that is already in progress will start over on upgrade to this version. This is not yet ready for merging. The autobuilders need to have sqlite installed. Also, interrupting a fsck --incremental does not commit the database. So, resuming with fsck --more restarts from beginning. Memory: Constant during a fsck of tens of thousands of files. (But, it does seem to buffer whole transation in memory, so may really scale with number of files.) CPU: ?	2015-02-16 15:35:26 -04:00
Joey Hess	3f5c9ddc05	fix compile warning	2015-02-12 16:03:59 -04:00
Joey Hess	4794ef083a	fsck --from: If a download from a remote fails, propigate the failure.	2015-02-10 13:10:58 -04:00
Joey Hess	8066a1c3cc	The file matching options are now only accepted by commands that can actually use them.	2015-02-06 17:16:41 -04:00
Joey Hess	70736d2b41	Repository tuning parameters can now be passed when initializing a repository for the first time. * init: Repository tuning parameters can now be passed when initializing a repository for the first time. For details, see http://git-annex.branchable.com/tuning/ * merge: Refuse to merge changes from a git-annex branch of a repo that has been tuned in incompatable ways.	2015-01-27 17:38:06 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	4f657aa14e	add getFileSize, which can get the real size of a large file on Windows Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.	2015-01-20 17:09:24 -04:00
Joey Hess	3bab5dfb1d	revert parentDir change Reverts `965e106f24` Unfortunately, this caused breakage on Windows, and possibly elsewhere, because parentDir and takeDirectory do not behave the same when there is a trailing directory separator.	2015-01-09 13:11:56 -04:00
Joey Hess	965e106f24	made parentDir return a Maybe FilePath; removed most uses of it parentDir is less safe than takeDirectory, especially when working with relative FilePaths. It's really only useful in loops that want to terminate at / This commit was sponsored by Audric SCHILTKNECHT.	2015-01-06 18:55:56 -04:00
Joey Hess	59f88558d5	doh't use "def" for command definitions, it conflicts with Data.Default.def	2014-10-14 14:20:10 -04:00
Joey Hess	9fd95d9025	indent with tabs not spaces Found these with: git grep "^ " $(find -type f -name \*.hs) \|grep -v ': where' Unfortunately there is some inline hamlet that cannot use tabs for indentation. Also, Assistant/WebApp/Bootstrap3.hs is a copy of a module and so I'm leaving it as-is.	2014-10-09 15:09:26 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	b63276309e	clean up cleanup action enumeration	2014-03-13 19:06:26 -04:00
Joey Hess	a1432bce2f	Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.	2014-02-26 16:52:56 -04:00
Joey Hess	3f6e4b8c7c	fix all remaining -Wall warnings on Windows	2014-02-25 14:48:50 -04:00
Joey Hess	1428390300	tweak wording	2014-02-20 16:00:41 -04:00
Joey Hess	9edc3a735d	fsck: Refuse to do anything if more than one of --incremental, --more, and --incremental-schedule are given, since it's not clear which option should win.	2014-02-20 15:56:45 -04:00
Joey Hess	134fdefb8c	fsck: When run with --all or --unused, while .gitattributes annex.numcopies cannot be honored since it's operating on keys instead of files, make it honor the global numcopies setting, and the annex.numcopies git config setting.	2014-02-20 14:45:17 -04:00
Joey Hess	8952ccec1b	windows: fix fsck --incremental to not crash Although it is still not incremental.	2014-02-13 12:40:10 -04:00
Joey Hess	7b19c7d25b	cleanup thanks to Utility.PID	2014-02-11 15:39:51 -04:00
Joey Hess	1572c460e8	avoid using openFile when withFile can be used Potentially fixes some FD leak if an action on an opened file handle fails for some reason. There have been some hard to reproduce reports of git-annex leaking FDs, and this may solve them.	2014-02-03 10:19:06 -04:00
Joey Hess	1669e80e85	Windows: Avoid using unix-compat's rename, which refuses to rename directories. Opened a bug about this: https://github.com/jystic/unix-compat/issues/10	2014-01-29 15:19:03 -04:00
Joey Hess	86ffeb73d1	reorganize some files and imports	2014-01-26 16:25:55 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	0ef282a116	numcopies cleanup, part 2 This includes several bug fixes.	2014-01-21 17:25:39 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	011b8bc7ec	pull in Win32-extras, to be able to get current process id in Windows Fixed up a number of things that had worked around there not being a way to get that. Most notably, transfer info files on windows now include the process id, since no locking is currently done. This means the file format varies between windows and unix.	2013-12-11 00:15:10 -04:00
Joey Hess	12bc989d2d	better name for continuation	2013-12-01 15:52:30 -04:00
Joey Hess	d48b00ebed	Direct mode .git/annex/objects directories are no longer left writable Because that allowed writing to symlinks of files that are not present, which followed the link and put bad content in an object location. fsck: Fix up .git/annex/object directory permissions. This commit was sponsored by an anonymous bitcoin donor.	2013-11-15 14:52:03 -04:00
Joey Hess	2b6747b6a2	update for Duration type change	2013-10-08 17:36:55 -04:00
Joey Hess	b405295aee	hlint test suite still passes	2013-09-25 03:09:06 -04:00
Joey Hess	65fe2314be	fsck: Fix detection and fixing of present direct mode files that are wrongly represented as standin symlinks on crippled filesystems.	2013-09-13 12:50:29 -04:00
Joey Hess	0f921307e7	mirror: New command, makes two repositories contain the same set of files. This is a simple approach for setting up a mirroring repository. It will work with any type of remotes. Mirror --from is more expensive than mirror --to in general. OTOH, mirror --from will get the file from any remote that has it, not only the named mirror remote. And if the named mirror remote is not the fastest available remote with a file, that can speed things up. It would be possible to make the assistant or watch command do a more dynamic mirroring, that didn't need to scan every time.	2013-08-20 15:46:35 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	04d07f2c1f	--unused: New switch that makes git-annex operate on all data found by the last run of git annex unused. Supported by fsck, get, move, copy.	2013-07-03 15:26:59 -04:00
Joey Hess	def7cb706f	Add --all option, and support it for fsck	2013-07-03 13:12:53 -04:00
Joey Hess	a35bdcb3f2	fsck: Ensures that direct mode is used for files when it's enabled. A common failure mode for direct mode has been for files to end up still stored in indirect mode. While I hope that doesn't happen anymore, fsck should deal with it.	2013-06-24 16:26:00 -04:00
Joey Hess	64f8819ae4	fix build	2013-06-17 21:30:52 -04:00
Joey Hess	9ef09587dc	fsck: Avoid getting confused by Windows path separators	2013-06-17 21:18:43 -04:00

1 2 3 4 5 ...

264 commits