git-annex

Author	SHA1	Message	Date
Joey Hess	b61c6bc2ff	hlint	2014-10-09 15:46:05 -04:00
Joey Hess	aebcc395ff	use types to enforce that removeAnnex can only be called inside lockContent This fixed one bug where it needed to be and wasn't (in Assistant.Unused). And also found one place where lockContent was used unnecessarily (by drop --from remote). A few other places like uninit probably don't really need to lockContent, but it doesn't hurt to do call it anyway. This commit was sponsored by David Wagner.	2014-08-20 20:13:47 -04:00
Joey Hess	e386e26ef2	avoid trying to create a content file in order to lock it The nice refactoring in `ec7dd0446a` highlighted a bug in lockContent -- when the content is not present, this incorrectly created an empty lock file, using the same filename as the content file. This seems like it could result in empty objects, which fsck would detect and complain about. Both drop and move --to call lockContent, as does Remote.Git.dropKey -- I think we got lucky and this bug didn't show up because both all of those only operate on files that are present. So this bug could only manifest if there was a race, and a file's content was dropped at just the wrong time, just as another process was about to drop it. (And then only if the other process's dropping failed, otherwise it'd delete the empty object file.) Hmm, move --from also called lockContent. Unnecessarily, since the content is not being removed from the local annex. In this case, the combination of the 2 bugs could result in an empty lock file being written, and then if the download of the content failed, left in the object directory as the content. This commit also optimises lockContent, avoiding an unncessary doesFileExist test and instead just catching the exception that's thrown when the file doesn't exist. This commit was sponsored by Justine Lam.	2014-08-20 17:25:30 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	e426fac273	add desktop notifications Motivation: Hook scripts for nautilus or other file managers need to provide the user with feedback that a file is being downloaded. This commit was sponsored by THM Schoemaker.	2014-03-22 14:12:19 -04:00
Joey Hess	4cc34973d9	copy --fast --to remote: Avoid printing anything for files that are already believed to be present on the remote.	2014-03-13 14:51:22 -04:00
Joey Hess	86ffeb73d1	reorganize some files and imports	2014-01-26 16:25:55 -04:00
Joey Hess	3149a62a35	refactor	2014-01-26 15:53:01 -04:00
Joey Hess	d66535f065	global numcopies setting * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.	2014-01-20 16:47:56 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	66285ca3d1	copy --from, get --from: When --force is used, ignore the location log and always try to get the file from the remote.	2013-12-02 15:41:20 -04:00
Joey Hess	b405295aee	hlint test suite still passes	2013-09-25 03:09:06 -04:00
Joey Hess	0f921307e7	mirror: New command, makes two repositories contain the same set of files. This is a simple approach for setting up a mirroring repository. It will work with any type of remotes. Mirror --from is more expensive than mirror --to in general. OTOH, mirror --from will get the file from any remote that has it, not only the named mirror remote. And if the named mirror remote is not the fastest available remote with a file, that can speed things up. It would be possible to make the assistant or watch command do a more dynamic mirroring, that didn't need to scan every time.	2013-08-20 15:46:35 -04:00
Joey Hess	7a7e426352	moved AssociatedFile definition	2013-07-04 02:36:02 -04:00
Joey Hess	04d07f2c1f	--unused: New switch that makes git-annex operate on all data found by the last run of git annex unused. Supported by fsck, get, move, copy.	2013-07-03 15:26:59 -04:00
Joey Hess	b337a8b4c7	--all for get, move, and copy	2013-07-03 13:55:50 -04:00
Joey Hess	9e11699c76	connect existing meters to the transfer log for downloads Most remotes have meters in their implementations of retrieveKeyFile already. Simply hooking these up to the transfer log makes that information available. Easy peasy. This is particularly valuable information for encrypted remotes, which otherwise bypass the assistant's polling of temp files, and so don't have good progress bars yet. Still some work to do here (see progressbars.mdwn changes), but this is entirely an improvement from the lack of progress bars for encrypted downloads.	2013-04-11 17:32:31 -04:00
Joey Hess	cfd3b16fe1	add section metadata to all commands Not yet used .. mindless train work.	2013-03-24 18:28:21 -04:00
Joey Hess	921f29c004	two types of byName Clean up from `9769235d6b`. In some cases, looking up a remote by name even though it has no UUID is desirable. This includes git annex sync, which can operate on remotes without an annex, and XMPP pairing, which runs addRemote (with calls byName) before the UUID of the XMPP remote has been configured in git.	2013-03-05 15:43:56 -04:00
Joey Hess	3b92c279e8	copy: Update location log when no copy was performed, if the location log was out of date.	2013-02-26 14:39:37 -04:00
Joey Hess	397082013a	proper fix for dropunused Now getKeysPresent checks that the key's content, not only its directory, exists. In direct mode, the inode cache file is used as a standin for the content. removeAnnex always removes the inode cache file, and drop and move --from always call removeAnnex, even if the object does not seem to be inAnnex, to ensure it's always deleted.	2013-02-15 17:58:49 -04:00
Joey Hess	b68eee625f	More commands work in direct mode repositories: find, whereis, move, copy, drop, log. These started working, for free, once lookupFile supported direct mode. yay!!	2013-01-05 17:17:04 -04:00
Joey Hess	2ce736ac50	block all commands that don't work in direct mode I left status working in direct mode, although it doesn't show correct stats for known annex keys.	2012-12-29 14:28:19 -04:00
Joey Hess	ebd576ebcb	where indentation	2012-11-12 01:05:04 -04:00
Joey Hess	40df26757a	copy: avoid updating location log when no copy is performed git annex copy --to remote often does not need to copy a file, but it was still updating the location log in this case.	2012-09-24 19:58:34 -04:00
Joey Hess	df07ccf404	make the assistant retry failed transfers When a transfer fails, the progress info can be used to intelligently retry it. If the transfer managed to make some progress, but did not fully complete, then there's a good chance that a retry will finish it (or at least make more progress).	2012-09-23 13:27:13 -04:00
Joey Hess	715a9a2f8e	keep logs of failed transfers, and requeue them when doing a non-full scan of a remote	2012-08-23 15:24:15 -04:00
Joey Hess	7225c2bfc0	record transfer information on local git remotes In order to record a semi-useful filename associated with the key, this required plumbing the filename all the way through to the remotes' storeKey and retrieveKeyFile. Note that there is potential for deadlock here, narrowly avoided. Suppose the repos are A and B. A sends file foo to B, and at the same time, B gets file foo from A. So, A locks its upload transfer info file, and then locks B's download transfer info file. At the same time, B is taking the two locks in the opposite order. This is only not a deadlock because the lock code does not wait, and aborts. So one of A or B's transfers will be aborted and the other transfer will continue. Whew!	2012-07-01 17:15:11 -04:00
Joey Hess	e5fd8b67b7	get, move, copy: Now refuse to do anything when the requested file transfer is already in progress by another process. Note this is per-remote, so trying to get the same file from multiple remotes can still let duplicate downloads run. (And uploading the same file to multiple remotes is not duplicate at all of course.) get, move, and copy are the only git-annex subcommands that transfer files, but there's still git-annex-shell recvkey and sendkey to deal with too. I considered modifying retrieveKeyFile or getViaTmp, but they are called by other code that does not involve expensive file transfers (migrate) or that does file transfers that should not be checked by this (fsck --from).	2012-07-01 17:15:11 -04:00
Joey Hess	942d8f7298	hlint	2012-06-12 11:32:06 -04:00
Joey Hess	60ab3d84e1	added ifM and nuked 11 lines of code no behavior changes	2012-03-14 17:43:34 -04:00
Joey Hess	2fd294d06f	move --from, copy --from: 10 times faster scanning remote on local disk Rather than go through the location log to see which files are present on the remote, it simply looks at the disk contents directly. I benchmarked this speeding up scanning 834 files, from an annex on my phone's SSD, from 11.39 seconds to 1.31 seconds. (No files actually moved.) Also benchmarked 8139 files, from an annex on spinning storage, speeding up from 103.17 to 13.39 seconds. Note that benchmarking with an encrypted annex on flash actually showed a minor slowdown with this optimisation -- from 13.93 to 14.50 seconds. Seems the overhead of doing the crypto needed to get the filenames to directly check can be higher than the overhead of looking up data in the location log. (Which says good things about how well the location log and git have been optimised!) It may make sense to make encrypted local remotes not have hasKeyCheap set; further benchmarking is called for.	2012-02-26 14:59:48 -04:00
Joey Hess	61dbad505d	fsck --from remote --fast Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.	2012-01-20 13:23:11 -04:00
Joey Hess	06b0cb6224	add tmp flag parameter to retrieveKeyFile	2012-01-19 16:07:36 -04:00
Joey Hess	90319afa41	fsck --from Fscking a remote is now supported. It's done by retrieving the contents of the specified files from the remote, and checking them, so can be an expensive operation. (Several optimisations are possible, to speed it up, of course.. This is the slow and stupid remote fsck to start with.) Still, if the remote is a special remote, or a git repository that you cannot run fsck in locally, it's nice to have the ability to fsck it. If you have any directory special remotes, now would be a good time to fsck them, in case you were hit by the data loss bug fixed in the previous release!	2012-01-19 15:24:05 -04:00
Joey Hess	1f8a1058c9	tweak	2012-01-06 10:57:57 -04:00
Joey Hess	df21cbfdd2	look up --to and --from remote names only once This will speed up commands like move and drop.	2012-01-06 04:06:13 -04:00
Joey Hess	0a36f92a31	more command-specific options Made --from and --to command-specific options. Added generic storage for values of command-specific options, which allows removing some of the special case fields in AnnexState. (Also added generic storage for command-specific flags, although there are not yet any.) Note that this storage uses a Map, so repeatedly looking up the same value is slightly more expensive than looking up an AnnexState field. But, the value can be looked up once in the seek stage, transformed as necessary, and passed in a closure to the start stage, and this avoids that overhead. Still, I'm hesitant to use this for things like force or fast flags. It's probably best to reserve it for flags that are only used by a few commands, or options like --from and --to that it's important only be allowed to be used with commands that implement them, to avoid user confusion.	2012-01-06 03:16:42 -04:00
Joey Hess	4a02c2ea62	type alias cleanup	2011-12-31 04:11:58 -04:00
Joey Hess	95e748cbd4	inverted logic	2011-12-09 13:38:28 -04:00
Joey Hess	3f5f28b487	factor out a stopUnless code melt for lunch	2011-12-09 12:23:45 -04:00
Joey Hess	0f0169fa99	comment update	2011-11-20 22:49:53 -04:00
Joey Hess	1b90918cec	avoid error message when doing get --from on file not present on remote	2011-11-18 17:26:37 -04:00
Joey Hess	637b5feb45	lint	2011-11-11 01:52:58 -04:00
Joey Hess	b327227ba5	better limiting of start actions to only run whenAnnexed Mostly only refactoring, but this does remove one redundant stat of the symlink by copy.	2011-11-10 23:45:14 -04:00
Joey Hess	4389782628	tweak	2011-11-10 22:37:52 -04:00
Joey Hess	2de1e2c2ce	Optimized copy --from and get --from to avoid checking the location log for files that are already present. This can be a significant speedup when running in large trees that are only missing a few files; it makes copy --from just as fast as get.	2011-11-10 21:32:42 -04:00
Joey Hess	d3e1a3619f	safer inannex checking git-annex-shell inannex now returns always 0, 1, or 100 (the last when it's unclear if content is currently in the index due to it currently being moved or dropped). (Actual locking code still not yet written.)	2011-11-09 18:33:15 -04:00
Joey Hess	8ce7e73f74	reorg to allow taking content lock The lock will only persist during the perform stage, so the content must be removed from the annex then, rather than in the cleanup stage. (No lock is actually taken yet.)	2011-11-09 16:54:18 -04:00

1 2 3

102 commits