git-annex

Author	SHA1	Message	Date
Joey Hess	3da0064657	assistant unused file handling Make sanity checker run git annex unused daily, and queue up transfers of unused files to any remotes that will have them. The transfer retrying code works for us here, so eg when a backup disk remote is plugged in, any transfers to it are done. Once the unused files reach a remote, they'll be removed locally as unwanted. If the setup does not cause unused files to go to a remote, they'll pile up, and the sanity checker detects this using some heuristics that are pretty good -- 1000 unused files, or 10% of disk used by unused files, or more disk wasted by unused files than is left free. Once it detects this, it pops up an alert in the webapp, with a button to take action. TODO: Webapp UI to configure this, and also the ability to launch an immediate cleanup of all unused files. This commit was sponsored by Simon Michael.	2014-01-22 22:53:18 -04:00
Joey Hess	4b55afe9e9	add "unused" preferred content expression With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.	2014-01-22 16:35:32 -04:00
Joey Hess	ae3cd632bd	add timestamps to unused log files This will be used in expiring old unused objects. The timestamp is when it was first noticed it was unused. Backwards compatability: It supports reading old format unused log files. The old version of git-annex will ignore lines in log files written by the new version, so the worst interop problem would be git annex dropunused not knowing some numbers that git-annex unused reported.	2014-01-22 15:33:02 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	0ef282a116	numcopies cleanup, part 2 This includes several bug fixes.	2014-01-21 17:25:39 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	d66535f065	global numcopies setting * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.	2014-01-20 16:47:56 -04:00
Joey Hess	93161d0dea	copyright year	2014-01-08 16:29:15 -04:00
Joey Hess	3e68c1c2fd	add remote state logs This allows a remote to store a piece of arbitrary state associated with a key. This is needed to support Tahoe, where the file-cap is calculated from the data stored in it, and used to retrieve a key later. Glacier also would be much improved by using this. GETSTATE and SETSTATE are added to the external special remote protocol. Note that the state is left as-is even when a key is removed from a remote. It's up to the remote to decide when it wants to clear the state. The remote state log, $KEY.log.rmt, is a UUID-based log. However, rather than using the old UUID-based log format, I created a new variant of that format. The new varient is more space efficient (since it lacks the "timestamp=" hack, and easier to parse (and the parser doesn't mess with whitespace in the value), and avoids compatability cruft in the old one. This seemed worth cleaning up for these new files, since there could be a lot of them, while before UUID-based logs were only used for a few log files at the top of the git-annex branch. The transition code has also been updated to handle these new UUID-based logs. This commit was sponsored by Daniel Hofer.	2014-01-03 16:35:57 -04:00
Joey Hess	8e3032df2d	added GETWANTED, SETWANTED for Tobias's flickr remote This was unexpectedly difficult because of a depdenency cycle. To parse a preferred content expression involves several things that need to operate on the list of remotes. Which needs Remote.External. The only way to avoid this cycle (I tried breaking it at several points) was to skip parsing the expression in SETWANTED. That's sorta ok, because git-annex already has to deal with unparsable preferred content expressions being stored, in order to handle eg, upgrades. But I'm still not very happy that I cannot check it. I feel this is a strong indication that I need to beware of further bloating the special remote protocol interface.	2014-01-01 20:12:20 -04:00
Joey Hess	f0a6de1ca2	add PreferredContentExpression type	2014-01-01 19:58:02 -04:00
Richard Hartmann	974fe009bf	Another round of s/amoung/among/	2013-12-19 12:30:53 -04:00
Joey Hess	f931272681	syntax	2013-12-11 00:18:58 -04:00
Joey Hess	011b8bc7ec	pull in Win32-extras, to be able to get current process id in Windows Fixed up a number of things that had worked around there not being a way to get that. Most notably, transfer info files on windows now include the process id, since no locking is currently done. This means the file format varies between windows and unix.	2013-12-11 00:15:10 -04:00
Joey Hess	ecd42aef8e	different PID types for Unix and Windows Windows has a larger (unsigned) PID space, so cannot use the unix CInt there. Note that TransferInfo does not yet ever get the TransferPid populated, as there is missing locking.	2013-12-10 23:48:42 -04:00
Joey Hess	6edac746f0	merge improved fsck types from git-repair and some associated changes	2013-11-30 14:29:11 -04:00
Joey Hess	53ab737723	clean up cruft left in log by bug	2013-11-09 14:30:26 -04:00
Joey Hess	8e1b8af6e7	fix crash on empty description Caused by bug fixed in `46cf00ffd8`	2013-11-09 13:50:44 -04:00
Joey Hess	049e80e865	refactor	2013-10-28 14:05:55 -04:00
Joey Hess	d345e5b52f	add git fsck to cronner, and UI for repository repair (not yet wired up)	2013-10-22 16:02:52 -04:00
Joey Hess	92d5452a19	write via temp file	2013-10-14 16:15:38 -04:00
Joey Hess	296e21b381	add schedule command Mostly because it gives me an excuse and a hook to document the schedule expression format.	2013-10-13 15:40:38 -04:00
Joey Hess	88ec6eff15	add/remove/edit schedule UI working Once I built the basic widget, it turned out to be rather easy to replicate it once per scheduled activity and wire it all up to a fully working UI. This does abuse yesod's form handling a bit, but I think it's ok. And it would be nice to have it all ajax-y, so that saving one modified form won't lose any modifications to other forms. But for now, a nice simple 115 line of code implementation is a win. This late night hack session commit was sponsored by Andrea Rota.	2013-10-11 03:04:11 -04:00
Joey Hess	af5e1d0494	half way complete cronner thread to run scheduled activities	2013-10-08 11:48:28 -04:00
Joey Hess	b9375acb18	add schedule to vicfg	2013-10-07 17:11:13 -04:00
Joey Hess	29ca49dad4	add a log file for scheduled activities	2013-10-07 16:06:34 -04:00
Joey Hess	57d49a6d04	remove >=> and >=> ; use <$$> instead I forgot I had <$$> hidden away in Utility.Applicative. It allows doing the same kind of currying as does >=> and I found using it made the code more readable for me. (>=> was not used)	2013-09-27 19:58:48 -04:00
Joey Hess	c1990702e9	hlint	2013-09-25 23:19:01 -04:00
Joey Hess	4dc4a9a385	assistant: Clear the list of failed transfers when doing a full transfer scan. This prevents repeated retries to download files that are not available, or are not referenced by the current git tree. This is motivated by a user report that the assistant was repeatedly retrying transfers of files that had been deleted (in direct mode, so removing the only copy). Note that the glacier code retries failed transfers after a while to retry downloads that have aged long enough to be available. This is ok; if we're doing a full transfer scan we'll retry on every file that is still in the git tree. Also note that this makes the assistant less likely to get every file referenced by old revs of the git tree. Not something the assistant tries to ensure anyway, so I feel this is acceptable.	2013-09-25 11:46:17 -04:00
Joey Hess	eb42bde19a	sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory.	2013-09-19 14:48:42 -04:00
Joey Hess	51ce7fcaf1	fix warning	2013-09-04 21:37:13 -04:00
Joey Hess	0831e18372	forget --drop-dead: Completely removes mentions of repositories that have been marked as dead from the git-annex branch. Wrote nice pure transition calculator, and ugly code to stage its results into the git-annex branch. Also had to split up several Log modules that Annex.Branch needed to use, but that themselves used Annex.Branch. The transition calculator is limited to looking at and changing one file at a time. While this made the implementation relatively easy, it precludes transitions that do stuff like deleting old url log files for keys that are being removed because they are no longer present anywhere.	2013-08-31 17:51:13 -04:00
Joey Hess	62beaa1a86	refactor git-annex branch log filename code into central location Having one module that knows about all the filenames used on the branch allows working back from an arbitrary filename to enough information about it to implement dropping dead remotes and doing other log file compacting as part of a forget transition.	2013-08-29 19:13:00 -04:00
Joey Hess	4a915cd3cd	add forget command Works, more or less. --dead is not implemented, and so far a new branch is made, but keys no longer present anywhere are not scrubbed. git annex sync fails to push the synced/git-annex branch after a forget, because it's not a fast-forward of the existing synced branch. Could be fixed by making git-annex sync use assistant-style sync branches.	2013-08-28 16:41:13 -04:00
Joey Hess	fcd5c167ef	untested transition detection on merging, and transition running code	2013-08-28 15:57:42 -04:00
Joey Hess	511cf77b6d	add transition log	2013-08-28 13:54:51 -04:00
Joey Hess	824241b6fb	better cases	2013-08-22 23:44:13 -04:00
Joey Hess	46b6d75274	Youtube support! (And 53 other video hosts) When quvi is installed, git-annex addurl automatically uses it to detect when an page is a video, and downloads the video file. web special remote: Also support using quvi, for getting files, or checking if files exist in the web. This commit was sponsored by Mark Hepburn. Thanks!	2013-08-22 18:50:43 -04:00
Joey Hess	a3224ce35b	avoid more build warnings on Windows	2013-08-04 14:05:36 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	7e66d260ea	importfeed: git-annex becomes a podcatcher in 150 LOC	2013-07-28 16:55:42 -04:00
Joey Hess	ec8cf85fcc	display "transfer already in progress" as a note	2013-07-17 16:16:17 -04:00
Joey Hess	7afd92d083	When a transfer is already being run by another process, proceed on to the next file, rather than dying.	2013-07-17 15:54:01 -04:00
Joey Hess	7a7e426352	moved AssociatedFile definition	2013-07-04 02:36:02 -04:00
Joey Hess	04d07f2c1f	--unused: New switch that makes git-annex operate on all data found by the last run of git annex unused. Supported by fsck, get, move, copy.	2013-07-03 15:26:59 -04:00
Joey Hess	bf86b5ca16	improve robustness of fromDirect and replaceFile Made fromDirect check that a file in the tree has good content (and is not a broken symlink either) before copying it to another file that has the same key. Made replaceFile clean up the temp file if the action that creates it, or the file replacement action fails.	2013-05-25 15:06:02 -04:00
Joey Hess	25a8d4b11c	rename module	2013-05-12 19:19:28 -04:00
Joey Hess	03e8594369	fix the day's windows permissions damage	2013-05-12 19:09:48 -04:00
Joey Hess	73d2f8b280	deal with git using / internally, even on DOS	2013-05-12 17:29:49 -05:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	18bdff3fae	clean up from windows porting	2013-05-11 18:23:41 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	0ae8c82c53	per-IA-item content directories	2013-04-25 23:44:55 -04:00
Joey Hess	49547ad32d	initremote: If two existing remotes have the same name, prefer the one with a higher trust level.	2013-04-24 21:53:58 -04:00
Joey Hess	6be815a30c	rmurl: New command, removes one of the recorded urls for a file.	2013-04-22 17:18:53 -04:00
Joey Hess	9e11699c76	connect existing meters to the transfer log for downloads Most remotes have meters in their implementations of retrieveKeyFile already. Simply hooking these up to the transfer log makes that information available. Easy peasy. This is particularly valuable information for encrypted remotes, which otherwise bypass the assistant's polling of temp files, and so don't have good progress bars yet. Still some work to do here (see progressbars.mdwn changes), but this is entirely an improvement from the lack of progress bars for encrypted downloads.	2013-04-11 17:32:31 -04:00
Joey Hess	c9e4c218a6	fix invalidating the preferred content cache when changing a group The ConfigMonitor already did this, but groups can also be changed by eg, the webapp UI, so need to do it at this deeper level.	2013-04-08 16:43:06 -04:00
Joey Hess	9a5f421768	detect when unwanted remote is empty and remove it Needs fixes to build when the webapp is disabled.	2013-04-03 17:01:40 -04:00
Joey Hess	8a5b397ac4	hlint	2013-04-03 03:52:41 -04:00
Joey Hess	7b6cf1981f	show bytesComplete	2013-04-02 16:38:47 -04:00
Joey Hess	91b7de97e8	invalidated the wrong cache when setting preferred content	2013-03-31 19:00:14 -04:00
Joey Hess	67e817c6a1	New annex.largefiles setting, which configures which files `git annex add` and the assistant add to the annex. I would have sort of liked to put this in .gitattributes, but it seems it does not support multi-word attribute values. Also, making this a single config setting makes it easy to only parse the expression once. A natural next step would be to make the assistant `git add` files that are not annex.largefiles. OTOH, I don't think `git annex add` should `git add` such files, because git-annex command line tools are not in the business of wrapping git command line tools.	2013-03-29 16:17:13 -04:00
Joey Hess	cf07a2c412	webapp: Progess bar fixes for many types of special remotes. There was confusion in different parts of the progress bar code about whether an update contained the total number of bytes transferred, or the number of bytes transferred since the last update. One way this bug showed up was progress bars that seemed to stick at zero for a long time. In order to fix it comprehensively, I add a new BytesProcessed data type, that is explicitly a total quantity of bytes, not a delta. Note that this doesn't necessarily fix every problem with progress bars. Particularly, buffering can now cause progress bars to seem to run ahead of transfers, reaching 100% when data is still being uploaded.	2013-03-28 17:04:37 -04:00
Joey Hess	e9048ecec8	get, copy, move: Display an error message when an identical transfer is already in progress, rather than failing with no indication why.	2013-03-19 13:56:20 -04:00
Joey Hess	b543842a7f	optimisation for transfers to drives that are not plugged in Rather than forking a git-annex transferkey only to have it fail, just immediately record the failed transfer (so when the drive is plugged in, the scan will retry it).	2013-03-18 20:40:24 -04:00
Joey Hess	a1b6d2e057	show an error message if garbage is provided to dropunused	2013-03-03 20:04:24 -04:00
Joey Hess	46c9cbeb1e	add additional debug info about reasons for transfers	2013-03-01 15:23:59 -04:00
Joey Hess	24316f6562	improve imports	2013-02-27 21:48:46 -04:00
Joey Hess	a2f17146fa	move Arbitrary instances out of Test and into modules that define the types This is possible now that we build-depend on QuickCheck.	2013-02-27 21:42:07 -04:00
Joey Hess	4008590c68	type based git config handling for remotes Still a couple of places that use git config ad-hoc, but this is most of it done.	2013-01-01 13:58:14 -04:00
Joey Hess	1702409f00	check	2012-12-20 00:08:30 -04:00
Joey Hess	df90a2acd5	another quickcheck	2012-12-20 00:02:33 -04:00
Joey Hess	8491917d04	more quickcheck fun and the code gets better..	2012-12-19 22:14:12 -04:00
Joey Hess	bf71d42681	quickcheck test for transfer info read/write code Fixed a bug the quickcheck turned up.	2012-12-19 16:15:39 -04:00
Joey Hess	7da2e27293	Bugfix: Fixed bug parsing transfer info files The newline after the filename was included in it. This was generally benign -- mostly these filenames are just displayed, and the newline didn't matter. But in the assistant, it caused unexpected dropping of preferred content. A characteristic of this bug is that the drop was displayed like this: drop some_file ok	2012-12-19 14:17:01 -04:00
Joey Hess	ffdd08fd2e	Merge branch 'master' into desymlink	2012-12-13 00:46:10 -04:00
Joey Hess	0d50a6105b	whitespace fixes	2012-12-13 00:45:27 -04:00
Joey Hess	e7b8cb0063	direct mode committing	2012-12-12 19:20:38 -04:00
Joey Hess	99a8a5297c	--auto fixes * get/copy --auto: Transfer data even if it would exceed numcopies, when preferred content settings want it. * drop --auto: Fix dropping content when there are no preferred content settings.	2012-12-06 13:22:16 -04:00
Joey Hess	ea5d7292e6	dropping from web	2012-11-29 17:01:07 -04:00
Joey Hess	2172cc586e	where indenting	2012-11-11 00:51:07 -04:00
Joey Hess	ec337baaee	add trustExclude	2012-11-11 00:24:32 -04:00
Joey Hess	c6fbed48a1	bugfix: Don't fail transferring content from read-only repos. Closes: #691341 This used to work, but got broken when the transfer info files were added, as it failed writing them on the readonly filesystem.	2012-10-24 10:59:25 -04:00
Joey Hess	452e6819d0	!! removal	2012-10-21 00:51:42 -04:00
Joey Hess	c7c2015435	add ConfigMonitor thread Monitors git-annex branch for changes, which are noticed by the Merger thread whenever the branch ref is changed (either due to an incoming push, or a local change), and refreshes cached config values for modified config files. Rate limited to run no more often than once per minute. This is important because frequent git-annex branch changes happen when files are being added, or transferred, etc. A primary use case is that, when preferred content changes are made, and get pushed to remotes, the remotes start honoring those settings. Other use cases include propigating repository description and trust changes to remotes, and learning when a remote has added a new special remote, so the webapp can present the GUI to enable that special remote locally. Also added a uuid.log cache. All other config files already had caches.	2012-10-20 16:43:35 -04:00
Joey Hess	40aab719df	Replace "in=" with "present" in preferred content expressions in= was problimatic in two ways. First, it referred to a remote by name, but preferred content expressions can be evaluated elsewhere, where that remote doesn't exist, or a different remote has the same name. This name lookup code could error out at runtime. Secondly, in= seemed pretty useless. in=here did not cause content to be gotten, but it did let present content be dropped. present is more useful, although "not present" is unstable and should be avoided.	2012-10-19 16:09:21 -04:00
Joey Hess	e7780a39f5	Preferred content path matching bugfix. When in a subdir, both the normal filepath, and the filepath relative to the top of the git repo are needed for matching. The former for key lookup, and the latter for include/exclude to match against. Previously, key lookup didn't work in this situation.	2012-10-17 16:01:09 -04:00
Joey Hess	c78975babb	avoid duplicate code with a more generic monadic matcher Interesting type signature ghc derived for this: forall o (m :: * -> *). Monad m => Matcher o -> (o -> m Bool) -> m Bool	2012-10-13 15:17:15 -04:00
Joey Hess	7aef34f501	implement saving of repository settings	2012-10-10 19:13:49 -04:00
Joey Hess	4e2e08b45a	ui for selecting a repository group	2012-10-10 16:23:41 -04:00
Joey Hess	39be7eea40	add standard group selector to repo edit form	2012-10-10 16:04:28 -04:00
Joey Hess	9da7dd8874	webapp: configure new repos to use the standard preferred content settings	2012-10-10 15:35:10 -04:00
Joey Hess	3490977d97	webapp: put new repos in standard groups I'm using transfer for most things, both removable drives and cloud storage, because it's the safest choice. We'll see if it makes sense to prompt for the group when setting this up, or let the user pick something else after the fact.	2012-10-10 15:27:25 -04:00
Joey Hess	f9b81c7a75	refactor	2012-10-10 15:15:56 -04:00
Joey Hess	0c88d9395d	standard preferred content settings for client, transfer, backup, and archive repositories I've designed these to work well together, I hope. If I get it wrong, I can just change the code in one place, since these expressions won't be stored in the git-annex branch.	2012-10-10 13:54:40 -04:00
Joey Hess	b6ce003843	rename --ingroup to --inallgroup	2012-10-10 12:59:45 -04:00
Joey Hess	e375b931c0	add --ingroup limit	2012-10-08 15:18:58 -04:00
Joey Hess	7cd81bd978	Added --smallerthan and --largerthan limits	2012-10-08 13:39:18 -04:00
Joey Hess	71fd18a97f	wired preferred content up to get, copy, and drop --auto	2012-10-08 13:16:53 -04:00
Joey Hess	7bb4d507ba	add AssumeNotPresent parameter to limits Solves the issue with preferred content expressions and dropping that I mentioned yesterday. My solution was to add a parameter to specify a set of repositories where content should be assumed not to be present. When deciding whether to drop, it can put the current repository in, and then if the expression fails to match, the content can be dropped. Using yesterday's example "(not copies=trusted:2) and (not in=usbdrive)", when the local repo is one of the 2 trusted copies, the drop check will see only 1 trusted copy, so the expression matches, and so the content will not be dropped.	2012-10-05 16:52:44 -04:00

1 2 3 4 5

239 commits