git-annex

Author	SHA1	Message	Date
Joey Hess	2cecc8d2a3	Added GIT_ANNEX_VECTOR_CLOCK environment variable Can be used to override the default timestamps used in log files in the git-annex branch. This is a dangerous environment variable; use with caution. Note that this only affects writing to the logs on the git-annex branch. It is not used for metadata in git commits (other env vars can be set for that). There are many other places where timestamps are still used, that don't get committed to git, but do touch disk. Including regular timestamps of files, and timestamps embedded in some files in .git/annex/, including the last fsck timestamp and timestamps in transfer log files. A good way to find such things in git-annex is to get for getPOSIXTime and getCurrentTime, although some of the results are of course false positives that never hit disk (unless git-annex gets swapped out..) So this commit does NOT necessarily make git-annex comply with some HIPPA privacy regulations; it's up to the user to determine if they can use it in a way compliant with such regulations. Benchmarking: It takes 0.00114 milliseconds to call getEnv "GIT_ANNEX_VECTOR_CLOCK" when that env var is not set. So, 100 thousand log files can be written with an added overhead of only 0.114 seconds. That should be by far swamped by the actual overhead of writing the log files and making the commit containing them. This commit was supported by the NSF-funded DataLad project.	2017-08-14 14:19:58 -04:00
Joey Hess	bcf276655c	Keys marked as dead are now skipped by --all. fsck already special-cased dead keys to make --all not report errors with them, and it makes sense to also expand that to whereis. I think it makes sense for dead keys to be skipped by all uses of --all, so mistakes can be completely forgotten about and not come back to haunt us. The speed impact of testing if the key is dead is negligible for fsck and whereis, since they use the location log anyway and it gets cached. This does slow down a few commands that support --all, in particular metadata --all runs around 2x as slow. I don't think metadata --all is often used though. It might slow down copy/move/mirror --all and get --all. log --all is not affected (does not use the normal --all machinery). Dead keys will still be processed by --incomplete, --branch, --failed, and --key. Although it would be unlikely for a dead key to ave in incomplete or failed transfer. It seems to make perfect sense for --branch to process keys on the branch, even if dead. (fsck's special-casing of dead keys was left in, so if one of these options causes a dead key to be fscked, there will be a nice message.) This commit was supported by the NSF-funded DataLad project.	2017-05-09 12:55:21 -04:00
Joey Hess	c3970f6c1a	multicast: New command, uses uftp to multicast annexed files, for eg a classroom setting. This commit was supported by the NSF-funded DataLad project.	2017-03-30 19:35:30 -04:00
Joey Hess	1c4e5f65fc	Drop support for building with old versions of directory, feed, and http-types.	2017-03-10 15:57:41 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	27eca014be	fix up Read instance incompatability caused by recent commit `9c4650358c` changed the Read instance for Key. I've checked all uses of that instance (by removing it and seeing what breaks), and they're all limited to the webapp, except one. That is GitAnnexDistribution's Read instance. So, `9c4650358c` would have broken upgrades of git-annex from downloads.kitenet.net. Once the .info files there got updated for a new release, old releases would have failed to parse them and never upgraded. To fix this, I found a way to make the .info files that contain GitAnnexDistribution values be readable by the old version of git-annex. This commit was sponsored by Ewen McNeill.	2017-02-24 18:59:12 -04:00
Joey Hess	ed56dba868	annex.autocommit can be configured via git-annex config ... to control the default behavior in all clones of a repository. This includes a new Configurable data type, so the GitConfig type indicates which values can be configured this way. The implementation should be quite efficient; the config log is only read once, and only when a Configurable value has not already been set by git-config. Indeed, it would be nice in the future to extend this, so that git-config is itself only read on demand. Some commands may not need to look at the git configuration at all. This commit was sponsored by Trenton Cronholm on Patreon.	2017-02-03 13:58:53 -04:00
Joey Hess	9eb10caa27	Some optimisations to string splitting code. Turns out that Data.List.Utils.split is slow and makes a lot of allocations. Here's a much simpler single character splitter that behaves the same (even in wacky corner cases) while running in half the time and 75% the allocations. As well as being an optimisation, this helps move toward eliminating use of missingh. (Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and allocates even more.) I have not benchmarked the effect on git-annex, but would not be surprised to see some parsing of eg, large streams from git commands run twice as fast, and possibly in less memory. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2017-01-31 19:06:22 -04:00
Joey Hess	5676d267b5	forgot to add this new source file	2017-01-30 17:36:45 -04:00
Joey Hess	26d23e38f1	vicfg: Include the numcopies configuation. Docs say vicfg can configure everything from git-annex branch, so it ought to configure numcopies. Note that commenting out existing numcopies does not unset it. This commit was sponsored by Thom May on Patreon.	2017-01-30 15:27:25 -04:00
Joey Hess	8484c0c197	Always use filesystem encoding for all file and handle reads and writes. This is a big scary change. I have convinced myself it should be safe. I hope!	2016-12-24 14:46:31 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	7cae6c746c	Optimised git-annex branch log file timestamp parsing. 10% speedup This sped up git annex find --not --in web from 6.64s to 5.69s. The optimised parser is probably more like 50% faster than the general one it replaced.	2016-09-29 14:04:53 -04:00
Joey Hess	c9082cf0e4	move Arbitrary instance to new Types.Transfer module Avoid orphan instance warning	2016-09-05 14:52:06 -04:00
Joey Hess	d17f08afdc	avoid warning about orphan Arbirary instance	2016-09-05 14:51:07 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	c4d011bf3e	log: Added --all option.	2016-07-17 15:15:08 -04:00
Joey Hess	176cd98293	remove \r from Arbitrary for log tests	2016-05-27 12:04:49 -04:00
Joey Hess	eba68572dc	Split lines in the git-annex branch on \r as well as \n, to deal with \r\n terminated lines written by some versions of git-annex on Windows. This fixes strange displays in some cases, including whereis showing many duplicate locations, and showing more total copies than actually exist. It's unknown if that lead to data loss when eg, dropping. At the moment, it seems unlikely it could, since the UUID with \r's appended is not the same as a UUID without, and so no remote matches it. It's also unknown if \r's can leak in on windows, perhaps when merging the git-annex branch.	2016-05-27 11:45:13 -04:00
Joey Hess	823c28d2dc	nub transitionList to avoid ugly message after repeated transitions, and avoid redundant work for repeated ForgetDeadRemotes transitions	2016-05-18 12:26:38 -04:00
Joey Hess	8ab27235ea	reinject: Added new mode which can reinject known files into the annex. For example: git-annex reinject --known /mnt/backup/*	2016-04-22 13:49:32 -04:00
Joey Hess	403b56fb91	Limit annex.largefiles parsing to the subset of preferred content expressions that make sense in its context. So, not "standard" or "lackingcopies", etc.	2016-02-03 15:04:42 -04:00
Joey Hess	cdf5977053	simplify	2016-02-03 13:23:34 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	1f3358512a	refactor	2016-01-19 15:55:32 -04:00
Joey Hess	983c1894eb	avoid unnecessary reading of git-annex branch data when matching on annex.largefiles This makes git annex clean not look at the git-annex branch at all, and so speeds it up by 50% or more.	2015-12-04 15:06:41 -04:00
Joey Hess	b0626230b7	fix use of hifalutin terminology	2015-11-16 14:37:31 -04:00
Joey Hess	aaf1ef268d	convert from Utility.LockPool to Annex.LockPool everywhere	2015-11-12 18:13:37 -04:00
Joey Hess	f9adb905fc	Avoid unncessary write to the location log when a file is unlocked and then added back with unchanged content. Implemented with no additional overhead of compares etc. This is safe to do for presence logs because of their locality of change; a given repo's presence logs are only ever changed in that repo, or in a repo that has just been actively changing the content of that repo. So, we don't need to worry about a split-brain situation where there'd be disagreement about the location of a key in a repo. And so, it's ok to not update the timestamp when that's the only change that would be made due to logging presence info.	2015-10-12 14:46:47 -04:00
Joey Hess	6fbabfcf16	oops, didn't mean to commit this debug	2015-10-06 17:28:20 -04:00
Joey Hess	ba7ecf68c0	analysis	2015-10-06 17:11:52 -04:00
Joey Hess	16947ef654	Fix bug in combination of preferred and required content settings. When one was set to the empty string and the other set to some expression, this bug caused all files to be wanted, instead of only files matching the expression. Avoid: MAny `MOr` otherexpression Which matches anything.	2015-09-15 12:50:14 -04:00
Joey Hess	6e829939e9	add test case that all standard group preferred content expressions parse	2015-06-17 13:44:19 -04:00
Joey Hess	5c960601aa	4 ns optimisation of repeated calls to hasDifference on the same Differences I want this as fast as possible, so it can be added to code paths without slowing them down. Avoid the set lookup, and rely on laziness, drops runtime from 14.37 ns to 11.03 ns according to this criterion benchmark: import Criterion.Main import qualified Types.Difference as New import qualified Types.DifferenceOld as Old main :: IO () main = defaultMain [ bgroup "hasDifference" [ bench "new" $ whnf (New.hasDifference New.OneLevelObjectHash) new , bench "old" $ whnf (Old.hasDifference Old.OneLevelObjectHash) old ] ] where s = "fromList [ObjectHashLower, OneLevelObjectHash, OneLevelBranchHash]" new = New.readDifferences s old = Old.readDifferences s A little bit of added boilerplate, but I suppose it's worth it to not need to worry about set lookup overhead. Note that adding more differences would slow down the old implementation; the new implementation will run the same speed.	2015-06-11 16:34:35 -04:00
Joey Hess	f8ab3bc449	dead --key: Can be used to mark a key as dead.	2015-06-09 14:52:05 -04:00
Joey Hess	6eefc5db65	fsck: Ignore keys that are known to be dead when running in --all mode or a in a bare repo. Otherwise, still reports files with lost contents, even if the content is dead.	2015-06-09 14:08:57 -04:00
Joey Hess	53ede1a10e	parse X in location log file as indicating a dead key A dead key is both not present at the location that thinks it has a copy, and also is assumed to probably not be present anywhere else. Although there may be lurking disconnected repos that somehow still have a copy. Suprisingly few changes needed for this! This is because the presence log code only really concerns itself with keys that are present, and dead keys are not present. Note that both the location and web log can be parsed as having a dead key. I don't see any value to having keys listed as dead in the web log, but since it doesn't change any behavior, there was no point in not parsing it.	2015-06-09 13:28:30 -04:00
Joey Hess	6383d22ffa	remove back-compat code for old version of containers Already b-d on a newer version.	2015-06-06 15:23:53 -04:00
Joey Hess	87f28bb2ea	ignore failure to clean up stale transfer lock file Perhaps due to permissions problem, or perhaps a race with another process also cleaning up.	2015-05-19 23:46:42 -04:00
Joey Hess	9de5cd2966	fix crash in stale transfer lockfile cleanup code Need to differentiate between the lockfile not being locked, and it not existing.	2015-05-19 23:35:24 -04:00
Joey Hess	ecb0d5c087	use lock pools throughout git-annex The one exception is in Utility.Daemon. As long as a process only daemonizes once, which seems reasonable, and as long as it avoids calling checkDaemon once it's already running as a daemon, the fcntl locking gotchas won't be a problem there. Annex.LockFile has it's own separate lock pool layer, which has been renamed to LockCache. This is a persistent cache of locks that persist until closed. This is not quite done; lockContent stil needs to be converted.	2015-05-19 14:09:52 -04:00
Joey Hess	6915b71c57	lock pools to work around non-concurrency/composition safety of POSIX fcntl	2015-05-18 15:57:17 -04:00
Joey Hess	7ebf234616	Stale transfer lock and info files will be cleaned up automatically when get/unused/info commands are run. Deleting lock files is tricky, tricky stuff. I think I got it right!	2015-05-12 20:11:23 -04:00
Joey Hess	643b233860	an optimization that also fixes a reversion This is a little optimisation; avoid loading the info file for the download of the current key when checking for other downloads. The reversion it fixes is sorta strange. `a812d598ef` broke checking for transfers that were already in progress. Indeed, the transfer lock was not held after getTransfers was called. Why? I think it's magic in ghc's handling of getLock and setLock, although it's hard to tell since those functions are almost entirely undocumented as to their semantics. Something, either the RTS (or maybe it's linux?) notices that the same process has taken a lock and is now calling getLock on a FD attached to the same file. So, it drops the lock. So, this optimisation avoids that problematic behavior.	2015-05-12 18:34:49 -04:00
Joey Hess	a812d598ef	Take space that will be used by running downloads into account when checking annex.diskreserve.	2015-05-12 15:20:22 -04:00
Joey Hess	03667a162a	couple of AMP warnings I missed before	2015-05-10 16:51:03 -04:00
Joey Hess	ec267aa1ea	rejigger imports for clean build with ghc 7.10's AMP changes The explict import Prelude after import Control.Applicative is a trick to avoid a warning.	2015-05-10 16:20:30 -04:00
Joey Hess	6c2d5b5e41	more time-1.5 fixes	2015-05-10 15:36:58 -04:00
Joey Hess	33a2264546	fix build warning with time 1.5	2015-05-10 15:28:23 -04:00
Joey Hess	a5a53ca011	forgot to add new module	2015-05-10 15:23:38 -04:00
Joey Hess	6cf62a9bde	support time-1.5.0 This no longer uses old-locale's defaultTimeLocale, but provides one of its own. Factored out a Logs.TimeStamp.	2015-05-10 15:21:35 -04:00
Joey Hess	06211738c1	Fix activity log parsing. I had some cargo culting in there that used the wrong type, so it failed to parse old logs, and overwrote them with the new log.	2015-04-09 21:02:38 -04:00
Joey Hess	9445556c97	rethought distributed fsck; instead add activity.log and expire command This is much more space efficient!	2015-04-05 12:50:02 -04:00
Joey Hess	656fc1c881	fsck: Added --distributed and --expire options, for distributed fsck.	2015-04-01 17:53:16 -04:00
Joey Hess	9e25cbde20	importfeed: Avoid downloading a redundant item from a feed whose guid has been downloaded before, even when the url has changed. To support this, always store itemid in metadata; before this was only done when annex.genmetadata was set.	2015-03-31 13:30:13 -04:00
Joey Hess	f0195b2a43	Fix GETURLS in external special remote protocol to strip downloader prefix from logged url info before checking for the specified prefix. This doesn't change what GETURLS returns, but only whether it matches any prefix that the external special remote asked for.	2015-03-27 18:49:03 -04:00
Joey Hess	6045406deb	Added SETURIPRESENT and SETURIMISSING to external special remote protocol Useful for things like ipfs that don't use regular urls. An external special remote can add a regular url to a key, and then git-annex get will download it from the web. But for ipfs, we want to instead tell git-annex that the uri uses OtherDownloader. Before this change, the external special remote protocol lacked a way to do that.	2015-03-05 13:50:15 -04:00
Joey Hess	b0575c621f	implement annex.tune.branchhash1 I hope this doesn't impact speed much -- it does have to pull out a value from Annex state every time it accesses the branch now. The test case I dropped has never caught any problems that I can remember, and would have been rather difficult to convert.	2015-01-28 17:17:26 -04:00
Joey Hess	e8c376e0ad	import Data.Default in Common	2015-01-28 16:11:28 -04:00
Joey Hess	037d86e046	refactor	2015-01-28 13:56:38 -04:00
Joey Hess	ba3825441c	rework Differences data type Eliminated complexity and future proofed. The most important change is that all functions over Difference are now total; any Difference that can be expressed should be handled. Avoids needs for sanity checking of inputs, and version skew with the future. Also, the difference.log now serializes a [Difference], not a Differences. This saves space and keeps it simpler. Note that [Difference] might contain conflicting differences (eg, [Version5, Version6]. In this case, one of them needs to consistently win over the others, probably based on Ord.	2015-01-28 13:50:02 -04:00
Joey Hess	70736d2b41	Repository tuning parameters can now be passed when initializing a repository for the first time. * init: Repository tuning parameters can now be passed when initializing a repository for the first time. For details, see http://git-annex.branchable.com/tuning/ * merge: Refuse to merge changes from a git-annex branch of a repo that has been tuned in incompatable ways.	2015-01-27 17:38:06 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	3bab5dfb1d	revert parentDir change Reverts `965e106f24` Unfortunately, this caused breakage on Windows, and possibly elsewhere, because parentDir and takeDirectory do not behave the same when there is a trailing directory separator.	2015-01-09 13:11:56 -04:00
Joey Hess	965e106f24	made parentDir return a Maybe FilePath; removed most uses of it parentDir is less safe than takeDirectory, especially when working with relative FilePaths. It's really only useful in loops that want to terminate at / This commit was sponsored by Audric SCHILTKNECHT.	2015-01-06 18:55:56 -04:00
Joey Hess	43dc7f678f	setpresentkey: A new plumbing-level command.	2014-12-29 15:16:40 -04:00
Joey Hess	589a048a7d	fix addurl behavior when location and url logs are inconsistent The url log could have an url for a key, while the location log thinks it's not present in the web. In this case, addurl --file url would not do anything. Fixed it to re-add the web as a location. I don't know how this situation could arise, but I saw it in the wild in the conference_proceedings repo, affecting key URL-s17806003--http://mirror.linux.org.au/pub/linux.conf.au/2014/Wednesday/53-Building_Effective_Alliances_around_the_Trans-Pacific_Partnershi-c0505b631127ccc67e38e637344d988e Investigating the presence log, it looked like that key was originally listed as present in the web, then in commit 56abf9e9f3e691ed9d83513037d4019313321ca3 someone else's git-annex set it and some other things to not present in the web. It would be interesting to know what that user did, but I doubt I'll be able to find out. All I can tell from this investigation is that the inconsistency was not introduced when originally addurl-ing the url.	2014-12-29 14:22:47 -04:00
Joey Hess	7e422269a6	move dummy uuids to Annex.UUID	2014-12-17 13:57:52 -04:00
Joey Hess	a7690de016	Added bittorrent special remote addurl behavior change: When downloading an url ending in .torrent, it will download files from bittorrent, instead of the old behavior of adding the torrent file to the repository. Added Recommends on aria2 and bittornado \| bittorrent. This commit was sponsored by Asbjørn Sloth Tønnesen.	2014-12-16 23:22:46 -04:00
Joey Hess	30bf112185	Urls can now be claimed by remotes. This will allow creating, for example, a external special remote that handles magnet: and *.torrent urls.	2014-12-08 19:15:07 -04:00
Joey Hess	cb6e16947d	add stub claimUrl	2014-12-08 13:40:15 -04:00
Joey Hess	8093008ef4	External special remote protocol now includes commands for setting and getting the urls associated with a key.	2014-12-08 13:32:46 -04:00
Joey Hess	db9121ecee	vicfg: Deleting configurations now resets to the default, where before it has no effect. Added a Default instance for TrustLevel, and was able to use that to clear up several other parts of the code too. This commit was sponsored by Stephan Schulz	2014-10-14 14:15:07 -04:00
Joey Hess	9fd95d9025	indent with tabs not spaces Found these with: git grep "^ " $(find -type f -name \*.hs) \|grep -v ': where' Unfortunately there is some inline hamlet that cannot use tabs for indentation. Also, Assistant/WebApp/Bootstrap3.hs is a copy of a module and so I'm leaving it as-is.	2014-10-09 15:09:26 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	b874f84086	New annex.hardlink setting. Closes: #758593 * New annex.hardlink setting. Closes: #758593 * init: Automatically detect when a repository was cloned with --shared, and set annex.hardlink=true, as well as marking the repository as untrusted. Had to reorganize Logs.Trust a bit to avoid a cycle between it and Annex.Init.	2014-09-05 13:44:09 -04:00
Joey Hess	59eae904b1	final scary locking refactoring (for now) Note that while before checkTransfer this called getLock with WriteLock, getLockStatus's use of ReadLock will also notice any exclusive locks. Since transfer info files are only locked exclusively, never shared, there is no behavior change. Also, fixes checkLocked to actually return Just False when the file exists, but is not locked.	2014-08-20 19:30:40 -04:00
Joey Hess	ec7dd0446a	more lock file refactoring	2014-08-20 17:03:04 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	f4f82e2741	deriving Show	2014-08-01 16:30:33 -04:00
Joey Hess	80cc554c82	add ChunkMethod type and make Logs.Chunk use it, rather than assuming fixed size chunks (so eg, rolling hash chunks can be supported later) If a newer git-annex starts logging something else in the chunk log, it won't be used by this version, but it will be preserved when updating the log.	2014-07-28 13:19:08 -04:00
Joey Hess	014794f4ed	improve a bit	2014-07-24 17:18:14 -04:00
Joey Hess	e2c44bf656	implement chunk logs Slightly tricky as they are not normal UUIDBased logs, but are instead maps from (uuid, chunksize) to chunkcount. This commit was sponsored by Frank Thomas.	2014-07-24 16:23:36 -04:00
Joey Hess	d0c1a22e7c	import metadata from feeds When annex.genmetadata is set, metadata from the feed is added to files that are imported from it. Reused the same feedtitle and itemtitle, feedauthor, itemauthor, etc names that are used in --template. Also added title and author, which are the item title/author if available, falling back to the feed title/author. These are more likely to be common metadata fields. (There is a small bit of dupication here, but once git gets around to packing the object, it will compress it away.) The itempubdate field is not included in the metadata as a string; instead it is used to generate year and month fields, same as is done when adding files with annex.genmetadata set. This commit was sponsored by Amitai Schlair, who cooincidentially is responsible for ikiwiki generating nice feed metadata!	2014-07-03 14:15:00 -04:00
Joey Hess	c00b459f3a	unused: Avoid checking view branches for unused files. This avoids a potential slowdown when using lots of views. I think that it makes sense for unused to ignore (local) view branches, since these are by definition supposed to be views of an existing branch, so looking at the branch should be sufficient (and if the view is out of date and has files that have since been deleted from the branch, the user's intent is not to preserve those from unused reaping).	2014-06-04 14:03:41 -04:00
Joey Hess	9eaabf0382	webapp: avoid overwriting remote configs when enabling it Avoid stomping on existing group and preferred content settings when enabling or combining with an already existing remote. Two level fix. First, use defaultStandardGroup rather than setStandardGroup, so if there is an existing configuration in the git-annex branch, it's not overwritten. To handle pre-existing ssh remotes (including gcrypt), a second level is needed, because before syncing with the remote, it's configuration won't be available locally. (And syncing could take a long time.) So, in this case, keep track of whether the remote is being created or enabled, and only set configs when creating it. This commit was sponsored by Anders Lannerback.	2014-05-30 14:03:04 -04:00
Joey Hess	065248f3d2	Added required content configuration. This includes checking when dropping files that any required content configuration is satisfied. However, it does not yet include an active check on the required content; the location log is trusted when checking the required content expression.	2014-03-29 16:03:33 -04:00
Joey Hess	fe19e15040	reorg matcher types; no non-type code changes	2014-03-29 14:43:34 -04:00
Joey Hess	e426fac273	add desktop notifications Motivation: Hook scripts for nautilus or other file managers need to provide the user with feedback that a file is being downloaded. This commit was sponsored by THM Schoemaker.	2014-03-22 14:12:19 -04:00
Joey Hess	ed30b81e2c	Improve behavior when unable to parse a preferred content expression (thanks, ion). Fall back to "present" as the preferred conent expression, which will not result in any content movement.	2014-03-20 00:10:12 -04:00
Joey Hess	f64c2d6138	toplevel lastchanged field	2014-03-19 19:10:55 -04:00
Joey Hess	6848f09a12	better timestamp format	2014-03-18 19:01:50 -04:00
Joey Hess	caa97d1271	Each for each metadata field, there's now an automatically maintained "$field-lastchanged" that gives the timestamp of the last change to that field. Note that this is a nearly entirely free feature. The data was already stored in the metadata log in an easily accessible way, and already was parsed to a time when parsing the log. The generation of the metadata fields may even be done lazily, although probably not entirely (the map has to be evaulated to when queried).	2014-03-18 18:55:43 -04:00
Joey Hess	6a4dd42328	finish wiring up groupwanted	2014-03-15 17:08:55 -04:00
Joey Hess	417aea25be	vicfg: Allows editing preferred content expressions for groups. This is stored in the git-annex branch, but not yet actually hooked up and used.	2014-03-15 16:17:01 -04:00
Joey Hess	431d805a96	factored out a generic MapLog from uuid-based logs UUIDBased is just a MapLog with a UUID for the field.	2014-03-15 13:45:25 -04:00
Joey Hess	b7eb1d834a	Avoid encoding errors when using the unused log file.	2014-03-15 11:57:27 -04:00
Joey Hess	3551d40b05	"standard" can now be used as a first-class keyword in preferred content expressions. For example "standard or (include=otherdir/*)" or even "not standard" Note that the implementation avoids any potential for loops (if a standard preferred content expression itself mentioned standard). This commit was sponsored by Jochen Bartl.	2014-03-14 15:04:33 -04:00
Joey Hess	67f09bca6d	fully fix fsck memory use by iterative fscking Not very well tested, but I'm sure it doesn't eg, loop forever.	2014-03-12 15:18:43 -04:00
Joey Hess	9f27339e80	remove uninofrmative warning dateUnusedLog is only used to show a timestamp in the webapp, so not worth a warning	2014-03-12 12:42:51 -04:00
Joey Hess	c2e8c21ca6	view, vfilter: Add support for filtering tags and values out of a view, using !tag and field!=value. Note that negated globs are not supported. Would have complicated the code to add them, without changing the data type serialization in a non-backwards-compatable way. This commit was sponsored by Denver Gingerich.	2014-03-02 14:53:19 -04:00
Joey Hess	a1432bce2f	Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.	2014-02-26 16:52:56 -04:00
Joey Hess	8d5158fa31	Preserve metadata when staging a new version of an annexed file. Performance impact: When adding a large tree of new files, this needs to do some git cat-file queries to check if any of the files already existed and might need a metadata copy. I tried a benchmark in a copy of my sound repository (so there was already a significant git tree to check against. Adding 10000 small files, with a cold cache: before: 1m48.539s after: 1m52.791s So, impact is 0.0004 seconds per file added. Which seems acceptable, so did not add some kind of configuration to enable/disable this. This commit was sponsored by Lisa Feilen.	2014-02-24 14:41:33 -04:00
Joey Hess	7498c5dd96	annex.genmetadata can be set to make git-annex automatically set metadata (year and month) when adding files	2014-02-23 00:08:29 -04:00
Joey Hess	bdfc8e1f44	fix build with old version of Data.Set that lacks toDescList	2014-02-21 11:30:31 -04:00
Joey Hess	cfed7f6a5d	remove special case for tags in view branch names Just having "_" for tags=* turned out to be too hard to understand. Note that this invalidaes all current views.	2014-02-19 17:38:45 -04:00
Joey Hess	c85a482136	improve view branch name when there are a list of values	2014-02-19 16:35:00 -04:00
Joey Hess	dd7b99c860	add tip about metadata driven views (and more flexible view filtering) While writing this documentation, I realized that there needed to be a way to stay in a view like tag=* while adding a filter like tag=work that applies to the same field. So, there are really two ways a view can be refined. It can have a new "field=explicitvalue" filter added to it, which does not change the "shape" of the view, but narrows the files it shows. Or, it can have a new view added, which adds another level of subdirectories. So, added a vfilter command, which takes explicit values to add to the filter, and rejects changes that would change the shape of the view. And, made vadd only accept changes that change the shape of the view. And, changed the View data type slightly; now components that can match multiple metadata values can be visible, or not visible. This commit was sponsored by Stelian Iancu.	2014-02-19 16:29:56 -04:00
Joey Hess	39ebfa1a2e	pre-commit: Update metadata when committing changes to annexed files within a view. So the user can now switch to a view and then move files around within it to manage metadata. For example, moving a file into a new directory when in the tags=* view adds a tag to it. Implementation is fairly efficient. One diff-index, which is no more expensive than the first stage of a git commit, followed by possibly some cat-file --batch traffic to find the key (when deleting a file). Very similar to what's done in direct mode when committing. And like direct mode when updating the WC after a merge, it has to buffer the diff-tree values in order to make 2 passes over them. When not in a view, pre-commit now does one extra git symbolic-ref, which is tiny overhead. This commit was sponsored by Andrew Eskridge.	2014-02-19 14:17:58 -04:00
Joey Hess	02259d2a55	speed up currentView when not in a view Avoid reading the view log when the branch is clearly not a view branch.	2014-02-19 12:52:47 -04:00
Joey Hess	4e0be2792b	remove Read instance for Ref Removed instance, got it all to build using fromRef. (With a few things that really need to show something using a ref for debugging stubbed out.) Then added back Read instance, and made Logs.View use it for serialization. This changes the view log format.	2014-02-19 01:19:57 -04:00
Joey Hess	2bf338f443	fixed vpop	2014-02-18 21:09:25 -04:00
Joey Hess	67fd06af76	add git annex view command (And a vpop command, which is still a bit buggy.) Still need to do vadd and vrm, though this also adds their documentation. Currently not very happy with the view log data serialization. I had to lose the TDFA regexps temporarily, so I can have Read/Show instances of View. I expect the view log format will change in some incompatable way later, probably adding last known refs for the parent branch to View or something like that. Anyway, it basically works, although it's a bit slow looking up the metadata. The actual git branch construction is about as fast as it can be using the current git plumbing. This commit was sponsored by Peter Hogg.	2014-02-18 18:22:20 -04:00
Joey Hess	a18eae9a0f	nice git ack space optimisation when setting the same metadata value for multiple files	2014-02-13 01:57:43 -04:00
Joey Hess	361aee0470	avoid churning in git to no benefit when optimising metadata log I think this is now optimal.	2014-02-12 23:24:04 -04:00
Joey Hess	8076530284	improve simplifier	2014-02-12 22:50:41 -04:00
Joey Hess	a05ac13e92	fix metadata log simplifier and additional quickcheck tests	2014-02-12 22:27:55 -04:00
Joey Hess	9f7e76130e	add metadata command to get/set metadata Adds metadata log, and command. Note that unsetting field values seems to currently be broken. And in general this has had all of 2 minutes worth of testing. This commit was sponsored by Julien Lefrique.	2014-02-12 21:30:33 -04:00
Joey Hess	c390e896d1	fix windows build (and make --stop work on windows, incidentially) The Utility.PID will clean up other code soon.	2014-02-11 15:25:59 -04:00
Joey Hess	4f7e72b51a	fix parsing of unused log; keys can contain spaces	2014-02-08 15:27:11 -04:00
Joey Hess	a44e01c29c	--in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}"	2014-02-06 12:43:56 -04:00
Joey Hess	1572c460e8	avoid using openFile when withFile can be used Potentially fixes some FD leak if an action on an opened file handle fails for some reason. There have been some hard to reproduce reports of git-annex leaking FDs, and this may solve them.	2014-02-03 10:19:06 -04:00
Joey Hess	32f1f68dc9	typo	2014-01-28 17:17:21 -04:00
Joey Hess	f0dfac4d96	fix build with old ghc that used old-time type	2014-01-28 17:14:43 -04:00
Joey Hess	eefda291c6	fix warning	2014-01-28 14:43:20 -04:00
Joey Hess	891c85cd88	use locking on Windows This is all the easy cases, where there was already a separate lock file.	2014-01-28 14:42:03 -04:00
Joey Hess	3518c586cf	fix transfers of key with no associated file Several places assumed this would not happen, and when the AssociatedFile was Nothing, did nothing. As part of this, preferred content checks pass the Key around. Note that checkMatcher is sometimes now called with Just Key and Just File. It currently constructs a FileMatcher, ignoring the Key. However, if it constructed a FileKeyMatcher, which contained both, then it might be possible to speed up parts of Limit, which currently call the somewhat expensive lookupFileKey to get the Key. I have not made this optimisation yet, because I am not sure if the key is always the same. Will need some significant checking to satisfy myself that's the case..	2014-01-23 16:44:02 -04:00
Joey Hess	e0bd088f08	add webapp UI to manage unused files	2014-01-23 15:09:43 -04:00
Joey Hess	3da0064657	assistant unused file handling Make sanity checker run git annex unused daily, and queue up transfers of unused files to any remotes that will have them. The transfer retrying code works for us here, so eg when a backup disk remote is plugged in, any transfers to it are done. Once the unused files reach a remote, they'll be removed locally as unwanted. If the setup does not cause unused files to go to a remote, they'll pile up, and the sanity checker detects this using some heuristics that are pretty good -- 1000 unused files, or 10% of disk used by unused files, or more disk wasted by unused files than is left free. Once it detects this, it pops up an alert in the webapp, with a button to take action. TODO: Webapp UI to configure this, and also the ability to launch an immediate cleanup of all unused files. This commit was sponsored by Simon Michael.	2014-01-22 22:53:18 -04:00
Joey Hess	4b55afe9e9	add "unused" preferred content expression With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.	2014-01-22 16:35:32 -04:00
Joey Hess	ae3cd632bd	add timestamps to unused log files This will be used in expiring old unused objects. The timestamp is when it was first noticed it was unused. Backwards compatability: It supports reading old format unused log files. The old version of git-annex will ignore lines in log files written by the new version, so the worst interop problem would be git annex dropunused not knowing some numbers that git-annex unused reported.	2014-01-22 15:33:02 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	0ef282a116	numcopies cleanup, part 2 This includes several bug fixes.	2014-01-21 17:25:39 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	d66535f065	global numcopies setting * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.	2014-01-20 16:47:56 -04:00
Joey Hess	93161d0dea	copyright year	2014-01-08 16:29:15 -04:00
Joey Hess	3e68c1c2fd	add remote state logs This allows a remote to store a piece of arbitrary state associated with a key. This is needed to support Tahoe, where the file-cap is calculated from the data stored in it, and used to retrieve a key later. Glacier also would be much improved by using this. GETSTATE and SETSTATE are added to the external special remote protocol. Note that the state is left as-is even when a key is removed from a remote. It's up to the remote to decide when it wants to clear the state. The remote state log, $KEY.log.rmt, is a UUID-based log. However, rather than using the old UUID-based log format, I created a new variant of that format. The new varient is more space efficient (since it lacks the "timestamp=" hack, and easier to parse (and the parser doesn't mess with whitespace in the value), and avoids compatability cruft in the old one. This seemed worth cleaning up for these new files, since there could be a lot of them, while before UUID-based logs were only used for a few log files at the top of the git-annex branch. The transition code has also been updated to handle these new UUID-based logs. This commit was sponsored by Daniel Hofer.	2014-01-03 16:35:57 -04:00
Joey Hess	8e3032df2d	added GETWANTED, SETWANTED for Tobias's flickr remote This was unexpectedly difficult because of a depdenency cycle. To parse a preferred content expression involves several things that need to operate on the list of remotes. Which needs Remote.External. The only way to avoid this cycle (I tried breaking it at several points) was to skip parsing the expression in SETWANTED. That's sorta ok, because git-annex already has to deal with unparsable preferred content expressions being stored, in order to handle eg, upgrades. But I'm still not very happy that I cannot check it. I feel this is a strong indication that I need to beware of further bloating the special remote protocol interface.	2014-01-01 20:12:20 -04:00
Joey Hess	f0a6de1ca2	add PreferredContentExpression type	2014-01-01 19:58:02 -04:00
Richard Hartmann	974fe009bf	Another round of s/amoung/among/	2013-12-19 12:30:53 -04:00
Joey Hess	f931272681	syntax	2013-12-11 00:18:58 -04:00
Joey Hess	011b8bc7ec	pull in Win32-extras, to be able to get current process id in Windows Fixed up a number of things that had worked around there not being a way to get that. Most notably, transfer info files on windows now include the process id, since no locking is currently done. This means the file format varies between windows and unix.	2013-12-11 00:15:10 -04:00
Joey Hess	ecd42aef8e	different PID types for Unix and Windows Windows has a larger (unsigned) PID space, so cannot use the unix CInt there. Note that TransferInfo does not yet ever get the TransferPid populated, as there is missing locking.	2013-12-10 23:48:42 -04:00
Joey Hess	6edac746f0	merge improved fsck types from git-repair and some associated changes	2013-11-30 14:29:11 -04:00
Joey Hess	53ab737723	clean up cruft left in log by bug	2013-11-09 14:30:26 -04:00
Joey Hess	8e1b8af6e7	fix crash on empty description Caused by bug fixed in `46cf00ffd8`	2013-11-09 13:50:44 -04:00
Joey Hess	049e80e865	refactor	2013-10-28 14:05:55 -04:00
Joey Hess	d345e5b52f	add git fsck to cronner, and UI for repository repair (not yet wired up)	2013-10-22 16:02:52 -04:00
Joey Hess	92d5452a19	write via temp file	2013-10-14 16:15:38 -04:00
Joey Hess	296e21b381	add schedule command Mostly because it gives me an excuse and a hook to document the schedule expression format.	2013-10-13 15:40:38 -04:00

1 2 3 4 5 ...

417 commits