git-annex

Author	SHA1	Message	Date
Joey Hess	d17f08afdc	avoid warning about orphan Arbirary instance	2016-09-05 14:51:07 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	c4d011bf3e	log: Added --all option.	2016-07-17 15:15:08 -04:00
Joey Hess	176cd98293	remove \r from Arbitrary for log tests	2016-05-27 12:04:49 -04:00
Joey Hess	eba68572dc	Split lines in the git-annex branch on \r as well as \n, to deal with \r\n terminated lines written by some versions of git-annex on Windows. This fixes strange displays in some cases, including whereis showing many duplicate locations, and showing more total copies than actually exist. It's unknown if that lead to data loss when eg, dropping. At the moment, it seems unlikely it could, since the UUID with \r's appended is not the same as a UUID without, and so no remote matches it. It's also unknown if \r's can leak in on windows, perhaps when merging the git-annex branch.	2016-05-27 11:45:13 -04:00
Joey Hess	823c28d2dc	nub transitionList to avoid ugly message after repeated transitions, and avoid redundant work for repeated ForgetDeadRemotes transitions	2016-05-18 12:26:38 -04:00
Joey Hess	8ab27235ea	reinject: Added new mode which can reinject known files into the annex. For example: git-annex reinject --known /mnt/backup/*	2016-04-22 13:49:32 -04:00
Joey Hess	403b56fb91	Limit annex.largefiles parsing to the subset of preferred content expressions that make sense in its context. So, not "standard" or "lackingcopies", etc.	2016-02-03 15:04:42 -04:00
Joey Hess	cdf5977053	simplify	2016-02-03 13:23:34 -04:00
Joey Hess	737e45156e	remove 163 lines of code without changing anything except imports	2016-01-20 16:36:33 -04:00
Joey Hess	1f3358512a	refactor	2016-01-19 15:55:32 -04:00
Joey Hess	983c1894eb	avoid unnecessary reading of git-annex branch data when matching on annex.largefiles This makes git annex clean not look at the git-annex branch at all, and so speeds it up by 50% or more.	2015-12-04 15:06:41 -04:00
Joey Hess	b0626230b7	fix use of hifalutin terminology	2015-11-16 14:37:31 -04:00
Joey Hess	aaf1ef268d	convert from Utility.LockPool to Annex.LockPool everywhere	2015-11-12 18:13:37 -04:00
Joey Hess	f9adb905fc	Avoid unncessary write to the location log when a file is unlocked and then added back with unchanged content. Implemented with no additional overhead of compares etc. This is safe to do for presence logs because of their locality of change; a given repo's presence logs are only ever changed in that repo, or in a repo that has just been actively changing the content of that repo. So, we don't need to worry about a split-brain situation where there'd be disagreement about the location of a key in a repo. And so, it's ok to not update the timestamp when that's the only change that would be made due to logging presence info.	2015-10-12 14:46:47 -04:00
Joey Hess	6fbabfcf16	oops, didn't mean to commit this debug	2015-10-06 17:28:20 -04:00
Joey Hess	ba7ecf68c0	analysis	2015-10-06 17:11:52 -04:00
Joey Hess	16947ef654	Fix bug in combination of preferred and required content settings. When one was set to the empty string and the other set to some expression, this bug caused all files to be wanted, instead of only files matching the expression. Avoid: MAny `MOr` otherexpression Which matches anything.	2015-09-15 12:50:14 -04:00
Joey Hess	6e829939e9	add test case that all standard group preferred content expressions parse	2015-06-17 13:44:19 -04:00
Joey Hess	5c960601aa	4 ns optimisation of repeated calls to hasDifference on the same Differences I want this as fast as possible, so it can be added to code paths without slowing them down. Avoid the set lookup, and rely on laziness, drops runtime from 14.37 ns to 11.03 ns according to this criterion benchmark: import Criterion.Main import qualified Types.Difference as New import qualified Types.DifferenceOld as Old main :: IO () main = defaultMain [ bgroup "hasDifference" [ bench "new" $ whnf (New.hasDifference New.OneLevelObjectHash) new , bench "old" $ whnf (Old.hasDifference Old.OneLevelObjectHash) old ] ] where s = "fromList [ObjectHashLower, OneLevelObjectHash, OneLevelBranchHash]" new = New.readDifferences s old = Old.readDifferences s A little bit of added boilerplate, but I suppose it's worth it to not need to worry about set lookup overhead. Note that adding more differences would slow down the old implementation; the new implementation will run the same speed.	2015-06-11 16:34:35 -04:00
Joey Hess	f8ab3bc449	dead --key: Can be used to mark a key as dead.	2015-06-09 14:52:05 -04:00
Joey Hess	6eefc5db65	fsck: Ignore keys that are known to be dead when running in --all mode or a in a bare repo. Otherwise, still reports files with lost contents, even if the content is dead.	2015-06-09 14:08:57 -04:00
Joey Hess	53ede1a10e	parse X in location log file as indicating a dead key A dead key is both not present at the location that thinks it has a copy, and also is assumed to probably not be present anywhere else. Although there may be lurking disconnected repos that somehow still have a copy. Suprisingly few changes needed for this! This is because the presence log code only really concerns itself with keys that are present, and dead keys are not present. Note that both the location and web log can be parsed as having a dead key. I don't see any value to having keys listed as dead in the web log, but since it doesn't change any behavior, there was no point in not parsing it.	2015-06-09 13:28:30 -04:00
Joey Hess	6383d22ffa	remove back-compat code for old version of containers Already b-d on a newer version.	2015-06-06 15:23:53 -04:00
Joey Hess	87f28bb2ea	ignore failure to clean up stale transfer lock file Perhaps due to permissions problem, or perhaps a race with another process also cleaning up.	2015-05-19 23:46:42 -04:00
Joey Hess	9de5cd2966	fix crash in stale transfer lockfile cleanup code Need to differentiate between the lockfile not being locked, and it not existing.	2015-05-19 23:35:24 -04:00
Joey Hess	ecb0d5c087	use lock pools throughout git-annex The one exception is in Utility.Daemon. As long as a process only daemonizes once, which seems reasonable, and as long as it avoids calling checkDaemon once it's already running as a daemon, the fcntl locking gotchas won't be a problem there. Annex.LockFile has it's own separate lock pool layer, which has been renamed to LockCache. This is a persistent cache of locks that persist until closed. This is not quite done; lockContent stil needs to be converted.	2015-05-19 14:09:52 -04:00
Joey Hess	6915b71c57	lock pools to work around non-concurrency/composition safety of POSIX fcntl	2015-05-18 15:57:17 -04:00
Joey Hess	7ebf234616	Stale transfer lock and info files will be cleaned up automatically when get/unused/info commands are run. Deleting lock files is tricky, tricky stuff. I think I got it right!	2015-05-12 20:11:23 -04:00
Joey Hess	643b233860	an optimization that also fixes a reversion This is a little optimisation; avoid loading the info file for the download of the current key when checking for other downloads. The reversion it fixes is sorta strange. `a812d598ef` broke checking for transfers that were already in progress. Indeed, the transfer lock was not held after getTransfers was called. Why? I think it's magic in ghc's handling of getLock and setLock, although it's hard to tell since those functions are almost entirely undocumented as to their semantics. Something, either the RTS (or maybe it's linux?) notices that the same process has taken a lock and is now calling getLock on a FD attached to the same file. So, it drops the lock. So, this optimisation avoids that problematic behavior.	2015-05-12 18:34:49 -04:00
Joey Hess	a812d598ef	Take space that will be used by running downloads into account when checking annex.diskreserve.	2015-05-12 15:20:22 -04:00
Joey Hess	03667a162a	couple of AMP warnings I missed before	2015-05-10 16:51:03 -04:00
Joey Hess	ec267aa1ea	rejigger imports for clean build with ghc 7.10's AMP changes The explict import Prelude after import Control.Applicative is a trick to avoid a warning.	2015-05-10 16:20:30 -04:00
Joey Hess	6c2d5b5e41	more time-1.5 fixes	2015-05-10 15:36:58 -04:00
Joey Hess	33a2264546	fix build warning with time 1.5	2015-05-10 15:28:23 -04:00
Joey Hess	a5a53ca011	forgot to add new module	2015-05-10 15:23:38 -04:00
Joey Hess	6cf62a9bde	support time-1.5.0 This no longer uses old-locale's defaultTimeLocale, but provides one of its own. Factored out a Logs.TimeStamp.	2015-05-10 15:21:35 -04:00
Joey Hess	06211738c1	Fix activity log parsing. I had some cargo culting in there that used the wrong type, so it failed to parse old logs, and overwrote them with the new log.	2015-04-09 21:02:38 -04:00
Joey Hess	9445556c97	rethought distributed fsck; instead add activity.log and expire command This is much more space efficient!	2015-04-05 12:50:02 -04:00
Joey Hess	656fc1c881	fsck: Added --distributed and --expire options, for distributed fsck.	2015-04-01 17:53:16 -04:00
Joey Hess	9e25cbde20	importfeed: Avoid downloading a redundant item from a feed whose guid has been downloaded before, even when the url has changed. To support this, always store itemid in metadata; before this was only done when annex.genmetadata was set.	2015-03-31 13:30:13 -04:00
Joey Hess	f0195b2a43	Fix GETURLS in external special remote protocol to strip downloader prefix from logged url info before checking for the specified prefix. This doesn't change what GETURLS returns, but only whether it matches any prefix that the external special remote asked for.	2015-03-27 18:49:03 -04:00
Joey Hess	6045406deb	Added SETURIPRESENT and SETURIMISSING to external special remote protocol Useful for things like ipfs that don't use regular urls. An external special remote can add a regular url to a key, and then git-annex get will download it from the web. But for ipfs, we want to instead tell git-annex that the uri uses OtherDownloader. Before this change, the external special remote protocol lacked a way to do that.	2015-03-05 13:50:15 -04:00
Joey Hess	b0575c621f	implement annex.tune.branchhash1 I hope this doesn't impact speed much -- it does have to pull out a value from Annex state every time it accesses the branch now. The test case I dropped has never caught any problems that I can remember, and would have been rather difficult to convert.	2015-01-28 17:17:26 -04:00
Joey Hess	e8c376e0ad	import Data.Default in Common	2015-01-28 16:11:28 -04:00
Joey Hess	037d86e046	refactor	2015-01-28 13:56:38 -04:00
Joey Hess	ba3825441c	rework Differences data type Eliminated complexity and future proofed. The most important change is that all functions over Difference are now total; any Difference that can be expressed should be handled. Avoids needs for sanity checking of inputs, and version skew with the future. Also, the difference.log now serializes a [Difference], not a Differences. This saves space and keeps it simpler. Note that [Difference] might contain conflicting differences (eg, [Version5, Version6]. In this case, one of them needs to consistently win over the others, probably based on Ord.	2015-01-28 13:50:02 -04:00
Joey Hess	70736d2b41	Repository tuning parameters can now be passed when initializing a repository for the first time. * init: Repository tuning parameters can now be passed when initializing a repository for the first time. For details, see http://git-annex.branchable.com/tuning/ * merge: Refuse to merge changes from a git-annex branch of a repo that has been tuned in incompatable ways.	2015-01-27 17:38:06 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	3bab5dfb1d	revert parentDir change Reverts `965e106f24` Unfortunately, this caused breakage on Windows, and possibly elsewhere, because parentDir and takeDirectory do not behave the same when there is a trailing directory separator.	2015-01-09 13:11:56 -04:00
Joey Hess	965e106f24	made parentDir return a Maybe FilePath; removed most uses of it parentDir is less safe than takeDirectory, especially when working with relative FilePaths. It's really only useful in loops that want to terminate at / This commit was sponsored by Audric SCHILTKNECHT.	2015-01-06 18:55:56 -04:00
Joey Hess	43dc7f678f	setpresentkey: A new plumbing-level command.	2014-12-29 15:16:40 -04:00
Joey Hess	589a048a7d	fix addurl behavior when location and url logs are inconsistent The url log could have an url for a key, while the location log thinks it's not present in the web. In this case, addurl --file url would not do anything. Fixed it to re-add the web as a location. I don't know how this situation could arise, but I saw it in the wild in the conference_proceedings repo, affecting key URL-s17806003--http://mirror.linux.org.au/pub/linux.conf.au/2014/Wednesday/53-Building_Effective_Alliances_around_the_Trans-Pacific_Partnershi-c0505b631127ccc67e38e637344d988e Investigating the presence log, it looked like that key was originally listed as present in the web, then in commit 56abf9e9f3e691ed9d83513037d4019313321ca3 someone else's git-annex set it and some other things to not present in the web. It would be interesting to know what that user did, but I doubt I'll be able to find out. All I can tell from this investigation is that the inconsistency was not introduced when originally addurl-ing the url.	2014-12-29 14:22:47 -04:00
Joey Hess	7e422269a6	move dummy uuids to Annex.UUID	2014-12-17 13:57:52 -04:00
Joey Hess	a7690de016	Added bittorrent special remote addurl behavior change: When downloading an url ending in .torrent, it will download files from bittorrent, instead of the old behavior of adding the torrent file to the repository. Added Recommends on aria2 and bittornado \| bittorrent. This commit was sponsored by Asbjørn Sloth Tønnesen.	2014-12-16 23:22:46 -04:00
Joey Hess	30bf112185	Urls can now be claimed by remotes. This will allow creating, for example, a external special remote that handles magnet: and *.torrent urls.	2014-12-08 19:15:07 -04:00
Joey Hess	cb6e16947d	add stub claimUrl	2014-12-08 13:40:15 -04:00
Joey Hess	8093008ef4	External special remote protocol now includes commands for setting and getting the urls associated with a key.	2014-12-08 13:32:46 -04:00
Joey Hess	db9121ecee	vicfg: Deleting configurations now resets to the default, where before it has no effect. Added a Default instance for TrustLevel, and was able to use that to clear up several other parts of the code too. This commit was sponsored by Stephan Schulz	2014-10-14 14:15:07 -04:00
Joey Hess	9fd95d9025	indent with tabs not spaces Found these with: git grep "^ " $(find -type f -name \*.hs) \|grep -v ': where' Unfortunately there is some inline hamlet that cannot use tabs for indentation. Also, Assistant/WebApp/Bootstrap3.hs is a copy of a module and so I'm leaving it as-is.	2014-10-09 15:09:26 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	b874f84086	New annex.hardlink setting. Closes: #758593 * New annex.hardlink setting. Closes: #758593 * init: Automatically detect when a repository was cloned with --shared, and set annex.hardlink=true, as well as marking the repository as untrusted. Had to reorganize Logs.Trust a bit to avoid a cycle between it and Annex.Init.	2014-09-05 13:44:09 -04:00
Joey Hess	59eae904b1	final scary locking refactoring (for now) Note that while before checkTransfer this called getLock with WriteLock, getLockStatus's use of ReadLock will also notice any exclusive locks. Since transfer info files are only locked exclusively, never shared, there is no behavior change. Also, fixes checkLocked to actually return Just False when the file exists, but is not locked.	2014-08-20 19:30:40 -04:00
Joey Hess	ec7dd0446a	more lock file refactoring	2014-08-20 17:03:04 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	f4f82e2741	deriving Show	2014-08-01 16:30:33 -04:00
Joey Hess	80cc554c82	add ChunkMethod type and make Logs.Chunk use it, rather than assuming fixed size chunks (so eg, rolling hash chunks can be supported later) If a newer git-annex starts logging something else in the chunk log, it won't be used by this version, but it will be preserved when updating the log.	2014-07-28 13:19:08 -04:00
Joey Hess	014794f4ed	improve a bit	2014-07-24 17:18:14 -04:00
Joey Hess	e2c44bf656	implement chunk logs Slightly tricky as they are not normal UUIDBased logs, but are instead maps from (uuid, chunksize) to chunkcount. This commit was sponsored by Frank Thomas.	2014-07-24 16:23:36 -04:00
Joey Hess	d0c1a22e7c	import metadata from feeds When annex.genmetadata is set, metadata from the feed is added to files that are imported from it. Reused the same feedtitle and itemtitle, feedauthor, itemauthor, etc names that are used in --template. Also added title and author, which are the item title/author if available, falling back to the feed title/author. These are more likely to be common metadata fields. (There is a small bit of dupication here, but once git gets around to packing the object, it will compress it away.) The itempubdate field is not included in the metadata as a string; instead it is used to generate year and month fields, same as is done when adding files with annex.genmetadata set. This commit was sponsored by Amitai Schlair, who cooincidentially is responsible for ikiwiki generating nice feed metadata!	2014-07-03 14:15:00 -04:00
Joey Hess	c00b459f3a	unused: Avoid checking view branches for unused files. This avoids a potential slowdown when using lots of views. I think that it makes sense for unused to ignore (local) view branches, since these are by definition supposed to be views of an existing branch, so looking at the branch should be sufficient (and if the view is out of date and has files that have since been deleted from the branch, the user's intent is not to preserve those from unused reaping).	2014-06-04 14:03:41 -04:00
Joey Hess	9eaabf0382	webapp: avoid overwriting remote configs when enabling it Avoid stomping on existing group and preferred content settings when enabling or combining with an already existing remote. Two level fix. First, use defaultStandardGroup rather than setStandardGroup, so if there is an existing configuration in the git-annex branch, it's not overwritten. To handle pre-existing ssh remotes (including gcrypt), a second level is needed, because before syncing with the remote, it's configuration won't be available locally. (And syncing could take a long time.) So, in this case, keep track of whether the remote is being created or enabled, and only set configs when creating it. This commit was sponsored by Anders Lannerback.	2014-05-30 14:03:04 -04:00
Joey Hess	065248f3d2	Added required content configuration. This includes checking when dropping files that any required content configuration is satisfied. However, it does not yet include an active check on the required content; the location log is trusted when checking the required content expression.	2014-03-29 16:03:33 -04:00
Joey Hess	fe19e15040	reorg matcher types; no non-type code changes	2014-03-29 14:43:34 -04:00
Joey Hess	e426fac273	add desktop notifications Motivation: Hook scripts for nautilus or other file managers need to provide the user with feedback that a file is being downloaded. This commit was sponsored by THM Schoemaker.	2014-03-22 14:12:19 -04:00
Joey Hess	ed30b81e2c	Improve behavior when unable to parse a preferred content expression (thanks, ion). Fall back to "present" as the preferred conent expression, which will not result in any content movement.	2014-03-20 00:10:12 -04:00
Joey Hess	f64c2d6138	toplevel lastchanged field	2014-03-19 19:10:55 -04:00
Joey Hess	6848f09a12	better timestamp format	2014-03-18 19:01:50 -04:00
Joey Hess	caa97d1271	Each for each metadata field, there's now an automatically maintained "$field-lastchanged" that gives the timestamp of the last change to that field. Note that this is a nearly entirely free feature. The data was already stored in the metadata log in an easily accessible way, and already was parsed to a time when parsing the log. The generation of the metadata fields may even be done lazily, although probably not entirely (the map has to be evaulated to when queried).	2014-03-18 18:55:43 -04:00
Joey Hess	6a4dd42328	finish wiring up groupwanted	2014-03-15 17:08:55 -04:00
Joey Hess	417aea25be	vicfg: Allows editing preferred content expressions for groups. This is stored in the git-annex branch, but not yet actually hooked up and used.	2014-03-15 16:17:01 -04:00
Joey Hess	431d805a96	factored out a generic MapLog from uuid-based logs UUIDBased is just a MapLog with a UUID for the field.	2014-03-15 13:45:25 -04:00
Joey Hess	b7eb1d834a	Avoid encoding errors when using the unused log file.	2014-03-15 11:57:27 -04:00
Joey Hess	3551d40b05	"standard" can now be used as a first-class keyword in preferred content expressions. For example "standard or (include=otherdir/*)" or even "not standard" Note that the implementation avoids any potential for loops (if a standard preferred content expression itself mentioned standard). This commit was sponsored by Jochen Bartl.	2014-03-14 15:04:33 -04:00
Joey Hess	67f09bca6d	fully fix fsck memory use by iterative fscking Not very well tested, but I'm sure it doesn't eg, loop forever.	2014-03-12 15:18:43 -04:00
Joey Hess	9f27339e80	remove uninofrmative warning dateUnusedLog is only used to show a timestamp in the webapp, so not worth a warning	2014-03-12 12:42:51 -04:00
Joey Hess	c2e8c21ca6	view, vfilter: Add support for filtering tags and values out of a view, using !tag and field!=value. Note that negated globs are not supported. Would have complicated the code to add them, without changing the data type serialization in a non-backwards-compatable way. This commit was sponsored by Denver Gingerich.	2014-03-02 14:53:19 -04:00
Joey Hess	a1432bce2f	Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.	2014-02-26 16:52:56 -04:00
Joey Hess	8d5158fa31	Preserve metadata when staging a new version of an annexed file. Performance impact: When adding a large tree of new files, this needs to do some git cat-file queries to check if any of the files already existed and might need a metadata copy. I tried a benchmark in a copy of my sound repository (so there was already a significant git tree to check against. Adding 10000 small files, with a cold cache: before: 1m48.539s after: 1m52.791s So, impact is 0.0004 seconds per file added. Which seems acceptable, so did not add some kind of configuration to enable/disable this. This commit was sponsored by Lisa Feilen.	2014-02-24 14:41:33 -04:00
Joey Hess	7498c5dd96	annex.genmetadata can be set to make git-annex automatically set metadata (year and month) when adding files	2014-02-23 00:08:29 -04:00
Joey Hess	bdfc8e1f44	fix build with old version of Data.Set that lacks toDescList	2014-02-21 11:30:31 -04:00
Joey Hess	cfed7f6a5d	remove special case for tags in view branch names Just having "_" for tags=* turned out to be too hard to understand. Note that this invalidaes all current views.	2014-02-19 17:38:45 -04:00
Joey Hess	c85a482136	improve view branch name when there are a list of values	2014-02-19 16:35:00 -04:00
Joey Hess	dd7b99c860	add tip about metadata driven views (and more flexible view filtering) While writing this documentation, I realized that there needed to be a way to stay in a view like tag=* while adding a filter like tag=work that applies to the same field. So, there are really two ways a view can be refined. It can have a new "field=explicitvalue" filter added to it, which does not change the "shape" of the view, but narrows the files it shows. Or, it can have a new view added, which adds another level of subdirectories. So, added a vfilter command, which takes explicit values to add to the filter, and rejects changes that would change the shape of the view. And, made vadd only accept changes that change the shape of the view. And, changed the View data type slightly; now components that can match multiple metadata values can be visible, or not visible. This commit was sponsored by Stelian Iancu.	2014-02-19 16:29:56 -04:00
Joey Hess	39ebfa1a2e	pre-commit: Update metadata when committing changes to annexed files within a view. So the user can now switch to a view and then move files around within it to manage metadata. For example, moving a file into a new directory when in the tags=* view adds a tag to it. Implementation is fairly efficient. One diff-index, which is no more expensive than the first stage of a git commit, followed by possibly some cat-file --batch traffic to find the key (when deleting a file). Very similar to what's done in direct mode when committing. And like direct mode when updating the WC after a merge, it has to buffer the diff-tree values in order to make 2 passes over them. When not in a view, pre-commit now does one extra git symbolic-ref, which is tiny overhead. This commit was sponsored by Andrew Eskridge.	2014-02-19 14:17:58 -04:00
Joey Hess	02259d2a55	speed up currentView when not in a view Avoid reading the view log when the branch is clearly not a view branch.	2014-02-19 12:52:47 -04:00
Joey Hess	4e0be2792b	remove Read instance for Ref Removed instance, got it all to build using fromRef. (With a few things that really need to show something using a ref for debugging stubbed out.) Then added back Read instance, and made Logs.View use it for serialization. This changes the view log format.	2014-02-19 01:19:57 -04:00
Joey Hess	2bf338f443	fixed vpop	2014-02-18 21:09:25 -04:00
Joey Hess	67fd06af76	add git annex view command (And a vpop command, which is still a bit buggy.) Still need to do vadd and vrm, though this also adds their documentation. Currently not very happy with the view log data serialization. I had to lose the TDFA regexps temporarily, so I can have Read/Show instances of View. I expect the view log format will change in some incompatable way later, probably adding last known refs for the parent branch to View or something like that. Anyway, it basically works, although it's a bit slow looking up the metadata. The actual git branch construction is about as fast as it can be using the current git plumbing. This commit was sponsored by Peter Hogg.	2014-02-18 18:22:20 -04:00
Joey Hess	a18eae9a0f	nice git ack space optimisation when setting the same metadata value for multiple files	2014-02-13 01:57:43 -04:00
Joey Hess	361aee0470	avoid churning in git to no benefit when optimising metadata log I think this is now optimal.	2014-02-12 23:24:04 -04:00
Joey Hess	8076530284	improve simplifier	2014-02-12 22:50:41 -04:00
Joey Hess	a05ac13e92	fix metadata log simplifier and additional quickcheck tests	2014-02-12 22:27:55 -04:00
Joey Hess	9f7e76130e	add metadata command to get/set metadata Adds metadata log, and command. Note that unsetting field values seems to currently be broken. And in general this has had all of 2 minutes worth of testing. This commit was sponsored by Julien Lefrique.	2014-02-12 21:30:33 -04:00
Joey Hess	c390e896d1	fix windows build (and make --stop work on windows, incidentially) The Utility.PID will clean up other code soon.	2014-02-11 15:25:59 -04:00
Joey Hess	4f7e72b51a	fix parsing of unused log; keys can contain spaces	2014-02-08 15:27:11 -04:00
Joey Hess	a44e01c29c	--in can now refer to files that were located in a repository at some past date. For example, --in="here@{yesterday}"	2014-02-06 12:43:56 -04:00
Joey Hess	1572c460e8	avoid using openFile when withFile can be used Potentially fixes some FD leak if an action on an opened file handle fails for some reason. There have been some hard to reproduce reports of git-annex leaking FDs, and this may solve them.	2014-02-03 10:19:06 -04:00
Joey Hess	32f1f68dc9	typo	2014-01-28 17:17:21 -04:00
Joey Hess	f0dfac4d96	fix build with old ghc that used old-time type	2014-01-28 17:14:43 -04:00
Joey Hess	eefda291c6	fix warning	2014-01-28 14:43:20 -04:00
Joey Hess	891c85cd88	use locking on Windows This is all the easy cases, where there was already a separate lock file.	2014-01-28 14:42:03 -04:00
Joey Hess	3518c586cf	fix transfers of key with no associated file Several places assumed this would not happen, and when the AssociatedFile was Nothing, did nothing. As part of this, preferred content checks pass the Key around. Note that checkMatcher is sometimes now called with Just Key and Just File. It currently constructs a FileMatcher, ignoring the Key. However, if it constructed a FileKeyMatcher, which contained both, then it might be possible to speed up parts of Limit, which currently call the somewhat expensive lookupFileKey to get the Key. I have not made this optimisation yet, because I am not sure if the key is always the same. Will need some significant checking to satisfy myself that's the case..	2014-01-23 16:44:02 -04:00
Joey Hess	e0bd088f08	add webapp UI to manage unused files	2014-01-23 15:09:43 -04:00
Joey Hess	3da0064657	assistant unused file handling Make sanity checker run git annex unused daily, and queue up transfers of unused files to any remotes that will have them. The transfer retrying code works for us here, so eg when a backup disk remote is plugged in, any transfers to it are done. Once the unused files reach a remote, they'll be removed locally as unwanted. If the setup does not cause unused files to go to a remote, they'll pile up, and the sanity checker detects this using some heuristics that are pretty good -- 1000 unused files, or 10% of disk used by unused files, or more disk wasted by unused files than is left free. Once it detects this, it pops up an alert in the webapp, with a button to take action. TODO: Webapp UI to configure this, and also the ability to launch an immediate cleanup of all unused files. This commit was sponsored by Simon Michael.	2014-01-22 22:53:18 -04:00
Joey Hess	4b55afe9e9	add "unused" preferred content expression With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.	2014-01-22 16:35:32 -04:00
Joey Hess	ae3cd632bd	add timestamps to unused log files This will be used in expiring old unused objects. The timestamp is when it was first noticed it was unused. Backwards compatability: It supports reading old format unused log files. The old version of git-annex will ignore lines in log files written by the new version, so the worst interop problem would be git annex dropunused not knowing some numbers that git-annex unused reported.	2014-01-22 15:33:02 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	0ef282a116	numcopies cleanup, part 2 This includes several bug fixes.	2014-01-21 17:25:39 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	d66535f065	global numcopies setting * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.	2014-01-20 16:47:56 -04:00
Joey Hess	93161d0dea	copyright year	2014-01-08 16:29:15 -04:00
Joey Hess	3e68c1c2fd	add remote state logs This allows a remote to store a piece of arbitrary state associated with a key. This is needed to support Tahoe, where the file-cap is calculated from the data stored in it, and used to retrieve a key later. Glacier also would be much improved by using this. GETSTATE and SETSTATE are added to the external special remote protocol. Note that the state is left as-is even when a key is removed from a remote. It's up to the remote to decide when it wants to clear the state. The remote state log, $KEY.log.rmt, is a UUID-based log. However, rather than using the old UUID-based log format, I created a new variant of that format. The new varient is more space efficient (since it lacks the "timestamp=" hack, and easier to parse (and the parser doesn't mess with whitespace in the value), and avoids compatability cruft in the old one. This seemed worth cleaning up for these new files, since there could be a lot of them, while before UUID-based logs were only used for a few log files at the top of the git-annex branch. The transition code has also been updated to handle these new UUID-based logs. This commit was sponsored by Daniel Hofer.	2014-01-03 16:35:57 -04:00
Joey Hess	8e3032df2d	added GETWANTED, SETWANTED for Tobias's flickr remote This was unexpectedly difficult because of a depdenency cycle. To parse a preferred content expression involves several things that need to operate on the list of remotes. Which needs Remote.External. The only way to avoid this cycle (I tried breaking it at several points) was to skip parsing the expression in SETWANTED. That's sorta ok, because git-annex already has to deal with unparsable preferred content expressions being stored, in order to handle eg, upgrades. But I'm still not very happy that I cannot check it. I feel this is a strong indication that I need to beware of further bloating the special remote protocol interface.	2014-01-01 20:12:20 -04:00
Joey Hess	f0a6de1ca2	add PreferredContentExpression type	2014-01-01 19:58:02 -04:00
Richard Hartmann	974fe009bf	Another round of s/amoung/among/	2013-12-19 12:30:53 -04:00
Joey Hess	f931272681	syntax	2013-12-11 00:18:58 -04:00
Joey Hess	011b8bc7ec	pull in Win32-extras, to be able to get current process id in Windows Fixed up a number of things that had worked around there not being a way to get that. Most notably, transfer info files on windows now include the process id, since no locking is currently done. This means the file format varies between windows and unix.	2013-12-11 00:15:10 -04:00
Joey Hess	ecd42aef8e	different PID types for Unix and Windows Windows has a larger (unsigned) PID space, so cannot use the unix CInt there. Note that TransferInfo does not yet ever get the TransferPid populated, as there is missing locking.	2013-12-10 23:48:42 -04:00
Joey Hess	6edac746f0	merge improved fsck types from git-repair and some associated changes	2013-11-30 14:29:11 -04:00
Joey Hess	53ab737723	clean up cruft left in log by bug	2013-11-09 14:30:26 -04:00
Joey Hess	8e1b8af6e7	fix crash on empty description Caused by bug fixed in `46cf00ffd8`	2013-11-09 13:50:44 -04:00
Joey Hess	049e80e865	refactor	2013-10-28 14:05:55 -04:00
Joey Hess	d345e5b52f	add git fsck to cronner, and UI for repository repair (not yet wired up)	2013-10-22 16:02:52 -04:00
Joey Hess	92d5452a19	write via temp file	2013-10-14 16:15:38 -04:00
Joey Hess	296e21b381	add schedule command Mostly because it gives me an excuse and a hook to document the schedule expression format.	2013-10-13 15:40:38 -04:00
Joey Hess	88ec6eff15	add/remove/edit schedule UI working Once I built the basic widget, it turned out to be rather easy to replicate it once per scheduled activity and wire it all up to a fully working UI. This does abuse yesod's form handling a bit, but I think it's ok. And it would be nice to have it all ajax-y, so that saving one modified form won't lose any modifications to other forms. But for now, a nice simple 115 line of code implementation is a win. This late night hack session commit was sponsored by Andrea Rota.	2013-10-11 03:04:11 -04:00
Joey Hess	af5e1d0494	half way complete cronner thread to run scheduled activities	2013-10-08 11:48:28 -04:00
Joey Hess	b9375acb18	add schedule to vicfg	2013-10-07 17:11:13 -04:00
Joey Hess	29ca49dad4	add a log file for scheduled activities	2013-10-07 16:06:34 -04:00
Joey Hess	57d49a6d04	remove >=> and >=> ; use <$$> instead I forgot I had <$$> hidden away in Utility.Applicative. It allows doing the same kind of currying as does >=> and I found using it made the code more readable for me. (>=> was not used)	2013-09-27 19:58:48 -04:00
Joey Hess	c1990702e9	hlint	2013-09-25 23:19:01 -04:00
Joey Hess	4dc4a9a385	assistant: Clear the list of failed transfers when doing a full transfer scan. This prevents repeated retries to download files that are not available, or are not referenced by the current git tree. This is motivated by a user report that the assistant was repeatedly retrying transfers of files that had been deleted (in direct mode, so removing the only copy). Note that the glacier code retries failed transfers after a while to retry downloads that have aged long enough to be available. This is ok; if we're doing a full transfer scan we'll retry on every file that is still in the git tree. Also note that this makes the assistant less likely to get every file referenced by old revs of the git tree. Not something the assistant tries to ensure anyway, so I feel this is acceptable.	2013-09-25 11:46:17 -04:00
Joey Hess	eb42bde19a	sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory.	2013-09-19 14:48:42 -04:00
Joey Hess	51ce7fcaf1	fix warning	2013-09-04 21:37:13 -04:00
Joey Hess	0831e18372	forget --drop-dead: Completely removes mentions of repositories that have been marked as dead from the git-annex branch. Wrote nice pure transition calculator, and ugly code to stage its results into the git-annex branch. Also had to split up several Log modules that Annex.Branch needed to use, but that themselves used Annex.Branch. The transition calculator is limited to looking at and changing one file at a time. While this made the implementation relatively easy, it precludes transitions that do stuff like deleting old url log files for keys that are being removed because they are no longer present anywhere.	2013-08-31 17:51:13 -04:00
Joey Hess	62beaa1a86	refactor git-annex branch log filename code into central location Having one module that knows about all the filenames used on the branch allows working back from an arbitrary filename to enough information about it to implement dropping dead remotes and doing other log file compacting as part of a forget transition.	2013-08-29 19:13:00 -04:00
Joey Hess	4a915cd3cd	add forget command Works, more or less. --dead is not implemented, and so far a new branch is made, but keys no longer present anywhere are not scrubbed. git annex sync fails to push the synced/git-annex branch after a forget, because it's not a fast-forward of the existing synced branch. Could be fixed by making git-annex sync use assistant-style sync branches.	2013-08-28 16:41:13 -04:00
Joey Hess	fcd5c167ef	untested transition detection on merging, and transition running code	2013-08-28 15:57:42 -04:00
Joey Hess	511cf77b6d	add transition log	2013-08-28 13:54:51 -04:00
Joey Hess	824241b6fb	better cases	2013-08-22 23:44:13 -04:00
Joey Hess	46b6d75274	Youtube support! (And 53 other video hosts) When quvi is installed, git-annex addurl automatically uses it to detect when an page is a video, and downloads the video file. web special remote: Also support using quvi, for getting files, or checking if files exist in the web. This commit was sponsored by Mark Hepburn. Thanks!	2013-08-22 18:50:43 -04:00
Joey Hess	a3224ce35b	avoid more build warnings on Windows	2013-08-04 14:05:36 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	7e66d260ea	importfeed: git-annex becomes a podcatcher in 150 LOC	2013-07-28 16:55:42 -04:00
Joey Hess	ec8cf85fcc	display "transfer already in progress" as a note	2013-07-17 16:16:17 -04:00
Joey Hess	7afd92d083	When a transfer is already being run by another process, proceed on to the next file, rather than dying.	2013-07-17 15:54:01 -04:00
Joey Hess	7a7e426352	moved AssociatedFile definition	2013-07-04 02:36:02 -04:00
Joey Hess	04d07f2c1f	--unused: New switch that makes git-annex operate on all data found by the last run of git annex unused. Supported by fsck, get, move, copy.	2013-07-03 15:26:59 -04:00
Joey Hess	bf86b5ca16	improve robustness of fromDirect and replaceFile Made fromDirect check that a file in the tree has good content (and is not a broken symlink either) before copying it to another file that has the same key. Made replaceFile clean up the temp file if the action that creates it, or the file replacement action fails.	2013-05-25 15:06:02 -04:00
Joey Hess	25a8d4b11c	rename module	2013-05-12 19:19:28 -04:00
Joey Hess	03e8594369	fix the day's windows permissions damage	2013-05-12 19:09:48 -04:00
Joey Hess	73d2f8b280	deal with git using / internally, even on DOS	2013-05-12 17:29:49 -05:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	18bdff3fae	clean up from windows porting	2013-05-11 18:23:41 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	0ae8c82c53	per-IA-item content directories	2013-04-25 23:44:55 -04:00
Joey Hess	49547ad32d	initremote: If two existing remotes have the same name, prefer the one with a higher trust level.	2013-04-24 21:53:58 -04:00
Joey Hess	6be815a30c	rmurl: New command, removes one of the recorded urls for a file.	2013-04-22 17:18:53 -04:00
Joey Hess	9e11699c76	connect existing meters to the transfer log for downloads Most remotes have meters in their implementations of retrieveKeyFile already. Simply hooking these up to the transfer log makes that information available. Easy peasy. This is particularly valuable information for encrypted remotes, which otherwise bypass the assistant's polling of temp files, and so don't have good progress bars yet. Still some work to do here (see progressbars.mdwn changes), but this is entirely an improvement from the lack of progress bars for encrypted downloads.	2013-04-11 17:32:31 -04:00
Joey Hess	c9e4c218a6	fix invalidating the preferred content cache when changing a group The ConfigMonitor already did this, but groups can also be changed by eg, the webapp UI, so need to do it at this deeper level.	2013-04-08 16:43:06 -04:00
Joey Hess	9a5f421768	detect when unwanted remote is empty and remove it Needs fixes to build when the webapp is disabled.	2013-04-03 17:01:40 -04:00
Joey Hess	8a5b397ac4	hlint	2013-04-03 03:52:41 -04:00
Joey Hess	7b6cf1981f	show bytesComplete	2013-04-02 16:38:47 -04:00
Joey Hess	91b7de97e8	invalidated the wrong cache when setting preferred content	2013-03-31 19:00:14 -04:00
Joey Hess	67e817c6a1	New annex.largefiles setting, which configures which files `git annex add` and the assistant add to the annex. I would have sort of liked to put this in .gitattributes, but it seems it does not support multi-word attribute values. Also, making this a single config setting makes it easy to only parse the expression once. A natural next step would be to make the assistant `git add` files that are not annex.largefiles. OTOH, I don't think `git annex add` should `git add` such files, because git-annex command line tools are not in the business of wrapping git command line tools.	2013-03-29 16:17:13 -04:00
Joey Hess	cf07a2c412	webapp: Progess bar fixes for many types of special remotes. There was confusion in different parts of the progress bar code about whether an update contained the total number of bytes transferred, or the number of bytes transferred since the last update. One way this bug showed up was progress bars that seemed to stick at zero for a long time. In order to fix it comprehensively, I add a new BytesProcessed data type, that is explicitly a total quantity of bytes, not a delta. Note that this doesn't necessarily fix every problem with progress bars. Particularly, buffering can now cause progress bars to seem to run ahead of transfers, reaching 100% when data is still being uploaded.	2013-03-28 17:04:37 -04:00
Joey Hess	e9048ecec8	get, copy, move: Display an error message when an identical transfer is already in progress, rather than failing with no indication why.	2013-03-19 13:56:20 -04:00
Joey Hess	b543842a7f	optimisation for transfers to drives that are not plugged in Rather than forking a git-annex transferkey only to have it fail, just immediately record the failed transfer (so when the drive is plugged in, the scan will retry it).	2013-03-18 20:40:24 -04:00
Joey Hess	a1b6d2e057	show an error message if garbage is provided to dropunused	2013-03-03 20:04:24 -04:00
Joey Hess	46c9cbeb1e	add additional debug info about reasons for transfers	2013-03-01 15:23:59 -04:00
Joey Hess	24316f6562	improve imports	2013-02-27 21:48:46 -04:00
Joey Hess	a2f17146fa	move Arbitrary instances out of Test and into modules that define the types This is possible now that we build-depend on QuickCheck.	2013-02-27 21:42:07 -04:00
Joey Hess	4008590c68	type based git config handling for remotes Still a couple of places that use git config ad-hoc, but this is most of it done.	2013-01-01 13:58:14 -04:00
Joey Hess	1702409f00	check	2012-12-20 00:08:30 -04:00
Joey Hess	df90a2acd5	another quickcheck	2012-12-20 00:02:33 -04:00
Joey Hess	8491917d04	more quickcheck fun and the code gets better..	2012-12-19 22:14:12 -04:00
Joey Hess	bf71d42681	quickcheck test for transfer info read/write code Fixed a bug the quickcheck turned up.	2012-12-19 16:15:39 -04:00
Joey Hess	7da2e27293	Bugfix: Fixed bug parsing transfer info files The newline after the filename was included in it. This was generally benign -- mostly these filenames are just displayed, and the newline didn't matter. But in the assistant, it caused unexpected dropping of preferred content. A characteristic of this bug is that the drop was displayed like this: drop some_file ok	2012-12-19 14:17:01 -04:00
Joey Hess	ffdd08fd2e	Merge branch 'master' into desymlink	2012-12-13 00:46:10 -04:00
Joey Hess	0d50a6105b	whitespace fixes	2012-12-13 00:45:27 -04:00
Joey Hess	e7b8cb0063	direct mode committing	2012-12-12 19:20:38 -04:00
Joey Hess	99a8a5297c	--auto fixes * get/copy --auto: Transfer data even if it would exceed numcopies, when preferred content settings want it. * drop --auto: Fix dropping content when there are no preferred content settings.	2012-12-06 13:22:16 -04:00
Joey Hess	ea5d7292e6	dropping from web	2012-11-29 17:01:07 -04:00
Joey Hess	2172cc586e	where indenting	2012-11-11 00:51:07 -04:00
Joey Hess	ec337baaee	add trustExclude	2012-11-11 00:24:32 -04:00
Joey Hess	c6fbed48a1	bugfix: Don't fail transferring content from read-only repos. Closes: #691341 This used to work, but got broken when the transfer info files were added, as it failed writing them on the readonly filesystem.	2012-10-24 10:59:25 -04:00
Joey Hess	452e6819d0	!! removal	2012-10-21 00:51:42 -04:00
Joey Hess	c7c2015435	add ConfigMonitor thread Monitors git-annex branch for changes, which are noticed by the Merger thread whenever the branch ref is changed (either due to an incoming push, or a local change), and refreshes cached config values for modified config files. Rate limited to run no more often than once per minute. This is important because frequent git-annex branch changes happen when files are being added, or transferred, etc. A primary use case is that, when preferred content changes are made, and get pushed to remotes, the remotes start honoring those settings. Other use cases include propigating repository description and trust changes to remotes, and learning when a remote has added a new special remote, so the webapp can present the GUI to enable that special remote locally. Also added a uuid.log cache. All other config files already had caches.	2012-10-20 16:43:35 -04:00
Joey Hess	40aab719df	Replace "in=" with "present" in preferred content expressions in= was problimatic in two ways. First, it referred to a remote by name, but preferred content expressions can be evaluated elsewhere, where that remote doesn't exist, or a different remote has the same name. This name lookup code could error out at runtime. Secondly, in= seemed pretty useless. in=here did not cause content to be gotten, but it did let present content be dropped. present is more useful, although "not present" is unstable and should be avoided.	2012-10-19 16:09:21 -04:00

... 2 3 4 5 6 ...

453 commits