git-annex

Author	SHA1	Message	Date
Joey Hess	efb37e7c78	Improve behavior when a git-annex command is told to operate on a file that doesn't exist. It will now continue to other files specified after that on the command line, and only error out at the end.	2015-04-30 15:28:17 -04:00
Joey Hess	3c2cb25698	wording	2015-04-08 16:16:42 -04:00
Joey Hess	4da371af1e	add: If annex.largefiles is set and does not match a file that's being added, the file will be checked into git rather than being added to the annex. Previously, git annex add skipped over such files; this new behavior is more useful in direct mode.	2015-04-08 16:14:23 -04:00
Joey Hess	8066a1c3cc	The file matching options are now only accepted by commands that can actually use them.	2015-02-06 17:16:41 -04:00
Joey Hess	70736d2b41	Repository tuning parameters can now be passed when initializing a repository for the first time. * init: Repository tuning parameters can now be passed when initializing a repository for the first time. For details, see http://git-annex.branchable.com/tuning/ * merge: Refuse to merge changes from a git-annex branch of a repo that has been tuned in incompatable ways.	2015-01-27 17:38:06 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	068aaf943b	on second thought, InodeCache should use getFileSize This is necessary for interop between inode caches created on unix and windows. Which is more important than supporting inodecaches for large keys with the wrong size, which are broken anyway. There should be no slowdown from this change, except on Windows.	2015-01-20 19:35:50 -04:00
Joey Hess	59f88558d5	doh't use "def" for command definitions, it conflicts with Data.Default.def	2014-10-14 14:20:10 -04:00
Joey Hess	b61c6bc2ff	hlint	2014-10-09 15:46:05 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	44e7d6e1fe	add: In direct mode, adding an annex symlink will check it into git, as was already done in indirect mode.	2014-09-18 14:24:47 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	4fe2e53f5b	finish fixing windows timezone madness Rather than calculating the TSDelta once, and caching it, this now reads the inode sential file's InodeCache file once, and then each time a new InodeCache is generated, looks at the sentinal file to get the current delta. This way, if the time zone changes while git-annex is running, it will adapt. This adds some inneffiency, but only on Windows, and only 1 stat per new file added. The worst innefficiency is that `git annex status` and `git annex sync` will now (on Windows) stat the inode sentinal file once per file in the repo. It would be more efficient to use getCurrentTimeZone, rather than needing to stat the sentinal file. This should be easy to do, once the time package gets my bugfix patch. This commit was sponsored by Jürgen Lüters.	2014-06-12 13:54:08 -04:00
Joey Hess	e4d7e2ebde	fix for Windows file timestamp timezone madness On Windows, changing the time zone causes the apparent mtime of files to change. This confuses git-annex, which natually thinks this means the files have actually been modified (since THAT'S WHAT A MTIME IS FOR, BILL <sheesh>). Work around this stupidity, by using the inode sentinal file to detect if the timezone has changed, and calculate a TSDelta, which will be applied when generating InodeCaches. This should add no overhead at all on unix. Indeed, I sped up a few things slightly in the refactoring. Seems to basically work! But it has a big known problem: If the timezone changes while the assistant (or a long-running command) runs, it won't notice, since it only checks the inode cache once, and so will use the old delta for all new inode caches it generates for new files it's added. Which will result in them seeming changed the next time it runs. This commit was sponsored by Vincent Demeester.	2014-06-12 13:42:21 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	2f538dd65c	add --include-dotfiles: New option, perhaps useful for backups.	2014-03-26 14:52:07 -04:00
Joey Hess	9aa31b71f3	add: display exception when lockdown fails (for RichiH)	2014-03-19 21:08:46 -04:00
Joey Hess	a1432bce2f	Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.	2014-02-26 16:52:56 -04:00
Joey Hess	3f6e4b8c7c	fix all remaining -Wall warnings on Windows	2014-02-25 14:48:50 -04:00
Joey Hess	8d5158fa31	Preserve metadata when staging a new version of an annexed file. Performance impact: When adding a large tree of new files, this needs to do some git cat-file queries to check if any of the files already existed and might need a metadata copy. I tried a benchmark in a copy of my sound repository (so there was already a significant git tree to check against. Adding 10000 small files, with a cold cache: before: 1m48.539s after: 1m52.791s So, impact is 0.0004 seconds per file added. Which seems acceptable, so did not add some kind of configuration to enable/disable this. This commit was sponsored by Lisa Feilen.	2014-02-24 14:41:33 -04:00
Joey Hess	7498c5dd96	annex.genmetadata can be set to make git-annex automatically set metadata (year and month) when adding files	2014-02-23 00:08:29 -04:00
Joey Hess	1669e80e85	Windows: Avoid using unix-compat's rename, which refuses to rename directories. Opened a bug about this: https://github.com/jystic/unix-compat/issues/10	2014-01-29 15:19:03 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	0cc1bd7e53	add: Fix rollback when disk is completely full. Noticed that it was possible for add to move a file to .git/annex/objects and not make the link if the disk was full. This happened because the location log update failed, and so addLink never got a chance to run. Running addLink first fixes it; on error it will unwind by moving the file back to where it was originally.	2014-01-05 14:09:57 -04:00
Joey Hess	3e9419b088	avoid using Utility.Touch without WITH_CLIBS	2013-11-12 21:05:04 -04:00
Joey Hess	9ff229a798	watcher: Avoid loop when adding a file owned by someone else fails in indirect mode because its permissions cannot be modified. Adding the file moved it to the annex, and then tried to set the mode. Error unwind then moved the file back, and so the watcher saw the file get deleted and then added back, and so tried again..	2013-11-07 15:18:54 -04:00
Joey Hess	ad86926f09	Revert "avoid hsc files on Windows" This reverts commit `158ba9d332`. My windows build environment was broken; reverted to backup.	2013-10-17 17:53:50 -04:00
Joey Hess	bf11eac772	typo	2013-10-17 17:02:24 -04:00
Joey Hess	158ba9d332	avoid hsc files on Windows This used to work, but now hsc2hs is failing with a usage message. Since I have not changed my windows build environment at all, it must be some change due to a change in the cabal file. Perhaps too make flags are causing it to hit a windows command line length limit? Anyway, these hsc files did nothing on Windows, so can be omitted and not built to work around yet another epic windows weirdness.	2013-10-17 16:35:14 -04:00
Joey Hess	98fc7e8a19	add, import, assistant: Better preserve the mtime of symlinks, when when adding content that gets deduplicated. Note that this turned out to remove a syscall, not add any expense. Otherwise, I would not have done it.	2013-09-25 16:07:11 -04:00
Joey Hess	b405295aee	hlint test suite still passes	2013-09-25 03:09:06 -04:00
Joey Hess	ddd46db09a	Fix a few bugs involving filenames that are at or near the filesystem's maximum filename length limit. Started with a problem when running addurl on a really long url, because the whole url is munged into the filename. Ended up doing a fairly extensive review for places where filenames could get too large, although it's hard to say I'm not missed any.. Backend.Url had a 128 character limit, which is fine when the limit is 255, but not if it's a lot shorter on some systems. So check the pathconf() limit. Note that this could result in fromUrl creating different keys for the same url, if run on systems with different limits. I don't see this is likely to cause any problems. That can already happen when using addurl --fast, or if the content of an url changes. Both Command.AddUrl and Backend.Url assumed that urls don't contain a lot of multi-byte unicode, and would fail to truncate an url that did properly. A few places use a filename as the template to make a temp file. While that's nice in that the temp file name can be easily related back to the original filename, it could lead to `git annex add` failing to add a filename that was at or close to the maximum length. Note that in Command.Add.lockdown, the template is still derived from the filename, just with enough space left to turn it into a temp file. This is an important optimisation, because the assistant may lock down a bunch of files all at once, and using the same template for all of them would cause openTempFile to iterate through the same set of names, looking for an unused temp file. I'm not very happy with the relatedTemplate hack, but it avoids that slowdown. Backend.WORM does not limit the filename stored in the key. I have not tried to change that; so git annex add will fail on really long filenames when using the WORM backend. It seems better to preserve the invariant that a WORM key always contains the complete filename, since the filename is the only unique material in the key, other than mtime and size. Since nobody has complained about add failing (I think I saw it once?) on WORM, probably it's ok, or nobody but me uses it. There may be compatability problems if using git annex addurl --fast or the WORM backend on a system with the 255 limit and then trying to use that repo in a system with a smaller limit. I have not tried to deal with those. This commit was sponsored by Alexander Brem. Thanks!	2013-07-30 19:18:29 -04:00
Joey Hess	6dcf21db93	Direct mode: No longer temporarily remove write permission bit of files when adding them. This write permission frobbing is very appropriate in indirect mode, since annexed objects are stored as immutably as can be managed. But not in direct mode, where files should be able to be modified at any time. There are already sufficient guards that there's no need to prevent a file being written to while it's being ingested, in direct mode. The inode cache will detect (most) types of modifications, and the add will fail. Then a re-add should be done. The assistant should get another inotify change event, and automatically add the new version of the file.	2013-06-12 14:02:31 -04:00
Joey Hess	a64106dcef	Supports indirect mode on encfs in paranoia mode, and other filesystems that do not support hard links, but do support symlinks and other POSIX filesystem features.	2013-06-10 13:11:33 -04:00
Joey Hess	92f036fcb4	avoid warnings when built with ghc 7.6	2013-06-02 15:01:58 -04:00
Joey Hess	345ee4f37c	Switch to MonadCatchIO-transformers for better handling of state while catching exceptions. As seen in this bug report, the lifted exception handling using the StateT monad throws away state changes when an action throws an exception. http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/ .. Which can result in cached values being redundantly calculated, or other possibly worse bugs when the annex state gets out of sync with reality. This switches from a StateT AnnexState to a ReaderT (MVar AnnexState). All changes to the state go via the MVar. So when an Annex action is running inside an exception handler, and it makes some changes, they immediately go into affect in the MVar. If it then throws an exception (or even crashes its thread!), the state changes are still in effect. The MonadCatchIO-transformers change is actually only incidental. I could have kept on using lifted-base for the exception handling. However, I'd have needed to write a new instance of MonadBaseControl for the new monad.. and I didn't write the old instance.. I begged Bas and he kindly sent it to me. Happily, MonadCatchIO-transformers is able to derive a MonadCatchIO instance for my monad. This is a deep level change. It passes the test suite! What could it break? Well.. The most likely breakage would be to code that runs an Annex action in an exception handler, and wants state changes to be thrown away. Perhaps the state changes leaves the state inconsistent, or wrong. Since there are relatively few places in git-annex that catch exceptions in the Annex monad, and the AnnexState is generally just used to cache calculated data, this is unlikely to be a problem. Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit redundant. It's now entirely possible to run concurrent Annex actions in different threads, all sharing access to the same state! The ThreadedMonad just adds some extra work on top of that, with its own MVar, and avoids such actions possibly stepping on one-another's toes. I have not gotten rid of it, but might try that later. Being able to run concurrent Annex actions would simplify parts of the Assistant code.	2013-05-19 14:16:36 -04:00
Joey Hess	b8e5b9c645	test suite passes in direct mode This fixes a bug with git annex add in direct mode. If some files already existed in the tree pointing at the same key as a file that was just added, and their content was not present, add neglected to copy the content to those files. I also changed the behavior of moveAnnex slightly: When content is moved into the annex in direct mode, it does not overwrite any content already present in direct mode files. That content may be modified after all.	2013-05-17 15:59:37 -04:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	68f38a7ae6	show a message to tell why adding a file failed	2013-04-23 18:09:00 -04:00
Joey Hess	8b0dcb3136	add: avoid ugly error message when adding a deleted file in direct mode Due to add using withFilesMaybeModified, it will get files that have been deleted but are still in the index. So catch the IO error that results when trying to stat such a file.	2013-04-23 17:22:56 -04:00
Joey Hess	7b4733f0e8	addurl: Bugfix: Did not properly add file in direct mode.	2013-04-11 13:35:52 -04:00
Joey Hess	602baae12e	Bugfix: Direct mode no longer repeatedly checksums duplicated files. Fixed by storing a list of cached inodes for a key, instead of just one. Backwards compatability note: An old git-annex version will fail to parse an inode cache file that has been written by a new version, and has multiple items. It will succees if just one. So old git-annexes will have even worse behavior when there are duplicated files, if that is possible. I don't think it will be a problem. (Famous last words.) Also, note that it doesn't expire old and unused inode caches for a key. It would be possible to add this if needed; just look through the associated files for a key and if there are more cached inodes, throw out any not corresponding to associated files. Unless a file is being copied repeatedly and the old copy deleted, this lack of expiry should not be a problem.	2013-04-06 16:07:25 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	38d61f934d	Update working tree files fully atomically This avoids commit churn by the assistant when eg, replacing a file with a symlink. But, just as importantly, it prevents the working tree being left with a deleted file if git-annex, or perhaps the whole system, crashes at the wrong time. (It also probably avoids confusing displays in file managers.)	2013-04-02 15:02:00 -04:00
Joey Hess	b89efc79f6	add --force overrides annex.largefiles	2013-03-29 16:20:15 -04:00
Joey Hess	67e817c6a1	New annex.largefiles setting, which configures which files `git annex add` and the assistant add to the annex. I would have sort of liked to put this in .gitattributes, but it seems it does not support multi-word attribute values. Also, making this a single config setting makes it easy to only parse the expression once. A natural next step would be to make the assistant `git add` files that are not annex.largefiles. OTOH, I don't think `git annex add` should `git add` such files, because git-annex command line tools are not in the business of wrapping git command line tools.	2013-03-29 16:17:13 -04:00
Joey Hess	cfd3b16fe1	add section metadata to all commands Not yet used .. mindless train work.	2013-03-24 18:28:21 -04:00
Joey Hess	06046a0d2b	finish fast direct mode rename handling. wow, it's fast	2013-03-11 14:14:45 -04:00
Joey Hess	40df015d90	remove Eq instance for InodeCache There are two types of equality here, and which one is right varies, so this forces me to consider and choose between them. Based on this, I learned that the commit in git anex sync was always doing a strong comparison, even when in a repository where the inodes had changed. Fixed that.	2013-03-11 02:57:48 -04:00

1 2 3

131 commits