git-annex

Author	SHA1	Message	Date
Joey Hess	5d98cba923	use ByteStrings when reading annex symlinks and pointers Now there's a ByteString used all the way from disk to Key. The main complication in this conversion was the use of fromInternalGitPath in several places to munge things on Windows. The things that used that were changed to parse the ByteString using either path separator. Also some code that had read from files to a String lazily was changed to read a minimal strict ByteString.	2019-01-14 15:37:08 -04:00
Joey Hess	7d51b0c109	import Utility.FileSystemEncoding in Common	2019-01-03 11:37:02 -04:00
Joey Hess	a1730cd6af	adeiu, MissingH Removed dependency on MissingH, instead depending on the split library. After laying groundwork for this since 2015, it was mostly straightforward. Added Utility.Tuple and Utility.Split. Eyeballed System.Path.WildMatch while implementing the same thing. Since MissingH's progress meter display was being used, I re-implemented my own. Bonus: Now progress is displayed for transfers of files of unknown size. This commit was sponsored by Shane-o on Patreon.	2017-05-16 01:03:52 -04:00
Joey Hess	ed56dba868	annex.autocommit can be configured via git-annex config ... to control the default behavior in all clones of a repository. This includes a new Configurable data type, so the GitConfig type indicates which values can be configured this way. The implementation should be quite efficient; the config log is only read once, and only when a Configurable value has not already been set by git-config. Indeed, it would be nice in the future to extend this, so that git-config is itself only read on demand. Some commands may not need to look at the git configuration at all. This commit was sponsored by Trenton Cronholm on Patreon.	2017-02-03 13:58:53 -04:00
Joey Hess	0a4479b8ec	Avoid backtraces on expected failures when built with ghc 8; only use backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.	2016-11-15 21:29:54 -04:00
Joey Hess	fb8ab2469d	assistant: Fix bug that caused v6 pointer files to be annexed by the assistant.	2016-05-16 13:36:36 -04:00
Joey Hess	c61f038fc9	stage pointer file not annex link	2016-05-16 13:19:02 -04:00
Joey Hess	d37fe6a547	annex.largefiles can be configured in .gitattributes too This is particulary useful for v6 repositories, since the .gitattributes configuration will apply in all clones of the repository.	2016-02-02 15:18:17 -04:00
Joey Hess	b52cf5697b	immediate queue flushing when annex.queuesize=1 Previously, it only flushed when the queue got larger than 1. Also, make the queue auto-flush when items are added, rather than needing to be flushed as a separate step. This simplifies the code and make it more efficient too, as it avoids needing to read the queue out of the state to check if it should be flushed.	2016-01-13 14:55:01 -04:00
Joey Hess	b3d60ca285	use TopFilePath for associated files Fixes several bugs with updates of pointer files. When eg, running git annex drop --from localremote it was updating the pointer file in the local repository, not the remote. Also, fixes drop ../foo when run in a subdir, and probably lots of other problems. Test suite drops from ~30 to 11 failures now. TopFilePath is used to force thinking about what the filepath is relative to. The data stored in the sqlite db is still just a plain string, and TopFilePath is a newtype, so there's no overhead involved in using it in DataBase.Keys.	2016-01-05 17:22:19 -04:00
Joey Hess	ca2c977704	wip v6 support for assistant Files are not yet added to v6 repos in unlocked mode.	2015-12-21 18:41:15 -04:00
Joey Hess	cdd27b8920	reorg	2015-12-15 15:34:28 -04:00
Joey Hess	eb33569f9d	remove Params constructor from Utility.SafeCommand This removes a bit of complexity, and should make things faster (avoids tokenizing Params string), and probably involve less garbage collection. In a few places, it was useful to use Params to avoid needing a list, but that is easily avoided. Problems noticed while doing this conversion: * Some uses of Params "oneword" which was entirely unnecessary overhead. * A few places that built up a list of parameters with ++ and then used Params to split it! Test suite passes.	2015-06-01 13:52:23 -04:00
Joey Hess	70736d2b41	Repository tuning parameters can now be passed when initializing a repository for the first time. * init: Repository tuning parameters can now be passed when initializing a repository for the first time. For details, see http://git-annex.branchable.com/tuning/ * merge: Refuse to merge changes from a git-annex branch of a repo that has been tuned in incompatable ways.	2015-01-27 17:38:06 -04:00
Joey Hess	afc5153157	update my email address and homepage url	2015-01-21 12:50:09 -04:00
Joey Hess	068aaf943b	on second thought, InodeCache should use getFileSize This is necessary for interop between inode caches created on unix and windows. Which is more important than supporting inodecaches for large keys with the wrong size, which are broken anyway. There should be no slowdown from this change, except on Windows.	2015-01-20 19:35:50 -04:00
Joey Hess	9fd95d9025	indent with tabs not spaces Found these with: git grep "^ " $(find -type f -name \*.hs) \|grep -v ': where' Unfortunately there is some inline hamlet that cannot use tabs for indentation. Also, Assistant/WebApp/Bootstrap3.hs is a copy of a module and so I'm leaving it as-is.	2014-10-09 15:09:26 -04:00
Joey Hess	7b50b3c057	fix some mixed space+tab indentation This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.	2014-10-09 15:09:11 -04:00
Joey Hess	c784ef4586	unify exception handling into Utility.Exception Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.	2014-08-07 22:03:29 -04:00
Joey Hess	867fd116a7	better exception display	2014-07-26 23:01:44 -04:00
Joey Hess	e880d0d22c	replace (Key, Backend) with Key Only fsck and reinject and the test suite used the Backend, and they can look it up as needed from the Key. This simplifies the code and also speeds it up. There is a small behavior change here. Before, all commands would warn when acting on an annexed file with an unknown backend. Now, only fsck and reinject show that warning.	2014-04-17 18:03:39 -04:00
Joey Hess	fe19e15040	reorg matcher types; no non-type code changes	2014-03-29 14:43:34 -04:00
Joey Hess	a3fe8270ca	annex.startupscan can be set to false to disable the assistant's startup scan.	2014-03-05 17:44:14 -04:00
Joey Hess	3f6e4b8c7c	fix all remaining -Wall warnings on Windows	2014-02-25 14:48:50 -04:00
Joey Hess	58c7b0a56d	assistant: Always batch changes found in startup scan. Batch detection is heuristic, so can sometimes fail. I observed one such failure while starting up in a repository with 87000 files. After the first several batches of ~5000 files, it fell out of batch mode, and never re-entered it, and so made many more commits of a few files at a time than necessary. So, let's always use batch mode when in the startup scan. This avoids the heuristic there, at least. There is clearly also room to improve the heuristic. Possibly 10 files is too high a bar to be found during a commit, on a system that can commit quickly.	2013-12-16 16:16:19 -04:00
Joey Hess	2066e90421	avoid needing --force on windows despite no lsof Note that I still need to think this through and make sure handling of open files is safe. This is just for testing purposes.	2013-12-09 16:56:15 -04:00
Joey Hess	13108b7196	assistant: Notice on startup when the index file is corrupt, and auto-repair.	2013-11-13 14:27:17 -04:00
Joey Hess	a1b1b5ef52	moved code out of webapp No code changes, aside from some changes to lifting in code that turned out to be able to run in Assistant rather than Handler.	2013-10-26 16:58:16 -04:00
Joey Hess	4f871f89ba	git-recover-repository 1/2 done	2013-10-20 17:50:51 -04:00
Joey Hess	635c9a1549	assistant: Detect stale git lock files at startup time, and remove them. Extends the index.lock handling to other git lock files. I surveyed all lock files used by git, and found more than I expected. All are handled the same in git; it leaves them open while doing the operation, possibly writing the new file content to the lock file, and then closes them when done. The gc.pid file is excluded because it won't affect the normal operation of the assistant, and waiting for a gc to finish on startup wouldn't be good. All threads except the webapp thread wait on the new startup sanity checker thread to complete, so they won't try to do things with git that fail due to stale lock files. The webapp thread mostly avoids doing that kind of thing itself. A few configurators might fail on lock files, but only if the user is explicitly trying to run them. The webapp needs to start immediately when the user has opened it, even if there are stale lock files. Arranging for the threads to wait on the startup sanity checker was a bit of a bear. Have to get all the NotificationHandles set up before the startup sanity checker runs, or they won't see its signal. Perhaps the NotificationBroadcaster is not the best interface to have used for this. Oh well, it works. This commit was sponsored by Michael Jakl	2013-10-05 17:04:21 -04:00
Joey Hess	93dbb7842e	watcher: Detect at startup time when there is a stale .git/lock, and remove it so it does not interfere with the automatic commits of changed files.	2013-10-03 16:57:21 -04:00
Joey Hess	3ac9c4e672	hlint	2013-10-02 22:59:07 -04:00
Joey Hess	b191d5c595	gitignore support for the assistant and watcher Requires git 1.8.4 or newer. When it's installed, a background git check-ignore process is run, and used to efficiently check ignores whenever a new file is added. Thanks to Adam Spiers, for getting the necessary support into git for this. A complication is what to do about files that are gitignored but have been checked into git anyway. git commands assume the ignore has been overridden in this case, and not need any more overriding to commit a changed version. However, for the assistant to do the same, it would have to run git ls-files to check if the ignored file is in git. This is somewhat expensive. Or it could use the running git-cat-file process to query the file that way, but that requires transferring the whole file content over a pipe, so it can be quite expensive too, for files that are not git-annex symlinks. Now imagine if the user knows that a file or directory tree will be getting frequent changes, and doesn't want the assistant to sync it, so gitignores it. The assistant could overload the system with repeated ls-files checks! So, I've decided that the assistant will not automatically commit changes to files that are gitignored. This is a tradeoff. Hopefully it won't be a problem to adjust .gitignore settings to not ignore files you want the assistant to autocommit, or to manually git annex add files that are listed in .gitignore. (This could be revisited if git-annex gets access to an interface to check the content of the index w/o forking a git command. This could be libgit2, or perhaps a separate git cat-file --batch-check process, so it wouldn't need to ship over the whole file content.) This commit was sponsored by Francois Marier. Thanks!	2013-08-02 20:37:03 -04:00
Joey Hess	ed4febb170	remove debug print	2013-05-25 00:02:39 -04:00
Joey Hess	b8e5b9c645	test suite passes in direct mode This fixes a bug with git annex add in direct mode. If some files already existed in the tree pointing at the same key as a file that was just added, and their content was not present, add neglected to copy the content to those files. I also changed the behavior of moveAnnex slightly: When content is moved into the annex in direct mode, it does not overwrite any content already present in direct mode files. That content may be modified after all.	2013-05-17 15:59:37 -04:00
Joey Hess	a9081ae473	optimise direct mode startup scan A recent change made existing symlinks be re-staged. That does not need to be done during the startup scan though.	2013-04-24 21:20:29 -04:00
Joey Hess	ebee93a837	get rid of need to run pre-commit hook when assistant commits in direct mode That hook updates associated file bookkeeping info for direct mode. But, everything already called addAssociatedFile when adding/changing a file. It only needed to also call removeAssociatedFile when deleting a file, or a directory. This should make bulk adds faster, by some possibly significant amount. Bulk removals may be a little slower, since it has to use catKeyFile now on each removed file, but will still be faster than adds.	2013-04-24 18:04:59 -04:00
Joey Hess	b8e45ec9d7	refactoring and minor performance tweak	2013-04-24 17:46:46 -04:00
Joey Hess	04a27ad926	assistant: Bug fix to avoid annexing the files that git uses to stand in for symlinks on FAT and other filesystem not supporting symlinks. also, blog for the day..	2013-04-10 19:57:26 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	6e7842475b	convert "./file" from inotify to just "file" This just prettifies some display.	2013-04-02 16:20:23 -04:00
Joey Hess	38d61f934d	Update working tree files fully atomically This avoids commit churn by the assistant when eg, replacing a file with a symlink. But, just as importantly, it prevents the working tree being left with a deleted file if git-annex, or perhaps the whole system, crashes at the wrong time. (It also probably avoids confusing displays in file managers.)	2013-04-02 15:02:00 -04:00
Joey Hess	8c52b20cc7	optimise last commit Rather than re-adding a direct mode file unnecessarily when it's not changed, just re-stage the symlink.	2013-04-02 12:58:56 -04:00
Joey Hess	31cbde8190	assistant: Fix bug that could cause direct mode files to be unstaged from git. My test case for this bug is to have the assistant running and syncing to a remote, and create a file in the annex. Then at the command line run git annex drop. The assistant sees that the file is gone, sees it's a wanted file, and downloads it from the remote. With a directory special remote and a small file, I was seeing around 1 time in 3, a race where the file got unstaged from git after it got downloaded. Looking at what direct mode content managing code does in this case, it deletes the symlink, and then adds the file content back. It would be possible, sometimes, to avoid removing the symlink and do this atomically. And I probably should.. but in some cases, particularly where the file needs to be run through `cp` (multiple direct mode files with same content), there's no way to atomically replace the symlink with the content. Anyway, the bug turns out to be something that the watcher does right for indirect mode, but not for direct mode. When it got an add event, it checked to see if this was a new file, or one we've already added. In the latter case, no add event was queued. But that means that only the rm event is queued, and so it unstages the file. Fixed by queueing an add event even when the file is already in git. Tested by running hundreds of drops in a loop; file remained staged.	2013-04-02 12:45:31 -04:00
Joey Hess	5771cfce02	assistant: Check small files into git directly.	2013-03-29 16:54:59 -04:00
Joey Hess	67e817c6a1	New annex.largefiles setting, which configures which files `git annex add` and the assistant add to the annex. I would have sort of liked to put this in .gitattributes, but it seems it does not support multi-word attribute values. Also, making this a single config setting makes it easy to only parse the expression once. A natural next step would be to make the assistant `git add` files that are not annex.largefiles. OTOH, I don't think `git annex add` should `git add` such files, because git-annex command line tools are not in the business of wrapping git command line tools.	2013-03-29 16:17:13 -04:00
Joey Hess	f340fd324c	synthesize RmChange when a directory is deleted This gets directory renames closer to being fully detected. There's close to no extra overhead to doing it this way.	2013-03-11 15:14:42 -04:00
Joey Hess	74f723bb50	let's put type modules under the parent module, not in a Types directory	2013-03-10 22:24:13 -04:00
Joey Hess	65a4c7966f	moved transfer queueing out of watcher and into committer This cleaned up the code quite a bit; now the committer just looks at the Change to see if it's a change that needs to have a transfer queued for it. If I later want to add dropping keys for files that were removed, or something like that, this should make it straightforward. This also fixes a bug. In direct mode, moving a file out of an archive directory failed to start a transfer to get its content. The problem was that the file had not been committed to git yet, and so the transfer code didn't want to touch it, since fileKey failed to get its key. Only starting transfers after a commit avoids this problem.	2013-03-10 18:16:03 -04:00
Joey Hess	c908672f3d	fix another potential race with the watcher and direct mode Watcher wants to rewrite symlink to fix it. But in direct mode, the symlink could be replaced at any time with file content that has finished being transferred by some other process. So, just don't touch it. FWIW, I audited the rest of the assistant for places where it removes files, and the rest is ok. I have not audited the rest of git-annex.	2013-03-04 15:09:32 -04:00

1 2 3

104 commits