git-annex

Author	SHA1	Message	Date
Joey Hess	a3224ce35b	avoid more build warnings on Windows	2013-08-04 14:05:36 -04:00
Joey Hess	06db8e0bd9	squash compiler warnings on Windows	2013-08-04 13:18:05 -04:00
Joey Hess	b191d5c595	gitignore support for the assistant and watcher Requires git 1.8.4 or newer. When it's installed, a background git check-ignore process is run, and used to efficiently check ignores whenever a new file is added. Thanks to Adam Spiers, for getting the necessary support into git for this. A complication is what to do about files that are gitignored but have been checked into git anyway. git commands assume the ignore has been overridden in this case, and not need any more overriding to commit a changed version. However, for the assistant to do the same, it would have to run git ls-files to check if the ignored file is in git. This is somewhat expensive. Or it could use the running git-cat-file process to query the file that way, but that requires transferring the whole file content over a pipe, so it can be quite expensive too, for files that are not git-annex symlinks. Now imagine if the user knows that a file or directory tree will be getting frequent changes, and doesn't want the assistant to sync it, so gitignores it. The assistant could overload the system with repeated ls-files checks! So, I've decided that the assistant will not automatically commit changes to files that are gitignored. This is a tradeoff. Hopefully it won't be a problem to adjust .gitignore settings to not ignore files you want the assistant to autocommit, or to manually git annex add files that are listed in .gitignore. (This could be revisited if git-annex gets access to an interface to check the content of the index w/o forking a git command. This could be libgit2, or perhaps a separate git cat-file --batch-check process, so it wouldn't need to ship over the whole file content.) This commit was sponsored by Francois Marier. Thanks!	2013-08-02 20:37:03 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	ddd46db09a	Fix a few bugs involving filenames that are at or near the filesystem's maximum filename length limit. Started with a problem when running addurl on a really long url, because the whole url is munged into the filename. Ended up doing a fairly extensive review for places where filenames could get too large, although it's hard to say I'm not missed any.. Backend.Url had a 128 character limit, which is fine when the limit is 255, but not if it's a lot shorter on some systems. So check the pathconf() limit. Note that this could result in fromUrl creating different keys for the same url, if run on systems with different limits. I don't see this is likely to cause any problems. That can already happen when using addurl --fast, or if the content of an url changes. Both Command.AddUrl and Backend.Url assumed that urls don't contain a lot of multi-byte unicode, and would fail to truncate an url that did properly. A few places use a filename as the template to make a temp file. While that's nice in that the temp file name can be easily related back to the original filename, it could lead to `git annex add` failing to add a filename that was at or close to the maximum length. Note that in Command.Add.lockdown, the template is still derived from the filename, just with enough space left to turn it into a temp file. This is an important optimisation, because the assistant may lock down a bunch of files all at once, and using the same template for all of them would cause openTempFile to iterate through the same set of names, looking for an unused temp file. I'm not very happy with the relatedTemplate hack, but it avoids that slowdown. Backend.WORM does not limit the filename stored in the key. I have not tried to change that; so git annex add will fail on really long filenames when using the WORM backend. It seems better to preserve the invariant that a WORM key always contains the complete filename, since the filename is the only unique material in the key, other than mtime and size. Since nobody has complained about add failing (I think I saw it once?) on WORM, probably it's ok, or nobody but me uses it. There may be compatability problems if using git annex addurl --fast or the WORM backend on a system with the 255 limit and then trying to use that repo in a system with a smaller limit. I have not tried to deal with those. This commit was sponsored by Alexander Brem. Thanks!	2013-07-30 19:18:29 -04:00
Joey Hess	7b0970b340	Fix inverted logic in last release's fix for data loss bug, that caused git-annex sync on FAT or other crippled filesystems to add symlink standin files to the annex.	2013-07-30 16:08:09 -04:00
Joey Hess	7e66d260ea	importfeed: git-annex becomes a podcatcher in 150 LOC	2013-07-28 16:55:42 -04:00
Joey Hess	6ae2637eb1	For long hostnames, use a hash of the hostname to generate the socket file for ssh connection caching. This is ok to do now that the socket filename never needs to be mapped back to a hostname. Short hostnames will still appear in the clear, which is less obfuscated. So this cannot possibly make ssh connection caching fail for a hostname it used to work for.	2013-07-22 15:09:41 -04:00
Joey Hess	c6a020ad1f	stop cached ssh connection w/o needing to look up host and port Turns out that with -O stop -S socketfile, ssh does not need the real hostname, or port to be specificed. This is because it simply talks to the ssh behind the socket and tells it to stop. So, can eliminate the conversion back from a socketfile to host and port. Which will allow using shorter filenames for sockets in the future.	2013-07-21 14:14:54 -04:00
Joey Hess	ecdfa40cbe	avoid false positives when detecting core.symlinks=false symlink standin files If the file is > 8192 bytes, it's certianly not a symlink file. And if it contains nuls or newlines or whitespace, it's certianly not a link to annexed content. But it might be a tarball containing a git-annex repo.	2013-07-20 19:28:02 -04:00
Joey Hess	ae341c1a37	avoid reading files that are not symlinks when core.symlinks=false This hack is only needed on FAT filesystems, so there's no point in doing it the rest of the time. And it's possible for there to be a false positive, so it's best to avoid the hack when possible.	2013-07-20 19:14:29 -04:00
Joey Hess	3e422cb5fa	fix uninit to delete content from annex when it ended up hard linked back to the work tree	2013-07-18 13:30:12 -04:00
Joey Hess	c1307b1388	fsck: Don't claim to fix direct mode when run on a symlink whose content is not present.	2013-07-08 17:29:42 -04:00
Joey Hess	d84a000e92	detect system with no dot in FQDN, where git commit will fail, and workaround Sigh, git is so fragile. Or rather, across the set of systems that use git-annex, where are no many horribly broken systems..	2013-07-05 12:24:28 -04:00
Joey Hess	7a7e426352	moved AssociatedFile definition	2013-07-04 02:36:02 -04:00
Joey Hess	72ab02ca48	avoid failure creating inode sentinal file Test suite on windows failed running git annex init in a bare clone of an annexed repo. The annex directory didn't exist when it tried to write the inode sentinal file.	2013-06-18 15:38:17 -04:00
Joey Hess	1312cffad0	Revert "Windows: Ssh connection caching is now supported." Yeah, that didn't actually work. Got error messages like it couldn't read from the control socket, so probably ssh doesn't really support that on Windows, at least the cygwin ssh build I'm using.	2013-06-17 22:13:28 -04:00
Joey Hess	07a17f58b7	Windows: Ssh connection caching is now supported. Turns out the socket stuff just works on windows.	2013-06-17 22:05:49 -04:00
Joey Hess	d80a0f62a4	avoid lazy read of file contents On Windows, that means the file could still be open when later code wants to delete it, which fails. Since we're only reading 8k anyway, just read it, strictly. However, avoid reading the whole file strictly, so no getContentsStrict here.	2013-06-17 21:12:09 -04:00
Joey Hess	b7674b464b	typo in comment	2013-06-17 20:45:04 -04:00
Joey Hess	0527c74c0f	assistant: In direct mode, objects are now only dropped when all associated files are unwanted. This avoids a repreated drop/get loop of a file that has a copy in an archive directory, and a copy not in an archive directory. (Indirect mode still has some buggy behavior in this area, since it does not keep track of associated files.) Closes: #712060	2013-06-15 14:44:43 -04:00
Joey Hess	92f036fcb4	avoid warnings when built with ghc 7.6	2013-06-02 15:01:58 -04:00
Joey Hess	eba9ee5bc6	remove debug print	2013-05-27 11:18:18 -04:00
Joey Hess	3b1aedea3d	Merge branch 'robustness'	2013-05-25 15:22:18 -04:00
Joey Hess	5eeea0fac9	make direct mode merge cleanup more robust If the cleanup of a single file fails for some reason, continue to clean up other files. This could happen because of a race. The merge pulls in a change to a file, which gets changed locally at the same time.	2013-05-25 15:22:16 -04:00
Joey Hess	bf86b5ca16	improve robustness of fromDirect and replaceFile Made fromDirect check that a file in the tree has good content (and is not a broken symlink either) before copying it to another file that has the same key. Made replaceFile clean up the temp file if the action that creates it, or the file replacement action fails.	2013-05-25 15:06:02 -04:00
Joey Hess	729eab1f89	assistant: Work around git-cat-file's not reloading the index after files are staged. Argh.	2013-05-25 00:37:41 -04:00
Joey Hess	2b14fe2c98	refactor	2013-05-24 23:07:26 -04:00
Joey Hess	08c03b2af3	XMPP: Avoid redundant and unncessary pushes. Note that this breaks compatibility with previous versions of git-annex, which will refuse to accept any XMPP pushes from this version.	2013-05-21 18:24:29 -04:00
Joey Hess	0cb34f3caa	update inode cache after copying content This was also tripped by the test suite's automatic conflict resolution test. Which also shows BTW that an unnecessary copy of content is done sometimes when merging in direct mode. Not going to try to speed that up now.	2013-05-20 17:11:40 -04:00
Joey Hess	d88be65495	didn't quite get removeDirect right before, this passes test suite	2013-05-20 16:28:33 -04:00
Joey Hess	3d8355d984	Fix a bug in the git-annex branch handling code that could cause info from a remote to not be merged and take effect immediately. This bug was turned up by the test suite, running fsck in direct mode. A repository was cloned, was put into direct mode, was fscked, and fsck incorrectly said that no copy existed of a file, that was actually present in origin. This turned out to occur because fsck first did a Annex.Branch.change, recording that it did not locally have the file. That was recorded in the journal. Since neither the git annex direct not the fsck had yet needed to read any info from the branch, but had only made changes to it, the origin/git-annex branch was not yet merged in. So the journal got a location log entry written to it, but this did not include the location log info for the origin. When fsck then did a Annex.Branch.get, it trusted the journal was cosnsitent, and returned it, again w/o merging from origin/git-annex. This latter behavior is the actual bug. Refer to commit `e9bfa8eaed` for the thinking behind it being ok to make a change to a file on the branch, without first merging the branch. That thinking still stands. However, it means that files in the journal cannot be trusted to be consistent if the branch has not been merged. So, to fix, just enure the branch gets merged, even when reading from the journal. In tests, this does not seem to cause any extra merging. Except, of course, in the one case described above. But git annex add, etc, are able to make changes w/o first merging the branch.	2013-05-20 15:14:59 -04:00
Joey Hess	4c22c2261f	minor optimisation and warning fix	2013-05-20 13:58:41 -04:00
Joey Hess	f4ba19f2b8	direct mode bug fix: After a conflicted merge was automatically resolved, the content of a file that was already present could incorrectly be replaced with a symlink. The bug was in movein, which just replaceFile'd the file with a symlink, even if it already had the desired content, before trying to pull the content out of the annex and replace the symlink with it. That was ok-ish for non conflicted merges, where if the file existed it would be an old version of the content. But for conflicted merges, the automatic merge resolver has already run, and will have already put the desired content into the file for the local variant. Also, made removeDirect not trust that the associated files map is correct. Only if it can verify that another file has the content will it not move it into .git/annex/objects.	2013-05-20 13:41:09 -04:00
Joey Hess	345ee4f37c	Switch to MonadCatchIO-transformers for better handling of state while catching exceptions. As seen in this bug report, the lifted exception handling using the StateT monad throws away state changes when an action throws an exception. http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/ .. Which can result in cached values being redundantly calculated, or other possibly worse bugs when the annex state gets out of sync with reality. This switches from a StateT AnnexState to a ReaderT (MVar AnnexState). All changes to the state go via the MVar. So when an Annex action is running inside an exception handler, and it makes some changes, they immediately go into affect in the MVar. If it then throws an exception (or even crashes its thread!), the state changes are still in effect. The MonadCatchIO-transformers change is actually only incidental. I could have kept on using lifted-base for the exception handling. However, I'd have needed to write a new instance of MonadBaseControl for the new monad.. and I didn't write the old instance.. I begged Bas and he kindly sent it to me. Happily, MonadCatchIO-transformers is able to derive a MonadCatchIO instance for my monad. This is a deep level change. It passes the test suite! What could it break? Well.. The most likely breakage would be to code that runs an Annex action in an exception handler, and wants state changes to be thrown away. Perhaps the state changes leaves the state inconsistent, or wrong. Since there are relatively few places in git-annex that catch exceptions in the Annex monad, and the AnnexState is generally just used to cache calculated data, this is unlikely to be a problem. Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit redundant. It's now entirely possible to run concurrent Annex actions in different threads, all sharing access to the same state! The ThreadedMonad just adds some extra work on top of that, with its own MVar, and avoids such actions possibly stepping on one-another's toes. I have not gotten rid of it, but might try that later. Being able to run concurrent Annex actions would simplify parts of the Assistant code.	2013-05-19 14:16:36 -04:00
Joey Hess	630a8b9ad2	warning	2013-05-19 12:43:44 -04:00
Joey Hess	1b616c5d37	improve handling of receiving object in direct mode when associated files are modified Before, if a direct mode repo had one or more associated files that were modifed, moving the object into it would overwrite the associated files with the pristine object. Now, modified associated files are left unchanged. To ensure that, when an object is moved into a direct mode repo, it's not thrown away, it gets stored in indirect mode.	2013-05-17 16:25:18 -04:00
Joey Hess	94cb037aa3	store copy in inode cache too	2013-05-17 16:16:10 -04:00
Joey Hess	b8e5b9c645	test suite passes in direct mode This fixes a bug with git annex add in direct mode. If some files already existed in the tree pointing at the same key as a file that was just added, and their content was not present, add neglected to copy the content to those files. I also changed the behavior of moveAnnex slightly: When content is moved into the annex in direct mode, it does not overwrite any content already present in direct mode files. That content may be modified after all.	2013-05-17 15:59:37 -04:00
Joey Hess	3240006c56	fix android build, broken by changes for windows port	2013-05-16 11:52:48 -04:00
Joey Hess	aba49995b6	Merge branch 'master' into windows	2013-05-15 19:18:04 -04:00
Joey Hess	4829eae883	fix toDirectGen bug introduced in `247b7e9e58`	2013-05-15 19:15:40 -04:00
Joey Hess	c62b54d80d	start one git-cat-file per index file This reverts `1c83b6c439` and properly fixes the issue discussed there. This makes git-annex behave much nicer in direct mode.	2013-05-15 18:46:38 -04:00
Joey Hess	25cb9a48da	fix the day's Windows permissions damage	2013-05-14 20:15:14 -04:00
Joey Hess	8a2ff023a3	convert from internal git path when checking symlink standin file	2013-05-14 15:08:40 -05:00
Joey Hess	15af92291f	Merge remote-tracking branch 'gnu/windows' into windows	2013-05-14 14:21:49 -05:00
Joey Hess	fee6cd4635	fix imports	2013-05-14 14:21:35 -05:00
Joey Hess	e7936b1a34	always try to read symlink; only fall back to looking inside file On Windows with Cygwin, checking out a git-annex repo will create symlinks on disk, so we need to always try to read the symlink, even when core.symlinks says they're not supported.	2013-05-14 14:18:47 -04:00
Joey Hess	17952a893e	fix imports	2013-05-14 13:53:29 -04:00
Joey Hess	43f2de8522	Merge branch 'windows' of git://git-annex.branchable.com into windows	2013-05-13 20:11:30 -05:00

1 2 3 4 5 ...

321 commits