git-annex

Author	SHA1	Message	Date
Joey Hess	a48a4e2f8a	automatically derive an annex-uuid from a gcrypt-uuids	2013-09-05 16:02:39 -04:00
Joey Hess	4079f9cfe8	avoid double commit during transition The second commit had some bad refs which resulted in the race detection code running. But that commit was unnecessary anyway, it only was there to merge in the other refs.	2013-09-03 16:33:15 -04:00
Joey Hess	db83cc82d6	Merge branch 'forget' Conflicts: debian/changelog	2013-09-03 14:36:00 -04:00
Joey Hess	67fda9e669	Honor core.sharedrepository when receiving and adding files in direct mode.	2013-09-03 13:35:49 -04:00
Joey Hess	0831e18372	forget --drop-dead: Completely removes mentions of repositories that have been marked as dead from the git-annex branch. Wrote nice pure transition calculator, and ugly code to stage its results into the git-annex branch. Also had to split up several Log modules that Annex.Branch needed to use, but that themselves used Annex.Branch. The transition calculator is limited to looking at and changing one file at a time. While this made the implementation relatively easy, it precludes transitions that do stuff like deleting old url log files for keys that are being removed because they are no longer present anywhere.	2013-08-31 17:51:13 -04:00
Joey Hess	2f57d74534	remove print	2013-08-29 20:28:45 -04:00
Joey Hess	6147652cc6	wording	2013-08-29 16:41:59 -04:00
Joey Hess	6cdac3a003	sync, assistant: Force push of the git-annex branch. Necessary to ensure it gets pushed to remotes after being rewritten by forget. See inline rationalles for why I think this is safe!	2013-08-29 14:27:53 -04:00
Joey Hess	c181efe437	use --force in taggedPush This should make the assistant force update its tagged push branch after a transition like git annex forget.	2013-08-29 13:31:29 -04:00
Joey Hess	336d5ec349	Merge branch 'master' into forget	2013-08-29 13:23:02 -04:00
Joey Hess	d3af414568	typo	2013-08-28 17:05:07 -04:00
Joey Hess	4a915cd3cd	add forget command Works, more or less. --dead is not implemented, and so far a new branch is made, but keys no longer present anywhere are not scrubbed. git annex sync fails to push the synced/git-annex branch after a forget, because it's not a fast-forward of the existing synced branch. Could be fixed by making git-annex sync use assistant-style sync branches.	2013-08-28 16:41:13 -04:00
Joey Hess	fcd5c167ef	untested transition detection on merging, and transition running code	2013-08-28 15:57:42 -04:00
Joey Hess	46b6d75274	Youtube support! (And 53 other video hosts) When quvi is installed, git-annex addurl automatically uses it to detect when an page is a video, and downloads the video file. web special remote: Also support using quvi, for getting files, or checking if files exist in the web. This commit was sponsored by Mark Hepburn. Thanks!	2013-08-22 18:50:43 -04:00
Joey Hess	412dcb8017	Fix bug that caused typechanged symlinks to be assumed to be unlocked files, so they were added to the annex by the pre-commit hook.	2013-08-22 13:57:07 -04:00
Joey Hess	a3224ce35b	avoid more build warnings on Windows	2013-08-04 14:05:36 -04:00
Joey Hess	06db8e0bd9	squash compiler warnings on Windows	2013-08-04 13:18:05 -04:00
Joey Hess	b191d5c595	gitignore support for the assistant and watcher Requires git 1.8.4 or newer. When it's installed, a background git check-ignore process is run, and used to efficiently check ignores whenever a new file is added. Thanks to Adam Spiers, for getting the necessary support into git for this. A complication is what to do about files that are gitignored but have been checked into git anyway. git commands assume the ignore has been overridden in this case, and not need any more overriding to commit a changed version. However, for the assistant to do the same, it would have to run git ls-files to check if the ignored file is in git. This is somewhat expensive. Or it could use the running git-cat-file process to query the file that way, but that requires transferring the whole file content over a pipe, so it can be quite expensive too, for files that are not git-annex symlinks. Now imagine if the user knows that a file or directory tree will be getting frequent changes, and doesn't want the assistant to sync it, so gitignores it. The assistant could overload the system with repeated ls-files checks! So, I've decided that the assistant will not automatically commit changes to files that are gitignored. This is a tradeoff. Hopefully it won't be a problem to adjust .gitignore settings to not ignore files you want the assistant to autocommit, or to manually git annex add files that are listed in .gitignore. (This could be revisited if git-annex gets access to an interface to check the content of the index w/o forking a git command. This could be libgit2, or perhaps a separate git cat-file --batch-check process, so it wouldn't need to ship over the whole file content.) This commit was sponsored by Francois Marier. Thanks!	2013-08-02 20:37:03 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	ddd46db09a	Fix a few bugs involving filenames that are at or near the filesystem's maximum filename length limit. Started with a problem when running addurl on a really long url, because the whole url is munged into the filename. Ended up doing a fairly extensive review for places where filenames could get too large, although it's hard to say I'm not missed any.. Backend.Url had a 128 character limit, which is fine when the limit is 255, but not if it's a lot shorter on some systems. So check the pathconf() limit. Note that this could result in fromUrl creating different keys for the same url, if run on systems with different limits. I don't see this is likely to cause any problems. That can already happen when using addurl --fast, or if the content of an url changes. Both Command.AddUrl and Backend.Url assumed that urls don't contain a lot of multi-byte unicode, and would fail to truncate an url that did properly. A few places use a filename as the template to make a temp file. While that's nice in that the temp file name can be easily related back to the original filename, it could lead to `git annex add` failing to add a filename that was at or close to the maximum length. Note that in Command.Add.lockdown, the template is still derived from the filename, just with enough space left to turn it into a temp file. This is an important optimisation, because the assistant may lock down a bunch of files all at once, and using the same template for all of them would cause openTempFile to iterate through the same set of names, looking for an unused temp file. I'm not very happy with the relatedTemplate hack, but it avoids that slowdown. Backend.WORM does not limit the filename stored in the key. I have not tried to change that; so git annex add will fail on really long filenames when using the WORM backend. It seems better to preserve the invariant that a WORM key always contains the complete filename, since the filename is the only unique material in the key, other than mtime and size. Since nobody has complained about add failing (I think I saw it once?) on WORM, probably it's ok, or nobody but me uses it. There may be compatability problems if using git annex addurl --fast or the WORM backend on a system with the 255 limit and then trying to use that repo in a system with a smaller limit. I have not tried to deal with those. This commit was sponsored by Alexander Brem. Thanks!	2013-07-30 19:18:29 -04:00
Joey Hess	7b0970b340	Fix inverted logic in last release's fix for data loss bug, that caused git-annex sync on FAT or other crippled filesystems to add symlink standin files to the annex.	2013-07-30 16:08:09 -04:00
Joey Hess	7e66d260ea	importfeed: git-annex becomes a podcatcher in 150 LOC	2013-07-28 16:55:42 -04:00
Joey Hess	6ae2637eb1	For long hostnames, use a hash of the hostname to generate the socket file for ssh connection caching. This is ok to do now that the socket filename never needs to be mapped back to a hostname. Short hostnames will still appear in the clear, which is less obfuscated. So this cannot possibly make ssh connection caching fail for a hostname it used to work for.	2013-07-22 15:09:41 -04:00
Joey Hess	c6a020ad1f	stop cached ssh connection w/o needing to look up host and port Turns out that with -O stop -S socketfile, ssh does not need the real hostname, or port to be specificed. This is because it simply talks to the ssh behind the socket and tells it to stop. So, can eliminate the conversion back from a socketfile to host and port. Which will allow using shorter filenames for sockets in the future.	2013-07-21 14:14:54 -04:00
Joey Hess	ecdfa40cbe	avoid false positives when detecting core.symlinks=false symlink standin files If the file is > 8192 bytes, it's certianly not a symlink file. And if it contains nuls or newlines or whitespace, it's certianly not a link to annexed content. But it might be a tarball containing a git-annex repo.	2013-07-20 19:28:02 -04:00
Joey Hess	ae341c1a37	avoid reading files that are not symlinks when core.symlinks=false This hack is only needed on FAT filesystems, so there's no point in doing it the rest of the time. And it's possible for there to be a false positive, so it's best to avoid the hack when possible.	2013-07-20 19:14:29 -04:00
Joey Hess	3e422cb5fa	fix uninit to delete content from annex when it ended up hard linked back to the work tree	2013-07-18 13:30:12 -04:00
Joey Hess	c1307b1388	fsck: Don't claim to fix direct mode when run on a symlink whose content is not present.	2013-07-08 17:29:42 -04:00
Joey Hess	d84a000e92	detect system with no dot in FQDN, where git commit will fail, and workaround Sigh, git is so fragile. Or rather, across the set of systems that use git-annex, where are no many horribly broken systems..	2013-07-05 12:24:28 -04:00
Joey Hess	7a7e426352	moved AssociatedFile definition	2013-07-04 02:36:02 -04:00
Joey Hess	72ab02ca48	avoid failure creating inode sentinal file Test suite on windows failed running git annex init in a bare clone of an annexed repo. The annex directory didn't exist when it tried to write the inode sentinal file.	2013-06-18 15:38:17 -04:00
Joey Hess	1312cffad0	Revert "Windows: Ssh connection caching is now supported." Yeah, that didn't actually work. Got error messages like it couldn't read from the control socket, so probably ssh doesn't really support that on Windows, at least the cygwin ssh build I'm using.	2013-06-17 22:13:28 -04:00
Joey Hess	07a17f58b7	Windows: Ssh connection caching is now supported. Turns out the socket stuff just works on windows.	2013-06-17 22:05:49 -04:00
Joey Hess	d80a0f62a4	avoid lazy read of file contents On Windows, that means the file could still be open when later code wants to delete it, which fails. Since we're only reading 8k anyway, just read it, strictly. However, avoid reading the whole file strictly, so no getContentsStrict here.	2013-06-17 21:12:09 -04:00
Joey Hess	b7674b464b	typo in comment	2013-06-17 20:45:04 -04:00
Joey Hess	0527c74c0f	assistant: In direct mode, objects are now only dropped when all associated files are unwanted. This avoids a repreated drop/get loop of a file that has a copy in an archive directory, and a copy not in an archive directory. (Indirect mode still has some buggy behavior in this area, since it does not keep track of associated files.) Closes: #712060	2013-06-15 14:44:43 -04:00
Joey Hess	92f036fcb4	avoid warnings when built with ghc 7.6	2013-06-02 15:01:58 -04:00
Joey Hess	eba9ee5bc6	remove debug print	2013-05-27 11:18:18 -04:00
Joey Hess	3b1aedea3d	Merge branch 'robustness'	2013-05-25 15:22:18 -04:00
Joey Hess	5eeea0fac9	make direct mode merge cleanup more robust If the cleanup of a single file fails for some reason, continue to clean up other files. This could happen because of a race. The merge pulls in a change to a file, which gets changed locally at the same time.	2013-05-25 15:22:16 -04:00
Joey Hess	bf86b5ca16	improve robustness of fromDirect and replaceFile Made fromDirect check that a file in the tree has good content (and is not a broken symlink either) before copying it to another file that has the same key. Made replaceFile clean up the temp file if the action that creates it, or the file replacement action fails.	2013-05-25 15:06:02 -04:00
Joey Hess	729eab1f89	assistant: Work around git-cat-file's not reloading the index after files are staged. Argh.	2013-05-25 00:37:41 -04:00
Joey Hess	2b14fe2c98	refactor	2013-05-24 23:07:26 -04:00
Joey Hess	08c03b2af3	XMPP: Avoid redundant and unncessary pushes. Note that this breaks compatibility with previous versions of git-annex, which will refuse to accept any XMPP pushes from this version.	2013-05-21 18:24:29 -04:00
Joey Hess	0cb34f3caa	update inode cache after copying content This was also tripped by the test suite's automatic conflict resolution test. Which also shows BTW that an unnecessary copy of content is done sometimes when merging in direct mode. Not going to try to speed that up now.	2013-05-20 17:11:40 -04:00
Joey Hess	d88be65495	didn't quite get removeDirect right before, this passes test suite	2013-05-20 16:28:33 -04:00
Joey Hess	3d8355d984	Fix a bug in the git-annex branch handling code that could cause info from a remote to not be merged and take effect immediately. This bug was turned up by the test suite, running fsck in direct mode. A repository was cloned, was put into direct mode, was fscked, and fsck incorrectly said that no copy existed of a file, that was actually present in origin. This turned out to occur because fsck first did a Annex.Branch.change, recording that it did not locally have the file. That was recorded in the journal. Since neither the git annex direct not the fsck had yet needed to read any info from the branch, but had only made changes to it, the origin/git-annex branch was not yet merged in. So the journal got a location log entry written to it, but this did not include the location log info for the origin. When fsck then did a Annex.Branch.get, it trusted the journal was cosnsitent, and returned it, again w/o merging from origin/git-annex. This latter behavior is the actual bug. Refer to commit `e9bfa8eaed` for the thinking behind it being ok to make a change to a file on the branch, without first merging the branch. That thinking still stands. However, it means that files in the journal cannot be trusted to be consistent if the branch has not been merged. So, to fix, just enure the branch gets merged, even when reading from the journal. In tests, this does not seem to cause any extra merging. Except, of course, in the one case described above. But git annex add, etc, are able to make changes w/o first merging the branch.	2013-05-20 15:14:59 -04:00
Joey Hess	4c22c2261f	minor optimisation and warning fix	2013-05-20 13:58:41 -04:00
Joey Hess	f4ba19f2b8	direct mode bug fix: After a conflicted merge was automatically resolved, the content of a file that was already present could incorrectly be replaced with a symlink. The bug was in movein, which just replaceFile'd the file with a symlink, even if it already had the desired content, before trying to pull the content out of the annex and replace the symlink with it. That was ok-ish for non conflicted merges, where if the file existed it would be an old version of the content. But for conflicted merges, the automatic merge resolver has already run, and will have already put the desired content into the file for the local variant. Also, made removeDirect not trust that the associated files map is correct. Only if it can verify that another file has the content will it not move it into .git/annex/objects.	2013-05-20 13:41:09 -04:00
Joey Hess	345ee4f37c	Switch to MonadCatchIO-transformers for better handling of state while catching exceptions. As seen in this bug report, the lifted exception handling using the StateT monad throws away state changes when an action throws an exception. http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/ .. Which can result in cached values being redundantly calculated, or other possibly worse bugs when the annex state gets out of sync with reality. This switches from a StateT AnnexState to a ReaderT (MVar AnnexState). All changes to the state go via the MVar. So when an Annex action is running inside an exception handler, and it makes some changes, they immediately go into affect in the MVar. If it then throws an exception (or even crashes its thread!), the state changes are still in effect. The MonadCatchIO-transformers change is actually only incidental. I could have kept on using lifted-base for the exception handling. However, I'd have needed to write a new instance of MonadBaseControl for the new monad.. and I didn't write the old instance.. I begged Bas and he kindly sent it to me. Happily, MonadCatchIO-transformers is able to derive a MonadCatchIO instance for my monad. This is a deep level change. It passes the test suite! What could it break? Well.. The most likely breakage would be to code that runs an Annex action in an exception handler, and wants state changes to be thrown away. Perhaps the state changes leaves the state inconsistent, or wrong. Since there are relatively few places in git-annex that catch exceptions in the Annex monad, and the AnnexState is generally just used to cache calculated data, this is unlikely to be a problem. Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit redundant. It's now entirely possible to run concurrent Annex actions in different threads, all sharing access to the same state! The ThreadedMonad just adds some extra work on top of that, with its own MVar, and avoids such actions possibly stepping on one-another's toes. I have not gotten rid of it, but might try that later. Being able to run concurrent Annex actions would simplify parts of the Assistant code.	2013-05-19 14:16:36 -04:00

1 2 3 4 5 ...

336 commits