git-annex

Author	SHA1	Message	Date
Joey Hess	74b101d1dd	reorg	2014-01-26 16:36:31 -04:00
Joey Hess	b93e485ef1	added annex.secure-erase-command config option.	2014-01-24 12:58:52 -04:00
Joey Hess	3518c586cf	fix transfers of key with no associated file Several places assumed this would not happen, and when the AssociatedFile was Nothing, did nothing. As part of this, preferred content checks pass the Key around. Note that checkMatcher is sometimes now called with Just Key and Just File. It currently constructs a FileMatcher, ignoring the Key. However, if it constructed a FileKeyMatcher, which contained both, then it might be possible to speed up parts of Limit, which currently call the somewhat expensive lookupFileKey to get the Key. I have not made this optimisation yet, because I am not sure if the key is always the same. Will need some significant checking to satisfy myself that's the case..	2014-01-23 16:44:02 -04:00
Joey Hess	4b55afe9e9	add "unused" preferred content expression With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.	2014-01-22 16:35:32 -04:00
Joey Hess	f2713a3bb9	benchmarked numcopies .gitattributes in preferred content Checking .gitattributes adds a full minute to a git annex find looking for files that don't have enough copies. 2:25 increasts to 3:27. I feel this is too much of a slowdown to justify making it the default. So, exposed two versions of the preferred content expression, a slow one and a fast but approximate one. I'm using the approximate one in the default preferred content expressions to avoid slowing down the assistant.	2014-01-21 18:49:25 -04:00
Joey Hess	f7cdc40f7b	reorg	2014-01-21 18:08:56 -04:00
Joey Hess	0ef282a116	numcopies cleanup, part 2 This includes several bug fixes.	2014-01-21 17:25:39 -04:00
Joey Hess	b40df4f0d0	reorganize numcopies code (no behavior changes) Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.	2014-01-21 16:08:59 -04:00
Joey Hess	3159da2693	Add and use numcopiesneeded preferred content expression. * Add numcopiesneeded preferred content expression. * Client, transfer, incremental backup, and archive repositories now want to get content that does not yet have enough copies. This means the asssistant will make copies of files that don't yet meet the configured numcopies, even to places that would not normally want the file. For example, if numcopies is 4, and there are 2 client repos and 2 transfer repos, and 2 removable backup drives, the file will be sent to both transfer repos in order to make 4 copies. Once a removable drive get a copy of the file, it will be dropped from one transfer repo or the other (but not both). Another example, numcopies is 3 and there is a client that has a backup removable drive and two small archive repos. Normally once one of the small archives has a file, it will not be put into the other one. But, to satisfy numcopies, the assistant will duplicate it into the other small archive too, if the backup repo is not available to receive the file. I notice that these examples are fairly unlikely setups .. the old behavior was not too bad, but it's nice to finally have it really correct. .. Almost. I have skipped checking the annex.numcopies .gitattributes out of fear it will be too slow. This commit was sponsored by Florian Schlegel.	2014-01-20 17:35:29 -04:00
Joey Hess	d66535f065	global numcopies setting * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.	2014-01-20 16:47:56 -04:00
Joey Hess	73c420ffcf	much better command action handling for sync --content	2014-01-20 13:31:03 -04:00
Joey Hess	34c8af74ba	fix inversion of control in CommandSeek (no behavior changes) I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.	2014-01-20 04:57:36 -04:00
Joey Hess	b6ba0bd556	sync --content: New option that makes the content of annexed files be transferred. Similar to the assistant, this honors any configured preferred content expressions. I am not entirely happpy with the implementation. It would be nicer if the seek function returned a list of actions which included the individual file gets and copies and drops, rather than the current list of calls to syncContent. This would allow getting rid of the somewhat reundant display of "sync file [ok\|failed]" after the get/put display. But, do that, withFilesInGit would need to somehow be able to construct such a mixed action list. And it would be less efficient than the current implementation, which is able to reuse several values between eg get and drop. Note that currently this does not try to satisfy numcopies when getting/putting files (numcopies are of course checked when dropping files!) This makes it like the assistant, and unlike get --auto and copy --auto, which do duplicate files when numcopies is not yet satisfied. I don't know if this is the right decision; it only seemed to make sense to have this parallel the assistant as far as possible to start with, since I know the assistant works. This commit was sponsored by Øyvind Andersen Holm.	2014-01-19 17:49:54 -04:00
Joey Hess	8ce515ffe4	improve matcher data type to allow matching Keys, instead of just files (no behavior changes)	2014-01-18 14:51:55 -04:00
Joey Hess	207ac67aaa	avoid needing a build-dep on hxt for Data.AssocList	2014-01-14 16:42:10 -04:00
Joey Hess	d07f2d7865	Fix a long-standing bug that could cause the wrong index file to be used when committing to the git-annex branch, if GIT_INDEX_FILE is set in the environment. This typically resulted in git-annex branch log files being committed to the master branch and later showing up in the work tree. (These log files can be safely removed.)	2014-01-14 15:36:33 -04:00
Joey Hess	78c7c54fdb	also check diskreserve for quvi downloads	2014-01-04 15:38:59 -04:00
Joey Hess	f9e7b6cf61	addurl, importfeed: Honor annex.diskreserve as long as the size of the url can be checked. This adds a http HEAD before the download is done. That was already the case when the assistant was running, and it seems worth it to avoid filling up the whole disk, like happened to my server today.	2014-01-04 15:08:06 -04:00
Joey Hess	3e68c1c2fd	add remote state logs This allows a remote to store a piece of arbitrary state associated with a key. This is needed to support Tahoe, where the file-cap is calculated from the data stored in it, and used to retrieve a key later. Glacier also would be much improved by using this. GETSTATE and SETSTATE are added to the external special remote protocol. Note that the state is left as-is even when a key is removed from a remote. It's up to the remote to decide when it wants to clear the state. The remote state log, $KEY.log.rmt, is a UUID-based log. However, rather than using the old UUID-based log format, I created a new variant of that format. The new varient is more space efficient (since it lacks the "timestamp=" hack, and easier to parse (and the parser doesn't mess with whitespace in the value), and avoids compatability cruft in the old one. This seemed worth cleaning up for these new files, since there could be a lot of them, while before UUID-based logs were only used for a few log files at the top of the git-annex branch. The transition code has also been updated to handle these new UUID-based logs. This commit was sponsored by Daniel Hofer.	2014-01-03 16:35:57 -04:00
Joey Hess	b1d7474c1d	Auto-upgrade v3 indirect repos to v5 with no changes. This also fixes a problem when a direct mode repo was somehow set to v3 rather than v4, and so the automatic direct mode upgrade to v5 was not done.	2013-12-29 13:06:23 -04:00
Joey Hess	7d5b25515c	Add plumbing-level lookupkey command.	2013-12-15 14:02:23 -04:00
Joey Hess	bef567c31f	Fix direct mode's handling when modifications to non-annexed files are pulled from a remote. A bug prevented the files from being updated in the work tree, and this caused the modification to be reverted.	2013-12-12 15:57:09 -04:00
Joey Hess	c160bf9d88	format comment	2013-12-12 15:16:44 -04:00
Joey Hess	03932212ec	Avoid using git commit in direct mode, since in some situations it will read the full contents of files in the tree. The assistant's commit code also always avoids git commit, for simplicity. Indirect mode sync still does a git commit -a to catch unstaged changes. Note that this means that direct mode sync no longer runs the pre-commit hook or any other hooks git commit might call. The git annex pre-commit hook action for direct mode is however explicitly run. (The assistant already ran git commit with hooks disabled, so no change there.)	2013-12-01 13:59:45 -04:00
Joey Hess	b25abdb3e6	fix reversion in relative paths to local remotes of direct mode repos `0980f3dae6` broke support for local remotes from direct mode repos, because the relative path was taken to be from the gitdir, rather than from the work tree.	2013-11-26 19:33:26 -04:00
Joey Hess	f913deab78	move programPath out of Config.Files to Annex.Path This works around horribleness in the Mavericks cpp, which falls over on the #if when configure is running. Moving it avoids the file being built at that point. But it's also a location that makes sense..	2013-11-24 16:03:03 -04:00
Joey Hess	e563c7e6f4	fsck distribution key	2013-11-23 21:58:39 -04:00
Joey Hess	b8e74bf489	fix standalone build of this module	2013-11-22 12:21:37 -04:00
Joey Hess	b876df6fdb	Ensure that core.sharedrepository is honored when creating the .git/annex directory.	2013-11-18 18:20:20 -04:00
Joey Hess	310c549b5a	Ensure execute bit is set on directories when core.sharedrepsitory is set.	2013-11-18 18:13:09 -04:00
Joey Hess	5561b46416	fix windows build	2013-11-18 11:05:16 -04:00
Joey Hess	d48b00ebed	Direct mode .git/annex/objects directories are no longer left writable Because that allowed writing to symlinks of files that are not present, which followed the link and put bad content in an object location. fsck: Fix up .git/annex/object directory permissions. This commit was sponsored by an anonymous bitcoin donor.	2013-11-15 14:52:03 -04:00
Joey Hess	b0f85b3e22	Fix direct mode merge bug when a direct mode file was deleted and replaced with a directory. An ordering problem caused the directory to not get created in this case. Thanks to Tim for the test cases.	2013-11-15 13:40:12 -04:00
Joey Hess	59ecc804cd	add new status command This works for both direct and indirect mode. It may need some performance tuning. Note that unlike git status, it only shows the status of the work tree, not the status of the index. So only one status letter, not two .. and since files that have been added and not yet committed do not differ between the work tree and the index, they are not shown. Might want to add display of the index vs the last commit eventually. This commit was sponsored by an unknown bitcoin contributor, whose contribution as been going up lately! ;)	2013-11-07 14:07:25 -04:00
Joey Hess	00c91816fb	Merge branch 'master' into directguard	2013-11-06 13:02:35 -04:00
Joey Hess	81117e8a9d	typo	2013-11-06 12:39:14 -04:00
Joey Hess	ee23be55fd	Fix exception handling bug that could cause .git/annex/index to be used for git commits outside the git-annex branch. Known to affect git-annex when used with the git shipped with Ubuntu 13.10.	2013-11-06 12:21:50 -04:00
Joey Hess	3802f2f270	work around lack of receive.denyCurrentBranch in direct mode Now that direct mode sets core.bare=true, git's normal prohibition about pushing into the currently checked out branch doesn't work. A simple fix for this would be an update hook which blocks the pushes.. but git hooks must be executable, and git-annex needs to be usable on eg, FAT, which lacks x bits. Instead, enabling direct mode switches the branch (eg master) to a special purpose branch (eg annex/direct/master). This branch is not pushed when syncing; instead any changes that git annex sync commits get written to master, and it's pushed (along with synced/master) to the remote. Note that initialization has been changed to always call setDirect, even if it's just setDirect False for indirect mode. This is needed because if the user has just cloned a direct mode repo, that nothing has synced with before, it may have no master branch, and only a annex/direct/master. Resulting in that branch being checked out locally too. Calling setDirect False for indirect mode moves back out of this branch, to a new master branch, and ensures that a manual "git push" doesn't push changes directly to the annex/direct/master of the remote. (It's possible that the user makes a commit w/o using git-annex and pushes it, but nothing I can do about that really.) This commit was sponsored by Jonathan Harrington.	2013-11-05 21:08:31 -04:00
Joey Hess	4510819215	v5 for direct mode, with automatic upgrade This includes storing the current state of the HEAD ref, which git annex sync is going to need, but does not make sync use it.	2013-11-05 17:05:03 -04:00
Joey Hess	0edd9ec03a	refactored hook setup	2013-11-05 15:29:56 -04:00
Joey Hess	230bfa9688	add --want-get and --want-drop options New --want-get and --want-drop options which can be used to test preferred content settings. For example, "git annex find --in . --want-drop"	2013-10-28 14:50:17 -04:00
Joey Hess	049e80e865	refactor	2013-10-28 14:05:55 -04:00
Joey Hess	435ea52f3c	repair command: add handling of git-annex branch and index	2013-10-23 13:00:45 -04:00
Joey Hess	4f871f89ba	git-recover-repository 1/2 done	2013-10-20 17:50:51 -04:00
Joey Hess	19816bca41	update for DiffTree type change (which fixes assistant in subdir confusion bug)	2013-10-17 15:11:21 -04:00
Joey Hess	78acbfeb6a	ensure merge directory is empty before starting merge Don't want some past failed merge to lead to bad results, potentially.	2013-10-16 14:57:58 -04:00
Joey Hess	18f4d1b400	queue downloads of keys that fsck finds with bad content	2013-10-10 17:27:00 -04:00
Joey Hess	267c124f67	run ssh in the directory with its socket when stopping This guarantees that stopping an existing socket never fails. This might be the route out of the mess of needing to worry about socket lengths in general. However, it would need quite a lot of refactoring to make every place in git-annex that runs ssh run it with a cwd that was determined by the location of its connection caching socket. If this wasn't already such a mess, I'd consider even the thought of that API a bad idea..	2013-10-06 21:11:39 -04:00
Joey Hess	6f38426cb8	work around ssh brain-damange The control socket path passed to ssh needs to be 17 characters shorter than the maximum unix domain socket length, because ssh appends stuff to it to make a temporary filename. Closes: #725512 Also, take the shorter of the relative and the absolute paths to the socket. Typically the relative path will be a lot shorter (unless deep inside a subdirectory of the repository), and so using it will avoid flirting with the maximum safe socket lenghts in more situations, and so lead to less breakage if all my attempts at fixing this are still buggy.	2013-10-06 20:59:36 -04:00
Joey Hess	f8880c4fe4	Automatically and safely detect and recover from dangling .git/annex/index.lock files, which would prevent git from committing to the git-annex branch, eg after a crash.	2013-10-03 15:43:08 -04:00
Joey Hess	83b4b8d589	rename confusing function The index.lck file is not a lock file. Kept the historical name for now as changing it would be work.	2013-10-03 15:06:58 -04:00
Joey Hess	f2ee4ef86d	ensure that commitBranch is only called when the journal is locked This is not strictly a requirement, since it does not actually update the journal. But it's a nice invariant to enforce.	2013-10-03 14:48:46 -04:00
Joey Hess	56c3f68a53	use types to partially prove correctness of journal locking code My implementation does not guard against double locking of the journal. But it does ensure that the journal is always locked when operated on, by using a type that is only produced by lockJournal, and which is required as a parameter of all functions that operate on the journal. Note that I had to add the fooStale functions for cases where it does not make sense to lock the journal when querying it. I was more concerned about ensuring that anything that modifies the journal is locked. setJournalFile's implementation ensures that any query of the journal will get one value or the other atomically, even if the journal is being changed at the time.	2013-10-03 14:41:57 -04:00
Joey Hess	7a9a16b337	lockJournal when running performTransitions This may not strictly be needed -- the transition code bypasses the journal. However, this ensures that the git-annex branch is only committed with the journal locked. This will allow for further improvements.	2013-10-03 14:37:46 -04:00
Joey Hess	12f6b9693a	Send a git-annex user-agent when downloading urls. Overridable with --user-agent option. Not yet done for S3 or WebDAV due to limitations of libraries used -- nether allows a user-agent header to be specified. This commit sponsored by Michael Zehrer.	2013-09-28 14:35:21 -04:00
Joey Hess	c45f5fbdb3	indirect: Better behavior when a file in direct mode is not owned by the user running the conversion.	2013-09-25 15:29:56 -04:00
Joey Hess	b405295aee	hlint test suite still passes	2013-09-25 03:09:06 -04:00
Joey Hess	3588729f0d	completely solve catKey memory leak Since `006cf7976f` was incomplete, not being able to get the right mode of the file when the index differs from HEAD, this is a final workaround. Only buffering the start of the file in this case avoids leaking memory. This does not prevent git-cat-file being asked to output the whole file, which needs to be consumed, and can be slow. But this only happens in a rare edge case.	2013-09-19 20:09:03 -04:00
Joey Hess	006cf7976f	more completely solve catKey memory leak Done using a mode witness, which ensures it's fixed everywhere. Fixing catFileKey was a bear, because git cat-file does not provide a nice way to query for the mode of a file and there is no other efficient way to do it. Oh, for libgit2.. Note that I am looking at tree objects from HEAD, rather than the index. Because I cat-file cannot show a tree object for the index. So this fix is technically incomplete. The only cases where it matters are: 1. A new large file has been directly staged in git, but not committed. 2. A file that was committed to HEAD as a symlink has been staged directly in the index. This could be fixed a lot better using libgit2.	2013-09-19 16:41:21 -04:00
Joey Hess	eb42bde19a	sync, pre-commit, indirect: Avoid unnecessarily catting non-symlink files from git, which can be so large it runs out of memory.	2013-09-19 14:48:42 -04:00
Joey Hess	ab9dd6d8a0	sync: Fix bug that caused direct mode mappings to not be updated when merging files into the tree on Windows.	2013-09-13 13:49:28 -04:00
Joey Hess	a48a4e2f8a	automatically derive an annex-uuid from a gcrypt-uuids	2013-09-05 16:02:39 -04:00
Joey Hess	4079f9cfe8	avoid double commit during transition The second commit had some bad refs which resulted in the race detection code running. But that commit was unnecessary anyway, it only was there to merge in the other refs.	2013-09-03 16:33:15 -04:00
Joey Hess	db83cc82d6	Merge branch 'forget' Conflicts: debian/changelog	2013-09-03 14:36:00 -04:00
Joey Hess	67fda9e669	Honor core.sharedrepository when receiving and adding files in direct mode.	2013-09-03 13:35:49 -04:00
Joey Hess	0831e18372	forget --drop-dead: Completely removes mentions of repositories that have been marked as dead from the git-annex branch. Wrote nice pure transition calculator, and ugly code to stage its results into the git-annex branch. Also had to split up several Log modules that Annex.Branch needed to use, but that themselves used Annex.Branch. The transition calculator is limited to looking at and changing one file at a time. While this made the implementation relatively easy, it precludes transitions that do stuff like deleting old url log files for keys that are being removed because they are no longer present anywhere.	2013-08-31 17:51:13 -04:00
Joey Hess	2f57d74534	remove print	2013-08-29 20:28:45 -04:00
Joey Hess	6147652cc6	wording	2013-08-29 16:41:59 -04:00
Joey Hess	6cdac3a003	sync, assistant: Force push of the git-annex branch. Necessary to ensure it gets pushed to remotes after being rewritten by forget. See inline rationalles for why I think this is safe!	2013-08-29 14:27:53 -04:00
Joey Hess	c181efe437	use --force in taggedPush This should make the assistant force update its tagged push branch after a transition like git annex forget.	2013-08-29 13:31:29 -04:00
Joey Hess	336d5ec349	Merge branch 'master' into forget	2013-08-29 13:23:02 -04:00
Joey Hess	d3af414568	typo	2013-08-28 17:05:07 -04:00
Joey Hess	4a915cd3cd	add forget command Works, more or less. --dead is not implemented, and so far a new branch is made, but keys no longer present anywhere are not scrubbed. git annex sync fails to push the synced/git-annex branch after a forget, because it's not a fast-forward of the existing synced branch. Could be fixed by making git-annex sync use assistant-style sync branches.	2013-08-28 16:41:13 -04:00
Joey Hess	fcd5c167ef	untested transition detection on merging, and transition running code	2013-08-28 15:57:42 -04:00
Joey Hess	46b6d75274	Youtube support! (And 53 other video hosts) When quvi is installed, git-annex addurl automatically uses it to detect when an page is a video, and downloads the video file. web special remote: Also support using quvi, for getting files, or checking if files exist in the web. This commit was sponsored by Mark Hepburn. Thanks!	2013-08-22 18:50:43 -04:00
Joey Hess	412dcb8017	Fix bug that caused typechanged symlinks to be assumed to be unlocked files, so they were added to the annex by the pre-commit hook.	2013-08-22 13:57:07 -04:00
Joey Hess	a3224ce35b	avoid more build warnings on Windows	2013-08-04 14:05:36 -04:00
Joey Hess	06db8e0bd9	squash compiler warnings on Windows	2013-08-04 13:18:05 -04:00
Joey Hess	b191d5c595	gitignore support for the assistant and watcher Requires git 1.8.4 or newer. When it's installed, a background git check-ignore process is run, and used to efficiently check ignores whenever a new file is added. Thanks to Adam Spiers, for getting the necessary support into git for this. A complication is what to do about files that are gitignored but have been checked into git anyway. git commands assume the ignore has been overridden in this case, and not need any more overriding to commit a changed version. However, for the assistant to do the same, it would have to run git ls-files to check if the ignored file is in git. This is somewhat expensive. Or it could use the running git-cat-file process to query the file that way, but that requires transferring the whole file content over a pipe, so it can be quite expensive too, for files that are not git-annex symlinks. Now imagine if the user knows that a file or directory tree will be getting frequent changes, and doesn't want the assistant to sync it, so gitignores it. The assistant could overload the system with repeated ls-files checks! So, I've decided that the assistant will not automatically commit changes to files that are gitignored. This is a tradeoff. Hopefully it won't be a problem to adjust .gitignore settings to not ignore files you want the assistant to autocommit, or to manually git annex add files that are listed in .gitignore. (This could be revisited if git-annex gets access to an interface to check the content of the index w/o forking a git command. This could be libgit2, or perhaps a separate git cat-file --batch-check process, so it wouldn't need to ship over the whole file content.) This commit was sponsored by Francois Marier. Thanks!	2013-08-02 20:37:03 -04:00
Joey Hess	93f2371e09	get rid of __WINDOWS__, use mingw32_HOST_OS The latter is harder for me to remember, but avoids build failures in code used by the configure program.	2013-08-02 12:27:32 -04:00
Joey Hess	ddd46db09a	Fix a few bugs involving filenames that are at or near the filesystem's maximum filename length limit. Started with a problem when running addurl on a really long url, because the whole url is munged into the filename. Ended up doing a fairly extensive review for places where filenames could get too large, although it's hard to say I'm not missed any.. Backend.Url had a 128 character limit, which is fine when the limit is 255, but not if it's a lot shorter on some systems. So check the pathconf() limit. Note that this could result in fromUrl creating different keys for the same url, if run on systems with different limits. I don't see this is likely to cause any problems. That can already happen when using addurl --fast, or if the content of an url changes. Both Command.AddUrl and Backend.Url assumed that urls don't contain a lot of multi-byte unicode, and would fail to truncate an url that did properly. A few places use a filename as the template to make a temp file. While that's nice in that the temp file name can be easily related back to the original filename, it could lead to `git annex add` failing to add a filename that was at or close to the maximum length. Note that in Command.Add.lockdown, the template is still derived from the filename, just with enough space left to turn it into a temp file. This is an important optimisation, because the assistant may lock down a bunch of files all at once, and using the same template for all of them would cause openTempFile to iterate through the same set of names, looking for an unused temp file. I'm not very happy with the relatedTemplate hack, but it avoids that slowdown. Backend.WORM does not limit the filename stored in the key. I have not tried to change that; so git annex add will fail on really long filenames when using the WORM backend. It seems better to preserve the invariant that a WORM key always contains the complete filename, since the filename is the only unique material in the key, other than mtime and size. Since nobody has complained about add failing (I think I saw it once?) on WORM, probably it's ok, or nobody but me uses it. There may be compatability problems if using git annex addurl --fast or the WORM backend on a system with the 255 limit and then trying to use that repo in a system with a smaller limit. I have not tried to deal with those. This commit was sponsored by Alexander Brem. Thanks!	2013-07-30 19:18:29 -04:00
Joey Hess	7b0970b340	Fix inverted logic in last release's fix for data loss bug, that caused git-annex sync on FAT or other crippled filesystems to add symlink standin files to the annex.	2013-07-30 16:08:09 -04:00
Joey Hess	7e66d260ea	importfeed: git-annex becomes a podcatcher in 150 LOC	2013-07-28 16:55:42 -04:00
Joey Hess	6ae2637eb1	For long hostnames, use a hash of the hostname to generate the socket file for ssh connection caching. This is ok to do now that the socket filename never needs to be mapped back to a hostname. Short hostnames will still appear in the clear, which is less obfuscated. So this cannot possibly make ssh connection caching fail for a hostname it used to work for.	2013-07-22 15:09:41 -04:00
Joey Hess	c6a020ad1f	stop cached ssh connection w/o needing to look up host and port Turns out that with -O stop -S socketfile, ssh does not need the real hostname, or port to be specificed. This is because it simply talks to the ssh behind the socket and tells it to stop. So, can eliminate the conversion back from a socketfile to host and port. Which will allow using shorter filenames for sockets in the future.	2013-07-21 14:14:54 -04:00
Joey Hess	ecdfa40cbe	avoid false positives when detecting core.symlinks=false symlink standin files If the file is > 8192 bytes, it's certianly not a symlink file. And if it contains nuls or newlines or whitespace, it's certianly not a link to annexed content. But it might be a tarball containing a git-annex repo.	2013-07-20 19:28:02 -04:00
Joey Hess	ae341c1a37	avoid reading files that are not symlinks when core.symlinks=false This hack is only needed on FAT filesystems, so there's no point in doing it the rest of the time. And it's possible for there to be a false positive, so it's best to avoid the hack when possible.	2013-07-20 19:14:29 -04:00
Joey Hess	3e422cb5fa	fix uninit to delete content from annex when it ended up hard linked back to the work tree	2013-07-18 13:30:12 -04:00
Joey Hess	c1307b1388	fsck: Don't claim to fix direct mode when run on a symlink whose content is not present.	2013-07-08 17:29:42 -04:00
Joey Hess	d84a000e92	detect system with no dot in FQDN, where git commit will fail, and workaround Sigh, git is so fragile. Or rather, across the set of systems that use git-annex, where are no many horribly broken systems..	2013-07-05 12:24:28 -04:00
Joey Hess	7a7e426352	moved AssociatedFile definition	2013-07-04 02:36:02 -04:00
Joey Hess	72ab02ca48	avoid failure creating inode sentinal file Test suite on windows failed running git annex init in a bare clone of an annexed repo. The annex directory didn't exist when it tried to write the inode sentinal file.	2013-06-18 15:38:17 -04:00
Joey Hess	1312cffad0	Revert "Windows: Ssh connection caching is now supported." Yeah, that didn't actually work. Got error messages like it couldn't read from the control socket, so probably ssh doesn't really support that on Windows, at least the cygwin ssh build I'm using.	2013-06-17 22:13:28 -04:00
Joey Hess	07a17f58b7	Windows: Ssh connection caching is now supported. Turns out the socket stuff just works on windows.	2013-06-17 22:05:49 -04:00
Joey Hess	d80a0f62a4	avoid lazy read of file contents On Windows, that means the file could still be open when later code wants to delete it, which fails. Since we're only reading 8k anyway, just read it, strictly. However, avoid reading the whole file strictly, so no getContentsStrict here.	2013-06-17 21:12:09 -04:00
Joey Hess	b7674b464b	typo in comment	2013-06-17 20:45:04 -04:00
Joey Hess	0527c74c0f	assistant: In direct mode, objects are now only dropped when all associated files are unwanted. This avoids a repreated drop/get loop of a file that has a copy in an archive directory, and a copy not in an archive directory. (Indirect mode still has some buggy behavior in this area, since it does not keep track of associated files.) Closes: #712060	2013-06-15 14:44:43 -04:00
Joey Hess	92f036fcb4	avoid warnings when built with ghc 7.6	2013-06-02 15:01:58 -04:00
Joey Hess	eba9ee5bc6	remove debug print	2013-05-27 11:18:18 -04:00
Joey Hess	3b1aedea3d	Merge branch 'robustness'	2013-05-25 15:22:18 -04:00
Joey Hess	5eeea0fac9	make direct mode merge cleanup more robust If the cleanup of a single file fails for some reason, continue to clean up other files. This could happen because of a race. The merge pulls in a change to a file, which gets changed locally at the same time.	2013-05-25 15:22:16 -04:00
Joey Hess	bf86b5ca16	improve robustness of fromDirect and replaceFile Made fromDirect check that a file in the tree has good content (and is not a broken symlink either) before copying it to another file that has the same key. Made replaceFile clean up the temp file if the action that creates it, or the file replacement action fails.	2013-05-25 15:06:02 -04:00
Joey Hess	729eab1f89	assistant: Work around git-cat-file's not reloading the index after files are staged. Argh.	2013-05-25 00:37:41 -04:00
Joey Hess	2b14fe2c98	refactor	2013-05-24 23:07:26 -04:00
Joey Hess	08c03b2af3	XMPP: Avoid redundant and unncessary pushes. Note that this breaks compatibility with previous versions of git-annex, which will refuse to accept any XMPP pushes from this version.	2013-05-21 18:24:29 -04:00
Joey Hess	0cb34f3caa	update inode cache after copying content This was also tripped by the test suite's automatic conflict resolution test. Which also shows BTW that an unnecessary copy of content is done sometimes when merging in direct mode. Not going to try to speed that up now.	2013-05-20 17:11:40 -04:00
Joey Hess	d88be65495	didn't quite get removeDirect right before, this passes test suite	2013-05-20 16:28:33 -04:00
Joey Hess	3d8355d984	Fix a bug in the git-annex branch handling code that could cause info from a remote to not be merged and take effect immediately. This bug was turned up by the test suite, running fsck in direct mode. A repository was cloned, was put into direct mode, was fscked, and fsck incorrectly said that no copy existed of a file, that was actually present in origin. This turned out to occur because fsck first did a Annex.Branch.change, recording that it did not locally have the file. That was recorded in the journal. Since neither the git annex direct not the fsck had yet needed to read any info from the branch, but had only made changes to it, the origin/git-annex branch was not yet merged in. So the journal got a location log entry written to it, but this did not include the location log info for the origin. When fsck then did a Annex.Branch.get, it trusted the journal was cosnsitent, and returned it, again w/o merging from origin/git-annex. This latter behavior is the actual bug. Refer to commit `e9bfa8eaed` for the thinking behind it being ok to make a change to a file on the branch, without first merging the branch. That thinking still stands. However, it means that files in the journal cannot be trusted to be consistent if the branch has not been merged. So, to fix, just enure the branch gets merged, even when reading from the journal. In tests, this does not seem to cause any extra merging. Except, of course, in the one case described above. But git annex add, etc, are able to make changes w/o first merging the branch.	2013-05-20 15:14:59 -04:00
Joey Hess	4c22c2261f	minor optimisation and warning fix	2013-05-20 13:58:41 -04:00
Joey Hess	f4ba19f2b8	direct mode bug fix: After a conflicted merge was automatically resolved, the content of a file that was already present could incorrectly be replaced with a symlink. The bug was in movein, which just replaceFile'd the file with a symlink, even if it already had the desired content, before trying to pull the content out of the annex and replace the symlink with it. That was ok-ish for non conflicted merges, where if the file existed it would be an old version of the content. But for conflicted merges, the automatic merge resolver has already run, and will have already put the desired content into the file for the local variant. Also, made removeDirect not trust that the associated files map is correct. Only if it can verify that another file has the content will it not move it into .git/annex/objects.	2013-05-20 13:41:09 -04:00
Joey Hess	345ee4f37c	Switch to MonadCatchIO-transformers for better handling of state while catching exceptions. As seen in this bug report, the lifted exception handling using the StateT monad throws away state changes when an action throws an exception. http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/ .. Which can result in cached values being redundantly calculated, or other possibly worse bugs when the annex state gets out of sync with reality. This switches from a StateT AnnexState to a ReaderT (MVar AnnexState). All changes to the state go via the MVar. So when an Annex action is running inside an exception handler, and it makes some changes, they immediately go into affect in the MVar. If it then throws an exception (or even crashes its thread!), the state changes are still in effect. The MonadCatchIO-transformers change is actually only incidental. I could have kept on using lifted-base for the exception handling. However, I'd have needed to write a new instance of MonadBaseControl for the new monad.. and I didn't write the old instance.. I begged Bas and he kindly sent it to me. Happily, MonadCatchIO-transformers is able to derive a MonadCatchIO instance for my monad. This is a deep level change. It passes the test suite! What could it break? Well.. The most likely breakage would be to code that runs an Annex action in an exception handler, and wants state changes to be thrown away. Perhaps the state changes leaves the state inconsistent, or wrong. Since there are relatively few places in git-annex that catch exceptions in the Annex monad, and the AnnexState is generally just used to cache calculated data, this is unlikely to be a problem. Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit redundant. It's now entirely possible to run concurrent Annex actions in different threads, all sharing access to the same state! The ThreadedMonad just adds some extra work on top of that, with its own MVar, and avoids such actions possibly stepping on one-another's toes. I have not gotten rid of it, but might try that later. Being able to run concurrent Annex actions would simplify parts of the Assistant code.	2013-05-19 14:16:36 -04:00
Joey Hess	630a8b9ad2	warning	2013-05-19 12:43:44 -04:00
Joey Hess	1b616c5d37	improve handling of receiving object in direct mode when associated files are modified Before, if a direct mode repo had one or more associated files that were modifed, moving the object into it would overwrite the associated files with the pristine object. Now, modified associated files are left unchanged. To ensure that, when an object is moved into a direct mode repo, it's not thrown away, it gets stored in indirect mode.	2013-05-17 16:25:18 -04:00
Joey Hess	94cb037aa3	store copy in inode cache too	2013-05-17 16:16:10 -04:00
Joey Hess	b8e5b9c645	test suite passes in direct mode This fixes a bug with git annex add in direct mode. If some files already existed in the tree pointing at the same key as a file that was just added, and their content was not present, add neglected to copy the content to those files. I also changed the behavior of moveAnnex slightly: When content is moved into the annex in direct mode, it does not overwrite any content already present in direct mode files. That content may be modified after all.	2013-05-17 15:59:37 -04:00
Joey Hess	3240006c56	fix android build, broken by changes for windows port	2013-05-16 11:52:48 -04:00
Joey Hess	aba49995b6	Merge branch 'master' into windows	2013-05-15 19:18:04 -04:00
Joey Hess	4829eae883	fix toDirectGen bug introduced in `247b7e9e58`	2013-05-15 19:15:40 -04:00
Joey Hess	c62b54d80d	start one git-cat-file per index file This reverts `1c83b6c439` and properly fixes the issue discussed there. This makes git-annex behave much nicer in direct mode.	2013-05-15 18:46:38 -04:00
Joey Hess	25cb9a48da	fix the day's Windows permissions damage	2013-05-14 20:15:14 -04:00
Joey Hess	8a2ff023a3	convert from internal git path when checking symlink standin file	2013-05-14 15:08:40 -05:00
Joey Hess	15af92291f	Merge remote-tracking branch 'gnu/windows' into windows	2013-05-14 14:21:49 -05:00
Joey Hess	fee6cd4635	fix imports	2013-05-14 14:21:35 -05:00
Joey Hess	e7936b1a34	always try to read symlink; only fall back to looking inside file On Windows with Cygwin, checking out a git-annex repo will create symlinks on disk, so we need to always try to read the symlink, even when core.symlinks says they're not supported.	2013-05-14 14:18:47 -04:00
Joey Hess	17952a893e	fix imports	2013-05-14 13:53:29 -04:00
Joey Hess	43f2de8522	Merge branch 'windows' of git://git-annex.branchable.com into windows	2013-05-13 20:11:30 -05:00
Joey Hess	1093302eba	read inode cache file strictly to avoid failure to drop on windows Seems that Windows doesn't allow deleting a file that the same process has open. Here the inode cache file was read and a the value from it gets used later. But due to laziness, the old file is still open when it gets deleted. Adding strictness avoids this problem. Of course, the file is small, so it's no problem to read it all strictly, so this is probably an improvement even outside of Windows.	2013-05-13 19:29:52 -05:00
Joey Hess	13b629c208	fix warnings	2013-05-13 15:30:18 -04:00
Joey Hess	25a8d4b11c	rename module	2013-05-12 19:19:28 -04:00
Joey Hess	03e8594369	fix the day's windows permissions damage	2013-05-12 19:09:48 -04:00
Joey Hess	73d2f8b280	deal with git using / internally, even on DOS	2013-05-12 17:29:49 -05:00
Joey Hess	2f3ce4c02f	fix	2013-05-12 15:43:59 -05:00
Joey Hess	838b984797	deal with dos path separators	2013-05-12 15:37:32 -05:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	18bdff3fae	clean up from windows porting	2013-05-11 18:23:41 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	763cbda14f	fixup #if 0 stubs to use #ifndef mingw32_HOST_OS That's needed in files used to build the configure program. For the other files, I'm keeping my __WINDOWS__ define, as I find that much easier to type. I may search and replace it to use the mingw32_HOST_OS thing later.	2013-05-10 16:57:21 -05:00
Joey Hess	6c74a42cc6	stub out POSIX stuff	2013-05-10 16:29:59 -05:00
Joey Hess	adde00f4f3	git-annex-shell: Ensure that received files can be read. Files transferred from some Android devices may have very broken permissions as received.	2013-05-06 17:30:57 -04:00
Joey Hess	247b7e9e58	direct: Fix a bug that could cause some files to be left in indirect mode. It's possible for files in indirect mode to have a direct mode mapping file. Probably from when they were in direct mode. In this case, toDirectGen tried to copy the content from the direct mode file that the mapping said had it. But, being in indirect mode, it didn't really have the content. So it did nothing. This fix makes it always move the content from .git/annex/objects/ when it's there.	2013-05-06 12:43:03 -04:00
Joey Hess	543ffa5b9f	work around git/environment/gecos/android suck I don't know why, but I can't seem to set the environment variables inside git-annex to work around the git error caused by android's crappy username and hostname settings. This workaround works, and that's all that's good about it.	2013-05-03 14:08:26 -04:00
Joey Hess	e23a7598e2	set EMAIL when GECOS workaround is needed Git fails on Android, because it gets some weird domain for local host like "localhost.(none)". This works around that. I made it always set EMAIL when GECOS workaround was needed (unless EMAIL is already set). It might be nicer to try to get the hostname.domain as git does, and only set it if that fails. But I don't want to be stuck trying to exactly duplicate whatever git is doing.	2013-05-03 11:52:04 -04:00
Joey Hess	0807211a67	thaw content directory in direct mode too A content directory can be frozen in direct mode. One way this can happen is if the content is transferred before direct mode has a mapping for it, so it's stored in the content directory. So, we need to thaw the content directory before doing things with it.	2013-04-30 19:33:43 -04:00
Joey Hess	11ca4cee34	refactor	2013-04-30 19:09:36 -04:00
Joey Hess	0ae8c82c53	per-IA-item content directories	2013-04-25 23:44:55 -04:00
Joey Hess	07580dc3df	sync: Bug fix, avoid adding to the annex the dummy symlinks used on crippled filesystems. The root of the problem is that toInodeCache sees a non-symlink, and so goes on and generates a new inode cache for the dummy symlink. Any place that toInodeCache, or sameFileStatus, or genInodeCache are called may need to deal with this case. Although many of them are ok. For example, prepSendAnnex calls sameInodeCache, which calls genInodeCache.. but if the file content is not present, the InodeCache generated for its standin file is appropriately not the same, and so it returns Nothing. I've audited some, but have to say I'm not happy with this; it should be handled at the type level somehow, or a toInodeCache wrapper be used that is aware of dummy symlinks. (The Watcher already dealt with it, via the guardSymlinkStandin function.)	2013-04-23 17:14:28 -04:00
Joey Hess	8a2d1988d3	expose Control.Monad.join I think I've been looking for that function for some time. Ie, I remember wanting to collapse Just Nothing to Nothing.	2013-04-22 20:24:53 -04:00
Joey Hess	9cb223a8b3	Detect systems that have no user name set in GECOS, and also don't have user.name set in git config, and put in a workaround so that commits to the git-annex branch (and the assistant) will still succeed despite git not liking the system configuration.	2013-04-22 15:36:34 -04:00
guilhem	a1eded8641	Allow rsync to use other remote shells. Introduced a new per-remote option 'annex-rsync-transport' to specify the remote shell that it to be used with rsync. In case the value is 'ssh', connections are cached unless 'sshcaching' is unset.	2013-04-13 19:26:24 -04:00
Joey Hess	4f5ceffead	implement massReplace This looks at the string one char at a time, which is hardly efficient.. but more than good enough for expanding variables in relatively short command lines.	2013-04-08 23:56:37 -04:00

1 2 3 4 5 ...

497 commits