git-annex

Author	SHA1	Message	Date
Joey Hess	aba49995b6	Merge branch 'master' into windows	2013-05-15 19:18:04 -04:00
Joey Hess	4829eae883	fix toDirectGen bug introduced in `247b7e9e58`	2013-05-15 19:15:40 -04:00
Joey Hess	c62b54d80d	start one git-cat-file per index file This reverts `1c83b6c439` and properly fixes the issue discussed there. This makes git-annex behave much nicer in direct mode.	2013-05-15 18:46:38 -04:00
Joey Hess	25cb9a48da	fix the day's Windows permissions damage	2013-05-14 20:15:14 -04:00
Joey Hess	8a2ff023a3	convert from internal git path when checking symlink standin file	2013-05-14 15:08:40 -05:00
Joey Hess	15af92291f	Merge remote-tracking branch 'gnu/windows' into windows	2013-05-14 14:21:49 -05:00
Joey Hess	fee6cd4635	fix imports	2013-05-14 14:21:35 -05:00
Joey Hess	e7936b1a34	always try to read symlink; only fall back to looking inside file On Windows with Cygwin, checking out a git-annex repo will create symlinks on disk, so we need to always try to read the symlink, even when core.symlinks says they're not supported.	2013-05-14 14:18:47 -04:00
Joey Hess	17952a893e	fix imports	2013-05-14 13:53:29 -04:00
Joey Hess	43f2de8522	Merge branch 'windows' of git://git-annex.branchable.com into windows	2013-05-13 20:11:30 -05:00
Joey Hess	1093302eba	read inode cache file strictly to avoid failure to drop on windows Seems that Windows doesn't allow deleting a file that the same process has open. Here the inode cache file was read and a the value from it gets used later. But due to laziness, the old file is still open when it gets deleted. Adding strictness avoids this problem. Of course, the file is small, so it's no problem to read it all strictly, so this is probably an improvement even outside of Windows.	2013-05-13 19:29:52 -05:00
Joey Hess	13b629c208	fix warnings	2013-05-13 15:30:18 -04:00
Joey Hess	25a8d4b11c	rename module	2013-05-12 19:19:28 -04:00
Joey Hess	03e8594369	fix the day's windows permissions damage	2013-05-12 19:09:48 -04:00
Joey Hess	73d2f8b280	deal with git using / internally, even on DOS	2013-05-12 17:29:49 -05:00
Joey Hess	2f3ce4c02f	fix	2013-05-12 15:43:59 -05:00
Joey Hess	838b984797	deal with dos path separators	2013-05-12 15:37:32 -05:00
Joey Hess	abe8d549df	fix permission damage (thanks, Windows)	2013-05-11 23:54:25 -04:00
Joey Hess	18bdff3fae	clean up from windows porting	2013-05-11 18:23:41 -04:00
Joey Hess	3c7e30a295	git-annex now builds on Windows (doesn't work)	2013-05-11 15:03:00 -05:00
Joey Hess	763cbda14f	fixup #if 0 stubs to use #ifndef mingw32_HOST_OS That's needed in files used to build the configure program. For the other files, I'm keeping my __WINDOWS__ define, as I find that much easier to type. I may search and replace it to use the mingw32_HOST_OS thing later.	2013-05-10 16:57:21 -05:00
Joey Hess	6c74a42cc6	stub out POSIX stuff	2013-05-10 16:29:59 -05:00
Joey Hess	adde00f4f3	git-annex-shell: Ensure that received files can be read. Files transferred from some Android devices may have very broken permissions as received.	2013-05-06 17:30:57 -04:00
Joey Hess	247b7e9e58	direct: Fix a bug that could cause some files to be left in indirect mode. It's possible for files in indirect mode to have a direct mode mapping file. Probably from when they were in direct mode. In this case, toDirectGen tried to copy the content from the direct mode file that the mapping said had it. But, being in indirect mode, it didn't really have the content. So it did nothing. This fix makes it always move the content from .git/annex/objects/ when it's there.	2013-05-06 12:43:03 -04:00
Joey Hess	543ffa5b9f	work around git/environment/gecos/android suck I don't know why, but I can't seem to set the environment variables inside git-annex to work around the git error caused by android's crappy username and hostname settings. This workaround works, and that's all that's good about it.	2013-05-03 14:08:26 -04:00
Joey Hess	e23a7598e2	set EMAIL when GECOS workaround is needed Git fails on Android, because it gets some weird domain for local host like "localhost.(none)". This works around that. I made it always set EMAIL when GECOS workaround was needed (unless EMAIL is already set). It might be nicer to try to get the hostname.domain as git does, and only set it if that fails. But I don't want to be stuck trying to exactly duplicate whatever git is doing.	2013-05-03 11:52:04 -04:00
Joey Hess	0807211a67	thaw content directory in direct mode too A content directory can be frozen in direct mode. One way this can happen is if the content is transferred before direct mode has a mapping for it, so it's stored in the content directory. So, we need to thaw the content directory before doing things with it.	2013-04-30 19:33:43 -04:00
Joey Hess	11ca4cee34	refactor	2013-04-30 19:09:36 -04:00
Joey Hess	0ae8c82c53	per-IA-item content directories	2013-04-25 23:44:55 -04:00
Joey Hess	07580dc3df	sync: Bug fix, avoid adding to the annex the dummy symlinks used on crippled filesystems. The root of the problem is that toInodeCache sees a non-symlink, and so goes on and generates a new inode cache for the dummy symlink. Any place that toInodeCache, or sameFileStatus, or genInodeCache are called may need to deal with this case. Although many of them are ok. For example, prepSendAnnex calls sameInodeCache, which calls genInodeCache.. but if the file content is not present, the InodeCache generated for its standin file is appropriately not the same, and so it returns Nothing. I've audited some, but have to say I'm not happy with this; it should be handled at the type level somehow, or a toInodeCache wrapper be used that is aware of dummy symlinks. (The Watcher already dealt with it, via the guardSymlinkStandin function.)	2013-04-23 17:14:28 -04:00
Joey Hess	8a2d1988d3	expose Control.Monad.join I think I've been looking for that function for some time. Ie, I remember wanting to collapse Just Nothing to Nothing.	2013-04-22 20:24:53 -04:00
Joey Hess	9cb223a8b3	Detect systems that have no user name set in GECOS, and also don't have user.name set in git config, and put in a workaround so that commits to the git-annex branch (and the assistant) will still succeed despite git not liking the system configuration.	2013-04-22 15:36:34 -04:00
guilhem	a1eded8641	Allow rsync to use other remote shells. Introduced a new per-remote option 'annex-rsync-transport' to specify the remote shell that it to be used with rsync. In case the value is 'ssh', connections are cached unless 'sshcaching' is unset.	2013-04-13 19:26:24 -04:00
Joey Hess	4f5ceffead	implement massReplace This looks at the string one char at a time, which is hardly efficient.. but more than good enough for expanding variables in relatively short command lines.	2013-04-08 23:56:37 -04:00
Joey Hess	d440b6047b	Added annex.web-download-command setting.	2013-04-08 23:34:05 -04:00
Joey Hess	602baae12e	Bugfix: Direct mode no longer repeatedly checksums duplicated files. Fixed by storing a list of cached inodes for a key, instead of just one. Backwards compatability note: An old git-annex version will fail to parse an inode cache file that has been written by a new version, and has multiple items. It will succees if just one. So old git-annexes will have even worse behavior when there are duplicated files, if that is possible. I don't think it will be a problem. (Famous last words.) Also, note that it doesn't expire old and unused inode caches for a key. It would be possible to add this if needed; just look through the associated files for a key and if there are more cached inodes, throw out any not corresponding to associated files. Unless a file is being copied repeatedly and the old copy deleted, this lack of expiry should not be a problem.	2013-04-06 16:07:25 -04:00
Joey Hess	f1b0a4b404	Use lower case hash directories for storing files on crippled filesystems, same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.	2013-04-04 15:46:33 -04:00
Joey Hess	8a5b397ac4	hlint	2013-04-03 03:52:41 -04:00
Joey Hess	0b57113c42	cleanup	2013-04-02 19:45:52 -04:00
Joey Hess	38d61f934d	Update working tree files fully atomically This avoids commit churn by the assistant when eg, replacing a file with a symlink. But, just as importantly, it prevents the working tree being left with a deleted file if git-annex, or perhaps the whole system, crashes at the wrong time. (It also probably avoids confusing displays in file managers.)	2013-04-02 15:02:00 -04:00
Joey Hess	67e817c6a1	New annex.largefiles setting, which configures which files `git annex add` and the assistant add to the annex. I would have sort of liked to put this in .gitattributes, but it seems it does not support multi-word attribute values. Also, making this a single config setting makes it easy to only parse the expression once. A natural next step would be to make the assistant `git add` files that are not annex.largefiles. OTOH, I don't think `git annex add` should `git add` such files, because git-annex command line tools are not in the business of wrapping git command line tools.	2013-03-29 16:17:13 -04:00
Joey Hess	75a1c2f91a	cleanup debug print	2013-03-28 14:18:26 -04:00
Joey Hess	80c8c0e62a	comment typo	2013-03-18 13:17:43 -04:00
Joey Hess	7a77f98576	move comment to right place	2013-03-18 11:18:04 -04:00
Joey Hess	b3d3ece2ab	remove old debug print	2013-03-16 17:04:48 -04:00
Joey Hess	f7de51e8b6	Bugfix: Fix bug in inode cache sentinal check, which broke copying to local repos if the repo being copied from had moved to a different filesystem or otherwise changed all its inodes'	2013-03-12 16:41:54 -04:00
Joey Hess	61c5e8736c	detect renames during commit, and .. um, do nothing special because it's lunch time But I'm well set up to fast-track direct mode adds for renames now.	2013-03-11 12:56:47 -04:00
Joey Hess	40df015d90	remove Eq instance for InodeCache There are two types of equality here, and which one is right varies, so this forces me to consider and choose between them. Based on this, I learned that the commit in git anex sync was always doing a strong comparison, even when in a repository where the inodes had changed. Fixed that.	2013-03-11 02:57:48 -04:00
Joey Hess	cbb6e1fae4	tag xmpp pushes with jid This fixes the issue mentioned in the last commit. Turns out just collecting UUID of clients behind a XMPP remote is insufficient (although I should probably still do it for other reasons), because a single remote repo might be connected via both XMPP and local pairing. So a way is needed to know when a push was received from any client using a given XMPP remote over XMPP, as opposed to via ssh.	2013-03-06 16:29:19 -04:00
Joey Hess	c23ea9e311	assistant: Get back in sync with XMPP remotes after network reconnection, and on startup. Make manualPull send push requests over XMPP. When reconnecting with remotes, those that are XMPP remotes cannot immediately be pulled from and scanned, so instead maintain a set of (probably) desynced remotes, and put XMPP remotes on it. (This set could be used in other ways later, if we can detect we're out of sync with other types of remotes.) The merger handles detecting when a XMPP push is received from a desynced remote, and triggers a scan then, if they have in fact diverged. This has one known bug: A single XMPP remote can have multiple clients behind it. When this happens, only the UUID of one client is recorded as the UUID of the XMPP remote. Pushes from the other XMPP clients will not trigger a scan. If the client whose UUID is expected responds to the push request, it'll work, but when that client is offline, we're SOL.	2013-03-06 15:09:31 -04:00
Joey Hess	974d075108	Run ssh with -T to avoid tty allocation and any login scripts that may do undesired things with it.	2013-03-04 23:36:07 -04:00
Joey Hess	0c13d3065e	git subcommand cleanup Pass subcommand as a regular param, which allows passing git parameters like -c before it. This was already done in the pipeing set of functions, but not the command running set.	2013-03-03 13:39:07 -04:00
Joey Hess	cbd53b4a8c	Makefile now builds using cabal, taking advantage of cabal's automatic detection of appropriate build flags. The only thing lost is ./ghci Speed: make fast used to take 20 seconds here, when rebuilding from touching Command/Unused.hs. With cabal, it's 29 seconds.	2013-02-27 02:39:22 -04:00
Joey Hess	2d9c046dea	annex.version is now set to 4 for direct mode repositories To avoid old versions of git-annex getting confused. There is no upgrade required though. We switch back to 3 when going from direct to indirect.	2013-02-26 15:13:10 -04:00
Joey Hess	e423190b11	fix	2013-02-24 17:40:14 -04:00
Joey Hess	6ff1ce76b7	hopefully fix a bug	2013-02-24 17:21:04 -04:00
Joey Hess	afb21353c8	remove debug print	2013-02-23 14:34:02 -04:00
Joey Hess	051476c2a9	squelch warning	2013-02-22 18:22:12 -04:00
Joey Hess	08854afa10	fix inverted logic	2013-02-22 17:01:48 -04:00
Joey Hess	4689fbde35	fix sameInodeCache to check the inode change sentinal This should fix the problem where the assistant, on Android, re-adds every file on startup.	2013-02-22 15:19:28 -04:00
Joey Hess	faa9d3c22b	work around broken getEnvironment on Android in the most important place: git annex init This resulted in a lot of user complains that git annex init had git telling them they needed to run git config --global user.email .. which didn't work because even HOME was not passed into git.	2013-02-22 14:47:29 -04:00
Joey Hess	6f9be431e6	only create inode sentinal file when initializing a new repo	2013-02-20 13:55:53 -04:00
Joey Hess	00b465e213	shorter directory to external ssh socket Before it was too long to be used.	2013-02-19 17:31:08 -04:00
Joey Hess	624e34649f	Direct mode: Support filesystems like FAT which can change their inodes each time they are mounted.	2013-02-19 17:31:03 -04:00
Joey Hess	0f4cc559a7	Android: Support ssh connection caching.	2013-02-19 14:57:45 -04:00
Joey Hess	d799ef3182	set fileSystemEncoding when reading files that might be binary	2013-02-18 17:19:37 -04:00
Joey Hess	422dd28f0b	hlint	2013-02-18 02:39:40 -04:00
Joey Hess	9aa979edbd	types	2013-02-18 02:35:38 -04:00
Joey Hess	d7c93b8913	fully support core.symlinks=false in all relevant symlink handling code Refactored annex link code into nice clean new library. Audited and dealt with calls to createSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ createSymbolicLink linktarget file only when core.symlinks=true Assistant/WebApp/Configurators/Local.hs: createSymbolicLink link link test if symlinks can be made Command/Fix.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/FromKey.hs: liftIO $ createSymbolicLink link file command only works in indirect mode Command/Indirect.hs: liftIO $ createSymbolicLink l f refuses to run if core.symlinks=false Init.hs: createSymbolicLink f f2 test if symlinks can be made Remote/Directory.hs: go [file] = catchBoolIO $ createSymbolicLink file f >> return True fast key linking; catches failure to make symlink and falls back to copy Remote/Git.hs: liftIO $ catchBoolIO $ createSymbolicLink loc file >> return True ditto Upgrade/V1.hs: liftIO $ createSymbolicLink link f v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to readSymbolicLink. Remaining calls are all safe, because: Annex/Link.hs: ( liftIO $ catchMaybeIO $ readSymbolicLink file only when core.symlinks=true Assistant/Threads/Watcher.hs: ifM ((==) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) code that fixes real symlinks when inotify sees them It's ok to not fix psdueo-symlinks. Assistant/Threads/Watcher.hs: mlink <- liftIO (catchMaybeIO $ readSymbolicLink file) ditto Command/Fix.hs: stopUnless ((/=) (Just link) <$> liftIO (catchMaybeIO $ readSymbolicLink file)) $ do command only works in indirect mode Upgrade/V1.hs: getsymlink = takeFileName <$> readSymbolicLink file v1 repos could not be on a filesystem w/o symlinks Audited and dealt with calls to isSymbolicLink. (Typically used with getSymbolicLinkStatus, but that is just used because getFileStatus is not as robust; it also works on pseudolinks.) Remaining calls are all safe, because: Assistant/Threads/SanityChecker.hs: \| isSymbolicLink s -> addsymlink file ms only handles staging of symlinks that were somehow not staged (might need to be updated to support pseudolinks, but this is only a belt-and-suspenders check anyway, and I've never seen the code run) Command/Add.hs: if isSymbolicLink s \|\| not (isRegularFile s) avoids adding symlinks to the annex, so not relevant Command/Indirect.hs: \| isSymbolicLink s -> void $ flip whenAnnexed f $ only allowed on systems that support symlinks Command/Indirect.hs: whenM (liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f) $ do ditto Seek.hs:notSymlink f = liftIO $ not . isSymbolicLink <$> getSymbolicLinkStatus f used to find unlocked files, only relevant in indirect mode Utility/FSEvents.hs: \| Files.isSymbolicLink s = runhook addSymlinkHook $ Just s Utility/FSEvents.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: \| Files.isSymbolicLink s -> Utility/INotify.hs: checkfiletype Files.isSymbolicLink addSymlinkHook f Utility/Kqueue.hs: \| Files.isSymbolicLink s = callhook addSymlinkHook (Just s) change all above are lower-level, not relevant Audited and dealt with calls to isSymLink. Remaining calls are all safe, because: Annex/Direct.hs: \| isSymLink (getmode item) = This is looking at git diff-tree objects, not files on disk Command/Unused.hs: \| isSymLink (LsTree.mode l) = do This is looking at git ls-tree, not file on disk Utility/FileMode.hs:isSymLink :: FileMode -> Bool Utility/FileMode.hs:isSymLink = checkMode symbolicLinkMode low-level Done!!	2013-02-17 16:43:14 -04:00
Joey Hess	397082013a	proper fix for dropunused Now getKeysPresent checks that the key's content, not only its directory, exists. In direct mode, the inode cache file is used as a standin for the content. removeAnnex always removes the inode cache file, and drop and move --from always call removeAnnex, even if the object does not seem to be inAnnex, to ensure it's always deleted.	2013-02-15 17:58:49 -04:00
Joey Hess	5a8fb26d0a	Revert "Clean up direct mode cache and mapping info when dropping keys." This reverts commit `57780cb3a4`. This was buggy, it caused the direct mode cache to be lost when dropping keys, so when the file is gotten back, it's stored in indirect mode. Note to self: Do not attempt bug fixes at 6 am!	2013-02-15 16:37:57 -04:00
Joey Hess	5ea4b91fb4	start to support core.symlinks=false Utility functions to handle no symlink mode, and converted Annex.Content to use them; still many other places to convert.	2013-02-15 16:03:11 -04:00
Joey Hess	7ce30b534f	add: Improved detection of files that are modified while being added. In indirect mode, now checks the inode cache to detect changes to a file. Note that a file can still be changed if a process has it open for write, after landing in the annex. In direct mode, some checking of the inode cache was done before, but from a much later point, so fewer modifications could be detected. Now it's as good as indirect mode. On crippled filesystems, no lock down is done before starting to add a file, so checking the inode cache is the only protection we have.	2013-02-14 16:54:36 -04:00
Joey Hess	a52f8f382b	split out Utility.InodeCache	2013-02-14 16:17:40 -04:00
Joey Hess	47477b2807	crippled filesystem support, probing and initial support git annex init probes for crippled filesystems, and sets direct mode, as well as `annex.crippledfilesystem`. Avoid manipulating permissions of files on crippled filesystems. That would likely cause an exception to be thrown. Very basic support in Command.Add for cripped filesystems; avoids the lock down entirely since doing it needs both permissions and hard links. Will make this better soon.	2013-02-14 14:15:26 -04:00
Joey Hess	f202d997f4	Now uses the Haskell uuid library, rather than needing a uuid program. Been meaning to do this for some time; Android port was last straw. Note that newer versions of the uuid library have a Data.UUID.V4 that generates random UUIDs slightly more cleanly, but Debian has an old version of the library, so I do it slightly round-about.	2013-02-10 14:52:54 -04:00
Joey Hess	57780cb3a4	Clean up direct mode cache and mapping info when dropping keys. These files were left behind, and made getKeysPresent find keys that were not present. It would be expensive to make getKeysPresent check that the actual key files are present (it just lists the directories). But that's not needed if we just clean up the stale cache and mapping files. To handle systems that were in direct mode and got switched back with stale direct mode files, made cleanObjectLoc remove all files in the key's directory. git annex unused will still list keys that are gone but for which the stale direct mode files exists. To deal with that, made dropunused remove the key's directory even if the key does not seem to be present.	2013-02-07 08:28:40 -04:00
Joey Hess	af3a25ee03	Deal with stale mappings for deleted file in direct mode. The most common way for a mapping to be stale is when a file was deleted, or renamed. Nothing updates the mappings for deletions yet. But they can also become stale in other ways. For example a file can be modified. So, the mapping is not trusted to be consistent. When we get a key, only replace symlinks that still point to that key with its content. When we drop a key, only put back symlinks for files that still have the direct mode content.	2013-02-05 16:48:00 -04:00
Joey Hess	0e3f931f37	add another setting to GitConfig	2013-01-28 00:33:19 +11:00
Joey Hess	103b572d8e	ensure that content directory is thawed when writing direct mode mapping and cache files	2013-01-26 20:09:15 +11:00
Joey Hess	f86462b475	allow lazy reading of map contents Don't explicitly close; hGetContents will close when read is done.	2013-01-18 13:16:16 -04:00
Joey Hess	e481ca7658	some more direct mode fixes Avoid a crash if a mapping contains files that no longer exist. This could happen because eg, one was deleted and a commit has not yet been done to update the mapping. Fix path calculation.	2013-01-18 12:39:26 -04:00
Joey Hess	5c58e9c101	Avoid filename encoding errors when writing direct mode mappings.	2013-01-18 12:26:45 -04:00
Joey Hess	bbf0e74f72	Fix direct mode mapping code to always store direct mode filenames relative to the top of the repository, even when operating inside a subdirectory.	2013-01-18 12:20:08 -04:00
Joey Hess	85c564ea94	In direct mode, files with the same key are no longer hardlinked, as that would cause a surprising behavior if modifying one, where the other would also change.	2013-01-14 11:56:37 -04:00
Joey Hess	a6a5ed8121	check for direct mode file change when copying to a local git remote	2013-01-10 11:45:44 -04:00
Joey Hess	1bc49b7158	Special remotes now all rollback storage of keys that get modified during the transfer, which can happen in direct mode.	2013-01-09 18:42:29 -04:00
Joey Hess	858ad6783b	add works in direct mode Also, changed sync to no longer automatically add files in direct mode. That was only necessary before because add didn't work.	2013-01-06 17:24:22 -04:00
Joey Hess	909f67443f	Fix transferring files to special remotes in direct mode.	2013-01-06 14:29:01 -04:00
Joey Hess	e457be7631	direct: Avoid hardlinking symlinks that point to the same content when the content is not present.	2013-01-06 13:57:53 -04:00
Joey Hess	1c83b6c439	work around a very strange git-cat-file behavior Sometimes it seems that git-cat-file --batch stops getting info for files in the current repo, when ":file" is fed to it. I have not reproduced this at the command line, but only when using git annex whereis and git annex move inside a direct mode repo. Those failed, because cat-file returned "file missing". OTOH, git annex find works fine, despite passing the same file to cat-file. It seems that the failing commands first asked cat-file to show a file on the git-annex branch. Perhaps it got "stuck" on that branch? But I cannot repoduce it running cat-file by hand. Most strange. HEAD is a workaround for this extreme weirdness, since I spent a good 2 hours struggling with it already.	2013-01-05 17:06:24 -04:00
Joey Hess	1cdf2b923d	assistant: Make expensive transfer scan work fully in direct mode. The expensive scan uses lookupFile, but in direct mode, that doesn't work for files that are present. So the scan was not finding things that are present that need to be uploaded. (It did find things not present that needed to be downloaded.) Now lookupFile also works in direct mode. Note that it still prefers symlinks on disk to info committed to git, in direct mode. This is necessary to make things like Assistant.Threads.Watcher.onAddSymlink work correctly, when given a new symlink not yet checked into git (or replacing a file checked into git).	2013-01-05 15:57:53 -04:00
Joey Hess	4008590c68	type based git config handling for remotes Still a couple of places that use git config ad-hoc, but this is most of it done.	2013-01-01 13:58:14 -04:00
Joey Hess	7f7c31df1c	type based git config handling Now there's a Config type, that's extracted from the git config at startup. Note that laziness means that individual config values are only looked up and parsed on demand, and so we get implicit memoization for all of them. So this is not only prettier and more type safe, it optimises several places that didn't have explicit memoization before. As well as getting rid of the ugly explicit memoization code. Not yet done for annex.<remote>.* configuration settings.	2012-12-29 23:10:18 -04:00
Joey Hess	2fdefc656b	fix logic error breaking direct mode assistant autocommit of modified files	2012-12-28 16:00:19 -04:00
Joey Hess	eb40227d15	assistant direct mode file add/change bookkeeping When a file is changed in direct mode, the old content is probably lost (at least from the local repo), and bookeeping needs to be updated to reflect this. Also, synthetic add events are generated at assistant startup, so make it detect when the file has not really changed, and avoid re-adding it. This does add the overhead of querying the runing git cat-file for the key that's recorded in git for the file, each time a file is added or modified in direct mode.	2012-12-25 15:48:15 -04:00
Joey Hess	ddb0adb998	more quickcheck fun	2012-12-19 16:36:19 -04:00
Joey Hess	93c430c2a4	comment	2012-12-19 12:46:35 -04:00
Joey Hess	97d670b0d5	normalise associated files Sometimes ./file will be passed in, and sometimes file; need to treat these the same.	2012-12-19 12:44:24 -04:00
Joey Hess	05ec4587dd	partial and incomplete automatic merging in direct mode Handles our file right, but not theirs.	2012-12-18 17:15:16 -04:00
Joey Hess	53dbcce645	direct mode merging works! Automatic merge resoltion code needs to be fixed to preserve objects from direct mode files.	2012-12-18 15:04:44 -04:00
Joey Hess	5df3c66a85	added direct and indirect commands	2012-12-13 15:44:56 -04:00
Joey Hess	cfe354eccd	whitespace fix	2012-12-13 00:46:30 -04:00
Joey Hess	ffdd08fd2e	Merge branch 'master' into desymlink	2012-12-13 00:46:10 -04:00
Joey Hess	0d50a6105b	whitespace fixes	2012-12-13 00:45:27 -04:00
Joey Hess	b080a58b76	Merge branch 'master' into desymlink Conflicts: Annex/CatFile.hs Annex/Content.hs Git/LsFiles.hs Git/LsTree.hs	2012-12-13 00:29:06 -04:00
Joey Hess	f87a781aa6	finished where indentation changes	2012-12-13 00:24:19 -04:00
Joey Hess	e7b8cb0063	direct mode committing	2012-12-12 19:20:38 -04:00
Joey Hess	f2ed0f9659	fix associated files to not fall back to object location	2012-12-12 13:11:59 -04:00
Joey Hess	752b5354ab	make parent directory	2012-12-12 13:05:50 -04:00
Joey Hess	9d133270c2	update	2012-12-10 15:02:44 -04:00
Joey Hess	514957914d	direct mode mappings now updated by git annex sync Still lots to do to make sync handle direct mode, but this is a good first step.	2012-12-10 14:37:24 -04:00
Joey Hess	b4c6da9cbd	Got object sending working in direct mode. However, I don't yet have a reliable way to deal with files being modified while they're being transferred. I have code that detects it on the sending side, but the receiver is still free to move the wrong content into its annex, and record that it has the content. So that's not acceptable, and I'll need to work on it some more. However, at this point I can use a direct mode repository as a remote and transfer files from and to it.	2012-12-08 17:03:39 -04:00
Joey Hess	664765e757	update the cache automatically when moving objects in or out	2012-12-08 13:13:36 -04:00
Joey Hess	ef24751922	support for checking presence of objects in direct mode Also for dropping objects in direct mode. Checking presence reliably needs a cache of mtime, size, and inode. This way, if a file is modified, keys that point to it are no longer present. Also, the code for restoring the symlink when removing objects is unnecessarily messy. calcGitLink was generating links starting with "../../remote/.git/", when running "git annex move --from remote". I put in a workaround, but calcGitLink should probably be fixed. There is not yet support for getting objects from repositories in direct mode; it still looks for content in .git/annex/objects, and there's no once place I can change to fix that. Also, getting objects from direct mode repositories is problematic since the can be changed while the object is being transferred. It probably needs to quarantine it first.	2012-12-07 17:29:55 -04:00
Joey Hess	3898d8c091	support for storing files in direct mode	2012-12-07 14:53:02 -04:00
Joey Hess	99a8a5297c	--auto fixes * get/copy --auto: Transfer data even if it would exceed numcopies, when preferred content settings want it. * drop --auto: Fix dropping content when there are no preferred content settings.	2012-12-06 13:22:16 -04:00
Joey Hess	b5a9560a1b	squelch warning	2012-11-26 16:30:46 -04:00
Joey Hess	da6fb44446	finished XMPP pairing! This includes keeping track of which buddies we're pairing with, to know which PairAck are legitimate.	2012-11-05 17:43:17 -04:00
Joey Hess	9767562f65	rsync special remote: Include annex-rsync-options when running rsync to test a key's presence. Also, use the new withQuietOutput function to avoid running the shell to /dev/null stderr in two other places.	2012-10-28 13:51:14 -04:00
Joey Hess	3417c55189	remove git-annex branch read cache This cache prevented noticing changes made by another process. The case I just ran into involved the assistant dropping a file, which cached its presence info. Then the same file was downloaded again, but the assistant didn't know its presence info had changed. I don't see a way to keep this cache. Will instead rely on the OS level file cache, for files in the journal. May need to add more higher-level caching of info that it's ok to have a potentially stale copy of, although much of git-annex already does so.	2012-10-19 14:25:15 -04:00
Joey Hess	e7780a39f5	Preferred content path matching bugfix. When in a subdir, both the normal filepath, and the filepath relative to the top of the git repo are needed for matching. The former for key lookup, and the latter for include/exclude to match against. Previously, key lookup didn't work in this situation.	2012-10-17 16:01:09 -04:00
Joey Hess	3156febec8	disable ssh connection caching for standalone builds The standalone build does not bundle its own ssh, so should be built to support as wide an array of ssh versions as possible, so turn off connection caching. Unfortunatly, as implemented this forces a full rebuild when building the standalone binary, and of course it makes it somewhat slower. This is not ideal, but neither is probing the ssh version every time it's run (slow), or once when initializing a repo (fragile).	2012-10-15 14:49:40 -04:00
Joey Hess	97ea08e2d1	Avoid unsetting HOME when running certian git commands. Closes: #690193 Setting GIT_INDEX_FILE clobbers the rest of the environment, making git not read ~/.gitconfig, and blow up if GECOS didn't have a name for the user. I'm not entirely happy with getEnvironment being run every time now, that's somewhat expensive. It may make sense to just set GIT_COMMITTER_* and GIT_AUTHOR_*, but I worry that clobbering the rest could break PATH, or GIT_PATH, or something else that might be used by a command run in here. And caching the environment is not a good idea either; it can change..	2012-10-11 12:58:24 -04:00
Joey Hess	39be7eea40	add standard group selector to repo edit form	2012-10-10 16:04:28 -04:00
Joey Hess	9da7dd8874	webapp: configure new repos to use the standard preferred content settings	2012-10-10 15:35:10 -04:00
Joey Hess	3490977d97	webapp: put new repos in standard groups I'm using transfer for most things, both removable drives and cloud storage, because it's the safest choice. We'll see if it makes sense to prompt for the group when setting this up, or let the user pick something else after the fact.	2012-10-10 15:27:25 -04:00
Joey Hess	f9b81c7a75	refactor	2012-10-10 15:15:56 -04:00
Joey Hess	5ac15149cc	assistant: Now honors preferred content settings when deciding what to transfer. Both when queueing downloads, and uploads, consults the preferred content settings. I didn't make it check yet when requeing failed transfers or queuing deferred downloads; dealing with the preferred content settings (or indeed, other settings) changing while the assistant is running still needs work.	2012-10-09 12:18:41 -04:00
Joey Hess	fee40dd374	generalized Annex.Wanted this should make it easy to use from inside the assistant, where everything is an AssociatedFile.	2012-10-08 17:14:01 -04:00
Joey Hess	836561e057	fix invered logic for shouldDrop	2012-10-08 16:12:02 -04:00
Joey Hess	1eedf495c3	make copy --to check preferred content of the remote	2012-10-08 16:06:56 -04:00
Joey Hess	34e7faf71a	uninit: Unset annex.version. Closes: #689852	2012-10-07 16:04:03 -04:00
Joey Hess	47314c0fad	fix last zombies in the assistant Made Git.LsFiles return cleanup actions, and everything waits on processes now, except of course for Seek.	2012-10-04 19:56:32 -04:00
Joey Hess	5594bf0643	more zombie fighting I'm down to 9 places in the code that can produce unwaited for zombies. Most of these are pretty innocuous, at least for now, are only used in short-running commands, or commands that run a set of actions and explicitly reap zombies after each one. The one from Annex.Branch.files could be trouble later, since both Command.Fsck and Command.Unused can trigger it, and the assistant will be doing those eventally. Ditto the one in Git.LsTree.lsTree, which Command.Unused uses. The only ones currently affecting the assistant though, are in Git.LsFiles. Several threads use several of those. (And yeah, using pipes or ResourceT would be a less ad-hoc approach, but I don't really feel like ripping my entire code base apart right now to change a foundation monad. Maybe one of these days..)	2012-10-04 18:47:31 -04:00
Joey Hess	bc83179a76	Test that uuid -m works, falling back to plain uuid if not.	2012-09-25 10:48:20 -04:00
Joey Hess	3887432c54	fixes for transfer resume Fix resuming of downloads, which do not have a transfer info file to read. When checking upload progress, use the MVar, rather than re-reading the info file. Catch exceptions in the transfer action. Required a tryAnnex.	2012-09-24 13:18:16 -04:00
Joey Hess	e8188ea611	flip catchDefaultIO	2012-09-17 00:18:07 -04:00
Joey Hess	ba0334116c	more descriptive name for oneshot	2012-09-15 20:46:38 -04:00
Joey Hess	750c4ac6c2	bugfix: avoid staging but not committing changes to git-annex branch Branch.get is not able to see changes that have been staged to the index but not committed. This is a limitation of git cat-file --batch; when reading from the index, as opposed to from a branch, it does not notice changes made after the first time it reads the index. So, had to revert the changes made in `1f73db3469` to make annex.alwayscommit=false stage changes. Also, ensure that Branch.change and Branch.get always see changes at all points during a commit, by not deleting journal files when staging to the index. Delete them only after committing the branch. Before, there was a race during commits where a different git-annex could see out-of-date info from the branch while a commit was in progress. That's also done when updating the branch to merge in remote branches. In the case where the local git-annex branch has had changes pushed into it that are not yet reflected in the index, and there are journalled changes as well, a merge commit has to be done.	2012-09-15 20:15:16 -04:00
Joey Hess	a1f93f06fd	eliminate some commits to the git-annex branch Commits used to be made to the git-annex branch whenever there were journalled changes from a previous command, and the current command looked up the value of a file. This no longer happens. This means that transferkey, which is a oneshot command that stages changes, can be run multiple times by the assistant, without each of them committing the changes made by the command before. Which will be a lot faster and use less space by batching up the commits. Commits still happen if a remote git-annex branch has been changed and is merged in.	2012-09-15 18:36:42 -04:00
Joey Hess	ca45cea113	Revert "add catFileIndex" This interface is not a good idea, because a running git cat-file --batch does not notice when existing files in the index are changed.	2012-09-15 18:30:53 -04:00
Joey Hess	e1baf48d88	add catFileIndex	2012-09-15 17:06:10 -04:00
Joey Hess	87fb9c690e	remove withIndexUpdate helper	2012-09-15 15:48:21 -04:00
Joey Hess	5573911d25	Disable ssh connection caching if the path to the control socket would be too long (and use relative path to minimise path to the control socket).	2012-09-13 19:26:39 -04:00
Joey Hess	c9b3b8829d	thread safe git-annex index file use	2012-08-24 20:50:39 -04:00
Joey Hess	5c3e14649e	avoid unnecessary transfer scans when syncing a disconnected remote Found a very cheap way to determine when a disconnected remote has diverged, and has new content that needs to be transferred: Piggyback on the git-annex branch update, which already checks for divergence. However, this does not check if new content has appeared locally while disconnected, that should be transferred to the remote. Also, this does not handle cases where the two git repos are in sync, but their content syncing has not caught up yet. This code could have its efficiency improved: * When multiple remotes are synced, if any one has diverged, they're all queued for transfer scans. * The transfer scanner could be told whether the remote has new content, the local repo has new content, or both, and could optimise its scan accordingly.	2012-08-22 15:05:57 -04:00
Joey Hess	9fc94d780b	better readProcess	2012-07-19 00:57:40 -04:00
Joey Hess	1db7d27a45	add back debug logging Make Utility.Process wrap the parts of System.Process that I use, and add debug logging to them. Also wrote some higher-level code that allows running an action with handles to a processes stdin or stdout (or both), and checking its exit status, all in a single function call. As a bonus, the debug logging now indicates whether the process is being run to read from it, feed it data, chat with it (writing and reading), or just call it for its side effect.	2012-07-19 00:46:52 -04:00
Joey Hess	d1da9cf221	switch from System.Cmd.Utils to System.Process Test suite now passes with -threaded! I traced back all the hangs with -threaded to System.Cmd.Utils. It seems it's just crappy/unsafe/outdated, and should not be used. System.Process seems to be the cool new thing, so converted all the code to use it instead. In the process, --debug stopped printing commands it runs. I may try to bring that back later. Note that even SafeSystem was switched to use System.Process. Since that was a modified version of code from System.Cmd.Utils, it needed to be converted too. I also got rid of nearly all calls to forkProcess, and all calls to executeFile, which I'm also doubtful about working well with -threaded.	2012-07-18 18:00:24 -04:00
Joey Hess	05310538ef	more debugging	2012-07-18 13:31:00 -04:00
Joey Hess	75b6ee81f9	avoid ByteString.Char8 where not needed Its truncation behavior is a red flag, so avoid using it in these places where only raw ByteStrings are used, without looking at the data inside.	2012-06-20 13:13:40 -04:00
Joey Hess	e0095b0bdc	fishy commit	2012-06-14 00:01:48 -04:00
Joey Hess	942d8f7298	hlint	2012-06-12 11:32:06 -04:00
Joey Hess	ca9ee21bd7	crazy optimisation Crazy like a fox..	2012-06-10 19:58:34 -04:00
Joey Hess	c5707c84d3	queue size fix Increase queue size for update-index actions, because otherwise they'll never be flushed.	2012-06-10 13:56:04 -04:00
Joey Hess	d45a9a7831	refactor and function name cleanup (oops, I had a calcMerge and a calc_merge!)	2012-06-08 00:29:39 -04:00
Joey Hess	20f425be19	make watch use the queue May not work. Certianly needs to flush the queue from time to time when only symlink changes are being made.	2012-06-07 15:40:44 -04:00
Joey Hess	0a11b35d89	extend Git.Queue to be able to queue more than simple git commands While I was in there, I noticed and fixed a bug in the queue size calculations. It was never encountered only because Queue.add was only ever run with 1 file in the list.	2012-06-07 15:19:44 -04:00
Joey Hess	b819f644ad	close the git add race There's a race adding a new file to the annex: The file is moved to the annex and replaced with a symlink, and then we git add the symlink. If someone comes along in the meantime and replaces the symlink with something else, such as a new large file, we add that instead. Which could be bad.. This race is fixed by avoiding using git add, instead the symlink is directly staged into the index. It would be nice to make `git annex add` use this same technique. I have not done so yet because it currently runs git update-index once per file, which would slow does `git annex add`. A future enhancement would be to extend the Git.Queue to include the ability to run update-index with a list of Streamers.	2012-06-06 14:29:10 -04:00
Joey Hess	993e6459a3	factor out nukeFile	2012-06-06 13:13:13 -04:00
Joey Hess	27cfeca4ea	Merge branch 'master' into watch	2012-06-06 02:16:21 -04:00
Joey Hess	f1bd72ea54	factor out generic update-index code from unionmerge code	2012-06-06 00:10:34 -04:00
Joey Hess	7a6fb8ae4e	flush the git queue when a new type of action is being added to it This allows the queue to be used in a single process for multiple possibly conflicting commands, like add and rm, without running them out of order. This assumes that running the same git subcommand with different parameters cannot itself conflict.	2012-06-04 20:41:22 -04:00
Joey Hess	bb4f31a0ee	Clean up handling of git directory and git worktree. Baked into the code was an assumption that a repository's git directory could be determined by adding ".git" to its work tree (or nothing for bare repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are used to separate the two. This was attacked at the type level, by storing the gitdir and worktree separately, so Nothing for the worktree means a bare repo. A complication arose because we don't learn where a repository is bare until its configuration is read. So another Location type handles repositories that have not had their config read yet. I am not entirely happy with this being a Location type, rather than representing them entirely separate from the Git type. The new code is not worse than the old, but better types could enforce more safety. Added support for core.worktree. Overriding it with -c isn't supported because it's not really clear what to do if a git repo's config is read, is not bare, and is then overridden to bare. What is the right git directory in this case? I will worry about this if/when someone has a use case for overriding core.worktree with -c. (See Git.Config.updateLocation) Also removed and renamed some functions like gitDir and workTree that misused git's terminology. One minor regression is known: git annex add in a bare repository does not print a nice error message, but runs git ls-files in a way that fails earlier with a less nice error message. This is because before --work-tree was always passed to git commands, even in a bare repo, while now it's not.	2012-05-18 17:03:12 -04:00
Joey Hess	f7d8982672	Fix use of several config settings annex.ssh-options, annex.rsync-options, annex.bup-split-options. And adjust types to avoid the bugs that broke several config settings recently. Now "annex." prefixing is enforced at the type level.	2012-05-05 20:16:56 -04:00
Joey Hess	76102c1c75	display "Recording state in git..." when staging the journal A bit tricky to avoid printing it twice in a row when there are queued git commands to run and journal to stage. Added a generic way to run an action that may output multiple side messages, with only the first displayed.	2012-04-27 13:54:33 -04:00
Joey Hess	e0b7012ccc	uninit: Clear annex.uuid from .git/config. Closes: #670639	2012-04-27 12:21:38 -04:00
Joey Hess	84ac8c58db	Add annex.httpheaders and annex.httpheader-command config settings Allow custom headers to be sent with all HTTP requests. (Requested by the Internet Archive)	2012-04-22 01:13:09 -04:00
Joey Hess	ed79596b75	noop	2012-04-21 23:32:33 -04:00
Joey Hess	bee420bd2d	in which I discover void void :: Functor f => f a -> f () -- ah, of course that's useful :)	2012-04-21 23:06:19 -04:00
Joey Hess	cab63b89f2	cache parsed core.sharedrepository	2012-04-21 19:42:49 -04:00
Joey Hess	b98b69e8c6	honor core.sharedRepository when making all the other files in the annex Lock files, directories, etc.	2012-04-21 19:36:03 -04:00
Joey Hess	7e45712d19	better file mode setting code	2012-04-21 16:01:56 -04:00
Joey Hess	b4a5e39ee6	Support git's core.sharedRepository configuration This is incomplete, it does not honor it yet for hash directories and other annex bookkeeping files. Some of that is not needed for a bare repo; some of it may be.	2012-04-21 15:36:52 -04:00
Joey Hess	b65e257b13	inverted logic	2012-04-20 16:16:13 -04:00
Joey Hess	262017e17d	export a more generalized checkDiskSpace	2012-04-20 16:06:10 -04:00
Joey Hess	e38a839a80	Rewrote free disk space checking code Moving the portability handling into a small C library cleans up things a lot, avoiding the pain of unpacking structs from inside haskell code.	2012-03-22 17:32:47 -04:00
Joey Hess	f1398b5583	use new getConfig	2012-03-22 17:32:47 -04:00
Joey Hess	4eb5112681	rationalize getConfig getConfig got a remote-specific config, and this confusing name caused it to be used a couple of places that only were interested in global configs. Rename to getRemoteConfig and make getConfig only get global configs. There are no behavior changes here, but remote.<name>.annex-web-options never actually worked (and per-remote web options is a very unlikely to be useful case so I didn't make it work), so fix the documentation for it.	2012-03-22 17:32:47 -04:00
Joey Hess	188e2edc41	status: Prints available local disk space, or shows if git-annex doesn't know.	2012-03-21 21:55:02 -04:00
Joey Hess	181d2ccd20	Improve detection of inability to check free disk space. Don't check if configure indicated checks won't work. This should fix a FTBFS on mipsel, where configure correctly detects the checks won't work, while garbage is returned for disk space info at git-annex runtime. It also means that, when built via cabal, disk space checks are not enabled, unfortunatly.	2012-03-21 21:21:20 -04:00
Joey Hess	60ab3d84e1	added ifM and nuked 11 lines of code no behavior changes	2012-03-14 17:43:34 -04:00
Joey Hess	b325694645	getKeysPresent is now fully lazy .. Allowing it to be used by things in constant space! Random statistics: git annex status has gone from taking 239 mb of memory and 26 seconds in a repo, to 8 mb and 13 seconds. The trick here is the unsafeInterleaveIO, and the form of the function's recursion, which I cribbed heavily from System.IO.HVFS.Utils.recurseDirStat. The difference is, this one goes to a limited depth and avoids statting everything.	2012-03-11 18:04:58 -04:00
Joey Hess	ff3644ad38	status: Fixed to run in nearly constant space. Before, it leaked space due to caching lists of keys. Now all necessary data about keys is calculated as they stream in. The "nearly constant" is due to getKeysPresent, which builds up a lot of [] thunks as it traverses .git/annex/objects/. Will deal with it later.	2012-03-11 17:15:58 -04:00
Joey Hess	d08ee1a9d2	syscall optimisation	2012-03-06 13:56:20 -04:00
Joey Hess	12b89a3eb8	configure: Check if ssh connection caching is supported by the installed version of ssh and default annex.sshcaching accordingly.	2012-02-25 19:15:29 -04:00
Joey Hess	1f73db3469	improve alwayscommit=false mode Now changes are staged into the branch's index, but not committed, which avoids growing a large journal. And sync and merge always explicitly commit, ensuring that even when they do nothing else, they commit the staged changes. Added a flag file to indicate that the branch's journal contains uncommitted changes. (Could use git ls-files, but don't want to run that every time.) In the future, this ability to have uncommitted changes staged in the journal might be used on remotes after a series of oneshot commands.	2012-02-25 16:18:55 -04:00
Joey Hess	b49c0c2633	add annex.alwayscommit option To avoid commits of data to the git-annex branch after each command is run, set annex.alwayscommit=false. Its data will then be committed less frequently, when a merge or sync is done.	2012-02-25 15:31:42 -04:00
Joey Hess	bd66f962d3	Deal with NFS problem that caused a failure to remove a directory when removing content from the annex. I was able to reproduce this on linux using the kernel's nfs server and mounting localhost:/. Determined that removing the directory fails when the just-deleted file in it was locked. Considered dropping the lock before removing the directory, but this would complicate parts of the code that should not need to worry about locking. So instead, ignore the failure to remove the directory in this case. While I was at it, made it attempt to remove both levels of hash directories, in case they're empty.	2012-02-24 16:30:47 -04:00
Joey Hess	a1e52f0ce5	hlint	2012-02-16 00:44:51 -04:00
Joey Hess	52c5b164d8	Added a annex.queuesize setting useful when adding hundreds of thousands of files on a system with plenty of memory. git add gets quite slow in such a large repository, so if the system has more than the ~32 mb of memory the queue can use by default, it's a useful optimisation to increase the queue size, in order to decrease the number of times git add is run.	2012-02-15 11:14:19 -04:00
Joey Hess	03c559f8d6	tweak	2012-02-14 14:51:26 -04:00
Joey Hess	7ebd98d8d8	fix memory leak when staging the journal The list of files had to be retained until the end so it could be deleted. Also, a list of update-index lines was generated and only then fed into it. Now everything streams in constant space.	2012-02-14 14:37:59 -04:00
Joey Hess	a40ec5e03e	Fixed a memory leak due to excessive strictness when committing journal files. When hashing the files, the entire list of shas was read strictly. That was entirely unnecessary, since there's a cleanup action run after they're consumed.	2012-02-14 11:20:34 -04:00
Joey Hess	cbaebf538a	rework git check-attr interface Now gitattributes are looked up, efficiently, in only the places that really need them, using the same approach used for cat-file. The old CheckAttr code seemed very fragile, in the way it streamed files through git check-attr. I actually found that `cad8824852` was still deadlocking with ghc 7.4, at the end of adding a lot of files. This should fix that problem, and avoid future ones. The best part is that this removes withAttrFilesInGit and withNumCopies, which were complicated Seek methods, as well as simplfying the types for several other Seek methods that had a Backend tupled in.	2012-02-13 23:52:21 -04:00
Joey Hess	d55f3c0716	Fix teardown of stale cached ssh connections.	2012-02-09 21:49:46 -04:00
Joey Hess	146c36ca54	IO exception rework ghc 7.4 comaplains about use of System.IO.Error to catch exceptions. Ok, use Control.Exception, with variants specialized to only catch IO exceptions.	2012-02-03 16:47:24 -04:00
Joey Hess	b81d662cbf	Avoid repeated location log commits when a remote is receiving files. Done by adding a oneshot mode, in which location log changes are written to the journal, but not committed. Taking advantage of git-annex's existing ability to recover in this situation. This is used by git-annex-shell and other places where changes are made to a remote's location log.	2012-01-28 15:41:52 -04:00
Joey Hess	ba6088b249	rename readMaybe to readish a stricter (but also partial) readMaybe is getting added to base	2012-01-23 17:00:10 -04:00
Joey Hess	eb9001044f	order user provided params after connection caching params So the user can override them.	2012-01-20 17:32:32 -04:00
Joey Hess	6ef82665de	add annex.sshcaching config setting	2012-01-20 17:15:46 -04:00
Joey Hess	47250a153a	ssh connection caching Ssh connection caching is now enabled automatically by git-annex. Only one ssh connection is made to each host per git-annex run, which can speed some things up a lot, as well as avoiding repeated password prompts. Concurrent git-annex processes also share ssh connections. Cached ssh connections are shut down when git-annex exits. Note: The rsync special remote does not yet participate in the ssh connection caching.	2012-01-20 17:14:56 -04:00
Joey Hess	61dbad505d	fsck --from remote --fast Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.	2012-01-20 13:23:11 -04:00
Joey Hess	effaa298fa	optimise fsck --from normal git remotes For a local git remote, can symlink the file. For a git remote using rsync, can preseed any local content. There are a few reasons to use fsck --from on a normal git remote. One is if it's using gitosis or similar, and you don't have shell access to run git annex locally. Another reason could be if you just want to fsck certian files of a bare remote.	2012-01-19 17:10:44 -04:00
Joey Hess	81856c3175	add a configure check for StatFS This way, the build log will indicate whether StatFS can be relied on. I've tested all the failing architectures now, and on all of them, the StatFS code now returns Nothing, rather than Just nonsense. Also, if annex.diskreserve is set on a platform where StatFS is not working, git-annex will complain. Also, the Makefile was missing the sources target used when building with cabal.	2012-01-15 13:49:32 -04:00
Joey Hess	a3d97e0c85	tweak	2012-01-14 14:31:16 -04:00
Joey Hess	5e2b4e16ba	avoid multiple unnecessary stats of the index file Up to one per file processed.	2012-01-14 12:07:36 -04:00
Joey Hess	abdacf58ed	tweaks	2012-01-11 00:06:54 -04:00
Joey Hess	16e7178f20	reorg	2012-01-10 15:29:10 -04:00
Joey Hess	a3a9f87047	log: New command that displays the location log for file, showing each repository they were added to and removed from. This needs to run git log on the location log files to get at all past versions of the file, which tends to be a bit slow. It would be possible to make a version optimised for showing the location logs for every key. That would only need to run git log once, so would be faster, but it would need to process an enormous amount of data, so would not speed up the individual file case. In the future it would be nice to support log --format. log --json also doesn't work right yet.	2012-01-06 15:40:07 -04:00
Joey Hess	aa0882691b	Added remote.name.annex-web-options configuration setting, which can be used to provide parameters to whichever of wget or curl git-annex uses (depends on which is available, but most of their important options suitable for use here are the same).	2012-01-02 14:20:20 -04:00
Joey Hess	252376d639	Merge branch 'master' into autosync	2011-12-30 20:38:59 -04:00
Joey Hess	52104dae6f	refactor	2011-12-30 18:36:40 -04:00
Joey Hess	925b6390aa	add forceUpdate This code is picked from my tweak-fetch branch, which already did the needed refactoring.	2011-12-30 15:57:28 -04:00
Joey Hess	6d4382a89e	Merge branch 'new-monad-control'	2011-12-24 23:02:42 -04:00
Joey Hess	ee3b5b2a42	use Common in a few more modules	2011-12-20 14:37:53 -04:00
Joey Hess	ef28b3fef7	split out Git/Command.hs	2011-12-14 15:56:11 -04:00
Joey Hess	02f1bd2bf4	split more stuff out of Git.hs	2011-12-14 15:43:13 -04:00
Joey Hess	25b2cc4148	move commit to Git.Branch	2011-12-13 15:08:44 -04:00
Joey Hess	13fff71f20	split out three modules from Git Constructors and configuration make sense in separate modules. A separate Git.Types is needed to avoid cycles.	2011-12-13 15:06:49 -04:00
Joey Hess	46588674b0	avoid closing pipe before all the shas are read from it Could have just used hGetContentsStrict here, but that would require storing all the shas in memory. Since this is called at the end of a git-annex run, it may have created a lot of shas, so I avoid that memory use and stream them out like before.	2011-12-12 21:41:37 -04:00
Joey Hess	0e45b762a0	broke out Git/HashObject.hs	2011-12-12 21:24:55 -04:00
Joey Hess	31a0c07ee9	broke out Git/Branch.hs and reorganized	2011-12-12 21:12:51 -04:00
Joey Hess	543d0d2501	split out Git/Ref.hs	2011-12-12 18:30:33 -04:00
Joey Hess	da95cbadca	split out Annex/Journal.hs	2011-12-12 18:03:28 -04:00
Joey Hess	98dfc0c9b0	split out Annex/BranchState.hs	2011-12-12 17:38:46 -04:00
Joey Hess	b2f934e07a	update comment	2011-12-12 17:24:12 -04:00
Joey Hess	79345ad5fc	optimisation avoids a redundant call to git show-ref	2011-12-12 03:30:47 -04:00
Joey Hess	f9cd3f6ad1	optimisation avoids a useless diff from git-annex..refs/heads/git-annex	2011-12-12 02:31:07 -04:00
Joey Hess	2332afb4bc	cleanup	2011-12-12 02:04:48 -04:00
Joey Hess	29b88ad657	avoid redundant call to updateIndex commitBranch calls updateIndex	2011-12-11 21:46:21 -04:00
Joey Hess	c4c965d602	detect and recover from branch push/commit race Dealing with a race without using locking is exceedingly difficult and tricky. Fully tested, I hope. There are three places left where the branch can be updated, that are not covered by the race recovery code. Let's prove they're all immune to the race: 1. tryFastForwardTo checks to see if a fast-forward can be done, and then does git-update-ref on the branch to fast-forward it. If a push comes in before the check, then either no fast-forward will be done (ok), or the push set the branch to a ref that can still be fast-forwarded (also ok) If a push comes in after the check, the git-update-ref will undo the ref change made by the push. It's as if the push did not come in, and the next git-push will see this, and try to re-do it. (acceptable) 2. When creating the branch for the very first time, an empty index is created, and a commit of it made to the branch. The commit's ref is recorded as the current state of the index. If a push came in during that, it will be noticed the next time a commit is made to the branch, since the branch will have changed. (ok) 3. Creating the branch from an existing remote branch involves making the branch, and then getting its ref, and recording that the index reflects that ref. If a push creates the branch first, git-branch will fail (ok). If the branch is created and a racing push is then able to change it (highly unlikely!) we're still ok, because it first records the ref into the index.lck, and then updating the index. The race can cause the index.lck to have the old branch ref, while the index has the newly pushed branch merged into it, but that only results in an unnecessary update of the index file later on.	2011-12-11 20:41:35 -04:00
Joey Hess	e04852c8af	Merge branch 'master' into new-monad-control Conflicts: git-annex.cabal	2011-12-11 16:55:36 -04:00
Joey Hess	cfbbda99f4	optimize index updating The last branch ref that the index was updated to is stored in .git/annex/index.lck, and the index only updated when the current branch ref differs. (The .lck file should later be used for locking too.) Some more optimization is still needed, since there is some redundancy in calls to git show-ref.	2011-12-11 16:14:59 -04:00
Joey Hess	8680c415de	slow, stupid, and safe index updating Always merge the git-annex branch into .git/annex/index before making a commit from the index. This ensures that, when the branch has been changed in any way (by a push being received, or changes pulled directly into it, or even by the user checking it out, and committing a change), the index reflects those changes. This is much too slow; it needs to be optimised to only update the index when the branch has really changed, not every time. Also, there is an unhandled race, when a change is made to the branch right after the index gets updated. I left it in for now because it's unlikely and I didn't want to complicate things with additional locking yet.	2011-12-11 15:05:53 -04:00
Joey Hess	0ba4b1de18	move a file location to Locations.hs	2011-12-11 14:14:28 -04:00
Joey Hess	eecaf42485	no need to show, it's a string	2011-12-10 12:30:31 -04:00
Joey Hess	d64132a43a	hslint	2011-12-09 01:57:13 -04:00
Joey Hess	f3a2f60abc	adjust to build with monad-control-0.3 I had to, I hope temporarily, lose my nice Annex newtype, and use a type synonym. This because I cannot find a way to derive a MonadBaseControl instance of the Annex newtype. I've emailed Bas van Dijk in hope he can help get the newtype back. Otherwise appears to build & work.	2011-12-05 22:51:37 -04:00
Joey Hess	598eb2e2da	cleanup	2011-11-30 12:01:15 -04:00
Joey Hess	da9cd315be	add support for using hashDirLower in addition to hashDirMixed Supporting multiple directory hash types will allow converting to a different one, without a flag day. gitAnnexLocation now checks which of the possible locations have a file. This means more statting of files. Several places currently use gitAnnexLocation and immediately check if the returned file exists; those need to be optimised.	2011-11-28 22:43:51 -04:00
Joey Hess	6869e6023e	support .git/annex on a different disk than the rest of the repo The only fully supported thing is to have the main repository on one disk, and .git/annex on another. Only commands that move data in/out of the annex will need to copy it across devices. There is only partial support for putting arbitrary subdirectories of .git/annex on different devices. For one thing, but this can require more copies to be done. For example, when .git/annex/tmp is on one device, and .git/annex/journal on another, every journal write involves a call to mv(1). Also, there are a few places that make hard links between various subdirectories of .git/annex with createLink, that are not handled. In the common case without cross-device, the new moveFile is actually faster than renameFile, avoiding an unncessary stat to check that a file (not a directory) is being moved. Of course if a cross-device move is needed, it is as slow as mv(1) of the data.	2011-11-28 16:17:55 -04:00
Joey Hess	128b4bd015	tweaks	2011-11-19 15:57:08 -04:00
Joey Hess	0fa1d136dc	tweak	2011-11-19 15:40:40 -04:00
Joey Hess	1ffd54ef78	ensure branch exists before trying to update it The branch may not exist, if .git/annex has been copied over from another repo (or a corrupted repo). I suppose it could also have gotten deleted somehow. Without this, there is a confusing failure.	2011-11-16 18:56:06 -04:00
Joey Hess	9290095fc2	improve type signatures with a Ref newtype In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for those. Note that this does not prevent mixing up of eg, refs and branches at the type level. Since git really doesn't care, except rare cases like git update-ref, or git tag -d, that seems ok for now. There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref may or may not be a tree-ish, depending on the object type, so there seems no point in trying to represent it at the type level.	2011-11-16 02:41:46 -04:00
Joey Hess	272a67921c	better name	2011-11-16 01:46:46 -04:00
Joey Hess	21a925dcf1	merge: Now runs in constant space. Before, a merge was first calculated, by running various actions that called git and built up a list of lines, which were at the end sent to git update-index. This necessarily used space proportional to the size of the diff between the trees being merged. Now, lines are streamed into git update-index from each of the actions in turn. Runtime size of git-annex merge when merging 50000 location log files drops from around 100 mb to a constant 4 mb. Presumably it runs quite a lot faster, too.	2011-11-15 23:28:01 -04:00
Joey Hess	04edae6791	Optimised union merging; now only runs git cat-file once.	2011-11-12 17:45:12 -04:00
Joey Hess	e9bfa8eaed	avoid unnecessary auto-merge when only changing a file in the branch. Avoids doing auto-merging in commands that don't need fully current information from the git-annex branch. In particular, git annex add no longer needs to auto-merge. Affected commands: Anything that doesn't look up data from the branch, but does write a change to it. It might seem counterintuitive that we can change a value without first making sure we have the current value. This optimisation works because these two sequences are equivilant: 1. pull from remote 2. union merge 3. read file from branch 4. modify file and write to branch vs. 1. read file from branch 2. modify file and write to branch 3. pull from remote 4. union merge After either sequence, the git-annex branch contains the same logical content for the modified file. (Possibly with lines in a different order or additional old lines of course).	2011-11-12 15:15:57 -04:00
Joey Hess	897bf938f6	merge: Improve commit messages to mention what was merged.	2011-11-12 14:51:19 -04:00
Joey Hess	637b5feb45	lint	2011-11-11 01:52:58 -04:00
Joey Hess	49d2177d51	factored out some useful error catching methods	2011-11-10 20:57:28 -04:00
Joey Hess	9570421251	better message when content is locked	2011-11-10 02:59:13 -04:00
Joey Hess	a218ce41cf	exclusive locks, ugh	2011-11-09 22:15:33 -04:00
Joey Hess	cf0174c922	content locking I've tested that this solves the cyclic drop problem. Have not looked at cyclic move, etc.	2011-11-09 21:54:42 -04:00
Joey Hess	d3e1a3619f	safer inannex checking git-annex-shell inannex now returns always 0, 1, or 100 (the last when it's unclear if content is currently in the index due to it currently being moved or dropped). (Actual locking code still not yet written.)	2011-11-09 18:33:15 -04:00
Joey Hess	8ce7e73f74	reorg to allow taking content lock The lock will only persist during the perform stage, so the content must be removed from the annex then, rather than in the cleanup stage. (No lock is actually taken yet.)	2011-11-09 16:54:18 -04:00
Joey Hess	56b8194470	cleanup	2011-11-09 01:33:20 -04:00
Joey Hess	bf460a0a98	reorder repo parameters last Many functions took the repo as their first parameter. Changing it consistently to be the last parameter allows doing some useful things with currying, that reduce boilerplate. In particular, g <- gitRepo is almost never needed now, instead use inRepo to run an IO action in the repo, and fromRepo to get a value from the repo. This also provides more opportunities to use monadic and applicative combinators.	2011-11-08 16:27:20 -04:00
Joey Hess	b11a63a860	clean up read/show abuse Avoid ever using read to parse a non-haskell formatted input string. show :: Key is arguably still show abuse, but displaying Keys as filenames is just too useful to give up.	2011-11-08 00:17:54 -04:00
Joey Hess	63a292324d	add a UUID type Should have done this a long time ago.	2011-11-07 15:59:16 -04:00
Joey Hess	f229911715	optimization The last commit added some git-log calls to a merge. This removes some, by only merging branches that have unique refs.	2011-11-06 15:33:15 -04:00
Joey Hess	c99fb58909	merge: Use fast-forward merges when possible. Thanks Valentin Haenel for a test case showing how non-fast-forward merges could result in an ongoing pull/merge/push cycle. While the git-annex branch is fast-forwarded, git-annex's index file is still updated using the union merge strategy as before. There's no other way to update the index that would be any faster. It is possible that a union merge and a fast-forward result in different file contents: Files should have the same lines, but a union merge may change their order. If this happens, the next commit made to the git-annex branch will have some unnecessary changes to line orders, but the consistency of data should be preserved. Note that when the journal contains changes, a fast-forward is never attempted, which is fine, because committing those changes would be vanishingly unlikely to leave the git-annex branch at a commit that already exists in one of the remotes. The real difficulty is handling the case where multiple remotes have all changed. git-annex does find the best (ie, newest) one and fast forwards to it. If the remotes are diverged, no fast-forward is done at all. It would be possible to pick one, fast forward to it, and make a merge commit to the rest, I see no benefit to adding that complexity. Determining the best of N changed remotes requires N*2+1 calls to git-log, but these are fast git-log calls, and N is typically small. Also, typically some or all of the remote refs will be the same, and git-log is not called to compare those. In the real world I expect this will almost always add only 1 git-log call to the merge process. (Which already makes N anyway.)	2011-11-06 15:22:40 -04:00
Joey Hess	5f3dd3d246	ensure directory exists when locking journal Fixes git annex init in a bare repository that already has a git-annex branch.	2011-11-02 15:09:19 -04:00
Joey Hess	1826b3bd67	cleanup	2011-10-27 18:01:52 -04:00
Joey Hess	373cad993d	Sped up some operations on remotes that are on the same host. Specifically, disabled trying to update the git-annex branch on the remote, since that data is never used by operations that act on such remotes. Also, when copying content to such a remote, skip committing the presence information changes to its git-annex branch. Leaving it in the journal there is ok: Any command run on the remote that needs the info will flush the journal. This may partially solve this bug: http://git-annex.branchable.com/bugs/fails_to_handle_lot_of_files/ Although I still see unreaped git processes piling up when doing a copy --to.	2011-10-27 14:55:06 -04:00
Joey Hess	91366c896d	clean Annex stuff out of Utility/	2011-10-16 00:04:26 -04:00
Joey Hess	ee9af605bc	break out non-log stuff to separate module	2011-10-15 17:47:03 -04:00
Joey Hess	1a29b5b52e	reorganize log modules no code changes	2011-10-15 16:21:08 -04:00
Joey Hess	b505ba83e8	minor syntax changes	2011-10-11 14:43:45 -04:00
Joey Hess	025ded4a2d	tweaks	2011-10-10 17:37:44 -04:00
Joey Hess	f0153f9fd7	fix a race Another process may stage journalled files before the lock is taken, so need to get the list of journalled files afterwards. It's unfortunate this means getting the directory contents twice, but it seems better to do that than sometimes take the lock unnecessarily.	2011-10-09 16:19:09 -04:00
Joey Hess	dfee6e1ed6	better layout And a theoretical fix to branchstate cache invalidation, but not a bug that could actually happen.	2011-10-07 13:59:34 -04:00
Joey Hess	82e655efd0	performance fix It was checking if it needed to merge on every branch access, fix it to only check once.	2011-10-07 13:38:56 -04:00
Joey Hess	44fc358885	avoid merging multiple branches that point to the same tree avoids git warning "error: duplicate parent xxx ignored"	2011-10-07 13:37:01 -04:00
Joey Hess	3acdba3995	faster union merge of multiple branches into index only write index once	2011-10-07 13:36:48 -04:00
Joey Hess	6a6ea06cee	rename	2011-10-05 16:02:51 -04:00
Joey Hess	cfe21e85e7	rename	2011-10-04 00:59:08 -04:00
Joey Hess	ff21fd4a65	factor out Annex exception handling module	2011-10-04 00:34:04 -04:00

... 9 10 11 12 13 ...

781 commits