git-annex

Author	SHA1	Message	Date
Joey Hess	8cae4115a8	releasing version 3.20120227	2012-02-27 13:07:04 -04:00
Joey Hess	2fd294d06f	move --from, copy --from: 10 times faster scanning remote on local disk Rather than go through the location log to see which files are present on the remote, it simply looks at the disk contents directly. I benchmarked this speeding up scanning 834 files, from an annex on my phone's SSD, from 11.39 seconds to 1.31 seconds. (No files actually moved.) Also benchmarked 8139 files, from an annex on spinning storage, speeding up from 103.17 to 13.39 seconds. Note that benchmarking with an encrypted annex on flash actually showed a minor slowdown with this optimisation -- from 13.93 to 14.50 seconds. Seems the overhead of doing the crypto needed to get the filenames to directly check can be higher than the overhead of looking up data in the location log. (Which says good things about how well the location log and git have been optimised!) It may make sense to make encrypted local remotes not have hasKeyCheap set; further benchmarking is called for.	2012-02-26 14:59:48 -04:00
Joey Hess	b889581945	version dependency on openssh-client This is only to ensure that it's as new a version as it was built with, so partial upgrades work.	2012-02-25 19:31:46 -04:00
Joey Hess	12b89a3eb8	configure: Check if ssh connection caching is supported by the installed version of ssh and default annex.sshcaching accordingly.	2012-02-25 19:15:29 -04:00
Joey Hess	c3fbe07d7a	do a cleanup commit after moving data from or to a git remote Added Annex.cleanup, which is a general purpose interface for adding actions to run at the end. Remotes with the old git-annex-shell will commit every time, and have no commit command, so hide stderr when running the commit command.	2012-02-25 18:02:49 -04:00
Joey Hess	1f73db3469	improve alwayscommit=false mode Now changes are staged into the branch's index, but not committed, which avoids growing a large journal. And sync and merge always explicitly commit, ensuring that even when they do nothing else, they commit the staged changes. Added a flag file to indicate that the branch's journal contains uncommitted changes. (Could use git ls-files, but don't want to run that every time.) In the future, this ability to have uncommitted changes staged in the journal might be used on remotes after a series of oneshot commands.	2012-02-25 16:18:55 -04:00
Joey Hess	b49c0c2633	add annex.alwayscommit option To avoid commits of data to the git-annex branch after each command is run, set annex.alwayscommit=false. Its data will then be committed less frequently, when a merge or sync is done.	2012-02-25 15:31:42 -04:00
Joey Hess	df3a310b83	update copyright format url	2012-02-25 10:40:05 -04:00
Joey Hess	bd66f962d3	Deal with NFS problem that caused a failure to remove a directory when removing content from the annex. I was able to reproduce this on linux using the kernel's nfs server and mounting localhost:/. Determined that removing the directory fails when the just-deleted file in it was locked. Considered dropping the lock before removing the directory, but this would complicate parts of the code that should not need to worry about locking. So instead, ignore the failure to remove the directory in this case. While I was at it, made it attempt to remove both levels of hash directories, in case they're empty.	2012-02-24 16:30:47 -04:00
Joey Hess	5bf07b3b5c	Store web special remote url info in a more efficient location. storing it in remotes/web/xx/yy/foo.log meant lots of extra directory objects in git. Now I use xx/yy/foo.log.web, which is just as unique, but more efficient since foo.log is there anyway. Of course, it still looks in the old location too.	2012-02-17 23:15:29 -04:00
Joey Hess	db6b4cdfcf	rekey: New plumbing level command, can be used to change the keys used for files en masse.	2012-02-16 16:36:35 -04:00
Joey Hess	aeaaa0ff87	reorder	2012-02-16 15:07:59 -04:00
Joey Hess	39c3f56b33	addurl: Add --pathdepth option.	2012-02-16 12:25:19 -04:00
Joey Hess	4d8afc1713	tweak wording	2012-02-15 19:43:15 -04:00
Joey Hess	63152428e9	changelog	2012-02-15 17:33:21 -04:00
Joey Hess	52c5b164d8	Added a annex.queuesize setting useful when adding hundreds of thousands of files on a system with plenty of memory. git add gets quite slow in such a large repository, so if the system has more than the ~32 mb of memory the queue can use by default, it's a useful optimisation to increase the queue size, in order to decrease the number of times git add is run.	2012-02-15 11:14:19 -04:00
Joey Hess	7ebd98d8d8	fix memory leak when staging the journal The list of files had to be retained until the end so it could be deleted. Also, a list of update-index lines was generated and only then fed into it. Now everything streams in constant space.	2012-02-14 14:37:59 -04:00
Joey Hess	a40ec5e03e	Fixed a memory leak due to excessive strictness when committing journal files. When hashing the files, the entire list of shas was read strictly. That was entirely unnecessary, since there's a cleanup action run after they're consumed.	2012-02-14 11:20:34 -04:00
Joey Hess	cb631ce518	whereis: Prints the urls of files that the web special remote knows about.	2012-02-14 03:49:48 -04:00
Joey Hess	59b2adea4f	changelog for `a964012fc3` Turns out that commit really made some serious improvements to memory use. With the lazy state monad, git-annex add in a huge tree grew seemingly without bound until it overflowed the stack. With the strict monad, it uses 42 mb max. It's possible another change since the 3.20120123 release fixed that, but `a964012fc3` seems most likely.	2012-02-13 16:58:58 -04:00
Joey Hess	17fed709c8	addurl --fast: Verifies that the url can be downloaded (only getting its head), and records the size in the key.	2012-02-10 19:23:46 -04:00
Joey Hess	9030f68452	When checking that an url has a key, verify that the Content-Length, if available, matches the size of the key. If there's no Content-Length, or the key has no size, this check is not done, but it should happen most of the time, and protect against web content that has changed.	2012-02-10 19:23:41 -04:00
Joey Hess	d55f3c0716	Fix teardown of stale cached ssh connections.	2012-02-09 21:49:46 -04:00
Joey Hess	1c0bd81ba6	addurl: Normalize badly encoded urls.	2012-02-09 14:19:58 -04:00
Joey Hess	ef013506cb	addurl: Added a --file option Can be used to specify what file the url is added to. This can be used to override the default filename that is used when adding an url, which is based on the url. Or, when the file already exists, the url is recorded as another location of the file.	2012-02-08 15:35:29 -04:00
Joey Hess	57a747d081	S3: Fix irrefutable pattern failure when accessing encrypted S3 credentials.	2012-02-08 11:41:15 -04:00
Joey Hess	995bf51e10	correction	2012-02-07 16:52:39 -04:00
Joey Hess	3f4f96228e	changelog	2012-02-06 20:42:49 -04:00
Joey Hess	91fc975964	note 7.4 needed	2012-02-04 14:51:52 -04:00
Joey Hess	ed64bd8a4b	remove; unused	2012-01-30 13:20:36 -04:00
Joey Hess	b81d662cbf	Avoid repeated location log commits when a remote is receiving files. Done by adding a oneshot mode, in which location log changes are written to the journal, but not committed. Taking advantage of git-annex's existing ability to recover in this situation. This is used by git-annex-shell and other places where changes are made to a remote's location log.	2012-01-28 15:41:52 -04:00
Joey Hess	ce5637498f	remove Utility.Conditional and use IfElse This drops the >>! and >>? with the nice low fixity. IfElse does have undocumented >>=>>! and >>=>>? operators, but I deem that too fishy. Anyway, using whenM and unlessM is easier; I sometimes mixed the operators up.	2012-01-24 16:22:07 -04:00
Joey Hess	20d0288802	releasing version 3.20120123	2012-01-23 15:09:50 -04:00
Joey Hess	47250a153a	ssh connection caching Ssh connection caching is now enabled automatically by git-annex. Only one ssh connection is made to each host per git-annex run, which can speed some things up a lot, as well as avoiding repeated password prompts. Concurrent git-annex processes also share ssh connections. Cached ssh connections are shut down when git-annex exits. Note: The rsync special remote does not yet participate in the ssh connection caching.	2012-01-20 17:14:56 -04:00
Joey Hess	61dbad505d	fsck --from remote --fast Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.	2012-01-20 13:23:11 -04:00
Joey Hess	711c154561	update NEWS Add news item recommending fscking directory special remotes. Remote news item about URL backend being removed; it was later added back to be used by git annex addurl --fast. Link NEWS into top level.	2012-01-19 15:27:39 -04:00
Joey Hess	90319afa41	fsck --from Fscking a remote is now supported. It's done by retrieving the contents of the specified files from the remote, and checking them, so can be an expensive operation. (Several optimisations are possible, to speed it up, of course.. This is the slow and stupid remote fsck to start with.) Still, if the remote is a special remote, or a git repository that you cannot run fsck in locally, it's nice to have the ability to fsck it. If you have any directory special remotes, now would be a good time to fsck them, in case you were hit by the data loss bug fixed in the previous release!	2012-01-19 15:24:05 -04:00
Joey Hess	2837e8fef1	releasing version 3.20120116	2012-01-16 16:52:26 -04:00
Joey Hess	f161b5eb59	Fix data loss bug in directory special remote When moving a file to the remote failed, and partially transferred content was left behind in the directory, re-running the same move would think it succeeded and delete the local copy. I reproduced data loss when moving files to a partition that was almost full. Interrupting a transfer could have similar results. Easily fixed by using a temp file which is then moved atomically into place once the transfer completes. I've audited other calls to copyFileExternal, and other special remote file transfer code; everything else seems to use temp files correctly (rsync, git), or otherwise use atomic transfers (bup, S3).	2012-01-16 16:28:15 -04:00
Joey Hess	e3ea5fe938	debhelper v9 kills that ugly python message during build	2012-01-15 14:53:38 -04:00
Joey Hess	ce608303a3	releasing version 3.20120115	2012-01-15 14:02:32 -04:00
Joey Hess	37b5b1bf0d	Fix QuickCheck dependency in cabal file.	2012-01-15 13:53:51 -04:00
Joey Hess	81856c3175	add a configure check for StatFS This way, the build log will indicate whether StatFS can be relied on. I've tested all the failing architectures now, and on all of them, the StatFS code now returns Nothing, rather than Just nonsense. Also, if annex.diskreserve is set on a platform where StatFS is not working, git-annex will complain. Also, the Makefile was missing the sources target used when building with cabal.	2012-01-15 13:49:32 -04:00
Joey Hess	0eed604446	Add a sanity check for bad StatFS results. git-annex FTBFS on s390, mips, powerpc, sparc. That StatFS code is failing on all of them. At least on s390, the failure appears as: Just (FileSystemStats {fsStatBlockSize = 4096, fsStatBlockCount = 0, fsStatByteCount = 0, fsStatBytesFree = 0, fsStatBytesAvailable = 0, fsStatBytesUsed = 0}) While I don't understand why this is happening, or how to fix it, bandaid over it by checking for obviously bad values and returning Nothing. That disables disk free space checking, but at least git-annex will work. Upstream bug: http://code.google.com/p/xmobar/issues/detail?id=70	2012-01-14 17:17:20 -04:00
Joey Hess	b88ecbdc1b	Add libghc-testpack-dev to build depends on all arches.	2012-01-13 15:50:56 -04:00
Joey Hess	1ae780ee79	git-annex, git-union-merge: Support GIT_DIR and GIT_WORK_TREE. Note that GIT_WORK_TREE cannot influence GIT_DIR; that is necessary for git-fake-bare and vcsh type things to work.	2012-01-13 12:52:09 -04:00
Joey Hess	0d5c402210	Add annex-trustlevel configuration settings, which can be used to override the trust level of a remote. This overrides the trust.log, and is overridden by the command-line trust parameters. It would have been nicer to have Logs.Trust.trustMap just look up the configuration for all remotes, but a dependency loop prevented that (Remotes depends on Logs.Trust in several ways). So instead, look up the configuration when building remotes, storing it in the same forcetrust field used for the command-line trust parameters.	2012-01-09 23:31:44 -04:00
Joey Hess	7675b83efa	map: Fix display of remote repos A change to break local cycles made remote repos be dropped entirely.	2012-01-08 16:05:57 -04:00
Joey Hess	a35278430a	log: Add --gource mode, which generates output usable by gource. As part of this, I fixed up how log was getting the descriptions of remotes.	2012-01-07 18:18:09 -04:00
Joey Hess	3da28cad07	releasing version 3.20120106	2012-01-07 13:50:35 -04:00

1 2 3 4 5 ...

541 commits