git-annex

Author	SHA1	Message	Date
Joey Hess	3436aba6de	Directory special remotes now support chunking files written to them Avoiding writing files larger than a specified size is useful on certian things. For example, box.com has a file size limit of 100 mb. Could also be useful on really crappy removable media.	2012-03-03 18:05:55 -04:00
Joey Hess	1098bc37ab	"here" can be used to refer to the current repository, which can read better than the old "." (which still works too).	2012-03-01 22:35:10 -04:00
Joey Hess	6571831b92	releasing version 3.20120229	2012-02-29 02:39:44 -04:00
Joey Hess	e5fee3f352	Fix test suite to not require a unicode locale. Without a unicode locale, it will fail to print a unicode filename to console, and fails.	2012-02-29 02:32:05 -04:00
Joey Hess	8cae4115a8	releasing version 3.20120227	2012-02-27 13:07:04 -04:00
Joey Hess	2fd294d06f	move --from, copy --from: 10 times faster scanning remote on local disk Rather than go through the location log to see which files are present on the remote, it simply looks at the disk contents directly. I benchmarked this speeding up scanning 834 files, from an annex on my phone's SSD, from 11.39 seconds to 1.31 seconds. (No files actually moved.) Also benchmarked 8139 files, from an annex on spinning storage, speeding up from 103.17 to 13.39 seconds. Note that benchmarking with an encrypted annex on flash actually showed a minor slowdown with this optimisation -- from 13.93 to 14.50 seconds. Seems the overhead of doing the crypto needed to get the filenames to directly check can be higher than the overhead of looking up data in the location log. (Which says good things about how well the location log and git have been optimised!) It may make sense to make encrypted local remotes not have hasKeyCheap set; further benchmarking is called for.	2012-02-26 14:59:48 -04:00
Joey Hess	b889581945	version dependency on openssh-client This is only to ensure that it's as new a version as it was built with, so partial upgrades work.	2012-02-25 19:31:46 -04:00
Joey Hess	12b89a3eb8	configure: Check if ssh connection caching is supported by the installed version of ssh and default annex.sshcaching accordingly.	2012-02-25 19:15:29 -04:00
Joey Hess	c3fbe07d7a	do a cleanup commit after moving data from or to a git remote Added Annex.cleanup, which is a general purpose interface for adding actions to run at the end. Remotes with the old git-annex-shell will commit every time, and have no commit command, so hide stderr when running the commit command.	2012-02-25 18:02:49 -04:00
Joey Hess	1f73db3469	improve alwayscommit=false mode Now changes are staged into the branch's index, but not committed, which avoids growing a large journal. And sync and merge always explicitly commit, ensuring that even when they do nothing else, they commit the staged changes. Added a flag file to indicate that the branch's journal contains uncommitted changes. (Could use git ls-files, but don't want to run that every time.) In the future, this ability to have uncommitted changes staged in the journal might be used on remotes after a series of oneshot commands.	2012-02-25 16:18:55 -04:00
Joey Hess	b49c0c2633	add annex.alwayscommit option To avoid commits of data to the git-annex branch after each command is run, set annex.alwayscommit=false. Its data will then be committed less frequently, when a merge or sync is done.	2012-02-25 15:31:42 -04:00
Joey Hess	df3a310b83	update copyright format url	2012-02-25 10:40:05 -04:00
Joey Hess	bd66f962d3	Deal with NFS problem that caused a failure to remove a directory when removing content from the annex. I was able to reproduce this on linux using the kernel's nfs server and mounting localhost:/. Determined that removing the directory fails when the just-deleted file in it was locked. Considered dropping the lock before removing the directory, but this would complicate parts of the code that should not need to worry about locking. So instead, ignore the failure to remove the directory in this case. While I was at it, made it attempt to remove both levels of hash directories, in case they're empty.	2012-02-24 16:30:47 -04:00
Joey Hess	5bf07b3b5c	Store web special remote url info in a more efficient location. storing it in remotes/web/xx/yy/foo.log meant lots of extra directory objects in git. Now I use xx/yy/foo.log.web, which is just as unique, but more efficient since foo.log is there anyway. Of course, it still looks in the old location too.	2012-02-17 23:15:29 -04:00
Joey Hess	db6b4cdfcf	rekey: New plumbing level command, can be used to change the keys used for files en masse.	2012-02-16 16:36:35 -04:00
Joey Hess	aeaaa0ff87	reorder	2012-02-16 15:07:59 -04:00
Joey Hess	39c3f56b33	addurl: Add --pathdepth option.	2012-02-16 12:25:19 -04:00
Joey Hess	4d8afc1713	tweak wording	2012-02-15 19:43:15 -04:00
Joey Hess	63152428e9	changelog	2012-02-15 17:33:21 -04:00
Joey Hess	52c5b164d8	Added a annex.queuesize setting useful when adding hundreds of thousands of files on a system with plenty of memory. git add gets quite slow in such a large repository, so if the system has more than the ~32 mb of memory the queue can use by default, it's a useful optimisation to increase the queue size, in order to decrease the number of times git add is run.	2012-02-15 11:14:19 -04:00
Joey Hess	7ebd98d8d8	fix memory leak when staging the journal The list of files had to be retained until the end so it could be deleted. Also, a list of update-index lines was generated and only then fed into it. Now everything streams in constant space.	2012-02-14 14:37:59 -04:00
Joey Hess	a40ec5e03e	Fixed a memory leak due to excessive strictness when committing journal files. When hashing the files, the entire list of shas was read strictly. That was entirely unnecessary, since there's a cleanup action run after they're consumed.	2012-02-14 11:20:34 -04:00
Joey Hess	cb631ce518	whereis: Prints the urls of files that the web special remote knows about.	2012-02-14 03:49:48 -04:00
Joey Hess	59b2adea4f	changelog for `a964012fc3` Turns out that commit really made some serious improvements to memory use. With the lazy state monad, git-annex add in a huge tree grew seemingly without bound until it overflowed the stack. With the strict monad, it uses 42 mb max. It's possible another change since the 3.20120123 release fixed that, but `a964012fc3` seems most likely.	2012-02-13 16:58:58 -04:00
Joey Hess	17fed709c8	addurl --fast: Verifies that the url can be downloaded (only getting its head), and records the size in the key.	2012-02-10 19:23:46 -04:00
Joey Hess	9030f68452	When checking that an url has a key, verify that the Content-Length, if available, matches the size of the key. If there's no Content-Length, or the key has no size, this check is not done, but it should happen most of the time, and protect against web content that has changed.	2012-02-10 19:23:41 -04:00
Joey Hess	d55f3c0716	Fix teardown of stale cached ssh connections.	2012-02-09 21:49:46 -04:00
Joey Hess	1c0bd81ba6	addurl: Normalize badly encoded urls.	2012-02-09 14:19:58 -04:00
Joey Hess	ef013506cb	addurl: Added a --file option Can be used to specify what file the url is added to. This can be used to override the default filename that is used when adding an url, which is based on the url. Or, when the file already exists, the url is recorded as another location of the file.	2012-02-08 15:35:29 -04:00
Joey Hess	57a747d081	S3: Fix irrefutable pattern failure when accessing encrypted S3 credentials.	2012-02-08 11:41:15 -04:00
Joey Hess	995bf51e10	correction	2012-02-07 16:52:39 -04:00
Joey Hess	3f4f96228e	changelog	2012-02-06 20:42:49 -04:00
Joey Hess	91fc975964	note 7.4 needed	2012-02-04 14:51:52 -04:00
Joey Hess	ed64bd8a4b	remove; unused	2012-01-30 13:20:36 -04:00
Joey Hess	b81d662cbf	Avoid repeated location log commits when a remote is receiving files. Done by adding a oneshot mode, in which location log changes are written to the journal, but not committed. Taking advantage of git-annex's existing ability to recover in this situation. This is used by git-annex-shell and other places where changes are made to a remote's location log.	2012-01-28 15:41:52 -04:00
Joey Hess	ce5637498f	remove Utility.Conditional and use IfElse This drops the >>! and >>? with the nice low fixity. IfElse does have undocumented >>=>>! and >>=>>? operators, but I deem that too fishy. Anyway, using whenM and unlessM is easier; I sometimes mixed the operators up.	2012-01-24 16:22:07 -04:00
Joey Hess	20d0288802	releasing version 3.20120123	2012-01-23 15:09:50 -04:00
Joey Hess	47250a153a	ssh connection caching Ssh connection caching is now enabled automatically by git-annex. Only one ssh connection is made to each host per git-annex run, which can speed some things up a lot, as well as avoiding repeated password prompts. Concurrent git-annex processes also share ssh connections. Cached ssh connections are shut down when git-annex exits. Note: The rsync special remote does not yet participate in the ssh connection caching.	2012-01-20 17:14:56 -04:00
Joey Hess	61dbad505d	fsck --from remote --fast Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.	2012-01-20 13:23:11 -04:00
Joey Hess	711c154561	update NEWS Add news item recommending fscking directory special remotes. Remote news item about URL backend being removed; it was later added back to be used by git annex addurl --fast. Link NEWS into top level.	2012-01-19 15:27:39 -04:00
Joey Hess	90319afa41	fsck --from Fscking a remote is now supported. It's done by retrieving the contents of the specified files from the remote, and checking them, so can be an expensive operation. (Several optimisations are possible, to speed it up, of course.. This is the slow and stupid remote fsck to start with.) Still, if the remote is a special remote, or a git repository that you cannot run fsck in locally, it's nice to have the ability to fsck it. If you have any directory special remotes, now would be a good time to fsck them, in case you were hit by the data loss bug fixed in the previous release!	2012-01-19 15:24:05 -04:00
Joey Hess	2837e8fef1	releasing version 3.20120116	2012-01-16 16:52:26 -04:00
Joey Hess	f161b5eb59	Fix data loss bug in directory special remote When moving a file to the remote failed, and partially transferred content was left behind in the directory, re-running the same move would think it succeeded and delete the local copy. I reproduced data loss when moving files to a partition that was almost full. Interrupting a transfer could have similar results. Easily fixed by using a temp file which is then moved atomically into place once the transfer completes. I've audited other calls to copyFileExternal, and other special remote file transfer code; everything else seems to use temp files correctly (rsync, git), or otherwise use atomic transfers (bup, S3).	2012-01-16 16:28:15 -04:00
Joey Hess	e3ea5fe938	debhelper v9 kills that ugly python message during build	2012-01-15 14:53:38 -04:00
Joey Hess	ce608303a3	releasing version 3.20120115	2012-01-15 14:02:32 -04:00
Joey Hess	37b5b1bf0d	Fix QuickCheck dependency in cabal file.	2012-01-15 13:53:51 -04:00
Joey Hess	81856c3175	add a configure check for StatFS This way, the build log will indicate whether StatFS can be relied on. I've tested all the failing architectures now, and on all of them, the StatFS code now returns Nothing, rather than Just nonsense. Also, if annex.diskreserve is set on a platform where StatFS is not working, git-annex will complain. Also, the Makefile was missing the sources target used when building with cabal.	2012-01-15 13:49:32 -04:00
Joey Hess	0eed604446	Add a sanity check for bad StatFS results. git-annex FTBFS on s390, mips, powerpc, sparc. That StatFS code is failing on all of them. At least on s390, the failure appears as: Just (FileSystemStats {fsStatBlockSize = 4096, fsStatBlockCount = 0, fsStatByteCount = 0, fsStatBytesFree = 0, fsStatBytesAvailable = 0, fsStatBytesUsed = 0}) While I don't understand why this is happening, or how to fix it, bandaid over it by checking for obviously bad values and returning Nothing. That disables disk free space checking, but at least git-annex will work. Upstream bug: http://code.google.com/p/xmobar/issues/detail?id=70	2012-01-14 17:17:20 -04:00
Joey Hess	b88ecbdc1b	Add libghc-testpack-dev to build depends on all arches.	2012-01-13 15:50:56 -04:00
Joey Hess	1ae780ee79	git-annex, git-union-merge: Support GIT_DIR and GIT_WORK_TREE. Note that GIT_WORK_TREE cannot influence GIT_DIR; that is necessary for git-fake-bare and vcsh type things to work.	2012-01-13 12:52:09 -04:00
Joey Hess	0d5c402210	Add annex-trustlevel configuration settings, which can be used to override the trust level of a remote. This overrides the trust.log, and is overridden by the command-line trust parameters. It would have been nicer to have Logs.Trust.trustMap just look up the configuration for all remotes, but a dependency loop prevented that (Remotes depends on Logs.Trust in several ways). So instead, look up the configuration when building remotes, storing it in the same forcetrust field used for the command-line trust parameters.	2012-01-09 23:31:44 -04:00
Joey Hess	7675b83efa	map: Fix display of remote repos A change to break local cycles made remote repos be dropped entirely.	2012-01-08 16:05:57 -04:00
Joey Hess	a35278430a	log: Add --gource mode, which generates output usable by gource. As part of this, I fixed up how log was getting the descriptions of remotes.	2012-01-07 18:18:09 -04:00
Joey Hess	3da28cad07	releasing version 3.20120106	2012-01-07 13:50:35 -04:00
Joey Hess	60c1aeeb6f	Fix overbroad gpg --no-tty fix from last release. Only set --no-tty when GPG_AGENT_INFO is set and batch mode is used. In the test suite, set GPG_AGENT_INFO to /dev/null to avoid the test suite relying on /dev/tty.	2012-01-07 12:38:08 -04:00
Joey Hess	b59759e33c	typo	2012-01-06 17:52:16 -04:00
Joey Hess	a3a9f87047	log: New command that displays the location log for file, showing each repository they were added to and removed from. This needs to run git log on the location log files to get at all past versions of the file, which tends to be a bit slow. It would be possible to make a version optimised for showing the location logs for every key. That would only need to run git log once, so would be faster, but it would need to process an enormous amount of data, so would not speed up the individual file case. In the future it would be nice to support log --format. log --json also doesn't work right yet.	2012-01-06 15:40:07 -04:00
Joey Hess	f534fcc7b1	remove S3stub stuff Let's keep that in a no-s3 branch, which can be merged into eg, debian-stable.	2012-01-05 23:14:10 -04:00
Joey Hess	c371c40a88	Don't list S3 as a remote type when built without S3 support.	2012-01-05 23:11:07 -04:00
Joey Hess	0b27e6baa0	Support unescaped repository urls, like git does. Turns out that git will accept a .git/config containing an url with eg, spaces in its name. Handle this by escaping the url if it's not valid. This also fixes support for urls containing escaped characters like %20 for space. Before, the path from the url was not unescaped properly.	2012-01-05 14:32:20 -04:00
Joey Hess	338d472ca2	releasing version 3.20120105	2012-01-05 13:51:13 -04:00
Joey Hess	769edd6b08	Run gpg with --no-tty. Closes: #654721	2012-01-05 13:44:09 -04:00
Joey Hess	a1aea174d7	fsck: Do backend-specific check before checking numcopies is satisfied. This way, when a checksum check fails and the content is moved aside, the numcopies check also warns if there are not enough copies.	2012-01-03 18:40:47 -04:00
Joey Hess	7e6a54f984	Added quickcheck to build dependencies, and fail if test suite cannot be built.	2012-01-03 14:52:20 -04:00
Joey Hess	34abd7bca8	no implicit dotfiles in add Dotfiles, and files inside dotdirs are not added by "git annex add" unless the dotfile or directory is explicitly listed. So "git annex add ." will add all untracked files in the current directory except for those in dotdirs. One reason for this is that it will make git-annex more usable with vcsh, where you don't want "vcsh big annex add" to check in all the dotfiles that are already versioned in other repositories. (If you're using vcsh for repos that contain non-dotfiles, this won't help, and you'll need to .gitignore such things, but this will cover the common case.) A more general reason why this seems like a good idea is the same reason ls ignores dotfiles, just the unix convention that they are cruft that is kept out of the way most of the time. All the other git-annex commands still do deal with any dotfiles that do get into the annex. This seemed right because if I've gone to the trouble to add a dotfile, I will want "git annex get ." to get it along with everything else.	2012-01-03 00:11:00 -04:00
Joey Hess	f0c4a1c770	annex.web-options also works	2012-01-02 14:22:50 -04:00
Joey Hess	aa0882691b	Added remote.name.annex-web-options configuration setting, which can be used to provide parameters to whichever of wget or curl git-annex uses (depends on which is available, but most of their important options suitable for use here are the same).	2012-01-02 14:20:20 -04:00
Joey Hess	9b12701b9e	releasing version 3.20111231	2011-12-31 15:07:45 -04:00
Joey Hess	e7d3e546c2	sync --fast: Selects some of the remotes with the lowest annex.cost and syncs those, in addition to any specified at the command line.	2011-12-30 21:17:36 -04:00
Joey Hess	dd8451f0f8	update	2011-12-30 20:40:59 -04:00
Joey Hess	8f4fdb3f97	Merge branch 'new-monad-control' Conflicts: debian/changelog	2011-12-30 20:08:01 -04:00
Joey Hess	5287d1dc3f	fixed behavior when multiple insteadOf configs are provided for the same url base Consider this git config --list case: url.git+ssh://git@example.com/.insteadOf=gl url.git+ssh://git@example.com/.insteadOf=shared Since config is stored in a Map, only the last of the values for this key was stored and available for use by the insteadOf code. But that is wrong; git allows either "gl" or "shared" to be used in an url and the insteadOf value to be substituted in. To support this, it seems best to keep the existing config map as-is, and add a second map that accumulates a list of multiple values for config keys. This new fullconfig map can be used in the rare places where multiple values for a key make sense, without needing to complicate everything else. Haskell's laziness and data sharing keep the overhead of adding this second map low.	2011-12-30 14:07:46 -04:00
Joey Hess	85f1f3a63a	Updated to build with monad-control 0.3.	2011-12-24 23:05:23 -04:00
Joey Hess	fdf02986cf	find --json	2011-12-23 01:08:19 -04:00
Joey Hess	06bafae9e0	Format strings can be specified using the new --find option, to control what is output by git annex find.	2011-12-22 18:31:44 -04:00
Joey Hess	5a275a3f5d	Can now be built with older git versions (before 1.7.7); the resulting binary should only be used with old git. Remove git old version check from configure, and use the git version it was built against in the git check-attr code.	2011-12-22 15:01:13 -04:00
Joey Hess	6bffe509d7	Add --include, which is the same as --not --exclude.	2011-12-22 14:00:17 -04:00
Joey Hess	20482712d0	Improve deletion of files from rsync special remotes. Closes: #652849 Rsync is only run once, with include / exclude rules used to specify exactly what to delete. This is faster, and avoids ugly error messages from rsync, and doesn't fail if the content already got deleted somehow.	2011-12-21 16:57:03 -04:00
Joey Hess	a76b13b848	test fsck in bare repos (75%)	2011-12-21 14:20:41 -04:00
Joey Hess	8cdcd78b21	test bup special remote (74% coverage)	2011-12-21 13:50:33 -04:00
Joey Hess	c61f3d7b7b	test coverage improvements	2011-12-21 12:46:14 -04:00
Joey Hess	82a145df91	test encrypted special remote This involved adding a test harness to run gpg with a dummy key, and lots of fun.	2011-12-20 23:24:06 -04:00
Joey Hess	cc88abd0ad	Test suite improvements. Current top-level test coverage: 68% Been higher before, but a lot of new code has been added.	2011-12-20 17:31:25 -04:00
Joey Hess	1c28237e0c	map: --fast disables use of dot to display map Generally useful, and allows the test suite to test it.	2011-12-20 16:42:35 -04:00
Joey Hess	da0bdc1a57	Fix the hook special remote, which bitrotted a while ago.	2011-12-20 12:23:49 -04:00
Joey Hess	09cd042775	Properly handle multiline git config values. A crash on parsing was fixed a while ago. This adds support for fully correctly parsing multiline git config values, using git config --null. Since git-annex-shell configlist uses normal git config output, I left in support for that too; the two forms of config output can be easily identified by the parser. Since configlist only prints the annex.uuid config, there's no risk of multiline values there, so no need to change it.	2011-12-15 12:48:27 -04:00
Joey Hess	6edaabd040	reinject: Add a sanity check for using an annexed file as the source file.	2011-12-12 13:43:52 -04:00
Joey Hess	acd7a52dfd	always find optimal merge Testing `b9ac585454`, it didn't find the optimal union merge, the second sha was the one to use, at least in the case I tried. Let's just try all shas to see if any can be reused. I stopped using the expensive nub, so despite the use of sets to sort/uniq file contents, this is probably as fast or faster than it was before.	2011-12-12 01:59:29 -04:00
Joey Hess	acb2d5a5a6	releasing version 3.20111211	2011-12-11 21:55:51 -04:00
Joey Hess	8680c415de	slow, stupid, and safe index updating Always merge the git-annex branch into .git/annex/index before making a commit from the index. This ensures that, when the branch has been changed in any way (by a push being received, or changes pulled directly into it, or even by the user checking it out, and committing a change), the index reflects those changes. This is much too slow; it needs to be optimised to only update the index when the branch has really changed, not every time. Also, there is an unhandled race, when a change is made to the branch right after the index gets updated. I left it in for now because it's unlikely and I didn't want to complicate things with additional locking yet.	2011-12-11 15:05:53 -04:00
Joey Hess	10e8028a42	Fix bug in last version in getting contents from bare repositories.	2011-12-10 18:45:55 -04:00
Joey Hess	c5267802f3	version dependency on old monad-control This should let cabal build it with the right version.	2011-12-10 12:56:02 -04:00
Joey Hess	fb8231f3a1	sync: New command that synchronises the local repository and default remote, by running git commit, pull, and push for you.	2011-12-09 20:27:22 -04:00
Joey Hess	14e9b87d44	unannex improvements Added files don't have to be committed before they can be unannexed. unannex no longer commits existing staged changes unannex of the last file in a directory now works, before it failed because git rm deleted the directory out from under it,	2011-12-09 13:07:31 -04:00
Joey Hess	e3f1568e0f	Fix caching of decrypted ciphers, which failed when drop had to check multiple different encrypted special remotes.	2011-12-08 16:01:46 -04:00
Joey Hess	8047bba5b9	add: If interrupted, add can leave files converted to symlinks but not yet added to git. Running the add again will now clean up this situtation.	2011-12-07 16:53:53 -04:00
Joey Hess	480495beb4	Prevent key names from containing newlines. There are several places where it's assumed a key can be written on one line. One is in the format of the .git/annex/unused files. The difficult one is that filenames derived from keys are fed into git cat-file --batch, which has a line based input. (And no -z option.) So, for now it's best to block such keys being created.	2011-12-06 13:03:09 -04:00
Joey Hess	b6c8a0119a	map: Fix a failure to detect a loop when both repositories are local and refer to each other with relative paths.	2011-12-04 12:23:10 -04:00
Joey Hess	ff5df842ea	releasing version 3.20111203	2011-12-03 21:13:21 -04:00
Joey Hess	251c01d51e	dead: A command which says that a repository is gone for good and you don't want git-annex to mention it again.	2011-12-02 16:59:55 -04:00
Joey Hess	fb68a7881f	convert rsync special backend to using both hash directory types	2011-12-02 15:50:27 -04:00
Joey Hess	97f809c006	wording	2011-12-02 14:18:55 -04:00
Joey Hess	998d8f7968	clarify	2011-11-28 23:23:14 -04:00
Joey Hess	f4bf444ae0	store content in hashDirLower directories in bare repositories When storing content in bare repositories, use the hashDirLower directories. Bare repositories can be on USB drives, which might use the FAT filesystem, and fall afoul of recent bugs in linux's handling of mixed case on FAT. Using hashDirLower avoids that.	2011-11-28 22:55:40 -04:00
Joey Hess	e32ab766b0	--inbackend can be used to make git-annex only operate on files whose content is stored using a specified key-value backend.	2011-11-28 17:45:47 -04:00
Joey Hess	6869e6023e	support .git/annex on a different disk than the rest of the repo The only fully supported thing is to have the main repository on one disk, and .git/annex on another. Only commands that move data in/out of the annex will need to copy it across devices. There is only partial support for putting arbitrary subdirectories of .git/annex on different devices. For one thing, but this can require more copies to be done. For example, when .git/annex/tmp is on one device, and .git/annex/journal on another, every journal write involves a call to mv(1). Also, there are a few places that make hard links between various subdirectories of .git/annex with createLink, that are not handled. In the common case without cross-device, the new moveFile is actually faster than renameFile, avoiding an unncessary stat to check that a file (not a directory) is being moved. Of course if a cross-device move is needed, it is as slow as mv(1) of the data.	2011-11-28 16:17:55 -04:00
Joey Hess	2bf3addf49	Bugfix: dropunused did not drop keys with two spaces in their name.	2011-11-27 13:50:05 -04:00
Joey Hess	a72f0ecc27	changelog	2011-11-26 12:06:03 -04:00
Joey Hess	12243d2279	Flush json output, avoiding a buffering problem that could result in doubled output. The bug was that with --json, output lines were sometimes doubled. For example, git annex init --json would output two lines, despite only running one thing. Adding to the weirdness, this only occurred when the output was redirected to a pipe or a file. Strace showed two processes outputting the same buffered output. The second process was this writer process (only needed to work around bug #624389): _ <- forkProcess $ do hPutStr toh $ unlines paths hClose toh exitSuccess The doubled output occurs when this process exits, and ghc flushes the inherited stdout buffer. Why only when piping? I don't know, but ghc may be behaving differently when stdout is not a terminal. While this is quite possibly a ghc bug, there is a nice fix in git-annex. Explicitly flushing after each chunk of json is output works around the problem, and as a side effect, json is streamed rather than being output all at the end when performing an expensive operaition. However, note that this means all uses of putStr in git-annex must be explicitly flushed. The others were, already.	2011-11-25 11:51:06 -04:00
Joey Hess	75a590bdd8	Put a workaround in the directory special remote for strange behavior with VFAT filesystems on Linux (mounted with shortname=mixed)	2011-11-22 18:21:28 -04:00
Joey Hess	322d9b1cc0	releasing version 3.20111122	2011-11-22 14:40:11 -04:00
Joey Hess	7f7ae7a3b1	find: Support --print0 It would be nice if command-specific options were supported. The first difficulty is that which command is being called is not known until after getopt; but that could be worked around by finding the first non-dashed parameter. Storing the settings without putting them in the annex monad is the next difficulty; it could perhaps be handled by making the seek stage pass applicable settings into the start stage (and from there on to perform as needed). But that still leaves a problem, what data type to use to represent the options between getopt and seek?	2011-11-22 14:06:31 -04:00
Joey Hess	d675f1c82e	status --json now shows most things Left out the backend usage graph for now, and bad/temp directory sizes are only displayed when present. Also, disk usage is returned as a string with units, which I can see changing later.	2011-11-20 14:12:48 -04:00
Joey Hess	c50a5fbeb4	status: Include all special remotes in the list of repositories. Special remotes do not always have a description listed in uuid.log, and such ones were not listed before.	2011-11-18 13:22:48 -04:00
Joey Hess	1326bb8635	Avoid excessive escaping for rsync special remotes that are not accessed over ssh. This is actually tricky, `45bbf210a1` added the escaping because it's needed for rsync that does go over ssh. So I had to detect whether the remote's rsync url will use ssh or not, and vary the escaping.	2011-11-18 12:53:48 -04:00
Joey Hess	c70b78d40a	migrate: Don't fall over a stale temp file.	2011-11-17 18:29:28 -04:00
Joey Hess	2bb6b02948	When not run in a git repository, git-annex can still display a usage message, and "git annex version" even works. Things that sound simple, but are made hard by the Annex monad being built with the assumption that there will always be a git repo.	2011-11-16 00:49:09 -04:00
Joey Hess	84784e2ca1	cleanup	2011-11-16 00:07:06 -04:00
Joey Hess	21a925dcf1	merge: Now runs in constant space. Before, a merge was first calculated, by running various actions that called git and built up a list of lines, which were at the end sent to git update-index. This necessarily used space proportional to the size of the diff between the trees being merged. Now, lines are streamed into git update-index from each of the actions in turn. Runtime size of git-annex merge when merging 50000 location log files drops from around 100 mb to a constant 4 mb. Presumably it runs quite a lot faster, too.	2011-11-15 23:28:01 -04:00
Joey Hess	7d05ca1d6d	Fix support for insteadOf url remapping. Closes: #644278	2011-11-15 14:06:38 -04:00
Joey Hess	bfe38f8ff1	status --json --fast for esc * status: Fix --json mode (only the repository lists are currently displayed) * status: --fast is back	2011-11-14 19:27:22 -04:00
Joey Hess	aa4fbbdd33	status: Now displays trusted, untrusted, and semitrusted repositories separately.	2011-11-14 16:14:17 -04:00
Joey Hess	04edae6791	Optimised union merging; now only runs git cat-file once.	2011-11-12 17:45:12 -04:00
Joey Hess	cea65b9e5b	init: When run in an already initalized repository, and without a description specified, don't delete the old description.	2011-11-12 15:42:52 -04:00
Joey Hess	e9bfa8eaed	avoid unnecessary auto-merge when only changing a file in the branch. Avoids doing auto-merging in commands that don't need fully current information from the git-annex branch. In particular, git annex add no longer needs to auto-merge. Affected commands: Anything that doesn't look up data from the branch, but does write a change to it. It might seem counterintuitive that we can change a value without first making sure we have the current value. This optimisation works because these two sequences are equivilant: 1. pull from remote 2. union merge 3. read file from branch 4. modify file and write to branch vs. 1. read file from branch 2. modify file and write to branch 3. pull from remote 4. union merge After either sequence, the git-annex branch contains the same logical content for the modified file. (Possibly with lines in a different order or additional old lines of course).	2011-11-12 15:15:57 -04:00
Joey Hess	897bf938f6	merge: Improve commit messages to mention what was merged.	2011-11-12 14:51:19 -04:00
Joey Hess	71b216d1fb	map: Support remotes with /~/ and /~user/ More accurately, it was supported already when map uses git-annex-shell, but not when it does not. Note that the user name cannot be shell escaped using git-annex's current approach for shell escaping. I tried and some shells like dash cannot cd ~'joey'. Rest of directory is still shell escaped, not for security but in case a directory has a space or other weird character.	2011-11-11 16:18:53 -04:00
Joey Hess	826d5887b2	Automatically fix up badly formatted uuid.log entries produced by 3.20111105, whenever the uuid.log is changed (ie, by init or describe).	2011-11-11 13:42:31 -04:00
Joey Hess	2de1e2c2ce	Optimized copy --from and get --from to avoid checking the location log for files that are already present. This can be a significant speedup when running in large trees that are only missing a few files; it makes copy --from just as fast as get.	2011-11-10 21:32:42 -04:00
Joey Hess	cf0174c922	content locking I've tested that this solves the cyclic drop problem. Have not looked at cyclic move, etc.	2011-11-09 21:54:42 -04:00
Joey Hess	faa4935047	Handle a case where an annexed file is moved into a gitignored directory, by having fix --force add its change.	2011-11-07 18:10:31 -04:00
Joey Hess	f8911cc69d	releasing version 3.20111107	2011-11-07 13:06:58 -04:00
Joey Hess	41eecb4601	Bugfix: In the past two releases, git-annex init has written the uuid.log in the wrong format, with the UUID and description flipped. This is my own damn fault for not making UUID a real type, and then relying on the type checker to ensure my refactoring was correct -- which it wasn't! I should probably add code to clean up bogus entries in the uuid.log, but right now I want to get the fix out there to prevent people experiencing this bug. I should also make UUID a real data type.	2011-11-07 12:47:41 -04:00
Joey Hess	aae0417d94	Don't try to read config from repos with annex-ignore set.	2011-11-07 11:50:30 -04:00
Joey Hess	c99fb58909	merge: Use fast-forward merges when possible. Thanks Valentin Haenel for a test case showing how non-fast-forward merges could result in an ongoing pull/merge/push cycle. While the git-annex branch is fast-forwarded, git-annex's index file is still updated using the union merge strategy as before. There's no other way to update the index that would be any faster. It is possible that a union merge and a fast-forward result in different file contents: Files should have the same lines, but a union merge may change their order. If this happens, the next commit made to the git-annex branch will have some unnecessary changes to line orders, but the consistency of data should be preserved. Note that when the journal contains changes, a fast-forward is never attempted, which is fine, because committing those changes would be vanishingly unlikely to leave the git-annex branch at a commit that already exists in one of the remotes. The real difficulty is handling the case where multiple remotes have all changed. git-annex does find the best (ie, newest) one and fast forwards to it. If the remotes are diverged, no fast-forward is done at all. It would be possible to pick one, fast forward to it, and make a merge commit to the rest, I see no benefit to adding that complexity. Determining the best of N changed remotes requires N*2+1 calls to git-log, but these are fast git-log calls, and N is typically small. Also, typically some or all of the remote refs will be the same, and git-log is not called to compare those. In the real world I expect this will almost always add only 1 git-log call to the merge process. (Which already makes N anyway.)	2011-11-06 15:22:40 -04:00
Joey Hess	0556dc812e	releasing version 3.20111105	2011-11-05 15:55:19 -04:00
Joey Hess	0bb798e351	Pass -t to rsync to preserve timestamps.	2011-11-04 19:41:11 -04:00
Joey Hess	ef3457196a	use SHA256 by default To get old behavior, add a .gitattributes containing: * annex.backend=WORM I feel that SHA256 is a better default for most people, as long as their systems are fast enough that checksumming their files isn't a problem. git-annex should default to preserving the integrity of data as well as git does. Checksum backends also work better with editing files via unlock/lock. I considered just using SHA1, but since that hash is believed to be somewhat near to being broken, and git-annex deals with large files which would be a perfect exploit medium, I decided to go to a SHA-2 hash. SHA512 is annoyingly long when displayed, and git-annex displays it in a few places (and notably it is shown in ls -l), so I picked the shorter hash. Considered SHA224 as it's even shorter, but feel it's a bit weird. I expect git-annex will use SHA-3 at some point in the future, but probably not soon! Note that systems without a sha256sum (or sha256) program will fall back to defaulting to SHA1.	2011-11-04 15:51:01 -04:00
Joey Hess	1089e85d48	add changelog for bugfix	2011-11-04 15:51:01 -04:00
Joey Hess	eec137f33a	Record uuid when auto-initializing a remote so it shows in status.	2011-11-02 14:18:21 -04:00
Joey Hess	00988bcf36	fixed my build environment	2011-10-31 15:40:57 -04:00
Joey Hess	3d3e1c4c25	better command name	2011-10-31 15:18:41 -04:00
Joey Hess	380839299e	The fromkey command now takes the key as its first parameter. The --key option is no longer used.	2011-10-31 12:56:07 -04:00
Joey Hess	cc1ea8f844	Removed the setkey command, and added a setcontent command with a more useful interface.	2011-10-31 12:33:41 -04:00
Joey Hess	22e9f445ab	unused, dropunused: Now work in bare repositories. Turned out I had already done all the work needed to support this when unused started checking all branches.	2011-10-29 19:16:45 -04:00
Joey Hess	2566eb85fe	fsck: Now works in bare repositories. Checks location log information, and file contents. Does not check that numcopies is satisfied, as .gitattributes information about numcopies is not available in a bare repository. In practice, that should not be a problem, since fsck is also run in a checkout and will check numcopies there.	2011-10-29 18:03:28 -04:00
Joey Hess	ab738a403a	status: Now always shows the current repository, even when it does not appear in uuid.log.	2011-10-28 19:49:01 -04:00
Joey Hess	6c31e3a8c3	drop --from is now supported to remove file content from a remote.	2011-10-28 17:26:38 -04:00
Joey Hess	b955238ec7	Fail if --from or --to is passed to commands that do not support them.	2011-10-27 18:56:54 -04:00
Joey Hess	66194684ac	uninit: Add guard against being run with the git-annex branch checked out.	2011-10-27 15:47:11 -04:00
Joey Hess	83d11c03c4	wording	2011-10-27 15:24:58 -04:00
Joey Hess	f84d66fa15	reap in onLocal Each onLocal call involves a new Annex state, so needs to clean up after it.	2011-10-27 14:55:07 -04:00
Joey Hess	373cad993d	Sped up some operations on remotes that are on the same host. Specifically, disabled trying to update the git-annex branch on the remote, since that data is never used by operations that act on such remotes. Also, when copying content to such a remote, skip committing the presence information changes to its git-annex branch. Leaving it in the journal there is ok: Any command run on the remote that needs the info will flush the journal. This may partially solve this bug: http://git-annex.branchable.com/bugs/fails_to_handle_lot_of_files/ Although I still see unreaped git processes piling up when doing a copy --to.	2011-10-27 14:55:06 -04:00
Joey Hess	270c1af087	releasing version 3.20111025	2011-10-25 13:46:01 -07:00
Joey Hess	e2853b3fec	update	2011-10-25 11:39:15 -07:00
Joey Hess	52c8244219	git-annex-shell: GIT_ANNEX_SHELL_READONLY and GIT_ANNEX_SHELL_LIMITED environment variables can be set to limit what commands can be run. This could be used by eg, gitolite.	2011-10-15 19:06:35 -04:00
Joey Hess	ec169f84b1	migrate: Copy url logs for keys when migrating.	2011-10-15 16:36:56 -04:00
Joey Hess	9fa9214106	A remote can have a annexUrl configured, that is used by git-annex instead of its usual url. (Similar to pushUrl.)	2011-10-14 18:18:28 -04:00
Joey Hess	205a5b2aaa	typo	2011-10-12 00:29:49 -04:00
Joey Hess	11b154e811	prep release	2011-10-11 23:03:19 -04:00
Joey Hess	402d9c7c5f	oops	2011-10-11 22:54:38 -04:00
Joey Hess	9c04d1e523	fix git 1.7.7 breakage * This version of git-annex only works with git 1.7.7 and newer. The breakage with old versions is subtle, and affects annex.numcopies .gitattributes settings, so be sure to upgrade git to 1.7.7. (Debian package now depends on that version.) * Don't pass absolute paths to git show-attr, as it started following symlinks when that's done in 1.7.7. Instead, use relative paths, which show-attr only handles 100% correctly in 1.7.7. Closes: #645046 Unfortunatly I can find no way to work with the old and new gits, as the old had bugs that require absolute paths, while the new doesn't like them at all. And the behavior of git show-attr in 1.7.7. is the same as eg, git add of an absolute path to a symlink, so seems entirely intentional and not likely to change.	2011-10-11 22:53:32 -04:00
Joey Hess	10edaf6dc9	reorder	2011-10-10 16:03:32 -04:00
Joey Hess	81ed7b203d	Now supports git's insteadOf configuration, to modify the url used to access a remote. Note that pushInsteadOf is not used; that and pushurl are reserved for actual git pushes. Closes: #644278	2011-10-09 14:58:32 -04:00
Joey Hess	5414bbce58	git-annex-shell uuid verification * git-annex now asks git-annex-shell to verify that it's operating in the expected repository. * Note that this git-annex will not interoperate with remotes using older versions of git-annex-shell. The reason for this check is to avoid git-annex getting confused about what remote repository actually contains a value. It's a prerequisite for supporting git insteadOf aliases.	2011-10-06 19:24:11 -04:00
Joey Hess	f011033869	add timestamps to remote.log	2011-10-06 16:07:58 -04:00
Joey Hess	f929d0229c	Add timestamps to trust.log.	2011-10-06 15:55:50 -04:00
Joey Hess	3e0d2a0803	add timestamp to uuid.log * New or changed repository descriptions in uuid.log now have a timestamp, which is used to ensure the newest description is used when the uuid.log has been merged. * Note that older versions of git-annex will display the timestamp as part of the repository description, which is ugly but otherwise harmless.	2011-10-06 15:31:25 -04:00
Joey Hess	d357556141	Add locking to avoid races when changing the git-annex branch.	2011-10-03 16:32:36 -04:00
Joey Hess	49f21dd9ba	Contain the zombie hordes.a Specifically, when using gpg, a zombie is forked for each file, so waiting until shutdown to reap won't do.	2011-10-02 11:16:34 -04:00
Joey Hess	29032cb70e	When displaying a list of repositories, show git remote names in addition to their descriptions.	2011-09-30 15:02:29 -04:00
Joey Hess	828f3f1b0c	status: List all known repositories.	2011-09-30 03:20:24 -04:00
Joey Hess	a7e7dda55a	Fix referring to remotes by uuid. I think that I broke this in some fairly recent refactoring.	2011-09-30 02:23:24 -04:00
Joey Hess	7ff89ccfee	convert all git read/write functions to use ByteStrings This yields a second or so speedup in unused, find, etc. Seems that even when the ByteString is immediately split and then converted to Strings, it's faster. I may try to push ByteStrings out into more of git-annex gradually, although I suspect most of the time-critical parts are already covered now, and many of the rest rely on libraries that only support Strings.	2011-09-29 23:48:57 -04:00
Joey Hess	a91c8a15d5	Sped up unused. Added Git.ByteString which replaces Git IO methods with ones using lazy ByteStrings. This can be more efficient when large quantities of data are being read from git. In Git.LsTree, parse git ls-tree output more efficiently, thanks to ByteString. This benchmarks 25% faster, in a benchmark that includes (probably predominately) the run time for git ls-tree itself. In real world numbers, this makes git annex unused 2 seconds faster for each branch it needs to check, in my usual large repo.	2011-09-29 19:04:24 -04:00
Joey Hess	7dddb803a0	releasing version 3.20110928	2011-09-28 19:17:12 -04:00
Joey Hess	d75da353b9	documentation/warning message update for future feature	2011-09-23 18:04:38 -04:00
Joey Hess	9f5c7a246b	status: Massively sped up; remove --fast mode. Using Sets is the right thing; they have constant size lookup like my SizeList, and logn insertation, which beats nub to death. Runs faster than --fast mode did before, and gives accurate counts. 13 seconds total runtime with a warm cache in a repository with 40 thousand keys.	2011-09-20 18:57:05 -04:00
Joey Hess	cabbefd9d2	status: In --fast mode, all status info is displayed now; but some of it is only approximate, and is marked as such.	2011-09-20 18:13:08 -04:00
Joey Hess	a4aef6f115	clarify wording	2011-09-19 01:54:20 -04:00
Joey Hess	33cd1ffbfe	make find show files meeting limits, even when not present find: Rather than only showing files whose contents are present, when used with --exclude --copies or --in, displays all files that match the specified conditions. Note that this is a behavior change for find --exclude! Old behavior can be gotten with find --in . --exclude=...	2011-09-18 20:42:15 -04:00
Joey Hess	9da23dff78	--copies=N can be used to make git-annex only operate on files with the specified number of copies. (And --not --copies=N for the inverse.)	2011-09-18 20:23:08 -04:00
Joey Hess	1fc3ee2423	add --in limit	2011-09-18 20:14:18 -04:00
Joey Hess	3e73de4054	releasing version 3.20110915	2011-09-17 09:21:09 -04:00
Joey Hess	d036cd590f	bugfix: drop and fsck did not honor --exclude	2011-09-15 15:44:32 -04:00
Joey Hess	a0d3a343b5	copy --auto Only does copy when numcopies is not yet satisfied.	2011-09-15 15:28:58 -04:00
Joey Hess	984c9fc052	remove optimize subcommand; use --auto instead get, drop: Added --auto option, which decides whether to get/drop content as needed to work toward the configured numcopies. The problem with bundling it up in optimize was that I then found I wanted to run an optmize that did not drop files, only got them. Considered adding a --only-get switch to it, but that seemed wrong. Instead, let's make existing subcommands optionally smarter. Note that the only actual difference between drop and drop --auto is that the latter does not even try to drop a file if it knows of not enough copies, and does not print any error messages about files it was unable to drop. It might be nice to make get avoid asking git for attributes when not in auto mode. For now it always asks for attributes.	2011-09-15 13:30:04 -04:00
Joey Hess	949b3f69d0	optimize: A new subcommand that either gets or drops file content as needed to work toward meeting the configured numcopies setting. This is currently rather simplistic, though still useful. In the future, it could become smarter about what content is stored where, etc.	2011-09-14 13:47:22 -04:00
Joey Hess	03d6209e1c	addurl: Always use whole url as destination filename, rather than only its file component. First, this ensures that git annex addurl, when run repeatedly with the same url, doesn't create duplicate files, which it did before when it fell back to the longer filename. Secondly, the file part of an url is frequently not very descriptive on its own. The uri scheme, auth, and port is intentionally left out, as clutter.	2011-09-07 19:04:51 -04:00
Joey Hess	72b54d6170	Fix build without S3.	2011-09-07 10:21:19 -04:00
Joey Hess	6f98fd5391	whereis: Show untrusted locations separately and do not include in location count.	2011-09-06 16:59:53 -04:00
Joey Hess	6fd0df7c2f	releasing version 3.20110906	2011-09-06 15:54:21 -04:00
Joey Hess	ebb92221fd	Fix Makefile to work with cabal again.	2011-09-06 15:35:13 -04:00
Joey Hess	07125dca53	Improve display of newlines around error and warning messages.	2011-09-06 13:46:08 -04:00
Joey Hess	d238bbd9d9	releasing version 3.20110902	2011-09-02 21:32:05 -04:00
Joey Hess	2f4d4d1c45	basic json support This includes a generic JSONStream library built on top of Text.JSON (somewhat hackishly). It would be possible to stream out a single json document describing all actions, but it's probably better for consumers if they can expect one json document per line, so I did it that way instead. Output from external programs used for transferring files is not currently hidden when outputting json, which probably makes it not very useful there. This may be dealt with if there is demand for json output for --get or --move to be parsable. The version, status, and find subcommands have hand-crafted output and don't do json. The whereis subcommand needs to be modified to produce useful json.	2011-09-01 15:22:06 -04:00
Joey Hess	f600444ab6	unused --remote: Reduced memory use to 1/4th what was used before. Using a single strictness annotation, in just the right place. Tried several others, none of which helped and some of which potentially hurt. This is only the second time I've really had to deal with this in a year of using haskell, which is, I suppose not that bad.	2011-08-31 19:13:02 -04:00
Joey Hess	ea7b1828d4	unused, status: Sped up by avoiding unnecessary stats of annexed files. Statting files returned by dirContents to see if they exist and are regular files seems pretty useless. This code was originally part of fsck, and perhaps the idea then was to avoid things returned by dirContents that were not files. But it's certianly not needed in the current use cases for getKeysPresent.	2011-08-30 15:16:34 -04:00
Joey Hess	d1154d0837	init: Make description an optional parameter.	2011-08-29 14:13:38 -04:00
Joey Hess	6e750764b7	The wget command will now be used in preference to curl, if available. Got tired of curl's various ugly progress bars.	2011-08-27 12:31:50 -04:00

... 2 3 4 5 6 ...

695 commits