git-annex

Author	SHA1	Message	Date
Joey Hess	e32ab766b0	--inbackend can be used to make git-annex only operate on files whose content is stored using a specified key-value backend.	2011-11-28 17:45:47 -04:00
Joey Hess	6869e6023e	support .git/annex on a different disk than the rest of the repo The only fully supported thing is to have the main repository on one disk, and .git/annex on another. Only commands that move data in/out of the annex will need to copy it across devices. There is only partial support for putting arbitrary subdirectories of .git/annex on different devices. For one thing, but this can require more copies to be done. For example, when .git/annex/tmp is on one device, and .git/annex/journal on another, every journal write involves a call to mv(1). Also, there are a few places that make hard links between various subdirectories of .git/annex with createLink, that are not handled. In the common case without cross-device, the new moveFile is actually faster than renameFile, avoiding an unncessary stat to check that a file (not a directory) is being moved. Of course if a cross-device move is needed, it is as slow as mv(1) of the data.	2011-11-28 16:17:55 -04:00
Joey Hess	2bf3addf49	Bugfix: dropunused did not drop keys with two spaces in their name.	2011-11-27 13:50:05 -04:00
Joey Hess	a72f0ecc27	changelog	2011-11-26 12:06:03 -04:00
Joey Hess	12243d2279	Flush json output, avoiding a buffering problem that could result in doubled output. The bug was that with --json, output lines were sometimes doubled. For example, git annex init --json would output two lines, despite only running one thing. Adding to the weirdness, this only occurred when the output was redirected to a pipe or a file. Strace showed two processes outputting the same buffered output. The second process was this writer process (only needed to work around bug #624389): _ <- forkProcess $ do hPutStr toh $ unlines paths hClose toh exitSuccess The doubled output occurs when this process exits, and ghc flushes the inherited stdout buffer. Why only when piping? I don't know, but ghc may be behaving differently when stdout is not a terminal. While this is quite possibly a ghc bug, there is a nice fix in git-annex. Explicitly flushing after each chunk of json is output works around the problem, and as a side effect, json is streamed rather than being output all at the end when performing an expensive operaition. However, note that this means all uses of putStr in git-annex must be explicitly flushed. The others were, already.	2011-11-25 11:51:06 -04:00
Joey Hess	75a590bdd8	Put a workaround in the directory special remote for strange behavior with VFAT filesystems on Linux (mounted with shortname=mixed)	2011-11-22 18:21:28 -04:00
Joey Hess	322d9b1cc0	releasing version 3.20111122	2011-11-22 14:40:11 -04:00
Joey Hess	7f7ae7a3b1	find: Support --print0 It would be nice if command-specific options were supported. The first difficulty is that which command is being called is not known until after getopt; but that could be worked around by finding the first non-dashed parameter. Storing the settings without putting them in the annex monad is the next difficulty; it could perhaps be handled by making the seek stage pass applicable settings into the start stage (and from there on to perform as needed). But that still leaves a problem, what data type to use to represent the options between getopt and seek?	2011-11-22 14:06:31 -04:00
Joey Hess	d675f1c82e	status --json now shows most things Left out the backend usage graph for now, and bad/temp directory sizes are only displayed when present. Also, disk usage is returned as a string with units, which I can see changing later.	2011-11-20 14:12:48 -04:00
Joey Hess	c50a5fbeb4	status: Include all special remotes in the list of repositories. Special remotes do not always have a description listed in uuid.log, and such ones were not listed before.	2011-11-18 13:22:48 -04:00
Joey Hess	1326bb8635	Avoid excessive escaping for rsync special remotes that are not accessed over ssh. This is actually tricky, `45bbf210a1` added the escaping because it's needed for rsync that does go over ssh. So I had to detect whether the remote's rsync url will use ssh or not, and vary the escaping.	2011-11-18 12:53:48 -04:00
Joey Hess	c70b78d40a	migrate: Don't fall over a stale temp file.	2011-11-17 18:29:28 -04:00
Joey Hess	2bb6b02948	When not run in a git repository, git-annex can still display a usage message, and "git annex version" even works. Things that sound simple, but are made hard by the Annex monad being built with the assumption that there will always be a git repo.	2011-11-16 00:49:09 -04:00
Joey Hess	84784e2ca1	cleanup	2011-11-16 00:07:06 -04:00
Joey Hess	21a925dcf1	merge: Now runs in constant space. Before, a merge was first calculated, by running various actions that called git and built up a list of lines, which were at the end sent to git update-index. This necessarily used space proportional to the size of the diff between the trees being merged. Now, lines are streamed into git update-index from each of the actions in turn. Runtime size of git-annex merge when merging 50000 location log files drops from around 100 mb to a constant 4 mb. Presumably it runs quite a lot faster, too.	2011-11-15 23:28:01 -04:00
Joey Hess	7d05ca1d6d	Fix support for insteadOf url remapping. Closes: #644278	2011-11-15 14:06:38 -04:00
Joey Hess	bfe38f8ff1	status --json --fast for esc * status: Fix --json mode (only the repository lists are currently displayed) * status: --fast is back	2011-11-14 19:27:22 -04:00
Joey Hess	aa4fbbdd33	status: Now displays trusted, untrusted, and semitrusted repositories separately.	2011-11-14 16:14:17 -04:00
Joey Hess	04edae6791	Optimised union merging; now only runs git cat-file once.	2011-11-12 17:45:12 -04:00
Joey Hess	cea65b9e5b	init: When run in an already initalized repository, and without a description specified, don't delete the old description.	2011-11-12 15:42:52 -04:00
Joey Hess	e9bfa8eaed	avoid unnecessary auto-merge when only changing a file in the branch. Avoids doing auto-merging in commands that don't need fully current information from the git-annex branch. In particular, git annex add no longer needs to auto-merge. Affected commands: Anything that doesn't look up data from the branch, but does write a change to it. It might seem counterintuitive that we can change a value without first making sure we have the current value. This optimisation works because these two sequences are equivilant: 1. pull from remote 2. union merge 3. read file from branch 4. modify file and write to branch vs. 1. read file from branch 2. modify file and write to branch 3. pull from remote 4. union merge After either sequence, the git-annex branch contains the same logical content for the modified file. (Possibly with lines in a different order or additional old lines of course).	2011-11-12 15:15:57 -04:00
Joey Hess	897bf938f6	merge: Improve commit messages to mention what was merged.	2011-11-12 14:51:19 -04:00
Joey Hess	71b216d1fb	map: Support remotes with /~/ and /~user/ More accurately, it was supported already when map uses git-annex-shell, but not when it does not. Note that the user name cannot be shell escaped using git-annex's current approach for shell escaping. I tried and some shells like dash cannot cd ~'joey'. Rest of directory is still shell escaped, not for security but in case a directory has a space or other weird character.	2011-11-11 16:18:53 -04:00
Joey Hess	826d5887b2	Automatically fix up badly formatted uuid.log entries produced by 3.20111105, whenever the uuid.log is changed (ie, by init or describe).	2011-11-11 13:42:31 -04:00
Joey Hess	2de1e2c2ce	Optimized copy --from and get --from to avoid checking the location log for files that are already present. This can be a significant speedup when running in large trees that are only missing a few files; it makes copy --from just as fast as get.	2011-11-10 21:32:42 -04:00
Joey Hess	cf0174c922	content locking I've tested that this solves the cyclic drop problem. Have not looked at cyclic move, etc.	2011-11-09 21:54:42 -04:00
Joey Hess	faa4935047	Handle a case where an annexed file is moved into a gitignored directory, by having fix --force add its change.	2011-11-07 18:10:31 -04:00
Joey Hess	f8911cc69d	releasing version 3.20111107	2011-11-07 13:06:58 -04:00
Joey Hess	41eecb4601	Bugfix: In the past two releases, git-annex init has written the uuid.log in the wrong format, with the UUID and description flipped. This is my own damn fault for not making UUID a real type, and then relying on the type checker to ensure my refactoring was correct -- which it wasn't! I should probably add code to clean up bogus entries in the uuid.log, but right now I want to get the fix out there to prevent people experiencing this bug. I should also make UUID a real data type.	2011-11-07 12:47:41 -04:00
Joey Hess	aae0417d94	Don't try to read config from repos with annex-ignore set.	2011-11-07 11:50:30 -04:00
Joey Hess	c99fb58909	merge: Use fast-forward merges when possible. Thanks Valentin Haenel for a test case showing how non-fast-forward merges could result in an ongoing pull/merge/push cycle. While the git-annex branch is fast-forwarded, git-annex's index file is still updated using the union merge strategy as before. There's no other way to update the index that would be any faster. It is possible that a union merge and a fast-forward result in different file contents: Files should have the same lines, but a union merge may change their order. If this happens, the next commit made to the git-annex branch will have some unnecessary changes to line orders, but the consistency of data should be preserved. Note that when the journal contains changes, a fast-forward is never attempted, which is fine, because committing those changes would be vanishingly unlikely to leave the git-annex branch at a commit that already exists in one of the remotes. The real difficulty is handling the case where multiple remotes have all changed. git-annex does find the best (ie, newest) one and fast forwards to it. If the remotes are diverged, no fast-forward is done at all. It would be possible to pick one, fast forward to it, and make a merge commit to the rest, I see no benefit to adding that complexity. Determining the best of N changed remotes requires N*2+1 calls to git-log, but these are fast git-log calls, and N is typically small. Also, typically some or all of the remote refs will be the same, and git-log is not called to compare those. In the real world I expect this will almost always add only 1 git-log call to the merge process. (Which already makes N anyway.)	2011-11-06 15:22:40 -04:00
Joey Hess	0556dc812e	releasing version 3.20111105	2011-11-05 15:55:19 -04:00
Joey Hess	0bb798e351	Pass -t to rsync to preserve timestamps.	2011-11-04 19:41:11 -04:00
Joey Hess	ef3457196a	use SHA256 by default To get old behavior, add a .gitattributes containing: * annex.backend=WORM I feel that SHA256 is a better default for most people, as long as their systems are fast enough that checksumming their files isn't a problem. git-annex should default to preserving the integrity of data as well as git does. Checksum backends also work better with editing files via unlock/lock. I considered just using SHA1, but since that hash is believed to be somewhat near to being broken, and git-annex deals with large files which would be a perfect exploit medium, I decided to go to a SHA-2 hash. SHA512 is annoyingly long when displayed, and git-annex displays it in a few places (and notably it is shown in ls -l), so I picked the shorter hash. Considered SHA224 as it's even shorter, but feel it's a bit weird. I expect git-annex will use SHA-3 at some point in the future, but probably not soon! Note that systems without a sha256sum (or sha256) program will fall back to defaulting to SHA1.	2011-11-04 15:51:01 -04:00
Joey Hess	1089e85d48	add changelog for bugfix	2011-11-04 15:51:01 -04:00
Joey Hess	eec137f33a	Record uuid when auto-initializing a remote so it shows in status.	2011-11-02 14:18:21 -04:00
Joey Hess	00988bcf36	fixed my build environment	2011-10-31 15:40:57 -04:00
Joey Hess	3d3e1c4c25	better command name	2011-10-31 15:18:41 -04:00
Joey Hess	380839299e	The fromkey command now takes the key as its first parameter. The --key option is no longer used.	2011-10-31 12:56:07 -04:00
Joey Hess	cc1ea8f844	Removed the setkey command, and added a setcontent command with a more useful interface.	2011-10-31 12:33:41 -04:00
Joey Hess	22e9f445ab	unused, dropunused: Now work in bare repositories. Turned out I had already done all the work needed to support this when unused started checking all branches.	2011-10-29 19:16:45 -04:00
Joey Hess	2566eb85fe	fsck: Now works in bare repositories. Checks location log information, and file contents. Does not check that numcopies is satisfied, as .gitattributes information about numcopies is not available in a bare repository. In practice, that should not be a problem, since fsck is also run in a checkout and will check numcopies there.	2011-10-29 18:03:28 -04:00
Joey Hess	ab738a403a	status: Now always shows the current repository, even when it does not appear in uuid.log.	2011-10-28 19:49:01 -04:00
Joey Hess	6c31e3a8c3	drop --from is now supported to remove file content from a remote.	2011-10-28 17:26:38 -04:00
Joey Hess	b955238ec7	Fail if --from or --to is passed to commands that do not support them.	2011-10-27 18:56:54 -04:00
Joey Hess	66194684ac	uninit: Add guard against being run with the git-annex branch checked out.	2011-10-27 15:47:11 -04:00
Joey Hess	83d11c03c4	wording	2011-10-27 15:24:58 -04:00
Joey Hess	f84d66fa15	reap in onLocal Each onLocal call involves a new Annex state, so needs to clean up after it.	2011-10-27 14:55:07 -04:00
Joey Hess	373cad993d	Sped up some operations on remotes that are on the same host. Specifically, disabled trying to update the git-annex branch on the remote, since that data is never used by operations that act on such remotes. Also, when copying content to such a remote, skip committing the presence information changes to its git-annex branch. Leaving it in the journal there is ok: Any command run on the remote that needs the info will flush the journal. This may partially solve this bug: http://git-annex.branchable.com/bugs/fails_to_handle_lot_of_files/ Although I still see unreaped git processes piling up when doing a copy --to.	2011-10-27 14:55:06 -04:00
Joey Hess	270c1af087	releasing version 3.20111025	2011-10-25 13:46:01 -07:00
Joey Hess	e2853b3fec	update	2011-10-25 11:39:15 -07:00
Joey Hess	52c8244219	git-annex-shell: GIT_ANNEX_SHELL_READONLY and GIT_ANNEX_SHELL_LIMITED environment variables can be set to limit what commands can be run. This could be used by eg, gitolite.	2011-10-15 19:06:35 -04:00
Joey Hess	ec169f84b1	migrate: Copy url logs for keys when migrating.	2011-10-15 16:36:56 -04:00
Joey Hess	9fa9214106	A remote can have a annexUrl configured, that is used by git-annex instead of its usual url. (Similar to pushUrl.)	2011-10-14 18:18:28 -04:00
Joey Hess	205a5b2aaa	typo	2011-10-12 00:29:49 -04:00
Joey Hess	11b154e811	prep release	2011-10-11 23:03:19 -04:00
Joey Hess	402d9c7c5f	oops	2011-10-11 22:54:38 -04:00
Joey Hess	9c04d1e523	fix git 1.7.7 breakage * This version of git-annex only works with git 1.7.7 and newer. The breakage with old versions is subtle, and affects annex.numcopies .gitattributes settings, so be sure to upgrade git to 1.7.7. (Debian package now depends on that version.) * Don't pass absolute paths to git show-attr, as it started following symlinks when that's done in 1.7.7. Instead, use relative paths, which show-attr only handles 100% correctly in 1.7.7. Closes: #645046 Unfortunatly I can find no way to work with the old and new gits, as the old had bugs that require absolute paths, while the new doesn't like them at all. And the behavior of git show-attr in 1.7.7. is the same as eg, git add of an absolute path to a symlink, so seems entirely intentional and not likely to change.	2011-10-11 22:53:32 -04:00
Joey Hess	10edaf6dc9	reorder	2011-10-10 16:03:32 -04:00
Joey Hess	81ed7b203d	Now supports git's insteadOf configuration, to modify the url used to access a remote. Note that pushInsteadOf is not used; that and pushurl are reserved for actual git pushes. Closes: #644278	2011-10-09 14:58:32 -04:00
Joey Hess	5414bbce58	git-annex-shell uuid verification * git-annex now asks git-annex-shell to verify that it's operating in the expected repository. * Note that this git-annex will not interoperate with remotes using older versions of git-annex-shell. The reason for this check is to avoid git-annex getting confused about what remote repository actually contains a value. It's a prerequisite for supporting git insteadOf aliases.	2011-10-06 19:24:11 -04:00
Joey Hess	f011033869	add timestamps to remote.log	2011-10-06 16:07:58 -04:00
Joey Hess	f929d0229c	Add timestamps to trust.log.	2011-10-06 15:55:50 -04:00
Joey Hess	3e0d2a0803	add timestamp to uuid.log * New or changed repository descriptions in uuid.log now have a timestamp, which is used to ensure the newest description is used when the uuid.log has been merged. * Note that older versions of git-annex will display the timestamp as part of the repository description, which is ugly but otherwise harmless.	2011-10-06 15:31:25 -04:00
Joey Hess	d357556141	Add locking to avoid races when changing the git-annex branch.	2011-10-03 16:32:36 -04:00
Joey Hess	49f21dd9ba	Contain the zombie hordes.a Specifically, when using gpg, a zombie is forked for each file, so waiting until shutdown to reap won't do.	2011-10-02 11:16:34 -04:00
Joey Hess	29032cb70e	When displaying a list of repositories, show git remote names in addition to their descriptions.	2011-09-30 15:02:29 -04:00
Joey Hess	828f3f1b0c	status: List all known repositories.	2011-09-30 03:20:24 -04:00
Joey Hess	a7e7dda55a	Fix referring to remotes by uuid. I think that I broke this in some fairly recent refactoring.	2011-09-30 02:23:24 -04:00
Joey Hess	7ff89ccfee	convert all git read/write functions to use ByteStrings This yields a second or so speedup in unused, find, etc. Seems that even when the ByteString is immediately split and then converted to Strings, it's faster. I may try to push ByteStrings out into more of git-annex gradually, although I suspect most of the time-critical parts are already covered now, and many of the rest rely on libraries that only support Strings.	2011-09-29 23:48:57 -04:00
Joey Hess	a91c8a15d5	Sped up unused. Added Git.ByteString which replaces Git IO methods with ones using lazy ByteStrings. This can be more efficient when large quantities of data are being read from git. In Git.LsTree, parse git ls-tree output more efficiently, thanks to ByteString. This benchmarks 25% faster, in a benchmark that includes (probably predominately) the run time for git ls-tree itself. In real world numbers, this makes git annex unused 2 seconds faster for each branch it needs to check, in my usual large repo.	2011-09-29 19:04:24 -04:00
Joey Hess	7dddb803a0	releasing version 3.20110928	2011-09-28 19:17:12 -04:00
Joey Hess	d75da353b9	documentation/warning message update for future feature	2011-09-23 18:04:38 -04:00
Joey Hess	9f5c7a246b	status: Massively sped up; remove --fast mode. Using Sets is the right thing; they have constant size lookup like my SizeList, and logn insertation, which beats nub to death. Runs faster than --fast mode did before, and gives accurate counts. 13 seconds total runtime with a warm cache in a repository with 40 thousand keys.	2011-09-20 18:57:05 -04:00
Joey Hess	cabbefd9d2	status: In --fast mode, all status info is displayed now; but some of it is only approximate, and is marked as such.	2011-09-20 18:13:08 -04:00
Joey Hess	a4aef6f115	clarify wording	2011-09-19 01:54:20 -04:00
Joey Hess	33cd1ffbfe	make find show files meeting limits, even when not present find: Rather than only showing files whose contents are present, when used with --exclude --copies or --in, displays all files that match the specified conditions. Note that this is a behavior change for find --exclude! Old behavior can be gotten with find --in . --exclude=...	2011-09-18 20:42:15 -04:00
Joey Hess	9da23dff78	--copies=N can be used to make git-annex only operate on files with the specified number of copies. (And --not --copies=N for the inverse.)	2011-09-18 20:23:08 -04:00
Joey Hess	1fc3ee2423	add --in limit	2011-09-18 20:14:18 -04:00
Joey Hess	3e73de4054	releasing version 3.20110915	2011-09-17 09:21:09 -04:00
Joey Hess	d036cd590f	bugfix: drop and fsck did not honor --exclude	2011-09-15 15:44:32 -04:00
Joey Hess	a0d3a343b5	copy --auto Only does copy when numcopies is not yet satisfied.	2011-09-15 15:28:58 -04:00
Joey Hess	984c9fc052	remove optimize subcommand; use --auto instead get, drop: Added --auto option, which decides whether to get/drop content as needed to work toward the configured numcopies. The problem with bundling it up in optimize was that I then found I wanted to run an optmize that did not drop files, only got them. Considered adding a --only-get switch to it, but that seemed wrong. Instead, let's make existing subcommands optionally smarter. Note that the only actual difference between drop and drop --auto is that the latter does not even try to drop a file if it knows of not enough copies, and does not print any error messages about files it was unable to drop. It might be nice to make get avoid asking git for attributes when not in auto mode. For now it always asks for attributes.	2011-09-15 13:30:04 -04:00
Joey Hess	949b3f69d0	optimize: A new subcommand that either gets or drops file content as needed to work toward meeting the configured numcopies setting. This is currently rather simplistic, though still useful. In the future, it could become smarter about what content is stored where, etc.	2011-09-14 13:47:22 -04:00
Joey Hess	03d6209e1c	addurl: Always use whole url as destination filename, rather than only its file component. First, this ensures that git annex addurl, when run repeatedly with the same url, doesn't create duplicate files, which it did before when it fell back to the longer filename. Secondly, the file part of an url is frequently not very descriptive on its own. The uri scheme, auth, and port is intentionally left out, as clutter.	2011-09-07 19:04:51 -04:00
Joey Hess	72b54d6170	Fix build without S3.	2011-09-07 10:21:19 -04:00
Joey Hess	6f98fd5391	whereis: Show untrusted locations separately and do not include in location count.	2011-09-06 16:59:53 -04:00
Joey Hess	6fd0df7c2f	releasing version 3.20110906	2011-09-06 15:54:21 -04:00
Joey Hess	ebb92221fd	Fix Makefile to work with cabal again.	2011-09-06 15:35:13 -04:00
Joey Hess	07125dca53	Improve display of newlines around error and warning messages.	2011-09-06 13:46:08 -04:00
Joey Hess	d238bbd9d9	releasing version 3.20110902	2011-09-02 21:32:05 -04:00
Joey Hess	2f4d4d1c45	basic json support This includes a generic JSONStream library built on top of Text.JSON (somewhat hackishly). It would be possible to stream out a single json document describing all actions, but it's probably better for consumers if they can expect one json document per line, so I did it that way instead. Output from external programs used for transferring files is not currently hidden when outputting json, which probably makes it not very useful there. This may be dealt with if there is demand for json output for --get or --move to be parsable. The version, status, and find subcommands have hand-crafted output and don't do json. The whereis subcommand needs to be modified to produce useful json.	2011-09-01 15:22:06 -04:00
Joey Hess	f600444ab6	unused --remote: Reduced memory use to 1/4th what was used before. Using a single strictness annotation, in just the right place. Tried several others, none of which helped and some of which potentially hurt. This is only the second time I've really had to deal with this in a year of using haskell, which is, I suppose not that bad.	2011-08-31 19:13:02 -04:00
Joey Hess	ea7b1828d4	unused, status: Sped up by avoiding unnecessary stats of annexed files. Statting files returned by dirContents to see if they exist and are regular files seems pretty useless. This code was originally part of fsck, and perhaps the idea then was to avoid things returned by dirContents that were not files. But it's certianly not needed in the current use cases for getKeysPresent.	2011-08-30 15:16:34 -04:00
Joey Hess	d1154d0837	init: Make description an optional parameter.	2011-08-29 14:13:38 -04:00
Joey Hess	6e750764b7	The wget command will now be used in preference to curl, if available. Got tired of curl's various ugly progress bars.	2011-08-27 12:31:50 -04:00
Joey Hess	20259c2955	Set EMAIL when running test suite so that git does not need to be configured first. Closes: #638998	2011-08-23 13:41:32 -04:00
Joey Hess	06f509854a	file moved	2011-08-21 13:19:33 -04:00
Joey Hess	3786f8d348	releasing version 3.20110819	2011-08-19 20:38:36 -04:00
Joey Hess	01cd775d92	Fix broken upgrade from V1 repository. Closes: #638584 Had forgotten to keep several old versions of functions needed during this upgrade.	2011-08-19 20:32:18 -04:00
Joey Hess	8a2197adfa	Added annex-cost-command configuration, which can be used to vary the cost of a remote based on the output of a shell command. Also avoided crashing if the user specified cost value cannot be parsed.	2011-08-18 12:20:47 -04:00
Joey Hess	56f6923ccb	Now "git annex init" only has to be run once when a git repository is first being created. Clones will automatically notice that git-annex is in use and automatically perform a basic initalization. It's still recommended to run "git annex init" in any clones, to describe them.	2011-08-17 14:44:31 -04:00
Joey Hess	f0c2130700	releasing version 3.20110817	2011-08-17 01:34:15 -04:00
Joey Hess	4a023dd1aa	Added curl to Debian package dependencies.	2011-08-16 22:22:00 -04:00
Joey Hess	e6752cc064	Added support for getting content from git remotes using http (and https).	2011-08-16 21:12:48 -04:00
Joey Hess	dede05171b	addurl: --fast can be used to avoid immediately downloading the url. The tricky part about this is that to generate a key, the file must be present already. Worked around by adding (back) an URL key type, which is used for addurl --fast.	2011-08-06 14:57:22 -04:00
Joey Hess	45bbf210a1	Fix shell escaping in rsync special remote.	2011-07-29 15:28:21 +02:00
Joey Hess	a8a71b9d91	releasing version 3.20110719	2011-07-19 23:52:09 -04:00
Joey Hess	ec9e9343d9	add closure for new bug that I already fixed	2011-07-17 19:05:50 -04:00
Joey Hess	ded2591124	unannex: Clean up use of git commit -a. This was more complex than would be expected. unannex has to use git commit -a since it's removing files from git; git commit filelist won't do. Allow commands to be added to the Git queue that have no associated files, and run such commands once.	2011-07-14 17:15:37 -04:00
Joey Hess	0c46cbab09	Support the standard git -c name=value This allows eg, `git-annex -c annex.rsync-options=-6 get file` The overridden git configs are not passed on to git plumbing commands that are run. Perhaps someone will find a need to do that, but I don't yet and it would require storing more state to know what config settings have been overridden and need to be passed on.	2011-07-14 16:51:20 -04:00
Joey Hess	7919de73af	Bugfix: Make add ../ work. The complication of check-attr returning absolute paths that have to be converted back to relative paths..	2011-07-10 13:52:53 -04:00
Joey Hess	40c6ba99f5	add: Be even more robust to avoid ever leaving the file seemingly deleted. A failure at any point after the file is annexed will result in an undo that puts the original file back into place and wipes the location log.	2011-07-07 21:30:51 -04:00
Joey Hess	2a108982ad	add monad-control to build depends Will use this to handle exceptions in the Annex monad, yay.	2011-07-07 20:53:57 -04:00
Joey Hess	4d4f297c96	releasing version 3.20110707	2011-07-07 19:37:49 -04:00
Joey Hess	67dcc1f171	add: Avoid a failure mode that resulted in the file seemingly being deleted (content put in the annex but no symlink present).	2011-07-07 19:29:36 -04:00
Joey Hess	2fb771f135	Bugfix: Forgot to de-escape keys when upgrading. Could result in bad location log data for keys that contain [&:%] in their names. (A workaround for this problem is to run git annex fsck.) `git annex unused --from remote` could also run into the broken code.	2011-07-07 17:04:21 -04:00
Joey Hess	497b1e6092	Fix sign bug in disk free space checking. Giulio Eulisse reported that on OSX, bad free space numbers were being shown. It thought he had negative free space. While the documentation is not clear, especially across OS's, it seems likely that statfs uses unsigned long. It doesn't make sense for any numbers to be negative.	2011-07-05 20:53:58 -04:00
Joey Hess	d583e04d23	releasing version 3.20110705	2011-07-05 15:21:38 -04:00
Joey Hess	5c69ac14eb	Drop the dependency on the haskell curl bindings, use regular haskell HTTP.	2011-07-04 19:33:11 -04:00
Joey Hess	71c783bf24	uninit: Use unannex in --fast mode, to support unannexing multiple files that link to the same content.	2011-07-04 16:20:50 -04:00
Joey Hess	22a4f5b348	unannex: In --fast mode, file content is left in the annex, and a hard link made to it.	2011-07-04 16:06:28 -04:00
Joey Hess	5beb6bc76f	uninit: delete .git/annex/	2011-07-04 15:55:03 -04:00
Joey Hess	5c63b409d4	uninit: Delete the git-annex branch.	2011-07-04 15:50:30 -04:00
Joey Hess	48db40857c	releasing version 3.20110702	2011-07-02 15:08:05 -04:00
Joey Hess	457d28c676	wording	2011-07-01 17:24:11 -04:00
Joey Hess	a140f7148f	documentation for using the web	2011-07-01 16:05:06 -04:00
Joey Hess	2cdacfbae6	remove URL backend	2011-07-01 16:01:04 -04:00
Joey Hess	6ba866ca73	updates for web remote and removing URL backend	2011-07-01 15:39:30 -04:00
Joey Hess	cdbcd6f495	add web special remote Generalized LocationLog to PresenceLog, and use a presence log to record urls for the web special remote.	2011-07-01 15:30:42 -04:00
Joey Hess	ee3a0551a7	Merge branch 'master' into v3 Conflicts: debian/changelog	2011-06-30 15:01:08 -04:00
Joey Hess	56aeeb4565	cabal can now be used to build git-annex. This is substantially slower than using make, does not build or install documentation, does not run the test suite, and is not particularly recommended, but could be useful to some.	2011-06-30 14:55:03 -04:00
Joey Hess	8562e6096c	v3 is now faster than v2 Rebenchmarked v2 vs v3, and v3 is now actually faster. Yes, storing data in git, using git as a filesystem is actually faster than just using the filesystem. If you do it just right. :)	2011-06-30 01:16:53 -04:00
Joey Hess	d72fb5acc2	Fix encoding of utf-8 etc when storing the description of repository and other content. Write files in raw mode, to avoid mangling the encoding of content provided. Note: This was a longstanding problem, it was not introduced in v3.	2011-06-30 00:35:51 -04:00
Joey Hess	e1c18ddec4	Sped back up fsck, copy --from etc All commands that often have to read a lot of information from the git-annex branch should now be nearly as fast as before the branch was introduced. Before fsck was taking approximatly 3 hours, now it's running in 8 minutes. The code is very nasty. It should be rewritten to read the header line from git cat-file, and then read the specified number of bytes of content.	2011-06-29 21:47:31 -04:00
Joey Hess	af45d42224	Merge branch 'master' into v3 Conflicts: debian/changelog	2011-06-29 11:42:35 -04:00
Joey Hess	b3aaf980e4	--force will cause add, etc, to operate on ignored files.	2011-06-29 11:42:00 -04:00
Joey Hess	5034d8c298	Modify location log parser to allow future expansion. Since the logs have just been moved into the git-annex branch, don't need to worry about backwards compatability with old versions of git-annex that would fail to parse location logs with extra fields tacked on.	2011-06-28 16:15:50 -04:00
Joey Hess	c90652f015	Always ensure git-annex branch exists.	2011-06-26 22:43:48 -04:00
Joey Hess	874fc044c1	releasing version 3.20110624	2011-06-24 14:58:07 -04:00
Joey Hess	068703c405	improve post-upgrade push instructions	2011-06-23 14:51:04 -04:00
Joey Hess	7ee636f6dd	avoid unnecessary read of trust.log	2011-06-23 13:39:04 -04:00
Joey Hess	66ceb92702	docs	2011-06-22 23:37:46 -04:00
Joey Hess	68783fd5e0	let's have the major version number be annex.version	2011-06-22 23:02:58 -04:00
Joey Hess	ad3770e0b2	add merge subcommand	2011-06-22 18:46:56 -04:00
Joey Hess	80302d0b46	improve bare repo handing Many more commands can work in bare repos now, thanks to the git-annex branch.	2011-06-22 18:32:41 -04:00
Joey Hess	818ae0c6da	docs for v3	2011-06-21 20:21:33 -04:00
Joey Hess	9f9e17aa0f	unlock: Made atomic.	2011-06-20 22:38:18 -04:00
Joey Hess	c835166a7c	add git-union-merge This is a new git subcommand, that does a generic union merge operation between two refs, storing the result in a branch. It operates efficiently without touching the working tree. It does need to write out a temporary index file, and may need to write out some other temp files as well. This could be useful for anything that stores data in a branch, and needs to merge changes into that branch without actually checking the branch out. Since conflict handling can't be done without a working copy, the merge type is always a union merge, which is fine for data stored in log format (as git-annex does), or in non-conflicting files (as pristine-tar does). This probably belongs in git proper, but it will live in git-annex for now. --- Plan is to move .git-annex/ to a git-annex branch, and use git-union-merge to handle merging changes when pulling from remotes. Some preliminary benchmarking using real .git-annex/ data indicates that it's quite fast, except for the "git add" call, which is as slow as "git add" tends to be with a big index.	2011-06-20 21:37:18 -04:00
Joey Hess	f547277b75	Allow --trust etc to specify a repository by name, for temporarily trusting repositories that are not configured remotes.	2011-06-13 22:19:44 -04:00
Joey Hess	30d7cce7ec	rsync is now used when copying files from repos on other filesystems cp is still used when copying file from repos on the same filesystem, since --reflink=auto can make it significantly faster on filesystems such as btrfs. Directory special remotes still use cp, not rsync. It's not clear what tmp file should be used when rsyncing to such a remote.	2011-06-13 20:33:52 -04:00
Joey Hess	38e0100a69	releasing version 0.20110610	2011-06-10 11:58:21 -04:00
Joey Hess	9a272815dd	Bugfix: Fix fsck to not think all SHAnE keys are bad.	2011-06-10 11:43:28 -04:00
Joey Hess	90dd245522	get --from is the same as copy --from get not honoring --from has surprised me a few times, so least surprise suggests it should just behave like copy --from. This leaves the difference between get and copy being that copy always requires the remote to copy from, while get will decide whether to get a file from a key/value store or a remote.	2011-06-09 18:54:49 -04:00
Joey Hess	a8fb97d2ce	Add --trust, --untrust, and --semitrust options.	2011-06-01 17:57:31 -04:00
Joey Hess	3d567aa64f	Add --numcopies option.	2011-06-01 16:49:17 -04:00
Joey Hess	dc92a788c7	releasing version 0.20110601	2011-06-01 12:00:25 -04:00
Joey Hess	038da52bdd	Somewhat sped up `git commit` of modifications to unlocked files. Avoid git reset here too, so I no longer need to care that it's much more expensive than seems wise (but I asked the git list about that anyway). It's not necessary to reset the staged file content from the index, as the `git add` of the the symlink will replace it anyway. `git commit` of unlocked files is still slow, since git still has to shove their entire content into the index, only to have it be thrown away. So it's still better to use `git annex add`	2011-05-31 16:08:37 -04:00
Joey Hess	fb259033d4	Fix locking of files with staged changes. Previously, lock would skip files that had staged changes, but that is counterintuitive, I think.	2011-05-31 15:00:56 -04:00
Joey Hess	fafe60768f	Massively sped up `git annex lock` by avoiding use of the uber-slow `git reset`, and only running `git checkout` once, even when many files are being locked.	2011-05-31 14:50:41 -04:00
Joey Hess	14ffb5d47b	bugfix: fix unused list numbering Introduced in `43f0a666f0`	2011-05-28 22:30:06 -04:00
Joey Hess	7ea54e1c6e	releasing version 0.20110522	2011-05-27 20:28:01 -04:00
Joey Hess	82b88d0676	typo	2011-05-27 20:21:13 -04:00
Joey Hess	001edb008a	Fix bug in --exclude introduced in 0.20110516.	2011-05-27 20:20:20 -04:00
Joey Hess	5b941980aa	Closer emulation of git's behavior when told to use "foo/.git" as a git repository instead of just "foo". Closes: #627563	2011-05-22 14:12:16 -04:00
Joey Hess	8ed27db18f	add explict build dep on hslogger pulled in by missingh, but now used directly by git-annex	2011-05-21 13:03:13 -04:00
Joey Hess	944b1207dc	releasing version 0.20110521	2011-05-21 11:58:35 -04:00
Joey Hess	93a4f3d4e6	Add --debug option. Closes: #627499 This takes advantage of the debug logging done by missingh, and I added my own debug messages for executeFile calls. There are still some other low-level ways git-annex runs stuff that are not shown by debugging, but this gets most of it easily.	2011-05-21 11:52:13 -04:00
Joey Hess	cd83541872	--backend now overrides any backend configured in .gitattributes files.	2011-05-18 19:34:46 -04:00
Joey Hess	a8816efc14	status: New subcommand to show info about an annex, including its size.	2011-05-16 21:18:34 -04:00
Joey Hess	3ab15b9f4f	releasing version 0.20110516	2011-05-16 15:01:05 -04:00
Joey Hess	5256a6b011	migrate: Use current filename when generating new key, for backends where the filename affects the key name.	2011-05-16 12:10:08 -04:00
Joey Hess	e7b309ce02	clarify	2011-05-16 11:49:52 -04:00
Joey Hess	2a8efc7af1	Added filename extension preserving variant backends SHA1E, SHA256E, etc.	2011-05-16 11:46:34 -04:00
Joey Hess	1d2984441c	add a few tweaks to make it easy to use the Internet Archive's variant of S3 In particular, munge key filenames to comply with the IA's filename limits, disable encryption, support their nonstandard way of creating buckets, and allow x-amz-* headers to be specified in initremote to set item metadata. Still TODO: initremote does not handle multiword metadata headers right.	2011-05-16 11:20:35 -04:00
Joey Hess	078a6fbd76	Work around a bug in Network.URI's handling of bracketed ipv6 addresses.	2011-05-06 15:21:30 -04:00
Joey Hess	86d3205061	releasing version 0.20110503	2011-05-03 21:49:20 -04:00
Joey Hess	1f84c7a964	S3: When encryption is enabled, the Amazon S3 login credentials are stored, encrypted, in .git-annex/remotes.log, so environment variables need not be set after the remote is initialized.	2011-05-01 14:05:10 -04:00
Joey Hess	43f0a666f0	unused: Now also lists files fsck places in .git/annex/bad/	2011-04-29 13:59:00 -04:00
Joey Hess	eef3f634e9	Avoid crashing when an existing key is readded to the annex.	2011-04-28 20:41:40 -04:00
Joey Hess	07576f2a2c	documentation for hook special remotes Releasing before I have quite finished the code. Got a little caught up in Anathem references. Time for a walk and then a tiny bit more coding and possibly testing.	2011-04-28 15:26:21 -04:00
Joey Hess	d7b330b33b	Fix hasKeyCheap setting for bup and rsync special remotes.	2011-04-28 14:39:51 -04:00
Joey Hess	84e1ebfb0e	erm, thought I committed this release?	2011-04-28 14:38:01 -04:00
Joey Hess	7a33803193	Avoid pipeline stall when running git annex drop or fsck on a lot of files. When it's stalled, there are 3 processes: git annex git ls-files git check-attr git-annex stalls trying to write to git check-attr, which stalls trying to write to stdout (read by git-annex). git ls-files does not seem to be involved directly; I've seen the stall when it was still streaming out the file list, and after it had exited and zombified. The read and write are supposed to be handled by two different threads, which pipeBoth forks off, thus avoiding deadlock. But it does deadlock. (Certian signals unblock the deadlock for a while, then it stalls again.) So, this is another case of WTF is the ghc IO manager doing today? I avoid the issue by converting the writer to a separate process. Possibly this was caused by some change in ghc 7 -- I'm offline and cannot verify now, but I'm sure I used to be able to run git annex drop w/o it hanging! And the code does not seem to have changed, except for commit `c1dc407941`, which I tried reverting without success. In fact, I reverted all the way back to 0.20110316 and still saw the stall. Update: Minimal test case: import System.Cmd.Utils main = do as <- checkAttr "blah" $ map show [1..100000] sequence $ map (putStrLn . show) as checkAttr attr files = do (_, s) <- pipeBoth "git" params $ unlines files return $ lines s where params = ["check-attr", attr, "--stdin"] Bug filed on ghc in debian, #624389	2011-04-27 23:18:35 -04:00
Joey Hess	39966ba4ee	filter out --delete rsync option rsync does not have a --no-delete, so do it this way instead	2011-04-27 20:31:56 -04:00
Joey Hess	e68f128a9b	rsync special remote Fully tested and working, including resuming and encryption. (Though not resuming when sending with encryption; gpg doesn't produce identical output each time.) Uses same layout as the directory special remote and the .git/annex/objects/ directory.	2011-04-27 20:23:09 -04:00
Joey Hess	27774bdd56	Revert "Use haskell Crypto library instead of haskell SHA library.a" This reverts commit `892593c5ef`. Conflicts: Crypto.hs debian/control	2011-04-26 11:24:23 -04:00
Joey Hess	7d71f8770b	releasing version 0.20110425	2011-04-25 16:02:57 -04:00
Joey Hess	76911a446a	Avoid using absolute paths when staging location log, as that can confuse git when a remote's path contains a symlink. Closes: #621386 This was a real PITA to fix, since location logs can be staged in both the current repo, as well as in local remote's repos, in which case the cwd will not be in the repo. And git add needs different params in both cases, when absolute paths are not used. In passing, git annex fsck now stages location log fixes.	2011-04-25 14:54:24 -04:00
Joey Hess	8512a4a1a1	Remove testpack from build depends, as it is not available on all architectures. The test suite will not be run if it cannot be compiled. It may be possible later to split off the quickcheck using tests into a separate program and keep most of the tests using just hunit.	2011-04-25 12:43:22 -04:00
Joey Hess	892593c5ef	Use haskell Crypto library instead of haskell SHA library.a Since hS3 needs Crypto anyway, this actually reduces dependencies.	2011-04-21 16:37:14 -04:00
Joey Hess	24feee25c9	releasing version 0.20110420	2011-04-21 15:11:51 -04:00
Joey Hess	6668a061a8	typo	2011-04-21 14:53:07 -04:00
Joey Hess	2467c56771	update on S3 memory leaks The remaining leaks are in hS3. The leak with encryption was worked around by the use of the temp file. (And was probably originally caused by gpgCipherHandle sparking a thread which kept a reference to the start of the byte string.)	2011-04-21 11:06:29 -04:00
Joey Hess	6fcd3e1ef7	fix S3 upload buffering problem Provide file size to new version of hS3.	2011-04-21 10:33:17 -04:00
Joey Hess	d8329731c6	missing build dep	2011-04-21 09:58:32 -04:00
Joey Hess	43639f69f6	ghc7 * Update Debian build dependencies for ghc 7. * Debian package is now built with S3 support. Thanks Joachim Breitner for making this possible, also thanks Greg Heartsfield for working to improve the hS3 library for git-annex. Also hid a conflicting new symbol from Control.Monad.State	2011-04-21 02:22:40 -04:00
Joey Hess	143fc7b692	finalize release	2011-04-19 21:40:21 -04:00
Joey Hess	5985acdfad	bup: Avoid memory leak when transferring encrypted data. This was a most surprising leak. It occurred in the process that is forked off to feed data to gpg. That process was passed a lazy ByteString of input, and ghc seemed to not GC the ByteString as it was lazily read and consumed, so memory slowly leaked as the file was read and passed through gpg to bup. To fix it, I simply changed the feeder to take an IO action that returns the lazy bytestring, and fed the result directly to hPut. AFAICS, this should change nothing WRT buffering. But somehow it makes ghc's GC do the right thing. Probably I triggered some weakness in ghc's GC (version 6.12.1). (Note that S3 still has this leak, and others too. Fixing it will involve another dance with the type system.) Update: One theory I have is that this has something to do with the forking of the feeder process. Perhaps, when the ByteString is produced before the fork, ghc decides it need to hold a pointer to the start of it, for some reason -- maybe it doesn't realize that it is only used in the forked process.	2011-04-19 15:27:03 -04:00
Joey Hess	a441e08da1	Fix stalls in S3 when transferring encrypted data. Stalls were caused by code that did approximatly: content' <- liftIO $ withEncryptedContent cipher content return store content' The return evaluated without actually reading content from S3, and so the cleanup code began waiting on gpg to exit before gpg could send all its data. Fixing it involved moving the `store` type action into the IO monad: liftIO $ withEncryptedContent cipher content store Which was a bit of a pain to do, thank you type system, but avoids the problem as now the whole content is consumed, and stored, before cleanup.	2011-04-19 14:45:19 -04:00

... 2 3 4 5 6 ...

591 commits