git-annex

Author	SHA1	Message	Date
Joey Hess	1a8ba7eab4	Improve ssh socket cleanup code to skip over the cruft that NFS sometimes puts in a directory when a file is being deleted.	2016-10-26 13:16:41 -04:00
Joey Hess	6e4fee1faf	test: Deal with gpg-agent behavior change that broke the test suite. gpg-agent started deleting its socket file on shutdown, and this tickled an ugly behavior in removeDirectoryRecursive, https://github.com/haskell/directory/issues/60 Running removeDirectoryRecursive again on exception avoids the problem.	2016-10-18 16:56:38 -04:00
Joey Hess	090a922a98	Assistant, repair: Improved filtering out of git fsck lines about duplicate file entries in tree objects.	2016-10-18 11:19:41 -04:00
Joey Hess	0b1c061382	importfeed: Drop URL parameters from file extension. Thanks, James MacMahon.	2016-10-17 16:02:05 -04:00
Joey Hess	10ca4b9788	Improve style of offline html build of website.	2016-10-17 15:55:49 -04:00
Joey Hess	8e22114735	upgrade: Handle upgrade to v6 when the repository already contains v6 unlocked files whose content is already present. Closes https://github.com/datalad/datalad/issues/1020 The use of runWriter in scanUnlockedFiles broke due to this change; it failed with blocked indefinitely in mvar, because the database write handle was taken while linkFromAnnex needed to also write to it (to update the inode cache). So, switched to using a separate runWriter for each call to addAssociatedFileFast. A little less efficient, but not greatly; the writes should all still be cached.	2016-10-17 15:19:47 -04:00
Joey Hess	ee309d6941	lock: Fix edge cases where data loss could occur in v6 mode. In the case where the pointer file is in place, and not the content of the object, lock's performNew was called with filemodified=True, which caused it to try to repopulate the object from an unmodified associated file, of which there were none. So, the content of the object got thrown away incorrectly. This was the cause (although not the root cause) of data loss in https://github.com/datalad/datalad/issues/1020 The same problem could also occur when the work tree file is modified, but the object is not, and lock is called with --force. Added a test case for this, since it's excercising the same code path and is easier to set up than the problem above. Note that this only occurred when the keys database did not have an inode cache recorded for the annex object. Normally, the annex object would be in there, but there are of course circumstances where the inode cache is out of sync with reality, since it's only a cache. Fixed by checking if the object is unmodified; if so we don't need to try to repopulate it. This does add an additional checksum to the unlock path, but it's already checksumming the worktree file in another case, so it doesn't slow it down overall. Further investigation found a similar problem occurred when smudge --clean is called on a file and the inode cache is not populated. cleanOldKeys deleted the unmodified old object file in this case. This was also fixed by checking if the object is unmodified. In general, use of getInodeCaches and sameInodeCache is potentially dangerous if the inode cache has not gotten populated for some reason. Better to use isUnmodified. I breifly auited other places that check the inode cache, and did not see any immediate problems, but it would be easy to miss this kind of problem.	2016-10-17 13:58:43 -04:00
Joey Hess	c0cdac5c4a	releasing package git-annex version 6.20161012	2016-10-12 09:38:03 -04:00
Joey Hess	b82c3e0783	sync: Fix bug in adjusted branch merging that could cause recently added files to be lost when updating the adjusted branch. The modification flag was not being set when making modifications deep in a tree, so parent trees were not updated to contain the modified tree. Seems to have exposed another bug where the wrong filename gets grafted in. This commit was sponsored by Brock Spratlen on Patreon.	2016-10-10 15:00:45 -04:00
Joey Hess	933bc5c917	Support using v3 repositories without upgrading them to v5. An easy change now that supportedVersions is a list. Since v3 and v5 are identical other than version number, just add v3 to the list. This commit was sponsored by andrea rota.	2016-10-05 16:53:09 -04:00
Joey Hess	f867fc157f	When auto-upgrading a v3 remote, avoid upgrading to version 6, instead keep it at version 5. Fixes a bug introduced with v6 mode that I didn't notice until now. Probably not many v3 repos left out there, and upgrading them to v6 mode is not disastrous, only a little premature. This commit was sponsored by Riku Voipio	2016-10-05 16:23:09 -04:00
Joey Hess	34530e59d9	Avoid using a lot of memory when large objects are present in the git repository .. and have to be checked to see if they are a pointed to an annexed file. Cases where such memory use could occur included, but were not limited to: - git commit -a of a large unlocked file (in v5 mode) - git-annex adjust when a large file was checked into git directly Generally, any use of catKey was a potential problem. Fix by using git cat-file --batch-check to check size before catting. This adds another git batch process, which is included in the CatFileHandle for simplicity. There could be performance impact, anywhere catKey is used. Particularly likely to affect adjusted branch generation speed, and operations on unlocked files in v6 mode. Hopefully since the --batch-check and --batch read the same data, disk buffering will avoid most overhead. Leaving only the overhead of talking to the process over the pipe and whatever computation --batch-check needs to do. This commit was sponsored by Bruno BEAUFILS on Patreon.	2016-10-05 15:24:13 -04:00
Joey Hess	aacd9b190d	Linux standalone: Include locale files in the bundle, and generate locale definition files for the locales in use when starting runshell. Currently only done for utf-8 locales because the charset can easily be told for those. Other locales don't include the charset in their name. The locale definition is generated under git-annex.linux/locales. So, this only works if the user can write there. If locale generation fails for any reason, it's silently skipped. The git-annex-standalone.deb installs the bundle under /usr, so this locale generation won't work for non-root users.	2016-10-04 16:37:43 -04:00
Joey Hess	c079811226	Linux standalone: Add back the LOCPATH=/dev/null hack to avoid the system locale-archive being read. Version mismatches between the system locale-archive and the glibc in the bundle have been observed to cause git crashes. Unfortunately, this causes locales to not be used in the linux standalone bundle, as was the case until version 6.20160419. glibc hardcodes the path to /usr/lib/locale/locale-archive and does not let an environment variable cause a different locale-archive file to be used. The only other option to include locales in the bundle would be to include exploded locale definition directories in the bundle for a number of locales, generated by localedef. But these take at least 300 kb per locale, and there are a great many locales; it would be hundreds of megabytes to include them all. (Hmm, we could include localdef in the bundle, and check LANG in runshell and compile the locale directories on the fly. This would need /usr/share/i18n/ and /usr/lib/locale-archive to be included in the bundle. It's.. doable.) I know this is going to once again cause users of the bundle to complain that eg, ls doesn't show their unicode filenames right. Better than strange crashes though.	2016-10-04 12:53:09 -04:00
Joey Hess	5bf4623a1d	allow multiple concurrent external special remote processes Multiple external special remote processes for the same remote will be started as needed when using -J. This should not beak any existing external special remotes, because running multiple git-annex commands at the same time could already start multiple processes for the same external special remotes.	2016-09-30 14:29:02 -04:00
Joey Hess	28c6209f55	Make --json-progress output be shown even when the size of a object is not known.	2016-09-29 16:59:48 -04:00
Joey Hess	161d891508	Add "total-size" field to --json-progress output.	2016-09-29 16:29:54 -04:00
Joey Hess	7cae6c746c	Optimised git-annex branch log file timestamp parsing. 10% speedup This sped up git annex find --not --in web from 6.64s to 5.69s. The optimised parser is probably more like 50% faster than the general one it replaced.	2016-09-29 14:04:53 -04:00
Joey Hess	1cd02762bf	Optimisations to git-annex branch query and setting, avoiding repeated copies of the environment. Speeds up commands like "git-annex find --in remote" by over 50%. Profiling showed that adjustGitEnv was 21% of the time and 37% of the allocations of that command. It copied the environment each time with getEnvironment. The only repeated use of adjustGitEnv is in withIndexFile, which tends to be run at least once per file. So, it was optimised by keeping a cache of the environment, which can be reused. There could be other better ways to optimise this. Maybe get the while environment once at startup. But, then it would have to be serialized back out each time running a child process, so I doubt that would be a net win. It might be better to cache a version of the environment that is pre-modified to use .git-annex/index. But, profiling doesn't show that modifying the enviroment is taking any significant time.	2016-09-29 13:36:48 -04:00
Joey Hess	8794dcf27b	Optimisations to time it takes git-annex to walk working tree and find files to work on. Sped up by around 18%. key2file and file2key were top cost centers according to profiling. The repeated use of replace was not efficient. This new approach is quite a lot more efficient. This commit was sponsored by Denis Dzyubenko on Patreon.	2016-09-26 16:48:57 -04:00
Joey Hess	1678510680	prep release	2016-09-23 09:45:46 -04:00
Joey Hess	4b26aee92c	Revert "stack.yaml: Update to lts-7.0 (ghc 8)" This reverts commit `e181603103`. This broke the i386ancient autobuilder due to its use of --flag git-annex:XMPP --flag=git-annex:dbus -- Failure when adding dependencies: fdo-notify: needed ((>=0.3)), stack configuration has no specified version (latest applicable is 0.3.1) gnutls: needed ((>=0.1.4)), stack configuration has no specified version (latest applicable is 0.2) network-protocol-xmpp: needed (-any), stack configuration has no specified version (latest applicable is 0.4.8) OSX autobuilder also seems hosed by it, so too soon. De-revert later..	2016-09-21 18:01:23 -04:00
Joey Hess	c910004d50	addurl, importfeed: Improve behavior when file being added is gitignored.	2016-09-21 17:21:48 -04:00
Joey Hess	a569f195b7	fix bugs in handing of deep branches with sync and adjusted branches * sync: Previously, when run in a branch with a slash in its name, such as "foo/bar", the sync branch was "synced/bar". That conflicted with the sync branch used for branch "bar", so has been changed to "synced/foo/bar". * adjust: Previously, when adjusting a branch with a slash in its name, such as "foo/bar", the adjusted branch was "adjusted/bar(unlocked)". That conflicted with the adjusted branch used for branch "bar", so has been changed to "adjusted/foo/bar(unlocked)" * Also, running sync in an adjusted branch did not correctly sync changes back to the parent branch when it had a slash in its name. This bug has been fixed. Eliminate use of Git.Ref.under and Git.Ref.basename; using Git.Ref.underBase and Git.Ref.base make everything handle deep branches correctly. Probably noone was adjusting deep branches, and v6 is still experimental anyway, so I'm not going to worry about the mess that was left by that bug. In the case of git-annex sync, using a fixed git-annex with an old unfixed one will mean they use different sync branches for a deep branch, and so they may stop syncing until the old one is upgraded. However, that's only a problem when syncing between repositories without going via a central bare repository. Added a warning about this to the CHANGELOG, but it's probably not going to affect many people at all. This commit was sponsored by Riku Voipio.	2016-09-21 15:23:47 -04:00
Joey Hess	0e30e71e9c	info: Support being passed a treeish, and show info about the annexed files in it similar to how a directory is handled.	2016-09-15 12:51:00 -04:00
Joey Hess	e181603103	stack.yaml: Update to lts-7.0 (ghc 8) A few of these extra-deps are setting versions to work around various library dep issues with ghc 8.	2016-09-15 00:37:05 -04:00
Joey Hess	ec3558fb79	Improve gpg secret key list parser to deal with changes in gpg 2.1.15. Fixes key name display in webapp. gpg 2.1.15 (or so) seems to have added some new fields to the --with-colons --list-secret-keys output. These include "fpr" and "grp", and come before the "uid" line. So, the parser was giving up before it saw the name. Fix by continuing to look for the uid line until the next "sec" line. This commit was sponsored by Ole-Morten,Duesund on Patreon.	2016-09-14 13:31:00 -04:00
Joey Hess	3e22d60549	copy, move, mirror: Support --json and --json-progress.	2016-09-09 16:24:26 -04:00
Joey Hess	05d4438383	addurl, get: Added --json-progress option, which adds progress objects to the json output. This doesn't work right when used with -J yet, and there is some really ugly hand-crafting of part of the json output.	2016-09-09 15:06:54 -04:00
Joey Hess	f421a7f001	Remove key:null from git-annex add --json output.	2016-09-09 14:26:34 -04:00
Joey Hess	089c592977	buffer json output until done when in concurrent mode	2016-09-09 13:21:38 -04:00
Joey Hess	e0fae28c72	Rate limit console progress display updates to 10 per second. Was updating as frequently as changes were reported, up to hundreds of times per second, which used unncessary bandwidth when running git-annex over ssh etc.	2016-09-08 13:17:43 -04:00
Joey Hess	ad0a7f6cb3	prep release	2016-09-07 11:12:33 -04:00
Joey Hess	31289da691	get -J: Download different files from different remotes when the remotes have the same costs. Only done in -J mode because only if there's concurrency can downloading from two remotes be faster. Without concurrency, it's likely the case that sequential downloads from the same remote are faster than switching back and forth between two remotes. There is some hairy MVar code here, but basically it just keeps the activeremotes MVar full except when deciding which remote to assign to a thread. Also affects gets by sync --content -J This commit was sponsored by Jochen Bartl.	2016-09-06 12:45:21 -04:00
Joey Hess	dd0dff9dc4	Assistant, repair: Filter out git fsck lines about duplicate file entries in tree objects.	2016-09-05 16:08:49 -04:00
Joey Hess	219e2fa157	Make --json and --quiet suppress automatic init messages And any other messages that might be output before a command starts. Fixes a reversion introduced in version 5.20150727. During the optparse-applicative conversion, I needed a place to run per-command global option setters, and I made it get run during the seek stage. But that is too late to have --json and --quiet disable output produced in the check stage. Fix is just to run those per-command global option setters at the same time as the all-command global option setters. This commit was sponsored by Thom May.	2016-09-05 15:34:38 -04:00
Joey Hess	49b3ef88f7	Android: Fix disabling use of cp --reflink=auto, curl, sha224, and sha384. This was originally done in `a7ef05a9`, but got lost in some change to the Makefile. Use CROSS_COMPILE=Android to tell configure that it's configuring for android instead of passing it a parameter.	2016-09-05 14:11:35 -04:00
Joey Hess	5d70eaacaf	examimekey: Allow being run in a git repo that is not initialized by git-annex yet. No reason not to; indeed there's no real reason to need a git repository at all except the implementation uses the Annex monad.	2016-09-05 12:26:59 -04:00
Joey Hess	f1a9c5f248	Fix formatting of git-annex-smudge man page, and improve mdwn2man. Thanks, Jim Paris.	2016-09-05 12:16:03 -04:00
Joey Hess	f292f78366	Windows: Handle shebang in external special remote program.	2016-09-05 12:09:23 -04:00
Joey Hess	3752426ca1	releasing package git-annex version 6.20160808	2016-08-08 11:57:09 -04:00
Joey Hess	f461bcae4b	Re-enable accumulating transfer failure log files for command-line actions This was disabled in commit `61ccf95004`, because only the assistant used them, and they were clutter. But, now --failed also uses them. Remove the failure log files after successful transfers. Should avoid most of the clutter problems. Commit `61ccf95004` mentions a subtle behavior change, which has now been reverted: There is one behavior change from this. If glacier is being used, and a manual git annex get --from glacier fails because the file isn't available yet, the assistant will no longer later see that failed transfer file and retry the get.	2016-08-03 13:41:07 -04:00
Joey Hess	1a0e2c9901	get, move, copy, mirror: Added --failed switch which retries failed copies/moves Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/	2016-08-03 12:37:12 -04:00
Joey Hess	f0886a1bdd	info: When run on a file now includes an indication of whether the content is present locally.	2016-07-30 12:29:59 -04:00
Joey Hess	bf3327ff25	Added metadata --batch option, which allows getting, setting, deleting, and modifying metadata for multiple files/keys.	2016-07-27 10:46:25 -04:00
Joey Hess	e5225f08fc	When built with ut uid-1.3.12, generate more random UUIDs than before Use nextRandom to generate the random UUID, rather than using randomIO. This gets fixes for the following two bugs in the uuid library. However, this did not impact git-annex much, so a hard depedency has not been added on uuid-1.3.12. https://github.com/aslatter/uuid/issues/15 "v4 UUIDs are not that random" This doesn't greatly affect git-annex, because even with only 2^64 possible UUIDs, the chance that two git-annex repositories that are clones of the same git repo get the same UUID is miniscule. And, git-annex generates only one UUID per run, so preducting subsequent UUIDs is not a problem. https://github.com/aslatter/uuid/issues/16 "Remove Random instance for UUID, or mark it as deprecated" git-annex was using that instance; let's stop before it gets deprecated or removed.	2016-07-27 07:46:08 -04:00
Joey Hess	870873bdaa	Removed dependency on json library; all JSON is now handled by aeson. I've eyeballed all --json commands, and the only difference should be that some fields are re-ordered.	2016-07-26 19:15:34 -04:00
Joey Hess	8bc8469c38	saner format for metadata --json metadata --json output format has changed, adding a inner json object named "fields" which contains only the fields and their values. This should be easier to parse than the old format, which mixed up metadata fields with other keys in the json object. Any consumers of the old format will need to be updated. This adds a dependency on unordered-containers for parsing MetaData from JSON, but it's a free dependency; aeson pulls in that library.	2016-07-26 15:41:04 -04:00
Joey Hess	d344f04d09	cabal constraints for aws and esqueleto closes https://github.com/joeyh/git-annex/pull/55 * git-annex.cabal: Temporarily limit to http-conduit <2.2.0 since aws 0.14.0 is not compatible with the newer version. * git-annex.cabal: Temporarily limit to persistent <2.5 since esqueleto 2.4.3 is not compatible with the newer version.	2016-07-22 12:41:28 -04:00
Joey Hess	bf8bf14e8e	--branch, stage 1 Added --branch option to copy, drop, fsck, get, metadata, mirror, move, and whereis commands. This option makes git-annex operate on files that are included in a specified branch (or other treeish). The names of the files from the branch that are being operated on are not displayed yet; only the keys. Displaying the filenames will need changes to every affected command. Also, note that --branch can be specified repeatedly. This is not really documented, but seemed worth supporting, especially since we may later want the ability to operate on all branches matching a refspec. However, when operating on two branches that contain the same key, that key will be operated on twice.	2016-07-20 12:05:26 -04:00

1 2

94 commits