git-annex

Author	SHA1	Message	Date
Joey Hess	e53070c1ff	inheritable annex.securehashesonly * init: When annex.securehashesonly has been set with git-annex config, copy that value to the annex.securehashesonly git config. * config --set: As well as setting value in git-annex branch, set local gitconfig. This is needed especially for annex.securehashesonly, which is read only from local gitconfig and not the git-annex branch. doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn has the rationalle for doing it this way. There's no perfect solution; this seems to be the least-bad one. This commit was supported by the NSF-funded DataLad project.	2017-02-27 16:08:23 -04:00
Joey Hess	9db064f50c	reorg	2017-02-27 15:04:03 -04:00
Joey Hess	49114cf4ea	securehash matching Added --securehash option to match files using a secure hash function, and corresponding securehash preferred content expression. This commit was sponsored by Ethan Aubin.	2017-02-27 15:02:44 -04:00
Joey Hess	942e0174b3	make fsck check annex.securehashesonly, and new tip for working around SHA1 collisions with git-annex This commit was sponsored by andrea rota.	2017-02-27 13:55:15 -04:00
Joey Hess	07f1e638ee	annex.securehashesonly Cryptographically secure hashes can be forced to be used in a repository, by setting annex.securehashesonly. This does not prevent the git repository from containing files with insecure hashes, but it does prevent the content of such files from being pulled into .git/annex/objects from another repository. We want to make sure that at no point does git-annex accept content into .git/annex/objects that is hashed with an insecure key. Here's how it was done: * .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be written to it normally * So every place that writes content must call, thawContent or modifyContent. We can audit for these, and be sure we've considered all cases. * The main functions are moveAnnex, and linkToAnnex; these were made to check annex.securehashesonly, and are the main security boundary for annex.securehashesonly. * Most other calls to modifyContent deal with other files in the KEY directory (inode cache etc). The other ones that mess with the content are: - Annex.Direct.toDirectGen, in which content already in the annex directory is moved to the direct mode file, so not relevant. - fix and lock, which don't add new content - Command.ReKey.linkKey, which manually unlocks it to make a copy. * All other calls to thawContent appear safe. Made moveAnnex return a Bool, so checked all callsites and made them deal with a failure in appropriate ways. linkToAnnex simply returns LinkAnnexFailed; all callsites already deal with it failing in appropriate ways. This commit was sponsored by Riku Voipio.	2017-02-27 13:33:59 -04:00
Joey Hess	40327cab6e	Removed support for building with the old cryptohash library. Building with that library made git-annex not support SHA3; it's time for that to always be supported in case SHA2 dominoes.	2017-02-24 20:56:26 -04:00
Joey Hess	6b52fcbb7e	SHA1 collisions in key names was more exploitable than I thought Yesterday's SHA1 collision attack could be used to generate eg: SHA256-sfoo--whatever.good SHA256-sfoo--whatever.bad Such that they collide. A repository with the good one could have the bad one swapped in and signed commits would still verify. I've already mitigated this.	2017-02-24 19:54:36 -04:00
Joey Hess	9de0767d0e	update	2017-02-24 12:31:23 -04:00
Joey Hess	35739a74c2	make file2key reject E* backend keys with a long extension I am not happy that I had to put backend-specific code in file2key. But it would be very difficult to avoid this layering violation. Most of the time, when parsing a Key from a symlink target, git-annex never looks up its Backend at all, so adding this check to a method of the Backend object would not work. The Key could be made to contain the appropriate Backend, but since Backend is parameterized on an "a" that is fixed to the Annex monad later, that would need Key to change to "Key a". The only way to clean this up that I can see would be to have the Key contain a LowlevelBackend, and put the validation in LowlevelBackend. Perhaps later, but that would be an extensive change, so let's not do it in this commit which may want to cherry-pick to backports. This commit was sponsored by Ethan Aubin.	2017-02-24 11:22:15 -04:00
Joey Hess	102e04b30c	typo	2017-02-24 00:29:37 -04:00
Joey Hess	60d99a80a6	Tighten key parser to not accept keys containing a non-numeric fields, which could be used to embed data useful for a SHA1 attack against git. Also todo about why this is important, and with some further hardening to add. This commit was sponsored by Ignacio on Patreon.	2017-02-24 00:17:25 -04:00
Joey Hess	75a15e1ad7	status: Pass --ignore-submodules=when option on to git status. Didn't make --ignore-submodules without a value be handled because I can't see a way to make optparse-applicative parse that. I've opened a bug requesting a way to do that: https://github.com/pcapriotti/optparse-applicative/issues/243	2017-02-20 17:01:24 -04:00
Joey Hess	7a0d6d81a0	make curl show http errors to stderr * Run curl with -S, so HTTP errors are displayed, even when it's otherwise silent. * When downloading in --json or --quiet mode, use curl in preference to wget, since curl is able to display only errors to stderr, unlike wget. This does mean that downloadQuiet is only silent on stdout, not necessarily on stderr, which affects a couple other calls of it. For example, downloading the .git/config of a http remote may show an error message now, perhaps with slightly suboptimal formatting due to other output.	2017-02-20 16:09:32 -04:00
Joey Hess	4a397b5313	Run wget with -nv instead of -q, so it will display HTTP errors. This adds one extra line of output when a download is successful, after the progress bar. I don't much like that, but wget does not provide a way to show HTTP errors without it.	2017-02-20 15:25:02 -04:00
Joey Hess	a13c0ce66c	adjust: Fix behavior when used in a repository that contains submodules. Also fixed the LsFiles parser to not assume its output has a fixed width type field.	2017-02-20 13:44:55 -04:00
Joey Hess	c5cf5cf03a	git-annex.cabal: Make crypto-api a dependency even when built w/o webapp and test suite. The p2p code made it always be needed. This commit was sponsored by Anthony DeRobertis on Patreon.	2017-02-20 12:21:35 -04:00
Joey Hess	e6857e75a6	sync hack to make updateInstead work on eg FAT sync: When syncing with a local repository located on a crippled filesystem, run the post-receive hook there, since it wouldn't get run otherwise. This makes pushing to repos on FAT-formatted removable drives update them when receive.denyCurrentBranch=updateInstead. Made Remote.Git export onLocal, which was cleaned up to not have so many caveats about its use. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2017-02-17 15:21:52 -04:00
Joey Hess	d074532aff	post-recive hook to make updateInstead work in direct mode and adjusted branches * Added post-recieve hook, which makes updateInstead work with direct mode and adjusted branches. * init: Set up the post-receive hook. This commit was sponsored by Fernando Jimenez on Patreon.	2017-02-17 14:04:43 -04:00
Joey Hess	d0651bb567	make query commands not output extraneous messages config group groupwanted numcopies schedule wanted required: Avoid displaying extraneous messages about repository auto-init, git-annex branch merging, etc, when being used to get information.	2017-02-16 13:24:35 -04:00
Joey Hess	a73c8ce4a1	sync: Improve integration with receive.denyCurrentBranch=updateInstead By displaying error messages from the remote then it fails to update its checked out branch. Error messages in the default receive.denyCurrentBranch are still suppressed, which matches user expectations. This commit was sponsored by Nick Daly on Patreon.	2017-02-15 16:13:30 -04:00
Joey Hess	f07af03018	Run ssh with -n whenever input is not being piped into it ... to avoid it consuming stdin that it shouldn't. This fixes git-annex-checkpresentkey --batch remote, which didn't output results for all keys passed into it. Other git-annex commands that communicate with a remote over ssh may also have been consuming stdin that they shouldn't have, which could have impacted using them in eg, shell scripts. For example, a shell script reading files from stdin and passing them to git annex drop would be impacted by this bug, whenever git annex drop ran git-annex-shell checkpresent, it would consume part/all of the stdin that the shell script was supposed to consume. Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which is used throughout git-annex to run ssh (in order for ssh connection caching to work). Every call site was checked to see if it used CreatePipe for stdin, and if not was marked NoConsumeStdin.	2017-02-15 15:08:46 -04:00
Joey Hess	69baa45f14	sync, merge: Fail when the current branch has no commits yet, instead of not merging in anything from remotes and appearing to succeed. At first I wanted to make it go ahead and merge into the newborn branch, so made it use Git.Branch.currentUnsafe to get the current branch. But that failed: fatal: ambiguous argument 'refs/heads/master..refs/heads/synced/master': unknown revision or path not in the working tree. A whole nother code path to handle merging into newborn branches seemed excessive, so went with displaying a warning and propigating failure status. This commit was sponsored by Brock Spratlen on Patreon.	2017-02-14 16:09:55 -04:00
Joey Hess	95390f0c27	releasing package git-annex version 6.20170214	2017-02-14 14:56:11 -04:00
Joey Hess	3b22ad9f47	Work around sqlite's incorrect handling of umask when creating databases. Refactored some common code into initDb. This only deals with the problem when creating new databases. If a repo got bad permissions into it, it's up to the user to deal with it. This commit was sponsored by Ole-Morten Duesund on Patreon.	2017-02-13 17:39:16 -04:00
Joey Hess	976676a7b0	S3: Fix check of uuid file stored in bucket, which was not working. The check was broken in two ways.. First, nowhere did it error out when checkUUIDFile found a different UUID already in the file. Instead, it overwrote the uuid file. And, checkUUIDFile's implementation was for some reason always failing with a ConnectionClosed exception. Apparently something to do with using two different runResourceT's and a response getting GCed inbetween. I'm pretty sure that used to work, but changed to a more obviously correct implementation. This commit was sponsored by Peter Hogg on Patreon.	2017-02-13 15:35:24 -04:00
Edward Betts	0750913136	correct spelling mistakes	2017-02-12 17:30:23 -04:00
Joey Hess	5e6ced7d0f	Improve pid locking code to work on filesystems that don't support hard links. Probing for hard link support in the pid locking code is redundant since git-annex init already probes that. But, it didn't seem worth threading that data through; the pid locking code runs at most once per git-annex process, and only on unusual filesystems. Optimising a single hard link and unlink isn't worth it. This commit was sponsored by Francois Marier on Patreon.	2017-02-10 15:22:28 -04:00
Joey Hess	e2c98f5788	Added git template directory to Linux standalone tarball and OSX app bundle. Git does not provide a switch to find out where this directory is, and while the git-init man page says it will always be in /usr/share/git-core/templates, that's not the case on OSX with git installed from homebrew. So, I used a hack taking the --man-path and constructing a path from that. Works on both Debian and OSX at least.	2017-02-10 13:55:54 -04:00
Joey Hess	c1ece47ea0	import --reinject-duplicates This is the same as running git annex reinject --known, followed by git-annex import. The advantage to having it in one command is that it only has to hash each file once; the two commands have to hash the imported files a second time. This commit was sponsored by Shane-o on Patreon.	2017-02-09 15:41:00 -04:00
Joey Hess	f617988a29	Make import --deduplicate and --skip-duplicates only hash once, not twice import: --deduplicate and --skip-duplicates were implemented inneficiently; they unncessarily hashed each file twice. They have been improved to only hash once. The new approach is to lock down (minimally) and hash files, and then reuse that information when importing them. This was rather tricky, especially in detecting changes to files while they are being imported. The output of import changed slightly. While before it silently skipped over files with eg --skip-duplicates, now it shows each file as it starts to act on it. Since every file is hashed first thing, it would otherwise not be clear what file import is chewing on. (Actually, it wasn't clear before when any of the duplicates switches were used.) This commit was sponsored by Alexander Thompson on Patreon.	2017-02-09 15:32:22 -04:00
Joey Hess	e7e36b6e72	import: Changed how --deduplicate, --skip-duplicates, and --clean-duplicates determine if a file is a duplicate Before, only content known to be present somewhere was considered a duplicate. Now, any content that has been annexed before will be considered a duplicate, even if all annexed copies of the data have been lost. Note that --clean-duplicates and --deduplicate still check numcopies, so won't delete duplicate files unless there's an annexed copy. This makes import use the same method as reinject --known. The man page already said that duplicate meant "its content is either present in the local repository already, or git-annex knows of another repository that contains it, or it was present in the annex before but has been removed now". So, this is really only bringing the implementation into line with the man page. This commit was sponsored by Jochen Bartl on Patreon.	2017-02-07 17:41:58 -04:00
Joey Hess	27e89aeffc	initremote: When a uuid= parameter is passed, use the specified UUID for the new special remote, instead of generating a UUID. This can be useful in some situations, eg when the same data can be accessed via two different special remote backends.	2017-02-07 15:10:41 -04:00
Joey Hess	3439f3cc87	assistant: Make --autostart --foreground wait for the children it starts. Before, the --foreground was ignored when autostarting. This commit was sponsored by Denis Dzyubenko on Patreon.	2017-02-07 13:31:45 -04:00
Joey Hess	655f707990	Fix build with aws 0.16. Thanks, aristidb.	2017-02-07 13:01:57 -04:00
Joey Hess	3fe9d99f24	wormhole pairing appid flag day 2021-12-31 Wormhole pairing will start to provide an appid to wormhole on 2021-12-31. An appid can't be provided now because Debian stable is going to ship a older version of git-annex that does not provide an appid. Assumption is that by 2021-12-31, this version of git-annex will be shipped in a Debian stable release. If that turns out to not be the case, this change will need to be cherry-picked into the git-annex in Debian stable, or its wormhole pairing will break. This commit was sponsored by Thomas Hochstein on Patreon.	2017-02-03 15:06:40 -04:00
Joey Hess	06f307ad13	lost a changelog entry; put back	2017-02-03 14:40:53 -04:00
Joey Hess	b77903af48	New annex.synccontent config setting .. which can be set to true to make git annex sync default to --content. This may become the default at some point in the future. As well as being configuable by git config, it can be configured by git-annex config to control the default behavior in all clones of a repository. Had to add a separate --no-content switch to we can tell if it's been explicitly set, and should override annex.synccontent. If --content was the default, this complication would not be necessary. This commit was sponsored by Jake Vosloo on Patreon.	2017-02-03 14:31:17 -04:00
Joey Hess	ed56dba868	annex.autocommit can be configured via git-annex config ... to control the default behavior in all clones of a repository. This includes a new Configurable data type, so the GitConfig type indicates which values can be configured this way. The implementation should be quite efficient; the config log is only read once, and only when a Configurable value has not already been set by git-config. Indeed, it would be nice in the future to extend this, so that git-config is itself only read on demand. Some commands may not need to look at the git configuration at all. This commit was sponsored by Trenton Cronholm on Patreon.	2017-02-03 13:58:53 -04:00
Joey Hess	ed60f60e9b	unused: Improved memory use significantly when there are a lot of differences between branches. Argh, didn't need an accumulator here! I think I use accumulators a lot more than I need to when recusively processing lists.. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2017-01-31 19:42:00 -04:00
Joey Hess	062286135c	unused: When large files are checked right into git, avoid buffering their contents in memory. This makes it a little bit slower since it has to check file size, but worth it to fix a potential memory use problem. This commit was sponsored by Fernando Jimenez on Patreon.	2017-01-31 19:09:37 -04:00
Joey Hess	9eb10caa27	Some optimisations to string splitting code. Turns out that Data.List.Utils.split is slow and makes a lot of allocations. Here's a much simpler single character splitter that behaves the same (even in wacky corner cases) while running in half the time and 75% the allocations. As well as being an optimisation, this helps move toward eliminating use of missingh. (Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and allocates even more.) I have not benchmarked the effect on git-annex, but would not be surprised to see some parsing of eg, large streams from git commands run twice as fast, and possibly in less memory. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2017-01-31 19:06:22 -04:00
Joey Hess	3300911b14	lts-7.18 finally! esqueleto finally got fixed, thanks to @bitemyapp Since XMPP was removed, the previous build failures related to it should no longer be a problem either. Meanwhile, lts-5.18 fails to build anymore on Debian due to linker hardening breaking the version of ghc stack uses with that version. This commit was sponsored by Francois Marier on Patreon.	2017-01-31 12:27:08 -04:00
Joey Hess	339464e847	config: New command for storing configuration in the git-annex branch. Any config names can be set using this; git-annex commands will only look at specific ones that make sense and are worth the overhead of querying the branch. This might also be useful for storing whatever other config-type stuff the user might want to shove into the git-annex branch. This commit was sponsored by Jochen Bartl on Patreon.	2017-01-30 16:46:38 -04:00
Joey Hess	26d23e38f1	vicfg: Include the numcopies configuation. Docs say vicfg can configure everything from git-annex branch, so it ought to configure numcopies. Note that commenting out existing numcopies does not unset it. This commit was sponsored by Thom May on Patreon.	2017-01-30 15:27:25 -04:00
Joey Hess	280442ca2c	Remove -j short option for --json-progress; that option was already taken for --json. This commit was sponsored by Trenton Cronholm.	2017-01-30 12:46:42 -04:00
Joey Hess	f275caf732	Increase default cost for p2p remotes from 200 to 1000. This makes git-annex prefer transferring data from special remotes when possible.	2017-01-06 15:23:30 -04:00
Joey Hess	8740cd9716	releasing package git-annex version 6.20170101	2016-12-31 23:59:56 -04:00
Joey Hess	10e4d93212	Support all common locations of the torrc file.	2016-12-28 15:12:31 -04:00
Joey Hess	b68d2a4b68	webapp: full wormhole pairing UI (untested) This commit was sponsored by Riku Voipio.	2016-12-27 16:41:35 -04:00
Joey Hess	8484c0c197	Always use filesystem encoding for all file and handle reads and writes. This is a big scary change. I have convinced myself it should be safe. I hope!	2016-12-24 14:46:31 -04:00

1 2 3 4 5

202 commits