git-annex

Author	SHA1	Message	Date
Joey Hess	7eb9889bfd	track exported files in a sqlite database Went with a separate db per export remote, rather than a single export database. Mostly because there will probably not be a lot of separate export remotes, and it might be convenient to be able to delete a given remote's export database. This commit was supported by the NSF-funded DataLad project.	2017-09-04 13:53:08 -04:00
Joey Hess	28e2cad849	implement exporttree=yes configuration * Only export to remotes that were initialized to support it. * Prevent storing key/value on export remotes. * Prevent enabling exporttree=yes and encryption in the same remote. SetupStage Enable was changed to take the old RemoteConfig. This allowed only setting exporttree when initially setting up a remote, and not configuring it later after stuff might already be stored in the remote. Went with =yes rather than =true for consistency with other parts of git-annex. Changed docs accordingly. This commit was supported by the NSF-funded DataLad project.	2017-09-04 13:09:38 -04:00
Joey Hess	a4328b49d2	refactor ExportActions This will allow disabling exports for remotes that are not configured to allow them. Also, exportSupported will be useful for the external special remote to probe. This commit was supported by the NSF-funded DataLad project	2017-09-01 13:05:09 -04:00
Joey Hess	5483ea90ec	graft exported tree into git-annex branch So it will be available later and elsewhere, even after GC. I first though to use git update-index to do this, but feeding it a line with a tree object seems to always cause it to generate a git subtree merge. So, fell back to using the Git.Tree interface to maniupulate the trees, and not involving the git-annex branch index file at all. This commit was sponsored by Andreas Karlsson.	2017-08-31 18:06:49 -04:00
Joey Hess	978885247e	implement export.log and resolve export conflicts Incremental export updates work now too. This commit was sponsored by Anthony DeRobertis on Patreon.	2017-08-31 15:47:23 -04:00
Joey Hess	7c7af82578	resuming exports Make a pass over the whole exported tree, and upload anything that has not yet reached the export. Update location log when exporting. Note that the synthesized keys for non-annexed files are stored in the location log too. Some cases involving files in the tree with the same content are not handled correctly yet. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.	2017-08-31 13:33:50 -04:00
Joey Hess	e662aceeac	improve type	2017-08-31 12:47:08 -04:00
Joey Hess	4694e49158	fix error message when content to export is not locally available	2017-08-31 12:39:10 -04:00
Joey Hess	9f3630f4e0	initial export command Very basic operation works, but of course this is only the beginning. This commit was sponsored by Nick Daly on Patreon.	2017-08-29 15:10:01 -04:00
Joey Hess	5f732717d8	toFeed was unused so remove	2017-08-28 12:51:25 -04:00
Joey Hess	ee2f096e3b	Support building with feed-1.0, while still supporting older versions. This commit was sponsored by Jeff Goeke-Smith on Patreon.	2017-08-28 12:29:28 -04:00
Joey Hess	d39c120afa	add annex-ignore-command and annex-sync-command configs Added remote configuration settings annex-ignore-command and annex-sync-command, which are dynamic equivilants of the annex-ignore and annex-sync configurations. For this I needed a new DynamicConfig infrastructure. Its implementation should be as fast as before when there is no dynamic config, and it caches so shell commands are only run once. Note that annex-ignore-command exits nonzero when the remote should be ignored. While that may seem backwards, it allows using the same command for it as for annex-sync-command when you want to disable both. This commit was sponsored by Trenton Cronholm on Patreon.	2017-08-17 13:54:14 -04:00
Joey Hess	2eb6309d3e	move, copy: Support --batch.	2017-08-15 12:39:10 -04:00
Joey Hess	2cecc8d2a3	Added GIT_ANNEX_VECTOR_CLOCK environment variable Can be used to override the default timestamps used in log files in the git-annex branch. This is a dangerous environment variable; use with caution. Note that this only affects writing to the logs on the git-annex branch. It is not used for metadata in git commits (other env vars can be set for that). There are many other places where timestamps are still used, that don't get committed to git, but do touch disk. Including regular timestamps of files, and timestamps embedded in some files in .git/annex/, including the last fsck timestamp and timestamps in transfer log files. A good way to find such things in git-annex is to get for getPOSIXTime and getCurrentTime, although some of the results are of course false positives that never hit disk (unless git-annex gets swapped out..) So this commit does NOT necessarily make git-annex comply with some HIPPA privacy regulations; it's up to the user to determine if they can use it in a way compliant with such regulations. Benchmarking: It takes 0.00114 milliseconds to call getEnv "GIT_ANNEX_VECTOR_CLOCK" when that env var is not set. So, 100 thousand log files can be written with an added overhead of only 0.114 seconds. That should be by far swamped by the actual overhead of writing the log files and making the commit containing them. This commit was supported by the NSF-funded DataLad project.	2017-08-14 14:19:58 -04:00
Joey Hess	81a861326d	fsck: Support --json. One use case is to get a list of files that fsck fails on, in order to eg, drop them from a remote. This commit was sponsored by Nick Daly on Patreon.	2017-06-26 13:40:57 -04:00
Joey Hess	02df5c5932	support --to=. as shorthand for --to=here	2017-06-01 13:12:42 -04:00
Joey Hess	94351daba6	configuration to disable automatic merge conflict resolution * Added annex.resolvemerge configuration, which can be set to false to disable the usual automatic merge conflict resolution done by git-annex sync and the assistant. * sync: Added --no-resolvemerge option. Note that disabling merge conflict resolution is probably not a good idea in a direct mode repo or adjusted branch. Since updates to both are done outside the usual work tree, if it fails the tree is not left in a conflicted state, and it would be hard to manually resolve the conflict. Still, made annex.resolvemerge be supported in those cases for consistency. This commit was sponsored by Riku Voipio.	2017-06-01 12:51:01 -04:00
Joey Hess	bb18026b2c	move --to=here * move --to=here moves from all reachable remotes to the local repository. The output of move --from remote is changed slightly, when the remote and local both have the content. It used to say: move foo ok Now: move foo (from theremote...) ok That was done so that, when move --to=here is used and the content is locally present and also in several remotes, it's clear which remotes the content gets dropped from. Note that move --to=here will report an error if a non-reachable remote contains the file, even if the local repository also contains the file. I think that's reasonable; the user may be intending to move all other copies of the file from remotes. OTOH, if a copy of the file is believed to be present in some repository that is not a configured remote, move --to=here does not report an error. So a little bit inconsistent, but erroring in this case feels wrong. copy --to=here came along for free, but it's basically the same behavior as git-annex get, and probably with not as good messages in edge cases (especially on failure), so I've not documented it. This commit was sponsored by Anthony DeRobertis on Patreon.	2017-05-31 17:00:18 -04:00
Joey Hess	5ee6912cf3	support parsing options like --to=here Reworked remote name parsing to allow things like that. Command.Move uses it for --to=here, although there's not yet an implementation of that option. This commit was sponsored by Ignacio on Patreon.	2017-05-31 16:49:28 -04:00
Joey Hess	a1730cd6af	adeiu, MissingH Removed dependency on MissingH, instead depending on the split library. After laying groundwork for this since 2015, it was mostly straightforward. Added Utility.Tuple and Utility.Split. Eyeballed System.Path.WildMatch while implementing the same thing. Since MissingH's progress meter display was being used, I re-implemented my own. Bonus: Now progress is displayed for transfers of files of unknown size. This commit was sponsored by Shane-o on Patreon.	2017-05-16 01:03:52 -04:00
Joey Hess	0ec2f3b20f	rename to avoid name conflict	2017-05-11 18:31:14 -04:00
Joey Hess	db1600b2de	de-Maybe remoteGitConfig It's always set, so does not need to be a Maybe.	2017-05-11 16:05:01 -04:00
Joey Hess	4c1e3210fa	annex.backend is the new name for what was annex.backends It takes a single key-value backend, rather than the unncessary and confusing list. The old option still works if set. Simplified some old old code too. This commit was sponsored by Thomas Hochstein on Patreon.	2017-05-09 15:04:07 -04:00
Joey Hess	e3184e54c9	version: Added "dependency versions" line. This commit was sponsored by Anthony DeRobertis on Patreon.	2017-04-07 18:16:11 -04:00
Joey Hess	6896ac06e8	git annex add -u now supported, analagous to git add -u Unlike git add -u, git annex add -u does not update the index for files removed from the working tree. But then, "git add ." stages removals, and "git annex add ." does not, so that's an existing divergence. Seems that --update --batch would need to run git ls-files once per line of batch input, which would surely be too slow, so just throw an error for that. This commit was supported by the NSF-funded DataLad project.	2017-04-07 15:55:45 -04:00
Joey Hess	99984967eb	enableremote: Fix re-enabling of existing gcrypt remotes, so that eg, encryption key changes take effect. They were silently ignored, a reversion introduced in 6.20160527. I don't like this regular git remote special case in enableremote, but I can't see a way to get rid of it. So, check if the existing remote is a Remote.Git This commit was sponsored by Trenton Cronholm on Patreon.	2017-04-07 13:51:09 -04:00
Joey Hess	f406d16525	enableremote: When enabling a non-special remote, param=value parameters can't be used, so error out if any are provided. This commit was sponsored by Riku Voipio.	2017-04-07 13:14:53 -04:00
Joey Hess	29e73f76ef	Added remote.<name>.annex-push and remote.<name>.annex-pull The former can be useful to make remotes that don't get fully synced with local changes, which comes up in a lot of situations. The latter was mostly added for symmetry, but could be useful (though less likely to be). Implementing `remote.<name>.annex-pull` was a bit tricky, as there's no one place where git-annex pulls/fetches from remotes. I audited all instances of "fetch" and "pull". A few cases were left not checking this config: * Git.Repair can try to pull missing refs from a remote, and if the local repo is corrupted, that seems a reasonable thing to do even though the config would normally prevent it. * Assistant.WebApp.Gpg and Remote.Gcrypt and Remote.Git do fetches as part of the setup process of a remote. The config would probably not be set then, and having the setup fail seems worse than honoring it if it is already set. I have not prevented all the code that does a "merge" from merging branches from remotes with remote.<name>.annex-pull=false. That could perhaps be done, but it would need a way to map from branch name to remote name, and the way refspecs work makes that hard to get really correct. So if the user fetches manually, the git-annex branch will get merged, for example. Anther way of looking at/justifying this is that the setting is called "annex-pull", not "annex-merge". This commit was supported by the NSF-funded DataLad project.	2017-04-05 13:22:35 -04:00
Joey Hess	c6d5d8f9bf	fix windows build	2017-04-05 11:19:29 -04:00
Joey Hess	256eb4807e	add missing "do" Unsure how it got committed in an uncompilable state before..	2017-04-03 14:52:54 -04:00
Joey Hess	c3970f6c1a	multicast: New command, uses uftp to multicast annexed files, for eg a classroom setting. This commit was supported by the NSF-funded DataLad project.	2017-03-30 19:35:30 -04:00
Joey Hess	64f924dc93	sync --content-of=path For when you want to sync only some files' contents, not the whole working tree. This commit was sponsored by Anthony DeRobertis on Patreon.	2017-03-20 16:00:48 -04:00
Joey Hess	faecd73f32	Support GIT_SSH and GIT_SSH_COMMAND They are handled close the same as they are by git. However, unlike git, git-annex sometimes needs to pass the -n parameter when using these. So, this has the potential for breaking some setup, and perhaps there ought to be a ANNEX_USE_GIT_SSH=1 needed to use these. But I'd rather avoid that if possible, so let's see if anyone complains. Almost all places where "ssh" was run have been changed to support the env vars. Anything still calling sshOptions does not support them. In particular, rsync special remotes don't. Seems that annex-rsync-transport already gives sufficient control there. (Fixed in passing: Remote.Helper.Ssh.toRepo used to extract remoteAnnexSshOptions and pass them to sshOptions, which was redundant since sshOptions also extracts those.) This commit was sponsored by Jeff Goeke-Smith on Patreon.	2017-03-17 16:20:37 -04:00
Joey Hess	1c4e5f65fc	Drop support for building with old versions of directory, feed, and http-types.	2017-03-10 15:57:41 -04:00
Joey Hess	55b178a6ba	minor cleanup	2017-03-10 15:03:33 -04:00
Joey Hess	71a05b0d25	use ActionItem rather than String This changes fsck -A warnings to include the name of the key, which is a bit redundant in one case, but was missing in another case.	2017-03-10 14:13:10 -04:00
Joey Hess	c8e1e3dada	AssociatedFile newtype To prevent any further mistakes like `301aff34c4` This commit was sponsored by Francois Marier on Patreon.	2017-03-10 13:35:31 -04:00
Joey Hess	f90e2d0893	fix fsck bug introduced in `301aff34c4` Got two Maybe FilePaths crossed. Test suite caught it. Slightly improved types to avoid this mistake.	2017-03-10 12:11:00 -04:00
Joey Hess	301aff34c4	fsck -q: When a file has bad content, include the name of the file in the warning message. This commit was sponsored by Alexander Thompson on Patreon.	2017-03-08 15:15:20 -04:00
Joey Hess	874232f1a6	status: Propigate nonzero exit code from git status.	2017-03-02 14:09:42 -04:00
Joey Hess	ddf68b7c48	improve display of checking known urls Display it as a separate action, so it ends with a newline	2017-02-28 14:41:08 -04:00
Joey Hess	a62802af08	remove old debug print	2017-02-28 14:41:00 -04:00
Joey Hess	75029536e5	squelch a couple of warnings about moveAnnex return code	2017-02-28 12:49:17 -04:00
Joey Hess	e53070c1ff	inheritable annex.securehashesonly * init: When annex.securehashesonly has been set with git-annex config, copy that value to the annex.securehashesonly git config. * config --set: As well as setting value in git-annex branch, set local gitconfig. This is needed especially for annex.securehashesonly, which is read only from local gitconfig and not the git-annex branch. doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn has the rationalle for doing it this way. There's no perfect solution; this seems to be the least-bad one. This commit was supported by the NSF-funded DataLad project.	2017-02-27 16:08:23 -04:00
Joey Hess	942e0174b3	make fsck check annex.securehashesonly, and new tip for working around SHA1 collisions with git-annex This commit was sponsored by andrea rota.	2017-02-27 13:55:15 -04:00
Joey Hess	07f1e638ee	annex.securehashesonly Cryptographically secure hashes can be forced to be used in a repository, by setting annex.securehashesonly. This does not prevent the git repository from containing files with insecure hashes, but it does prevent the content of such files from being pulled into .git/annex/objects from another repository. We want to make sure that at no point does git-annex accept content into .git/annex/objects that is hashed with an insecure key. Here's how it was done: * .git/annex/objects/xx/yy/KEY/ is kept frozen, so nothing can be written to it normally * So every place that writes content must call, thawContent or modifyContent. We can audit for these, and be sure we've considered all cases. * The main functions are moveAnnex, and linkToAnnex; these were made to check annex.securehashesonly, and are the main security boundary for annex.securehashesonly. * Most other calls to modifyContent deal with other files in the KEY directory (inode cache etc). The other ones that mess with the content are: - Annex.Direct.toDirectGen, in which content already in the annex directory is moved to the direct mode file, so not relevant. - fix and lock, which don't add new content - Command.ReKey.linkKey, which manually unlocks it to make a copy. * All other calls to thawContent appear safe. Made moveAnnex return a Bool, so checked all callsites and made them deal with a failure in appropriate ways. linkToAnnex simply returns LinkAnnexFailed; all callsites already deal with it failing in appropriate ways. This commit was sponsored by Riku Voipio.	2017-02-27 13:33:59 -04:00
Joey Hess	27eca014be	fix up Read instance incompatability caused by recent commit `9c4650358c` changed the Read instance for Key. I've checked all uses of that instance (by removing it and seeing what breaks), and they're all limited to the webapp, except one. That is GitAnnexDistribution's Read instance. So, `9c4650358c` would have broken upgrades of git-annex from downloads.kitenet.net. Once the .info files there got updated for a new release, old releases would have failed to parse them and never upgraded. To fix this, I found a way to make the .info files that contain GitAnnexDistribution values be readable by the old version of git-annex. This commit was sponsored by Ewen McNeill.	2017-02-24 18:59:12 -04:00
Joey Hess	9c4650358c	add KeyVariety type Where before the "name" of a key and a backend was a string, this makes it a concrete data type. This is groundwork for allowing some varieties of keys to be disabled in file2key, so git-annex won't use them at all. Benchmarks ran in my big repo: old git-annex info: real 0m3.338s user 0m3.124s sys 0m0.244s new git-annex info: real 0m3.216s user 0m3.024s sys 0m0.220s new git-annex find: real 0m7.138s user 0m6.924s sys 0m0.252s old git-annex find: real 0m7.433s user 0m7.240s sys 0m0.232s Surprising result; I'd have expected it to be slower since it now parses all the key varieties. But, the parser is very simple and perhaps sharing KeyVarieties uses less memory or something like that. This commit was supported by the NSF-funded DataLad project.	2017-02-24 15:16:56 -04:00
Joey Hess	3afc7d83f2	noCommit for PostReceive This was noticed because it broke the datalad test suite, which pushed to the remote and then fetched to check if it had received the expected branches. Auto-init caused the git-annex branch on the remote to diverge, breaking that test. https://github.com/datalad/datalad/issues/1319#issuecomment-281649518 The auto-init still happens, it's staged in the journal, and will be commited by some later git-annex command when it runs. Which is fine, it's the same as that later command doing the auto-init. This commit was supported by the NSF-funded DataLad project	2017-02-23 18:37:02 -04:00
Joey Hess	75a15e1ad7	status: Pass --ignore-submodules=when option on to git status. Didn't make --ignore-submodules without a value be handled because I can't see a way to make optparse-applicative parse that. I've opened a bug requesting a way to do that: https://github.com/pcapriotti/optparse-applicative/issues/243	2017-02-20 17:01:24 -04:00

1 2 3 4 5 ...

1863 commits