git-annex

Author	SHA1	Message	Date
Joey Hess	3e68c1c2fd	add remote state logs This allows a remote to store a piece of arbitrary state associated with a key. This is needed to support Tahoe, where the file-cap is calculated from the data stored in it, and used to retrieve a key later. Glacier also would be much improved by using this. GETSTATE and SETSTATE are added to the external special remote protocol. Note that the state is left as-is even when a key is removed from a remote. It's up to the remote to decide when it wants to clear the state. The remote state log, $KEY.log.rmt, is a UUID-based log. However, rather than using the old UUID-based log format, I created a new variant of that format. The new varient is more space efficient (since it lacks the "timestamp=" hack, and easier to parse (and the parser doesn't mess with whitespace in the value), and avoids compatability cruft in the old one. This seemed worth cleaning up for these new files, since there could be a lot of them, while before UUID-based logs were only used for a few log files at the top of the git-annex branch. The transition code has also been updated to handle these new UUID-based logs. This commit was sponsored by Daniel Hofer.	2014-01-03 16:35:57 -04:00
Joey Hess	f7727d2df1	Remotes can now be made read-only, by setting remote.<name>.annex-readonly	2014-01-02 13:12:32 -04:00
Joey Hess	8e3032df2d	added GETWANTED, SETWANTED for Tobias's flickr remote This was unexpectedly difficult because of a depdenency cycle. To parse a preferred content expression involves several things that need to operate on the list of remotes. Which needs Remote.External. The only way to avoid this cycle (I tried breaking it at several points) was to skip parsing the expression in SETWANTED. That's sorta ok, because git-annex already has to deal with unparsable preferred content expressions being stored, in order to handle eg, upgrades. But I'm still not very happy that I cannot check it. I feel this is a strong indication that I need to beware of further bloating the special remote protocol interface.	2014-01-01 20:12:20 -04:00
Joey Hess	ed1fcab6d7	external special remote protocol: Added GETUUID.	2013-12-31 13:50:18 -04:00
Joey Hess	054e4f17e2	implement PREPARE-FAILURE for Tobias	2013-12-29 13:39:25 -04:00
Joey Hess	aa97a33dde	better error messages when external special remote exits unexpectedly or is not in PATH	2013-12-27 17:14:44 -04:00
Joey Hess	445b7b41b9	add credential storage support for external special remotes & update example	2013-12-27 16:01:43 -04:00
Joey Hess	551573570f	better protocol error message, indicate if the command was able to be parsed or was misplaced	2013-12-27 14:03:35 -04:00
Joey Hess	21342cae63	flush handle after writing message	2013-12-27 13:22:06 -04:00
Joey Hess	fa6f404a5f	fix deadlock when state TMVar is empty	2013-12-27 13:17:22 -04:00
Joey Hess	9125a25738	defer SETSTATE and GETSTATE for now TAHOE-LAFS may use these eventually, but that's TBD and none of git-annex's own special remotes need that, except for the web special remote's urls.	2013-12-27 13:07:56 -04:00
Joey Hess	a7f3724e21	implement GETCONFIG and SETCONFIG Changed protocol spec to make SETCONFIG only store it persistently when run during INITREMOTE. I see no reason to support storing it persistently at other times, and doing so would unnecessarily complicate the code. Also, letting that be done would probably result in use for storing data that doesn't really belong there, and special remote authors who don't understand how the union merging works would probably be surprised the results.	2013-12-27 12:37:23 -04:00
Joey Hess	91c9e98168	support encryption	2013-12-27 12:21:55 -04:00
Joey Hess	5d8ff64dc1	make --debug show transcript of special remote protocol messages	2013-12-27 03:10:00 -04:00
Joey Hess	3289155e28	don't send PREPARE before INITREMOTE That complicated special remote programs, because they had to avoid making PREPARE fail if some configuration is missing, because the remote might not be initialized yet. Instead, complicate git-annex slightly by only sending PREPARE immediately before some other request other than INITREMOTE (or PREPARE of course).	2013-12-27 02:49:10 -04:00
Joey Hess	6d504b57e7	make some requests optional, simplify and future-proof protocol more	2013-12-27 02:11:06 -04:00
Joey Hess	6c565ec905	external special remotes mostly implemented (untested) This has not been tested at all. It compiles! The only known missing things are support for encryption, and for get/set of special remote configuration, and of key state. (The latter needs separate work to add a new per-key log file to store that state.) Only thing I don't much like is that initremote needs to be passed both type=external and externaltype=foo. It would be better to have just type=foo Most of this is quite straightforward code, that largely wrote itself given the types. The only tricky parts were: * Need to lock the remote when using it to eg make a request, because in theory git-annex could have multiple threads that each try to use a remote at the same time. I don't think that git-annex ever does that currently, but better safe than sorry. * Rather than starting up every external special remote program when git-annex starts, they are started only on demand, when first used. This will avoid slowdown, especially when running fast git-annex query commands. Once started, they keep running until git-annex stops, currently, which may not be ideal, but it's hard to know a better time to stop them. * Bit of a chicken and egg problem with caching the cost of the remote, because setting annex-cost in the git config needs the remote to already be set up. Managed to finesse that. This commit was sponsored by Lukas Anzinger.	2013-12-26 18:23:13 -04:00
Joey Hess	8803e36814	future-proofing	2013-12-25 20:04:31 -04:00
Joey Hess	1dc930063a	basic data types and serialization for external special remote protocol This is mostly straightforward, but did turn out quite nicely stronly typed, and with a quite nice automatic tokenization and parsing of received messages. Made a few minor changes to the protocol to clear up ambiguities and make it easier to parse. Note particularly that setting remote configuration is moved to a separate command, which allows a remote to set arbitrary data.	2013-12-25 17:54:57 -04:00
Joey Hess	011b8bc7ec	pull in Win32-extras, to be able to get current process id in Windows Fixed up a number of things that had worked around there not being a way to get that. Most notably, transfer info files on windows now include the process id, since no locking is currently done. This means the file format varies between windows and unix.	2013-12-11 00:15:10 -04:00
Joey Hess	e425a966ed	Deal with box.com changing the url of their webdav endpoint. Use new url when making new remotes. Transparently rewrite old url to new for existing remotes.	2013-12-02 16:01:20 -04:00
Joey Hess	0a63ed563f	rsync special remote: Fix fallback mode for rsync remotes that use hashDirMixed. Closes: #731142	2013-12-02 12:53:39 -04:00
Joey Hess	58db042033	map: Work when there are gcrypt remotes.	2013-11-04 14:14:44 -04:00
Joey Hess	2203690822	really fix gcrypt for `7be69a2491` Fixed all the other ones, but forgot to fix gcrypt!	2013-11-02 20:10:54 -04:00
Joey Hess	b2cca95d1c	clean import list	2013-11-02 19:55:18 -04:00
Joey Hess	a04fe350b8	fix build	2013-11-02 19:54:59 -04:00
Joey Hess	7be69a2491	gcrypt, bup: Fix bug that prevented using these special remotes with encryption=pubkey. I think both of these are all that's affected, but I went ahead and fixed all the remotes that set their config to M.empty to instead store the actual config. Who knows what will expect it to be actually present in future, the Remote instance of getGpgEncParams came to..	2013-11-02 16:37:28 -04:00
Joey Hess	7ed8e87a34	assistant: Support repairing git remotes that are locally accessible (eg, on removable drives) gcrypt remotes are not yet handled. This commit was sponsored by Sören Brunk.	2013-10-27 15:38:59 -04:00
Joey Hess	5756636486	directory, webdav: Fix bug introduced in version 4.20131002 that caused the chunkcount file to not be written. Work around repositories without such a file, so files can still be retreived from them.	2013-10-26 15:03:12 -04:00
Joey Hess	06ea92282f	fix inverted logic when determining whether to write a chunkcount file late-night hlint bit me on this one.. Reviewed `c1990702e9` and the rest of it seems ok	2013-10-26 14:08:29 -04:00
Joey Hess	c76c94a0da	S3: Try to ensure bucket name is valid for archive.org.	2013-10-16 16:35:47 -04:00
Joey Hess	a6e9386d39	fix remote fsck to run in remote	2013-10-14 15:05:29 -04:00
Joey Hess	c78aaed317	ye olde inverted logic	2013-10-14 12:26:46 -04:00
Joey Hess	1ffb3bb0ba	add remote fsck interface Currently only implemented for local git remotes. May try to add support to git-annex-shell for ssh remotes later. Could concevably also be supported by some special remote, although that seems unlikely. Cronner user this when available, and when not falls back to fsck --fast --from remote git annex fsck --from does not itself use this interface. To do so, I would need to pass --fast and all other options that influence fsck on to the git annex fsck that it runs inside the remote. And that seems like a lot of work for a result that would be no better than cd remote; git annex fsck This may need to be revisited if git-annex-shell gets support, since it may be the case that the user cannot ssh to the server to run git-annex fsck there, but can run git-annex-shell there. This commit was sponsored by Damien Diederen.	2013-10-11 16:03:18 -04:00
Joey Hess	747f5b123c	url size fixes addurl: Improve message when adding url with wrong size to existing file. Before the message suggested the url didn't exist. Fixed handling of URL keys that have no recorded size. Before, if the key has no size, the url also had to not declare any size, which was unlikely and wrong, or it was taken to not exist. This probably would mostly affect keys that were added to the annex with addurl --relaxed.	2013-10-11 13:05:00 -04:00
Joey Hess	571fe4999b	remove __WINDOWS__ ifdef	2013-10-06 17:23:30 -04:00
Joey Hess	0ede6b7def	typoe and debug info	2013-10-01 19:10:45 -04:00
Joey Hess	bddfbef8be	git-annex-shell gcryptsetup command This was the least-bad alternative to get dedicated key gcrypt repos working in the assistant.	2013-10-01 17:20:51 -04:00
Joey Hess	1536ebfe47	Disable receive.denyNonFastForwards when setting up a gcrypt special remote gcrypt needs to be able to fast-forward the master branch. If a git repository is set up with git init --shared --bare, it gets that set, and pushing to it will then fail, even when it's up-to-date.	2013-10-01 15:23:48 -04:00
Joey Hess	101099f7b5	fix probing for local gcrypt repos	2013-10-01 14:38:20 -04:00
Joey Hess	995e1e3c5d	fix transferring to gcrypt repo from direct mode repo recvkey was told it was receiving a HMAC key from a direct mode repo, and that confused it into rejecting the transfer, since it has no way to verify a key using that backend, since there is no HMAC backend. I considered making recvkey skip verification in the case of an unknown backend. However, that could lead to bad results; a key can legitimately be in the annex with a backend that the remote git-annex-shell doesn't know about. Better to keep it rejecting if it cannot verify. Instead, made the gcrypt special remote not set the direct mode flag when sending (and receiving) files. Also, added some recvkey messages when its checks fail, since otherwise all that is shown is a confusing error message from rsync when the remote git-annex-shell exits nonzero.	2013-10-01 14:19:24 -04:00
Joey Hess	12f6b9693a	Send a git-annex user-agent when downloading urls. Overridable with --user-agent option. Not yet done for S3 or WebDAV due to limitations of libraries used -- nether allows a user-agent header to be specified. This commit sponsored by Michael Zehrer.	2013-09-28 14:35:21 -04:00
Joey Hess	c6032b0dab	clean up some ugly code	2013-09-27 19:52:36 -04:00
Joey Hess	e864c8d033	blind enabling gcrypt repos on rsync.net This pulls off quite a nice trick: When given a path on rsync.net, it determines if it is an encrypted git repository that the user has the key to decrypt, and merges with it. This is works even when the local repository had no idea that the gcrypt remote exists! (As previously done with local drives.) This commit sponsored by Pedro Côrte-Real	2013-09-27 16:21:56 -04:00
Joey Hess	e0b99f3960	support ssh://host/~/dir When generating the path for rsync, /~/ is not valid, so change to just host:dir Note that git remotes specified in host:dir form are internally converted to the ssh:// url form, so this was especially needed..	2013-09-26 15:02:27 -04:00
Joey Hess	c1990702e9	hlint	2013-09-25 23:19:01 -04:00
Joey Hess	3192b059b5	add back lost check that git-annex-shell supports gcrypt	2013-09-24 17:51:12 -04:00
Joey Hess	4c954661a1	git-annex-shell: Added support for operating inside gcrypt repositories. * Note that the layout of gcrypt repositories has changed, and if you created one you must manually upgrade it. See http://git-annex.branchable.com/upgrades/gcrypt/	2013-09-24 17:25:47 -04:00
Joey Hess	f9e438c1bc	factor out more ssh stuff from git remote This has the dual benefits of making Remote.Git shorter, and letting Remote.GCrypt use these utilities.	2013-09-24 13:37:41 -04:00
Joey Hess	7390f08ef9	Use cryptohash rather than SHA for hashing. This is a massive win on OSX, which doesn't have a sha256sum normally. Only use external hash commands when the file is > 1 mb, since cryptohash is quite close to them in speed. SHA is still used to calculate HMACs. I don't quite understand cryptohash's API for those. Used the following benchmark to arrive at the 1 mb number. 1 mb file: benchmarking sha256/internal mean: 13.86696 ms, lb 13.83010 ms, ub 13.93453 ms, ci 0.950 std dev: 249.3235 us, lb 162.0448 us, ub 458.1744 us, ci 0.950 found 5 outliers among 100 samples (5.0%) 4 (4.0%) high mild 1 (1.0%) high severe variance introduced by outliers: 10.415% variance is moderately inflated by outliers benchmarking sha256/external mean: 14.20670 ms, lb 14.17237 ms, ub 14.27004 ms, ci 0.950 std dev: 230.5448 us, lb 150.7310 us, ub 427.6068 us, ci 0.950 found 3 outliers among 100 samples (3.0%) 2 (2.0%) high mild 1 (1.0%) high severe 2 mb file: benchmarking sha256/internal mean: 26.44270 ms, lb 26.23701 ms, ub 26.63414 ms, ci 0.950 std dev: 1.012303 ms, lb 925.8921 us, ub 1.122267 ms, ci 0.950 variance introduced by outliers: 35.540% variance is moderately inflated by outliers benchmarking sha256/external mean: 26.84521 ms, lb 26.77644 ms, ub 26.91433 ms, ci 0.950 std dev: 347.7867 us, lb 210.6283 us, ub 571.3351 us, ci 0.950 found 6 outliers among 100 samples (6.0%) import Crypto.Hash import Data.ByteString.Lazy as L import Criterion.Main import Common testfile :: FilePath testfile = "/run/shm/data" -- on ram disk main = defaultMain [ bgroup "sha256" [ bench "internal" $ whnfIO internal , bench "external" $ whnfIO external ] ] sha256 :: L.ByteString -> Digest SHA256 sha256 = hashlazy internal :: IO String internal = show . sha256 <$> L.readFile testfile external :: IO String external = do s <- readProcess "sha256sum" [testfile] return $ fst $ separate (== ' ') s	2013-09-22 20:06:02 -04:00

1 2 3 4 5 ...

447 commits