Commit graph

279 commits

Author SHA1 Message Date
Joey Hess
6fcca2f13e WIP converting S3 special remote from hS3 to aws library
Currently, initremote works, but not the other operations. They should be
fairly easy to add from this base.

Also, https://github.com/aristidb/aws/issues/119 blocks internet archive
support.

Note that since http-conduit is used, this also adds https support to S3.
Although git-annex encrypts everything anyway, so that may not be extremely
useful. It is not enabled by default, because existing S3 special remotes
have port=80 in their config. Setting port=443 will enable it.

This commit was sponsored by Daniel Brockman.
2014-08-08 19:00:53 -04:00
Joey Hess
8025decc7f run Preparer to get Remover and CheckPresent actions
This will allow special remotes to eg, open a http connection and reuse it,
while checking if chunks are present, or removing chunks.

S3 and WebDAV both need this to support chunks with reasonable speed.

Note that a special remote might want to cache a http connection across
multiple requests. A simple case of this is that CheckPresent is typically
called before Store or Remove. A remote using this interface can certianly
use a Preparer that eg, uses a MVar to cache a http connection.

However, it's up to the remote to then deal with things like stale or
stalled http connections when eg, doing a series of downloads from a remote
and other places. There could be long delays between calls to a remote,
which could lead to eg, http connection stalls; the machine might even
move to a new network, etc.

It might be nice to improve this interface later to allow
the simple case without needing to handle the full complex case.
One way to do it would be to have a `Transaction SpecialRemote cache`,
where SpecialRemote contains methods for Storer, Retriever, Remover, and
CheckPresent, that all expect to be passed a `cache`.
2014-08-06 14:28:36 -04:00
Joey Hess
b4cf22a388 pushed checkPresent exception handling out of Remote implementations
I tend to prefer moving toward explicit exception handling, not away from
it, but in this case, I think there are good reasons to let checkPresent
throw exceptions:

1. They can all be caught in one place (Remote.hasKey), and we know
   every possible exception is caught there now, which we didn't before.
2. It simplified the code of the Remotes. I think it makes sense for
   Remotes to be able to be implemented without needing to worry about
   catching exceptions inside them. (Mostly.)
3. Types.StoreRetrieve.Preparer can only work on things that return a
   Bool, which all the other relevant remote methods already did.
   I do not see a good way to generalize that type; my previous attempts
   failed miserably.
2014-08-06 13:45:19 -04:00
Joey Hess
4b16989e98 roll ChunkedEncryptable into Special and improve interface
Allow disabling progress displays, for eg, rsync.
2014-08-03 15:40:01 -04:00
Joey Hess
d05b7b9182 better byteRetriever
Make the byteRetriever be passed the callback that consumes the bytestring.

This way, there's no worries about the lazy bytestring not all being read
when the resource that's creating it is closed.

Which in turn lets bup, ddar, and S3 each switch from using an unncessary
fileRetriver to a byteRetriever. So, more efficient on chunks and encrypted
files.

The only remaining fileRetrievers are hook and external, which really do
retrieve to files.
2014-08-03 01:12:24 -04:00
Joey Hess
32e4368377 S3: support chunking
The assistant defaults to 1MiB chunk size for new S3 special remotes.
Which will work around a couple of bugs:
  http://git-annex.branchable.com/bugs/S3_memory_leaks/
  http://git-annex.branchable.com/bugs/S3_upload_not_using_multipart/
2014-08-02 15:51:58 -04:00
Joey Hess
604740b720 S3: Deal with AWS ACL configurations that do not allow creating or checking the location of a bucket, but only reading and writing content to it. 2014-07-11 15:21:43 -04:00
Joey Hess
2f84659d51 fix build with old versions of bytestring 2014-06-06 14:04:35 -04:00
Joey Hess
0c2a14e4aa fix dodgy use of Char8
I don't know if this was a bug, but I don't know if it was not a bug
either.

See also,
http://git-annex.branchable.com/bugs/Truncated_file_transferred_via_S3/
where the file is not truncated, but mangled..
2014-05-27 20:31:25 -04:00
Joey Hess
45e7040142 webapp: Fix creation of box.com, S3, and Glacier repositories, broken in 5.20140221. 2014-02-24 15:29:17 -04:00
Joey Hess
fa24ba2520 plumb creds from webapp to initremote
Avoids abusing setting environment variables, which was always a hack
and won't work on windows.
2014-02-11 14:07:56 -04:00
Joey Hess
c20f31a1ad add GETAVAILABILITY to external special remote protocol
And some reworking of types, and added an annex-availability git config
setting.
2014-01-13 14:41:10 -04:00
Joey Hess
7ed8e87a34 assistant: Support repairing git remotes that are locally accessible
(eg, on removable drives)

gcrypt remotes are not yet handled.

This commit was sponsored by Sören Brunk.
2013-10-27 15:38:59 -04:00
Joey Hess
c76c94a0da S3: Try to ensure bucket name is valid for archive.org. 2013-10-16 16:35:47 -04:00
Joey Hess
1ffb3bb0ba add remote fsck interface
Currently only implemented for local git remotes. May try to add support
to git-annex-shell for ssh remotes later. Could concevably also be
supported by some special remote, although that seems unlikely.

Cronner user this when available, and when not falls back to
fsck --fast --from remote

git annex fsck --from does not itself use this interface.
To do so, I would need to pass --fast and all other options that influence
fsck on to the git annex fsck that it runs inside the remote. And that
seems like a lot of work for a result that would be no better than
cd remote; git annex fsck
This may need to be revisited if git-annex-shell gets support, since it
may be the case that the user cannot ssh to the server to run git-annex
fsck there, but can run git-annex-shell there.

This commit was sponsored by Damien Diederen.
2013-10-11 16:03:18 -04:00
Joey Hess
5fe49b98f8 Support hot-swapping of removable drives containing gcrypt repositories.
To support this, a core.gcrypt-id is stored by git-annex inside the git
config of a local gcrypt repository, when setting it up.

That is compared with the remote's cached gcrypt-id. When different, a
drive has been changed. git-annex then looks up the remote config for
the uuid mapped from the core.gcrypt-id, and tweaks the configuration
appropriately. When there is no known config for the uuid, it will refuse to
use the remote.
2013-09-12 15:54:35 -04:00
Joey Hess
7c1a9cdeb9 partially complete gcrypt remote (local send done; rest not)
This is a git-remote-gcrypt encrypted special remote. Only sending files
in to the remote works, and only for local repositories.

Most of the work so far has involved making initremote work. A particular
problem is that remote setup in this case needs to generate its own uuid,
derivied from the gcrypt-id. That required some larger changes in the code
to support.

For ssh remotes, this will probably just reuse Remote.Rsync's code, so
should be easy enough. And for downloading from a web remote, I will need
to factor out the part of Remote.Git that does that.

One particular thing that will need work is supporting hot-swapping a local
gcrypt remote. I think it needs to store the gcrypt-id in the git config of the
local remote, so that it can check it every time, and compare with the
cached annex-uuid for the remote. If there is a mismatch, it can change
both the cached annex-uuid and the gcrypt-id. That should work, and I laid
some groundwork for it by already reading the remote's config when it's
local. (Also needed for other reasons.)

This commit was sponsored by Daniel Callahan.
2013-09-07 18:38:00 -04:00
Joey Hess
1587fd42a3 fix build (seems getGpgEncOpts got renamed to getGpgEncParams) 2013-09-04 18:00:02 -04:00
guilhem
8293ed619f Allow public-key encryption of file content.
With the initremote parameters "encryption=pubkey keyid=788A3F4C".

/!\ Adding or removing a key has NO effect on files that have already
been copied to the remote. Hence using keyid+= and keyid-= with such
remotes should be used with care, and make little sense unless the point
is to replace a (sub-)key by another. /!\

Also, a test case has been added to ensure that the cipher and file
contents are encrypted as specified by the chosen encryption scheme.
2013-09-03 14:34:16 -04:00
Joey Hess
883b17af01 Store an annex-uuid file in the bucket when setting up a new S3 remote. 2013-04-27 17:01:24 -04:00
Joey Hess
3c7f4d2bd1 Automatically register public urls for files uploaded to the Internet Archive. 2013-04-25 17:28:25 -04:00
Joey Hess
e3ea36174b webapp: Display some additional information about a repository on its edit page. 2013-04-25 16:42:17 -04:00
Joey Hess
3e396a3b89 S3: Dropping content from the Internet Archive doesn't work, but their API indicates it does. Always refuse to drop from there. 2013-04-25 15:20:31 -04:00
Joey Hess
8284b310a7 support enabling IA repositories 2013-04-25 13:14:49 -04:00
Joey Hess
9e11699c76 connect existing meters to the transfer log for downloads
Most remotes have meters in their implementations of retrieveKeyFile
already. Simply hooking these up to the transfer log makes that information
available. Easy peasy.

This is particularly valuable information for encrypted remotes, which
otherwise bypass the assistant's polling of temp files, and so don't have
good progress bars yet.

Still some work to do here (see progressbars.mdwn changes), but this
is entirely an improvement from the lack of progress bars for encrypted
downloads.
2013-04-11 17:32:31 -04:00
Joey Hess
cf07a2c412 webapp: Progess bar fixes for many types of special remotes.
There was confusion in different parts of the progress bar code about
whether an update contained the total number of bytes transferred, or the
number of bytes transferred since the last update. One way this bug
showed up was progress bars that seemed to stick at zero for a long time.
In order to fix it comprehensively, I add a new BytesProcessed data type,
that is explicitly a total quantity of bytes, not a delta.

Note that this doesn't necessarily fix every problem with progress bars.
Particularly, buffering can now cause progress bars to seem to run ahead
of transfers, reaching 100% when data is still being uploaded.
2013-03-28 17:04:37 -04:00
Joey Hess
449520a573 add globallyAvailable to remotes 2013-03-15 19:16:13 -04:00
Joey Hess
19c0a0d5b1 split cost out into its own module
Added a function to insert a new cost into a list, which could be used to
asjust costs after a drag and drop.
2013-03-13 16:30:34 -04:00
guilhem
d2bc0e9f3e GnuPG options for symmetric encryption. 2013-03-11 09:48:38 -04:00
Joey Hess
1bc49b7158 Special remotes now all rollback storage of keys that get modified during the transfer, which can happen in direct mode. 2013-01-09 18:42:29 -04:00
Joey Hess
909f67443f Fix transferring files to special remotes in direct mode. 2013-01-06 14:29:01 -04:00
Joey Hess
4008590c68 type based git config handling for remotes
Still a couple of places that use git config ad-hoc, but this is most of it
done.
2013-01-01 13:58:14 -04:00
Joey Hess
0d50a6105b whitespace fixes 2012-12-13 00:45:27 -04:00
Joey Hess
0b6c889012 webapp: S3 and Glacier forms now have a select list of all currently-supported AWS regions. 2012-12-01 14:11:37 -04:00
Joey Hess
020a25abe1 avoid unnecessary Maybe 2012-11-30 00:55:59 -04:00
Joey Hess
8dd1d9aaf9 webapp: Defaults to sharing box.com account info with friends, allowing one-click enabling of the repository. 2012-11-28 13:31:49 -04:00
Joey Hess
a5111a6d85 Amazon Glacier special remote; 100% working 2012-11-20 16:43:58 -04:00
Joey Hess
7df1e71fe3 S3: Added progress display for uploading and downloading. 2012-11-18 22:49:07 -04:00
Joey Hess
b0e08ae457 S3: upload progress display 2012-11-18 22:20:43 -04:00
Joey Hess
81379bb29c better streaming while encrypting/decrypting
Both the directory and webdav special remotes used to have to buffer
the whole file contents before it could be decrypted, as they read
from chunks. Now the chunks are streamed through gpg with no buffering.
2012-11-18 15:27:44 -04:00
Joey Hess
7addb89dc1 webapp: support box.com 2012-11-17 15:30:11 -04:00
Joey Hess
0f782bd028 encrypted webdav working 2012-11-16 13:57:32 -04:00
Joey Hess
3c039d329c update to dav 0.1, and basic uploading is working! 2012-11-15 13:46:16 -04:00
Joey Hess
e250f6f11f factor out Creds 2012-11-14 19:32:27 -04:00
Joey Hess
2172cc586e where indenting 2012-11-11 00:51:07 -04:00
Joey Hess
f18a53eec0 change s3 creds caching
Rather than store decrypted creds in the environment, store them in the
creds cache file.

This way, a single git-annex can have multiple S3 remotes using different
creds.
2012-09-26 14:42:51 -04:00
Joey Hess
e4bf74a965 store S3 creds in a 600 mode file inside the local git repo 2012-09-26 14:42:32 -04:00
Joey Hess
226781c047 unify types 2012-09-21 14:50:14 -04:00
Joey Hess
aff09a1f33 add a progress callback to storeKey, and threaded it all the way through
Transfer info files are updated when the callback is called, updating
the number of bytes transferred.

Left unused p variables at every place the callback should be used.
Which is rather a lot..
2012-09-19 16:08:37 -04:00
Joey Hess
271ea49978 add support for readonly remotes
Currently only the web special remote is readonly, but it'd be possible to
also have readonly drives, or other remotes. These are handled in the
assistant by only downloading from them, and never trying to upload to
them.
2012-08-26 15:39:02 -04:00
Joey Hess
78d3add86b tweak field name 2012-08-26 14:26:43 -04:00
Joey Hess
b818337054 fix build warning 2012-08-16 16:48:27 -07:00
Joey Hess
cbca93cf7c Merge branch 'master' into assistant
Conflicts:
	debian/changelog
2012-08-16 16:36:32 -07:00
Joey Hess
ad4e152fd6 S3: Add fileprefix setting. 2012-08-09 13:54:54 -04:00
Joey Hess
94fcd0cf59 add routes to pause/start/cancel transfers
This commit includes a paydown on technical debt incurred two years ago,
when I didn't know that it was bad to make custom Read and Show instances
for types. As the routes need Read and Show for Transfer, which includes a
Key, and deriving my own Read instance of key was not practical,
I had to finally clean that up.

So the compact Key read and show functions are now file2key and key2file,
and Read and Show are now derived instances.

Changed all code that used the old instances, compiler checked.
(There were a few places, particularly in Command.Unused, and the test
suite where the Show instance continue to be used for legitimate
comparisons; ie show key_x == show key_y (though really in a bloom filter))
2012-08-08 16:20:24 -04:00
Joey Hess
4ec9244f1a add a path field to remotes
Also broke out some helper functions around constructing remotes,
to be used later.
2012-07-22 14:30:43 -04:00
Joey Hess
7225c2bfc0 record transfer information on local git remotes
In order to record a semi-useful filename associated with the key,
this required plumbing the filename all the way through to the remotes'
storeKey and retrieveKeyFile.

Note that there is potential for deadlock here, narrowly avoided.
Suppose the repos are A and B. A sends file foo to B, and at the same
time, B gets file foo from A. So, A locks its upload transfer info file,
and then locks B's download transfer info file. At the same time,
B is taking the two locks in the opposite order. This is only not a
deadlock because the lock code does not wait, and aborts. So one of A or
B's transfers will be aborted and the other transfer will continue.
Whew!
2012-07-01 17:15:11 -04:00
Joey Hess
ed79596b75 noop 2012-04-21 23:32:33 -04:00
Joey Hess
7ba79cfb8c thread through original key to retrieveEnctypted
Allows showing progress bar for this last case of the directory special
remote.
2012-03-04 03:36:39 -04:00
Joey Hess
9856c24a59 Add progress bar display to the directory special remote.
So far I've only written progress bars for sending files, not yet
receiving.

No longer uses external cp at all. ByteString IO is fast enough.
2012-03-04 03:17:25 -04:00
Joey Hess
cb631ce518 whereis: Prints the urls of files that the web special remote knows about. 2012-02-14 03:49:48 -04:00
Joey Hess
57a747d081 S3: Fix irrefutable pattern failure when accessing encrypted S3 credentials. 2012-02-08 11:41:15 -04:00
Joey Hess
b9b72d22a9 refactor
Wow, triple monadic lift!
2012-02-07 01:40:14 -04:00
Joey Hess
61dbad505d fsck --from remote --fast
Avoids expensive file transfers, at the expense of checking file size
and/or contents.

Required some reworking of the remote code.
2012-01-20 13:23:11 -04:00
Joey Hess
06b0cb6224 add tmp flag parameter to retrieveKeyFile 2012-01-19 16:07:36 -04:00
Joey Hess
f534fcc7b1 remove S3stub stuff
Let's keep that in a no-s3 branch, which can be merged into eg,
debian-stable.
2012-01-05 23:14:10 -04:00
Joey Hess
9c96d86502 nasty hack to build when hS3 is not available
So, it would be nicer to just use Cabal and take advantage
of its conditional compilation support. But, Cabal seems to
lack good support for a package with an internal library that is used by
multiple executables. It wants to build everything twice or more.
That's too slow for me.

Anyway, fairly soon, I expect to upgrade hS3 to a requirment, and I
can just revert this.
2011-03-30 01:32:05 -04:00
Joey Hess
8f9951369d refactor 2011-03-29 18:28:37 -04:00
Joey Hess
3adb48f46a more S3 docs 2011-03-29 18:21:05 -04:00
Joey Hess
d8154eaad3 transfering content back from s3 works! 2011-03-29 18:09:22 -04:00
Joey Hess
0782d70063 copy --to S3 works 2011-03-29 17:57:20 -04:00
Joey Hess
72f94cc42e progress 2011-03-29 17:20:22 -04:00
Joey Hess
475f707361 initremote now creates buckets 2011-03-29 16:21:21 -04:00
Joey Hess
0a4c610b4f initremote works 2011-03-29 14:55:59 -04:00
Joey Hess
05751d55cd clean up remote.log handling 2011-03-29 14:10:12 -04:00
Joey Hess
a3b6586902 update 2011-03-28 23:51:07 -04:00
Joey Hess
a7bd63eb01 basic s3 remote start
But bucket name is not handled right; it needs to be globally unique.
2011-03-28 01:32:47 -04:00
Joey Hess
c0fd38bfa9 document S3 remotes 2011-03-27 22:52:13 -04:00
Joey Hess
65b72604d7 skeleton of S3 remote 2011-03-27 22:00:44 -04:00