What these generate is not really suitable to be used as a filename,
which is why keyFile and fileKey further escape it. These are just
serializing Keys.
Also removed a quickcheck test that was very unlikely to test anything
useful, since it relied on random chance creating something that looks
like a serialized key. The other test is sufficient for testing what
that was intended to test anyway.
Where before the "name" of a key and a backend was a string, this makes
it a concrete data type.
This is groundwork for allowing some varieties of keys to be disabled
in file2key, so git-annex won't use them at all.
Benchmarks ran in my big repo:
old git-annex info:
real 0m3.338s
user 0m3.124s
sys 0m0.244s
new git-annex info:
real 0m3.216s
user 0m3.024s
sys 0m0.220s
new git-annex find:
real 0m7.138s
user 0m6.924s
sys 0m0.252s
old git-annex find:
real 0m7.433s
user 0m7.240s
sys 0m0.232s
Surprising result; I'd have expected it to be slower since it now parses
all the key varieties. But, the parser is very simple and perhaps
sharing KeyVarieties uses less memory or something like that.
This commit was supported by the NSF-funded DataLad project.
Turns out that Data.List.Utils.split is slow and makes a lot of
allocations. Here's a much simpler single character splitter that behaves
the same (even in wacky corner cases) while running in half the time and
75% the allocations.
As well as being an optimisation, this helps move toward eliminating use of
missingh.
(Data.List.Split.splitOn is nearly as slow as Data.List.Utils.split and
allocates even more.)
I have not benchmarked the effect on git-annex, but would not be surprised
to see some parsing of eg, large streams from git commands run twice as
fast, and possibly in less memory.
This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
ghc 8 added backtraces on uncaught errors. This is great, but git-annex was
using error in many places for a error message targeted at the user, in
some known problem case. A backtrace only confuses such a message, so omit it.
Notably, commands like git annex drop that failed due to eg, numcopies,
used to use error, so had a backtrace.
This commit was sponsored by Ethan Aubin.
Removed the instance LensGpgEncParams RemoteConfig because it encouraged
code that does not take the RemoteGitConfig into account.
RemoteType's setup was changed to take a RemoteGitConfig,
although the only place that is able to provide a non-empty one is
enableremote, when it's changing an existing remote. This led to several
folow-on changes, and got RemoteGitConfig plumbed through.
This is useful for makking a special remote that anyone with a clone of the
repo and your public keys can upload files to, but only you can decrypt the
files stored in it.
The naming is unofrtunately not consistent, but the gnupg-options
were only used for encrypting, and it's too late to change that.
It would be nice to have a third setting that is always passed to gnupg,
but ~/.gnupg/options can be used to specify such global options when really
needed.
When gpg.program is configured, it's used to get the command to run for
gpg. Useful on systems that have only a gpg2 command or want to use it
instead of the gpg command.
Removed old extensible-exceptions, only needed for very old ghc.
Made webdav use Utility.Exception, to work after some changes in DAV's
exception handling.
Removed Annex.Exception. Mostly this was trivial, but note that
tryAnnex is replaced with tryNonAsync and catchAnnex replaced with
catchNonAsync. In theory that could be a behavior change, since the former
caught all exceptions, and the latter don't catch async exceptions.
However, in practice, nothing in the Annex monad uses async exceptions.
Grepping for throwTo and killThread only find stuff in the assistant,
which does not seem related.
Command.Add.undo is changed to accept a SomeException, and things
that use it for rollback now catch non-async exceptions, rather than
only IOExceptions.
Some remotes like External need to run store and retrieve actions in Annex,
not IO. In order to do that lift, I had to dive pretty deep into the
utilities, making Utility.Gpg and Utility.Tmp be partly converted to using
MonadIO, and Control.Monad.Catch for exception handling.
There should be no behavior changes in this commit.
This commit was sponsored by Michael Barabanov.
I'd have liked to keep these two concepts entirely separate,
but that are entagled: Storing a key in an encrypted and chunked remote
need to generate chunk keys, encrypt the keys, chunk the data, encrypt the
chunks, and send them to the remote. Similar for retrieval, etc.
So, here's an implemnetation of all of that.
The total win here is that every remote was implementing encrypted storage
and retrival, and now it can move into this single place. I expect this
to result in several hundred lines of code being removed from git-annex
eventually!
This commit was sponsored by Henrik Ahlgren.
Added new fields for chunk number, and chunk size. These will not appear
in normal keys ever, but will be used for chunked data stored on special
remotes.
This commit was sponsored by Jouni K Seppanen.
I forgot I had <$$> hidden away in Utility.Applicative.
It allows doing the same kind of currying as does >=*>
and I found using it made the code more readable for me.
(*>=> was not used)
Cipher is now a datatype
data Cipher = Cipher String | MacOnlyCipher String
which makes more precise its interpretation MAC-only vs. MAC + used to
derive a key for symmetric crypto.
With the initremote parameters "encryption=pubkey keyid=788A3F4C".
/!\ Adding or removing a key has NO effect on files that have already
been copied to the remote. Hence using keyid+= and keyid-= with such
remotes should be used with care, and make little sense unless the point
is to replace a (sub-)key by another. /!\
Also, a test case has been added to ensure that the cipher and file
contents are encrypted as specified by the chosen encryption scheme.
/!\ It is to be noted that revoking a key does NOT necessarily prevent
the owner of its private part from accessing data on the remote /!\
The only sound use of `keyid-=` is probably to replace a (sub-)key by
another, where the private part of both is owned by the same
person/entity:
git annex enableremote myremote keyid-=2512E3C7 keyid+=788A3F4C
Reference: http://git-annex.branchable.com/bugs/Using_a_revoked_GPG_key/
* Other change introduced by this patch:
New keys now need to be added with option `keyid+=`, and the scheme
specified (upon initremote only) with `encryption=`. The motivation for
this change is to open for new schemes, e.g., strict asymmetric
encryption.
git annex initremote myremote encryption=hybrid keyid=2512E3C7
git annex enableremote myremote keyid+=788A3F4C
Unless highRandomQuality=false (or --fast) is set, use Libgcypt's
'GCRY_VERY_STRONG_RANDOM' level by default for cipher generation, like
it's done for OpenPGP key generation.
On the assistant side, the random quality is left to the old (lower)
level, in order not to scare the user with an enless page load due to
the blocking PRNG waiting for IO actions.
Both the directory and webdav special remotes used to have to buffer
the whole file contents before it could be decrypted, as they read
from chunks. Now the chunks are streamed through gpg with no buffering.
This commit includes a paydown on technical debt incurred two years ago,
when I didn't know that it was bad to make custom Read and Show instances
for types. As the routes need Read and Show for Transfer, which includes a
Key, and deriving my own Read instance of key was not practical,
I had to finally clean that up.
So the compact Key read and show functions are now file2key and key2file,
and Read and Show are now derived instances.
Changed all code that used the old instances, compiler checked.
(There were a few places, particularly in Command.Unused, and the test
suite where the Show instance continue to be used for legitimate
comparisons; ie show key_x == show key_y (though really in a bloom filter))
This option avoids gpg key distribution, at the expense of flexability, and
with the requirement that all clones of the git repository be equally
trusted.