From 0be6cad7a82cb66bee6406bc51f27353eefe8e20 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 27 Dec 2022 13:48:53 -0400 Subject: [PATCH] idea --- .../encrypted_keys_in_git_repository.mdwn | 59 +++++++++++++++++++ 1 file changed, 59 insertions(+) create mode 100644 doc/todo/encrypted_keys_in_git_repository.mdwn diff --git a/doc/todo/encrypted_keys_in_git_repository.mdwn b/doc/todo/encrypted_keys_in_git_repository.mdwn new file mode 100644 index 0000000000..580d4346b8 --- /dev/null +++ b/doc/todo/encrypted_keys_in_git_repository.mdwn @@ -0,0 +1,59 @@ +The idea is to have a type of key that is based on a hash of the annexed +file, but with some form of encryption or password protection. + +So someone who can access the git repository is prevented from checking +known hashes of files, and learning about content in the repository that is +not available to them. (However, they would be able to see filenames +and commit metadata, which might also expose information about what the +files are. This is not a fully encrypted git repository like can be made +with gcrypt.) + +This might be an added layer of security, or the git repository might be +made public, including perhaps some of the annexed files, while this is +used to obscure the hashes of some other annexed files. For example, a +scientific dataset might make public files that are derived from patient +data, and use this for the patient data files. + +---- + +If two such repositories were merged, git-annex would need to somehow +be able to tell how to decrypt a key, which could have come from either +repository. So it seems that the key needs to include within it some +identifier of the secret that is used to encrypt it. For example: + + ENC--ident--foo + +Where ident would be something like a UUID of the secret, and foo is the +file's hash (and perhaps size, or indeed a whole regular key) that is +encrypted by the secret. + +---- + +Should the key's size be included only in encrypted form, or in plaintext? + +Since git-annex constructs a Key without using the associated Backend, +it currently parses the size field the same way for each type of key. +So, it does not seem to be possible to encrypt the size field and +use that encrypted size to populate keySize. What would be +doable though, is to replace uses of keySize with a new Backend method that +returns the size. + +---- + +Should the encryption method be reversible? + +If the encryption method is not reversible, the key's size would need to be +included in plaintext, or left out entirely. + +But it does not seem necessary for the encryption method to be reversible +otherwise. Consider if scrypt was used. When adding a file, git-annex would +first hash it, and then run it through scrypt. That is not reversible, +so when fscking a file, just repeat the same process and compare the +resulting scrypt keys. + +Not being reversible is a nice benefit, because it makes it much harder +for an attacker to brute-force. If it's reversible the attacker can +brute-force the user's password, looking for a password that decrypts +to something that looks right. + +--[[Joey]]