From 0be6cad7a82cb66bee6406bc51f27353eefe8e20 Mon Sep 17 00:00:00 2001
From: Joey Hess <joeyh@joeyh.name>
Date: Tue, 27 Dec 2022 13:48:53 -0400
Subject: [PATCH] idea

---
 .../encrypted_keys_in_git_repository.mdwn     | 59 +++++++++++++++++++
 1 file changed, 59 insertions(+)
 create mode 100644 doc/todo/encrypted_keys_in_git_repository.mdwn

diff --git a/doc/todo/encrypted_keys_in_git_repository.mdwn b/doc/todo/encrypted_keys_in_git_repository.mdwn
new file mode 100644
index 0000000000..580d4346b8
--- /dev/null
+++ b/doc/todo/encrypted_keys_in_git_repository.mdwn
@@ -0,0 +1,59 @@
+The idea is to have a type of key that is based on a hash of the annexed
+file, but with some form of encryption or password protection. 
+
+So someone who can access the git repository is prevented from checking
+known hashes of files, and learning about content in the repository that is
+not available to them. (However, they would be able to see filenames
+and commit metadata, which might also expose information about what the
+files are. This is not a fully encrypted git repository like can be made
+with gcrypt.)
+
+This might be an added layer of security, or the git repository might be
+made public, including perhaps some of the annexed files, while this is
+used to obscure the hashes of some other annexed files. For example, a
+scientific dataset might make public files that are derived from patient
+data, and use this for the patient data files.
+
+----
+
+If two such repositories were merged, git-annex would need to somehow
+be able to tell how to decrypt a key, which could have come from either
+repository. So it seems that the key needs to include within it some
+identifier of the secret that is used to encrypt it. For example:
+
+    ENC--ident--foo
+
+Where ident would be something like a UUID of the secret, and foo is the
+file's hash (and perhaps size, or indeed a whole regular key) that is
+encrypted by the secret.
+
+----
+
+Should the key's size be included only in encrypted form, or in plaintext?
+
+Since git-annex constructs a Key without using the associated Backend,
+it currently parses the size field the same way for each type of key.
+So, it does not seem to be possible to encrypt the size field and 
+use that encrypted size to populate keySize. What would be
+doable though, is to replace uses of keySize with a new Backend method that
+returns the size.
+
+----
+
+Should the encryption method be reversible?
+
+If the encryption method is not reversible, the key's size would need to be
+included in plaintext, or left out entirely.
+
+But it does not seem necessary for the encryption method to be reversible
+otherwise. Consider if scrypt was used. When adding a file, git-annex would
+first hash it, and then run it through scrypt. That is not reversible,
+so when fscking a file, just repeat the same process and compare the
+resulting scrypt keys.
+
+Not being reversible is a nice benefit, because it makes it much harder
+for an attacker to brute-force. If it's reversible the attacker can
+brute-force the user's password, looking for a password that decrypts
+to something that looks right.
+
+--[[Joey]]