diff --git a/doc/design/external_backend_protocol.mdwn b/doc/design/external_backend_protocol.mdwn
new file mode 100644
index 0000000000..2fbdb176ce
--- /dev/null
+++ b/doc/design/external_backend_protocol.mdwn
@@ -0,0 +1,178 @@
+**Draft**
+
+Communication between git-annex and a program implementing an external
+[[backend|backends]] uses this protocol.
+
+[[!toc ]]
+
+## starting the program
+
+The external backend program has a name like `git-annex-backend-XFOO`.
+When git-annex is configured to use a backend starting with "X", 
+or encounters a key in a repository starting with "X", it
+looks for the corresponding external backend program in PATH.
+
+The program is started by git-annex when it needs to use it, and may be
+left running for a long period of time. Note that git-annex may choose to
+run multiple instances of the program.
+
+## protocol overview
+
+Communication is via stdin and stdout. While stderr is connected to the
+console and so visible to the user, the program should avoid using it
+except for in the most exceptional circumstances.
+
+The protocol is line based. git-annex sends a request, and the program
+responds with a reply.
+
+Each protocol line starts with a command, which is followed by the
+command's parameters (a fixed number per command), each separated by a
+single space. The last parameter may contain spaces. Parameters may be
+empty, but the separating spaces are still required in that case.
+
+## example session
+
+git-annex always starts by sending a message asking the program what protocol
+version it uses.
+
+	GETVERSION
+
+The program responds.
+
+	VERSION 1
+
+git-annex will next query the program about the properties of the keys it
+uses (CANVERIFY, ISSTABLE, ISCRYPTOGRAPHICALLYSECURE), and the program will
+respond to each query.
+
+Then git-annex may ask the program to generate a key.
+
+	GENKEY somefile
+
+The program will respond with the key it generated, but if it needs to do
+an expensive operation, such as hashing the file, it can first send
+progress messages, indicating the position in the file it has processed.
+
+	PROGRESS 1024
+	PROGRESS 2048
+	GENKEY-SUCCESS XFOO-s2048--dbd009
+
+git-annex can also ask the program to verify if the content of a file
+matches a key.
+
+	VERIFYKEYCONTENT XFOO-s2048--dbd009 somefile
+
+Again the program can send progress messages as it works, finishing
+with the result of the verification.
+
+	PROGRESS 1024
+	PROGRESS 2048
+	VERIFYKEYCONTENT-SUCCESS
+
+## startup messages and replies
+
+These messages are sent to the program soon after starting it, and it should
+reply with one of the listed replies.
+
+* `GETVERSION`  
+  Always the first message sent.  
+  Currently the only version of this protocol is version 1.
+  * `VERSION 1`  
+* `CANVERIFY`  
+  Asks if the program can verify the content of files match a key it generated.
+  The verification does not need to be cryptographically secure, but should
+  catch data corruption.
+  * `CANVERIFY-YES`
+  * `CANVERIFY-NO`
+* `ISSTABLE`  
+  Asks the program if a key it has generated will always have the same
+  content. The answer to this is almost always yes; URL keys are an example
+  of a type of key that may have different content at different times.
+  * `ISSTABLE-YES`
+  * `ISSTABLE-NO`
+* `ISCRYPTOGRAPHICALLYSECURE`  
+  Asks the program if keys it generates are verified using a cryptographically
+  secure hash. Note that sha1 is *not* a cryptographically secure hash any
+  longer. A program can change its answer to this question as the state of the
+  art advances, and should aim to stay ahead of the state of the art by a
+  reasonable amount of time.
+  * ISCRYPTOGRAPHICALLYSECURE-YES`
+  * ISCRYPTOGRAPHICALLYSECURE-NO`
+
+## main messages and replies
+
+This is where work happens.
+
+* `GENKEY Contentfile`  
+  The program should examine the ContentFile and from it generate a
+  key. While it is doing this, it can send any number of `PROGRESS`
+  messages indication the position in the file that it's gotten to.
+  * `GENKEY-SUCCESS Key`
+  * `GENKEY-FAILURE ErrorMsg`
+* `VERIFYKEYCONTENT Key ContentFile`  
+  The program should examine the ContentFile and verify that it has the
+  content it would expect for the Key. While it is doing this, it can
+  send any number of `PROGRESS` messages indication the position in the
+  file that it's gotten to. (If the program earlier sent CANVERIFY-NO,
+  it will not be asked to do this.)
+  * `VERIFYKEYCONTENT-SUCCESS`
+  * `VERIFYKEYCONTENT-FAILURE`
+
+## general messages
+
+These messages can be sent at any time by either git-annex or the program.
+
+* `ERROR ErrorMsg`  
+  Generic error. Can be sent at any time if things get too messed up to
+  continue. When possible, use a more specific reply.  
+  The program should exit after sending this, as git-annex will not talk to
+  it any further. If the program receives an ERROR from git-annex, it can
+  exit with its own ERROR.
+
+## considerations for generating keys
+
+See [[doc/internals/key_format]] for how to format a key.
+
+The backend name should match the name of the program, eg if the program
+is git-annex-backend-XFOO, it should generate a key starting with "XFOO-".
+
+The backend name (and program name) has to be all uppercase, and should be
+reasonably short (max 10 bytes or so), and should be entirely ascii
+alphanumerics. Eg, use similar names to other [[backends]].
+
+git-annex will automatically also support an "E" variant of the backend,
+which adds a filename extension to the end of the key. It does this
+entirely transparently to the program, so while the repository may be using
+XFOOE keys, the program will always generate and verify XFOO keys.
+
+The key name is typically some kind of hash, but is not limited to a hash.
+The length of it needs to be similar to the lengths of other git-annex
+keys. Too long a key name will make it annoying to work with repositories
+using them, or even cause problems due to filename length limits. 128 bytes
+maximum, but shorter is better.
+
+It's important that, if the program responds with
+ISCRYPTOGRAPHICALLYSECURE-YES, the key name contains only a hash, and not
+other data from some other source. That other data could be used to try to
+mount a sha1 collision attack against git, by embedding colliding material
+in the key name, where users are unlikely to notice it. While git has
+several things that make sha1 collision attacks difficult, we don't want
+this chink in the armor.
+
+## program names must be unique
+
+It's important that two different programs don't use the same name, because
+that would result in bad behavior if the wrong program were used with a
+repository with keys generated by the other program.
+
+Here is a list of programs, to avoid picking the same name. Edit this page
+to add yours to the list.
+
+* [[git-annex-backend-XFOO]] is a demo program implementing this protocol
+  with a shell script.
+
+## signals
+
+The program should not block SIGINT, or SIGTERM. Doing so may cause
+git-annex to hang waiting on it to exit. Of course it's ok to catch those
+signals and do some necessary cleanup before exiting.
diff --git a/doc/design/external_backend_protocol/git-annex-backend-XFOO b/doc/design/external_backend_protocol/git-annex-backend-XFOO
new file mode 100755
index 0000000000..0282fab4ae
--- /dev/null
+++ b/doc/design/external_backend_protocol/git-annex-backend-XFOO
@@ -0,0 +1,57 @@
+#!/bin/sh
+# Demo git-annex external backend program.
+# 
+# Install in PATH as git-annex-backend-XFOO
+#
+# Copyright 2020 Joey Hess; licenced under the GNU GPL version 3 or higher.
+
+set -e
+
+hashfile {
+	local contentfile="$1"
+	# could send PROGRESS while doing this, but it's
+	# hard to implement that in shell
+	return "$(md5sum "$contentfile" | cut -d ' ' -f 1 || echo '')"
+}
+
+while read line; do
+	set -- $line
+	case "$1" in
+		GETVERSION)
+			echo VERSION 1
+		;;
+		CANVERIFY)
+			echo CANVERIFY-YES
+		;;
+		ISSTABLE)
+			echo ISSTABLE-YES
+		;;
+		ISCRYPTOGRAPHICALLYSECURE)
+			# md5 is not cryptographically secure
+			echo ISCRYPTOGRAPHICALLYSECURE-NO
+		;;
+		GENKEY)
+			contentfile="$2"
+			hash=$(hashfile "$contentfile")
+			if [ -n "$hash" ]; then
+				echo "GENKEY-SUCCESS" "XFOO--$hash"
+			else
+				echo "GENKEY-FAILURE" "md5sum failed"
+			fi
+		;;
+		VERIFYKEYCONTENT)
+			key="$2"
+			contentfile="$3"
+			hash=$(hashfile "$contentfile")
+			khash=$(echo "$key" | sed 's/.*--//')
+			if [ "$hash" == "$khash" ]; then
+				echo "VERIFYKEYCONTENT-SUCCESS"
+			else
+				echo "VERIFYKEYCONTENT-FAILURE"
+			fi
+		;;
+		*)
+			echo ERROR protocol error
+		;;
+	esac
+done
diff --git a/doc/design/external_special_remote_protocol.mdwn b/doc/design/external_special_remote_protocol.mdwn
index aee77cf8eb..e58264a5d8 100644
--- a/doc/design/external_special_remote_protocol.mdwn
+++ b/doc/design/external_special_remote_protocol.mdwn
@@ -194,7 +194,7 @@ the special remote can reply with `UNSUPPORTED-REQUEST`.
   (See Config/Cost.hs for some standard costs.)
   * `COST Int`  
     Indicates the cost of the remote.
-* `GETAVAILABILITY`
+* `GETAVAILABILITY`  
   Asks the remote if it is locally or globally available.
   (Ie stored in the cloud vs on a local disk.)  
   If the remote replies with `UNSUPPORTED-REQUEST`, its availability
@@ -227,7 +227,7 @@ the special remote can reply with `UNSUPPORTED-REQUEST`.
     can contain spaces.
   * `CHECKURL-FAILURE`  
     Indicates that the requested url could not be accessed.
-* `WHEREIS Key`
+* `WHEREIS Key`  
   Asks the remote to provide additional information about ways to access
   the content of a key stored in it, such as eg, public urls.
   This will be displayed to the user by eg, `git annex whereis`.
diff --git a/doc/todo/external_backends/comment_11_56224638fde7b46ee6f52211474cd047._comment b/doc/todo/external_backends/comment_11_56224638fde7b46ee6f52211474cd047._comment
new file mode 100644
index 0000000000..7a78ef77b5
--- /dev/null
+++ b/doc/todo/external_backends/comment_11_56224638fde7b46ee6f52211474cd047._comment
@@ -0,0 +1,15 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 11"""
+ date="2020-07-20T18:01:27Z"
+ content="""
+Wrote a draft [[design/external_backend_protocol]].
+
+I wonder if it makes sense to require the programs to format and parse
+their own keys; git-annex could break up the key and send the peices in.
+The advantage though is that this lets a program decide whether or not to
+include information like the size and mtime fields in the key or not.
+And if more fields ever got added it would not need changes to the
+protocol. I guess it's simple enough for format and parse, as shown by the
+example shell program that does it.
+"""]]