From 44b3136fdf77e88dadcdfc2ccd03691f18310c5d Mon Sep 17 00:00:00 2001
From: Joey Hess <joeyh@joeyh.name>
Date: Wed, 3 Jul 2024 15:53:25 -0400
Subject: [PATCH] update

---
 .../P2P_locking_connection_drop_safety.mdwn   | 54 +++++++++++++++++--
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/doc/todo/P2P_locking_connection_drop_safety.mdwn b/doc/todo/P2P_locking_connection_drop_safety.mdwn
index 209ee18379..a45002030e 100644
--- a/doc/todo/P2P_locking_connection_drop_safety.mdwn
+++ b/doc/todo/P2P_locking_connection_drop_safety.mdwn
@@ -17,6 +17,7 @@ I'm inclined to agree with past me. While the P2P protocol could be
 extended with a way to verify that the connection is still open, there
 is a point where git-annex has told the remote to drop, and is relying on
 the locks remaining locked until the drop finishes.
+--[[Joey]]
 
 Worst case, I can imagine that the local git-annex process takes the remote
 locks. Then it's put to sleep for a day. Then it wakes up and drops from
@@ -24,11 +25,11 @@ the other remote. The P2P connections for the locks have long since closed.
 Consider for example, a ssh password prompt on connection to the remote to
 drop the content, and the user taking a long time to respond.
 
-It seems that lockContentWhile needs to guarantee that the content remains
+It seems that LOCKCONTENT needs to guarantee that the content remains
 locked for some amount of time. Then local git-annex would know it
 has at most that long to drop the content. But it's the remote that's
 dropping that really needs to know. So, extend the P2P protocol with a
-PRE-REMOVE step. After receiving PRE-REMOVE N, a REMOVE of that key is only
+PRE-REMOVE step. After receiving PRE-REMOVE N Key, a REMOVE of that key is only
 allowed until N seconds later. Sending PRE-REMOVE first, followed by
 LOCKCONTENT will guarantee the content remains locked for the full amount
 of time.
@@ -62,4 +63,51 @@ git-annex gets installed, a user is likely to have been using git-annex
 
 OTOH putting the timestamp in the lock file may be hard (eg on Windows).
 
---[[Joey]]
+> Status: Content retention files implemented on `p2p_locking` branch.
+> P2P LOCKCONTENT uses a 10 minute retention in case it gets killed,
+> but other values can be used in the future safely.
+
+----
+
+Extending the P2P protocol is a bit tricky, because the same P2P
+protocol connection could be used for several different things at
+the same time. A PRE-REMOVE N Key might be followed by removals of other
+keys, and eventually a removal of the requested key. There are
+sometimes pools of P2P connections that get used like this.
+So the server would need to cache some number of PRE-REMOVE timestamps.
+How many?
+
+Certainly care would need to be taken to send PRE-REMOVE to the same
+connection as REMOVE. How?
+
+Could this be done without extending the REMOVE side of the P2P protocol?
+
+1. check start time
+2. LOCKCONTENT
+3. prepare to remove
+4. in checkVerifiedCopy, 
+   check current time.. fail if more than 10 minutes from start
+5. REMOVE
+
+The issue with this is that git-annex could be paused for any amount of
+time between steps 4 and 5. Usually it won't pause.. 
+mkSafeDropProof calls checkVerifiedCopy and constructs the proof,
+and then it immediately sends REMOVE. But of course sending REMOVE
+could take arbitrarily long. Or git-annex could be paused at just the wrong
+point.
+
+Ok, let's reconsider... Add GETTIMESTAMP which causes the server to
+return its current timestamp. The same timestamp must be returned on any
+connection to the server, eg the server must have a single clock.
+That can be called before LOCKCONTENT.
+Then REMOVE Key Timestamp can fail if the current time is past the
+specified timestamp. 
+
+How to handle this when proxying to a cluster? In a cluster, each node
+has a different clock. So GETTIMESTAMP will return a bunch of times.
+The cluster can get its own current time, and return that to the client.
+Then REMOVE Key Timestamp can have the timestamp adjusted when it's sent
+out to each client, by calling GETTIMESTAMP again and applying the offsets
+between the cluster's clock and each node's clock.
+
+This approach would need to use a monotonic clock!