comments

2020-02-20 12:23:41 -04:00 · 2020-02-20 12:23:41 -04:00 · 0f9bf2d434
commit 0f9bf2d434
parent 79a9475007
2 changed files with 100 additions and 0 deletions
--- a/doc/todo/more_extensive_retries_to_mask_transient_failures/comment_7_53c2820a0a60ab1efe4560a58ecd4f0b._comment
+++ b/doc/todo/more_extensive_retries_to_mask_transient_failures/comment_7_53c2820a0a60ab1efe4560a58ecd4f0b._comment
@ -0,0 +1,17 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""Re: retries due to locked index file"""
+ date="2020-02-20T15:32:11Z"
+ content="""
+If you have a bug where concurrent adds fail due to some locking problem,
+please file a bug report. This is not the correct place to discuss that
+problem. 
+
+[So far all you've shown is that a git index file lock,
+which could be a stale lock or a lock due to some git operation not under
+the control of git-annex, causes git annex add to fail. Since git-annex
+takes it own lock before any operation that touches the index file,
+I'm quite confident that concurrent git-annex adds do not interfere with
+one-another. See Annex.Queue which uses withExclusiveLock when flushing
+the queue.]
+"""]]
--- a/doc/todo/more_extensive_retries_to_mask_transient_failures/comment_8_b302f24d3234b1c46ed1566cfdf696fb._comment
+++ b/doc/todo/more_extensive_retries_to_mask_transient_failures/comment_8_b302f24d3234b1c46ed1566cfdf696fb._comment
@ -0,0 +1,83 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 8"""
+ date="2020-02-20T15:43:59Z"
+ content="""
+While separate processes for would of course work, it would add quite a
+lot of overhead. Especially in places like Annex.Queue where multiple
+threads can currently cooperate in building up something that gets flushed
+to disk together, and separate processes would need to do more work.
+
+Of course, it's also entirely possible to write your own program that runs
+concurrent git-annex processes and kills them if they seem stuck or
+whatever.
+
+----
+
+Thinking some more about what would be necessary to make worker threads
+cancellable, they would first of all need to use async whenever they fork
+any threads of their own. That would be fairly easy to arrange for in the
+git-annex code, although it's currently not the case. (Remote.Git at least
+forks a thread w/o using async.)
+
+If a special remote uses a library that itself uses worker threads, that
+library would also need to use async. But I am pretty sure that the
+libraries in question (S3 and DAV) don't spawn off threads. Also, it would
+be an easy sell in the haskell community that any such library that
+spawns off its own threads use async or something similar that causes
+cancelation of an API call to cancel the threads.
+
+That leaves processes, and it occurs to me that there all at least 3
+different types of processes a remote might run.
+
+1. Interactive processes. Eg, ssh prompting for a password, or
+   git-credential. Killing such a process in the middle of user input
+   or after it's output a prompt would not be good. Also, being blocked
+   by a prompt is not the same as having stalled a download.
+
+   Locking already prevents more than one thread from running
+   such an interactive process; the actions are run inside `prompt`.
+   So, something would need to be done to prevent killing threads that
+   are in `prompt`.
+
+2. Processes in worker pools shared amoung threads of the remote.
+   The ExternalState pool is an example of this, the P2PSshConnectionPool
+   is another.
+
+   Killing a thread needs to kill whatever external process
+   it's currently using, but  on the other hand, it could have started an
+   external process that's idle, or that is now being used by some other
+   thread, and that process should not be killed.
+
+   I think these all work the same: A process is removed from the pool
+   while it's being used, and then gets put back into the pool once
+   it's idle again. So, register the pid as belonging to a thread when
+   the thread removes it from the pool, and deregister it when the thread
+   returns it.
+
+3. All the rest. For these what's needed is some way to register
+   the pid of the process that a thread starts as belonging to the thread,
+   so that on killing the thread that pid can also be killed.
+   
+   Starting the process and registering it needs to be done 
+   in an exception-safe way; if a cancelation exception is thrown
+   as the process is started and before registering its pid, 
+   the process needs to be killed.
+
+   This seems like it would need wrappers for everything in
+   Utility.Process, to gather and register the pids.
+
+   It might be that a remote runs a child thread using withAsync, and
+   then that thread starts a process. So the process would be
+   registered as belonging to the child thread. Then, if the parent
+   thread gets killed, the signal would propigate to kill the child thread
+   due to async being used. Resulting in the process being left running,
+   because it was not registered as belonging to the parent process.
+   This is difficult to solve, because the child thread does not know what
+   thread is its parent.
+
+Wow, this looks like a lot of work, and it would be fragile -- 
+any mistake would not be noticed until git-annex tried to kill a worker
+thread, and left a process behind -- and the consequence of a mistake
+could potentially be a (slowish) fork bomb.
+"""]]