sped up sim step by about 200%

Noticed that it was quite slow compared with things like action sendwanted. Guessed that the slowdown is largely due to every step doing a simulated git pull/push. So, rather than always doing a pull/push, only do those when no actions are found without doing a pull/push. This does mean that step will sometimes experience a split brain situation, but that seems like a good thing? Because step ought to explore as many possible scenarios as it reasonably can.
2024-09-23 15:27:45 -04:00 · 2024-09-23 15:27:45 -04:00 · 969e6c2747
commit 969e6c2747
parent 6df101f8b4
4 changed files with 33 additions and 30 deletions
--- a/Annex/Sim.hs
+++ b/Annex/Sim.hs
@ -490,22 +490,39 @@ applySimCommand' (CommandNotPresent _ _) _ _ = error "applySimCommand' CommandNo
 handleStep :: Int -> Int -> SimState SimRepo -> Annex (SimState SimRepo)
 handleStep startn n st
 	| n > 0 = do
-		let (st', actions) = getcomponents [] st $
-			getactions [] (M.toList (simRepos st))
-		(st'', stable) <- runoneaction actions st'
+		let (st', actions) = getactions unsyncactions st
+		(st'', nothingtodo) <- runoneaction actions st'
+		if nothingtodo
+			then do
+				let (st''', actions') = getactions [ActionSync] st''
+				(st'''', stable) <- runoneaction actions' st'''
 				if stable
-			then return st''
+					then do
+						showLongNote $ UnquotedString $ 
+							"Simulation has stabilized after "
+							++ show (startn - n)
+							++ " steps."
+						return st''''
+					else handleStep startn (pred n) st''''
 			else handleStep startn (pred n) st''
 	| otherwise = return st
  where
-	getactions c [] = c
-	getactions c ((repo, u):repos) = 
+	unsyncactions = 
+		[ ActionGetWanted
+		, ActionSendWanted
+		, \repo remote -> ActionDropUnwanted repo (Just remote)
+		]
+
+	getactions mks st' = getcomponents [] st' $
+		getactions' mks [] (M.toList (simRepos st'))
+
+	getactions' _ c [] = concat c
+	getactions' mks c ((repo, u):repos) = 
 		case M.lookup u (simConnections st) of
-			Nothing -> getactions c repos
+			Nothing -> getactions' mks c repos
 			Just remotes ->
-				let c' = map (ActionSync repo)
-					(S.toList remotes)
-				in getactions (c'++c) repos
+				let l = [mk repo remote | remote <- S.toList remotes, mk <- mks]
+				in getactions' mks (l:c) repos
 	
 	getcomponents c st' [] = (st', concat c)
 	getcomponents c st' (a:as) = case getSimActionComponents a st' of
@ -513,12 +530,7 @@ handleStep startn n st
 		Right (Right st'') -> getcomponents c st'' as
 		Right (Left (st'', cs)) -> getcomponents (cs:c) st'' as
 	
-	runoneaction [] st' = do
-		showLongNote $ UnquotedString $ 
-			"Simulation has stabilized after "
-				++ show (startn - n)
-				++ " steps."
-		return (st', True)
+	runoneaction [] st' = return (st', True)
 	runoneaction actions st' = do
 		let (idx, st'') = simRandom st'
                  	(randomR (0, length actions - 1))
--- a/doc/git-annex-sim.mdwn
+++ b/doc/git-annex-sim.mdwn
@ -193,8 +193,8 @@ as passed to "git annex sim" while a simulation is running.
  according to the current configuration, a message will be displayed
  to indicate that the simulation has stabilized.

-  (A step also simulates git pull and git push being run in each repository,
-  to all of its remotes. That happens before the file transfer or drop.)
+  This also simulates git pull and git push being run in each repository,
+  as needed in order to find additional things to do.

 * `action repo getwanted remote`

--- a/doc/todo/git-annex_proxies.mdwn
+++ b/doc/todo/git-annex_proxies.mdwn
@ -120,11 +120,6 @@ present bar 9testfile
  clients having direct connections to the nodes, but not the same when
  there are more than 2 clients connected to the 2 gateways.

-* sim: Detect instability. This can be done by examining the history,
-  if a file is added or removed from the same repository repeatedly,
-  there is probably instability, although it may be an instability that
-  dampens out later.
-
 * sim: Set a random preferred content expression. Rather than generating a
  fully random expression, it would probably be most useful to take a set
  of terms and build an expression that randomly combines them with
@ -140,12 +135,6 @@ present bar 9testfile

 ## items deferred until later for balanced preferred content and maxsize tracking

-* `git-annex assist --rebalance` of `balanced=foo:2` 
-  sometimes needs several runs to stabalize.
-
-  May not be a bug, needs reproducing and analysis.
-  Deferred for proving behavior of balanced preferred content stage.
-
 * The assistant is using NoLiveUpdate, but it should be posssible to plumb
  a LiveUpdate through it from preferred content checking to location log
  updating.
--- a/doc/todo/proving_preferred_content_behavior.mdwn
+++ b/doc/todo/proving_preferred_content_behavior.mdwn
@ -113,3 +113,5 @@ The location log history could be examined at the end of the simulation
 to find problems like instability.

 [[!tag projects/openneuro]]
+
+> [[done]], see `git-annex sim` command. --[[Joey]]