remove stale live changes from reposize database
Reorganized the reposize database directory, and split up a column. checkStaleSizeChanges needs to run before needLiveUpdate, otherwise the process won't be holding a lock on its pid file, and another process could go in and expire the live update it records. It just so happens that they do get called in the correct order, since checking balanced preferred content calls getLiveRepoSizes before needLiveUpdate. The 1 minute delay between checks is arbitrary, but will avoid excess work. The downside of it is that, if a process is dropping a file and gets interrupted, for 1 minute another process can expect a repository will soon be smaller than it is. And so a process might send data to a repository when a file is not really going to be dropped from it. But note that can already happen if a drop takes some time in eg locking and then fails. So it seems possible that live updates should only be allowed to increase, rather than decrease the size of a repository.
This commit is contained in:
parent
278adbb726
commit
f89a1b8216
7 changed files with 199 additions and 83 deletions
|
@ -37,27 +37,6 @@ Planned schedule of work:
|
|||
|
||||
* Test that live repo size data is correct and really works.
|
||||
|
||||
* When loading the live update table, check if PIDs in it are still
|
||||
running (and are still git-annex), and if not, remove stale entries
|
||||
from it, which can accumulate when processes are interrupted.
|
||||
Note that it will be ok for the wrong git-annex process, running again
|
||||
at a pid to keep a stale item in the live update table, because that
|
||||
is unlikely and exponentially unlikely to happen repeatedly, so stale
|
||||
information will only be used for a short time.
|
||||
|
||||
But then, how to check if a PID is git-annex or not? /proc of course,
|
||||
but what about other OS's? Windows?
|
||||
|
||||
A plan: Have git-annex lock a per-pid file at startup. Then before
|
||||
loading the live updates table, check each other per-pid file, by
|
||||
try to take a shared lock. If able to, that process is no longer running,
|
||||
and its live updates should be considered stale, and can be removed
|
||||
while loading the live updates table.
|
||||
|
||||
Might be better to not lock at startup, but only once live updates are
|
||||
used. annex.pidlock might otherwise prevent running more than one
|
||||
git-annex at a time.
|
||||
|
||||
* The assistant is using NoLiveUpdate, but it should be posssible to plumb
|
||||
a LiveUpdate through it from preferred content checking to location log
|
||||
updating.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue