closing in on finishing live reposizes

Fixed successfullyFinishedLiveSizeChange to not update the rolling total
when a redundant change is in RecentChanges.

Made setRepoSizes clear RecentChanges that are no longer needed.
It might be possible to clear those earlier, this is only a convenient
point to do it.

The reason it's safe to clear RecentChanges here is that, in order for a
live update to call successfullyFinishedLiveSizeChange, a change must be
made to a location log. If a RecentChange gets cleared, and just after
that a new live update is started, making the same change, the location
log has already been changed (since the RecentChange exists), and
so when the live update succeeds, it won't call
successfullyFinishedLiveSizeChange. The reason it doesn't
clear RecentChanges when there is a reduntant live update is because
I didn't want to think through whether or not all races are avoided in
that case.

The rolling total in SizeChanges is never cleared. Instead,
calcJournalledRepoSizes gets the initial value of it, and then
getLiveRepoSizes subtracts that initial value from the current value.
Since the rolling total can only be updated by updateRepoSize,
which is called with the journal locked, locking the journal in
calcJournalledRepoSizes ensures that the database does not change while
reading the journal.
This commit is contained in:
Joey Hess 2024-08-27 11:04:27 -04:00
parent 23d44aa4aa
commit 4d2f95853d
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
6 changed files with 165 additions and 248 deletions

View file

@ -133,7 +133,7 @@ data AnnexRead = AnnexRead
, forcenumcopies :: Maybe NumCopies
, forcemincopies :: Maybe MinCopies
, forcebackend :: Maybe String
, reposizes :: MVar (Maybe (M.Map UUID RepoSize))
, reposizes :: MVar (Maybe (M.Map UUID (RepoSize, SizeOffset)))
, rebalance :: Bool
, useragent :: Maybe String
, desktopnotify :: DesktopNotify

View file

@ -13,10 +13,10 @@ module Annex.RepoSize (
) where
import Annex.Common
import Annex.RepoSize.LiveUpdate
import qualified Annex
import Annex.Branch (UnmergedBranches(..), getBranch)
import qualified Database.RepoSize as Db
import Annex.Journal
import Logs
import Logs.Location
import Logs.UUID
@ -36,7 +36,10 @@ import qualified Data.Set as S
- was called. It does not update while git-annex is running.
-}
getRepoSizes :: Bool -> Annex (M.Map UUID RepoSize)
getRepoSizes quiet = do
getRepoSizes quiet = M.map fst <$> getRepoSizes' quiet
getRepoSizes' :: Bool -> Annex (M.Map UUID (RepoSize, SizeOffset))
getRepoSizes' quiet = do
rsv <- Annex.getRead Annex.reposizes
liftIO (takeMVar rsv) >>= \case
Just sizemap -> do
@ -47,22 +50,24 @@ getRepoSizes quiet = do
{- Like getRepoSizes, but with live updates. -}
getLiveRepoSizes :: Bool -> Annex (M.Map UUID RepoSize)
getLiveRepoSizes quiet = do
h <- Db.getRepoSizeHandle
liftIO (Db.estimateLiveRepoSizes h) >>= \case
Just (m, annexbranchsha) -> return m
Nothing -> do
-- Db.estimateLiveRepoSizes needs the
-- reposizes to be calculated first.
m <- getRepoSizes quiet
liftIO (Db.estimateLiveRepoSizes h) >>= \case
Just (m', annexbranchsha) -> return m'
Nothing -> return m
sizemap <- getRepoSizes' quiet
go sizemap `onException` return (M.map fst sizemap)
where
go sizemap = do
h <- Db.getRepoSizeHandle
liveoffsets <- liftIO $ Db.liveRepoOffsets h
let calc u (RepoSize size, SizeOffset startoffset) =
case M.lookup u liveoffsets of
Nothing -> RepoSize size
Just (SizeOffset offset) -> RepoSize $
size + (offset - startoffset)
return $ M.mapWithKey calc sizemap
{- Fills an empty Annex.reposizes MVar with current information
- from the git-annex branch, supplimented with journalled but
- not yet committed information.
-}
calcRepoSizes :: Bool -> MVar (Maybe (M.Map UUID RepoSize)) -> Annex (M.Map UUID RepoSize)
calcRepoSizes :: Bool -> MVar (Maybe (M.Map UUID (RepoSize, SizeOffset))) -> Annex (M.Map UUID (RepoSize, SizeOffset))
calcRepoSizes quiet rsv = go `onException` failed
where
go = do
@ -73,7 +78,7 @@ calcRepoSizes quiet rsv = go `onException` failed
Just oldbranchsha -> do
currbranchsha <- getBranch
if oldbranchsha == currbranchsha
then calcJournalledRepoSizes oldsizemap oldbranchsha
then calcJournalledRepoSizes h oldsizemap oldbranchsha
else incrementalupdate h oldsizemap oldbranchsha currbranchsha
liftIO $ putMVar rsv (Just sizemap)
return sizemap
@ -83,12 +88,12 @@ calcRepoSizes quiet rsv = go `onException` failed
showSideAction "calculating repository sizes"
(sizemap, branchsha) <- calcBranchRepoSizes
liftIO $ Db.setRepoSizes h sizemap branchsha
calcJournalledRepoSizes sizemap branchsha
calcJournalledRepoSizes h sizemap branchsha
incrementalupdate h oldsizemap oldbranchsha currbranchsha = do
(sizemap, branchsha) <- diffBranchRepoSizes quiet oldsizemap oldbranchsha currbranchsha
liftIO $ Db.setRepoSizes h sizemap branchsha
calcJournalledRepoSizes sizemap branchsha
calcJournalledRepoSizes h sizemap branchsha
failed = do
liftIO $ putMVar rsv (Just M.empty)
@ -120,13 +125,21 @@ calcBranchRepoSizes = do
- data from journalled location logs.
-}
calcJournalledRepoSizes
:: M.Map UUID RepoSize
:: Db.RepoSizeHandle
-> M.Map UUID RepoSize
-> Sha
-> Annex (M.Map UUID RepoSize)
calcJournalledRepoSizes startmap branchsha =
overLocationLogsJournal startmap branchsha
(\k v m -> pure (accumRepoSizes k v m))
Nothing
-> Annex (M.Map UUID (RepoSize, SizeOffset))
calcJournalledRepoSizes h startmap branchsha =
-- Lock the journal to prevent updates to the size offsets
-- in the repository size database while this is processing
-- the journal files.
lockJournal $ \_jl -> do
sizemap <- overLocationLogsJournal startmap branchsha
(\k v m' -> pure (accumRepoSizes k v m'))
Nothing
offsets <- liftIO $ Db.recordedRepoOffsets h
let getoffset u = fromMaybe (SizeOffset 0) $ M.lookup u offsets
return $ M.mapWithKey (\u sz -> (sz, getoffset u)) sizemap
{- Incremental update by diffing. -}
diffBranchRepoSizes :: Bool -> M.Map UUID RepoSize -> Sha -> Sha -> Annex (M.Map UUID RepoSize, Sha)
@ -180,3 +193,22 @@ diffBranchRepoSizes quiet oldsizemap oldbranchsha newbranchsha = do
(\m u -> M.insertWith (flip const) u (RepoSize 0) m)
newsizemap
knownuuids
addKeyRepoSize :: Key -> Maybe RepoSize -> Maybe RepoSize
addKeyRepoSize k mrs = case mrs of
Just (RepoSize sz) -> Just $ RepoSize $ sz + ksz
Nothing -> Just $ RepoSize ksz
where
ksz = fromMaybe 0 $ fromKey keySize k
removeKeyRepoSize :: Key -> Maybe RepoSize -> Maybe RepoSize
removeKeyRepoSize k mrs = case mrs of
Just (RepoSize sz) -> Just $ RepoSize $ sz - ksz
Nothing -> Nothing
where
ksz = fromMaybe 0 $ fromKey keySize k
accumRepoSizes :: Key -> (S.Set UUID, S.Set UUID) -> M.Map UUID RepoSize -> M.Map UUID RepoSize
accumRepoSizes k (newlocs, removedlocs) sizemap =
let !sizemap' = foldl' (flip $ M.alter $ addKeyRepoSize k) sizemap newlocs
in foldl' (flip $ M.alter $ removeKeyRepoSize k) sizemap' removedlocs

View file

@ -31,25 +31,6 @@ updateRepoSize lu u k s = liftIO $ finishedLiveUpdate lu u k sc
InfoMissing -> RemovingKey
InfoDead -> RemovingKey
addKeyRepoSize :: Key -> Maybe RepoSize -> Maybe RepoSize
addKeyRepoSize k mrs = case mrs of
Just (RepoSize sz) -> Just $ RepoSize $ sz + ksz
Nothing -> Just $ RepoSize ksz
where
ksz = fromMaybe 0 $ fromKey keySize k
removeKeyRepoSize :: Key -> Maybe RepoSize -> Maybe RepoSize
removeKeyRepoSize k mrs = case mrs of
Just (RepoSize sz) -> Just $ RepoSize $ sz - ksz
Nothing -> Nothing
where
ksz = fromMaybe 0 $ fromKey keySize k
accumRepoSizes :: Key -> (S.Set UUID, S.Set UUID) -> M.Map UUID RepoSize -> M.Map UUID RepoSize
accumRepoSizes k (newlocs, removedlocs) sizemap =
let !sizemap' = foldl' (flip $ M.alter $ addKeyRepoSize k) sizemap newlocs
in foldl' (flip $ M.alter $ removeKeyRepoSize k) sizemap' removedlocs
-- When the UUID is Nothing, it's a live update of the local repository.
prepareLiveUpdate :: Maybe UUID -> Key -> SizeChange -> Annex LiveUpdate
prepareLiveUpdate mu k sc = do

View file

@ -25,10 +25,11 @@ module Database.RepoSize (
closeDb,
getRepoSizes,
setRepoSizes,
estimateLiveRepoSizes,
startingLiveSizeChange,
successfullyFinishedLiveSizeChange,
removeStaleLiveSizeChange,
recordedRepoOffsets,
liveRepoOffsets,
) where
import Annex.Common
@ -164,6 +165,7 @@ setRepoSizes (RepoSizeHandle (Just h)) sizemap branchcommitsha =
unsetRepoSize u
forM_ (M.toList sizemap) $
uncurry setRepoSize
clearRecentChanges
recordAnnexBranchCommit branchcommitsha
setRepoSizes (RepoSizeHandle Nothing) _ _ = noop
@ -192,15 +194,29 @@ startingLiveSizeChange (RepoSizeHandle (Just h)) u k sc sid =
]
startingLiveSizeChange (RepoSizeHandle Nothing) _ _ _ _ = noop
{- A live size change has successfully finished.
-
- Update the rolling total, add as a recent change,
- and remove the live change in the same transaction.
-
- But, it's possible that the same change has been done by two
- different processes or threads. If there is a matching recent change,
- then this one is redundant, so remove it without updating the rolling
- total.
-}
successfullyFinishedLiveSizeChange :: RepoSizeHandle -> UUID -> Key -> SizeChange -> SizeChangeId -> IO ()
successfullyFinishedLiveSizeChange (RepoSizeHandle (Just h)) u k sc sid =
H.commitDb h $ do
-- Update the rolling total, add as a recent change,
-- and remove the live change in the same transaction.
getRecentChange u k >>= \case
Just sc' | sc == sc' -> remove
_ -> go
where
go = do
rollingtotal <- getSizeChangeFor u
setSizeChangeFor u (updateRollingTotal rollingtotal sc k)
addRecentChange u k sc
removeLiveSizeChange u k sc sid
remove
remove = removeLiveSizeChange u k sc sid
successfullyFinishedLiveSizeChange (RepoSizeHandle Nothing) _ _ _ _ = noop
updateRollingTotal :: FileSize -> SizeChange -> Key -> FileSize
@ -231,6 +247,13 @@ getLiveSizeChanges = M.fromListWith (++) . map conv <$> selectList [] []
let LiveSizeChanges u k sid sc = entityVal entity
in (u, [(k, (sc, sid))])
getLiveSizeChanges' :: SqlPersistM [(UUID, Key, SizeChange)]
getLiveSizeChanges' = map conv <$> selectList [] []
where
conv entity =
let LiveSizeChanges u k _sid sc = entityVal entity
in (u, k, sc)
getSizeChanges :: SqlPersistM (M.Map UUID FileSize)
getSizeChanges = M.fromList . map conv <$> selectList [] []
where
@ -251,7 +274,7 @@ setSizeChangeFor u sz =
(UniqueRepoRollingTotal u)
(SizeChanges u sz)
[SizeChangesRollingtotal =. sz]
addRecentChange :: UUID -> Key -> SizeChange -> SqlPersistM ()
addRecentChange u k sc =
void $ upsertBy
@ -269,19 +292,49 @@ getRecentChange u k = do
(s:_) -> Just $ recentChangesChange $ entityVal s
[] -> Nothing
{- Gets the sizes of Repos as of a commit to the git-annex branch
- (which is not necessarily the current commit), adjusted with all
- live changes that have happened since then or are happening now.
getRecentChanges :: SqlPersistM [(UUID, Key, SizeChange)]
getRecentChanges = map conv <$> selectList [] []
where
conv entity =
let RecentChanges u k sc = entityVal entity
in (u, k, sc)
{- Clears recent changes, except when there is a live change that is
- redundant with a recent change. -}
clearRecentChanges :: SqlPersistM ()
clearRecentChanges = do
live <- getLiveSizeChanges'
if null live
then deleteWhere ([] :: [Filter RecentChanges])
else do
let liveset = S.fromList live
rcs <- getRecentChanges
forM_ rcs $ \rc@(u, k, sc) ->
when (S.notMember rc liveset) $
deleteWhere
[ RecentChangesRepo ==. u
, RecentChangesKey ==. k
, RecentChangesChange ==. sc
]
{- Gets the recorded offsets to sizes of Repos, not including live
- changes. -}
recordedRepoOffsets :: RepoSizeHandle -> IO (M.Map UUID SizeOffset)
recordedRepoOffsets (RepoSizeHandle (Just h)) =
M.map SizeOffset <$> H.queryDb h getSizeChanges
recordedRepoOffsets (RepoSizeHandle Nothing) = pure mempty
{- Gets the offsets to sizes of Repos, including all live changes that
- are happening now.
-
- This does not necessarily include all changes that have been journalled,
- This does not necessarily include all changes that have been made,
- only ones that had startingLiveSizeChange called for them will be
- included. Also live changes or recent changes that were to a UUID not in
- the RepoSizes map are not included.
- included.
-
- In the unlikely case where two live changes are occurring, one
- adding a key and the other removing the same key, the one
- adding the key is used, in order to err on the side of a larger
- RepoSize.
- repository size.
-
- In the case where the same live change is recorded by two different
- processes or threads, the first to complete will record it as a recent
@ -291,24 +344,14 @@ getRecentChange u k = do
- This is only expensive when there are a lot of live changes happening at
- the same time.
-}
estimateLiveRepoSizes :: RepoSizeHandle -> IO (Maybe (M.Map UUID RepoSize, Sha))
estimateLiveRepoSizes (RepoSizeHandle (Just h)) = H.queryDb h $ do
getAnnexBranchCommit >>= \case
Just annexbranchsha -> do
sizechanges <- getSizeChanges
livechanges <- getLiveSizeChanges
reposizes <- getRepoSizes'
m <- M.fromList <$> forM reposizes
(go sizechanges livechanges)
return (Just (m, annexbranchsha))
Nothing -> return Nothing
liveRepoOffsets :: RepoSizeHandle -> IO (M.Map UUID SizeOffset)
liveRepoOffsets (RepoSizeHandle (Just h)) = H.queryDb h $ do
sizechanges <- getSizeChanges
livechanges <- getLiveSizeChanges
let us = nub (M.keys sizechanges ++ M.keys livechanges)
M.fromList <$> forM us (go sizechanges livechanges)
where
go
:: M.Map UUID FileSize
-> M.Map UUID [(Key, (SizeChange, SizeChangeId))]
-> (UUID, RepoSize)
-> SqlPersistM (UUID, RepoSize)
go sizechanges livechanges (u, RepoSize startsize) = do
go sizechanges livechanges u = do
let livechangesbykey =
M.fromListWith (++) $
map (\(k, v) -> (k, [v])) $
@ -321,18 +364,13 @@ estimateLiveRepoSizes (RepoSizeHandle (Just h)) = H.queryDb h $ do
(\t (k, sc) -> updateRollingTotal t sc k)
(fromMaybe 0 (M.lookup u sizechanges))
livechanges'
return (u, RepoSize (startsize + sizechange))
return (u, SizeOffset sizechange)
combinelikelivechanges =
S.elems
. S.fromList
. map (\(k, (sc, _)) -> (k, sc))
nonredundantlivechange
:: M.Map Key [(SizeChange, SizeChangeId)]
-> UUID
-> (Key, (SizeChange, SizeChangeId))
-> SqlPersistM Bool
nonredundantlivechange livechangesbykey u (k, (sc, cid))
| null (competinglivechanges livechangesbykey k sc cid) =
getRecentChange u k >>= pure . \case
@ -340,14 +378,8 @@ estimateLiveRepoSizes (RepoSizeHandle (Just h)) = H.queryDb h $ do
Just sc' -> sc /= sc'
| otherwise = pure False
competinglivechanges
:: M.Map Key [(SizeChange, SizeChangeId)]
-> Key
-> SizeChange
-> SizeChangeId
-> [(SizeChange, SizeChangeId)]
competinglivechanges livechangesbykey k RemovingKey cid =
filter (\(sc', cid') -> cid /= cid' && sc' == AddingKey)
(fromMaybe [] $ M.lookup k livechangesbykey)
competinglivechanges _ _ AddingKey _ = []
estimateLiveRepoSizes (RepoSizeHandle Nothing) = return Nothing
liveRepoOffsets (RepoSizeHandle Nothing) = pure mempty

View file

@ -28,6 +28,10 @@ newtype RepoSize = RepoSize { fromRepoSize :: Integer }
newtype MaxSize = MaxSize { fromMaxSize :: Integer }
deriving (Show, Eq, Ord)
-- An offset to the size of a repo.
newtype SizeOffset = SizeOffset { fromSizeChange :: Integer }
deriving (Show, Eq, Ord, Num)
-- Used when an action is in progress that will change the current size of
-- a repository.
--

View file

@ -35,173 +35,40 @@ Planned schedule of work:
May not be a bug, needs reproducing and analysis.
* Concurrency issues with RepoSizes calculation and balanced content:
* Make sure that two threads don't check balanced preferred content at the
same time, so each thread always sees a consistent picture of what is
happening. Use locking as necessary.
* What if 2 concurrent threads are considering sending two different
keys to a repo at the same time. It can hold either but not both.
It should avoid sending both in this situation.
* When loading the live update table, check if PIDs in it are still
running (and are still git-annex), and if not, remove stale entries
from it, which can accumulate when processes are interrupted.
Note that it will be ok for the wrong git-annex process, running again
at a pid to keep a stale item in the live update table, because that
is unlikely and exponentially unlikely to happen repeatedly, so stale
information will only be used for a short time.
* There can also be a race with 2 concurrent threads where one just
finished sending to a repo, but has not yet updated the location log.
So the other one won't see an updated repo size.
The fact that location log changes happen in CommandCleanup makes
this difficult to fix.
But then, how to check if a PID is git-annex or not? /proc of course,
but what about other OS's? Windows?
Could provisionally update Annex.reposizes before starting to send a
key, and roll it back if the send fails. But then Logs.Location
would update Annex.reposizes redundantly. So would need to remember
the provisional update was made until that is called.... But what if it
is never called for some reason?
Also, in a race between two threads at the checking preferred content
stage, neither would have started sending yet, and so both would think
it was ok for them to.
This race only really matters when the repo becomes full,
then the second thread will fail to send because it's full. Or will
send more than the configured maxsize. Still this would be good to
fix.
* If all the above thread concurrency problems are fixed, separate
processes will still have concurrency problems. One case where that is
bad is a cluster accessed via ssh. Each connection to the cluster is
a separate process. So each will be unaware of changes made by others.
When `git-annex copy --to cluster -Jn` is used, this makes a single
command behave non-ideally, the same as the thread concurrency
problems.
* Possible solution:
Add to reposizes db a table for live updates.
Listing process ID, thread ID, UUID, key, addition or removal
(done)
Add to reposizes db a table for sizechanges. This has for each UUID
a rolling total which is the total size changes that have accumulated
since the last update of the reposizes table.
So adding the reposizes table to sizechanges gives the current
size.
Make checking the balanced preferred content limit record a
live update in the table (done)
... and use other live updates and sizechanges in making its decision
Note: This will only work when preferred content is being checked.
If a git-annex copy without --auto is run, for example, it won't
tell other processes that it is in the process of filling up a remote.
That seems ok though, because if the user is running a command like
that, they are ok with a remote filling up.
Make sure that two threads don't check balanced preferred content at the
same time, so each thread always sees a consistent picture of what is
happening. Use locking as necessary.
When updating location log for a key, when there is actually a change,
update the db, remove the live update (done) and update the sizechanges
table in the same transaction (done).
Two concurrent processes might both start the same action, eg dropping
a key, and both succeed, and so both update the location log. One needs
to update the log and the sizechanges table. The other needs to see
that it has no actual change to report, and so avoid updating the
location log (already the case) and avoid updating the sizechanges
table. (done)
Detect when an upload (or drop) fails, and remove from the live
update table. (done)
When loading the live update table, check if PIDs in it are still
running (and are still git-annex), and if not, remove stale entries
from it, which can accumulate when processes are interrupted.
Note that it will be ok for the wrong git-annex process, running again
at a pid to keep a stale item in the live update table, because that
is unlikely and exponentially unlikely to happen repeatedly, so stale
information will only be used for a short time.
But then, how to check if a PID is git-annex or not? /proc of course,
but what about other OS's? Windows?
How? Possibly have a thread that
waits on an empty MVar. Thread MVar through somehow to location log
update. (Seems this would need checking preferred content to return
the MVar? Or alternatively, the MVar could be passed into it, which
seems better..) Fill MVar on location log update. If MVar gets
GCed without being filled, the thread will get an exception and can
remove from table and cache then. This does rely on GC behavior, but if
the GC takes some time, it will just cause a failed upload to take
longer to get removed from the table and cache, which will just prevent
another upload of a different key from running immediately.
(Need to check if MVar GC behavior operates like this.
See https://stackoverflow.com/questions/10871303/killing-a-thread-when-mvar-is-garbage-collected )
Perhaps stale entries can be found in a different way. Require the live
update table to be updated with a timestamp every 5 minutes. The thread
that waits on the MVar can do that, as long as the transfer is running. If
interrupted, it will become stale in 5 minutes, which is probably good
enough? Could do it every minute, depending on overhead. This could
also be done by just repeatedly touching a file named with the processes's
pid in it, to avoid sqlite overhead.
* Still implementing LiveUpdate. Check for TODO XXX markers
* Concurrency issue noted in commit db89e39df606b6ec292e0f1c3a7a60e317ac60f1
But: There will be a window where the redundant LiveUpdate is still
visible in the db, and processes can see it, combine it with the
rollingtotal, and arrive at the wrong size. This is a small window, but
it still ought to be addressed. Unsure if it would always be safe to
remove the redundant LiveUpdate? Consider the case where two drops and a
get are all running concurrently somehow, and the order they finish is
[drop, get, drop]. The second drop seems redundant to the first, but
it would not be safe to remove it. While this seems unlikely, it's hard
to rule out that a get and drop at different stages can both be running
at the same time.
It also is possible for a redundant LiveUpdate to get added to the db
just after the rollingtotal was updated. In this case, combining the LiveUpdate
with the rollingtotal again yields the wrong reposize.
So is the rollingtotal doomed to not be accurate?
A separate table could be kept of recent updates. When combining a LiveUpdate
with the rollingtotal to get a reposize, first check if the LiveUpdate is
redundant given a recent update. When updating the RepoSizes table, clear the
recent updates table and the rolling totals table (in the same transaction).
This recent updates table could get fairly large, but only needs to be queried
for each current LiveUpdate, of which there are not ususally many running.
When does a recent update mean a LiveUpdate is redundant? In the case of two drops,
the second is clearly redundant. But what about two gets and a drop? In this
case, after the first get, we don't know what order operations will
happen in. So the fact that the first get is in the recent updates table
should not make the second get be treated as redundant.
So, look up each LiveUpdate in the recent updates table. When the same
operation is found there, look to see if there is any other LiveUpdate of
the same key and uuid, but with a different SizeChange. Only when there is
not is the LiveUpdate redundant.
What if the recent updates table contains a get and a drop of the same
key. Now a get is running. Is it redundant? Perhaps the recent updates
table needs timestamps. More simply, when adding a drop to the recent
updates table, any existing get of the same key should be removed.
* In the case where a copy to a remote fails (due eg to annex.diskreserve),
the LiveUpdate thread can not get a chance to catch its exception when
the LiveUpdate is gced, before git-annex exits. In this case, the
database is left with some stale entries in the live update table.
This is not a big problem because the same can happen when the process is
interrupted. Still it would be cleaner for this not to happen. Is there
any way to prevent it? Waiting 1 GC tick before exiting would do it,
I'd think, but I tried manually doing a performGC at git-annex shutdown
and it didn't help.
getLiveRepoSizes is an unfinished try at implementing the above.
* Something needs to empty SizeChanges and RecentChanges when
setRepoSizes is called. While avoiding races.
How? Possibly have a thread that
waits on an empty MVar. Thread MVar through somehow to location log
update. (Seems this would need checking preferred content to return
the MVar? Or alternatively, the MVar could be passed into it, which
seems better..) Fill MVar on location log update. If MVar gets
GCed without being filled, the thread will get an exception and can
remove from table and cache then. This does rely on GC behavior, but if
the GC takes some time, it will just cause a failed upload to take
longer to get removed from the table and cache, which will just prevent
another upload of a different key from running immediately.
(Need to check if MVar GC behavior operates like this.
See https://stackoverflow.com/questions/10871303/killing-a-thread-when-mvar-is-garbage-collected )
Perhaps stale entries can be found in a different way. Require the live
update table to be updated with a timestamp every 5 minutes. The thread
that waits on the MVar can do that, as long as the transfer is running. If
interrupted, it will become stale in 5 minutes, which is probably good
enough? Could do it every minute, depending on overhead. This could
also be done by just repeatedly touching a file named with the processes's
pid in it, to avoid sqlite overhead.
* The assistant is using NoLiveUpdate, but it should be posssible to plumb
a LiveUpdate through it from preferred content checking to location log
@ -222,6 +89,7 @@ Planned schedule of work:
* Balanced preferred content basic implementation, including --rebalance
option.
* Implemented [[track_free_space_in_repos_via_git-annex_branch]]
* Implemented tracking of live changes to repository sizes.
* `git-annex maxsize`
* annex.fullybalancedthreshhold