make --rebalance of balanced use fullysizebalanced when useful
When the specified number of copies is > 1, and some repositories are too full, it can be better to move content from them to other less full repositories, in order to make space for new content. annex.fullybalancedthreshhold is documented, but not implemented yet This is not tested very well yet, and is known to sometimes take several runs to stabalize.
This commit is contained in:
parent
9e87061de2
commit
76ece2a699
4 changed files with 70 additions and 28 deletions
35
Limit.hs
35
Limit.hs
|
@ -599,11 +599,24 @@ limitFullyBalanced :: Maybe UUID -> Annex GroupMap -> MkLimit Annex
|
||||||
limitFullyBalanced = limitFullyBalanced' "fullybalanced"
|
limitFullyBalanced = limitFullyBalanced' "fullybalanced"
|
||||||
|
|
||||||
limitFullyBalanced' :: String -> Maybe UUID -> Annex GroupMap -> MkLimit Annex
|
limitFullyBalanced' :: String -> Maybe UUID -> Annex GroupMap -> MkLimit Annex
|
||||||
limitFullyBalanced' = limitFullyBalanced'' filtercandidates
|
limitFullyBalanced' = limitFullyBalanced'' $ \n key candidates -> do
|
||||||
where
|
|
||||||
filtercandidates _ key candidates = do
|
|
||||||
maxsizes <- getMaxSizes
|
maxsizes <- getMaxSizes
|
||||||
sizemap <- getRepoSizes False
|
sizemap <- getRepoSizes False
|
||||||
|
let threshhold = 0.9 :: Double
|
||||||
|
let toofull u =
|
||||||
|
case (M.lookup u maxsizes, M.lookup u sizemap) of
|
||||||
|
(Just (MaxSize maxsize), Just (RepoSize reposize)) ->
|
||||||
|
fromIntegral reposize >= fromIntegral maxsize * threshhold
|
||||||
|
_ -> False
|
||||||
|
needsizebalance <- ifM (Annex.getRead Annex.rebalance)
|
||||||
|
( return $ n > 1 &&
|
||||||
|
n > S.size candidates
|
||||||
|
- S.size (S.filter toofull candidates)
|
||||||
|
, return False
|
||||||
|
)
|
||||||
|
if needsizebalance
|
||||||
|
then filterCandidatesFullySizeBalanced maxsizes sizemap n key candidates
|
||||||
|
else do
|
||||||
currentlocs <- S.fromList <$> loggedLocations key
|
currentlocs <- S.fromList <$> loggedLocations key
|
||||||
let keysize = fromMaybe 0 (fromKey keySize key)
|
let keysize = fromMaybe 0 (fromKey keySize key)
|
||||||
let hasspace u = case (M.lookup u maxsizes, M.lookup u sizemap) of
|
let hasspace u = case (M.lookup u maxsizes, M.lookup u sizemap) of
|
||||||
|
@ -673,11 +686,19 @@ limitFullySizeBalanced :: Maybe UUID -> Annex GroupMap -> MkLimit Annex
|
||||||
limitFullySizeBalanced = limitFullySizeBalanced' "fullysizebalanced"
|
limitFullySizeBalanced = limitFullySizeBalanced' "fullysizebalanced"
|
||||||
|
|
||||||
limitFullySizeBalanced' :: String -> Maybe UUID -> Annex GroupMap -> MkLimit Annex
|
limitFullySizeBalanced' :: String -> Maybe UUID -> Annex GroupMap -> MkLimit Annex
|
||||||
limitFullySizeBalanced' = limitFullyBalanced'' filtercandidates
|
limitFullySizeBalanced' = limitFullyBalanced'' $ \n key candidates -> do
|
||||||
where
|
|
||||||
filtercandidates n key candidates = do
|
|
||||||
maxsizes <- getMaxSizes
|
maxsizes <- getMaxSizes
|
||||||
sizemap <- getRepoSizes False
|
sizemap <- getRepoSizes False
|
||||||
|
filterCandidatesFullySizeBalanced maxsizes sizemap n key candidates
|
||||||
|
|
||||||
|
filterCandidatesFullySizeBalanced
|
||||||
|
:: M.Map UUID MaxSize
|
||||||
|
-> M.Map UUID RepoSize
|
||||||
|
-> Int
|
||||||
|
-> Key
|
||||||
|
-> S.Set UUID
|
||||||
|
-> Annex (S.Set UUID)
|
||||||
|
filterCandidatesFullySizeBalanced maxsizes sizemap n key candidates = do
|
||||||
currentlocs <- S.fromList <$> loggedLocations key
|
currentlocs <- S.fromList <$> loggedLocations key
|
||||||
let keysize = fromMaybe 0 (fromKey keySize key)
|
let keysize = fromMaybe 0 (fromKey keySize key)
|
||||||
let go u = case (M.lookup u maxsizes, M.lookup u sizemap, u `S.member` currentlocs) of
|
let go u = case (M.lookup u maxsizes, M.lookup u sizemap, u `S.member` currentlocs) of
|
||||||
|
@ -689,7 +710,7 @@ limitFullySizeBalanced' = limitFullyBalanced'' filtercandidates
|
||||||
return $ S.fromList $
|
return $ S.fromList $
|
||||||
map fst $ take n $ reverse $ sortOn snd $
|
map fst $ take n $ reverse $ sortOn snd $
|
||||||
mapMaybe go $ S.toList candidates
|
mapMaybe go $ S.toList candidates
|
||||||
|
where
|
||||||
proportionfree keysize inrepo u (RepoSize reposize) (MaxSize maxsize)
|
proportionfree keysize inrepo u (RepoSize reposize) (MaxSize maxsize)
|
||||||
| maxsize > 0 = Just
|
| maxsize > 0 = Just
|
||||||
( u
|
( u
|
||||||
|
|
|
@ -318,6 +318,16 @@ elsewhere to allow removing it).
|
||||||
When the `--rebalance` option is used, `balanced` is the same as
|
When the `--rebalance` option is used, `balanced` is the same as
|
||||||
`fullybalanced`.
|
`fullybalanced`.
|
||||||
|
|
||||||
|
When the specified number is greater than 1, and too many repositories
|
||||||
|
in the group are more than 90% full (as configured by
|
||||||
|
annex.fullybalancedthreshhold), this behaves like `fullysizebalanced`.
|
||||||
|
|
||||||
|
For example, `fullybalanced=foo:3`, when group foo has 5 repositories,
|
||||||
|
two 50% full and three 99% full, will make some content move from the
|
||||||
|
full repositories to the others. Moving content like that is expensive,
|
||||||
|
but it allows new files to continue to be stored on the specified number
|
||||||
|
of repositories.
|
||||||
|
|
||||||
* `sizebalanced=groupname:number`
|
* `sizebalanced=groupname:number`
|
||||||
|
|
||||||
Distributes content amoung repositories in the group, keeping
|
Distributes content amoung repositories in the group, keeping
|
||||||
|
|
|
@ -928,6 +928,12 @@ repository, using [[git-annex-config]]. See its man page for a list.)
|
||||||
|
|
||||||
The default reserve is 100 megabytes.
|
The default reserve is 100 megabytes.
|
||||||
|
|
||||||
|
* `annex.fullybalancedthreshhold`
|
||||||
|
|
||||||
|
Configures the percent full a repository must be in order for
|
||||||
|
the "fullybalanced" preferred content expression to consider it
|
||||||
|
to be full. The default is 90.
|
||||||
|
|
||||||
* `annex.skipunknown`
|
* `annex.skipunknown`
|
||||||
|
|
||||||
Set to true to make commands like "git-annex get" silently skip over
|
Set to true to make commands like "git-annex get" silently skip over
|
||||||
|
|
|
@ -30,6 +30,11 @@ Planned schedule of work:
|
||||||
|
|
||||||
## work notes
|
## work notes
|
||||||
|
|
||||||
|
* Implement annex.fullybalancedthreshhold
|
||||||
|
|
||||||
|
* `git-annex assist --rebalance` of `balanced=foo:2`
|
||||||
|
sometimes needs several runs to stabalize.
|
||||||
|
|
||||||
* Bug:
|
* Bug:
|
||||||
|
|
||||||
git init foo
|
git init foo
|
||||||
|
|
Loading…
Reference in a new issue