improved bwrate limiting implementation

New method is much better. Avoids unrestrained transfer at the beginning
(except for the first block. Keeps right at or a few kb/s below the
configured limit, with very little varation in the actual reported bandwidth.

Removed the /s part of the config as it's not needed.

Ready to merge.

Sponsored-by: Luke Shumaker on Patreon
This commit is contained in:
Joey Hess 2021-09-22 15:14:28 -04:00
parent 44d3d50785
commit e8496d62e4
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
5 changed files with 46 additions and 55 deletions

View file

@ -1,7 +1,8 @@
git-annex (8.20210904) UNRELEASED; urgency=medium
* Added annex.bwlimit and remote.name.annex-bwlimit config that works
for git remotes and many but not all special remotes.
* Added annex.bwlimit and remote.name.annex-bwlimit config to limit
the bandwidth of transfers. It works for git remotes and many
but not all special remotes.
* Bug fix: Git configs such as annex.verify were incorrectly overriding
per-remote git configs such as remote.name.annex-verify.
(Reversion in version 4.20130323)

View file

@ -407,9 +407,9 @@ extractRemoteGitConfig r remotename = do
, remoteAnnexStallDetection =
either (const Nothing) Just . parseStallDetection
=<< getmaybe "stalldetection"
, remoteAnnexBwLimit =
either (const Nothing) Just . parseBwRate
=<< getmaybe "bwlimit"
, remoteAnnexBwLimit = do
sz <- readSize dataUnits =<< getmaybe "bwlimit"
return (BwRate sz (Duration 1))
, remoteAnnexAllowUnverifiedDownloads = (== Just "ACKTHPPT") $
getmaybe ("security-allow-unverified-downloads")
, remoteAnnexConfigUUID = toUUID <$> getmaybe "config-uuid"

View file

@ -389,39 +389,34 @@ rateLimitMeterUpdate delta (Meter totalsizev _ _ _) meterupdate = do
-- same process and thread as the call to the MeterUpdate.
--
-- For example, if the desired bandwidth is 100kb/s, and over the past
-- second, 200kb was sent, then pausing for half a second, and then
-- running for half a second should result in the desired bandwidth.
-- But, if after that pause, only 75kb is sent over the next half a
-- second, then the next pause should be 2/3rds of a second.
-- 1/10th of a second, 30kb was sent, then the current bandwidth is
-- 300kb/s, 3x as fast as desired. So, after getting the next chunk,
-- pause for twice as long as it took to get it.
bwLimitMeterUpdate :: ByteSize -> Duration -> MeterUpdate -> IO MeterUpdate
bwLimitMeterUpdate sz duration meterupdate = do
nowtime <- getPOSIXTime
lastpause <- newMVar (nowtime, toEnum 0 :: POSIXTime, 0)
return $ mu lastpause
where
mu lastpause n@(BytesProcessed i) = do
bwLimitMeterUpdate bwlimit duration meterupdate
| bwlimit <= 0 = return meterupdate
| otherwise = do
nowtime <- getPOSIXTime
meterupdate n
lastv@(prevtime, prevpauselength, previ) <- takeMVar lastpause
let timedelta = nowtime - prevtime
if timedelta >= durationsecs
then do
let sz' = i - previ
let runtime = timedelta - prevpauselength
let pauselength = calcpauselength sz' runtime
if pauselength > 0
then do
unboundDelay (floor (pauselength * fromIntegral oneSecond))
putMVar lastpause (nowtime, pauselength, i)
else putMVar lastpause lastv
else putMVar lastpause lastv
mv <- newMVar (nowtime, 0)
return (mu mv)
where
mu mv n@(BytesProcessed i) = do
endtime <- getPOSIXTime
(starttime, previ) <- takeMVar mv
calcpauselength sz' runtime
| sz' > sz && sz' > 0 && runtime > 0 =
durationsecs - (fromIntegral sz / fromIntegral sz') * runtime
| otherwise = 0
durationsecs = fromIntegral (durationSeconds duration)
let runtime = endtime - starttime
let currbw = fromIntegral (i - previ) / runtime
let pausescale = if currbw > bwlimit'
then (currbw / bwlimit') - 1
else 0
unboundDelay (floor (runtime * pausescale * msecs))
meterupdate n
nowtime <- getPOSIXTime
putMVar mv (nowtime, i)
bwlimit' = fromIntegral (bwlimit * durationSeconds duration)
msecs = fromIntegral oneSecond
data Meter = Meter (MVar (Maybe TotalSize)) (MVar MeterState) (MVar String) DisplayMeter

View file

@ -1389,21 +1389,12 @@ Remotes are configured using these settings in `.git/config`.
This can be used to limit how much bandwidth is used for a transfer
from or to a remote.
For example, to limit transfers to 1 gigabyte per second:
`git config annex.bwlimit "1GB/1s"`
For example, to limit transfers to 1 mebibyte per second:
`git config annex.bwlimit "1MiB"`
This will work with many remotes, including git remotes, but not
for remotes where the transfer is run by a separate program than
git-annex.
The bandwidth limiting is implemented by pausing when
the transfer is running too fast, so it may use more bandwidth
than configured before being slowed down, either at the beginning
or if the available bandwidth changes while it is running.
It is different to use "1GB/1s" than "10GB/10s". git-annex will
track how much data was transferred over the time period, and then
pausing. So usually 1s is the best time period to use.
git-annex.
* `remote.<name>.annex-stalldetecton`, `annex.stalldetection`

View file

@ -10,17 +10,17 @@ works, it will probably work to put the delay in there. --[[Joey]]
[[confirmed]]
> Implmentation in progress in the `bwlimit` branch. Seems to work, but see
> commit message for what still needs to be done. --[[Joey]]
> The directory special remote, when resuming an interrupted
> Implemented and works well.
>
> A local git remote, when resuming an interrupted
> transfer, has to hash the file (with default annex.verify settings),
> and that hashing updates the progress bar, and so the bwlimit can kick
> in and slow down that initial hashing, before any data copying begins.
> This seems perhaps ok; if you've bwlimited a directory special
> This seems perhaps ok; if you've bwlimited a local git remote,
> remote you're wanting to limit disk IO. Only reason it might not be ok
> is if the intent is to limit IO to the disk containing the directory
> special remote, but not the one containing the annex repo.
> is if the intent is to limit IO to the disk containing the remote
> but not the one containing the annex repo. (This also probably
> holds for the directory special remote.)
>
> Other remotes, including git over ssh, when resuming don't have that
> problem. Looks like chunked special remotes narrowly avoid it, just
@ -28,4 +28,8 @@ works, it will probably work to put the delay in there. --[[Joey]]
> when resuming. It might be worthwhile to differentiate between progress
> updates for incremental verification setup and for actual transfers, and
> only rate limit the latter, just to avoid fragility in the code.
> I have not done so yet though, and am closing this..
> --[[Joey]]
[[done]]