git-annex/doc/todo/dynamic_stall_detection.mdwn
Joey Hess dd39e9e255
suggest when user may want annex.stalldetection
When annex.stalldetection is not enabled, and a likely stall is detected,
display a suggestion to enable it.

Note that the progress meter display is not taken down when displaying
the message, so it will display like this:

	0%    8 B                 0 B/s
	  Transfer seems to have stalled. To handle stalling transfers, configure annex.stalldetection
	0%    10 B                0 B/s

Although of course if it's really stalled, it will never update
again after the message. Taking down the progress meter and starting
a new one doesn't seem too necessary given how unusual this is,
also this does help show the state it was at when it stalled.

Use of uninterruptibleCancel here is ok, the thread it's canceling
only does STM transactions and sleeps. The annex thread that gets
forked off is separate to avoid it being canceled, so that it
can be joined back at the end.

A module cycle required moving from dupState the precaching of the
remote list. Doing it at startConcurrency should cover all the cases
where the remote list is used in concurrent actions.

This commit was sponsored by Kevin Mueller on Patreon.
2021-02-03 15:57:19 -04:00

24 lines
1.3 KiB
Markdown

annex.stalldetection lets remotes be configured with a minimum throughput
to detect and retry stalls. But most users are not going to configure this.
Could something be done to dynamically detect a stall, without configuration?
Eg, wait until data starts to flow, and then check if there's at least some
data being sent each half minute. If so, the progress display is being updated
at least every minute. So then if 1 minute goes by without more data
flowing, it's almost certainly stalled. And if the progress display is
updated less frequently, see if it's updated every 2 minutes, etc. Although
realistically, progress displays are updated every chunk, and there's
typically more than 1 chunk per minute. So longer durations than 1 minute
may be an unncessary complication. And a minute to detect a stall is fine.
> Implemented this, annex.stalldetection = true enables automatic.
It may still need a config to turn it on, because running
transfers in separate processes can lead to more resource use, or even
password prompting, which could be annoying to existing users. Also, if it
gets it wrong and the remote does not support resuming transfers,
defaulting to on could lead to bad waste of resources. It could
detect stalls even when not turned on, but only display a message
suggesting enabling the config. --[[Joey]]
> [[done]] --[[Joey]]