add --size-limit option

When this option is not used, there should be effectively no added
overhead, thanks to the optimisation in
b3cd0cc6ba.

When an action fails on a file, the size of the file still counts toward
the size limit. This was necessary to support concurrency, but also
generally seems like the right choice.

Most commands that operate on annexed files support the option.
export and import do not, and I don't know if it would make sense for
export to.. Why would you want an incomplete export? sync doesn't, and
while it would be easy to make it support it for transferring files,
it's not clear if dropping files should also take the size limit into
account. Commands like add that don't operate on annexed files don't
support the option either.

Exiting 101 not yet implemented.

Sponsored-by: Denis Dzyubenko on Patreon
This commit is contained in:
Joey Hess 2021-06-04 16:08:42 -04:00
parent b3cd0cc6ba
commit 771a122c9e
No known key found for this signature in database
GPG key ID: DB12DB0FF05F8F38
6 changed files with 118 additions and 19 deletions

View file

@ -0,0 +1,34 @@
[[!comment format=mdwn
username="joey"
subject="""comment 1"""
date="2021-06-04T18:07:44Z"
content="""
I agree this could be useful.
Implementation is complicated by it needing to only count the size when a
file is acted on. Eg `git annex get` shouldn't stop when it's seen enough
files that already have content present.
So it seems it would need to be implemented next to where showStartMessage
is used in commandAction, looking at the size of the key in the
StartMessage (or possibly file when there's no key?) and when it would go
over the limit, rather than proceeding to perform the action it could skip
doing anything and go on to the next file.
I don't think there is a good way to make it immediately exit
when it reaches the limit, so if there were subsequent smaller files
after a skipped file that could be processed still, it still would.
It would probably also make sense to make it later exit with 101 like
--time-limit does, or another special exit code, to indicate it didn't
process everything.
Hmm, if an action fails, should the size of the file be counted or not?
If failures are not counted, incomplete transfers could result in a
lot more work/disk space than desired. But if failures are counted
after failing to drop a bunch of files, or failing early on to get a bunch
of files, it could stop seemingly prematurely. Also there's a problem with
concurrency, if it needs to know the result of running jobs before deciding
whether to start a new job. Seems no entirely good answer here, but the
concurrency problem seems only solvable by updating the count at start time.
"""]]