Without actually simulating cluster implementation at all. Instead, only
the essential fact that cluster gateways know what changes they have
made to each node of a cluster. That is enough for sims like
sizebalanced_cluster.
Noticed that it was quite slow compared with things like action
sendwanted. Guessed that the slowdown is largely due to every step
doing a simulated git pull/push.
So, rather than always doing a pull/push, only do those when no actions
are found without doing a pull/push.
This does mean that step will sometimes experience a split brain
situation, but that seems like a good thing? Because step ought to
explore as many possible scenarios as it reasonably can.
This demonstrates concurrent behavior that looks right. And with a
random seed, the results are deterministic.
init foo
init bar
init backup
connect foo <-> bar
connect foo <-> backup
addmulti 10 testfiles 1mb 1gb foo backup
action foo gitpull backup
wanted foo nothing
wanted bar anything
wanted backup anything
action bar gitpull foo
action foo dropunwanted while action bar getwanted foo
When getting from a remote, have to check that the repo doing the
getting thinks the remote contains the key, but also that the remote
actually does. Before this bug fix, it would get from a repo that used
to have the key, but that had dropped it since the last git pull.
Had to add Read instances to Key and NumCopies and some other similar
types. I only expect to use those in serializing a sim. Of course, this
risks that implementation changes break reading old data. For a sim,
that would not be a big problem.
Not tested yet. This emulates the same checking that is done when
dropping. Note that when dropping from a special remote it is not able
to make a locked copy.
The generator doesn't emit the best possible connect commands,
but it does output something valid. Eg, an input like:
connect A <-> B <-> C <-> D
becomes:
connect A <-> B <-> C
connect C <-> D
Also:
connect A -> B <- C
becomes:
connect A -> B
connect C -> B
Which could be improved.
Also disconnect commands are not prettified at all, but probably there's
no reason to.
Detect when a preferred content expression contains "not present", which
would lead to repeatedly getting and then dropping files, and make it never
match. This also applies to "not balanced" and "not sizebalanced".
--explain will tell the user when this happens
Note that getMatcher calls matchMrun' and does not check for unstable
negated limits. While there is no --present anyway, if there was,
it would not make sense for --not --present to complain about
instability and fail to match.
Reorganized the reposize database directory, and split up a column.
checkStaleSizeChanges needs to run before needLiveUpdate,
otherwise the process won't be holding a lock on its pid file, and
another process could go in and expire the live update it records. It
just so happens that they do get called in the correct order, since
checking balanced preferred content calls getLiveRepoSizes before
needLiveUpdate.
The 1 minute delay between checks is arbitrary, but will avoid excess
work. The downside of it is that, if a process is dropping a file and
gets interrupted, for 1 minute another process can expect a repository
will soon be smaller than it is. And so a process might send data to a
repository when a file is not really going to be dropped from it. But
note that can already happen if a drop takes some time in eg locking and
then fails. So it seems possible that live updates should only be
allowed to increase, rather than decrease the size of a repository.
Only when the preferred content expression being matched uses balanced
preferred content is this overhead needed.
It might be possible to eliminate the locking entirely. Eg, check the
live changes before and after the action and re-run if they are not
stable. For now, this is good enough, it avoids existing preferred
content getting slow. If balanced preferred content turns out to be too
slow to check, that could be tried later.