Commit graph

34742 commits

Author SHA1 Message Date
Joey Hess
ad6b8c5f77
fix STM deadlock in finishCommandActions
Happened every time, because it was taking the pool TMVar while threads
were still running, and then the thread would try to switch state.
2019-06-19 18:34:26 -04:00
Joey Hess
19321e6892
devblog 2019-06-19 18:18:37 -04:00
Joey Hess
37d505dd6b
avoid STM deadlock
When all worker threads are running and enteringStage is called,
it waits for an idle slot. If all off the other threads then call it in
turn, a deadlock occurrs.

This is the same problem I didn't actually fix in
5a9842d7ed.

Fixed by doing two separate STM transactions, the first replaces its
active thread with an idle thread, and the second waits for another idle
thread. That guarantees there will eventually be an idle thread to find.

The changes to WorkerPool were necessary because it can't add an idle
thread containing the Annex state and go on to run an action using that
same state, so I had to remove the Annex state from IdleWorker.
2019-06-19 18:15:25 -04:00
Joey Hess
a0d3a699e2
fix concurrency
Broken by recent commits, because before dupState is called, the Annex
state needs to have concurrent output enabled, and the thread pool
populated.
2019-06-19 16:12:39 -04:00
Joey Hess
9671248fff
speed up enteringStage in non-concurrent mode
Avoid a STM transaction.

Also got rid of UnallocatedWorkerPool.
2019-06-19 15:47:54 -04:00
Joey Hess
05a908c3c9
fix oops 2019-06-19 14:52:44 -04:00
Joey Hess
9d36c826c0
use fine-grained WorkerStages when transferring and verifying
This means that Command.Move and Command.Get don't need to
manually set the stage, and is a lot cleaner conceptually.

Also, this makes Command.Sync.syncFile use the worker pool better.
In the scenario where it first downloads content and then uploads it to
some other remotes, it will start in TransferStage, then enter VerifyStage
and then go back to TransferStage for each transfer to the remotes.
Before, it entered CleanupStage after the download, and stayed in it for
the upload, so too many transfer jobs could run at the same time.

Note that, in Remote.Git, it uses runTransfer and also verifyKeyContent
inside onLocal. That has a Annex state for the remote, with no worker pool.
So the resulting calls to enteringStage won't block in there.

While Remote.Git.copyToRemote does do checksum verification, I
realized that should not use a verification slot in the WorkerPool
to do it. Because, it's reading back from eg, a removable disk to checksum.
That will contend with other writes to that disk. It's best to treat
that checksum verification as just part of the transer. So, removed the todo
item about that, as there's nothing needing to be done.
2019-06-19 13:24:20 -04:00
Joey Hess
53882ab4a7
make WorkerStage an open type
Rather than limiting it to PerformStage and CleanupStage, this opens it
up so any number of stages can be added as needed by commands.

Each concurrent command has a set of stages that it uses, and only
transitions between those can block waiting for a free slot in the
worker pool. Calling enteringStage for some other stage does not block,
and has very little overhead.

Note that while before the Annex state was duplicated on the first call
to commandAction, this now happens earlier, in startConcurrency.
That means that seek stage actions should that use startConcurrency
and then modify Annex state won't modify the state of worker threads
they then start. I audited all of them, and only Command.Seek
did so; prepMerge changes the working directory and so has to come
before startConcurrency.

Also, the remote list is built before duplicating the state, which means
that it gets built earlier now than it used to. This would only have an
effect of making commands that end up not needing to perform any actions
unncessary build the remote list (only when they're run with concurrency
enable), but that's a minor overhead compared to commands seeking
through the work tree and determining they don't need to do anything.
2019-06-19 13:05:03 -04:00
Joey Hess
e19408ed9d
Merge branch 'master' of ssh://git-annex.branchable.com 2019-06-17 15:26:57 -04:00
Joey Hess
c31f4c0e66
devblog 2019-06-17 15:26:46 -04:00
Joey Hess
04cc470201
run download checksum verification in separate job pool
get, move, copy, sync: When -J or annex.jobs has enabled concurrency,
checksum verification uses a separate job pool than is used for
downloads, to keep bandwidth saturated.

Not yet done for upload checksum verification, but that only affects
remotes on local disks.
2019-06-17 14:58:02 -04:00
Joey Hess
5a9842d7ed
avoid STM deadlock onredundant call to changeStageTo
I couldn't find a way to avoid the deadlock w/o rewriting it to clearly
not have one. I'm not quite sure what was the actual cause of the
deadlock.

This makes me unsure how I now know it clearly doesn't have a
deadlock. But, it was easy to reproduce before (just call it twice in a
row) and doesn't happen now.
2019-06-17 14:51:30 -04:00
Joey Hess
ecbd456312
fix restoring worker pool bug
The bug might have led to a STM deadlock, if this case could ever
actually fire.
2019-06-17 12:52:57 -04:00
Joey Hess
1a8d06d251
thought 2019-06-17 11:50:18 -04:00
jsag@f84637fe752e0235291a118b1cd007bafad0997e
ae9f2d5e6a 2019-06-17 12:43:17 +00:00
Joey Hess
502ce3f243
Merge branch 'starting' 2019-06-15 12:42:10 -04:00
Joey Hess
76c0a38025
add news item for git-annex 7.20190615 2019-06-15 12:39:48 -04:00
Joey Hess
0bd9e8c0e2
releasing package git-annex version 7.20190615 2019-06-15 12:39:16 -04:00
artem
9c4744c3c2 2019-06-14 04:24:47 +00:00
Joey Hess
44de3fff0b
avoid rsync/gcrypt ssh startup delay with -J
Avoid a delay at startup when concurrency is enabled and there are
rsync or gcrypt special remotes, which was caused by git-annex
opening a ssh connection to the remote too early.

sshOptions makes a connection to the ssh server if one is not already open,
when concurrency is enabled. Avoid doing that at startup, when the remote
list is being built, but the remote may not be used at all.

Instead, rsync/gcrypt now runs sshOptions once per ssh connection to the
server. This should not be significant overhead since Remote.Git already
has the same overhead (as do Bup and Ddar).
2019-06-13 11:16:38 -04:00
Joey Hess
43805a0be9
devblog 2019-06-12 15:10:17 -04:00
Joey Hess
157f41427f
bug 2019-06-12 15:00:28 -04:00
Joey Hess
e589a9b3fc
moving this to a bug 2019-06-12 15:00:14 -04:00
Joey Hess
44c971b9ec
Merge branch 'master' into starting
This reverts commit e07003ab73, adding back
the separate queue for cleanup actions.
2019-06-12 14:51:57 -04:00
Joey Hess
7b5aad2452
Merge commit '6f8322b8f72f3399d4c28426749db5d01742001d' into starting 2019-06-12 14:50:59 -04:00
Joey Hess
e07003ab73
Revert "separate queue for cleanup actions"
This reverts commit 659640e224
and 4932972487

Too early to include these in a release; they'll be de-reverted after
the release.
2019-06-12 14:47:40 -04:00
Joey Hess
6f8322b8f7
close; not a bug 2019-06-12 14:45:25 -04:00
Joey Hess
e1c48509d7
remove incorrect changelog entry
I didn't speed up -J seek yet
2019-06-12 14:13:45 -04:00
Joey Hess
ba2551da6f
add startingNoMessage
Fixes the last wart in the StartMessage transition. A few commands
include other CommandStart actions that generate output, and
do not themselves need to display a start/end message.
2019-06-12 14:11:23 -04:00
Joey Hess
70bc30acb1
get rid of implicitMessages state
Oh joyous day, this is probably git-annex's oldest implementation wart,
source of much unncessary bother.

Now that we have a StartMessage, showEndResult' can look at it to know
if it needs to display an end message or not.

This is also going to be faster, because it avoids an uncessary state
lookup for each file processed.
2019-06-12 14:01:41 -04:00
Joey Hess
8e5ea28c26
finish CommandStart transition
The hoped for optimisation of CommandStart with -J did not materialize.
In fact, not runnign CommandStart in parallel is slower than -J3.
So, CommandStart are still run in parallel.

(The actual bad performance I've been seeing with -J in my big repo
has to do with building the remoteList.)

But, this is still progress toward making -J faster, because it gets rid
of the onlyActionOn roadblock in the way of making CommandCleanup jobs
run separate from CommandPerform jobs.

Added OnlyActionOn constructor for ActionItem which fixes the
onlyActionOn breakage in the last commit.

Made CustomOutput include an ActionItem, so even things using it can
specify OnlyActionOn.

In Command.Move and Command.Sync, there were CommandStarts that used
includeCommandAction, so output messages, which is no longer allowed.
Fixed by using startingCustomOutput, but that's still not quite right,
since it prevents message display for the includeCommandAction run
inside it too.
2019-06-12 13:24:01 -04:00
Ilya_Shlyakhter
3820829407 re: status of v7 2019-06-12 15:53:12 +00:00
Ilya_Shlyakhter
d2658d9537 an issue involving repos cloned with --single-branch 2019-06-11 23:30:26 +00:00
anthony@ad39673d230d75cbfd19d2757d754030049c7673
fd7f316482 Added a comment 2019-06-10 16:52:23 +00:00
anthony@ad39673d230d75cbfd19d2757d754030049c7673
d5a0bf3ae9 clarify it's the new android installer 2019-06-10 16:38:44 +00:00
anthony@ad39673d230d75cbfd19d2757d754030049c7673
856affe859 initial report 2019-06-10 16:37:31 +00:00
kirelagin@6d93475882c55a329fedae6be1971868a775ec7e
c0f4788cdb Added a comment: Workaround? 2019-06-08 13:03:50 +00:00
Joey Hess
8d573a653b
Merge branch 'master' of ssh://git-annex.branchable.com 2019-06-07 19:34:46 -04:00
Joey Hess
1d92846e54
bug report from MacGyver.mdwn 2019-06-07 19:34:21 -04:00
jamie@b5676b90eec0401ca8faac7c972eaf5676891601
0424549599 removed 2019-06-07 15:19:15 +00:00
Joey Hess
132ec9e005
devblog 2019-06-06 17:16:30 -04:00
Joey Hess
436f107715
make CommandStart return a StartMessage
The goal is to be able to run CommandStart in the main thread when -J is
used, rather than unncessarily passing it off to a worker thread, which
incurs overhead that is signficant when the CommandStart is going to
quickly decide to stop.

To do that, the message it displays needs to be displayed in the worker
thread, after the CommandStart has run.

Also, the change will mean that CommandStart will no longer necessarily
run with the same Annex state as CommandPerform. While its docs already
said it should avoid modifying Annex state, I audited all the
CommandStart code as part of the conversion. (Note that CommandSeek
already sometimes runs with a different Annex state, and that has not been
a source of any problems, so I am not too worried that this change will
lead to breakage going forward.)

The only modification of Annex state I found was it calling
allowMessages in some Commands that default to noMessages. Dealt with
that by adding a startCustomOutput and a startingUsualMessages.
This lets a command start with noMessages and then select the output it
wants for each CommandStart.

One bit of breakage: onlyActionOn has been removed from commands that used it.
The plan is that, since a StartMessage contains an ActionItem,
when a Key can be extracted from that, the parallel job runner can
run onlyActionOn' automatically. Then commands won't need to worry about
this detail. Future work.

Otherwise, this was a fairly straightforward process of making each
CommandStart compile again. Hopefully other behavior changes were mostly
avoided.

In a few cases, a command had a CommandStart that called a CommandPerform
that then called showStart multiple times. I have collapsed those
down to a single start action. The main command to perhaps suffer from it
is Command.Direct, which used to show a start for each file, and no
longer does.

Another minor behavior change is that some commands used showStart
before, but had an associated file and a Key available, so were changed
to ShowStart with an ActionItemAssociatedFile. That will not change the
normal output or behavior, but --json output will now include the key.
This should not break it for anyone using a real json parser.
2019-06-06 17:13:54 -04:00
Joey Hess
258a7c5cd1
add Key to all ActionItem constructors 2019-06-06 12:53:24 -04:00
Joey Hess
3893d84764
todo 2019-06-06 12:02:27 -04:00
Joey Hess
6cefe071b1
Merge branch 'master' of ssh://git-annex.branchable.com 2019-06-05 20:19:27 -04:00
Joey Hess
4088ce49ce
devblog 2019-06-05 20:18:59 -04:00
Joey Hess
4932972487
fix STM deadlock
659640e224 was buggy, it had a STM
deadlock because two actions both wanted to takeTMVar the WorkerPool
and so blocked one-another.

Fixed by completely reworking how the pool is maintained. Maintenace
threads now wait for the Async actions and update the WorkerPool. This
means twice as many threads as before, but green threads so will only
use a few extra bytes ram per thread.
2019-06-05 20:07:35 -04:00
Joey Hess
3eac4e01a4
idea 2019-06-05 19:43:01 -04:00
Joey Hess
659640e224
separate queue for cleanup actions
When running multiple concurrent actions, the cleanup phase is run in a
separate queue than the main action queue. This can make some commands
faster, because less time is spent on bookkeeping in between each file
transfer.

But as far as I can see, nothing will be sped up much by this yet, because
all the existing cleanup actions are very light-weight. This is just groundwork
for deferring checksum verification to cleanup time.

This change does mean that if the user expects -J2 will mean that they see no
more than 2 jobs running at a time, they may be surprised to see 4 in some
cases (if the cleanup actions are slow enough to notice).

It might also make sense to enable background cleanup without the -J,
for at least one cleanup action. Indeed, that's the behavior that -J1
has now. At some point in the future, it make make sense to make the
behavior with no -J the same as -J1. The only reason it's not currently
is that git-annex can build w/o concurrent-output, and also any bugs
in concurrent-output (such as perhaps misbehaving on non-VT100 compatible
terminals) are avoided by default by only using it when -J is used.
2019-06-05 17:54:35 -04:00
Joey Hess
c04b2af3e1
improved WorkerPool abstraction
No behavior changes.
2019-06-05 14:26:48 -04:00