As we only ever keep the first error state around, we can avoid some
work that can be quite intrusive if we don't record the error the second
time around. This does move the race whereby the user could discard one
error state as the second is being captured, but that race exists in the
current code and we hope that recapturing error state is only done for
debugging.
Note that as we discard the error state for simulated errors, igt that
exercise error capture continue to function.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Remove some redundant kernel messages as we deduce a hung GPU and
capture the error state.
v2: Fix "hang" vs "no progress" message whilst I was there
v3: s/snprintf/scnprintf/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
After Joonas complained about using READ_ONCE() on the only use of the
variable in the function, where the intent was to simply document that
the read was intentionally racy and unlocked, I switched the READ_ONCE()
over to lockless_dereference(). However, in linux-next that has a
stronger type-check to only allow pointers and is no longer
interchangeable with READ_ONCE(), see commit 331b6d8c7a
("locking/barriers: Validate lockless_dereference() is used on a pointer
type")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 67d97da349 ("drm/i915: Only start retire worker when idle")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467705276-707-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
As vendor document indicate, when REF_CLK bit set 0, then DP
phy's REF_CLK should switch to 24M source clock.
But due to IC PHY layout mistaken, some chips need to flip this
bit(like RK3288), and unfortunately they didn't indicate in the
DP version register. That's why we have to make this little hack.
Signed-off-by: Yakir Yang <ykk@rock-chips.com>
Reviewed-by: Tomasz Figa <tomasz.figa@chromium.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
There're an register define error in ANALOGIX_DP_PLL_REG_1 which introduced
by commit bcec20fd5a ("drm: bridge: analogix/dp: add some rk3288 special
registers setting").
The PHY PLL input clock source is selected by ANALOGIX_DP_PLL_REG_1
BIT 0, not BIT 1.
Signed-off-by: Yakir Yang <ykk@rock-chips.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Tomasz Figa <tomasz.figa@chromium.com>
Tested-by: Javier Martinez Canillas <javier@osg.samsung.com>
eDP controller need to declare which vop provide the video source,
and it's defined in GRF registers.
But different chips have different GRF register address, so we need to
create a device data to declare the GRF messages for each chips.
Signed-off-by: Yakir Yang <ykk@rock-chips.com>
Acked-by: Mark Yao <mark.yao@rock-chips.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Allow the HDMI CODECs to get private data passed in in callbacks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXd3kSAAoJECTWi3JdVIfQuVcH/A3UN9SCUMjPfRvpSOr5fEV7
MRn4AHxnP0DWucN041rdTvuxJVpIcLpkj15qfJLFOYLtN0ub2YVNlic2a3pFoRtx
ajbDC//5TFvrWjEEkMAPgdALymVYwCOKLuUgSI6xBKe+RW58JIYLzz+T+NUKHor4
3tEBNSvEQL7ljZWwr9657EO/KX9PIO7juWYk2wFHyLN7YBZEz8PZDI9y37bjnpce
PjK7X5uWXss/DRwcSvKZX6MVmErn6egxzhlFon3YpSWEzqalGsEUsJxW7Jhwk8G5
XA4EntLWAQnwW2R4Jf/5+TI1DTHXsne0iRNZbVyDf//LX5GIPJp6Ck5AYI0QVTw=
=JcYF
-----END PGP SIGNATURE-----
Merge tag 'asoc-hdmi-codec-pdata' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into drm-next
ASoC: Add private data for HDMI CODEC callbacks
Allow the HDMI CODECs to get private data passed in in callbacks.
[airlied:
Add STI/mediatek patches from Arnd for drivers merged later in drm tree.]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* tag 'asoc-hdmi-codec-pdata' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound:
ASoC: hdmi-codec: callback function will be called with private data
Some IS_ and HAS_ macros can return any non-zero value for true.
One potential problem with that is that someone could assign
them to integers and be surprised with the result. Therefore it
is probably safer to do the conversion to 0/1 in the macros
themselves.
Luckily this does not seem to have an effect on code size.
Only one call site was getting bit by this and a patch for
that has been sent as "drm/i915/guc: Protect against HAS_GUC_*
returning true values other than one".
v2: Added some extra braces as suggested by checkpatch.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467643823-9798-1-git-send-email-tvrtko.ursulin@linux.intel.com
Since we now subclass struct drm_device, we can save pointer dances by
noting the equivalence of struct drm_device and struct drm_i915_private,
i.e. by using to_i915().
text data bss dec hex filename
1073824 4562 416 1078802 107612 drivers/gpu/drm/i915/i915.ko
1068976 4562 416 1073954 106322 drivers/gpu/drm/i915/i915.ko
Created by the coccinelle script:
@@
expression E;
identifier p;
@@
- struct drm_i915_private *p = E->dev_private;
+ struct drm_i915_private *p = to_i915(E);
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk
Vblank turn on should be called in crtc's enable callback.
And turn off called in crtc's disable callback.
Thanks to Daniel Vetter, this bug is reported by him.
Reported-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Add select HISI_KIRIN_DW_DSI to Kconfig.
The DRM driver depends on dsi sub-driver.
Signed-off-by: Zoltan Kuscsik <zoltan.kuscsik@linaro.org>
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
In case of error, the function devm_clk_get() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check
should be replaced with IS_ERR().
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Chen Feng <puck.chen@hisilicon.com>
This patch added the loading of the GuC for Kabylake.
It loads a 9.14 firmware.
v2: Fix commit message
v3: Fix major/minor var names to match -nightly. (Rodrigo)
Cc: Christophe Prigent <christophe.prigent@intel.com>
Signed-off-by: Peter Antoine <peter.antoine@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v3)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467304672-2106-1-git-send-email-rodrigo.vivi@intel.com
For simplicity in testing, only report known rings in the mask. This
allows userspace to try and trigger a missed irq on every ring and do a
comparison between i915_ring_test_irq and i915_ring_missed_irq to see if
any rings failed.
v2: Move the debug message to after the rings are selected (so that the
message accurately reflects reality)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466170505-8048-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acquiring the forcewake domain asserts that it is in an atomic section
(as we always expect to be under the uncore.lock). This is true except for
initialising the domains on Ivybridge, and so we generate a warning.
Wrap the manual usage of fw_domains inside the spin_lock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467566973-13596-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
igt likes to inject GPU hangs into its command streams. However, as we
expect these hangs, we don't actually want them recorded in the dmesg
output or stored in the i915_error_state (usually). To accommodate this
allow userspace to set a flag on the context that any hang emanating
from that context will not be recorded. We still do the error capture
(otherwise how do we find the guilty context and know its intent?) as
part of the reason for random GPU hang injection is to exercise the race
conditions between the error capture and normal execution.
v2: Split out the request->ringbuf error capture changes.
v3: Move the flag defines next to the intel_context->flags definition
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-9-git-send-email-chris@chris-wilson.co.uk
The request tells us where to read the ringbuf from, so use that
information to simplify the error capture. If no request was active at
the time of the hang, the ring is idle and there is no information
inside the ring pertaining to the hang.
Note carefully that this will reduce the amount of information stored in
the error state - any ring without an active request will not be
recorded.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-8-git-send-email-chris@chris-wilson.co.uk
Now that we have (near) universal GPU recovery code, we can inject a
real hang from userspace and not need any fakery. Not only does this
mean that the testing is far more realistic, but we can simplify the
kernel in the process.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-7-git-send-email-chris@chris-wilson.co.uk
Make sure that the RPS bottom-half is flushed before we set the idle
frequency when we decide the GPU is idle. This should prevent any races
with the bottom-half and setting the idle frequency, and ensures that
the bottom-half is bounded by the GPU's rpm reference taken for when it
is active (i.e. between gen6_rps_busy() and gen6_rps_idle()).
v2: Avoid recursively using the i915->wq - RPS does not touch the
struct_mutex so has no place being on the ordered i915->wq.
v3: Enable/disable interrupts for RPS busy/idle in order to prevent
further HW access from RPS outside of the wakeref.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
References: https://bugs.freedesktop.org/show_bug.cgi?id=89728
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-6-git-send-email-chris@chris-wilson.co.uk
Describe the intent of boosting the GPU frequency to maximum before
waiting on the GPU.
RPS waitboosting was introduced with commit b29c19b645 ("drm/i915:
Boost RPS frequency for CPU stalls") but lacked a concise comment in the
code to explain itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-5-git-send-email-chris@chris-wilson.co.uk
Ideally, we want to automagically have the GPU respond to the
instantaneous load by reclocking itself. However, reclocking occurs
relatively slowly, and to the client waiting for a result from the GPU,
too late. To compensate and reduce the client latency, we allow the
first wait from a client to boost the GPU clocks to maximum. This
overcomes the lag in autoreclocking, at the expense of forcing the GPU
clocks too high. So to offset the excessive power usage, we currently
allow a client to only boost the clocks once before we detect the GPU
is idle again. This works reasonably for say the first frame in a
benchmark, but for many more synchronous workloads (like OpenCL) we find
the GPU clocks remain too low. By noting a wait which would idle the GPU
(i.e. we just waited upon the last known request), we can give that
client the idle boost credit (for their next wait) without the 100ms
delay required for us to detect the GPU idle state. The intention is to
boost clients that are stalling in the process of feeding the GPU more
work (and who in doing so let the GPU idle), without granting boost
credits to clients that are throttling themselves (such as compositors).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Zou, Nanhai" <nanhai.zou@intel.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-4-git-send-email-chris@chris-wilson.co.uk
We know, by design, that whilst the GPU is active (and thus we are
throttling) the retire_worker is queued. Therefore attempting to requeue
it with queue_delayed_work() is a no-op and we can safely remove it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-3-git-send-email-chris@chris-wilson.co.uk
Rather than persistently postponing the idle-work everytime somebody
calls i915_gem_retire_requests() (potentially ensuring that we never
reach the idle state), queue the work the first time we detect all
requests are complete. Then if in 100ms, more requests have been queued,
we will abort the idle-worker and wait again until all the new requests
have been completed.
Of course, this does depend upon the idle worker cancelling itself
gracefully from the previous patch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-2-git-send-email-chris@chris-wilson.co.uk
The retire worker is a low frequency task that makes sure we retire
outstanding requests if userspace is being lax. We only need to start it
once as it remains active until the GPU is idle, so do a cheap test
before the more expensive queue_work(). A consequence of this is that we
need correct locking in the worker to make the hot path of request
submission cheap. To keep the symmetry and keep hangcheck strictly bound
by the GPU's wakelock, we move the cancel_sync(hangcheck) to the idle
worker before dropping the wakelock.
v2: Guard against RCU fouling the breadcrumbs bottom-half whilst we kick
the waiter.
v3: Remove the wakeref assertion squelching (now we hold a wakeref for
the hangcheck, any rpm error there is genuine).
v4: To prevent excess work when retiring requests, we split the busy
flag into two, a boolean to denote whether we hold the wakeref and a
bitmask of active engines.
v5: Reorder cancelling hangcheck upon idling to avoid a race where we
might cancel a hangcheck after being preempted by a new task
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
References: https://bugs.freedesktop.org/show_bug.cgi?id=88437
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-1-git-send-email-chris@chris-wilson.co.uk
alloc_workqueue replaces deprecated create_workqueue().
create_workqueue has been replaced with alloc_workqueue with max_active
as 0 since there is no need for throttling the number of active work items.
WQ_MEM_RECLAIM has not been set to because kfd_process_wq will not be
used in memory reclaim path.
kfd_process_wq is used for delay destruction. A work item embedded in
kfd_process gets queued to kfd_process_wq and when it executes it
destroys and frees the containing kfd_process and thus itself.
This requires a dedicated workqueue because a work item once queued, may
get freed at any point of time and any external entity cannot
flush the work item. So, in order to wait for such a work item,
it needs to be put on a dedicated workqueue.
kfd_module_exit() calls kfd_process_destroy_wq which ensures that all
pending work items are finished before the module is removed.
flush_workqueue is unnecessary since destroy_workqueue() itself calls
drain_workqueue() which flushes repeatedly till the workqueue
becomes empty.
Hence flush_workqueue has been removed.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
smatch complains of:
drivers/gpu/drm/i915/intel_fbdev.c:403
intel_fb_initial_config() warn: should '1 << i' be a 64 bit type?
drivers/gpu/drm/i915/intel_fbdev.c:422 intel_fb_initial_config() warn:
should '1 << i' be a 64 bit type?
drivers/gpu/drm/i915/intel_fbdev.c:501 intel_fb_initial_config() warn:
should '1 << i' be a 64 bit type?
We are prepared to iterate over a u64 but don't limit the number of
connectors we try to configure to a maximum of 64.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-5-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
smatch complains:
drivers/gpu/drm/i915/i915_debugfs.c:1390 i915_frequency_info() Function
too hairy. Giving up.
drivers/gpu/drm/i915/i915_debugfs.c:1985 i915_gem_framebuffer_info()
warn: inconsistent indenting
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467470166-31717-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
The patchset contains a new helper in drm_fb_cma_helper.c for suspend/
resume when using cma backed framebuffers.
* 'for-next' of http://git.agner.ch/git/linux-drm-fsl-dcu:
drm/fsl-dcu: disable vblank events on CRTC disable
drm/fsl-dcu: implement suspend/resume using atomic helpers
drm/fsl-dcu: use clk helpers
drm/fsl-dcu: move layer initialization to plane file
drm/fsl-dcu: store layer registers in soc_data
drm/fb_cma_helper: add suspend helper
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXcHi9AAoJEHm+PkMAQRiGSJ0H/2o4t9VWYmhyPC1sdIHoCExJ
P4tBrcZYBmKcsOmIfnJDa5g/+IdhouEUM0v0fHPogS2UUWT9eRuJWYD3sY+HpEQ+
heKTli8X73gsFB25odeIbIt0jAoSiiMYWDrWqLNsuUV1tjEYVA8rH0SM94FiOC/5
7WVWXLTuH+Rm7JHP18BnKxmMMbzrTFmwisLMqFKyfZRRSlS+/ix7iLUNO9AFa39B
YHxNPihLrZ0oONyCOAQoHTIXXrw0cQbxV2utg3vnMcCZdme2xOn+iXMntTSKfZ39
iC9/T0vsO3R6OrRo2aDZAnCPUAniXnMEIhrKG37WMyXpj6cucZ/2QiNXcXviGV4=
=iLte
-----END PGP SIGNATURE-----
Back-merge tag 'v4.7-rc5' into drm-next
Linux 4.7-rc5
The fsl-dcu pull needs -rc3 so go to -rc5 for now.
This pull request include 3 minors fix and one feature evolution
around ASoC hdmi codec support.
* 'sti-drm-next-2016-06-30' of http://git.linaro.org/people/benjamin.gaignard/kernel:
drm: sti: Add ASoC generic hdmi codec support.
drm/sti: adjust delay for AWG
drm: sti: fix clocking issues in crtc
drm/sti: Use 64-bit timestamps
Since the tests can and do explicitly check debugfs/i915_ring_missed_irqs
for the handling of a "missed interrupt", adding it to the dmesg at INFO
is just noise. When it happens for real, we still class it as an ERROR.
Note that I have chose to remove it entirely because when we detect the
"missed interrupt" is irrelevant and the message contains no more
information than we glean from looking in debugfs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-20-git-send-email-chris@chris-wilson.co.uk
With only a single callsite for intel_engine_cs->irq_get and ->irq_put,
we can reduce the code size by moving the common preamble into the
caller, and we can also eliminate the reference counting.
For completeness, as we are no longer doing reference counting on irq,
rename the get/put vfunctions to enable/disable respectively and are
able to review the use of posting reads. We only require the
serialisation with hardware when enabling the interrupt (i.e. so we
cannot miss an interrupt by going to sleep before the hardware truly
enables it).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-18-git-send-email-chris@chris-wilson.co.uk
Under the assumption that enabling signaling will be a frequent
operation, lets preallocate our attachments for signaling inside the
(rather large) request struct (and so benefiting from the slab cache).
v2: Convert from void * to more meaningful names and types.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-17-git-send-email-chris@chris-wilson.co.uk
If we convert the tracing over from direct use of ring->irq_get() and
over to the breadcrumb infrastructure, we only have a single user of the
ring->irq_get and so we will be able to simplify the driver routines
(eliminating the redundant validation and irq refcounting).
Process context is preferred over softirq (or even hardirq) for a couple
of reasons:
- we already utilize process context to have fast wakeup of a single
client (i.e. the client waiting for the GPU inspects the seqno for
itself following an interrupt to avoid the overhead of a context
switch before it returns to userspace)
- engine->irq_seqno() is not suitable for use from an softirq/hardirq
context as we may require long waits (100-250us) to ensure the seqno
write is posted before we read it from the CPU
A signaling framework is a requirement for enabling dma-fences.
v2: Move to a signaling framework based upon the waiter.
v3: Track the first-signal to avoid having to walk the rbtree everytime.
v4: Mark the signaler thread as RT priority to reduce latency in the
indirect wakeups.
v5: Make failure to allocate the thread fatal.
v6: Rename kthreads to i915/signal:%u
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-16-git-send-email-chris@chris-wilson.co.uk
We have testcases to ensure that seqno wraparound works fine, so we can
forgo forcing everyone to encounter seqno wraparound during early
uptime. seqno wraparound incurs a full GPU stall so not forcing it
will eliminate one jitter from the early system. Using the testcases, we
have very deterministic testing which given how difficult it would be to
debug an issue (GPU hang) stemming from a wraparound using pure
postmortem analysis I see no value in forcing a wrap during boot.
Advancing the global next_seqno after a GPU reset is equally pointless.
References? https://bugs.freedesktop.org/show_bug.cgi?id=95023
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-15-git-send-email-chris@chris-wilson.co.uk
If we flag the seqno as potentially stale upon receiving an interrupt,
we can use that information to reduce the frequency that we apply the
heavyweight coherent seqno read (i.e. if we wake up a chain of waiters).
v2: Use cmpxchg to replace READ_ONCE/WRITE_ONCE for more explicit
control of the ordering wrt to interrupt generation and interrupt
checking in the bottom-half.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-14-git-send-email-chris@chris-wilson.co.uk
If we have multiple waiters, we may find that many complete on the same
wake up. If we first inspect the seqno from the CPU cache, we may reduce
the number of heavyweight coherent seqno reads we require.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467390209-3576-13-git-send-email-chris@chris-wilson.co.uk