So apparently under ridiculous amounts of memory pressure we can get
into trouble in do_switch when we try to move the old hw context
backing storage object onto the active lists.
With list debugging enabled that usually results in us chasing a
poisoned pointer - which means we've hit upon a vma that has been
removed from all lrus with list_del (and then deallocated, so it's a
real use-after free).
Ian Lister has done some great callchain chasing and noticed that we
can reenter do_switch:
i915_gem_do_execbuffer()
i915_switch_context()
do_switch()
from = ring->last_context;
i915_gem_object_pin()
i915_gem_object_bind_to_gtt()
ret = drm_mm_insert_node_in_range_generic();
// If the above call fails then it will try i915_gem_evict_something()
// If that fails it will call i915_gem_evict_everything() ...
i915_gem_evict_everything()
i915_gpu_idle()
i915_switch_context(DEFAULT_CONTEXT)
Like with everything else where the shrinker or eviction code can
invalidate pointers we need to reload relevant state.
Note that there's no need to recheck whether a context switch is still
required because:
- Doing a switch to the same context is harmless (besides wasting a
bit of energy).
- This can only happen with the default context. But since that one's
pinned we'll never call down into evict_everything under normal
circumstances. Note that there's a little driver bringup fun
involved namely that we could recourse into do_switch for the
initial switch. Atm we're fine since we assign the context pointer
only after the call to do_switch at driver load or resume time. And
in the gpu reset case we skip the entire setup sequence (which might
be a bug on its own, but definitely not this one here).
Cc'ing stable since apparently ChromeOS guys are seeing this in the
wild (and not just on artificial stress tests), see the reference.
Note that in upstream code doesn't calle evict_everything directly
from evict_something, that's an extension in this product branch. But
we can still hit upon this bug (and apparently we do, see the linked
backtraces). I've noticed this while trying to construct a testcase
for this bug and utterly failed to provoke it. It looks like we need
to driver the system squarly into the lowmem wall and provoke the
shrinker to evict the context object by doing the last-ditch
evict_everything call.
Aside: There's currently no means to get a badly-fragmenting hw
context object away from a bad spot in the upstream code. We should
fix this by at least adding some code to evict_something to handle hw
contexts.
References: https://code.google.com/p/chromium/issues/detail?id=248191
Reported-by: Ian Lister <ian.lister@intel.com>
Cc: Ian Lister <ian.lister@intel.com>
Cc: stable@vger.kernel.org
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Bloomfield, Jon <jon.bloomfield@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Reviewed-by: Ian Lister <ian.lister@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Inorder to serialise the closing of the file descriptor and its
subsequent release of client requests with i915_gem_free_request(), we
need to hold the struct_mutex in i915_gem_release(). Failing to do so
has the potential to trigger an OOPS, later with a use-after-free.
Testcase: igt/gem_close_race
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70874
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71029
Reported-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
BDW context sizes varies a bit.
v2: Squash in fixup for the hw context size from Ben.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I had this lying around from he original PPGTT series, and thought we
might try to get it in by itself.
With the introduction of context refcounting we never explicitly
ref/unref the backing object. As such, the previous fix was a bit wonky.
Aside from fixing the above, this patch also puts us in good shape for
an upcoming patch which allows a failure to occur in between
context_init and the first do_switch.
CC: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Even though we track object activity and not VMA, because we have the
active_list be based on the VM, it makes the most sense to use VMAs in
the APIs.
NOTE: Daniel intends to eventually rip out active/inactive LRUs, but for
now, leave them be.
v2: Remove leftover hunk from the previous patch which didn't keep
i915_gem_object_move_to_active. That patch had to rely on the ring to
get the dev instead of the obj. (Chris)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On both Ivybridge and Haswell, row remapping information is saved and
restored with context. This means, we never actually properly supported
the l3 remapping because our sysfs interface is asynchronous (and not
tied to any context), and the known faulty HW would be reused by the
next context to run.
Not that due to the asynchronous nature of the sysfs entry, there is no
point modifying the registers for the existing context. Instead we set a
flag for all contexts to load the correct remapping information on the
next run. Interested clients can use debugfs to determine whether or not
the row has been remapped.
One could propose at this point that we just do the remapping in the
kernel. I guess since we have to maintain the sysfs interface anyway,
I'm not sure how useful it is, and I do like keeping the policy in
userspace; (it wasn't my original decision to make the
interface the way it is, so I'm not attached).
v2: Force a context switch when we have a remap on the next switch.
(Ville)
Don't let userspace use the interface with disabled contexts.
v3: Don't force a context switch, just let it nop
Improper context slice remap initialization, 1<<1 instead of 1<<i, but I
rewrote it to avoid a second round of confusion.
Error print moved to error path (All Ville)
Added a comment on why the slice remap initialization happens.
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I have implemented this patch before without creating a separate list
(I'm having trouble finding the links, but the messages ids are:
<1364942743-6041-2-git-send-email-ben@bwidawsk.net>
<1365118914-15753-9-git-send-email-ben@bwidawsk.net>)
However, the code is much simpler to just use a list and it makes the
code from the next patch a lot more pretty.
As you'll see in the next patch, the reason for this is to be able to
specify when a context needs to get L3 remapping. More details there.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We use the request to ensure we hold a reference to the context for the
duration that it remains in use by the ring. Each request only holds a
reference to the current context, hence we emit a request after
switching contexts with the final reference to the old context. However,
the extra interrupt caused by that request is not useful (no timing
critical function will wait for the context object), instead the overhead
of servicing the IRQ shows up in some (lightweight) benchmarks. In order
to keep the useful property of using the request to manage the context
lifetime, we want to add a dummy request that is associated with the
interrupt from the subsequent real request following the batch.
The extra interrupt was added as a side-effect of using
i915_add_request() in
commit 112522f678
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Thu May 2 16:48:07 2013 +0300
drm/i915: put context upon switching
v2: Daniel convinced me that the request here was solely for context
lifetime tracking and that we have the active ref to keep the object
alive whilst the MI_SET_CONTEXT. So the only concern then is which
context should get the blame for MI_SET_CONTEXT failing. The old scheme
added a request for the old context so that any hang upto and including
the switch away would mark the old context as guilty. Now any hang here
implicates the new context. However since we have already gone through a
complete flush with the last context in its last request, and all that
lies in no-man's-land is an invalidate flush and the MI_SET_CONTEXT, we
should be safe in not unduly placing blame on the new context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
formerly: "drm/i915: Create VMAs (part 5) - move mm_list"
The mm_list is used for the active/inactive LRUs. Since those LRUs are
per address space, the link should be per VMx .
Because we'll only ever have 1 VMA before this point, it's not incorrect
to defer this change until this point in the patch series, and doing it
here makes the change much easier to understand.
Shamelessly manipulated out of Daniel:
"active/inactive stuff is used by eviction when we run out of address
space, so needs to be per-vma and per-address space. Bound/unbound otoh
is used by the shrinker which only cares about the amount of memory used
and not one bit about in which address space this memory is all used in.
Of course to actual kick out an object we need to unbind it from every
address space, but for that we have the per-object list of vmas."
v2: only bump GGTT LRU in i915_gem_object_set_to_gtt_domain (Chris)
v3: Moved earlier in the series
v4: Add dropped message from v3
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Frob patch to apply and use vma->node.size directly as
discused with Ben. Also drop a needles BUG_ON before move_to_inactive,
the function itself has the same check.]
[danvet 2nd: Rebase on top of the lost "drm/i915: Cleanup more of VMA
in destroy", specifically unlink the vma from the mm_list in
vma_unbind (to keep it symmetric with bind_to_vm) instead of
vma_destroy.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
MLC_LLC was never validated for Sandybridge and was superseded by a new
level of cacheing for the GPU in Ivybridge. Update our names to be
consistent with usage, and in the process stop setting the unwanted bit
on Sandybridge.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: s/BUG/WARN_ON(1) bikeshed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To verbalize it, one can say, "pin an object into the given address
space." The semantics of pinning remain the same otherwise.
Certain objects will always have to be bound into the global GTT.
Therefore, global GTT is a special case, and keep a special interface
around for it (i915_gem_obj_ggtt_pin).
v2: s/i915_gem_ggtt_pin/i915_gem_obj_ggtt_pin
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The default context is always supported (as it contains the global
hangcheck stats) and the contexts for hangcheck are not limited
to any ring.
References: https://bugs.freedesktop.org/show_bug.cgi?id=65845
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Soon we want to gut a lot of our existing assumptions how many address
spaces an object can live in, and in doing so, embed the drm_mm_node in
the object (and later the VMA).
It's possible in the future we'll want to add more getter/setter
methods, but for now this is enough to enable the VMAs.
v2: Reworked commit message (Ben)
Added comments to the main functions (Ben)
sed -i "s/i915_gem_obj_set_color/i915_gem_obj_ggtt_set_color/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_bound/i915_gem_obj_ggtt_bound/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_size/i915_gem_obj_ggtt_size/" drivers/gpu/drm/i915/*.[ch]
sed -i "s/i915_gem_obj_offset/i915_gem_obj_ggtt_offset/" drivers/gpu/drm/i915/*.[ch]
(Daniel)
v3: Rebased on new reserve_node patch
Changed DRM_DEBUG_KMS to actually work (will need fixing later)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With updates to the spec, we can actually see the context layout, and
how many dwords are allocated. That table suggests we need 70720 bytes
per HW context. Rounded up, this is 18 pages. Looking at what lives
after the current 4 pages we use, I can't see too much important (mostly
it's d3d related), but there are a couple of things which look scary. I
am hopeful this can explain some of our odd HSW failures.
v2: Make the context only 17 pages. The power context space isn't used
ever, and execlists aren't used in our driver, making the actual total
66944 bytes.
v3: Add a comment to the code. (Jesse & Paulo)
Reported-by: "Azad, Vinit" <vinit.azad@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Only execbuffer needed all the parameters on i915_add_request().
By putting __i915_add_request behind macro, all current callsites
become cleaner. Following patch will introduce a new parameter
for __i915_add_request. With this patch, only the relevant callsite
will reflect the change making commit smaller and easier to understand.
v2: _i915_add_request as function name (Chris Wilson)
v3: change name __i915_add_request and fix ordering of params (Ben Widawsky)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To get context hang statistics for specified context,
add i915_gem_context_get_hang_stats().
For arb-robustness, every context needs to have its own
hang statistics tracking. Added function will return
the user specified context statistics or in case of
default context, statistics from drm_i915_file_private.
v2: handle default context inside get_reset_state
v3: return struct pointer instead of passing it in as param
(Chris Wilson)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add some debug messages to help figure out what goes wrong on context
initialization.
Later in the PPGTT series, I ended up having a lot of failures after
reset. In many cases it was extra difficult to debug because I hadn't
even realized that contexts failed to reinitialize after reset (again an
artifact of some later patches).
This fairly benign patch does help debug some potential issues which
arise later.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We did not mention the workaround name when implementing those. This
should help us track what we already implement.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Because our context refcounting doesn't grab a ref at lookup time, it is
unsafe to do so without the lock.
NOTE: We don't have an easy way to put the assertion in the lookup
function which is where this really belongs. Context switching is good
enough because it actually asserts even more correctness by protecting
the default_context.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: s/BUG/WARN/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In order to be notified of when the context and all of its associated
objects is idle (for if the context maps to a ppgtt) we need a callback
from the retire handler. We can arrange this by using the kref_get/put
of the context for request tracking and by inserting a request to
demarque the switch away from the old context.
[Ben: fixed minor error to patch compile, AND s/last_context/from/]
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Before module unload is called, gpu_idle() will switch
to default context. This will increment ref count of base
object as the default context is 'running' on module unload
time. Unreference the drm object so that when context
is freed, base object is freed as well.
v2: added comment to explain the refcounts (Ben Widawsky)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Enabling PPGTT and also the need to track which context was guilty of
gpu hang (arb robustness enabling) have put pressure for struct i915_hw_context
to be more than just a placeholder for hw context state.
In order to track object lifetime properly in a multi peer usage, add reference
counting for i915_hw_context.
v2: track i915_hw_context pointers instead of using ctx_ids
(from Chris Wilson)
v3 (Ben): Get rid of do_release() and handle refcounting more compactly.
(recommended by Chis)
v4: kref_* put inside static inlines (Daniel Vetter)
remove code duplication on freeing context (Chris Wilson)
v5: idr_remove and ctx->file_priv = NULL in destroy ioctl (Chris)
This actually will cause a problem if one destroys a context and later
refers to the idea of the context (multiple contexts may have the same
id, but only 1 will exist in the idr).
v6: Strip out the request related stuff. Reworded commit message.
Got rid of do_destroy and introduced i915_gem_context_release_handle,
suggested by Chris Wilson.
v7: idr_remove can't be called inside idr_for_each (Chris Wilson)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v5)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> (v7)
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Squash sob lines, the patch ping-ponged between Ben and Mika
a bit ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Enabling context support increases SwapBuffers latency by about 20%
(measured on an i7-3720qm). We can offset that loss slightly by enabling
faster caching for the contexts. As they are not backed by any
particular cache (such as the sampler or render caches) our only option
is to select the generic mid-level cache. This reduces the latency of
the swap by about 5%.
Oddly this effect can be observed running smokin-guns on IVB at
1280x1024:
Using BLT copies for swaps: 151.67 fps
Using Render copies for swaps (unpatched): 141.70 fps
With contexts disabled: 150.23 fps
With contexts in L3$: 150.77 fps
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Convert to the much saner new idr interface.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: David Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's not that the assertion is incorrect, but rather that we can call
do_destroy early in loading, and we will falsely BUG().
Since contexts have been in for a while now, and in the internal APIs
are pretty stable, it should be fairly safe to remove this.
v2: Remove unused dev_priv, and dev
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was a rebase error from when the patches originally landed. Since
the context size is unsigned, there is also no use in checking if it's
less than 0.
The existing code is not really wrong, but it's not simple as it should
be.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Based on the work by Mika Kuoppala, we realised that we need to handle
seqno wraparound prior to committing our changes to the ring. The most
obvious point then is to grab the seqno inside intel_ring_begin(), and
then to reuse that seqno for all ring operations until the next request.
As intel_ring_begin() can fail, the callers must already be prepared to
handle such failure and so we can safely add further checks.
This patch looks like it should be split up into the interface
changes and the tweaks to move seqno wrapping from the execbuffer into
the core seqno increment. However, I found no easy way to break it into
incremental steps without introducing further broken behaviour.
v2: Mika found a silly mistake and a subtle error in the existing code;
inside i915_gem_retire_requests() we were resetting the sync_seqno of
the target ring based on the seqno from this ring - which are only
related by the order of their allocation, not retirement. Hence we were
applying the optimisation that the rings were synchronised too early,
fortunately the only real casualty there is the handling of seqno
wrapping.
v3: Do not forget to reset the sync_seqno upon module reinitialisation,
ala resume.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=863861
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> [v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Whoops. This was fixed previously, but not sure how it got lost. It's
not needed for -fixes or stable because at the moment
drm_i915_file_private is way bigger than i915_hw_context (by 120 bytes
on my 64b build).
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel writes:
Bigger -fixes pile, mostly because I've included Ajax' DP dongle stuff,
as discussed on irc. Otherwise just small things:
- regression fix to finally make 6bpc auto-dither on dp work (Jani)
- reinstate an snb ctx w/a that accidentally got lost in a rework (Chris)
- fixup the DP train sequence, logic-goof-up uncovered by Coverty (Chris)
- fix set_caching locking (Ben)
- fix spurious segfault on con-current gtt mmap faulting (Dimitry and Mika)
- some pageflip correctness fixes (still hunting down some issues, but
these are the worst offenders of confused code that we've tracked down
thus far) from Chris and me
- fixup swizzling settings on vlv (Jesse)
- gt_mode w/a from Ben added, fixes snb gt1 rc6+hw ctx hangs.
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: Fix GT_MODE default value
drm/i915: don't frob the vblank ts in finish_page_flip
drm/i915: call drm_handle_vblank before finish_page_flip
drm/i915: print warning if vmi915_gem_fault error is not handled
drm/i915: EBUSY status handling added to i915_gem_fault().
drm/i915: Try harder to complete DP training pattern 1
drm/i915: set swizzling to none on VLV
drm/dp: Make sink count DP 1.2 aware
drm/dp: Document DP spec versions for various DPCD registers
drm/i915/dp: Be smarter about connection sense for branch devices
drm/i915/dp: Fetch downstream port info if needed during DPCD fetch
drm/dp: Update DPCD defines
drm: Export drm_probe_ddc()
drm/i915: Flush the pending flips on the CRTC before modification
drm/i915: Actually invalidate the TLB for the SandyBridge HW contexts w/a
drm/i915: Fix set_caching locking
drm/i915: use adjusted_mode instead of mode for checking the 6bpc force flag
Pull drm merge (part 1) from Dave Airlie:
"So first of all my tree and uapi stuff has a conflict mess, its my
fault as the nouveau stuff didn't hit -next as were trying to rebase
regressions out of it before we merged.
Highlights:
- SH mobile modesetting driver and associated helpers
- some DRM core documentation
- i915 modesetting rework, haswell hdmi, haswell and vlv fixes, write
combined pte writing, ilk rc6 support,
- nouveau: major driver rework into a hw core driver, makes features
like SLI a lot saner to implement,
- psb: add eDP/DP support for Cedarview
- radeon: 2 layer page tables, async VM pte updates, better PLL
selection for > 2 screens, better ACPI interactions
The rest is general grab bag of fixes.
So why part 1? well I have the exynos pull req which came in a bit
late but was waiting for me to do something they shouldn't have and it
looks fairly safe, and David Howells has some more header cleanups
he'd like me to pull, that seem like a good idea, but I'd like to get
this merge out of the way so -next dosen't get blocked."
Tons of conflicts mostly due to silly include line changes, but mostly
mindless. A few other small semantic conflicts too, noted from Dave's
pre-merged branch.
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (447 commits)
drm/nv98/crypt: fix fuc build with latest envyas
drm/nouveau/devinit: fixup various issues with subdev ctor/init ordering
drm/nv41/vm: fix and enable use of "real" pciegart
drm/nv44/vm: fix and enable use of "real" pciegart
drm/nv04/dmaobj: fixup vm target handling in preparation for nv4x pcie
drm/nouveau: store supported dma mask in vmmgr
drm/nvc0/ibus: initial implementation of subdev
drm/nouveau/therm: add support for fan-control modes
drm/nouveau/hwmon: rename pwm0* to pmw1* to follow hwmon's rules
drm/nouveau/therm: calculate the pwm divisor on nv50+
drm/nouveau/fan: rewrite the fan tachometer driver to get more precision, faster
drm/nouveau/therm: move thermal-related functions to the therm subdev
drm/nouveau/bios: parse the pwm divisor from the perf table
drm/nouveau/therm: use the EXTDEV table to detect i2c monitoring devices
drm/nouveau/therm: rework thermal table parsing
drm/nouveau/gpio: expose the PWM/TOGGLE parameter found in the gpio vbios table
drm/nouveau: fix pm initialization order
drm/nouveau/bios: check that fixed tvdac gpio data is valid before using it
drm/nouveau: log channel debug/error messages from client object rather than drm client
drm/nouveau: have drm debugging macros build on top of core macros
...
Convert #include "..." to #include <path/...> in drivers/gpu/.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
A side-effect of commit 7d54a90428
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Fri Aug 10 10:18:10 2012 +0100
drm/i915: Apply post-sync write for pipe control invalidates
was that only a request to emit invalidate flush would result in the
TLB being invalidated (since it requires synchronisation and so incurs a
performance penalty). However, the stated w/a for hardware contexts is
that the TLBs must be invalidated prior to a MI_SET_CONTEXT, yet the w/a
itself did not request the TLBs to be invalidated...
Note this w/a does not prevent the hard system hang I experience when
using hw contexts (with rc6 enabled) on SNB GT1.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Avoid stalling and waiting for the GPU by checking to see if there is
sufficient inactive space in the aperture for us to bind the buffer
prior to writing through the GTT. If there is inadequate space we will
have to stall waiting for the GPU, and incur overheads moving objects
about. Instead, only incur the clflush overhead on the target object by
writing through shmem.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
the following warning was produced,
drivers/gpu/drm/i915/i915_gem_context.c: In function ‘i915_switch_context’:
drivers/gpu/drm/i915/i915_gem_context.c:454:6: warning: unused variable ‘ret’ [-Wunused-variable]
fix up by removing it
Signed-off-by: Devendra Naga <devendra.aaru@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Basic context support on HSW is no different than previous generations.
The size of the context object changes, but that's about it.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When bug hunting, I found the interface to do_switch() overly
complicated and I believe festered the earlier bug. This aims to make
the code a little clearer.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need to check that "ctx" is a valid pointer before dereferencing it.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise we end up trying to unpin a freed object and BUG.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The issue is that we stale data in the CPU caches, when we come to
swap-out the object, the CPU may short-circuit the reads from those
cacheline and so corrupt the context object.
Secondary, leaving the context object as being marked in the CPU write
domain whilst on the GPU active list is a bad idea and will throw
warnings later.
Note: Thanks to calling set_to_gtt_domain with write = false and not
setting any gpu write domain when putting a context object onto the
active list (when we switch away from it) the set_to_gtt_domain call
won't block.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Added a note to the commit message and a comment in the code
to explain the clever non-blocking trick.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
*sigh* the docs had it spelled wrong, corrected it, and then proceeded
to re-do the original error. The original code preserved this history,
and this patch attempts to keep in sync with the current docs.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel complained about this on initial review, but he graciously moved
the patches forward. As promised, I am delivering the desired cleanup
now.
Hopefully I didn't screw the trivial patch up ;-)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The idr code already passes us the pointer associated with that id, so
no need to look it up again. Also, we'll kill the idr right away, so
there's no issue with leaving these dangling pointers behind - the
current code does the same.
v2: Also drop the file argument, spotted by Ben Widawsky.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is our customary "no such object" errno, not -EINVAL.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It doesn't hurt and it at least prevents us from OOPSing left and
right at quite a few places. This also allows us to simplify the code
a bit by folding the only line of context_open into the callsite.
We obviuosly also need to run the cleanup code unconditionally, too.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
commit 8e96d9c4d9
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Mon Jun 4 14:42:56 2012 -0700
drm/i915: reset the GPU on context fini
broke module unload because it reset the gpu before we've stopped
touching it. Later on in the unload sequence the ringbuffer code
complained that the gpu would idle properly (because intel_gpu_reset
only resets the hw and not our sw state).
v2: Reorder things so that we reset the gpu _before_ we release the
backing storage of the default context.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51183
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This got dropped as a result of the last round of comments. I didn't
test it on unsupported HW (which this is likely the case).
Note that this prevents hw context from blowing up on any pre-gen6 hw.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51142
[danvet: Added note and buglink.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>