On a machine with bit17 swizzling, we need to store the bit17 of the
physical page address in put-pages. This requires a memory allocation,
on average less than a page, which may be difficult to satisfy is the
request to put-pages is on behalf of the shrinker. We could allow that
allocation to pull from the reserved memory pools, but it seems much
safer to preallocate the array for tiled objects on affected machines.
v2: Export i915_gem_object_needs_bit17_swizzle() for reuse.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Both the dp and fdi code use the exact same computations (ignore minor
differences in conversion between bits and bytes).
This makes it even more apparent that we have a _massive_ mess between
cpu transcoder/fdi link/pch transcoder and pch link settings. And also
that we have hilarious amounts of confusion between edp and dp
(despite that they're identical at a link level).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This has originally been added in
commit 8db9d77b1b
Author: Zhenyu Wang <zhenyuw@linux.intel.com>
Date: Wed Apr 7 16:15:54 2010 +0800
drm/i915: Support for Cougarpoint PCH display pipeline
probably to combat issues with hw state left behind by the BIOS. And
indeed, I've checked out that specific revision, and there is no DP
support yet. So the pch dp transcoder won't be correctly disabled, and
that's important since it requires a rether special disable dance:
Just writing 0 to TRANS_DP_CTL won't cut it, since we need to select
the NONE port when disabling, too.
And indeed, things seem to still work, so let's just remove this.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This could have happened with the old crtc helper based modeset code,
but can't happen any longer with the new code.
Hence put in a WARN and adjust the comment. If no one hits this, we
can eventually remove it (like a few other such cases across our
code).
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
17 ms is eerily close to 60 Hz ^-1
Unfortunately this goes back to the original DP enabling for ilk, and
unfortunately does not come with a reason for it's existance attached.
Some closer inspection of the code and DP specs shows that we set the
idle link pattern before we disable the port. And it seems like that
the DP spec (or at least our hw) only switch to the idle pattern on
the next vblank. Hence a vblank wait at this spot makes _much_ more
sense than a really long wait.
v2: Rebase fixup.
v3: Add comment requested by Paulo Zanoni saying that we don't really
know what this wait is for.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
While reading docs I've noticed that this special workaround to select
the 1.6 GHz DP clock only applies to pre-production ilk machines.
Since the registers we're touching here are rather undocumented and
might be harmful on later chips, rip it out.
For the Bspec reference of this w/a look in "vol4g CPU Display
Registers [DevILK]", Section 4.1.7.1 "DP_A—DisplayPort A
Control Register", "DP_PLL_Frequency_Select".
v2: Keep a debug message as a hint in case something regresses.
Requested by Chris Wilson.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we enable the cpu edp pll in intel_dp->pre_enable and no
longer in crtc_mode_set, we can also move the modeset part to the
intel_dp->mode_set callback. Previously this was not possible because
the encoder ->mode_set callbacks are called after the crtc mode set
callback.
v2: Rebase on top of copy&pasted hsw crtc_mode_set.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Especially getting rid of all things lvds is ... great!
v2: Drop the two additional pre-hsw hunks noticed by Paulo Zanoni.
v3:
- handle DP ports correctly (spoted by Paulo)
- don't leave {} behind for a single-line block (again spotted by
Paulo)
- kill another if (IBX || CPT) block
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Before queuing the flip but crucially after attaching the unpin-work to
the crtc, we continue to setup the unpin-work. However, should the
hardware fire early, we see the connected unpin-work and queue the task.
The task then promptly runs and unpins the fb before we finish taking
the required references or even pinning it... Havoc.
To close the race, we use the flip-pending atomic to indicate when the
flip is finally setup and enqueued. So during the flip-done processing,
we can check more accurately whether the flip was expected.
v2: Add the appropriate mb() to ensure that the writes to the page-flip
worker are complete prior to marking it active and emitting the MI_FLIP.
On the read side, the mb should be enforced by the spinlocks.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
[danvet: Review the barriers a bit, we need a write barrier both
before and after updating ->pending. Similarly we need a read barrier
in the interrupt handler both before and after reading ->pending. With
well-ordered irqs only one barrier in each place should be required,
but since this patch explicitly sets out to combat spurious interrupts
with is staged activation of the unpin work we need to go full-bore on
the barriers, too. Discussed with Chris Wilson on irc and changes
acked by him.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Having 9500 lines repeated on dmesg does not help me at all.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
TRANS_DP_VIDEO_AUDIO is not used at all.
The other 3 has duplicated #defines.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Less clutter in the traces. And in both cases we yell rather loud
into the logs if we time out. Patch suggested by Chris Wilson.
v2: Annotate another I915_READ in dp_aux to be consistent - we filter
out all register io in wait_for and similar loops. Chris also
suggested to mark all dp_aux register access as _NOTRACE, but I think
we should keep all functionally relevant access around, and filter
unneeded bits in userspace after the trace is captured.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
At least on the platforms that have a dp aux irq and also have it
enabled - vlvhsw should have one, too. But I don't have a machine to
test this on. Judging from docs there's no dp aux interrupt for gm45.
Also, I only have an ivb cpu edp machine, so the dp aux A code for
snb/ilk is untested.
For dpcd probing when nothing is connected it slashes about 5ms of cpu
time (cpu time is now negligible), which agrees with 3 * 5 400 usec
timeouts.
A previous version of this patch increases the time required to go
through the dp_detect cycle (which includes reading the edid) from
around 33 ms to around 40 ms. Experiments indicated that this is
purely due to the irq latency - the hw doesn't allow us to queue up
dp aux transactions and hence irq latency directly affects throughput.
gmbus is much better, there we have a 8 byte buffer, and we get the
irq once another 4 bytes can be queued up.
But by using the pm_qos interface to request the lowest possible cpu
wake-up latency this slowdown completely disappeared.
Since all our output detection logic is single-threaded with the
mode_config mutex right now anyway, I've decide not ot play fancy and
to just reuse the gmbus wait queue. But this would definitely prep the
way to run dp detection on different ports in parallel
v2: Add a timeout for dp aux transfers when using interrupts - the hw
_does_ prevent this with the hw-based 400 usec timeout, but if the
irq somehow doesn't arrive we're screwed. Lesson learned while
developing this ;-)
v3: While at it also convert the busy-loop to wait_for_atomic, so that
we don't run the risk of an infinite loop any more.
v4: Ensure we have the smallest possible irq latency by using the
pm_qos interface.
v5: Add a comment to the code to explain why we frob pm_qos. Suggested
by Chris Wilson.
v6: Disable dp irq for vlv, that's easier than trying to get at docs
and hw.
v7: Squash in a fix for Haswell that Paulo Zanoni tracked down - the
dp aux registers aren't at a fixed offset any more, but can be on the
PCH while the DP port is on the cpu die.
Reviewed-by: Imre Deak <imre.deak@intel.com> (v6)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Doesn't do anything yet than call dp_aux_irq_handler.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
GMBUS_ACTIVE has inverted sense and so doesn't fit into the
wait_hw_status helper, hence create a new gmbus_wait_idle functions.
Also, we only care about the idle irq event and nothing else, which
allows us to use the wait_event_timeout helper directly without
jumping through hoops to catch NAKs.
Since gen2/3 don't have gmbus interrupts, handle them separately with
the old wait_for macro.
This shaves another few ms off reading EDID from a hdmi screen on my
testbox here. EDID reading with interrupt driven gmbus is now as fast
as with busy-looping gmbus at 28 ms here (with negligible cpu
overhead).
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need two special things to properly wire this up:
- Add another argument to gmbus_wait_hw_status to pass in the
correct interrupt bit in gmbus4.
- Since we can only get an irq for one of the two events we want,
hand-roll the wait_event_timeout code so that we wake up every
jiffie and can check for NAKs. This way we also subsume gmbus
support for platforms without interrupts (or where those are not
yet enabled).
The important bit really is to only enable one gmbus interrupt source
at the same time - with that piece of lore figured out, this seems to
work flawlessly.
Ben Widawsky rightfully complained the lack of measurements for the
claimed benefits (especially since the first version was actually
broken and fell back to bit-banging). Previously reading the 256 byte
hdmi EDID takes about 72 ms here. With this patch it's down to 33 ms.
Given that transfering the 256 bytes over i2c at wire speed takes
20.5ms alone, the reduction in additional overhead is rather nice.
v2: Chris Wilson wondered whether GMBUS4 might contain some set bits
when booting up an hence result in some spurious interrupts. Since we
clear GMBUS4 after every wait and we do gmbus transfer really early in
the setup sequence to detect displays the window is small, but still
be paranoid and clear it properly.
v3: Clarify the comment that gmbus irq generation can only support one
kind of event, why it bothers us and how we work around that limit.
Cc: Daniel Kurtz <djkurtz@chromium.org>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Only enables the interrupt and puts a irq handler into place, doesn't
do anything yet.
Unfortunately there's no gmbus interrupt support for gen2/3 (safe for
pnv, but there the irq is marked as "Test mode").
v2: Wire up the irq handler for vlv and gen4 properly.
v3: i915_enable_pipestat expects the mask bit, not the status bits ... and
for added hilarity those are rather inconsistently named.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The gmbus interrupt generation is rather fiddly: We can only ever
enable one interrupt source (but we always want to check for NAK
in addition to the real bit). And the bits in the gmbus status
register don't map at all to the bis in the irq register.
To prepare for this mess, start by extracting the hw status wait
loop into it's own function, consolidate the NAK error handling a
bit. To keep things flexible, pass in the status bit we care about
(in addition to any NAK signalling).
v2: I've failed to notice that the sense of GMBUS_ACTIVE is inverted,
Chris Wilson gladly pointed that out for me. To keep things simple,
ignore that case for now (we only need to idle the gmbus controller
at the end of an entire i2c transaction, not after every message).
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Otherwise the new&shiny irq-driven gmbus and dp aux code won't work that
well. Noticed since the dp aux code doesn't have an automatic fallback
with a timeout (since the hw provides for that already).
v2: Simple move drm_irq_install before intel_modeset_gem_init, as
suggested by Ben Widawsky.
v3: Now that interrupts are enabled before all connectors are fully
set up, we might fall over serving a HPD interrupt while things are
still being set up. Instead of jumping through massive hoops and
complicating the code with a separate hpd irq enable step, simply
block out the hotplug work item from doing anything until things are
in place.
v4: Actually, we can enable hotplug processing only after the fbdev is
fully set up, since we call down into the fbdev from the hotplug work
functions. So stick the hpd enabling right next to the poll helper
initialization.
v5: We need to enable irqs before intel_modeset_init, since that
function sets up the outputs.
v6: Fixup cleanup sequence, too.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
... together with all the other irq related resources in
intel_irq_init. I've managed to oops in the notify_ring function on my
ilk, presumably because of the powerctx setup call to i915_gpu_idle.
Note that this is only a problem with the reorder irq setup sequence
for irq-driver gmbus/dp aux.
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is for legacy legacy stuff, and checking with the leftover
pipe from the previous loop is propably not what we want. Since
pipe == 2 after the loop ... Then we only assing a variable and do
nothing with it.
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If there are pre-wrap values in semaphore-mbox registers after wrap,
syncing against some after-wrap request will complete immediately.
Fix this by emitting ring commands to set mbox registers to zero
when the wrap happens.
v2: Use __intel_ring_begin to emit ring commands, from
Chris Wilson.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add a small comment to handle_seqno_wrap.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In preparation for handling ring seqno wrapping, split
intel_ring_begin into helper part which doesn't allocate
seqno.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When l3 parity support for Haswell was enabled in
commit f27b92651d
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Tue Jul 24 20:47:32 2012 -0700
drm/i915: Expand DPF support to Haswell
no one noticed that the patch which introduced this macro
commit e1ef7cc299
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Tue Jul 24 20:47:31 2012 -0700
drm/i915: Macro to determine DPF support
missed one spot. Fix this.
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57441
Reviewed-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
seqno's are u32 so print accordingly
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is needed for testing seqno wrapping. Be careful to not bump next_seqno
more than 0x7FFFFFFF at a time (between some handled requests) as
i915_seqno_passed() can't handle bigger difference in between.
v2: Address review comments from Chris Wilson.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Squash in fixup to properly remove the debugfs file on driver
unload again.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds code for composing AVI and AUI info frames
and send them every VSYNC.
This patch is important for hdmi certification.
v3:
- Moved enums, macros to exynos_hdmi.c.
- Corrected hex format.
- Added static to hdmi_reg_infoframe.
v2:
- Added few blank lines.
- Corrected comments format.
- Added comments for 2's Complement calculation for check sum.
v1:
- Remove un-necessary blank lines.
- Change the case of hex constants.
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Signed-off-by: Fahad Kunnathadi <fahad.k@samsung.com>
Signed-off-by: Shirish S <s.shirish@samsung.com>
Acked-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_clk_get is device managed and makes error handling and exit code
simpler.
Also fixes an error related to returning 'ret' without initialising
with error code.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_* functions are device managed and make error handling and exit code
simpler.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_clk_get is device managed and makes error handling and exit code
simpler.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Pointer was being dereferenced after freeing.
Fixes the following error:
drivers/gpu/drm/exynos/exynos_drm_g2d.c:323 g2d_userptr_put_dma_addr() error:
dereferencing freed memory 'g2d_userptr'
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
devm_clk_get is device managed and makes error handling and exit code
simpler.
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
The 'pages' structure in the exynos gem buffer has been
removed. So we get the fix.smem_start from the first sgl
of the scatter gather table.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
This patch is to preserve the display mode header during the mode adjustment.
Display mode header is overwritten with the adjusted mode header which is
throwing the stack dump.
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
drm_get_edid() returns a pointer to an EDID block. The caller
is responsible to free this pointer itself.
Here the pointer gets assigned to the local variable raw_edid.
Therefore it should be freed before the variable goes out of
scope.
Signed-off-by: Egbert Eich <eich@suse.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Changelog v2:
Removed redundant check for invalid sgl.
Added check for valid page_offset in the beginning of exynos_drm_gem_map_buf.
Changelog v1:
The 'pages' structure is not required since we can use the 'sgt'. Even for
CONTIG buffers, a SGT is created (which will have just one sgl). This SGT
can be used during mmap instead of 'pages'. The 'page_size' element of the
structure is also not used anywhere and is removed.
This patch also fixes a memory leak where the 'pages' structure was being
allocated during gem buffer allocation but not being freed during deallocate.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Changelog v3:
Passing the actual buffer size instead of vm_size to dma_mmap_attrs.
Changelog v2:
Extracting the private data from fb_info structure to obtain the exynos
gem buffer structure. Now, dma address is obtained from the exynos gem
buffer structure and not from smem_start. Also calling dma_mmap_attrs
(instead of dma_mmap_writecombine) with the same attributes used
during allocation.
Changelog v1:
This patch adds a exynos drm specific implementation of fb_mmap
which supports mapping a non-contiguous buffer to user space.
This new function does not assume that the frame buffer is contiguous
and calls dma_mmap_writecombine for mapping the buffer to user space.
dma_mmap_writecombine will be able to map a contiguous buffer as well
as non-contig buffer depending on whether an IOMMU mapping is created
for drm or not.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Changelog v2:
fix a little bit performance issue to previous patch.
- When drm framebuffer is destroyed, make sure that overlay
data are updated to real hardwrae for all encoders
instead of waiting for vblank every page flip request.
For this, it adds a new function,
exynos_drm_encoder_complete_scanout function.
Changelog v1:
This patch removes wait_for_vblank call from
exynos_drm_encoder_plane_disable function and move it to
exynos_drm_encoder_plane_commit function.
Disabling dma channel to each plane doens't need vblank
signal to update data to real hardware. But updating
overlay data to real hardware does need vblank signal.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
In a couple of places we attempt to adjust the existing watermark
registers to update them for the new cursor watermarks. This goes
horribly wrong as instead of clearing the cursor bits prior to or'ing in
the new values, we clear the rest of the register with the result that
the watermark registers contain bogus values.
References: https://bugs.freedesktop.org/show_bug.cgi?id=47034
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The BLC_PWM_CTL2 register does not exist before gen4. While at it, do a
slight drive by cleanup of the code.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Changelog v3:
use drm_file's file object instead of gem object's
- gem object's file represents the shmem storage so
process-unique file object should be used instead.
Changelog v2:
call mutex_lock before drm_vm_open_locked is called.
Changelog v1:
This patch makes it takes a reference to gem object when
specific gem mmap is requested. For this, it sets
dev->driver->gem_vm_ops to vma->vm_ops.
And this patch is based on exynos-drm-next-iommu branch of
git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
This patch adds userptr feautre for G2D module.
The userptr means user space address allocated by malloc().
And the purpose of this feature is to make G2D's dma able
to access the user space region.
To user this feature, user should flag G2D_BUF_USRPTR to
offset variable of struct drm_exynos_g2d_cmd and fill
struct drm_exynos_g2d_userptr with user space address
and size for it and then should set a pointer to
drm_exynos_g2d_userptr object to data variable of struct
drm_exynos_g2d_cmd. The last bit of offset variable is used
to check if the cmdlist's buffer type is userptr or not.
If userptr, the g2d driver gets user space address and size
and then gets pages through get_user_pages().
(another case is counted as gem handle)
Below is sample codes:
static void set_cmd(struct drm_exynos_g2d_cmd *cmd,
unsigned long offset, unsigned long data)
{
cmd->offset = offset;
cmd->data = data;
}
static int solid_fill_test(int x, int y, unsigned long userptr)
{
struct drm_exynos_g2d_cmd cmd_gem[5];
struct drm_exynos_g2d_userptr g2d_userptr;
unsigned int gem_nr = 0;
...
g2d_userptr.userptr = userptr;
g2d_userptr.size = x * y * 4;
set_cmd(&cmd_gem[gem_nr++], DST_BASE_ADDR_REG |
G2D_BUF_USERPTR,
(unsigned long)&g2d_userptr);
...
}
int main(int argc, char **argv)
{
unsigned long addr;
...
addr = malloc(x * y * 4);
...
solid_fill_test(x, y, addr);
...
}
And next, the pages are mapped with iommu table and the device
address is set to cmdlist so that G2D's dma can access it.
As you may know, the pages from get_user_pages() are pinned.
In other words, they CAN NOT be migrated and also swapped out.
So the dma access would be safe.
But the use of userptr feature has performance overhead so
this patch also has memory pool to the userptr feature.
Please, assume that user sends cmdlist filled with userptr
and size every time to g2d driver, and the get_user_pages
funcion will be called every time.
The memory pool has maximum 64MB size and the userptr that
user had ever sent, is holded in the memory pool.
This meaning is that if the userptr from user is same as one
in the memory pool, device address to the userptr in the memory
pool is set to cmdlist.
And last, the pages from get_user_pages() will be freed once
user calls free() and the dma access is completed. Actually,
get_user_pages() takes 2 reference counts if the user process
has never accessed user region allocated by malloc(). Then, if
the user calls free(), the page reference count becomes 1 and
becomes 0 with put_page() call. And the reverse holds as well.
This means how the pages backed are used by dma and freed.
This patch is based on "drm/exynos: add iommu support for g2d",
https://patchwork.kernel.org/patch/1629481/
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
The function dma_get_sgtable will allocate a sg table internally so
it is not necessary to allocate a sg table before it. The unnecessary
'sg_alloc_table' call is removed.
Signed-off-by: Prathyush K <prathyush.k@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
This patch fixes the problem of mapping contigous and non contigous dma buffers.
Currently page struct is calculated from the buf->dma_addr which is not the
physical address. It is replaced by buf->pages which points to the page struct
of the first page of contigous memory chunk. This gives the correct page frame
number for mapping.
Non-contigous dma buffers are described using SG table and SG lists. Each
valid SG List is pointing to a single page or group of pages which are
physically contigous. Current implementation just maps the first page of each
SG List and leave the other pages unmapped, leading to a crash. Given solution
finds the page struct for the faulting page through parsing SG table and map it.
Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
With iommu support, non-continuous buffer also is supported so
this patch removes these checking from exynos_drm_gem_get/put_dma_addr
funciton.
This patch is based on the below patch set, "drm/exynos: add
iommu support for -next".
http://www.spinics.net/lists/dri-devel/msg29041.html
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Chagelog v2:
removed unnecessary structure, struct g2d_gem_node.
Chagelog v1:
This patch adds iommu support for g2d driver. For this, it
adds subdrv_probe/remove callback to enable or disable
g2d iommu. And with this patch, in case of using g2d iommu,
we can get or put device address to a gem handle from user
through exynos_drm_gem_get/put_dma_addr(). Actually, these
functions take a reference to a gem handle so that the gem
object used by g2d dma is released properly.
And runqueue_node has a pointer to drm_file object of current
process to manage gem handles to owner.
This patch is based on the below patch set, "drm/exynos: add
iommu support for -next".
http://www.spinics.net/lists/dri-devel/msg29041.html
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Changelog v2:
move iommu support feature to mixer side.
And below is Prathyush's comment.
According to the new IOMMU framework for exynos sysmmus,
the owner of the sysmmu-tv is mixer (which is the actual
device that does DMA) and not hdmi.
The mmu-master in sysmmu-tv node is set as below in exynos5250.dtsi
sysmmu-tv {
-
mmu-master = <&mixer>;
};
Changelog v1:
The iommu will be enabled when hdmi sub driver is probed and
will be disabled when removed.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
- __iomem where there is none (I love how we mix these things up).
- Use gfp_t instead of an other plain type.
- Unconfuse one place about enum pipe vs enum transcoder - for the pch
transcoder we actually use the pipe enum. Fixup the other cases
where we assign the pipe to the cpu transcoder with explicit casts.
- Declare the mch_lock properly in a header.
There is still a decent mess in intel_bios.c about __iomem, but heck,
this is x86 and we're allowed to do that.
Makes-sparse-happy: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Use a space after the cast consistently and fix up the
newly-added cast in i915_irq.c to properly use __iomem.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>