The DMA engine supports memory copy, RAID5 XOR, RAID6 PQ, and other
computations. But the bandwidth of the entire DMA engine is shared
among all channels. This patch re-configures operations availability
such that one can achieve maximum performance for XOR and PQ
computation by removing the memory offload operations.
Signed-off-by: Rameshwar Prasad Sahu <rsahu@apm.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Introduce __vmx_flush_tlb() to handle specific vpid.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adjust allocate/free_vid so that they can be reused for the nested vpid.
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
async_pf_execute() seems to be missing a memory barrier which might
cause the waker to not notice the waiter and miss sending a wake_up as
in the following figure.
async_pf_execute kvm_vcpu_block
------------------------------------------------------------------------
spin_lock(&vcpu->async_pf.lock);
if (waitqueue_active(&vcpu->wq))
/* The CPU might reorder the test for
the waitqueue up here, before
prior writes complete */
prepare_to_wait(&vcpu->wq, &wait,
TASK_INTERRUPTIBLE);
/*if (kvm_vcpu_check_block(vcpu) < 0) */
/*if (kvm_arch_vcpu_runnable(vcpu)) { */
...
return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
!vcpu->arch.apf.halted)
|| !list_empty_careful(&vcpu->async_pf.done)
...
return 0;
list_add_tail(&apf->link,
&vcpu->async_pf.done);
spin_unlock(&vcpu->async_pf.lock);
waited = true;
schedule();
------------------------------------------------------------------------
The attached patch adds the missing memory barrier.
I found this issue when I was looking through the linux source code
for places calling waitqueue_active() before wake_up*(), but without
preceding memory barriers, after sending a patch to fix a similar
issue in drivers/tty/n_tty.c (Details about the original issue can be
found here: https://lkml.org/lkml/2015/9/28/849).
Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On real hardware, edge-triggered interrupts don't set a bit in TMR,
which means that IOAPIC isn't notified on EOI. Do the same here.
Staying in guest/kernel mode after edge EOI is what we want for most
devices. If some bugs could be nicely worked around with edge EOI
notifications, we should invest in a better interface.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM uses eoi_exit_bitmap to track vectors that need an action on EOI.
The problem is that IOAPIC can be reconfigured while an interrupt with
old configuration is pending and eoi_exit_bitmap only remembers the
newest configuration; thus EOI from the pending interrupt is not
recognized.
(Reconfiguration is not a problem for level interrupts, because IOAPIC
sends interrupt with the new configuration.)
For an edge interrupt with ACK notifiers, like i8254 timer; things can
happen in this order
1) IOAPIC inject a vector from i8254
2) guest reconfigures that vector's VCPU and therefore eoi_exit_bitmap
on original VCPU gets cleared
3) guest's handler for the vector does EOI
4) KVM's EOI handler doesn't pass that vector to IOAPIC because it is
not in that VCPU's eoi_exit_bitmap
5) i8254 stops working
A simple solution is to set the IOAPIC vector in eoi_exit_bitmap if the
vector is in PIR/IRR/ISR.
This creates an unwanted situation if the vector is reused by a
non-IOAPIC source, but I think it is so rare that we don't want to make
the solution more sophisticated. The simple solution also doesn't work
if we are reconfiguring the vector. (Shouldn't happen in the wild and
I'd rather fix users of ACK notifiers instead of working around that.)
The are no races because ioapic injection and reconfig are locked.
Fixes: b053b2aef2 ("KVM: x86: Add EOI exit bitmap inference")
[Before b053b2aef2, this bug happened only with APICv.]
Fixes: c7c9c56ca2 ("x86, apicv: add virtual interrupt delivery support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After moving PIR to IRR, the interrupt needs to be delivered manually.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In order to get into 64-bit protected mode, you need to enable
paging while EFER.LMA=1. For this to work, CS.L must be 0.
Currently, we load the segments before CR0 and CR4, which means
that if RSM returns into 64-bit protected mode CS.L is already 1
and everything breaks.
Luckily, CS.L=0 is always the case when executing RSM, because it
is forbidden to execute RSM from 64-bit protected mode. Hence it
is enough to load CR0 and CR4 first, and only then the segments.
Fixes: 660a5d517a
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the eDMA3 has support for channel paRAM slot mapping we can utilize it
to allocate slots on demand and save precious slots for real transfers.
On am335x the eDMA has 64 channels which means we can unlock 64 paRAM
slots out from the available 256.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The names chosen for the bitfields were quite confusing and given no real
information on what they are used for...
edma_inuse -> slot_inuse: tracks the slot usage/availability
edma_unused -> channel_unused: tracks the channel usage/availability
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Instead of directly reading it from CCCFG register take the information out
once when we set up the configuration from the HW.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
No need to run through the bits in QEMR and CCERR events since they will
not trigger any action, so just clearing the errors there is fine.
In case of the missed event the loop can be optimized so we spend less time
to handle the event.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
In the ccerr interrupt handler the code checks for pending errors in the
error status registers in two different places.
Move the check out to a helper function.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
With the merger of the arch/arm/common/edma.c code into the dmaengine
driver, there is no longer need to have per channel callback/data storage
for interrupt events.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Remove or rewrite the comments for the internal functions.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Warning message in case of linking between paRAM slots in different eDMA
controllers.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
edma_write_slot() is for writing an entire paRAM slot.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
We have access to dev, so it is better to use the dev_dbg for debug prints.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Be consistent and do not mix the use of dev, &pdev->dev, etc in the
functions.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
When allocating a memory for number of items it is better (looks better)
to use devm_kcalloc.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Instead of using defines to specify the size of different arrays and
bitmaps, allocate the memory for them based on the information we get from
the HW itself.
Since these defines are set based on the worst case, there are devices
where they are not valid.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Move the code out from arch/arm/common and merge it inside of the dmaengine
driver.
This change is done with as minimal (if eny) functional change to the code
as possible to avoid introducing regression.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The upcoming change to merge the arch/arm/common/edma.c into
drivers/dma/edma.c will need this change when booting daVinci devices in
no DT mode.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Convert the eDMA platform device creation to use
struct platform_device_info XXXXXX __initconst and
platform_device_register_full()
This will allow us to cleanly specify the dma_mask for the devices in an
upcoming patch.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Since the driver stack no longer depends on lookup with id number in a
global array of pointers, the limitation for the number of eDMAs are no
longer needed. We can handle as many eDMAs in legacy and DT boot as we have
memory for them to allocate the needed structures.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Instead of relying on indexes pointing to edma private date in the global
pointer array, pass the private data pointer via the public API.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Merge the iomem into the 'struct edma' and change the internal (static)
functions to use pointer to the edma_cc instead of the ctlr number.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
If the of_dma_controller is registered in the non dmaengine driver we could
have race condition:
the of_dma_controller has been registered, but the dmaengine driver is not
yet probed. Drivers requesting DMA channels during this window will fail
since we do not yet have dmaengine drivers registered.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Currently we have one device created to handle all (maximum 2) eDMAs in the
system.
With this change all eDMA instance will have it's own device/driver.
This change is needed for further cleanups in the eDMA driver stack since
the one device/driver to handle all eDMAs in the system was not flexible
enough and prevents the upcoming work.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The code path in edma_execute() and edma_callback() can be simplified
and make it more optimal.
There is not need to call in to edma_execute() when the transfer
has been finished for example.
Also the handling of missed/first or next batch of paRAMs can
be done in a more optimal way.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
We no longer have users for these functions so they can be removed.
Remove also unused enums from the header file.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
In case when the interrupt happened for the second eDMA the channel
number was incorrectly passed to the client driver.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
On sparc, we see unaligned access messages on each modprobe[-r]:
Kernel unaligned access at TPC[6ad9b4] pkcs7_verify [..]
Kernel unaligned access at TPC[6a5484] crypto_shash_finup [..]
Kernel unaligned access at TPC[6a5390] crypto_shash_update [..]
Kernel unaligned access at TPC[10150308] sha1_sparc64_update [..]
Kernel unaligned access at TPC[101501ac] __sha1_sparc64_update [..]
These ware triggered by mod_verify_sig() invocations of pkcs_verify(), and
are are being caused by an unaligned desc at (sha1, digest_size is 0x14)
desc = digest + digest_size;
To fix this, pkcs7_verify needs to make sure that desc is pointing
at an aligned value past the digest_size, and kzalloc appropriately,
taking alignment values into consideration.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Using the devm_xxx() managed function to stripdown the error
and remove code.
In the same time, we replace request_mem_region/ioremap by the unified
devm_ioremap_resource() function.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Using the devm_xxx() managed function to stripdown the error and remove
code.
Signed-off-by: LABBE Corentin <clabbe.montjoie@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The mxs-dcp driver relies on the stmp_reset_block() helper function, which
is provided by CONFIG_STMP_DEVICE. This symbol is always set on MXS,
but the driver can now also be built for MXC (i.MX6), which results
in a built error if no other driver selects STMP_DEVICE:
drivers/built-in.o: In function `mxs_dcp_probe':
vf610-ocotp.c:(.text+0x3df302): undefined reference to `stmp_reset_block'
This adds the 'select', like all other stmp drivers have it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: a2712e6c75 ("crypto: mxs-dcp - Allow MXS_DCP to be used on MX6SL")
Acked-by: Marek Vasut <marex@denx.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
New bindings and driver have been created for STM32 series parts. This
patch integrates this changes.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for STMicroelectronics STM32 random number generator.
The config value defaults to N, reflecting the fact that STM32 is a
very low resource microcontroller platform and unlikely to be targeted
by any "grown up" defconfigs.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This adds documentation of device tree bindings for the STM32 hardware
random number generator.
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As all the import functions and export functions are virtually
identical, factor out their common parts into a generic
mv_cesa_ahash_import() and mv_cesa_ahash_export() respectively. This
performs the actual import or export, and we pass the data pointers and
length into these functions.
We have to switch a % const operation to do_div() in the common import
function to avoid provoking gcc to use the expensive 64-bit by 64-bit
modulus operation.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Attempting to use the sha1 digest for openssh via openssl reveals that
the result from the hash is wrong: this happens when we export the
state from one socket and import it into another via calling accept().
The reason for this is because the operation is reset to "initial block"
state, whereas we may be past the first fragment of data to be hashed.
Arrange for the operation code to avoid the initialisation of the state,
thereby preserving the imported state.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When a AF_ALG fd is accepted a second time (hence hash_accept() is
used), hash_accept_parent() allocates a new private context using
sock_kmalloc(). This context is uninitialised. After use of the new
fd, we eventually end up with the kernel complaining:
marvell-cesa f1090000.crypto: dma_pool_free cesa_padding, c0627770/0 (bad dma)
where c0627770 is a random address. Poisoning the memory allocated by
the above sock_kmalloc() produces kernel oopses within the marvell hash
code, particularly the interrupt handling.
The following simplfied call sequence occurs:
hash_accept()
crypto_ahash_export()
marvell hash export function
af_alg_accept()
hash_accept_parent() <== allocates uninitialised struct hash_ctx
crypto_ahash_import()
marvell hash import function
hash_ctx contains the struct mv_cesa_ahash_req in its req.__ctx member,
and, as the marvell hash import function only partially initialises
this structure, we end up with a lot of members which are left with
whatever data was in memory prior to sock_kmalloc().
Add zero-initialisation of this structure.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Boris Brezillon <boris.brezillon@free-electronc.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Several of the algorithms in marvell/hash.c have a statesize of zero.
When an AF_ALG accept() on an already-accepted file descriptor to
calls into hash_accept(), this causes:
char state[crypto_ahash_statesize(crypto_ahash_reqtfm(req))];
to be zero-sized, but we still pass this to:
err = crypto_ahash_export(req, state);
which proceeds to write to 'state' as if it was a "struct md5_state",
"struct sha1_state" etc. Add the necessary initialisers for the
.statesize member.
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
A new crypto driver for Marvell ARM platforms was added in
drivers/crypto/marvell/ as part of commit f63601fd61 ("crypto:
marvell/cesa - add a new driver for Marvell's CESA"). This commit adds
the relevant developers to the list of maintainers.
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Arnaud Ebalard <arno@natisbad.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Arnaud Ebalard <arno@natisbad.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds CRC generation and validation support for nx-842.
Add CRC flag so that nx842 coprocessor includes CRC during compression
and validates during decompression.
Also changes in 842 SW compression to append CRC value at the end
of template and checks during decompression.
Signed-off-by: Haren Myneni <haren@us.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The clocksource does not provide clocksource_register() function since
f893598 commit (clocksource: Mostly kill clocksource_register()), so
let's remove unnecessary information about this function from a comment.
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com>
Suggested-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Setkey function has been split into set_priv_key and set_pub_key.
Akcipher requests takes sgl for src and dst instead of void *.
Users of the API i.e. two existing RSA implementation and
test mgr code have been updated accordingly.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add mpi_read_raw_from_sgl and mpi_write_to_sgl helpers.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>