Commit graph

561537 commits

Author SHA1 Message Date
Mark Rutland
fbb4574ce9 arm64: kvm: report original PAR_EL1 upon panic
If we call __kvm_hyp_panic while a guest context is active, we call
__restore_sysregs before acquiring the system register values for the
panic, in the process throwing away the PAR_EL1 value at the point of
the panic.

This patch modifies __kvm_hyp_panic to stash the PAR_EL1 value prior to
restoring host register values, enabling us to report the original
values at the point of the panic.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 18:20:58 +01:00
Mark Rutland
1d7a4e313a arm64: kvm: avoid %p in __kvm_hyp_panic
Currently __kvm_hyp_panic uses %p for values which are not pointers,
such as the ESR value. This can confusingly lead to "(null)" being
printed for the value.

Use %x instead, and only use %p for host pointers.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 18:18:13 +01:00
Christoffer Dall
9f958c11b7 KVM: arm/arm64: vgic: Trust the LR state for HW IRQs
We were probing the physial distributor state for the active state of a
HW virtual IRQ, because we had seen evidence that the LR state was not
cleared when the guest deactivated a virtual interrupted.

However, this issue turned out to be a software bug in the GIC, which
was solved by: 84aab5e68c2a5e1e18d81ae8308c3ce25d501b29
(KVM: arm/arm64: arch_timer: Preserve physical dist. active
state on LR.active, 2015-11-24)

Therefore, get rid of the complexities and just look at the LR.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 18:08:37 +01:00
Christoffer Dall
0e3dfda91d KVM: arm/arm64: arch_timer: Preserve physical dist. active state on LR.active
We were incorrectly removing the active state from the physical
distributor on the timer interrupt when the timer output level was
deasserted.  We shouldn't be doing this without considering the virtual
interrupt's active state, because the architecture requires that when an
LR has the HW bit set and the pending or active bits set, then the
physical interrupt must also have the corresponding bits set.

This addresses an issue where we have been observing an inconsistency
between the LR state and the physical distributor state where the LR
state was active and the physical distributor was not active, which
shouldn't happen.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 18:07:40 +01:00
Christoffer Dall
7e16aa81f9 KVM: arm/arm64: Fix preemptible timer active state crazyness
We were setting the physical active state on the GIC distributor in a
preemptible section, which could cause us to set the active state on
different physical CPU from the one we were actually going to run on,
hacoc ensues.

Since we are no longer descheduling/scheduling soft timers in the
flush/sync timer functions, simply moving the timer flush into a
non-preemptible section.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 18:04:00 +01:00
Marc Zyngier
498cd5c32b arm64: KVM: Add workaround for Cortex-A57 erratum 834220
Cortex-A57 parts up to r1p2 can misreport Stage 2 translation faults
when a Stage 1 permission fault or device alignment fault should
have been reported.

This patch implements the workaround (which is to validate that the
Stage-1 translation actually succeeds) by using code patching.

Cc: stable@vger.kernel.org
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 17:58:14 +01:00
Marc Zyngier
c0f0963464 arm64: KVM: Fix AArch32 to AArch64 register mapping
When running a 32bit guest under a 64bit hypervisor, the ARMv8
architecture defines a mapping of the 32bit registers in the 64bit
space. This includes banked registers that are being demultiplexed
over the 64bit ones.

On exceptions caused by an operation involving a 32bit register, the
HW exposes the register number in the ESR_EL2 register. It was so
far understood that SW had to distinguish between AArch32 and AArch64
accesses (based on the current AArch32 mode and register number).

It turns out that I misinterpreted the ARM ARM, and the clue is in
D1.20.1: "For some exceptions, the exception syndrome given in the
ESR_ELx identifies one or more register numbers from the issued
instruction that generated the exception. Where the exception is
taken from an Exception level using AArch32 these register numbers
give the AArch64 view of the register."

Which means that the HW is already giving us the translated version,
and that we shouldn't try to interpret it at all (for example, doing
an MMIO operation from the IRQ mode using the LR register leads to
very unexpected behaviours).

The fix is thus not to perform a call to vcpu_reg32() at all from
vcpu_reg(), and use whatever register number is supplied directly.
The only case we need to find out about the mapping is when we
actively generate a register access, which only occurs when injecting
a fault in a guest.

Cc: stable@vger.kernel.org
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 17:58:08 +01:00
Ard Biesheuvel
e6fab54423 ARM/arm64: KVM: test properly for a PTE's uncachedness
The open coded tests for checking whether a PTE maps a page as
uncached use a flawed '(pte_val(xxx) & CONST) != CONST' pattern,
which is not guaranteed to work since the type of a mapping is
not a set of mutually exclusive bits

For HYP mappings, the type is an index into the MAIR table (i.e, the
index itself does not contain any information whatsoever about the
type of the mapping), and for stage-2 mappings it is a bit field where
normal memory and device types are defined as follows:

    #define MT_S2_NORMAL            0xf
    #define MT_S2_DEVICE_nGnRE      0x1

I.e., masking *and* comparing with the latter matches on the former,
and we have been getting lucky merely because the S2 device mappings
also have the PTE_UXN bit set, or we would misidentify memory mappings
as device mappings.

Since the unmap_range() code path (which contains one instance of the
flawed test) is used both for HYP mappings and stage-2 mappings, and
considering the difference between the two, it is non-trivial to fix
this by rewriting the tests in place, as it would involve passing
down the type of mapping through all the functions.

However, since HYP mappings and stage-2 mappings both deal with host
physical addresses, we can simply check whether the mapping is backed
by memory that is managed by the host kernel, and only perform the
D-cache maintenance if this is the case.

Cc: stable@vger.kernel.org
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2015-11-24 17:58:00 +01:00
Ying Xue
7098356bac tipc: fix error handling of expanding buffer headroom
Coverity says:

*** CID 1338065:  Error handling issues  (CHECKED_RETURN)
/net/tipc/udp_media.c: 162 in tipc_udp_send_msg()
156     	struct udp_media_addr *dst = (struct udp_media_addr *)&dest->value;
157     	struct udp_media_addr *src = (struct udp_media_addr *)&b->addr.value;
158     	struct sk_buff *clone;
159     	struct rtable *rt;
160
161     	if (skb_headroom(skb) < UDP_MIN_HEADROOM)
>>>     CID 1338065:  Error handling issues  (CHECKED_RETURN)
>>>     Calling "pskb_expand_head" without checking return value (as is done elsewhere 51 out of 56 times).
162     		pskb_expand_head(skb, UDP_MIN_HEADROOM, 0, GFP_ATOMIC);
163
164     	clone = skb_clone(skb, GFP_ATOMIC);
165     	skb_set_inner_protocol(clone, htons(ETH_P_TIPC));
166     	ub = rcu_dereference_rtnl(b->media_ptr);
167     	if (!ub) {

When expanding buffer headroom over udp tunnel with pskb_expand_head(),
it's unfortunate that we don't check its return value. As a result, if
the function returns an error code due to the lack of memory, it may
cause unpredictable consequence as we unconditionally consider that
it's always successful.

Fixes: e53567948f ("tipc: conditionally expand buffer headroom over udp tunnel")
Reported-by: <scan-admin@coverity.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-24 11:26:19 -05:00
Steven Rostedt (Red Hat)
bd1b7cd360 ring-buffer: Put back the length if crossed page with add_timestamp
Commit fcc742eaad "ring-buffer: Add event descriptor to simplify passing
data" added a descriptor that holds various data instead of passing around
several variables through parameters. The problem was that one of the
parameters was modified in a function and the code was designed not to have
an effect on that modified  parameter. Now that the parameter is a
descriptor and any modifications to it are non-volatile, the size of the
data could be unnecessarily expanded.

Remove the extra space added if a timestamp was added and the event went
across the page.

Cc: stable@vger.kernel.org # 4.3+
Fixes: fcc742eaad "ring-buffer: Add event descriptor to simplify passing data"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-24 09:27:25 -05:00
Steven Rostedt (Red Hat)
b81f472a20 ring-buffer: Update read stamp with first real commit on page
Do not update the read stamp after swapping out the reader page from the
write buffer. If the reader page is swapped out of the buffer before an
event is written to it, then the read_stamp may get an out of date
timestamp, as the page timestamp is updated on the first commit to that
page.

rb_get_reader_page() only returns a page if it has an event on it, otherwise
it will return NULL. At that point, check if the page being returned has
events and has not been read yet. Then at that point update the read_stamp
to match the time stamp of the reader page.

Cc: stable@vger.kernel.org # 2.6.30+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-11-24 09:23:17 -05:00
Markus Elfring
3f3a7280d4 GPU-DRM-IMX: Delete an unnecessary check before drm_fbdev_cma_restore_mode()
The drm_fbdev_cma_restore_mode() function tests whether its argument
is NULL and then returns immediately.
Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2015-11-24 11:30:20 +01:00
Philipp Zabel
407c9eba78 drm/imx: Remove of_node assignment from ipuv3-crtc driver probe
The crtc child device driver shouldn't modify the of_node of its platform
device in the probe function. Instead, since the previous patch, the IPU
core driver sets the of_node when the platform device is created.

Drop the now unused custom imx_drm_get_port_by_id function.

Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-11-24 11:30:18 +01:00
Philipp Zabel
304e6be652 gpu: ipu-v3: Assign of_node of child platform devices to corresponding ports
The crtc child device driver shouldn't have to modify the of_node of its
platform device in the probe function. Instead, let the IPU core driver
set the of_node when the platform device is created.

Also reorder the client_reg array so the elements are in port id order
(CSIs first, then DIs).

Suggested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-11-24 11:30:17 +01:00
Philipp Zabel
99ae78c373 gpu: ipu-v3: Remove reg_offset field
This is not used, so remove it.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2015-11-24 11:30:16 +01:00
Philipp Zabel
c3ede03c88 gpu: ipu-v3: drop unused dmfc field from client platform data
This field is never used, drop it.

Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2015-11-24 11:30:15 +01:00
Philipp Zabel
9832e81102 drm/imx: parallel-display: allow to determine bus format from the connected panel
Similarly to commit 5e501ed725 ("drm/imx: imx-ldb: allow to determine
bus format from the connected panel"), if a panel is connected to the ldb
output port via the of_graph bindings, the data mapping is determined from
the display_info.bus_format field provided by the panel instead of from the
optional interface_pix_fmt device tree property.

Reported-by: Ulrich Ölmann <u.oelmann@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Gary Bisson <gary.bisson@boundarydevices.com>
2015-11-24 11:29:21 +01:00
Cory Tusar
897ed0ca59 ARM: dts: vfxxx: Fix dspi[01] spi-num-chipselects.
Per the Vybrid Reference Manual (section 3.8.6.1), dspi0 has 6 chip
select signals associated with it, while dspi1 has only 4.

Signed-off-by: Cory Tusar <cory.tusar@pid1solutions.com>
Acked-by: Stefan Agner <stefan@agner.ch>
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
2015-11-24 17:38:42 +08:00
Andy Lutomirski
f10750536f x86/entry/64: Fix irqflag tracing wrt context tracking
Paolo pointed out that enter_from_user_mode could be called
while irqflags were traced as though IRQs were on.

In principle, this could confuse lockdep.  It doesn't cause any
problems that I've seen in any configuration, but if I build
with CONFIG_DEBUG_LOCKDEP=y, enable a nohz_full CPU, and add
code like:

	if (irqs_disabled()) {
		spin_lock(&something);
		spin_unlock(&something);
	}

to the top of enter_from_user_mode, then lockdep will complain
without this fix.  It seems that lockdep's irqflags sanity
checks are too weak to detect this bug without forcing the
issue.

This patch adds one byte to normal kernels, and it's IMO a bit
ugly. I haven't spotted a better way to do this yet, though.
The issue is that we can't do TRACE_IRQS_OFF until after SWAPGS
(if needed), but we're also supposed to do it before calling C
code.

An alternative approach would be to call trace_hardirqs_off in
enter_from_user_mode.  That would be less code and would not
bloat normal kernels at all, but it would be harder to see how
the code worked.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/86237e362390dfa6fec12de4d75a238acb0ae787.1447361906.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-11-24 09:55:02 +01:00
Hui Wang
8c69729b44 ALSA: hda - Fix headphone noise after Dell XPS 13 resume back from S3
We have a machine Dell XPS 13 with the codec alc256, after resume back
from S3, the headphone has noise when play sound.

Through comparing with the coeff vaule before and after S3, we found
restoring a coeff register will help remove noise.

BugLink: https://bugs.launchpad.net/bugs/1519168
Cc: Kailang Yang <kailang@realtek.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2015-11-24 07:33:43 +01:00
Ying Xue
f4195d1eac tipc: avoid packets leaking on socket receive queue
Even if we drain receive queue thoroughly in tipc_release() after tipc
socket is removed from rhashtable, it is possible that some packets
are in flight because some CPU runs receiver and did rhashtable lookup
before we removed socket. They will achieve receive queue, but nobody
delete them at all. To avoid this leak, we register a private socket
destructor to purge receive queue, meaning releasing packets pending
on receive queue will be delayed until the last reference of tipc
socket will be released.

Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-23 23:45:15 -05:00
Aaro Koskinen
3c25a860d1 broadcom: fix PHY_ID_BCM5481 entry in the id table
Commit fcb26ec5b1 ("broadcom: move all PHY_ID's to header")
updated broadcom_tbl to use PHY_IDs, but incorrectly replaced 0x0143bca0
with PHY_ID_BCM5482 (making a duplicate entry, and completely omitting
the original). Fix that.

Fixes: fcb26ec5b1 ("broadcom: move all PHY_ID's to header")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-23 23:29:27 -05:00
Ming Lei
12e57f59ca blk-merge: warn if figured out segment number is bigger than nr_phys_segments
We had seen lots of reports of this kind issue, so add one
warnning in blk-merge, then it can be triggered easily and
avoid to depend on warning/bug from drivers.

Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-23 20:16:55 -07:00
Ming Lei
02e707424c blk-merge: fix blk_bio_segment_split
Commit bdced438acd83a(block: setup bi_phys_segments after
splitting) introduces function of computing bio->bi_phys_segments
during bio splitting.

Unfortunately both bio->bi_seg_front_size and bio->bi_seg_back_size
arn't computed, so too many physical segments may be obtained
for one request since both the two are used to check if one segment
across two bios can be possible.

This patch fixes the issue by computing the two variables in
blk_bio_segment_split().

Fixes: bdced438acd83a(block: setup bi_phys_segments after splitting)
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-23 20:16:53 -07:00
Ming Lei
578270bfbd block: fix segment split
Inside blk_bio_segment_split(), previous bvec pointer(bvprvp)
always points to the iterator local variable, which is obviously
wrong, so fix it by pointing to the local variable of 'bvprv'.

Fixes: 5014c311baa2b(block: fix bogus compiler warnings in blk-merge.c)
Cc: stable@kernel.org #4.3
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Mark Salter <msalter@redhat.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-23 20:16:51 -07:00
Benjamin Coddington
38b7631fbe nfs4: limit callback decoding to received bytes
A truncated cb_compound request will cause the client to decode null or
data from a previous callback for nfs4.1 backchannel case, or uninitialized
data for the nfs4.0 case. This is because the path through
svc_process_common() advances the request's iov_base and decrements iov_len
without adjusting the overall xdr_buf's len field.  That causes
xdr_init_decode() to set up the xdr_stream with an incorrect length in
nfs4_callback_compound().

Fixing this for the nfs4.1 backchannel case first requires setting the
correct iov_len and page_len based on the length of received data in the
same manner as the nfs4.0 case.

Then the request's xdr_buf length can be adjusted for both cases based upon
the remaining iov_len and page_len.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 22:03:15 -05:00
Benjamin Coddington
c68a027c05 nfs4: start callback_ident at idr 1
If clp->cl_cb_ident is zero, then nfs_cb_idr_remove_locked() skips removing
it when the nfs_client is freed.  A decoding or server bug can then find
and try to put that first nfs_client which would lead to a crash.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: d687031265 ("nfs4client: convert to idr_alloc()")
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:59:42 -05:00
Jeff Layton
91ab4b4d16 nfs: use sliding delay when LAYOUTGET gets NFS4ERR_DELAY
When LAYOUTGET gets NFS4ERR_DELAY, we currently will wait 15s before
retrying the call. That is a _very_ long time, so add a timeout value to
struct nfs4_layoutget and pass nfs4_async_handle_error a pointer to it.
This allows the RPC engine to use a sliding delay window, instead of a
15s delay.

Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:57:44 -05:00
Kinglong Mee
f54423a1f8 NFS4: Cleanup FATTR4_WORD0_FS_LOCATIONS after decoding success
Commit 1ca843a2d2 "nfs: Fix GETATTR bitmap verification" has check
the bitmap after decoding success, but decode_attr_fs_locations forgets
cleanup the FATTR4_WORD0_FS_LOCATIONS bits.

decode_getfattr_attrs always return -EIO when meeting FS_LOCATIONS now.

ls: cannot access /mnt/referal: Input/output error
ls: cannot access /mnt/replicas: Input/output error
total 32
drwxr-xr-x. 7 root root 8192 Nov 16 20:36 pnfs
??????????? ? ?    ?       ?            ? referal
??????????? ? ?    ?       ?            ? replicas

v2: clear the bit earlier

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:56:53 -05:00
Anna Schumaker
291e1b9459 NFS: Properly set NFS v4.2 NFSDBG_FACILITY
NFS v4.2 operations can work outside of pNFS, so dprintk() output
shouldn't be placed under NFSDBG_PNFS.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:53:59 -05:00
Christoph Hellwig
6b7153da2c nfs: reduce the amount of ifdefs for v4.2 in nfs4file.c
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:53:14 -05:00
Christoph Hellwig
0f42a6a9b8 nfs: use btrfs ioctl defintions for clone
The NFS CLONE_RANGE defintion was wrong and thus never worked.  Fix this
by simply using the btrfs ioctl defintion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:53:08 -05:00
Christoph Hellwig
21fad313d5 nfs: allow intra-file CLONE
Originally CLONE didn't allow for intra-file clones, but we recently
updated the spec to support this feature which is also supported by
local Linux file systems.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:52:51 -05:00
Christoph Hellwig
3a2e176905 nfs: offer native ioctls even if CONFIG_COMPAT is set
Without this for example 64-bit binaries on typical amd64 distributions
would not be able to use ioctls on NFS.  For now this only affects clones.
Additionally ->compat_ioctl is defined even for non-compat builds, so
get rid of the pointless ifdef.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:52:28 -05:00
Christoph Hellwig
9494b2ce4b nfs: pass on count for CLONE operations
Currently we pass uninitialized stack garbage in the count parameter.
The value is usually large enought to clone whole files and thus let
simple tests pass, but it makes the tests for range clones very unhappy.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
2015-11-23 21:52:22 -05:00
Jan Kara
c2489e07c0 vfs: Avoid softlockups with sendfile(2)
The following test program from Dmitry can cause softlockups or RCU
stalls as it copies 1GB from tmpfs into eventfd and we don't have any
scheduling point at that path in sendfile(2) implementation:

        int r1 = eventfd(0, 0);
        int r2 = memfd_create("", 0);
        unsigned long n = 1<<30;
        fallocate(r2, 0, 0, n);
        sendfile(r1, r2, 0, n);

Add cond_resched() into __splice_from_pipe() to fix the problem.

CC: Dmitry Vyukov <dvyukov@google.com>
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-23 21:15:30 -05:00
Jan Kara
c725bfce79 vfs: Make sendfile(2) killable even better
Commit 296291cdd1 (mm: make sendfile(2) killable) fixed an issue where
sendfile(2) was doing a lot of tiny writes into a filesystem and thus
was unkillable for a long time. However sendfile(2) can be (mis)used to
issue lots of writes into arbitrary file descriptor such as evenfd or
similar special file descriptors which never hit the standard filesystem
write path and thus are still unkillable. E.g. the following example
from Dmitry burns CPU for ~16s on my test system without possibility to
be killed:

        int r1 = eventfd(0, 0);
        int r2 = memfd_create("", 0);
        unsigned long n = 1<<30;
        fallocate(r2, 0, 0, n);
        sendfile(r1, r2, 0, n);

There are actually quite a few tests for pending signals in sendfile
code however we data to write is always available none of them seems to
trigger. So fix the problem by adding a test for pending signal into
splice_from_pipe_next() also before the loop waiting for pipe buffers to
be available. This should fix all the lockup issues with sendfile of the
do-ton-of-tiny-writes nature.

CC: stable@vger.kernel.org
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-23 21:15:30 -05:00
Al Viro
0ebf7f10d6 fix sysvfs symlinks
The thing got broken back in 2002 - sysvfs does *not* have inline
symlinks; even short ones have bodies stored in the first block
of file.  sysv_symlink() handles that correctly; unfortunately,
attempting to look an existing symlink up will end up confusing
them for inline symlinks, and interpret the block number containing
the body as the body itself.

Nobody has noticed until now, which says something about the level
of testing sysvfs gets ;-/

Cc: stable@vger.kernel.org # all of them, not that anyone cared
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-23 21:11:08 -05:00
Tim Harvey
a2291badc3 imx: thermal: use CPU temperature grade info for thresholds
The IMX6Q/IMX6DL SoC's have a 2-bit temperature grade stored in OTP which
is valid for all IMX6 SoC's (despite the fact that the IMXSDLRM and
IMXSXRM do not document this - this has been proven via tests as well as
verified by Freescale FAE).

Instead of assuming a fixed 85C for passive cooling threshold and 105C for
critical use the thermal grade for these configurations.

We will set the critical to maxT - 5C and passive to maxT - 10C.

Cc: Anson Huang <b20788@freescale.com>
Cc: Fabio Estevam <fabio.estevam@freescale.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Acked-by: Jon Nettleton <jon@solid-run.com>
Signed-off-by: Tim Harvey <tharvey@gateworks.com>
----
v3:
 - rebase against linux-soc-thermal.git
 - added ack's from Shawn and Jon
v2:
 - remove check for IMX6Q and update comments: The OTP values have been tested
   on IMX6SOLO, IMX6DUALLITE, and IMX6SX and Freescale FAE has shared data with
   me that the OTP settings are the same and that the reference manuals will
   reflect this in their next updates.
 - set critical to max - 5C
 - set passive to max - 10C
 - display max temp in info
 - do not allow passive to be set above critical
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-11-23 16:38:40 -08:00
Arnd Bergmann
c86b3de8c8 thermal: fix thermal_zone_bind_cooling_device prototype
When the prototype for thermal_zone_bind_cooling_device
changed, the static inline wrapper function was left alone,
which in theory can cause build warnings:

I have seen this error in the past:
drivers/thermal/db8500_thermal.c: In function 'db8500_cdev_bind':
drivers/thermal/db8500_thermal.c:78:9: error: too many arguments to function 'thermal_zone_bind_cooling_device'
   ret = thermal_zone_bind_cooling_device(thermal, i, cdev,

while this one no longer shows up, there is no doubt that
the prototype is still wrong, so let's just fix it anyway.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 6cd9e9f629 ("thermal: of: fix cooling device weights in device tree")
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-11-23 15:34:34 -08:00
Arnd Bergmann
e4217468ae Revert "thermal: qcom_spmi: allow compile test"
This just caused build errors:

warning: (QCOM_SPMI_TEMP_ALARM) selects REGMAP_SPMI which has unmet direct dependencies (SPMI)
drivers/built-in.o: In function `regmap_spmi_ext_gather_write':
:(.text+0x609b0): undefined reference to `spmi_ext_register_write'
:(.text+0x609f0): undefined reference to `spmi_ext_register_writel'

While it's generally a good idea to allow compile testing, in this
case, it just doesn't work, so reverting the patch that
introduced the compile-test variant seems the most appropriate
solution.

Note that SPMI also has a 'depends on ARCH_QCOM || COMPILE_TEST'
statement, so we should be able to enable SPMI on all architectures
for compile testing already.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: cb7fb4d342 ("thermal: qcom_spmi: allow compile test")
Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2015-11-23 15:33:56 -08:00
Nikolay Aleksandrov
7f109f7cc3 vrf: fix double free and memory corruption on register_netdevice failure
When vrf's ->newlink is called, if register_netdevice() fails then it
does free_netdev(), but that's also done by rtnl_newlink() so a second
free happens and memory gets corrupted, to reproduce execute the
following line a couple of times (1 - 5 usually is enough):
$ for i in `seq 1 5`; do ip link add vrf: type vrf table 1; done;
This works because we fail in register_netdevice() because of the wrong
name "vrf:".

And here's a trace of one crash:
[   28.792157] ------------[ cut here ]------------
[   28.792407] kernel BUG at fs/namei.c:246!
[   28.792608] invalid opcode: 0000 [#1] SMP
[   28.793240] Modules linked in: vrf nfsd auth_rpcgss oid_registry
nfs_acl nfs lockd grace sunrpc crct10dif_pclmul crc32_pclmul
crc32c_intel qxl drm_kms_helper ttm drm aesni_intel aes_x86_64 psmouse
glue_helper lrw evdev gf128mul i2c_piix4 ablk_helper cryptd ppdev
parport_pc parport serio_raw pcspkr virtio_balloon virtio_console
i2c_core acpi_cpufreq button 9pnet_virtio 9p 9pnet fscache ipv6 autofs4
ext4 crc16 mbcache jbd2 virtio_blk virtio_net sg sr_mod cdrom
ata_generic ehci_pci uhci_hcd ehci_hcd e1000 usbcore usb_common ata_piix
libata virtio_pci virtio_ring virtio scsi_mod floppy
[   28.796016] CPU: 0 PID: 1148 Comm: ld-linux-x86-64 Not tainted
4.4.0-rc1+ #24
[   28.796016] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.8.1-20150318_183358- 04/01/2014
[   28.796016] task: ffff8800352561c0 ti: ffff88003592c000 task.ti:
ffff88003592c000
[   28.796016] RIP: 0010:[<ffffffff812187b3>]  [<ffffffff812187b3>]
putname+0x43/0x60
[   28.796016] RSP: 0018:ffff88003592fe88  EFLAGS: 00010246
[   28.796016] RAX: 0000000000000000 RBX: ffff8800352561c0 RCX:
0000000000000001
[   28.796016] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff88003784f000
[   28.796016] RBP: ffff88003592ff08 R08: 0000000000000001 R09:
0000000000000000
[   28.796016] R10: 0000000000000000 R11: 0000000000000001 R12:
0000000000000000
[   28.796016] R13: 000000000000047c R14: ffff88003784f000 R15:
ffff8800358c4a00
[   28.796016] FS:  0000000000000000(0000) GS:ffff88003fc00000(0000)
knlGS:0000000000000000
[   28.796016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.796016] CR2: 00007ffd583bc2d9 CR3: 0000000035a99000 CR4:
00000000000406f0
[   28.796016] Stack:
[   28.796016]  ffffffff8121045d ffffffff812102d3 ffff8800352561c0
ffff880035a91660
[   28.796016]  ffff8800008a9880 0000000000000000 ffffffff81a49940
00ffffff81218684
[   28.796016]  ffff8800352561c0 000000000000047c 0000000000000000
ffff880035b36d80
[   28.796016] Call Trace:
[   28.796016]  [<ffffffff8121045d>] ?
do_execveat_common.isra.34+0x74d/0x930
[   28.796016]  [<ffffffff812102d3>] ?
do_execveat_common.isra.34+0x5c3/0x930
[   28.796016]  [<ffffffff8121066c>] do_execve+0x2c/0x30
[   28.796016]  [<ffffffff810939a0>]
call_usermodehelper_exec_async+0xf0/0x140
[   28.796016]  [<ffffffff810938b0>] ? umh_complete+0x40/0x40
[   28.796016]  [<ffffffff815cb1af>] ret_from_fork+0x3f/0x70
[   28.796016] Code: 48 8d 47 1c 48 89 e5 53 48 8b 37 48 89 fb 48 39 c6
74 1a 48 8b 3d 7e e9 8f 00 e8 49 fa fc ff 48 89 df e8 f1 01 fd ff 5b 5d
f3 c3 <0f> 0b 48 89 fe 48 8b 3d 61 e9 8f 00 e8 2c fa fc ff 5b 5d eb e9
[   28.796016] RIP  [<ffffffff812187b3>] putname+0x43/0x60
[   28.796016]  RSP <ffff88003592fe88>

Fixes: 193125dbd8 ("net: Introduce VRF device driver")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-23 17:52:46 -05:00
Punit Agrawal
73124ced9c cpufreq: SCPI: Depend on SCPI clk driver
The SCPI clk driver registers the virtual cpufreq device that kicks off
initialisation of the SCPI cpufreq driver. Also, clk_get() will fail for
the cpufreq driver if the SCPI clk driver is missing.

Fix this by making the SCPI cpufreq driver explicitly depend on the SCPI
clk driver.

Fixes: 8def31034d (cpufreq: arm_big_little: add SCPI interface driver)
Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-23 23:50:27 +01:00
Rafael J. Wysocki
b0ceed0685 Merge back earlier cpufreq fixes for v4.4. 2015-11-23 23:49:57 +01:00
Prarit Bhargava
785ee27881 cpufreq: intel_pstate: Fix limits->max_perf rounding error
A rounding error was found in the calculation of limits->max_perf
in intel_pstate_set_policy(), which is used to calculate the max and min
pstate values in intel_pstate_get_min_max().  In that code,
limits->max_perf is truncated to 2 hex digits such that, for example,
0x169 was incorrectly calculated to 0x16 instead of 0x17.  This resulted in
the pstate being set one level too low.  This patch rounds the value of
limits->max_perf up instead of down so that the correct max pstate can
be reached.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-23 23:15:34 +01:00
Prarit Bhargava
8478f53946 cpufreq: intel_pstate: Fix limits->max_policy_pct rounding error
I have a Intel (6,63) processor with a "marketing" frequency (from
/proc/cpuinfo) of 2100MHz, and a max turbo frequency of 2600MHz.  I
can execute

cpupower frequency-set -g powersave --min 1200MHz --max 2100MHz

and the max_freq_pct is set to 80.  When adding load to the system I noticed
that the cpu frequency only reached 2000MHZ and not 2100MHz as expected.

This is because limits->max_policy_pct is calculated as 2100 * 100 /2600 = 80.7
and is rounded down to 80 when it should be rounded up to 81.  This patch
adds a DIV_ROUND_UP() which will return the correct value.

Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-23 23:14:10 +01:00
Viresh Kumar
f344dae0fe cpufreq: Always remove sysfs cpuX/cpufreq link on ->remove_dev()
Subsys interface's ->remove_dev() is called when the cpufreq driver is
unregistering or the CPU is getting physically removed. We keep removing
the cpuX/cpufreq link for all CPUs except the last one, which is a
mistake as all CPUs contain a link now.

Because of this, one CPU from each policy will still contain a link (to
an already removed policyX directory), after the cpufreq driver is
unregistered.

Fix that by removing the link first and then only see if the policy is
required to be freed. That will make sure that no links are left out.

Fixes: 96bdda61f5 ("cpufreq: create cpu/cpufreq/policyX directories")
Reported-and-tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-23 22:49:42 +01:00
Ashwin Chaugule
9dc1791773 cpufreq: CPPC: Initialize and check CPUFreq CPU co-ord type correctly
The CPU policy struct indicates the co-ordination type
for all CPUs of a common freq domain. Initialize it
correctly using the CPU specific data gathered from
CPPC ACPI lib via acpi_get_psd_map().

The PSD object is optional, so the cpu->shared_type
can also be 0. So instead of assuming any value other
than SW_ANY(0xFD) is unsupported, explictly check
if shared_type is SW_ALL and then bail.

Signed-off-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-11-23 22:21:18 +01:00
Linus Torvalds
a2931547ee linux-kselftest-4.4-rc3
This update consists of one minor documentation fix and a fix
 to an existing test.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWU1euAAoJEAsCRMQNDUMcC78P/3mIOPtVMRMHR0YwGA/MCavO
 +JhVbEJCsrVtg5aPRod1Psz3QU3ubqr37yAeDe7vCniJK1zDx0QBGXATv91dVGLz
 Fqjm6DZ1zJXrsSgoFhWZXtjicEI2khdMlzDsRD0vXNSDJATpWHRVa9eLMeeZnIVA
 DXMH/RRlo7b4lK8/Kf2YV190mqemMsJRF2PfUAiZ1ZqBd8hCnqsk0hYdkJNaIDfJ
 PydtUCDLbXuvjg3AfGaBndifudzRFzb/lYyQ9K3KPHj2cE5TMHCPn2jTZwJ5V3cZ
 IX+LtYtxEZu+gCz/3l9kN9QDzy0EVeozvPGgg8gY/YLmKinQVENBuVXV4+vR696y
 h/LtJm7NdVyy4fopI6YBTEvaq7TKeNQWKjnQ7p5clqMCchY1/9aSgbAVIMgw5OFb
 DPNnclcfWmVEMpzbmeyMTmfAbcqmttmQXAaklXH6WrcQ/C9KEWfMzexvY4ho/eur
 daIl7A3MyB83Z5bjUsryhVeNunPecklshE1wMwrmutnDIH8Wj+eJM6yHBJf/cgbO
 AnhKRcsqzkti0QXdlzEMRWfDWAfkzCXSbdjcORnRFV4Dw2X7RgizFXtfI6xccVxS
 AO4dtkNKbXUOt184XZlwrES+IXhtnlqBTO1HX/clQ2F7FVeT6Sq1eYuAlVugDH8H
 65mZzXyxAAfcjctk4U/r
 =8sYr
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "This update consists of one minor documentation fix and a fix to an
  existing test"

* tag 'linux-kselftest-4.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/seccomp: Get page size from sysconf
  tools:testing/selftests: fix typo in futex/README
2015-11-23 13:19:27 -08:00
Dan Carpenter
3d1a54e801 net/hsr: fix a warning message
WARN_ON_ONCE() takes a condition, it doesn't take an error message.  I
have converted this to WARN() instead.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-11-23 14:56:15 -05:00