I noticed the following bug/misbehavior on certain Intel systems: with a
single task running on a NOHZ CPU on an Intel Haswell, I recognized
that I did not only get the one expected local_timer APIC interrupt, but
two per second at minimum. (!)
Further tracing showed that the first one precedes the programmed deadline
by up to ~50us and hence, it did nothing except for reprogramming the TSC
deadline clockevent device to trigger shortly thereafter again.
The reason for this is imprecise calibration, the timeout we program into
the APIC results in 'too short' timer interrupts. The core (hr)timer code
notices this (because it has a precise ktime source and sees the short
interrupt) and fixes it up by programming an additional very short
interrupt period.
This is obviously suboptimal.
The reason for the imprecise calibration is twofold, and this patch
fixes the first reason:
In setup_APIC_timer(), the registered clockevent device's frequency
is calculated by first dividing tsc_khz by TSC_DIVISOR and multiplying
it with 1000 afterwards:
(tsc_khz / TSC_DIVISOR) * 1000
The multiplication with 1000 is done for converting from kHz to Hz and the
division by TSC_DIVISOR is carried out in order to make sure that the final
result fits into an u32.
However, with the order given in this calculation, the roundoff error
introduced by the division gets magnified by a factor of 1000 by the
following multiplication.
To fix it, reversing the order of the division and the multiplication a la:
(tsc_khz * 1000) / TSC_DIVISOR
... reduces the roundoff error already.
Furthermore, if TSC_DIVISOR divides 1000, associativity holds:
(tsc_khz * 1000) / TSC_DIVISOR = tsc_khz * (1000 / TSC_DIVISOR)
and thus, the roundoff error even vanishes and the whole operation can be
carried out within 32 bits.
The powers of two that divide 1000 are 2, 4 and 8. A value of 8 for
TSC_DIVISOR still allows for TSC frequencies up to
2^32 / 10^9ns * 8 = 34.4GHz which is way larger than anything to expect
in the next years.
Thus we also replace the current TSC_DIVISOR value of 32 by 8. Reverse
the order of the divison and the multiplication in the calculation of
the registered clockevent device's frequency.
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Christopher S. Hall <christopher.s.hall@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Link: http://lkml.kernel.org/r/20160714152255.18295-2-nicstange@gmail.com
[ Improved changelog. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Where a device driver has set a 64-bit DMA mask to indicate the absence
of addressing limitations, we still need to ensure that we don't
allocate IOVAs beyond the actual input size of the IOMMU. The reported
aperture is the most reliable way we have of inferring that input
address size, so use that to enforce a hard upper limit where available.
Fixes: 0db2e5d18f ("iommu: Implement common IOMMU ops for DMA mapping")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
We cannot do those initializations from apply_feature_fixups() as
this function runs in a very restricted environment on 32-bit where
the kernel isn't running at its linked address and the PTRRELOC()
macro must be used for any global accesss.
Instead, split them into a separtate steup_feature_keys() function
which is called in a more suitable spot on ppc32.
Fixes: 309b315b6e ("powerpc: Call jump_label_init() in apply_feature_fixups()")
Reported-and-tested-by: Christian Kujau <lists@nerdbynature.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Right now the following sequence of events can happen:
1. Thread X calls vgic_put_irq
2. Thread Y calls vgic_add_lpi
3. Thread Y gets lpi_list_lock
4. Thread X drops the ref count to 0 and blocks on lpi_list_lock
5. Thread Y finds the irq via the lpi_list_lock, raises the ref
count to 1, and release the lpi_list_lock.
6. Thread X proceeds and frees the irq.
Avoid this by holding the spinlock around the kref_put.
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
During low memory conditions, we could be dereferencing a NULL pointer
when vgic_add_lpi fails to allocate memory.
Consider for example this call sequence:
vgic_its_cmd_handle_mapi
itte->irq = vgic_add_lpi(kvm, lpi_nr);
update_lpi_config(kvm, itte->irq, NULL);
ret = kvm_read_guest(kvm, propbase + irq->intid
^^^^
kaboom?
Instead, return an error pointer from vgic_add_lpi and check the return
value from its single caller.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
We don't identify the machine type anymore...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This makes it easier to debug crashes that happen very early before
the kernel takes over Open Firmware by allowing us to relate the OF
reported crashing addresses to offsets within the kernel.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Both set_memory_ro() and set_memory_rw() will modify the page
attributes of at least one page, even if the numpages parameter is
zero.
The author expected that calling these functions with numpages == zero
would never happen. However with the new 444d13ff10 ("modules: add
ro_after_init support") feature this happens frequently.
Therefore do the right thing and make these two functions return
gracefully if nothing should be done.
Fixes crashes on module load like this one:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 000003ff80008000 TEID: 000003ff80008407
Fault in home space mode while using kernel ASCE.
AS:0000000000d18007 R3:00000001e6aa4007 S:00000001e6a10800 P:00000001e34ee21d
Oops: 0004 ilc:3 [#1] SMP
Modules linked in: x_tables
CPU: 10 PID: 1 Comm: systemd Not tainted 4.7.0-11895-g3fa9045 #4
Hardware name: IBM 2964 N96 703 (LPAR)
task: 00000001e9118000 task.stack: 00000001e9120000
Krnl PSW : 0704e00180000000 00000000005677f8 (rb_erase+0xf0/0x4d0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 000003ff80008b20 000003ff80008b20 000003ff80008b70 0000000000b9d608
000003ff80008b20 0000000000000000 00000001e9123e88 000003ff80008950
00000001e485ab40 000003ff00000000 000003ff80008b00 00000001e4858480
0000000100000000 000003ff80008b68 00000000001d5998 00000001e9123c28
Krnl Code: 00000000005677e8: ec1801c3007c cgij %r1,0,8,567b6e
00000000005677ee: e32010100020 cg %r2,16(%r1)
#00000000005677f4: a78401c2 brc 8,567b78
>00000000005677f8: e35010080024 stg %r5,8(%r1)
00000000005677fe: ec5801af007c cgij %r5,0,8,567b5c
0000000000567804: e30050000024 stg %r0,0(%r5)
000000000056780a: ebacf0680004 lmg %r10,%r12,104(%r15)
0000000000567810: 07fe bcr 15,%r14
Call Trace:
([<000003ff80008900>] __this_module+0x0/0xffffffffffffd700 [x_tables])
([<0000000000264fd4>] do_init_module+0x12c/0x220)
([<00000000001da14a>] load_module+0x24e2/0x2b10)
([<00000000001da976>] SyS_finit_module+0xbe/0xd8)
([<0000000000803b26>] system_call+0xd6/0x264)
Last Breaking-Event-Address:
[<000000000056771a>] rb_erase+0x12/0x4d0
Kernel panic - not syncing: Fatal exception: panic_on_oops
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-and-tested-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Fixes: e8a97e42dc ("s390/pageattr: allow kernel page table splitting")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
When a device is in a status where CIO has killed all I/O by itself the
interrupt for a clear request may not contain an irb to determine the
clear function. Instead it contains an error pointer -EIO.
This was ignored by the DASD int_handler leading to a hanging device
waiting for a clear interrupt.
Handle -EIO error pointer correctly for requests that are clear pending and
treat the clear as successful.
Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com>
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Commit 8d460f6156 ("powerpc/process: Add the function
flush_tmregs_to_thread") added flush_tmregs_to_thread() and included
the assumption that it would only be called for a task which is not
current.
Although this is correct for ptrace, when generating a core dump, some
of the routines which call flush_tmregs_to_thread() are called. This
leads to a WARNing such as:
Not expecting ptrace on self: TM regs may be incorrect
------------[ cut here ]------------
WARNING: CPU: 123 PID: 7727 at arch/powerpc/kernel/process.c:1088 flush_tmregs_to_thread+0x78/0x80
CPU: 123 PID: 7727 Comm: libvirtd Not tainted 4.8.0-rc1-gcc6x-g61e8a0d #1
task: c000000fe631b600 task.stack: c000000fe63b0000
NIP: c00000000001a1a8 LR: c00000000001a1a4 CTR: c000000000717780
REGS: c000000fe63b3420 TRAP: 0700 Not tainted (4.8.0-rc1-gcc6x-g61e8a0d)
MSR: 900000010282b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]> CR: 28004222 XER: 20000000
...
NIP [c00000000001a1a8] flush_tmregs_to_thread+0x78/0x80
LR [c00000000001a1a4] flush_tmregs_to_thread+0x74/0x80
Call Trace:
flush_tmregs_to_thread+0x74/0x80 (unreliable)
vsr_get+0x64/0x1a0
elf_core_dump+0x604/0x1430
do_coredump+0x5fc/0x1200
get_signal+0x398/0x740
do_signal+0x54/0x2b0
do_notify_resume+0x98/0xb0
ret_from_except_lite+0x70/0x74
So fix flush_tmregs_to_thread() to detect the case where it is called on
current, and a transaction is active, and in that case flush the TM regs
to the thread_struct.
This patch also moves flush_tmregs_to_thread() into ptrace.c as it is
only called from that file.
Fixes: 8d460f6156 ("powerpc/process: Add the function flush_tmregs_to_thread")
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Flesh out change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Commit 7aef413656 ("powerpc32: rewrite csum_partial_copy_generic()
based on copy_tofrom_user()") introduced a bug when destination
address is odd and initial csum is not null
In that (rare) case the initial csum value has to be rotated one byte
as well as the resulting value is
This patch also fixes related comments
Fixes: 7aef413656 ("powerpc32: rewrite csum_partial_copy_generic() based on copy_tofrom_user()")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Adding fdb entries pointing to the bridge device uses fdb_insert(),
which lacks various checks and does not respect added_by_user flag.
As a result, some inconsistent behavior can happen:
* Adding temporary entries succeeds but results in permanent entries.
* Same goes for "dynamic" and "use".
* Changing mac address of the bridge device causes deletion of
user-added entries.
* Replacing existing entries looks successful from userspace but actually
not, regardless of NLM_F_EXCL flag.
Use the same logic as other entries and fix them.
Fixes: 3741873b4f ("bridge: allow adding of fdb entries pointing to the bridge device")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch the setting of psl_fir_cntl from debug to production
environment recommended value. It mostly affects the PSL behavior when
an error is raised in psl_fir1/2.
Tested with cxlflash.
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
I've found funny live-lock between raid10 barriers during resync and
memory controller hard limits. Inside mpage_readpages() task holds on to
its plug bio which blocks the barrier in raid10. Its memory cgroup have
no free memory thus the task goes into reclaimer but all reclaimable
pages are dirty and cannot be written because raid10 is rebuilding and
stuck on the barrier.
Common flush of such IO in schedule() never happens, because the caller
doesn't go to sleep.
Lock is 'live' because changing memory limit or killing tasks which
holds that stuck bio unblock whole progress.
That was what happened in 3.18.x but I see no difference in upstream
logic. Theoretically this might happen even without memory cgroup.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jens Axboe <axboe@fb.com>
Disable all interrupts when suspend, they will be enabled
when resume. Otherwise, the suspend/resume process will be
blocked occasionally.
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit b5a099c67a "net: ethernet: davicom: fix devicetree irq
resource" causes an interrupt storm after the ethernet interface
is activated on S3C24XX platform (ARM non-dt), due to the interrupt
trigger type not being set properly.
It seems, after adding parsing of IRQ flags in commit 7085a7401b
"drivers: platform: parse IRQ flags from resources", there is no path
for non-dt platforms where irq_set_type callback could be invoked when
we don't pass the trigger type flags to the request_irq() call.
In case of a board where the regression is seen the interrupt trigger
type flags are passed through a platform device's resource and it is
not currently handled properly without passing the irq trigger type
flags to the request_irq() call. In case of OF an of_irq_get() call
within platform_get_irq() function seems to be ensuring required irq_chip
setup, but there is no equivalent code for non OF/ACPI platforms.
This patch mostly restores irq trigger type setting code which has been
removed in commit ("net: ethernet: davicom: fix devicetree irq resource").
Fixes: b5a099c67a ("net: ethernet: davicom: fix devicetree irq resource")
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
During boot, sometimes the kernel will test to see if an instruction
causes an undefined instruction exception. Unfortunately, the exit
path for these exceptions did not restore the address limit, which
causes the rootfs mount code to fail. Fix the missing address limit
restoration.
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
The late_alloc() PTE allocation function used by create_mapping_late()
does not call pgtable_page_ctor() on PTE pages it allocates, leaving
the per-page spinlock uninitialized.
Since generic page table manipulation code may assume that translation
table pages that are not owned by init_mm are covered by fully
constructed struct pages, the following crash may occur with the new
UEFI memory attributes table code.
efi: memattr: Processing EFI Memory Attributes table:
efi: memattr: 0x0000ffa16000-0x0000ffa82fff [Runtime Code |RUN| | |XP| | | | | | | | ]
Unable to handle kernel NULL pointer dereference at virtual address 00000010
pgd = c0204000
[00000010] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc4-00063-g3882aa7b340b #361
Hardware name: Generic DT based system
task: ed858000 ti: ed842000 task.ti: ed842000
PC is at __lock_acquire+0xa0/0x19a8
...
[<c038c830>] (__lock_acquire) from [<c038e4f8>] (lock_acquire+0x6c/0x88)
[<c038e4f8>] (lock_acquire) from [<c0c06134>] (_raw_spin_lock+0x2c/0x3c)
[<c0c06134>] (_raw_spin_lock) from [<c0410384>] (apply_to_page_range+0xe8/0x238)
[<c0410384>] (apply_to_page_range) from [<c1205f34>] (efi_set_mapping_permissions+0x54/0x5c)
[<c1205f34>] (efi_set_mapping_permissions) from [<c1247474>] (efi_memattr_apply_permissions+0x2b8/0x378)
[<c1247474>] (efi_memattr_apply_permissions) from [<c1248258>] (arm_enable_runtime_services+0x1f0/0x22c)
[<c1248258>] (arm_enable_runtime_services) from [<c0301f0c>] (do_one_initcall+0x44/0x174)
[<c0301f0c>] (do_one_initcall) from [<c1200d10>] (kernel_init_freeable+0x90/0x1e8)
[<c1200d10>] (kernel_init_freeable) from [<c0bff690>] (kernel_init+0x8/0x114)
[<c0bff690>] (kernel_init) from [<c0307ed0>] (ret_from_fork+0x14/0x24)
The crash is due to the fact that the UEFI page tables are not owned by
init_mm, but are not covered by fully constructed struct pages.
Given that the UEFI subsystem is currently the only user of
create_mapping_late(), add an unconditional call to pgtable_page_ctor() to
late_alloc().
Fixes: 9fc68b717c ("ARM/efi: Apply strict permissions for UEFI Runtime Services regions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
To limit the amount of mapped low memory, we determine a physical address
boundary based on the start of the vmalloc area using __pa().
Strictly speaking, the vmalloc area location is arbitrary and does not
necessarily corresponds to a valid physical address. For example, if
PAGE_OFFSET = 0x80000000
PHYS_OFFSET = 0x90000000
vmalloc_min = 0xf0000000
then __pa(vmalloc_min) overflows and returns a wrapped 0 when phys_addr_t
is a 32-bit type. Then the code that follows determines that the entire
physical memory is above that boundary and no low memory gets mapped at
all:
|[...]
|Machine model: Freescale i.MX51 NA04 Board
|Ignoring RAM at 0x90000000-0xb0000000 (!CONFIG_HIGHMEM)
|Consider using a HIGHMEM enabled kernel.
To avoid this problem let's make vmalloc_limit a 64-bit value all the
time and determine that boundary explicitly without using __pa().
Reported-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Emil Renner Berthing <kernel@esmil.dk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When executing the script included below, the netns delete operation
hangs with the following message (repeated at 10 second intervals):
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1
This occurs because a reference to the lo interface in the "secure" netns
is still held by a dst entry in the xfrm bundle cache in the init netns.
Address this problem by garbage collecting the tunnel netns flow cache
when a cross-namespace vti interface receives a NETDEV_DOWN notification.
A more detailed description of the problem scenario (referencing commands
in the script below):
(1) ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
The vti_test interface is created in the init namespace. vti_tunnel_init()
attaches a struct ip_tunnel to the vti interface's netdev_priv(dev),
setting the tunnel net to &init_net.
(2) ip link set vti_test netns secure
The vti_test interface is moved to the "secure" netns. Note that
the associated struct ip_tunnel still has tunnel->net set to &init_net.
(3) ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1
The first packet sent using the vti device causes xfrm_lookup() to be
called as follows:
dst = xfrm_lookup(tunnel->net, skb_dst(skb), fl, NULL, 0);
Note that tunnel->net is the init namespace, while skb_dst(skb) references
the vti_test interface in the "secure" namespace. The returned dst
references an interface in the init namespace.
Also note that the first parameter to xfrm_lookup() determines which flow
cache is used to store the computed xfrm bundle, so after xfrm_lookup()
returns there will be a cached bundle in the init namespace flow cache
with a dst referencing a device in the "secure" namespace.
(4) ip netns del secure
Kernel begins to delete the "secure" namespace. At some point the
vti_test interface is deleted, at which point dst_ifdown() changes
the dst->dev in the cached xfrm bundle flow from vti_test to lo (still
in the "secure" namespace however).
Since nothing has happened to cause the init namespace's flow cache
to be garbage collected, this dst remains attached to the flow cache,
so the kernel loops waiting for the last reference to lo to go away.
<Begin script>
ip link add br1 type bridge
ip link set dev br1 up
ip addr add dev br1 1.1.1.1/8
ip netns add secure
ip link add vti_test type vti local 1.1.1.1 remote 1.1.1.2 key 1
ip link set vti_test netns secure
ip netns exec secure ip link set vti_test up
ip netns exec secure ip link s lo up
ip netns exec secure ip addr add dev lo 192.168.100.1/24
ip netns exec secure ip route add 192.168.200.0/24 dev vti_test
ip xfrm policy flush
ip xfrm state flush
ip xfrm policy add dir out tmpl src 1.1.1.1 dst 1.1.1.2 \
proto esp mode tunnel mark 1
ip xfrm policy add dir in tmpl src 1.1.1.2 dst 1.1.1.1 \
proto esp mode tunnel mark 1
ip xfrm state add src 1.1.1.1 dst 1.1.1.2 proto esp spi 1 \
mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
ip xfrm state add src 1.1.1.2 dst 1.1.1.1 proto esp spi 1 \
mode tunnel enc des3_ede 0x112233445566778811223344556677881122334455667788
ip netns exec secure ping -c 4 -i 0.02 -I 192.168.100.1 192.168.200.1
ip netns del secure
<End script>
Reported-by: Hangbin Liu <haliu@redhat.com>
Reported-by: Jan Tluka <jtluka@redhat.com>
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIVAwUAV6oCE/Sw1s6N8H32AQK1bxAAj2FoASombJwHomUjnvKb4RWdFkwA2Km3
A2R6qdPYCZLbdiH7gw8TqSYGBp//8EL6qlcMKKVvSciAFlS0oQb2G99Tni4cVB4W
/xmRatAeU2/k1YBndPsOFHkhdDto6BlsUJbWaW3WIH2vqIfFsW1nh4oIR0N9yvt0
awF34s2yXfp77hr2t7Ca/PMibEcyCmg2kF83Kxilx5z/Ubpl/TtyqswLiPDO7EJq
lfyoplgxoG0kvpN1wh2z8i4dKAh8+e3PMIMBdodG/2aA/srWVlECRDx0IqQ4hzQg
iSaJDBp+E9ZhKExkPIDpEx49iH1kawerFub+udBDCR6n9JPSlPOWdHiYxXaVE7yA
WGW2JGyv2Wo7vkEM76/3dzHBzn0w203T6C+Sa25I82oOxvVbAZfQRxHTJVmCdcOc
lHyeGcEuk4h2lYc7U/ty3uGEwm7dL+tyoEw0MGcHQEDpnainKjFPdXiKSOOMb/Vd
kiqXrsK9nPZ9JjDtRA1oK7uLNYbllFH13aDR5kU1VAH4s2zwD1498/tegcZ+Hds5
UKhyWfqHBr3uBrxEwRVLFnJ5yi6LQy+O4jNHg1gTakaY6B2j34sLzd5fuc1idhGe
yUuCU+XC7mcdQQWoXAc96/orsqmtEiY6yABuTTijeZTSbpKBT3uJpelzAnuxCePZ
2u0lddGIl7M=
=c47N
-----END PGP SIGNATURE-----
Merge tag 'rxrpc-fixes-20160809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here are a bunch of miscellaneous fixes to AF_RXRPC:
(*) Fix an uninitialised pointer.
(*) Fix error handling when we fail to connect a call.
(*) Fix a NULL pointer dereference.
(*) Fix two occasions where a packet is accessed again after being queued
for someone else to deal with.
(*) Fix a missing skb free.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
User visible fixes:
- Fix the lookup for a kernel module in 'perf probe', fixing for instance, the
erroneous return of "[raid10]" when looking for "[raid1]" (Konstantin Khlebnikov)
- Disable counters in a group before reading them in 'perf stat', to avoid skew (Mark Rutland)
- Fix adding probes to function aliases in systems using kaslr (Masami Hiramatsu)
- Trip libtraceevent trace_seq buffers, removing unnecessary memory usage that could
bring a system using tracepoint events with 'perf top' to a crawl, as the trace_seq
buffers start at a whooping 4 KB, which is very rarely used in perf's usecases,
so realloc it to the really used space as a last measure after using libtraceevent
functions to format the fields of tracepoint events (Arnaldo Carvalho de Melo)
- Fix 'perf probe' location when using DWARF on ppc64le (Ravi Bangoria)
Improvement:
- Allow specifying signedness casts to a 'perf probe' variable, to shorten
the number of steps to see signed values that otherwise would always appear
as hex values (Naohiro Aota)
Documentation fixes:
- Add 'bpf-output' field to 'perf script' usage message (Brendan Gregg)
Infrastructure fixes:
- Sync kernel header files: cpufeatures.h, {disabled,required}-features.h,
bpf.h and vmx.h, so that we get a clean build, without warnings about files
being different from the kernel counterparts.
A verification of the need or desirability of changes in tools/ based on what
was done in the kernel changesets was made and documented in the respective
file sync changesets (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXqfpSAAoJENZQFvNTUqpAkJUQALGxbMIZmCOE2O/lok7t6PDZ
VUsq0vs3V1mPqdin1UnNsfMCvWQExeJZQ4P8EUTSLoMbpiifq2BY5696xs35LUtl
+UTTtVgtN+/W5gLiQ78U8kEkG8Q/PiMeWKyLLKgBSAEtibC4pOLnQCu/g4DP3e3c
oYQTaoaSq4eBQMQaDKF2Y6EVQFEdQs4PI1JUIsGn9zTfR5qtRiKwZxrNkAfmNAVO
opDN42JR3HewwXiOKWuoAVHDi5QVsHgDUnPuYlFujbx306WV+EiypRpzA4Rbr0Cu
AZrtkqQdSkKYVhEop3Az5kW9m3qZ6DRcZfJNVmD0Cax637gQNbOyIVVxo2KiS5JQ
8kZknTuQoR8GTARUwlUlQVydwKaRXsox4M1o71FVAOuvEKzBFpIUF46c+Ljgj4O0
zo1q9I2GnmxjakP0528oLrtT4UyndWTxjK0bwPcr+AwFGVfICT5OteUoTkLSbTAO
WdqfIhtS1PL+yJmy9iQkaPtTWVtgHmJbQ8PtBVBZZjwvDbx2sDv84/83picwUM+u
4g0WaqEzKMhRznDHXjc6EkSWRcP20AFqAotRNn8Uxhinx6OvLMs2C6TLzt1lHuw/
Zsz6wJnM3qaOs9Oxe1U8J04Zm47C8y6iVx5zpJkgFaAf49mJ475me89NratFrAfq
T1OsSy9dejfB4xrBvOjk
=/B7T
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo-20160809' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
User visible fixes:
- Fix the lookup for a kernel module in 'perf probe', fixing for instance, the
erroneous return of "[raid10]" when looking for "[raid1]" (Konstantin Khlebnikov)
- Disable counters in a group before reading them in 'perf stat', to avoid skew (Mark Rutland)
- Fix adding probes to function aliases in systems using kaslr (Masami Hiramatsu)
- Trip libtraceevent trace_seq buffers, removing unnecessary memory usage that could
bring a system using tracepoint events with 'perf top' to a crawl, as the trace_seq
buffers start at a whooping 4 KB, which is very rarely used in perf's usecases,
so realloc it to the really used space as a last measure after using libtraceevent
functions to format the fields of tracepoint events (Arnaldo Carvalho de Melo)
- Fix 'perf probe' location when using DWARF on ppc64le (Ravi Bangoria)
- Allow specifying signedness casts to a 'perf probe' variable, to shorten
the number of steps to see signed values that otherwise would always appear
as hex values (Naohiro Aota)
Documentation fixes:
- Add 'bpf-output' field to 'perf script' usage message (Brendan Gregg)
Infrastructure fixes:
- Sync kernel header files: cpufeatures.h, {disabled,required}-features.h,
bpf.h and vmx.h, so that we get a clean build, without warnings about files
being different from the kernel counterparts.
A verification of the need or desirability of changes in tools/ based on what
was done in the kernel changesets was made and documented in the respective
file sync changesets (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This reverts commit 874f9c7da9.
Geert Uytterhoeven reports:
"This change seems to have an (unintendent?) side-effect.
Before, pr_*() calls without a trailing newline characters would be
printed with a newline character appended, both on the console and in
the output of the dmesg command.
After this commit, no new line character is appended, and the output
of the next pr_*() call of the same type may be appended, like in:
- Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000
- Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)
+ Truncating RAM at 0x0000000040000000-0x00000000c0000000 to -0x0000000070000000Ignoring RAM at 0x0000000200000000-0x0000000240000000 (!CONFIG_HIGHMEM)"
Joe Perches says:
"No, that is not intentional.
The newline handling code inside vprintk_emit is a bit involved and
for now I suggest a revert until this has all the same behavior as
earlier"
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Requested-by: Joe Perches <joe@perches.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since IRQCHIP_DECLARE now flags the GPC node as already populated, the
GPC power domain driver is never probed unless we clear the flag again.
Fixes: 15cc2ed6dc ("of/irq: Mark initialised interrupt controllers as populated")
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rob Herring <robh@kernel.org>
That way the init callback may clear the flag again, in case of drivers
split between early irq chip and a normal platform driver.
Fixes: 15cc2ed6dc ("of/irq: Mark initialised interrupt controllers as populated")
Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rob Herring <robh@kernel.org>
@mynodes is set to NULL when __unflatten_device_tree() is called
to unflatten device sub-tree in PCI hot add scenario on PowerPC
PowerNV platform. Marking @mynodes detached unconditionally causes
kernel crash as below backtrace shows:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc000000000b26f64
cpu 0x0: Vector: 300 (Data Access) at [c000003fcc7cf740]
pc: c000000000b26f64: __unflatten_device_tree+0xf4/0x190
lr: c000000000b26f40: __unflatten_device_tree+0xd0/0x190
sp: c000003fcc7cf9c0
msr: 900000000280b033
dar: 0
dsisr: 40000000
current = 0xc000003fcc281680
paca = 0xc00000000ff00000 softe: 0 irq_happened: 0x01
pid = 2724, comm = sh
Linux version 4.7.0-gavin-07754-g92a6836 (gwshan@gwshan) (gcc version \
4.9.3 (Buildroot 2016.02-rc2-00093-g5ea3bce) ) #539 SMP Mon Aug 1 \
12:40:29 AEST 2016
enter ? for help
[c000003fcc7cfa50] c000000000b27060 of_fdt_unflatten_tree+0x60/0x90
[c000003fcc7cfaa0] c0000000004c6288 pnv_php_set_slot_power_state+0x118/0x440
[c000003fcc7cfb80] c0000000004c6a10 pnv_php_enable+0xc0/0x170
[c000003fcc7cfbd0] c0000000004c4d80 power_write_file+0xa0/0x190
[c000003fcc7cfc50] c0000000004be93c pci_slot_attr_store+0x3c/0x60
[c000003fcc7cfc70] c0000000002d3fd4 sysfs_kf_write+0x94/0xc0
[c000003fcc7cfcb0] c0000000002d2c30 kernfs_fop_write+0x180/0x260
[c000003fcc7cfd00] c000000000230fe0 __vfs_write+0x40/0x190
[c000003fcc7cfd90] c000000000232278 vfs_write+0xc8/0x240
[c000003fcc7cfde0] c000000000233d90 SyS_write+0x60/0x110
[c000003fcc7cfe30] c000000000009524 system_call+0x38/0x108
This avoids the kernel crash by marking @mynodes detached only when
@mynodes is dereferencing valid device node in __unflatten_device_tree().
Fixes: 1d1bde550e ("of: fdt: mark unflattened tree as detached")
Reported-by: Meng Li <shlimeng@cn.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
The of_node_put() function tests whether its argument is NULL
and then returns immediately.
Thus the test around the call is not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Rob Herring <robh@kernel.org>
properly by the tracing user space tools. This was due to the
TRACE_DEFINE_ENUM() being set to a define, when it should have been set
to the enum itself. The define was of the MASK that used the BIT to shift.
The BIT was the enum and by adding that, everything gets converted nicely.
The MASK is still kept just in case it gets converted to an enum in the
future.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXqeA/AAoJEKKk/i67LK/8wLAH/0nD8L5pxtn+pZi3mQnWNwbn
qEBtfKK8cvnt0IWH2HlKmRKLAiIJfp8UrkGoPiLT7Nb83PlKaw3UT868t7eDmknu
b29SsZroMgvJ1MeeNH9Yzk7cK3/K213VO02P4ce8EWSELYCqFlxJDE3dhl52K9Lj
clJSoZbIGTLlx4pk6zZnPksTt3Z9WZXcVJITwxEiz/Cr+CKAWZpLoPPUaqJOmA4j
9oS6d++0DYZdz3cWCmYBBkphmc5IQkBNZWGMYLcAR+M+m5fsN+baWlP3Dhq4j7he
WknHwu4WDFMk6a2Kh0Ggi6yUWVUIkLNY2Z4QUF2gNmJT1g/FH5lZka4/kIxjvKw=
=rgBA
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"Fix tick_stop tracepoint symbols for user export.
Luiz Capitulino noticed that the tick_stop tracepoint wasn't being
parsed properly by the tracing user space tools.
This was due to the TRACE_DEFINE_ENUM() being set to a define, when it
should have been set to the enum itself. The define was of the MASK
that used the BIT to shift. The BIT was the enum and by adding that,
everything gets converted nicely. The MASK is still kept just in case
it gets converted to an enum in the future"
* tag 'trace-v4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix tick_stop tracepoint symbols for user export
- Fixes a problem with gcc plugins interfering with cc-option tests.
- Aborts more gracefully when gcc plugin headers or compiler support is
missing.
- Improves the gcc plugin rule generation to be more dynamic, pass arguments,
and build from subdirectories.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Kees Cook <kees@outflux.net>
iQIbBAABCgAGBQJXqSsEAAoJEIly9N/cbcAmf+gP+LkBnio5aJrhQwsw2pJ44xC+
XTYeypDJswjA2EZsCLJXRFkRr46/SDUXF9sE428o/i47udglOPDyvRtb4/yekBTC
ao9INrXqmpfkwM2QAZRQH6bTDmxoJF4/u1Z+KgF0e2CX+23ZEUjNVuQwsGcBWGJ8
ktvF3agyT/Of97scbiKmxmbyvGF4lqCdEUWr2Aq0kYd2XKRzKvDO5JvqB+Sz7uWQ
ipy4GdcVgEJG6XyjirEyneqcH4Kp0XLf7pYV4V8cdBm1ORBx7igLVgm1mnPwIiAH
xzlRLD/CHG8bLttvd/6iJ3RvTUYhoBFzfiijH4CTKEd/M7wWqzXhdQEGChvD1OCS
DJIrZZZfWP+9IfPMdFBuq3eZ7O5wyroQf8n8jbF2Nql3qZfWy8CeNTwINxCivEHs
PJnXMiug8yNuBBiXagKNlAVUb91ij/VzsQ0Zikh7wQB6odOi680p16xEUrvoNC5V
H+zST1Ep+BI8O7WiYPH3YyoDZtIJcPGYLT4j0ZGY7U3UrPAgF5Wnu6pcCHDD6Azl
zisde72+DWcUXTsIdFLvK1lvy4aC2lkgyqBwTvUQ8s4O0IDymywo+WacuCk1JURL
5dkGsRT8Q31XOvJjp7rf4NZvB8blwHFkwTR4yB5qjiTVCl8x2d+xfq2tDxLw60MT
d/R0RGahDEBddJwmN04=
=Jwmy
-----END PGP SIGNATURE-----
Merge tag 'gcc-plugin-infrastructure-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull gcc plugin improvements from Kees Cook:
"Several fixes/improvements for the gcc plugin infrastructure:
- fix a problem with gcc plugins interfering with cc-option tests.
- abort more gracefully when gcc plugin headers or compiler support
is missing.
- improve the gcc plugin rule generation to be more dynamic, pass
arguments, and build from subdirectories"
* tag 'gcc-plugin-infrastructure-v4.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
gcc-plugins: Add support for plugin subdirectories
gcc-plugins: Automate make rule generation
gcc-plugins: Add support for passing plugin arguments
gcc-plugins: abort builds cleanly when not supported
kbuild: no gcc-plugins during cc-option tests
Pull drm fixes from Dave Airlie:
"This contains a bunch of amdgpu fixes, and some i915 regression fixes.
It also contains some fixes for an older regression with some EDID
changes and some 6bpc panels.
Then there are the lockdep, cirrus and rcar-du regression fixes from
this window"
* tag 'drm-fixes-for-4.8-rc2' of git://people.freedesktop.org/~airlied/linux:
drm/cirrus: Fix NULL pointer dereference when registering the fbdev
drm/edid: Set 8 bpc color depth for displays with "DFP 1.x compliant TMDS".
drm/i915/dp: Revert "drm/i915/dp: fall back to 18 bpp when sink capability is unknown"
drm/edid: Add 6 bpc quirk for display AEO model 0.
drm: Paper over locking inversion after registration rework
drm: rcar-du: Link HDMI encoder with bridge
drm/ttm: Wait for a BO to become idle before unbinding it from GTT
drm/i915/fbdev: Check for the framebuffer before use
drm/amdgpu: update golden setting of polaris10
drm/amdgpu: update golden setting of stoney
drm/amdgpu: update golden setting of polaris11
drm/amdgpu: update golden setting of carrizo
drm/amdgpu: update golden setting of iceland
drm/amd/amdgpu: change pptable output format from ASCII to binary
drm/amdgpu/ci: add mullins to default case for smc ucode
drm/amdgpu/gmc7: add missing mullins case
drm/i915: Never fully mask the the EI up rps interrupt on SNB/IVB
drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
Commit b195d5e2bf ("ipr: Wait to do async scan until scsi host is
initialized") fixed async scan for ipr, but broke sync scan for ipr.
This fixes sync scan back up.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Reported-and-tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg,
which sets page->_mapcount to -512. Currently, we set/clear PageKmemcg
in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated
with __GFP_ACCOUNT, including those that aren't actually charged to any
cgroup, i.e. allocated from the root cgroup context. To avoid overhead
in case cgroups are not used, we only do that if memcg_kmem_enabled() is
true. The latter is set iff there are kmem-enabled memory cgroups
(online or offline). The root cgroup is not considered kmem-enabled.
As a result, if a page is allocated with __GFP_ACCOUNT for the root
cgroup when there are kmem-enabled memory cgroups and is freed after all
kmem-enabled memory cgroups were removed, e.g.
# no memory cgroups has been created yet, create one
mkdir /sys/fs/cgroup/memory/test
# run something allocating pages with __GFP_ACCOUNT, e.g.
# a program using pipe
dmesg | tail
# remove the memory cgroup
rmdir /sys/fs/cgroup/memory/test
we'll get bad page state bug complaining about page->_mapcount != -1:
BUG: Bad page state in process swapper/0 pfn:1fd945c
page:ffffea007f651700 count:0 mapcount:-511 mapping: (null) index:0x0
flags: 0x1000000000000000()
To avoid that, let's mark with PageKmemcg only those pages that are
actually charged to and hence pin a non-root memory cgroup.
Fixes: 4949148ad4 ("mm: charge/uncharge kmemcg from generic page allocator paths")
Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some uio based PCI drivers, e.g., uio_cif, do not work if the assigned PCI
memory resources are not page aligned. By using the kernel option
"pci=resource_alignment=<align>@<bus>:<slot>.<func>" it is possible to
request page alignment for memory resources of devices.
However, this is cumbersome when using several devices, and the
bus/slot/func addresses may change if devices are added to or removed from
the system.
Extend the "pci=resource_alignment" option so we can specify the relevant
devices via PCI vendor, device, subvendor, and subdevice IDs. The
specification of the devices via IDs is indicated by a leading string
"pci:" as argument to "pci=resource_alignment".
The format of the specification is
pci:<vendor>:<device>[:<subvendor>:<subdevice>]
Examples:
pci=resource_alignment=4096@pci:8086:9c22:103c:198f
pci=resource_alignment=pci:8086:9c22 # defaults to PAGE_SIZE align
[bhelgaas: changelog, use actual vendor/device IDs in examples]
Signed-off-by: Mathias Koehrer <mathias.koehrer@etas.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Patch 19a469a587 ("drivers/perf: arm-pmu: Handle per-interrupt
affinity mask") added support for partitionned PPI setups, but
inadvertently broke setups using SPIs without the "interrupt-affinity"
property (which is the case for UP platforms).
This patch restore the broken functionnality by testing whether the
interrupt is percpu or not instead of relying on the using_spi flag
that really means "SPI *and* interrupt-affinity property".
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: 19a469a587 ("drivers/perf: arm-pmu: Handle per-interrupt affinity mask")
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
arm_pmu_mutex is never held long and we don't want to sleep while the
lock is being held as it's executed in the context of hotplug notifiers.
So it can be converted to a simple spinlock instead.
Without this patch we get the following warning:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/2
no locks held by swapper/2/0.
irq event stamp: 381314
hardirqs last enabled at (381313): _raw_spin_unlock_irqrestore+0x7c/0x88
hardirqs last disabled at (381314): cpu_die+0x28/0x48
softirqs last enabled at (381294): _local_bh_enable+0x28/0x50
softirqs last disabled at (381293): irq_enter+0x58/0x78
CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.7.0 #12
Call trace:
dump_backtrace+0x0/0x220
show_stack+0x24/0x30
dump_stack+0xb4/0xf0
___might_sleep+0x1d8/0x1f0
__might_sleep+0x5c/0x98
mutex_lock_nested+0x54/0x400
arm_perf_starting_cpu+0x34/0xb0
cpuhp_invoke_callback+0x88/0x3d8
notify_cpu_starting+0x78/0x98
secondary_start_kernel+0x108/0x1a8
This patch converts the mutex to spinlock to eliminate the above
warnings. This constraints pmu->reset to be non-blocking call which is
the case with all the ARM PMU backends.
Cc: Stephen Boyd <sboyd@codeaurora.org>
Fixes: 37b502f121 ("arm/perf: Fix hotplug state machine conversion")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Once a packet has been posted to a connection in the data_ready handler, we
mustn't try reposting if we then find that the connection is dying as the
refcount has been given over to the dying connection and the packet might
no longer exist.
Losing the packet isn't a problem as the peer will retransmit.
Signed-off-by: David Howells <dhowells@redhat.com>
The call state machine processor sets up the message parameters for a UDP
message that it might need to transmit in advance on the basis that there's
a very good chance it's going to have to transmit either an ACK or an
ABORT. This requires it to look in the connection struct to retrieve some
of the parameters.
However, if the call is complete, the call connection pointer may be NULL
to dissuade the processor from transmitting a message. However, there are
some situations where the processor is still going to be called - and it's
still going to set up message parameters whether it needs them or not.
This results in a NULL pointer dereference at:
net/rxrpc/call_event.c:837
To fix this, skip the message pre-initialisation if there's no connection
attached.
Signed-off-by: David Howells <dhowells@redhat.com>
If rxrpc_new_client_call() fails to make a connection, the call record that
it allocated needs to be marked as RXRPC_CALL_RELEASED before it is passed
to rxrpc_put_call() to indicate that it no longer has any attachment to the
AF_RXRPC socket.
Without this, an assertion failure may occur at:
net/rxrpc/call_object:635
Signed-off-by: David Howells <dhowells@redhat.com>
Due to the limitations of having to wait until we see a device's DMA
restrictions before we know how we want an IOVA domain initialised,
there is a window for error if a DMA ops domain is allocated but later
freed without ever being used. In that case, init_iova_domain() was
never called, so calling put_iova_domain() from iommu_put_dma_cookie()
ends up trying to take an uninitialised lock and crashing.
Make things robust by skipping the call unless the IOVA domain actually
has been initialised, as we probably should have done from the start.
Fixes: 0db2e5d18f ("iommu: Implement common IOMMU ops for DMA mapping")
Cc: stable@vger.kernel.org
Reported-by: Nate Watterson <nwatters@codeaurora.org>
Reviewed-by: Nate Watterson <nwatters@codeaurora.org>
Tested-by: Nate Watterson <nwatters@codeaurora.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
pathbase is the base inode; set it to 0 if we've got no path.
Coverity-id: 146348
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
ceph_file_layout::pool_id is now s64. rbd_add_get_pool_id() and
ceph_pg_poolid_by_name() both return an int, so it's bogus anyway.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Powerpc has Global Entry Point and Local Entry Point for functions. LEP
catches call from both the GEP and the LEP. Symbol table of ELF contains
GEP and Offset from which we can calculate LEP, but debuginfo does not
have LEP info.
Currently, perf prioritize symbol table over dwarf to probe on LEP for
ppc64le. But when user tries to probe with function parameter, we fall
back to using dwarf(i.e. GEP) and when function called via LEP, probe
will never hit.
For example:
$ objdump -d vmlinux
...
do_sys_open():
c0000000002eb4a0: e8 00 4c 3c addis r2,r12,232
c0000000002eb4a4: 60 00 42 38 addi r2,r2,96
c0000000002eb4a8: a6 02 08 7c mflr r0
c0000000002eb4ac: d0 ff 41 fb std r26,-48(r1)
$ sudo ./perf probe do_sys_open
$ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/do_sys_open _text+3060904
$ sudo ./perf probe 'do_sys_open filename:string'
$ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/do_sys_open _text+3060896 filename_string=+0(%gpr4):string
For second case, perf probed on GEP. So when function will be called via
LEP, probe won't hit.
$ sudo ./perf record -a -e probe:do_sys_open ls
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.195 MB perf.data ]
To resolve this issue, let's not prioritize symbol table, let perf
decide what it wants to use. Perf is already converting GEP to LEP when
it uses symbol table. When perf uses debuginfo, let it find LEP offset
form symbol table. This way we fall back to probe on LEP for all cases.
After patch:
$ sudo ./perf probe 'do_sys_open filename:string'
$ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/do_sys_open _text+3060904 filename_string=+0(%gpr4):string
$ sudo ./perf record -a -e probe:do_sys_open ls
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.197 MB perf.data (11 samples) ]
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1470723805-5081-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of inline code, introduce function to post process kernel
probe trace events.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1470723805-5081-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Due to:
1e61f78baf ("x86/cpufeature: Make sure DISABLED/REQUIRED macros are updated")
No changes to tools using those headers (tools/arch/x86/lib/mem{set,cpu}_64.S)
seems necessary.
Detected by the tools build header drift checker:
$ make -C tools/perf O=/tmp/build/perf
make: Entering directory '/home/acme/git/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build
GEN /tmp/build/perf/common-cmds.h
Warning: tools/arch/x86/include/asm/disabled-features.h differs from kernel
Warning: tools/arch/x86/include/asm/required-features.h differs from kernel
Warning: tools/arch/x86/include/asm/cpufeatures.h differs from kernel
CC /tmp/build/perf/util/probe-finder.o
CC /tmp/build/perf/builtin-help.o
<SNIP>
^C$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ja75m7zk8j0jkzmrv16i5ehw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>