Commit graph

29,002 commits

Author SHA1 Message Date
Joel Fernandes (Google)
8af21ac176 BACKPORT: perf_event: Add support for LSM and SELinux checks
In current mainline, the degree of access to perf_event_open(2) system
call depends on the perf_event_paranoid sysctl.  This has a number of
limitations:

1. The sysctl is only a single value. Many types of accesses are controlled
   based on the single value thus making the control very limited and
   coarse grained.
2. The sysctl is global, so if the sysctl is changed, then that means
   all processes get access to perf_event_open(2) opening the door to
   security issues.

This patch adds LSM and SELinux access checking which will be used in
Android to access perf_event_open(2) for the purposes of attaching BPF
programs to tracepoints, perf profiling and other operations from
userspace. These operations are intended for production systems.

5 new LSM hooks are added:
1. perf_event_open: This controls access during the perf_event_open(2)
   syscall itself. The hook is called from all the places that the
   perf_event_paranoid sysctl is checked to keep it consistent with the
   systctl. The hook gets passed a 'type' argument which controls CPU,
   kernel and tracepoint accesses (in this context, CPU, kernel and
   tracepoint have the same semantics as the perf_event_paranoid sysctl).
   Additionally, I added an 'open' type which is similar to
   perf_event_paranoid sysctl == 3 patch carried in Android and several other
   distros but was rejected in mainline [1] in 2016.

2. perf_event_alloc: This allocates a new security object for the event
   which stores the current SID within the event. It will be useful when
   the perf event's FD is passed through IPC to another process which may
   try to read the FD. Appropriate security checks will limit access.

3. perf_event_free: Called when the event is closed.

4. perf_event_read: Called from the read(2) and mmap(2) syscalls for the event.

5. perf_event_write: Called from the ioctl(2) syscalls for the event.

[1] https://lwn.net/Articles/696240/

Since Peter had suggest LSM hooks in 2016 [1], I am adding his
Suggested-by tag below.

To use this patch, we set the perf_event_paranoid sysctl to -1 and then
apply selinux checking as appropriate (default deny everything, and then
add policy rules to give access to domains that need it). In the future
we can remove the perf_event_paranoid sysctl altogether.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: James Morris <jmorris@namei.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: rostedt@goodmis.org
Cc: Yonghong Song <yhs@fb.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: jeffv@google.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: primiano@google.com
Cc: Song Liu <songliubraving@fb.com>
Cc: rsavitski@google.com
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Matthew Garrett <matthewgarrett@google.com>
Link: https://lkml.kernel.org/r/20191014170308.70668-1-joel@joelfernandes.org

Bug: 137092007
Change-Id: I591c6ad6c82ab9133409e51383d2c9b9f6ae4545
(cherry picked from commit da97e18458)
[ Ryan Savitski:
  Resolved conflicts with existing code, no new functionality ]
Signed-off-by: Ryan Savitski <rsavitski@google.com>
2020-01-07 22:30:02 +00:00
Greg Kroah-Hartman
287ec341d6 This is the 4.19.93 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4Q1jAACgkQONu9yGCS
 aT7vqg/9FEBVO/NARJYQ/R7Z6L4fQUNgHmFI0y9iaTP2nlHuVuBvMHJdF7BmidHF
 9iwe/lctPobgoknUoA3nmt8WmPmCaKbFhABsS03sz1Q5Z+IC1g218s4SUppER3fB
 YlgqRDjKY0wwk2MPAOgIPaRQCNSiaVZFo+bH1Mxrj77m8D7NHKXiZPlrbDunVlEB
 NA0DWOyb04JehRoRNbKTHzLBs/VfZ0LhxEO5sS17M2hhOauYAKAFmSzdMPJwv4ka
 qiCR+4zWYR5LF64mG5jxmerhUjIOrhRUc+334//WH4jCuo9xjKrCmxLIjqR7wwHC
 dK4Apu128Ujl4boHxLrFKIG3f2K19gZz6h+sWrcxjTzZ/YWPYjPI4atuWrZEJIG5
 nhhcz4fZfLAxMNm51kM9i4WAcP2k+CX1ynD0AuzXIZXs+t+xOoaUtYeFHc/tpmig
 P/AA4eAYjojQHPUwNeR+8GmjOGPfwSuTNkd6PqAaaI1cvGtHK0y5M38FNrut+I1k
 pvYvWOvtvWOsR6YaJviU2HF7uNFX0saNqJ4Ahmm/nxdlxOKRcKDIzDI7ibwcwEOQ
 E20SZdPQG/oiaXq0itSstpDuYJ9hKr5YehPS7uAXvy0RT/H7J5cpSZuCUK74J4Zr
 rC2D5M99rW9aztpfEQxU6CTluIGLZ+eBp2pKTU420jkySxmOo6o=
 =qgtu
 -----END PGP SIGNATURE-----

Merge 4.19.93 into android-4.19

Changes in 4.19.93
	scsi: lpfc: Fix discovery failures when target device connectivity bounces
	scsi: mpt3sas: Fix clear pending bit in ioctl status
	scsi: lpfc: Fix locking on mailbox command completion
	Input: atmel_mxt_ts - disable IRQ across suspend
	f2fs: fix to update time in lazytime mode
	iommu: rockchip: Free domain on .domain_free
	iommu/tegra-smmu: Fix page tables in > 4 GiB memory
	dmaengine: xilinx_dma: Clear desc_pendingcount in xilinx_dma_reset
	scsi: target: compare full CHAP_A Algorithm strings
	scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices
	scsi: csiostor: Don't enable IRQs too early
	scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()
	powerpc/pseries: Mark accumulate_stolen_time() as notrace
	powerpc/pseries: Don't fail hash page table insert for bolted mapping
	powerpc/tools: Don't quote $objdump in scripts
	dma-debug: add a schedule point in debug_dma_dump_mappings()
	leds: lm3692x: Handle failure to probe the regulator
	clocksource/drivers/asm9260: Add a check for of_clk_get
	clocksource/drivers/timer-of: Use unique device name instead of timer
	powerpc/security/book3s64: Report L1TF status in sysfs
	powerpc/book3s64/hash: Add cond_resched to avoid soft lockup warning
	ext4: update direct I/O read lock pattern for IOCB_NOWAIT
	ext4: iomap that extends beyond EOF should be marked dirty
	jbd2: Fix statistics for the number of logged blocks
	scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)
	scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow
	f2fs: fix to update dir's i_pino during cross_rename
	clk: qcom: Allow constant ratio freq tables for rcg
	clk: clk-gpio: propagate rate change to parent
	irqchip/irq-bcm7038-l1: Enable parent IRQ if necessary
	irqchip: ingenic: Error out if IRQ domain creation failed
	fs/quota: handle overflows of sysctl fs.quota.* and report as unsigned long
	scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences
	PCI: rpaphp: Fix up pointer to first drc-info entry
	scsi: ufs: fix potential bug which ends in system hang
	powerpc/pseries/cmm: Implement release() function for sysfs device
	PCI: rpaphp: Don't rely on firmware feature to imply drc-info support
	PCI: rpaphp: Annotate and correctly byte swap DRC properties
	PCI: rpaphp: Correctly match ibm, my-drc-index to drc-name when using drc-info
	powerpc/security: Fix wrong message when RFI Flush is disable
	scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE
	clk: pxa: fix one of the pxa RTC clocks
	bcache: at least try to shrink 1 node in bch_mca_scan()
	HID: quirks: Add quirk for HP MSU1465 PIXART OEM mouse
	HID: logitech-hidpp: Silence intermittent get_battery_capacity errors
	ARM: 8937/1: spectre-v2: remove Brahma-B53 from hardening
	libnvdimm/btt: fix variable 'rc' set but not used
	HID: Improve Windows Precision Touchpad detection.
	HID: rmi: Check that the RMI_STARTED bit is set before unregistering the RMI transport device
	watchdog: Fix the race between the release of watchdog_core_data and cdev
	scsi: pm80xx: Fix for SATA device discovery
	scsi: ufs: Fix error handing during hibern8 enter
	scsi: scsi_debug: num_tgts must be >= 0
	scsi: NCR5380: Add disconnect_mask module parameter
	scsi: iscsi: Don't send data to unbound connection
	scsi: target: iscsi: Wait for all commands to finish before freeing a session
	gpio: mpc8xxx: Don't overwrite default irq_set_type callback
	apparmor: fix unsigned len comparison with less than zero
	scripts/kallsyms: fix definitely-lost memory leak
	powerpc: Don't add -mabi= flags when building with Clang
	cdrom: respect device capabilities during opening action
	perf script: Fix brstackinsn for AUXTRACE
	perf regs: Make perf_reg_name() return "unknown" instead of NULL
	s390/zcrypt: handle new reply code FILTERED_BY_HYPERVISOR
	libfdt: define INT32_MAX and UINT32_MAX in libfdt_env.h
	s390/cpum_sf: Check for SDBT and SDB consistency
	ocfs2: fix passing zero to 'PTR_ERR' warning
	mailbox: imx: Fix Tx doorbell shutdown path
	kernel: sysctl: make drop_caches write-only
	userfaultfd: require CAP_SYS_PTRACE for UFFD_FEATURE_EVENT_FORK
	Revert "powerpc/vcpu: Assume dedicated processors as non-preempt"
	x86/mce: Fix possibly incorrect severity calculation on AMD
	net, sysctl: Fix compiler warning when only cBPF is present
	netfilter: nf_queue: enqueue skbs with NULL dst
	ALSA: hda - Downgrade error message for single-cmd fallback
	bonding: fix active-backup transition after link failure
	perf strbuf: Remove redundant va_end() in strbuf_addv()
	Make filldir[64]() verify the directory entry filename is valid
	filldir[64]: remove WARN_ON_ONCE() for bad directory entries
	netfilter: ebtables: compat: reject all padding in matches/watchers
	6pack,mkiss: fix possible deadlock
	netfilter: bridge: make sure to pull arp header in br_nf_forward_arp()
	inetpeer: fix data-race in inet_putpeer / inet_putpeer
	net: add a READ_ONCE() in skb_peek_tail()
	net: icmp: fix data-race in cmp_global_allow()
	hrtimer: Annotate lockless access to timer->state
	net: ena: fix napi handler misbehavior when the napi budget is zero
	net/mlxfw: Fix out-of-memory error in mfa2 flash burning
	net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs
	ptp: fix the race between the release of ptp_clock and cdev
	tcp: Fix highest_sack and highest_sack_seq
	udp: fix integer overflow while computing available space in sk_rcvbuf
	vhost/vsock: accept only packets with the right dst_cid
	net: add bool confirm_neigh parameter for dst_ops.update_pmtu
	ip6_gre: do not confirm neighbor when do pmtu update
	gtp: do not confirm neighbor when do pmtu update
	net/dst: add new function skb_dst_update_pmtu_no_confirm
	tunnel: do not confirm neighbor when do pmtu update
	vti: do not confirm neighbor when do pmtu update
	sit: do not confirm neighbor when do pmtu update
	net/dst: do not confirm neighbor for vxlan and geneve pmtu update
	gtp: do not allow adding duplicate tid and ms_addr pdp context
	net: marvell: mvpp2: phylink requires the link interrupt
	tcp/dccp: fix possible race __inet_lookup_established()
	tcp: do not send empty skb from tcp_write_xmit()
	gtp: fix wrong condition in gtp_genl_dump_pdp()
	gtp: fix an use-after-free in ipv4_pdp_find()
	gtp: avoid zero size hashtable
	spi: fsl: don't map irq during probe
	tty/serial: atmel: fix out of range clock divider handling
	pinctrl: baytrail: Really serialize all register accesses
	spi: fsl: use platform_get_irq() instead of of_irq_to_resource()
	Linux 4.19.93

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie31b3fba19c5a45be0b85f272bc50cb8b67ea3c0
2020-01-04 19:29:03 +01:00
Vladis Dronov
0393b87201 ptp: fix the race between the release of ptp_clock and cdev
[ Upstream commit a33121e548 ]

In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
device is removed, closing this file leads to a race. This reproduces
easily in a kvm virtual machine:

ts# cat openptp0.c
int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
ts# uname -r
5.5.0-rc3-46cf053e
ts# cat /proc/cmdline
... slub_debug=FZP
ts# modprobe ptp_kvm
ts# ./openptp0 &
[1] 670
opened /dev/ptp0, sleeping 10s...
ts# rmmod ptp_kvm
ts# ls /dev/ptp*
ls: cannot access '/dev/ptp*': No such file or directory
ts# ...woken up
[   48.010809] general protection fault: 0000 [#1] SMP
[   48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-46cf053e #25
[   48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
[   48.016270] RIP: 0010:module_put.part.0+0x7/0x80
[   48.017939] RSP: 0018:ffffb3850073be00 EFLAGS: 00010202
[   48.018339] RAX: 000000006b6b6b6b RBX: 6b6b6b6b6b6b6b6b RCX: ffff89a476c00ad0
[   48.018936] RDX: fffff65a08d3ea08 RSI: 0000000000000247 RDI: 6b6b6b6b6b6b6b6b
[   48.019470] ...                                              ^^^ a slub poison
[   48.023854] Call Trace:
[   48.024050]  __fput+0x21f/0x240
[   48.024288]  task_work_run+0x79/0x90
[   48.024555]  do_exit+0x2af/0xab0
[   48.024799]  ? vfs_write+0x16a/0x190
[   48.025082]  do_group_exit+0x35/0x90
[   48.025387]  __x64_sys_exit_group+0xf/0x10
[   48.025737]  do_syscall_64+0x3d/0x130
[   48.026056]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   48.026479] RIP: 0033:0x7f53b12082f6
[   48.026792] ...
[   48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
[   48.045001] Fixing recursive fault but reboot is needed!

This happens in:

static void __fput(struct file *file)
{   ...
    if (file->f_op->release)
        file->f_op->release(inode, file); <<< cdev is kfree'd here
    if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
             !(mode & FMODE_PATH))) {
        cdev_put(inode->i_cdev); <<< cdev fields are accessed here

Namely:

__fput()
  posix_clock_release()
    kref_put(&clk->kref, delete_clock) <<< the last reference
      delete_clock()
        delete_ptp_clock()
          kfree(ptp) <<< cdev is embedded in ptp
  cdev_put
    module_put(p->owner) <<< *p is kfree'd, bang!

Here cdev is embedded in posix_clock which is embedded in ptp_clock.
The race happens because ptp_clock's lifetime is controlled by two
refcounts: kref and cdev.kobj in posix_clock. This is wrong.

Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
created especially for such cases. This way the parent device with its
ptp_clock is not released until all references to the cdev are released.
This adds a requirement that an initialized but not exposed struct
device should be provided to posix_clock_register() by a caller instead
of a simple dev_t.

This approach was adopted from the commit 72139dfa24 ("watchdog: Fix
the race between the release of watchdog_core_data and cdev"). See
details of the implementation in the commit 233ed09d7f ("chardev: add
helper function to register char devs with a struct device").

Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#u
Analyzed-by: Stephen Johnston <sjohnsto@redhat.com>
Analyzed-by: Vern Lovejoy <vlovejoy@redhat.com>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-04 19:13:35 +01:00
Eric Dumazet
779807c74a hrtimer: Annotate lockless access to timer->state
commit 56144737e6 upstream.

syzbot reported various data-race caused by hrtimer_is_queued() reading
timer->state. A READ_ONCE() is required there to silence the warning.

Also add the corresponding WRITE_ONCE() when timer->state is set.

In remove_hrtimer() the hrtimer_is_queued() helper is open coded to avoid
loading timer->state twice.

KCSAN reported these cases:

BUG: KCSAN: data-race in __remove_hrtimer / tcp_pacing_check

write to 0xffff8880b2a7d388 of 1 bytes by interrupt on cpu 0:
 __remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991
 __run_hrtimer kernel/time/hrtimer.c:1496 [inline]
 __hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576
 hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593
 __do_softirq+0x115/0x33f kernel/softirq.c:292
 run_ksoftirqd+0x46/0x60 kernel/softirq.c:603
 smpboot_thread_fn+0x37d/0x4a0 kernel/smpboot.c:165
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

read to 0xffff8880b2a7d388 of 1 bytes by task 24652 on cpu 1:
 tcp_pacing_check net/ipv4/tcp_output.c:2235 [inline]
 tcp_pacing_check+0xba/0x130 net/ipv4/tcp_output.c:2225
 tcp_xmit_retransmit_queue+0x32c/0x5a0 net/ipv4/tcp_output.c:3044
 tcp_xmit_recovery+0x7c/0x120 net/ipv4/tcp_input.c:3558
 tcp_ack+0x17b6/0x3170 net/ipv4/tcp_input.c:3717
 tcp_rcv_established+0x37e/0xf50 net/ipv4/tcp_input.c:5696
 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561
 sk_backlog_rcv include/net/sock.h:945 [inline]
 __release_sock+0x135/0x1e0 net/core/sock.c:2435
 release_sock+0x61/0x160 net/core/sock.c:2951
 sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145
 tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393
 tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434
 inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0x9f/0xc0 net/socket.c:657

BUG: KCSAN: data-race in __remove_hrtimer / __tcp_ack_snd_check

write to 0xffff8880a3a65588 of 1 bytes by interrupt on cpu 0:
 __remove_hrtimer+0x52/0x130 kernel/time/hrtimer.c:991
 __run_hrtimer kernel/time/hrtimer.c:1496 [inline]
 __hrtimer_run_queues+0x250/0x600 kernel/time/hrtimer.c:1576
 hrtimer_run_softirq+0x10e/0x150 kernel/time/hrtimer.c:1593
 __do_softirq+0x115/0x33f kernel/softirq.c:292
 invoke_softirq kernel/softirq.c:373 [inline]
 irq_exit+0xbb/0xe0 kernel/softirq.c:413
 exiting_irq arch/x86/include/asm/apic.h:536 [inline]
 smp_apic_timer_interrupt+0xe6/0x280 arch/x86/kernel/apic/apic.c:1137
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830

read to 0xffff8880a3a65588 of 1 bytes by task 22891 on cpu 1:
 __tcp_ack_snd_check+0x415/0x4f0 net/ipv4/tcp_input.c:5265
 tcp_ack_snd_check net/ipv4/tcp_input.c:5287 [inline]
 tcp_rcv_established+0x750/0xf50 net/ipv4/tcp_input.c:5708
 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1561
 sk_backlog_rcv include/net/sock.h:945 [inline]
 __release_sock+0x135/0x1e0 net/core/sock.c:2435
 release_sock+0x61/0x160 net/core/sock.c:2951
 sk_stream_wait_memory+0x3d7/0x7c0 net/core/stream.c:145
 tcp_sendmsg_locked+0xb47/0x1f30 net/ipv4/tcp.c:1393
 tcp_sendmsg+0x39/0x60 net/ipv4/tcp.c:1434
 inet_sendmsg+0x6d/0x90 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0x9f/0xc0 net/socket.c:657
 __sys_sendto+0x21f/0x320 net/socket.c:1952
 __do_sys_sendto net/socket.c:1964 [inline]
 __se_sys_sendto net/socket.c:1960 [inline]
 __x64_sys_sendto+0x89/0xb0 net/socket.c:1960
 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 24652 Comm: syz-executor.3 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

[ tglx: Added comments ]

Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191106174804.74723-1-edumazet@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-04 19:13:32 +01:00
Johannes Weiner
acb265a5cc kernel: sysctl: make drop_caches write-only
[ Upstream commit 204cb79ad4 ]

Currently, the drop_caches proc file and sysctl read back the last value
written, suggesting this is somehow a stateful setting instead of a
one-time command.  Make it write-only, like e.g.  compact_memory.

While mitigating a VM problem at scale in our fleet, there was confusion
about whether writing to this file will permanently switch the kernel into
a non-caching mode.  This influences the decision making in a tense
situation, where tens of people are trying to fix tens of thousands of
affected machines: Do we need a rollback strategy?  What are the
performance implications of operating in a non-caching state for several
days?  It also caused confusion when the kernel team said we may need to
write the file several times to make sure it's effective ("But it already
reads back 3?").

Link: http://lkml.kernel.org/r/20191031221602.9375-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Chris Down <chris@chrisdown.name>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:13:17 +01:00
Eric Dumazet
ea44828c41 dma-debug: add a schedule point in debug_dma_dump_mappings()
[ Upstream commit 9ff6aa027d ]

debug_dma_dump_mappings() can take a lot of cpu cycles :

lpk43:/# time wc -l /sys/kernel/debug/dma-api/dump
163435 /sys/kernel/debug/dma-api/dump

real	0m0.463s
user	0m0.003s
sys	0m0.459s

Let's add a cond_resched() to avoid holding cpu for too long.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Corentin Labbe <clabbe@baylibre.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04 19:12:43 +01:00
chenqiwu
146c22ecde UPSTREAM: exit: panic before exit_mm() on global init exit
Currently, when global init and all threads in its thread-group have exited
we panic via:
do_exit()
-> exit_notify()
   -> forget_original_parent()
      -> find_child_reaper()
This makes it hard to extract a useable coredump for global init from a
kernel crashdump because by the time we panic exit_mm() will have already
released global init's mm.
This patch moves the panic futher up before exit_mm() is called. As was the
case previously, we only panic when global init and all its threads in the
thread-group have exited.

Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
[christian.brauner@ubuntu.com: fix typo, rewrite commit message]
Link: https://lore.kernel.org/r/1576736993-10121-1-git-send-email-qiwuchen55@gmail.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
(cherry picked from commit 43cf75d964)
Bug: 146789558
Change-Id: Icff81267e8c49bf1d332773351d1b47cb8cbac4a
Signed-off-by: Alistair Delva <adelva@google.com>
2020-01-03 23:18:17 +00:00
Greg Kroah-Hartman
c8a465e614 This is the 4.19.92 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl4La9gACgkQONu9yGCS
 aT6hlA//TDpj9rdEwkaKyg/Ge4TCOJSOiwlp2/5lg2Sroiuizz527hVybGOOYAHl
 gMA2Syt73PWStyfgl5B3AimcBvPADX8h/b1KiSoIdHFkq5rPFyneB6aEj+5jSK1V
 63UnnTV0T49wt0Jvs6nN0FxI4ZCXbfjzaSVz4BGIflz6h9UUkPAu91CJTKtPmrAp
 pliH20cMOykxyS/KfKa6zDcpIfU0k+DxL5U0Y5F1YRDKc1iPg8e6I3cNLgwKSja6
 21BgdoTyZdvbC85HxSY7V6Dswp4YQPBY3y8crp8npZ9apbYV7eNU3L1+WVQvxpFg
 kahhyjalqwqkKq+cTEsIFj7cjPksSlH/qytTS+lnN3BScXbFPp8GdzIazhQNSCv3
 S/7T51CcvNoVcs9Qeu+nwyvx+H1LH4MYO4C7RYWZhPnMcA+/MxvT5WXNKfjf2ekM
 N5h8xNATllzDuDkX+zVwW8i80SCyhVqQIKbXLn8ugGYW3G5TNdy8Ysh0kdrq26Y+
 LAELsbQhK/Kt8WF+XNBpb9LLbeUGn1GTwhnbEuD7IKI+bVxnmsGk8QUu3h+a9xFh
 lI7bsj8Ku9T+59/9xqAnoStEto+0tdTPB9Cx1jNdWlLiVdkewiDKiUbloFpDFS1n
 L3SvqB68DC/IznQcK970g3aIx9zbkb2KZRdj2Fu7apaY5D9q85I=
 =W+5k
 -----END PGP SIGNATURE-----

Merge 4.19.92 into android-4.19

Changes in 4.19.92
	af_packet: set defaule value for tmo
	fjes: fix missed check in fjes_acpi_add
	mod_devicetable: fix PHY module format
	net: dst: Force 4-byte alignment of dst_metrics
	net: gemini: Fix memory leak in gmac_setup_txqs
	net: hisilicon: Fix a BUG trigered by wrong bytes_compl
	net: nfc: nci: fix a possible sleep-in-atomic-context bug in nci_uart_tty_receive()
	net: qlogic: Fix error paths in ql_alloc_large_buffers()
	net: usb: lan78xx: Fix suspend/resume PHY register access error
	qede: Disable hardware gro when xdp prog is installed
	qede: Fix multicast mac configuration
	sctp: fully initialize v4 addr in some functions
	selftests: forwarding: Delete IPv6 address at the end
	btrfs: don't double lock the subvol_sem for rename exchange
	btrfs: do not call synchronize_srcu() in inode_tree_del
	Btrfs: fix missing data checksums after replaying a log tree
	btrfs: send: remove WARN_ON for readonly mount
	btrfs: abort transaction after failed inode updates in create_subvol
	btrfs: skip log replay on orphaned roots
	btrfs: do not leak reloc root if we fail to read the fs root
	btrfs: handle ENOENT in btrfs_uuid_tree_iterate
	Btrfs: fix removal logic of the tree mod log that leads to use-after-free issues
	ALSA: pcm: Avoid possible info leaks from PCM stream buffers
	ALSA: hda/ca0132 - Keep power on during processing DSP response
	ALSA: hda/ca0132 - Avoid endless loop
	ALSA: hda/ca0132 - Fix work handling in delayed HP detection
	drm: mst: Fix query_payload ack reply struct
	drm/panel: Add missing drm_panel_init() in panel drivers
	drm/bridge: analogix-anx78xx: silence -EPROBE_DEFER warnings
	iio: light: bh1750: Resolve compiler warning and make code more readable
	drm/amdgpu: grab the id mgr lock while accessing passid_mapping
	spi: Add call to spi_slave_abort() function when spidev driver is released
	staging: rtl8192u: fix multiple memory leaks on error path
	staging: rtl8188eu: fix possible null dereference
	rtlwifi: prevent memory leak in rtl_usb_probe
	libertas: fix a potential NULL pointer dereference
	ath10k: fix backtrace on coredump
	IB/iser: bound protection_sg size by data_sg size
	media: am437x-vpfe: Setting STD to current value is not an error
	media: i2c: ov2659: fix s_stream return value
	media: ov6650: Fix crop rectangle alignment not passed back
	media: i2c: ov2659: Fix missing 720p register config
	media: ov6650: Fix stored frame format not in sync with hardware
	media: ov6650: Fix stored crop rectangle not in sync with hardware
	tools/power/cpupower: Fix initializer override in hsw_ext_cstates
	media: venus: core: Fix msm8996 frequency table
	ath10k: fix offchannel tx failure when no ath10k_mac_tx_frm_has_freq
	pinctrl: devicetree: Avoid taking direct reference to device name string
	drm/amdkfd: fix a potential NULL pointer dereference (v2)
	selftests/bpf: Correct path to include msg + path
	media: venus: Fix occasionally failures to suspend
	usb: renesas_usbhs: add suspend event support in gadget mode
	hwrng: omap3-rom - Call clk_disable_unprepare() on exit only if not idled
	regulator: max8907: Fix the usage of uninitialized variable in max8907_regulator_probe()
	media: flexcop-usb: fix NULL-ptr deref in flexcop_usb_transfer_init()
	media: cec-funcs.h: add status_req checks
	drm/bridge: dw-hdmi: Refuse DDC/CI transfers on the internal I2C controller
	samples: pktgen: fix proc_cmd command result check logic
	block: Fix writeback throttling W=1 compiler warnings
	mwifiex: pcie: Fix memory leak in mwifiex_pcie_init_evt_ring
	drm/drm_vblank: Change EINVAL by the correct errno
	media: cx88: Fix some error handling path in 'cx8800_initdev()'
	media: ti-vpe: vpe: Fix Motion Vector vpdma stride
	media: ti-vpe: vpe: fix a v4l2-compliance warning about invalid pixel format
	media: ti-vpe: vpe: fix a v4l2-compliance failure about frame sequence number
	media: ti-vpe: vpe: Make sure YUYV is set as default format
	media: ti-vpe: vpe: fix a v4l2-compliance failure causing a kernel panic
	media: ti-vpe: vpe: ensure buffers are cleaned up properly in abort cases
	media: ti-vpe: vpe: fix a v4l2-compliance failure about invalid sizeimage
	syscalls/x86: Use the correct function type in SYSCALL_DEFINE0
	drm/amd/display: Fix dongle_caps containing stale information.
	extcon: sm5502: Reset registers during initialization
	x86/mm: Use the correct function type for native_set_fixmap()
	ath10k: Correct error handling of dma_map_single()
	drm/bridge: dw-hdmi: Restore audio when setting a mode
	perf test: Report failure for mmap events
	perf report: Add warning when libunwind not compiled in
	usb: usbfs: Suppress problematic bind and unbind uevents.
	iio: adc: max1027: Reset the device at probe time
	Bluetooth: missed cpu_to_le16 conversion in hci_init4_req
	Bluetooth: Workaround directed advertising bug in Broadcom controllers
	Bluetooth: hci_core: fix init for HCI_USER_CHANNEL
	bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()
	x86/mce: Lower throttling MCE messages' priority to warning
	perf tests: Disable bp_signal testing for arm64
	drm/gma500: fix memory disclosures due to uninitialized bytes
	rtl8xxxu: fix RTL8723BU connection failure issue after warm reboot
	ipmi: Don't allow device module unload when in use
	x86/ioapic: Prevent inconsistent state when moving an interrupt
	media: smiapp: Register sensor after enabling runtime PM on the device
	md/bitmap: avoid race window between md_bitmap_resize and bitmap_file_clear_bit
	arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill()
	i40e: initialize ITRN registers with correct values
	net: phy: dp83867: enable robust auto-mdix
	drm/tegra: sor: Use correct SOR index on Tegra210
	spi: sprd: adi: Add missing lock protection when rebooting
	ACPI: button: Add DMI quirk for Medion Akoya E2215T
	RDMA/qedr: Fix memory leak in user qp and mr
	gpu: host1x: Allocate gather copy for host1x
	net: dsa: LAN9303: select REGMAP when LAN9303 enable
	phy: qcom-usb-hs: Fix extcon double register after power cycle
	s390/time: ensure get_clock_monotonic() returns monotonic values
	s390/mm: add mm_pxd_folded() checks to pxd_free()
	net: hns3: add struct netdev_queue debug info for TX timeout
	libata: Ensure ata_port probe has completed before detach
	loop: fix no-unmap write-zeroes request behavior
	pinctrl: sh-pfc: sh7734: Fix duplicate TCLK1_B
	iio: dln2-adc: fix iio_triggered_buffer_postenable() position
	libbpf: Fix error handling in bpf_map__reuse_fd()
	Bluetooth: Fix advertising duplicated flags
	pinctrl: amd: fix __iomem annotation in amd_gpio_irq_handler()
	ixgbe: protect TX timestamping from API misuse
	media: rcar_drif: fix a memory disclosure
	media: v4l2-core: fix touch support in v4l_g_fmt
	nvmem: imx-ocotp: reset error status on probe
	rfkill: allocate static minor
	bnx2x: Fix PF-VF communication over multi-cos queues.
	spi: img-spfi: fix potential double release
	ALSA: timer: Limit max amount of slave instances
	rtlwifi: fix memory leak in rtl92c_set_fw_rsvdpagepkt()
	perf probe: Fix to find range-only function instance
	perf probe: Fix to list probe event with correct line number
	perf jevents: Fix resource leak in process_mapfile() and main()
	perf probe: Walk function lines in lexical blocks
	perf probe: Fix to probe an inline function which has no entry pc
	perf probe: Fix to show ranges of variables in functions without entry_pc
	perf probe: Fix to show inlined function callsite without entry_pc
	libsubcmd: Use -O0 with DEBUG=1
	perf probe: Fix to probe a function which has no entry pc
	perf tools: Splice events onto evlist even on error
	drm/amdgpu: disallow direct upload save restore list from gfx driver
	drm/amdgpu: fix potential double drop fence reference
	xen/gntdev: Use select for DMA_SHARED_BUFFER
	perf parse: If pmu configuration fails free terms
	perf probe: Skip overlapped location on searching variables
	perf probe: Return a better scope DIE if there is no best scope
	perf probe: Fix to show calling lines of inlined functions
	perf probe: Skip end-of-sequence and non statement lines
	perf probe: Filter out instances except for inlined subroutine and subprogram
	ath10k: fix get invalid tx rate for Mesh metric
	fsi: core: Fix small accesses and unaligned offsets via sysfs
	media: pvrusb2: Fix oops on tear-down when radio support is not present
	soundwire: intel: fix PDI/stream mapping for Bulk
	crypto: atmel - Fix authenc support when it is set to m
	ice: delay less
	media: si470x-i2c: add missed operations in remove
	EDAC/ghes: Fix grain calculation
	spi: pxa2xx: Add missed security checks
	ASoC: rt5677: Mark reg RT5677_PWR_ANLG2 as volatile
	iio: dac: ad5446: Add support for new AD5600 DAC
	ASoC: Intel: kbl_rt5663_rt5514_max98927: Add dmic format constraint
	s390/disassembler: don't hide instruction addresses
	nvme: Discard workaround for non-conformant devices
	parport: load lowlevel driver if ports not found
	bcache: fix static checker warning in bcache_device_free()
	cpufreq: Register drivers only after CPU devices have been registered
	x86/crash: Add a forward declaration of struct kimage
	tracing: use kvcalloc for tgid_map array allocation
	tracing/kprobe: Check whether the non-suffixed symbol is notrace
	bcache: fix deadlock in bcache_allocator
	iwlwifi: mvm: fix unaligned read of rx_pkt_status
	ASoC: wm8904: fix regcache handling
	spi: tegra20-slink: add missed clk_unprepare
	tun: fix data-race in gro_normal_list()
	crypto: virtio - deal with unsupported input sizes
	mmc: tmio: Add MMC_CAP_ERASE to allow erase/discard/trim requests
	btrfs: don't prematurely free work in end_workqueue_fn()
	btrfs: don't prematurely free work in run_ordered_work()
	ASoC: wm2200: add missed operations in remove and probe failure
	spi: st-ssc4: add missed pm_runtime_disable
	ASoC: wm5100: add missed pm_runtime_disable
	ASoC: Intel: bytcr_rt5640: Update quirk for Acer Switch 10 SW5-012 2-in-1
	x86/insn: Add some Intel instructions to the opcode map
	brcmfmac: remove monitor interface when detaching
	iwlwifi: check kasprintf() return value
	fbtft: Make sure string is NULL terminated
	net: ethernet: ti: ale: clean ale tbl on init and intf restart
	crypto: sun4i-ss - Fix 64-bit size_t warnings
	crypto: sun4i-ss - Fix 64-bit size_t warnings on sun4i-ss-hash.c
	mac80211: consider QoS Null frames for STA_NULLFUNC_ACKED
	crypto: vmx - Avoid weird build failures
	libtraceevent: Fix memory leakage in copy_filter_type
	mips: fix build when "48 bits virtual memory" is enabled
	drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2
	net: phy: initialise phydev speed and duplex sanely
	btrfs: don't prematurely free work in reada_start_machine_worker()
	btrfs: don't prematurely free work in scrub_missing_raid56_worker()
	Revert "mmc: sdhci: Fix incorrect switch to HS mode"
	mmc: mediatek: fix CMD_TA to 2 for MT8173 HS200/HS400 mode
	can: kvaser_usb: kvaser_usb_leaf: Fix some info-leaks to USB devices
	usb: xhci: Fix build warning seen with CONFIG_PM=n
	drm/amdgpu: fix uninitialized variable pasid_mapping_needed
	s390/ftrace: fix endless recursion in function_graph tracer
	btrfs: return error pointer from alloc_test_extent_buffer
	usbip: Fix receive error in vhci-hcd when using scatter-gather
	usbip: Fix error path of vhci_recv_ret_submit()
	cpufreq: Avoid leaving stale IRQ work items during CPU offline
	USB: EHCI: Do not return -EPIPE when hub is disconnected
	intel_th: pci: Add Comet Lake PCH-V support
	intel_th: pci: Add Elkhart Lake SOC support
	platform/x86: hp-wmi: Make buffer for HPWMI_FEATURE2_QUERY 128 bytes
	staging: comedi: gsc_hpdi: check dma_alloc_coherent() return value
	ext4: fix ext4_empty_dir() for directories with holes
	ext4: check for directory entries too close to block end
	ext4: unlock on error in ext4_expand_extra_isize()
	KVM: arm64: Ensure 'params' is initialised when looking up sys register
	x86/MCE/AMD: Do not use rdmsr_safe_on_cpu() in smca_configure()
	x86/MCE/AMD: Allow Reserved types to be overwritten in smca_banks[]
	powerpc/vcpu: Assume dedicated processors as non-preempt
	powerpc/irq: fix stack overflow verification
	mmc: sdhci-msm: Correct the offset and value for DDR_CONFIG register
	mmc: sdhci-of-esdhc: Revert "mmc: sdhci-of-esdhc: add erratum A-009204 support"
	mmc: sdhci: Update the tuning failed messages to pr_debug level
	mmc: sdhci-of-esdhc: fix P2020 errata handling
	mmc: sdhci: Workaround broken command queuing on Intel GLK
	mmc: sdhci: Add a quirk for broken command queuing
	nbd: fix shutdown and recv work deadlock v2
	perf probe: Fix to show function entry line as probe-able
	Linux 4.19.92

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic4c7f9c713549ebb3319cd0275e88678bfa0e53d
2019-12-31 17:11:54 +01:00
Rafael J. Wysocki
e82f0540a0 cpufreq: Avoid leaving stale IRQ work items during CPU offline
commit 85572c2c4a upstream.

The scheduler code calling cpufreq_update_util() may run during CPU
offline on the target CPU after the IRQ work lists have been flushed
for it, so the target CPU should be prevented from running code that
may queue up an IRQ work item on it at that point.

Unfortunately, that may not be the case if dvfs_possible_from_any_cpu
is set for at least one cpufreq policy in the system, because that
allows the CPU going offline to run the utilization update callback
of the cpufreq governor on behalf of another (online) CPU in some
cases.

If that happens, the cpufreq governor callback may queue up an IRQ
work on the CPU running it, which is going offline, and the IRQ work
may not be flushed after that point.  Moreover, that IRQ work cannot
be flushed until the "offlining" CPU goes back online, so if any
other CPU calls irq_work_sync() to wait for the completion of that
IRQ work, it will have to wait until the "offlining" CPU is back
online and that may not happen forever.  In particular, a system-wide
deadlock may occur during CPU online as a result of that.

The failing scenario is as follows.  CPU0 is the boot CPU, so it
creates a cpufreq policy and becomes the "leader" of it
(policy->cpu).  It cannot go offline, because it is the boot CPU.
Next, other CPUs join the cpufreq policy as they go online and they
leave it when they go offline.  The last CPU to go offline, say CPU3,
may queue up an IRQ work while running the governor callback on
behalf of CPU0 after leaving the cpufreq policy because of the
dvfs_possible_from_any_cpu effect described above.  Then, CPU0 is
the only online CPU in the system and the stale IRQ work is still
queued on CPU3.  When, say, CPU1 goes back online, it will run
irq_work_sync() to wait for that IRQ work to complete and so it
will wait for CPU3 to go back online (which may never happen even
in principle), but (worse yet) CPU0 is waiting for CPU1 at that
point too and a system-wide deadlock occurs.

To address this problem notice that CPUs which cannot run cpufreq
utilization update code for themselves (for example, because they
have left the cpufreq policies that they belonged to), should also
be prevented from running that code on behalf of the other CPUs that
belong to a cpufreq policy with dvfs_possible_from_any_cpu set and so
in that case the cpufreq_update_util_data pointer of the CPU running
the code must not be NULL as well as for the CPU which is the target
of the cpufreq utilization update in progress.

Accordingly, change cpufreq_this_cpu_can_update() into a regular
function in kernel/sched/cpufreq.c (instead of a static inline in a
header file) and make it check the cpufreq_update_util_data pointer
of the local CPU if dvfs_possible_from_any_cpu is set for the target
cpufreq policy.

Also update the schedutil governor to do the
cpufreq_this_cpu_can_update() check in the non-fast-switch
case too to avoid the stale IRQ work issues.

Fixes: 99d14d0e16 ("cpufreq: Process remote callbacks from any CPU if the platform permits")
Link: https://lore.kernel.org/linux-pm/20191121093557.bycvdo4xyinbc5cb@vireshk-i7/
Reported-by: Anson Huang <anson.huang@nxp.com>
Tested-by: Anson Huang <anson.huang@nxp.com>
Cc: 4.14+ <stable@vger.kernel.org> # 4.14+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Tested-by: Peng Fan <peng.fan@nxp.com> (i.MX8QXP-MEK)
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-31 16:36:22 +01:00
Masami Hiramatsu
49e9a5395c tracing/kprobe: Check whether the non-suffixed symbol is notrace
[ Upstream commit c7411a1a12 ]

Check whether the non-suffixed symbol is notrace, since suffixed
symbols are generated by the compilers for optimization. Based on
these suffixed symbols, notrace check might not work because
some of them are just a partial code of the original function.
(e.g. cold-cache (unlikely) code is separated from original
 function as FUNCTION.cold.XX)

For example, without this fix,
  # echo p device_add.cold.67 > /sys/kernel/debug/tracing/kprobe_events
  sh: write error: Invalid argument

  # cat /sys/kernel/debug/tracing/error_log
  [  135.491035] trace_kprobe: error: Failed to register probe event
    Command: p device_add.cold.67
               ^
  # dmesg | tail -n 1
  [  135.488599] trace_kprobe: Could not probe notrace function device_add.cold.67

With this,
  # echo p device_add.cold.66 > /sys/kernel/debug/tracing/kprobe_events
  # cat /sys/kernel/debug/kprobes/list
  ffffffff81599de9  k  device_add.cold.66+0x0    [DISABLED]

Actually, kprobe blacklist already did similar thing,
see within_kprobe_blacklist().

Link: http://lkml.kernel.org/r/157233790394.6706.18243942030937189679.stgit@devnote2

Fixes: 45408c4f92 ("tracing: kprobes: Prohibit probing on notrace function")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31 16:36:02 +01:00
Yuming Han
22508003fc tracing: use kvcalloc for tgid_map array allocation
[ Upstream commit 6ee40511cb ]

Fail to allocate memory for tgid_map, because it requires order-6 page.
detail as:

c3 sh: page allocation failure: order:6,
   mode:0x140c0c0(GFP_KERNEL), nodemask=(null)
c3 sh cpuset=/ mems_allowed=0
c3 CPU: 3 PID: 5632 Comm: sh Tainted: G        W  O    4.14.133+ #10
c3 Hardware name: Generic DT based system
c3 Backtrace:
c3 [<c010bdbc>] (dump_backtrace) from [<c010c08c>](show_stack+0x18/0x1c)
c3 [<c010c074>] (show_stack) from [<c0993c54>](dump_stack+0x84/0xa4)
c3 [<c0993bd0>] (dump_stack) from [<c0229858>](warn_alloc+0xc4/0x19c)
c3 [<c0229798>] (warn_alloc) from [<c022a6e4>](__alloc_pages_nodemask+0xd18/0xf28)
c3 [<c02299cc>] (__alloc_pages_nodemask) from [<c0248344>](kmalloc_order+0x20/0x38)
c3 [<c0248324>] (kmalloc_order) from [<c0248380>](kmalloc_order_trace+0x24/0x108)
c3 [<c024835c>] (kmalloc_order_trace) from [<c01e6078>](set_tracer_flag+0xb0/0x158)
c3 [<c01e5fc8>] (set_tracer_flag) from [<c01e6404>](trace_options_core_write+0x7c/0xcc)
c3 [<c01e6388>] (trace_options_core_write) from [<c0278b1c>](__vfs_write+0x40/0x14c)
c3 [<c0278adc>] (__vfs_write) from [<c0278e10>](vfs_write+0xc4/0x198)
c3 [<c0278d4c>] (vfs_write) from [<c027906c>](SyS_write+0x6c/0xd0)
c3 [<c0279000>] (SyS_write) from [<c01079a0>](ret_fast_syscall+0x0/0x54)

Switch to use kvcalloc to avoid unexpected allocation failures.

Link: http://lkml.kernel.org/r/1571888070-24425-1-git-send-email-chunyan.zhang@unisoc.com

Signed-off-by: Yuming Han <yuming.han@unisoc.com>
Signed-off-by: Chunyan Zhang <chunyan.zhang@unisoc.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31 16:36:02 +01:00
Song Liu
a7f4da875c bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()
[ Upstream commit eac9153f2b ]

bpf stackmap with build-id lookup (BPF_F_STACK_BUILD_ID) can trigger A-A
deadlock on rq_lock():

rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[...]
Call Trace:
 try_to_wake_up+0x1ad/0x590
 wake_up_q+0x54/0x80
 rwsem_wake+0x8a/0xb0
 bpf_get_stack+0x13c/0x150
 bpf_prog_fbdaf42eded9fe46_on_event+0x5e3/0x1000
 bpf_overflow_handler+0x60/0x100
 __perf_event_overflow+0x4f/0xf0
 perf_swevent_overflow+0x99/0xc0
 ___perf_sw_event+0xe7/0x120
 __schedule+0x47d/0x620
 schedule+0x29/0x90
 futex_wait_queue_me+0xb9/0x110
 futex_wait+0x139/0x230
 do_futex+0x2ac/0xa50
 __x64_sys_futex+0x13c/0x180
 do_syscall_64+0x42/0x100
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

This can be reproduced by:
1. Start a multi-thread program that does parallel mmap() and malloc();
2. taskset the program to 2 CPUs;
3. Attach bpf program to trace_sched_switch and gather stackmap with
   build-id, e.g. with trace.py from bcc tools:
   trace.py -U -p <pid> -s <some-bin,some-lib> t:sched:sched_switch

A sample reproducer is attached at the end.

This could also trigger deadlock with other locks that are nested with
rq_lock.

Fix this by checking whether irqs are disabled. Since rq_lock and all
other nested locks are irq safe, it is safe to do up_read() when irqs are
not disable. If the irqs are disabled, postpone up_read() in irq_work.

Fixes: 615755a77b ("bpf: extend stackmap to save binary_build_id+offset instead of address")
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191014171223.357174-1-songliubraving@fb.com

Reproducer:
============================ 8< ============================

char *filename;

void *worker(void *p)
{
        void *ptr;
        int fd;
        char *pptr;

        fd = open(filename, O_RDONLY);
        if (fd < 0)
                return NULL;
        while (1) {
                struct timespec ts = {0, 1000 + rand() % 2000};

                ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0);
                usleep(1);
                if (ptr == MAP_FAILED) {
                        printf("failed to mmap\n");
                        break;
                }
                munmap(ptr, 4096 * 64);
                usleep(1);
                pptr = malloc(1);
                usleep(1);
                pptr[0] = 1;
                usleep(1);
                free(pptr);
                usleep(1);
                nanosleep(&ts, NULL);
        }
        close(fd);
        return NULL;
}

int main(int argc, char *argv[])
{
        void *ptr;
        int i;
        pthread_t threads[THREAD_COUNT];

        if (argc < 2)
                return 0;

        filename = argv[1];

        for (i = 0; i < THREAD_COUNT; i++) {
                if (pthread_create(threads + i, NULL, worker, NULL)) {
                        fprintf(stderr, "Error creating thread\n");
                        return 0;
                }
        }

        for (i = 0; i < THREAD_COUNT; i++)
                pthread_join(threads[i], NULL);
        return 0;
}
============================ 8< ============================

Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-31 16:35:20 +01:00
Greg Kroah-Hartman
d902dae13d This is the 4.19.90 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl35M30ACgkQONu9yGCS
 aT7MxQ/+P2k2knFpbzGfqn7Ug4fyrWJ8T0cvmcQYLxcddJdM+tQWuFfXR6rhg2U6
 cCEkIAKVxihEA51PT6LYiynIMQ1UDAEYENfwYK4inVX2HbMsqDC4D0qnAkABzH27
 sLXwhKOOGB/z1F7oKjjsX/cCwP3V2E0PL1P7owHZis6tB24pZrMEss24x/4+dDm9
 zBDDxpR++mJypRvG3fA8oP5dhZZJNacIvLW+48wrxZWkIcVNnRV+QnyHZe68af1R
 SH4+I12AAeEDyEsQI8yX8PmGAnj1RZrzRQhibxooyBH4642RbX2qCYJkutPjI5rG
 pUl4970MdSHYMyEUwxh77b0jSO/9w7k02yatyp0DVA0PQ7p0lLBFZ96GEG9ytXJm
 Csuc6HEXSSTvuX8pf/KAf18L6kgnUhlxywkDcrcAVLQofMDhODul3fJALmGSVJXW
 jbp6AFoqT84I8Gm+je+vyuQciLnuH5C9wwxrOrWZzr+hLzZk60iG+OpRohn/g+Bx
 PjDjvnump0JGjF89hfNc+v9F+ihz7GBwOxspGrgb27ViRIhcxf0GuYFxyJtEuDiW
 6+gYNzWUaVC4RR1l1jXGWtGUPBsNV50sxFHK/Hx09UMIu/uJPMtF+TW9QDhJT1jr
 kL1kKeCsRV54nWjiWKwTTI2I37xJCPuidW5hvLqf2+ZHYTfQzsE=
 =Op5F
 -----END PGP SIGNATURE-----

Merge 4.19.90 into android-4.19

Changes in 4.19.90
	usb: gadget: configfs: Fix missing spin_lock_init()
	usb: gadget: pch_udc: fix use after free
	scsi: qla2xxx: Fix driver unload hang
	media: venus: remove invalid compat_ioctl32 handler
	USB: uas: honor flag to avoid CAPACITY16
	USB: uas: heed CAPACITY_HEURISTICS
	USB: documentation: flags on usb-storage versus UAS
	usb: Allow USB device to be warm reset in suspended state
	staging: rtl8188eu: fix interface sanity check
	staging: rtl8712: fix interface sanity check
	staging: gigaset: fix general protection fault on probe
	staging: gigaset: fix illegal free on probe errors
	staging: gigaset: add endpoint-type sanity check
	usb: xhci: only set D3hot for pci device
	xhci: Fix memory leak in xhci_add_in_port()
	xhci: Increase STS_HALT timeout in xhci_suspend()
	xhci: handle some XHCI_TRUST_TX_LENGTH quirks cases as default behaviour.
	ARM: dts: pandora-common: define wl1251 as child node of mmc3
	iio: adis16480: Add debugfs_reg_access entry
	iio: humidity: hdc100x: fix IIO_HUMIDITYRELATIVE channel reporting
	iio: imu: inv_mpu6050: fix temperature reporting using bad unit
	USB: atm: ueagle-atm: add missing endpoint check
	USB: idmouse: fix interface sanity checks
	USB: serial: io_edgeport: fix epic endpoint lookup
	usb: roles: fix a potential use after free
	USB: adutux: fix interface sanity check
	usb: core: urb: fix URB structure initialization function
	usb: mon: Fix a deadlock in usbmon between mmap and read
	tpm: add check after commands attribs tab allocation
	mtd: spear_smi: Fix Write Burst mode
	virtio-balloon: fix managed page counts when migrating pages between zones
	usb: dwc3: pci: add ID for the Intel Comet Lake -H variant
	usb: dwc3: gadget: Fix logical condition
	usb: dwc3: ep0: Clear started flag on completion
	phy: renesas: rcar-gen3-usb2: Fix sysfs interface of "role"
	btrfs: check page->mapping when loading free space cache
	btrfs: use refcount_inc_not_zero in kill_all_nodes
	Btrfs: fix metadata space leak on fixup worker failure to set range as delalloc
	Btrfs: fix negative subv_writers counter and data space leak after buffered write
	btrfs: Avoid getting stuck during cyclic writebacks
	btrfs: Remove btrfs_bio::flags member
	Btrfs: send, skip backreference walking for extents with many references
	btrfs: record all roots for rename exchange on a subvol
	rtlwifi: rtl8192de: Fix missing code to retrieve RX buffer address
	rtlwifi: rtl8192de: Fix missing callback that tests for hw release of buffer
	rtlwifi: rtl8192de: Fix missing enable interrupt flag
	lib: raid6: fix awk build warnings
	ovl: fix corner case of non-unique st_dev;st_ino
	ovl: relax WARN_ON() on rename to self
	hwrng: omap - Fix RNG wait loop timeout
	dm writecache: handle REQ_FUA
	dm zoned: reduce overhead of backing device checks
	workqueue: Fix spurious sanity check failures in destroy_workqueue()
	workqueue: Fix pwq ref leak in rescuer_thread()
	ASoC: rt5645: Fixed buddy jack support.
	ASoC: rt5645: Fixed typo for buddy jack support.
	ASoC: Jack: Fix NULL pointer dereference in snd_soc_jack_report
	md: improve handling of bio with REQ_PREFLUSH in md_flush_request()
	blk-mq: avoid sysfs buffer overflow with too many CPU cores
	cgroup: pids: use atomic64_t for pids->limit
	ar5523: check NULL before memcpy() in ar5523_cmd()
	s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported
	media: bdisp: fix memleak on release
	media: radio: wl1273: fix interrupt masking on release
	media: cec.h: CEC_OP_REC_FLAG_ values were swapped
	cpuidle: Do not unset the driver if it is there already
	erofs: zero out when listxattr is called with no xattr
	intel_th: Fix a double put_device() in error path
	intel_th: pci: Add Ice Lake CPU support
	intel_th: pci: Add Tiger Lake CPU support
	PM / devfreq: Lock devfreq in trans_stat_show
	cpufreq: powernv: fix stack bloat and hard limit on number of CPUs
	ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
	ACPI: OSL: only free map once in osl.c
	ACPI: bus: Fix NULL pointer check in acpi_bus_get_private_data()
	ACPI: PM: Avoid attaching ACPI PM domain to certain devices
	pinctrl: armada-37xx: Fix irq mask access in armada_37xx_irq_set_type()
	pinctrl: samsung: Add of_node_put() before return in error path
	pinctrl: samsung: Fix device node refcount leaks in Exynos wakeup controller init
	pinctrl: samsung: Fix device node refcount leaks in S3C24xx wakeup controller init
	pinctrl: samsung: Fix device node refcount leaks in init code
	pinctrl: samsung: Fix device node refcount leaks in S3C64xx wakeup controller init
	mmc: host: omap_hsmmc: add code for special init of wl1251 to get rid of pandora_wl1251_init_card
	ARM: dts: omap3-tao3530: Fix incorrect MMC card detection GPIO polarity
	ppdev: fix PPGETTIME/PPSETTIME ioctls
	powerpc: Allow 64bit VDSO __kernel_sync_dicache to work across ranges >4GB
	powerpc/xive: Prevent page fault issues in the machine crash handler
	powerpc: Allow flush_icache_range to work across ranges >4GB
	powerpc/xive: Skip ioremap() of ESB pages for LSI interrupts
	video/hdmi: Fix AVI bar unpack
	quota: Check that quota is not dirty before release
	ext2: check err when partial != NULL
	quota: fix livelock in dquot_writeback_dquots
	ext4: Fix credit estimate for final inode freeing
	reiserfs: fix extended attributes on the root directory
	block: fix single range discard merge
	scsi: zfcp: trace channel log even for FCP command responses
	scsi: qla2xxx: Fix DMA unmap leak
	scsi: qla2xxx: Fix hang in fcport delete path
	scsi: qla2xxx: Fix session lookup in qlt_abort_work()
	scsi: qla2xxx: Fix qla24xx_process_bidir_cmd()
	scsi: qla2xxx: Always check the qla2x00_wait_for_hba_online() return value
	scsi: qla2xxx: Fix message indicating vectors used by driver
	scsi: qla2xxx: Fix SRB leak on switch command timeout
	xhci: make sure interrupts are restored to correct state
	usb: typec: fix use after free in typec_register_port()
	omap: pdata-quirks: remove openpandora quirks for mmc3 and wl1251
	scsi: lpfc: Cap NPIV vports to 256
	scsi: lpfc: Correct code setting non existent bits in sli4 ABORT WQE
	scsi: lpfc: Correct topology type reporting on G7 adapters
	drbd: Change drbd_request_detach_interruptible's return type to int
	e100: Fix passing zero to 'PTR_ERR' warning in e100_load_ucode_wait
	pvcalls-front: don't return error when the ring is full
	sch_cake: Correctly update parent qlen when splitting GSO packets
	net/smc: do not wait under send_lock
	net: hns3: clear pci private data when unload hns3 driver
	net: hns3: change hnae3_register_ae_dev() to int
	net: hns3: Check variable is valid before assigning it to another
	scsi: hisi_sas: send primitive NOTIFY to SSP situation only
	scsi: hisi_sas: Reject setting programmed minimum linkrate > 1.5G
	x86/MCE/AMD: Turn off MC4_MISC thresholding on all family 0x15 models
	x86/MCE/AMD: Carve out the MC4_MISC thresholding quirk
	power: supply: cpcap-battery: Fix signed counter sample register
	mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes dead
	media: vimc: fix component match compare
	ath10k: fix fw crash by moving chip reset after napi disabled
	regulator: 88pm800: fix warning same module names
	powerpc: Avoid clang warnings around setjmp and longjmp
	powerpc: Fix vDSO clock_getres()
	ext4: work around deleting a file with i_nlink == 0 safely
	firmware: qcom: scm: Ensure 'a0' status code is treated as signed
	mm/shmem.c: cast the type of unmap_start to u64
	rtc: disable uie before setting time and enable after
	splice: only read in as much information as there is pipe buffer space
	ext4: fix a bug in ext4_wait_for_tail_page_commit
	mfd: rk808: Fix RK818 ID template
	mm, thp, proc: report THP eligibility for each vma
	s390/smp,vdso: fix ASCE handling
	blk-mq: make sure that line break can be printed
	workqueue: Fix missing kfree(rescuer) in destroy_workqueue()
	perf callchain: Fix segfault in thread__resolve_callchain_sample()
	gre: refetch erspan header from skb->data after pskb_may_pull()
	firmware: arm_scmi: Avoid double free in error flow
	sunrpc: fix crash when cache_head become valid before update
	net/mlx5e: Fix SFF 8472 eeprom length
	leds: trigger: netdev: fix handling on interface rename
	PCI: rcar: Fix missing MACCTLR register setting in initialization sequence
	gfs2: fix glock reference problem in gfs2_trans_remove_revoke
	of: overlay: add_changeset_property() memory leak
	kernel/module.c: wakeup processes in module_wq on module unload
	cifs: Fix potential softlockups while refreshing DFS cache
	gpiolib: acpi: Add Terra Pad 1061 to the run_edge_events_on_boot_blacklist
	raid5: need to set STRIPE_HANDLE for batch head
	scsi: qla2xxx: Change discovery state before PLOGI
	iio: imu: mpu6050: add missing available scan masks
	idr: Fix idr_get_next_ul race with idr_remove
	scsi: zorro_esp: Limit DMA transfers to 65536 bytes (except on Fastlane)
	of: unittest: fix memory leak in attach_node_and_children
	Linux 4.19.90

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I790291e9f3d3c8dd3f53e4387de25ff272ad4f39
2019-12-18 09:03:30 +01:00
Konstantin Khorenko
12c88d91a8 kernel/module.c: wakeup processes in module_wq on module unload
[ Upstream commit 5d60331161 ]

Fix the race between load and unload a kernel module.

sys_delete_module()
 try_stop_module()
  mod->state = _GOING
					add_unformed_module()
					 old = find_module_all()
					 (old->state == _GOING =>
					  wait_event_interruptible())

					 During pre-condition
					 finished_loading() rets 0
					 schedule()
					 (never gets waken up later)
 free_module()
  mod->state = _UNFORMED
   list_del_rcu(&mod->list)
   (dels mod from "modules" list)

return

The race above leads to modprobe hanging forever on loading
a module.

Error paths on loading module call wake_up_all(&module_wq) after
freeing module, so let's do the same on straight module unload.

Fixes: 6e6de3dee5 ("kernel/module.c: Only return -EEXIST for modules that have finished loading")
Reviewed-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-17 20:35:57 +01:00
Tejun Heo
1b83d5756a workqueue: Fix missing kfree(rescuer) in destroy_workqueue()
commit 8efe1223d7 upstream.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Qian Cai <cai@lca.pw>
Fixes: def98c84b6 ("workqueue: Fix spurious sanity check failures in destroy_workqueue()")
Cc: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 20:35:50 +01:00
Aleksa Sarai
a1de70aa86 cgroup: pids: use atomic64_t for pids->limit
commit a713af394c upstream.

Because pids->limit can be changed concurrently (but we don't want to
take a lock because it would be needlessly expensive), use atomic64_ts
instead.

Fixes: commit 49b786ea14 ("cgroup: implement the PIDs subsystem")
Cc: stable@vger.kernel.org # v4.3+
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 20:34:56 +01:00
Tejun Heo
ebd9fbf9e7 workqueue: Fix pwq ref leak in rescuer_thread()
commit e66b39af00 upstream.

008847f66c ("workqueue: allow rescuer thread to do more work.") made
the rescuer worker requeue the pwq immediately if there may be more
work items which need rescuing instead of waiting for the next mayday
timer expiration.  Unfortunately, it doesn't check whether the pwq is
already on the mayday list and unconditionally gets the ref and moves
it onto the list.  This doesn't corrupt the list but creates an
additional reference to the pwq.  It got queued twice but will only be
removed once.

This leak later can trigger pwq refcnt warning on workqueue
destruction and prevent freeing of the workqueue.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Williams, Gerald S" <gerald.s.williams@intel.com>
Cc: NeilBrown <neilb@suse.de>
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 20:34:54 +01:00
Tejun Heo
7c43540e88 workqueue: Fix spurious sanity check failures in destroy_workqueue()
commit def98c84b6 upstream.

Before actually destrying a workqueue, destroy_workqueue() checks
whether it's actually idle.  If it isn't, it prints out a bunch of
warning messages and leaves the workqueue dangling.  It unfortunately
has a couple issues.

* Mayday list queueing increments pwq's refcnts which gets detected as
  busy and fails the sanity checks.  However, because mayday list
  queueing is asynchronous, this condition can happen without any
  actual work items left in the workqueue.

* Sanity check failure leaves the sysfs interface behind too which can
  lead to init failure of newer instances of the workqueue.

This patch fixes the above two by

* If a workqueue has a rescuer, disable and kill the rescuer before
  sanity checks.  Disabling and killing is guaranteed to flush the
  existing mayday list.

* Remove sysfs interface before sanity checks.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Marcin Pawlowski <mpawlowski@fb.com>
Reported-by: "Williams, Gerald S" <gerald.s.williams@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17 20:34:53 +01:00
Greg Kroah-Hartman
e5312e5d68 This is the 4.19.89 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3zQ1wACgkQONu9yGCS
 aT6ZDA/+JQyM+mgrU2t5mkq9lXCwL87Jiooy0kKT9b/2EWmW5Gdxp/On9PXfqtfs
 uZ+v0A1g1H+582uwuqG1wB2jr3I2AhNnRNbvSypGtk1Kitx9HqVJD/wWRRVCULww
 cr3uA/ZOX+deRjOVYP3dhFp7ycn6u5+GxgmFQTLmKAYN8uUqq4/dpWy01iB0nr2A
 GcoLm9P96o8P/wIWaykqOvshDrocbFcBL4VuxLeZCbFsAMTiX+jJnyIL8W7gfBJl
 M2626S/hESk5DvGcMN3zwOw/nTJlvySUtfqXSvPk0sT90UMx/YZ9QdpS9GkvRb9t
 OA1G+iHguEU+Fq/DawUyxwk/kt3nA6cg0q7RSxHo7QP6SGo7OaHHS1myzGDhL8oc
 LDKXO2iSSzvXJDlqrU45N+1YhpeiIHCxmDctbUIM9dP4u6wWmQIyYXLrcpupTsm9
 StiDBguXFHWSBFhG0+MlTUU5cypVNoN+56wBAUTR6+qoDASTzGvjNbrBsQihODV0
 RMFJF17Zvn+UoEohe860EMswUBsJ+F+VSZO5yGuZgsaC/2Ih6M1dxsiNU7RF02gX
 fRis6huj1+642ZsEbd2tueYGUaDN1HpMsVkN3AAkD3pJF5lX7AJRwhvRyC8N1jhc
 G90KMSk2pR/ItjmUpkKaAhAKhN+oKSzuCPpHj2iGotfWdd4slXQ=
 =Ekyt
 -----END PGP SIGNATURE-----

Merge 4.19.89 into android-4.19

Changes in 4.19.89
	rsi: release skb if rsi_prepare_beacon fails
	arm64: tegra: Fix 'active-low' warning for Jetson TX1 regulator
	sparc64: implement ioremap_uc
	lp: fix sparc64 LPSETTIMEOUT ioctl
	usb: gadget: u_serial: add missing port entry locking
	tty: serial: fsl_lpuart: use the sg count from dma_map_sg
	tty: serial: msm_serial: Fix flow control
	serial: pl011: Fix DMA ->flush_buffer()
	serial: serial_core: Perform NULL checks for break_ctl ops
	serial: ifx6x60: add missed pm_runtime_disable
	autofs: fix a leak in autofs_expire_indirect()
	RDMA/hns: Correct the value of HNS_ROCE_HEM_CHUNK_LEN
	iwlwifi: pcie: don't consider IV len in A-MSDU
	exportfs_decode_fh(): negative pinned may become positive without the parent locked
	audit_get_nd(): don't unlock parent too early
	NFC: nxp-nci: Fix NULL pointer dereference after I2C communication error
	xfrm: release device reference for invalid state
	Input: cyttsp4_core - fix use after free bug
	sched/core: Avoid spurious lock dependencies
	perf/core: Consistently fail fork on allocation failures
	ALSA: pcm: Fix stream lock usage in snd_pcm_period_elapsed()
	drm/sun4i: tcon: Set min division of TCON0_DCLK to 1.
	selftests: kvm: fix build with glibc >= 2.30
	rsxx: add missed destroy_workqueue calls in remove
	net: ep93xx_eth: fix mismatch of request_mem_region in remove
	i2c: core: fix use after free in of_i2c_notify
	serial: core: Allow processing sysrq at port unlock time
	cxgb4vf: fix memleak in mac_hlist initialization
	iwlwifi: mvm: synchronize TID queue removal
	iwlwifi: trans: Clear persistence bit when starting the FW
	iwlwifi: mvm: Send non offchannel traffic via AP sta
	ARM: 8813/1: Make aligned 2-byte getuser()/putuser() atomic on ARMv6+
	audit: Embed key into chunk
	netfilter: nf_tables: don't use position attribute on rule replacement
	ARC: IOC: panic if kernel was started with previously enabled IOC
	net/mlx5: Release resource on error flow
	clk: sunxi-ng: a64: Fix gate bit of DSI DPHY
	ice: Fix NVM mask defines
	dlm: fix possible call to kfree() for non-initialized pointer
	ARM: dts: exynos: Fix LDO13 min values on Odroid XU3/XU4/HC1
	extcon: max8997: Fix lack of path setting in USB device mode
	net: ethernet: ti: cpts: correct debug for expired txq skb
	rtc: s3c-rtc: Avoid using broken ALMYEAR register
	rtc: max77686: Fix the returned value in case of error in 'max77686_rtc_read_time()'
	i40e: don't restart nway if autoneg not supported
	virtchnl: Fix off by one error
	clk: rockchip: fix rk3188 sclk_smc gate data
	clk: rockchip: fix rk3188 sclk_mac_lbtest parameter ordering
	ARM: dts: rockchip: Fix rk3288-rock2 vcc_flash name
	dlm: fix missing idr_destroy for recover_idr
	MIPS: SiByte: Enable ZONE_DMA32 for LittleSur
	net: dsa: mv88e6xxx: Work around mv886e6161 SERDES missing MII_PHYSID2
	scsi: zfcp: update kernel message for invalid FCP_CMND length, it's not the CDB
	scsi: zfcp: drop default switch case which might paper over missing case
	drivers: soc: Allow building the amlogic drivers without ARCH_MESON
	bus: ti-sysc: Fix getting optional clocks in clock_roles
	ARM: dts: imx6: RDU2: fix eGalax touchscreen node
	crypto: ecc - check for invalid values in the key verification test
	crypto: bcm - fix normal/non key hash algorithm failure
	arm64: dts: zynqmp: Fix node names which contain "_"
	pinctrl: qcom: ssbi-gpio: fix gpio-hog related boot issues
	Staging: iio: adt7316: Fix i2c data reading, set the data field
	firmware: raspberrypi: Fix firmware calls with large buffers
	mm/vmstat.c: fix NUMA statistics updates
	clk: rockchip: fix I2S1 clock gate register for rk3328
	clk: rockchip: fix ID of 8ch clock of I2S1 for rk3328
	sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit
	regulator: Fix return value of _set_load() stub
	USB: serial: f81534: fix reading old/new IC config
	xfs: extent shifting doesn't fully invalidate page cache
	net-next/hinic:fix a bug in set mac address
	net-next/hinic: fix a bug in rx data flow
	ice: Fix return value from NAPI poll
	ice: Fix possible NULL pointer de-reference
	iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents
	iomap: sub-block dio needs to zeroout beyond EOF
	iomap: dio data corruption and spurious errors when pipes fill
	iomap: readpages doesn't zero page tail beyond EOF
	iw_cxgb4: only reconnect with MPAv1 if the peer aborts
	MIPS: OCTEON: octeon-platform: fix typing
	net/smc: use after free fix in smc_wr_tx_put_slot()
	math-emu/soft-fp.h: (_FP_ROUND_ZERO) cast 0 to void to fix warning
	nds32: Fix the items of hwcap_str ordering issue.
	rtc: max8997: Fix the returned value in case of error in 'max8997_rtc_read_alarm()'
	rtc: dt-binding: abx80x: fix resistance scale
	ARM: dts: exynos: Use Samsung SoC specific compatible for DWC2 module
	media: coda: fix memory corruption in case more than 32 instances are opened
	media: pulse8-cec: return 0 when invalidating the logical address
	media: cec: report Vendor ID after initialization
	iwlwifi: fix cfg structs for 22000 with different RF modules
	ravb: Clean up duplex handling
	net/ipv6: re-do dad when interface has IFF_NOARP flag change
	dmaengine: coh901318: Fix a double-lock bug
	dmaengine: coh901318: Remove unused variable
	dmaengine: dw-dmac: implement dma protection control setting
	net: qualcomm: rmnet: move null check on dev before dereferecing it
	selftests/powerpc: Allocate base registers
	selftests/powerpc: Skip test instead of failing
	usb: dwc3: debugfs: Properly print/set link state for HS
	usb: dwc3: don't log probe deferrals; but do log other error codes
	ACPI: fix acpi_find_child_device() invocation in acpi_preset_companion()
	f2fs: fix to account preflush command for noflush_merge mode
	f2fs: fix count of seg_freed to make sec_freed correct
	f2fs: change segment to section in f2fs_ioc_gc_range
	ARM: dts: rockchip: Fix the PMU interrupt number for rv1108
	ARM: dts: rockchip: Assign the proper GPIO clocks for rv1108
	f2fs: fix to allow node segment for GC by ioctl path
	sparc: Fix JIT fused branch convergance.
	sparc: Correct ctx->saw_frame_pointer logic.
	nvme: Free ctrl device name on init failure
	dma-mapping: fix return type of dma_set_max_seg_size()
	slimbus: ngd: Fix build error on x86
	altera-stapl: check for a null key before strcasecmp'ing it
	serial: imx: fix error handling in console_setup
	i2c: imx: don't print error message on probe defer
	clk: meson: Fix GXL HDMI PLL fractional bits width
	gpu: host1x: Fix syncpoint ID field size on Tegra186
	lockd: fix decoding of TEST results
	sctp: increase sk_wmem_alloc when head->truesize is increased
	iommu/amd: Fix line-break in error log reporting
	ASoC: rsnd: tidyup registering method for rsnd_kctrl_new()
	ARM: dts: sun4i: Fix gpio-keys warning
	ARM: dts: sun4i: Fix HDMI output DTC warning
	ARM: dts: sun5i: a10s: Fix HDMI output DTC warning
	ARM: dts: r8a779[01]: Disable unconnected LVDS encoders
	ARM: dts: sun7i: Fix HDMI output DTC warning
	ARM: dts: sun8i: a23/a33: Fix OPP DTC warnings
	ARM: dts: sun8i: v3s: Change pinctrl nodes to avoid warning
	dlm: NULL check before kmem_cache_destroy is not needed
	ARM: debug: enable UART1 for socfpga Cyclone5
	can: xilinx: fix return type of ndo_start_xmit function
	nfsd: fix a warning in __cld_pipe_upcall()
	bpf: btf: implement btf_name_valid_identifier()
	bpf: btf: check name validity for various types
	tools: bpftool: fix a bitfield pretty print issue
	ASoC: au8540: use 64-bit arithmetic instead of 32-bit
	ARM: OMAP1/2: fix SoC name printing
	arm64: dts: meson-gxl-libretech-cc: fix GPIO lines names
	arm64: dts: meson-gxbb-nanopi-k2: fix GPIO lines names
	arm64: dts: meson-gxbb-odroidc2: fix GPIO lines names
	arm64: dts: meson-gxl-khadas-vim: fix GPIO lines names
	net/x25: fix called/calling length calculation in x25_parse_address_block
	net/x25: fix null_x25_address handling
	tools/bpf: make libbpf _GNU_SOURCE friendly
	clk: mediatek: Drop __init from mtk_clk_register_cpumuxes()
	clk: mediatek: Drop more __init markings for driver probe
	soc: renesas: r8a77970-sysc: Correct names of A2DP/A2CN power domains
	soc: renesas: r8a77980-sysc: Correct names of A2DP[01] power domains
	soc: renesas: r8a77980-sysc: Correct A3VIP[012] power domain hierarchy
	kbuild: disable dtc simple_bus_reg warnings by default
	tcp: make tcp_space() aware of socket backlog
	ARM: dts: mmp2: fix the gpio interrupt cell number
	ARM: dts: realview-pbx: Fix duplicate regulator nodes
	tcp: fix off-by-one bug on aborting window-probing socket
	tcp: fix SNMP under-estimation on failed retransmission
	tcp: fix SNMP TCP timeout under-estimation
	modpost: skip ELF local symbols during section mismatch check
	kbuild: fix single target build for external module
	mtd: fix mtd_oobavail() incoherent returned value
	ARM: dts: pxa: clean up USB controller nodes
	clk: meson: meson8b: fix the offset of vid_pll_dco's N value
	clk: sunxi-ng: h3/h5: Fix CSI_MCLK parent
	clk: qcom: Fix MSM8998 resets
	media: cxd2880-spi: fix probe when dvb_attach fails
	ARM: dts: realview: Fix some more duplicate regulator nodes
	dlm: fix invalid cluster name warning
	net/mlx4_core: Fix return codes of unsupported operations
	pstore/ram: Avoid NULL deref in ftrace merging failure path
	powerpc/math-emu: Update macros from GCC
	clk: renesas: r8a77990: Correct parent clock of DU
	clk: renesas: r8a77995: Correct parent clock of DU
	MIPS: OCTEON: cvmx_pko_mem_debug8: use oldest forward compatible definition
	nfsd: Return EPERM, not EACCES, in some SETATTR cases
	media: uvcvideo: Abstract streaming object lifetime
	tty: serial: qcom_geni_serial: Fix softlock
	ARM: dts: sun8i: h3: Fix the system-control register range
	tty: Don't block on IO when ldisc change is pending
	media: stkwebcam: Bugfix for wrong return values
	firmware: qcom: scm: fix compilation error when disabled
	clk: qcom: gcc-msm8998: Disable halt check of UFS clocks
	sctp: frag_point sanity check
	soc: renesas: r8a77990-sysc: Fix initialization order of 3DG-{A,B}
	mlxsw: spectrum_router: Relax GRE decap matching check
	IB/hfi1: Ignore LNI errors before DC8051 transitions to Polling state
	IB/hfi1: Close VNIC sdma_progress sleep window
	mlx4: Use snprintf instead of complicated strcpy
	usb: mtu3: fix dbginfo in qmu_tx_zlp_error_handler
	clk: renesas: rcar-gen3: Set state when registering SD clocks
	ASoC: max9867: Fix power management
	ARM: dts: sunxi: Fix PMU compatible strings
	ARM: dts: am335x-pdu001: Fix polarity of card detection input
	media: vimc: fix start stream when link is disabled
	net: aquantia: fix RSS table and key sizes
	sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
	fuse: verify nlink
	fuse: verify attributes
	ALSA: hda/realtek - Enable internal speaker of ASUS UX431FLC
	ALSA: hda/realtek - Enable the headset-mic on a Xiaomi's laptop
	ALSA: hda/realtek - Dell headphone has noise on unmute for ALC236
	ALSA: pcm: oss: Avoid potential buffer overflows
	ALSA: hda - Add mute led support for HP ProBook 645 G4
	Input: synaptics - switch another X1 Carbon 6 to RMI/SMbus
	Input: synaptics-rmi4 - re-enable IRQs in f34v7_do_reflash
	Input: synaptics-rmi4 - don't increment rmiaddr for SMBus transfers
	Input: goodix - add upside-down quirk for Teclast X89 tablet
	coresight: etm4x: Fix input validation for sysfs.
	Input: Fix memory leak in psxpad_spi_probe
	x86/mm/32: Sync only to VMALLOC_END in vmalloc_sync_all()
	x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect
	xfrm interface: fix memory leak on creation
	xfrm interface: avoid corruption on changelink
	xfrm interface: fix list corruption for x-netns
	xfrm interface: fix management of phydev
	CIFS: Fix NULL-pointer dereference in smb2_push_mandatory_locks
	CIFS: Fix SMB2 oplock break processing
	tty: vt: keyboard: reject invalid keycodes
	can: slcan: Fix use-after-free Read in slcan_open
	kernfs: fix ino wrap-around detection
	jbd2: Fix possible overflow in jbd2_log_space_left()
	drm/msm: fix memleak on release
	drm/i810: Prevent underflow in ioctl
	arm64: dts: exynos: Revert "Remove unneeded address space mapping for soc node"
	KVM: arm/arm64: vgic: Don't rely on the wrong pending table
	KVM: x86: do not modify masked bits of shared MSRs
	KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES
	KVM: x86: Grab KVM's srcu lock when setting nested state
	crypto: crypto4xx - fix double-free in crypto4xx_destroy_sdr
	crypto: atmel-aes - Fix IV handling when req->nbytes < ivsize
	crypto: af_alg - cast ki_complete ternary op to int
	crypto: ccp - fix uninitialized list head
	crypto: ecdh - fix big endian bug in ECC library
	crypto: user - fix memory leak in crypto_report
	spi: atmel: Fix CS high support
	mwifiex: update set_mac_address logic
	can: ucan: fix non-atomic allocation in completion handler
	RDMA/qib: Validate ->show()/store() callbacks before calling them
	iomap: Fix pipe page leakage during splicing
	thermal: Fix deadlock in thermal thermal_zone_device_check
	vcs: prevent write access to vcsu devices
	binder: Fix race between mmap() and binder_alloc_print_pages()
	binder: Handle start==NULL in binder_update_page_range()
	ALSA: hda - Fix pending unsol events at shutdown
	md/raid0: Fix an error message in raid0_make_request()
	watchdog: aspeed: Fix clock behaviour for ast2600
	perf script: Fix invalid LBR/binary mismatch error
	splice: don't read more than available pipe space
	iomap: partially revert 4721a60109 (simulated directio short read on EFAULT)
	xfs: add missing error check in xfs_prepare_shift()
	ASoC: rsnd: fixup MIX kctrl registration
	KVM: x86: fix out-of-bounds write in KVM_GET_EMULATED_CPUID (CVE-2019-19332)
	net: qrtr: fix memort leak in qrtr_tun_write_iter
	appletalk: Fix potential NULL pointer dereference in unregister_snap_client
	appletalk: Set error code if register_snap_client failed
	Linux 4.19.89

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie3fa59adde9a7e9a6d4684de0e95de14a8b83d0b
2019-12-13 10:01:10 +01:00
Xuewei Zhang
742f2319cb sched/fair: Scale bandwidth quota and period without losing quota/period ratio precision
commit 4929a4e6fa upstream.

The quota/period ratio is used to ensure a child task group won't get
more bandwidth than the parent task group, and is calculated as:

  normalized_cfs_quota() = [(quota_us << 20) / period_us]

If the quota/period ratio was changed during this scaling due to
precision loss, it will cause inconsistency between parent and child
task groups.

See below example:

A userspace container manager (kubelet) does three operations:

 1) Create a parent cgroup, set quota to 1,000us and period to 10,000us.
 2) Create a few children cgroups.
 3) Set quota to 1,000us and period to 10,000us on a child cgroup.

These operations are expected to succeed. However, if the scaling of
147/128 happens before step 3, quota and period of the parent cgroup
will be changed:

  new_quota: 1148437ns,   1148us
 new_period: 11484375ns, 11484us

And when step 3 comes in, the ratio of the child cgroup will be
104857, which will be larger than the parent cgroup ratio (104821),
and will fail.

Scaling them by a factor of 2 will fix the problem.

Tested-by: Phil Auld <pauld@redhat.com>
Signed-off-by: Xuewei Zhang <xueweiz@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Phil Auld <pauld@redhat.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Ben Segall <bsegall@google.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Fixes: 2e8e192263 ("sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup")
Link: https://lkml.kernel.org/r/20191004001243.140897-1-xueweiz@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-13 08:52:35 +01:00
Yonghong Song
12c49ac4cf bpf: btf: check name validity for various types
[ Upstream commit eb04bbb608 ]

This patch added name checking for the following types:
 . BTF_KIND_PTR, BTF_KIND_ARRAY, BTF_KIND_VOLATILE,
   BTF_KIND_CONST, BTF_KIND_RESTRICT:
     the name must be null
 . BTF_KIND_STRUCT, BTF_KIND_UNION: the struct/member name
     is either null or a valid identifier
 . BTF_KIND_ENUM: the enum type name is either null or a valid
     identifier; the enumerator name must be a valid identifier.
 . BTF_KIND_FWD: the name must be a valid identifier
 . BTF_KIND_TYPEDEF: the name must be a valid identifier

For those places a valid name is required, the name must be
a valid C identifier. This can be relaxed later if we found
use cases for a different (non-C) frontend.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13 08:52:09 +01:00
Yonghong Song
2f3e380d49 bpf: btf: implement btf_name_valid_identifier()
[ Upstream commit cdbb096add ]

Function btf_name_valid_identifier() have been implemented in
bpf-next commit 2667a2626f ("bpf: btf: Add BTF_KIND_FUNC and
BTF_KIND_FUNC_PROTO"). Backport this function so later patch
can use it.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13 08:52:09 +01:00
Jan Kara
6ce317fdc2 audit: Embed key into chunk
[ Upstream commit 8d20d6e930 ]

Currently chunk hash key (which is in fact pointer to the inode) is
derived as chunk->mark.conn->obj. It is tricky to make this dereference
reliable for hash table lookups only under RCU as mark can get detached
from the connector and connector gets freed independently of the
running lookup. Thus there is a possible use after free / NULL ptr
dereference issue:

CPU1					CPU2
					untag_chunk()
					  ...
audit_tree_lookup()
  list_for_each_entry_rcu(p, list, hash) {
					  list_del_rcu(&chunk->hash);
					  fsnotify_destroy_mark(entry);
					  fsnotify_put_mark(entry)
    chunk_to_key(p)
      if (!chunk->mark.connector)
					    ...
					    hlist_del_init_rcu(&mark->obj_list);
					    if (hlist_empty(&conn->list)) {
					      inode = fsnotify_detach_connector_from_object(conn);
					    mark->connector = NULL;
					    ...
					    frees connector from workqueue
      chunk->mark.connector->obj

This race is probably impossible to hit in practice as the race window
on CPU1 is very narrow and CPU2 has a lot of code to execute. Still it's
better to have this fixed. Since the inode the chunk is attached to is
constant during chunk's lifetime it is easy to cache the key in the
chunk itself and thus avoid these issues.

Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13 08:51:11 +01:00
Alexander Shishkin
78a917bea6 perf/core: Consistently fail fork on allocation failures
[ Upstream commit 697d877849 ]

Commit:

  313ccb9615 ("perf: Allocate context task_ctx_data for child event")

makes the inherit path skip over the current event in case of task_ctx_data
allocation failure. This, however, is inconsistent with allocation failures
in perf_event_alloc(), which would abort the fork.

Correct this by returning an error code on task_ctx_data allocation
failure and failing the fork in that case.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/20191105075702.60319-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13 08:51:04 +01:00
Peter Zijlstra
870083b6af sched/core: Avoid spurious lock dependencies
[ Upstream commit ff51ff84d8 ]

While seemingly harmless, __sched_fork() does hrtimer_init(), which,
when DEBUG_OBJETS, can end up doing allocations.

This then results in the following lock order:

  rq->lock
    zone->lock.rlock
      batched_entropy_u64.lock

Which in turn causes deadlocks when we do wakeups while holding that
batched_entropy lock -- as the random code does.

Solve this by moving __sched_fork() out from under rq->lock. This is
safe because nothing there relies on rq->lock, as also evident from the
other __sched_fork() callsite.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: bigeasy@linutronix.de
Cc: cl@linux.com
Cc: keescook@chromium.org
Cc: penberg@kernel.org
Cc: rientjes@google.com
Cc: thgarnie@google.com
Cc: tytso@mit.edu
Cc: will@kernel.org
Fixes: b7d5dc2107 ("random: add a spinlock_t to struct batched_entropy")
Link: https://lkml.kernel.org/r/20191001091837.GK4536@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13 08:51:04 +01:00
Al Viro
7fb6ef16ef audit_get_nd(): don't unlock parent too early
[ Upstream commit 69924b8968 ]

if the child has been negative and just went positive
under us, we want coherent d_is_positive() and ->d_inode.
Don't unlock the parent until we'd done that work...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-13 08:51:02 +01:00
Greg Kroah-Hartman
291d853dff This is the 4.19.88 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3owgEACgkQONu9yGCS
 aT43zw//SS1As83XXuHr4mdWIVDjXo6RMJ6Ib7YbRi/uhBmQuUuGVFcqGxUIA9Kl
 eSXu5Kt8TNmInzHq9AMYgegrELAEwPD2XfptALGDwiUHonQuiFaqOQn/bltJOm1L
 PsG15A7+/gFhuhPJDp2ZfNBmZGdpXdIwD27oUDqF1XD64dMa/HPbFUVgxWn3HHkd
 sm0J6Ez0eNA+BmLnHXYDiSaEYIiwvy1nN6XpyIfOyb2Tz6kPoe0vVWU00Cmy8KAU
 EIWB+TBRunspgMsShL5Cl1MSFOxf9QOmgnZxcrODAQfb1TbLMACB1FGMjK4nLm+3
 wPlSnC7L49ARl/pvmN5NOUrjHi8S8qq/Od9QW+UIckRI6KzOU832h99v4gFuHjSC
 KFiLi5K9+uTIMgNOETmINBiKKUcUzYXYVajvm4tuAUq3HO8wy6jeALtt34OiJZQZ
 DV8wyBdL9NDUFqBymFaMFA4Us/fGIREzvPgI0E0jth2ANuLFLtScrnStuWv8buwJ
 JT3V9xCxHZtZ3Ctevx/Jp6OaQtnbSnWjMjrO0UDzZ6N7+g5UKmh9/R3xL6sBpFVU
 Vu49J+qWU3VmbY3EIulel+yARNe7xS4ExK185JmNzpYFyOpXum14FHhhtQ6xNSeu
 dRqyITI0KYP7jWtBDKCgVAWF5jC9gHP1ksrHSZMhyGrv1dC1XZM=
 =KnJW
 -----END PGP SIGNATURE-----

Merge 4.19.88 into android-4.19

Changes in 4.19.88
	clk: meson: gxbb: let sar_adc_clk_div set the parent clock rate
	clocksource/drivers/mediatek: Fix error handling
	ASoC: msm8916-wcd-analog: Fix RX1 selection in RDAC2 MUX
	ASoC: compress: fix unsigned integer overflow check
	reset: Fix memory leak in reset_control_array_put()
	clk: samsung: exynos5433: Fix error paths
	ASoC: kirkwood: fix external clock probe defer
	ASoC: kirkwood: fix device remove ordering
	clk: samsung: exynos5420: Preserve PLL configuration during suspend/resume
	pinctrl: cherryview: Allocate IRQ chip dynamic
	ARM: dts: imx6qdl-sabreauto: Fix storm of accelerometer interrupts
	reset: fix reset_control_ops kerneldoc comment
	clk: at91: avoid sleeping early
	clk: sunxi: Fix operator precedence in sunxi_divs_clk_setup
	clk: sunxi-ng: a80: fix the zero'ing of bits 16 and 18
	ARM: dts: sun8i-a83t-tbs-a711: Fix WiFi resume from suspend
	samples/bpf: fix build by setting HAVE_ATTR_TEST to zero
	powerpc/bpf: Fix tail call implementation
	idr: Fix integer overflow in idr_for_each_entry
	idr: Fix idr_alloc_u32 on 32-bit systems
	x86/resctrl: Prevent NULL pointer dereference when reading mondata
	clk: ti: dra7-atl-clock: Remove ti_clk_add_alias call
	clk: ti: clkctrl: Fix failed to enable error with double udelay timeout
	net: fec: add missed clk_disable_unprepare in remove
	bridge: ebtables: don't crash when using dnat target in output chains
	can: peak_usb: report bus recovery as well
	can: c_can: D_CAN: c_can_chip_config(): perform a sofware reset on open
	can: rx-offload: can_rx_offload_queue_tail(): fix error handling, avoid skb mem leak
	can: rx-offload: can_rx_offload_offload_one(): do not increase the skb_queue beyond skb_queue_len_max
	can: rx-offload: can_rx_offload_offload_one(): increment rx_fifo_errors on queue overflow or OOM
	can: rx-offload: can_rx_offload_offload_one(): use ERR_PTR() to propagate error value in case of errors
	can: rx-offload: can_rx_offload_irq_offload_timestamp(): continue on error
	can: rx-offload: can_rx_offload_irq_offload_fifo(): continue on error
	can: flexcan: increase error counters if skb enqueueing via can_rx_offload_queue_sorted() fails
	can: mcp251x: mcp251x_restart_work_handler(): Fix potential force_quit race condition
	watchdog: meson: Fix the wrong value of left time
	ASoC: stm32: sai: add restriction on mmap support
	scripts/gdb: fix debugging modules compiled with hot/cold partitioning
	net: bcmgenet: use RGMII loopback for MAC reset
	net: bcmgenet: reapply manual settings to the PHY
	net: mscc: ocelot: fix __ocelot_rmw_ix prototype
	ceph: return -EINVAL if given fsc mount option on kernel w/o support
	net/fq_impl: Switch to kvmalloc() for memory allocation
	mac80211: fix station inactive_time shortly after boot
	block: drbd: remove a stray unlock in __drbd_send_protocol()
	pwm: bcm-iproc: Prevent unloading the driver module while in use
	scsi: target/tcmu: Fix queue_cmd_ring() declaration
	scsi: lpfc: Fix kernel Oops due to null pring pointers
	scsi: lpfc: Fix dif and first burst use in write commands
	ARM: dts: Fix up SQ201 flash access
	tracing: Lock event_mutex before synth_event_mutex
	ARM: debug-imx: only define DEBUG_IMX_UART_PORT if needed
	ARM: dts: imx51: Fix memory node duplication
	ARM: dts: imx53: Fix memory node duplication
	ARM: dts: imx31: Fix memory node duplication
	ARM: dts: imx35: Fix memory node duplication
	ARM: dts: imx7: Fix memory node duplication
	ARM: dts: imx6ul: Fix memory node duplication
	ARM: dts: imx6sx: Fix memory node duplication
	ARM: dts: imx6sl: Fix memory node duplication
	ARM: dts: imx50: Fix memory node duplication
	ARM: dts: imx23: Fix memory node duplication
	ARM: dts: imx1: Fix memory node duplication
	ARM: dts: imx27: Fix memory node duplication
	ARM: dts: imx25: Fix memory node duplication
	ARM: dts: imx53-voipac-dmm-668: Fix memory node duplication
	parisc: Fix serio address output
	parisc: Fix HP SDC hpa address output
	ARM: dts: Fix hsi gdd range for omap4
	arm64: mm: Prevent mismatched 52-bit VA support
	arm64: smp: Handle errors reported by the firmware
	bus: ti-sysc: Check for no-reset and no-idle flags at the child level
	platform/x86: mlx-platform: Fix LED configuration
	ARM: OMAP1: fix USB configuration for device-only setups
	RDMA/hns: Fix the bug while use multi-hop of pbl
	arm64: preempt: Fix big-endian when checking preempt count in assembly
	RDMA/vmw_pvrdma: Use atomic memory allocation in create AH
	PM / AVS: SmartReflex: NULL check before some freeing functions is not needed
	xfs: zero length symlinks are not valid
	ARM: ks8695: fix section mismatch warning
	ACPI / LPSS: Ignore acpi_device_fix_up_power() return value
	scsi: lpfc: Enable Management features for IF_TYPE=6
	scsi: qla2xxx: Fix NPIV handling for FC-NVMe
	scsi: qla2xxx: Fix for FC-NVMe discovery for NPIV port
	nvme: provide fallback for discard alloc failure
	s390/zcrypt: make sysfs reset attribute trigger queue reset
	crypto: user - support incremental algorithm dumps
	arm64: dts: renesas: draak: Fix CVBS input
	mwifiex: fix potential NULL dereference and use after free
	mwifiex: debugfs: correct histogram spacing, formatting
	brcmfmac: set F2 watermark to 256 for 4373
	brcmfmac: set SDIO F1 MesBusyCtrl for CYW4373
	rtl818x: fix potential use after free
	bcache: do not check if debug dentry is ERR or NULL explicitly on remove
	bcache: do not mark writeback_running too early
	xfs: require both realtime inodes to mount
	nvme: fix kernel paging oops
	ubifs: Fix default compression selection in ubifs
	ubi: Put MTD device after it is not used
	ubi: Do not drop UBI device reference before using
	microblaze: adjust the help to the real behavior
	microblaze: move "... is ready" messages to arch/microblaze/Makefile
	microblaze: fix multiple bugs in arch/microblaze/boot/Makefile
	iwlwifi: move iwl_nvm_check_version() into dvm
	iwlwifi: mvm: force TCM re-evaluation on TCM resume
	iwlwifi: pcie: fix erroneous print
	iwlwifi: pcie: set cmd_len in the correct place
	gpio: pca953x: Fix AI overflow on PCAL6524
	gpiolib: Fix return value of gpio_to_desc() stub if !GPIOLIB
	kvm: vmx: Set IA32_TSC_AUX for legacy mode guests
	Revert "KVM: nVMX: reset cache/shadows when switching loaded VMCS"
	Revert "KVM: nVMX: move check_vmentry_postreqs() call to nested_vmx_enter_non_root_mode()"
	crypto/chelsio/chtls: listen fails with multiadapt
	VSOCK: bind to random port for VMADDR_PORT_ANY
	mmc: meson-gx: make sure the descriptor is stopped on errors
	mtd: rawnand: sunxi: Write pageprog related opcodes to WCMD_SET
	usb: ehci-omap: Fix deferred probe for phy handling
	btrfs: Check for missing device before bio submission in btrfs_map_bio
	btrfs: fix ncopies raid_attr for RAID56
	btrfs: dev-replace: set result code of cancel by status of scrub
	Btrfs: allow clear_extent_dirty() to receive a cached extent state record
	btrfs: only track ref_heads in delayed_ref_updates
	serial: sh-sci: Fix crash in rx_timer_fn() on PIO fallback
	HID: intel-ish-hid: fixes incorrect error handling
	gpio: raspberrypi-exp: decrease refcount on firmware dt node
	serial: 8250: Rate limit serial port rx interrupts during input overruns
	kprobes/x86/xen: blacklist non-attachable xen interrupt functions
	xen/pciback: Check dev_data before using it
	kprobes: Blacklist symbols in arch-defined prohibited area
	kprobes/x86: Show x86-64 specific blacklisted symbols correctly
	vfio-mdev/samples: Use u8 instead of char for handle functions
	memory: omap-gpmc: Get the header of the enum
	pinctrl: xway: fix gpio-hog related boot issues
	net/mlx5: Continue driver initialization despite debugfs failure
	netfilter: nf_nat_sip: fix RTP/RTCP source port translations
	exofs_mount(): fix leaks on failure exits
	bnxt_en: Return linux standard errors in bnxt_ethtool.c
	bnxt_en: Save ring statistics before reset.
	bnxt_en: query force speeds before disabling autoneg mode.
	KVM: s390: unregister debug feature on failing arch init
	pinctrl: sh-pfc: r8a77990: Fix MOD_SEL0 SEL_I2C1 field width
	pinctrl: sh-pfc: sh7264: Fix PFCR3 and PFCR0 register configuration
	pinctrl: sh-pfc: sh7734: Fix shifted values in IPSR10
	HID: doc: fix wrong data structure reference for UHID_OUTPUT
	dm flakey: Properly corrupt multi-page bios.
	gfs2: take jdata unstuff into account in do_grow
	dm raid: fix false -EBUSY when handling check/repair message
	xfs: Align compat attrlist_by_handle with native implementation.
	xfs: Fix bulkstat compat ioctls on x32 userspace.
	IB/qib: Fix an error code in qib_sdma_verbs_send()
	clocksource/drivers/fttmr010: Fix invalid interrupt register access
	vxlan: Fix error path in __vxlan_dev_create()
	powerpc/book3s/32: fix number of bats in p/v_block_mapped()
	powerpc/xmon: fix dump_segments()
	drivers/regulator: fix a missing check of return value
	Bluetooth: hci_bcm: Handle specific unknown packets after firmware loading
	serial: max310x: Fix tx_empty() callback
	openrisc: Fix broken paths to arch/or32
	RDMA/srp: Propagate ib_post_send() failures to the SCSI mid-layer
	scsi: qla2xxx: deadlock by configfs_depend_item
	scsi: csiostor: fix incorrect dma device in case of vport
	brcmfmac: Fix access point mode
	ath6kl: Only use match sets when firmware supports it
	ath6kl: Fix off by one error in scan completion
	powerpc/perf: Fix unit_sel/cache_sel checks
	powerpc/32: Avoid unsupported flags with clang
	powerpc/prom: fix early DEBUG messages
	powerpc/mm: Make NULL pointer deferences explicit on bad page faults.
	powerpc/44x/bamboo: Fix PCI range
	vfio/spapr_tce: Get rid of possible infinite loop
	powerpc/powernv/eeh/npu: Fix uninitialized variables in opal_pci_eeh_freeze_status
	drbd: ignore "all zero" peer volume sizes in handshake
	drbd: reject attach of unsuitable uuids even if connected
	drbd: do not block when adjusting "disk-options" while IO is frozen
	drbd: fix print_st_err()'s prototype to match the definition
	IB/rxe: Make counters thread safe
	bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't
	regulator: tps65910: fix a missing check of return value
	powerpc/83xx: handle machine check caused by watchdog timer
	powerpc/pseries: Fix node leak in update_lmb_associativity_index()
	powerpc: Fix HMIs on big-endian with CONFIG_RELOCATABLE=y
	crypto: mxc-scc - fix build warnings on ARM64
	pwm: clps711x: Fix period calculation
	net/netlink_compat: Fix a missing check of nla_parse_nested
	net/net_namespace: Check the return value of register_pernet_subsys()
	f2fs: fix block address for __check_sit_bitmap
	f2fs: fix to dirty inode synchronously
	um: Include sys/uio.h to have writev()
	um: Make GCOV depend on !KCOV
	net: (cpts) fix a missing check of clk_prepare
	net: stmicro: fix a missing check of clk_prepare
	net: dsa: bcm_sf2: Propagate error value from mdio_write
	atl1e: checking the status of atl1e_write_phy_reg
	tipc: fix a missing check of genlmsg_put
	net: marvell: fix a missing check of acpi_match_device
	net/wan/fsl_ucc_hdlc: Avoid double free in ucc_hdlc_probe()
	ocfs2: clear journal dirty flag after shutdown journal
	vmscan: return NODE_RECLAIM_NOSCAN in node_reclaim() when CONFIG_NUMA is n
	mm/page_alloc.c: free order-0 pages through PCP in page_frag_free()
	mm/page_alloc.c: use a single function to free page
	mm/page_alloc.c: deduplicate __memblock_free_early() and memblock_free()
	tools/vm/page-types.c: fix "kpagecount returned fewer pages than expected" failures
	netfilter: nf_tables: fix a missing check of nla_put_failure
	xprtrdma: Prevent leak of rpcrdma_rep objects
	infiniband: bnxt_re: qplib: Check the return value of send_message
	infiniband/qedr: Potential null ptr dereference of qp
	firmware: arm_sdei: fix wrong of_node_put() in init function
	firmware: arm_sdei: Fix DT platform device creation
	lib/genalloc.c: fix allocation of aligned buffer from non-aligned chunk
	lib/genalloc.c: use vzalloc_node() to allocate the bitmap
	fork: fix some -Wmissing-prototypes warnings
	drivers/base/platform.c: kmemleak ignore a known leak
	lib/genalloc.c: include vmalloc.h
	mtd: Check add_mtd_device() ret code
	tipc: fix memory leak in tipc_nl_compat_publ_dump
	net/core/neighbour: tell kmemleak about hash tables
	ata: ahci: mvebu: do Armada 38x configuration only on relevant SoCs
	PCI/MSI: Return -ENOSPC from pci_alloc_irq_vectors_affinity()
	net/core/neighbour: fix kmemleak minimal reference count for hash tables
	serial: 8250: Fix serial8250 initialization crash
	gpu: ipu-v3: pre: don't trigger update if buffer address doesn't change
	sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probe
	ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnel
	decnet: fix DN_IFREQ_SIZE
	net/smc: prevent races between smc_lgr_terminate() and smc_conn_free()
	net/smc: don't wait for send buffer space when data was already sent
	mm/hotplug: invalid PFNs from pfn_to_online_page()
	xfs: end sync buffer I/O properly on shutdown error
	net/smc: fix sender_free computation
	blktrace: Show requests without sector
	net/smc: fix byte_order for rx_curs_confirmed
	tipc: fix skb may be leaky in tipc_link_input
	ASoC: samsung: i2s: Fix prescaler setting for the secondary DAI
	sfc: initialise found bitmap in efx_ef10_mtd_probe
	geneve: change NET_UDP_TUNNEL dependency to select
	net: fix possible overflow in __sk_mem_raise_allocated()
	net: ip_gre: do not report erspan_ver for gre or gretap
	net: ip6_gre: do not report erspan_ver for ip6gre or ip6gretap
	sctp: don't compare hb_timer expire date before starting it
	bpf: decrease usercnt if bpf_map_new_fd() fails in bpf_map_get_fd_by_id()
	mmc: core: align max segment size with logical block size
	net: dev: Use unsigned integer as an argument to left-shift
	kvm: properly check debugfs dentry before using it
	bpf: drop refcount if bpf_map_new_fd() fails in map_create()
	net: hns3: Change fw error code NOT_EXEC to NOT_SUPPORTED
	net: hns3: fix PFC not setting problem for DCB module
	net: hns3: fix an issue for hclgevf_ae_get_hdev
	net: hns3: fix an issue for hns3_update_new_int_gl
	iommu/amd: Fix NULL dereference bug in match_hid_uid
	apparmor: delete the dentry in aafs_remove() to avoid a leak
	scsi: libsas: Support SATA PHY connection rate unmatch fixing during discovery
	ACPI / APEI: Don't wait to serialise with oops messages when panic()ing
	ACPI / APEI: Switch estatus pool to use vmalloc memory
	scsi: hisi_sas: shutdown axi bus to avoid exception CQ returned
	scsi: libsas: Check SMP PHY control function result
	RDMA/hns: Fix the bug with updating rq head pointer when flush cqe
	RDMA/hns: Bugfix for the scene without receiver queue
	RDMA/hns: Fix the state of rereg mr
	RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp
	ASoC: rt5645: Headphone Jack sense inverts on the LattePanda board
	powerpc/pseries/dlpar: Fix a missing check in dlpar_parse_cc_property()
	xdp: fix cpumap redirect SKB creation bug
	mtd: Remove a debug trace in mtdpart.c
	mm, gup: add missing refcount overflow checks on s390
	clk: at91: fix update bit maps on CFG_MOR write
	clk: at91: generated: set audio_pll_allowed in at91_clk_register_generated()
	usb: dwc2: use a longer core rest timeout in dwc2_core_reset()
	staging: rtl8192e: fix potential use after free
	staging: rtl8723bs: Drop ACPI device ids
	staging: rtl8723bs: Add 024c:0525 to the list of SDIO device-ids
	USB: serial: ftdi_sio: add device IDs for U-Blox C099-F9P
	mei: bus: prefix device names on bus with the bus name
	mei: me: add comet point V device id
	thunderbolt: Power cycle the router if NVM authentication fails
	xfrm: Fix memleak on xfrm state destroy
	media: v4l2-ctrl: fix flags for DO_WHITE_BALANCE
	net: macb: fix error format in dev_err()
	pwm: Clear chip_data in pwm_put()
	media: atmel: atmel-isc: fix asd memory allocation
	media: atmel: atmel-isc: fix INIT_WORK misplacement
	macvlan: schedule bc_work even if error
	net: psample: fix skb_over_panic
	openvswitch: fix flow command message size
	sctp: Fix memory leak in sctp_sf_do_5_2_4_dupcook
	slip: Fix use-after-free Read in slip_open
	openvswitch: drop unneeded BUG_ON() in ovs_flow_cmd_build_info()
	openvswitch: remove another BUG_ON()
	selftests: bpf: test_sockmap: handle file creation failures gracefully
	tipc: fix link name length check
	sctp: cache netns in sctp_ep_common
	net: sched: fix `tc -s class show` no bstats on class with nolock subqueues
	net: macb: add missed tasklet_kill
	ext4: add more paranoia checking in ext4_expand_extra_isize handling
	watchdog: sama5d4: fix WDD value to be always set to max
	net: macb: Fix SUBNS increment and increase resolution
	net: macb driver, check for SKBTX_HW_TSTAMP
	mtd: rawnand: atmel: Fix spelling mistake in error message
	mtd: rawnand: atmel: fix possible object reference leak
	mtd: spi-nor: cast to u64 to avoid uint overflows
	drm/atmel-hlcdc: revert shift by 8
	mailbox: stm32_ipcc: add spinlock to fix channels concurrent access
	tcp: exit if nothing to retransmit on RTO timeout
	HID: core: check whether Usage Page item is after Usage ID items
	crypto: stm32/hash - Fix hmac issue more than 256 bytes
	media: stm32-dcmi: fix DMA corruption when stopping streaming
	media: stm32-dcmi: fix check of pm_runtime_get_sync return value
	hwrng: stm32 - fix unbalanced pm_runtime_enable
	clk: stm32mp1: fix HSI divider flag
	clk: stm32mp1: fix mcu divider table
	clk: stm32mp1: add CLK_SET_RATE_NO_REPARENT to Kernel clocks
	clk: stm32mp1: parent clocks update
	mailbox: mailbox-test: fix null pointer if no mmio
	pinctrl: stm32: fix memory leak issue
	ASoC: stm32: i2s: fix dma configuration
	ASoC: stm32: i2s: fix 16 bit format support
	ASoC: stm32: i2s: fix IRQ clearing
	ASoC: stm32: sai: add missing put_device()
	dmaengine: stm32-dma: check whether length is aligned on FIFO threshold
	platform/x86: hp-wmi: Fix ACPI errors caused by too small buffer
	platform/x86: hp-wmi: Fix ACPI errors caused by passing 0 as input size
	net: fec: fix clock count mis-match
	Linux 4.19.88

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ifd3801a77cb551be72788031e7fcfc8a1d4fd197
2019-12-05 12:02:49 +01:00
Jesper Dangaard Brouer
7d8677ff10 xdp: fix cpumap redirect SKB creation bug
[ Upstream commit 676e4a6fe7 ]

We want to avoid leaking pointer info from xdp_frame (that is placed in
top of frame) like commit 6dfb970d3d ("xdp: avoid leaking info stored in
frame data on page reuse"), and followup commit 97e19cce05 ("bpf:
reserve xdp_frame size in xdp headroom") that reserve this headroom.

These changes also affected how cpumap constructed SKBs, as xdpf->headroom
size changed, the skb data starting point were in-effect shifted with 32
bytes (sizeof xdp_frame). This was still okay, as the cpumap frame_size
calculation also included xdpf->headroom which were reduced by same amount.

A bug was introduced in commit 77ea5f4cbe ("bpf/cpumap: make sure
frame_size for build_skb is aligned if headroom isn't"), where the
xdpf->headroom became part of the SKB_DATA_ALIGN rounding up. This
round-up to find the frame_size is in principle still correct as it does
not exceed the 2048 bytes frame_size (which is max for ixgbe and i40e),
but the 32 bytes offset of pkt_data_start puts this over the 2048 bytes
limit. This cause skb_shared_info to spill into next frame. It is a little
hard to trigger, as the SKB need to use above 15 skb_shinfo->frags[] as
far as I calculate. This does happen in practise for TCP streams when
skb_try_coalesce() kicks in.

KASAN can be used to detect these wrong memory accesses, I've seen:
 BUG: KASAN: use-after-free in skb_try_coalesce+0x3cb/0x760
 BUG: KASAN: wild-memory-access in skb_release_data+0xe2/0x250

Driver veth also construct a SKB from xdp_frame in this way, but is not
affected, as it doesn't reserve/deduct the room (used by xdp_frame) from
the SKB headroom. Instead is clears the pointers via xdp_scrub_frame(),
and allows SKB to use this area.

The fix in this patch is to do like veth and instead allow SKB to (re)use
the area occupied by xdp_frame, by clearing via xdp_scrub_frame().  (This
does kill the idea of the SKB being able to access (mem) info from this
area, but I guess it was a bad idea anyhow, and it was already killed by
the veth changes.)

Fixes: 77ea5f4cbe ("bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:21:24 +01:00
Peng Sun
1a5df073d0 bpf: drop refcount if bpf_map_new_fd() fails in map_create()
[ Upstream commit 352d20d611 ]

In bpf/syscall.c, map_create() first set map->usercnt to 1, a file
descriptor is supposed to return to userspace. When bpf_map_new_fd()
fails, drop the refcount.

Fixes: bd5f5f4ecb ("bpf: Add BPF_MAP_GET_FD_BY_ID")
Signed-off-by: Peng Sun <sironhide0null@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:21:15 +01:00
Peng Sun
944df7b4b9 bpf: decrease usercnt if bpf_map_new_fd() fails in bpf_map_get_fd_by_id()
[ Upstream commit 781e62823c ]

In bpf/syscall.c, bpf_map_get_fd_by_id() use bpf_map_inc_not_zero()
to increase the refcount, both map->refcnt and map->usercnt. Then, if
bpf_map_new_fd() fails, should handle map->usercnt too.

Fixes: bd5f5f4ecb ("bpf: Add BPF_MAP_GET_FD_BY_ID")
Signed-off-by: Peng Sun <sironhide0null@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:21:13 +01:00
Yi Wang
fc38279ec5 fork: fix some -Wmissing-prototypes warnings
[ Upstream commit fb5bf31722 ]

We get a warning when building kernel with W=1:

  kernel/fork.c:167:13: warning: no previous prototype for `arch_release_thread_stack' [-Wmissing-prototypes]
  kernel/fork.c:779:13: warning: no previous prototype for `fork_init' [-Wmissing-prototypes]

Add the missing declaration in head file to fix this.

Also, remove arch_release_thread_stack() completely because no arch
seems to implement it since bb9d81264 (arch: remove tile port).

Link: http://lkml.kernel.org/r/1542170087-23645-1-git-send-email-wang.yi59@zte.com.cn
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:21:04 +01:00
Jesper Dangaard Brouer
f0910af753 bpf/cpumap: make sure frame_size for build_skb is aligned if headroom isn't
[ Upstream commit 77ea5f4cbe ]

The frame_size passed to build_skb must be aligned, else it is
possible that the embedded struct skb_shared_info gets unaligned.

For correctness make sure that xdpf->headroom in included in the
alignment. No upstream drivers can hit this, as all XDP drivers provide
an aligned headroom.  This was discovered when playing with implementing
XDP support for mvneta, which have a 2 bytes DSA header, and this
Marvell ARM64 platform didn't like doing atomic operations on an
unaligned skb_shinfo(skb)->dataref addresses.

Fixes: 1c601d829a ("bpf: cpumap xdp_buff to skb conversion and allocation")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:20:46 +01:00
Masami Hiramatsu
98cf16d33c kprobes: Blacklist symbols in arch-defined prohibited area
[ Upstream commit fb1a59fae8 ]

Blacklist symbols in arch-defined probe-prohibited areas.
With this change, user can see all symbols which are prohibited
to probe in debugfs.

All archtectures which have custom prohibit areas should define
its own arch_populate_kprobe_blacklist() function, but unless that,
all symbols marked __kprobes are blacklisted.

Reported-by: Andrea Righi <righi.andrea@gmail.com>
Tested-by: Andrea Righi <righi.andrea@gmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lkml.kernel.org/r/154503485491.26176.15823229545155174796.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:20:26 +01:00
Masami Hiramatsu
dee3f77032 tracing: Lock event_mutex before synth_event_mutex
[ Upstream commit fc800a10be ]

synthetic event is using synth_event_mutex for protecting
synth_event_list, and event_trigger_write() path acquires
locks as below order.

event_trigger_write(event_mutex)
  ->trigger_process_regex(trigger_cmd_mutex)
    ->event_hist_trigger_func(synth_event_mutex)

On the other hand, synthetic event creation and deletion paths
call trace_add_event_call() and trace_remove_event_call()
which acquires event_mutex. In that case, if we keep the
synth_event_mutex locked while registering/unregistering synthetic
events, its dependency will be inversed.

To avoid this issue, current synthetic event is using a 2 phase
process to create/delete events. For example, it searches existing
events under synth_event_mutex to check for event-name conflicts, and
unlocks synth_event_mutex, then registers a new event under event_mutex
locked. Finally, it locks synth_event_mutex and tries to add the
new event to the list. But it can introduce complexity and a chance
for name conflicts.

To solve this simpler, this introduces trace_add_event_call_nolock()
and trace_remove_event_call_nolock() which don't acquire
event_mutex inside. synthetic event can lock event_mutex before
synth_event_mutex to solve the lock dependency issue simpler.

Link: http://lkml.kernel.org/r/154140844377.17322.13781091165954002713.stgit@devbox

Reviewed-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-05 09:19:49 +01:00
Greg Kroah-Hartman
2700cf837e This is the 4.19.87 stable release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl3jdysACgkQONu9yGCS
 aT49mhAAxl50oRh0bSk/SikbVCn7kRaoRNlCBsPtvtvbDDIpyREIzMRmIbH2OZjS
 ji1umUu8iMj+HXW72dWNqPo2K337UzcNRsUXWAdJH8Et+ao+xOUV1jps/Zr3D9ca
 2Cw6iTn2QFhKQythMlCxb30sf+kN4cAU1XY3M8xEWXB+4nAqc/aFW7mRAP1jusRb
 1DAW+xqPiwCbaag1v5OzumAOGBhpmTcX8sfEYM+3DKcPgGL1jPyYeWlXA26nih8/
 LQR6r2tAb454pipV0uApJ2u7V5nNxprcrfUNDmAfap2q/eF1w5pBbZoS5sqpf1eZ
 2ycZ36w0ThE7lJKvNrfjq13Su+bGtpENxHlwesNPbsvz0F4xoEkelYSCE1gJBaHX
 CZvq1Dhrk5DBvkCCElV8+CJxxuhMUZwzOwrz2iBLdPnpCpSgj8uNPLXMJno6D9fH
 PZMCcBFf4WnCUBc06fB7qG+Z7y0TeAuLsNQmK3zYQoEc1gz4Yk5aFo7NgTyajqbD
 YKINVP2Wj11TDBssolIA58EYcc2J38As54wuOvYtwy+k/mVkvZVhCPOtI+h3UYm4
 lX2ROCHTzt+Av5qFlM8aSBcIlm1qihzSHEdnyqX2EZvUGC4C5Mc5/Eml3QJxAnVh
 SzUfLZGzzfntpnWn2cJZ/EA/p6hXujG5k95LEwtAnxxYBFpLTKc=
 =d8kA
 -----END PGP SIGNATURE-----

Merge 4.19.87 into android-4.19

Changes in 4.19.87
	mlxsw: spectrum_router: Fix determining underlay for a GRE tunnel
	net/mlx4_en: fix mlx4 ethtool -N insertion
	net/mlx4_en: Fix wrong limitation for number of TX rings
	net: rtnetlink: prevent underflows in do_setvfinfo()
	net/sched: act_pedit: fix WARN() in the traffic path
	net: sched: ensure opts_len <= IP_TUNNEL_OPTS_MAX in act_tunnel_key
	sfc: Only cancel the PPS workqueue if it exists
	net/mlx5e: Fix set vf link state error flow
	net/mlxfw: Verify FSM error code translation doesn't exceed array size
	net/mlx5: Fix auto group size calculation
	vhost/vsock: split packets to send using multiple buffers
	gpio: max77620: Fixup debounce delays
	tools: gpio: Correctly add make dependencies for gpio_utils
	nbd:fix memory leak in nbd_get_socket()
	virtio_console: allocate inbufs in add_port() only if it is needed
	Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
	mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
	drm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs
	drm/i915/pmu: "Frequency" is reported as accumulated cycles
	drm/i915/userptr: Try to acquire the page lock around set_page_dirty()
	mwifiex: Fix NL80211_TX_POWER_LIMITED
	ALSA: isight: fix leak of reference to firewire unit in error path of .probe callback
	crypto: testmgr - fix sizeof() on COMP_BUF_SIZE
	printk: lock/unlock console only for new logbuf entries
	printk: fix integer overflow in setup_log_buf()
	pinctrl: madera: Fix uninitialized variable bug in madera_mux_set_mux
	PCI: cadence: Write MSI data with 32bits
	gfs2: Fix marking bitmaps non-full
	pty: fix compat ioctls
	synclink_gt(): fix compat_ioctl()
	powerpc: Fix signedness bug in update_flash_db()
	powerpc/boot: Fix opal console in boot wrapper
	powerpc/boot: Disable vector instructions
	powerpc/eeh: Fix null deref for devices removed during EEH
	powerpc/eeh: Fix use of EEH_PE_KEEP on wrong field
	EDAC, thunderx: Fix memory leak in thunderx_l2c_threaded_isr()
	mt76: do not store aggregation sequence number for null-data frames
	mt76x0: phy: fix restore phase in mt76x0_phy_recalibrate_after_assoc
	brcmsmac: AP mode: update beacon when TIM changes
	ath10k: set probe request oui during driver start
	ath10k: allocate small size dma memory in ath10k_pci_diag_write_mem
	skd: fixup usage of legacy IO API
	cdrom: don't attempt to fiddle with cdo->capability
	spi: sh-msiof: fix deferred probing
	mmc: mediatek: fill the actual clock for mmc debugfs
	mmc: mediatek: fix cannot receive new request when msdc_cmd_is_ready fail
	PCI: mediatek: Fix class type for MT7622 to PCI_CLASS_BRIDGE_PCI
	btrfs: defrag: use btrfs_mod_outstanding_extents in cluster_pages_for_defrag
	btrfs: handle error of get_old_root
	gsmi: Fix bug in append_to_eventlog sysfs handler
	misc: mic: fix a DMA pool free failure
	w1: IAD Register is yet readable trough iad sys file. Fix snprintf (%u for unsigned, count for max size).
	m68k: fix command-line parsing when passed from u-boot
	scsi: hisi_sas: Feed back linkrate(max/min) when re-attached
	scsi: hisi_sas: Fix the race between IO completion and timeout for SMP/internal IO
	scsi: hisi_sas: Free slot later in slot_complete_vx_hw()
	RDMA/bnxt_re: Avoid NULL check after accessing the pointer
	RDMA/bnxt_re: Fix qp async event reporting
	RDMA/bnxt_re: Avoid resource leak in case the NQ registration fails
	pinctrl: sunxi: Fix a memory leak in 'sunxi_pinctrl_build_state()'
	pwm: lpss: Only set update bit if we are actually changing the settings
	amiflop: clean up on errors during setup
	qed: Align local and global PTT to propagate through the APIs.
	scsi: ips: fix missing break in switch
	nfp: bpf: protect against mis-initializing atomic counters
	KVM: nVMX: reset cache/shadows when switching loaded VMCS
	KVM: nVMX: move check_vmentry_postreqs() call to nested_vmx_enter_non_root_mode()
	KVM/x86: Fix invvpid and invept register operand size in 64-bit mode
	clk: tegra: Fixes for MBIST work around
	scsi: isci: Use proper enumerated type in atapi_d2h_reg_frame_handler
	scsi: isci: Change sci_controller_start_task's return type to sci_status
	scsi: bfa: Avoid implicit enum conversion in bfad_im_post_vendor_event
	scsi: iscsi_tcp: Explicitly cast param in iscsi_sw_tcp_host_get_param
	crypto: ccree - avoid implicit enum conversion
	nvmet: avoid integer overflow in the discard code
	nvmet-fcloop: suppress a compiler warning
	nvme-pci: fix hot removal during error handling
	PCI: mediatek: Fixup MSI enablement logic by enabling MSI before clocks
	clk: mmp2: fix the clock id for sdh2_clk and sdh3_clk
	clk: at91: audio-pll: fix audio pmc type
	ASoC: tegra_sgtl5000: fix device_node refcounting
	scsi: dc395x: fix dma API usage in srb_done
	scsi: dc395x: fix DMA API usage in sg_update_list
	scsi: zorro_esp: Limit DMA transfers to 65535 bytes
	net: dsa: mv88e6xxx: Fix 88E6141/6341 2500mbps SERDES speed
	net: fix warning in af_unix
	net: ena: Fix Kconfig dependency on X86
	xfs: fix use-after-free race in xfs_buf_rele
	xfs: clear ail delwri queued bufs on unmount of shutdown fs
	kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
	ACPI / scan: Create platform device for INT33FE ACPI nodes
	PM / Domains: Deal with multiple states but no governor in genpd
	ALSA: i2c/cs8427: Fix int to char conversion
	macintosh/windfarm_smu_sat: Fix debug output
	PCI: vmd: Detach resources after stopping root bus
	USB: misc: appledisplay: fix backlight update_status return code
	usbip: tools: fix atoi() on non-null terminated string
	sctp: use sk_wmem_queued to check for writable space
	dm raid: avoid bitmap with raid4/5/6 journal device
	selftests/bpf: fix file resource leak in load_kallsyms
	SUNRPC: Fix a compile warning for cmpxchg64()
	sunrpc: safely reallow resvport min/max inversion
	atm: zatm: Fix empty body Clang warnings
	s390/perf: Return error when debug_register fails
	swiotlb: do not panic on mapping failures
	spi: omap2-mcspi: Set FIFO DMA trigger level to word length
	x86/intel_rdt: Prevent pseudo-locking from using stale pointers
	sparc: Fix parport build warnings.
	scsi: hisi_sas: Fix NULL pointer dereference
	powerpc/pseries: Export raw per-CPU VPA data via debugfs
	powerpc/mm/radix: Fix off-by-one in split mapping logic
	powerpc/mm/radix: Fix overuse of small pages in splitting logic
	powerpc/mm/radix: Fix small page at boundary when splitting
	powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmd
	selftests/bpf: fix return value comparison for tests in test_libbpf.sh
	tools: bpftool: fix completion for "bpftool map update"
	ceph: fix dentry leak in ceph_readdir_prepopulate
	ceph: only allow punch hole mode in fallocate
	rtc: s35390a: Change buf's type to u8 in s35390a_init
	RISC-V: Avoid corrupting the upper 32-bit of phys_addr_t in ioremap
	thermal: armada: fix a test in probe()
	f2fs: fix to spread clear_cold_data()
	f2fs: spread f2fs_set_inode_flags()
	mISDN: Fix type of switch control variable in ctrl_teimanager
	qlcnic: fix a return in qlcnic_dcb_get_capability()
	net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
	mfd: arizona: Correct calling of runtime_put_sync
	mfd: mc13xxx-core: Fix PMIC shutdown when reading ADC values
	mfd: intel_soc_pmic_bxtwc: Chain power button IRQs as well
	mfd: max8997: Enale irq-wakeup unconditionally
	net: socionext: Stop PHY before resetting netsec
	fs/cifs: fix uninitialised variable warnings
	spi: uniphier: fix incorrect property items
	selftests/ftrace: Fix to test kprobe $comm arg only if available
	selftests: watchdog: fix message when /dev/watchdog open fails
	selftests: watchdog: Fix error message.
	selftests: kvm: Fix -Wformat warnings
	selftests: fix warning: "_GNU_SOURCE" redefined
	thermal: rcar_thermal: fix duplicate IRQ request
	thermal: rcar_thermal: Prevent hardware access during system suspend
	net: ethernet: cadence: fix socket buffer corruption problem
	bpf: devmap: fix wrong interface selection in notifier_call
	bpf, btf: fix a missing check bug in btf_parse
	powerpc/process: Fix flush_all_to_thread for SPE
	sparc64: Rework xchg() definition to avoid warnings.
	arm64: lib: use C string functions with KASAN enabled
	fs/ocfs2/dlm/dlmdebug.c: fix a sleep-in-atomic-context bug in dlm_print_one_mle()
	mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock
	tools/testing/selftests/vm/gup_benchmark.c: fix 'write' flag usage
	mm: thp: fix MADV_DONTNEED vs migrate_misplaced_transhuge_page race condition
	macsec: update operstate when lower device changes
	macsec: let the administrator set UP state even if lowerdev is down
	block: fix the DISCARD request merge
	i2c: uniphier-f: make driver robust against concurrency
	i2c: uniphier-f: fix occasional timeout error
	i2c: uniphier-f: fix race condition when IRQ is cleared
	um: Make line/tty semantics use true write IRQ
	vfs: avoid problematic remapping requests into partial EOF block
	ipv4/igmp: fix v1/v2 switchback timeout based on rfc3376, 8.12
	powerpc/xmon: Relax frame size for clang
	selftests/powerpc/ptrace: Fix out-of-tree build
	selftests/powerpc/signal: Fix out-of-tree build
	selftests/powerpc/switch_endian: Fix out-of-tree build
	selftests/powerpc/cache_shape: Fix out-of-tree build
	block: call rq_qos_exit() after queue is frozen
	mm/gup_benchmark.c: prevent integer overflow in ioctl
	linux/bitmap.h: handle constant zero-size bitmaps correctly
	linux/bitmap.h: fix type of nbits in bitmap_shift_right()
	lib/bitmap.c: fix remaining space computation in bitmap_print_to_pagebuf
	hfsplus: fix BUG on bnode parent update
	hfs: fix BUG on bnode parent update
	hfsplus: prevent btree data loss on ENOSPC
	hfs: prevent btree data loss on ENOSPC
	hfsplus: fix return value of hfsplus_get_block()
	hfs: fix return value of hfs_get_block()
	hfsplus: update timestamps on truncate()
	hfs: update timestamp on truncate()
	fs/hfs/extent.c: fix array out of bounds read of array extent
	kernel/panic.c: do not append newline to the stack protector panic string
	mm/memory_hotplug: make add_memory() take the device_hotplug_lock
	mm/memory_hotplug: fix online/offline_pages called w.o. mem_hotplug_lock
	powerpc/powernv: hold device_hotplug_lock when calling device_online()
	igb: shorten maximum PHC timecounter update interval
	fm10k: ensure completer aborts are marked as non-fatal after a resume
	net: hns3: bugfix for buffer not free problem during resetting
	net: hns3: bugfix for reporting unknown vector0 interrupt repeatly problem
	net: hns3: bugfix for is_valid_csq_clean_head()
	net: hns3: bugfix for hclge_mdio_write and hclge_mdio_read
	ntb_netdev: fix sleep time mismatch
	ntb: intel: fix return value for ndev_vec_mask()
	irq/matrix: Fix memory overallocation
	nvme-pci: fix conflicting p2p resource adds
	arm64: makefile fix build of .i file in external module case
	tools/power turbosat: fix AMD APIC-id output
	mm: handle no memcg case in memcg_kmem_charge() properly
	ocfs2: without quota support, avoid calling quota recovery
	ocfs2: don't use iocb when EIOCBQUEUED returns
	ocfs2: don't put and assigning null to bh allocated outside
	ocfs2: fix clusters leak in ocfs2_defrag_extent()
	net: do not abort bulk send on BQL status
	sched/topology: Fix off by one bug
	sched/fair: Don't increase sd->balance_interval on newidle balance
	openvswitch: fix linking without CONFIG_NF_CONNTRACK_LABELS
	ARM: dts: imx6sx-sdb: Fix enet phy regulator
	clk: sunxi-ng: enable so-said LDOs for A64 SoC's pll-mipi clock
	soc: bcm: brcmstb: Fix re-entry point with a THUMB2_KERNEL
	audit: print empty EXECVE args
	sock_diag: fix autoloading of the raw_diag module
	net: bpfilter: fix iptables failure if bpfilter_umh is disabled
	nds32: Fix bug in bitfield.h
	media: ov13858: Check for possible null pointer
	btrfs: avoid link error with CONFIG_NO_AUTO_INLINE
	wil6210: fix debugfs memory access alignment
	wil6210: fix L2 RX status handling
	wil6210: fix RGF_CAF_ICR address for Talyn-MB
	wil6210: fix locking in wmi_call
	ath10k: snoc: fix unbalanced clock error handling
	wlcore: Fix the return value in case of error in 'wlcore_vendor_cmd_smart_config_start()'
	rtl8xxxu: Fix missing break in switch
	brcmsmac: never log "tid x is not agg'able" by default
	wireless: airo: potential buffer overflow in sprintf()
	rtlwifi: rtl8192de: Fix misleading REG_MCUFWDL information
	net: dsa: bcm_sf2: Turn on PHY to allow successful registration
	scsi: mpt3sas: Fix Sync cache command failure during driver unload
	scsi: mpt3sas: Don't modify EEDPTagMode field setting on SAS3.5 HBA devices
	scsi: mpt3sas: Fix driver modifying persistent data in Manufacturing page11
	scsi: megaraid_sas: Fix msleep granularity
	scsi: megaraid_sas: Fix goto labels in error handling
	scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces
	scsi: lpfc: Fix odd recovery in duplicate FLOGIs in point-to-point
	scsi: lpfc: Correct loss of fc4 type on remote port address change
	usb: typec: tcpm: charge current handling for sink during hard reset
	dlm: fix invalid free
	dlm: don't leak kernel pointer to userspace
	vrf: mark skb for multicast or link-local as enslaved to VRF
	clk: tegra20: Turn EMC clock gate into divider
	ACPICA: Use %d for signed int print formatting instead of %u
	net: bcmgenet: return correct value 'ret' from bcmgenet_power_down
	of: unittest: allow base devicetree to have symbol metadata
	of: unittest: initialize args before calling of_*parse_*()
	tools: bpftool: pass an argument to silence open_obj_pinned()
	cfg80211: Prevent regulatory restore during STA disconnect in concurrent interfaces
	pinctrl: qcom: spmi-gpio: fix gpio-hog related boot issues
	pinctrl: bcm2835: Use define directive for BCM2835_PINCONF_PARAM_PULL
	pinctrl: lpc18xx: Use define directive for PIN_CONFIG_GPIO_PIN_INT
	pinctrl: zynq: Use define directive for PIN_CONFIG_IO_STANDARD
	PCI: keystone: Use quirk to limit MRRS for K2G
	nvme-pci: fix surprise removal
	spi: omap2-mcspi: Fix DMA and FIFO event trigger size mismatch
	i2c: uniphier-f: fix timeout error after reading 8 bytes
	mm/memory_hotplug: Do not unlock when fails to take the device_hotplug_lock
	ipv6: Fix handling of LLA with VRF and sockets bound to VRF
	cfg80211: call disconnect_wk when AP stops
	mm/page_io.c: do not free shared swap slots
	Bluetooth: Fix invalid-free in bcsp_close()
	KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
	ath10k: Fix a NULL-ptr-deref bug in ath10k_usb_alloc_urb_from_pipe
	ath9k_hw: fix uninitialized variable data
	md/raid10: prevent access of uninitialized resync_pages offset
	mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
	net: phy: dp83867: fix speed 10 in sgmii mode
	net: phy: dp83867: increase SGMII autoneg timer duration
	ocfs2: remove ocfs2_is_o2cb_active()
	ARM: 8904/1: skip nomap memblocks while finding the lowmem/highmem boundary
	ARC: perf: Accommodate big-endian CPU
	x86/insn: Fix awk regexp warnings
	x86/speculation: Fix incorrect MDS/TAA mitigation status
	x86/speculation: Fix redundant MDS mitigation message
	nbd: prevent memory leak
	y2038: futex: Move compat implementation into futex.c
	futex: Prevent robust futex exit race
	ALSA: usb-audio: Fix NULL dereference at parsing BADD
	nfc: port100: handle command failure cleanly
	media: vivid: Set vid_cap_streaming and vid_out_streaming to true
	media: vivid: Fix wrong locking that causes race conditions on streaming stop
	media: usbvision: Fix races among open, close, and disconnect
	cpufreq: Add NULL checks to show() and store() methods of cpufreq
	media: uvcvideo: Fix error path in control parsing failure
	media: b2c2-flexcop-usb: add sanity checking
	media: cxusb: detect cxusb_ctrl_msg error in query
	media: imon: invalid dereference in imon_touch_event
	virtio_ring: fix return code on DMA mapping fails
	USBIP: add config dependency for SGL_ALLOC
	usbip: tools: fix fd leakage in the function of read_attr_usbip_status
	usbip: Fix uninitialized symbol 'nents' in stub_recv_cmd_submit()
	usb-serial: cp201x: support Mark-10 digital force gauge
	USB: chaoskey: fix error case of a timeout
	appledisplay: fix error handling in the scheduled work
	USB: serial: mos7840: add USB ID to support Moxa UPort 2210
	USB: serial: mos7720: fix remote wakeup
	USB: serial: mos7840: fix remote wakeup
	USB: serial: option: add support for DW5821e with eSIM support
	USB: serial: option: add support for Foxconn T77W968 LTE modules
	staging: comedi: usbduxfast: usbduxfast_ai_cmdtest rounding error
	powerpc/64s: support nospectre_v2 cmdline option
	powerpc/book3s64: Fix link stack flush on context switch
	KVM: PPC: Book3S HV: Flush link stack on guest exit to host kernel
	PM / devfreq: Fix kernel oops on governor module load
	Linux 4.19.87

Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Id8c8d4cd92227f8f46c48c05440e09da957fa687
2019-12-01 09:53:43 +01:00
Yang Tao
2819f4030f futex: Prevent robust futex exit race
commit ca16d5bee5 upstream.

Robust futexes utilize the robust_list mechanism to allow the kernel to
release futexes which are held when a task exits. The exit can be voluntary
or caused by a signal or fault. This prevents that waiters block forever.

The futex operations in user space store a pointer to the futex they are
either locking or unlocking in the op_pending member of the per task robust
list.

After a lock operation has succeeded the futex is queued in the robust list
linked list and the op_pending pointer is cleared.

After an unlock operation has succeeded the futex is removed from the
robust list linked list and the op_pending pointer is cleared.

The robust list exit code checks for the pending operation and any futex
which is queued in the linked list. It carefully checks whether the futex
value is the TID of the exiting task. If so, it sets the OWNER_DIED bit and
tries to wake up a potential waiter.

This is race free for the lock operation but unlock has two race scenarios
where waiters might not be woken up. These issues can be observed with
regular robust pthread mutexes. PI aware pthread mutexes are not affected.

(1) Unlocking task is killed after unlocking the futex value in user space
    before being able to wake a waiter.

        pthread_mutex_unlock()
                |
                V
        atomic_exchange_rel (&mutex->__data.__lock, 0)
                        <------------------------killed
            lll_futex_wake ()                   |
                                                |
                                                |(__lock = 0)
                                                |(enter kernel)
                                                |
                                                V
                                            do_exit()
                                            exit_mm()
                                          mm_release()
                                        exit_robust_list()
                                        handle_futex_death()
                                                |
                                                |(__lock = 0)
                                                |(uval = 0)
                                                |
                                                V
        if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr))
                return 0;

    The sanity check which ensures that the user space futex is owned by
    the exiting task prevents the wakeup of waiters which in consequence
    block infinitely.

(2) Waiting task is killed after a wakeup and before it can acquire the
    futex in user space.

        OWNER                         WAITER
				futex_wait()
   pthread_mutex_unlock()               |
                |                       |
                |(__lock = 0)           |
                |                       |
                V                       |
         futex_wake() ------------>  wakeup()
                                        |
                                        |(return to userspace)
                                        |(__lock = 0)
                                        |
                                        V
                        oldval = mutex->__data.__lock
                                          <-----------------killed
    atomic_compare_and_exchange_val_acq (&mutex->__data.__lock,  |
                        id | assume_other_futex_waiters, 0)      |
                                                                 |
                                                                 |
                                                   (enter kernel)|
                                                                 |
                                                                 V
                                                         do_exit()
                                                        |
                                                        |
                                                        V
                                        handle_futex_death()
                                        |
                                        |(__lock = 0)
                                        |(uval = 0)
                                        |
                                        V
        if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr))
                return 0;

    The sanity check which ensures that the user space futex is owned
    by the exiting task prevents the wakeup of waiters, which seems to
    be correct as the exiting task does not own the futex value, but
    the consequence is that other waiters wont be woken up and block
    infinitely.

In both scenarios the following conditions are true:

   - task->robust_list->list_op_pending != NULL
   - user space futex value == 0
   - Regular futex (not PI)

If these conditions are met then it is reasonably safe to wake up a
potential waiter in order to prevent the above problems.

As this might be a false positive it can cause spurious wakeups, but the
waiter side has to handle other types of unrelated wakeups, e.g. signals
gracefully anyway. So such a spurious wakeup will not affect the
correctness of these operations.

This workaround must not touch the user space futex value and cannot set
the OWNER_DIED bit because the lock value is 0, i.e. uncontended. Setting
OWNER_DIED in this case would result in inconsistent state and subsequently
in malfunction of the owner died handling in user space.

The rest of the user space state is still consistent as no other task can
observe the list_op_pending entry in the exiting tasks robust list.

The eventually woken up waiter will observe the uncontended lock value and
take it over.

[ tglx: Massaged changelog and comment. Made the return explicit and not
  	depend on the subsequent check and added constants to hand into
  	handle_futex_death() instead of plain numbers. Fixed a few coding
	style issues. ]

Fixes: 0771dfefc9 ("[PATCH] lightweight robust futexes: core")
Signed-off-by: Yang Tao <yang.tao172@zte.com.cn>
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1573010582-35297-1-git-send-email-wang.yi59@zte.com.cn
Link: https://lkml.kernel.org/r/20191106224555.943191378@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-01 09:17:38 +01:00
Arnd Bergmann
d3f8c58d70 y2038: futex: Move compat implementation into futex.c
commit 04e7712f44 upstream.

We are going to share the compat_sys_futex() handler between 64-bit
architectures and 32-bit architectures that need to deal with both 32-bit
and 64-bit time_t, and this is easier if both entry points are in the
same file.

In fact, most other system call handlers do the same thing these days, so
let's follow the trend here and merge all of futex_compat.c into futex.c.

In the process, a few minor changes have to be done to make sure everything
still makes sense: handle_futex_death() and futex_cmpxchg_enabled() become
local symbol, and the compat version of the fetch_robust_entry() function
gets renamed to compat_fetch_robust_entry() to avoid a symbol clash.

This is intended as a purely cosmetic patch, no behavior should
change.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-01 09:17:38 +01:00
Richard Guy Briggs
3c69a033b4 audit: print empty EXECVE args
[ Upstream commit ea956d8be9 ]

Empty executable arguments were being skipped when printing out the list
of arguments in an EXECVE record, making it appear they were somehow
lost.  Include empty arguments as an itemized empty string.

Reproducer:
	autrace /bin/ls "" "/etc"
	ausearch --start recent -m execve -i | grep EXECVE
	type=EXECVE msg=audit(10/03/2018 13:04:03.208:1391) : argc=3 a0=/bin/ls a2=/etc

With fix:
	type=EXECVE msg=audit(10/03/2018 21:51:38.290:194) : argc=3 a0=/bin/ls a1= a2=/etc
	type=EXECVE msg=audit(1538617898.290:194): argc=3 a0="/bin/ls" a1="" a2="/etc"

Passes audit-testsuite.  GH issue tracker at
https://github.com/linux-audit/audit-kernel/issues/99

Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: cleaned up the commit metadata]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:17 +01:00
Valentin Schneider
31bced01fe sched/fair: Don't increase sd->balance_interval on newidle balance
[ Upstream commit 3f130a37c4 ]

When load_balance() fails to move some load because of task affinity,
we end up increasing sd->balance_interval to delay the next periodic
balance in the hopes that next time we look, that annoying pinned
task(s) will be gone.

However, idle_balance() pays no attention to sd->balance_interval, yet
it will still lead to an increase in balance_interval in case of
pinned tasks.

If we're going through several newidle balances (e.g. we have a
periodic task), this can lead to a huge increase of the
balance_interval in a very small amount of time.

To prevent that, don't increase the balance interval when going
through a newidle balance.

This is a similar approach to what is done in commit 58b26c4c02
("sched: Increment cache_nice_tries only on periodic lb"), where we
disregard newidle balance and rely on periodic balance for more stable
results.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Dietmar.Eggemann@arm.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: patrick.bellasi@arm.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1537974727-30788-2-git-send-email-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:16 +01:00
Peter Zijlstra
ed023646c2 sched/topology: Fix off by one bug
[ Upstream commit 993f0b0510 ]

With the addition of the NUMA identity level, we increased @level by
one and will run off the end of the array in the distance sort loop.

Fixed: 051f3ca02e ("sched/topology: Introduce NUMA identity node sched domain")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:16 +01:00
Michael Kelley
0bbb8382db irq/matrix: Fix memory overallocation
[ Upstream commit 57f01796f1 ]

IRQ_MATRIX_SIZE is the number of longs needed for a bitmap, multiplied by
the size of a long, yielding a byte count. But it is used to size an array
of longs, which is way more memory than is needed.

Change IRQ_MATRIX_SIZE so it is just the number of longs needed and the
arrays come out the correct size.

Fixes: 2f75d9e1c9 ("genirq: Implement bitmap matrix allocator")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: KY Srinivasan <kys@microsoft.com>
Link: https://lkml.kernel.org/r/1541032428-10392-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:13 +01:00
Borislav Petkov
023c071f10 kernel/panic.c: do not append newline to the stack protector panic string
[ Upstream commit 95c4fb78fb ]

... because panic() itself already does this. Otherwise you have
line-broken trailer:

  [    1.836965] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: pgd_alloc+0x29e/0x2a0
  [    1.836965]  ]---

Link: http://lkml.kernel.org/r/20181008202901.7894-1-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:10 +01:00
Martin Lau
54299e1cf3 bpf, btf: fix a missing check bug in btf_parse
[ Upstream commit 4a6998aff8 ]

Wenwen Wang reported:

  In btf_parse(), the header of the user-space btf data 'btf_data'
  is firstly parsed and verified through btf_parse_hdr().
  In btf_parse_hdr(), the header is copied from user-space 'btf_data'
  to kernel-space 'btf->hdr' and then verified. If no error happens
  during the verification process, the whole data of 'btf_data',
  including the header, is then copied to 'data' in btf_parse(). It
  is obvious that the header is copied twice here. More importantly,
  no check is enforced after the second copy to make sure the headers
  obtained in these two copies are same. Given that 'btf_data' resides
  in the user space, a malicious user can race to modify the header
  between these two copies. By doing so, the user can inject
  inconsistent data, which can cause undefined behavior of the
  kernel and introduce potential security risk.

This issue is similar to the one fixed in commit 8af03d1ae2 ("bpf:
btf: Fix a missing check bug"). To fix it, this patch copies the user
'btf_data' *before* parsing / verifying the BTF header.

Fixes: 69b693f0ae ("bpf: btf: Introduce BPF Type Format (BTF)")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Co-developed-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:01 +01:00
Taehee Yoo
8044e741ee bpf: devmap: fix wrong interface selection in notifier_call
[ Upstream commit f592f80483 ]

The dev_map_notification() removes interface in devmap if
unregistering interface's ifindex is same.
But only checking ifindex is not enough because other netns can have
same ifindex. so that wrong interface selection could occurred.
Hence netdev pointer comparison code is added.

v2: compare netdev pointer instead of using net_eq() (Daniel Borkmann)
v1: Initial patch

Fixes: 2ddf71e23c ("net: add notifier hooks for devmap bpf map")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:17:01 +01:00
Christoph Hellwig
ad9a4e963c swiotlb: do not panic on mapping failures
[ Upstream commit 8088546832 ]

All properly written drivers now have error handling in the
dma_map_single / dma_map_page callers.  As swiotlb_tbl_map_single already
prints a useful warning when running out of swiotlb pool space we can
also remove swiotlb_full entirely as it serves no purpose now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:16:42 +01:00
Sergey Senozhatsky
4465a916ea printk: fix integer overflow in setup_log_buf()
[ Upstream commit d2130e82e9 ]

The way we calculate logbuf free space percentage overflows signed
integer:

	int free;

	free = __LOG_BUF_LEN - log_next_idx;
	pr_info("early log buf free: %u(%u%%)\n",
		free, (free * 100) / __LOG_BUF_LEN);

We support LOG_BUF_LEN of up to 1<<25 bytes. Since setup_log_buf() is
called during early init, logbuf is mostly empty, so

	__LOG_BUF_LEN - log_next_idx

is close to 1<<25. Thus when we multiply it by 100, we overflow signed
integer value range: 100 is 2^6 + 2^5 + 2^2.

Example, booting with LOG_BUF_LEN 1<<25 and log_buf_len=2G
boot param:

[    0.075317] log_buf_len: -2147483648 bytes
[    0.075319] early log buf free: 33549896(-28%)

Make "free" unsigned integer and use appropriate printk() specifier.

Link: http://lkml.kernel.org/r/20181010113308.9337-1-sergey.senozhatsky@gmail.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:16:14 +01:00
Sergey Senozhatsky
90d73768dd printk: lock/unlock console only for new logbuf entries
[ Upstream commit 3ac37a93fa ]

Prior to commit 5c2992ee7f ("printk: remove console flushing special
cases for partial buffered lines") we would do console_cont_flush()
for each pr_cont() to print cont fragments, so console_unlock() would
actually print data:

	pr_cont();
	 console_lock();
	 console_unlock()
	  console_cont_flush(); // print cont fragment
	...
	pr_cont();
	 console_lock();
	 console_unlock()
	  console_cont_flush(); // print cont fragment

We don't do console_cont_flush() anymore, so when we do pr_cont()
console_unlock() does nothing (unless we flushed the cont buffer):

	pr_cont();
	 console_lock();
	 console_unlock();      // noop
	...
	pr_cont();
	 console_lock();
	 console_unlock();      // noop
	...
	pr_cont();
	  cont_flush();
	    console_lock();
	    console_unlock();   // print data

We also wakeup klogd purposelessly for pr_cont() output - un-flushed
cont buffer is not stored in log_buf; there is nothing to pull.

Thus we can console_lock()/console_unlock()/wake_up_klogd() only when
we know that we log_store()-ed a message and there is something to
print to the consoles/syslog.

Link: http://lkml.kernel.org/r/20181002023836.4487-3-sergey.senozhatsky@gmail.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-01 09:16:13 +01:00
Sami Tolvanen
32518aee30 FROMLIST: scs: add support for stack usage debugging
Implements CONFIG_DEBUG_STACK_USAGE for shadow stacks. When enabled,
also prints out the highest shadow stack usage per process.

Bug: 145210207
Change-Id: I4c085b51e1432e8d52e54126ffd8bf7b6e35b529
(am from https://lore.kernel.org/patchwork/patch/1149056/)
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2019-11-27 12:37:25 -08:00
Sami Tolvanen
9c265248fa FROMLIST: scs: add accounting
This change adds accounting for the memory allocated for shadow stacks.

Bug: 145210207
Change-Id: I51157fe0b23b4cb28bb33c86a5dfe3ac911296a4
(am from https://lore.kernel.org/patchwork/patch/1149055/)
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2019-11-27 12:37:25 -08:00
Sami Tolvanen
16e13c60ff FROMLIST: add support for Clang's Shadow Call Stack (SCS)
This change adds generic support for Clang's Shadow Call Stack,
which uses a shadow stack to protect return addresses from being
overwritten by an attacker. Details are available here:

  https://clang.llvm.org/docs/ShadowCallStack.html

Note that security guarantees in the kernel differ from the
ones documented for user space. The kernel must store addresses
of shadow stacks used by other tasks and interrupt handlers in
memory, which means an attacker capable reading and writing
arbitrary memory may be able to locate them and hijack control
flow by modifying shadow stacks that are not currently in use.

Bug: 145210207
Change-Id: Ia5f1650593fa95da4efcf86f84830a20989f161c
(am from https://lore.kernel.org/patchwork/patch/1149054/)
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2019-11-27 12:37:25 -08:00