-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl2DKbUACgkQONu9yGCS
aT6YGg//abjWbd60zh1bErEl/ZlGfqjcKwM6GZ2k+Knhk14JzogJHGtBbuhIcaQB
c2KH44r1QiNg9nJ2Gtq+mQnH5/GnY1aGlMHrfqb2uO7sKxoOVUZsxKiNjevtrOwI
6LsiiKE6bBRAP3f8pCiFj84ubWiWzDnb+FA3p2sfh11F9wrxAcNPsvl8jsnwHisr
sAJn9KgXQRezJdwRk+JgSYdSR6WSnaf4m4rrDGe9a2qxsvH9ttCtiOmf63m184cb
iMoYs1ceBfUefyJjum077KVBb/ryRDr4VMMPhKDGqgcctXAlPVAUwcUY5HG3YWQg
HQaHK9AyoAiEDh+iyAMHCYZaNr/lUPNUFbsYU7nf4o058EX1fpLtFmn0T1Dh9hOn
N0TN1stNDQ8KAZ5iugYBMDKKmHznIo1umxiv68dMIsUSANdBSGENio+4Tkpvmfod
zagE5aOoYtsh9Qxytz9IExkGYhinfrLT5fpTLrALQwneCquZqynqBFbwYj/VOYC2
9MKGSeAKyRQqM02Bf4TrMptzO5jNsR+aWG7yyIR/L0fqis4h1fyCUTCdHd1fD1+0
hEvR+2lkKiRT+B6ArhVOWf61N0RZ4TdkCzST6WoVosonCPFImJmmSkoYQ0KmnqKP
DhnIp68n4zx8uutDoFuQ5HLJeiubmHOikjnO6F+pYPyzT0PW/wQ=
=VrIK
-----END PGP SIGNATURE-----
Merge 4.19.74 into android-4.19
Changes in 4.19.74
bridge/mdb: remove wrong use of NLM_F_MULTI
cdc_ether: fix rndis support for Mediatek based smartphones
ipv6: Fix the link time qualifier of 'ping_v6_proc_exit_net()'
isdn/capi: check message length in capi_write()
ixgbe: Fix secpath usage for IPsec TX offload.
net: Fix null de-reference of device refcount
net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list
net: phylink: Fix flow control resolution
net: sched: fix reordering issues
sch_hhf: ensure quantum and hhf_non_hh_weight are non-zero
sctp: Fix the link time qualifier of 'sctp_ctrlsock_exit()'
sctp: use transport pf_retrans in sctp_do_8_2_transport_strike
tcp: fix tcp_ecn_withdraw_cwr() to clear TCP_ECN_QUEUE_CWR
tipc: add NULL pointer check before calling kfree_rcu
tun: fix use-after-free when register netdev failed
gpiolib: acpi: Add gpiolib_acpi_run_edge_events_on_boot option and blacklist
gpio: fix line flag validation in linehandle_create
Btrfs: fix assertion failure during fsync and use of stale transaction
ixgbe: Prevent u8 wrapping of ITR value to something less than 10us
genirq: Prevent NULL pointer dereference in resend_irqs()
KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset()
KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
KVM: x86: work around leak of uninitialized stack contents
KVM: nVMX: handle page fault in vmread
x86/purgatory: Change compiler flags from -mcmodel=kernel to -mcmodel=large to fix kexec relocation errors
powerpc: Add barrier_nospec to raw_copy_in_user()
drm/meson: Add support for XBGR8888 & ABGR8888 formats
clk: rockchip: Don't yell about bad mmc phases when getting
mtd: rawnand: mtk: Fix wrongly assigned OOB buffer pointer issue
PCI: Always allow probing with driver_override
gpio: fix line flag validation in lineevent_create
ubifs: Correctly use tnc_next() in search_dh_cookie()
driver core: Fix use-after-free and double free on glue directory
crypto: talitos - check AES key size
crypto: talitos - fix CTR alg blocksize
crypto: talitos - check data blocksize in ablkcipher.
crypto: talitos - fix ECB algs ivsize
crypto: talitos - Do not modify req->cryptlen on decryption.
crypto: talitos - HMAC SNOOP NO AFEU mode requires SW icv checking.
firmware: ti_sci: Always request response from firmware
drm: panel-orientation-quirks: Add extra quirk table entry for GPD MicroPC
drm/mediatek: mtk_drm_drv.c: Add of_node_put() before goto
Revert "Bluetooth: btusb: driver to enable the usb-wakeup feature"
iio: adc: stm32-dfsdm: fix data type
modules: fix BUG when load module with rodata=n
modules: fix compile error if don't have strict module rwx
platform/x86: pmc_atom: Add CB4063 Beckhoff Automation board to critclk_systems DMI table
rsi: fix a double free bug in rsi_91x_deinit()
nvmem: Use the same permissions for eeprom as for nvmem
x86/build: Add -Wnoaddress-of-packed-member to REALMODE_CFLAGS, to silence GCC9 build warning
Linux 4.19.74
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I6dc03ee31e33b10ad8f7e2a68af42fe102a7743d
commit 93651f80dc upstream.
If CONFIG_ARCH_HAS_STRICT_MODULE_RWX is not defined,
we need stub for module_enable_nx() and module_enable_x().
If CONFIG_ARCH_HAS_STRICT_MODULE_RWX is defined, but
CONFIG_STRICT_MODULE_RWX is disabled, we need stub for
module_enable_nx.
Move frob_text() outside of the CONFIG_STRICT_MODULE_RWX,
because it is needed anyway.
Fixes: 2eef1399a8 ("modules: fix BUG when load module with rodata=n")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eddf3e9c7c upstream.
The following crash was observed:
Unable to handle kernel NULL pointer dereference at 0000000000000158
Internal error: Oops: 96000004 [#1] SMP
pc : resend_irqs+0x68/0xb0
lr : resend_irqs+0x64/0xb0
...
Call trace:
resend_irqs+0x68/0xb0
tasklet_action_common.isra.6+0x84/0x138
tasklet_action+0x2c/0x38
__do_softirq+0x120/0x324
run_ksoftirqd+0x44/0x60
smpboot_thread_fn+0x1ac/0x1e8
kthread+0x134/0x138
ret_from_fork+0x10/0x18
The reason for this is that the interrupt resend mechanism happens in soft
interrupt context, which is a asynchronous mechanism versus other
operations on interrupts. free_irq() does not take resend handling into
account. Thus, the irq descriptor might be already freed before the resend
tasklet is executed. resend_irqs() does not check the return value of the
interrupt descriptor lookup and derefences the return value
unconditionally.
1):
__setup_irq
irq_startup
check_irq_resend // activate softirq to handle resend irq
2):
irq_domain_free_irqs
irq_free_descs
free_desc
call_rcu(&desc->rcu, delayed_free_desc)
3):
__do_softirq
tasklet_action
resend_irqs
desc = irq_to_desc(irq)
desc->handle_irq(desc) // desc is NULL --> Ooops
Fix this by adding a NULL pointer check in resend_irqs() before derefencing
the irq descriptor.
Fixes: a4633adcdb ("[PATCH] genirq: add genirq sw IRQ-retrigger")
Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1630ae13-5c8e-901e-de09-e740b6a426a7@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Problem background: In the process of suspend, maybe some device suspend
callback failed in dpm_suspend_start()/dpm_prepare()/dpm_suspend().
Because it's after suspend_console(), so printf() is disabled now.
Currently we can see "Some devices failed to suspend, or early wake
event detected" by log_suspend_abort_reason() in bugreport. There are
many devices but we don't know which exactly device failed. So we
want to do a little change to record which device failed.
Note: I checked upstream LTS kernel, then I found the
patch can not be sent upstream, because it uses function
log_suspend_abort_reason() in wakeup_reason.c, the initial
patch(https://patchwork.kernel.org/patch/3827331/) for adding
/kernel/power/wakeup_reason.c is not accepted by upstream which
was merged in AOSP common kernels. So maybe the patch could
only be sent to AOSP common kernels.
Test: manual - Use modified version for daily use two days, from
bugreport we can see which device failed.
Bug: 120445600
Change-Id: I326c87ca1263496db79d08ec615f12fc22452d7a
Signed-off-by: zhuguangqing <zhuguangqing@xiaomi.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1/KiEACgkQONu9yGCS
aT49JBAAy7b3wv1WXAtg9wsyS1JL4HbMXt3YjtokIX+UpkznoqII4B85QftPBbiD
9zDuTWPjhrqKv1GsMkFRCqBVp5wGVik1MIbjVuKdstFN5W8KQybpbYnSW4T52+wS
cs6oOPkLydAfWzKeq+ekEeU8yr5dua+Ui3huundZ49wseJWQP3fh9T+ToUx8V/cr
tsLiRRgI0djj7KQWVuM1j8YGKT/6qk/UL0HMVZyoIdLmsxpLap+LWe0+CRXn8rvs
eJJlVQTVtYf/ySoHkpnwR12VsjRYjx6pNkm/GrebMCkM7wF/4RMqxk7j9EU0PENH
VUdRrUd+j/YPp6QzjSFMK0+0eb7Gm3X0FEN0IGZshu1r/CDnoj/7hqnBmOlYIbhv
pdteYaLqWq7JjAHu7vF+S4aNQRGpAZb05LsbTJ39Eu3FbdVTLXsAuUveZ7Y4/y0X
ri2M3d/sF/cjc3C+V7Y7h422SM36jSAK6496VAoRyqqjX/3JyROhgfU9NAMzVr83
4uI904z9lH4TZGOd5YQgX2VuOtBcGwa7+g6fy97u1tp8UxSWFZRGDDLRysF/dIJO
Wi51UK0Q7EWnqBTe0TFF6TjE5tC7R3ZgzqEQ1MU4eLI5mqokg82DAK4Ub2Wk5Qch
CGs5/d16OOrLtG2RoaOGz9UdQR7IHUXLSqkKbaEdstc16MXNXns=
=cmGh
-----END PGP SIGNATURE-----
Merge 4.19.73 into android-4.19
Changes in 4.19.73
ALSA: hda - Fix potential endless loop at applying quirks
ALSA: hda/realtek - Fix overridden device-specific initialization
ALSA: hda/realtek - Add quirk for HP Pavilion 15
ALSA: hda/realtek - Enable internal speaker & headset mic of ASUS UX431FL
ALSA: hda/realtek - Fix the problem of two front mics on a ThinkCentre
sched/fair: Don't assign runtime for throttled cfs_rq
drm/vmwgfx: Fix double free in vmw_recv_msg()
vhost/test: fix build for vhost test
vhost/test: fix build for vhost test - again
powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction
batman-adv: fix uninit-value in batadv_netlink_get_ifindex()
batman-adv: Only read OGM tvlv_len after buffer len check
hv_sock: Fix hang when a connection is closed
Blk-iolatency: warn on negative inflight IO counter
blk-iolatency: fix STS_AGAIN handling
{nl,mac}80211: fix interface combinations on crypto controlled devices
timekeeping: Use proper ktime_add when adding nsecs in coarse offset
selftests: fib_rule_tests: use pre-defined DEV_ADDR
x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace()
powerpc/64: mark start_here_multiplatform as __ref
media: stm32-dcmi: fix irq = 0 case
arm64: dts: rockchip: enable usb-host regulators at boot on rk3328-rock64
scripts/decode_stacktrace: match basepath using shell prefix operator, not regex
riscv: remove unused variable in ftrace
nvme-fc: use separate work queue to avoid warning
clk: s2mps11: Add used attribute to s2mps11_dt_match
remoteproc: qcom: q6v5: shore up resource probe handling
modules: always page-align module section allocations
kernel/module: Fix mem leak in module_add_modinfo_attrs
drm/i915: Re-apply "Perform link quality check, unconditionally during long pulse"
media: cec/v4l2: move V4L2 specific CEC functions to V4L2
media: cec: remove cec-edid.c
scsi: qla2xxx: Move log messages before issuing command to firmware
keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
Drivers: hv: kvp: Fix two "this statement may fall through" warnings
x86, hibernate: Fix nosave_regions setup for hibernation
remoteproc: qcom: q6v5-mss: add SCM probe dependency
drm/amdgpu/gfx9: Update gfx9 golden settings.
drm/amdgpu: Update gc_9_0 golden settings.
KVM: x86: hyperv: enforce vp_index < KVM_MAX_VCPUS
KVM: x86: hyperv: consistently use 'hv_vcpu' for 'struct kvm_vcpu_hv' variables
KVM: x86: hyperv: keep track of mismatched VP indexes
KVM: hyperv: define VP assist page helpers
x86/kvm/lapic: preserve gfn_to_hva_cache len on cache reinit
drm/i915: Fix intel_dp_mst_best_encoder()
drm/i915: Rename PLANE_CTL_DECOMPRESSION_ENABLE
drm/i915/gen9+: Fix initial readout for Y tiled framebuffers
drm/atomic_helper: Disallow new modesets on unregistered connectors
Drivers: hv: kvp: Fix the indentation of some "break" statements
Drivers: hv: kvp: Fix the recent regression caused by incorrect clean-up
powerplay: Respect units on max dcfclk watermark
drm/amd/pp: Fix truncated clock value when set watermark
drm/amd/dm: Understand why attaching path/tile properties are needed
ARM: davinci: da8xx: define gpio interrupts as separate resources
ARM: davinci: dm365: define gpio interrupts as separate resources
ARM: davinci: dm646x: define gpio interrupts as separate resources
ARM: davinci: dm355: define gpio interrupts as separate resources
ARM: davinci: dm644x: define gpio interrupts as separate resources
s390/zcrypt: reinit ap queue state machine during device probe
media: vim2m: use workqueue
media: vim2m: use cancel_delayed_work_sync instead of flush_schedule_work
drm/i915: Restore sane defaults for KMS on GEM error load
drm/i915: Cleanup gt powerstate from gem
KVM: PPC: Book3S HV: Fix race between kvm_unmap_hva_range and MMU mode switch
Btrfs: clean up scrub is_dev_replace parameter
Btrfs: fix deadlock with memory reclaim during scrub
btrfs: Remove extent_io_ops::fill_delalloc
btrfs: Fix error handling in btrfs_cleanup_ordered_extents
scsi: megaraid_sas: Fix combined reply queue mode detection
scsi: megaraid_sas: Add check for reset adapter bit
scsi: megaraid_sas: Use 63-bit DMA addressing
powerpc/pkeys: Fix handling of pkey state across fork()
btrfs: volumes: Make sure no dev extent is beyond device boundary
btrfs: Use real device structure to verify dev extent
media: vim2m: only cancel work if it is for right context
ARC: show_regs: lockdep: re-enable preemption
ARC: mm: do_page_fault fixes#1: relinquish mmap_sem if signal arrives while handle_mm_fault
IB/uverbs: Fix OOPs upon device disassociation
crypto: ccree - fix resume race condition on init
crypto: ccree - add missing inline qualifier
drm/vblank: Allow dynamic per-crtc max_vblank_count
drm/i915/ilk: Fix warning when reading emon_status with no output
mfd: Kconfig: Fix I2C_DESIGNWARE_PLATFORM dependencies
tpm: Fix some name collisions with drivers/char/tpm.h
bcache: replace hard coded number with BUCKET_GC_GEN_MAX
bcache: treat stale && dirty keys as bad keys
KVM: VMX: Compare only a single byte for VMCS' "launched" in vCPU-run
iio: adc: exynos-adc: Add S5PV210 variant
dt-bindings: iio: adc: exynos-adc: Add S5PV210 variant
iio: adc: exynos-adc: Use proper number of channels for Exynos4x12
mt76: fix corrupted software generated tx CCMP PN
drm/nouveau: Don't WARN_ON VCPI allocation failures
iwlwifi: fix devices with PCI Device ID 0x34F0 and 11ac RF modules
iwlwifi: add new card for 9260 series
x86/kvmclock: set offset for kvm unstable clock
spi: spi-gpio: fix SPI_CS_HIGH capability
powerpc/kvm: Save and restore host AMR/IAMR/UAMOR
mmc: renesas_sdhi: Fix card initialization failure in high speed mode
btrfs: scrub: pass fs_info to scrub_setup_ctx
btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex
btrfs: scrub: fix circular locking dependency warning
btrfs: init csum_list before possible free
PCI: qcom: Fix error handling in runtime PM support
PCI: qcom: Don't deassert reset GPIO during probe
drm: add __user attribute to ptr_to_compat()
CIFS: Fix error paths in writeback code
CIFS: Fix leaking locked VFS cache pages in writeback retry
drm/i915: Handle vm_mmap error during I915_GEM_MMAP ioctl with WC set
drm/i915: Sanity check mmap length against object size
usb: typec: tcpm: Try PD-2.0 if sink does not respond to 3.0 source-caps
arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's
IB/mlx5: Reset access mask when looping inside page fault handler
kvm: mmu: Fix overflow on kvm mmu page limit calculation
x86/kvm: move kvm_load/put_guest_xcr0 into atomic context
KVM: x86: Always use 32-bit SMRAM save state for 32-bit kernels
cifs: Fix lease buffer length error
media: i2c: tda1997x: select V4L2_FWNODE
ext4: protect journal inode's blocks using block_validity
ARM: dts: qcom: ipq4019: fix PCI range
ARM: dts: qcom: ipq4019: Fix MSI IRQ type
ARM: dts: qcom: ipq4019: enlarge PCIe BAR range
dt-bindings: mmc: Add supports-cqe property
dt-bindings: mmc: Add disable-cqe-dcmd property.
PCI: Add macro for Switchtec quirk declarations
PCI: Reset Lenovo ThinkPad P50 nvgpu at boot if necessary
dm mpath: fix missing call of path selector type->end_io
blk-mq: free hw queue's resource in hctx's release handler
mmc: sdhci-pci: Add support for Intel CML
PCI: dwc: Use devm_pci_alloc_host_bridge() to simplify code
cifs: smbd: take an array of reqeusts when sending upper layer data
dm crypt: move detailed message into debug level
signal/arc: Use force_sig_fault where appropriate
ARC: mm: fix uninitialised signal code in do_page_fault
ARC: mm: SIGSEGV userspace trying to access kernel virtual memory
drm/amdkfd: Add missing Polaris10 ID
kvm: Check irqchip mode before assign irqfd
drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2)
drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc
Btrfs: fix race between block group removal and block group allocation
cifs: add spinlock for the openFileList to cifsInodeInfo
clk: tegra: Fix maximum audio sync clock for Tegra124/210
clk: tegra210: Fix default rates for HDA clocks
IB/hfi1: Avoid hardlockup with flushlist_lock
apparmor: reset pos on failure to unpack for various functions
scsi: target/core: Use the SECTOR_SHIFT constant
scsi: target/iblock: Fix overrun in WRITE SAME emulation
staging: wilc1000: fix error path cleanup in wilc_wlan_initialize()
scsi: zfcp: fix request object use-after-free in send path causing wrong traces
cifs: Properly handle auto disabling of serverino option
ALSA: hda - Don't resume forcibly i915 HDMI/DP codec
ceph: use ceph_evict_inode to cleanup inode's resource
KVM: x86: optimize check for valid PAT value
KVM: VMX: Always signal #GP on WRMSR to MSR_IA32_CR_PAT with bad value
KVM: VMX: Fix handling of #MC that occurs during VM-Entry
KVM: VMX: check CPUID before allowing read/write of IA32_XSS
KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct
KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation
ARM: dts: gemini: Set DIR-685 SPI CS as active low
RDMA/srp: Document srp_parse_in() arguments
RDMA/srp: Accept again source addresses that do not have a port number
btrfs: correctly validate compression type
resource: Include resource end in walk_*() interfaces
resource: Fix find_next_iomem_res() iteration issue
resource: fix locking in find_next_iomem_res()
pstore: Fix double-free in pstore_mkfile() failure path
dm thin metadata: check if in fail_io mode when setting needs_check
drm/panel: Add support for Armadeus ST0700 Adapt
ALSA: hda - Fix intermittent CORB/RIRB stall on Intel chips
powerpc/mm: Limit rma_size to 1TB when running without HV mode
iommu/iova: Remove stale cached32_node
gpio: don't WARN() on NULL descs if gpiolib is disabled
i2c: at91: disable TXRDY interrupt after sending data
i2c: at91: fix clk_offset for sama5d2
mm/migrate.c: initialize pud_entry in migrate_vma()
iio: adc: gyroadc: fix uninitialized return code
NFSv4: Fix delegation state recovery
bcache: only clear BTREE_NODE_dirty bit when it is set
bcache: add comments for mutex_lock(&b->write_lock)
bcache: fix race in btree_flush_write()
drm/i915: Make sure cdclk is high enough for DP audio on VLV/CHV
virtio/s390: fix race on airq_areas[]
drm/atomic_helper: Allow DPMS On<->Off changes for unregistered connectors
ext4: don't perform block validity checks on the journal inode
ext4: fix block validity checks for journal inodes using indirect blocks
ext4: unsigned int compared against zero
PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround
powerpc/tm: Remove msr_tm_active()
powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts
vhost: make sure log_num < in_num
Linux 4.19.73
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I7bc57825aeb36759bb8e8726888da9af06392c09
[ Upstream commit 49f17c26c1 ]
Since resources can be removed, locking should ensure that the resource
is not removed while accessing it. However, find_next_iomem_res() does
not hold the lock while copying the data of the resource.
Keep holding the lock while the data is copied. While at it, change the
return value to a more informative value. It is disregarded by the
callers.
[akpm@linux-foundation.org: fix find_next_iomem_res() documentation]
Link: http://lkml.kernel.org/r/20190613045903.4922-2-namit@vmware.com
Fixes: ff3cc952d3 ("resource: Add remove_resource interface")
Signed-off-by: Nadav Amit <namit@vmware.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 010a93bf97 ]
Previously find_next_iomem_res() used "*res" as both an input parameter for
the range to search and the type of resource to search for, and an output
parameter for the resource we found, which makes the interface confusing.
The current callers use find_next_iomem_res() incorrectly because they
allocate a single struct resource and use it for repeated calls to
find_next_iomem_res(). When find_next_iomem_res() returns a resource, it
overwrites the start, end, flags, and desc members of the struct. If we
call find_next_iomem_res() again, we must update or restore these fields.
The previous code restored res.start and res.end, but not res.flags or
res.desc.
Since the callers did not restore res.flags, if they searched for flags
IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
incorrectly skip resources unless they were also marked as
IORESOURCE_SYSRAM.
Fix this by restructuring the interface so it takes explicit "start, end,
flags" parameters and uses "*res" only as an output parameter.
Based on a patch by Lianbo Jiang <lijiang@redhat.com>.
[ bp: While at it:
- make comments kernel-doc style.
-
Originally-by: http://lore.kernel.org/lkml/20180921073211.20097-2-lijiang@redhat.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Brijesh Singh <brijesh.singh@amd.com>
CC: Dan Williams <dan.j.williams@intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Lianbo Jiang <lijiang@redhat.com>
CC: Takashi Iwai <tiwai@suse.de>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Tom Lendacky <thomas.lendacky@amd.com>
CC: Vivek Goyal <vgoyal@redhat.com>
CC: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
CC: bhe@redhat.com
CC: dan.j.williams@intel.com
CC: dyoung@redhat.com
CC: kexec@lists.infradead.org
CC: mingo@redhat.com
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/153805812916.1157.177580438135143788.stgit@bhelgaas-glaptop.roam.corp.google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a98959fdbd ]
find_next_iomem_res() finds an iomem resource that covers part of a range
described by "start, end". All callers expect that range to be inclusive,
i.e., both start and end are included, but find_next_iomem_res() doesn't
handle the end address correctly.
If it finds an iomem resource that contains exactly the end address, it
skips it, e.g., if "start, end" is [0x0-0x10000] and there happens to be an
iomem resource [mem 0x10000-0x10000] (the single byte at 0x10000), we skip
it:
find_next_iomem_res(...)
{
start = 0x0;
end = 0x10000;
for (p = next_resource(...)) {
# p->start = 0x10000;
# p->end = 0x10000;
# we *should* return this resource, but this condition is false:
if ((p->end >= start) && (p->start < end))
break;
Adjust find_next_iomem_res() so it allows a resource that includes the
single byte at the end of the range. This is a corner case that we
probably don't see in practice.
Fixes: 58c1b5b079 ("[PATCH] memory hotadd fixes: find_next_system_ram catch range fix")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Brijesh Singh <brijesh.singh@amd.com>
CC: Dan Williams <dan.j.williams@intel.com>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Lianbo Jiang <lijiang@redhat.com>
CC: Takashi Iwai <tiwai@suse.de>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Tom Lendacky <thomas.lendacky@amd.com>
CC: Vivek Goyal <vgoyal@redhat.com>
CC: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
CC: bhe@redhat.com
CC: dan.j.williams@intel.com
CC: dyoung@redhat.com
CC: kexec@lists.infradead.org
CC: mingo@redhat.com
CC: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/153805812254.1157.16736368485811773752.stgit@bhelgaas-glaptop.roam.corp.google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 38f054d549 ]
Some arches (e.g., arm64, x86) have moved towards non-executable
module_alloc() allocations for security hardening reasons. That means
that the module loader will need to set the text section of a module to
executable, regardless of whether or not CONFIG_STRICT_MODULE_RWX is set.
When CONFIG_STRICT_MODULE_RWX=y, module section allocations are always
page-aligned to handle memory rwx permissions. On some arches with
CONFIG_STRICT_MODULE_RWX=n however, when setting the module text to
executable, the BUG_ON() in frob_text() gets triggered since module
section allocations are not page-aligned when CONFIG_STRICT_MODULE_RWX=n.
Since the set_memory_* API works with pages, and since we need to call
set_memory_x() regardless of whether CONFIG_STRICT_MODULE_RWX is set, we
might as well page-align all module section allocations for ease of
managing rwx permissions of module sections (text, rodata, etc).
Fixes: 2eef1399a8 ("modules: fix BUG when load module with rodata=n")
Reported-by: Martin Kaiser <lists@kaiser.cx>
Reported-by: Bartosz Golaszewski <brgl@bgdev.pl>
Tested-by: David Lechner <david@lechnology.com>
Tested-by: Martin Kaiser <martin@kaiser.cx>
Tested-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0354c1a3cd ]
While this doesn't actually amount to a real difference, since the macro
evaluates to the same thing, every place else operates on ktime_t using
these functions, so let's not break the pattern.
Fixes: e3ff9c3678 ("timekeeping: Repair ktime_get_coarse*() granularity")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lkml.kernel.org/r/20190621203249.3909-1-Jason@zx2c4.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 5e2d2cc258 upstream.
do_sched_cfs_period_timer() will refill cfs_b runtime and call
distribute_cfs_runtime to unthrottle cfs_rq, sometimes cfs_b->runtime
will allocate all quota to one cfs_rq incorrectly, then other cfs_rqs
attached to this cfs_b can't get runtime and will be throttled.
We find that one throttled cfs_rq has non-negative
cfs_rq->runtime_remaining and cause an unexpetced cast from s64 to u64
in snippet:
distribute_cfs_runtime() {
runtime = -cfs_rq->runtime_remaining + 1;
}
The runtime here will change to a large number and consume all
cfs_b->runtime in this cfs_b period.
According to Ben Segall, the throttled cfs_rq can have
account_cfs_rq_runtime called on it because it is throttled before
idle_balance, and the idle_balance calls update_rq_clock to add time
that is accounted to the task.
This commit prevents cfs_rq to be assgined new runtime if it has been
throttled until that distribute_cfs_runtime is called.
Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Ben Segall <bsegall@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: shanpeic@linux.alibaba.com
Cc: stable@vger.kernel.org
Cc: xlpang@linux.alibaba.com
Fixes: d3d9dc3302 ("sched: Throttle entities exceeding their allowed bandwidth")
Link: https://lkml.kernel.org/r/20190826121633.6538-1-liangyan.peng@linux.alibaba.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
upstream commit 594cc251fd ("make 'user_access_begin()' do 'access_ok()'")
Originally, the rule used to be that you'd have to do access_ok()
separately, and then user_access_begin() before actually doing the
direct (optimized) user access.
But experience has shown that people then decide not to do access_ok()
at all, and instead rely on it being implied by other operations or
similar. Which makes it very hard to verify that the access has
actually been range-checked.
If you use the unsafe direct user accesses, hardware features (either
SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged
Access Never - on ARM) do force you to use user_access_begin(). But
nothing really forces the range check.
By putting the range check into user_access_begin(), we actually force
people to do the right thing (tm), and the range check vill be visible
near the actual accesses. We have way too long a history of people
trying to avoid them.
Bug: 135368228
Change-Id: I4ca0e4566ea080fa148c5e768bb1a0b6f7201c01
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl13bgIACgkQONu9yGCS
aT5lvg/+KvPsSPQiccNrK/mvPan29KMk7R8Yjr3hzjPxLFPvRz7tR69joU6yj1nw
S4BNZYiVsT43d9U8FieniAz2ch2qDIQbIIrQTLqcED4vn7ih0cTt4783FNZPApsZ
I84u6kgbD3x5tPyB/RYE79JCq4JDgV+GvyDK+MXcU8Dh5OHTjJzLrwqghbI/LH6k
G9yoMSho23i1h+jl/JW+QfknarDaclZUCmL/BdZ3A6mILXpWcwF2j10KHBYTsPRd
DvrbW9S7e78c80TIWPtCHr1RYY79J4NXvPPThpytvpOjMZYsdT3s7jJLoiV4+3gq
G04Y2awbfyD8U3WB3q8qX9AMfF5B6uIOqnp3DRV/NX7LJHk3xBTTAY3/6d7xAZr7
xJ9WSd4gzzpBu6EgcS8F52qrrxol2S8jGRoNh7zGgQe8YvxvW5ktgsOCshzvGGSb
HgYBar4qiQepG4mwKpJ9FCzBY/H+uuIKOgJpgFGE/Hhmw4yw5REy0i62sf3Rkroa
db1j+5egc0kvqLUVGp8lS5RPDrhJQ9A4XIUlThZWU2bGVl6wDFHaJyZdM1oEXFZU
jgONQnM/ys9jAnzc9xzJ4Ck9u03qquDgMmC+bC1z2OlQ/tN+z5Sm4U86DjXd9qBD
jcb9IaSZ9HEgwIzhJypQHHNjr7OVINmrQW5v74LHNb3u+jIiYiM=
=59b4
-----END PGP SIGNATURE-----
Merge 4.19.72 into android-4.19
Changes in 4.19.72
mld: fix memory leak in mld_del_delrec()
net: fix skb use after free in netpoll
net: sched: act_sample: fix psample group handling on overwrite
net_sched: fix a NULL pointer deref in ipt action
net: stmmac: dwmac-rk: Don't fail if phy regulator is absent
tcp: inherit timestamp on mtu probe
tcp: remove empty skb from write queue in error cases
net/rds: Fix info leak in rds6_inc_info_copy()
x86/boot: Preserve boot_params.secure_boot from sanitizing
spi: bcm2835aux: unifying code between polling and interrupt driven code
spi: bcm2835aux: remove dangerous uncontrolled read of fifo
spi: bcm2835aux: fix corruptions for longer spi transfers
net: tundra: tsi108: use spin_lock_irqsave instead of spin_lock_irq in IRQ context
netfilter: nf_tables: use-after-free in failing rule with bound set
tools: bpftool: fix error message (prog -> object)
hv_netvsc: Fix a warning of suspicious RCU usage
net: tc35815: Explicitly check NET_IP_ALIGN is not zero in tc35815_rx
Bluetooth: btqca: Add a short delay before downloading the NVM
ibmveth: Convert multicast list size for little-endian system
gpio: Fix build error of function redefinition
netfilter: nft_flow_offload: skip tcp rst and fin packets
drm/mediatek: use correct device to import PRIME buffers
drm/mediatek: set DMA max segment size
scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure
scsi: target: tcmu: avoid use-after-free after command timeout
cxgb4: fix a memory leak bug
liquidio: add cleanup in octeon_setup_iq()
net: myri10ge: fix memory leaks
lan78xx: Fix memory leaks
vfs: fix page locking deadlocks when deduping files
cx82310_eth: fix a memory leak bug
net: kalmia: fix memory leaks
ibmvnic: Unmap DMA address of TX descriptor buffers after use
net: cavium: fix driver name
wimax/i2400m: fix a memory leak bug
ravb: Fix use-after-free ravb_tstamp_skb
kprobes: Fix potential deadlock in kprobe_optimizer()
HID: cp2112: prevent sleeping function called from invalid context
x86/boot/compressed/64: Fix boot on machines with broken E820 table
Input: hyperv-keyboard: Use in-place iterator API in the channel callback
Tools: hv: kvp: eliminate 'may be used uninitialized' warning
nvme-multipath: fix possible I/O hang when paths are updated
IB/mlx4: Fix memory leaks
infiniband: hfi1: fix a memory leak bug
infiniband: hfi1: fix memory leaks
selftests: kvm: fix state save/load on processors without XSAVE
selftests/kvm: make platform_info_test pass on AMD
ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
ceph: fix buffer free while holding i_ceph_lock in fill_inode()
KVM: arm/arm64: Only skip MMIO insn once
afs: Fix leak in afs_lookup_cell_rcu()
KVM: arm/arm64: VGIC: Properly initialise private IRQ affinity
x86/boot/compressed/64: Fix missing initialization in find_trampoline_placement()
libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
Revert "x86/apic: Include the LDR when clearing out APIC registers"
Linux 4.19.72
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I207e2ba522f98f6e17dacd3a2527fab615155007
[ Upstream commit f1c6ece237 ]
lockdep reports the following deadlock scenario:
WARNING: possible circular locking dependency detected
kworker/1:1/48 is trying to acquire lock:
000000008d7a62b2 (text_mutex){+.+.}, at: kprobe_optimizer+0x163/0x290
but task is already holding lock:
00000000850b5e2d (module_mutex){+.+.}, at: kprobe_optimizer+0x31/0x290
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (module_mutex){+.+.}:
__mutex_lock+0xac/0x9f0
mutex_lock_nested+0x1b/0x20
set_all_modules_text_rw+0x22/0x90
ftrace_arch_code_modify_prepare+0x1c/0x20
ftrace_run_update_code+0xe/0x30
ftrace_startup_enable+0x2e/0x50
ftrace_startup+0xa7/0x100
register_ftrace_function+0x27/0x70
arm_kprobe+0xb3/0x130
enable_kprobe+0x83/0xa0
enable_trace_kprobe.part.0+0x2e/0x80
kprobe_register+0x6f/0xc0
perf_trace_event_init+0x16b/0x270
perf_kprobe_init+0xa7/0xe0
perf_kprobe_event_init+0x3e/0x70
perf_try_init_event+0x4a/0x140
perf_event_alloc+0x93a/0xde0
__do_sys_perf_event_open+0x19f/0xf30
__x64_sys_perf_event_open+0x20/0x30
do_syscall_64+0x65/0x1d0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
-> #0 (text_mutex){+.+.}:
__lock_acquire+0xfcb/0x1b60
lock_acquire+0xca/0x1d0
__mutex_lock+0xac/0x9f0
mutex_lock_nested+0x1b/0x20
kprobe_optimizer+0x163/0x290
process_one_work+0x22b/0x560
worker_thread+0x50/0x3c0
kthread+0x112/0x150
ret_from_fork+0x3a/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(module_mutex);
lock(text_mutex);
lock(module_mutex);
lock(text_mutex);
*** DEADLOCK ***
As a reproducer I've been using bcc's funccount.py
(https://github.com/iovisor/bcc/blob/master/tools/funccount.py),
for example:
# ./funccount.py '*interrupt*'
That immediately triggers the lockdep splat.
Fix by acquiring text_mutex before module_mutex in kprobe_optimizer().
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: d5b844a2cf ("ftrace/x86: Remove possible deadlock between register_kprobe() and ftrace_run_update_code()")
Link: http://lkml.kernel.org/r/20190812184302.GA7010@xps-13
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1yF0AACgkQONu9yGCS
aT64ew/6AzJDRMcmnx1COeRP8tfQ5A8ghjnp6REEca1MYJWjDlqd0X+EMd/7zorZ
YkBzb1ND1c/9KeGQzdx8lJ5lcNRVJcD7tT6irT5zMnBcyR9uPamAaVgmVHXxuorK
el4nvX9g/MxBLsuqLHxGr5pNXi7mHu6zfXyQ86TJzlez7yHlPuiJb8bDpHnCRJ1P
n/MemPq/nQgC5jPBQRhT+IpqC1MTIHNhRaHHg/5Gdrrz+eVumnk+1zbWqtBRuJKS
qS7RL1pI0Su00i5bY1r76iSkoRkGw9SoeIgz3sycbtAvGo9TI16/hZPvWAsYOAjL
2DQS0rMPSnM4QV0odbUImFt86f0YJiAL7xYS8EYCc4GX/eLbNRtP8yLq+8rlo4Oa
36HbiNhGSEjRxenfVRUD/STgBYzfVeQOMEyFJNRtfNDP/l66sLk/pEOEd83j06H4
G87BJgKFC35dv5QrbCmJO8P1IXLs5QaChD3dL6R9/hbvCU2A/MOqhPL16JCXWA20
+hOWn8ryrtBa5Dt0avAkrrnUNC8cVWyD44uAm+Hu/49CZpkTasZMB7Z81VSIsuvO
xoT1Jx0J1W/LtHwCghSkui/fjQjVpkP3xnB7zVGem73Mpcm68g6mLajKaUrLnJHp
/sz2mppF7gZob43anTtUhSV9OvrzqftkR+iFg3rCKSJyQE1o48o=
=hMvg
-----END PGP SIGNATURE-----
Merge 4.19.70 into android-4.19
Changes in 4.19.70
dmaengine: ste_dma40: fix unneeded variable warning
nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
afs: Fix the CB.ProbeUuid service handler to reply correctly
afs: Fix loop index mixup in afs_deliver_vl_get_entry_by_name_u()
fs: afs: Fix a possible null-pointer dereference in afs_put_read()
afs: Only update d_fsdata if different in afs_d_revalidate()
nvmet-loop: Flush nvme_delete_wq when removing the port
nvme: fix a possible deadlock when passthru commands sent to a multipath device
nvme-pci: Fix async probe remove race
soundwire: cadence_master: fix register definition for SLAVE_STATE
soundwire: cadence_master: fix definitions for INTSTAT0/1
auxdisplay: panel: need to delete scan_timer when misc_register fails in panel_attach
dmaengine: stm32-mdma: Fix a possible null-pointer dereference in stm32_mdma_irq_handler()
omap-dma/omap_vout_vrfb: fix off-by-one fi value
iommu/dma: Handle SG length overflow better
usb: gadget: composite: Clear "suspended" on reset/disconnect
usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt
xen/blkback: fix memory leaks
arm64: cpufeature: Don't treat granule sizes as strict
i2c: rcar: avoid race when unregistering slave client
i2c: emev2: avoid race when unregistering slave client
drm/ast: Fixed reboot test may cause system hanged
usb: host: fotg2: restart hcd after port reset
tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
tools: hv: fix KVP and VSS daemons exit code
drm/i915: fix broadwell EU computation
watchdog: bcm2835_wdt: Fix module autoload
drm/bridge: tfp410: fix memleak in get_modes()
scsi: ufs: Fix RX_TERMINATION_FORCE_ENABLE define value
drm/tilcdc: Register cpufreq notifier after we have initialized crtc
net/tls: Fixed return value when tls_complete_pending_work() fails
net/tls: swap sk_write_space on close
net: tls, fix sk_write_space NULL write when tx disabled
ipv6/addrconf: allow adding multicast addr if IFA_F_MCAUTOJOIN is set
ipv6: Default fib6_type to RTN_UNICAST when not set
net/smc: make sure EPOLLOUT is raised
tcp: make sure EPOLLOUT wont be missed
ipv4/icmp: fix rt dst dev null pointer dereference
mm/zsmalloc.c: fix build when CONFIG_COMPACTION=n
ALSA: usb-audio: Check mixer unit bitmap yet more strictly
ALSA: line6: Fix memory leak at line6_init_pcm() error path
ALSA: hda - Fixes inverted Conexant GPIO mic mute led
ALSA: seq: Fix potential concurrent access to the deleted pool
ALSA: usb-audio: Fix invalid NULL check in snd_emuusb_set_samplerate()
ALSA: usb-audio: Add implicit fb quirk for Behringer UFX1604
kvm: x86: skip populating logical dest map if apic is not sw enabled
KVM: x86: Don't update RIP or do single-step on faulting emulation
uprobes/x86: Fix detection of 32-bit user mode
x86/apic: Do not initialize LDR and DFR for bigsmp
x86/apic: Include the LDR when clearing out APIC registers
ftrace: Fix NULL pointer dereference in t_probe_next()
ftrace: Check for successful allocation of hash
ftrace: Check for empty hash and comment the race with registering probes
usb-storage: Add new JMS567 revision to unusual_devs
USB: cdc-wdm: fix race between write and disconnect due to flag abuse
usb: hcd: use managed device resources
usb: chipidea: udc: don't do hardware access if gadget has stopped
usb: host: ohci: fix a race condition between shutdown and irq
usb: host: xhci: rcar: Fix typo in compatible string matching
USB: storage: ums-realtek: Update module parameter description for auto_delink_en
USB: storage: ums-realtek: Whitelist auto-delink support
mei: me: add Tiger Lake point LP device ID
mmc: sdhci-of-at91: add quirk for broken HS200
mmc: core: Fix init of SD cards reporting an invalid VDD range
stm class: Fix a double free of stm_source_device
intel_th: pci: Add support for another Lewisburg PCH
intel_th: pci: Add Tiger Lake support
typec: tcpm: fix a typo in the comparison of pdo_max_voltage
fsi: scom: Don't abort operations for minor errors
lib: logic_pio: Fix RCU usage
lib: logic_pio: Avoid possible overlap for unregistering regions
lib: logic_pio: Add logic_pio_unregister_range()
drm/amdgpu: Add APTX quirk for Dell Latitude 5495
drm/i915: Don't deballoon unused ggtt drm_mm_node in linux guest
drm/i915: Call dma_set_max_seg_size() in i915_driver_hw_probe()
bus: hisi_lpc: Unregister logical PIO range to avoid potential use-after-free
bus: hisi_lpc: Add .remove method to avoid driver unbind crash
VMCI: Release resource if the work is already queued
crypto: ccp - Ignore unconfigured CCP device on suspend/resume
Revert "cfg80211: fix processing world regdomain when non modular"
mac80211: fix possible sta leak
mac80211: Don't memset RXCB prior to PAE intercept
mac80211: Correctly set noencrypt for PAE frames
KVM: PPC: Book3S: Fix incorrect guest-to-user-translation error handling
KVM: arm/arm64: vgic: Fix potential deadlock when ap_list is long
KVM: arm/arm64: vgic-v2: Handle SGI bits in GICD_I{S,C}PENDR0 as WI
NFS: Clean up list moves of struct nfs_page
NFSv4/pnfs: Fix a page lock leak in nfs_pageio_resend()
NFS: Pass error information to the pgio error cleanup routine
NFS: Ensure O_DIRECT reports an error if the bytes read/written is 0
i2c: piix4: Fix port selection for AMD Family 16h Model 30h
x86/ptrace: fix up botched merge of spectrev1 fix
mt76: mt76x0u: do not reset radio on resume
Revert "ASoC: Fail card instantiation if DAI format setup fails"
Linux 4.19.70
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I35ff8a403a05a8c66d87cb4b542997e63c422288
commit 372e0d01da upstream.
The race between adding a function probe and reading the probes that exist
is very subtle. It needs a comment. Also, the issue can also happen if the
probe has has the EMPTY_HASH as its func_hash.
Cc: stable@vger.kernel.org
Fixes: 7b60f3d876 ("ftrace: Dynamically create the probe ftrace_ops for the trace_array")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5b0022dd32 upstream.
In register_ftrace_function_probe(), we are not checking the return
value of alloc_and_copy_ftrace_hash(). The subsequent call to
ftrace_match_records() may end up dereferencing the same. Add a check to
ensure this doesn't happen.
Link: http://lkml.kernel.org/r/26e92574f25ad23e7cafa3cf5f7a819de1832cbe.1562249521.git.naveen.n.rao@linux.vnet.ibm.com
Cc: stable@vger.kernel.org
Fixes: 1ec3a81a0c ("ftrace: Have each function probe use its own ftrace_ops")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7bd46644ea upstream.
LTP testsuite on powerpc results in the below crash:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc00000000029d800
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
...
CPU: 68 PID: 96584 Comm: cat Kdump: loaded Tainted: G W
NIP: c00000000029d800 LR: c00000000029dac4 CTR: c0000000001e6ad0
REGS: c0002017fae8ba10 TRAP: 0300 Tainted: G W
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28022422 XER: 20040000
CFAR: c00000000029d90c DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0
...
NIP [c00000000029d800] t_probe_next+0x60/0x180
LR [c00000000029dac4] t_mod_start+0x1a4/0x1f0
Call Trace:
[c0002017fae8bc90] [c000000000cdbc40] _cond_resched+0x10/0xb0 (unreliable)
[c0002017fae8bce0] [c0000000002a15b0] t_start+0xf0/0x1c0
[c0002017fae8bd30] [c0000000004ec2b4] seq_read+0x184/0x640
[c0002017fae8bdd0] [c0000000004a57bc] sys_read+0x10c/0x300
[c0002017fae8be30] [c00000000000b388] system_call+0x5c/0x70
The test (ftrace_set_ftrace_filter.sh) is part of ftrace stress tests
and the crash happens when the test does 'cat
$TRACING_PATH/set_ftrace_filter'.
The address points to the second line below, in t_probe_next(), where
filter_hash is dereferenced:
hash = iter->probe->ops.func_hash->filter_hash;
size = 1 << hash->size_bits;
This happens due to a race with register_ftrace_function_probe(). A new
ftrace_func_probe is created and added into the func_probes list in
trace_array under ftrace_lock. However, before initializing the filter,
we drop ftrace_lock, and re-acquire it after acquiring regex_lock. If
another process is trying to read set_ftrace_filter, it will be able to
acquire ftrace_lock during this window and it will end up seeing a NULL
filter_hash.
Fix this by just checking for a NULL filter_hash in t_probe_next(). If
the filter_hash is NULL, then this probe is just being added and we can
simply return from here.
Link: http://lkml.kernel.org/r/05e021f757625cbbb006fad41380323dbe4e3b43.1562249521.git.naveen.n.rao@linux.vnet.ibm.com
Cc: stable@vger.kernel.org
Fixes: 7b60f3d876 ("ftrace: Dynamically create the probe ftrace_ops for the trace_array")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1ncKwACgkQONu9yGCS
aT6FPg//RiiJo8O+CUzkP4MFohy8JUuGC1MnnSfSFJn9bzAljWhYtSoJlZ9PbfHq
qx9oWuQNNVZRn9nWuRbTRfRlz6ztc7whsjhAth4eNCtXvu+xAvFLvFhlbVt6xiZ0
Wg3jXDtIY3Y8km01uJdzVk/juUqvTU8nioM4s1OWTFRfOfakLMK9CkxOKfZMFnxP
mVILTcOxZAf0Js2tRMRPvm8c6OhegkXZjWUhGMlvmFKk/pqUouVXH8pKbBoTj8zR
VoHB6pWs3YG3S15WgkNfKiR9WpeXywC7XN9ilziczaQ8HbsH6Y+5wM9Ncx+3FVxd
mzygLCtlckYWjiabS/w3tQHrH+LV8MaPYuW/2tlL9sBljlDW5RBW+g7vaBQjIpCK
gco/z0qmeEIYt8ktLL08i9FQBPp00Fra9x3jZKLz8Tp+W//EBm4ENMm2cxHtRKtd
3fG70ngJmycksCK/e8N466/1f/aAOxfBZkog3R/4yqNvOX8rkQGRJlhv0AzIdsy8
RlTDotDwwQFbMWROVs/Jea+9Wwp71jlPOXyMqX2EqUR/WfDWuIZEyzSdbqKPNgHL
Q9OoUB7kIKGusqQC6ABYKtDn7T046o6ePEB8r2aIvPmw5AiWMefSubHNV6mP3KJb
0NbQlUelMTvxuwm5NLMY2EWbWi+jvg4OgJvajwcDretTPeBGMZI=
=945Y
-----END PGP SIGNATURE-----
Merge 4.19.69 into android-4.19
Changes in 4.19.69
HID: Add 044f:b320 ThrustMaster, Inc. 2 in 1 DT
MIPS: kernel: only use i8253 clocksource with periodic clockevent
mips: fix cacheinfo
netfilter: ebtables: fix a memory leak bug in compat
ASoC: dapm: Fix handling of custom_stop_condition on DAPM graph walks
selftests/bpf: fix sendmsg6_prog on s390
bonding: Force slave speed check after link state recovery for 802.3ad
net: mvpp2: Don't check for 3 consecutive Idle frames for 10G links
selftests: forwarding: gre_multipath: Enable IPv4 forwarding
selftests: forwarding: gre_multipath: Fix flower filters
can: dev: call netif_carrier_off() in register_candev()
can: mcp251x: add error check when wq alloc failed
can: gw: Fix error path of cgw_module_init
ASoC: Fail card instantiation if DAI format setup fails
st21nfca_connectivity_event_received: null check the allocation
st_nci_hci_connectivity_event_received: null check the allocation
ASoC: rockchip: Fix mono capture
ASoC: ti: davinci-mcasp: Correct slot_width posed constraint
net: usb: qmi_wwan: Add the BroadMobi BM818 card
qed: RDMA - Fix the hw_ver returned in device attributes
isdn: mISDN: hfcsusb: Fix possible null-pointer dereferences in start_isoc_chain()
mac80211_hwsim: Fix possible null-pointer dereferences in hwsim_dump_radio_nl()
netfilter: ipset: Actually allow destination MAC address for hash:ip,mac sets too
netfilter: ipset: Copy the right MAC address in bitmap:ip,mac and hash:ip,mac sets
netfilter: ipset: Fix rename concurrency with listing
rxrpc: Fix potential deadlock
rxrpc: Fix the lack of notification when sendmsg() fails on a DATA packet
isdn: hfcsusb: Fix mISDN driver crash caused by transfer buffer on the stack
net: phy: phy_led_triggers: Fix a possible null-pointer dereference in phy_led_trigger_change_speed()
perf bench numa: Fix cpu0 binding
can: sja1000: force the string buffer NULL-terminated
can: peak_usb: force the string buffer NULL-terminated
net/ethernet/qlogic/qed: force the string buffer NULL-terminated
NFSv4: Fix a potential sleep while atomic in nfs4_do_reclaim()
NFS: Fix regression whereby fscache errors are appearing on 'nofsc' mounts
HID: quirks: Set the INCREMENT_USAGE_ON_DUPLICATE quirk on Saitek X52
HID: input: fix a4tech horizontal wheel custom usage
drm/rockchip: Suspend DP late
SMB3: Fix potential memory leak when processing compound chain
SMB3: Kernel oops mounting a encryptData share with CONFIG_DEBUG_VIRTUAL
s390: put _stext and _etext into .text section
net: cxgb3_main: Fix a resource leak in a error path in 'init_one()'
net: stmmac: Fix issues when number of Queues >= 4
net: stmmac: tc: Do not return a fragment entry
net: hisilicon: make hip04_tx_reclaim non-reentrant
net: hisilicon: fix hip04-xmit never return TX_BUSY
net: hisilicon: Fix dma_map_single failed on arm64
libata: have ata_scsi_rw_xlat() fail invalid passthrough requests
libata: add SG safety checks in SFF pio transfers
x86/lib/cpu: Address missing prototypes warning
drm/vmwgfx: fix memory leak when too many retries have occurred
block, bfq: handle NULL return value by bfq_init_rq()
perf ftrace: Fix failure to set cpumask when only one cpu is present
perf cpumap: Fix writing to illegal memory in handling cpumap mask
perf pmu-events: Fix missing "cpu_clk_unhalted.core" event
KVM: arm64: Don't write junk to sysregs on reset
KVM: arm: Don't write junk to CP15 registers on reset
selftests: kvm: Adding config fragments
HID: wacom: correct misreported EKR ring values
HID: wacom: Correct distance scale for 2nd-gen Intuos devices
Revert "dm bufio: fix deadlock with loop device"
clk: socfpga: stratix10: fix rate caclulationg for cnt_clks
ceph: clear page dirty before invalidate page
ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
libceph: fix PG split vs OSD (re)connect race
drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
gpiolib: never report open-drain/source lines as 'input' to user-space
Drivers: hv: vmbus: Fix virt_to_hvpfn() for X86_PAE
userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx
x86/retpoline: Don't clobber RFLAGS during CALL_NOSPEC on i386
x86/apic: Handle missing global clockevent gracefully
x86/CPU/AMD: Clear RDRAND CPUID bit on AMD family 15h/16h
x86/boot: Save fields explicitly, zero out everything else
x86/boot: Fix boot regression caused by bootparam sanitizing
dm kcopyd: always complete failed jobs
dm btree: fix order of block initialization in btree_split_beneath
dm integrity: fix a crash due to BUG_ON in __journal_read_write()
dm raid: add missing cleanup in raid_ctr()
dm space map metadata: fix missing store of apply_bops() return value
dm table: fix invalid memory accesses with too high sector number
dm zoned: improve error handling in reclaim
dm zoned: improve error handling in i/o map code
dm zoned: properly handle backing device failure
genirq: Properly pair kobject_del() with kobject_add()
mm, page_owner: handle THP splits correctly
mm/zsmalloc.c: migration can leave pages in ZS_EMPTY indefinitely
mm/zsmalloc.c: fix race condition in zs_destroy_pool
xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
xfs: don't trip over uninitialized buffer on extent read of corrupted inode
xfs: Move fs/xfs/xfs_attr.h to fs/xfs/libxfs/xfs_attr.h
xfs: Add helper function xfs_attr_try_sf_addname
xfs: Add attibute set and helper functions
xfs: Add attibute remove and helper functions
xfs: always rejoin held resources during defer roll
dm zoned: fix potential NULL dereference in dmz_do_reclaim()
powerpc: Allow flush_(inval_)dcache_range to work across ranges >4GB
rxrpc: Fix local endpoint refcounting
rxrpc: Fix read-after-free in rxrpc_queue_local()
rxrpc: Fix local endpoint replacement
rxrpc: Fix local refcounting
Linux 4.19.69
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I9824a29e0434a6a80e2f32fdb88c0ac1fe8e5af5
commit d0ff14fdc9 upstream.
If alloc_descs() fails before irq_sysfs_init() has run, free_desc() in the
cleanup path will call kobject_del() even though the kobject has not been
added with kobject_add().
Fix this by making the call to kobject_del() conditional on whether
irq_sysfs_init() has run.
This problem surfaced because commit aa30f47cf6 ("kobject: Add support
for default attribute groups to kobj_type") makes kobject_del() stricter
about pairing with kobject_add(). If the pairing is incorrrect, a WARNING
and backtrace occur in sysfs_remove_group() because there is no parent.
[ tglx: Add a comment to the code and make it work with CONFIG_SYSFS=n ]
Fixes: ecb3f394c5 ("genirq: Expose interrupt information through sysfs")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1564703564-4116-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1iS0cACgkQONu9yGCS
aT7zbhAAyPU6KNVLa1Pj/xQf8puJ3v+FTws0X9Ii9zyWCNNuTXSi8mf4arP8oMjH
NsYvhCGBHIQO0l3kFJRnOMLp0pPZPPUgHLvFWQljKWRABUOzMLjCWolnjegIPo8E
cUwYkwx5T5oZtYH7ScxfQLiQUJB28L65+gi+/LBqANcEYL6WaSa2+UIBeVUzbMTt
tQw6TAmE6f/kGAbmiQFeIdHj6Q9MrQyGBwTdhVSe+OENlZnZq8pxbwy3GXwEBuOP
rtmUfZrSxdUmWBa7+/oW14TBe/c6j2LT0tVoyZdUFGZNbOUNv4vEImXghd28YSuv
fppeSkom5di+RDH+B+LCNm+rV5vbfQqGTMuwBT6do2EDuQQ5KqS5kR8tULI4GKVl
pNejWeK2qcNNloC7imH+4rHIy9AFewKz1ixbGolSaPXyCYfYEBx1xfpw4npng2+X
aWJuk7/DnEEdzeKu9msdycQpf0aT5vkJqquBZQTzd5HAMlgqAF/sjvKjIsaUPNu1
wDc+tyyAF0lObR7aswuhvttZ/8yczMW6pOJiM/XAfFn/a+Y7V2FRedGcIsxyeKOw
YMNoW6VskIunShpbKhxVjPViI27NM0o8vmwCKmRQ6FRpxOX6vViNj+uQOjGQxfez
N5g9jqxLsq9YGVQfoseS2Taao8JkBrTOW0wiJtE3/nwt/FaeHpU=
=M37u
-----END PGP SIGNATURE-----
Merge 4.19.68 into android-4.19
Changes in 4.19.68
sh: kernel: hw_breakpoint: Fix missing break in switch statement
seq_file: fix problem when seeking mid-record
mm/hmm: fix bad subpage pointer in try_to_unmap_one
mm: mempolicy: make the behavior consistent when MPOL_MF_MOVE* and MPOL_MF_STRICT were specified
mm: mempolicy: handle vma with unmovable pages mapped correctly in mbind
mm/memcontrol.c: fix use after free in mem_cgroup_iter()
mm/usercopy: use memory range to be accessed for wraparound check
Revert "pwm: Set class for exported channels in sysfs"
cpufreq: schedutil: Don't skip freq update when limits change
xtensa: add missing isync to the cpu_reset TLB code
ALSA: hda/realtek - Add quirk for HP Envy x360
ALSA: usb-audio: Fix a stack buffer overflow bug in check_input_term
ALSA: usb-audio: Fix an OOB bug in parse_audio_mixer_unit
ALSA: hda - Apply workaround for another AMD chip 1022:1487
ALSA: hda - Fix a memory leak bug
ALSA: hda - Add a generic reboot_notify
ALSA: hda - Let all conexant codec enter D3 when rebooting
HID: holtek: test for sanity of intfdata
HID: hiddev: avoid opening a disconnected device
HID: hiddev: do cleanup in failure of opening a device
Input: kbtab - sanity check for endpoint type
Input: iforce - add sanity checks
net: usb: pegasus: fix improper read if get_registers() fail
netfilter: ebtables: also count base chain policies
riscv: Make __fstate_clean() work correctly.
clk: at91: generated: Truncate divisor to GENERATED_MAX_DIV + 1
clk: sprd: Select REGMAP_MMIO to avoid compile errors
clk: renesas: cpg-mssr: Fix reset control race condition
xen/pciback: remove set but not used variable 'old_state'
irqchip/gic-v3-its: Free unused vpt_page when alloc vpe table fail
irqchip/irq-imx-gpcv2: Forward irq type to parent
perf header: Fix divide by zero error if f_header.attr_size==0
perf header: Fix use of unitialized value warning
libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
drm/bridge: lvds-encoder: Fix build error while CONFIG_DRM_KMS_HELPER=m
Btrfs: fix deadlock between fiemap and transaction commits
scsi: hpsa: correct scsi command status issue after reset
scsi: qla2xxx: Fix possible fcport null-pointer dereferences
drm/amdgpu: fix a potential information leaking bug
ata: libahci: do not complain in case of deferred probe
kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
kbuild: Check for unknown options with cc-option usage in Kconfig and clang
arm64/efi: fix variable 'si' set but not used
arm64: unwind: Prohibit probing on return_address()
arm64/mm: fix variable 'pud' set but not used
IB/core: Add mitigation for Spectre V1
IB/mlx5: Fix MR registration flow to use UMR properly
IB/mad: Fix use-after-free in ib mad completion handling
drm: msm: Fix add_gpu_components
drm/exynos: fix missing decrement of retry counter
Revert "kmemleak: allow to coexist with fault injection"
ocfs2: remove set but not used variable 'last_hash'
asm-generic: fix -Wtype-limits compiler warnings
arm64: KVM: regmap: Fix unexpected switch fall-through
KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
staging: comedi: dt3000: Fix signed integer overflow 'divider * base'
staging: comedi: dt3000: Fix rounding up of timer divisor
iio: adc: max9611: Fix temperature reading in probe
USB: core: Fix races in character device registration and deregistraion
usb: gadget: udc: renesas_usb3: Fix sysfs interface of "role"
usb: cdc-acm: make sure a refcount is taken early enough
USB: CDC: fix sanity checks in CDC union parser
USB: serial: option: add D-Link DWM-222 device ID
USB: serial: option: Add support for ZTE MF871A
USB: serial: option: add the BroadMobi BM818 card
USB: serial: option: Add Motorola modem UARTs
drm/i915/cfl: Add a new CFL PCI ID.
dm: disable DISCARD if the underlying storage no longer supports it
arm64: ftrace: Ensure module ftrace trampoline is coherent with I-side
netfilter: conntrack: Use consistent ct id hash calculation
Input: psmouse - fix build error of multiple definition
iommu/amd: Move iommu_init_pci() to .init section
bnx2x: Fix VF's VLAN reconfiguration in reload.
bonding: Add vlan tx offload to hw_enc_features
net: dsa: Check existence of .port_mdb_add callback before calling it
net/mlx4_en: fix a memory leak bug
net/packet: fix race in tpacket_snd()
sctp: fix memleak in sctp_send_reset_streams
sctp: fix the transport error_count check
team: Add vlan tx offload to hw_enc_features
tipc: initialise addr_trail_end when setting node addresses
xen/netback: Reset nr_frags before freeing skb
net/mlx5e: Only support tx/rx pause setting for port owner
net/mlx5e: Use flow keys dissector to parse packets for ARFS
mmc: sdhci-of-arasan: Do now show error message in case of deffered probe
Linux 4.19.68
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8d9bb84a852deab3581f6cb37b4cb16e9bfe927c
commit 600f5badb7 upstream.
To avoid reducing the frequency of a CPU prematurely, we skip reducing
the frequency if the CPU had been busy recently.
This should not be done when the limits of the policy are changed, for
example due to thermal throttling. We should always get the frequency
within the new limits as soon as possible.
Trying to fix this by using only one flag, i.e. need_freq_update, can
lead to a race condition where the flag gets cleared without forcing us
to change the frequency at least once. And so this patch introduces
another flag to avoid that race condition.
Fixes: ecd2884291 ("cpufreq: schedutil: Don't set next_freq to UINT_MAX")
Cc: v4.18+ <stable@vger.kernel.org> # v4.18+
Reported-by: Doug Smythies <dsmythies@telus.net>
Tested-by: Doug Smythies <dsmythies@telus.net>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
boosted_cpu_util() sums the CFS and RT util signals before they are used
for frequency selection. While the util_avg signals are in sync,
util_est prevents cpu_util_cfs() from decreasing when a CFS task is
preempted by RT.
This util_est behaviour is beneficial in many scenarios, but it can
cause the sum of rt_util and cfs_util to go above SCHED_CAPACITY_SCALE.
Although benign, this transient value appears in the stune tracepoints,
and can cause confusion.
Work around the problem by capping the util sum to SCHED_CAPACITY_SCALE.
Bug: 120440300
Change-Id: I2be6eb157af86024e52ae11715f5637c77b201a3
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1WZYYACgkQONu9yGCS
aT5VjRAApdD6wuKcKhZ8j010Ni18w6W+3qs6IuIXv94eav0zFSRaO9Zp93lZq2p0
h+k+ssZ+P8a4EuDquzDydlagno9hojHFAYr+9loPZlZUw578Jzg9JbJK9Z1MyQCo
BCRElzZG67E+WjLP0wGHnS0oVhIoHlJaVWP3pEYkTJILY65ErLT/fYGs64YUAEKr
Ct1pKoIHPEC0606IKx12kmV645ME4z6pI7g4kLDhk2BozglbxGlwdHgVuIe/NzDP
PraR1gqMoOD2skjK673ozsZ65yuiVeqSTsbs49Xao1lAS6etUMbC/ACU/yrhL48H
mMM/EFTSKb5TjJSxQAXU1ANQrm4X6n1yPkNW/MdthnPAotDY3Nda4NNVE9X2toM7
XW0HfFdcVD7aJtpC/h6ckndGTaOGkHSPjhJtSlBEjF76BA+KhZ9hhcjNWng92bWL
d5Nws4b82wvgM6T99mkZfbMc2MOopPMf+I94W0JcMa49+rXhyhJdrC72GpxKLdSq
+XtZJupFWg0RrPlZfmc4Az8f/uY0UfR9gNSaHJokaZAkMzH2x4MzMnPxwRiXAw4W
qz1s+sgZlqUQcWvODzaNvG1l7QtjD5rbdJ+FAjN2+16F8rep52Yl/IQffYr04DDD
wikYmcUoMh8hCoj6Atj2LAAU9ulhl6ja8s0YpmHz/HQETufHAZc=
=gOG+
-----END PGP SIGNATURE-----
Merge 4.19.67 into android-4.19
Changes in 4.19.67
iio: cros_ec_accel_legacy: Fix incorrect channel setting
iio: adc: max9611: Fix misuse of GENMASK macro
staging: gasket: apex: fix copy-paste typo
staging: android: ion: Bail out upon SIGKILL when allocating memory.
crypto: ccp - Fix oops by properly managing allocated structures
crypto: ccp - Add support for valid authsize values less than 16
crypto: ccp - Ignore tag length when decrypting GCM ciphertext
usb: usbfs: fix double-free of usb memory upon submiturb error
usb: iowarrior: fix deadlock on disconnect
sound: fix a memory leak bug
mmc: cavium: Set the correct dma max segment size for mmc_host
mmc: cavium: Add the missing dma unmap when the dma has finished.
loop: set PF_MEMALLOC_NOIO for the worker thread
Input: usbtouchscreen - initialize PM mutex before using it
Input: elantech - enable SMBus on new (2018+) systems
Input: synaptics - enable RMI mode for HP Spectre X360
x86/mm: Check for pfn instead of page in vmalloc_sync_one()
x86/mm: Sync also unmappings in vmalloc_sync_all()
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
perf annotate: Fix s390 gap between kernel end and module start
perf db-export: Fix thread__exec_comm()
perf record: Fix module size on s390
x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS
gfs2: gfs2_walk_metadata fix
usb: host: xhci-rcar: Fix timeout in xhci_suspend()
usb: yurex: Fix use-after-free in yurex_delete
usb: typec: tcpm: free log buf memory when remove debug file
usb: typec: tcpm: remove tcpm dir if no children
usb: typec: tcpm: Add NULL check before dereferencing config
usb: typec: tcpm: Ignore unsupported/unknown alternate mode requests
can: rcar_canfd: fix possible IRQ storm on high load
can: peak_usb: fix potential double kfree_skb()
netfilter: nfnetlink: avoid deadlock due to synchronous request_module
vfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn
netfilter: Fix rpfilter dropping vrf packets by mistake
netfilter: conntrack: always store window size un-scaled
netfilter: nft_hash: fix symhash with modulus one
scripts/sphinx-pre-install: fix script for RHEL/CentOS
drm/amd/display: Wait for backlight programming completion in set backlight level
drm/amd/display: use encoder's engine id to find matched free audio device
drm/amd/display: Fix dc_create failure handling and 666 color depths
drm/amd/display: Only enable audio if speaker allocation exists
drm/amd/display: Increase size of audios array
iscsi_ibft: make ISCSI_IBFT dependson ACPI instead of ISCSI_IBFT_FIND
nl80211: fix NL80211_HE_MAX_CAPABILITY_LEN
mac80211: don't warn about CW params when not using them
allocate_flower_entry: should check for null deref
hwmon: (nct6775) Fix register address and added missed tolerance for nct6106
drm: silence variable 'conn' set but not used
cpufreq/pasemi: fix use-after-free in pas_cpufreq_cpu_init()
s390/qdio: add sanity checks to the fast-requeue path
ALSA: compress: Fix regression on compressed capture streams
ALSA: compress: Prevent bypasses of set_params
ALSA: compress: Don't allow paritial drain operations on capture streams
ALSA: compress: Be more restrictive about when a drain is allowed
perf tools: Fix proper buffer size for feature processing
perf probe: Avoid calling freeing routine multiple times for same pointer
drbd: dynamically allocate shash descriptor
ACPI/IORT: Fix off-by-one check in iort_dev_find_its_id()
nvme: fix multipath crash when ANA is deactivated
ARM: davinci: fix sleep.S build error on ARMv4
ARM: dts: bcm: bcm47094: add missing #cells for mdio-bus-mux
scsi: megaraid_sas: fix panic on loading firmware crashdump
scsi: ibmvfc: fix WARN_ON during event pool release
scsi: scsi_dh_alua: always use a 2 second delay before retrying RTPG
test_firmware: fix a memory leak bug
tty/ldsem, locking/rwsem: Add missing ACQUIRE to read_failed sleep loop
perf/core: Fix creating kernel counters for PMUs that override event->cpu
s390/dma: provide proper ARCH_ZONE_DMA_BITS value
HID: sony: Fix race condition between rumble and device remove.
x86/purgatory: Do not use __builtin_memcpy and __builtin_memset
ALSA: usb-audio: fix a memory leak bug
can: peak_usb: pcan_usb_pro: Fix info-leaks to USB devices
can: peak_usb: pcan_usb_fd: Fix info-leaks to USB devices
hwmon: (nct7802) Fix wrong detection of in4 presence
drm/i915: Fix wrong escape clock divisor init for GLK
ALSA: firewire: fix a memory leak bug
ALSA: hiface: fix multiple memory leak bugs
ALSA: hda - Don't override global PCM hw info flag
ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457)
mac80211: don't WARN on short WMM parameters from AP
dax: dax_layout_busy_page() should not unmap cow pages
SMB3: Fix deadlock in validate negotiate hits reconnect
smb3: send CAP_DFS capability during session setup
NFSv4: Fix an Oops in nfs4_do_setattr
KVM: Fix leak vCPU's VMCS value into other pCPU
mwifiex: fix 802.11n/WPA detection
iwlwifi: don't unmap as page memory that was mapped as single
iwlwifi: mvm: fix an out-of-bound access
iwlwifi: mvm: don't send GEO_TX_POWER_LIMIT on version < 41
iwlwifi: mvm: fix version check for GEO_TX_POWER_LIMIT support
Linux 4.19.67
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5ea813ed5ba6d1eeda51eb4031395ee3e8ba54c3
[ Upstream commit 4ce54af8b3 ]
Some hardware PMU drivers will override perf_event.cpu inside their
event_init callback. This causes a lockdep splat when initialized through
the kernel API:
WARNING: CPU: 0 PID: 250 at kernel/events/core.c:2917 ctx_sched_out+0x78/0x208
pc : ctx_sched_out+0x78/0x208
Call trace:
ctx_sched_out+0x78/0x208
__perf_install_in_context+0x160/0x248
remote_function+0x58/0x68
generic_exec_single+0x100/0x180
smp_call_function_single+0x174/0x1b8
perf_install_in_context+0x178/0x188
perf_event_create_kernel_counter+0x118/0x160
Fix this by calling perf_install_in_context with event->cpu, just like
perf_event_open
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/r/c4ebe0503623066896d7046def4d6b1e06e0eb2e.1563972056.git.leonard.crestez@nxp.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
There is a race between reading task->exit_state in pidfd_poll and
writing it after do_notify_parent calls do_notify_pidfd. Expected
sequence of events is:
CPU 0 CPU 1
------------------------------------------------
exit_notify
do_notify_parent
do_notify_pidfd
tsk->exit_state = EXIT_DEAD
pidfd_poll
if (tsk->exit_state)
However nothing prevents the following sequence:
CPU 0 CPU 1
------------------------------------------------
exit_notify
do_notify_parent
do_notify_pidfd
pidfd_poll
if (tsk->exit_state)
tsk->exit_state = EXIT_DEAD
This causes a polling task to wait forever, since poll blocks because
exit_state is 0 and the waiting task is not notified again. A stress
test continuously doing pidfd poll and process exits uncovered this bug.
To fix it, we make sure that the task's exit_state is always set before
calling do_notify_pidfd.
Fixes: b53b0b9d9a ("pidfd: add polling support")
Cc: kernel-team@android.com
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Link: https://lore.kernel.org/r/20190717172100.261204-1-joel@joelfernandes.org
[christian@brauner.io: adapt commit message and drop unneeded changes from wait_task_zombie]
Signed-off-by: Christian Brauner <christian@brauner.io>
(cherry picked from commit b191d6491b)
Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: Ife81348ae3c3b6f5f8caac0c2f9bacd656582b31
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
This adds the pidfd_open() syscall. It allows a caller to retrieve pollable
pidfds for a process which did not get created via CLONE_PIDFD, i.e. for a
process that is created via traditional fork()/clone() calls that is only
referenced by a PID:
int pidfd = pidfd_open(1234, 0);
ret = pidfd_send_signal(pidfd, SIGSTOP, NULL, 0);
With the introduction of pidfds through CLONE_PIDFD it is possible to
created pidfds at process creation time.
However, a lot of processes get created with traditional PID-based calls
such as fork() or clone() (without CLONE_PIDFD). For these processes a
caller can currently not create a pollable pidfd. This is a problem for
Android's low memory killer (LMK) and service managers such as systemd.
Both are examples of tools that want to make use of pidfds to get reliable
notification of process exit for non-parents (pidfd polling) and race-free
signal sending (pidfd_send_signal()). They intend to switch to this API for
process supervision/management as soon as possible. Having no way to get
pollable pidfds from PID-only processes is one of the biggest blockers for
them in adopting this api. With pidfd_open() making it possible to retrieve
pidfds for PID-based processes we enable them to adopt this api.
In line with Arnd's recent changes to consolidate syscall numbers across
architectures, I have added the pidfd_open() syscall to all architectures
at the same time.
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-api@vger.kernel.org
(cherry picked from commit 32fcb426ec)
Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I97583cfa441a08585cbedf9a44f982c0d0ee5583
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
This patch adds polling support to pidfd.
Android low memory killer (LMK) needs to know when a process dies once
it is sent the kill signal. It does so by checking for the existence of
/proc/pid which is both racy and slow. For example, if a PID is reused
between when LMK sends a kill signal and checks for existence of the
PID, since the wrong PID is now possibly checked for existence.
Using the polling support, LMK will be able to get notified when a process
exists in race-free and fast way, and allows the LMK to do other things
(such as by polling on other fds) while awaiting the process being killed
to die.
For notification to polling processes, we follow the same existing
mechanism in the kernel used when the parent of the task group is to be
notified of a child's death (do_notify_parent). This is precisely when the
tasks waiting on a poll of pidfd are also awakened in this patch.
We have decided to include the waitqueue in struct pid for the following
reasons:
1. The wait queue has to survive for the lifetime of the poll. Including
it in task_struct would not be option in this case because the task can
be reaped and destroyed before the poll returns.
2. By including the struct pid for the waitqueue means that during
de_thread(), the new thread group leader automatically gets the new
waitqueue/pid even though its task_struct is different.
Appropriate test cases are added in the second patch to provide coverage of
all the cases the patch is handling.
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Colascione <dancol@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Tim Murray <timmurray@google.com>
Cc: Jonathan Kowalski <bl0pbl33p@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: kernel-team@android.com
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Co-developed-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Daniel Colascione <dancol@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
(cherry picked from commit b53b0b9d9a)
Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I0391acb666b34e33ef60a778dffdf12ed3654393
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Improve the comments for pidfd_send_signal().
First, the comment still referred to a file descriptor for a process as a
"task file descriptor" which stems from way back at the beginning of the
discussion. Replace this with "pidfd" for consistency.
Second, the wording for the explanation of the arguments to the syscall
was a bit inconsistent, e.g. some used the past tense some used present
tense. Make the wording more consistent.
Signed-off-by: Christian Brauner <christian@brauner.io>
(cherry picked from commit c732327f04)
Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: Iaa289e1bc2f96366102e51241accbb97099553d4
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Let pidfd_send_signal() use pidfds retrieved via CLONE_PIDFD. With this
patch pidfd_send_signal() becomes independent of procfs. This fullfils
the request made when we merged the pidfd_send_signal() patchset. The
pidfd_send_signal() syscall is now always available allowing for it to
be used by users without procfs mounted or even users without procfs
support compiled into the kernel.
Signed-off-by: Christian Brauner <christian@brauner.io>
Co-developed-by: Jann Horn <jannh@google.com>
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit 2151ad1b06)
Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: If6ee59279f41e0ae3c6d9e24f7b5481f23aab469
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
This patchset makes it possible to retrieve pid file descriptors at
process creation time by introducing the new flag CLONE_PIDFD to the
clone() system call. Linus originally suggested to implement this as a
new flag to clone() instead of making it a separate system call. As
spotted by Linus, there is exactly one bit for clone() left.
CLONE_PIDFD creates file descriptors based on the anonymous inode
implementation in the kernel that will also be used to implement the new
mount api. They serve as a simple opaque handle on pids. Logically,
this makes it possible to interpret a pidfd differently, narrowing or
widening the scope of various operations (e.g. signal sending). Thus, a
pidfd cannot just refer to a tgid, but also a tid, or in theory - given
appropriate flag arguments in relevant syscalls - a process group or
session. A pidfd does not represent a privilege. This does not imply it
cannot ever be that way but for now this is not the case.
A pidfd comes with additional information in fdinfo if the kernel supports
procfs. The fdinfo file contains the pid of the process in the callers
pid namespace in the same format as the procfs status file, i.e. "Pid:\t%d".
As suggested by Oleg, with CLONE_PIDFD the pidfd is returned in the
parent_tidptr argument of clone. This has the advantage that we can
give back the associated pid and the pidfd at the same time.
To remove worries about missing metadata access this patchset comes with
a sample program that illustrates how a combination of CLONE_PIDFD, and
pidfd_send_signal() can be used to gain race-free access to process
metadata through /proc/<pid>. The sample program can easily be
translated into a helper that would be suitable for inclusion in libc so
that users don't have to worry about writing it themselves.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <christian@brauner.io>
Co-developed-by: Jann Horn <jannh@google.com>
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
(cherry picked from commit b3e5838252)
Bug: 135608568
Test: test program using syscall(__NR_sys_pidfd_open,..) and poll()
Change-Id: I8a8f87e8fb23de0adb6d6acf2e622926b7a9f55c
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
As stated in the original commit for pidfd_send_signal() we don't allow
to signal processes through O_PATH file descriptors since it is
semantically equivalent to a write on the pidfd.
We already correctly error out right now and return EBADF if an O_PATH
fd is passed. This is because we use file->f_op to detect whether a
pidfd is passed and O_PATH fds have their file->f_op set to empty_fops
in do_dentry_open() and thus fail the test.
Thus, there is no regression. It's just semantically correct to use
fdget() and return an error right from there instead of taking a
reference and returning an error later.
Signed-off-by: Christian Brauner <christian@brauner.io>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jann Horn <jann@thejh.net>
Cc: David Howells <dhowells@redhat.com>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 738a7832d2)
Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: Ib102922f9793e8610940d34ad5fb1256d4b07476
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
The current sys_pidfd_send_signal() silently turns signals with explicit
SI_USER context that are sent to non-current tasks into signals with
kernel-generated siginfo.
This is unlike do_rt_sigqueueinfo(), which returns -EPERM in this case.
If a user actually wants to send a signal with kernel-provided siginfo,
they can do that with pidfd_send_signal(pidfd, sig, NULL, 0); so allowing
this case is unnecessary.
Instead of silently replacing the siginfo, just bail out with an error;
this is consistent with other interfaces and avoids special-casing behavior
based on security checks.
Fixes: 3eb39f4793 ("signal: add pidfd_send_signal() syscall")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
(cherry picked from commit 556a888a14)
Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: I004452c19c50296730a2c6852a5ef47abd69d819
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
The kill() syscall operates on process identifiers (pid). After a process
has exited its pid can be reused by another process. If a caller sends a
signal to a reused pid it will end up signaling the wrong process. This
issue has often surfaced and there has been a push to address this problem [1].
This patch uses file descriptors (fd) from proc/<pid> as stable handles on
struct pid. Even if a pid is recycled the handle will not change. The fd
can be used to send signals to the process it refers to.
Thus, the new syscall pidfd_send_signal() is introduced to solve this
problem. Instead of pids it operates on process fds (pidfd).
/* prototype and argument /*
long pidfd_send_signal(int pidfd, int sig, siginfo_t *info, unsigned int flags);
/* syscall number 424 */
The syscall number was chosen to be 424 to align with Arnd's rework in his
y2038 to minimize merge conflicts (cf. [25]).
In addition to the pidfd and signal argument it takes an additional
siginfo_t and flags argument. If the siginfo_t argument is NULL then
pidfd_send_signal() is equivalent to kill(<positive-pid>, <signal>). If it
is not NULL pidfd_send_signal() is equivalent to rt_sigqueueinfo().
The flags argument is added to allow for future extensions of this syscall.
It currently needs to be passed as 0. Failing to do so will cause EINVAL.
/* pidfd_send_signal() replaces multiple pid-based syscalls */
The pidfd_send_signal() syscall currently takes on the job of
rt_sigqueueinfo(2) and parts of the functionality of kill(2), Namely, when a
positive pid is passed to kill(2). It will however be possible to also
replace tgkill(2) and rt_tgsigqueueinfo(2) if this syscall is extended.
/* sending signals to threads (tid) and process groups (pgid) */
Specifically, the pidfd_send_signal() syscall does currently not operate on
process groups or threads. This is left for future extensions.
In order to extend the syscall to allow sending signal to threads and
process groups appropriately named flags (e.g. PIDFD_TYPE_PGID, and
PIDFD_TYPE_TID) should be added. This implies that the flags argument will
determine what is signaled and not the file descriptor itself. Put in other
words, grouping in this api is a property of the flags argument not a
property of the file descriptor (cf. [13]). Clarification for this has been
requested by Eric (cf. [19]).
When appropriate extensions through the flags argument are added then
pidfd_send_signal() can additionally replace the part of kill(2) which
operates on process groups as well as the tgkill(2) and
rt_tgsigqueueinfo(2) syscalls.
How such an extension could be implemented has been very roughly sketched
in [14], [15], and [16]. However, this should not be taken as a commitment
to a particular implementation. There might be better ways to do it.
Right now this is intentionally left out to keep this patchset as simple as
possible (cf. [4]).
/* naming */
The syscall had various names throughout iterations of this patchset:
- procfd_signal()
- procfd_send_signal()
- taskfd_send_signal()
In the last round of reviews it was pointed out that given that if the
flags argument decides the scope of the signal instead of different types
of fds it might make sense to either settle for "procfd_" or "pidfd_" as
prefix. The community was willing to accept either (cf. [17] and [18]).
Given that one developer expressed strong preference for the "pidfd_"
prefix (cf. [13]) and with other developers less opinionated about the name
we should settle for "pidfd_" to avoid further bikeshedding.
The "_send_signal" suffix was chosen to reflect the fact that the syscall
takes on the job of multiple syscalls. It is therefore intentional that the
name is not reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the
fomer because it might imply that pidfd_send_signal() is a replacement for
kill(2), and not the latter because it is a hassle to remember the correct
spelling - especially for non-native speakers - and because it is not
descriptive enough of what the syscall actually does. The name
"pidfd_send_signal" makes it very clear that its job is to send signals.
/* zombies */
Zombies can be signaled just as any other process. No special error will be
reported since a zombie state is an unreliable state (cf. [3]). However,
this can be added as an extension through the @flags argument if the need
ever arises.
/* cross-namespace signals */
The patch currently enforces that the signaler and signalee either are in
the same pid namespace or that the signaler's pid namespace is an ancestor
of the signalee's pid namespace. This is done for the sake of simplicity
and because it is unclear to what values certain members of struct
siginfo_t would need to be set to (cf. [5], [6]).
/* compat syscalls */
It became clear that we would like to avoid adding compat syscalls
(cf. [7]). The compat syscall handling is now done in kernel/signal.c
itself by adding __copy_siginfo_from_user_generic() which lets us avoid
compat syscalls (cf. [8]). It should be noted that the addition of
__copy_siginfo_from_user_any() is caused by a bug in the original
implementation of rt_sigqueueinfo(2) (cf. 12).
With upcoming rework for syscall handling things might improve
significantly (cf. [11]) and __copy_siginfo_from_user_any() will not gain
any additional callers.
/* testing */
This patch was tested on x64 and x86.
/* userspace usage */
An asciinema recording for the basic functionality can be found under [9].
With this patch a process can be killed via:
#define _GNU_SOURCE
#include <errno.h>
#include <fcntl.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
static inline int do_pidfd_send_signal(int pidfd, int sig, siginfo_t *info,
unsigned int flags)
{
#ifdef __NR_pidfd_send_signal
return syscall(__NR_pidfd_send_signal, pidfd, sig, info, flags);
#else
return -ENOSYS;
#endif
}
int main(int argc, char *argv[])
{
int fd, ret, saved_errno, sig;
if (argc < 3)
exit(EXIT_FAILURE);
fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
if (fd < 0) {
printf("%s - Failed to open \"%s\"\n", strerror(errno), argv[1]);
exit(EXIT_FAILURE);
}
sig = atoi(argv[2]);
printf("Sending signal %d to process %s\n", sig, argv[1]);
ret = do_pidfd_send_signal(fd, sig, NULL, 0);
saved_errno = errno;
close(fd);
errno = saved_errno;
if (ret < 0) {
printf("%s - Failed to send signal %d to process %s\n",
strerror(errno), sig, argv[1]);
exit(EXIT_FAILURE);
}
exit(EXIT_SUCCESS);
}
/* Q&A
* Given that it seems the same questions get asked again by people who are
* late to the party it makes sense to add a Q&A section to the commit
* message so it's hopefully easier to avoid duplicate threads.
*
* For the sake of progress please consider these arguments settled unless
* there is a new point that desperately needs to be addressed. Please make
* sure to check the links to the threads in this commit message whether
* this has not already been covered.
*/
Q-01: (Florian Weimer [20], Andrew Morton [21])
What happens when the target process has exited?
A-01: Sending the signal will fail with ESRCH (cf. [22]).
Q-02: (Andrew Morton [21])
Is the task_struct pinned by the fd?
A-02: No. A reference to struct pid is kept. struct pid - as far as I
understand - was created exactly for the reason to not require to
pin struct task_struct (cf. [22]).
Q-03: (Andrew Morton [21])
Does the entire procfs directory remain visible? Just one entry
within it?
A-03: The same thing that happens right now when you hold a file descriptor
to /proc/<pid> open (cf. [22]).
Q-04: (Andrew Morton [21])
Does the pid remain reserved?
A-04: No. This patchset guarantees a stable handle not that pids are not
recycled (cf. [22]).
Q-05: (Andrew Morton [21])
Do attempts to signal that fd return errors?
A-05: See {Q,A}-01.
Q-06: (Andrew Morton [22])
Is there a cleaner way of obtaining the fd? Another syscall perhaps.
A-06: Userspace can already trivially retrieve file descriptors from procfs
so this is something that we will need to support anyway. Hence,
there's no immediate need to add another syscalls just to make
pidfd_send_signal() not dependent on the presence of procfs. However,
adding a syscalls to get such file descriptors is planned for a
future patchset (cf. [22]).
Q-07: (Andrew Morton [21] and others)
This fd-for-a-process sounds like a handy thing and people may well
think up other uses for it in the future, probably unrelated to
signals. Are the code and the interface designed to permit such
future applications?
A-07: Yes (cf. [22]).
Q-08: (Andrew Morton [21] and others)
Now I think about it, why a new syscall? This thing is looking
rather like an ioctl?
A-08: This has been extensively discussed. It was agreed that a syscall is
preferred for a variety or reasons. Here are just a few taken from
prior threads. Syscalls are safer than ioctl()s especially when
signaling to fds. Processes are a core kernel concept so a syscall
seems more appropriate. The layout of the syscall with its four
arguments would require the addition of a custom struct for the
ioctl() thereby causing at least the same amount or even more
complexity for userspace than a simple syscall. The new syscall will
replace multiple other pid-based syscalls (see description above).
The file-descriptors-for-processes concept introduced with this
syscall will be extended with other syscalls in the future. See also
[22], [23] and various other threads already linked in here.
Q-09: (Florian Weimer [24])
What happens if you use the new interface with an O_PATH descriptor?
A-09:
pidfds opened as O_PATH fds cannot be used to send signals to a
process (cf. [2]). Signaling processes through pidfds is the
equivalent of writing to a file. Thus, this is not an operation that
operates "purely at the file descriptor level" as required by the
open(2) manpage. See also [4].
/* References */
[1]: https://lore.kernel.org/lkml/20181029221037.87724-1-dancol@google.com/
[2]: https://lore.kernel.org/lkml/874lbtjvtd.fsf@oldenburg2.str.redhat.com/
[3]: https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6fhva@brauner.io/
[4]: https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru2ku@brauner.io/
[5]: https://lore.kernel.org/lkml/20181121213946.GA10795@mail.hallyn.com/
[6]: https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6nv4@brauner.io/
[7]: https://lore.kernel.org/lkml/36323361-90BD-41AF-AB5B-EE0D7BA02C21@amacapital.net/
[8]: https://lore.kernel.org/lkml/87tvjxp8pc.fsf@xmission.com/
[9]: https://asciinema.org/a/IQjuCHew6bnq1cr78yuMv16cy
[11]: https://lore.kernel.org/lkml/F53D6D38-3521-4C20-9034-5AF447DF62FF@amacapital.net/
[12]: https://lore.kernel.org/lkml/87zhtjn8ck.fsf@xmission.com/
[13]: https://lore.kernel.org/lkml/871s6u9z6u.fsf@xmission.com/
[14]: https://lore.kernel.org/lkml/20181206231742.xxi4ghn24z4h2qki@brauner.io/
[15]: https://lore.kernel.org/lkml/20181207003124.GA11160@mail.hallyn.com/
[16]: https://lore.kernel.org/lkml/20181207015423.4miorx43l3qhppfz@brauner.io/
[17]: https://lore.kernel.org/lkml/CAGXu5jL8PciZAXvOvCeCU3wKUEB_dU-O3q0tDw4uB_ojMvDEew@mail.gmail.com/
[18]: https://lore.kernel.org/lkml/20181206222746.GB9224@mail.hallyn.com/
[19]: https://lore.kernel.org/lkml/20181208054059.19813-1-christian@brauner.io/
[20]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[21]: https://lore.kernel.org/lkml/20181228152012.dbf0508c2508138efc5f2bbe@linux-foundation.org/
[22]: https://lore.kernel.org/lkml/20181228233725.722tdfgijxcssg76@brauner.io/
[23]: https://lwn.net/Articles/773459/
[24]: https://lore.kernel.org/lkml/8736rebl9s.fsf@oldenburg.str.redhat.com/
[25]: https://lore.kernel.org/lkml/CAK8P3a0ej9NcJM8wXNPbcGUyOUZYX+VLoDFdbenW3s3114oQZw@mail.gmail.com/
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Christian Brauner <christian@brauner.io>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: Aleksa Sarai <cyphar@cyphar.com>
(cherry picked from commit 3eb39f4793)
Conflicts:
include/linux/proc_fs.h - trivial manual merge
include/uapi/asm-generic/unistd.h - trivial manual merge
kernel/signal.c
(1. manual merges because of 4.19 differences
2. change prepare_kill_siginfo() to use struct siginfo instead of
kernel_siginfo
3. change copy_siginfo_from_user_any() to use struct siginfo instead of
kernel_siginfo
4. change pidfd_send_signal() to use struct siginfo instead of
kernel_siginfo
5. use copy_from_user() instead of copy_siginfo_from_user() in
copy_siginfo_from_user_any())
Bug: 135608568
Test: test program using syscall(__NR_pidfd_send_signal,..) to send SIGKILL
Change-Id: I24e6298ecf036d1822f3fa6c5286984b4e195c16
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1NlswACgkQONu9yGCS
aT6bJBAAhlElptL5xPWWrkZdYBVyfcat3VaeEmtO7ilHxyvhFMFOE8zIYEg/NU5s
ySM1yqdYvZY1ANawzIEH3es7eXy++GBBIVcRUPBuEVG/yv4/XyeX6MQYlDn3LuzI
KsTAkkkqy4LLieqZTF5cqavu1EuUkQWkPxKW1ps9O5Dv24FRmKhj+EiIoEGt43Ln
6d8vChzS+ZWNhWMJd4zP8x5ZXjUUTYlBxveRQQ9yatTbzrQdfkxyD7k6+1K8se8T
m9koOVJC9LtbR7W2WbVVq4L6ik3l3LQTr592kZVKiwyVgJtlul3LvqS0hNXcQxSj
pwfbz+4b8UP4aRgRbMxzSRLOkb+hUOvYR+CuGLKx827b9FQcLNAbseRsA0MlPENF
jJJUQSDollhx4knbYU+8Y2V1WtDi7Dnjt5gRvCvQ6rdrJMp1mFdqRGxfsdBaR10N
kU+LaSfIheEvuRoTv535NpxvPQhmEqY6NTGwMYDP1xlvu4vQbWF+HAMhCLmTWrif
i5ITtxYxSZOHNy7lvR7dNTMWsLoYg0KvG/oqJo8kSLmlBPcga+VK1YlPsc6JrfEG
d4ft7rMz/Cx2oeUQfiURqi/XPwyIBaU6MbovLuQkVx9CJcwgYcmdFD0HRHZIEwgk
z0q965POxm3DUoJF8HjBFNUOmlaYn9JZBmrwg5aFgwCy5Kot8g0=
=aeth
-----END PGP SIGNATURE-----
Merge 4.19.66 into android-4.19
Changes in 4.19.66
scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure
gcc-9: don't warn about uninitialized variable
driver core: Establish order of operations for device_add and device_del via bitflag
drivers/base: Introduce kill_device()
libnvdimm/bus: Prevent duplicate device_unregister() calls
libnvdimm/region: Register badblocks before namespaces
libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant
libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
HID: wacom: fix bit shift for Cintiq Companion 2
HID: Add quirk for HP X1200 PIXART OEM mouse
IB: directly cast the sockaddr union to aockaddr
atm: iphase: Fix Spectre v1 vulnerability
bnx2x: Disable multi-cos feature.
ife: error out when nla attributes are empty
ip6_gre: reload ipv6h in prepare_ip6gre_xmit_ipv6
ip6_tunnel: fix possible use-after-free on xmit
ipip: validate header length in ipip_tunnel_xmit
mlxsw: spectrum: Fix error path in mlxsw_sp_module_init()
mvpp2: fix panic on module removal
mvpp2: refactor MTU change code
net: bridge: delete local fdb on device init failure
net: bridge: mcast: don't delete permanent entries when fast leave is enabled
net: fix ifindex collision during namespace removal
net/mlx5e: always initialize frag->last_in_page
net/mlx5: Use reversed order when unregister devices
net: phylink: Fix flow control for fixed-link
net: qualcomm: rmnet: Fix incorrect UL checksum offload logic
net: sched: Fix a possible null-pointer dereference in dequeue_func()
net sched: update vlan action for batched events operations
net: sched: use temporary variable for actions indexes
net/smc: do not schedule tx_work in SMC_CLOSED state
NFC: nfcmrvl: fix gpio-handling regression
ocelot: Cancel delayed work before wq destruction
tipc: compat: allow tipc commands without arguments
tun: mark small packets as owned by the tap sock
net/mlx5: Fix modify_cq_in alignment
net/mlx5e: Prevent encap flow counter update async to user query
r8169: don't use MSI before RTL8168d
compat_ioctl: pppoe: fix PPPOEIOCSFWD handling
cgroup: Call cgroup_release() before __exit_signal()
cgroup: Implement css_task_iter_skip()
cgroup: Include dying leaders with live threads in PROCS iterations
cgroup: css_task_iter_skip()'d iterators must be advanced before accessed
cgroup: Fix css_task_iter_advance_css_set() cset skip condition
spi: bcm2835: Fix 3-wire mode if DMA is enabled
Linux 4.19.66
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Id33ce169af8bf14a3791040b4cf923832ce84f6c
commit c596687a00 upstream.
While adding handling for dying task group leaders c03cd7738a
("cgroup: Include dying leaders with live threads in PROCS
iterations") added an inverted cset skip condition to
css_task_iter_advance_css_set(). It should skip cset if it's
completely empty but was incorrectly testing for the inverse condition
for the dying_tasks list. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: c03cd7738a ("cgroup: Include dying leaders with live threads in PROCS iterations")
Reported-by: syzbot+d4bba5ccd4f9a2a68681@syzkaller.appspotmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cee0c33c54 upstream.
b636fd38dc ("cgroup: Implement css_task_iter_skip()") introduced
css_task_iter_skip() which is used to fix task iterations skipping
dying threadgroup leaders with live threads. Skipping is implemented
as a subportion of full advancing but css_task_iter_next() forgot to
fully advance a skipped iterator before determining the next task to
visit causing it to return invalid task pointers.
Fix it by making css_task_iter_next() fully advance the iterator if it
has been skipped since the previous iteration.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: syzbot
Link: http://lkml.kernel.org/r/00000000000097025d058a7fd785@google.com
Fixes: b636fd38dc ("cgroup: Implement css_task_iter_skip()")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c03cd7738a upstream.
CSS_TASK_ITER_PROCS currently iterates live group leaders; however,
this means that a process with dying leader and live threads will be
skipped. IOW, cgroup.procs might be empty while cgroup.threads isn't,
which is confusing to say the least.
Fix it by making cset track dying tasks and include dying leaders with
live threads in PROCS iteration.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b636fd38dc upstream.
When a task is moved out of a cset, task iterators pointing to the
task are advanced using the normal css_task_iter_advance() call. This
is fine but we'll be tracking dying tasks on csets and thus moving
tasks from cset->tasks to (to be added) cset->dying_tasks. When we
remove a task from cset->tasks, if we advance the iterators, they may
move over to the next cset before we had the chance to add the task
back on the dying list, which can allow the task to escape iteration.
This patch separates out skipping from advancing. Skipping only moves
the affected iterators to the next pointer rather than fully advancing
it and the following advancing will recognize that the cursor has
already been moved forward and do the rest of advancing. This ensures
that when a task moves from one list to another in its cset, as long
as it moves in the right direction, it's always visible to iteration.
This doesn't cause any visible behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b115bf58e upstream.
cgroup_release() calls cgroup_subsys->release() which is used by the
pids controller to uncharge its pid. We want to use it to manage
iteration of dying tasks which requires putting it before
__unhash_process(). Move cgroup_release() above __exit_signal().
While this makes it uncharge before the pid is freed, pid is RCU freed
anyway and the window is very narrow.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1Js7MACgkQONu9yGCS
aT4PQxAAo7xa4kYvDxc1RjUY/yIlp6lQ3rpYAAfZB0t8vN+dqivnJZ7m6JHeWX1Y
CMcxg85zxLVFeuiXdP821Zj68AB5zqlWMhX0bXm2lhw/Eo9+XHzXtnrLZHhz0/Xd
M5cmfIPmoyPCUQQfzSfUMvch+ZpwzEt5op5pUfSjckSpjHQZ0HFj1WJ4D8Hn9jAJ
y4+DAKDZgtqhb3GvpS6MoVnBJgcPk9+mBiDkSb12L392+FvHqfeBi3tDRhvyiZAO
iJrk747SPds7NlNmuRnj7YyUSDhBzaceRCz0Jsv9FT5EKXoPErXdsL3Bkfa9TREM
pH0OaMgNr6WSXLO9qIMcfxMeaKVIvIbotqBTkBTzhEAGPkHA75dhi0lpixXXFExg
MaqhLfmHO0dOEr9FrvYGe7f2wUA1Rdw/qRTM3KPEKmHxMqBS7eufIWMHwie1n9Oe
cYoP6UkxUIvhUyFV2BlMRFdMfaDbtR0iqy8Dqh36NISD6PAYaUGSoVeSO1fEg4Jy
5GgrKPg6rcz2XNY2cVbsm2zLpqY4dY58SFK9ORfuULdKUQvScvFGrdSSW0CgX+uc
F/5NmPutUoboHVxFraDPx7yo46pHf1RW0Me4xZ0aJ3e9ituLAN4fmJ9u46nofb5M
thPelQlMVt30O41uViJ0ADkOjCsiBr3AxOFvc76Ct9Q/BJVxhLk=
=JVBv
-----END PGP SIGNATURE-----
Merge 4.19.65 into android-4.19
Changes in 4.19.65
ARM: riscpc: fix DMA
ARM: dts: rockchip: Make rk3288-veyron-minnie run at hs200
ARM: dts: rockchip: Make rk3288-veyron-mickey's emmc work again
ARM: dts: rockchip: Mark that the rk3288 timer might stop in suspend
ftrace: Enable trampoline when rec count returns back to one
dmaengine: tegra-apb: Error out if DMA_PREP_INTERRUPT flag is unset
arm64: dts: rockchip: fix isp iommu clocks and power domain
kernel/module.c: Only return -EEXIST for modules that have finished loading
firmware/psci: psci_checker: Park kthreads before stopping them
MIPS: lantiq: Fix bitfield masking
dmaengine: rcar-dmac: Reject zero-length slave DMA requests
clk: tegra210: fix PLLU and PLLU_OUT1
fs/adfs: super: fix use-after-free bug
clk: sprd: Add check for return value of sprd_clk_regmap_init()
btrfs: fix minimum number of chunk errors for DUP
btrfs: qgroup: Don't hold qgroup_ioctl_lock in btrfs_qgroup_inherit()
cifs: Fix a race condition with cifs_echo_request
ceph: fix improper use of smp_mb__before_atomic()
ceph: return -ERANGE if virtual xattr value didn't fit in buffer
ACPI: blacklist: fix clang warning for unused DMI table
scsi: zfcp: fix GCC compiler warning emitted with -Wmaybe-uninitialized
perf version: Fix segfault due to missing OPT_END()
x86: kvm: avoid constant-conversion warning
ACPI: fix false-positive -Wuninitialized warning
be2net: Signal that the device cannot transmit during reconfiguration
x86/apic: Silence -Wtype-limits compiler warnings
x86: math-emu: Hide clang warnings for 16-bit overflow
mm/cma.c: fail if fixed declaration can't be honored
lib/test_overflow.c: avoid tainting the kernel and fix wrap size
lib/test_string.c: avoid masking memset16/32/64 failures
coda: add error handling for fget
coda: fix build using bare-metal toolchain
uapi linux/coda_psdev.h: move upc_req definition from uapi to kernel side headers
drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
ipc/mqueue.c: only perform resource calculation if user valid
mlxsw: spectrum_dcb: Configure DSCP map as the last rule is removed
xen/pv: Fix a boot up hang revealed by int3 self test
x86/kvm: Don't call kvm_spurious_fault() from .fixup
x86/paravirt: Fix callee-saved function ELF sizes
x86, boot: Remove multiple copy of static function sanitize_boot_params()
drm/nouveau: fix memory leak in nouveau_conn_reset()
kconfig: Clear "written" flag to avoid data loss
kbuild: initialize CLANG_FLAGS correctly in the top Makefile
Btrfs: fix incremental send failure after deduplication
Btrfs: fix race leading to fs corruption after transaction abort
mmc: dw_mmc: Fix occasional hang after tuning on eMMC
mmc: meson-mx-sdio: Fix misuse of GENMASK macro
gpiolib: fix incorrect IRQ requesting of an active-low lineevent
IB/hfi1: Fix Spectre v1 vulnerability
mtd: rawnand: micron: handle on-die "ECC-off" devices correctly
selinux: fix memory leak in policydb_init()
ALSA: hda: Fix 1-minute detection delay when i915 module is not available
mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker
s390/dasd: fix endless loop after read unit address configuration
cgroup: kselftest: relax fs_spec checks
parisc: Fix build of compressed kernel even with debug enabled
drivers/perf: arm_pmu: Fix failure path in PM notifier
arm64: compat: Allow single-byte watchpoints on all addresses
arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
nbd: replace kill_bdev() with __invalidate_device() again
xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
IB/mlx5: Fix unreg_umr to ignore the mkey state
IB/mlx5: Use direct mkey destroy command upon UMR unreg failure
IB/mlx5: Move MRs to a kernel PD when freeing them to the MR cache
IB/mlx5: Fix clean_mr() to work in the expected order
IB/mlx5: Fix RSS Toeplitz setup to be aligned with the HW specification
IB/hfi1: Check for error on call to alloc_rsm_map_table
drm/i915/gvt: fix incorrect cache entry for guest page mapping
eeprom: at24: make spd world-readable again
ARC: enable uboot support unconditionally
objtool: Support GCC 9 cold subfunction naming scheme
gcc-9: properly declare the {pv,hv}clock_page storage
x86/vdso: Prevent segfaults due to hoisted vclock reads
scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
x86/cpufeatures: Carve out CQM features retrieval
x86/cpufeatures: Combine word 11 and 12 into a new scattered features word
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
x86/speculation: Enable Spectre v1 swapgs mitigations
x86/entry/64: Use JMP instead of JMPQ
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Documentation: Add swapgs description to the Spectre v1 documentation
Linux 4.19.65
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iceeabdb164657e0a616db618e6aa8445d56b0dc1
[ Upstream commit 6e6de3dee5 ]
Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and
linux guests boot with repeated errors:
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
The warnings occur because the module code erroneously returns -EEXIST
for modules that have failed to load and are in the process of being
removed from the module list.
module amd64_edac_mod has a dependency on module edac_mce_amd. Using
modules.dep, systemd will load edac_mce_amd for every request of
amd64_edac_mod. When the edac_mce_amd module loads, the module has
state MODULE_STATE_UNFORMED and once the module load fails and the state
becomes MODULE_STATE_GOING. Another request for edac_mce_amd module
executes and add_unformed_module() will erroneously return -EEXIST even
though the previous instance of edac_mce_amd has MODULE_STATE_GOING.
Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which
fails because of unknown symbols from edac_mce_amd.
add_unformed_module() must wait to return for any case other than
MODULE_STATE_LIVE to prevent a race between multiple loads of
dependent modules.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Barret Rhoden <brho@google.com>
Cc: David Arcari <darcari@redhat.com>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit a124692b69 ]
Custom trampolines can only be enabled if there is only a single ops
attached to it. If there's only a single callback registered to a function,
and the ops has a trampoline registered for it, then we can call the
trampoline directly. This is very useful for improving the performance of
ftrace and livepatch.
If more than one callback is registered to a function, the general
trampoline is used, and the custom trampoline is not restored back to the
direct call even if all the other callbacks were unregistered and we are
back to one callback for the function.
To fix this, set FTRACE_FL_TRAMP flag if rec count is decremented
to one, and the ops that left has a trampoline.
Testing After this patch :
insmod livepatch_unshare_files.ko
cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
echo unshare_files > /sys/kernel/debug/tracing/set_ftrace_filter
echo function > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (2) R I ->ftrace_ops_list_func+0x0/0x150
echo nop > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/enabled_functions
unshare_files (1) R I tramp: 0xffffffffc0000000(klp_ftrace_handler+0x0/0xa0) ->ftrace_ops_assist_func+0x0/0xf0
Link: http://lkml.kernel.org/r/1556969979-111047-1-git-send-email-cj.chengjian@huawei.com
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1GibIACgkQONu9yGCS
aT7z2hAAmv8AsH9IG43m7t6zLroJVswr/9594xk7yPBQgcY3/PW2aTFBCFbsdOL4
yXcj2PSwRiq9K6qAJULrvOvncR9fIILHqzWzyXnoaZ30lR/FxaaFmuHZX/5Ix1tB
e5EEE/EA49UAEjEDaMLq8g2IvibsReDxmSpnXyBJWoyRAdFIElVnMJ2+zvP/wRhF
NKzQj/bj/qecCbis2lUCaVWJFZ6+P/52UbD8lvIwqR3nk2TKsGDcLU6eY3yg4KrB
rEHl5T8KIPrkX3KNIEB8EcFREene+rdpZLLVe4fYwf+gOqfiFXSzZZvweauMkplq
ehlVHkykvQvlsVM2tjBD379z3C4aasZDuMVNMCbAy2FlruLeBQ7gEn77mCJB9VH5
/n/mlc2yizdoowtARCLWOUMfASpdSbqu2SQ7A/3kwG7l6GrpzKSIU2nQgm+41sUZ
QJVtZ3IYsPoYjnU4B3JZzgJnf3M9jcRz/3JegviqhSEbF1gaScJX0cqN8C1idN/v
ZAGCJK9S20/EEEsp5jn+bq2grUehvmD4TVDfot4P+5yRYyBIhMFpbM2RpjydOpwy
+x8D1Q34LYPFgZfQ0vF62vcSBhMBiJ/7j41rUeo44K+Lg00F3yCOyL6FxK6S8h6j
wsD0xLbllMrhV5KRYFizb3QbCHoHYiROIJk76uLvB+Tqq2Jg9VQ=
=qIi2
-----END PGP SIGNATURE-----
Merge 4.19.64 into android-4.19
Changes in 4.19.64
hv_sock: Add support for delayed close
vsock: correct removal of socket from the list
NFS: Fix dentry revalidation on NFSv4 lookup
NFS: Refactor nfs_lookup_revalidate()
NFSv4: Fix lookup revalidate of regular files
usb: dwc2: Disable all EP's on disconnect
usb: dwc2: Fix disable all EP's on disconnect
arm64: compat: Provide definition for COMPAT_SIGMINSTKSZ
binder: fix possible UAF when freeing buffer
ISDN: hfcsusb: checking idx of ep configuration
media: au0828: fix null dereference in error path
ath10k: Change the warning message string
media: cpia2_usb: first wake up, then free in disconnect
media: pvrusb2: use a different format for warnings
NFS: Cleanup if nfs_match_client is interrupted
media: radio-raremono: change devm_k*alloc to k*alloc
iommu/vt-d: Don't queue_iova() if there is no flush queue
iommu/iova: Fix compilation error with !CONFIG_IOMMU_IOVA
Bluetooth: hci_uart: check for missing tty operations
vhost: introduce vhost_exceeds_weight()
vhost_net: fix possible infinite loop
vhost: vsock: add weight support
vhost: scsi: add weight support
sched/fair: Don't free p->numa_faults with concurrent readers
sched/fair: Use RCU accessors consistently for ->numa_group
/proc/<pid>/cmdline: remove all the special cases
/proc/<pid>/cmdline: add back the setproctitle() special case
drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
Fix allyesconfig output.
ceph: hold i_ceph_lock when removing caps for freeing inode
block, scsi: Change the preempt-only flag into a counter
scsi: core: Avoid that a kernel warning appears during system resume
ip_tunnel: allow not to count pkts on tstats by setting skb's dev to NULL
Linux 4.19.64
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I3e9055b677bd8ad9d5070307fae0bc765d444e9d
commit cb361d8cde upstream.
The old code used RCU annotations and accessors inconsistently for
->numa_group, which can lead to use-after-frees and NULL dereferences.
Let all accesses to ->numa_group use proper RCU helpers to prevent such
issues.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes: 8c8a743c50 ("sched/numa: Use {cpu, pid} to create task groups for shared faults")
Link: https://lkml.kernel.org/r/20190716152047.14424-3-jannh@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16d51a590a upstream.
When going through execve(), zero out the NUMA fault statistics instead of
freeing them.
During execve, the task is reachable through procfs and the scheduler. A
concurrent /proc/*/sched reader can read data from a freed ->numa_faults
allocation (confirmed by KASAN) and write it back to userspace.
I believe that it would also be possible for a use-after-free read to occur
through a race between a NUMA fault and execve(): task_numa_fault() can
lead to task_numa_compare(), which invokes task_weight() on the currently
running task of a different CPU.
Another way to fix this would be to make ->numa_faults RCU-managed or add
extra locking, but it seems easier to wipe the NUMA fault statistics on
execve.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Fixes: 82727018b0 ("sched/numa: Call task_numa_free() from do_execve()")
Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl1BJrAACgkQONu9yGCS
aT6MpRAAt2nuozm5Z/MshFAuGFzddAwPoYtkIPSy8BiPHYjf7x0+D5Ew4dz5OihS
ElfbA94hMOpvhhXlzBU3ZFJsWZIK78gzV6+LiHyb5R97Jdzj/zT4h40y0kxKw+pS
gghnZ6zx+pGSIXm/EsODW2gg98yTrmhBFpUhXpAGoC/71c1vxVlj3jHcuhd778YK
NRlj2tFWJGIBpmXrApo1Eg7qQRj4tzbOjthfgANWPq+EP68PgiOaxMDMMp7IUwju
KrXEFcXlk5aHvfaJ06FSBRBnn45XMyGXPYV/76HsaqkBNmg1r7o02dZKy/0S84Fn
YXoEFxHt6NFOJ52LiO95z7cQ0xeTSYygNCtYXJ70uyDMnrCPXCrKp7DRyka+vDPs
RCrcpB1QjcCb3xTL//SPkNWM3oZEW9CawpRFh9bmqbw/h7ZjaUEuwNIJnunISulu
2fvOjUmFWfUVIARiwKFVuIkXzgf3cSLYZTtiDFC5/yBpkGVNnXqyO7YtWZwnrMHq
L3DC3pOKuYXMa03KWGEzZoCZEXjtoRhRwCSgV7wbK5o90sZeRj/HC1zXEyDPPD/R
7A1rePTuwlAH3gHCJGhYkmYqULx62ZdvV6IC2N7xNxeTL1Y7OVNBT7ZUqexxY6WC
OG1vVxUKNJIvBLYmc6cmQATgR6XH5/B9H2p1YRBLuAQuqHk+jqo=
=ZQf3
-----END PGP SIGNATURE-----
Merge 4.19.63 into android-4.19
Changes in 4.19.63
hvsock: fix epollout hang from race condition
drm/panel: simple: Fix panel_simple_dsi_probe
iio: adc: stm32-dfsdm: manage the get_irq error case
iio: adc: stm32-dfsdm: missing error case during probe
staging: vt6656: use meaningful error code during buffer allocation
usb: core: hub: Disable hub-initiated U1/U2
tty: max310x: Fix invalid baudrate divisors calculator
pinctrl: rockchip: fix leaked of_node references
tty: serial: cpm_uart - fix init when SMC is relocated
drm/amd/display: Fill prescale_params->scale for RGB565
drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE
drm/amd/display: Disable ABM before destroy ABM struct
drm/amdkfd: Fix a potential memory leak
drm/amdkfd: Fix sdma queue map issue
drm/edid: Fix a missing-check bug in drm_load_edid_firmware()
PCI: Return error if cannot probe VF
drm/bridge: tc358767: read display_props in get_modes()
drm/bridge: sii902x: pixel clock unit is 10kHz instead of 1kHz
gpu: host1x: Increase maximum DMA segment size
drm/crc-debugfs: User irqsafe spinlock in drm_crtc_add_crc_entry
drm/crc-debugfs: Also sprinkle irqrestore over early exits
memstick: Fix error cleanup path of memstick_init
tty/serial: digicolor: Fix digicolor-usart already registered warning
tty: serial: msm_serial: avoid system lockup condition
serial: 8250: Fix TX interrupt handling condition
drm/amd/display: Always allocate initial connector state state
drm/virtio: Add memory barriers for capset cache.
phy: renesas: rcar-gen2: Fix memory leak at error paths
drm/amd/display: fix compilation error
powerpc/pseries/mobility: prevent cpu hotplug during DT update
drm/rockchip: Properly adjust to a true clock in adjusted_mode
serial: imx: fix locking in set_termios()
tty: serial_core: Set port active bit in uart_port_activate
usb: gadget: Zero ffs_io_data
mmc: sdhci: sdhci-pci-o2micro: Check if controller supports 8-bit width
powerpc/pci/of: Fix OF flags parsing for 64bit BARs
drm/msm: Depopulate platform on probe failure
serial: mctrl_gpio: Check if GPIO property exisits before requesting it
PCI: sysfs: Ignore lockdep for remove attribute
i2c: stm32f7: fix the get_irq error cases
kbuild: Add -Werror=unknown-warning-option to CLANG_FLAGS
genksyms: Teach parser about 128-bit built-in types
PCI: xilinx-nwl: Fix Multi MSI data programming
iio: iio-utils: Fix possible incorrect mask calculation
powerpc/cacheflush: fix variable set but not used
powerpc/xmon: Fix disabling tracing while in xmon
recordmcount: Fix spurious mcount entries on powerpc
mfd: madera: Add missing of table registration
mfd: core: Set fwnode for created devices
mfd: arizona: Fix undefined behavior
mfd: hi655x-pmic: Fix missing return value check for devm_regmap_init_mmio_clk
mm/swap: fix release_pages() when releasing devmap pages
um: Silence lockdep complaint about mmap_sem
powerpc/4xx/uic: clear pending interrupt after irq type/pol change
RDMA/i40iw: Set queue pair state when being queried
serial: sh-sci: Terminate TX DMA during buffer flushing
serial: sh-sci: Fix TX DMA buffer flushing and workqueue races
IB/mlx5: Fixed reporting counters on 2nd port for Dual port RoCE
powerpc/mm: Handle page table allocation failures
IB/ipoib: Add child to parent list only if device initialized
arm64: assembler: Switch ESB-instruction with a vanilla nop if !ARM64_HAS_RAS
PCI: mobiveil: Fix PCI base address in MEM/IO outbound windows
PCI: mobiveil: Fix the Class Code field
kallsyms: exclude kasan local symbols on s390
PCI: mobiveil: Initialize Primary/Secondary/Subordinate bus numbers
PCI: mobiveil: Use the 1st inbound window for MEM inbound transactions
perf test mmap-thread-lookup: Initialize variable to suppress memory sanitizer warning
perf stat: Fix use-after-freed pointer detected by the smatch tool
perf top: Fix potential NULL pointer dereference detected by the smatch tool
perf session: Fix potential NULL pointer dereference found by the smatch tool
perf annotate: Fix dereferencing freed memory found by the smatch tool
perf hists browser: Fix potential NULL pointer dereference found by the smatch tool
RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM
PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIB
powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h
block: init flush rq ref count to 1
f2fs: avoid out-of-range memory access
mailbox: handle failed named mailbox channel request
dlm: check if workqueues are NULL before flushing/destroying
powerpc/eeh: Handle hugepages in ioremap space
block/bio-integrity: fix a memory leak bug
sh: prevent warnings when using iounmap
mm/kmemleak.c: fix check for softirq context
9p: pass the correct prototype to read_cache_page
mm/gup.c: mark undo_dev_pagemap as __maybe_unused
mm/gup.c: remove some BUG_ONs from get_gate_page()
memcg, fsnotify: no oom-kill for remote memcg charging
mm/mmu_notifier: use hlist_add_head_rcu()
proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
proc: use down_read_killable mmap_sem for /proc/pid/pagemap
proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
proc: use down_read_killable mmap_sem for /proc/pid/map_files
cxgb4: reduce kernel stack usage in cudbg_collect_mem_region()
proc: use down_read_killable mmap_sem for /proc/pid/maps
locking/lockdep: Fix lock used or unused stats error
mm: use down_read_killable for locking mmap_sem in access_remote_vm
locking/lockdep: Hide unused 'class' variable
usb: wusbcore: fix unbalanced get/put cluster_id
usb: pci-quirks: Correct AMD PLL quirk detection
btrfs: inode: Don't compress if NODATASUM or NODATACOW set
x86/sysfb_efi: Add quirks for some devices with swapped width and height
x86/speculation/mds: Apply more accurate check on hypervisor platform
binder: prevent transactions to context manager from its own process.
fpga-manager: altera-ps-spi: Fix build error
mei: me: add mule creek canyon (EHL) device ids
hpet: Fix division by zero in hpet_time_div()
ALSA: ac97: Fix double free of ac97_codec_device
ALSA: line6: Fix wrong altsetting for LINE6_PODHD500_1
ALSA: hda - Add a conexant codec entry to let mute led work
powerpc/xive: Fix loop exit-condition in xive_find_target_in_mask()
powerpc/tm: Fix oops on sigreturn on systems without TM
libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()
access: avoid the RCU grace period for the temporary subjective credentials
Linux 4.19.63
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ic31529aa6fd283d16d6bfb182187a9402a4db44f