Changes in 5.10.37
Bluetooth: verify AMP hci_chan before amp_destroy
bluetooth: eliminate the potential race condition when removing the HCI controller
net/nfc: fix use-after-free llcp_sock_bind/connect
io_uring: truncate lengths larger than MAX_RW_COUNT on provide buffers
Revert "USB: cdc-acm: fix rounding error in TIOCSSERIAL"
usb: roles: Call try_module_get() from usb_role_switch_find_by_fwnode()
tty: moxa: fix TIOCSSERIAL jiffies conversions
tty: amiserial: fix TIOCSSERIAL permission check
USB: serial: usb_wwan: fix TIOCSSERIAL jiffies conversions
staging: greybus: uart: fix TIOCSSERIAL jiffies conversions
USB: serial: ti_usb_3410_5052: fix TIOCSSERIAL permission check
staging: fwserial: fix TIOCSSERIAL jiffies conversions
tty: moxa: fix TIOCSSERIAL permission check
staging: fwserial: fix TIOCSSERIAL permission check
drm: bridge: fix LONTIUM use of mipi_dsi_() functions
usb: typec: tcpm: Address incorrect values of tcpm psy for fixed supply
usb: typec: tcpm: Address incorrect values of tcpm psy for pps supply
usb: typec: tcpm: update power supply once partner accepts
usb: xhci-mtk: remove or operator for setting schedule parameters
usb: xhci-mtk: improve bandwidth scheduling with TT
ASoC: samsung: tm2_wm5110: check of of_parse return value
ASoC: Intel: kbl_da7219_max98927: Fix kabylake_ssp_fixup function
ASoC: tlv320aic32x4: Register clocks before registering component
ASoC: tlv320aic32x4: Increase maximum register in regmap
MIPS: pci-mt7620: fix PLL lock check
MIPS: pci-rt2880: fix slot 0 configuration
FDDI: defxx: Bail out gracefully with unassigned PCI resource for CSR
PCI: Allow VPD access for QLogic ISP2722
KVM: x86: Defer the MMU unload to the normal path on an global INVPCID
PCI: xgene: Fix cfg resource mapping
PCI: keystone: Let AM65 use the pci_ops defined in pcie-designware-host.c
PM / devfreq: Unlock mutex and free devfreq struct in error path
soc/tegra: regulators: Fix locking up when voltage-spread is out of range
iio: inv_mpu6050: Fully validate gyro and accel scale writes
iio:accel:adis16201: Fix wrong axis assignment that prevents loading
iio:adc:ad7476: Fix remove handling
sc16is7xx: Defer probe if device read fails
phy: cadence: Sierra: Fix PHY power_on sequence
misc: lis3lv02d: Fix false-positive WARN on various HP models
phy: ti: j721e-wiz: Invoke wiz_init() before of_platform_device_create()
misc: vmw_vmci: explicitly initialize vmci_notify_bm_set_msg struct
misc: vmw_vmci: explicitly initialize vmci_datagram payload
selinux: add proper NULL termination to the secclass_map permissions
x86, sched: Treat Intel SNC topology as default, COD as exception
async_xor: increase src_offs when dropping destination page
md/bitmap: wait for external bitmap writes to complete during tear down
md-cluster: fix use-after-free issue when removing rdev
md: split mddev_find
md: factor out a mddev_find_locked helper from mddev_find
md: md_open returns -EBUSY when entering racing area
md: Fix missing unused status line of /proc/mdstat
mt76: mt7615: use ieee80211_free_txskb() in mt7615_tx_token_put()
ipw2x00: potential buffer overflow in libipw_wx_set_encodeext()
cfg80211: scan: drop entry from hidden_list on overflow
rtw88: Fix array overrun in rtw_get_tx_power_params()
mt76: fix potential DMA mapping leak
FDDI: defxx: Make MMIO the configuration default except for EISA
drm/i915/gvt: Fix virtual display setup for BXT/APL
drm/i915/gvt: Fix vfio_edid issue for BXT/APL
drm/qxl: use ttm bo priorities
drm/panfrost: Clear MMU irqs before handling the fault
drm/panfrost: Don't try to map pages that are already mapped
drm/radeon: fix copy of uninitialized variable back to userspace
drm/dp_mst: Revise broadcast msg lct & lcr
drm/dp_mst: Set CLEAR_PAYLOAD_ID_TABLE as broadcast
drm: bridge/panel: Cleanup connector on bridge detach
drm/amd/display: Reject non-zero src_y and src_x for video planes
drm/amdgpu: fix concurrent VM flushes on Vega/Navi v2
ALSA: hda/realtek: Re-order ALC882 Acer quirk table entries
ALSA: hda/realtek: Re-order ALC882 Sony quirk table entries
ALSA: hda/realtek: Re-order ALC882 Clevo quirk table entries
ALSA: hda/realtek: Re-order ALC269 HP quirk table entries
ALSA: hda/realtek: Re-order ALC269 Acer quirk table entries
ALSA: hda/realtek: Re-order ALC269 Dell quirk table entries
ALSA: hda/realtek: Re-order ALC269 ASUS quirk table entries
ALSA: hda/realtek: Re-order ALC269 Sony quirk table entries
ALSA: hda/realtek: Re-order ALC269 Lenovo quirk table entries
ALSA: hda/realtek: Re-order remaining ALC269 quirk table entries
ALSA: hda/realtek: Re-order ALC662 quirk table entries
ALSA: hda/realtek: Remove redundant entry for ALC861 Haier/Uniwill devices
ALSA: hda/realtek: ALC285 Thinkpad jack pin quirk is unreachable
ALSA: hda/realtek: Fix speaker amp on HP Envy AiO 32
KVM: s390: VSIE: correctly handle MVPG when in VSIE
KVM: s390: split kvm_s390_logical_to_effective
KVM: s390: fix guarded storage control register handling
s390: fix detection of vector enhancements facility 1 vs. vector packed decimal facility
KVM: s390: VSIE: fix MVPG handling for prefixing and MSO
KVM: s390: split kvm_s390_real_to_abs
KVM: s390: extend kvm_s390_shadow_fault to return entry pointer
KVM: x86/mmu: Alloc page for PDPTEs when shadowing 32-bit NPT with 64-bit
KVM: x86: Remove emulator's broken checks on CR0/CR3/CR4 loads
KVM: nSVM: Set the shadow root level to the TDP level for nested NPT
KVM: SVM: Don't strip the C-bit from CR2 on #PF interception
KVM: SVM: Do not allow SEV/SEV-ES initialization after vCPUs are created
KVM: SVM: Inject #GP on guest MSR_TSC_AUX accesses if RDTSCP unsupported
KVM: nVMX: Defer the MMU reload to the normal path on an EPTP switch
KVM: nVMX: Truncate bits 63:32 of VMCS field on nested check in !64-bit
KVM: nVMX: Truncate base/index GPR value on address calc in !64-bit
KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read
KVM: Destroy I/O bus devices on unregister failure _after_ sync'ing SRCU
KVM: Stop looking for coalesced MMIO zones if the bus is destroyed
KVM: arm64: Fully zero the vcpu state on reset
KVM: arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST_REGION read
Revert "drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit"
Revert "i3c master: fix missing destroy_workqueue() on error in i3c_master_register"
ovl: fix missing revert_creds() on error path
Revert "drm/qxl: do not run release if qxl failed to init"
usb: gadget: pch_udc: Revert d3cb25a121 completely
Revert "tools/power turbostat: adjust for temperature offset"
firmware: xilinx: Fix dereferencing freed memory
firmware: xilinx: Add a blank line after function declaration
firmware: xilinx: Remove zynqmp_pm_get_eemi_ops() in IS_REACHABLE(CONFIG_ZYNQMP_FIRMWARE)
fpga: fpga-mgr: xilinx-spi: fix error messages on -EPROBE_DEFER
crypto: sun8i-ss - fix result memory leak on error path
memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]
ARM: dts: exynos: correct fuel gauge interrupt trigger level on GT-I9100
ARM: dts: exynos: correct fuel gauge interrupt trigger level on Midas family
ARM: dts: exynos: correct MUIC interrupt trigger level on Midas family
ARM: dts: exynos: correct PMIC interrupt trigger level on Midas family
ARM: dts: exynos: correct PMIC interrupt trigger level on Odroid X/U3 family
ARM: dts: exynos: correct PMIC interrupt trigger level on SMDK5250
ARM: dts: exynos: correct PMIC interrupt trigger level on Snow
ARM: dts: s5pv210: correct fuel gauge interrupt trigger level on Fascinate family
ARM: dts: renesas: Add mmc aliases into R-Car Gen2 board dts files
arm64: dts: renesas: Add mmc aliases into board dts files
x86/platform/uv: Set section block size for hubless architectures
serial: stm32: fix code cleaning warnings and checks
serial: stm32: add "_usart" prefix in functions name
serial: stm32: fix probe and remove order for dma
serial: stm32: Use of_device_get_match_data()
serial: stm32: fix startup by enabling usart for reception
serial: stm32: fix incorrect characters on console
serial: stm32: fix TX and RX FIFO thresholds
serial: stm32: fix a deadlock condition with wakeup event
serial: stm32: fix wake-up flag handling
serial: stm32: fix a deadlock in set_termios
serial: stm32: fix tx dma completion, release channel
serial: stm32: call stm32_transmit_chars locked
serial: stm32: fix FIFO flush in startup and set_termios
serial: stm32: add FIFO flush when port is closed
serial: stm32: fix tx_empty condition
usb: typec: tcpci: Check ROLE_CONTROL while interpreting CC_STATUS
usb: typec: tps6598x: Fix return value check in tps6598x_probe()
usb: typec: stusb160x: fix return value check in stusb160x_probe()
regmap: set debugfs_name to NULL after it is freed
spi: rockchip: avoid objtool warning
mtd: rawnand: fsmc: Fix error code in fsmc_nand_probe()
mtd: rawnand: brcmnand: fix OOB R/W with Hamming ECC
mtd: Handle possible -EPROBE_DEFER from parse_mtd_partitions()
mtd: rawnand: qcom: Return actual error code instead of -ENODEV
mtd: don't lock when recursively deleting partitions
mtd: maps: fix error return code of physmap_flash_remove()
ARM: dts: stm32: fix usart 2 & 3 pinconf to wake up with flow control
arm64: dts: qcom: sm8250: Fix level triggered PMU interrupt polarity
arm64: dts: qcom: sm8250: Fix timer interrupt to specify EL2 physical timer
arm64: dts: qcom: sdm845: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: sm8150: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: sm8250: fix number of pins in 'gpio-ranges'
arm64: dts: qcom: db845c: fix correct powerdown pin for WSA881x
crypto: sun8i-ss - Fix memory leak of object d when dma_iv fails to map
spi: stm32: drop devres version of spi_register_master
regulator: bd9576: Fix return from bd957x_probe()
arm64: dts: renesas: r8a77980: Fix vin4-7 endpoint binding
spi: stm32: Fix use-after-free on unbind
x86/microcode: Check for offline CPUs before requesting new microcode
devtmpfs: fix placement of complete() call
usb: gadget: pch_udc: Replace cpu_to_le32() by lower_32_bits()
usb: gadget: pch_udc: Check if driver is present before calling ->setup()
usb: gadget: pch_udc: Check for DMA mapping error
usb: gadget: pch_udc: Initialize device pointer before use
usb: gadget: pch_udc: Provide a GPIO line used on Intel Minnowboard (v1)
crypto: ccp - fix command queuing to TEE ring buffer
crypto: qat - don't release uninitialized resources
crypto: qat - ADF_STATUS_PF_RUNNING should be set after adf_dev_init
fotg210-udc: Fix DMA on EP0 for length > max packet size
fotg210-udc: Fix EP0 IN requests bigger than two packets
fotg210-udc: Remove a dubious condition leading to fotg210_done
fotg210-udc: Mask GRP2 interrupts we don't handle
fotg210-udc: Don't DMA more than the buffer can take
fotg210-udc: Complete OUT requests on short packets
usb: gadget: s3c: Fix incorrect resources releasing
usb: gadget: s3c: Fix the error handling path in 's3c2410_udc_probe()'
dt-bindings: serial: stm32: Use 'type: object' instead of false for 'additionalProperties'
mtd: require write permissions for locking and badblock ioctls
arm64: dts: renesas: r8a779a0: Fix PMU interrupt
bus: qcom: Put child node before return
soundwire: bus: Fix device found flag correctly
phy: ti: j721e-wiz: Delete "clk_div_sel" clk provider during cleanup
phy: marvell: ARMADA375_USBCLUSTER_PHY should not default to y, unconditionally
arm64: dts: mediatek: fix reset GPIO level on pumpkin
NFSD: Fix sparse warning in nfs4proc.c
NFSv4.2: fix copy stateid copying for the async copy
crypto: poly1305 - fix poly1305_core_setkey() declaration
crypto: qat - fix error path in adf_isr_resource_alloc()
usb: gadget: aspeed: fix dma map failure
USB: gadget: udc: fix wrong pointer passed to IS_ERR() and PTR_ERR()
drivers: nvmem: Fix voltage settings for QTI qfprom-efuse
driver core: platform: Declare early_platform_cleanup() prototype
memory: pl353: fix mask of ECC page_size config register
soundwire: stream: fix memory leak in stream config error path
m68k: mvme147,mvme16x: Don't wipe PCC timer config bits
firmware: qcom_scm: Make __qcom_scm_is_call_available() return bool
firmware: qcom_scm: Reduce locking section for __get_convention()
firmware: qcom_scm: Workaround lack of "is available" call on SC7180
iio: adc: Kconfig: make AD9467 depend on ADI_AXI_ADC symbol
mtd: rawnand: gpmi: Fix a double free in gpmi_nand_init
irqchip/gic-v3: Fix OF_BAD_ADDR error handling
staging: comedi: tests: ni_routes_test: Fix compilation error
staging: rtl8192u: Fix potential infinite loop
staging: fwserial: fix TIOCSSERIAL implementation
staging: fwserial: fix TIOCGSERIAL implementation
staging: greybus: uart: fix unprivileged TIOCCSERIAL
soc: qcom: pdr: Fix error return code in pdr_register_listener
PM / devfreq: Use more accurate returned new_freq as resume_freq
clocksource/drivers/timer-ti-dm: Fix posted mode status check order
clocksource/drivers/timer-ti-dm: Add missing set_state_oneshot_stopped
clocksource/drivers/ingenic_ost: Fix return value check in ingenic_ost_probe()
spi: Fix use-after-free with devm_spi_alloc_*
spi: fsl: add missing iounmap() on error in of_fsl_spi_probe()
soc: qcom: mdt_loader: Validate that p_filesz < p_memsz
soc: qcom: mdt_loader: Detect truncated read of segments
PM: runtime: Replace inline function pm_runtime_callbacks_present()
cpuidle: Fix ARM_QCOM_SPM_CPUIDLE configuration
ACPI: CPPC: Replace cppc_attr with kobj_attribute
crypto: allwinner - add missing CRYPTO_ prefix
crypto: sun8i-ss - Fix memory leak of pad
crypto: sa2ul - Fix memory leak of rxd
crypto: qat - Fix a double free in adf_create_ring
cpufreq: armada-37xx: Fix setting TBG parent for load levels
clk: mvebu: armada-37xx-periph: remove .set_parent method for CPU PM clock
cpufreq: armada-37xx: Fix the AVS value for load L1
clk: mvebu: armada-37xx-periph: Fix switching CPU freq from 250 Mhz to 1 GHz
clk: mvebu: armada-37xx-periph: Fix workaround for switching from L1 to L0
cpufreq: armada-37xx: Fix driver cleanup when registration failed
cpufreq: armada-37xx: Fix determining base CPU frequency
spi: spi-zynqmp-gqspi: use wait_for_completion_timeout to make zynqmp_qspi_exec_op not interruptible
spi: spi-zynqmp-gqspi: add mutex locking for exec_op
spi: spi-zynqmp-gqspi: transmit dummy circles by using the controller's internal functionality
spi: spi-zynqmp-gqspi: fix incorrect operating mode in zynqmp_qspi_read_op
spi: fsl-lpspi: Fix PM reference leak in lpspi_prepare_xfer_hardware()
usb: gadget: r8a66597: Add missing null check on return from platform_get_resource
USB: cdc-acm: fix unprivileged TIOCCSERIAL
USB: cdc-acm: fix TIOCGSERIAL implementation
tty: actually undefine superseded ASYNC flags
tty: fix return value for unsupported ioctls
tty: Remove dead termiox code
tty: fix return value for unsupported termiox ioctls
serial: core: return early on unsupported ioctls
firmware: qcom-scm: Fix QCOM_SCM configuration
node: fix device cleanups in error handling code
crypto: chelsio - Read rxchannel-id from firmware
usbip: vudc: fix missing unlock on error in usbip_sockfd_store()
m68k: Add missing mmap_read_lock() to sys_cacheflush()
spi: spi-zynqmp-gqspi: Fix missing unlock on error in zynqmp_qspi_exec_op()
memory: renesas-rpc-if: fix possible NULL pointer dereference of resource
memory: samsung: exynos5422-dmc: handle clk_set_parent() failure
security: keys: trusted: fix TPM2 authorizations
platform/x86: pmc_atom: Match all Beckhoff Automation baytrail boards with critclk_systems DMI table
ARM: dts: aspeed: Rainier: Fix humidity sensor bus address
Drivers: hv: vmbus: Use after free in __vmbus_open()
spi: spi-zynqmp-gqspi: fix clk_enable/disable imbalance issue
spi: spi-zynqmp-gqspi: fix hang issue when suspend/resume
spi: spi-zynqmp-gqspi: fix use-after-free in zynqmp_qspi_exec_op
spi: spi-zynqmp-gqspi: return -ENOMEM if dma_map_single fails
x86/platform/uv: Fix !KEXEC build failure
hwmon: (pmbus/pxe1610) don't bail out when not all pages are active
Drivers: hv: vmbus: Increase wait time for VMbus unload
PM: hibernate: x86: Use crc32 instead of md5 for hibernation e820 integrity check
usb: dwc2: Fix host mode hibernation exit with remote wakeup flow.
usb: dwc2: Fix hibernation between host and device modes.
ttyprintk: Add TTY hangup callback.
serial: omap: don't disable rs485 if rts gpio is missing
serial: omap: fix rs485 half-duplex filtering
xen-blkback: fix compatibility bug with single page rings
soc: aspeed: fix a ternary sign expansion bug
drm/tilcdc: send vblank event when disabling crtc
drm/stm: Fix bus_flags handling
drm/amd/display: Fix off by one in hdmi_14_process_transaction()
drm/mcde/panel: Inverse misunderstood flag
sched/fair: Fix shift-out-of-bounds in load_balance()
afs: Fix updating of i_mode due to 3rd party change
rcu: Remove spurious instrumentation_end() in rcu_nmi_enter()
media: vivid: fix assignment of dev->fbuf_out_flags
media: saa7134: use sg_dma_len when building pgtable
media: saa7146: use sg_dma_len when building pgtable
media: omap4iss: return error code when omap4iss_get() failed
media: rkisp1: rsz: crash fix when setting src format
media: aspeed: fix clock handling logic
drm/probe-helper: Check epoch counter in output_poll_execute()
media: venus: core: Fix some resource leaks in the error path of 'venus_probe()'
media: platform: sunxi: sun6i-csi: fix error return code of sun6i_video_start_streaming()
media: m88ds3103: fix return value check in m88ds3103_probe()
media: docs: Fix data organization of MEDIA_BUS_FMT_RGB101010_1X30
media: [next] staging: media: atomisp: fix memory leak of object flash
media: atomisp: Fixed error handling path
media: m88rs6000t: avoid potential out-of-bounds reads on arrays
media: atomisp: Fix use after free in atomisp_alloc_css_stat_bufs()
drm/amdkfd: fix build error with AMD_IOMMU_V2=m
of: overlay: fix for_each_child.cocci warnings
x86/kprobes: Fix to check non boostable prefixes correctly
selftests: fix prepending $(OUTPUT) to $(TEST_PROGS)
pata_arasan_cf: fix IRQ check
pata_ipx4xx_cf: fix IRQ check
sata_mv: add IRQ checks
ata: libahci_platform: fix IRQ check
seccomp: Fix CONFIG tests for Seccomp_filters
nvme-tcp: block BH in sk state_change sk callback
nvmet-tcp: fix incorrect locking in state_change sk callback
clk: imx: Fix reparenting of UARTs not associated with stdout
power: supply: bq25980: Move props from battery node
nvme: retrigger ANA log update if group descriptor isn't found
media: i2c: imx219: Move out locking/unlocking of vflip and hflip controls from imx219_set_stream
media: i2c: imx219: Balance runtime PM use-count
media: v4l2-ctrls.c: fix race condition in hdl->requests list
vfio/fsl-mc: Re-order vfio_fsl_mc_probe()
vfio/pci: Move VGA and VF initialization to functions
vfio/pci: Re-order vfio_pci_probe()
vfio/mdev: Do not allow a mdev_type to have a NULL parent pointer
clk: zynqmp: move zynqmp_pll_set_mode out of round_rate callback
clk: zynqmp: pll: add set_pll_mode to check condition in zynqmp_pll_enable
drm: xlnx: zynqmp: fix a memset in zynqmp_dp_train()
clk: qcom: a53-pll: Add missing MODULE_DEVICE_TABLE
clk: qcom: apss-ipq-pll: Add missing MODULE_DEVICE_TABLE
drm/amd/display: use GFP_ATOMIC in dcn20_resource_construct
drm/radeon: Fix a missing check bug in radeon_dp_mst_detect()
clk: uniphier: Fix potential infinite loop
scsi: pm80xx: Increase timeout for pm80xx mpi_uninit_check()
scsi: pm80xx: Fix potential infinite loop
scsi: ufs: ufshcd-pltfrm: Fix deferred probing
scsi: hisi_sas: Fix IRQ checks
scsi: jazz_esp: Add IRQ check
scsi: sun3x_esp: Add IRQ check
scsi: sni_53c710: Add IRQ check
scsi: ibmvfc: Fix invalid state machine BUG_ON()
mailbox: sprd: Introduce refcnt when clients requests/free channels
mfd: stm32-timers: Avoid clearing auto reload register
nvmet-tcp: fix a segmentation fault during io parsing error
nvme-pci: don't simple map sgl when sgls are disabled
media: cedrus: Fix H265 status definitions
HSI: core: fix resource leaks in hsi_add_client_from_dt()
x86/events/amd/iommu: Fix sysfs type mismatch
perf/amd/uncore: Fix sysfs type mismatch
io_uring: fix overflows checks in provide buffers
sched/debug: Fix cgroup_path[] serialization
drivers/block/null_blk/main: Fix a double free in null_init.
xsk: Respect device's headroom and tailroom on generic xmit path
HID: plantronics: Workaround for double volume key presses
perf symbols: Fix dso__fprintf_symbols_by_name() to return the number of printed chars
ASoC: Intel: boards: sof-wm8804: add check for PLL setting
ASoC: Intel: Skylake: Compile when any configuration is selected
RDMA/mlx5: Fix mlx5 rates to IB rates map
wilc1000: write value to WILC_INTR2_ENABLE register
KVM: x86/mmu: Retry page faults that hit an invalid memslot
Bluetooth: avoid deadlock between hci_dev->lock and socket lock
net: lapbether: Prevent racing when checking whether the netif is running
libbpf: Add explicit padding to bpf_xdp_set_link_opts
bpftool: Fix maybe-uninitialized warnings
iommu: Check dev->iommu in iommu_dev_xxx functions
iommu/vt-d: Reject unsupported page request modes
selftests/bpf: Re-generate vmlinux.h and BPF skeletons if bpftool changed
libbpf: Add explicit padding to btf_dump_emit_type_decl_opts
powerpc/fadump: Mark fadump_calculate_reserve_size as __init
powerpc/prom: Mark identical_pvr_fixup as __init
MIPS: fix local_irq_{disable,enable} in asmmacro.h
ima: Fix the error code for restoring the PCR value
inet: use bigger hash table for IP ID generation
pinctrl: pinctrl-single: remove unused parameter
pinctrl: pinctrl-single: fix pcs_pin_dbg_show() when bits_per_mux is not zero
MIPS: loongson64: fix bug when PAGE_SIZE > 16KB
ASoC: wm8960: Remove bitclk relax condition in wm8960_configure_sysclk
iommu/arm-smmu-v3: add bit field SFM into GERROR_ERR_MASK
RDMA/mlx5: Fix drop packet rule in egress table
IB/isert: Fix a use after free in isert_connect_request
powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration
MIPS/bpf: Enable bpf_probe_read{, str}() on MIPS again
gpio: guard gpiochip_irqchip_add_domain() with GPIOLIB_IRQCHIP
ALSA: core: remove redundant spin_lock pair in snd_card_disconnect
net: phy: lan87xx: fix access to wrong register of LAN87xx
udp: never accept GSO_FRAGLIST packets
powerpc/pseries: Only register vio drivers if vio bus exists
net/tipc: fix missing destroy_workqueue() on error in tipc_crypto_start()
bug: Remove redundant condition check in report_bug
RDMA/core: Fix corrupted SL on passive side
nfc: pn533: prevent potential memory corruption
net: hns3: Limiting the scope of vector_ring_chain variable
mips: bmips: fix syscon-reboot nodes
iommu/vt-d: Don't set then clear private data in prq_event_thread()
iommu: Fix a boundary issue to avoid performance drop
iommu/vt-d: Report right snoop capability when using FL for IOVA
iommu/vt-d: Report the right page fault address
iommu/vt-d: Preset Access/Dirty bits for IOVA over FL
iommu/vt-d: Remove WO permissions on second-level paging entries
iommu/vt-d: Invalidate PASID cache when root/context entry changed
ALSA: usb-audio: Add error checks for usb_driver_claim_interface() calls
HID: lenovo: Use brightness_set_blocking callback for setting LEDs brightness
HID: lenovo: Fix lenovo_led_set_tp10ubkbd() error handling
HID: lenovo: Check hid_get_drvdata() returns non NULL in lenovo_event()
HID: lenovo: Map mic-mute button to KEY_F20 instead of KEY_MICMUTE
KVM: arm64: Initialize VCPU mdcr_el2 before loading it
ASoC: simple-card: fix possible uninitialized single_cpu local variable
liquidio: Fix unintented sign extension of a left shift of a u16
IB/hfi1: Use kzalloc() for mmu_rb_handler allocation
powerpc/64s: Fix pte update for kernel memory on radix
powerpc/perf: Fix PMU constraint check for EBB events
powerpc: iommu: fix build when neither PCI or IBMVIO is set
mac80211: bail out if cipher schemes are invalid
perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric
xfs: fix return of uninitialized value in variable error
rtw88: Fix an error code in rtw_debugfs_set_rsvd_page()
mt7601u: fix always true expression
mt76: mt7615: fix tx skb dma unmap
mt76: mt7915: fix tx skb dma unmap
mt76: mt7915: fix aggr len debugfs node
mt76: mt7615: fix mib stats counter reporting to mac80211
mt76: mt7915: fix mib stats counter reporting to mac80211
mt76: mt7663s: make all of packets 4-bytes aligned in sdio tx aggregation
mt76: mt7663s: fix the possible device hang in high traffic
KVM: PPC: Book3S HV P9: Restore host CTRL SPR after guest exit
ovl: invalidate readdir cache on changes to dir with origin
RDMA/qedr: Fix error return code in qedr_iw_connect()
IB/hfi1: Fix error return code in parse_platform_config()
RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal()
cxgb4: Fix unintentional sign extension issues
net: thunderx: Fix unintentional sign extension issue
RDMA/srpt: Fix error return code in srpt_cm_req_recv()
RDMA/rtrs-clt: destroy sysfs after removing session from active list
i2c: cadence: fix reference leak when pm_runtime_get_sync fails
i2c: img-scb: fix reference leak when pm_runtime_get_sync fails
i2c: imx-lpi2c: fix reference leak when pm_runtime_get_sync fails
i2c: imx: fix reference leak when pm_runtime_get_sync fails
i2c: omap: fix reference leak when pm_runtime_get_sync fails
i2c: sprd: fix reference leak when pm_runtime_get_sync fails
i2c: stm32f7: fix reference leak when pm_runtime_get_sync fails
i2c: xiic: fix reference leak when pm_runtime_get_sync fails
i2c: cadence: add IRQ check
i2c: emev2: add IRQ check
i2c: jz4780: add IRQ check
i2c: mlxbf: add IRQ check
i2c: rcar: make sure irq is not threaded on Gen2 and earlier
i2c: rcar: protect against supurious interrupts on V3U
i2c: rcar: add IRQ check
i2c: sh7760: add IRQ check
powerpc/xive: Drop check on irq_data in xive_core_debug_show()
powerpc/xive: Fix xmon command "dxi"
ASoC: ak5558: correct reset polarity
net/mlx5: Fix bit-wise and with zero
net/packet: make packet_fanout.arr size configurable up to 64K
net/packet: remove data races in fanout operations
drm/i915/gvt: Fix error code in intel_gvt_init_device()
iommu/amd: Put newline after closing bracket in warning
perf beauty: Fix fsconfig generator
drm/amd/pm: fix error code in smu_set_power_limit()
MIPS: pci-legacy: stop using of_pci_range_to_resource
powerpc/pseries: extract host bridge from pci_bus prior to bus removal
powerpc/smp: Reintroduce cpu_core_mask
KVM: x86: dump_vmcs should not assume GUEST_IA32_EFER is valid
rtlwifi: 8821ae: upgrade PHY and RF parameters
wlcore: fix overlapping snprintf arguments in debugfs
i2c: sh7760: fix IRQ error path
i2c: mediatek: Fix wrong dma sync flag
mwl8k: Fix a double Free in mwl8k_probe_hw
netfilter: nft_payload: fix C-VLAN offload support
netfilter: nftables_offload: VLAN id needs host byteorder in flow dissector
netfilter: nftables_offload: special ethertype handling for VLAN
vsock/vmci: log once the failed queue pair allocation
libbpf: Initialize the bpf_seq_printf parameters array field by field
net: ethernet: ixp4xx: Set the DMA masks explicitly
gro: fix napi_gro_frags() Fast GRO breakage due to IP alignment check
RDMA/cxgb4: add missing qpid increment
RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
ALSA: usb: midi: don't return -ENOMEM when usb_urb_ep_type_check fails
sfc: ef10: fix TX queue lookup in TX event handling
vsock/virtio: free queued packets when closing socket
net: marvell: prestera: fix port event handling on init
net: davinci_emac: Fix incorrect masking of tx and rx error channel
mt76: mt7615: fix memleak when mt7615_unregister_device()
crypto: ccp: Detect and reject "invalid" addresses destined for PSP
nfp: devlink: initialize the devlink port attribute "lanes"
net: stmmac: fix TSO and TBS feature enabling during driver open
net: renesas: ravb: Fix a stuck issue when a lot of frames are received
net: phy: intel-xway: enable integrated led functions
RDMA/rxe: Fix a bug in rxe_fill_ip_info()
RDMA/core: Add CM to restrack after successful attachment to a device
powerpc/64: Fix the definition of the fixmap area
ath9k: Fix error check in ath9k_hw_read_revisions() for PCI devices
ath10k: Fix a use after free in ath10k_htc_send_bundle
ath10k: Fix ath10k_wmi_tlv_op_pull_peer_stats_info() unlock without lock
wlcore: Fix buffer overrun by snprintf due to incorrect buffer size
powerpc/perf: Fix the threshold event selection for memory events in power10
powerpc/52xx: Fix an invalid ASM expression ('addi' used instead of 'add')
net: phy: marvell: fix m88e1011_set_downshift
net: phy: marvell: fix m88e1111_set_downshift
net: enetc: fix link error again
bnxt_en: fix ternary sign extension bug in bnxt_show_temp()
ARM: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
arm64: dts: uniphier: Change phy-mode to RGMII-ID to enable delay pins for RTL8211E
net: geneve: modify IP header check in geneve6_xmit_skb and geneve_xmit_skb
selftests: net: mirror_gre_vlan_bridge_1q: Make an FDB entry static
selftests: mlxsw: Remove a redundant if statement in tc_flower_scale test
bnxt_en: Fix RX consumer index logic in the error path.
KVM: VMX: Intercept FS/GS_BASE MSR accesses for 32-bit KVM
net:emac/emac-mac: Fix a use after free in emac_mac_tx_buf_send
selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macro
selftests/bpf: Fix field existence CO-RE reloc tests
selftests/bpf: Fix core_reloc test runner
bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds
RDMA/siw: Fix a use after free in siw_alloc_mr
RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
net: bridge: mcast: fix broken length + header check for MRDv6 Adv.
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
perf tools: Change fields type in perf_record_time_conv
perf jit: Let convert_timestamp() to be backwards-compatible
perf session: Add swap operation for event TIME_CONV
ia64: fix EFI_DEBUG build
kfifo: fix ternary sign extension bugs
mm/sl?b.c: remove ctor argument from kmem_cache_flags
mm: memcontrol: slab: fix obtain a reference to a freeing memcg
mm/sparse: add the missing sparse_buffer_fini() in error branch
mm/memory-failure: unnecessary amount of unmapping
afs: Fix speculative status fetches
bpf: Fix alu32 const subreg bound tracking on bitwise operations
bpf, ringbuf: Deny reserve of buffers larger than ringbuf
bpf: Prevent writable memory-mapping of read-only ringbuf pages
arm64: Remove arm64_dma32_phys_limit and its uses
net: Only allow init netns to set default tcp cong to a restricted algo
smp: Fix smp_call_function_single_async prototype
Revert "net/sctp: fix race condition in sctp_destroy_sock"
sctp: delay auto_asconf init until binding the first addr
Linux 5.10.37
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5bee89c285d9dd72de967b0e70d96951ae4e06ae
[ Upstream commit 64bdc02440 ]
Strictly speaking, seccomp filters are only used
when CONFIG_SECCOMP_FILTER.
This patch fixes the condition to enable "Seccomp_filters"
in /proc/$pid/status.
Signed-off-by: Kenta Tada <Kenta.Tada@sony.com>
Fixes: c818c03b66 ("seccomp: Report number of loaded filters in /proc/$pid/status")
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/OSBPR01MB26772D245E2CF4F26B76A989F5669@OSBPR01MB2677.jpnprd01.prod.outlook.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmAjmYgACgkQONu9yGCS
aT6bohAAiRtT6WoMrzEFXvs44tWxB4JPXJR290RmmO+fZGj+HoujKbhnWCwNMD5q
s2PGELPybWaii7schjuXqUl0vhatUnVLs04A59hGKKHHlvpWTznQYT7Urt8C6c8I
sm2zAB6oeGwkwG8vQnujq2srxdgZVzffx09Tm1l+PtFGudROVixuthw21Q0QhRt0
2z2O5lcJC7utlUIudI/pe7WECgMIZHS2lRZyz8EKJ6ynMuNpQl2WuchPWuLgVhYB
hiHV0yRLIHRzi3wNDMy8GJzjjfS1dkMuwnDK+3rZNZH+IVV3wvs/8mdr8p5d/VYU
kfPDP1EMjnik0sBLvavl9d4ixnUcCn05HjGnFrqSbfpi8j88npqKriV3jkDrhQpO
cLlMcVTN3ufpcrr6BIDgL6uCVCul+txeaPkUuafP6y3yNtQXDWbLxlgfm7u+TwAB
nIr12H8cgMR6zK2iQFSZk4rGt1eXUIZkZYLcoINW88Zey9jsTImg9ZgAA+BJJMzs
4A3UKAOthCCyGJYju3hoTUVyS0QPJVuplAODIMSklUjnxQT4JfPJ9Jk2BRw5cY4K
Xecpb/+6vxKj7Zl6HFvcZCkU8+pul5wboDnCIgpIGOLUhq8OTyu5716Q7c9T6MbO
R6ftZosij+bmuapo4HZhnYzybzHb7LHLsa+B/5DzK0CwTTmV1JM=
=YVjW
-----END PGP SIGNATURE-----
Merge 5.10.15 into android12-5.10
Changes in 5.10.15
USB: serial: cp210x: add pid/vid for WSDA-200-USB
USB: serial: cp210x: add new VID/PID for supporting Teraoka AD2000
USB: serial: option: Adding support for Cinterion MV31
usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada 3720
USB: gadget: legacy: fix an error code in eth_bind()
usb: gadget: aspeed: add missing of_node_put
USB: usblp: don't call usb_set_interface if there's a single alt
usb: renesas_usbhs: Clear pipe running flag in usbhs_pkt_pop()
usb: dwc2: Fix endpoint direction check in ep_from_windex
usb: dwc3: fix clock issue during resume in OTG mode
usb: xhci-mtk: fix unreleased bandwidth data
usb: xhci-mtk: skip dropping bandwidth of unchecked endpoints
usb: xhci-mtk: break loop when find the endpoint to drop
ARM: OMAP1: OSK: fix ohci-omap breakage
arm64: dts: qcom: c630: keep both touchpad devices enabled
Input: i8042 - unbreak Pegatron C15B
arm64: dts: amlogic: meson-g12: Set FL-adj property value
arm64: dts: rockchip: fix vopl iommu irq on px30
arm64: dts: rockchip: Use only supported PCIe link speed on Pinebook Pro
ARM: dts: stm32: Fix polarity of the DH DRC02 uSD card detect
ARM: dts: stm32: Connect card-detect signal on DHCOM
ARM: dts: stm32: Disable WP on DHCOM uSD slot
ARM: dts: stm32: Disable optional TSC2004 on DRC02 board
ARM: dts: stm32: Fix GPIO hog flags on DHCOM DRC02
vdpa/mlx5: Fix memory key MTT population
bpf, cgroup: Fix optlen WARN_ON_ONCE toctou
bpf, cgroup: Fix problematic bounds check
bpf, inode_storage: Put file handler if no storage was found
um: virtio: free vu_dev only with the contained struct device
bpf, preload: Fix build when $(O) points to a relative path
arm64: dts: meson: switch TFLASH_VDD_EN pin to open drain on Odroid-C4
r8169: work around RTL8125 UDP hw bug
rxrpc: Fix deadlock around release of dst cached on udp tunnel
arm64: dts: ls1046a: fix dcfg address range
SUNRPC: Fix NFS READs that start at non-page-aligned offsets
igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr
igc: check return value of ret_val in igc_config_fc_after_link_up
i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues"
ibmvnic: device remove has higher precedence over reset
net/mlx5: Fix function calculation for page trees
net/mlx5: Fix leak upon failure of rule creation
net/mlx5e: Update max_opened_tc also when channels are closed
net/mlx5e: Release skb in case of failure in tc update skb
net: lapb: Copy the skb before sending a packet
net: mvpp2: TCAM entry enable should be written after SRAM data
r8169: fix WoL on shutdown if CONFIG_DEBUG_SHIRQ is set
net: ipa: pass correct dma_handle to dma_free_coherent()
ARM: dts: sun7i: a20: bananapro: Fix ethernet phy-mode
nvmet-tcp: fix out-of-bounds access when receiving multiple h2cdata PDUs
vdpa/mlx5: Restore the hardware used index after change map
memblock: do not start bottom-up allocations with kernel_end
kbuild: fix duplicated flags in DEBUG_CFLAGS
thunderbolt: Fix possible NULL pointer dereference in tb_acpi_add_link()
ovl: fix dentry leak in ovl_get_redirect
ovl: avoid deadlock on directory ioctl
ovl: implement volatile-specific fsync error behaviour
mac80211: fix station rate table updates on assoc
gpiolib: free device name on error path to fix kmemleak
fgraph: Initialize tracing_graph_pause at task creation
tracing/kprobe: Fix to support kretprobe events on unloaded modules
kretprobe: Avoid re-registration of the same kretprobe earlier
tracing: Use pause-on-trace with the latency tracers
tracepoint: Fix race between tracing and removing tracepoint
libnvdimm/namespace: Fix visibility of namespace resource attribute
libnvdimm/dimm: Avoid race between probe and available_slots_show()
genirq: Prevent [devm_]irq_alloc_desc from returning irq 0
genirq/msi: Activate Multi-MSI early when MSI_FLAG_ACTIVATE_EARLY is set
scripts: use pkg-config to locate libcrypto
xhci: fix bounce buffer usage for non-sg list case
RISC-V: Define MAXPHYSMEM_1GB only for RV32
cifs: report error instead of invalid when revalidating a dentry fails
iommu: Check dev->iommu in dev_iommu_priv_get() before dereferencing it
smb3: Fix out-of-bounds bug in SMB2_negotiate()
smb3: fix crediting for compounding when only one request in flight
mmc: sdhci-pltfm: Fix linking err for sdhci-brcmstb
mmc: core: Limit retries when analyse of SDIO tuples fails
Fix unsynchronized access to sev members through svm_register_enc_region
drm/dp/mst: Export drm_dp_get_vc_payload_bw()
drm/i915: Fix the MST PBN divider calculation
drm/i915/gem: Drop lru bumping on display unpinning
drm/i915/gt: Close race between enable_breadcrumbs and cancel_breadcrumbs
drm/i915/display: Prevent double YUV range correction on HDR planes
drm/i915: Extract intel_ddi_power_up_lanes()
drm/i915: Power up combo PHY lanes for for HDMI as well
drm/amd/display: Revert "Fix EDID parsing after resume from suspend"
io_uring: don't modify identity's files uncess identity is cowed
nvme-pci: avoid the deepest sleep state on Kingston A2000 SSDs
KVM: SVM: Treat SVM as unsupported when running as an SEV guest
KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEs
KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off
KVM: x86: fix CPUID entries returned by KVM_GET_CPUID2 ioctl
KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode
KVM: x86: Set so called 'reserved CR3 bits in LM mask' at vCPU reset
DTS: ARM: gta04: remove legacy spi-cs-high to make display work again
ARM: dts; gta04: SPI panel chip select is active low
ARM: footbridge: fix dc21285 PCI configuration accessors
ARM: 9043/1: tegra: Fix misplaced tegra_uart_config in decompressor
mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
mm: hugetlb: fix a race between freeing and dissolving the page
mm: hugetlb: fix a race between isolating and freeing page
mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
mm, compaction: move high_pfn to the for loop scope
mm/vmalloc: separate put pages and flush VM flags
mm: thp: fix MADV_REMOVE deadlock on shmem THP
mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
x86/build: Disable CET instrumentation in the kernel
x86/debug: Fix DR6 handling
x86/debug: Prevent data breakpoints on __per_cpu_offset
x86/debug: Prevent data breakpoints on cpu_dr7
x86/apic: Add extra serialization for non-serializing MSRs
Input: goodix - add support for Goodix GT9286 chip
Input: xpad - sync supported devices with fork on GitHub
Input: ili210x - implement pressure reporting for ILI251x
md: Set prev_flush_start and flush_bio in an atomic way
igc: Report speed and duplex as unknown when device is runtime suspended
neighbour: Prevent a dead entry from updating gc_list
net: ip_tunnel: fix mtu calculation
udp: ipv4: manipulate network header of NATed UDP GRO fraglist
net: dsa: mv88e6xxx: override existent unicast portvec in port_fdb_add
net: sched: replaced invalid qdisc tree flush helper in qdisc_replace
Linux 5.10.15
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I15750357b4c30739515fdc0bbbd0e04b7c986171
commit 7e0a922046 upstream.
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable@vger.kernel.org
Fixes: 380c4b1411 ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois@arm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d73b49365e ]
This is a preparatory commit for the upcoming addition of a new hardware
tag-based (MTE-based) KASAN mode.
Hardware tag-based KASAN won't use kasan_depth. Only define and use it
when one of the software KASAN modes are enabled.
No functional changes for software modes.
Link: https://lkml.kernel.org/r/e16f15aeda90bc7fb4dfc2e243a14b74cc5c8219.1606161801.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bug: 172318110
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Change-Id: I553d5ca1fa50ae80cd2eb929e328ac3cb3ce0e9f
[ Upstream commit f7cfd871ae ]
Recently syzbot reported[0] that there is a deadlock amongst the users
of exec_update_mutex. The problematic lock ordering found by lockdep
was:
perf_event_open (exec_update_mutex -> ovl_i_mutex)
chown (ovl_i_mutex -> sb_writes)
sendfile (sb_writes -> p->lock)
by reading from a proc file and writing to overlayfs
proc_pid_syscall (p->lock -> exec_update_mutex)
While looking at possible solutions it occured to me that all of the
users and possible users involved only wanted to state of the given
process to remain the same. They are all readers. The only writer is
exec.
There is no reason for readers to block on each other. So fix
this deadlock by transforming exec_update_mutex into a rw_semaphore
named exec_update_lock that only exec takes for writing.
Cc: Jann Horn <jannh@google.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christopher Yeoh <cyeoh@au1.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Fixes: eea9673250 ("exec: Add exec_update_mutex to replace cred_guard_mutex")
[0] https://lkml.kernel.org/r/00000000000063640c05ade8e3de@google.com
Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Grab actual references to the files_struct. To avoid circular references
issues due to this, we add a per-task note that keeps track of what
io_uring contexts a task has used. When the tasks execs or exits its
assigned files, we cancel requests based on this tracking.
With that, we can grab proper references to the files table, and no
longer need to rely on stashing away ring_fd and ring_file to check
if the ring_fd may have been closed.
Cc: stable@vger.kernel.org # v5.5+
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
A few patches all over the place during this cycle, mostly bug and
sparse warning fixes for OpenRISC, but a few enhancements too. Note,
there are 2 non OpenRISC specific fixups.
Non OpenRISC fixes:
- In init we need to align the init_task correctly to fix an issue with
MUTEX_FLAGS, reviewed by Peter Z. No one picked this up so I kept it
on my tree.
- In asm-generic/io.h I fixed up some sparse warnings, OK'd by Arnd.
Arnd asked to merge it via my tree.
OpenRISC fixes:
- Many fixes for OpenRISC sprase warnings.
- Add support OpenRISC SMP tlb flushing rather than always flushing the
entire TLB on every CPU.
- Fix bug when dumping stack via /proc/xxx/stack of user threads.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE2cRzVK74bBA6Je/xw7McLV5mJ+QFAl829wgACgkQw7McLV5m
J+RvMg/+Ik9jmHiCoDilVzB5yqJ0Ea8oLjg75V9eBE3YtnYJMAbDHb8ye2OsYrlp
QhrAHFi8PB7nJQphod3XXt8Y5JWMYjKIgdazybVQtUlD5IAXgYAR6/IxJ1DVzxa0
AzJ7TGmYSxnhW7GzbRU5xjgdIi5cKQjBUcVM/blDQB6/GZ4wY3OBxK1pn0kNXMPU
gnS+0yPDlwXaZw67YmbF5kF34lvEe0knkOaxxP/S0t2ROb6Xu0PJCEDTbdcGApsB
2xdm0dJwK50ulS0/HWxC18vC/R7d1b0qjR+xvisjydHbZawEN2Kcf3mOCSAETSTk
ST1WFxuTAObqdyc4F15tdsqFvbchPtCH9UAjkkSbmRxGVOKQa88NmW1A+s0hj4BX
enf6I9SYzqiU/WkuFDwSnJ4NETOpPaUVqZbi3WTUfljyXmOdqXbT+416YxViOXpx
OtSyGVN18qs8wjsWlWiGyhM/eAnHwr9q0q1kJ8VZTh+nQSnQFmuWjHSfRan2PkmQ
nnbvXHXJcgWYVlk+JZLOnhOB3zrkH5xmlM2UakVUvP92ESnnSmBCC0RLA0k6kGZ3
PkFBbY4etbA7Ug8r1KueOaqHKwJpTpIb4tU75y3KXyi05FeLEln1doC5M4EQUPDy
eXzdWj6afuEKmAPILiEYlSVXO3t8iIncVBkK7isaR37dURNnJWE=
=0MlF
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://github.com/openrisc/linux
Pull OpenRISC updates from Stafford Horne:
"A few patches all over the place during this cycle, mostly bug and
sparse warning fixes for OpenRISC, but a few enhancements too. Note,
there are 2 non OpenRISC specific fixups.
Non OpenRISC fixes:
- In init we need to align the init_task correctly to fix an issue
with MUTEX_FLAGS, reviewed by Peter Z. No one picked this up so I
kept it on my tree.
- In asm-generic/io.h I fixed up some sparse warnings, OK'd by Arnd.
Arnd asked to merge it via my tree.
OpenRISC fixes:
- Many fixes for OpenRISC sprase warnings.
- Add support OpenRISC SMP tlb flushing rather than always flushing
the entire TLB on every CPU.
- Fix bug when dumping stack via /proc/xxx/stack of user threads"
* tag 'for-linus' of git://github.com/openrisc/linux:
openrisc: uaccess: Add user address space check to access_ok
openrisc: signal: Fix sparse address space warnings
openrisc: uaccess: Remove unused macro __addr_ok
openrisc: uaccess: Use static inline function in access_ok
openrisc: uaccess: Fix sparse address space warnings
openrisc: io: Fixup defines and move include to the end
asm-generic/io.h: Fix sparse warnings on big-endian architectures
openrisc: Implement proper SMP tlb flushing
openrisc: Fix oops caused when dumping stack
openrisc: Add support for external initrd images
init: Align init_task to avoid conflict with MUTEX_FLAGS
openrisc: fix __user in raw_copy_to_user()'s prototype
- Untangle the header spaghetti which causes build failures in various
situations caused by the lockdep additions to seqcount to validate that
the write side critical sections are non-preemptible.
- The seqcount associated lock debug addons which were blocked by the
above fallout.
seqcount writers contrary to seqlock writers must be externally
serialized, which usually happens via locking - except for strict per
CPU seqcounts. As the lock is not part of the seqcount, lockdep cannot
validate that the lock is held.
This new debug mechanism adds the concept of associated locks.
sequence count has now lock type variants and corresponding
initializers which take a pointer to the associated lock used for
writer serialization. If lockdep is enabled the pointer is stored and
write_seqcount_begin() has a lockdep assertion to validate that the
lock is held.
Aside of the type and the initializer no other code changes are
required at the seqcount usage sites. The rest of the seqcount API is
unchanged and determines the type at compile time with the help of
_Generic which is possible now that the minimal GCC version has been
moved up.
Adding this lockdep coverage unearthed a handful of seqcount bugs which
have been addressed already independent of this.
While generaly useful this comes with a Trojan Horse twist: On RT
kernels the write side critical section can become preemtible if the
writers are serialized by an associated lock, which leads to the well
known reader preempts writer livelock. RT prevents this by storing the
associated lock pointer independent of lockdep in the seqcount and
changing the reader side to block on the lock when a reader detects
that a writer is in the write side critical section.
- Conversion of seqcount usage sites to associated types and initializers.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl8xmPYTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoTuQEACyzQCjU8PgehPp9oMqWzaX2fcVyuZO
QU2yw6gmz2oTz3ZHUNwdW8UnzGh2OWosK3kDruoD9FtSS51lER1/ISfSPCGfyqxC
KTjOcB1Kvxwq/3LcCx7Zi3ZxWApat74qs3EhYhKtEiQ2Y9xv9rLq8VV1UWAwyxq0
eHpjlIJ6b6rbt+ARslaB7drnccOsdK+W/roNj4kfyt+gezjBfojGRdMGQNMFcpnv
shuTC+vYurAVIiVA/0IuizgHfwZiXOtVpjVoEWaxg6bBH6HNuYMYzdSa/YrlDkZs
n/aBI/Xkvx+Eacu8b1Zwmbzs5EnikUK/2dMqbzXKUZK61eV4hX5c2xrnr1yGWKTs
F/juh69Squ7X6VZyKVgJ9RIccVueqwR2EprXWgH3+RMice5kjnXH4zURp0GHALxa
DFPfB6fawcH3Ps87kcRFvjgm6FBo0hJ1AxmsW1dY4ACFB9azFa2euW+AARDzHOy2
VRsUdhL9CGwtPjXcZ/9Rhej6fZLGBXKr8uq5QiMuvttp4b6+j9FEfBgD4S6h8csl
AT2c2I9LcbWqyUM9P4S7zY/YgOZw88vHRuDH7tEBdIeoiHfrbSBU7EQ9jlAKq/59
f+Htu2Io281c005g7DEeuCYvpzSYnJnAitj5Lmp/kzk2Wn3utY1uIAVszqwf95Ul
81ppn2KlvzUK8g==
=7Gj+
-----END PGP SIGNATURE-----
Merge tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Thomas Gleixner:
"A set of locking fixes and updates:
- Untangle the header spaghetti which causes build failures in
various situations caused by the lockdep additions to seqcount to
validate that the write side critical sections are non-preemptible.
- The seqcount associated lock debug addons which were blocked by the
above fallout.
seqcount writers contrary to seqlock writers must be externally
serialized, which usually happens via locking - except for strict
per CPU seqcounts. As the lock is not part of the seqcount, lockdep
cannot validate that the lock is held.
This new debug mechanism adds the concept of associated locks.
sequence count has now lock type variants and corresponding
initializers which take a pointer to the associated lock used for
writer serialization. If lockdep is enabled the pointer is stored
and write_seqcount_begin() has a lockdep assertion to validate that
the lock is held.
Aside of the type and the initializer no other code changes are
required at the seqcount usage sites. The rest of the seqcount API
is unchanged and determines the type at compile time with the help
of _Generic which is possible now that the minimal GCC version has
been moved up.
Adding this lockdep coverage unearthed a handful of seqcount bugs
which have been addressed already independent of this.
While generally useful this comes with a Trojan Horse twist: On RT
kernels the write side critical section can become preemtible if
the writers are serialized by an associated lock, which leads to
the well known reader preempts writer livelock. RT prevents this by
storing the associated lock pointer independent of lockdep in the
seqcount and changing the reader side to block on the lock when a
reader detects that a writer is in the write side critical section.
- Conversion of seqcount usage sites to associated types and
initializers"
* tag 'locking-urgent-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
locking/seqlock, headers: Untangle the spaghetti monster
locking, arch/ia64: Reduce <asm/smp.h> header dependencies by moving XTP bits into the new <asm/xtp.h> header
x86/headers: Remove APIC headers from <asm/smp.h>
seqcount: More consistent seqprop names
seqcount: Compress SEQCNT_LOCKNAME_ZERO()
seqlock: Fold seqcount_LOCKNAME_init() definition
seqlock: Fold seqcount_LOCKNAME_t definition
seqlock: s/__SEQ_LOCKDEP/__SEQ_LOCK/g
hrtimer: Use sequence counter with associated raw spinlock
kvm/eventfd: Use sequence counter with associated spinlock
userfaultfd: Use sequence counter with associated spinlock
NFSv4: Use sequence counter with associated spinlock
iocost: Use sequence counter with associated spinlock
raid5: Use sequence counter with associated spinlock
vfs: Use sequence counter with associated spinlock
timekeeping: Use sequence counter with associated raw spinlock
xfrm: policy: Use sequence counters with associated lock
netfilter: nft_set_rbtree: Use sequence counter with associated rwlock
netfilter: conntrack: Use sequence counter with associated spinlock
sched: tasks: Use sequence counter with associated spinlock
...
When booting on 32-bit machines (seen on OpenRISC) I saw this warning
with CONFIG_DEBUG_MUTEXES turned on.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/locking/mutex.c:1242 __mutex_unlock_slowpath+0x328/0x3ec
DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current)
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.8.0-rc1-simple-smp-00005-g2864e2171db4-dirty #179
Call trace:
[<(ptrval)>] dump_stack+0x34/0x48
[<(ptrval)>] __warn+0x104/0x158
[<(ptrval)>] ? __mutex_unlock_slowpath+0x328/0x3ec
[<(ptrval)>] warn_slowpath_fmt+0x7c/0x94
[<(ptrval)>] __mutex_unlock_slowpath+0x328/0x3ec
[<(ptrval)>] mutex_unlock+0x18/0x28
[<(ptrval)>] __cpuhp_setup_state_cpuslocked.part.0+0x29c/0x2f4
[<(ptrval)>] ? page_alloc_cpu_dead+0x0/0x30
[<(ptrval)>] ? start_kernel+0x0/0x684
[<(ptrval)>] __cpuhp_setup_state+0x4c/0x5c
[<(ptrval)>] page_alloc_init+0x34/0x68
[<(ptrval)>] ? start_kernel+0x1a0/0x684
[<(ptrval)>] ? early_init_dt_scan_nodes+0x60/0x70
irq event stamp: 0
I traced this to kernel/locking/mutex.c storing 3 bits of MUTEX_FLAGS in
the task_struct pointer (mutex.owner). There is a comment saying that
task_structs are always aligned to L1_CACHE_BYTES. This is not true for
the init_task.
On 64-bit machines this is not a problem because symbol addresses are
naturally aligned to 64-bits providing 3 bits for MUTEX_FLAGS. Howerver,
for 32-bit machines the symbol address only has 2 bits available.
Fix this by setting init_task alignment to at least L1_CACHE_BYTES.
Signed-off-by: Stafford Horne <shorne@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
A sequence counter write side critical section must be protected by some
form of locking to serialize writers. A plain seqcount_t does not
contain the information of which lock must be held when entering a write
side critical section.
Use the new seqcount_spinlock_t data type, which allows to associate a
spinlock with the sequence counter. This enables lockdep to verify that
the spinlock used for writer serialization is held when the write side
critical section is entered.
If lockdep is disabled this lock association is compiled out and has
neither storage size nor runtime overhead.
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200720155530.1173732-14-a.darwish@linutronix.de
A common question asked when debugging seccomp filters is "how many
filters are attached to your process?" Provide a way to easily answer
this question through /proc/$pid/status with a "Seccomp_filters" line.
Signed-off-by: Kees Cook <keescook@chromium.org>
Merge the state of the locking kcsan branch before the read/write_once()
and the atomics modifications got merged.
Squash the fallout of the rebase on top of the read/write once and atomic
fallback work into the merge. The history of the original branch is
preserved in tag locking-kcsan-2020-06-02.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Patch series "mm: consolidate definitions of page table accessors", v2.
The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once. For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.
Most of these definitions are actually identical and typically it boils
down to, e.g.
static inline unsigned long pmd_index(unsigned long address)
{
return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}
static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}
These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.
These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.
This patch (of 12):
The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g. pte_alloc() and
pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.
The include statements in such cases are remove with a simple loop:
for f in $(git grep -l "include <linux/mm.h>") ; do
sed -i -e '/include <asm\/pgtable.h>/ d' $f
done
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Branch Target Identification (BTI)
* Support for ARMv8.5-BTI in both user- and kernel-space. This
allows branch targets to limit the types of branch from which
they can be called and additionally prevents branching to
arbitrary code, although kernel support requires a very recent
toolchain.
* Function annotation via SYM_FUNC_START() so that assembly
functions are wrapped with the relevant "landing pad"
instructions.
* BPF and vDSO updates to use the new instructions.
* Addition of a new HWCAP and exposure of BTI capability to
userspace via ID register emulation, along with ELF loader
support for the BTI feature in .note.gnu.property.
* Non-critical fixes to CFI unwind annotations in the sigreturn
trampoline.
- Shadow Call Stack (SCS)
* Support for Clang's Shadow Call Stack feature, which reserves
platform register x18 to point at a separate stack for each
task that holds only return addresses. This protects function
return control flow from buffer overruns on the main stack.
* Save/restore of x18 across problematic boundaries (user-mode,
hypervisor, EFI, suspend, etc).
* Core support for SCS, should other architectures want to use it
too.
* SCS overflow checking on context-switch as part of the existing
stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
- CPU feature detection
* Removed numerous "SANITY CHECK" errors when running on a system
with mismatched AArch32 support at EL1. This is primarily a
concern for KVM, which disabled support for 32-bit guests on
such a system.
* Addition of new ID registers and fields as the architecture has
been extended.
- Perf and PMU drivers
* Minor fixes and cleanups to system PMU drivers.
- Hardware errata
* Unify KVM workarounds for VHE and nVHE configurations.
* Sort vendor errata entries in Kconfig.
- Secure Monitor Call Calling Convention (SMCCC)
* Update to the latest specification from Arm (v1.2).
* Allow PSCI code to query the SMCCC version.
- Software Delegated Exception Interface (SDEI)
* Unexport a bunch of unused symbols.
* Minor fixes to handling of firmware data.
- Pointer authentication
* Add support for dumping the kernel PAC mask in vmcoreinfo so
that the stack can be unwound by tools such as kdump.
* Simplification of key initialisation during CPU bringup.
- BPF backend
* Improve immediate generation for logical and add/sub
instructions.
- vDSO
- Minor fixes to the linker flags for consistency with other
architectures and support for LLVM's unwinder.
- Clean up logic to initialise and map the vDSO into userspace.
- ACPI
- Work around for an ambiguity in the IORT specification relating
to the "num_ids" field.
- Support _DMA method for all named components rather than only
PCIe root complexes.
- Minor other IORT-related fixes.
- Miscellaneous
* Initialise debug traps early for KGDB and fix KDB cacheflushing
deadlock.
* Minor tweaks to early boot state (documentation update, set
TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
* Refactoring and cleanup
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl7U9csQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNLBHCACs/YU4SM7Om5f+7QnxIKao5DBr2CnGGvdC
yTfDghFDTLQVv3MufLlfno3yBe5G8sQpcZfcc+hewfcGoMzVZXu8s7LzH6VSn9T9
jmT3KjDMrg0RjSHzyumJp2McyelTk0a4FiKArSIIKsJSXUyb1uPSgm7SvKVDwEwU
JGDzL9IGilmq59GiXfDzGhTZgmC37QdwRoRxDuqtqWQe5CHoRXYexg87HwBKOQxx
HgU9L7ehri4MRZfpyjaDrr6quJo3TVnAAKXNBh3mZAskVS9ZrfKpEH0kYWYuqybv
znKyHRecl/rrGePV8RTMtrwnSdU26zMXE/omsVVauDfG9hqzqm+Q
=w3qi
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"A sizeable pile of arm64 updates for 5.8.
Summary below, but the big two features are support for Branch Target
Identification and Clang's Shadow Call stack. The latter is currently
arm64-only, but the high-level parts are all in core code so it could
easily be adopted by other architectures pending toolchain support
Branch Target Identification (BTI):
- Support for ARMv8.5-BTI in both user- and kernel-space. This allows
branch targets to limit the types of branch from which they can be
called and additionally prevents branching to arbitrary code,
although kernel support requires a very recent toolchain.
- Function annotation via SYM_FUNC_START() so that assembly functions
are wrapped with the relevant "landing pad" instructions.
- BPF and vDSO updates to use the new instructions.
- Addition of a new HWCAP and exposure of BTI capability to userspace
via ID register emulation, along with ELF loader support for the
BTI feature in .note.gnu.property.
- Non-critical fixes to CFI unwind annotations in the sigreturn
trampoline.
Shadow Call Stack (SCS):
- Support for Clang's Shadow Call Stack feature, which reserves
platform register x18 to point at a separate stack for each task
that holds only return addresses. This protects function return
control flow from buffer overruns on the main stack.
- Save/restore of x18 across problematic boundaries (user-mode,
hypervisor, EFI, suspend, etc).
- Core support for SCS, should other architectures want to use it
too.
- SCS overflow checking on context-switch as part of the existing
stack limit check if CONFIG_SCHED_STACK_END_CHECK=y.
CPU feature detection:
- Removed numerous "SANITY CHECK" errors when running on a system
with mismatched AArch32 support at EL1. This is primarily a concern
for KVM, which disabled support for 32-bit guests on such a system.
- Addition of new ID registers and fields as the architecture has
been extended.
Perf and PMU drivers:
- Minor fixes and cleanups to system PMU drivers.
Hardware errata:
- Unify KVM workarounds for VHE and nVHE configurations.
- Sort vendor errata entries in Kconfig.
Secure Monitor Call Calling Convention (SMCCC):
- Update to the latest specification from Arm (v1.2).
- Allow PSCI code to query the SMCCC version.
Software Delegated Exception Interface (SDEI):
- Unexport a bunch of unused symbols.
- Minor fixes to handling of firmware data.
Pointer authentication:
- Add support for dumping the kernel PAC mask in vmcoreinfo so that
the stack can be unwound by tools such as kdump.
- Simplification of key initialisation during CPU bringup.
BPF backend:
- Improve immediate generation for logical and add/sub instructions.
vDSO:
- Minor fixes to the linker flags for consistency with other
architectures and support for LLVM's unwinder.
- Clean up logic to initialise and map the vDSO into userspace.
ACPI:
- Work around for an ambiguity in the IORT specification relating to
the "num_ids" field.
- Support _DMA method for all named components rather than only PCIe
root complexes.
- Minor other IORT-related fixes.
Miscellaneous:
- Initialise debug traps early for KGDB and fix KDB cacheflushing
deadlock.
- Minor tweaks to early boot state (documentation update, set
TEXT_OFFSET to 0x0, increase alignment of PE/COFF sections).
- Refactoring and cleanup"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (148 commits)
KVM: arm64: Move __load_guest_stage2 to kvm_mmu.h
KVM: arm64: Check advertised Stage-2 page size capability
arm64/cpufeature: Add get_arm64_ftr_reg_nowarn()
ACPI/IORT: Remove the unused __get_pci_rid()
arm64/cpuinfo: Add ID_MMFR4_EL1 into the cpuinfo_arm64 context
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR1 register
arm64/cpufeature: Add remaining feature bits in ID_AA64PFR0 register
arm64/cpufeature: Add remaining feature bits in ID_AA64ISAR0 register
arm64/cpufeature: Add remaining feature bits in ID_MMFR4 register
arm64/cpufeature: Add remaining feature bits in ID_PFR0 register
arm64/cpufeature: Introduce ID_MMFR5 CPU register
arm64/cpufeature: Introduce ID_DFR1 CPU register
arm64/cpufeature: Introduce ID_PFR2 CPU register
arm64/cpufeature: Make doublelock a signed feature in ID_AA64DFR0
arm64/cpufeature: Drop TraceFilt feature exposure from ID_DFR0 register
arm64/cpufeature: Add explicit ftr_id_isar0[] for ID_ISAR0 register
arm64: mm: Add asid_gen_match() helper
firmware: smccc: Fix missing prototype warning for arm_smccc_version_init
arm64: vdso: Fix CFI directives in sigreturn trampoline
arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction
...
This change adds generic support for Clang's Shadow Call Stack,
which uses a shadow stack to protect return addresses from being
overwritten by an attacker. Details are available here:
https://clang.llvm.org/docs/ShadowCallStack.html
Note that security guarantees in the kernel differ from the ones
documented for user space. The kernel must store addresses of
shadow stacks in memory, which means an attacker capable reading
and writing arbitrary memory may be able to locate them and hijack
control flow by modifying the stacks.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
[will: Numerous cosmetic changes]
Signed-off-by: Will Deacon <will@kernel.org>
This commit splits ->trc_reader_need_end by using the rcu_special union.
This change permits readers to check to see if a memory barrier is
required without any added overhead in the common case where no such
barrier is required. This commit also adds the read-side checking.
Later commits will add the machinery to properly set the new
->trc_reader_special.b.need_mb field.
This commit also makes rcu_read_unlock_trace_special() tolerate nested
read-side critical sections within interrupt and NMI handlers.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Because RCU does not watch exception early-entry/late-exit, idle-loop,
or CPU-hotplug execution, protection of tracing and BPF operations is
needlessly complicated. This commit therefore adds a variant of
Tasks RCU that:
o Has explicit read-side markers to allow finite grace periods in
the face of in-kernel loops for PREEMPT=n builds. These markers
are rcu_read_lock_trace() and rcu_read_unlock_trace().
o Protects code in the idle loop, exception entry/exit, and
CPU-hotplug code paths. In this respect, RCU-tasks trace is
similar to SRCU, but with lighter-weight readers.
o Avoids expensive read-side instruction, having overhead similar
to that of Preemptible RCU.
There are of course downsides:
o The grace-period code can send IPIs to CPUs, even when those
CPUs are in the idle loop or in nohz_full userspace. This is
mitigated by later commits.
o It is necessary to scan the full tasklist, much as for Tasks RCU.
o There is a single callback queue guarded by a single lock,
again, much as for Tasks RCU. However, those early use cases
that request multiple grace periods in quick succession are
expected to do so from a single task, which makes the single
lock almost irrelevant. If needed, multiple callback queues
can be provided using any number of schemes.
Perhaps most important, this variant of RCU does not affect the vanilla
flavors, rcu_preempt and rcu_sched. The fact that RCU Tasks Trace
readers can operate from idle, offline, and exception entry/exit in no
way enables rcu_preempt and rcu_sched readers to do so.
The memory ordering was outlined here:
https://lore.kernel.org/lkml/20200319034030.GX3199@paulmck-ThinkPad-P72/
This effort benefited greatly from off-list discussions of BPF
requirements with Alexei Starovoitov and Andrii Nakryiko. At least
some of the on-list discussions are captured in the Link: tags below.
In addition, KCSAN was quite helpful in finding some early bugs.
Link: https://lore.kernel.org/lkml/20200219150744.428764577@infradead.org/
Link: https://lore.kernel.org/lkml/87mu8p797b.fsf@nanos.tec.linutronix.de/
Link: https://lore.kernel.org/lkml/20200225221305.605144982@linutronix.de/
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>
[ paulmck: Apply feedback from Steve Rostedt and Joel Fernandes. ]
[ paulmck: Decrement trc_n_readers_need_end upon IPI failure. ]
[ paulmck: Fix locking issue reported by rcutorture. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This adds support for scoped accesses, where the memory range is checked
for the duration of the scope. The feature is implemented by inserting
the relevant access information into a list of scoped accesses for
the current execution context, which are then checked (until removed)
on every call (through instrumentation) into the KCSAN runtime.
An alternative, more complex, implementation could set up a watchpoint for
the scoped access, and keep the watchpoint set up. This, however, would
require first exposing a handle to the watchpoint, as well as dealing
with cases such as accesses by the same thread while the watchpoint is
still set up (and several more cases). It is also doubtful if this would
provide any benefit, since the majority of delay where the watchpoint
is set up is likely due to the injected delays by KCSAN. Therefore,
the implementation in this patch is simpler and avoids hurting KCSAN's
main use-case (normal data race detection); it also implicitly increases
scoped-access race-detection-ability due to increased probability of
setting up watchpoints by repeatedly calling __kcsan_check_access()
throughout the scope of the access.
The implementation required adding an additional conditional branch to
the fast-path. However, the microbenchmark showed a *speedup* of ~5%
on the fast-path. This appears to be due to subtly improved codegen by
GCC from moving get_ctx() and associated load of preempt_count earlier.
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl6TbaUeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGhgkH/iWpiKvosA20HJjC
rBqYeJPxQsgZTuBieWJ+MeVxbpcF7RlM4c+glyvg3QJhHwIEG58dl6LBrQbAyBAR
aFHNojr1iAYOruVCGnU3pA008YZiwUIDv/ZQ4DF8fmIU2vI2mJ6qHBv3XDl4G2uR
Nwz8Eu9AgIwZM5coomVOSmoWyFy7Vxmb7W+3t5VmKsvOWx4ib9kyQtOIkvQDEl7j
XCbWfI0xDQr6LFOm4jnCi5R/LhJ2LIqqIvHHrunbpszM8IwK797jCXz4im+dmd5Y
+km46N7a8pDqri36xXz1gdBAU3eG7Pt1NyvfjwRVTdX4GquQ2MT0GoojxbLxUP3y
3pEsQuE=
=whbL
-----END PGP SIGNATURE-----
Merge tag 'v5.7-rc1' into locking/kcsan, to resolve conflicts and refresh
Resolve these conflicts:
arch/x86/Kconfig
arch/x86/kernel/Makefile
Do a minor "evil merge" to move the KCSAN entry up a bit by a few lines
in the Kconfig to reduce the probability of future conflicts.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The cred_guard_mutex is problematic as it is held over possibly
indefinite waits for userspace. The possible indefinite waits for
userspace that I have identified are: The cred_guard_mutex is held in
PTRACE_EVENT_EXIT waiting for the tracer. The cred_guard_mutex is
held over "put_user(0, tsk->clear_child_tid)" in exit_mm(). The
cred_guard_mutex is held over "get_user(futex_offset, ...") in
exit_robust_list. The cred_guard_mutex held over copy_strings.
The functions get_user and put_user can trigger a page fault which can
potentially wait indefinitely in the case of userfaultfd or if
userspace implements part of the page fault path.
In any of those cases the userspace process that the kernel is waiting
for might make a different system call that winds up taking the
cred_guard_mutex and result in deadlock.
Holding a mutex over any of those possibly indefinite waits for
userspace does not appear necessary. Add exec_update_mutex that will
just cover updating the process during exec where the permissions and
the objects pointed to by the task struct may be out of sync.
The plan is to switch the users of cred_guard_mutex to
exec_update_mutex one by one. This lets us move forward while still
being careful and not introducing any regressions.
Link: https://lore.kernel.org/lkml/20160921152946.GA24210@dhcp22.suse.cz/
Link: https://lore.kernel.org/lkml/AM6PR03MB5170B06F3A2B75EFB98D071AE4E60@AM6PR03MB5170.eurprd03.prod.outlook.com/
Link: https://lore.kernel.org/linux-fsdevel/20161102181806.GB1112@redhat.com/
Link: https://lore.kernel.org/lkml/20160923095031.GA14923@redhat.com/
Link: https://lore.kernel.org/lkml/20170213141452.GA30203@redhat.com/
Ref: 45c1a159b85b ("Add PTRACE_O_TRACEVFORKDONE and PTRACE_O_TRACEEXIT facilities.")
Ref: 456f17cd1a28 ("[PATCH] user-vm-unlock-2.5.31-A2")
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Bernd Edlinger <bernd.edlinger@hotmail.de>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
When setting up an access mask with kcsan_set_access_mask(), KCSAN will
only report races if concurrent changes to bits set in access_mask are
observed. Conveying access_mask via a separate call avoids introducing
overhead in the common-case fast-path.
Acked-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Kernel Concurrency Sanitizer (KCSAN) is a dynamic data-race detector for
kernel space. KCSAN is a sampling watchpoint-based data-race detector.
See the included Documentation/dev-tools/kcsan.rst for more details.
This patch adds basic infrastructure, but does not yet enable KCSAN for
any architecture.
Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Pull core timer updates from Thomas Gleixner:
"Timers and timekeeping updates:
- A large overhaul of the posix CPU timer code which is a preparation
for moving the CPU timer expiry out into task work so it can be
properly accounted on the task/process.
An update to the bogus permission checks will come later during the
merge window as feedback was not complete before heading of for
travel.
- Switch the timerqueue code to use cached rbtrees and get rid of the
homebrewn caching of the leftmost node.
- Consolidate hrtimer_init() + hrtimer_init_sleeper() calls into a
single function
- Implement the separation of hrtimers to be forced to expire in hard
interrupt context even when PREEMPT_RT is enabled and mark the
affected timers accordingly.
- Implement a mechanism for hrtimers and the timer wheel to protect
RT against priority inversion and live lock issues when a (hr)timer
which should be canceled is currently executing the callback.
Instead of infinitely spinning, the task which tries to cancel the
timer blocks on a per cpu base expiry lock which is held and
released by the (hr)timer expiry code.
- Enable the Hyper-V TSC page based sched_clock for Hyper-V guests
resulting in faster access to timekeeping functions.
- Updates to various clocksource/clockevent drivers and their device
tree bindings.
- The usual small improvements all over the place"
* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
posix-cpu-timers: Fix permission check regression
posix-cpu-timers: Always clear head pointer on dequeue
hrtimer: Add a missing bracket and hide `migration_base' on !SMP
posix-cpu-timers: Make expiry_active check actually work correctly
posix-timers: Unbreak CONFIG_POSIX_TIMERS=n build
tick: Mark sched_timer to expire in hard interrupt context
hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARD
x86/hyperv: Hide pv_ops access for CONFIG_PARAVIRT=n
posix-cpu-timers: Utilize timerqueue for storage
posix-cpu-timers: Move state tracking to struct posix_cputimers
posix-cpu-timers: Deduplicate rlimit handling
posix-cpu-timers: Remove pointless comparisons
posix-cpu-timers: Get rid of 64bit divisions
posix-cpu-timers: Consolidate timer expiry further
posix-cpu-timers: Get rid of zero checks
rlimit: Rewrite non-sensical RLIMIT_CPU comment
posix-cpu-timers: Respect INFINITY for hard RTTIME limit
posix-cpu-timers: Switch thread group sampling to array
posix-cpu-timers: Restructure expiry array
posix-cpu-timers: Remove cputime_expires
...
CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by
CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same
functionality which today depends on CONFIG_PREEMPT.
Switch the preemption code, scheduler and init task over to use
CONFIG_PREEMPTION.
That's the first step towards RT in that area. The more complex changes are
coming separately.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20190726212124.117528401@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull scheduler updates from Ingo Molnar:
- Remove the unused per rq load array and all its infrastructure, by
Dietmar Eggemann.
- Add utilization clamping support by Patrick Bellasi. This is a
refinement of the energy aware scheduling framework with support for
boosting of interactive and capping of background workloads: to make
sure critical GUI threads get maximum frequency ASAP, and to make
sure background processing doesn't unnecessarily move to cpufreq
governor to higher frequencies and less energy efficient CPU modes.
- Add the bare minimum of tracepoints required for LISA EAS regression
testing, by Qais Yousef - which allows automated testing of various
power management features, including energy aware scheduling.
- Restructure the former tsk_nr_cpus_allowed() facility that the -rt
kernel used to modify the scheduler's CPU affinity logic such as
migrate_disable() - introduce the task->cpus_ptr value instead of
taking the address of &task->cpus_allowed directly - by Sebastian
Andrzej Siewior.
- Misc optimizations, fixes, cleanups and small enhancements - see the
Git log for details.
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
sched/uclamp: Add uclamp support to energy_compute()
sched/uclamp: Add uclamp_util_with()
sched/cpufreq, sched/uclamp: Add clamps for FAIR and RT tasks
sched/uclamp: Set default clamps for RT tasks
sched/uclamp: Reset uclamp values on RESET_ON_FORK
sched/uclamp: Extend sched_setattr() to support utilization clamping
sched/core: Allow sched_setattr() to use the current policy
sched/uclamp: Add system default clamps
sched/uclamp: Enforce last task's UCLAMP_MAX
sched/uclamp: Add bucket local max tracking
sched/uclamp: Add CPU's clamp buckets refcounting
sched/fair: Rename weighted_cpuload() to cpu_runnable_load()
sched/debug: Export the newly added tracepoints
sched/debug: Add sched_overutilized tracepoint
sched/debug: Add new tracepoint to track PELT at se level
sched/debug: Add new tracepoints to track PELT at rq level
sched/debug: Add a new sched_trace_*() helper functions
sched/autogroup: Make autogroup_path() always available
sched/wait: Deduplicate code with do-while
sched/topology: Remove unused 'sd' parameter from arch_scale_cpu_capacity()
...
Chain keys are computed using Jenkins hash function, which needs an initial
hash to start with. Dedicate a macro to make this clear and configurable. A
later patch changes this initial chain key.
Signed-off-by: Yuyang Du <duyuyang@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bvanassche@acm.org
Cc: frederic@kernel.org
Cc: ming.lei@redhat.com
Cc: will.deacon@arm.com
Link: https://lkml.kernel.org/r/20190506081939.74287-9-duyuyang@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Despite that there is a lockdep_init_task() which does nothing, lockdep
initiates tasks by assigning lockdep fields and does so inconsistently. Fix
this by using lockdep_init_task().
Signed-off-by: Yuyang Du <duyuyang@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bvanassche@acm.org
Cc: frederic@kernel.org
Cc: ming.lei@redhat.com
Cc: will.deacon@arm.com
Link: https://lkml.kernel.org/r/20190506081939.74287-8-duyuyang@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In commit:
4b53a3412d ("sched/core: Remove the tsk_nr_cpus_allowed() wrapper")
the tsk_nr_cpus_allowed() wrapper was removed. There was not
much difference in !RT but in RT we used this to implement
migrate_disable(). Within a migrate_disable() section the CPU mask is
restricted to single CPU while the "normal" CPU mask remains untouched.
As an alternative implementation Ingo suggested to use:
struct task_struct {
const cpumask_t *cpus_ptr;
cpumask_t cpus_mask;
};
with
t->cpus_ptr = &t->cpus_mask;
In -RT we then can switch the cpus_ptr to:
t->cpus_ptr = &cpumask_of(task_cpu(p));
in a migration disabled region. The rules are simple:
- Code that 'uses' ->cpus_allowed would use the pointer.
- Code that 'modifies' ->cpus_allowed would use the direct mask.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20190423142636.14347-1-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAlx+8ZgUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXOlDhAAiGlirQ9syyG2fYzaARZZ2QoU/GGD
PSAeiNmP3jvJzXArCvugRCw+YSNDdQOBM3SrLQC+cM0MAIDRYXN0NdcrsbTchlMA
51Fx1egZ9Fyj+Ehgida3muh2lRUy7DQwMCL6tAVqwz7vYkSTGDUf+MlYqOqXDka5
74pEExOS3Jdi7560BsE8b6QoW9JIJqEJnirXGkG9o2qC0oFHCR6PKxIyQ7TJrLR1
F23aFTqLTH1nbPUQjnox2PTf13iQVh4j2gwzd+9c9KBfxoGSge3dmxId7BJHy2aG
M27fPdCYTNZAGWpPVujsCPAh1WPQ9NQqg3mA9+g14PEbiLqPcqU+kWmnDU7T7bEw
Qx0kt6Y8GiknwCqq8pDbKYclgRmOjSGdfutzd0z8uDpbaeunS4/NqnDb/FUaDVcr
jA4d6ep7qEgHpYbL8KgOeZCexfaTfz6mcwRWNq3Uu9cLZbZqSSQ7PXolMADHvoRs
LS7VH2jcP7q4p4GWmdfjv67xyUUo9HG5HHX74h5pLfQSYXiBWo4ht0UOAzX/6EcE
CJNHAFHv+OanI5Rg/6JQ8b3/bJYxzAJVyLZpCuMtlKk6lYBGNeADk9BezEDIYsm8
tSe4/GqqyR9+Qz8rSdpAZ0KKkfqS535IcHUPUJau7Bzg1xqSEP5gzZN6QsjdXg0+
5wFFfdFICTfJFXo=
=57/1
-----END PGP SIGNATURE-----
Merge tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit
Pull audit updates from Paul Moore:
"A lucky 13 audit patches for v5.1.
Despite the rather large diffstat, most of the changes are from two
bug fix patches that move code from one Kconfig option to another.
Beyond that bit of churn, the remaining changes are largely cleanups
and bug-fixes as we slowly march towards container auditing. It isn't
all boring though, we do have a couple of new things: file
capabilities v3 support, and expanded support for filtering on
filesystems to solve problems with remote filesystems.
All changes pass the audit-testsuite. Please merge for v5.1"
* tag 'audit-pr-20190305' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
audit: mark expected switch fall-through
audit: hide auditsc_get_stamp and audit_serial prototypes
audit: join tty records to their syscall
audit: remove audit_context when CONFIG_ AUDIT and not AUDITSYSCALL
audit: remove unused actx param from audit_rule_match
audit: ignore fcaps on umount
audit: clean up AUDITSYSCALL prototypes and stubs
audit: more filter PATH records keyed on filesystem magic
audit: add support for fcaps v3
audit: move loginuid and sessionid from CONFIG_AUDITSYSCALL to CONFIG_AUDIT
audit: add syscall information to CONFIG_CHANGE records
audit: hand taken context to audit_kill_trees for syscall logging
audit: give a clue what CONFIG_CHANGE op was involved
Merge misc updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (159 commits)
tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
proc: more robust bulk read test
proc: test /proc/*/maps, smaps, smaps_rollup, statm
proc: use seq_puts() everywhere
proc: read kernel cpu stat pointer once
proc: remove unused argument in proc_pid_lookup()
fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
fs/proc/self.c: code cleanup for proc_setup_self()
proc: return exit code 4 for skipped tests
mm,mremap: bail out earlier in mremap_to under map pressure
mm/sparse: fix a bad comparison
mm/memory.c: do_fault: avoid usage of stale vm_area_struct
writeback: fix inode cgroup switching comment
mm/huge_memory.c: fix "orig_pud" set but not used
mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
mm/memcontrol.c: fix bad line in comment
mm/cma.c: cma_declare_contiguous: correct err handling
mm/page_ext.c: fix an imbalance with kmemleak
mm/compaction: pass pgdat to too_many_isolated() instead of zone
mm: remove zone_lru_lock() function, access ->lru_lock directly
...
Patch series "Replace all open encodings for NUMA_NO_NODE", v3.
All these places for replacement were found by running the following
grep patterns on the entire kernel code. Please let me know if this
might have missed some instances. This might also have replaced some
false positives. I will appreciate suggestions, inputs and review.
1. git grep "nid == -1"
2. git grep "node == -1"
3. git grep "nid = -1"
4. git grep "node = -1"
This patch (of 2):
At present there are multiple places where invalid node number is
encoded as -1. Even though implicitly understood it is always better to
have macros in there. Replace these open encodings for an invalid node
number with the global macro NUMA_NO_NODE. This helps remove NUMA
related assumptions like 'invalid node' from various places redirecting
them to a common definition.
Link: http://lkml.kernel.org/r/1545127933-10711-2-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> [ixgbe]
Acked-by: Jens Axboe <axboe@kernel.dk> [mtip32xx]
Acked-by: Vinod Koul <vkoul@kernel.org> [dmaengine.c]
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Acked-by: Doug Ledford <dledford@redhat.com> [drivers/infiniband]
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Hans Verkuil <hverkuil@xs4all.nl>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable task_struct.stack_refcount is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
** Important note for maintainers:
Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.
For the task_struct.stack_refcount it might make a difference
in following places:
- try_get_task_stack(): increment in refcount_inc_not_zero() only
guarantees control dependency on success vs. fully ordered
atomic counterpart
- put_task_stack(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: viro@zeniv.linux.org.uk
Link: https://lkml.kernel.org/r/1547814450-18902-6-git-send-email-elena.reshetova@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable task_struct.usage is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
** Important note for maintainers:
Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.
For the task_struct.usage it might make a difference
in following places:
- put_task_struct(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: viro@zeniv.linux.org.uk
Link: https://lkml.kernel.org/r/1547814450-18902-5-git-send-email-elena.reshetova@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
atomic_t variables are currently used to implement reference
counters with the following properties:
- counter is initialized to 1 using atomic_set()
- a resource is freed upon counter reaching zero
- once counter reaches zero, its further
increments aren't allowed
- counter schema uses basic atomic operations
(set, inc, inc_not_zero, dec_and_test, etc.)
Such atomic variables should be converted to a newly provided
refcount_t type and API that prevents accidental counter overflows
and underflows. This is important since overflows and underflows
can lead to use-after-free situation and be exploitable.
The variable signal_struct.sigcnt is used as pure reference counter.
Convert it to refcount_t and fix up the operations.
** Important note for maintainers:
Some functions from refcount_t API defined in lib/refcount.c
have different memory ordering guarantees than their atomic
counterparts.
The full comparison can be seen in
https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon
in state to be merged to the documentation tree.
Normally the differences should not matter since refcount_t provides
enough guarantees to satisfy the refcounting use cases, but in
some rare cases it might matter.
Please double check that you don't have some undocumented
memory guarantees for this variable usage.
For the signal_struct.sigcnt it might make a difference
in following places:
- put_signal_struct(): decrement in refcount_dec_and_test() only
provides RELEASE ordering and control dependency on success
vs. fully ordered atomic counterpart
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: David Windsor <dwindsor@gmail.com>
Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com>
Reviewed-by: Andrea Parri <andrea.parri@amarulasolutions.com>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: akpm@linux-foundation.org
Cc: viro@zeniv.linux.org.uk
Link: https://lkml.kernel.org/r/1547814450-18902-3-git-send-email-elena.reshetova@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
loginuid and sessionid (and audit_log_session_info) should be part of
CONFIG_AUDIT scope and not CONFIG_AUDITSYSCALL since it is used in
CONFIG_CHANGE, ANOM_LINK, FEATURE_CHANGE (and INTEGRITY_RULE), none of
which are otherwise dependent on AUDITSYSCALL.
Please see github issue
https://github.com/linux-audit/audit-kernel/issues/104
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
[PM: tweaked subject line for better grep'ing]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Wen Yang <wen.yang99@zte.com.cn> and majiang <ma.jiang@zte.com.cn>
report that a periodic signal received during fork can cause fork to
continually restart preventing an application from making progress.
The code was being overly pessimistic. Fork needs to guarantee that a
signal sent to multiple processes is logically delivered before the
fork and just to the forking process or logically delivered after the
fork to both the forking process and it's newly spawned child. For
signals like periodic timers that are always delivered to a single
process fork can safely complete and let them appear to logically
delivered after the fork().
While examining this issue I also discovered that fork today will miss
signals delivered to multiple processes during the fork and handled by
another thread. Similarly the current code will also miss blocked
signals that are delivered to multiple process, as those signals will
not appear pending during fork.
Add a list of each thread that is currently forking, and keep on that
list a signal set that records all of the signals sent to multiple
processes. When fork completes initialize the new processes
shared_pending signal set with it. The calculate_sigpending function
will see those signals and set TIF_SIGPENDING causing the new task to
take the slow path to userspace to handle those signals. Making it
appear as if those signals were received immediately after the fork.
It is not possible to send real time signals to multiple processes and
exceptions don't go to multiple processes, which means that that are
no signals sent to multiple processes that require siginfo. This
means it is safe to not bother collecting siginfo on signals sent
during fork.
The sigaction of a child of fork is initially the same as the
sigaction of the parent process. So a signal the parent ignores the
child will also initially ignore. Therefore it is safe to ignore
signals sent to multiple processes and ignored by the forking process.
Signals sent to only a single process or only a single thread and delivered
during fork are treated as if they are received after the fork, and generally
not dealt with. They won't cause any problems.
V2: Added removal from the multiprocess list on failure.
V3: Use -ERESTARTNOINTR directly
V4: - Don't queue both SIGCONT and SIGSTOP
- Initialize signal_struct.multiprocess in init_task
- Move setting of shared_pending to before the new task
is visible to signals. This prevents signals from comming
in before shared_pending.signal is set to delayed.signal
and being lost.
V5: - rework list add and delete to account for idle threads
v6: - Use sigdelsetmask when removing stop signals
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200447
Reported-by: Wen Yang <wen.yang99@zte.com.cn> and
Reported-by: majiang <ma.jiang@zte.com.cn>
Fixes: 4a2c7a7837 ("[PATCH] make fork() atomic wrt pgrp/session signals")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Everywhere except in the pid array we distinguish between a tasks pid and
a tasks tgid (thread group id). Even in the enumeration we want that
distinction sometimes so we have added __PIDTYPE_TGID. With leader_pid
we almost have an implementation of PIDTYPE_TGID in struct signal_struct.
Add PIDTYPE_TGID as a first class member of the pid_type enumeration and
into the pids array. Then remove the __PIDTYPE_TGID special case and the
leader_pid in signal_struct.
The net size increase is just an extra pointer added to struct pid and
an extra pair of pointers of an hlist_node added to task_struct.
The effect on code maintenance is the removal of a number of special
cases today and the potential to remove many more special cases as
PIDTYPE_TGID gets used to it's fullest. The long term potential
is allowing zombie thread group leaders to exit, which will remove
a lot more special cases in the code.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
To access these fields the code always has to go to group leader so
going to signal struct is no loss and is actually a fundamental simplification.
This saves a little bit of memory by only allocating the pid pointer array
once instead of once for every thread, and even better this removes a
few potential races caused by the fact that group_leader can be changed
by de_thread, while signal_struct can not.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Use a macro, "AUDIT_SID_UNSET", to replace each instance of
initialization and comparison to an audit session ID.
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
There doesn't seem to be any need to have the INIT_SIGNALS and INIT_SIGHAND
macros, so expand them in their single places of use and remove them.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Will Deacon <will.deacon@arm.com> (arm64)
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Expand various INIT_* macros into the single places they're used in
init/init_task.c and remove them.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Will Deacon <will.deacon@arm.com> (arm64)
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
It's no longer necessary to have an INIT_TASK() macro, and this can be
expanded into the one place it is now used and removed.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Will Deacon <will.deacon@arm.com> (arm64)
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Construct the init thread stack in the linker script rather than doing it
by means of a union so that ia64's init_task.c can be got rid of.
The following symbols are then made available from INIT_TASK_DATA() linker
script macro:
init_thread_union
init_stack
INIT_TASK_DATA() also expands the region to THREAD_SIZE to accommodate the
size of the init stack. init_thread_union is given its own section so that
it can be placed into the stack space in the right order. I'm assuming
that the ia64 ordering is correct and that the task_struct is first and the
thread_info second.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Will Deacon <will.deacon@arm.com> (arm64)
Tested-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This was entirely automated, using the script by Al:
PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)
to do the replacement at the end of the merge window.
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>