We need dump task->stack in kernel module for debug usage,
call try_get_task_stack to lock task->stack, and
try_get_task_stack/put_task_stack should call in pairs,
but put_task_stack is not exported
Bug: 192990535
Change-Id: Ifb2f3d16f93039bffeb3e822bc066e42e2d21d13
Signed-off-by: chunhui.li <chunhui.li@mediatek.com>
Export cgroup_add_legacy_cftypes and a helper function to allow vendor module to expose additional files in the memory cgroup hierarchy.
Bug: 192052083
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: Ie2b936b3e77c7ab6d740d1bb6d70e03c70a326a7
Through this vendor hook, we can get the timing to check
current running task for the validation of its credential
and bpf operations.
Bug: 191291287
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Change-Id: Ie4ed8df7ad66df2486fc7e52a26d9191fc0c176e
android_rvh_sched_fork() and android_rvh_sched_fork_init()
already let us register probes during fork(), but those are
invoked *before* the new task is added to the tasklist, which
can lead to some undesired races when a module is trying to
initialize vendor-specific task_struct fields.
Export the task_newtask tracepoint to register probes to run
during fork() but *after* the task has been inserted into the
tasklist.
Bug: 192873984
Signed-off-by: Jing-Ting Wu <Jing-Ting.Wu@mediatek.com>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Change-Id: Ifef14819264385b5e955a5966b4e4f66d50da5e3
Fix warnings reported by kernelci due to incorrect indentatio:
kernel/smp.c:982:3: warning: this ‘if’ clause does not guard
Fixes: f0b280c395 ("ANDROID: cpuidle: Update cpuidle_uninstall_idle_handler()
to wakeup all online CPUs")
Signed-off-by: Todd Kjos <tkjos@google.com>
Change-Id: Ide771342558de321154696f9fe1272750a773853
commit 5f89468e2f upstream.
in case of driver wants to sync part of ranges with offset,
swiotlb_tbl_sync_single() copies from orig_addr base to tlb_addr with
offset and ends up with data mismatch.
It was removed from
"swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single",
but said logic has to be added back in.
From Linus's email:
"That commit which the removed the offset calculation entirely, because the old
(unsigned long)tlb_addr & (IO_TLB_SIZE - 1)
was wrong, but instead of removing it, I think it should have just
fixed it to be
(tlb_addr - mem->start) & (IO_TLB_SIZE - 1);
instead. That way the slot offset always matches the slot index calculation."
(Unfortunatly that broke NVMe).
The use-case that drivers are hitting is as follow:
1. Get dma_addr_t from dma_map_single()
dma_addr_t tlb_addr = dma_map_single(dev, vaddr, vsize, DMA_TO_DEVICE);
|<---------------vsize------------->|
+-----------------------------------+
| | original buffer
+-----------------------------------+
vaddr
swiotlb_align_offset
|<----->|<---------------vsize------------->|
+-------+-----------------------------------+
| | | swiotlb buffer
+-------+-----------------------------------+
tlb_addr
2. Do something
3. Sync dma_addr_t through dma_sync_single_for_device(..)
dma_sync_single_for_device(dev, tlb_addr + offset, size, DMA_TO_DEVICE);
Error case.
Copy data to original buffer but it is from base addr (instead of
base addr + offset) in original buffer:
swiotlb_align_offset
|<----->|<- offset ->|<- size ->|
+-------+-----------------------------------+
| | |##########| | swiotlb buffer
+-------+-----------------------------------+
tlb_addr
|<- size ->|
+-----------------------------------+
|##########| | original buffer
+-----------------------------------+
vaddr
The fix is to copy the data to the original buffer and take into
account the offset, like so:
swiotlb_align_offset
|<----->|<- offset ->|<- size ->|
+-------+-----------------------------------+
| | |##########| | swiotlb buffer
+-------+-----------------------------------+
tlb_addr
|<- offset ->|<- size ->|
+-----------------------------------+
| |##########| | original buffer
+-----------------------------------+
vaddr
[One fix which was Linus's that made more sense to as it created a
symmetry would break NVMe. The reason for that is the:
unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);
would come up with the proper offset, but it would lose the
alignment (which this patch contains).]
Bug: 192521392
Fixes: 16fc3cef33 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
Signed-off-by: Bumyong Lee <bumyong.lee@samsung.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Dominique MARTINET <dominique.martinet@atmark-techno.com>
Reported-by: Horia Geantă <horia.geanta@nxp.com>
Tested-by: Horia Geantă <horia.geanta@nxp.com>
CC: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit e6108147dd of
linux-5.10.47)
Change-Id: Ib03e81080ab029d37e6ff54a3e2cb526d3a30e10
wake_up_all_idle_cpus() will not wakeup paused CPUs since they are removed
from cpu_active_mask but paused CPUs can be in deep cpu idle and hence must
wakeup when uninstalling idle handler.
This change fixes this by introducing wake_up_all_online_idle_cpus() to
unconditionally wakeup all online idle CPUs and invoking same when uninstalling
cpu idle handler.
Bug: 192436062
Fixes: 683010f555 ("ANDROID: cpu/hotplug: add pause/resume_cpus interface")
Change-Id: I4afd4b7a17b87f9cc495e7009c9537888387f9ef
Signed-off-by: Maulik Shah <mkshah@codeaurora.org>
For vendor specific data in struct cfs_rq.
Bug: 188947181
Signed-off-by: Rick Yiu <rickyiu@google.com>
Change-Id: I7c322c6812829c19014426b5721cd1fb0c37a53f
As restricted hooks have been introduced, regular vendor hooks are no
longer necessary.
Bug: 187917024
Change-Id: Ia70e9dd1bd7373e19bdc82e90a2384201076bc0b
Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>
select_fallback_rq() must return a cpu that is valid for the task.
However, when nid is not -1, it skips checking for
task_cpu_possible_mask().
This causes a problem when execve-ing 32 bit apps on an asymmetric
system where not all cpus are 32 bit capable. During execve-ing
the task is marked as 32 bit long before its affinity mask is
restricted.
If the cpu goes offline during this time, select_fallback_rq()
could return a 64 bit only cpu, which __migrate_tasks()/
is_cpu_allowed() rejects.
migrate_tasks() will therefore continue to pick the same task
repeatedly, where __migrate_tasks() rejects the cpu chosen
by select_fallback_rq() every time, leading to an infinite loop.
Correct the issue by updating select_fallback_rq() for the case
where nid is not -1, ensuring that the returned cpu is always
valid for this task.
Bug: 192050156
Change-Id: Ia073a8395a02485f6d1c1daa0f3ce9e2029cb1f4
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
In order to update cpufreq, vendor modules invoke cpufreq_update_util(),
but when we build our modules, report error:
ERROR: modpost: "cpufreq_update_util_data" [xxx.ko] undefined!
Bug: 192218676
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: Ib1da70229f04b08d8d812d065021dec0bf891e0e
Pre and post tracepoints in force_compatible_cpus_allowed_ptr() need
to be restricted hooks so that they can sleep.
The old non-restricted versions need to stay in place temporarily for
KMI stability. They will be removed by aosp/1742588.
Bug: 187917024
Change-Id: If630554b1c8fa2e8ccb79c89945c55e17756e6a8
Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>
PSI accounts stalls for each cgroup separately and aggregates it at each
level of the hierarchy. This causes additional overhead with psi_avgs_work
being called for each cgroup in the hierarchy. psi_avgs_work has been
highly optimized, however on systems with large number of cgroups the
overhead becomes noticeable.
Systems which use PSI only at the system level could avoid this overhead
if PSI can be configured to skip per-cgroup stall accounting.
Add "cgroup_disable=pressure" kernel command-line option to allow
requesting system-wide only pressure stall accounting. When set, it
keeps system-wide accounting under /proc/pressure/ but skips accounting
for individual cgroups and does not expose PSI nodes in cgroup hierarchy.
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/patchwork/patch/1435705
(cherry picked from commit 3958e2d0c3https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git tj)
Bug: 178872719
Bug: 191734423
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Change-Id: Ifc8fbc52f9a1131d7c2668edbb44c525c76c3360
Some interrupts (such as the rescheduling IPI) rely on not going through
the irq_enter()/irq_exit() calls. To distinguish such interrupts, add
a new IRQ flag that allows the low-level handling code to sidestep the
enter()/exit() calls.
Only the architecture code is expected to use this. It will do the wrong
thing on normal interrupts. Note that this is a band-aid until we can
move to some more correct infrastructure (such as kernel/entry/common.c).
Bug: 191808738
Link: https://lore.kernel.org/lkml/20201124141449.572446-3-maz@kernel.org/
Change-Id: I0609a8b689219ba9e769c8b9f7fcf1e77a0ff1ca
Signed-off-by: Marc Zyngier <maz@kernel.org>
[minor port to 5.10]
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Some arch-specific flags need to be set/cleared, but not exposed to
random device drivers. Introduce a new helper (__irq_modify_status())
that takes an arbitrary mask, and rewrite irq_modify_status() to use
this new helper.
No functionnal change.
Bug: 191808738
Link: https://lore.kernel.org/lkml/20201124141449.572446-5-maz@kernel.org/
Change-Id: I2c2c0d6599d0ab39fad22462bf4c87694362fba8
Signed-off-by: Marc Zyngier <maz@kernel.org>
[minor port to 5.10]
Signed-off-by: Stephen Dickey <dickey@codeaurora.org>
Add the vendor hook to qos.c, because of some special cases related to
our feature. we add the hook at freq_qos_add_request and remove_request
to make sure we can go to our own qos process logic.
Bug: 187458531
Signed-off-by: heshuai1 <heshuai1@xiaomi.com>
Change-Id: I1fb8fd6134432ecfb44ad242c66ccd8280ab9b43
The proactive compaction[1] gets triggered for every 500msec and run
compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9)
pages based on the value set to sysctl.compaction_proactiveness.
Triggering the compaction for every 500msec in search of
COMPACTION_HPAGE_ORDER pages is not needed for all applications,
especially on the embedded system usecases which may have few MB's of
RAM. Enabling the proactive compaction in its state will endup in
running almost always on such systems.
Other side, proactive compaction can still be very much useful for
getting a set of higher order pages in some controllable
manner(controlled by using the sysctl.compaction_proactiveness). Thus on
systems where enabling the proactive compaction always may proove not
required, can trigger the same from user space on write to its sysctl
interface. As an example, say app launcher decide to launch the memory
heavy application which can be launched fast if it gets more higher
order pages thus launcher can prepare the system in advance by
triggering the proactive compaction from userspace.
This triggering of proactive compaction is done on a write to
sysctl.compaction_proactiveness by user.
[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a
Bug: 186387247
Link: https://lore.kernel.org/patchwork/patch/1438211/
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Change-Id: Ie5208e274b9d7e7354471bb98ff1f10becf93595
Changes in 5.10.43
btrfs: tree-checker: do not error out if extent ref hash doesn't match
net: usb: cdc_ncm: don't spew notifications
hwmon: (dell-smm-hwmon) Fix index values
hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_3 for RAA228228
netfilter: conntrack: unregister ipv4 sockopts on error unwind
efi/fdt: fix panic when no valid fdt found
efi: Allow EFI_MEMORY_XP and EFI_MEMORY_RO both to be cleared
efi/libstub: prevent read overflow in find_file_option()
efi: cper: fix snprintf() use in cper_dimm_err_location()
vfio/pci: Fix error return code in vfio_ecap_init()
vfio/pci: zap_vma_ptes() needs MMU
samples: vfio-mdev: fix error handing in mdpy_fb_probe()
vfio/platform: fix module_put call in error flow
ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
HID: logitech-hidpp: initialize level variable
HID: pidff: fix error return code in hid_pidff_init()
HID: i2c-hid: fix format string mismatch
devlink: Correct VIRTUAL port to not have phys_port attributes
net/sched: act_ct: Offload connections with commit action
net/sched: act_ct: Fix ct template allocation for zone 0
mptcp: always parse mptcp options for MPC reqsk
nvme-rdma: fix in-casule data send for chained sgls
ACPICA: Clean up context mutex during object deletion
perf probe: Fix NULL pointer dereference in convert_variable_location()
net: dsa: tag_8021q: fix the VLAN IDs used for encoding sub-VLANs
net: sock: fix in-kernel mark setting
net/tls: Replace TLS_RX_SYNC_RUNNING with RCU
net/tls: Fix use-after-free after the TLS device goes down and up
net/mlx5e: Fix incompatible casting
net/mlx5: Check firmware sync reset requested is set before trying to abort it
net/mlx5e: Check for needed capability for cvlan matching
net/mlx5: DR, Create multi-destination flow table with level less than 64
nvmet: fix freeing unallocated p2pmem
netfilter: nft_ct: skip expectations for confirmed conntrack
netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches
drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest()
bpf: Simplify cases in bpf_base_func_proto
bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks
ieee802154: fix error return code in ieee802154_add_iface()
ieee802154: fix error return code in ieee802154_llsec_getparams()
igb: add correct exception tracing for XDP
ixgbevf: add correct exception tracing for XDP
cxgb4: fix regression with HASH tc prio value update
ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions
ice: Fix allowing VF to request more/less queues via virtchnl
ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared
ice: handle the VF VSI rebuild failure
ice: report supported and advertised autoneg using PHY capabilities
ice: Allow all LLDP packets from PF to Tx
i2c: qcom-geni: Add shutdown callback for i2c
cxgb4: avoid link re-train during TC-MQPRIO configuration
i40e: optimize for XDP_REDIRECT in xsk path
i40e: add correct exception tracing for XDP
ice: simplify ice_run_xdp
ice: optimize for XDP_REDIRECT in xsk path
ice: add correct exception tracing for XDP
ixgbe: optimize for XDP_REDIRECT in xsk path
ixgbe: add correct exception tracing for XDP
arm64: dts: ti: j7200-main: Mark Main NAVSS as dma-coherent
optee: use export_uuid() to copy client UUID
bus: ti-sysc: Fix am335x resume hang for usb otg module
arm64: dts: ls1028a: fix memory node
arm64: dts: zii-ultra: fix 12V_MAIN voltage
arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage
ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property
ARM: dts: imx7d-pico: Fix the 'tuning-step' property
ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells
bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act
tipc: add extack messages for bearer/media failure
tipc: fix unique bearer names sanity check
serial: stm32: fix threaded interrupt handling
riscv: vdso: fix and clean-up Makefile
io_uring: fix link timeout refs
io_uring: use better types for cflags
drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gate
drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gate
drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate
Bluetooth: fix the erroneous flush_work() order
Bluetooth: use correct lock to prevent UAF of hdev object
wireguard: do not use -O3
wireguard: peer: allocate in kmem_cache
wireguard: use synchronize_net rather than synchronize_rcu
wireguard: selftests: remove old conntrack kconfig value
wireguard: selftests: make sure rp_filter is disabled on vethc
wireguard: allowedips: initialize list head in selftest
wireguard: allowedips: remove nodes in O(1)
wireguard: allowedips: allocate nodes in kmem_cache
wireguard: allowedips: free empty intermediate nodes when removing single node
net: caif: added cfserl_release function
net: caif: add proper error handling
net: caif: fix memory leak in caif_device_notify
net: caif: fix memory leak in cfusbl_device_notify
HID: i2c-hid: Skip ELAN power-on command after reset
HID: magicmouse: fix NULL-deref on disconnect
HID: multitouch: require Finger field to mark Win8 reports as MT
gfs2: fix scheduling while atomic bug in glocks
ALSA: timer: Fix master timer notification
ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx
ALSA: hda: update the power_state during the direct-complete
ARM: dts: imx6dl-yapp4: Fix RGMII connection to QCA8334 switch
ARM: dts: imx6q-dhcom: Add PU,VDD1P1,VDD2P5 regulators
ext4: fix memory leak in ext4_fill_super
ext4: fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed
ext4: fix fast commit alignment issues
ext4: fix memory leak in ext4_mb_init_backend on error path.
ext4: fix accessing uninit percpu counter variable with fast_commit
usb: dwc2: Fix build in periphal-only mode
pid: take a reference when initializing `cad_pid`
ocfs2: fix data corruption by fallocate
mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests()
mm/page_alloc: fix counting of free pages after take off from buddy
x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()
x86/sev: Check SME/SEV support in CPUID first
nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect
drm/amdgpu: Don't query CE and UE errors
drm/amdgpu: make sure we unpin the UVD BO
x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing
powerpc/kprobes: Fix validation of prefixed instructions across page boundary
btrfs: mark ordered extent and inode with error if we fail to finish
btrfs: fix error handling in btrfs_del_csums
btrfs: return errors from btrfs_del_csums in cleanup_ref_head
btrfs: fixup error handling in fixup_inode_link_counts
btrfs: abort in rename_exchange if we fail to insert the second ref
btrfs: fix deadlock when cloning inline extents and low on available space
mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY
drm/msm/dpu: always use mdp device to scale bandwidth
btrfs: fix unmountable seed device after fstrim
KVM: SVM: Truncate GPR value for DR and CR accesses in !64-bit mode
KVM: arm64: Fix debug register indexing
x86/kvm: Teardown PV features on boot CPU as well
x86/kvm: Disable kvmclock on all CPUs on shutdown
x86/kvm: Disable all PV features on crash
lib/lz4: explicitly support in-place decompression
i2c: qcom-geni: Suspend and resume the bus during SYSTEM_SLEEP_PM ops
netfilter: nf_tables: missing error reporting for not selected expressions
xen-netback: take a reference to the RX task thread
neighbour: allow NUD_NOARP entries to be forced GCed
Linux 5.10.43
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I8d7ec0878193e4e454076809b7fb71fcc4e3d810
Exporting the symbol cpuset_cpus_allowed(), in which ko module can do
cpuset operation in vendor hook related code.
Bug: 189725786
Signed-off-by: lijianzhong <lijianzhong@xiaomi.com>
Change-Id: I7919a893ab64bb441ab43cbb0b16825ed76d802d
[ Upstream commit ff40e51043 ]
Commit 59438b4647 ("security,lockdown,selinux: implement SELinux lockdown")
added an implementation of the locked_down LSM hook to SELinux, with the aim
to restrict which domains are allowed to perform operations that would breach
lockdown. This is indirectly also getting audit subsystem involved to report
events. The latter is problematic, as reported by Ondrej and Serhei, since it
can bring down the whole system via audit:
1) The audit events that are triggered due to calls to security_locked_down()
can OOM kill a machine, see below details [0].
2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit()
when trying to wake up kauditd, for example, when using trace_sched_switch()
tracepoint, see details in [1]. Triggering this was not via some hypothetical
corner case, but with existing tools like runqlat & runqslower from bcc, for
example, which make use of this tracepoint. Rough call sequence goes like:
rq_lock(rq) -> -------------------------+
trace_sched_switch() -> |
bpf_prog_xyz() -> +-> deadlock
selinux_lockdown() -> |
audit_log_end() -> |
wake_up_interruptible() -> |
try_to_wake_up() -> |
rq_lock(rq) --------------+
What's worse is that the intention of 59438b4647 to further restrict lockdown
settings for specific applications in respect to the global lockdown policy is
completely broken for BPF. The SELinux policy rule for the current lockdown check
looks something like this:
allow <who> <who> : lockdown { <reason> };
However, this doesn't match with the 'current' task where the security_locked_down()
is executed, example: httpd does a syscall. There is a tracing program attached
to the syscall which triggers a BPF program to run, which ends up doing a
bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does
the permission check against 'current', that is, httpd in this example. httpd
has literally zero relation to this tracing program, and it would be nonsensical
having to write an SELinux policy rule against httpd to let the tracing helper
pass. The policy in this case needs to be against the entity that is installing
the BPF program. For example, if bpftrace would generate a histogram of syscall
counts by user space application:
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
bpftrace would then go and generate a BPF program from this internally. One way
of doing it [for the sake of the example] could be to call bpf_get_current_task()
helper and then access current->comm via one of bpf_probe_read_kernel{,_str}()
helpers. So the program itself has nothing to do with httpd or any other random
app doing a syscall here. The BPF program _explicitly initiated_ the lockdown
check. The allow/deny policy belongs in the context of bpftrace: meaning, you
want to grant bpftrace access to use these helpers, but other tracers on the
system like my_random_tracer _not_.
Therefore fix all three issues at the same time by taking a completely different
approach for the security_locked_down() hook, that is, move the check into the
program verification phase where we actually retrieve the BPF func proto. This
also reliably gets the task (current) that is trying to install the BPF tracing
program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since
we're moving this out of the BPF helper's fast-path which can be called several
millions of times per second.
The check is then also in line with other security_locked_down() hooks in the
system where the enforcement is performed at open/load time, for example,
open_kcore() for /proc/kcore access or module_sig_check() for module signatures
just to pick few random ones. What's out of scope in the fix as well as in
other security_locked_down() hook locations /outside/ of BPF subsystem is that
if the lockdown policy changes on the fly there is no retrospective action.
This requires a different discussion, potentially complex infrastructure, and
it's also not clear whether this can be solved generically. Either way, it is
out of scope for a suitable stable fix which this one is targeting. Note that
the breakage is specifically on 59438b4647 where it started to rely on 'current'
as UAPI behavior, and _not_ earlier infrastructure such as 9d1f8be5cf ("bpf:
Restrict bpf when kernel lockdown is in confidentiality mode").
[0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says:
I starting seeing this with F-34. When I run a container that is traced with
BPF to record the syscalls it is doing, auditd is flooded with messages like:
type=AVC msg=audit(1619784520.593:282387): avc: denied { confidentiality }
for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM"
scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0
tclass=lockdown permissive=0
This seems to be leading to auditd running out of space in the backlog buffer
and eventually OOMs the machine.
[...]
auditd running at 99% CPU presumably processing all the messages, eventually I get:
Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded
Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64
Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64
Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64
Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64
Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000
Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1
Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
[...]
[1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/,
Serhei Makarov says:
Upstream kernel 5.11.0-rc7 and later was found to deadlock during a
bpf_probe_read_compat() call within a sched_switch tracepoint. The problem
is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend
testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on
ppc64le. Example stack trace:
[...]
[ 730.868702] stack backtrace:
[ 730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1
[ 730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014
[ 730.873278] Call Trace:
[ 730.873770] dump_stack+0x7f/0xa1
[ 730.874433] check_noncircular+0xdf/0x100
[ 730.875232] __lock_acquire+0x1202/0x1e10
[ 730.876031] ? __lock_acquire+0xfc0/0x1e10
[ 730.876844] lock_acquire+0xc2/0x3a0
[ 730.877551] ? __wake_up_common_lock+0x52/0x90
[ 730.878434] ? lock_acquire+0xc2/0x3a0
[ 730.879186] ? lock_is_held_type+0xa7/0x120
[ 730.880044] ? skb_queue_tail+0x1b/0x50
[ 730.880800] _raw_spin_lock_irqsave+0x4d/0x90
[ 730.881656] ? __wake_up_common_lock+0x52/0x90
[ 730.882532] __wake_up_common_lock+0x52/0x90
[ 730.883375] audit_log_end+0x5b/0x100
[ 730.884104] slow_avc_audit+0x69/0x90
[ 730.884836] avc_has_perm+0x8b/0xb0
[ 730.885532] selinux_lockdown+0xa5/0xd0
[ 730.886297] security_locked_down+0x20/0x40
[ 730.887133] bpf_probe_read_compat+0x66/0xd0
[ 730.887983] bpf_prog_250599c5469ac7b5+0x10f/0x820
[ 730.888917] trace_call_bpf+0xe9/0x240
[ 730.889672] perf_trace_run_bpf_submit+0x4d/0xc0
[ 730.890579] perf_trace_sched_switch+0x142/0x180
[ 730.891485] ? __schedule+0x6d8/0xb20
[ 730.892209] __schedule+0x6d8/0xb20
[ 730.892899] schedule+0x5b/0xc0
[ 730.893522] exit_to_user_mode_prepare+0x11d/0x240
[ 730.894457] syscall_exit_to_user_mode+0x27/0x70
[ 730.895361] entry_SYSCALL_64_after_hwframe+0x44/0xae
[...]
Fixes: 59438b4647 ("security,lockdown,selinux: implement SELinux lockdown")
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Reported-by: Jakub Hrozek <jhrozek@redhat.com>
Reported-by: Serhei Makarov <smakarov@redhat.com>
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: James Morris <jamorris@linux.microsoft.com>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Frank Eigler <fche@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.net
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 61ca36c8c4 ]
!perfmon_capable() is checked before the last switch(func_id) in
bpf_base_func_proto. Thus, the cases BPF_FUNC_trace_printk and
BPF_FUNC_snprintf_btf can be moved to that last switch(func_id) to omit
the inline !perfmon_capable() checks.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210127174615.3038-1-tklauser@distanz.ch
Signed-off-by: Sasha Levin <sashal@kernel.org>
Add the vendor hook to user.c, because of some speical cases related to
our feature, we need to initialize the variables defined by ourselves in
user_struct, so we add the hook at alloc_uid to make sure we can go to
our own logic when the user_struct is about to initialize.
Bug: 187458531
Signed-off-by: heshuai1 <heshuai1@xiaomi.com>
Change-Id: I078484aac2c3d396aba5971d6d0f491652f3781c
and sched_waking to let module probe them
Get task info about sleep and waking
Bug: 190422437
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
Change-Id: I828c93f531f84e6133c2c3a7f8faada51683afcf
Module code would like to hold some locks when affinity is being updated
for 32 bit task exec.
Create pre and post tracepoints in force_compatible_cpus_allowed_ptr()
Bug: 187917024
Change-Id: I95bff9f4d5b5d37c1d5440acbd6857d2855c2b43
Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>
Add the vendor hook to freezer.c, because of some special cases related to our feature, we do not want the process to be frozen immediately, so we add the hook at __refrigerator to make sure we can go to our own freeze logic when the process is about to be frozen.
Bug: 187458531
Signed-off-by: heshuai1 <heshuai1@xiaomi.com>
Change-Id: Iea42fd9604d6b33ccd6502425416f0dd28eecebb
Add android_rvh_find_new_ilb to select a next ilb cpu for vendors.
Bug: 190228983
Change-Id: Iba1a0cd9cdc22dcf628dd33f8d838fe513a4818f
Signed-off-by: Choonghoon Park <choong.park@samsung.com>
Add ANDROID_OEM_DATA to struct rq, which is used to implement oem's
scheduler tuning.
Bug: 188899490
Change-Id: I1904b4fd83effc4b309bfb98811e9718398504f4
Signed-off-by: Liangliang Li <liliangliang@vivo.com>
With the introduction of per-cpu wakeup devices that can be used in
preference to the broadcast timer, print the name of such devices when
they are available.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210524221818.15850-6-will@kernel.org
(cherry picked from commit 245a057fee tip/tip.git timers/core)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 185092876
Change-Id: I39736cb43702430b722382c802603fdc4188a5c4
When configuring the broadcast timer on entry to and exit from deep idle
states, prefer a per-CPU wakeup timer if one exists.
On entry to idle, stop the tick device and transfer the next event into
the oneshot wakeup device, which will serve as the wakeup from idle. To
avoid the overhead of additional hardware accesses on exit from idle,
leave the timer armed and treat the inevitable interrupt as a (possibly
spurious) tick event.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210524221818.15850-5-will@kernel.org
(cherry picked from commit ea5c7f1b9a tip/tip.git timers/core)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 185092876
Change-Id: I62a49231e213285f95e9f0cf6a07633984930b56
Some SoCs have two per-cpu timer implementations where the timer with the
higher rating stops in deep idle (i.e. suffers from CLOCK_EVT_FEAT_C3STOP)
but is otherwise preferable to the timer with the lower rating. In such a
design, selecting the higher rated devices relies on a global broadcast
timer and IPIs to wake up from deep idle states.
To avoid the reliance on a global broadcast timer and also to reduce the
overhead associated with the IPI wakeups, extend
tick_install_broadcast_device() to manage per-cpu wakeup timers separately
from the broadcast device.
For now, these timers remain unused.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210524221818.15850-4-will@kernel.org
(cherry picked from commit c94a8537df tip/tip.git timers/core)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 185092876
Change-Id: I2d2b1bc6333d004846270d3e58dec0dca89a89d1
In preparation for adding support for per-cpu wakeup timers, split
_tick_broadcast_oneshot_control() into a helper function which deals
only with the broadcast timer management across idle transitions.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210524221818.15850-3-will@kernel.org
(cherry picked from commit e5007c288e tip/tip.git timers/core)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 185092876
Change-Id: I14c5456ec1af0b8f73c85e9571f171ea1c3c564b
tick-broadcast.o is only built if CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
so remove the redundant #ifdef guards around the definition of
tick_receive_broadcast().
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210524221818.15850-2-will@kernel.org
(cherry picked from commit c2d4fee3f6 tip/tip.git timers/core)
Signed-off-by: Will Deacon <willdeacon@google.com>
Bug: 185092876
Change-Id: Ifb18a53bfad8d228f6f76c9b74c9bca8de27759a
Add vendor hook to determine if the memory of a process that received
the SIGKILL can be reaped.
Bug: 189803002
Change-Id: Ie6802b9bf93ddffb0ceef615d7cca40c23219e57
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
This reverts commit a7af91adc7 ("ANDROID: mm: oom_kill: reap memory of
a task that receives SIGKILL") as this functionality is moved to vendor
hook based approach.
Bug: 189803002
Signed-off-by: Charan Teja Reddy <charante@codeaurora.org>
Change-Id: Ica0d6df22fb81bf430e9b4c7d12b36ab7d44dab8
specific wake flag for android vendor
Bug: 189858948
Signed-off-by: Namkyu Kim <namkyu78.kim@samsung.com>
Change-Id: Idc23c1c47f7d83b298c0b2560859f1ce2761fd85
Changes in 5.10.42
ALSA: hda/realtek: the bass speaker can't output sound on Yoga 9i
ALSA: hda/realtek: Headphone volume is controlled by Front mixer
ALSA: hda/realtek: Chain in pop reduction fixup for ThinkStation P340
ALSA: hda/realtek: fix mute/micmute LEDs for HP 855 G8
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook G8
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 15 G8
ALSA: hda/realtek: fix mute/micmute LEDs and speaker for HP Zbook Fury 17 G8
ALSA: usb-audio: scarlett2: Fix device hang with ehci-pci
ALSA: usb-audio: scarlett2: Improve driver startup messages
cifs: set server->cipher_type to AES-128-CCM for SMB3.0
NFSv4: Fix a NULL pointer dereference in pnfs_mark_matching_lsegs_return()
iommu/vt-d: Fix sysfs leak in alloc_iommu()
perf intel-pt: Fix sample instruction bytes
perf intel-pt: Fix transaction abort handling
perf scripts python: exported-sql-viewer.py: Fix copy to clipboard from Top Calls by elapsed Time report
perf scripts python: exported-sql-viewer.py: Fix Array TypeError
perf scripts python: exported-sql-viewer.py: Fix warning display
proc: Check /proc/$pid/attr/ writes against file opener
net: hso: fix control-request directions
net/sched: fq_pie: re-factor fix for fq_pie endless loop
net/sched: fq_pie: fix OOB access in the traffic path
netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check, fallback to non-AVX2 version
mac80211: assure all fragments are encrypted
mac80211: prevent mixed key and fragment cache attacks
mac80211: properly handle A-MSDUs that start with an RFC 1042 header
cfg80211: mitigate A-MSDU aggregation attacks
mac80211: drop A-MSDUs on old ciphers
mac80211: add fragment cache to sta_info
mac80211: check defrag PN against current frame
mac80211: prevent attacks on TKIP/WEP as well
mac80211: do not accept/forward invalid EAPOL frames
mac80211: extend protection against mixed key and fragment cache attacks
ath10k: add CCMP PN replay protection for fragmented frames for PCIe
ath10k: drop fragments with multicast DA for PCIe
ath10k: drop fragments with multicast DA for SDIO
ath10k: drop MPDU which has discard flag set by firmware for SDIO
ath10k: Fix TKIP Michael MIC verification for PCIe
ath10k: Validate first subframe of A-MSDU before processing the list
ath11k: Clear the fragment cache during key install
dm snapshot: properly fix a crash when an origin has no snapshots
drm/amd/pm: correct MGpuFanBoost setting
drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gate
drm/amdkfd: correct sienna_cichlid SDMA RLC register offset error
drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gate
drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gate
drm/amdgpu/jpeg2.0: add cancel_delayed_work_sync before power gate
selftests/gpio: Use TEST_GEN_PROGS_EXTENDED
selftests/gpio: Move include of lib.mk up
selftests/gpio: Fix build when source tree is read only
kgdb: fix gcc-11 warnings harder
Documentation: seccomp: Fix user notification documentation
seccomp: Refactor notification handler to prepare for new semantics
serial: core: fix suspicious security_locked_down() call
misc/uss720: fix memory leak in uss720_probe
thunderbolt: usb4: Fix NVM read buffer bounds and offset issue
thunderbolt: dma_port: Fix NVM read buffer bounds and offset issue
KVM: X86: Fix vCPU preempted state from guest's point of view
KVM: arm64: Prevent mixed-width VM creation
mei: request autosuspend after sending rx flow control
staging: iio: cdc: ad7746: avoid overwrite of num_channels
iio: gyro: fxas21002c: balance runtime power in error path
iio: dac: ad5770r: Put fwnode in error case during ->probe()
iio: adc: ad7768-1: Fix too small buffer passed to iio_push_to_buffers_with_timestamp()
iio: adc: ad7124: Fix missbalanced regulator enable / disable on error.
iio: adc: ad7124: Fix potential overflow due to non sequential channel numbers
iio: adc: ad7923: Fix undersized rx buffer.
iio: adc: ad7793: Add missing error code in ad7793_setup()
iio: adc: ad7192: Avoid disabling a clock that was never enabled.
iio: adc: ad7192: handle regulator voltage error first
serial: 8250: Add UART_BUG_TXRACE workaround for Aspeed VUART
serial: 8250_dw: Add device HID for new AMD UART controller
serial: 8250_pci: Add support for new HPE serial device
serial: 8250_pci: handle FL_NOIRQ board flag
USB: trancevibrator: fix control-request direction
Revert "irqbypass: do not start cons/prod when failed connect"
USB: usbfs: Don't WARN about excessively large memory allocations
drivers: base: Fix device link removal
serial: tegra: Fix a mask operation that is always true
serial: sh-sci: Fix off-by-one error in FIFO threshold register setting
serial: rp2: use 'request_firmware' instead of 'request_firmware_nowait'
USB: serial: ti_usb_3410_5052: add startech.com device id
USB: serial: option: add Telit LE910-S1 compositions 0x7010, 0x7011
USB: serial: ftdi_sio: add IDs for IDS GmbH Products
USB: serial: pl2303: add device id for ADLINK ND-6530 GC
thermal/drivers/intel: Initialize RW trip to THERMAL_TEMP_INVALID
usb: dwc3: gadget: Properly track pending and queued SG
usb: gadget: udc: renesas_usb3: Fix a race in usb3_start_pipen()
usb: typec: mux: Fix matching with typec_altmode_desc
net: usb: fix memory leak in smsc75xx_bind
Bluetooth: cmtp: fix file refcount when cmtp_attach_device fails
fs/nfs: Use fatal_signal_pending instead of signal_pending
NFS: fix an incorrect limit in filelayout_decode_layout()
NFS: Fix an Oopsable condition in __nfs_pageio_add_request()
NFS: Don't corrupt the value of pg_bytes_written in nfs_do_recoalesce()
NFSv4: Fix v4.0/v4.1 SEEK_DATA return -ENOTSUPP when set NFS_V4_2 config
drm/meson: fix shutdown crash when component not probed
net/mlx5e: reset XPS on error flow if netdev isn't registered yet
net/mlx5e: Fix multipath lag activation
net/mlx5e: Fix error path of updating netdev queues
{net,vdpa}/mlx5: Configure interface MAC into mpfs L2 table
net/mlx5e: Fix nullptr in add_vlan_push_action()
net/mlx5: Set reformat action when needed for termination rules
net/mlx5e: Fix null deref accessing lag dev
net/mlx4: Fix EEPROM dump support
net/mlx5: Set term table as an unmanaged flow table
SUNRPC in case of backlog, hand free slots directly to waiting task
Revert "net:tipc: Fix a double free in tipc_sk_mcast_rcv"
tipc: wait and exit until all work queues are done
tipc: skb_linearize the head skb when reassembling msgs
spi: spi-fsl-dspi: Fix a resource leak in an error handling path
netfilter: flowtable: Remove redundant hw refresh bit
net: dsa: mt7530: fix VLAN traffic leaks
net: dsa: fix a crash if ->get_sset_count() fails
net: dsa: sja1105: update existing VLANs from the bridge VLAN list
net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic
net: dsa: sja1105: error out on unsupported PHY mode
net: dsa: sja1105: add error handling in sja1105_setup()
net: dsa: sja1105: call dsa_unregister_switch when allocating memory fails
net: dsa: sja1105: fix VL lookup command packing for P/Q/R/S
i2c: s3c2410: fix possible NULL pointer deref on read message after write
i2c: mediatek: Disable i2c start_en and clear intr_stat brfore reset
i2c: i801: Don't generate an interrupt on bus reset
i2c: sh_mobile: Use new clock calculation formulas for RZ/G2E
afs: Fix the nlink handling of dir-over-dir rename
perf jevents: Fix getting maximum number of fds
nvmet-tcp: fix inline data size comparison in nvmet_tcp_queue_response
mptcp: avoid error message on infinite mapping
mptcp: drop unconditional pr_warn on bad opt
mptcp: fix data stream corruption
platform/x86: hp_accel: Avoid invoking _INI to speed up resume
gpio: cadence: Add missing MODULE_DEVICE_TABLE
Revert "crypto: cavium/nitrox - add an error message to explain the failure of pci_request_mem_regions"
Revert "media: usb: gspca: add a missed check for goto_low_power"
Revert "ALSA: sb: fix a missing check of snd_ctl_add"
Revert "serial: max310x: pass return value of spi_register_driver"
serial: max310x: unregister uart driver in case of failure and abort
Revert "net: fujitsu: fix a potential NULL pointer dereference"
net: fujitsu: fix potential null-ptr-deref
Revert "net/smc: fix a NULL pointer dereference"
net/smc: properly handle workqueue allocation failure
Revert "net: caif: replace BUG_ON with recovery code"
net: caif: remove BUG_ON(dev == NULL) in caif_xmit
Revert "char: hpet: fix a missing check of ioremap"
char: hpet: add checks after calling ioremap
Revert "ALSA: gus: add a check of the status of snd_ctl_add"
Revert "ALSA: usx2y: Fix potential NULL pointer dereference"
Revert "isdn: mISDNinfineon: fix potential NULL pointer dereference"
isdn: mISDNinfineon: check/cleanup ioremap failure correctly in setup_io
Revert "ath6kl: return error code in ath6kl_wmi_set_roam_lrssi_cmd()"
ath6kl: return error code in ath6kl_wmi_set_roam_lrssi_cmd()
Revert "isdn: mISDN: Fix potential NULL pointer dereference of kzalloc"
isdn: mISDN: correctly handle ph_info allocation failure in hfcsusb_ph_info
Revert "dmaengine: qcom_hidma: Check for driver register failure"
dmaengine: qcom_hidma: comment platform_driver_register call
Revert "libertas: add checks for the return value of sysfs_create_group"
libertas: register sysfs groups properly
Revert "ASoC: cs43130: fix a NULL pointer dereference"
ASoC: cs43130: handle errors in cs43130_probe() properly
Revert "media: dvb: Add check on sp8870_readreg"
media: dvb: Add check on sp8870_readreg return
Revert "media: gspca: mt9m111: Check write_bridge for timeout"
media: gspca: mt9m111: Check write_bridge for timeout
Revert "media: gspca: Check the return value of write_bridge for timeout"
media: gspca: properly check for errors in po1030_probe()
Revert "net: liquidio: fix a NULL pointer dereference"
net: liquidio: Add missing null pointer checks
Revert "brcmfmac: add a check for the status of usb_register"
brcmfmac: properly check for bus register errors
btrfs: return whole extents in fiemap
scsi: ufs: ufs-mediatek: Fix power down spec violation
scsi: BusLogic: Fix 64-bit system enumeration error for Buslogic
openrisc: Define memory barrier mb
scsi: pm80xx: Fix drives missing during rmmod/insmod loop
btrfs: release path before starting transaction when cloning inline extent
btrfs: do not BUG_ON in link_to_fixup_dir
platform/x86: hp-wireless: add AMD's hardware id to the supported list
platform/x86: intel_punit_ipc: Append MODULE_DEVICE_TABLE for ACPI
platform/x86: touchscreen_dmi: Add info for the Mediacom Winpad 7.0 W700 tablet
SMB3: incorrect file id in requests compounded with open
drm/amd/display: Disconnect non-DP with no EDID
drm/amd/amdgpu: fix refcount leak
drm/amdgpu: Fix a use-after-free
drm/amd/amdgpu: fix a potential deadlock in gpu reset
drm/amdgpu: stop touching sched.ready in the backend
platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tablet
block: fix a race between del_gendisk and BLKRRPART
linux/bits.h: fix compilation error with GENMASK
net: netcp: Fix an error message
net: dsa: fix error code getting shifted with 4 in dsa_slave_get_sset_count
interconnect: qcom: bcm-voter: add a missing of_node_put()
interconnect: qcom: Add missing MODULE_DEVICE_TABLE
ASoC: cs42l42: Regmap must use_single_read/write
net: stmmac: Fix MAC WoL not working if PHY does not support WoL
net: ipa: memory region array is variable size
vfio-ccw: Check initialized flag in cp_init()
spi: Assume GPIO CS active high in ACPI case
net: really orphan skbs tied to closing sk
net: packetmmap: fix only tx timestamp on request
net: fec: fix the potential memory leak in fec_enet_init()
chelsio/chtls: unlock on error in chtls_pt_recvmsg()
net: mdio: thunder: Fix a double free issue in the .remove function
net: mdio: octeon: Fix some double free issues
cxgb4/ch_ktls: Clear resources when pf4 device is removed
openvswitch: meter: fix race when getting now_ms.
tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
net: sched: fix packet stuck problem for lockless qdisc
net: sched: fix tx action rescheduling issue during deactivation
net: sched: fix tx action reschedule issue with stopped queue
net: hso: check for allocation failure in hso_create_bulk_serial_device()
net: bnx2: Fix error return code in bnx2_init_board()
bnxt_en: Include new P5 HV definition in VF check.
bnxt_en: Fix context memory setup for 64K page size.
mld: fix panic in mld_newpack()
net/smc: remove device from smcd_dev_list after failed device_add()
gve: Check TX QPL was actually assigned
gve: Update mgmt_msix_idx if num_ntfy changes
gve: Add NULL pointer checks when freeing irqs.
gve: Upgrade memory barrier in poll routine
gve: Correct SKB queue index validation.
iommu/virtio: Add missing MODULE_DEVICE_TABLE
net: hns3: fix incorrect resp_msg issue
net: hns3: put off calling register_netdev() until client initialize complete
iommu/vt-d: Use user privilege for RID2PASID translation
cxgb4: avoid accessing registers when clearing filters
staging: emxx_udc: fix loop in _nbu2ss_nuke()
ASoC: cs35l33: fix an error code in probe()
bpf, offload: Reorder offload callback 'prepare' in verifier
bpf: Set mac_len in bpf_skb_change_head
ixgbe: fix large MTU request from VF
ASoC: qcom: lpass-cpu: Use optional clk APIs
scsi: libsas: Use _safe() loop in sas_resume_port()
net: lantiq: fix memory corruption in RX ring
ipv6: record frag_max_size in atomic fragments in input path
ALSA: usb-audio: scarlett2: snd_scarlett_gen2_controls_create() can be static
net: ethernet: mtk_eth_soc: Fix packet statistics support for MT7628/88
sch_dsmark: fix a NULL deref in qdisc_reset()
net: hsr: fix mac_len checks
MIPS: alchemy: xxs1500: add gpio-au1000.h header file
MIPS: ralink: export rt_sysc_membase for rt2880_wdt.c
net: zero-initialize tc skb extension on allocation
net: mvpp2: add buffer header handling in RX
i915: fix build warning in intel_dp_get_link_status()
samples/bpf: Consider frame size in tx_only of xdpsock sample
net: hns3: check the return of skb_checksum_help()
bpftool: Add sock_release help info for cgroup attach/prog load command
SUNRPC: More fixes for backlog congestion
Revert "Revert "ALSA: usx2y: Fix potential NULL pointer dereference""
net: hso: bail out on interrupt URB allocation failure
scripts/clang-tools: switch explicitly to Python 3
neighbour: Prevent Race condition in neighbour subsytem
usb: core: reduce power-on-good delay time of root hub
Linux 5.10.42
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I05d98d1355a080e0951b4b2ae77f0a9ccb6dfc5d
task_cs(p) is protected by RCU, so ensure that we have entered an RCU
read-side critical section before accessing it in guarantee_online_cpus().
This issue was introduced by 4045a05f88 ("BACKPORT: FROMLIST: cpuset:
Honour task_cpu_possible_mask() in guarantee_online_cpus()") and spotted
during upstream review.
Reported-by: Qais Yousef <qais.yousef@arm.com>
Link: https://lore.kernel.org/r/20210521162524.22cwmrao3df7m4jb@e107158-lin.cambridge.arm.com
Fixes: 4045a05f88 ("BACKPORT: FROMLIST: cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()")
Bug: 178507149
Change-Id: Ia8b8b89b5fcf72eefe9c2667951e24f315176ed5
Signed-off-by: Will Deacon <willdeacon@google.com>
[ Upstream commit ceb11679d9 ]
Commit 4976b718c3 ("bpf: Introduce pseudo_btf_id") switched the
order of resolve_pseudo_ldimm(), in which some pseudo instructions
are rewritten. Thus those rewritten instructions cannot be passed
to driver via 'prepare' offload callback.
Reorder the 'prepare' offload callback to fix it.
Fixes: 4976b718c3 ("bpf: Introduce pseudo_btf_id")
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20210520085834.15023-1-simon.horman@netronome.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit ddc4739169 upstream.
This refactors the user notification code to have a do / while loop around
the completion condition. This has a small change in semantic, in that
previously we ignored addfd calls upon wakeup if the notification had been
responded to, but instead with the new change we check for an outstanding
addfd calls prior to returning to userspace.
Rodrigo Campos also identified a bug that can result in addfd causing
an early return, when the supervisor didn't actually handle the
syscall [1].
[1]: https://lore.kernel.org/lkml/20210413160151.3301-1-rodrigo@kinvolk.io/
Fixes: 7cf97b1254 ("seccomp: Introduce addfd ioctl to seccomp user notifier")
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Tycho Andersen <tycho@tycho.pizza>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Rodrigo Campos <rodrigo@kinvolk.io>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210517193908.3113-3-sargun@sargun.me
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When CONFIG_SCHEDSTATS is not set, the build breaks because
DEFINE_EVENT_SCHEDSTAT evaluates to DEFINE_EVENT_NOP, which only defines
trace_<name>, not __tracepoint_<name>, __traceiter_<name>, and
_SCK__tp_func_<name> like DEFINE_EVENT.
Gate these exports on CONFIG_SCHEDSTATS so all of the exported symbols
are defined.
Change-Id: I38056ee1446e6c149686ce1905c2ba6e4ea5e59e
Fixes: a6bb1af39d ("ANDROID: vendor_hooks: Export the tracepoints sched_stat_iowait, sched_stat_blocked, sched_stat_wait to let modules probe them")
Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/2724257445?check_suite_focus=true
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Asymmetric systems may not offer the same level of userspace ISA support
across all CPUs, meaning that some applications cannot be executed by
some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do
not feature support for 32-bit applications on both clusters.
Although we take care to prevent explicit hot-unplug of all 32-bit
capable CPUs on such a system, this is required when suspending on some
SoCs where the firmware mandates that the suspend/resume operation is
handled by CPU 0, which may not be capable of running 32-bit tasks.
Consequently, there is a window on the resume path where no 32-bit
capable CPUs are available for scheduling and waking up a 32-bit task
will result in a scheduler BUG() due to failure of select_fallback_rq():
| kernel BUG at kernel/sched/core.c:2858!
| Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
| ...
| Call trace:
| select_fallback_rq+0x4b0/0x4e4
| try_to_wake_up.llvm.4388853297126348405+0x460/0x5b0
| default_wake_function+0x1c/0x30
| autoremove_wake_function+0x1c/0x60
| __wake_up_common.llvm.11763074518265335900+0x100/0x1b8
| __wake_up+0x78/0xc4
| ep_poll_callback+0x20c/0x3fc
Prevent wakeups of unschedulable frozen tasks in ttwu() and instead
defer the wakeup to __thaw_tasks(), which runs only once all the
secondary CPUs are back online.
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/linux-arch/20210525151432.16875-17-will@kernel.org/
Bug: 186372082
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I5a0531b48d537a79e1926289b5a87edcd7dd78ad
Occasionally it is necessary to see if a task is either frozen or
sleeping in the PF_FREEZER_SKIP state. In preparation for adding
additional users of this check, introduce a frozen_or_skipped() helper
function and convert the hung task detector over to using it.
Signed-off-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/linux-arch/20210525151432.16875-16-will@kernel.org/
Bug: 186372082
Signed-off-by: Will Deacon <willdeacon@google.com>
Change-Id: I138ffe2fae5a2da96df6f30d50d3a8a0dc61724c
Get task info about scheduling delay, iowait, and block time.
Bug: 189415303
Change-Id: Ib6b548f8a78de5b26d555e9a89e3cc79ea2d1024
Signed-off-by: Liujie Xie <xieliujie@oppo.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmCw0WEACgkQONu9yGCS
aT5K6RAAphIxwnUhvm1gBe+lzNOsp6ZURXMIT6hANhUdCU21Tw6agLGRELOJ+YNQ
agLdWb3auH1ufGV0wUtUJYbLa3lYF6uuU53BZb2i9iJ+1X2igzGwtVN/lEdcZs4G
R6hD8W5Rxiam5K7KAgYZLpSA5bS3ETrfsbJ2kddIGGSH1BHnwrManXOan1U9mY99
HDf+ksPCF2iA8Zqqq5Hx2g9Nuf0x0vyZ+6cob2QdJVq5ZnZXwamC498zmi/vGkGj
fPSxjaBMR9kDMQhUmvgSmAieM0UrrsPIkOxsWWCz/Lo4qhTG8+ccyHmplnsvvsyz
R3LEdq0YK3vMTypi7RfdxaEeB1a5d8cTV4JYZBs/eOU45lBVKZ+IKp1KJjTqtshy
Oj7LnNtONUfPNfXki+AgW7zGTPUJqK3hxW5j87Qg0MKe1i7CrAxmKhDcWX23acYG
5jBlUGX8vrYycdjvGCC7m/+T1NptVi/9UbcTi22au8hSwtKokn6AdTifTNfjst7H
4UMslyD5EA1Js17eObk/04kB0iMp9RSIEtc8DOV3cHZWAu3gK6pKe+RBL4uZ/K7Q
Wr8+gqlGyjg89NMjwoyXYaCTRkEwDcSSjKXzbfX1Pqjbujc/I7MLiLlPoen4Pe2i
v4aftEYPa4SHznutCQmLqRghFUfsxkRJKYVoo+SvKbhK7dE92O4=
=KKY2
-----END PGP SIGNATURE-----
Merge 5.10.41 into android12-5.10
Changes in 5.10.41
bpf: Wrap aux data inside bpf_sanitize_info container
bpf: Fix mask direction swap upon off reg sign change
bpf: No need to simulate speculative domain for immediates
context_tracking: Move guest exit context tracking to separate helpers
context_tracking: Move guest exit vtime accounting to separate helpers
KVM: x86: Defer vtime accounting 'til after IRQ handling
perf unwind: Fix separate debug info files when using elfutils' libdw's unwinder
perf unwind: Set userdata for all __report_module() paths
NFC: nci: fix memory leak in nci_allocate_device
Linux 5.10.41
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ie9f14d4b9960fb923eb01303517012fe6274d5ef
commit a703619127 upstream.
In 801c6058d1 ("bpf: Fix leakage of uninitialized bpf stack under
speculation") we replaced masking logic with direct loads of immediates
if the register is a known constant. Given in this case we do not apply
any masking, there is also no reason for the operation to be truncated
under the speculative domain.
Therefore, there is also zero reason for the verifier to branch-off and
simulate this case, it only needs to do it for unknown but bounded scalars.
As a side-effect, this also enables few test cases that were previously
rejected due to simulation under zero truncation.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb01a1bba5 upstream.
Masking direction as indicated via mask_to_left is considered to be
calculated once and then used to derive pointer limits. Thus, this
needs to be placed into bpf_sanitize_info instead so we can pass it
to sanitize_ptr_alu() call after the pointer move. Piotr noticed a
corner case where the off reg causes masking direction change which
then results in an incorrect final aux->alu_limit.
Fixes: 7fedb63a83 ("bpf: Tighten speculative pointer arithmetic mask")
Reported-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>