linux-uconsole

Author	SHA1	Message	Date
Sami Tolvanen	8da32d526d	ANDROID: cfi: explicitly clear diag in __cfi_slowpath When CONFIG_CFI_PERMISSIVE is not set, ensure the third argument passed to __cfi_check from __cfi_slowpath is NULL to avoid an invalid memory access in __cfi_check_fail. __cfi_check_fail always traps anyway, but the error message will be less confusing with this patch. Note that kernels built with full LTO aren't affected as they always clear the argument before a __cfi_slowpath call. Later kernel versions are also not affected as they use -fno-sanitize-trap=cfi. Bug: 196763360 Change-Id: Ifa5b4e324737a3069f7a772dd9b392042ec8407e Signed-off-by: Sami Tolvanen <samitolvanen@google.com>	2021-09-02 08:55:56 +00:00
Maulik Shah	4c9aa4c6f0	FROMGIT: irqdomain: Export irq_domain_disconnect_hierarchy() Export irq_domain_disconnect_hierarchy() so irqchip module drivers can use it. Signed-off-by: Maulik Shah <mkshah@codeaurora.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/1629705880-27877-2-git-send-email-mkshah@codeaurora.org Bug: 196928089 (cherry picked from commit `131d326ba9` https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git irq/irqchip-next) Change-Id: Ia38d7a23f60930970dde2edfad95a6924e807939 Signed-off-by: Maulik Shah <mkshah@codeaurora.org>	2021-08-25 00:41:42 +00:00
Elliot Berman	5cd4b1ce23	UPSTREAM: cfi: Use rcu_read_{un}lock_sched_notrace If rcu_read_lock_sched tracing is enabled, the tracing subsystem can perform a jump which needs to be checked by CFI. For example, stm_ftrace source is enabled as a module and hooks into enabled ftrace events. This can cause an recursive loop where find_shadow_check_fn -> rcu_read_lock_sched -> (call to stm_ftrace generates cfi slowpath) -> find_shadow_check_fn -> rcu_read_lock_sched -> ... To avoid the recursion, either the ftrace codes needs to be marked with __no_cfi or CFI should not trace. Use the "_notrace" in CFI to avoid tracing so that CFI can guard ftrace. Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Cc: stable@vger.kernel.org Fixes: `cf68fffb66` ("add support for Clang CFI") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210811155914.19550-1-quic_eberman@quicinc.com Bug: 194223154 Change-Id: I7d112496c7f503f95ba69390f6454623cf6dfed2 (cherry picked from commit `14c4c8e415`) Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>	2021-08-19 21:20:03 +00:00
Yanfei Xu	205686b558	FROMGIT: rcu: Fix stall-warning deadlock due to non-release of rcu_node ->lock If rcu_print_task_stall() is invoked on an rcu_node structure that does not contain any tasks blocking the current grace period, it takes an early exit that fails to release that rcu_node structure's lock. This results in a self-deadlock, which is detected by lockdep. To reproduce this bug: tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 3 --trust-make --configs "TREE03" --kconfig "CONFIG_PROVE_LOCKING=y" --bootargs "rcutorture.stall_cpu=30 rcutorture.stall_cpu_block=1 rcutorture.fwd_progress=0 rcutorture.test_boost=0" This will also result in other complaints, including RCU's scheduler hook complaining about blocking rather than preemption and an rcutorture writer stall. Only a partial RCU CPU stall warning message will be printed because of the self-deadlock. This commit therefore releases the lock on the rcu_print_task_stall() function's early exit path. Fixes: `c583bcb8f5` ("rcu: Don't invoke try_invoke_on_locked_down_task() with irqs disabled") Tested-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Yanfei Xu <yanfei.xu@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> BUG: 196874644 (cherry picked from commit `dc87740c8a` https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev) Signed-off-by: Cheng Jui Wang <cheng-jui.wang@mediatek.com> Change-Id: I0942973e3fbac2d666d8eb9ed59b1701af13248a	2021-08-18 15:05:21 +00:00
Poting Chen	fdc8f778e2	ANDROID: scheduler: export task_sched_runtime For power and performance monitoring, need to known tasks' runtime for loading estimation. But now, other modules can't get task_scehd_runtime. Export task_sched_runtime to let other modules get task_scehd_runtime. Bug: 195914330 Signed-off-by: Poting Chen <poting.chen@mediatek.com> Signed-off-by: Cheng Jui Wang <cheng-jui.wang@mediatek.com> Change-Id: Ida5caf8ed0a32954fc0b0ed950f163c7ca493fef	2021-08-16 20:48:25 +00:00
Quentin Perret	0ad91fe432	ANDROID: sched: Make uclamp changes depend on CAP_SYS_NICE There is currently nothing preventing tasks from changing their per-task clamp values in anyway that they like. The rationale is probably that system administrators are still able to limit those clamps thanks to the cgroup interface. However, this causes pain in a system where both per-task and per-cgroup clamp values are expected to be under the control of core system components (as is the case for Android). To fix this, let's require CAP_SYS_NICE to change per-task clamp values. There are ongoing discussions upstream about more flexible approaches than this using the RLIMIT API -- see [1]. But the upstream discussion has not converged yet, and this is way too late for UAPI changes in android12-5.10 anyway, so let's apply this change which provides the behaviour we want without actually impacting UAPIs. [1] https://lore.kernel.org/lkml/20210623123441.592348-4-qperret@google.com/ Bug: 187186685 Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I749312a77306460318ac5374cf243d00b78120dd	2021-08-13 18:27:21 +00:00
Quentin Perret	bfc334cc0b	FROMGIT: sched: Skip priority checks with SCHED_FLAG_KEEP_PARAMS SCHED_FLAG_KEEP_PARAMS can be passed to sched_setattr to specify that the call must not touch scheduling parameters (nice or priority). This is particularly handy for uclamp when used in conjunction with SCHED_FLAG_KEEP_POLICY as that allows to issue a syscall that only impacts uclamp values. However, sched_setattr always checks whether the priorities and nice values passed in sched_attr are valid first, even if those never get used down the line. This is useless at best since userspace can trivially bypass this check to set the uclamp values by specifying low priorities. However, it is cumbersome to do so as there is no single expression of this that skips both RT and CFS checks at once. As such, userspace needs to query the task policy first with e.g. sched_getattr and then set sched_attr.sched_priority accordingly. This is racy and slower than a single call. As the priority and nice checks are useless when SCHED_FLAG_KEEP_PARAMS is specified, simply inherit them in this case to match the policy inheritance of SCHED_FLAG_KEEP_POLICY. Reported-by: Wei Wang <wvw@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Link: https://lore.kernel.org/r/20210805102154.590709-3-qperret@google.com Bug: 190237315 (cherry picked from commit `f4dddf90d5` git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Ifdbc9262b82c7f5c0d34952ece07770a53e3f6a5	2021-08-10 10:32:13 +00:00
Quentin Perret	aaf62dc816	FROMGIT: sched: Don't report SCHED_FLAG_SUGOV in sched_getattr() SCHED_FLAG_SUGOV is supposed to be a kernel-only flag that userspace cannot interact with. However, sched_getattr() currently reports it in sched_flags if called on a sugov worker even though it is not actually defined in a UAPI header. To avoid this, make sure to clean-up the sched_flags field in sched_getattr() before returning to userspace. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210727101103.2729607-3-qperret@google.com Bug: 190237315 (cherry picked from commit `7ad721bf10` git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Ib998d497fc38a7f8e6ccb80119336c9ac30719b7	2021-08-10 10:32:05 +00:00
Quentin Perret	4bb5a5c55b	FROMGIT: sched/deadline: Fix reset_on_fork reporting of DL tasks It is possible for sched_getattr() to incorrectly report the state of the reset_on_fork flag when called on a deadline task. Indeed, if the flag was set on a deadline task using sched_setattr() with flags (SCHED_FLAG_RESET_ON_FORK \| SCHED_FLAG_KEEP_PARAMS), then p->sched_reset_on_fork will be set, but __setscheduler() will bail out early, which means that the dl_se->flags will not get updated by __setscheduler_params()->__setparam_dl(). Consequently, if sched_getattr() is then called on the task, __getparam_dl() will override kattr.sched_flags with the now out-of-date copy in dl_se->flags and report the stale value to userspace. To fix this, make sure to only copy the flags that are relevant to sched_deadline to and from the dl_se->flags field. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210727101103.2729607-2-qperret@google.com Bug: 190237315 (cherry picked from commit `f95091536f` git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I251a433e0ddde6b63881f92821bc0d47c1693a02	2021-08-10 10:32:00 +00:00
Quentin Perret	ac42699756	BACKPORT: FROMGIT: sched: Fix UCLAMP_FLAG_IDLE setting The UCLAMP_FLAG_IDLE flag is set on a runqueue when dequeueing the last uclamp active task (that is, when buckets.tasks reaches 0 for all buckets) to maintain the last uclamp.max and prevent blocked util from suddenly becoming visible. However, there is an asymmetry in how the flag is set and cleared which can lead to having the flag set whilst there are active tasks on the rq. Specifically, the flag is cleared in the uclamp_rq_inc() path, which is called at enqueue time, but set in uclamp_rq_dec_id() which is called both when dequeueing a task _and_ in the update_uclamp_active() path. As a result, when both uclamp_rq_{dec,ind}_id() are called from update_uclamp_active(), the flag ends up being set but not cleared, hence leaving the runqueue in a broken state. Fix this by clearing the flag in update_uclamp_active() as well. Fixes: `e496187da7` ("sched/uclamp: Enforce last task's UCLAMP_MAX") Reported-by: Rick Yiu <rickyiu@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20210805102154.590709-2-qperret@google.com [ qperret: BACKPORT due to trivial cherry-pick conflict caused by `0213b7083e` ("sched/uclamp: Fix uclamp_tg_restrict()") missing from 5.10. ] Bug: 192559209 (cherry picked from commit `ca4984a7dd` git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I7b3418e553ba0f06dd5ef6f0d38a99c3210ae897	2021-08-10 10:31:53 +00:00
JianMin Liu	1efc36b815	ANDROID: sched: add a helper function to change PELT half-life Add a new helper function and export it for vendor module to dynamically switch to an alternative half-life at runtime. Bug: 195474490 Signed-off-by: JianMin Liu <jian-min.liu@mediatek.com> Change-Id: Ife41997a032fe3384cfa126cbf7aee929c5c11cf	2021-08-07 00:03:23 +08:00
Jianqun Xu	2a8bfea53d	UPSTREAM: kernel/irq: export irq_gc_set_wake Module driver may use irq_gc_set_wake. Bug: 194515348 Change-Id: I52f43e1dff15d987532395e5151e65419b5904b2 Signed-off-by: Jianqun Xu <jay.xu@rock-chips.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210305080658.2422114-1-jay.xu@rock-chips.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> (cherry picked from commit `024c79520f`) Signed-off-by: Kever Yang <kever.yang@rock-chips.com>	2021-07-30 06:41:28 +00:00
Matthias Maennich	d0a88ae479	ANDROID: Enable GKI Dr. No Enforcement This effectively locks down OWNERS approval to a small group to guard the code base against unintentional breakages. Bug: 194314089 Signed-off-by: Matthias Maennich <maennich@google.com> Change-Id: Ifd1ea97639a622320ea83f901f6451e2e52b38d4	2021-07-21 20:51:47 +01:00
Elliot Berman	0bb433e014	ANDROID: debug_symbols: Add android_debug_for_each_module Allow vendors to obtain a list of modules loaded at given time. Vendor modules are able to register on part of notifier chain (register_module_notifer), but a vendor module would never see modules which are loaded before the one which registers on the notifier chain. The kernel doesn't offer load order control, so a hook is necessary to iterate through currently loaded kernel modules. Bug: 193552324 Change-Id: I3b01cc1b90f8c0c7c21a37992cc7d607316efc7b Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>	2021-07-15 13:59:25 -07:00
Peter Collingbourne	50829b8901	BACKPORT: arm64: Introduce prctl(PR_PAC_{SET,GET}_ENABLED_KEYS) This change introduces a prctl that allows the user program to control which PAC keys are enabled in a particular task. The main reason why this is useful is to enable a userspace ABI that uses PAC to sign and authenticate function pointers and other pointers exposed outside of the function, while still allowing binaries conforming to the ABI to interoperate with legacy binaries that do not sign or authenticate pointers. The idea is that a dynamic loader or early startup code would issue this prctl very early after establishing that a process may load legacy binaries, but before executing any PAC instructions. This change adds a small amount of overhead to kernel entry and exit due to additional required instruction sequences. On a DragonBoard 845c (Cortex-A75) with the powersave governor, the overhead of similar instruction sequences was measured as 4.9ns when simulating the common case where IA is left enabled, or 43.7ns when simulating the uncommon case where IA is disabled. These numbers can be seen as the worst case scenario, since in more realistic scenarios a better performing governor would be used and a newer chip would be used that would support PAC unlike Cortex-A75 and would be expected to be faster than Cortex-A75. On an Apple M1 under a hypervisor, the overhead of the entry/exit instruction sequences introduced by this patch was measured as 0.3ns in the case where IA is left enabled, and 33.0ns in the case where IA is disabled. Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://linux-review.googlesource.com/id/Ibc41a5e6a76b275efbaa126b31119dc197b927a5 Link: https://lore.kernel.org/r/d6609065f8f40397a4124654eb68c9f490b4d477.1616123271.git.pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Bug: 192536783 (cherry picked from commit `201698626f`) Change-Id: Ic0a21c92a22575f9ec3599fb67bd2931a50b9f04 [quic_eberman@quicinc.com: Resolved merge conflict in arch/arm64/kernel/process.c] Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Signed-off-by: Peter Collingbourne <pcc@google.com>	2021-07-14 20:52:05 -07:00
Suren Baghdasaryan	26cd2564e1	FROMLIST: psi: stop relying on timer_pending for poll_work rescheduling Psi polling mechanism is trying to minimize the number of wakeups to run psi_poll_work and is currently relying on timer_pending() to detect when this work is already scheduled. This provides a window of opportunity for psi_group_change to schedule an immediate psi_poll_work after poll_timer_fn got called but before psi_poll_work could reschedule itself. Below is the depiction of this entire window: poll_timer_fn wake_up_interruptible(&group->poll_wait); psi_poll_worker wait_event_interruptible(group->poll_wait, ...) psi_poll_work psi_schedule_poll_work if (timer_pending(&group->poll_timer)) return; ... mod_timer(&group->poll_timer, jiffies + delay); Prior to `461daba06b` we used to rely on poll_scheduled atomic which was reset and set back inside psi_poll_work and therefore this race window was much smaller. The larger window causes increased number of wakeups and our partners report visible power regression of ~10mA after applying `461daba06b`. Bring back the poll_scheduled atomic and make this race window even narrower by resetting poll_scheduled only when we reach polling expiration time. This does not completely eliminate the possibility of extra wakeups caused by a race with psi_group_change however it will limit it to the worst case scenario of one extra wakeup per every tracking window (0.5s in the worst case). This patch also ensures correct ordering between clearing poll_scheduled flag and obtaining changed_states using memory barrier. Correct ordering between updating changed_states and setting poll_scheduled is ensured by atomic_xchg operation. By tracing the number of immediate rescheduling attempts performed by psi_group_change and the number of these attempts being blocked due to psi monitor being already active, we can assess the effects of this change: Before the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 684365 1385156 1261240 Immediate reschedules blocked: 682846 1381654 1258682 Immediate reschedules (delta): 1519 3502 2558 Immediate reschedules (% of attempted): 0.22% 0.25% 0.20% After the patch: Run#1 Run#2 Run#3 Immediate reschedules attempted: 882244 770298 426218 Immediate reschedules blocked: 881996 769796 426074 Immediate reschedules (delta): 248 502 144 Immediate reschedules (% of attempted): 0.03% 0.07% 0.03% The number of non-blocked immediate reschedules dropped from 0.22-0.25% to 0.03-0.07%. The drop is attributed to the decrease in the race window size and the fact that we allow this race only when psi monitors reach polling window expiration time. Fixes: `461daba06b` ("psi: eliminate kthread_worker from psi trigger scheduling mechanism") Reported-by: Kathleen Chang <yt.chang@mediatek.com> Reported-by: Wenju Xu <wenju.xu@mediatek.com> Reported-by: Jonathan Chen <jonathan.jmchen@mediatek.com> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Tested-by: SH Chen <show-hong.chen@mediatek.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Link: https://lore.kernel.org/patchwork/patch/1455172/ Bug: 191127654 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Ie61547ca043e702442a9c6db1468cfb60ff2e729	2021-07-14 20:52:04 -07:00
chunhui.li	9c2958f454	ANDROID: GKI: Export put_task_stack symbol We need dump task->stack in kernel module for debug usage, call try_get_task_stack to lock task->stack, and try_get_task_stack/put_task_stack should call in pairs, but put_task_stack is not exported Bug: 192990535 Change-Id: Ifb2f3d16f93039bffeb3e822bc066e42e2d21d13 Signed-off-by: chunhui.li <chunhui.li@mediatek.com>	2021-07-14 09:14:16 +00:00
Liujie Xie	f41a95eadc	ANDROID: Export memcg functions to allow module to add new files Export cgroup_add_legacy_cftypes and a helper function to allow vendor module to expose additional files in the memory cgroup hierarchy. Bug: 192052083 Signed-off-by: Liujie Xie <xieliujie@oppo.com> Change-Id: Ie2b936b3e77c7ab6d740d1bb6d70e03c70a326a7	2021-07-12 18:53:29 +00:00
Kuan-Ying Lee	38abaebab7	ANDROID: syscall_check: add vendor hook for bpf syscall Through this vendor hook, we can get the timing to check current running task for the validation of its credential and bpf operations. Bug: 191291287 Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Change-Id: Ie4ed8df7ad66df2486fc7e52a26d9191fc0c176e	2021-07-09 13:48:53 +00:00
Jing-Ting Wu	5c51579fde	ANDROID: fork: Export task_newtask tracepoint android_rvh_sched_fork() and android_rvh_sched_fork_init() already let us register probes during fork(), but those are invoked before the new task is added to the tasklist, which can lead to some undesired races when a module is trying to initialize vendor-specific task_struct fields. Export the task_newtask tracepoint to register probes to run during fork() but after the task has been inserted into the tasklist. Bug: 192873984 Signed-off-by: Jing-Ting Wu <Jing-Ting.Wu@mediatek.com> Cc: Valentin Schneider <valentin.schneider@arm.com> Change-Id: Ifef14819264385b5e955a5966b4e4f66d50da5e3	2021-07-06 21:24:20 +00:00
Todd Kjos	e2a90797e8	ANDROID: Fix kernelci warnings for indentation in smp.c Fix warnings reported by kernelci due to incorrect indentatio: kernel/smp.c:982:3: warning: this ‘if’ clause does not guard Fixes: `f0b280c395` ("ANDROID: cpuidle: Update cpuidle_uninstall_idle_handler() to wakeup all online CPUs") Signed-off-by: Todd Kjos <tkjos@google.com> Change-Id: Ide771342558de321154696f9fe1272750a773853	2021-07-06 21:17:01 +00:00
Bumyong Lee	d4d02ab9b0	UPSTREAM: swiotlb: manipulate orig_addr when tlb_addr has offset commit `5f89468e2f` upstream. in case of driver wants to sync part of ranges with offset, swiotlb_tbl_sync_single() copies from orig_addr base to tlb_addr with offset and ends up with data mismatch. It was removed from "swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single", but said logic has to be added back in. From Linus's email: "That commit which the removed the offset calculation entirely, because the old (unsigned long)tlb_addr & (IO_TLB_SIZE - 1) was wrong, but instead of removing it, I think it should have just fixed it to be (tlb_addr - mem->start) & (IO_TLB_SIZE - 1); instead. That way the slot offset always matches the slot index calculation." (Unfortunatly that broke NVMe). The use-case that drivers are hitting is as follow: 1. Get dma_addr_t from dma_map_single() dma_addr_t tlb_addr = dma_map_single(dev, vaddr, vsize, DMA_TO_DEVICE); \|<---------------vsize------------->\| +-----------------------------------+ \| \| original buffer +-----------------------------------+ vaddr swiotlb_align_offset \|<----->\|<---------------vsize------------->\| +-------+-----------------------------------+ \| \| \| swiotlb buffer +-------+-----------------------------------+ tlb_addr 2. Do something 3. Sync dma_addr_t through dma_sync_single_for_device(..) dma_sync_single_for_device(dev, tlb_addr + offset, size, DMA_TO_DEVICE); Error case. Copy data to original buffer but it is from base addr (instead of base addr + offset) in original buffer: swiotlb_align_offset \|<----->\|<- offset ->\|<- size ->\| +-------+-----------------------------------+ \| \| \|##########\| \| swiotlb buffer +-------+-----------------------------------+ tlb_addr \|<- size ->\| +-----------------------------------+ \|##########\| \| original buffer +-----------------------------------+ vaddr The fix is to copy the data to the original buffer and take into account the offset, like so: swiotlb_align_offset \|<----->\|<- offset ->\|<- size ->\| +-------+-----------------------------------+ \| \| \|##########\| \| swiotlb buffer +-------+-----------------------------------+ tlb_addr \|<- offset ->\|<- size ->\| +-----------------------------------+ \| \|##########\| \| original buffer +-----------------------------------+ vaddr [One fix which was Linus's that made more sense to as it created a symmetry would break NVMe. The reason for that is the: unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1); would come up with the proper offset, but it would lose the alignment (which this patch contains).] Bug: 192521392 Fixes: `16fc3cef33` ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single") Signed-off-by: Bumyong Lee <bumyong.lee@samsung.com> Signed-off-by: Chanho Park <chanho61.park@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reported-by: Dominique MARTINET <dominique.martinet@atmark-techno.com> Reported-by: Horia Geantă <horia.geanta@nxp.com> Tested-by: Horia Geantă <horia.geanta@nxp.com> CC: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit `e6108147dd` of linux-5.10.47) Change-Id: Ib03e81080ab029d37e6ff54a3e2cb526d3a30e10	2021-07-06 16:30:01 +00:00
Maulik Shah	f0b280c395	ANDROID: cpuidle: Update cpuidle_uninstall_idle_handler() to wakeup all online CPUs wake_up_all_idle_cpus() will not wakeup paused CPUs since they are removed from cpu_active_mask but paused CPUs can be in deep cpu idle and hence must wakeup when uninstalling idle handler. This change fixes this by introducing wake_up_all_online_idle_cpus() to unconditionally wakeup all online idle CPUs and invoking same when uninstalling cpu idle handler. Bug: 192436062 Fixes: `683010f555` ("ANDROID: cpu/hotplug: add pause/resume_cpus interface") Change-Id: I4afd4b7a17b87f9cc495e7009c9537888387f9ef Signed-off-by: Maulik Shah <mkshah@codeaurora.org>	2021-07-02 19:44:56 +00:00
Rick Yiu	df80ec7469	ANDROID: sched: Add vendor data in struct cfs_rq For vendor specific data in struct cfs_rq. Bug: 188947181 Signed-off-by: Rick Yiu <rickyiu@google.com> Change-Id: I7c322c6812829c19014426b5721cd1fb0c37a53f	2021-07-01 22:32:03 -07:00
Shaleen Agrawal	f9fcdaeab7	ANDROID: sched: remove regular vendor hooks for 32bit execve As restricted hooks have been introduced, regular vendor hooks are no longer necessary. Bug: 187917024 Change-Id: Ia70e9dd1bd7373e19bdc82e90a2384201076bc0b Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>	2021-07-01 22:32:03 -07:00
Rick Yiu	22a57c542b	ANDROID: sched: Add trace for __setscheduler_uclamp To know per-task uclamp request. Bug: 191973176 Signed-off-by: Rick Yiu <rickyiu@google.com> Change-Id: Ibd40391f2228db5daa410198339237879e67a078	2021-07-01 12:31:33 +00:00
Mukesh Ojha	01f2392e13	ANDROID: logbuf: Add new logbuf vendor hook to support pr_cont() Add new logbuf vendor hook android_vh_logbuf_pr_cont() to capture pr_cont logs. Bug: 185182649 Change-Id: I76b310fc9caac71b344b6cc25ea36f7f81cb7148 Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>	2021-06-29 17:25:52 +00:00
Stephen Dickey	1093a9bfdb	ANDROID: sched: select fallback rq must check for allowed cpus select_fallback_rq() must return a cpu that is valid for the task. However, when nid is not -1, it skips checking for task_cpu_possible_mask(). This causes a problem when execve-ing 32 bit apps on an asymmetric system where not all cpus are 32 bit capable. During execve-ing the task is marked as 32 bit long before its affinity mask is restricted. If the cpu goes offline during this time, select_fallback_rq() could return a 64 bit only cpu, which __migrate_tasks()/ is_cpu_allowed() rejects. migrate_tasks() will therefore continue to pick the same task repeatedly, where __migrate_tasks() rejects the cpu chosen by select_fallback_rq() every time, leading to an infinite loop. Correct the issue by updating select_fallback_rq() for the case where nid is not -1, ensuring that the returned cpu is always valid for this task. Bug: 192050156 Change-Id: Ia073a8395a02485f6d1c1daa0f3ce9e2029cb1f4 Signed-off-by: Stephen Dickey <dickey@codeaurora.org>	2021-06-29 10:57:22 +00:00
Liujie Xie	8943a2e7a3	ANDROID: android: Export symbols for invoking cpufreq_update_util() In order to update cpufreq, vendor modules invoke cpufreq_update_util(), but when we build our modules, report error: ERROR: modpost: "cpufreq_update_util_data" [xxx.ko] undefined! Bug: 192218676 Signed-off-by: Liujie Xie <xieliujie@oppo.com> Change-Id: Ib1da70229f04b08d8d812d065021dec0bf891e0e	2021-06-29 10:44:12 +00:00
Shaleen Agrawal	c7c351ab3f	ANDROID: sched: add restricted tracehooks for 32bit execve Pre and post tracepoints in force_compatible_cpus_allowed_ptr() need to be restricted hooks so that they can sleep. The old non-restricted versions need to stay in place temporarily for KMI stability. They will be removed by aosp/1742588. Bug: 187917024 Change-Id: If630554b1c8fa2e8ccb79c89945c55e17756e6a8 Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>	2021-06-23 17:05:12 -07:00
Suren Baghdasaryan	c7186c2c46	FROMGIT: cgroup: make per-cgroup pressure stall tracking configurable PSI accounts stalls for each cgroup separately and aggregates it at each level of the hierarchy. This causes additional overhead with psi_avgs_work being called for each cgroup in the hierarchy. psi_avgs_work has been highly optimized, however on systems with large number of cgroups the overhead becomes noticeable. Systems which use PSI only at the system level could avoid this overhead if PSI can be configured to skip per-cgroup stall accounting. Add "cgroup_disable=pressure" kernel command-line option to allow requesting system-wide only pressure stall accounting. When set, it keeps system-wide accounting under /proc/pressure/ but skips accounting for individual cgroups and does not expose PSI nodes in cgroup hierarchy. Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/patchwork/patch/1435705 (cherry picked from commit `3958e2d0c3` https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git tj) Bug: 178872719 Bug: 191734423 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Ifc8fbc52f9a1131d7c2668edbb44c525c76c3360	2021-06-23 18:35:35 +00:00
Suren Baghdasaryan	0d054fc5d7	Revert "ANDROID: make per-cgroup PSI tracking configurable" This reverts commit `bd3983c8a8`. Bug: 178872719 Bug: 191734423 Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Iae7997d518693f09fcc0bf8a3ee5caac6145ada5	2021-06-23 18:35:27 +00:00
Marc Zyngier	3f9d45d802	FROMLIST: genirq: Allow an interrupt to be marked as 'raw' Some interrupts (such as the rescheduling IPI) rely on not going through the irq_enter()/irq_exit() calls. To distinguish such interrupts, add a new IRQ flag that allows the low-level handling code to sidestep the enter()/exit() calls. Only the architecture code is expected to use this. It will do the wrong thing on normal interrupts. Note that this is a band-aid until we can move to some more correct infrastructure (such as kernel/entry/common.c). Bug: 191808738 Link: https://lore.kernel.org/lkml/20201124141449.572446-3-maz@kernel.org/ Change-Id: I0609a8b689219ba9e769c8b9f7fcf1e77a0ff1ca Signed-off-by: Marc Zyngier <maz@kernel.org> [minor port to 5.10] Signed-off-by: Stephen Dickey <dickey@codeaurora.org>	2021-06-23 18:11:55 +00:00
Marc Zyngier	08327b9007	FROMLIST: genirq: Add __irq_modify_status() helper to clear/set special flags Some arch-specific flags need to be set/cleared, but not exposed to random device drivers. Introduce a new helper (__irq_modify_status()) that takes an arbitrary mask, and rewrite irq_modify_status() to use this new helper. No functionnal change. Bug: 191808738 Link: https://lore.kernel.org/lkml/20201124141449.572446-5-maz@kernel.org/ Change-Id: I2c2c0d6599d0ab39fad22462bf4c87694362fba8 Signed-off-by: Marc Zyngier <maz@kernel.org> [minor port to 5.10] Signed-off-by: Stephen Dickey <dickey@codeaurora.org>	2021-06-23 18:11:38 +00:00
heshuai1	728626cb04	ANDROID: power: Add vendor hook to qos for GKI purpose. Add the vendor hook to qos.c, because of some special cases related to our feature. we add the hook at freq_qos_add_request and remove_request to make sure we can go to our own qos process logic. Bug: 187458531 Signed-off-by: heshuai1 <heshuai1@xiaomi.com> Change-Id: I1fb8fd6134432ecfb44ad242c66ccd8280ab9b43	2021-06-23 14:36:23 +00:00
Charan Teja Reddy	71fdbce075	FROMLIST: mm: compaction: support triggering of proactive compaction by user The proactive compaction[1] gets triggered for every 500msec and run compaction on the node for COMPACTION_HPAGE_ORDER (usually order-9) pages based on the value set to sysctl.compaction_proactiveness. Triggering the compaction for every 500msec in search of COMPACTION_HPAGE_ORDER pages is not needed for all applications, especially on the embedded system usecases which may have few MB's of RAM. Enabling the proactive compaction in its state will endup in running almost always on such systems. Other side, proactive compaction can still be very much useful for getting a set of higher order pages in some controllable manner(controlled by using the sysctl.compaction_proactiveness). Thus on systems where enabling the proactive compaction always may proove not required, can trigger the same from user space on write to its sysctl interface. As an example, say app launcher decide to launch the memory heavy application which can be launched fast if it gets more higher order pages thus launcher can prepare the system in advance by triggering the proactive compaction from userspace. This triggering of proactive compaction is done on a write to sysctl.compaction_proactiveness by user. [1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=facdaa917c4d5a376d09d25865f5a863f906234a Bug: 186387247 Link: https://lore.kernel.org/patchwork/patch/1438211/ Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> Change-Id: Ie5208e274b9d7e7354471bb98ff1f10becf93595	2021-06-17 14:15:58 -07:00
Greg Kroah-Hartman	9e08e97ec6	Merge 5.10.43 into android12-5.10 Changes in 5.10.43 btrfs: tree-checker: do not error out if extent ref hash doesn't match net: usb: cdc_ncm: don't spew notifications hwmon: (dell-smm-hwmon) Fix index values hwmon: (pmbus/isl68137) remove READ_TEMPERATURE_3 for RAA228228 netfilter: conntrack: unregister ipv4 sockopts on error unwind efi/fdt: fix panic when no valid fdt found efi: Allow EFI_MEMORY_XP and EFI_MEMORY_RO both to be cleared efi/libstub: prevent read overflow in find_file_option() efi: cper: fix snprintf() use in cper_dimm_err_location() vfio/pci: Fix error return code in vfio_ecap_init() vfio/pci: zap_vma_ptes() needs MMU samples: vfio-mdev: fix error handing in mdpy_fb_probe() vfio/platform: fix module_put call in error flow ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service HID: logitech-hidpp: initialize level variable HID: pidff: fix error return code in hid_pidff_init() HID: i2c-hid: fix format string mismatch devlink: Correct VIRTUAL port to not have phys_port attributes net/sched: act_ct: Offload connections with commit action net/sched: act_ct: Fix ct template allocation for zone 0 mptcp: always parse mptcp options for MPC reqsk nvme-rdma: fix in-casule data send for chained sgls ACPICA: Clean up context mutex during object deletion perf probe: Fix NULL pointer dereference in convert_variable_location() net: dsa: tag_8021q: fix the VLAN IDs used for encoding sub-VLANs net: sock: fix in-kernel mark setting net/tls: Replace TLS_RX_SYNC_RUNNING with RCU net/tls: Fix use-after-free after the TLS device goes down and up net/mlx5e: Fix incompatible casting net/mlx5: Check firmware sync reset requested is set before trying to abort it net/mlx5e: Check for needed capability for cvlan matching net/mlx5: DR, Create multi-destination flow table with level less than 64 nvmet: fix freeing unallocated p2pmem netfilter: nft_ct: skip expectations for confirmed conntrack netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches drm/i915/selftests: Fix return value check in live_breadcrumbs_smoketest() bpf: Simplify cases in bpf_base_func_proto bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks ieee802154: fix error return code in ieee802154_add_iface() ieee802154: fix error return code in ieee802154_llsec_getparams() igb: add correct exception tracing for XDP ixgbevf: add correct exception tracing for XDP cxgb4: fix regression with HASH tc prio value update ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions ice: Fix allowing VF to request more/less queues via virtchnl ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared ice: handle the VF VSI rebuild failure ice: report supported and advertised autoneg using PHY capabilities ice: Allow all LLDP packets from PF to Tx i2c: qcom-geni: Add shutdown callback for i2c cxgb4: avoid link re-train during TC-MQPRIO configuration i40e: optimize for XDP_REDIRECT in xsk path i40e: add correct exception tracing for XDP ice: simplify ice_run_xdp ice: optimize for XDP_REDIRECT in xsk path ice: add correct exception tracing for XDP ixgbe: optimize for XDP_REDIRECT in xsk path ixgbe: add correct exception tracing for XDP arm64: dts: ti: j7200-main: Mark Main NAVSS as dma-coherent optee: use export_uuid() to copy client UUID bus: ti-sysc: Fix am335x resume hang for usb otg module arm64: dts: ls1028a: fix memory node arm64: dts: zii-ultra: fix 12V_MAIN voltage arm64: dts: freescale: sl28: var4: fix RGMII clock and voltage ARM: dts: imx7d-meerkat96: Fix the 'tuning-step' property ARM: dts: imx7d-pico: Fix the 'tuning-step' property ARM: dts: imx: emcon-avari: Fix nxp,pca8574 #gpio-cells bus: ti-sysc: Fix flakey idling of uarts and stop using swsup_sidle_act tipc: add extack messages for bearer/media failure tipc: fix unique bearer names sanity check serial: stm32: fix threaded interrupt handling riscv: vdso: fix and clean-up Makefile io_uring: fix link timeout refs io_uring: use better types for cflags drm/amdgpu/vcn3: add cancel_delayed_work_sync before power gate drm/amdgpu/jpeg2.5: add cancel_delayed_work_sync before power gate drm/amdgpu/jpeg3: add cancel_delayed_work_sync before power gate Bluetooth: fix the erroneous flush_work() order Bluetooth: use correct lock to prevent UAF of hdev object wireguard: do not use -O3 wireguard: peer: allocate in kmem_cache wireguard: use synchronize_net rather than synchronize_rcu wireguard: selftests: remove old conntrack kconfig value wireguard: selftests: make sure rp_filter is disabled on vethc wireguard: allowedips: initialize list head in selftest wireguard: allowedips: remove nodes in O(1) wireguard: allowedips: allocate nodes in kmem_cache wireguard: allowedips: free empty intermediate nodes when removing single node net: caif: added cfserl_release function net: caif: add proper error handling net: caif: fix memory leak in caif_device_notify net: caif: fix memory leak in cfusbl_device_notify HID: i2c-hid: Skip ELAN power-on command after reset HID: magicmouse: fix NULL-deref on disconnect HID: multitouch: require Finger field to mark Win8 reports as MT gfs2: fix scheduling while atomic bug in glocks ALSA: timer: Fix master timer notification ALSA: hda: Fix for mute key LED for HP Pavilion 15-CK0xx ALSA: hda: update the power_state during the direct-complete ARM: dts: imx6dl-yapp4: Fix RGMII connection to QCA8334 switch ARM: dts: imx6q-dhcom: Add PU,VDD1P1,VDD2P5 regulators ext4: fix memory leak in ext4_fill_super ext4: fix bug on in ext4_es_cache_extent as ext4_split_extent_at failed ext4: fix fast commit alignment issues ext4: fix memory leak in ext4_mb_init_backend on error path. ext4: fix accessing uninit percpu counter variable with fast_commit usb: dwc2: Fix build in periphal-only mode pid: take a reference when initializing `cad_pid` ocfs2: fix data corruption by fallocate mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests() mm/page_alloc: fix counting of free pages after take off from buddy x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid() x86/sev: Check SME/SEV support in CPUID first nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect drm/amdgpu: Don't query CE and UE errors drm/amdgpu: make sure we unpin the UVD BO x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing powerpc/kprobes: Fix validation of prefixed instructions across page boundary btrfs: mark ordered extent and inode with error if we fail to finish btrfs: fix error handling in btrfs_del_csums btrfs: return errors from btrfs_del_csums in cleanup_ref_head btrfs: fixup error handling in fixup_inode_link_counts btrfs: abort in rename_exchange if we fail to insert the second ref btrfs: fix deadlock when cloning inline extents and low on available space mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY drm/msm/dpu: always use mdp device to scale bandwidth btrfs: fix unmountable seed device after fstrim KVM: SVM: Truncate GPR value for DR and CR accesses in !64-bit mode KVM: arm64: Fix debug register indexing x86/kvm: Teardown PV features on boot CPU as well x86/kvm: Disable kvmclock on all CPUs on shutdown x86/kvm: Disable all PV features on crash lib/lz4: explicitly support in-place decompression i2c: qcom-geni: Suspend and resume the bus during SYSTEM_SLEEP_PM ops netfilter: nf_tables: missing error reporting for not selected expressions xen-netback: take a reference to the RX task thread neighbour: allow NUD_NOARP entries to be forced GCed Linux 5.10.43 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I8d7ec0878193e4e454076809b7fb71fcc4e3d810	2021-06-12 14:48:14 +02:00
lijianzhong	a685bf3fce	ANDROID: export cpuset_cpus_allowed()for GKI purpose. Exporting the symbol cpuset_cpus_allowed(), in which ko module can do cpuset operation in vendor hook related code. Bug: 189725786 Signed-off-by: lijianzhong <lijianzhong@xiaomi.com> Change-Id: I7919a893ab64bb441ab43cbb0b16825ed76d802d	2021-06-10 19:50:04 +00:00
Daniel Borkmann	ff5039ec75	bpf, lockdown, audit: Fix buggy SELinux lockdown permission checks [ Upstream commit `ff40e51043` ] Commit `59438b4647` ("security,lockdown,selinux: implement SELinux lockdown") added an implementation of the locked_down LSM hook to SELinux, with the aim to restrict which domains are allowed to perform operations that would breach lockdown. This is indirectly also getting audit subsystem involved to report events. The latter is problematic, as reported by Ondrej and Serhei, since it can bring down the whole system via audit: 1) The audit events that are triggered due to calls to security_locked_down() can OOM kill a machine, see below details [0]. 2) It also seems to be causing a deadlock via avc_has_perm()/slow_avc_audit() when trying to wake up kauditd, for example, when using trace_sched_switch() tracepoint, see details in [1]. Triggering this was not via some hypothetical corner case, but with existing tools like runqlat & runqslower from bcc, for example, which make use of this tracepoint. Rough call sequence goes like: rq_lock(rq) -> -------------------------+ trace_sched_switch() -> \| bpf_prog_xyz() -> +-> deadlock selinux_lockdown() -> \| audit_log_end() -> \| wake_up_interruptible() -> \| try_to_wake_up() -> \| rq_lock(rq) --------------+ What's worse is that the intention of `59438b4647` to further restrict lockdown settings for specific applications in respect to the global lockdown policy is completely broken for BPF. The SELinux policy rule for the current lockdown check looks something like this: allow <who> <who> : lockdown { <reason> }; However, this doesn't match with the 'current' task where the security_locked_down() is executed, example: httpd does a syscall. There is a tracing program attached to the syscall which triggers a BPF program to run, which ends up doing a bpf_probe_read_kernel{,_str}() helper call. The selinux_lockdown() hook does the permission check against 'current', that is, httpd in this example. httpd has literally zero relation to this tracing program, and it would be nonsensical having to write an SELinux policy rule against httpd to let the tracing helper pass. The policy in this case needs to be against the entity that is installing the BPF program. For example, if bpftrace would generate a histogram of syscall counts by user space application: bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }' bpftrace would then go and generate a BPF program from this internally. One way of doing it [for the sake of the example] could be to call bpf_get_current_task() helper and then access current->comm via one of bpf_probe_read_kernel{,_str}() helpers. So the program itself has nothing to do with httpd or any other random app doing a syscall here. The BPF program _explicitly initiated_ the lockdown check. The allow/deny policy belongs in the context of bpftrace: meaning, you want to grant bpftrace access to use these helpers, but other tracers on the system like my_random_tracer _not_. Therefore fix all three issues at the same time by taking a completely different approach for the security_locked_down() hook, that is, move the check into the program verification phase where we actually retrieve the BPF func proto. This also reliably gets the task (current) that is trying to install the BPF tracing program, e.g. bpftrace/bcc/perf/systemtap/etc, and it also fixes the OOM since we're moving this out of the BPF helper's fast-path which can be called several millions of times per second. The check is then also in line with other security_locked_down() hooks in the system where the enforcement is performed at open/load time, for example, open_kcore() for /proc/kcore access or module_sig_check() for module signatures just to pick few random ones. What's out of scope in the fix as well as in other security_locked_down() hook locations /outside/ of BPF subsystem is that if the lockdown policy changes on the fly there is no retrospective action. This requires a different discussion, potentially complex infrastructure, and it's also not clear whether this can be solved generically. Either way, it is out of scope for a suitable stable fix which this one is targeting. Note that the breakage is specifically on `59438b4647` where it started to rely on 'current' as UAPI behavior, and _not_ earlier infrastructure such as `9d1f8be5cf` ("bpf: Restrict bpf when kernel lockdown is in confidentiality mode"). [0] https://bugzilla.redhat.com/show_bug.cgi?id=1955585, Jakub Hrozek says: I starting seeing this with F-34. When I run a container that is traced with BPF to record the syscalls it is doing, auditd is flooded with messages like: type=AVC msg=audit(1619784520.593:282387): avc: denied { confidentiality } for pid=476 comm="auditd" lockdown_reason="use of bpf to read kernel RAM" scontext=system_u:system_r:auditd_t:s0 tcontext=system_u:system_r:auditd_t:s0 tclass=lockdown permissive=0 This seems to be leading to auditd running out of space in the backlog buffer and eventually OOMs the machine. [...] auditd running at 99% CPU presumably processing all the messages, eventually I get: Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: backlog limit exceeded Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152579 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152626 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_backlog=2152694 > audit_backlog_limit=64 Apr 30 12:20:42 fedora kernel: audit: audit_lost=6878426 audit_rate_limit=0 audit_backlog_limit=64 Apr 30 12:20:45 fedora kernel: oci-seccomp-bpf invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=-1000 Apr 30 12:20:45 fedora kernel: CPU: 0 PID: 13284 Comm: oci-seccomp-bpf Not tainted 5.11.12-300.fc34.x86_64 #1 Apr 30 12:20:45 fedora kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014 [...] [1] https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@mail.gmail.com/, Serhei Makarov says: Upstream kernel 5.11.0-rc7 and later was found to deadlock during a bpf_probe_read_compat() call within a sched_switch tracepoint. The problem is reproducible with the reg_alloc3 testcase from SystemTap's BPF backend testsuite on x86_64 as well as the runqlat, runqslower tools from bcc on ppc64le. Example stack trace: [...] [ 730.868702] stack backtrace: [ 730.869590] CPU: 1 PID: 701 Comm: in:imjournal Not tainted, 5.12.0-0.rc2.20210309git144c79ef3353.166.fc35.x86_64 #1 [ 730.871605] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 [ 730.873278] Call Trace: [ 730.873770] dump_stack+0x7f/0xa1 [ 730.874433] check_noncircular+0xdf/0x100 [ 730.875232] __lock_acquire+0x1202/0x1e10 [ 730.876031] ? __lock_acquire+0xfc0/0x1e10 [ 730.876844] lock_acquire+0xc2/0x3a0 [ 730.877551] ? __wake_up_common_lock+0x52/0x90 [ 730.878434] ? lock_acquire+0xc2/0x3a0 [ 730.879186] ? lock_is_held_type+0xa7/0x120 [ 730.880044] ? skb_queue_tail+0x1b/0x50 [ 730.880800] _raw_spin_lock_irqsave+0x4d/0x90 [ 730.881656] ? __wake_up_common_lock+0x52/0x90 [ 730.882532] __wake_up_common_lock+0x52/0x90 [ 730.883375] audit_log_end+0x5b/0x100 [ 730.884104] slow_avc_audit+0x69/0x90 [ 730.884836] avc_has_perm+0x8b/0xb0 [ 730.885532] selinux_lockdown+0xa5/0xd0 [ 730.886297] security_locked_down+0x20/0x40 [ 730.887133] bpf_probe_read_compat+0x66/0xd0 [ 730.887983] bpf_prog_250599c5469ac7b5+0x10f/0x820 [ 730.888917] trace_call_bpf+0xe9/0x240 [ 730.889672] perf_trace_run_bpf_submit+0x4d/0xc0 [ 730.890579] perf_trace_sched_switch+0x142/0x180 [ 730.891485] ? __schedule+0x6d8/0xb20 [ 730.892209] __schedule+0x6d8/0xb20 [ 730.892899] schedule+0x5b/0xc0 [ 730.893522] exit_to_user_mode_prepare+0x11d/0x240 [ 730.894457] syscall_exit_to_user_mode+0x27/0x70 [ 730.895361] entry_SYSCALL_64_after_hwframe+0x44/0xae [...] Fixes: `59438b4647` ("security,lockdown,selinux: implement SELinux lockdown") Reported-by: Ondrej Mosnacek <omosnace@redhat.com> Reported-by: Jakub Hrozek <jhrozek@redhat.com> Reported-by: Serhei Makarov <smakarov@redhat.com> Reported-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Jiri Olsa <jolsa@redhat.com> Cc: Paul Moore <paul@paul-moore.com> Cc: James Morris <jamorris@linux.microsoft.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Frank Eigler <fche@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/bpf/01135120-8bf7-df2e-cff0-1d73f1f841c3@iogearbox.net Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-10 13:39:19 +02:00
Tobias Klauser	cdf3f6db1a	bpf: Simplify cases in bpf_base_func_proto [ Upstream commit `61ca36c8c4` ] !perfmon_capable() is checked before the last switch(func_id) in bpf_base_func_proto. Thus, the cases BPF_FUNC_trace_printk and BPF_FUNC_snprintf_btf can be moved to that last switch(func_id) to omit the inline !perfmon_capable() checks. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210127174615.3038-1-tklauser@distanz.ch Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-10 13:39:19 +02:00
heshuai1	c9b8fa644f	ANDROID: user: Add vendor hook to user for GKI purpose Add the vendor hook to user.c, because of some speical cases related to our feature, we need to initialize the variables defined by ourselves in user_struct, so we add the hook at alloc_uid to make sure we can go to our own logic when the user_struct is about to initialize. Bug: 187458531 Signed-off-by: heshuai1 <heshuai1@xiaomi.com> Change-Id: I078484aac2c3d396aba5971d6d0f491652f3781c	2021-06-10 01:35:22 +00:00
Liujie Xie	13af062abf	ANDROID: vendor_hooks: Export the tracepoints sched_stat_sleep and sched_waking to let module probe them Get task info about sleep and waking Bug: 190422437 Signed-off-by: Liujie Xie <xieliujie@oppo.com> Change-Id: I828c93f531f84e6133c2c3a7f8faada51683afcf	2021-06-09 16:49:43 +00:00
Shaleen Agrawal	2a1bc2387d	ANDROID: abi_gki_aarch64_qcom: Add symbols for 32bit execve Export cpu_maps_update_begin, cpu_maps_update_done to be used by vendor modules, particularly to hold locks when affinity is being updated for 32 bit task exec. Leaf changes summary: 6 artifacts changed Changed leaf types summary: 0 leaf type changed Removed/Changed/Added functions summary: 0 Removed, 0 Changed, 4 Added functions Removed/Changed/Added variables summary: 0 Removed, 0 Changed, 2 Added variables 4 Added functions: [A] 'function int __traceiter_android_vh_force_compatible_post(void, void)' [A] 'function int __traceiter_android_vh_force_compatible_pre(void, void)' [A] 'function void cpu_maps_update_begin()' [A] 'function void cpu_maps_update_done()' 2 Added variables: [A] 'tracepoint __tracepoint_android_vh_force_compatible_post' [A] 'tracepoint __tracepoint_android_vh_force_compatible_pre' Bug: 187917024 Change-Id: I02b28f7c34b21a1bfb309fcbd4e9afc306febdd6 Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>	2021-06-07 21:59:49 +00:00
Abhijeet Dharmapurikar	3f5e8b830c	ANDROID: sched: create trace points for 32bit execve Module code would like to hold some locks when affinity is being updated for 32 bit task exec. Create pre and post tracepoints in force_compatible_cpus_allowed_ptr() Bug: 187917024 Change-Id: I95bff9f4d5b5d37c1d5440acbd6857d2855c2b43 Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org> Signed-off-by: Shaleen Agrawal <shalagra@codeaurora.org>	2021-06-07 21:59:40 +00:00
heshuai1	a1580311c3	ANDROID: freezer: Add vendor hook to freezer for GKI purpose. Add the vendor hook to freezer.c, because of some special cases related to our feature, we do not want the process to be frozen immediately, so we add the hook at __refrigerator to make sure we can go to our own freeze logic when the process is about to be frozen. Bug: 187458531 Signed-off-by: heshuai1 <heshuai1@xiaomi.com> Change-Id: Iea42fd9604d6b33ccd6502425416f0dd28eecebb	2021-06-07 16:07:44 +00:00
Choonghoon Park	27c285003d	ANDROID: sched: Add vendor hook to select ilb cpu Add android_rvh_find_new_ilb to select a next ilb cpu for vendors. Bug: 190228983 Change-Id: Iba1a0cd9cdc22dcf628dd33f8d838fe513a4818f Signed-off-by: Choonghoon Park <choong.park@samsung.com>	2021-06-07 11:00:05 +00:00
Liangliang Li	0b76ef69f6	ANDROID: sched: Add oem data in struct rq Add ANDROID_OEM_DATA to struct rq, which is used to implement oem's scheduler tuning. Bug: 188899490 Change-Id: I1904b4fd83effc4b309bfb98811e9718398504f4 Signed-off-by: Liangliang Li <liliangliang@vivo.com>	2021-06-04 11:15:20 -07:00
Will Deacon	18eae90751	FROMGIT: timer_list: Print name of per-cpu wakeup device With the introduction of per-cpu wakeup devices that can be used in preference to the broadcast timer, print the name of such devices when they are available. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210524221818.15850-6-will@kernel.org (cherry picked from commit `245a057fee` tip/tip.git timers/core) Signed-off-by: Will Deacon <willdeacon@google.com> Bug: 185092876 Change-Id: I39736cb43702430b722382c802603fdc4188a5c4	2021-06-04 18:33:43 +01:00
Will Deacon	41b08205cb	FROMGIT: tick/broadcast: Program wakeup timer when entering idle if required When configuring the broadcast timer on entry to and exit from deep idle states, prefer a per-CPU wakeup timer if one exists. On entry to idle, stop the tick device and transfer the next event into the oneshot wakeup device, which will serve as the wakeup from idle. To avoid the overhead of additional hardware accesses on exit from idle, leave the timer armed and treat the inevitable interrupt as a (possibly spurious) tick event. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210524221818.15850-5-will@kernel.org (cherry picked from commit `ea5c7f1b9a` tip/tip.git timers/core) Signed-off-by: Will Deacon <willdeacon@google.com> Bug: 185092876 Change-Id: I62a49231e213285f95e9f0cf6a07633984930b56	2021-06-04 18:33:43 +01:00
Will Deacon	130cd0ecfa	FROMGIT: tick/broadcast: Prefer per-cpu oneshot wakeup timers to broadcast Some SoCs have two per-cpu timer implementations where the timer with the higher rating stops in deep idle (i.e. suffers from CLOCK_EVT_FEAT_C3STOP) but is otherwise preferable to the timer with the lower rating. In such a design, selecting the higher rated devices relies on a global broadcast timer and IPIs to wake up from deep idle states. To avoid the reliance on a global broadcast timer and also to reduce the overhead associated with the IPI wakeups, extend tick_install_broadcast_device() to manage per-cpu wakeup timers separately from the broadcast device. For now, these timers remain unused. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20210524221818.15850-4-will@kernel.org (cherry picked from commit `c94a8537df` tip/tip.git timers/core) Signed-off-by: Will Deacon <willdeacon@google.com> Bug: 185092876 Change-Id: I2d2b1bc6333d004846270d3e58dec0dca89a89d1	2021-06-04 18:33:43 +01:00

1 2 3 4 5 ...

35,613 commits