linux-uconsole

Author	SHA1	Message	Date
Johannes Weiner	cf46cf40bc	UPSTREAM: psi: Fix a division error in psi poll() The psi window size is a u64 an can be up to 10 seconds right now, which exceeds the lower 32 bits of the variable. We currently use div_u64 for it, which is meant only for 32-bit divisors. The result is garbage pressure sampling values and even potential div0 crashes. Use div64_u64. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: Jingfeng Xie <xiejingfeng@linux.alibaba.com> Link: https://lkml.kernel.org/r/20191203183524.41378-3-hannes@cmpxchg.org Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit `c3466952ca`) Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I49fdfd55751d1a2cde19666624c9c5d76dc78dad	2020-02-28 15:30:04 +00:00
Johannes Weiner	55013802e8	UPSTREAM: sched/psi: Fix sampling error and rare div0 crashes with cgroups and high uptime Jingfeng reports rare div0 crashes in psi on systems with some uptime: [58914.066423] divide error: 0000 [#1] SMP [58914.070416] Modules linked in: ipmi_poweroff ipmi_watchdog toa overlay fuse tcp_diag inet_diag binfmt_misc aisqos(O) aisqos_hotfixes(O) [58914.083158] CPU: 94 PID: 140364 Comm: kworker/94:2 Tainted: G W OE K 4.9.151-015.ali3000.alios7.x86_64 #1 [58914.093722] Hardware name: Alibaba Alibaba Cloud ECS/Alibaba Cloud ECS, BIOS 3.23.34 02/14/2019 [58914.102728] Workqueue: events psi_update_work [58914.107258] task: ffff8879da83c280 task.stack: ffffc90059dcc000 [58914.113336] RIP: 0010:[] [] psi_update_stats+0x1c1/0x330 [58914.122183] RSP: 0018:ffffc90059dcfd60 EFLAGS: 00010246 [58914.127650] RAX: 0000000000000000 RBX: ffff8858fe98be50 RCX: 000000007744d640 [58914.134947] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00003594f700648e [58914.142243] RBP: ffffc90059dcfdf8 R08: 0000359500000000 R09: 0000000000000000 [58914.149538] R10: 0000000000000000 R11: 0000000000000000 R12: 0000359500000000 [58914.156837] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8858fe98bd78 [58914.164136] FS: 0000000000000000(0000) GS:ffff887f7f380000(0000) knlGS:0000000000000000 [58914.172529] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [58914.178467] CR2: 00007f2240452090 CR3: 0000005d5d258000 CR4: 00000000007606f0 [58914.185765] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [58914.193061] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [58914.200360] PKRU: 55555554 [58914.203221] Stack: [58914.205383] ffff8858fe98bd48 00000000000002f0 0000002e81036d09 ffffc90059dcfde8 [58914.213168] ffff8858fe98bec8 0000000000000000 0000000000000000 0000000000000000 [58914.220951] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [58914.228734] Call Trace: [58914.231337] [] psi_update_work+0x22/0x60 [58914.237067] [] process_one_work+0x189/0x420 [58914.243063] [] worker_thread+0x4e/0x4b0 [58914.248701] [] ? process_one_work+0x420/0x420 [58914.254869] [] kthread+0xe6/0x100 [58914.259994] [] ? kthread_park+0x60/0x60 [58914.265640] [] ret_from_fork+0x39/0x50 [58914.271193] Code: 41 29 c3 4d 39 dc 4d 0f 42 dc <49> f7 f1 48 8b 13 48 89 c7 48 c1 [58914.279691] RIP [] psi_update_stats+0x1c1/0x330 The crashing instruction is trying to divide the observed stall time by the sampling period. The period, stored in R8, is not 0, but we are dividing by the lower 32 bits only, which are all 0 in this instance. We could switch to a 64-bit division, but the period shouldn't be that big in the first place. It's the time between the last update and the next scheduled one, and so should always be around 2s and comfortably fit into 32 bits. The bug is in the initialization of new cgroups: we schedule the first sampling event in a cgroup as an offset of sched_clock(), but fail to initialize the last_update timestamp, and it defaults to 0. That results in a bogusly large sampling period the first time we run the sampling code, and consequently we underreport pressure for the first 2s of a cgroup's life. But worse, if sched_clock() is sufficiently advanced on the system, and the user gets unlucky, the period's lower 32 bits can all be 0 and the sampling division will crash. Fix this by initializing the last update timestamp to the creation time of the cgroup, thus correctly marking the start of the first pressure sampling period in a new cgroup. Reported-by: Jingfeng Xie <xiejingfeng@linux.alibaba.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Link: https://lkml.kernel.org/r/20191203183524.41378-2-hannes@cmpxchg.org Signed-off-by: Sasha Levin <sashal@kernel.org> (cherry picked from commit `3dfbe25c27`) Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: Iaada5c2f1a03cf38cbb053adde478f762ce40843	2020-02-28 15:29:50 +00:00
Miles Chen	88a47f1659	UPSTREAM: sched/psi: Correct overly pessimistic size calculation When passing a equal or more then 32 bytes long string to psi_write(), psi_write() copies 31 bytes to its buf and overwrites buf[30] with '\0'. Which makes the input string 1 byte shorter than it should be. Fix it by copying sizeof(buf) bytes when nbytes >= sizeof(buf). This does not cause problems in normal use case like: "some 500000 10000000" or "full 500000 10000000" because they are less than 32 bytes in length. /* assuming nbytes == 35 / char buf[32]; buf_size = min(nbytes, (sizeof(buf) - 1)); / buf_size = 31 / if (copy_from_user(buf, user_buf, buf_size)) return -EFAULT; buf[buf_size - 1] = '\0'; / buf[30] = '\0' */ Before: %cd /proc/pressure/ %echo "123456789\|123456789\|123456789\|1234" > memory [ 22.473497] nbytes=35,buf_size=31 [ 22.473775] 123456789\|123456789\|123456789\| (print 30 chars) %sh: write error: Invalid argument %echo "123456789\|123456789\|123456789\|1" > memory [ 64.916162] nbytes=32,buf_size=31 [ 64.916331] 123456789\|123456789\|123456789\| (print 30 chars) %sh: write error: Invalid argument After: %cd /proc/pressure/ %echo "123456789\|123456789\|123456789\|1234" > memory [ 254.837863] nbytes=35,buf_size=32 [ 254.838541] 123456789\|123456789\|123456789\|1 (print 31 chars) %sh: write error: Invalid argument %echo "123456789\|123456789\|123456789\|1" > memory [ 9965.714935] nbytes=32,buf_size=32 [ 9965.715096] 123456789\|123456789\|123456789\|1 (print 31 chars) %sh: write error: Invalid argument Also remove the superfluous parentheses. Signed-off-by: Miles Chen <miles.chen@mediatek.com> Cc: <linux-mediatek@lists.infradead.org> Cc: <wsd_upstream@mediatek.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190912103452.13281-1-miles.chen@mediatek.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `4adcdcea71`) Signed-off-by: Suren Baghdasaryan <surenb@google.com> Change-Id: I9371b4d5e465bb8b84ff7adf5f40f30696c6ff70	2020-02-28 15:28:23 +00:00
Sami Tolvanen	8028f78053	ANDROID: Disable wq fp check in CFI builds With non-canonical CFI, LLVM generates jump table entries for external symbols in modules and as a result, a function pointer passed from a module to the core kernel will have a different address. Disable the warning for now. Bug: 145210207 Change-Id: Ifdcee3479280f7b97abdee6b4c746f447e0944e6 Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Alistair Delva <adelva@google.com>	2020-02-27 00:09:59 +00:00
Todd Kjos	08256862e0	ANDROID: increase limit on sched-tune boost groups Some devices need an additional sched-tune boost group to optimize performance for key tasks Bug: 150302001 Change-Id: I392c8cc05a8851f1d416c381b4a27242924c2c27 Signed-off-by: Todd Kjos <tkjos@google.com>	2020-02-26 21:55:29 +00:00
Greg Kroah-Hartman	4dc4199770	This is the 4.19.106 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5TfLwACgkQONu9yGCS aT5wlRAAhZELK39c78NMCTZKHtKGLsGb2os2IiI7zIRbqNNwnvJi+jAc3kgbS9jP +W+wnhYFtFisDvqdCQ009I6A0NA1p3Nqy166JplW0iIg1e7rgUKKUfabCN9sJmjh HGK913cJlHwGmkSxq//sBucBwWhYYGaHec28pZ7uCFATjWrTaH3G4VrvLStuicYR YgS9MH261tWJKJm5+V2MxnOOI0103+Uey+xVqwSnLlV+qmasxwDCMU5ae+SK7e7f cXIkNZwvDph1zunekHg+jd64GN3GYswXVcRighWP0n7Lr+0tGPN7SY5pvZIjZLv/ sdroyrqAxytTYP32hypIUgsToVvJr7zXD09LGdsgOCKVwFVn8yl1e4zgGKH3L9Xu OK2krI90v1MVevibyaNndZ4UDKilF75oE2YYDOFW/BU1lorFAIzk4hh15CfKc8s1 KHRjePfcgQREs/SGK8k2BAmf/JwxFN1/Ro5dl7MvKn07ZYqx6QOwUoMhgxspIntN 9TlFw6elu1RSwu2BFts9wvoHO1tr7GZBa1cVkNF8qV1rzaGVY68aLDvvHGdffD6W JgX+BCfr6vcN7R4izak1RxzAoqDrRxS0vWoC1vVsPqeIIZydSxpYDquaFnbZm+Wc MRuh5gpQ2PzTXuMLeBB+ig6UnzsAO3x+3yIG/l5ZmmYxJbMFBKU= =zE/i -----END PGP SIGNATURE----- Merge 4.19.106 into android-4.19 Changes in 4.19.106 core: Don't skip generic XDP program execution for cloned SKBs enic: prevent waking up stopped tx queues over watchdog reset net/smc: fix leak of kernel memory to user space net: dsa: tag_qca: Make sure there is headroom for tag net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGS net/sched: flower: add missing validation of TCA_FLOWER_FLAGS Revert "KVM: nVMX: Use correct root level for nested EPT shadow page tables" Revert "KVM: VMX: Add non-canonical check on writes to RTIT address MSRs" KVM: nVMX: Use correct root level for nested EPT shadow page tables drm/gma500: Fixup fbdev stolen size usage evaluation cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order brcmfmac: Fix use after free in brcmf_sdio_readframes() leds: pca963x: Fix open-drain initialization ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT ALSA: ctl: allow TLV read operation for callback type of element in locked case gianfar: Fix TX timestamping with a stacked DSA driver pinctrl: sh-pfc: sh7264: Fix CAN function GPIOs pxa168fb: Fix the function used to release some memory in an error handling path media: i2c: mt9v032: fix enum mbus codes and frame sizes powerpc/powernv/iov: Ensure the pdn for VFs always contains a valid PE number gpio: gpio-grgpio: fix possible sleep-in-atomic-context bugs in grgpio_irq_map/unmap() iommu/vt-d: Fix off-by-one in PASID allocation char/random: silence a lockdep splat with printk() media: sti: bdisp: fix a possible sleep-in-atomic-context bug in bdisp_device_run() pinctrl: baytrail: Do not clear IRQ flags on direct-irq enabled pins efi/x86: Map the entire EFI vendor string before copying it MIPS: Loongson: Fix potential NULL dereference in loongson3_platform_init() sparc: Add .exit.data section. uio: fix a sleep-in-atomic-context bug in uio_dmem_genirq_irqcontrol() usb: gadget: udc: fix possible sleep-in-atomic-context bugs in gr_probe() usb: dwc2: Fix IN FIFO allocation clocksource/drivers/bcm2835_timer: Fix memory leak of timer kselftest: Minimise dependency of get_size on C library interfaces jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal x86/sysfb: Fix check for bad VRAM size pwm: omap-dmtimer: Simplify error handling s390/pci: Fix possible deadlock in recover_store() powerpc/iov: Move VF pdev fixup into pcibios_fixup_iov() tracing: Fix tracing_stat return values in error handling paths tracing: Fix very unlikely race of registering two stat tracers ARM: 8952/1: Disable kmemleak on XIP kernels ext4, jbd2: ensure panic when aborting with zero errno ath10k: Correct the DMA direction for management tx buffers drm/amd/display: Retrain dongles when SINK_COUNT becomes non-zero nbd: add a flush_workqueue in nbd_start_device KVM: s390: ENOTSUPP -> EOPNOTSUPP fixups kconfig: fix broken dependency in randconfig-generated .config clk: qcom: rcg2: Don't crash if our parent can't be found; return an error drm/amdgpu: remove 4 set but not used variable in amdgpu_atombios_get_connector_info_from_object_table drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREG regulator: rk808: Lower log level on optional GPIOs being not available net/wan/fsl_ucc_hdlc: reject muram offsets above 64K NFC: port100: Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu(). selinux: fall back to ref-walk if audit is required arm64: dts: allwinner: H6: Add PMU mode arm: dts: allwinner: H3: Add PMU node selinux: ensure we cleanup the internal AVC counters on error in avc_insert() arm64: dts: qcom: msm8996: Disable USB2 PHY suspend by core ARM: dts: imx6: rdu2: Disable WP for USDHC2 and USDHC3 ARM: dts: imx6: rdu2: Limit USBH1 to Full Speed PCI: iproc: Apply quirk_paxc_bridge() for module as well as built-in media: cx23885: Add support for AVerMedia CE310B PCI: Add generic quirk for increasing D3hot delay PCI: Increase D3 delay for AMD Ryzen5/7 XHCI controllers media: v4l2-device.h: Explicitly compare grp{id,mask} to zero in v4l2_device macros reiserfs: Fix spurious unlock in reiserfs_fill_super() error handling r8169: check that Realtek PHY driver module is loaded fore200e: Fix incorrect checks of NULL pointer dereference netfilter: nft_tunnel: add the missing ERSPAN_VERSION nla_policy ALSA: usx2y: Adjust indentation in snd_usX2Y_hwdep_dsp_status b43legacy: Fix -Wcast-function-type ipw2x00: Fix -Wcast-function-type iwlegacy: Fix -Wcast-function-type rtlwifi: rtl_pci: Fix -Wcast-function-type orinoco: avoid assertion in case of NULL pointer ACPICA: Disassembler: create buffer fields in ACPI_PARSE_LOAD_PASS1 scsi: ufs: Complete pending requests in host reset and restore path scsi: aic7xxx: Adjust indentation in ahc_find_syncrate drm/mediatek: handle events when enabling/disabling crtc ARM: dts: r8a7779: Add device node for ARM global timer selinux: ensure we cleanup the internal AVC counters on error in avc_update() dmaengine: Store module owner in dma_device struct dmaengine: imx-sdma: Fix memory leak crypto: chtls - Fixed memory leak x86/vdso: Provide missing include file PM / devfreq: rk3399_dmc: Add COMPILE_TEST and HAVE_ARM_SMCCC dependency pinctrl: sh-pfc: sh7269: Fix CAN function GPIOs reset: uniphier: Add SCSSI reset control for each channel RDMA/rxe: Fix error type of mmap_offset clk: sunxi-ng: add mux and pll notifiers for A64 CPU clock ALSA: sh: Fix unused variable warnings clk: uniphier: Add SCSSI clock gate for each channel ALSA: sh: Fix compile warning wrt const tools lib api fs: Fix gcc9 stringop-truncation compilation error ACPI: button: Add DMI quirk for Razer Blade Stealth 13 late 2019 lid switch mlx5: work around high stack usage with gcc drm: remove the newline for CRC source name. ARM: dts: stm32: Add power-supply for DSI panel on stm32f469-disco usbip: Fix unsafe unaligned pointer usage udf: Fix free space reporting for metadata and virtual partitions staging: rtl8188: avoid excessive stack usage IB/hfi1: Add software counter for ctxt0 seq drop soc/tegra: fuse: Correct straps' address for older Tegra124 device trees efi/x86: Don't panic or BUG() on non-critical error conditions rcu: Use WRITE_ONCE() for assignments to ->pprev for hlist_nulls Input: edt-ft5x06 - work around first register access error x86/nmi: Remove irq_work from the long duration NMI handler wan: ixp4xx_hss: fix compile-testing on 64-bit ASoC: atmel: fix build error with CONFIG_SND_ATMEL_SOC_DMA=m tty: synclinkmp: Adjust indentation in several functions tty: synclink_gt: Adjust indentation in several functions visorbus: fix uninitialized variable access driver core: platform: Prevent resouce overflow from causing infinite loops driver core: Print device when resources present in really_probe() bpf: Return -EBADRQC for invalid map type in __bpf_tx_xdp_map vme: bridges: reduce stack usage drm/nouveau/secboot/gm20b: initialize pointer in gm20b_secboot_new() drm/nouveau/gr/gk20a,gm200-: add terminators to method lists read from fw drm/nouveau: Fix copy-paste error in nouveau_fence_wait_uevent_handler drm/nouveau/drm/ttm: Remove set but not used variable 'mem' drm/nouveau/fault/gv100-: fix memory leak on module unload drm/vmwgfx: prevent memory leak in vmw_cmdbuf_res_add usb: musb: omap2430: Get rid of musb .set_vbus for omap2430 glue iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE f2fs: set I_LINKABLE early to avoid wrong access by vfs f2fs: free sysfs kobject scsi: iscsi: Don't destroy session if there are outstanding connections arm64: fix alternatives with LLVM's integrated assembler drm/amd/display: fixup DML dependencies watchdog/softlockup: Enforce that timestamp is valid on boot f2fs: fix memleak of kobject x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd pwm: omap-dmtimer: Remove PWM chip in .remove before making it unfunctional cmd64x: potential buffer overflow in cmd64x_program_timings() ide: serverworks: potential overflow in svwks_set_pio_mode() pwm: Remove set but not set variable 'pwm' btrfs: fix possible NULL-pointer dereference in integrity checks btrfs: safely advance counter when looking up bio csums btrfs: device stats, log when stats are zeroed module: avoid setting info->name early in case we can fall back to info->mod->name remoteproc: Initialize rproc_class before use irqchip/mbigen: Set driver .suppress_bind_attrs to avoid remove problems ALSA: hda/hdmi - add retry logic to parse_intel_hdmi() kbuild: use -S instead of -E for precise cc-option test in Kconfig x86/decoder: Add TEST opcode to Group3-2 s390: adjust -mpacked-stack support check for clang 10 s390/ftrace: generate traced function stack frame driver core: platform: fix u32 greater or equal to zero comparison ALSA: hda - Add docking station support for Lenovo Thinkpad T420s drm/nouveau/mmu: fix comptag memory leak powerpc/sriov: Remove VF eeh_dev state when disabling SR-IOV bcache: cached_dev_free needs to put the sb page iommu/vt-d: Remove unnecessary WARN_ON_ONCE() selftests: bpf: Reset global state between reuseport test runs jbd2: switch to use jbd2_journal_abort() when failed to submit the commit record jbd2: make sure ESHUTDOWN to be recorded in the journal superblock ARM: 8951/1: Fix Kexec compilation issue. hostap: Adjust indentation in prism2_hostapd_add_sta iwlegacy: ensure loop counter addr does not wrap and cause an infinite loop cifs: fix NULL dereference in match_prepath bpf: map_seq_next should always increase position index ceph: check availability of mds cluster on mount after wait timeout rbd: work around -Wuninitialized warning irqchip/gic-v3: Only provision redistributors that are enabled in ACPI drm/nouveau/disp/nv50-: prevent oops when no channel method map provided ftrace: fpid_next() should increase position index trigger_next should increase position index radeon: insert 10ms sleep in dce5_crtc_load_lut ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans() lib/scatterlist.c: adjust indentation in __sg_alloc_table reiserfs: prevent NULL pointer dereference in reiserfs_insert_item() bcache: explicity type cast in bset_bkey_last() irqchip/gic-v3-its: Reference to its_invall_cmd descriptor when building INVALL iwlwifi: mvm: Fix thermal zone registration microblaze: Prevent the overflow of the start brd: check and limit max_part par drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage NFS: Fix memory leaks help_next should increase position index cifs: log warning message (once) if out of disk space virtio_balloon: prevent pfn array overflow mlxsw: spectrum_dpipe: Add missing error path drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2) Linux 4.19.106 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ia1032b50dd82b42e13973120dcbf94ae7b864648	2020-02-24 09:13:25 +01:00
Vasily Averin	9ed840b756	trigger_next should increase position index [ Upstream commit `6722b23e7a` ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. Without patch: # dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist # Available triggers: # traceon traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist 6+1 records in 6+1 records out 206 bytes copied, 0.00027916 s, 738 kB/s Notice the printing of "# Available triggers:..." after the line. With the patch: # dd bs=30 skip=1 if=/sys/kernel/tracing/events/sched/sched_switch/trigger dd: /sys/kernel/tracing/events/sched/sched_switch/trigger: cannot skip to specified offset n traceoff snapshot stacktrace enable_event disable_event enable_hist disable_hist hist 2+1 records in 2+1 records out 88 bytes copied, 0.000526867 s, 167 kB/s It only prints the end of the file, and does not restart. Link: http://lkml.kernel.org/r/3c35ee24-dd3a-8119-9c19-552ed253388a@virtuozzo.com https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:52 +01:00
Vasily Averin	ddb005d906	ftrace: fpid_next() should increase position index [ Upstream commit `e4075e8bdf` ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. Without patch: # dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset id no pid 2+1 records in 2+1 records out 10 bytes copied, 0.000213285 s, 46.9 kB/s Notice the "id" followed by "no pid". With the patch: # dd bs=4 skip=1 if=/sys/kernel/tracing/set_ftrace_pid dd: /sys/kernel/tracing/set_ftrace_pid: cannot skip to specified offset id 0+1 records in 0+1 records out 3 bytes copied, 0.000202112 s, 14.8 kB/s Notice that it only prints "id" and not the "no pid" afterward. Link: http://lkml.kernel.org/r/4f87c6ad-f114-30bb-8506-c32274ce2992@virtuozzo.com https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:52 +01:00
Vasily Averin	ca2b459365	bpf: map_seq_next should always increase position index [ Upstream commit `90435a7891` ] If seq_file .next fuction does not change position index, read after some lseek can generate an unexpected output. See also: https://bugzilla.kernel.org/show_bug.cgi?id=206283 v1 -> v2: removed missed increment in end of function Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/eca84fdd-c374-a154-d874-6c7b55fc3bc4@virtuozzo.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:51 +01:00
Jessica Yu	c371b1e41f	module: avoid setting info->name early in case we can fall back to info->mod->name [ Upstream commit `708e0ada19` ] In setup_load_info(), info->name (which contains the name of the module, mostly used for early logging purposes before the module gets set up) gets unconditionally assigned if .modinfo is missing despite the fact that there is an if (!info->name) check near the end of the function. Avoid assigning a placeholder string to info->name if .modinfo doesn't exist, so that we can fall back to info->mod->name later on. Fixes: `5fdc7db644` ("module: setup load info before module_sig_check()") Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:49 +01:00
Thomas Gleixner	c2913e2c50	watchdog/softlockup: Enforce that timestamp is valid on boot [ Upstream commit `11e31f608b` ] Robert reported that during boot the watchdog timestamp is set to 0 for one second which is the indicator for a watchdog reset. The reason for this is that the timestamp is in seconds and the time is taken from sched clock and divided by ~1e9. sched clock starts at 0 which means that for the first second during boot the watchdog timestamp is 0, i.e. reset. Use ULONG_MAX as the reset indicator value so the watchdog works correctly right from the start. ULONG_MAX would only conflict with a real timestamp if the system reaches an uptime of 136 years on 32bit and almost eternity on 64bit. Reported-by: Robert Richter <rrichter@marvell.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/87o8v3uuzl.fsf@nanos.tec.linutronix.de Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:49 +01:00
Steven Rostedt (VMware)	56d3793229	tracing: Fix very unlikely race of registering two stat tracers [ Upstream commit `dfb6cd1e65` ] Looking through old emails in my INBOX, I came across a patch from Luis Henriques that attempted to fix a race of two stat tracers registering the same stat trace (extremely unlikely, as this is done in the kernel, and probably doesn't even exist). The submitted patch wasn't quite right as it needed to deal with clean up a bit better (if two stat tracers were the same, it would have the same files). But to make the code cleaner, all we needed to do is to keep the all_stat_sessions_mutex held for most of the registering function. Link: http://lkml.kernel.org/r/1410299375-20068-1-git-send-email-luis.henriques@canonical.com Fixes: `002bb86d8d` ("tracing/ftrace: separate events tracing and stats tracing engine") Reported-by: Luis Henriques <luis.henriques@canonical.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:39 +01:00
Luis Henriques	fb0085070a	tracing: Fix tracing_stat return values in error handling paths [ Upstream commit `afccc00f75` ] tracing_stat_init() was always returning '0', even on the error paths. It now returns -ENODEV if tracing_init_dentry() fails or -ENOMEM if it fails to created the 'trace_stat' debugfs directory. Link: http://lkml.kernel.org/r/1410299381-20108-1-git-send-email-luis.henriques@canonical.com Fixes: `ed6f1c996b` ("tracing: Check return value of tracing_init_dentry()") Signed-off-by: Luis Henriques <luis.henriques@canonical.com> [ Pulled from the archeological digging of my INBOX ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:39 +01:00
Peter Zijlstra	b9dc4d61b5	cpu/hotplug, stop_machine: Fix stop_machine vs hotplug order [ Upstream commit `45178ac0ce` ] Paul reported a very sporadic, rcutorture induced, workqueue failure. When the planets align, the workqueue rescuer's self-migrate fails and then triggers a WARN for running a work on the wrong CPU. Tejun then figured that set_cpus_allowed_ptr()'s stop_one_cpu() call could be ignored! When stopper->enabled is false, stop_machine will insta complete the work, without actually doing the work. Worse, it will not WARN about this (we really should fix this). It turns out there is a small window where a freshly online'ed CPU is marked 'online' but doesn't yet have the stopper task running: BP AP bringup_cpu() __cpu_up(cpu, idle) --> start_secondary() ... cpu_startup_entry() bringup_wait_for_ap() wait_for_ap_thread() <-- cpuhp_online_idle() while (1) do_idle() ... available to run kthreads ... stop_machine_unpark() stopper->enable = true; Close this by moving the stop_machine_unpark() into cpuhp_online_idle(), such that the stopper thread is ready before we start the idle loop and schedule. Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Debugged-by: Tejun Heo <tj@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: "Paul E. McKenney" <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-24 08:34:35 +01:00
Quentin Perret	2a557de670	UPSTREAM: sched/topology: Introduce a sysctl for Energy Aware Scheduling In its current state, Energy Aware Scheduling (EAS) starts automatically on asymmetric platforms having an Energy Model (EM). However, there are users who want to have an EM (for thermal management for example), but don't want EAS with it. In order to let users disable EAS explicitly, introduce a new sysctl called 'sched_energy_aware'. It is enabled by default so that EAS can start automatically on platforms where it makes sense. Flipping it to 0 rebuilds the scheduling domains and disables EAS. Bug: 120440300 Signed-off-by: Quentin Perret <quentin.perret@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: adharmap@codeaurora.org Cc: chris.redpath@arm.com Cc: currojerez@riseup.net Cc: dietmar.eggemann@arm.com Cc: edubezval@gmail.com Cc: gregkh@linuxfoundation.org Cc: javi.merino@kernel.org Cc: joel@joelfernandes.org Cc: juri.lelli@redhat.com Cc: morten.rasmussen@arm.com Cc: patrick.bellasi@arm.com Cc: pkondeti@codeaurora.org Cc: rjw@rjwysocki.net Cc: skannan@codeaurora.org Cc: smuckle@google.com Cc: srinivas.pandruvada@linux.intel.com Cc: thara.gopinath@linaro.org Cc: tkjos@google.com Cc: valentin.schneider@arm.com Cc: vincent.guittot@linaro.org Cc: viresh.kumar@linaro.org Link: https://lkml.kernel.org/r/20181203095628.11858-11-quentin.perret@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `8d5d0cfb63`) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I4ca842d07b82869cfab7542c8c4351f631e1024d	2020-02-19 10:50:59 +00:00
Greg Kroah-Hartman	4eee97caec	This is the 4.19.104 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5HEigACgkQONu9yGCS aT7Gcw/6AkDGkK5U/aDpKMqWiRmZUqDIg8U9xR+44Gl57Q71vicrzq8NGPHxxbsF slWoCyXLVSD7bMWGsTD0qJR8muROAraMxDl8dCxojEXHnXFMx4A4Cf0h1E0lY0mu Jq/O9m33ZMSppjio88sCcLpo0pbXF+cCX1CY87NI5QUitUzHgRh18W8BtyFpMMI8 eC0Fc+hMWax3+qqHt/hFVpufaTKm35zLCpGjGAJiHd7GFvqUJnuAzBYCs1Cf8NO1 KrrL3l/IWk8z3Z0Wc9PbBz309a9H6FVpjrXSXj6URkxjtqJ0F0mBMaIYxhaUF8PD CHY5xLyqKodC8/7O5zNOrP80oT9nqJvsmKwUwlG34IJuMVaq/o+hZu+88JVB02Yw v9XVcaQda5aZgWF9cBWzFQEcNwHFDCQ9VNidLDcHJLGPyFo/BogvMo8T4yPM9tI0 O0PSFm/yYu0airZSCzIbPzuF2Iv+iilVtq+o10VRDsGtEYAOzTL7nA01MkdXFhwy 4V+Q51C90TGo13BnnZ6xpEqjspuDWgeOD71/xkQ5cnyFgam0XQq/5R6JJghJIHOP 7p8NMMyNhK2FnOGrFUgqvwBCp6Dap1ISZyKvie1Z8vuCJsZcwMVIw8fxAzoZWOjj MlmmePjlbC7XTFxjdo0jrQTdvBwq+gFgNitD7UAlfHAdqKJKKA4= =8ktI -----END PGP SIGNATURE----- Merge 4.19.104 into android-4.19 Changes in 4.19.104 ASoC: pcm: update FE/BE trigger order based on the command hv_sock: Remove the accept port restriction IB/mlx4: Fix memory leak in add_gid error flow RDMA/netlink: Do not always generate an ACK for some netlink operations RDMA/core: Fix locking in ib_uverbs_event_read RDMA/uverbs: Verify MR access flags scsi: ufs: Fix ufshcd_probe_hba() reture value in case ufshcd_scsi_add_wlus() fails PCI/IOV: Fix memory leak in pci_iov_add_virtfn() ath10k: pci: Only dump ATH10K_MEM_REGION_TYPE_IOREG when safe PCI/switchtec: Fix vep_vector_number ioread width PCI: Don't disable bridge BARs when assigning bus resources nfs: NFS_SWAP should depend on SWAP NFS: Revalidate the file size on a fatal write error NFS/pnfs: Fix pnfs_generic_prepare_to_resend_writes() NFSv4: try lease recovery on NFS4ERR_EXPIRED serial: uartps: Add a timeout to the tx empty wait gpio: zynq: Report gpio direction at boot spi: spi-mem: Add extra sanity checks on the op param spi: spi-mem: Fix inverted logic in op sanity check rtc: hym8563: Return -EINVAL if the time is known to be invalid rtc: cmos: Stop using shared IRQ ARC: [plat-axs10x]: Add missing multicast filter number to GMAC node platform/x86: intel_mid_powerbtn: Take a copy of ddata ARM: dts: at91: Reenable UART TX pull-ups ARM: dts: am43xx: add support for clkout1 clock ARM: dts: at91: sama5d3: fix maximum peripheral clock rates ARM: dts: at91: sama5d3: define clock rate range for tcb1 tools/power/acpi: fix compilation error powerpc/pseries/vio: Fix iommu_table use-after-free refcount warning powerpc/pseries: Allow not having ibm, hypertas-functions::hcall-multi-tce for DDW iommu/arm-smmu-v3: Populate VMID field for CMDQ_OP_TLBI_NH_VA KVM: arm/arm64: vgic-its: Fix restoration of unmapped collections ARM: 8949/1: mm: mark free_memmap as __init arm64: cpufeature: Fix the type of no FP/SIMD capability arm64: ptrace: nofpsimd: Fail FP/SIMD regset operations KVM: arm/arm64: Fix young bit from mmu notifier KVM: arm: Fix DFSR setting for non-LPAE aarch32 guests KVM: arm: Make inject_abt32() inject an external abort instead KVM: arm64: pmu: Don't increment SW_INCR if PMCR.E is unset mtd: onenand_base: Adjust indentation in onenand_read_ops_nolock mtd: sharpslpart: Fix unsigned comparison to zero crypto: artpec6 - return correct error code for failed setkey() crypto: atmel-sha - fix error handling when setting hmac key media: i2c: adv748x: Fix unsafe macros pinctrl: sh-pfc: r8a7778: Fix duplicate SDSELF_B and SD1_CLK_B mwifiex: Fix possible buffer overflows in mwifiex_ret_wmm_get_status() mwifiex: Fix possible buffer overflows in mwifiex_cmd_append_vsie_tlv() libertas: don't exit from lbs_ibss_join_existing() with RCU read lock held libertas: make lbs_ibss_join_existing() return error code on rates overflow scsi: megaraid_sas: Do not initiate OCR if controller is not in ready state x86/stackframe: Move ENCODE_FRAME_POINTER to asm/frame.h x86/stackframe, x86/ftrace: Add pt_regs frame annotations serial: uartps: Move the spinlock after the read of the tx empty padata: fix null pointer deref of pd->pinst Linux 4.19.104 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I42a465b140183dcc8cf49e19903d0e8f4b688930	2020-02-19 08:31:05 +01:00
Daniel Jordan	cad926f70b	padata: fix null pointer deref of pd->pinst The 4.19 backport `dc34710a7a` ("padata: Remove broken queue flushing") removed padata_alloc_pd()'s assignment to pd->pinst, resulting in: Unable to handle kernel NULL pointer dereference ... ... pc : padata_reorder+0x144/0x2e0 ... Call trace: padata_reorder+0x144/0x2e0 padata_do_serial+0xc8/0x128 pcrypt_aead_enc+0x60/0x70 [pcrypt] padata_parallel_worker+0xd8/0x138 process_one_work+0x1bc/0x4b8 worker_thread+0x164/0x580 kthread+0x134/0x138 ret_from_fork+0x10/0x18 This happened because the backport was based on an enhancement that moved this assignment but isn't in 4.19: `bfde23ce20` ("padata: unbind parallel jobs from specific CPUs") Simply restore the assignment to fix the crash. Fixes: `dc34710a7a` ("padata: Remove broken queue flushing") Reported-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Sasha Levin <sashal@kernel.org> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-14 16:33:28 -05:00
Greg Kroah-Hartman	3389e56d31	This is the 4.19.103 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl5Cn0wACgkQONu9yGCS aT584xAAtePSlzTxst/jukREoyrpAfTM1BeovMdsZEBpKh+/F3n1udqHeo+iNAAN qSOig012aW2qP7b5/4CrEU9ZRTvd0AM4fog7ABLJVahMYMqoJgod8TRaE4v0nVut eRans6w3NbZJCZwdw2aiu5gwFfjwJLSUckBNmj4XVYdyfh7q0BgnZV5OY0V+zhuG 1MWXaylbRqjguR/ZFk0UPAmRaqNKHbwfCJ1V0ygL9xQkJM0cUn7hX9/CqM4aYnm6 m1oux4ektLAmF1XK4NiQEuRBMeFO74XlKcsZqQHf/b4FZfcPergcPwIj8ugtCHzJ kx2QgURDjgH4Tnu+Q0ScPrjj2kjU8rWmjqlcv1PcUyOWm+MR0OK9bW7TLEntMSF8 HOEe9j6SsjQNIOoYh1YcMnuGjKNIZjl2L3VbDzpVN2GxZxwAutY6G68tV7sbA2pu wtsrAVOqdcjoo0ruRmwognBqQAdNdsbiBx7bgcNjVEXWL0N3Ddiv6CNYwnehA5Hq cvQwVQpFGP9ZGYUcCMbdwR+7kJzVy6V2S615M8GkE9FouOwTfV60zM/sZ1rFVt1J 70zxfRX5ys19aTAVkbi6pHHCUJ0ZAiTgWujp5Hp4kPt7gEz01Ur0s1kI3b7b6iWh cuycRFULvqeXCApQacs//lOVDoUV20uFcL/zqOFM33v/+YzkyjA= =3D8z -----END PGP SIGNATURE----- Merge 4.19.103 into android-4.19 Changes in 4.19.103 Revert "drm/sun4i: dsi: Change the start delay calculation" ovl: fix lseek overflow on 32bit kernel/module: Fix memleak in module_add_modinfo_attrs() media: iguanair: fix endpoint sanity check ocfs2: fix oops when writing cloned file x86/cpu: Update cached HLE state on write to TSX_CTRL_CPUID_CLEAR udf: Allow writing to 'Rewritable' partitions printk: fix exclusive_console replaying iwlwifi: mvm: fix NVM check for 3168 devices sparc32: fix struct ipc64_perm type definition cls_rsvp: fix rsvp_policy gtp: use __GFP_NOWARN to avoid memalloc warning l2tp: Allow duplicate session creation with UDP net: hsr: fix possible NULL deref in hsr_handle_frame() net_sched: fix an OOB access in cls_tcindex net: stmmac: Delete txtimer in suspend() bnxt_en: Fix TC queue mapping. tcp: clear tp->total_retrans in tcp_disconnect() tcp: clear tp->delivered in tcp_disconnect() tcp: clear tp->data_segs{in\|out} in tcp_disconnect() tcp: clear tp->segs_{in\|out} in tcp_disconnect() rxrpc: Fix use-after-free in rxrpc_put_local() rxrpc: Fix insufficient receive notification generation rxrpc: Fix missing active use pinning of rxrpc_local object rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect media: uvcvideo: Avoid cyclic entity chains due to malformed USB descriptors mfd: dln2: More sanity checking for endpoints ipc/msg.c: consolidate all xxxctl_down() functions tracing: Fix sched switch start/stop refcount racy updates rcu: Avoid data-race in rcu_gp_fqs_check_wake() brcmfmac: Fix memory leak in brcmf_usbdev_qinit usb: typec: tcpci: mask event interrupts when remove driver usb: gadget: legacy: set max_speed to super-speed usb: gadget: f_ncm: Use atomic_t to track in-flight request usb: gadget: f_ecm: Use atomic_t to track in-flight request ALSA: usb-audio: Fix endianess in descriptor validation ALSA: dummy: Fix PCM format loop in proc output mm/memory_hotplug: fix remove_memory() lockdep splat mm: move_pages: report the number of non-attempted pages media/v4l2-core: set pages dirty upon releasing DMA buffers media: v4l2-core: compat: ignore native command codes media: v4l2-rect.h: fix v4l2_rect_map_inside() top/left adjustments lib/test_kasan.c: fix memory leak in kmalloc_oob_krealloc_more() irqdomain: Fix a memory leak in irq_domain_push_irq() platform/x86: intel_scu_ipc: Fix interrupt support ALSA: hda: Add Clevo W65_67SB the power_save blacklist KVM: arm64: Correct PSTATE on exception entry KVM: arm/arm64: Correct CPSR on exception entry KVM: arm/arm64: Correct AArch32 SPSR on exception entry KVM: arm64: Only sign-extend MMIO up to register width MIPS: fix indentation of the 'RELOCS' message MIPS: boot: fix typo in 'vmlinux.lzma.its' target s390/mm: fix dynamic pagetable upgrade for hugetlbfs powerpc/xmon: don't access ASDR in VMs powerpc/pseries: Advance pfn if section is not present in lmb_is_removable() smb3: fix signing verification of large reads PCI: tegra: Fix return value check of pm_runtime_get_sync() mmc: spi: Toggle SPI polarity, do not hardcode it ACPI: video: Do not export a non working backlight interface on MSI MS-7721 boards ACPI / battery: Deal with design or full capacity being reported as -1 ACPI / battery: Use design-cap for capacity calculations if full-cap is not available ACPI / battery: Deal better with neither design nor full capacity not being reported alarmtimer: Unregister wakeup source when module get fails ubifs: Reject unsupported ioctl flags explicitly ubifs: don't trigger assertion on invalid no-key filename ubifs: Fix FS_IOC_SETFLAGS unexpectedly clearing encrypt flag ubifs: Fix deadlock in concurrent bulk-read and writepage crypto: geode-aes - convert to skcipher API and make thread-safe PCI: keystone: Fix link training retries initiation mmc: sdhci-of-at91: fix memleak on clk_get failure hv_balloon: Balloon up according to request page number mfd: axp20x: Mark AXP20X_VBUS_IPSOUT_MGMT as volatile crypto: api - Check spawn->alg under lock in crypto_drop_spawn crypto: ccree - fix backlog memory leak crypto: ccree - fix pm wrongful error reporting crypto: ccree - fix PM race condition scripts/find-unused-docs: Fix massive false positives scsi: qla2xxx: Fix mtcp dump collection failure power: supply: ltc2941-battery-gauge: fix use-after-free ovl: fix wrong WARN_ON() in ovl_cache_update_ino() f2fs: choose hardlimit when softlimit is larger than hardlimit in f2fs_statfs_project() f2fs: fix miscounted block limit in f2fs_statfs_project() f2fs: code cleanup for f2fs_statfs_project() PM: core: Fix handling of devices deleted during system-wide resume of: Add OF_DMA_DEFAULT_COHERENT & select it on powerpc dm zoned: support zone sizes smaller than 128MiB dm space map common: fix to ensure new block isn't already in use dm crypt: fix benbi IV constructor crash if used in authenticated mode dm: fix potential for q->make_request_fn NULL pointer dm writecache: fix incorrect flush sequence when doing SSD mode commit padata: Remove broken queue flushing tracing: Annotate ftrace_graph_hash pointer with __rcu tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu ftrace: Add comment to why rcu_dereference_sched() is open coded ftrace: Protect ftrace_graph_hash with ftrace_sync samples/bpf: Don't try to remove user's homedir on clean crypto: ccp - set max RSA modulus size for v3 platform devices as well crypto: pcrypt - Do not clear MAY_SLEEP flag in original request crypto: atmel-aes - Fix counter overflow in CTR mode crypto: api - Fix race condition in crypto_spawn_alg crypto: picoxcell - adjust the position of tasklet_init and fix missed tasklet_kill scsi: qla2xxx: Fix unbound NVME response length NFS: Fix memory leaks and corruption in readdir NFS: Directory page cache pages need to be locked when read jbd2_seq_info_next should increase position index Btrfs: fix missing hole after hole punching and fsync when using NO_HOLES btrfs: set trans->drity in btrfs_commit_transaction Btrfs: fix race between adding and putting tree mod seq elements and nodes ARM: tegra: Enable PLLP bypass during Tegra124 LP1 iwlwifi: don't throw error when trying to remove IGTK mwifiex: fix unbalanced locking in mwifiex_process_country_ie() sunrpc: expiry_time should be seconds not timeval gfs2: move setting current->backing_dev_info gfs2: fix O_SYNC write handling drm/rect: Avoid division by zero media: rc: ensure lirc is initialized before registering input device tools/kvm_stat: Fix kvm_exit filter name xen/balloon: Support xend-based toolstack take two watchdog: fix UAF in reboot notifier handling in watchdog core code bcache: add readahead cache policy options via sysfs interface eventfd: track eventfd_signal() recursion depth aio: prevent potential eventfd recursion on poll KVM: x86: Refactor picdev_write() to prevent Spectre-v1/L1TF attacks KVM: x86: Refactor prefix decoding to prevent Spectre-v1/L1TF attacks KVM: x86: Protect pmu_intel.c from Spectre-v1/L1TF attacks KVM: x86: Protect DR-based index computations from Spectre-v1/L1TF attacks KVM: x86: Protect kvm_lapic_reg_write() from Spectre-v1/L1TF attacks KVM: x86: Protect kvm_hv_msr_[get\|set]_crash_data() from Spectre-v1/L1TF attacks KVM: x86: Protect ioapic_write_indirect() from Spectre-v1/L1TF attacks KVM: x86: Protect MSR-based index computations in pmu.h from Spectre-v1/L1TF attacks KVM: x86: Protect ioapic_read_indirect() from Spectre-v1/L1TF attacks KVM: x86: Protect MSR-based index computations from Spectre-v1/L1TF attacks in x86.c KVM: x86: Protect x86_decode_insn from Spectre-v1/L1TF attacks KVM: x86: Protect MSR-based index computations in fixed_msr_to_seg_unit() from Spectre-v1/L1TF attacks KVM: x86: Fix potential put_fpu() w/o load_fpu() on MPX platform KVM: PPC: Book3S HV: Uninit vCPU if vcore creation fails KVM: PPC: Book3S PR: Free shared page if mmu initialization fails x86/kvm: Be careful not to clear KVM_VCPU_FLUSH_TLB bit KVM: x86: Don't let userspace set host-reserved cr4 bits KVM: x86: Free wbinvd_dirty_mask if vCPU creation fails KVM: s390: do not clobber registers during guest reset/store status clk: tegra: Mark fuse clock as critical drm/amd/dm/mst: Ignore payload update failures percpu: Separate decrypted varaibles anytime encryption can be enabled scsi: qla2xxx: Fix the endianness of the qla82xx_get_fw_size() return type scsi: csiostor: Adjust indentation in csio_device_reset scsi: qla4xxx: Adjust indentation in qla4xxx_mem_free scsi: ufs: Recheck bkops level if bkops is disabled phy: qualcomm: Adjust indentation in read_poll_timeout ext2: Adjust indentation in ext2_fill_super powerpc/44x: Adjust indentation in ibm4xx_denali_fixup_memsize drm: msm: mdp4: Adjust indentation in mdp4_dsi_encoder_enable NFC: pn544: Adjust indentation in pn544_hci_check_presence ppp: Adjust indentation into ppp_async_input net: smc911x: Adjust indentation in smc911x_phy_configure net: tulip: Adjust indentation in {dmfe, uli526x}_init_module IB/mlx5: Fix outstanding_pi index for GSI qps IB/core: Fix ODP get user pages flow nfsd: fix delay timer on 32-bit architectures nfsd: fix jiffies/time_t mixup in LRU list nfsd: Return the correct number of bytes written to the file ubi: fastmap: Fix inverted logic in seen selfcheck ubi: Fix an error pointer dereference in error handling code mfd: da9062: Fix watchdog compatible string mfd: rn5t618: Mark ADC control register volatile bonding/alb: properly access headers in bond_alb_xmit() net: dsa: bcm_sf2: Only 7278 supports 2Gb/sec IMP port net: mvneta: move rx_dropped and rx_errors in per-cpu stats net_sched: fix a resource leak in tcindex_set_parms() net: systemport: Avoid RBUF stuck in Wake-on-LAN mode net/mlx5: IPsec, Fix esp modify function attribute net/mlx5: IPsec, fix memory leak at mlx5_fpga_ipsec_delete_sa_ctx net: macb: Remove unnecessary alignment check for TSO net: macb: Limit maximum GEM TX length in TSO net: dsa: b53: Always use dev->vlan_enabled in b53_configure_vlan() ext4: fix deadlock allocating crypto bounce page from mempool btrfs: use bool argument in free_root_pointers() btrfs: free block groups after free'ing fs trees drm: atmel-hlcdc: enable clock before configuring timing engine drm/dp_mst: Remove VCPI while disabling topology mgr btrfs: flush write bio if we loop in extent_write_cache_pages KVM: x86/mmu: Apply max PA check for MMIO sptes to 32-bit KVM KVM: x86: Use gpa_t for cr2/gpa to fix TDP support on 32-bit KVM KVM: VMX: Add non-canonical check on writes to RTIT address MSRs KVM: nVMX: vmread should not set rflags to specify success in case of #PF KVM: Use vcpu-specific gva->hva translation when querying host page size KVM: Play nice with read-only memslots when querying host page size mm: zero remaining unavailable struct pages mm: return zero_resv_unavail optimization mm/page_alloc.c: fix uninitialized memmaps on a partially populated last section cifs: fail i/o on soft mounts if sessionsetup errors out x86/apic/msi: Plug non-maskable MSI affinity race clocksource: Prevent double add_timer_on() for watchdog_timer perf/core: Fix mlock accounting in perf_mmap() rxrpc: Fix service call disconnection Linux 4.19.103 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I0d7f09085c3541373e0fd6b2e3ffacc5e34f7d55	2020-02-11 15:05:03 -08:00
Song Liu	a3623db43a	perf/core: Fix mlock accounting in perf_mmap() commit `003461559e` upstream. Decreasing sysctl_perf_event_mlock between two consecutive perf_mmap()s of a perf ring buffer may lead to an integer underflow in locked memory accounting. This may lead to the undesired behaviors, such as failures in BPF map creation. Address this by adjusting the accounting logic to take into account the possibility that the amount of already locked memory may exceed the current limit. Fixes: `c4b7547974` ("perf/core: Make the mlock accounting simple again") Suggested-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://lkml.kernel.org/r/20200123181146.2238074-1-songliubraving@fb.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:34:19 -08:00
Konstantin Khlebnikov	6284d30e96	clocksource: Prevent double add_timer_on() for watchdog_timer commit `febac332a8` upstream. Kernel crashes inside QEMU/KVM are observed: kernel BUG at kernel/time/timer.c:1154! BUG_ON(timer_pending(timer) \|\| !timer->function) in add_timer_on(). At the same time another cpu got: general protection fault: 0000 [#1] SMP PTI of poinson pointer 0xdead000000000200 in: __hlist_del at include/linux/list.h:681 (inlined by) detach_timer at kernel/time/timer.c:818 (inlined by) expire_timers at kernel/time/timer.c:1355 (inlined by) __run_timers at kernel/time/timer.c:1686 (inlined by) run_timer_softirq at kernel/time/timer.c:1699 Unfortunately kernel logs are badly scrambled, stacktraces are lost. Printing the timer->function before the BUG_ON() pointed to clocksource_watchdog(). The execution of clocksource_watchdog() can race with a sequence of clocksource_stop_watchdog() .. clocksource_start_watchdog(): expire_timers() detach_timer(timer, true); timer->entry.pprev = NULL; raw_spin_unlock_irq(&base->lock); call_timer_fn clocksource_watchdog() clocksource_watchdog_kthread() or clocksource_unbind() spin_lock_irqsave(&watchdog_lock, flags); clocksource_stop_watchdog(); del_timer(&watchdog_timer); watchdog_running = 0; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock_irqsave(&watchdog_lock, flags); clocksource_start_watchdog(); add_timer_on(&watchdog_timer, ...); watchdog_running = 1; spin_unlock_irqrestore(&watchdog_lock, flags); spin_lock(&watchdog_lock); add_timer_on(&watchdog_timer, ...); BUG_ON(timer_pending(timer) \|\| !timer->function); timer_pending() -> true BUG() I.e. inside clocksource_watchdog() watchdog_timer could be already armed. Check timer_pending() before calling add_timer_on(). This is sufficient as all operations are synchronized by watchdog_lock. Fixes: `75c5158f70` ("timekeeping: Update clocksource with stop_machine") Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/158048693917.4378.13823603769948933793.stgit@buzz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:34:18 -08:00
Thomas Gleixner	032a2bf978	x86/apic/msi: Plug non-maskable MSI affinity race commit `6f1a4891a5` upstream. Evan tracked down a subtle race between the update of the MSI message and the device raising an interrupt internally on PCI devices which do not support MSI masking. The update of the MSI message is non-atomic and consists of either 2 or 3 sequential 32bit wide writes to the PCI config space. - Write address low 32bits - Write address high 32bits (If supported by device) - Write data When an interrupt is migrated then both address and data might change, so the kernel attempts to mask the MSI interrupt first. But for MSI masking is optional, so there exist devices which do not provide it. That means that if the device raises an interrupt internally between the writes then a MSI message is sent built from half updated state. On x86 this can lead to spurious interrupts on the wrong interrupt vector when the affinity setting changes both address and data. As a consequence the device interrupt can be lost causing the device to become stuck or malfunctioning. Evan tried to handle that by disabling MSI accross an MSI message update. That's not feasible because disabling MSI has issues on its own: If MSI is disabled the PCI device is routing an interrupt to the legacy INTx mechanism. The INTx delivery can be disabled, but the disablement is not working on all devices. Some devices lose interrupts when both MSI and INTx delivery are disabled. Another way to solve this would be to enforce the allocation of the same vector on all CPUs in the system for this kind of screwed devices. That could be done, but it would bring back the vector space exhaustion problems which got solved a few years ago. Fortunately the high address (if supported by the device) is only relevant when X2APIC is enabled which implies interrupt remapping. In the interrupt remapping case the affinity setting is happening at the interrupt remapping unit and the PCI MSI message is programmed only once when the PCI device is initialized. That makes it possible to solve it with a two step update: 1) Target the MSI msg to the new vector on the current target CPU 2) Target the MSI msg to the new vector on the new target CPU In both cases writing the MSI message is only changing a single 32bit word which prevents the issue of inconsistency. After writing the final destination it is necessary to check whether the device issued an interrupt while the intermediate state #1 (new vector, current CPU) was in effect. This is possible because the affinity change is always happening on the current target CPU. The code runs with interrupts disabled, so the interrupt can be detected by checking the IRR of the local APIC. If the vector is pending in the IRR then the interrupt is retriggered on the new target CPU by sending an IPI for the associated vector on the target CPU. This can cause spurious interrupts on both the local and the new target CPU. 1) If the new vector is not in use on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then interrupt entry code will ignore that spurious interrupt. The vector is marked so that the 'No irq handler for vector' warning is supressed once. 2) If the new vector is in use already on the local CPU then the IRR check might see an pending interrupt from the device which is using this vector. The IPI to the new target CPU will then invoke the handler of the device, which got the affinity change, even if that device did not issue an interrupt 3) If the new vector is in use already on the local CPU and the device affected by the affinity change raised an interrupt during the transitional state (step #1 above) then the handler of the device which uses that vector on the local CPU will be invoked. expose issues in device driver interrupt handlers which are not prepared to handle a spurious interrupt correctly. This not a regression, it's just exposing something which was already broken as spurious interrupts can happen for a lot of reasons and all driver handlers need to be able to deal with them. Reported-by: Evan Green <evgreen@chromium.org> Debugged-by: Evan Green <evgreen@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Evan Green <evgreen@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87imkr4s7n.fsf@nanos.tec.linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:34:18 -08:00
Steven Rostedt (VMware)	0948d6294d	ftrace: Protect ftrace_graph_hash with ftrace_sync [ Upstream commit `54a16ff6f2` ] As function_graph tracer can run when RCU is not "watching", it can not be protected by synchronize_rcu() it requires running a task on each CPU before it can be freed. Calling schedule_on_each_cpu(ftrace_sync) needs to be used. Link: https://lore.kernel.org/r/20200205131110.GT2935@paulmck-ThinkPad-P72 Cc: stable@vger.kernel.org Fixes: `b9b0c831be` ("ftrace: Convert graph filter to use hash tables") Reported-by: "Paul E. McKenney" <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:05 -08:00
Steven Rostedt (VMware)	c03d235980	ftrace: Add comment to why rcu_dereference_sched() is open coded [ Upstream commit `16052dd5bd` ] Because the function graph tracer can execute in sections where RCU is not "watching", the rcu_dereference_sched() for the has needs to be open coded. This is fine because the RCU "flavor" of the ftrace hash is protected by its own RCU handling (it does its own little synchronization on every CPU and does not rely on RCU sched). Acked-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Amol Grover	30afa80b0f	tracing: Annotate ftrace_graph_notrace_hash pointer with __rcu [ Upstream commit `fd0e6852c4` ] Fix following instances of sparse error kernel/trace/ftrace.c:5667:29: error: incompatible types in comparison kernel/trace/ftrace.c:5813:21: error: incompatible types in comparison kernel/trace/ftrace.c:5868:36: error: incompatible types in comparison kernel/trace/ftrace.c:5870:25: error: incompatible types in comparison Use rcu_dereference_protected to dereference the newly annotated pointer. Link: http://lkml.kernel.org/r/20200205055701.30195-1-frextrite@gmail.com Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Amol Grover	f144ad2e84	tracing: Annotate ftrace_graph_hash pointer with __rcu [ Upstream commit `24a9729f83` ] Fix following instances of sparse error kernel/trace/ftrace.c:5664:29: error: incompatible types in comparison kernel/trace/ftrace.c:5785:21: error: incompatible types in comparison kernel/trace/ftrace.c:5864:36: error: incompatible types in comparison kernel/trace/ftrace.c:5866:25: error: incompatible types in comparison Use rcu_dereference_protected to access the __rcu annotated pointer. Link: http://lkml.kernel.org/r/20200201072703.17330-1-frextrite@gmail.com Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Amol Grover <frextrite@gmail.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Herbert Xu	dc34710a7a	padata: Remove broken queue flushing [ Upstream commit `07928d9bfc` ] The function padata_flush_queues is fundamentally broken because it cannot force padata users to complete the request that is underway. IOW padata has to passively wait for the completion of any outstanding work. As it stands flushing is used in two places. Its use in padata_stop is simply unnecessary because nothing depends on the queues to be flushed afterwards. The other use in padata_replace is more substantial as we depend on it to free the old pd structure. This patch instead uses the pd->refcnt to dynamically free the pd structure once all requests are complete. Fixes: `2b73b07ab8` ("padata: Flush the padata queues actively") Cc: <stable@vger.kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:34:04 -08:00
Stephen Boyd	b522ff023e	alarmtimer: Unregister wakeup source when module get fails commit `6b6d188aae` upstream. The alarmtimer_rtc_add_device() function creates a wakeup source and then tries to grab a module reference. If that fails the function returns early with an error code, but fails to remove the wakeup source. Cleanup this exit path so there is no dangling wakeup source, which is named 'alarmtime' left allocated which will conflict with another RTC device that may be registered later. Fixes: `51218298a2` ("alarmtimer: Ensure RTC module is not unloaded") Signed-off-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Douglas Anderson <dianders@chromium.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200109155910.907-2-swboyd@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:59 -08:00
Kevin Hao	4f7d834cec	irqdomain: Fix a memory leak in irq_domain_push_irq() commit `0f394daef8` upstream. Fix a memory leak reported by kmemleak: unreferenced object 0xffff000bc6f50e80 (size 128): comm "kworker/23:2", pid 201, jiffies 4294894947 (age 942.132s) hex dump (first 32 bytes): 00 00 00 00 41 00 00 00 86 c0 03 00 00 00 00 00 ....A........... 00 a0 b2 c6 0b 00 ff ff 40 51 fd 10 00 80 ff ff ........@Q...... backtrace: [<00000000e62d2240>] kmem_cache_alloc_trace+0x1a4/0x320 [<00000000279143c9>] irq_domain_push_irq+0x7c/0x188 [<00000000d9f4c154>] thunderx_gpio_probe+0x3ac/0x438 [<00000000fd09ec22>] pci_device_probe+0xe4/0x198 [<00000000d43eca75>] really_probe+0xdc/0x320 [<00000000d3ebab09>] driver_probe_device+0x5c/0xf0 [<000000005b3ecaa0>] __device_attach_driver+0x88/0xc0 [<000000004e5915f5>] bus_for_each_drv+0x7c/0xc8 [<0000000079d4db41>] __device_attach+0xe4/0x140 [<00000000883bbda9>] device_initial_probe+0x18/0x20 [<000000003be59ef6>] bus_probe_device+0x98/0xa0 [<0000000039b03d3f>] deferred_probe_work_func+0x74/0xa8 [<00000000870934ce>] process_one_work+0x1c8/0x470 [<00000000e3cce570>] worker_thread+0x1f8/0x428 [<000000005d64975e>] kthread+0xfc/0x128 [<00000000f0eaa764>] ret_from_fork+0x10/0x18 Fixes: `495c38d300` ("irqdomain: Add irq_domain_{push,pop}_irq() functions") Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200120043547.22271-1-haokexin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:57 -08:00
Eric Dumazet	00b13445f9	rcu: Avoid data-race in rcu_gp_fqs_check_wake() commit `6935c3983b` upstream. The rcu_gp_fqs_check_wake() function uses rcu_preempt_blocked_readers_cgp() to read ->gp_tasks while other cpus might overwrite this field. We need READ_ONCE()/WRITE_ONCE() pairs to avoid compiler tricks and KCSAN splats like the following : BUG: KCSAN: data-race in rcu_gp_fqs_check_wake / rcu_preempt_deferred_qs_irqrestore write to 0xffffffff85a7f190 of 8 bytes by task 7317 on cpu 0: rcu_preempt_deferred_qs_irqrestore+0x43d/0x580 kernel/rcu/tree_plugin.h:507 rcu_read_unlock_special+0xec/0x370 kernel/rcu/tree_plugin.h:659 __rcu_read_unlock+0xcf/0xe0 kernel/rcu/tree_plugin.h:394 rcu_read_unlock include/linux/rcupdate.h:645 [inline] __ip_queue_xmit+0x3b0/0xa40 net/ipv4/ip_output.c:533 ip_queue_xmit+0x45/0x60 include/net/ip.h:236 __tcp_transmit_skb+0xdeb/0x1cd0 net/ipv4/tcp_output.c:1158 __tcp_send_ack+0x246/0x300 net/ipv4/tcp_output.c:3685 tcp_send_ack+0x34/0x40 net/ipv4/tcp_output.c:3691 tcp_cleanup_rbuf+0x130/0x360 net/ipv4/tcp.c:1575 tcp_recvmsg+0x633/0x1a30 net/ipv4/tcp.c:2179 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1864 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 read to 0xffffffff85a7f190 of 8 bytes by task 10 on cpu 1: rcu_gp_fqs_check_wake kernel/rcu/tree.c:1556 [inline] rcu_gp_fqs_check_wake+0x93/0xd0 kernel/rcu/tree.c:1546 rcu_gp_fqs_loop+0x36c/0x580 kernel/rcu/tree.c:1611 rcu_gp_kthread+0x143/0x220 kernel/rcu/tree.c:1768 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 10 Comm: rcu_preempt Not tainted 5.3.0+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> [ paulmck: Added another READ_ONCE() for RCU CPU stall warnings. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:55 -08:00
Mathieu Desnoyers	62bfa26e4d	tracing: Fix sched switch start/stop refcount racy updates commit `64ae572bc7` upstream. Reading the sched_cmdline_ref and sched_tgid_ref initial state within tracing_start_sched_switch without holding the sched_register_mutex is racy against concurrent updates, which can lead to tracepoint probes being registered more than once (and thus trigger warnings within tracepoint.c). [ May be the fix for this bug ] Link: https://lore.kernel.org/r/000000000000ab6f84056c786b93@google.com Link: http://lkml.kernel.org/r/20190817141208.15226-1-mathieu.desnoyers@efficios.com Cc: stable@vger.kernel.org CC: Steven Rostedt (VMware) <rostedt@goodmis.org> CC: Joel Fernandes (Google) <joel@joelfernandes.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Paul E. McKenney <paulmck@linux.ibm.com> Reported-by: syzbot+774fddf07b7ab29a1e55@syzkaller.appspotmail.com Fixes: `d914ba37d7` ("tracing: Add support for recording tgid of tasks") Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-11 04:33:55 -08:00
John Ogness	8360063bfa	printk: fix exclusive_console replaying [ Upstream commit `def97da136` ] Commit `f92b070f2d` ("printk: Do not miss new messages when replaying the log") introduced a new variable @exclusive_console_stop_seq to store when an exclusive console should stop printing. It should be set to the @console_seq value at registration. However, @console_seq is previously set to @syslog_seq so that the exclusive console knows where to begin. This results in the exclusive console immediately reactivating all the other consoles and thus repeating the messages for those consoles. Set @console_seq after @exclusive_console_stop_seq has stored the current @console_seq value. Fixes: `f92b070f2d` ("printk: Do not miss new messages when replaying the log") Link: http://lkml.kernel.org/r/20191219115322.31160-1-john.ogness@linutronix.de Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: John Ogness <john.ogness@linutronix.de> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:33:51 -08:00
YueHaibing	bdfaaf35ac	kernel/module: Fix memleak in module_add_modinfo_attrs() [ Upstream commit `f6d061d617` ] In module_add_modinfo_attrs() if sysfs_create_file() fails on the first iteration of the loop (so i = 0), we forget to free the modinfo_attrs. Fixes: `bc6f2a757d` ("kernel/module: Fix mem leak in module_add_modinfo_attrs") Reviewed-by: Miroslav Benes <mbenes@suse.cz> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Jessica Yu <jeyu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-11 04:33:50 -08:00
Greg Kroah-Hartman	83b584a64c	This is the 4.19.102 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl461NIACgkQONu9yGCS aT6Mqw//W5xIIcs0Ut+P+QYNN6lCTRJ0AvFUolz79M3pyK/rHUluwTvYJbDAeGE3 sckv96rE1pxj5ZSf6LegXIoALrA4RlYHS8xXkYnRrt6xfrb7UwpqsJtt4Mx+IrJ3 9uFfaWRSvuDfRCraZxLiE2Bl9xVYvaPfFJYBmH383VB+deYNfpwORFsqNDQT+gR6 PZLuV0x//Kerwmd4OvaaHR/fIl8YVKmIz5lu3+3WIuVKxTK6Bbd3YzVu13dhVaX2 mETflLEAO/sYsUQiS4SO22ejLAiWyD8LyMV8s9KeTFQXzML3JpibKnt3ySDfzsFE m8VRlaLcQwB0Ca2AVGHA5QV0+V+2+6qh/IcZl630feBueGQX59qLQkOurD4e/9lm Na6ZkLPTh9UipIfTu9fvA5HY5lPt2VcSWwG2nLluckfJIpKNFVQEB7vuk9zd7468 qkXmj/J1YDdJzt2YgD0WZuKu3f1/No7rXbNmT2Oj0+HNWWvIU9xFNFlIPAxo7pJy kwekd9+gHI0n1OhLRjzYUyf0pD+j0o75ZHsYYsUW0y6cGoWX/LmQ8JPFi+waHiov FOe8FJz/uDtfQnJ4+izAM5Jjbu1LE+L8uGoIExYAv4DuXgPZtI2wtHvP4HHM3Aov mDWLesMgizsroViv57aXC0C1ZPksPpGeHT+HcH7RnDQ0kQmpe3E= =2XGW -----END PGP SIGNATURE----- Merge 4.19.102 into android-4.19 Changes in 4.19.102 vfs: fix do_last() regression x86/resctrl: Fix use-after-free when deleting resource groups x86/resctrl: Fix use-after-free due to inaccurate refcount of rdtgroup x86/resctrl: Fix a deadlock due to inaccurate reference crypto: pcrypt - Fix user-after-free on module unload rsi: add hci detach for hibernation and poweroff rsi: fix use-after-free on failed probe and unbind perf c2c: Fix return type for histogram sorting comparision functions PM / devfreq: Add new name attribute for sysfs tools lib: Fix builds when glibc contains strlcpy() arm64: kbuild: remove compressed images on 'make ARCH=arm64 (dist)clean' ext4: validate the debug_want_extra_isize mount option at parse time mm/mempolicy.c: fix out of bounds write in mpol_parse_str() reiserfs: Fix memory leak of journal device string media: digitv: don't continue if remote control state can't be read media: af9005: uninitialized variable printked media: vp7045: do not read uninitialized values if usb transfer fails media: gspca: zero usb_buf media: dvb-usb/dvb-usb-urb.c: initialize actlen to 0 tomoyo: Use atomic_t for statistics counter ttyprintk: fix a potential deadlock in interrupt context issue Bluetooth: Fix race condition in hci_release_sock() cgroup: Prevent double killing of css when enabling threaded cgroup media: si470x-i2c: Move free() past last use of 'radio' ARM: dts: sun8i: a83t: Correct USB3503 GPIOs polarity ARM: dts: am57xx-beagle-x15/am57xx-idk: Remove "gpios" for endpoint dt nodes ARM: dts: beagle-x15-common: Model 5V0 regulator soc: ti: wkup_m3_ipc: Fix race condition with rproc_boot tools lib traceevent: Fix memory leakage in filter_event rseq: Unregister rseq for clone CLONE_VM clk: sunxi-ng: h6-r: Fix AR100/R_APB2 parent order mac80211: mesh: restrict airtime metric to peered established plinks clk: mmp2: Fix the order of timer mux parents ASoC: rt5640: Fix NULL dereference on module unload ixgbevf: Remove limit of 10 entries for unicast filter list ixgbe: Fix calculation of queue with VFs and flow director on interface flap igb: Fix SGMII SFP module discovery for 100FX/LX. platform/x86: GPD pocket fan: Allow somewhat lower/higher temperature limits ASoC: sti: fix possible sleep-in-atomic qmi_wwan: Add support for Quectel RM500Q parisc: Use proper printk format for resource_size_t wireless: fix enabling channel 12 for custom regulatory domain cfg80211: Fix radar event during another phy CAC mac80211: Fix TKIP replay protection immediately after key setup wireless: wext: avoid gcc -O3 warning netfilter: nft_tunnel: ERSPAN_VERSION must not be null net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec bnxt_en: Fix ipv6 RFS filter matching logic. riscv: delete temporary files iwlwifi: Don't ignore the cap field upon mcc update ARM: dts: am335x-boneblack-common: fix memory size vti[6]: fix packet tx through bpf_redirect() xfrm interface: fix packet tx through bpf_redirect() xfrm: interface: do not confirm neighbor when do pmtu update scsi: fnic: do not queue commands during fwreset ARM: 8955/1: virt: Relax arch timer version check during early boot tee: optee: Fix compilation issue with nommu airo: Fix possible info leak in AIROOLDIOCTL/SIOCDEVPRIVATE airo: Add missing CAP_NET_ADMIN check in AIROOLDIOCTL/SIOCDEVPRIVATE r8152: get default setting of WOL before initializing ARM: dts: am43x-epos-evm: set data pin directions for spi0 and spi1 qlcnic: Fix CPU soft lockup while collecting firmware dump powerpc/fsl/dts: add fsl,erratum-a011043 net/fsl: treat fsl,erratum-a011043 net: fsl/fman: rename IF_MODE_XGMII to IF_MODE_10G seq_tab_next() should increase position index l2t_seq_next should increase position index net: Fix skb->csum update in inet_proto_csum_replace16(). btrfs: do not zero f_bavail if we have available space perf report: Fix no libunwind compiled warning break s390 issue mm/migrate.c: also overwrite error when it is bigger than zero Linux 4.19.102 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Ia9b63c7932b66f469ab0e88467e1e07741408f0b	2020-02-05 19:20:26 +00:00
Michal Koutný	6d26630912	cgroup: Prevent double killing of css when enabling threaded cgroup commit `3bc0bb36fa` upstream. The test_cgcore_no_internal_process_constraint_on_threads selftest when running with subsystem controlling noise triggers two warnings: > [ 597.443115] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3131 cgroup_apply_control_enable+0xe0/0x3f0 > [ 597.443413] WARNING: CPU: 1 PID: 28167 at kernel/cgroup/cgroup.c:3177 cgroup_apply_control_disable+0xa6/0x160 Both stem from a call to cgroup_type_write. The first warning was also triggered by syzkaller. When we're switching cgroup to threaded mode shortly after a subsystem was disabled on it, we can see the respective subsystem css dying there. The warning in cgroup_apply_control_enable is harmless in this case since we're not adding new subsys anyway. The warning in cgroup_apply_control_disable indicates an attempt to kill css of recently disabled subsystem repeatedly. The commit prevents these situations by making cgroup_type_write wait for all dying csses to go away before re-applying subtree controls. When at it, the locations of WARN_ON_ONCE calls are moved so that warning is triggered only when we are about to misuse the dying css. Reported-by: syzbot+5493b2a54d31d6aea629@syzkaller.appspotmail.com Reported-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-05 14:43:39 +00:00
Greg Kroah-Hartman	1b44c9bd91	This is the 4.19.101 stable release -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAl41RsgACgkQONu9yGCS aT4P7A/+PZVt4c6phHZ9tj0OV4TjAWfu3IX9nLypzyBxjmBeJu8yt1pkNrfKj6fT +N3MjDlmAYss5CV6SOACPWXdhAQF3SsM6PR+CSrzwpS3+iAVTqNTaHpZqJFBgr3R cDe+MksbMLDpw3x+hXWV1E6WKcJZZJVeANuaD09HQDRVqKw1hRGxGEdyPChEjT71 Ml3o9a2TYzOvRClBtBHPRQNy/MP4cVv06kS7jefDNh1z9PMsD2w01W54ur44WFJb aujt6bLyJlcs0cPdSkU7D8pmgzs/0cxW8N+4gCpfW66P6FJL8SU4RDTujUARlyvC oP5d62XrARXAO0hh1NYdWyUqpQjOFJRTWfEqW+lFGo5s9yL9oPW8vcCBKBuZfg+u HlVCCTCyU/IJN0DMeqdneThDg8sxirlzHu/NllgGIf7rhyMRqRmruQZXc1W3/7e8 UgqqAEFkgVmJgq3mVWlHsV5Fmgb+PQlqj4rSB05wlAbXsQwF0nbSS/lsvwDR8qqE 8nO/PQoxpQyAOYJ+iyaCsq51IsJUCwWOto8L/RpdYSbFpLTn+BRmNdDr7jHOVnPl FshugoXijE6IrVGIJhJBGGy/E+eG8Dru7IZEsi2UZLsw+bBvucqv7raIHAJ2YRaL 8ZuwwmvpZpCOdYSWa7lIgqZb0qOTyR+b6UQ57X8hS5U3MZ2jMOE= =+bpt -----END PGP SIGNATURE----- Merge 4.19.101 into android-4.19 Changes in 4.19.101 orinoco_usb: fix interface sanity check rsi_91x_usb: fix interface sanity check usb: dwc3: pci: add ID for the Intel Comet Lake -V variant USB: serial: ir-usb: add missing endpoint sanity check USB: serial: ir-usb: fix link-speed handling USB: serial: ir-usb: fix IrLAP framing usb: dwc3: turn off VBUS when leaving host mode staging: most: net: fix buffer overflow staging: wlan-ng: ensure error return is actually returned staging: vt6656: correct packet types for CTS protect, mode. staging: vt6656: use NULLFUCTION stack on mac80211 staging: vt6656: Fix false Tx excessive retries reporting. serial: 8250_bcm2835aux: Fix line mismatch on driver unbind component: do not dereference opaque pointer in debugfs mei: me: add comet point (lake) H device ids iio: st_gyro: Correct data for LSM9DS0 gyro crypto: chelsio - fix writing tfm flags to wrong place cifs: Fix memory allocation in __smb2_handle_cancelled_cmd() ath9k: fix storage endpoint lookup brcmfmac: fix interface sanity check rtl8xxxu: fix interface sanity check zd1211rw: fix storage endpoint lookup net_sched: ematch: reject invalid TCF_EM_SIMPLE net_sched: fix ops->bind_class() implementations HID: multitouch: Add LG MELF0410 I2C touchscreen support arc: eznps: fix allmodconfig kconfig warning HID: Add quirk for Xin-Mo Dual Controller HID: ite: Add USB id match for Acer SW5-012 keyboard dock HID: Add quirk for incorrect input length on Lenovo Y720 drivers/hid/hid-multitouch.c: fix a possible null pointer access. phy: qcom-qmp: Increase PHY ready timeout phy: cpcap-usb: Prevent USB line glitches from waking up modem watchdog: max77620_wdt: fix potential build errors watchdog: rn5t618_wdt: fix module aliases spi: spi-dw: Add lock protect dw_spi rx/tx to prevent concurrent calls drivers/net/b44: Change to non-atomic bit operations on pwol_mask net: wan: sdla: Fix cast from pointer to integer of different size gpio: max77620: Add missing dependency on GPIOLIB_IRQCHIP atm: eni: fix uninitialized variable warning HID: steam: Fix input device disappearing platform/x86: dell-laptop: disable kbd backlight on Inspiron 10xx PCI: Add DMA alias quirk for Intel VCA NTB iommu/amd: Support multiple PCI DMA aliases in IRQ Remapping ARM: OMAP2+: SmartReflex: add omap_sr_pdata definition usb-storage: Disable UAS on JMicron SATA enclosure sched/fair: Add tmp_alone_branch assertion sched/fair: Fix insertion in rq->leaf_cfs_rq_list rsi: fix use-after-free on probe errors rsi: fix memory leak on failed URB submission rsi: fix non-atomic allocation in completion handler crypto: af_alg - Use bh_lock_sock in sk_destruct random: try to actively add entropy rather than passively wait for it block: cleanup __blkdev_issue_discard() block: fix 32 bit overflow in __blkdev_issue_discard() KVM: arm64: Write arch.mdcr_el2 changes since last vcpu_load on VHE Linux 4.19.101 Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I801cd8d04eea35b4b53957cc69c0987d88094992	2020-02-02 20:22:38 +00:00
Patrick Bellasi	8b2fbd9076	UPSTREAM: sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases The estimated utilization for a task: util_est = max(util_avg, est.enqueue, est.ewma) is defined based on: - util_avg: the PELT defined utilization - est.enqueued: the util_avg at the end of the last activation - est.ewma: a exponential moving average on the est.enqueued samples According to this definition, when a task suddenly changes its bandwidth requirements from small to big, the EWMA will need to collect multiple samples before converging up to track the new big utilization. This slow convergence towards bigger utilization values is not aligned to the default scheduler behavior, which is to optimize for performance. Moreover, the est.ewma component fails to compensate for temporarely utilization drops which spans just few est.enqueued samples. To let util_est do a better job in the scenario depicted above, change its definition by making util_est directly follow upward motion and only decay the est.ewma on downward. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@matbug.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Douglas Raillard <douglas.raillard@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <qperret@google.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191023205630.14469-1-patrick.bellasi@matbug.net Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `b8c9636140`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: I5c0bdd401f3fe599a2b7b9215c9a3a621f91002d	2020-02-01 17:35:00 +00:00
Qais Yousef	f503db1178	ANDROID: Re-use SUGOV_RT_MAX_FREQ to control uclamp rt behavior By default uclamp RT tasks will use the max frequency, which is not the desired default behavior on mobile devices. Re-use the SUGOV_RT_MAX_FREQ sched_feat to control the default behavior. When SUGOV_RT_MAX_FREQ is NOT selected, the uclamp_min value of the RT tasks will be 0. Note, since now we use SUGOV_RT_MAX_FREQ to enforce the default max frequency for RT when uclamp is compiled in; the condition in schedutil_cpu_util() needs to be inverted so that max no longer unconditionally applied when uclamp is compiled in && SUGOV_RT_MAX_FREQ is true. This unconditional application means uclamp values are always ignored which is not what we want when uclamp is compiled in. Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I3d36f1ebed6ef35a6299af32bbf4462d0353e783 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 16:14:12 +00:00
Valentin Schneider	ecce1cf84a	BACKPORT: sched/fair: Make EAS wakeup placement consider uclamp restrictions task_fits_capacity() has just been made uclamp-aware, and find_energy_efficient_cpu() needs to go through the same treatment. Things are somewhat different here however - using the task max clamp isn't sufficient. Consider the following setup: The target runqueue, rq: rq.cpu_capacity_orig = 512 rq.cfs.avg.util_avg = 200 rq.uclamp.max = 768 // the max p.uclamp.max of all enqueued p's is 768 The waking task, p (not yet enqueued on rq): p.util_est = 600 p.uclamp.max = 100 Now, consider the following code which doesn't use the rq clamps: util = uclamp_task_util(p); // Does the task fit in the spare CPU capacity? cpu = cpu_of(rq); fits_capacity(util, cpu_capacity(cpu) - cpu_util(cpu)) This would lead to: util = 100; fits_capacity(100, 512 - 200) fits_capacity() would return true. However, enqueuing p on that CPU will cause it to become overutilized since rq clamp values are max-aggregated, so we'd remain with rq.uclamp.max = 768 which comes from the other tasks already enqueued on rq. Thus, we could select a high enough frequency to reach beyond 0.8 * 512 utilization (== overutilized) after enqueuing p on rq. What find_energy_efficient_cpu() needs here is uclamp_rq_util_with() which lets us peek at the future utilization landscape, including rq-wide uclamp values. Make find_energy_efficient_cpu() use uclamp_rq_util_with() for its fits_capacity() check. This is in line with what compute_energy() ends up using for estimating utilization. [QP: moved changes to select_cpu_candidates(), which is the equivalent to the mainline path, and fix missing dependency on fits_capacity() by using the open coded version] Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Suggested-by: Quentin Perret <qperret@google.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-6-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `1d42509e47`) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Ibe1643cd5e6c97daceceae9733344e54bf4a4857	2020-02-01 16:14:11 +00:00
Valentin Schneider	50262f741b	BACKPORT: sched/fair: Make task_fits_capacity() consider uclamp restrictions task_fits_capacity() drives CPU selection at wakeup time, and is also used to detect misfit tasks. Right now it does so by comparing task_util_est() with a CPU's capacity, but doesn't take into account uclamp restrictions. There's a few interesting uses that can come out of doing this. For instance, a low uclamp.max value could prevent certain tasks from being flagged as misfit tasks, so they could merrily remain on low-capacity CPUs. Similarly, a high uclamp.min value would steer tasks towards high capacity CPUs at wakeup (and, should that fail, later steered via misfit balancing), so such "boosted" tasks would favor CPUs of higher capacity. Introduce uclamp_task_util() and make task_fits_capacity() use it. [QP: fixed missing dependency on fits_capacity() by using the open coded alternative] Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-5-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `a7008c07a5`) Signed-off-by: Quentin Perret <qperret@google.com> Change-Id: Iabde2eda7252c3bcc273e61260a7a12a7de991b1	2020-02-01 16:14:11 +00:00
Patrick Bellasi	f609a2239f	ANDROID: sched/core: Move SchedTune task API into UtilClamp wrappers The main SchedTune API calls realted to task tuning attributes are now wrapped by more generic and mainlinish UtilClamp calls. The new APIs are: - uclamp_task(p) <= boosted_task_util(p) - uclamp_boosted(p) <= schedtune_task_boost(p) > 0 - uclamp_latency_sensitive(p) <= schedtune_prefer_idle(p) Let's provide also an implementation of the same API based on the new uclamp.uclamp_latency_sensitive flag. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> [Modified the patch to use uclamp.latency_sensitive instead mainline attributes] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ib1a6902e1c07a82a370e36bf1776d895b7528cbc Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 16:14:11 +00:00
Quentin Perret	752b47b84d	ANDROID: sched/core: Add a latency-sensitive flag to uclamp Add a 'latency_sensitive' flag to uclamp in order to express the need for some tasks to find a CPU where they can wake-up quickly. This is not expected to be used without cgroup support, so add solely a cgroup interface for it. As this flag represents a boolean attribute and not an amount of resources to be shared, it is not clear what the delegation logic should be. As such, it is kept simple: every new cgroup starts with latency_sensitive set to false, regardless of the parent. In essence, this is similar to SchedTune's prefer-idle flag which was used in android-4.19 and prior. Bug: 120440300 Change-Id: I722d8ecabb428bb7b95a5b54bc70a87f182dde2a Signed-off-by: Quentin Perret <quentin.perret@arm.com> (cherry picked from commit ad7dd648fc7dbe11f23673a3463af2468a274998) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:17 +00:00
Patrick Bellasi	9a05300da0	ANDROID: sched/tune: Move SchedTune cpu API into UtilClamp wrappers The SchedTune CPU boosting API is currently used from sugov_get_util() to get the boosted utilization and to pass it into schedutil_cpu_util(). When UtilClamp is in use instead we call schedutil_cpu_util() by passing in just the CFS utilization and the clamping is done internally on the aggregated CFS+RT utilization for FREQUENCY_UTIL calls. This asymmetry is not required moreover, schedutil code is polluted by non-mainline SchedTune code. Wrap SchedTune API call related to cpu utilization boosting with a more generic and mainlinish UtilClamp call: - uclamp_rq_util_with(cpu, util, p) <= boosted_cpu_util(cpu) This new API is already used in schedutil_cpu_util() to clamp the aggregated RT+CFS utilization on FREQUENCY_UTIL calls. Move the cpu boosting into uclamp_rq_util_with() so that we remove any SchedTune specific bit from kernel/sched/cpufreq_schedutil.c. Get rid of the no more required boosted_cpu_util(cpu) method and replace it with a stune_util(cpu, util) which signature is better aligned with its uclamp_rq_util_with(cpu, util, p) counterpart. Bug: 120440300 Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I45b0f0f54123fe0a2515fa9f1683842e6b99234f [Removed superfluous __maybe_unused for capacity_orig_of] Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:17 +00:00
Li Guanglei	7e1c333ed1	FROMGIT: sched/core: Fix size of rq::uclamp initialization rq::uclamp is an array of struct uclamp_rq, make sure we clear the whole thing. Bug: 120440300 Fixes: `69842cba9a` ("sched/uclamp: Add CPU's clamp buckets refcountinga") Signed-off-by: Li Guanglei <guanglei.li@unisoc.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Link: https://lkml.kernel.org/r/1577259844-12677-1-git-send-email-guangleix.li@gmail.com (cherry picked from commit `dcd6dffb0a` https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Id36a2b77c45e586535e8fadfb7d66868ca8fe8c7 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Qais Yousef	45b9d34bec	FROMGIT: sched/uclamp: Fix a bug in propagating uclamp value in new cgroups When a new cgroup is created, the effective uclamp value wasn't updated with a call to cpu_util_update_eff() that looks at the hierarchy and update to the most restrictive values. Fix it by ensuring to call cpu_util_update_eff() when a new cgroup becomes online. Without this change, the newly created cgroup uses the default root_task_group uclamp values, which is 1024 for both uclamp_{min, max}, which will cause the rq to to be clamped to max, hence cause the system to run at max frequency. The problem was observed on Ubuntu server and was reproduced on Debian and Buildroot rootfs. By default, Ubuntu and Debian create a cpu controller cgroup hierarchy and add all tasks to it - which creates enough noise to keep the rq uclamp value at max most of the time. Imitating this behavior makes the problem visible in Buildroot too which otherwise looks fine since it's a minimal userspace. Bug: 120440300 Fixes: `0b60ba2dd3` ("sched/uclamp: Propagate parent clamps") Reported-by: Doug Smythies <dsmythies@telus.net> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Doug Smythies <dsmythies@telus.net> Link: https://lore.kernel.org/lkml/000701d5b965$361b6c60$a2524520$@net/ (cherry picked from commit `7226017ad3` https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I9636c60e04d58bbfc5041df1305b34a12b5a3f46 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	f59dfad8f9	FROMGIT: sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with() The current helper returns (CPU) rq utilization with uclamp restrictions taken into account. A uclamp task utilization helper would be quite helpful, but this requires some renaming. Prepare the code for the introduction of a uclamp_task_util() by renaming the existing uclamp_util_with() to uclamp_rq_util_with(). Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-4-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `d2b58a286e` https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I3e7146b788e079e400167203df5e5dadee2fd232 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	254e090f3a	FROMGIT: sched/uclamp: Make uclamp util helpers use and return UL values Vincent pointed out recently that the canonical type for utilization values is 'unsigned long'. Internally uclamp uses 'unsigned int' values for cache optimization, but this doesn't have to be exported to its users. Make the uclamp helpers that deal with utilization use and return unsigned long values. Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-3-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `686516b55e` https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Id3837f12237e5b77eb3a236bd32457dcd7de743e Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	6477d90135	FROMGIT: sched/uclamp: Remove uclamp_util() The sole user of uclamp_util(), schedutil_cpu_util(), was made to use uclamp_util_with() instead in commit: `af24bde8df` ("sched/uclamp: Add uclamp support to energy_compute()") From then on, uclamp_util() has remained unused. Being a simple wrapper around uclamp_util_with(), we can get rid of it and win back a few lines. Bug: 120440300 Tested-By: Dietmar Eggemann <dietmar.eggemann@arm.com> Suggested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191211113851.24241-2-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `59fe675248` https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched/core) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I11dbff80c6c4be9666438800b2527aca8cd24025 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Qais Yousef	cdadd91444	BACKPORT: sched/rt: Make RT capacity-aware Capacity Awareness refers to the fact that on heterogeneous systems (like Arm big.LITTLE), the capacity of the CPUs is not uniform, hence when placing tasks we need to be aware of this difference of CPU capacities. In such scenarios we want to ensure that the selected CPU has enough capacity to meet the requirement of the running task. Enough capacity means here that capacity_orig_of(cpu) >= task.requirement. The definition of task.requirement is dependent on the scheduling class. For CFS, utilization is used to select a CPU that has >= capacity value than the cfs_task.util. capacity_orig_of(cpu) >= cfs_task.util DL isn't capacity aware at the moment but can make use of the bandwidth reservation to implement that in a similar manner CFS uses utilization. The following patchset implements that: https://lore.kernel.org/lkml/20190506044836.2914-1-luca.abeni@santannapisa.it/ capacity_orig_of(cpu)/SCHED_CAPACITY >= dl_deadline/dl_runtime For RT we don't have a per task utilization signal and we lack any information in general about what performance requirement the RT task needs. But with the introduction of uclamp, RT tasks can now control that by setting uclamp_min to guarantee a minimum performance point. ATM the uclamp value are only used for frequency selection; but on heterogeneous systems this is not enough and we need to ensure that the capacity of the CPU is >= uclamp_min. Which is what implemented here. capacity_orig_of(cpu) >= rt_task.uclamp_min Note that by default uclamp.min is 1024, which means that RT tasks will always be biased towards the big CPUs, which make for a better more predictable behavior for the default case. Must stress that the bias acts as a hint rather than a definite placement strategy. For example, if all big cores are busy executing other RT tasks we can't guarantee that a new RT task will be placed there. On non-heterogeneous systems the original behavior of RT should be retained. Similarly if uclamp is not selected in the config. [ mingo: Minor edits to comments. ] Bug: 120440300 Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20191009104611.15363-1-qais.yousef@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `804d402fb6` https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git) [Qais: resolved minor conflict in kernel/sched/cpupri.c] Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: Ifc9da1c47de1aec9b4d87be2614e4c8968366900 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:16 +00:00
Valentin Schneider	ea9ce42997	UPSTREAM: sched/uclamp: Fix overzealous type replacement Some uclamp helpers had their return type changed from 'unsigned int' to 'enum uclamp_id' by commit `0413d7f33e` ("sched/uclamp: Always use 'enum uclamp_id' for clamp_id values") but it happens that some do return a value in the [0, SCHED_CAPACITY_SCALE] range, which should really be unsigned int. The affected helpers are uclamp_none(), uclamp_rq_max_value() and uclamp_eff_value(). Fix those up. Note that this doesn't lead to any obj diff using a relatively recent aarch64 compiler (8.3-2019.03). The current code of e.g. uclamp_eff_value() properly returns an 11 bit value (bits_per(1024)) and doesn't seem to do anything funny. I'm still marking this as fixing the above commit to be on the safe side. Bug: 120440300 Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Qais Yousef <qais.yousef@arm.com> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar.Eggemann@arm.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: patrick.bellasi@matbug.net Cc: qperret@google.com Cc: surenb@google.com Cc: tj@kernel.org Fixes: `0413d7f33e` ("sched/uclamp: Always use 'enum uclamp_id' for clamp_id values") Link: https://lkml.kernel.org/r/20191115103908.27610-1-valentin.schneider@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `7763baace1`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I924a99c125372a8fca81cb4bc0c82e6a7183fc8a Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00
Qais Yousef	7125c7cfca	UPSTREAM: sched/uclamp: Fix incorrect condition uclamp_update_active() should perform the update when p->uclamp[clamp_id].active is true. But when the logic was inverted in [1], the if condition wasn't inverted correctly too. [1] https://lore.kernel.org/lkml/20190902073836.GO2369@hirez.programming.kicks-ass.net/ Bug: 120440300 Reported-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Qais Yousef <qais.yousef@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Ben Segall <bsegall@google.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Patrick Bellasi <patrick.bellasi@matbug.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `babbe170e0` ("sched/uclamp: Update CPU's refcount on TG's clamp changes") Link: https://lkml.kernel.org/r/20191114211052.15116-1-qais.yousef@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> (cherry picked from commit `6e1ff0773f`) Signed-off-by: Qais Yousef <qais.yousef@arm.com> Change-Id: I51b58a6089290277e08a0aaa72b86f852eec1512 Signed-off-by: Quentin Perret <qperret@google.com>	2020-02-01 15:03:15 +00:00

1 2 3 4 5 ...

29,125 commits