linux-uconsole/arch
Thomas Gleixner e841240174 x86/speculation: Prevent stale SPEC_CTRL msr content
commit 6d991ba509 upstream

The seccomp speculation control operates on all tasks of a process, but
only the current task of a process can update the MSR immediately. For the
other threads the update is deferred to the next context switch.

This creates the following situation with Process A and B:

Process A task 2 and Process B task 1 are pinned on CPU1. Process A task 2
does not have the speculation control TIF bit set. Process B task 1 has the
speculation control TIF bit set.

CPU0					CPU1
					MSR bit is set
					ProcB.T1 schedules out
					ProcA.T2 schedules in
					MSR bit is cleared
ProcA.T1
  seccomp_update()
  set TIF bit on ProcA.T2
					ProcB.T1 schedules in
					MSR is not updated  <-- FAIL

This happens because the context switch code tries to avoid the MSR update
if the speculation control TIF bits of the incoming and the outgoing task
are the same. In the worst case ProcB.T1 and ProcA.T2 are the only tasks
scheduling back and forth on CPU1, which keeps the MSR stale forever.

In theory this could be remedied by IPIs, but chasing the remote task which
could be migrated is complex and full of races.

The straight forward solution is to avoid the asychronous update of the TIF
bit and defer it to the next context switch. The speculation control state
is stored in task_struct::atomic_flags by the prctl and seccomp updates
already.

Add a new TIF_SPEC_FORCE_UPDATE bit and set this after updating the
atomic_flags. Check the bit on context switch and force a synchronous
update of the speculation control if set. Use the same mechanism for
updating the current task.

Reported-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Casey Schaufler <casey.schaufler@intel.com>
Cc: Asit Mallick <asit.k.mallick@intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Waiman Long <longman9394@gmail.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Dave Stewart <david.c.stewart@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1811272247140.1875@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-12-05 19:32:04 +01:00
..
alpha arch/alpha, termios: implement BOTHER, IBSHIFT and termios2 2018-11-21 09:19:20 +01:00
arc ARC: clone syscall to setp r25 as thread pointer 2018-10-05 14:33:29 -07:00
arm ARM: dts: fsl: Fix improperly quoted stdout-path values 2018-11-27 16:13:04 +01:00
arm64 arm64: dts: renesas: condor: switch from EtherAVB to GEther 2018-11-27 16:13:04 +01:00
c6x kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
h8300 Kbuild updates for v4.19 (2nd) 2018-08-25 13:40:38 -07:00
hexagon hexagon: modify ffs() and fls() to return int 2018-09-10 19:42:15 -05:00
ia64 ia64: Fix allnoconfig section mismatch for ioc_init/ioc_iommu_info 2018-08-22 14:12:47 -07:00
m68k crypto: speck - remove Speck 2018-11-13 11:08:46 -08:00
microblaze kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
mips MIPS: OCTEON: cavium_octeon_defconfig: re-enable OCTEON USB driver 2018-11-27 16:13:08 +01:00
nds32 nds32: linker script: GCOV kernel may refers data in __exit 2018-09-05 10:16:26 +08:00
nios2 nios2: kconfig: remove duplicate DEBUG_STACK_USAGE symbol defintions 2018-08-27 09:47:20 +08:00
openrisc OpenRISC updates for 4.19 2018-08-23 14:09:37 -07:00
parisc parisc: Fix exported address of os_hpmc handler 2018-11-13 11:08:18 -08:00
powerpc powerpc/numa: Suppress "VPHN is not supported" messages 2018-12-01 09:37:33 +01:00
riscv RISC-V: Silence some module warnings on 32-bit 2018-12-01 09:37:33 +01:00
s390 s390/perf: Change CPUM_CF return code in event init function 2018-11-27 16:13:05 +01:00
sh kbuild: rename LDFLAGS to KBUILD_LDFLAGS 2018-08-24 08:22:08 +09:00
sparc sparc64: Wire up compat getpeername and getsockname. 2018-11-04 14:50:54 +01:00
um um: Give start_idle_thread() a return code 2018-11-27 16:12:59 +01:00
unicore32 mm: convert return type of handle_mm_fault() caller to vm_fault_t 2018-08-17 16:20:28 -07:00
x86 x86/speculation: Prevent stale SPEC_CTRL msr content 2018-12-05 19:32:04 +01:00
xtensa xtensa: fix boot parameters address translation 2018-11-21 09:19:16 +01:00
.gitignore
Kconfig Merge branch 'tlb-fixes' 2018-08-23 14:55:01 -07:00