KPTI/Spectre: add more fixes

* initial IBRS/IBPB/SPEC_CTRL support
* regression fixes for KPTI
* additional hardening against Spectre

based on Ubuntu-4.13.0-29.32 and mainline 4.14
This commit is contained in:
Fabian Grünbichler 2018-01-15 12:34:50 +01:00
parent 59d5af6732
commit 035dbe6708
59 changed files with 5150 additions and 0 deletions

View file

@ -0,0 +1,66 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 3 Jan 2018 15:57:59 +0100
Subject: [PATCH] x86/pti: Make sure the user/kernel PTEs match
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
Meelis reported that his K8 Athlon64 emits MCE warnings when PTI is
enabled:
[Hardware Error]: Error Addr: 0x0000ffff81e000e0
[Hardware Error]: MC1 Error: L1 TLB multimatch.
[Hardware Error]: cache level: L1, tx: INSN
The address is in the entry area, which is mapped into kernel _AND_ user
space. That's special because we switch CR3 while we are executing
there.
User mapping:
0xffffffff81e00000-0xffffffff82000000 2M ro PSE GLB x pmd
Kernel mapping:
0xffffffff81000000-0xffffffff82000000 16M ro PSE x pmd
So the K8 is complaining that the TLB entries differ. They differ in the
GLB bit.
Drop the GLB bit when installing the user shared mapping.
Fixes: 6dc72c3cbca0 ("x86/mm/pti: Share entry text PMD")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Meelis Roos <mroos@linux.ee>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031407180.1957@nanos
(cherry picked from commit 52994c256df36fda9a715697431cba9daecb6b11)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 8a95d206afc447d8461815c67e618bd8b2c6457f)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/mm/pti.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index bce8aea65606..2da28ba97508 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -367,7 +367,8 @@ static void __init pti_setup_espfix64(void)
static void __init pti_clone_entry_text(void)
{
pti_clone_pmds((unsigned long) __entry_text_start,
- (unsigned long) __irqentry_text_end, _PAGE_RW);
+ (unsigned long) __irqentry_text_end,
+ _PAGE_RW | _PAGE_GLOBAL);
}
/*
--
2.14.2

View file

@ -0,0 +1,172 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Sun, 31 Dec 2017 10:18:06 -0600
Subject: [PATCH] x86/dumpstack: Fix partial register dumps
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
The show_regs_safe() logic is wrong. When there's an iret stack frame,
it prints the entire pt_regs -- most of which is random stack data --
instead of just the five registers at the end.
show_regs_safe() is also poorly named: the on_stack() checks aren't for
safety. Rename the function to show_regs_if_on_stack() and add a
comment to explain why the checks are needed.
These issues were introduced with the "partial register dump" feature of
the following commit:
b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully")
That patch had gone through a few iterations of development, and the
above issues were artifacts from a previous iteration of the patch where
'regs' pointed directly to the iret frame rather than to the (partially
empty) pt_regs.
Tested-by: Alexander Tsoy <alexander@tsoy.me>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@vger.kernel.org
Fixes: b02fcf9ba121 ("x86/unwinder: Handle stack overflows more gracefully")
Link: http://lkml.kernel.org/r/5b05b8b344f59db2d3d50dbdeba92d60f2304c54.1514736742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit a9cdbe72c4e8bf3b38781c317a79326e2e1a230d)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 3f159d02ecca1ffe81dc467767833dd6d0345147)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/unwind.h | 17 +++++++++++++----
arch/x86/kernel/dumpstack.c | 28 ++++++++++++++++++++--------
arch/x86/kernel/stacktrace.c | 2 +-
3 files changed, 34 insertions(+), 13 deletions(-)
diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index 38fa6154e382..e1c1cb5019bc 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -55,18 +55,27 @@ void unwind_start(struct unwind_state *state, struct task_struct *task,
#if defined(CONFIG_UNWINDER_ORC) || defined(CONFIG_UNWINDER_FRAME_POINTER)
/*
- * WARNING: The entire pt_regs may not be safe to dereference. In some cases,
- * only the iret frame registers are accessible. Use with caution!
+ * If 'partial' returns true, only the iret frame registers are valid.
*/
-static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
+static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state,
+ bool *partial)
{
if (unwind_done(state))
return NULL;
+ if (partial) {
+#ifdef CONFIG_UNWINDER_ORC
+ *partial = !state->full_regs;
+#else
+ *partial = false;
+#endif
+ }
+
return state->regs;
}
#else
-static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
+static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state,
+ bool *partial)
{
return NULL;
}
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 19a936e9b259..8da5b487919f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -76,12 +76,23 @@ void show_iret_regs(struct pt_regs *regs)
regs->sp, regs->flags);
}
-static void show_regs_safe(struct stack_info *info, struct pt_regs *regs)
+static void show_regs_if_on_stack(struct stack_info *info, struct pt_regs *regs,
+ bool partial)
{
- if (on_stack(info, regs, sizeof(*regs)))
+ /*
+ * These on_stack() checks aren't strictly necessary: the unwind code
+ * has already validated the 'regs' pointer. The checks are done for
+ * ordering reasons: if the registers are on the next stack, we don't
+ * want to print them out yet. Otherwise they'll be shown as part of
+ * the wrong stack. Later, when show_trace_log_lvl() switches to the
+ * next stack, this function will be called again with the same regs so
+ * they can be printed in the right context.
+ */
+ if (!partial && on_stack(info, regs, sizeof(*regs))) {
__show_regs(regs, 0);
- else if (on_stack(info, (void *)regs + IRET_FRAME_OFFSET,
- IRET_FRAME_SIZE)) {
+
+ } else if (partial && on_stack(info, (void *)regs + IRET_FRAME_OFFSET,
+ IRET_FRAME_SIZE)) {
/*
* When an interrupt or exception occurs in entry code, the
* full pt_regs might not have been saved yet. In that case
@@ -98,6 +109,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
struct stack_info stack_info = {0};
unsigned long visit_mask = 0;
int graph_idx = 0;
+ bool partial;
printk("%sCall Trace:\n", log_lvl);
@@ -140,7 +152,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
printk("%s <%s>\n", log_lvl, stack_name);
if (regs)
- show_regs_safe(&stack_info, regs);
+ show_regs_if_on_stack(&stack_info, regs, partial);
/*
* Scan the stack, printing any text addresses we find. At the
@@ -164,7 +176,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
/*
* Don't print regs->ip again if it was already printed
- * by show_regs_safe() below.
+ * by show_regs_if_on_stack().
*/
if (regs && stack == &regs->ip) {
unwind_next_frame(&state);
@@ -200,9 +212,9 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unwind_next_frame(&state);
/* if the frame has entry regs, print them */
- regs = unwind_get_entry_regs(&state);
+ regs = unwind_get_entry_regs(&state, &partial);
if (regs)
- show_regs_safe(&stack_info, regs);
+ show_regs_if_on_stack(&stack_info, regs, partial);
}
if (stack_name)
diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 8dabd7bf1673..60244bfaf88f 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -98,7 +98,7 @@ static int __save_stack_trace_reliable(struct stack_trace *trace,
for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
unwind_next_frame(&state)) {
- regs = unwind_get_entry_regs(&state);
+ regs = unwind_get_entry_regs(&state, NULL);
if (regs) {
/*
* Kernel mode registers on the stack indicate an
--
2.14.2

View file

@ -0,0 +1,58 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Josh Poimboeuf <jpoimboe@redhat.com>
Date: Sun, 31 Dec 2017 10:18:07 -0600
Subject: [PATCH] x86/dumpstack: Print registers for first stack frame
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
In the stack dump code, if the frame after the starting pt_regs is also
a regs frame, the registers don't get printed. Fix that.
Reported-by: Andy Lutomirski <luto@amacapital.net>
Tested-by: Alexander Tsoy <alexander@tsoy.me>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toralf Förster <toralf.foerster@gmx.de>
Cc: stable@vger.kernel.org
Fixes: 3b3fa11bc700 ("x86/dumpstack: Print any pt_regs found on the stack")
Link: http://lkml.kernel.org/r/396f84491d2f0ef64eda4217a2165f5712f6a115.1514736742.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
(cherry picked from commit 3ffdeb1a02be3086f1411a15c5b9c481fa28e21f)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 3aef1ce621ae2eb0bd58e07cf9e66a859faa17cd)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/dumpstack.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 8da5b487919f..042f80c50e3b 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -115,6 +115,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
unwind_start(&state, task, regs, stack);
stack = stack ? : get_stack_pointer(task, regs);
+ regs = unwind_get_entry_regs(&state, &partial);
/*
* Iterate through the stacks, starting with the current stack pointer.
@@ -132,7 +133,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
* - hardirq stack
* - entry stack
*/
- for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
+ for ( ; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
const char *stack_name;
if (get_stack_info(stack, task, &stack_info, &visit_mask)) {
--
2.14.2

View file

@ -0,0 +1,64 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Nick Desaulniers <ndesaulniers@google.com>
Date: Wed, 3 Jan 2018 12:39:52 -0800
Subject: [PATCH] x86/process: Define cpu_tss_rw in same section as declaration
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
cpu_tss_rw is declared with DECLARE_PER_CPU_PAGE_ALIGNED
but then defined with DEFINE_PER_CPU_SHARED_ALIGNED
leading to section mismatch warnings.
Use DEFINE_PER_CPU_PAGE_ALIGNED consistently. This is necessary because
it's mapped to the cpu entry area and must be page aligned.
[ tglx: Massaged changelog a bit ]
Fixes: 1a935bc3d4ea ("x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: thomas.lendacky@amd.com
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: tklauser@distanz.ch
Cc: minipli@googlemail.com
Cc: me@kylehuey.com
Cc: namit@vmware.com
Cc: luto@kernel.org
Cc: jpoimboe@redhat.com
Cc: tj@kernel.org
Cc: cl@linux.com
Cc: bp@suse.de
Cc: thgarnie@google.com
Cc: kirill.shutemov@linux.intel.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180103203954.183360-1-ndesaulniers@google.com
(cherry picked from commit 2fd9c41aea47f4ad071accf94b94f94f2c4d31eb)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit f45e574914ae47825d2eea46abc9d6fbabe55e56)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/process.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 3688a7b9d055..07e6218ad7d9 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -46,7 +46,7 @@
* section. Since TSS's are completely CPU-local, we want them
* on exact cacheline boundaries, to eliminate cacheline ping-pong.
*/
-__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss_rw) = {
+__visible DEFINE_PER_CPU_PAGE_ALIGNED(struct tss_struct, cpu_tss_rw) = {
.x86_tss = {
/*
* .sp0 is only used when entering ring 0 from a lower
--
2.14.2

View file

@ -0,0 +1,98 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 5 Jan 2018 15:27:34 +0100
Subject: [PATCH] x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
Use the name associated with the particular attack which needs page table
isolation for mitigation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Jiri Koshina <jikos@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Lutomirski <luto@amacapital.net>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Turner <pjt@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Greg KH <gregkh@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801051525300.1724@nanos
(cherry picked from commit de791821c295cc61419a06fe5562288417d1bc58)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit aefb6725ee33758a2869c37e22dbc7ca80548007)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/cpufeatures.h | 2 +-
arch/x86/kernel/cpu/common.c | 2 +-
arch/x86/mm/pti.c | 6 +++---
3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 9b0c283afcf0..b7900d26066c 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -340,6 +340,6 @@
#define X86_BUG_SWAPGS_FENCE X86_BUG(11) /* SWAPGS without input dep on GS */
#define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */
#define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */
-#define X86_BUG_CPU_INSECURE X86_BUG(14) /* CPU is insecure and needs kernel page table isolation */
+#define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */
#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 1854dd8071a6..142ab555dafa 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -900,7 +900,7 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
setup_force_cpu_cap(X86_FEATURE_ALWAYS);
if (c->x86_vendor != X86_VENDOR_AMD)
- setup_force_cpu_bug(X86_BUG_CPU_INSECURE);
+ setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
fpu__init_system(c);
}
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 2da28ba97508..43d4a4a29037 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -56,13 +56,13 @@
static void __init pti_print_if_insecure(const char *reason)
{
- if (boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+ if (boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
pr_info("%s\n", reason);
}
static void __init pti_print_if_secure(const char *reason)
{
- if (!boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+ if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
pr_info("%s\n", reason);
}
@@ -96,7 +96,7 @@ void __init pti_check_boottime_disable(void)
}
autosel:
- if (!boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+ if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
return;
enable:
setup_force_cpu_cap(X86_FEATURE_PTI);
--
2.14.2

View file

@ -0,0 +1,61 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Jiri Kosina <jkosina@suse.cz>
Date: Fri, 5 Jan 2018 22:35:41 +0100
Subject: [PATCH] x86/pti: Unbreak EFI old_memmap
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
EFI_OLD_MEMMAP's efi_call_phys_prolog() calls set_pgd() with swapper PGD that
has PAGE_USER set, which makes PTI set NX on it, and therefore EFI can't
execute it's code.
Fix that by forcefully clearing _PAGE_NX from the PGD (this can't be done
by the pgprot API).
_PAGE_NX will be automatically reintroduced in efi_call_phys_epilog(), as
_set_pgd() will again notice that this is _PAGE_USER, and set _PAGE_NX on
it.
Tested-by: Dimitri Sivanich <sivanich@hpe.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/nycvar.YFH.7.76.1801052215460.11852@cbobk.fhfr.pm
(cherry picked from commit de53c3786a3ce162a1c815d0c04c766c23ec9c0a)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 31afacd8089f54061e718e5d491f11747755c503)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/platform/efi/efi_64.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index b104224d3d6c..987a38e82f73 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -133,7 +133,9 @@ pgd_t * __init efi_call_phys_prolog(void)
pud[j] = *pud_offset(p4d_k, vaddr);
}
}
+ pgd_offset_k(pgd * PGDIR_SIZE)->pgd &= ~_PAGE_NX;
}
+
out:
__flush_tlb_all();
--
2.14.2

View file

@ -0,0 +1,275 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen@linux.intel.com>
Date: Fri, 5 Jan 2018 09:44:36 -0800
Subject: [PATCH] x86/Documentation: Add PTI description
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
Add some details about how PTI works, what some of the downsides
are, and how to debug it when things go wrong.
Also document the kernel parameter: 'pti/nopti'.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Moritz Lipp <moritz.lipp@iaik.tugraz.at>
Cc: Daniel Gruss <daniel.gruss@iaik.tugraz.at>
Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at>
Cc: Richard Fellner <richard.fellner@student.tugraz.at>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andi Lutomirsky <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180105174436.1BC6FA2B@viggo.jf.intel.com
(cherry picked from commit 01c9b17bf673b05bb401b76ec763e9730ccf1376)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 1acf87c45b0170e717fc1b06a2d6fef47e07f79b)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
Documentation/admin-guide/kernel-parameters.txt | 21 ++-
Documentation/x86/pti.txt | 186 ++++++++++++++++++++++++
2 files changed, 200 insertions(+), 7 deletions(-)
create mode 100644 Documentation/x86/pti.txt
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index b4d2edf316db..1a6ebc6cdf26 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2677,8 +2677,6 @@
steal time is computed, but won't influence scheduler
behaviour
- nopti [X86-64] Disable kernel page table isolation
-
nolapic [X86-32,APIC] Do not enable or use the local APIC.
nolapic_timer [X86-32,APIC] Do not use the local APIC timer.
@@ -3247,11 +3245,20 @@
pt. [PARIDE]
See Documentation/blockdev/paride.txt.
- pti= [X86_64]
- Control user/kernel address space isolation:
- on - enable
- off - disable
- auto - default setting
+ pti= [X86_64] Control Page Table Isolation of user and
+ kernel address spaces. Disabling this feature
+ removes hardening, but improves performance of
+ system calls and interrupts.
+
+ on - unconditionally enable
+ off - unconditionally disable
+ auto - kernel detects whether your CPU model is
+ vulnerable to issues that PTI mitigates
+
+ Not specifying this option is equivalent to pti=auto.
+
+ nopti [X86_64]
+ Equivalent to pti=off
pty.legacy_count=
[KNL] Number of legacy pty's. Overwrites compiled-in
diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt
new file mode 100644
index 000000000000..d11eff61fc9a
--- /dev/null
+++ b/Documentation/x86/pti.txt
@@ -0,0 +1,186 @@
+Overview
+========
+
+Page Table Isolation (pti, previously known as KAISER[1]) is a
+countermeasure against attacks on the shared user/kernel address
+space such as the "Meltdown" approach[2].
+
+To mitigate this class of attacks, we create an independent set of
+page tables for use only when running userspace applications. When
+the kernel is entered via syscalls, interrupts or exceptions, the
+page tables are switched to the full "kernel" copy. When the system
+switches back to user mode, the user copy is used again.
+
+The userspace page tables contain only a minimal amount of kernel
+data: only what is needed to enter/exit the kernel such as the
+entry/exit functions themselves and the interrupt descriptor table
+(IDT). There are a few strictly unnecessary things that get mapped
+such as the first C function when entering an interrupt (see
+comments in pti.c).
+
+This approach helps to ensure that side-channel attacks leveraging
+the paging structures do not function when PTI is enabled. It can be
+enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time.
+Once enabled at compile-time, it can be disabled at boot with the
+'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt).
+
+Page Table Management
+=====================
+
+When PTI is enabled, the kernel manages two sets of page tables.
+The first set is very similar to the single set which is present in
+kernels without PTI. This includes a complete mapping of userspace
+that the kernel can use for things like copy_to_user().
+
+Although _complete_, the user portion of the kernel page tables is
+crippled by setting the NX bit in the top level. This ensures
+that any missed kernel->user CR3 switch will immediately crash
+userspace upon executing its first instruction.
+
+The userspace page tables map only the kernel data needed to enter
+and exit the kernel. This data is entirely contained in the 'struct
+cpu_entry_area' structure which is placed in the fixmap which gives
+each CPU's copy of the area a compile-time-fixed virtual address.
+
+For new userspace mappings, the kernel makes the entries in its
+page tables like normal. The only difference is when the kernel
+makes entries in the top (PGD) level. In addition to setting the
+entry in the main kernel PGD, a copy of the entry is made in the
+userspace page tables' PGD.
+
+This sharing at the PGD level also inherently shares all the lower
+layers of the page tables. This leaves a single, shared set of
+userspace page tables to manage. One PTE to lock, one set of
+accessed bits, dirty bits, etc...
+
+Overhead
+========
+
+Protection against side-channel attacks is important. But,
+this protection comes at a cost:
+
+1. Increased Memory Use
+ a. Each process now needs an order-1 PGD instead of order-0.
+ (Consumes an additional 4k per process).
+ b. The 'cpu_entry_area' structure must be 2MB in size and 2MB
+ aligned so that it can be mapped by setting a single PMD
+ entry. This consumes nearly 2MB of RAM once the kernel
+ is decompressed, but no space in the kernel image itself.
+
+2. Runtime Cost
+ a. CR3 manipulation to switch between the page table copies
+ must be done at interrupt, syscall, and exception entry
+ and exit (it can be skipped when the kernel is interrupted,
+ though.) Moves to CR3 are on the order of a hundred
+ cycles, and are required at every entry and exit.
+ b. A "trampoline" must be used for SYSCALL entry. This
+ trampoline depends on a smaller set of resources than the
+ non-PTI SYSCALL entry code, so requires mapping fewer
+ things into the userspace page tables. The downside is
+ that stacks must be switched at entry time.
+ d. Global pages are disabled for all kernel structures not
+ mapped into both kernel and userspace page tables. This
+ feature of the MMU allows different processes to share TLB
+ entries mapping the kernel. Losing the feature means more
+ TLB misses after a context switch. The actual loss of
+ performance is very small, however, never exceeding 1%.
+ d. Process Context IDentifiers (PCID) is a CPU feature that
+ allows us to skip flushing the entire TLB when switching page
+ tables by setting a special bit in CR3 when the page tables
+ are changed. This makes switching the page tables (at context
+ switch, or kernel entry/exit) cheaper. But, on systems with
+ PCID support, the context switch code must flush both the user
+ and kernel entries out of the TLB. The user PCID TLB flush is
+ deferred until the exit to userspace, minimizing the cost.
+ See intel.com/sdm for the gory PCID/INVPCID details.
+ e. The userspace page tables must be populated for each new
+ process. Even without PTI, the shared kernel mappings
+ are created by copying top-level (PGD) entries into each
+ new process. But, with PTI, there are now *two* kernel
+ mappings: one in the kernel page tables that maps everything
+ and one for the entry/exit structures. At fork(), we need to
+ copy both.
+ f. In addition to the fork()-time copying, there must also
+ be an update to the userspace PGD any time a set_pgd() is done
+ on a PGD used to map userspace. This ensures that the kernel
+ and userspace copies always map the same userspace
+ memory.
+ g. On systems without PCID support, each CR3 write flushes
+ the entire TLB. That means that each syscall, interrupt
+ or exception flushes the TLB.
+ h. INVPCID is a TLB-flushing instruction which allows flushing
+ of TLB entries for non-current PCIDs. Some systems support
+ PCIDs, but do not support INVPCID. On these systems, addresses
+ can only be flushed from the TLB for the current PCID. When
+ flushing a kernel address, we need to flush all PCIDs, so a
+ single kernel address flush will require a TLB-flushing CR3
+ write upon the next use of every PCID.
+
+Possible Future Work
+====================
+1. We can be more careful about not actually writing to CR3
+ unless its value is actually changed.
+2. Allow PTI to be enabled/disabled at runtime in addition to the
+ boot-time switching.
+
+Testing
+========
+
+To test stability of PTI, the following test procedure is recommended,
+ideally doing all of these in parallel:
+
+1. Set CONFIG_DEBUG_ENTRY=y
+2. Run several copies of all of the tools/testing/selftests/x86/ tests
+ (excluding MPX and protection_keys) in a loop on multiple CPUs for
+ several minutes. These tests frequently uncover corner cases in the
+ kernel entry code. In general, old kernels might cause these tests
+ themselves to crash, but they should never crash the kernel.
+3. Run the 'perf' tool in a mode (top or record) that generates many
+ frequent performance monitoring non-maskable interrupts (see "NMI"
+ in /proc/interrupts). This exercises the NMI entry/exit code which
+ is known to trigger bugs in code paths that did not expect to be
+ interrupted, including nested NMIs. Using "-c" boosts the rate of
+ NMIs, and using two -c with separate counters encourages nested NMIs
+ and less deterministic behavior.
+
+ while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done
+
+4. Launch a KVM virtual machine.
+5. Run 32-bit binaries on systems supporting the SYSCALL instruction.
+ This has been a lightly-tested code path and needs extra scrutiny.
+
+Debugging
+=========
+
+Bugs in PTI cause a few different signatures of crashes
+that are worth noting here.
+
+ * Failures of the selftests/x86 code. Usually a bug in one of the
+ more obscure corners of entry_64.S
+ * Crashes in early boot, especially around CPU bringup. Bugs
+ in the trampoline code or mappings cause these.
+ * Crashes at the first interrupt. Caused by bugs in entry_64.S,
+ like screwing up a page table switch. Also caused by
+ incorrectly mapping the IRQ handler entry code.
+ * Crashes at the first NMI. The NMI code is separate from main
+ interrupt handlers and can have bugs that do not affect
+ normal interrupts. Also caused by incorrectly mapping NMI
+ code. NMIs that interrupt the entry code must be very
+ careful and can be the cause of crashes that show up when
+ running perf.
+ * Kernel crashes at the first exit to userspace. entry_64.S
+ bugs, or failing to map some of the exit code.
+ * Crashes at first interrupt that interrupts userspace. The paths
+ in entry_64.S that return to userspace are sometimes separate
+ from the ones that return to the kernel.
+ * Double faults: overflowing the kernel stack because of page
+ faults upon page faults. Caused by touching non-pti-mapped
+ data in the entry code, or forgetting to switch to kernel
+ CR3 before calling into C functions which are not pti-mapped.
+ * Userspace segfaults early in boot, sometimes manifesting
+ as mount(8) failing to mount the rootfs. These have
+ tended to be TLB invalidation issues. Usually invalidating
+ the wrong PCID, or otherwise missing an invalidation.
+
+1. https://gruss.cc/files/kaiser.pdf
+2. https://meltdownattack.com/meltdown.pdf
--
2.14.2

View file

@ -0,0 +1,68 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: David Woodhouse <dwmw@amazon.co.uk>
Date: Sat, 6 Jan 2018 11:49:23 +0000
Subject: [PATCH] x86/cpufeatures: Add X86_BUG_SPECTRE_V[12]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
Add the bug bits for spectre v1/2 and force them unconditionally for all
cpus.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kees Cook <keescook@google.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1515239374-23361-2-git-send-email-dwmw@amazon.co.uk
(cherry picked from commit 99c6fa2511d8a683e61468be91b83f85452115fa)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit de861dbf4587b9dac9a1978e6349199755e8c1b1)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/cpufeatures.h | 2 ++
arch/x86/kernel/cpu/common.c | 3 +++
2 files changed, 5 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b7900d26066c..3928050b51b0 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -341,5 +341,7 @@
#define X86_BUG_MONITOR X86_BUG(12) /* IPI required to wake up remote CPU */
#define X86_BUG_AMD_E400 X86_BUG(13) /* CPU is among the affected by Erratum 400 */
#define X86_BUG_CPU_MELTDOWN X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */
+#define X86_BUG_SPECTRE_V1 X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */
+#define X86_BUG_SPECTRE_V2 X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */
#endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 142ab555dafa..01abbf69d522 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -902,6 +902,9 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
if (c->x86_vendor != X86_VENDOR_AMD)
setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+
fpu__init_system(c);
}
--
2.14.2

View file

@ -0,0 +1,58 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen@linux.intel.com>
Date: Sat, 6 Jan 2018 18:41:14 +0100
Subject: [PATCH] x86/tboot: Unbreak tboot with PTI enabled
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
This is another case similar to what EFI does: create a new set of
page tables, map some code at a low address, and jump to it. PTI
mistakes this low address for userspace and mistakenly marks it
non-executable in an effort to make it unusable for userspace.
Undo the poison to allow execution.
Fixes: 385ce0ea4c07 ("x86/mm/pti: Add Kconfig")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jeff Law <law@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: David" <dwmw@amazon.co.uk>
Cc: Nick Clifton <nickc@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180108102805.GK25546@redhat.com
(cherry picked from commit 262b6b30087246abf09d6275eb0c0dc421bcbe38)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit f03e9108405491791f0b883a2d95e2620ddfce64)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/tboot.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index a4eb27918ceb..75869a4b6c41 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -127,6 +127,7 @@ static int map_tboot_page(unsigned long vaddr, unsigned long pfn,
p4d = p4d_alloc(&tboot_mm, pgd, vaddr);
if (!p4d)
return -1;
+ pgd->pgd &= ~_PAGE_NX;
pud = pud_alloc(&tboot_mm, p4d, vaddr);
if (!pud)
return -1;
--
2.14.2

View file

@ -0,0 +1,151 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Jike Song <albcamus@gmail.com>
Date: Tue, 9 Jan 2018 00:03:41 +0800
Subject: [PATCH] x86/mm/pti: Remove dead logic in pti_user_pagetable_walk*()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
The following code contains dead logic:
162 if (pgd_none(*pgd)) {
163 unsigned long new_p4d_page = __get_free_page(gfp);
164 if (!new_p4d_page)
165 return NULL;
166
167 if (pgd_none(*pgd)) {
168 set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page)));
169 new_p4d_page = 0;
170 }
171 if (new_p4d_page)
172 free_page(new_p4d_page);
173 }
There can't be any difference between two pgd_none(*pgd) at L162 and L167,
so it's always false at L171.
Dave Hansen explained:
Yes, the double-test was part of an optimization where we attempted to
avoid using a global spinlock in the fork() path. We would check for
unallocated mid-level page tables without the lock. The lock was only
taken when we needed to *make* an entry to avoid collisions.
Now that it is all single-threaded, there is no chance of a collision,
no need for a lock, and no need for the re-check.
As all these functions are only called during init, mark them __init as
well.
Fixes: 03f4424f348e ("x86/mm/pti: Add functions to clone kernel PMDs")
Signed-off-by: Jike Song <albcamus@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Jiri Koshina <jikos@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Kees Cook <keescook@google.com>
Cc: Andi Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg KH <gregkh@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Paul Turner <pjt@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180108160341.3461-1-albcamus@gmail.com
(cherry picked from commit 8d56eff266f3e41a6c39926269c4c3f58f881a8e)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit de8ab6bea570e70d1478af2c1667714bc900ae70)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/mm/pti.c | 32 ++++++--------------------------
1 file changed, 6 insertions(+), 26 deletions(-)
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index 43d4a4a29037..ce38f165489b 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -149,7 +149,7 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
*
* Returns a pointer to a P4D on success, or NULL on failure.
*/
-static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
+static __init p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
{
pgd_t *pgd = kernel_to_user_pgdp(pgd_offset_k(address));
gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
@@ -164,12 +164,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
if (!new_p4d_page)
return NULL;
- if (pgd_none(*pgd)) {
- set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page)));
- new_p4d_page = 0;
- }
- if (new_p4d_page)
- free_page(new_p4d_page);
+ set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page)));
}
BUILD_BUG_ON(pgd_large(*pgd) != 0);
@@ -182,7 +177,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
*
* Returns a pointer to a PMD on success, or NULL on failure.
*/
-static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
+static __init pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
{
gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
p4d_t *p4d = pti_user_pagetable_walk_p4d(address);
@@ -194,12 +189,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
if (!new_pud_page)
return NULL;
- if (p4d_none(*p4d)) {
- set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page)));
- new_pud_page = 0;
- }
- if (new_pud_page)
- free_page(new_pud_page);
+ set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page)));
}
pud = pud_offset(p4d, address);
@@ -213,12 +203,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
if (!new_pmd_page)
return NULL;
- if (pud_none(*pud)) {
- set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
- new_pmd_page = 0;
- }
- if (new_pmd_page)
- free_page(new_pmd_page);
+ set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
}
return pmd_offset(pud, address);
@@ -251,12 +236,7 @@ static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address)
if (!new_pte_page)
return NULL;
- if (pmd_none(*pmd)) {
- set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
- new_pte_page = 0;
- }
- if (new_pte_page)
- free_page(new_pte_page);
+ set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
}
pte = pte_offset_kernel(pmd, address);
--
2.14.2

View file

@ -0,0 +1,77 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Mon, 8 Jan 2018 16:09:21 -0600
Subject: [PATCH] x86/cpu/AMD: Make LFENCE a serializing instruction
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
To aid in speculation control, make LFENCE a serializing instruction
since it has less overhead than MFENCE. This is done by setting bit 1
of MSR 0xc0011029 (DE_CFG). Some families that support LFENCE do not
have this MSR. For these families, the LFENCE instruction is already
serializing.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/20180108220921.12580.71694.stgit@tlendack-t1.amdoffice.net
(cherry picked from commit e4d0e84e490790798691aaa0f2e598637f1867ec)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit bde943193168fe9a3814badaa0cae3422029dce5)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/msr-index.h | 2 ++
arch/x86/kernel/cpu/amd.c | 10 ++++++++++
2 files changed, 12 insertions(+)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 5573c75f8e4c..25147df4acfc 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -351,6 +351,8 @@
#define FAM10H_MMIO_CONF_BASE_MASK 0xfffffffULL
#define FAM10H_MMIO_CONF_BASE_SHIFT 20
#define MSR_FAM10H_NODE_ID 0xc001100c
+#define MSR_F10H_DECFG 0xc0011029
+#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1
/* K8 MSRs */
#define MSR_K8_TOP_MEM1 0xc001001a
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 2a5328cc03a6..c9a4e4db7860 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -785,6 +785,16 @@ static void init_amd(struct cpuinfo_x86 *c)
set_cpu_cap(c, X86_FEATURE_K8);
if (cpu_has(c, X86_FEATURE_XMM2)) {
+ /*
+ * A serializing LFENCE has less overhead than MFENCE, so
+ * use it for execution serialization. On families which
+ * don't have that MSR, LFENCE is already serializing.
+ * msr_set_bit() uses the safe accessors, too, even if the MSR
+ * is not present.
+ */
+ msr_set_bit(MSR_F10H_DECFG,
+ MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
+
/* MFENCE stops RDTSC speculation */
set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
}
--
2.14.2

View file

@ -0,0 +1,92 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Mon, 8 Jan 2018 16:09:32 -0600
Subject: [PATCH] x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
With LFENCE now a serializing instruction, use LFENCE_RDTSC in preference
to MFENCE_RDTSC. However, since the kernel could be running under a
hypervisor that does not support writing that MSR, read the MSR back and
verify that the bit has been set successfully. If the MSR can be read
and the bit is set, then set the LFENCE_RDTSC feature, otherwise set the
MFENCE_RDTSC feature.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/20180108220932.12580.52458.stgit@tlendack-t1.amdoffice.net
(cherry picked from commit 9c6a73c75864ad9fa49e5fa6513e4c4071c0e29f)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit dc39f26bf11d270cb4cfd251919afb16d98d6c2b)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/amd.c | 18 ++++++++++++++++--
2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 25147df4acfc..db88b7f852b4 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -353,6 +353,7 @@
#define MSR_FAM10H_NODE_ID 0xc001100c
#define MSR_F10H_DECFG 0xc0011029
#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT 1
+#define MSR_F10H_DECFG_LFENCE_SERIALIZE BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT)
/* K8 MSRs */
#define MSR_K8_TOP_MEM1 0xc001001a
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index c9a4e4db7860..99eef4a09fd9 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -785,6 +785,9 @@ static void init_amd(struct cpuinfo_x86 *c)
set_cpu_cap(c, X86_FEATURE_K8);
if (cpu_has(c, X86_FEATURE_XMM2)) {
+ unsigned long long val;
+ int ret;
+
/*
* A serializing LFENCE has less overhead than MFENCE, so
* use it for execution serialization. On families which
@@ -795,8 +798,19 @@ static void init_amd(struct cpuinfo_x86 *c)
msr_set_bit(MSR_F10H_DECFG,
MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
- /* MFENCE stops RDTSC speculation */
- set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
+ /*
+ * Verify that the MSR write was successful (could be running
+ * under a hypervisor) and only then assume that LFENCE is
+ * serializing.
+ */
+ ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
+ if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
+ /* A serializing LFENCE stops RDTSC speculation */
+ set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
+ } else {
+ /* MFENCE stops RDTSC speculation */
+ set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
+ }
}
/*
--
2.14.2

View file

@ -0,0 +1,63 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp@suse.de>
Date: Wed, 10 Jan 2018 12:28:16 +0100
Subject: [PATCH] x86/alternatives: Fix optimize_nops() checking
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
The alternatives code checks only the first byte whether it is a NOP, but
with NOPs in front of the payload and having actual instructions after it
breaks the "optimized' test.
Make sure to scan all bytes before deciding to optimize the NOPs in there.
Reported-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Andrew Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: Paul Turner <pjt@google.com>
Link: https://lkml.kernel.org/r/20180110112815.mgciyf5acwacphkq@pd.tnic
(cherry picked from commit 612e8e9350fd19cae6900cf36ea0c6892d1a0dca)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit dc241f68557ee1929a92b9ec6f7a1294bbbd4f00)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/alternative.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 32e14d137416..5dc05755a044 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -344,9 +344,12 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
{
unsigned long flags;
+ int i;
- if (instr[0] != 0x90)
- return;
+ for (i = 0; i < a->padlen; i++) {
+ if (instr[i] != 0x90)
+ return;
+ }
local_irq_save(flags);
add_nops(instr + (a->instrlen - a->padlen), a->padlen);
--
2.14.2

View file

@ -0,0 +1,83 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen@linux.intel.com>
Date: Wed, 10 Jan 2018 14:49:39 -0800
Subject: [PATCH] x86/pti: Make unpoison of pgd for trusted boot work for real
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
The inital fix for trusted boot and PTI potentially misses the pgd clearing
if pud_alloc() sets a PGD. It probably works in *practice* because for two
adjacent calls to map_tboot_page() that share a PGD entry, the first will
clear NX, *then* allocate and set the PGD (without NX clear). The second
call will *not* allocate but will clear the NX bit.
Defer the NX clearing to a point after it is known that all top-level
allocations have occurred. Add a comment to clarify why.
[ tglx: Massaged changelog ]
Fixes: 262b6b30087 ("x86/tboot: Unbreak tboot with PTI enabled")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: "Tim Chen" <tim.c.chen@linux.intel.com>
Cc: gnomes@lxorguk.ukuu.org.uk
Cc: peterz@infradead.org
Cc: ning.sun@intel.com
Cc: tboot-devel@lists.sourceforge.net
Cc: andi@firstfloor.org
Cc: luto@kernel.org
Cc: law@redhat.com
Cc: pbonzini@redhat.com
Cc: torvalds@linux-foundation.org
Cc: gregkh@linux-foundation.org
Cc: dwmw@amazon.co.uk
Cc: nickc@redhat.com
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180110224939.2695CD47@viggo.jf.intel.com
(cherry picked from commit 8a931d1e24bacf01f00a35d43bfe7917256c5c49)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 9935124a5c771c004a578423275633232fb7a006)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/tboot.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index 75869a4b6c41..a2486f444073 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -127,7 +127,6 @@ static int map_tboot_page(unsigned long vaddr, unsigned long pfn,
p4d = p4d_alloc(&tboot_mm, pgd, vaddr);
if (!p4d)
return -1;
- pgd->pgd &= ~_PAGE_NX;
pud = pud_alloc(&tboot_mm, p4d, vaddr);
if (!pud)
return -1;
@@ -139,6 +138,17 @@ static int map_tboot_page(unsigned long vaddr, unsigned long pfn,
return -1;
set_pte_at(&tboot_mm, vaddr, pte, pfn_pte(pfn, prot));
pte_unmap(pte);
+
+ /*
+ * PTI poisons low addresses in the kernel page tables in the
+ * name of making them unusable for userspace. To execute
+ * code at such a low address, the poison must be cleared.
+ *
+ * Note: 'pgd' actually gets set in p4d_alloc() _or_
+ * pud_alloc() depending on 4/5-level paging.
+ */
+ pgd->pgd &= ~_PAGE_NX;
+
return 0;
}
--
2.14.2

View file

@ -0,0 +1,62 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:43 +0300
Subject: [PATCH] locking/barriers: introduce new memory barrier gmb()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
In constrast to existing mb() and rmb() barriers,
gmb() barrier is arch-independent and can be used to
implement any type of memory barrier.
In x86 case, it is either lfence or mfence, based on
processor type. ARM and others can define it according
to their needs.
Suggested-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 6834bd7e6159da957a6c01deebf16132a694bc23)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/barrier.h | 3 +++
include/asm-generic/barrier.h | 4 ++++
2 files changed, 7 insertions(+)
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index bfb28caf97b1..aae78054cae2 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -23,6 +23,9 @@
#define wmb() asm volatile("sfence" ::: "memory")
#endif
+#define gmb() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
+ "lfence", X86_FEATURE_LFENCE_RDTSC);
+
#ifdef CONFIG_X86_PPRO_FENCE
#define dma_rmb() rmb()
#else
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index fe297b599b0a..0ee1345c9222 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -42,6 +42,10 @@
#define wmb() mb()
#endif
+#ifndef gmb
+#define gmb() do { } while (0)
+#endif
+
#ifndef dma_rmb
#define dma_rmb() rmb()
#endif
--
2.14.2

View file

@ -0,0 +1,60 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:44 +0300
Subject: [PATCH] bpf: prevent speculative execution in eBPF interpreter
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
This adds a generic memory barrier before LD_IMM_DW and
LDX_MEM_B/H/W/DW eBPF instructions during eBPF program
execution in order to prevent speculative execution on out
of bound BFP_MAP array indexes. This way an arbitary kernel
memory is not exposed through side channel attacks.
For more details, please see this Google Project Zero report: tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit dd13f73106c260dea7a689d33d1457639af820aa)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
kernel/bpf/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 9a1bed1f3029..3f83c60e3e86 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -33,6 +33,7 @@
#include <linux/rcupdate.h>
#include <asm/unaligned.h>
+#include <asm/barrier.h>
/* Registers */
#define BPF_R0 regs[BPF_REG_0]
@@ -920,6 +921,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn,
DST = IMM;
CONT;
LD_IMM_DW:
+ gmb();
DST = (u64) (u32) insn[0].imm | ((u64) (u32) insn[1].imm) << 32;
insn++;
CONT;
@@ -1133,6 +1135,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn,
*(SIZE *)(unsigned long) (DST + insn->off) = IMM; \
CONT; \
LDX_MEM_##SIZEOP: \
+ gmb(); \
DST = *(SIZE *)(unsigned long) (SRC + insn->off); \
CONT;
--
2.14.2

View file

@ -0,0 +1,93 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:45 +0300
Subject: [PATCH] x86, bpf, jit: prevent speculative execution when JIT is
enabled
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
When constant blinding is enabled (bpf_jit_harden = 1), this adds
a generic memory barrier (lfence for intel, mfence for AMD) before
emitting x86 jitted code for the BPF_ALU(64)_OR_X and BPF_ALU_LHS_X
(for BPF_REG_AX register) eBPF instructions. This is needed in order
to prevent speculative execution on out of bounds BPF_MAP array
indexes when JIT is enabled. This way an arbitary kernel memory is
not exposed through side-channel attacks.
For more details, please see this Google Project Zero report: tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit cf9676859a05d0d784067072e8121e63888bacc7)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/net/bpf_jit_comp.c | 33 ++++++++++++++++++++++++++++++++-
1 file changed, 32 insertions(+), 1 deletion(-)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 4d50ced94686..879dbfefb66d 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -107,6 +107,27 @@ static void bpf_flush_icache(void *start, void *end)
set_fs(old_fs);
}
+static void emit_memory_barrier(u8 **pprog)
+{
+ u8 *prog = *pprog;
+ int cnt = 0;
+
+ if (bpf_jit_blinding_enabled()) {
+ if (boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))
+ /* x86 LFENCE opcode 0F AE E8 */
+ EMIT3(0x0f, 0xae, 0xe8);
+ else if (boot_cpu_has(X86_FEATURE_MFENCE_RDTSC))
+ /* AMD MFENCE opcode 0F AE F0 */
+ EMIT3(0x0f, 0xae, 0xf0);
+ else
+ /* we should never end up here,
+ * but if we do, better not to emit anything*/
+ return;
+ }
+ *pprog = prog;
+ return;
+}
+
#define CHOOSE_LOAD_FUNC(K, func) \
((int)K < 0 ? ((int)K >= SKF_LL_OFF ? func##_negative_offset : func) : func##_positive_offset)
@@ -399,7 +420,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
case BPF_ADD: b2 = 0x01; break;
case BPF_SUB: b2 = 0x29; break;
case BPF_AND: b2 = 0x21; break;
- case BPF_OR: b2 = 0x09; break;
+ case BPF_OR: b2 = 0x09; emit_memory_barrier(&prog); break;
case BPF_XOR: b2 = 0x31; break;
}
if (BPF_CLASS(insn->code) == BPF_ALU64)
@@ -646,6 +667,16 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
case BPF_ALU64 | BPF_RSH | BPF_X:
case BPF_ALU64 | BPF_ARSH | BPF_X:
+ /* If blinding is enabled, each
+ * BPF_LD | BPF_IMM | BPF_DW instruction
+ * is converted to 4 eBPF instructions with
+ * BPF_ALU64_IMM(BPF_LSH, BPF_REG_AX, 32)
+ * always present(number 3). Detect such cases
+ * and insert memory barriers. */
+ if ((BPF_CLASS(insn->code) == BPF_ALU64)
+ && (BPF_OP(insn->code) == BPF_LSH)
+ && (src_reg == BPF_REG_AX))
+ emit_memory_barrier(&prog);
/* check for bad case when dst_reg == rcx */
if (dst_reg == BPF_REG_4) {
/* mov r11, dst_reg */
--
2.14.2

View file

@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:46 +0300
Subject: [PATCH] uvcvideo: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 65d4588b16395360695525add0ca79fa6ba04fa5)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/media/usb/uvc/uvc_v4l2.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c
index 3e7e283a44a8..fcedd1798e9d 100644
--- a/drivers/media/usb/uvc/uvc_v4l2.c
+++ b/drivers/media/usb/uvc/uvc_v4l2.c
@@ -821,6 +821,7 @@ static int uvc_ioctl_enum_input(struct file *file, void *fh,
}
pin = iterm->id;
} else if (index < selector->bNrInPins) {
+ gmb();
pin = selector->baSourceID[index];
list_for_each_entry(iterm, &chain->entities, chain) {
if (!UVC_ENTITY_IS_ITERM(iterm))
--
2.14.2

View file

@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:47 +0300
Subject: [PATCH] carl9170: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit dc218eba4fe8241ab073be41a068f6796450c6d0)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/net/wireless/ath/carl9170/main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/ath/carl9170/main.c b/drivers/net/wireless/ath/carl9170/main.c
index 988c8857d78c..7e2c1c870a1d 100644
--- a/drivers/net/wireless/ath/carl9170/main.c
+++ b/drivers/net/wireless/ath/carl9170/main.c
@@ -1388,6 +1388,7 @@ static int carl9170_op_conf_tx(struct ieee80211_hw *hw,
mutex_lock(&ar->mutex);
if (queue < ar->hw->queues) {
+ gmb();
memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
ret = carl9170_set_qos(ar);
} else {
--
2.14.2

View file

@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:48 +0300
Subject: [PATCH] p54: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 57b537e161bb9d44475a05b2b12d64bfb50319d3)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/net/wireless/intersil/p54/main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/intersil/p54/main.c b/drivers/net/wireless/intersil/p54/main.c
index d5a3bf91a03e..7e6af1f67960 100644
--- a/drivers/net/wireless/intersil/p54/main.c
+++ b/drivers/net/wireless/intersil/p54/main.c
@@ -415,6 +415,7 @@ static int p54_conf_tx(struct ieee80211_hw *dev,
mutex_lock(&priv->conf_mutex);
if (queue < dev->queues) {
+ gmb();
P54_SET_QUEUE(priv->qos_params[queue], params->aifs,
params->cw_min, params->cw_max, params->txop);
ret = p54_set_edcf(priv);
--
2.14.2

View file

@ -0,0 +1,60 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:49 +0300
Subject: [PATCH] qla2xxx: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit d71318e5f16371dbc0e89a786336a521551f8946)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/scsi/qla2xxx/qla_mr.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_mr.c b/drivers/scsi/qla2xxx/qla_mr.c
index 10b742d27e16..ca923d8803f9 100644
--- a/drivers/scsi/qla2xxx/qla_mr.c
+++ b/drivers/scsi/qla2xxx/qla_mr.c
@@ -2304,10 +2304,12 @@ qlafx00_status_entry(scsi_qla_host_t *vha, struct rsp_que *rsp, void *pkt)
req = ha->req_q_map[que];
/* Validate handle. */
- if (handle < req->num_outstanding_cmds)
+ if (handle < req->num_outstanding_cmds) {
+ gmb();
sp = req->outstanding_cmds[handle];
- else
+ } else {
sp = NULL;
+ }
if (sp == NULL) {
ql_dbg(ql_dbg_io, vha, 0x3034,
@@ -2655,10 +2657,12 @@ qlafx00_multistatus_entry(struct scsi_qla_host *vha,
req = ha->req_q_map[que];
/* Validate handle. */
- if (handle < req->num_outstanding_cmds)
+ if (handle < req->num_outstanding_cmds) {
+ gmb();
sp = req->outstanding_cmds[handle];
- else
+ } else {
sp = NULL;
+ }
if (sp == NULL) {
ql_dbg(ql_dbg_io, vha, 0x3044,
--
2.14.2

View file

@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:50 +0300
Subject: [PATCH] cw1200: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 30770297508b781f2c1e82c52f793bc4d2cb2356)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/net/wireless/st/cw1200/sta.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/st/cw1200/sta.c b/drivers/net/wireless/st/cw1200/sta.c
index a52224836a2b..bbff06a4263e 100644
--- a/drivers/net/wireless/st/cw1200/sta.c
+++ b/drivers/net/wireless/st/cw1200/sta.c
@@ -619,6 +619,7 @@ int cw1200_conf_tx(struct ieee80211_hw *dev, struct ieee80211_vif *vif,
mutex_lock(&priv->conf_mutex);
if (queue < dev->queues) {
+ gmb();
old_uapsd_flags = le16_to_cpu(priv->uapsd_info.uapsd_flags);
WSM_TX_QUEUE_SET(&priv->tx_queue_params, queue, 0, 0, 0);
--
2.14.2

View file

@ -0,0 +1,52 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:51 +0300
Subject: [PATCH] Thermal/int340x: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 3904f4cadeeaa9370f0635eb2f66194ca238325b)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
drivers/thermal/int340x_thermal/int340x_thermal_zone.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/thermal/int340x_thermal/int340x_thermal_zone.c b/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
index 145a5c53ff5c..4f9917ef3c11 100644
--- a/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
+++ b/drivers/thermal/int340x_thermal/int340x_thermal_zone.c
@@ -57,15 +57,16 @@ static int int340x_thermal_get_trip_temp(struct thermal_zone_device *zone,
if (d->override_ops && d->override_ops->get_trip_temp)
return d->override_ops->get_trip_temp(zone, trip, temp);
- if (trip < d->aux_trip_nr)
+ if (trip < d->aux_trip_nr) {
+ gmb();
*temp = d->aux_trips[trip];
- else if (trip == d->crt_trip_id)
+ } else if (trip == d->crt_trip_id) {
*temp = d->crt_temp;
- else if (trip == d->psv_trip_id)
+ } else if (trip == d->psv_trip_id) {
*temp = d->psv_temp;
- else if (trip == d->hot_trip_id)
+ } else if (trip == d->hot_trip_id) {
*temp = d->hot_temp;
- else {
+ } else {
for (i = 0; i < INT340X_THERMAL_MAX_ACT_TRIP_COUNT; i++) {
if (d->act_trips[i].valid &&
d->act_trips[i].id == trip) {
--
2.14.2

View file

@ -0,0 +1,42 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:52 +0300
Subject: [PATCH] userns: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 1410678db6238e625775f7108c68a9e5b8d439a1)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
kernel/user_namespace.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 4eacf186f5bc..684cc69d431c 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -549,8 +549,10 @@ static void *m_start(struct seq_file *seq, loff_t *ppos,
struct uid_gid_extent *extent = NULL;
loff_t pos = *ppos;
- if (pos < map->nr_extents)
+ if (pos < map->nr_extents) {
+ gmb();
extent = &map->extent[pos];
+ }
return extent;
}
--
2.14.2

View file

@ -0,0 +1,38 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:53 +0300
Subject: [PATCH] ipv6: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit fdb98114a31aa5c0083bd7cd5b42ea569b6f77dc)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/ipv6/raw.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 60be012fe708..1a0eae661512 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -726,6 +726,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
if (offset < rfv->hlen) {
int copy = min(rfv->hlen - offset, len);
+ gmb();
if (skb->ip_summed == CHECKSUM_PARTIAL)
memcpy(to, rfv->c + offset, copy);
else
--
2.14.2

View file

@ -0,0 +1,42 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:54 +0300
Subject: [PATCH] fs: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 1ca9e14b253a501f055c3ea29d992c028473676e)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
include/linux/fdtable.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index 6e84b2cae6ad..09b124542bb8 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -81,8 +81,10 @@ static inline struct file *__fcheck_files(struct files_struct *files, unsigned i
{
struct fdtable *fdt = rcu_dereference_raw(files->fdt);
- if (fd < fdt->max_fds)
+ if (fd < fdt->max_fds) {
+ gmb();
return rcu_dereference_raw(fdt->fd[fd]);
+ }
return NULL;
}
--
2.14.2

View file

@ -0,0 +1,39 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:55 +0300
Subject: [PATCH] net: mpls: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 98a9550398f87c5430d5e893104e21caa1e2e8d3)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
net/mpls/af_mpls.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index ea4f481839dd..08dfb99e19f2 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -50,6 +50,8 @@ static struct mpls_route *mpls_route_input_rcu(struct net *net, unsigned index)
if (index < net->mpls.platform_labels) {
struct mpls_route __rcu **platform_label =
rcu_dereference(net->mpls.platform_label);
+
+ gmb();
rt = rcu_dereference(platform_label[index]);
}
return rt;
--
2.14.2

View file

@ -0,0 +1,57 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Elena Reshetova <elena.reshetova@intel.com>
Date: Mon, 4 Sep 2017 13:11:56 +0300
Subject: [PATCH] udf: prevent speculative execution
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Real commit text tbd
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit f575840dd363aa80a14faacddf90b95db1185e2c)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
fs/udf/misc.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/fs/udf/misc.c b/fs/udf/misc.c
index 3949c4bec3a3..4bd10b2e8540 100644
--- a/fs/udf/misc.c
+++ b/fs/udf/misc.c
@@ -104,6 +104,8 @@ struct genericFormat *udf_add_extendedattr(struct inode *inode, uint32_t size,
iinfo->i_lenEAttr) {
uint32_t aal =
le32_to_cpu(eahd->appAttrLocation);
+
+ gmb();
memmove(&ea[offset - aal + size],
&ea[aal], offset - aal);
offset -= aal;
@@ -114,6 +116,8 @@ struct genericFormat *udf_add_extendedattr(struct inode *inode, uint32_t size,
iinfo->i_lenEAttr) {
uint32_t ial =
le32_to_cpu(eahd->impAttrLocation);
+
+ gmb();
memmove(&ea[offset - ial + size],
&ea[ial], offset - ial);
offset -= ial;
@@ -125,6 +129,8 @@ struct genericFormat *udf_add_extendedattr(struct inode *inode, uint32_t size,
iinfo->i_lenEAttr) {
uint32_t aal =
le32_to_cpu(eahd->appAttrLocation);
+
+ gmb();
memmove(&ea[offset - aal + size],
&ea[aal], offset - aal);
offset -= aal;
--
2.14.2

View file

@ -0,0 +1,77 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Thu, 24 Aug 2017 09:34:41 -0700
Subject: [PATCH] x86/feature: Enable the x86 feature to control Speculation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
cpuid ax=0x7, return rdx bit 26 to indicate presence of this feature
IA32_SPEC_CTRL (0x48) and IA32_PRED_CMD (0x49)
IA32_SPEC_CTRL, bit0 Indirect Branch Restricted Speculation (IBRS)
IA32_PRED_CMD, bit0 Indirect Branch Prediction Barrier (IBPB)
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit f1f160a92b70c25d6e6e76788463bbec86a73313)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 5 +++++
arch/x86/kernel/cpu/scattered.c | 1 +
3 files changed, 7 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 3928050b51b0..44be8fd069bf 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -208,6 +208,7 @@
#define X86_FEATURE_AVX512_4FMAPS ( 7*32+17) /* AVX-512 Multiply Accumulation Single precision */
#define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */
+#define X86_FEATURE_SPEC_CTRL ( 7*32+19) /* Control Speculation Control */
/* Virtualization flags: Linux defined, word 8 */
#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index db88b7f852b4..4e3438a00a50 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -41,6 +41,9 @@
#define MSR_PPIN_CTL 0x0000004e
#define MSR_PPIN 0x0000004f
+#define MSR_IA32_SPEC_CTRL 0x00000048
+#define MSR_IA32_PRED_CMD 0x00000049
+
#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
@@ -437,6 +440,8 @@
#define FEATURE_CONTROL_VMXON_ENABLED_INSIDE_SMX (1<<1)
#define FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX (1<<2)
#define FEATURE_CONTROL_LMCE (1<<20)
+#define FEATURE_ENABLE_IBRS (1<<0)
+#define FEATURE_SET_IBPB (1<<0)
#define MSR_IA32_APICBASE 0x0000001b
#define MSR_IA32_APICBASE_BSP (1<<8)
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 23c23508c012..9651ea395812 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -24,6 +24,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x00000007, 0 },
{ X86_FEATURE_AVX512_4VNNIW, CPUID_EDX, 2, 0x00000007, 0 },
{ X86_FEATURE_AVX512_4FMAPS, CPUID_EDX, 3, 0x00000007, 0 },
+ { X86_FEATURE_SPEC_CTRL, CPUID_EDX, 26, 0x00000007, 0 },
{ X86_FEATURE_CAT_L3, CPUID_EBX, 1, 0x00000010, 0 },
{ X86_FEATURE_CAT_L2, CPUID_EBX, 2, 0x00000010, 0 },
{ X86_FEATURE_CDP_L3, CPUID_ECX, 2, 0x00000010, 1 },
--
2.14.2

View file

@ -0,0 +1,41 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Wed, 27 Sep 2017 12:09:14 -0700
Subject: [PATCH] x86/feature: Report presence of IBPB and IBRS control
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Report presence of IBPB and IBRS.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit c41156d893e7f48bebf8d71cfddd39d8fb2724f8)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/cpu/intel.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index dfa90a3a5145..f1d94c73625a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -627,6 +627,11 @@ static void init_intel(struct cpuinfo_x86 *c)
init_intel_energy_perf(c);
init_intel_misc_features(c);
+
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ printk_once(KERN_INFO "FEATURE SPEC_CTRL Present\n");
+ else
+ printk_once(KERN_INFO "FEATURE SPEC_CTRL Not Present\n");
}
#ifdef CONFIG_X86_32
--
2.14.2

View file

@ -0,0 +1,84 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Fri, 15 Sep 2017 18:04:53 -0700
Subject: [PATCH] x86/enter: MACROS to set/clear IBRS and set IBPB
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Setup macros to control IBRS and IBPB
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 171d754fe3b783d361555cf2569e68a7b0e0d54a)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/spec_ctrl.h | 52 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
create mode 100644 arch/x86/include/asm/spec_ctrl.h
diff --git a/arch/x86/include/asm/spec_ctrl.h b/arch/x86/include/asm/spec_ctrl.h
new file mode 100644
index 000000000000..7f8bb09b6acb
--- /dev/null
+++ b/arch/x86/include/asm/spec_ctrl.h
@@ -0,0 +1,52 @@
+#ifndef _ASM_X86_SPEC_CTRL_H
+#define _ASM_X86_SPEC_CTRL_H
+
+#include <linux/stringify.h>
+#include <asm/msr-index.h>
+#include <asm/cpufeatures.h>
+#include <asm/alternative-asm.h>
+
+#ifdef __ASSEMBLY__
+
+#define __ASM_ENABLE_IBRS \
+ pushq %rax; \
+ pushq %rcx; \
+ pushq %rdx; \
+ movl $MSR_IA32_SPEC_CTRL, %ecx; \
+ movl $0, %edx; \
+ movl $FEATURE_ENABLE_IBRS, %eax; \
+ wrmsr; \
+ popq %rdx; \
+ popq %rcx; \
+ popq %rax
+#define __ASM_ENABLE_IBRS_CLOBBER \
+ movl $MSR_IA32_SPEC_CTRL, %ecx; \
+ movl $0, %edx; \
+ movl $FEATURE_ENABLE_IBRS, %eax; \
+ wrmsr;
+#define __ASM_DISABLE_IBRS \
+ pushq %rax; \
+ pushq %rcx; \
+ pushq %rdx; \
+ movl $MSR_IA32_SPEC_CTRL, %ecx; \
+ movl $0, %edx; \
+ movl $0, %eax; \
+ wrmsr; \
+ popq %rdx; \
+ popq %rcx; \
+ popq %rax
+
+.macro ENABLE_IBRS
+ALTERNATIVE "", __stringify(__ASM_ENABLE_IBRS), X86_FEATURE_SPEC_CTRL
+.endm
+
+.macro ENABLE_IBRS_CLOBBER
+ALTERNATIVE "", __stringify(__ASM_ENABLE_IBRS_CLOBBER), X86_FEATURE_SPEC_CTRL
+.endm
+
+.macro DISABLE_IBRS
+ALTERNATIVE "", __stringify(__ASM_DISABLE_IBRS), X86_FEATURE_SPEC_CTRL
+.endm
+
+#endif /* __ASSEMBLY__ */
+#endif /* _ASM_X86_SPEC_CTRL_H */
--
2.14.2

View file

@ -0,0 +1,171 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Fri, 13 Oct 2017 14:25:00 -0700
Subject: [PATCH] x86/enter: Use IBRS on syscall and interrupts
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Set IBRS upon kernel entrance via syscall and interrupts. Clear it upon exit.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit d7eb5f9ed26dbdc39df793491bdcc9f80d41325e)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/entry/entry_64.S | 18 +++++++++++++++++-
arch/x86/entry/entry_64_compat.S | 7 +++++++
2 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index b48f2c78a9bf..5f898c3c1dad 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -36,6 +36,7 @@
#include <asm/pgtable_types.h>
#include <asm/export.h>
#include <asm/frame.h>
+#include <asm/spec_ctrl.h>
#include <linux/err.h>
#include "calling.h"
@@ -235,6 +236,8 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */
UNWIND_HINT_REGS extra=0
+ ENABLE_IBRS
+
/*
* If we need to do entry work or if we guess we'll need to do
* exit work, go straight to the slow path.
@@ -286,6 +289,7 @@ entry_SYSCALL_64_fastpath:
TRACE_IRQS_ON /* user mode is traced as IRQs on */
movq RIP(%rsp), %rcx
movq EFLAGS(%rsp), %r11
+ DISABLE_IBRS
addq $6*8, %rsp /* skip extra regs -- they were preserved */
UNWIND_HINT_EMPTY
jmp .Lpop_c_regs_except_rcx_r11_and_sysret
@@ -379,6 +383,8 @@ return_from_SYSCALL_64:
* perf profiles. Nothing jumps here.
*/
syscall_return_via_sysret:
+ DISABLE_IBRS
+
/* rcx and r11 are already restored (see code above) */
UNWIND_HINT_EMPTY
POP_EXTRA_REGS
@@ -660,6 +666,10 @@ END(irq_entries_start)
/*
* IRQ from user mode.
*
+ */
+ ENABLE_IBRS
+
+ /*
* We need to tell lockdep that IRQs are off. We can't do this until
* we fix gsbase, and we should do it before enter_from_user_mode
* (which can take locks). Since TRACE_IRQS_OFF idempotent,
@@ -743,7 +753,7 @@ GLOBAL(swapgs_restore_regs_and_return_to_usermode)
* We are on the trampoline stack. All regs except RDI are live.
* We can do future final exit work right here.
*/
-
+ DISABLE_IBRS
SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
/* Restore RDI. */
@@ -1277,6 +1287,7 @@ ENTRY(paranoid_entry)
1:
SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14
+ ENABLE_IBRS_CLOBBER
ret
END(paranoid_entry)
@@ -1331,6 +1342,8 @@ ENTRY(error_entry)
/* We have user CR3. Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+ ENABLE_IBRS
+
.Lerror_entry_from_usermode_after_swapgs:
/* Put us onto the real thread stack. */
popq %r12 /* save return addr in %12 */
@@ -1377,6 +1390,7 @@ ENTRY(error_entry)
*/
SWAPGS
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+ ENABLE_IBRS_CLOBBER
jmp .Lerror_entry_done
.Lbstep_iret:
@@ -1391,6 +1405,7 @@ ENTRY(error_entry)
*/
SWAPGS
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
+ ENABLE_IBRS_CLOBBER
/*
* Pretend that the exception came from user mode: set up pt_regs
@@ -1518,6 +1533,7 @@ ENTRY(nmi)
UNWIND_HINT_REGS
ENCODE_FRAME_POINTER
+ ENABLE_IBRS
/*
* At this point we no longer need to worry about stack damage
* due to nesting -- we're on the normal thread stack and we're
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 2b5e7685823c..ee4f3edb3c50 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -13,6 +13,7 @@
#include <asm/irqflags.h>
#include <asm/asm.h>
#include <asm/smap.h>
+#include <asm/spec_ctrl.h>
#include <linux/linkage.h>
#include <linux/err.h>
@@ -95,6 +96,8 @@ ENTRY(entry_SYSENTER_compat)
pushq $0 /* pt_regs->r15 = 0 */
cld
+ ENABLE_IBRS
+
/*
* SYSENTER doesn't filter flags, so we need to clear NT and AC
* ourselves. To save a few cycles, we can check whether
@@ -194,6 +197,7 @@ ENTRY(entry_SYSCALL_compat)
/* Use %rsp as scratch reg. User ESP is stashed in r8 */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
+ ENABLE_IBRS
/* Switch to the kernel stack */
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
@@ -249,6 +253,7 @@ sysret32_from_system_call:
popq %rsi /* pt_regs->si */
popq %rdi /* pt_regs->di */
+ DISABLE_IBRS
/*
* USERGS_SYSRET32 does:
* GSBASE = user's GS base
@@ -348,6 +353,8 @@ ENTRY(entry_INT80_compat)
pushq %r15 /* pt_regs->r15 */
cld
+ ENABLE_IBRS
+
/*
* User mode is traced as though IRQs are on, and the interrupt
* gate turned them off.
--
2.14.2

View file

@ -0,0 +1,117 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Mon, 6 Nov 2017 18:19:14 -0800
Subject: [PATCH] x86/idle: Disable IBRS entering idle and enable it on wakeup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Clear IBRS on idle entry and set it on idle exit into kernel on mwait.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 5521b04afda1d683c1ebad6c25c2529a88e6f061)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/mwait.h | 8 ++++++++
arch/x86/kernel/process.c | 12 ++++++++++--
arch/x86/lib/delay.c | 10 ++++++++++
3 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index bda3c27f0da0..f15120ada161 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -5,6 +5,8 @@
#include <linux/sched/idle.h>
#include <asm/cpufeature.h>
+#include <asm/spec_ctrl.h>
+#include <asm/microcode.h>
#define MWAIT_SUBSTATE_MASK 0xf
#define MWAIT_CSTATE_MASK 0xf
@@ -105,9 +107,15 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
mb();
}
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
__monitor((void *)&current_thread_info()->flags, 0, 0);
if (!need_resched())
__mwait(eax, ecx);
+
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
}
current_clr_polling();
}
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 07e6218ad7d9..3adb3806a284 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -447,11 +447,19 @@ static __cpuidle void mwait_idle(void)
mb(); /* quirk */
}
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
__monitor((void *)&current_thread_info()->flags, 0, 0);
- if (!need_resched())
+ if (!need_resched()) {
__sti_mwait(0, 0);
- else
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
+ } else {
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
local_irq_enable();
+ }
trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id());
} else {
local_irq_enable();
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index cf2ac227c2ac..b088463973e4 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -26,6 +26,8 @@
# include <asm/smp.h>
#endif
+#define IBRS_DISABLE_THRESHOLD 1000
+
/* simple loop based delay: */
static void delay_loop(unsigned long loops)
{
@@ -105,6 +107,10 @@ static void delay_mwaitx(unsigned long __loops)
for (;;) {
delay = min_t(u64, MWAITX_MAX_LOOPS, loops);
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
+ (delay > IBRS_DISABLE_THRESHOLD))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
/*
* Use cpu_tss_rw as a cacheline-aligned, seldomly
* accessed per-cpu variable as the monitor target.
@@ -118,6 +124,10 @@ static void delay_mwaitx(unsigned long __loops)
*/
__mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE);
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
+ (delay > IBRS_DISABLE_THRESHOLD))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
+
end = rdtsc_ordered();
if (loops <= end - start)
--
2.14.2

View file

@ -0,0 +1,54 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Wed, 15 Nov 2017 12:24:19 -0800
Subject: [PATCH] x86/idle: Disable IBRS when offlining cpu and re-enable on
wakeup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Clear IBRS when cpu is offlined and set it when brining it back online.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 9bcf662c1690880b2464fe99d0f58dce53c0d89f)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/smpboot.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 398e8324fea4..a652bff7add4 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -77,6 +77,7 @@
#include <asm/i8259.h>
#include <asm/realmode.h>
#include <asm/misc.h>
+#include <asm/microcode.h>
/* Number of siblings per CPU package */
int smp_num_siblings = 1;
@@ -1692,9 +1693,15 @@ void native_play_dead(void)
play_dead_common();
tboot_shutdown(TB_SHUTDOWN_WFS);
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
mwait_play_dead(); /* Only returns on failure */
if (cpuidle_play_dead())
hlt_play_dead();
+
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
}
#else /* ... !CONFIG_HOTPLUG_CPU */
--
2.14.2

View file

@ -0,0 +1,47 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Fri, 20 Oct 2017 12:56:29 -0700
Subject: [PATCH] x86/mm: Set IBPB upon context switch
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Set IBPB on context switch with changing of page table.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit a3320203792b633fb96df5d0bbfb7036129b78e2)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/mm/tlb.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 06f3854d0a4f..bb3ded3a4e5f 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -12,6 +12,7 @@
#include <asm/cache.h>
#include <asm/apic.h>
#include <asm/uv/uv.h>
+#include <asm/microcode.h>
#include <linux/debugfs.h>
/*
@@ -218,6 +219,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
u16 new_asid;
bool need_flush;
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
+
if (IS_ENABLED(CONFIG_VMAP_STACK)) {
/*
* If our current stack is in vmalloc space and isn't
--
2.14.2

View file

@ -0,0 +1,127 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Tue, 7 Nov 2017 13:52:42 -0800
Subject: [PATCH] x86/mm: Only set IBPB when the new thread cannot ptrace
current thread
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
To reduce overhead of setting IBPB, we only do that when
the new thread cannot ptrace the current one. If the new
thread has ptrace capability on current thread, it is safe.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 65941af723059ffeeca269b99ab51b3c9e320751)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
include/linux/ptrace.h | 6 ++++++
arch/x86/mm/tlb.c | 5 ++++-
kernel/ptrace.c | 18 ++++++++++++++----
3 files changed, 24 insertions(+), 5 deletions(-)
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 0e5fcc11b1b8..d6afefd5465b 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -63,12 +63,15 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
#define PTRACE_MODE_NOAUDIT 0x04
#define PTRACE_MODE_FSCREDS 0x08
#define PTRACE_MODE_REALCREDS 0x10
+#define PTRACE_MODE_NOACCESS_CHK 0x20
/* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
#define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
#define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
#define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
#define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
+#define PTRACE_MODE_IBPB (PTRACE_MODE_ATTACH | PTRACE_MODE_NOAUDIT \
+ | PTRACE_MODE_NOACCESS_CHK | PTRACE_MODE_REALCREDS)
/**
* ptrace_may_access - check whether the caller is permitted to access
@@ -86,6 +89,9 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
*/
extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
+extern int ___ptrace_may_access(struct task_struct *cur, struct task_struct *task,
+ unsigned int mode);
+
static inline int ptrace_reparented(struct task_struct *child)
{
return !same_thread_group(child->real_parent, child->parent);
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index bb3ded3a4e5f..301e6efbc514 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -6,6 +6,7 @@
#include <linux/interrupt.h>
#include <linux/export.h>
#include <linux/cpu.h>
+#include <linux/ptrace.h>
#include <asm/tlbflush.h>
#include <asm/mmu_context.h>
@@ -219,7 +220,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
u16 new_asid;
bool need_flush;
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ /* Null tsk means switching to kernel, so that's safe */
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) && tsk &&
+ ___ptrace_may_access(tsk, current, PTRACE_MODE_IBPB))
native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
if (IS_ENABLED(CONFIG_VMAP_STACK)) {
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 60f356d91060..f2f0f1aeabaf 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -268,9 +268,10 @@ static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
}
/* Returns 0 on success, -errno on denial. */
-static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
+int ___ptrace_may_access(struct task_struct *cur, struct task_struct *task,
+ unsigned int mode)
{
- const struct cred *cred = current_cred(), *tcred;
+ const struct cred *cred = __task_cred(cur), *tcred;
struct mm_struct *mm;
kuid_t caller_uid;
kgid_t caller_gid;
@@ -290,7 +291,7 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
*/
/* Don't let security modules deny introspection */
- if (same_thread_group(task, current))
+ if (same_thread_group(task, cur))
return 0;
rcu_read_lock();
if (mode & PTRACE_MODE_FSCREDS) {
@@ -328,7 +329,16 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
!ptrace_has_cap(mm->user_ns, mode)))
return -EPERM;
- return security_ptrace_access_check(task, mode);
+ if (!(mode & PTRACE_MODE_NOACCESS_CHK))
+ return security_ptrace_access_check(task, mode);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(___ptrace_may_access);
+
+static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
+{
+ return ___ptrace_may_access(current, task, mode);
}
bool ptrace_may_access(struct task_struct *task, unsigned int mode)
--
2.14.2

View file

@ -0,0 +1,202 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Tue, 14 Nov 2017 17:16:30 -0800
Subject: [PATCH] x86/entry: Stuff RSB for entry to kernel for non-SMEP
platform
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Stuff RSB to prevent RSB underflow on non-SMEP platform.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit b82785ac1d33ce219c77d72b7bd80a21e1441ac8)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/spec_ctrl.h | 71 ++++++++++++++++++++++++++++++++++++++++
arch/x86/entry/entry_64.S | 18 ++++++++--
arch/x86/entry/entry_64_compat.S | 4 +++
3 files changed, 91 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/spec_ctrl.h b/arch/x86/include/asm/spec_ctrl.h
index 7f8bb09b6acb..55ee1f36bda2 100644
--- a/arch/x86/include/asm/spec_ctrl.h
+++ b/arch/x86/include/asm/spec_ctrl.h
@@ -35,6 +35,73 @@
popq %rdx; \
popq %rcx; \
popq %rax
+#define __ASM_STUFF_RSB \
+ call 1f; \
+ pause; \
+1: call 2f; \
+ pause; \
+2: call 3f; \
+ pause; \
+3: call 4f; \
+ pause; \
+4: call 5f; \
+ pause; \
+5: call 6f; \
+ pause; \
+6: call 7f; \
+ pause; \
+7: call 8f; \
+ pause; \
+8: call 9f; \
+ pause; \
+9: call 10f; \
+ pause; \
+10: call 11f; \
+ pause; \
+11: call 12f; \
+ pause; \
+12: call 13f; \
+ pause; \
+13: call 14f; \
+ pause; \
+14: call 15f; \
+ pause; \
+15: call 16f; \
+ pause; \
+16: call 17f; \
+ pause; \
+17: call 18f; \
+ pause; \
+18: call 19f; \
+ pause; \
+19: call 20f; \
+ pause; \
+20: call 21f; \
+ pause; \
+21: call 22f; \
+ pause; \
+22: call 23f; \
+ pause; \
+23: call 24f; \
+ pause; \
+24: call 25f; \
+ pause; \
+25: call 26f; \
+ pause; \
+26: call 27f; \
+ pause; \
+27: call 28f; \
+ pause; \
+28: call 29f; \
+ pause; \
+29: call 30f; \
+ pause; \
+30: call 31f; \
+ pause; \
+31: call 32f; \
+ pause; \
+32: \
+ add $(32*8), %rsp;
.macro ENABLE_IBRS
ALTERNATIVE "", __stringify(__ASM_ENABLE_IBRS), X86_FEATURE_SPEC_CTRL
@@ -48,5 +115,9 @@ ALTERNATIVE "", __stringify(__ASM_ENABLE_IBRS_CLOBBER), X86_FEATURE_SPEC_CTRL
ALTERNATIVE "", __stringify(__ASM_DISABLE_IBRS), X86_FEATURE_SPEC_CTRL
.endm
+.macro STUFF_RSB
+ALTERNATIVE __stringify(__ASM_STUFF_RSB), "", X86_FEATURE_SMEP
+.endm
+
#endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_SPEC_CTRL_H */
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 5f898c3c1dad..f6ec4ad5b114 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -214,8 +214,6 @@ ENTRY(entry_SYSCALL_64)
movq %rsp, PER_CPU_VAR(rsp_scratch)
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
- TRACE_IRQS_OFF
-
/* Construct struct pt_regs on stack */
pushq $__USER_DS /* pt_regs->ss */
pushq PER_CPU_VAR(rsp_scratch) /* pt_regs->sp */
@@ -238,6 +236,10 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
ENABLE_IBRS
+ STUFF_RSB
+
+ TRACE_IRQS_OFF
+
/*
* If we need to do entry work or if we guess we'll need to do
* exit work, go straight to the slow path.
@@ -658,6 +660,13 @@ END(irq_entries_start)
ALLOC_PT_GPREGS_ON_STACK
SAVE_C_REGS
SAVE_EXTRA_REGS
+
+ /*
+ * Have to do stuffing before encoding frame pointer.
+ * Could add some unnecessary RSB clearing if coming
+ * from kernel for non-SMEP platform.
+ */
+ STUFF_RSB
ENCODE_FRAME_POINTER
testb $3, CS(%rsp)
@@ -1276,6 +1285,10 @@ ENTRY(paranoid_entry)
cld
SAVE_C_REGS 8
SAVE_EXTRA_REGS 8
+ /*
+ * Do the stuffing unconditionally from user/kernel to be safe
+ */
+ STUFF_RSB
ENCODE_FRAME_POINTER 8
movl $1, %ebx
movl $MSR_GS_BASE, %ecx
@@ -1329,6 +1342,7 @@ ENTRY(error_entry)
cld
SAVE_C_REGS 8
SAVE_EXTRA_REGS 8
+ STUFF_RSB
ENCODE_FRAME_POINTER 8
xorl %ebx, %ebx
testb $3, CS+8(%rsp)
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index ee4f3edb3c50..1480222bae02 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -97,6 +97,7 @@ ENTRY(entry_SYSENTER_compat)
cld
ENABLE_IBRS
+ STUFF_RSB
/*
* SYSENTER doesn't filter flags, so we need to clear NT and AC
@@ -227,6 +228,8 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
pushq $0 /* pt_regs->r14 = 0 */
pushq $0 /* pt_regs->r15 = 0 */
+ STUFF_RSB
+
/*
* User mode is traced as though IRQs are on, and SYSENTER
* turned them off.
@@ -354,6 +357,7 @@ ENTRY(entry_INT80_compat)
cld
ENABLE_IBRS
+ STUFF_RSB
/*
* User mode is traced as though IRQs are on, and the interrupt
--
2.14.2

View file

@ -0,0 +1,103 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Wei Wang <wei.w.wang@intel.com>
Date: Tue, 7 Nov 2017 16:47:53 +0800
Subject: [PATCH] x86/kvm: add MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD to kvm
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Add field to access guest MSR_IA332_SPEC_CTRL and MSR_IA32_PRED_CMD state.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 15eb187f47ee2be44d34313bc89cfb719d82cb21)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/vmx.c | 10 ++++++++++
arch/x86/kvm/x86.c | 2 +-
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b69af3df978a..1953c0a5b972 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -628,6 +628,8 @@ struct kvm_vcpu_arch {
u64 mcg_ext_ctl;
u64 *mce_banks;
+ u64 spec_ctrl;
+
/* Cache MMIO info */
u64 mmio_gva;
unsigned access;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9b4256fd589a..daff9962c90a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -50,6 +50,7 @@
#include <asm/apic.h>
#include <asm/irq_remapping.h>
#include <asm/mmu_context.h>
+#include <asm/microcode.h>
#include "trace.h"
#include "pmu.h"
@@ -3247,6 +3248,9 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_TSC:
msr_info->data = guest_read_tsc(vcpu);
break;
+ case MSR_IA32_SPEC_CTRL:
+ msr_info->data = vcpu->arch.spec_ctrl;
+ break;
case MSR_IA32_SYSENTER_CS:
msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
break;
@@ -3351,6 +3355,9 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr_info);
break;
+ case MSR_IA32_SPEC_CTRL:
+ vcpu->arch.spec_ctrl = msr_info->data;
+ break;
case MSR_IA32_CR_PAT:
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -6146,6 +6153,7 @@ static int handle_rdmsr(struct kvm_vcpu *vcpu)
msr_info.index = ecx;
msr_info.host_initiated = false;
+
if (vmx_get_msr(vcpu, &msr_info)) {
trace_kvm_msr_read_ex(ecx);
kvm_inject_gp(vcpu, 0);
@@ -6699,6 +6707,8 @@ static __init int hardware_setup(void)
vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
+ vmx_disable_intercept_for_msr(MSR_IA32_SPEC_CTRL, false);
+ vmx_disable_intercept_for_msr(MSR_IA32_PRED_CMD, false);
memcpy(vmx_msr_bitmap_legacy_x2apic_apicv,
vmx_msr_bitmap_legacy, PAGE_SIZE);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 703cd4171921..eae4aecf3cfe 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -983,7 +983,7 @@ static u32 msrs_to_save[] = {
MSR_CSTAR, MSR_KERNEL_GS_BASE, MSR_SYSCALL_MASK, MSR_LSTAR,
#endif
MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
- MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
+ MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX, MSR_IA32_SPEC_CTRL,
};
static unsigned num_msrs_to_save;
--
2.14.2

View file

@ -0,0 +1,46 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Fri, 13 Oct 2017 14:31:46 -0700
Subject: [PATCH] x86/kvm: Set IBPB when switching VM
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Set IBPB (Indirect branch prediction barrier) when switching VM.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 665076ad780e8620505c742cfcb4b0f3fb99324a)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/vmx.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index daff9962c90a..8df195bbb41d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1488,6 +1488,7 @@ static void vmcs_load(struct vmcs *vmcs)
if (error)
printk(KERN_ERR "kvm: vmptrld %p/%llx failed\n",
vmcs, phys_addr);
+
}
#ifdef CONFIG_KEXEC_CORE
@@ -2268,6 +2269,8 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
vmcs_load(vmx->loaded_vmcs->vmcs);
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
}
if (!already_loaded) {
--
2.14.2

View file

@ -0,0 +1,42 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Fri, 20 Oct 2017 17:04:35 -0700
Subject: [PATCH] x86/kvm: Toggle IBRS on VM entry and exit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Restore guest IBRS on VM entry and set it to 1 on VM exit
back to kernel.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 3dc28210342f174270bcefac74ef5d0b52ffd846)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/vmx.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 8df195bbb41d..57d538fc7c75 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9101,6 +9101,11 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
__write_pkru(vcpu->arch.pkru);
atomic_switch_perf_msrs(vmx);
+
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL,
+ vcpu->arch.spec_ctrl, FEATURE_ENABLE_IBRS);
+
debugctlmsr = get_debugctlmsr();
vmx_arm_hv_timer(vcpu);
--
2.14.2

View file

@ -0,0 +1,154 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Fri, 20 Oct 2017 17:05:54 -0700
Subject: [PATCH] x86/kvm: Pad RSB on VM transition
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Add code to pad the local CPU's RSB entries to protect
from previous less privilege mode.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 5369368d3520addb2ffb2413cfa7e8f3efe2e31d)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/kvm_host.h | 103 ++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/vmx.c | 2 +
2 files changed, 105 insertions(+)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1953c0a5b972..4117a97228a2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -125,6 +125,109 @@ static inline gfn_t gfn_to_index(gfn_t gfn, gfn_t base_gfn, int level)
#define ASYNC_PF_PER_VCPU 64
+static inline void stuff_RSB(void)
+{
+ __asm__ __volatile__(" \n\
+ call .label1 \n\
+ pause \n\
+.label1: \n\
+ call .label2 \n\
+ pause \n\
+.label2: \n\
+ call .label3 \n\
+ pause \n\
+.label3: \n\
+ call .label4 \n\
+ pause \n\
+.label4: \n\
+ call .label5 \n\
+ pause \n\
+.label5: \n\
+ call .label6 \n\
+ pause \n\
+.label6: \n\
+ call .label7 \n\
+ pause \n\
+.label7: \n\
+ call .label8 \n\
+ pause \n\
+.label8: \n\
+ call .label9 \n\
+ pause \n\
+.label9: \n\
+ call .label10 \n\
+ pause \n\
+.label10: \n\
+ call .label11 \n\
+ pause \n\
+.label11: \n\
+ call .label12 \n\
+ pause \n\
+.label12: \n\
+ call .label13 \n\
+ pause \n\
+.label13: \n\
+ call .label14 \n\
+ pause \n\
+.label14: \n\
+ call .label15 \n\
+ pause \n\
+.label15: \n\
+ call .label16 \n\
+ pause \n\
+.label16: \n\
+ call .label17 \n\
+ pause \n\
+.label17: \n\
+ call .label18 \n\
+ pause \n\
+.label18: \n\
+ call .label19 \n\
+ pause \n\
+.label19: \n\
+ call .label20 \n\
+ pause \n\
+.label20: \n\
+ call .label21 \n\
+ pause \n\
+.label21: \n\
+ call .label22 \n\
+ pause \n\
+.label22: \n\
+ call .label23 \n\
+ pause \n\
+.label23: \n\
+ call .label24 \n\
+ pause \n\
+.label24: \n\
+ call .label25 \n\
+ pause \n\
+.label25: \n\
+ call .label26 \n\
+ pause \n\
+.label26: \n\
+ call .label27 \n\
+ pause \n\
+.label27: \n\
+ call .label28 \n\
+ pause \n\
+.label28: \n\
+ call .label29 \n\
+ pause \n\
+.label29: \n\
+ call .label30 \n\
+ pause \n\
+.label30: \n\
+ call .label31 \n\
+ pause \n\
+.label31: \n\
+ call .label32 \n\
+ pause \n\
+.label32: \n\
+ add $(32*8), %%rsp \n\
+": : :"memory");
+}
+
enum kvm_reg {
VCPU_REGS_RAX = 0,
VCPU_REGS_RCX = 1,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 57d538fc7c75..496884b6467f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -9228,6 +9228,8 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
#endif
);
+ stuff_RSB();
+
/* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
if (debugctlmsr)
update_debugctlmsr(debugctlmsr);
--
2.14.2

View file

@ -0,0 +1,613 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim@otc-grantley-02.jf.intel.com>
Date: Thu, 16 Nov 2017 04:47:48 -0800
Subject: [PATCH] x86/spec_ctrl: Add sysctl knobs to enable/disable SPEC_CTRL
feature
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
There are 2 ways to control IBPB and IBRS
1. At boot time
noibrs kernel boot parameter will disable IBRS usage
noibpb kernel boot parameter will disable IBPB usage
Otherwise if the above parameters are not specified, the system
will enable ibrs and ibpb usage if the cpu supports it.
2. At run time
echo 0 > /proc/sys/kernel/ibrs_enabled will turn off IBRS
echo 1 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in kernel
echo 2 > /proc/sys/kernel/ibrs_enabled will turn on IBRS in both userspace and kernel
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
[marcelo.cerri@canonical.com: add x86 guards to kernel/smp.c]
[marcelo.cerri@canonical.com: include asm/msr.h under x86 guard in kernel/sysctl.c]
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
(cherry picked from commit 23225db7b02c7f8b94e5d5050987430089e6f7cc)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
Documentation/admin-guide/kernel-parameters.txt | 10 ++
arch/x86/include/asm/mwait.h | 4 +-
arch/x86/include/asm/spec_ctrl.h | 24 ++++-
include/linux/smp.h | 87 +++++++++++++++++
arch/x86/kernel/cpu/intel.c | 11 ++-
arch/x86/kernel/cpu/microcode/core.c | 11 +++
arch/x86/kernel/process.c | 6 +-
arch/x86/kernel/smpboot.c | 4 +-
arch/x86/kvm/vmx.c | 4 +-
arch/x86/lib/delay.c | 6 +-
arch/x86/mm/tlb.c | 2 +-
kernel/smp.c | 41 ++++++++
kernel/sysctl.c | 125 ++++++++++++++++++++++++
13 files changed, 316 insertions(+), 19 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1a6ebc6cdf26..e7216bc05b3b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2566,6 +2566,16 @@
noexec=on: enable non-executable mappings (default)
noexec=off: disable non-executable mappings
+ noibrs [X86]
+ Don't use indirect branch restricted speculation (IBRS)
+ feature when running in secure environment,
+ to avoid performance overhead.
+
+ noibpb [X86]
+ Don't use indirect branch prediction barrier (IBPB)
+ feature when running in secure environment,
+ to avoid performance overhead.
+
nosmap [X86]
Disable SMAP (Supervisor Mode Access Prevention)
even if it is supported by processor.
diff --git a/arch/x86/include/asm/mwait.h b/arch/x86/include/asm/mwait.h
index f15120ada161..d665daab3f84 100644
--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -107,14 +107,14 @@ static inline void mwait_idle_with_hints(unsigned long eax, unsigned long ecx)
mb();
}
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
__monitor((void *)&current_thread_info()->flags, 0, 0);
if (!need_resched())
__mwait(eax, ecx);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
}
current_clr_polling();
diff --git a/arch/x86/include/asm/spec_ctrl.h b/arch/x86/include/asm/spec_ctrl.h
index 55ee1f36bda2..4c69e51261cc 100644
--- a/arch/x86/include/asm/spec_ctrl.h
+++ b/arch/x86/include/asm/spec_ctrl.h
@@ -8,6 +8,9 @@
#ifdef __ASSEMBLY__
+.extern use_ibrs
+.extern use_ibpb
+
#define __ASM_ENABLE_IBRS \
pushq %rax; \
pushq %rcx; \
@@ -104,15 +107,30 @@
add $(32*8), %rsp;
.macro ENABLE_IBRS
-ALTERNATIVE "", __stringify(__ASM_ENABLE_IBRS), X86_FEATURE_SPEC_CTRL
+ testl $1, use_ibrs
+ jz 10f
+ __ASM_ENABLE_IBRS
+ jmp 20f
+10:
+ lfence
+20:
.endm
.macro ENABLE_IBRS_CLOBBER
-ALTERNATIVE "", __stringify(__ASM_ENABLE_IBRS_CLOBBER), X86_FEATURE_SPEC_CTRL
+ testl $1, use_ibrs
+ jz 11f
+ __ASM_ENABLE_IBRS_CLOBBER
+ jmp 21f
+11:
+ lfence
+21:
.endm
.macro DISABLE_IBRS
-ALTERNATIVE "", __stringify(__ASM_DISABLE_IBRS), X86_FEATURE_SPEC_CTRL
+ testl $1, use_ibrs
+ jz 9f
+ __ASM_DISABLE_IBRS
+9:
.endm
.macro STUFF_RSB
diff --git a/include/linux/smp.h b/include/linux/smp.h
index 68123c1fe549..e2935c0a1bb4 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -50,6 +50,93 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
int smp_call_function_single_async(int cpu, struct call_single_data *csd);
+#ifdef CONFIG_X86
+/* indicate usage of IBRS to control execution speculation */
+extern int use_ibrs;
+extern u32 sysctl_ibrs_enabled;
+extern struct mutex spec_ctrl_mutex;
+#define ibrs_supported (use_ibrs & 0x2)
+#define ibrs_disabled (use_ibrs & 0x4)
+static inline void set_ibrs_inuse(void)
+{
+ if (ibrs_supported)
+ use_ibrs |= 0x1;
+}
+static inline void clear_ibrs_inuse(void)
+{
+ use_ibrs &= ~0x1;
+}
+static inline int check_ibrs_inuse(void)
+{
+ if (use_ibrs & 0x1)
+ return 1;
+ else
+ /* rmb to prevent wrong speculation for security */
+ rmb();
+ return 0;
+}
+static inline void set_ibrs_supported(void)
+{
+ use_ibrs |= 0x2;
+ if (!ibrs_disabled)
+ set_ibrs_inuse();
+}
+static inline void set_ibrs_disabled(void)
+{
+ use_ibrs |= 0x4;
+ if (check_ibrs_inuse())
+ clear_ibrs_inuse();
+}
+static inline void clear_ibrs_disabled(void)
+{
+ use_ibrs &= ~0x4;
+ set_ibrs_inuse();
+}
+#define ibrs_inuse (check_ibrs_inuse())
+
+/* indicate usage of IBPB to control execution speculation */
+extern int use_ibpb;
+extern u32 sysctl_ibpb_enabled;
+#define ibpb_supported (use_ibpb & 0x2)
+#define ibpb_disabled (use_ibpb & 0x4)
+static inline void set_ibpb_inuse(void)
+{
+ if (ibpb_supported)
+ use_ibpb |= 0x1;
+}
+static inline void clear_ibpb_inuse(void)
+{
+ use_ibpb &= ~0x1;
+}
+static inline int check_ibpb_inuse(void)
+{
+ if (use_ibpb & 0x1)
+ return 1;
+ else
+ /* rmb to prevent wrong speculation for security */
+ rmb();
+ return 0;
+}
+static inline void set_ibpb_supported(void)
+{
+ use_ibpb |= 0x2;
+ if (!ibpb_disabled)
+ set_ibpb_inuse();
+}
+static inline void set_ibpb_disabled(void)
+{
+ use_ibpb |= 0x4;
+ if (check_ibpb_inuse())
+ clear_ibpb_inuse();
+}
+static inline void clear_ibpb_disabled(void)
+{
+ use_ibpb &= ~0x4;
+ set_ibpb_inuse();
+}
+#define ibpb_inuse (check_ibpb_inuse())
+#endif
+
#ifdef CONFIG_SMP
#include <linux/preempt.h>
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index f1d94c73625a..c69ea2efbed1 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -628,10 +628,17 @@ static void init_intel(struct cpuinfo_x86 *c)
init_intel_misc_features(c);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
printk_once(KERN_INFO "FEATURE SPEC_CTRL Present\n");
- else
+ set_ibrs_supported();
+ set_ibpb_supported();
+ if (ibrs_inuse)
+ sysctl_ibrs_enabled = 1;
+ if (ibpb_inuse)
+ sysctl_ibpb_enabled = 1;
+ } else {
printk_once(KERN_INFO "FEATURE SPEC_CTRL Not Present\n");
+ }
}
#ifdef CONFIG_X86_32
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index c4fa4a85d4cb..6450aeda72fc 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -535,6 +535,17 @@ static ssize_t reload_store(struct device *dev,
}
if (!ret)
perf_check_microcode();
+
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
+ printk_once(KERN_INFO "FEATURE SPEC_CTRL Present\n");
+ set_ibrs_supported();
+ set_ibpb_supported();
+ if (ibrs_inuse)
+ sysctl_ibrs_enabled = 1;
+ if (ibpb_inuse)
+ sysctl_ibpb_enabled = 1;
+ }
+
mutex_unlock(&microcode_mutex);
put_online_cpus();
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 3adb3806a284..3fdf5358998e 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -447,16 +447,16 @@ static __cpuidle void mwait_idle(void)
mb(); /* quirk */
}
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
__monitor((void *)&current_thread_info()->flags, 0, 0);
if (!need_resched()) {
__sti_mwait(0, 0);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
} else {
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
local_irq_enable();
}
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index a652bff7add4..9317aa4a7446 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1693,14 +1693,14 @@ void native_play_dead(void)
play_dead_common();
tboot_shutdown(TB_SHUTDOWN_WFS);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
mwait_play_dead(); /* Only returns on failure */
if (cpuidle_play_dead())
hlt_play_dead();
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 496884b6467f..d2168203bddc 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2269,7 +2269,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
vmcs_load(vmx->loaded_vmcs->vmcs);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibpb_inuse)
native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
}
@@ -9102,7 +9102,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
atomic_switch_perf_msrs(vmx);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ if (ibrs_inuse)
add_atomic_switch_msr(vmx, MSR_IA32_SPEC_CTRL,
vcpu->arch.spec_ctrl, FEATURE_ENABLE_IBRS);
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index b088463973e4..72a174642550 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -107,8 +107,7 @@ static void delay_mwaitx(unsigned long __loops)
for (;;) {
delay = min_t(u64, MWAITX_MAX_LOOPS, loops);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
- (delay > IBRS_DISABLE_THRESHOLD))
+ if (ibrs_inuse && (delay > IBRS_DISABLE_THRESHOLD))
native_wrmsrl(MSR_IA32_SPEC_CTRL, 0);
/*
@@ -124,8 +123,7 @@ static void delay_mwaitx(unsigned long __loops)
*/
__mwaitx(MWAITX_DISABLE_CSTATES, delay, MWAITX_ECX_TIMER_ENABLE);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
- (delay > IBRS_DISABLE_THRESHOLD))
+ if (ibrs_inuse && (delay > IBRS_DISABLE_THRESHOLD))
native_wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
end = rdtsc_ordered();
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 301e6efbc514..6365f769de3d 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -221,7 +221,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
bool need_flush;
/* Null tsk means switching to kernel, so that's safe */
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) && tsk &&
+ if (ibpb_inuse && tsk &&
___ptrace_may_access(tsk, current, PTRACE_MODE_IBPB))
native_wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
diff --git a/kernel/smp.c b/kernel/smp.c
index 3061483cb3ad..3bece045f4a4 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -498,6 +498,26 @@ EXPORT_SYMBOL(smp_call_function);
unsigned int setup_max_cpus = NR_CPUS;
EXPORT_SYMBOL(setup_max_cpus);
+#ifdef CONFIG_X86
+/*
+ * use IBRS
+ * bit 0 = indicate if ibrs is currently in use
+ * bit 1 = indicate if system supports ibrs
+ * bit 2 = indicate if admin disables ibrs
+*/
+
+int use_ibrs;
+EXPORT_SYMBOL(use_ibrs);
+
+/*
+ * use IBRS
+ * bit 0 = indicate if ibpb is currently in use
+ * bit 1 = indicate if system supports ibpb
+ * bit 2 = indicate if admin disables ibpb
+*/
+int use_ibpb;
+EXPORT_SYMBOL(use_ibpb);
+#endif
/*
* Setup routine for controlling SMP activation
@@ -522,6 +542,27 @@ static int __init nosmp(char *str)
early_param("nosmp", nosmp);
+#ifdef CONFIG_X86
+static int __init noibrs(char *str)
+{
+ set_ibrs_disabled();
+
+ return 0;
+}
+
+early_param("noibrs", noibrs);
+
+static int __init noibpb(char *str)
+{
+ set_ibpb_disabled();
+
+ return 0;
+}
+
+early_param("noibpb", noibpb);
+#endif
+
+
/* this is hard limit */
static int __init nrcpus(char *str)
{
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 7ab08d5728e6..69c37bd6251a 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -72,6 +72,7 @@
#include <asm/processor.h>
#ifdef CONFIG_X86
+#include <asm/msr.h>
#include <asm/nmi.h>
#include <asm/stacktrace.h>
#include <asm/io.h>
@@ -222,6 +223,15 @@ static int proc_dostring_coredump(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos);
#endif
+#ifdef CONFIG_X86
+int proc_dointvec_ibrs_ctrl(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos);
+int proc_dointvec_ibpb_ctrl(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos);
+int proc_dointvec_ibrs_dump(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos);
+#endif
+
#ifdef CONFIG_MAGIC_SYSRQ
/* Note: sysrq code uses it's own private copy */
static int __sysrq_enabled = CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE;
@@ -258,6 +268,12 @@ extern struct ctl_table epoll_table[];
int sysctl_legacy_va_layout;
#endif
+u32 sysctl_ibrs_dump = 0;
+u32 sysctl_ibrs_enabled = 0;
+EXPORT_SYMBOL(sysctl_ibrs_enabled);
+u32 sysctl_ibpb_enabled = 0;
+EXPORT_SYMBOL(sysctl_ibpb_enabled);
+
/* The default sysctl tables: */
static struct ctl_table sysctl_base_table[] = {
@@ -1241,6 +1257,35 @@ static struct ctl_table kern_table[] = {
.extra1 = &zero,
.extra2 = &one,
},
+#endif
+#ifdef CONFIG_X86
+ {
+ .procname = "ibrs_enabled",
+ .data = &sysctl_ibrs_enabled,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_ibrs_ctrl,
+ .extra1 = &zero,
+ .extra2 = &two,
+ },
+ {
+ .procname = "ibpb_enabled",
+ .data = &sysctl_ibpb_enabled,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_ibpb_ctrl,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+ {
+ .procname = "ibrs_dump",
+ .data = &sysctl_ibrs_dump,
+ .maxlen = sizeof(unsigned int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_ibrs_dump,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
#endif
{ }
};
@@ -2585,6 +2630,86 @@ int proc_dointvec_minmax(struct ctl_table *table, int write,
do_proc_dointvec_minmax_conv, &param);
}
+#ifdef CONFIG_X86
+int proc_dointvec_ibrs_dump(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret;
+ unsigned int cpu;
+
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ printk("sysctl_ibrs_enabled = %u, sysctl_ibpb_enabled = %u\n", sysctl_ibrs_enabled, sysctl_ibpb_enabled);
+ printk("use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ for_each_online_cpu(cpu) {
+ u64 val;
+
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
+ rdmsrl_on_cpu(cpu, MSR_IA32_SPEC_CTRL, &val);
+ else
+ val = 0;
+ printk("read cpu %d ibrs val %lu\n", cpu, (unsigned long) val);
+ }
+ return ret;
+}
+
+int proc_dointvec_ibrs_ctrl(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret;
+ unsigned int cpu;
+
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ pr_debug("sysctl_ibrs_enabled = %u, sysctl_ibpb_enabled = %u\n", sysctl_ibrs_enabled, sysctl_ibpb_enabled);
+ pr_debug("before:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ if (sysctl_ibrs_enabled == 0) {
+ /* always set IBRS off */
+ set_ibrs_disabled();
+ if (ibrs_supported) {
+ for_each_online_cpu(cpu)
+ wrmsrl_on_cpu(cpu, MSR_IA32_SPEC_CTRL, 0x0);
+ }
+ } else if (sysctl_ibrs_enabled == 2) {
+ /* always set IBRS on, even in user space */
+ clear_ibrs_disabled();
+ if (ibrs_supported) {
+ for_each_online_cpu(cpu)
+ wrmsrl_on_cpu(cpu, MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
+ } else {
+ sysctl_ibrs_enabled = 0;
+ }
+ } else if (sysctl_ibrs_enabled == 1) {
+ /* use IBRS in kernel */
+ clear_ibrs_disabled();
+ if (!ibrs_inuse)
+ /* platform don't support ibrs */
+ sysctl_ibrs_enabled = 0;
+ }
+ pr_debug("after:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ return ret;
+}
+
+int proc_dointvec_ibpb_ctrl(struct ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret;
+
+ ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+ pr_debug("sysctl_ibrs_enabled = %u, sysctl_ibpb_enabled = %u\n", sysctl_ibrs_enabled, sysctl_ibpb_enabled);
+ pr_debug("before:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ if (sysctl_ibpb_enabled == 0)
+ set_ibpb_disabled();
+ else if (sysctl_ibpb_enabled == 1) {
+ clear_ibpb_disabled();
+ if (!ibpb_inuse)
+ /* platform don't support ibpb */
+ sysctl_ibpb_enabled = 0;
+ }
+ pr_debug("after:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ return ret;
+}
+#endif
+
+
struct do_proc_douintvec_minmax_conv_param {
unsigned int *min;
unsigned int *max;
--
2.14.2

View file

@ -0,0 +1,166 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Mon, 20 Nov 2017 13:47:54 -0800
Subject: [PATCH] x86/spec_ctrl: Add lock to serialize changes to ibrs and ibpb
control
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 85789933bc45a3e763823675bd0d80e3e617f234)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/cpu/intel.c | 22 ++++++++++++----------
arch/x86/kernel/cpu/microcode/core.c | 2 ++
kernel/smp.c | 4 ++++
kernel/sysctl.c | 14 +++++++++++++-
4 files changed, 31 insertions(+), 11 deletions(-)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index c69ea2efbed1..8d558e24783c 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -628,16 +628,18 @@ static void init_intel(struct cpuinfo_x86 *c)
init_intel_misc_features(c);
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
- printk_once(KERN_INFO "FEATURE SPEC_CTRL Present\n");
- set_ibrs_supported();
- set_ibpb_supported();
- if (ibrs_inuse)
- sysctl_ibrs_enabled = 1;
- if (ibpb_inuse)
- sysctl_ibpb_enabled = 1;
- } else {
- printk_once(KERN_INFO "FEATURE SPEC_CTRL Not Present\n");
+ if (!c->cpu_index) {
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
+ printk(KERN_INFO "FEATURE SPEC_CTRL Present\n");
+ set_ibrs_supported();
+ set_ibpb_supported();
+ if (ibrs_inuse)
+ sysctl_ibrs_enabled = 1;
+ if (ibpb_inuse)
+ sysctl_ibpb_enabled = 1;
+ } else {
+ printk(KERN_INFO "FEATURE SPEC_CTRL Not Present\n");
+ }
}
}
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 6450aeda72fc..55086921d29e 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -538,12 +538,14 @@ static ssize_t reload_store(struct device *dev,
if (boot_cpu_has(X86_FEATURE_SPEC_CTRL)) {
printk_once(KERN_INFO "FEATURE SPEC_CTRL Present\n");
+ mutex_lock(&spec_ctrl_mutex);
set_ibrs_supported();
set_ibpb_supported();
if (ibrs_inuse)
sysctl_ibrs_enabled = 1;
if (ibpb_inuse)
sysctl_ibpb_enabled = 1;
+ mutex_unlock(&spec_ctrl_mutex);
}
mutex_unlock(&microcode_mutex);
diff --git a/kernel/smp.c b/kernel/smp.c
index 3bece045f4a4..a224ec0c540c 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -519,6 +519,10 @@ int use_ibpb;
EXPORT_SYMBOL(use_ibpb);
#endif
+/* mutex to serialize IBRS & IBPB control changes */
+DEFINE_MUTEX(spec_ctrl_mutex);
+EXPORT_SYMBOL(spec_ctrl_mutex);
+
/*
* Setup routine for controlling SMP activation
*
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 69c37bd6251a..47a37792109d 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -69,6 +69,7 @@
#include <linux/mount.h>
#include <linux/uaccess.h>
+#include <linux/mutex.h>
#include <asm/processor.h>
#ifdef CONFIG_X86
@@ -2634,12 +2635,17 @@ int proc_dointvec_minmax(struct ctl_table *table, int write,
int proc_dointvec_ibrs_dump(struct ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
- int ret;
+ int ret, orig_inuse;
unsigned int cpu;
+
ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
printk("sysctl_ibrs_enabled = %u, sysctl_ibpb_enabled = %u\n", sysctl_ibrs_enabled, sysctl_ibpb_enabled);
printk("use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ mutex_lock(&spec_ctrl_mutex);
+ orig_inuse = use_ibrs;
+ /* temporary halt to ibrs usage to dump ibrs values */
+ clear_ibrs_inuse();
for_each_online_cpu(cpu) {
u64 val;
@@ -2649,6 +2655,8 @@ int proc_dointvec_ibrs_dump(struct ctl_table *table, int write,
val = 0;
printk("read cpu %d ibrs val %lu\n", cpu, (unsigned long) val);
}
+ use_ibrs = orig_inuse;
+ mutex_unlock(&spec_ctrl_mutex);
return ret;
}
@@ -2661,6 +2669,7 @@ int proc_dointvec_ibrs_ctrl(struct ctl_table *table, int write,
ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
pr_debug("sysctl_ibrs_enabled = %u, sysctl_ibpb_enabled = %u\n", sysctl_ibrs_enabled, sysctl_ibpb_enabled);
pr_debug("before:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ mutex_lock(&spec_ctrl_mutex);
if (sysctl_ibrs_enabled == 0) {
/* always set IBRS off */
set_ibrs_disabled();
@@ -2684,6 +2693,7 @@ int proc_dointvec_ibrs_ctrl(struct ctl_table *table, int write,
/* platform don't support ibrs */
sysctl_ibrs_enabled = 0;
}
+ mutex_unlock(&spec_ctrl_mutex);
pr_debug("after:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
return ret;
}
@@ -2696,6 +2706,7 @@ int proc_dointvec_ibpb_ctrl(struct ctl_table *table, int write,
ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
pr_debug("sysctl_ibrs_enabled = %u, sysctl_ibpb_enabled = %u\n", sysctl_ibrs_enabled, sysctl_ibpb_enabled);
pr_debug("before:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
+ mutex_lock(&spec_ctrl_mutex);
if (sysctl_ibpb_enabled == 0)
set_ibpb_disabled();
else if (sysctl_ibpb_enabled == 1) {
@@ -2704,6 +2715,7 @@ int proc_dointvec_ibpb_ctrl(struct ctl_table *table, int write,
/* platform don't support ibpb */
sysctl_ibpb_enabled = 0;
}
+ mutex_unlock(&spec_ctrl_mutex);
pr_debug("after:use_ibrs = %d, use_ibpb = %d\n", use_ibrs, use_ibpb);
return ret;
}
--
2.14.2

View file

@ -0,0 +1,94 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Tue, 19 Sep 2017 15:21:40 -0700
Subject: [PATCH] x86/syscall: Clear unused extra registers on syscall entrance
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
To prevent the unused registers %r12-%r15, %rbp and %rbx from
being used speculatively, we clear them upon syscall entrance
for code hygiene.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 7b5ea16f42b5e4860cf9033897bcdfa3e1209033)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/entry/calling.h | 9 +++++++++
arch/x86/entry/entry_64.S | 12 ++++++++----
2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 015e0a84bb99..d537818ad285 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -155,6 +155,15 @@ For 32-bit we have the following conventions - kernel is built with
popq %rbx
.endm
+ .macro CLEAR_EXTRA_REGS
+ xorq %r15, %r15
+ xorq %r14, %r14
+ xorq %r13, %r13
+ xorq %r12, %r12
+ xorq %rbp, %rbp
+ xorq %rbx, %rbx
+ .endm
+
.macro POP_C_REGS
popq %r11
popq %r10
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f6ec4ad5b114..1118a6256c69 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -231,10 +231,16 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
pushq %r9 /* pt_regs->r9 */
pushq %r10 /* pt_regs->r10 */
pushq %r11 /* pt_regs->r11 */
- sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */
+ sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not used */
UNWIND_HINT_REGS extra=0
ENABLE_IBRS
+ /*
+ * Clear the unused extra regs for code hygiene.
+ * Will restore the callee saved extra regs at end of syscall.
+ */
+ SAVE_EXTRA_REGS
+ CLEAR_EXTRA_REGS
STUFF_RSB
@@ -292,7 +298,7 @@ entry_SYSCALL_64_fastpath:
movq RIP(%rsp), %rcx
movq EFLAGS(%rsp), %r11
DISABLE_IBRS
- addq $6*8, %rsp /* skip extra regs -- they were preserved */
+ POP_EXTRA_REGS
UNWIND_HINT_EMPTY
jmp .Lpop_c_regs_except_rcx_r11_and_sysret
@@ -304,14 +310,12 @@ entry_SYSCALL_64_fastpath:
*/
TRACE_IRQS_ON
ENABLE_INTERRUPTS(CLBR_ANY)
- SAVE_EXTRA_REGS
movq %rsp, %rdi
call syscall_return_slowpath /* returns with IRQs disabled */
jmp return_from_SYSCALL_64
entry_SYSCALL64_slow_path:
/* IRQs are off. */
- SAVE_EXTRA_REGS
movq %rsp, %rdi
call do_syscall_64 /* returns with IRQs disabled */
--
2.14.2

View file

@ -0,0 +1,101 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Fri, 15 Sep 2017 19:41:24 -0700
Subject: [PATCH] x86/syscall: Clear unused extra registers on 32-bit
compatible syscall entrance
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
To prevent the unused registers %r8-%r15, from being used speculatively,
we clear them upon syscall entrance for code hygiene in 32 bit compatible
mode.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 725ad2ef81ccceb3e31a7263faae2059d05e2c48)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/entry/calling.h | 11 +++++++++++
arch/x86/entry/entry_64_compat.S | 18 ++++++++++++++----
2 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index d537818ad285..0e34002bc801 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -155,6 +155,17 @@ For 32-bit we have the following conventions - kernel is built with
popq %rbx
.endm
+ .macro CLEAR_R8_TO_R15
+ xorq %r15, %r15
+ xorq %r14, %r14
+ xorq %r13, %r13
+ xorq %r12, %r12
+ xorq %r11, %r11
+ xorq %r10, %r10
+ xorq %r9, %r9
+ xorq %r8, %r8
+ .endm
+
.macro CLEAR_EXTRA_REGS
xorq %r15, %r15
xorq %r14, %r14
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 1480222bae02..8d7ae9657375 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -99,6 +99,8 @@ ENTRY(entry_SYSENTER_compat)
ENABLE_IBRS
STUFF_RSB
+ CLEAR_R8_TO_R15
+
/*
* SYSENTER doesn't filter flags, so we need to clear NT and AC
* ourselves. To save a few cycles, we can check whether
@@ -223,10 +225,12 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
pushq $0 /* pt_regs->r11 = 0 */
pushq %rbx /* pt_regs->rbx */
pushq %rbp /* pt_regs->rbp (will be overwritten) */
- pushq $0 /* pt_regs->r12 = 0 */
- pushq $0 /* pt_regs->r13 = 0 */
- pushq $0 /* pt_regs->r14 = 0 */
- pushq $0 /* pt_regs->r15 = 0 */
+ pushq %r12 /* pt_regs->r12 */
+ pushq %r13 /* pt_regs->r13 */
+ pushq %r14 /* pt_regs->r14 */
+ pushq %r15 /* pt_regs->r15 */
+
+ CLEAR_R8_TO_R15
STUFF_RSB
@@ -245,6 +249,10 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
/* Opportunistic SYSRET */
sysret32_from_system_call:
TRACE_IRQS_ON /* User mode traces as IRQs on. */
+ movq R15(%rsp), %r15 /* pt_regs->r15 */
+ movq R14(%rsp), %r14 /* pt_regs->r14 */
+ movq R13(%rsp), %r13 /* pt_regs->r13 */
+ movq R12(%rsp), %r12 /* pt_regs->r12 */
movq RBX(%rsp), %rbx /* pt_regs->rbx */
movq RBP(%rsp), %rbp /* pt_regs->rbp */
movq EFLAGS(%rsp), %r11 /* pt_regs->flags (in r11) */
@@ -359,6 +367,8 @@ ENTRY(entry_INT80_compat)
ENABLE_IBRS
STUFF_RSB
+ CLEAR_R8_TO_R15
+
/*
* User mode is traced as though IRQs are on, and the interrupt
* gate turned them off.
--
2.14.2

View file

@ -0,0 +1,44 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tim Chen <tim.c.chen@linux.intel.com>
Date: Wed, 8 Nov 2017 16:30:06 -0800
Subject: [PATCH] x86/entry: Use retpoline for syscall's indirect calls
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit d2e0236f395e876f5303fb5021e4fe6eea881402)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/entry/entry_64.S | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 1118a6256c69..be7196967f9f 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -276,7 +276,15 @@ entry_SYSCALL_64_fastpath:
* It might end up jumping to the slow path. If it jumps, RAX
* and all argument registers are clobbered.
*/
- call *sys_call_table(, %rax, 8)
+ movq sys_call_table(, %rax, 8), %r10
+ jmp 1f
+4: callq 2f
+3: nop
+ jmp 3b
+2: mov %r10, (%rsp)
+ retq
+1: callq 4b
+
.Lentry_SYSCALL_64_after_fastpath_call:
movq %rax, RAX(%rsp)
--
2.14.2

View file

@ -0,0 +1,112 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:52:54 +0000
Subject: [PATCH] x86/cpu/AMD: Add speculative control support for AMD
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Add speculative control support for AMD processors. For AMD, speculative
control is indicated as follows:
CPUID EAX=0x00000007, ECX=0x00 return EDX[26] indicates support for
both IBRS and IBPB.
CPUID EAX=0x80000008, ECX=0x00 return EBX[12] indicates support for
just IBPB.
On AMD family 0x10, 0x12 and 0x16 processors where either of the above
features are not supported, IBPB can be achieved by disabling
indirect branch predictor support in MSR 0xc0011021[14] at boot.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 8c3fc9e98177daee2281ed40e3d61f9cf4eee576)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/amd.c | 39 ++++++++++++++++++++++++++++++++++++++
3 files changed, 41 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 44be8fd069bf..a97b327137aa 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -268,6 +268,7 @@
#define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */
#define X86_FEATURE_IRPERF (13*32+ 1) /* Instructions Retired Count */
#define X86_FEATURE_XSAVEERPTR (13*32+ 2) /* Always save/restore FP error pointers */
+#define X86_FEATURE_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 4e3438a00a50..954aad6c32f4 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -345,6 +345,7 @@
#define MSR_F15H_NB_PERF_CTR 0xc0010241
#define MSR_F15H_PTSC 0xc0010280
#define MSR_F15H_IC_CFG 0xc0011021
+#define MSR_F15H_IC_CFG_DIS_IND BIT_ULL(14)
/* Fam 10h MSRs */
#define MSR_FAM10H_MMIO_CONF_BASE 0xc0010058
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 99eef4a09fd9..42871c1a8da8 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -830,6 +830,45 @@ static void init_amd(struct cpuinfo_x86 *c)
/* AMD CPUs don't reset SS attributes on SYSRET, Xen does. */
if (!cpu_has(c, X86_FEATURE_XENPV))
set_cpu_bug(c, X86_BUG_SYSRET_SS_ATTRS);
+
+ /* AMD speculative control support */
+ if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
+ pr_info_once("FEATURE SPEC_CTRL Present\n");
+ set_ibrs_supported();
+ set_ibpb_supported();
+ if (ibrs_inuse)
+ sysctl_ibrs_enabled = 1;
+ if (ibpb_inuse)
+ sysctl_ibpb_enabled = 1;
+ } else if (cpu_has(c, X86_FEATURE_IBPB)) {
+ pr_info_once("FEATURE SPEC_CTRL Not Present\n");
+ pr_info_once("FEATURE IBPB Present\n");
+ set_ibpb_supported();
+ if (ibpb_inuse)
+ sysctl_ibpb_enabled = 1;
+ } else {
+ pr_info_once("FEATURE SPEC_CTRL Not Present\n");
+ pr_info_once("FEATURE IBPB Not Present\n");
+ /*
+ * On AMD processors that do not support the speculative
+ * control features, IBPB type support can be achieved by
+ * disabling indirect branch predictor support.
+ */
+ if (!ibpb_disabled) {
+ u64 val;
+
+ switch (c->x86) {
+ case 0x10:
+ case 0x12:
+ case 0x16:
+ pr_info_once("Disabling indirect branch predictor support\n");
+ rdmsrl(MSR_F15H_IC_CFG, val);
+ val |= MSR_F15H_IC_CFG_DIS_IND;
+ wrmsrl(MSR_F15H_IC_CFG, val);
+ break;
+ }
+ }
+ }
}
#ifdef CONFIG_X86_32
--
2.14.2

View file

@ -0,0 +1,45 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:55:47 +0000
Subject: [PATCH] x86/microcode: Extend post microcode reload to support IBPB
feature
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Add an IBPB feature check to the speculative control update check after
a microcode reload.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 099878acd3738271fb2ade01f4649b1ed2fb72d5)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/cpu/microcode/core.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 55086921d29e..638c08350d65 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -546,6 +546,13 @@ static ssize_t reload_store(struct device *dev,
if (ibpb_inuse)
sysctl_ibpb_enabled = 1;
mutex_unlock(&spec_ctrl_mutex);
+ } else if (boot_cpu_has(X86_FEATURE_IBPB)) {
+ printk_once(KERN_INFO "FEATURE IBPB Present\n");
+ mutex_lock(&spec_ctrl_mutex);
+ set_ibpb_supported();
+ if (ibpb_inuse)
+ sysctl_ibpb_enabled = 1;
+ mutex_unlock(&spec_ctrl_mutex);
}
mutex_unlock(&microcode_mutex);
--
2.14.2

View file

@ -0,0 +1,40 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:55:47 +0000
Subject: [PATCH] KVM: SVM: Do not intercept new speculative control MSRs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Allow guest access to the speculative control MSRs without being
intercepted.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit ccaa77a824fd3e21f0b8ae6b5a66fc1ee7e35b14)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/svm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 92cd94d51e1f..94adf6becc2e 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -248,6 +248,8 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_CSTAR, .always = true },
{ .index = MSR_SYSCALL_MASK, .always = true },
#endif
+ { .index = MSR_IA32_SPEC_CTRL, .always = true },
+ { .index = MSR_IA32_PRED_CMD, .always = true },
{ .index = MSR_IA32_LASTBRANCHFROMIP, .always = false },
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
{ .index = MSR_IA32_LASTINTFROMIP, .always = false },
--
2.14.2

View file

@ -0,0 +1,83 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:55:47 +0000
Subject: [PATCH] x86/svm: Set IBRS value on VM entry and exit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Set/restore the guests IBRS value on VM entry. On VM exit back to the
kernel save the guest IBRS value and then set IBRS to 1.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 72f71e6826fac9a656c3994fb6f979cd65a14c64)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/svm.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 94adf6becc2e..a1b19e810c49 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -175,6 +175,8 @@ struct vcpu_svm {
u64 next_rip;
+ u64 spec_ctrl;
+
u64 host_user_msrs[NR_HOST_SAVE_USER_MSRS];
struct {
u16 fs;
@@ -3547,6 +3549,9 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_VM_CR:
msr_info->data = svm->nested.vm_cr_msr;
break;
+ case MSR_IA32_SPEC_CTRL:
+ msr_info->data = svm->spec_ctrl;
+ break;
case MSR_IA32_UCODE_REV:
msr_info->data = 0x01000065;
break;
@@ -3702,6 +3707,9 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
case MSR_VM_IGNNE:
vcpu_unimpl(vcpu, "unimplemented wrmsr: 0x%x data 0x%llx\n", ecx, data);
break;
+ case MSR_IA32_SPEC_CTRL:
+ svm->spec_ctrl = data;
+ break;
case MSR_IA32_APICBASE:
if (kvm_vcpu_apicv_active(vcpu))
avic_update_vapic_bar(to_svm(vcpu), data);
@@ -4883,6 +4891,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
local_irq_enable();
+ if (ibrs_inuse && (svm->spec_ctrl != FEATURE_ENABLE_IBRS))
+ wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+
asm volatile (
"push %%" _ASM_BP "; \n\t"
"mov %c[rbx](%[svm]), %%" _ASM_BX " \n\t"
@@ -4975,6 +4986,12 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
#endif
);
+ if (ibrs_inuse) {
+ rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+ if (svm->spec_ctrl != FEATURE_ENABLE_IBRS)
+ wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
+ }
+
#ifdef CONFIG_X86_64
wrmsrl(MSR_GS_BASE, svm->host.gs_base);
#else
--
2.14.2

View file

@ -0,0 +1,73 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:55:47 +0000
Subject: [PATCH] x86/svm: Set IBPB when running a different VCPU
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Set IBPB (Indirect Branch Prediction Barrier) when the current CPU is
going to run a VCPU different from what was previously run.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 0ba3eaabbb6666ebd344ee80534e58c375a00810)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/svm.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a1b19e810c49..fade4869856a 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -518,6 +518,8 @@ struct svm_cpu_data {
struct kvm_ldttss_desc *tss_desc;
struct page *save_area;
+
+ struct vmcb *current_vmcb;
};
static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -1685,11 +1687,19 @@ static void svm_free_vcpu(struct kvm_vcpu *vcpu)
__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, svm);
+
+ /*
+ * The VMCB could be recycled, causing a false negative in svm_vcpu_load;
+ * block speculative execution.
+ */
+ if (ibpb_inuse)
+ wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
}
static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
int i;
if (unlikely(cpu != vcpu->cpu)) {
@@ -1718,6 +1728,12 @@ static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
if (static_cpu_has(X86_FEATURE_RDTSCP))
wrmsrl(MSR_TSC_AUX, svm->tsc_aux);
+ if (sd->current_vmcb != svm->vmcb) {
+ sd->current_vmcb = svm->vmcb;
+ if (ibpb_inuse)
+ wrmsrl(MSR_IA32_PRED_CMD, FEATURE_SET_IBPB);
+ }
+
avic_vcpu_load(vcpu, cpu);
}
--
2.14.2

View file

@ -0,0 +1,63 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:55:47 +0000
Subject: [PATCH] KVM: x86: Add speculative control CPUID support for guests
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Provide the guest with the speculative control CPUID related values.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit db7641e5f41cd517c4181ce90c4f9ecc93af4b2b)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/cpuid.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 19adbb418443..f64502d21a89 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -70,6 +70,7 @@ u64 kvm_supported_xcr0(void)
/* These are scattered features in cpufeatures.h. */
#define KVM_CPUID_BIT_AVX512_4VNNIW 2
#define KVM_CPUID_BIT_AVX512_4FMAPS 3
+#define KVM_CPUID_BIT_SPEC_CTRL 26
#define KF(x) bit(KVM_CPUID_BIT_##x)
int kvm_update_cpuid(struct kvm_vcpu *vcpu)
@@ -387,7 +388,12 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
- KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
+ KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) |
+ KF(SPEC_CTRL);
+
+ /* cpuid 0x80000008.0.ebx */
+ const u32 kvm_cpuid_80000008_0_ebx_x86_features =
+ F(IBPB);
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
@@ -622,7 +628,9 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
if (!g_phys_as)
g_phys_as = phys_as;
entry->eax = g_phys_as | (virt_as << 8);
- entry->ebx = entry->edx = 0;
+ entry->ebx &= kvm_cpuid_80000008_0_ebx_x86_features;
+ cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
+ entry->edx = 0;
break;
}
case 0x80000019:
--
2.14.2

View file

@ -0,0 +1,39 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:55:47 +0000
Subject: [PATCH] x86/svm: Add code to clobber the RSB on VM exit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Add code to overwrite the local CPU RSB entries from the previous less
privileged mode.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 9392e24469b71ff665cdbc3d81db215f9383219d)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/svm.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index fade4869856a..e99bdfcc6b01 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5008,6 +5008,8 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
wrmsrl(MSR_IA32_SPEC_CTRL, FEATURE_ENABLE_IBRS);
}
+ stuff_RSB();
+
#ifdef CONFIG_X86_64
wrmsrl(MSR_GS_BASE, svm->host.gs_base);
#else
--
2.14.2

View file

@ -0,0 +1,71 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Wed, 20 Dec 2017 10:55:48 +0000
Subject: [PATCH] x86/cpu/AMD: Remove now unused definition of MFENCE_RDTSC
feature
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
With the switch to using LFENCE_RDTSC on AMD platforms there is no longer
a need for the MFENCE_RDTSC feature. Remove it usage and definition.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 6e6c998937329e9d13d4b239233cd058e8a7730f)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/barrier.h | 3 +--
arch/x86/include/asm/msr.h | 3 +--
arch/x86/net/bpf_jit_comp.c | 3 ---
3 files changed, 2 insertions(+), 7 deletions(-)
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index aae78054cae2..d00432579444 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -23,8 +23,7 @@
#define wmb() asm volatile("sfence" ::: "memory")
#endif
-#define gmb() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
- "lfence", X86_FEATURE_LFENCE_RDTSC);
+#define gmb() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC);
#ifdef CONFIG_X86_PPRO_FENCE
#define dma_rmb() rmb()
diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h
index 898dba2e2e2c..3139098269f6 100644
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -213,8 +213,7 @@ static __always_inline unsigned long long rdtsc_ordered(void)
* that some other imaginary CPU is updating continuously with a
* time stamp.
*/
- alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
- "lfence", X86_FEATURE_LFENCE_RDTSC);
+ alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC);
return rdtsc();
}
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 879dbfefb66d..e20e304320f9 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -116,9 +116,6 @@ static void emit_memory_barrier(u8 **pprog)
if (boot_cpu_has(X86_FEATURE_LFENCE_RDTSC))
/* x86 LFENCE opcode 0F AE E8 */
EMIT3(0x0f, 0xae, 0xe8);
- else if (boot_cpu_has(X86_FEATURE_MFENCE_RDTSC))
- /* AMD MFENCE opcode 0F AE F0 */
- EMIT3(0x0f, 0xae, 0xf0);
else
/* we should never end up here,
* but if we do, better not to emit anything*/
--
2.14.2

View file

@ -0,0 +1,44 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: William Grant <wgrant@ubuntu.com>
Date: Thu, 11 Jan 2018 17:05:42 -0600
Subject: [PATCH] UBUNTU: SAUCE: x86/kvm: Fix stuff_RSB() for 32-bit
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5753
CVE-2017-5715
Signed-off-by: William Grant <wgrant@ubuntu.com>
Acked-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
(cherry picked from commit 306dada4f850bf537dbd8ff06cf1522074b3f327)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/include/asm/kvm_host.h | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4117a97228a2..f39bc68efa56 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -223,9 +223,13 @@ static inline void stuff_RSB(void)
.label31: \n\
call .label32 \n\
pause \n\
-.label32: \n\
- add $(32*8), %%rsp \n\
-": : :"memory");
+.label32: \n"
+#ifdef CONFIG_X86_64
+" add $(32*8), %%rsp \n"
+#else
+" add $(32*4), %%esp \n"
+#endif
+: : :"memory");
}
enum kvm_reg {
--
2.14.2

View file

@ -0,0 +1,39 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Wed, 3 Jan 2018 15:18:44 +0100
Subject: [PATCH] x86/pti: Enable PTI by default
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
CVE-2017-5754
This really want's to be enabled by default. Users who know what they are
doing can disable it either in the config or on the kernel command line.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
(cherry picked from commit 87faa0d9b43b4755ff6963a22d1fd1bee1aa3b39)
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
(cherry picked from commit 436cdbfed2112bea7943f4a0f6dfabf54088c8c6)
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
security/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/security/Kconfig b/security/Kconfig
index 91cb8f611a0d..529dccc22ce5 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -98,6 +98,7 @@ config SECURITY_NETWORK
config PAGE_TABLE_ISOLATION
bool "Remove the kernel mapping in user mode"
+ default y
depends on X86_64 && !UML
help
This feature reduces the number of hardware side channels by
--
2.14.2

View file

@ -0,0 +1,49 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Andrew Honig <ahonig@google.com>
Date: Wed, 10 Jan 2018 10:12:03 -0800
Subject: [PATCH] KVM: x86: Add memory barrier on vmcs field lookup
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
commit 75f139aaf896d6fdeec2e468ddfa4b2fe469bf40 upstream.
This adds a memory barrier when performing a lookup into
the vmcs_field_to_offset_table. This is related to
CVE-2017-5753.
Signed-off-by: Andrew Honig <ahonig@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kvm/vmx.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d2168203bddc..e6fa3df81fd8 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -882,8 +882,16 @@ static inline short vmcs_field_to_offset(unsigned long field)
{
BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
- if (field >= ARRAY_SIZE(vmcs_field_to_offset_table) ||
- vmcs_field_to_offset_table[field] == 0)
+ if (field >= ARRAY_SIZE(vmcs_field_to_offset_table))
+ return -ENOENT;
+
+ /*
+ * FIXME: Mitigation for CVE-2017-5753. To be replaced with a
+ * generic mechanism.
+ */
+ asm("lfence");
+
+ if (vmcs_field_to_offset_table[field] == 0)
return -ENOENT;
return vmcs_field_to_offset_table[field];
--
2.14.2

View file

@ -0,0 +1,54 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Dave Hansen <dave.hansen@linux.intel.com>
Date: Sat, 6 Jan 2018 18:41:14 +0100
Subject: [PATCH] x86/tboot: Unbreak tboot with PTI enabled
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
commit 262b6b30087246abf09d6275eb0c0dc421bcbe38 upstream.
This is another case similar to what EFI does: create a new set of
page tables, map some code at a low address, and jump to it. PTI
mistakes this low address for userspace and mistakenly marks it
non-executable in an effort to make it unusable for userspace.
Undo the poison to allow execution.
Fixes: 385ce0ea4c07 ("x86/mm/pti: Add Kconfig")
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Jon Masters <jcm@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jeff Law <law@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
Cc: David" <dwmw@amazon.co.uk>
Cc: Nick Clifton <nickc@redhat.com>
Link: https://lkml.kernel.org/r/20180108102805.GK25546@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/kernel/tboot.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index a2486f444073..8337730f0956 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -127,6 +127,7 @@ static int map_tboot_page(unsigned long vaddr, unsigned long pfn,
p4d = p4d_alloc(&tboot_mm, pgd, vaddr);
if (!p4d)
return -1;
+ pgd->pgd &= ~_PAGE_NX;
pud = pud_alloc(&tboot_mm, p4d, vaddr);
if (!pud)
return -1;
--
2.14.2

View file

@ -0,0 +1,72 @@
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Peter Zijlstra <peterz@infradead.org>
Date: Sun, 14 Jan 2018 11:27:13 +0100
Subject: [PATCH] x86,perf: Disable intel_bts when PTI
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
commit 99a9dc98ba52267ce5e062b52de88ea1f1b2a7d8 upstream.
The intel_bts driver does not use the 'normal' BTS buffer which is exposed
through the cpu_entry_area but instead uses the memory allocated for the
perf AUX buffer.
This obviously comes apart when using PTI because then the kernel mapping;
which includes that AUX buffer memory; disappears. Fixing this requires to
expose a mapping which is visible in all context and that's not trivial.
As a quick fix disable this driver when PTI is enabled to prevent
malfunction.
Fixes: 385ce0ea4c07 ("x86/mm/pti: Add Kconfig")
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Robert Święcki <robert@swiecki.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: greg@kroah.com
Cc: hughd@google.com
Cc: luto@amacapital.net
Cc: Vince Weaver <vince@deater.net>
Cc: torvalds@linux-foundation.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180114102713.GB6166@worktop.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com>
---
arch/x86/events/intel/bts.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index ddd8d3516bfc..9a62e6fce0e0 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -582,6 +582,24 @@ static __init int bts_init(void)
if (!boot_cpu_has(X86_FEATURE_DTES64) || !x86_pmu.bts)
return -ENODEV;
+ if (boot_cpu_has(X86_FEATURE_PTI)) {
+ /*
+ * BTS hardware writes through a virtual memory map we must
+ * either use the kernel physical map, or the user mapping of
+ * the AUX buffer.
+ *
+ * However, since this driver supports per-CPU and per-task inherit
+ * we cannot use the user mapping since it will not be availble
+ * if we're not running the owning process.
+ *
+ * With PTI we can't use the kernal map either, because its not
+ * there when we run userspace.
+ *
+ * For now, disable this driver when using PTI.
+ */
+ return -ENODEV;
+ }
+
bts_pmu.capabilities = PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_ITRACE |
PERF_PMU_CAP_EXCLUSIVE;
bts_pmu.task_ctx_nr = perf_sw_context;
--
2.14.2