Skip to content
Snippets Groups Projects
  1. Jul 02, 2019
  2. Jun 22, 2019
  3. May 09, 2019
  4. Apr 17, 2019
    • Andy Lutomirski's avatar
      x86/irq/64: Split the IRQ stack into its own pages · e6401c13
      Andy Lutomirski authored
      
      Currently, the IRQ stack is hardcoded as the first page of the percpu
      area, and the stack canary lives on the IRQ stack. The former gets in
      the way of adding an IRQ stack guard page, and the latter is a potential
      weakness in the stack canary mechanism.
      
      Split the IRQ stack into its own private percpu pages.
      
      [ tglx: Make 64 and 32 bit share struct irq_stack ]
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Brijesh Singh <brijesh.singh@amd.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Feng Tang <feng.tang@intel.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jordan Borgner <mail@jordan-borgner.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Maran Wilson <maran.wilson@oracle.com>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolai Stange <nstange@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pu Wen <puwen@hygon.cn>
      Cc: "Rafael Ávila de Espíndola" <rafael@espindo.la>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Stefano Stabellini <sstabellini@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: x86-ml <x86@kernel.org>
      Cc: xen-devel@lists.xenproject.org
      Link: https://lkml.kernel.org/r/20190414160146.267376656@linutronix.de
      e6401c13
    • Thomas Gleixner's avatar
      x86/irq/64: Rename irq_stack_ptr to hardirq_stack_ptr · 758a2e31
      Thomas Gleixner authored
      
      Preparatory patch to share code with 32bit.
      
      No functional changes.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolai Stange <nstange@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Pingfan Liu <kernelfans@gmail.com>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190414160145.912584074@linutronix.de
      758a2e31
    • Thomas Gleixner's avatar
      x86/exceptions: Split debug IST stack · 2a594d4c
      Thomas Gleixner authored
      
      The debug IST stack is actually two separate debug stacks to handle #DB
      recursion. This is required because the CPU starts always at top of stack
      on exception entry, which means on #DB recursion the second #DB would
      overwrite the stack of the first.
      
      The low level entry code therefore adjusts the top of stack on entry so a
      secondary #DB starts from a different stack page. But the stack pages are
      adjacent without a guard page between them.
      
      Split the debug stack into 3 stacks which are separated by guard pages. The
      3rd stack is never mapped into the cpu_entry_area and is only there to
      catch triple #DB nesting:
      
            --- top of DB_stack	<- Initial stack
            --- end of DB_stack
            	  guard page
      
            --- top of DB1_stack	<- Top of stack after entering first #DB
            --- end of DB1_stack
            	  guard page
      
            --- top of DB2_stack	<- Top of stack after entering second #DB
            --- end of DB2_stack
            	  guard page
      
      If DB2 would not act as the final guard hole, a second #DB would point the
      top of #DB stack to the stack below #DB1 which would be valid and not catch
      the not so desired triple nesting.
      
      The backing store does not allocate any memory for DB2 and its guard page
      as it is not going to be mapped into the cpu_entry_area.
      
       - Adjust the low level entry code so it adjusts top of #DB with the offset
         between the stacks instead of exception stack size.
      
       - Make the dumpstack code aware of the new stacks.
      
       - Adjust the in_debug_stack() implementation and move it into the NMI code
         where it belongs. As this is NMI hotpath code, it just checks the full
         area between top of DB_stack and bottom of DB1_stack without checking
         for the guard page. That's correct because the NMI cannot hit a
         stackpointer pointing to the guard page between DB and DB1 stack.  Even
         if it would, then the NMI operation still is unaffected, but the resume
         of the debug exception on the topmost DB stack will crash by touching
         the guard page.
      
        [ bp: Make exception_stack_names static const char * const ]
      
      Suggested-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: linux-doc@vger.kernel.org
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190414160145.439944544@linutronix.de
      2a594d4c
    • Thomas Gleixner's avatar
      x86/exceptions: Disconnect IST index and stack order · 32074269
      Thomas Gleixner authored
      
      The entry order of the TSS.IST array and the order of the stack
      storage/mapping are not required to be the same.
      
      With the upcoming split of the debug stack this is going to fall apart as
      the number of TSS.IST array entries stays the same while the actual stacks
      are increasing.
      
      Make them separate so that code like dumpstack can just utilize the mapping
      order. The IST index is solely required for the actual TSS.IST array
      initialization.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Nicolai Stange <nstange@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: Sean Christopherson <sean.j.christopherson@intel.com>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190414160145.241588113@linutronix.de
      32074269
    • Thomas Gleixner's avatar
      x86/exceptions: Make IST index zero based · 8f34c5b5
      Thomas Gleixner authored
      
      The defines for the exception stack (IST) array in the TSS are using the
      SDM convention IST1 - IST7. That causes all sorts of code to subtract 1 for
      array indices related to IST. That's confusing at best and does not provide
      any value.
      
      Make the indices zero based and fixup the usage sites. The only code which
      needs to adjust the 0 based index is the interrupt descriptor setup which
      needs to add 1 now.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Reviewed-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: linux-doc@vger.kernel.org
      Cc: Nicolai Stange <nstange@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Qian Cai <cai@lca.pw>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20190414160144.331772825@linutronix.de
      8f34c5b5
  5. Apr 05, 2019
  6. Apr 03, 2019
    • Peter Zijlstra's avatar
      sched/x86_64: Don't save flags on context switch · 64604d54
      Peter Zijlstra authored
      
      Now that we have objtool validating AC=1 state for all x86_64 code,
      we can once again guarantee clean flags on schedule.
      
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      64604d54
    • Peter Zijlstra's avatar
      sched/x86: Save [ER]FLAGS on context switch · 6690e86b
      Peter Zijlstra authored
      
      Effectively reverts commit:
      
        2c7577a7 ("sched/x86_64: Don't save flags on context switch")
      
      Specifically because SMAP uses FLAGS.AC which invalidates the claim
      that the kernel has clean flags.
      
      In particular; while preemption from interrupt return is fine (the
      IRET frame on the exception stack contains FLAGS) it breaks any code
      that does synchonous scheduling, including preempt_enable().
      
      This has become a significant issue ever since commit:
      
        5b24a7a2 ("Add 'unsafe' user access functions for batched accesses")
      
      provided for means of having 'normal' C code between STAC / CLAC,
      exposing the FLAGS.AC state. So far this hasn't led to trouble,
      however fix it before it comes apart.
      
      Reported-by: default avatarJulien Thierry <julien.thierry@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org
      Fixes: 5b24a7a2 ("Add 'unsafe' user access functions for batched accesses")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6690e86b
  7. Dec 06, 2018
    • Andrea Righi's avatar
      kprobes/x86: Blacklist non-attachable interrupt functions · a50480cb
      Andrea Righi authored
      
      These interrupt functions are already non-attachable by kprobes.
      Blacklist them explicitly so that they can show up in
      /sys/kernel/debug/kprobes/blacklist and tools like BCC can use this
      additional information.
      
      Signed-off-by: default avatarAndrea Righi <righi.andrea@gmail.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Yonghong Song <yhs@fb.com>
      Link: http://lkml.kernel.org/r/20181206095648.GA8249@Dell
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a50480cb
  8. Oct 17, 2018
  9. Oct 14, 2018
  10. Sep 13, 2018
    • Andy Lutomirski's avatar
      x86/pti/64: Remove the SYSCALL64 entry trampoline · bf904d27
      Andy Lutomirski authored
      
      The SYSCALL64 trampoline has a couple of nice properties:
      
       - The usual sequence of SWAPGS followed by two GS-relative accesses to
         set up RSP is somewhat slow because the GS-relative accesses need
         to wait for SWAPGS to finish.  The trampoline approach allows
         RIP-relative accesses to set up RSP, which avoids the stall.
      
       - The trampoline avoids any percpu access before CR3 is set up,
         which means that no percpu memory needs to be mapped in the user
         page tables.  This prevents using Meltdown to read any percpu memory
         outside the cpu_entry_area and prevents using timing leaks
         to directly locate the percpu areas.
      
      The downsides of using a trampoline may outweigh the upsides, however.
      It adds an extra non-contiguous I$ cache line to system calls, and it
      forces an indirect jump to transfer control back to the normal kernel
      text after CR3 is set up.  The latter is because x86 lacks a 64-bit
      direct jump instruction that could jump from the trampoline to the entry
      text.  With retpolines enabled, the indirect jump is extremely slow.
      
      Change the code to map the percpu TSS into the user page tables to allow
      the non-trampoline SYSCALL64 path to work under PTI.  This does not add a
      new direct information leak, since the TSS is readable by Meltdown from the
      cpu_entry_area alias regardless.  It does allow a timing attack to locate
      the percpu area, but KASLR is more or less a lost cause against local
      attack on CPUs vulnerable to Meltdown regardless.  As far as I'm concerned,
      on current hardware, KASLR is only useful to mitigate remote attacks that
      try to attack the kernel without first gaining RCE against a vulnerable
      user process.
      
      On Skylake, with CONFIG_RETPOLINE=y and KPTI on, this reduces syscall
      overhead from ~237ns to ~228ns.
      
      There is a possible alternative approach: Move the trampoline within 2G of
      the entry text and make a separate copy for each CPU.  This would allow a
      direct jump to rejoin the normal entry path. There are pro's and con's for
      this approach:
      
       + It avoids a pipeline stall
      
       - It executes from an extra page and read from another extra page during
         the syscall. The latter is because it needs to use a relative
         addressing mode to find sp1 -- it's the same *cacheline*, but accessed
         using an alias, so it's an extra TLB entry.
      
       - Slightly more memory. This would be one page per CPU for a simple
         implementation and 64-ish bytes per CPU or one page per node for a more
         complex implementation.
      
       - More code complexity.
      
      The current approach is chosen for simplicity and because the alternative
      does not provide a significant benefit, which makes it worth.
      
      [ tglx: Added the alternative discussion to the changelog ]
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: https://lkml.kernel.org/r/8c7c6e483612c3e4e10ca89495dc160b1aa66878.1536015544.git.luto@kernel.org
      bf904d27
  11. Sep 08, 2018
  12. Sep 05, 2018
    • Alexander Popov's avatar
      x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls · afaef01c
      Alexander Popov authored
      The STACKLEAK feature (initially developed by PaX Team) has the following
      benefits:
      
      1. Reduces the information that can be revealed through kernel stack leak
         bugs. The idea of erasing the thread stack at the end of syscalls is
         similar to CONFIG_PAGE_POISONING and memzero_explicit() in kernel
         crypto, which all comply with FDP_RIP.2 (Full Residual Information
         Protection) of the Common Criteria standard.
      
      2. Blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712,
         CVE-2010-2963). That kind of bugs should be killed by improving C
         compilers in future, which might take a long time.
      
      This commit introduces the code filling the used part of the kernel
      stack with a poison value before returning to userspace. Full
      STACKLEAK feature also contains the gcc plugin which comes in a
      separate commit.
      
      The STACKLEAK feature is ported from grsecurity/PaX. More information at:
        https://grsecurity.net/
        https://pax.grsecurity.net/
      
      
      
      This code is modified from Brad Spengler/PaX Team's code in the last
      public patch of grsecurity/PaX based on our understanding of the code.
      Changes or omissions from the original code are ours and don't reflect
      the original grsecurity/PaX code.
      
      Performance impact:
      
      Hardware: Intel Core i7-4770, 16 GB RAM
      
      Test #1: building the Linux kernel on a single core
              0.91% slowdown
      
      Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
              4.2% slowdown
      
      So the STACKLEAK description in Kconfig includes: "The tradeoff is the
      performance impact: on a single CPU system kernel compilation sees a 1%
      slowdown, other systems and workloads may vary and you are advised to
      test this feature on your expected workload before deploying it".
      
      Signed-off-by: default avatarAlexander Popov <alex.popov@linux.com>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDave Hansen <dave.hansen@linux.intel.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      afaef01c
  13. Sep 03, 2018
  14. Jul 24, 2018
    • Andy Lutomirski's avatar
      x86/entry/64: Remove %ebx handling from error_entry/exit · b3681dd5
      Andy Lutomirski authored
      
      error_entry and error_exit communicate the user vs. kernel status of
      the frame using %ebx.  This is unnecessary -- the information is in
      regs->cs.  Just use regs->cs.
      
      This makes error_entry simpler and makes error_exit more robust.
      
      It also fixes a nasty bug.  Before all the Spectre nonsense, the
      xen_failsafe_callback entry point returned like this:
      
              ALLOC_PT_GPREGS_ON_STACK
              SAVE_C_REGS
              SAVE_EXTRA_REGS
              ENCODE_FRAME_POINTER
              jmp     error_exit
      
      And it did not go through error_entry.  This was bogus: RBX
      contained garbage, and error_exit expected a flag in RBX.
      
      Fortunately, it generally contained *nonzero* garbage, so the
      correct code path was used.  As part of the Spectre fixes, code was
      added to clear RBX to mitigate certain speculation attacks.  Now,
      depending on kernel configuration, RBX got zeroed and, when running
      some Wine workloads, the kernel crashes.  This was introduced by:
      
          commit 3ac6d8c7 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
      
      With this patch applied, RBX is no longer needed as a flag, and the
      problem goes away.
      
      I suspect that malicious userspace could use this bug to crash the
      kernel even without the offending patch applied, though.
      
      [ Historical note: I wrote this patch as a cleanup before I was aware
        of the bug it fixed. ]
      
      [ Note to stable maintainers: this should probably get applied to all
        kernels.  If you're nervous about that, a more conservative fix to
        add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should
        also fix the problem. ]
      
      Reported-and-tested-by: default avatarM. Vefa Bicakci <m.v.b@runbox.com>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dominik Brodowski <linux@dominikbrodowski.net>
      Cc: Greg KH <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Cc: xen-devel@lists.xenproject.org
      Fixes: 3ac6d8c7 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
      Link: http://lkml.kernel.org/r/b5010a090d3586b2d6e06c7ad3ec5542d1241c45.1532282627.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b3681dd5
  15. Jul 03, 2018
    • Jan Beulich's avatar
      x86/entry/64: Add two more instruction suffixes · 6709812f
      Jan Beulich authored
      
      Sadly, other than claimed in:
      
        a368d7fd ("x86/entry/64: Add instruction suffix")
      
      ... there are two more instances which want to be adjusted.
      
      As said there, omitting suffixes from instructions in AT&T mode is bad
      practice when operand size cannot be determined by the assembler from
      register operands, and is likely going to be warned about by upstream
      gas in the future (mine does already).
      
      Add the other missing suffixes here as well.
      
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/5B3A02DD02000078001CFB78@prv1-mh.provo.novell.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6709812f
  16. Jun 21, 2018
    • Josh Poimboeuf's avatar
      x86/unwind/orc: Detect the end of the stack · d31a5802
      Josh Poimboeuf authored
      
      The existing UNWIND_HINT_EMPTY annotations happen to be good indicators
      of where entry code calls into C code for the first time.  So also use
      them to mark the end of the stack for the ORC unwinder.
      
      Use that information to set unwind->error if the ORC unwinder doesn't
      unwind all the way to the end.  This will be needed for enabling
      HAVE_RELIABLE_STACKTRACE for the ORC unwinder so we can use it with the
      livepatch consistency model.
      
      Thanks to Jiri Slaby for teaching the ORCs about the unwind hints.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Acked-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http...
      d31a5802
  17. Jun 14, 2018
    • Linus Torvalds's avatar
      Kbuild: rename CC_STACKPROTECTOR[_STRONG] config variables · 050e9baa
      Linus Torvalds authored
      
      The changes to automatically test for working stack protector compiler
      support in the Kconfig files removed the special STACKPROTECTOR_AUTO
      option that picked the strongest stack protector that the compiler
      supported.
      
      That was all a nice cleanup - it makes no sense to have the AUTO case
      now that the Kconfig phase can just determine the compiler support
      directly.
      
      HOWEVER.
      
      It also meant that doing "make oldconfig" would now _disable_ the strong
      stackprotector if you had AUTO enabled, because in a legacy config file,
      the sane stack protector configuration would look like
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_NONE is not set
        # CONFIG_CC_STACKPROTECTOR_REGULAR is not set
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_STACKPROTECTOR_AUTO=y
      
      and when you ran this through "make oldconfig" with the Kbuild changes,
      it would ask you about the regular CONFIG_CC_STACKPROTECTOR (that had
      been renamed from CONFIG_CC_STACKPROTECTOR_REGULAR to just
      CONFIG_CC_STACKPROTECTOR), but it would think that the STRONG version
      used to be disabled (because it was really enabled by AUTO), and would
      disable it in the new config, resulting in:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_CC_STACKPROTECTOR=y
        # CONFIG_CC_STACKPROTECTOR_STRONG is not set
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      That's dangerously subtle - people could suddenly find themselves with
      the weaker stack protector setup without even realizing.
      
      The solution here is to just rename not just the old RECULAR stack
      protector option, but also the strong one.  This does that by just
      removing the CC_ prefix entirely for the user choices, because it really
      is not about the compiler support (the compiler support now instead
      automatially impacts _visibility_ of the options to users).
      
      This results in "make oldconfig" actually asking the user for their
      choice, so that we don't have any silent subtle security model changes.
      The end result would generally look like this:
      
        CONFIG_HAVE_CC_STACKPROTECTOR=y
        CONFIG_CC_HAS_STACKPROTECTOR_NONE=y
        CONFIG_STACKPROTECTOR=y
        CONFIG_STACKPROTECTOR_STRONG=y
        CONFIG_CC_HAS_SANE_STACKPROTECTOR=y
      
      where the "CC_" versions really are about internal compiler
      infrastructure, not the user selections.
      
      Acked-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      050e9baa
  18. Apr 10, 2018
  19. Apr 05, 2018
  20. Mar 24, 2018
  21. Mar 07, 2018
    • Michael Kelley's avatar
      Drivers: hv: vmbus: Implement Direct Mode for stimer0 · 248e742a
      Michael Kelley authored
      
      The 2016 version of Hyper-V offers the option to operate the guest VM
      per-vcpu stimer's in Direct Mode, which means the timer interupts on its
      own vector rather than queueing a VMbus message. Direct Mode reduces
      timer processing overhead in both the hypervisor and the guest, and
      avoids having timer interrupts pollute the VMbus interrupt stream for
      the synthetic NIC and storage.  This patch enables Direct Mode by
      default on stimer0 when running on a version of Hyper-V that supports
      it.
      
      In prep for coming support of Hyper-V on ARM64, the arch independent
      portion of the code contains calls to routines that will be populated
      on ARM64 but are not needed and do nothing on x86.
      
      Signed-off-by: default avatarMichael Kelley <mikelley@microsoft.com>
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      248e742a
  22. Feb 28, 2018
  23. Feb 21, 2018
    • Dominik Brodowski's avatar
      x86/entry/64: Open-code switch_to_thread_stack() · f3d415ea
      Dominik Brodowski authored
      
      Open-code the two instances which called switch_to_thread_stack(). This
      allows us to remove the wrapper around DO_SWITCH_TO_THREAD_STACK.
      
      While at it, update the UNWIND hint to reflect where the IRET frame is,
      and update the commentary to reflect what we are actually doing here.
      
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: dan.j.williams@intel.com
      Link: http://lkml.kernel.org/r/20180220210113.6725-7-linux@dominikbrodowski.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f3d415ea
    • Dominik Brodowski's avatar
      x86/entry/64: Move ASM_CLAC to interrupt_entry() · b2855d8d
      Dominik Brodowski authored
      
      Moving ASM_CLAC to interrupt_entry means two instructions (addq / pushq
      and call interrupt_entry) are not covered by it. However, it offers a
      noticeable size reduction (-.2k):
      
         text	   data	    bss	    dec	    hex	filename
        16882	      0	      0	  16882	   41f2	entry_64.o-orig
        16623	      0	      0	  16623	   40ef	entry_64.o
      
      Suggested-by: default avatarBrian Gerst <brgerst@gmail.com>
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: dan.j.williams@intel.com
      Link: http://lkml.kernel.org/r/20180220210113.6725-6-linux@dominikbrodowski.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      b2855d8d
    • Dominik Brodowski's avatar
      x86/entry/64: Remove 'interrupt' macro · 3aa99fc3
      Dominik Brodowski authored
      
      It is now trivial to call interrupt_entry() and then the actual worker.
      Therefore, remove the interrupt macro and open code it all.
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: dan.j.williams@intel.com
      Link: http://lkml.kernel.org/r/20180220210113.6725-5-linux@dominikbrodowski.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3aa99fc3
    • Dominik Brodowski's avatar
      x86/entry/64: Move the switch_to_thread_stack() call to interrupt_entry() · 90a6acc4
      Dominik Brodowski authored
      
      We can also move the CLD, SWAPGS, and the switch_to_thread_stack() call
      to the interrupt_entry() helper function. As we do not want call depths
      of two, convert switch_to_thread_stack() to a macro.
      
      However, switch_to_thread_stack() has another user in entry_64_compat.S,
      which currently expects it to be a function. To keep the code changes
      in this patch minimal, create a wrapper function.
      
      The switch to a macro means that there is some binary code duplication
      if CONFIG_IA32_EMULATION=y is enabled. Therefore, the size reduction
      differs whether CONFIG_IA32_EMULATION is enabled or not:
      
      CONFIG_IA32_EMULATION=y (-0.13k):
         text	   data	    bss	    dec	    hex	filename
        17158	      0	      0	  17158	   4306	entry_64.o-orig
        17028	      0	      0	  17028	   4284	entry_64.o
      
      CONFIG_IA32_EMULATION=n (-0.27k):
         text	   data	    bss	    dec	    hex	filename
        17158	      0	      0	  17158	   4306	entry_64.o-orig
        16882	      0	      0	  16882	   41f2	entry_64.o
      
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: dan.j.williams@intel.com
      Link: http://lkml.kernel.org/r/20180220210113.6725-4-linux@dominikbrodowski.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      90a6acc4
    • Dominik Brodowski's avatar
      x86/entry/64: Move ENTER_IRQ_STACK from interrupt macro to interrupt_entry · 2ba64741
      Dominik Brodowski authored
      
      Moving the switch to IRQ stack from the interrupt macro to the helper
      function requires some trickery: All ENTER_IRQ_STACK really cares about
      is where the "original" stack -- meaning the GP registers etc. -- is
      stored. Therefore, we need to offset the stored RSP value by 8 whenever
      ENTER_IRQ_STACK is called from within a function. In such cases, and
      after switching to the IRQ stack, we need to push the "original" return
      address (i.e. the return address from the call to the interrupt entry
      function) to the IRQ stack.
      
      This trickery allows us to carve another .85k from the text size (it
      would be more except for the additional unwind hints):
      
         text	   data	    bss	    dec	    hex	filename
        18006	      0	      0	  18006	   4656	entry_64.o-orig
        17158	      0	      0	  17158	   4306	entry_64.o
      
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: dan.j.williams@intel.com
      Link: http://lkml.kernel.org/r/20180220210113.6725-3-linux@dominikbrodowski.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      2ba64741
    • Dominik Brodowski's avatar
      x86/entry/64: Move PUSH_AND_CLEAR_REGS from interrupt macro to helper function · 0e34d226
      Dominik Brodowski authored
      
      The PUSH_AND_CLEAR_REGS macro is able to insert the GP registers
      "above" the original return address. This allows us to move a sizeable
      part of the interrupt entry macro to an interrupt entry helper function:
      
         text	   data	    bss	    dec	    hex	filename
        21088	      0	      0	  21088	   5260	entry_64.o-orig
        18006	      0	      0	  18006	   4656	entry_64.o
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: dan.j.williams@intel.com
      Link: http://lkml.kernel.org/r/20180220210113.6725-2-linux@dominikbrodowski.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0e34d226
    • Kirill A. Shutemov's avatar
      x86/mm: Optimize boot-time paging mode switching cost · 39b95522
      Kirill A. Shutemov authored
      
      By this point we have functioning boot-time switching between 4- and
      5-level paging mode. But naive approach comes with cost.
      
      Numbers below are for kernel build, allmodconfig, 5 times.
      
      CONFIG_X86_5LEVEL=n:
      
       Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
      
         17308719.892691      task-clock:u (msec)       #   26.772 CPUs utilized            ( +-  0.11% )
                       0      context-switches:u        #    0.000 K/sec
                       0      cpu-migrations:u          #    0.000 K/sec
             331,993,164      page-faults:u             #    0.019 M/sec                    ( +-  0.01% )
      43,614,978,867,455      cycles:u                  #    2.520 GHz                      ( +-  0.01% )
      39,371,534,575,126      stalled-cycles-frontend:u #   90.27% frontend cycles idle     ( +-  0.09% )
      28,363,350,152,428      instructions:u            #    0.65  insn per cycle
                                                        #    1.39  stalled cycles per insn  ( +-  0.00% )
       6,316,784,066,413      branches:u                #  364.948 M/sec                    ( +-  0.00% )
         250,808,144,781      branch-misses:u           #    3.97% of all branches          ( +-  0.01% )
      
           646.531974142 seconds time elapsed                                          ( +-  1.15% )
      
      CONFIG_X86_5LEVEL=y:
      
       Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
      
         17411536.780625      task-clock:u (msec)       #   26.426 CPUs utilized            ( +-  0.10% )
                       0      context-switches:u        #    0.000 K/sec
                       0      cpu-migrations:u          #    0.000 K/sec
             331,868,663      page-faults:u             #    0.019 M/sec                    ( +-  0.01% )
      43,865,909,056,301      cycles:u                  #    2.519 GHz                      ( +-  0.01% )
      39,740,130,365,581      stalled-cycles-frontend:u #   90.59% frontend cycles idle     ( +-  0.05% )
      28,363,358,997,959      instructions:u            #    0.65  insn per cycle
                                                        #    1.40  stalled cycles per insn  ( +-  0.00% )
       6,316,784,937,460      branches:u                #  362.793 M/sec                    ( +-  0.00% )
         251,531,919,485      branch-misses:u           #    3.98% of all branches          ( +-  0.00% )
      
           658.886307752 seconds time elapsed                                          ( +-  0.92% )
      
      The patch tries to fix the performance regression by using
      cpu_feature_enabled(X86_FEATURE_LA57) instead of pgtable_l5_enabled in
      all hot code paths. These will statically patch the target code for
      additional performance.
      
      CONFIG_X86_5LEVEL=y + the patch:
      
       Performance counter stats for 'sh -c make -j100 -B -k >/dev/null' (5 runs):
      
         17381990.268506      task-clock:u (msec)       #   26.907 CPUs utilized            ( +-  0.19% )
                       0      context-switches:u        #    0.000 K/sec
                       0      cpu-migrations:u          #    0.000 K/sec
             331,862,625      page-faults:u             #    0.019 M/sec                    ( +-  0.01% )
      43,697,726,320,051      cycles:u                  #    2.514 GHz                      ( +-  0.03% )
      39,480,408,690,401      stalled-cycles-frontend:u #   90.35% frontend cycles idle     ( +-  0.05% )
      28,363,394,221,388      instructions:u            #    0.65  insn per cycle
                                                        #    1.39  stalled cycles per insn  ( +-  0.00% )
       6,316,794,985,573      branches:u                #  363.410 M/sec                    ( +-  0.00% )
         251,013,232,547      branch-misses:u           #    3.97% of all branches          ( +-  0.01% )
      
           645.991174661 seconds time elapsed                                          ( +-  1.19% )
      
      Unfortunately, this approach doesn't help with text size:
      
        vmlinux.before .text size:	8190319
        vmlinux.after .text size:	8200623
      
      The .text section is increased by about 4k. Not sure if we can do anything
      about this.
      
      Signed-off-by: default avatarKirill A. Shuemov <kirill.shutemov@linux.intel.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-mm@kvack.org
      Link: http://lkml.kernel.org/r/20180216114948.68868-4-kirill.shutemov@linux.intel.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      39b95522
  24. Feb 20, 2018
  25. Feb 17, 2018
  26. Feb 15, 2018
    • Ingo Molnar's avatar
      x86/entry/64: Fix CR3 restore in paranoid_exit() · e4865757
      Ingo Molnar authored
      
      Josh Poimboeuf noticed the following bug:
      
       "The paranoid exit code only restores the saved CR3 when it switches back
        to the user GS.  However, even in the kernel GS case, it's possible that
        it needs to restore a user CR3, if for example, the paranoid exception
        occurred in the syscall exit path between SWITCH_TO_USER_CR3_STACK and
        SWAPGS."
      
      Josh also confirmed via targeted testing that it's possible to hit this bug.
      
      Fix the bug by also restoring CR3 in the paranoid_exit_no_swapgs branch.
      
      The reason we haven't seen this bug reported by users yet is probably because
      "paranoid" entry points are limited to the following cases:
      
       idtentry double_fault       do_double_fault  has_error_code=1  paranoid=2
       idtentry debug              do_debug         has_error_code=0  paranoid=1 shift_ist=DEBUG_STACK
       idtentry int3               do_int3          has_error_code=0  paranoid=1 shift_ist=DEBUG_STACK
       idtentry machine_check      do_mce           has_error_code=0  paranoid=1
      
      Amongst those entry points only machine_check is one that will interrupt an
      IRQS-off critical section asynchronously - and machine check events are rare.
      
      The other main asynchronous entries are NMI entries, which can be very high-freq
      with perf profiling, but they are special: they don't use the 'idtentry' macro but
      are open coded and restore user CR3 unconditionally so don't have this bug.
      
      Reported-and-tested-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Reviewed-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180214073910.boevmg65upbk3vqb@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e4865757
  27. Feb 14, 2018