Skip to content
Snippets Groups Projects
  1. Jun 06, 2018
  2. May 28, 2018
  3. May 26, 2018
  4. May 25, 2018
  5. May 24, 2018
    • Wei Huang's avatar
      KVM: x86: Update cpuid properly when CR4.OSXAVE or CR4.PKE is changed · c4d21882
      Wei Huang authored
      
      The CPUID bits of OSXSAVE (function=0x1) and OSPKE (func=0x7, leaf=0x0)
      allows user apps to detect if OS has set CR4.OSXSAVE or CR4.PKE. KVM is
      supposed to update these CPUID bits when CR4 is updated. Current KVM
      code doesn't handle some special cases when updates come from emulator.
      Here is one example:
      
        Step 1: guest boots
        Step 2: guest OS enables XSAVE ==> CR4.OSXSAVE=1 and CPUID.OSXSAVE=1
        Step 3: guest hot reboot ==> QEMU reset CR4 to 0, but CPUID.OSXAVE==1
        Step 4: guest os checks CPUID.OSXAVE, detects 1, then executes xgetbv
      
      Step 4 above will cause an #UD and guest crash because guest OS hasn't
      turned on OSXAVE yet. This patch solves the problem by comparing the the
      old_cr4 with cr4. If the related bits have been changed,
      kvm_update_cpuid() needs to be called.
      
      Signed-off-by: default avatarWei Huang <wei@redhat.com>
      Reviewed-by: default avatarBandan Das <bsd@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      c4d21882
    • David Vrabel's avatar
      x86/kvm: fix LAPIC timer drift when guest uses periodic mode · d8f2f498
      David Vrabel authored
      
      Since 4.10, commit 8003c9ae (KVM: LAPIC: add APIC Timer
      periodic/oneshot mode VMX preemption timer support), guests using
      periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
      drift significantly over time.
      
      Differences in the underlying clocks and numerical errors means the
      periods of the two timers (hv and sw) are not the same. This
      difference will accumulate with every expiry resulting in a large
      error between the hv and sw timer.
      
      This means the sw timer may be running slow when compared to the hv
      timer. When the timer is switched from hv to sw, the now active sw
      timer will expire late. The guest VCPU is reentered and it switches to
      using the hv timer. This timer catches up, injecting multiple IRQs
      into the guest (of which the guest only sees one as it does not get to
      run until the hv timer has caught up) and thus the guest's timer rate
      is low (and becomes increasing slower over time as the sw timer lags
      further and further behind).
      
      I believe a similar problem would occur if the hv timer is the slower
      one, but I have not observed this.
      
      Fix this by synchronizing the deadlines for both timers to the same
      time source on every tick. This prevents the errors from accumulating.
      
      Fixes: 8003c9ae
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@nutanix.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Reviewed-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      d8f2f498
  6. May 23, 2018
  7. May 19, 2018
    • Borislav Petkov's avatar
      x86/MCE/AMD: Cache SMCA MISC block addresses · 78ce2410
      Borislav Petkov authored
      
      ... into a global, two-dimensional array and service subsequent reads from
      that cache to avoid rdmsr_on_cpu() calls during CPU hotplug (IPIs with IRQs
      disabled).
      
      In addition, this fixes a KASAN slab-out-of-bounds read due to wrong usage
      of the bank->blocks pointer.
      
      Fixes: 27bd5950 ("x86/mce/AMD: Get address from already initialized block")
      Reported-by: default avatarJohannes Hirte <johannes.hirte@datenkhaos.de>
      Tested-by: default avatarJohannes Hirte <johannes.hirte@datenkhaos.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Yazen Ghannam <yazen.ghannam@amd.com>
      Link: http://lkml.kernel.org/r/20180414004230.GA2033@probook
      78ce2410
    • Dmitry Safonov's avatar
      x86/mm: Drop TS_COMPAT on 64-bit exec() syscall · acf46020
      Dmitry Safonov authored
      
      The x86 mmap() code selects the mmap base for an allocation depending on
      the bitness of the syscall. For 64bit sycalls it select mm->mmap_base and
      for 32bit mm->mmap_compat_base.
      
      exec() calls mmap() which in turn uses in_compat_syscall() to check whether
      the mapping is for a 32bit or a 64bit task. The decision is made on the
      following criteria:
      
        ia32    child->thread.status & TS_COMPAT
         x32    child->pt_regs.orig_ax & __X32_SYSCALL_BIT
        ia64    !ia32 && !x32
      
      __set_personality_x32() was dropping TS_COMPAT flag, but
      set_personality_64bit() has kept compat syscall flag making
      in_compat_syscall() return true during the first exec() syscall.
      
      Which in result has user-visible effects, mentioned by Alexey:
      1) It breaks ASAN
      $ gcc -fsanitize=address wrap.c -o wrap-asan
      $ ./wrap32 ./wrap-asan true
      ==1217==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
      ==1217==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
      ==1217==Process memory map follows:
              0x000000400000-0x000000401000   /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
              0x000000600000-0x000000601000   /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
              0x000000601000-0x000000602000   /home/izbyshev/test/gcc/asan-exec-from-32bit/wrap-asan
              0x0000f7dbd000-0x0000f7de2000   /lib64/ld-2.27.so
              0x0000f7fe2000-0x0000f7fe3000   /lib64/ld-2.27.so
              0x0000f7fe3000-0x0000f7fe4000   /lib64/ld-2.27.so
              0x0000f7fe4000-0x0000f7fe5000
              0x7fed9abff000-0x7fed9af54000
              0x7fed9af54000-0x7fed9af6b000   /lib64/libgcc_s.so.1
      [snip]
      
      2) It doesn't seem to be great for security if an attacker always knows
      that ld.so is going to be mapped into the first 4GB in this case
      (the same thing happens for PIEs as well).
      
      The testcase:
      $ cat wrap.c
      
      int main(int argc, char *argv[]) {
        execvp(argv[1], &argv[1]);
        return 127;
      }
      
      $ gcc wrap.c -o wrap
      $ LD_SHOW_AUXV=1 ./wrap ./wrap true |& grep AT_BASE
      AT_BASE:         0x7f63b8309000
      AT_BASE:         0x7faec143c000
      AT_BASE:         0x7fbdb25fa000
      
      $ gcc -m32 wrap.c -o wrap32
      $ LD_SHOW_AUXV=1 ./wrap32 ./wrap true |& grep AT_BASE
      AT_BASE:         0xf7eff000
      AT_BASE:         0xf7cee000
      AT_BASE:         0x7f8b9774e000
      
      Fixes: 1b028f78 ("x86/mm: Introduce mmap_compat_base() for 32-bit mmap()")
      Fixes: ada26481 ("x86/mm: Make in_compat_syscall() work during exec")
      Reported-by: default avatarAlexey Izbyshev <izbyshev@ispras.ru>
      Bisected-by: default avatarAlexander Monakov <amonakov@ispras.ru>
      Investigated-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarDmitry Safonov <dima@arista.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Alexander Monakov <amonakov@ispras.ru>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: linux-mm@kvack.org
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Link: https://lkml.kernel.org/r/20180517233510.24996-1-dima@arista.com
      acf46020
  8. May 18, 2018
  9. May 17, 2018
  10. May 16, 2018