Commits · a9c92fd420fb83ac0060a89ffd24386b7ea96ccf · Summer2022 / 22b970497

Jul 19, 2021

x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing · a9c92fd4

Thomas Gleixner authored 3 years ago


stable inclusion
from linux-4.19.194
commit 7e25cb1b22f81239ae3332e14a1d0cff7014bccd

--------------------------------

commit 7d65f9e80646c595e8c853640a9d0768a33e204c upstream.

PIC interrupts do not support affinity setting and they can end up on
any online CPU. Therefore, it's required to mark the associated vectors
as system-wide reserved. Otherwise, the corresponding irq descriptors
are copied to the secondary CPUs but the vectors are not marked as
assigned or reserved. This works correctly for the IO/APIC case.

When the IO/APIC is disabled via config, kernel command line or lack of
enumeration then all legacy interrupts are routed through the PIC, but
nothing marks them as system-wide reserved vectors.

As a consequence, a subsequent allocation on a secondary CPU can result in
allocating one of these vectors, which triggers the BUG() in
apic_update_vector() because the interrupt descriptor slot is not empty.

Imran tried to work around that by marking those interrupts as allocated
when a CPU comes online. But that's wrong in case that the IO/APIC is
available and one of the legacy interrupts, e.g. IRQ0, has been switched to
PIC mode because then marking them as allocated will fail as they are
already marked as system vectors.

Stay consistent and update the legacy vectors after attempting IO/APIC
initialization and mark them as system vectors in case that no IO/APIC is
available.

Fixes: 69cde000 ("x86/vector: Use matrix allocator for vector assignment")
Reported-by: Imran Khan <imran.f.khan@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20210519233928.2157496-1-imran.f.khan@oracle.com


Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

a9c92fd4

Jul 01, 2021

x86/uprobes: Do not use prefixes.nbytes when looping over prefixes.bytes · edf4a501

Masami Hiramatsu authored 3 years ago


stable inclusion
from linux-4.19.163
commit 6bb78b3fff90fcf76991f689822ae58a4feee36d

--------------------------------

commit 4e9a5ae8 upstream.

Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper check must
be

  insn.prefixes.bytes[i] != 0 and i < 4

instead of using insn.prefixes.nbytes.

Introduce a for_each_insn_prefix() macro for this purpose. Debugged by
Kees Cook <keescook@chromium.org>.

 [ bp: Massage commit message, sync with the respective header in tools/
   and drop "we". ]

Fixes: 2b144498 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints")
Reported-by:  <syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Li...

edf4a501

Jun 30, 2021

x86/kprobes: Fix to check non boostable prefixes correctly · 3a488a41

Masami Hiramatsu authored 3 years ago


stable inclusion
from linux-4.19.191
commit 609a2d6557e8a6d5dbb6bfcfc5b42185526a0c0b

--------------------------------

[ Upstream commit 6dd3b8c9f58816a1354be39559f630cd1bd12159 ]

There are 2 bugs in the can_boost() function because of using
x86 insn decoder. Since the insn->opcode never has a prefix byte,
it can not find CS override prefix in it. And the insn->attr is
the attribute of the opcode, thus inat_is_address_size_prefix(
insn->attr) always returns false.

Fix those by checking each prefix bytes with for_each_insn_prefix
loop and getting the correct attribute for each prefix byte.
Also, this removes unlikely, because this is a slow path.

Fixes: a8d11cd0 ("kprobes/x86: Consolidate insn decoder users for copying code")
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/161666691162.1120877.2808435205294352583.stgit@devnote2


Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

3a488a41

x86/microcode: Check for offline CPUs before requesting new microcode · 400e8d37

Otavio Pontes authored 3 years ago


stable inclusion
from linux-4.19.191
commit af6d4954f9d43def013d3621d944ea2770bbd08b

--------------------------------

[ Upstream commit 7189b3c11903667808029ec9766a6e96de5012a5 ]

Currently, the late microcode loading mechanism checks whether any CPUs
are offlined, and, in such a case, aborts the load attempt.

However, this must be done before the kernel caches new microcode from
the filesystem. Otherwise, when offlined CPUs are onlined later, those
cores are going to be updated through the CPU hotplug notifier callback
with the new microcode, while CPUs previously onine will continue to run
with the older microcode.

For example:

Turn off one core (2 threads):

  echo 0 > /sys/devices/system/cpu/cpu3/online
  echo 0 > /sys/devices/system/cpu/cpu1/online

Install the ucode fails because a primary SMT thread is offline:

  cp intel-ucode/06-8e-09 /lib/firmware/intel-ucode/
  echo 1 > /sys/devices/system/cpu/microcode/reload
  bash: echo: write error: Invalid argument

Turn the core back on

  echo 1 > /sys/devices/system/cpu/cpu3/online
  echo 1 > /sys/devices/system/cpu/cpu1/online
  cat /proc/cpuinfo |grep microcode
  microcode : 0x30
  microcode : 0xde
  microcode : 0x30
  microcode : 0xde

The rationale for why the update is aborted when at least one primary
thread is offline is because even if that thread is soft-offlined
and idle, it will still have to participate in broadcasted MCE's
synchronization dance or enter SMM, and in both examples it will execute
instructions so it better have the same microcode revision as the other
cores.

 [ bp: Heavily edit and extend commit message with the reasoning behind all
   this. ]

Fixes: 30ec26da ("x86/microcode: Do not upload microcode if CPUs are offline")
Signed-off-by: Otavio Pontes <otavio.pontes@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Ashok Raj <ashok.raj@intel.com>
Link: https://lkml.kernel.org/r/20210319165515.9240-2-otavio.pontes@intel.com


Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

400e8d37

x86/sysfb: Fix check for bad VRAM size · 46325589

Arvind Sankar authored 3 years ago


mainline inclusion
from v5.5-rc1
commit dacc9092
category: bugfix
bugzilla: 28820
CVE: NA
---------------------------

When checking whether the reported lfb_size makes sense, the height
* stride result is page-aligned before seeing whether it exceeds the
reported size.

This doesn't work if height * stride is not an exact number of pages.
For example, as reported in the kernel bugzilla below, an 800x600x32 EFI
framebuffer gets skipped because of this.

Move the PAGE_ALIGN to after the check vs size.

Reported-by: Christopher Head <chead@chead.ca>
Tested-by: Christopher Head <chead@chead.ca>
Signed-off-by: Arvind Sankar <nivedita@alum.mit.edu>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206051
Link: https://lkml.kernel.org/r/20200107230410.2291947-1-nivedita@alum.mit.edu


Signed-off-by: Hongbo Yao <yaohongbo@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

46325589

x86/timer: Force PIT initialization when !X86_FEATURE_ARAT · c4f8b0ac

Jan Stancek authored 3 years ago


mainline inclusion
from mainline-v5.3
commit afa8b475
category: bugfix
bugzilla: 17251
CVE: NA

-------------------------------------------------

KVM guests with commit c8c40767 ("x86/timer: Skip PIT initialization on
modern chipsets") applied to guest kernel have been observed to have
unusually higher CPU usage with symptoms of increase in vm exits for HLT
and MSW_WRITE (MSR_IA32_TSCDEADLINE).

This is caused by older QEMUs lacking support for X86_FEATURE_ARAT.  lapic
clock retains CLOCK_EVT_FEAT_C3STOP and nohz stays inactive.  There's no
usable broadcast device either.

Do the PIT initialization if guest CPU lacks X86_FEATURE_ARAT.  On real
hardware it shouldn't matter as ARAT and DEADLINE come together.

Fixes: c8c40767 ("x86/timer: Skip PIT initialization on modern chipsets")
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

c4f8b0ac

x86/timer: Don't skip PIT setup when APIC is disabled or in legacy mode · 639b66f7

Thomas Gleixner authored 3 years ago


mainline inclusion
from mainline-v5.6-rc1
commit 97992387
category: bugfix
bugzilla: 17251
CVE: NA

-------------------------------------------------

Tony reported a boot regression caused by the recent workaround for systems
which have a disabled (clock gate off) PIT.

On his machine the kernel fails to initialize the PIT because
apic_needs_pit() does not take into account whether the local APIC
interrupt delivery mode will actually allow to setup and use the local
APIC timer. This should be easy to reproduce with acpi=off on the
command line which also disables HPET.

Due to the way the PIT/HPET and APIC setup ordering works (APIC setup can
require working PIT/HPET) the information is not available at the point
where apic_needs_pit() makes this decision.

To address this, split out the interrupt mode selection from
apic_intr_mode_init(), invoke the selection before making the decision
whether PIT is required or not, and add the missing checks into
apic_needs_pit().

Fixes: c8c40767 ("x86/timer: Skip PIT initialization on modern chipsets")
Reported-by: Anthony Buckley <tony.buckley000@gmail.com>
Tested-by: Anthony Buckley <tony.buckley000@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Daniel Drake <drake@endlessm.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206125
Link: https://lore.kernel.org/r/87sgk6tmk2.fsf@nanos.tec.linutronix.de



Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

639b66f7

x86/timer: Skip PIT initialization on modern chipsets · a0f31a54

Thomas Gleixner authored 3 years ago


mainline inclusion
from mainline-v5.3-rc1
commit c8c40767
category: bugfix
bugzilla: 17251
CVE: NA

-------------------------------------------------

Recent Intel chipsets including Skylake and ApolloLake have a special
ITSSPRC register which allows the 8254 PIT to be gated.  When gated, the
8254 registers can still be programmed as normal, but there are no IRQ0
timer interrupts.

Some products such as the Connex L1430 and exone go Rugged E11 use this
register to ship with the PIT gated by default. This causes Linux to fail
to boot:

  Kernel panic - not syncing: IO-APIC + timer doesn't work! Boot with
  apic=debug and send a report.

The panic happens before the framebuffer is initialized, so to the user, it
appears as an early boot hang on a black screen.

Affected products typically have a BIOS option that can be used to enable
the 8254 and make Linux work (Chipset -> South Cluster Configuration ->
Miscellaneous Configuration -> 8254 Clock Gating), however it would be best
to make Linux support the no-8254 case.

Modern sytems allow to discover the TSC and local APIC timer frequencies,
so the calibration against the PIT is not required. These systems have
always running timers and the local APIC timer works also in deep power
states.

So the setup of the PIT including the IO-APIC timer interrupt delivery
checks are a pointless exercise.

Skip the PIT setup and the IO-APIC timer interrupt checks on these systems,
which avoids the panic caused by non ticking PITs and also speeds up the
boot process.

Thanks to Daniel for providing the changelog, initial analysis of the
problem and testing against a variety of machines.

Reported-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: bp@alien8.de
Cc: hpa@zytor.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Cc: hdegoede@redhat.com
Link: https://lkml.kernel.org/r/20190628072307.24678-1-drake@endlessm.com


Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

a0f31a54

x86/apic: Rename 'lapic_timer_frequency' to 'lapic_timer_period' · 78003899

Daniel Drake authored 3 years ago


mainline inclusion
from mainline-v5.3-rc1
commit 52ae346b
category: bugfix
bugzilla: 17251
CVE: NA

-------------------------------------------------

This variable is a period unit (number of clock cycles per jiffy),
not a frequency (which is number of cycles per second).

Give it a more appropriate name.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux@endlessm.com
Cc: rafael.j.wysocki@intel.com
Link: http://lkml.kernel.org/r/20190509055417.13152-2-drake@endlessm.com


Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

78003899

Jun 11, 2021

x86/kvm: Add "nopvspin" parameter to disable PV spinlocks · c631feb1

Zhenzhong Duan authored 3 years ago

mainline inclusion
from mainline-5.1
commit 05eee619
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I3T22N


CVE: NA

There are cases where a guest tries to switch spinlocks to bare metal
behavior (e.g. by setting "xen_nopvspin" on XEN platform and
"hv_nopvspin" on HYPER_V).

That feature is missed on KVM, add a new parameter "nopvspin" to disable
PV spinlocks for KVM guest.

The new 'nopvspin' parameter will also replace Xen and Hyper-V specific
parameters in future patches.

Define variable nopvsin as global because it will be used in future
patches as above.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Cc: Jim Mattson <jmattson@google.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiajun Chen <chenjiajun8@huawei.com>
Reviewed-by: Xiangyou Xie <xiexiangyou@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

c631feb1

Jun 02, 2021

x86/tsc: Respect tsc command line paraemeter for clocksource_tsc_early · a83b988c

Michael Zhivich authored 3 years ago

mainline inclusion
from mainline-v5.4-rc4
commit 63ec58b44fcc05efd1542045abd7faf056ac27d9
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I3T8ZP?from=project-issue


CVE: NA

--------------------------------

The introduction of clocksource_tsc_early broke the functionality of
"tsc=reliable" and "tsc=nowatchdog" command line parameters, since
clocksource_tsc_early is unconditionally registered with
CLOCK_SOURCE_MUST_VERIFY and thus put on the watchdog list.

This can cause the TSC to be declared unstable during boot:

  clocksource: timekeeping watchdog on CPU0: Marking clocksource
               'tsc-early' as unstable because the skew is too large:
  clocksource: 'refined-jiffies' wd_now: fffb7018 wd_last: fffb6e9d
               mask: ffffffff
  clocksource: 'tsc-early' cs_now: 68a6a7070f6a0 cs_last: 68a69ab6f74d6
               mask: ffffffffffffffff
  tsc: Marking TSC unstable due to clocksource watchdog

The corresponding elapsed times are cs_nsec=1224152026 and wd_nsec=378942392, so
the watchdog differs from TSC by 0.84 seconds.

This happens when HPET is not available and jiffies are used as the TSC
watchdog instead and the jiffies update is not happening due to lost timer
interrupts in periodic mode, which can happen e.g. with expensive debug
mechanisms enabled or under massive overload conditions in virtualized
environments.

Before the introduction of the early TSC clocksource the command line
parameters "tsc=reliable" and "tsc=nowatchdog" could be used to work around
this issue.

Restore the behaviour by disabling the watchdog if requested on the kernel
command line.

[ tglx: Clarify changelog ]

Fixes: aa83c457 ("x86/tsc: Introduce early tsc clocksource")
Signed-off-by: Michael Zhivich <mzhivich@akamai.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191024175945.14338-1-mzhivich@akamai.com


[wangxiongfeng: remove 'no_tsc_watchdog' in the origin patch.]
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Jian Cheng <cj.chengjian@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

a83b988c

Jun 01, 2021

cpuidle-haltpoll: Use arch_cpu_idle() to replace default_idle() · 0259804a

Xiangyou Xie authored 3 years ago

hulk inclusion
category: feature
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=34


CVE: NA

Use arch_cpu_idle() to replace default_idle() in default_enter_idle().
default_idle() is defined only in x86.

Signed-off-by: Xiangyou Xie <xiexiangyou@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jiajun Chen <chenjiajun8@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

0259804a

cpuidle-haltpoll: Enable kvm guest polling when dedicated physical CPUs are available · 90a9fc3e

Wanpeng Li authored 3 years ago

mainline inclusion
from mainline-5.4
commit 1328edca
category: feature
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=34


CVE: NA

The downside of guest side polling is that polling is performed even
with other runnable tasks in the host. However, even if poll in kvm
can aware whether or not other runnable tasks in the same pCPU, it
can still incur extra overhead in over-subscribe scenario. Now we can
just enable guest polling when dedicated pCPUs are available.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Yubo Miao <miaoyubo@huawei.com>
Signed-off-by: Xiangyou Xie <xiexiangyou@huawei.com>
Reviewed-by: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jiajun Chen <chenjiajun8@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

90a9fc3e

cpuidle-haltpoll: vcpu hotplug support · 16d1054a

Joao Martins authored 3 years ago

mainline inclusion
from mainline-5.4
commit 97d3eb9d
category: feature
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=34


CVE: NA

When cpus != maxcpus cpuidle-haltpoll will fail to register all vcpus
past the online ones and thus fail to register the idle driver.
This is because cpuidle_add_sysfs() will return with -ENODEV as a
consequence from get_cpu_device() return no device for a non-existing
CPU.

Instead switch to cpuidle_register_driver() and manually register each
of the present cpus through cpuhp_setup_state() callbacks and future
ones that get onlined or offlined. This mimmics similar logic that
intel_idle does.

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Yubo Miao <miaoyubo@huawei.com>
Signed-off-by: Xiangyou Xie <xie...

16d1054a

cpuidle-haltpoll: disable host side polling when kvm virtualized · ce9add05

Marcelo Tosatti authored 3 years ago

mainline inclusion
from mainline-5.4
commit a1c4423b
category: feature
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=34


CVE: NA

When performing guest side polling, it is not necessary to
also perform host side polling.

So disable host side polling, via the new MSR interface,
when loading cpuidle-haltpoll driver.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Yubo Miao <miaoyubo@huawei.com>
Signed-off-by: Xiangyou Xie <xiexiangyou@huawei.com>
Reviewed-by: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jiajun Chen <chenjiajun8@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

ce9add05

add cpuidle-haltpoll driver · 8104395b

Marcelo Tosatti authored 3 years ago

mainline inclusion
from mainline-5.4
commit fa86ee90
category: feature
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=34


CVE: NA

Add a cpuidle driver that calls the architecture default_idle routine.

To be used in conjunction with the haltpoll governor.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Yubo Miao <miaoyubo@huawei.com>
Signed-off-by: Xiangyou Xie <xiexiangyou@huawei.com>
Reviewed-by: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jiajun Chen <chenjiajun8@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

8104395b

May 26, 2021

x86/apic/vector: Force interupt handler invocation to irq context · 3e134563

Thomas Gleixner authored 3 years ago


mainline inclusion
from mainline-5.7
commit 008f1d60
category: bugfix
bugzilla: NA
CVE: NA

-------------------------------------------------

Sathyanarayanan reported that the PCI-E AER error injection mechanism
can result in a NULL pointer dereference in apic_ack_edge():

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
 RIP: 0010:apic_ack_edge+0x1e/0x40
 Call Trace:
   handle_edge_irq+0x7d/0x1e0
   generic_handle_irq+0x27/0x30
   aer_inject_write+0x53a/0x720

It crashes in irq_complete_move() which dereferences get_irq_regs() which
is obviously NULL when this is called from non interrupt context.

Of course the pointer could be checked, but that just papers over the real
issue. Invoking the low level interrupt handling mechanism from random code
can wreckage the fragile interrupt affinity mechanism of x86 as interrupts
can only be moved in interrupt context or with special care when a CPU goes
offline and the move has to be enforced.

In the best case this triggers the warning in the MSI affinity setter, but
if the call happens on the correct CPU it just corrupts state and might
prevent further interrupt delivery for the affected device.

Mark the APIC interrupts as unsuitable for being invoked in random contexts.

This prevents the AER injection from proliferating the wreckage, but that's
less broken than the current state of affairs and more correct than just
papering over the problem by sprinkling random checks all over the place
and silently corrupting state.

Reported-by:  <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200306130623.684591280@linutronix.de


Signed-off-by: Liao Chang <liaochang1@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

3e134563

x86/cpufeatures: Enumerate the new AVX512 BFLOAT16 instructions · 3094b42a

Fenghua Yu authored 3 years ago

mainline inclusion
from mainline-v5.3-rc1
commit b302e4b1
category: feature
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=44


CVE: NA

-----------------------------------------------

AVX512 BFLOAT16 instructions support 16-bit BFLOAT16 floating-point
format (BF16) for deep learning optimization.

BF16 is a short version of 32-bit single-precision floating-point
format (FP32) and has several advantages over 16-bit half-precision
floating-point format (FP16). BF16 keeps FP32 accumulation after
multiplication without loss of precision, offers more than enough
range for deep learning training tasks, and doesn't need to handle
hardware exception.

AVX512 BFLOAT16 instructions are enumerated in CPUID.7.1:EAX[bit 5]
AVX512_BF16.

CPUID.7.1:EAX contains only feature bits. Reuse the currently empty
word 12 as a pure features word to hold the feature bits including
AVX512_BF16.

Detailed information of the CPUID bit and AVX512 BFLOAT16 instructions
can be found in the latest Intel Architecture Instruction Set Extensions
and Future Features Programming Reference.

 [ bp: Check CPUID(7) subleaf validity before accessing subleaf 1. ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nadav Amit <namit@vmware.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: Robert Hoo <robert.hu@linux.intel.com>
Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: x86 <x86@kernel.org>
Link: https://lkml.kernel.org/r/1560794416-217638-3-git-send-email-fenghua.yu@intel.com


Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

3094b42a

May 22, 2021

x86/unwind/orc: Remove boot-time ORC unwind tables sorting · badd79c4

Shile Zhang authored 3 years ago


mainline inclusion
from mainline-v5.6-rc1
commit f14bf6a3
category: feature
bugzilla: NA
CVE: NA

-------------------------------------------------

Now that the orc_unwind and orc_unwind_ip tables are sorted at build time,
remove the boot time sorting pass.

No change in functionality.

[ mingo: Rewrote the changelog and code comments. ]

Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kbuild@vger.kernel.org
Link: https://lkml.kernel.org/r/20191204004633.88660-8-shile.zhang@linux.alibaba.com


Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Jian Cheng <cj.chengjian@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

badd79c4

May 11, 2021

ACPI: x86: Call acpi_boot_table_init() after acpi_table_upgrade() · 9c16025d

Rafael J. Wysocki authored 3 years ago


stable inclusion
from linux-4.19.190
commit 9a5ba778b50d6c6c50597febf3a30387a80ac05d

--------------------------------

commit 6998a8800d73116187aad542391ce3b2dd0f9e30 upstream.

Commit 1a1c130ab757 ("ACPI: tables: x86: Reserve memory occupied by
ACPI tables") attempted to address an issue with reserving the memory
occupied by ACPI tables, but it broke the initrd-based table override
mechanism relied on by multiple users.

To restore the initrd-based ACPI table override functionality, move
the acpi_boot_table_init() invocation in setup_arch() on x86 after
the acpi_table_upgrade() one.

Fixes: 1a1c130ab757 ("ACPI: tables: x86: Reserve memory occupied by ACPI tables")
Reported-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-...

9c16025d

ACPI: tables: x86: Reserve memory occupied by ACPI tables · f60914de

Rafael J. Wysocki authored 3 years ago

stable inclusion
from linux-4.19.190
commit 7d329dd0a1b3302e10f0f25ef08538c7598d118f

--------------------------------

commit 1a1c130ab7575498eed5bcf7220037ae09cd1f8a upstream.

The following problem has been reported by George Kennedy:

 Since commit 7fef431b ("mm/page_alloc: place pages to tail
 in __free_pages_core()") the following use after free occurs
 intermittently when ACPI tables are accessed.

 BUG: KASAN: use-after-free in ibft_init+0x134/0xc49
 Read of size 4 at addr ffff8880be453004 by task swapper/0/1
 CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.12.0-rc1-7a7fd0d #1
 Call Trace:
  dump_stack+0xf6/0x158
  print_address_description.constprop.9+0x41/0x60
  kasan_report.cold.14+0x7b/0xd4
  __asan_report_load_n_noabort+0xf/0x20
  ibft_init+0x134/0xc49
  do_one_initcall+0xc4/0x3e0
  kernel_init_freeable+0x5af/0x66b
  kernel_init+0x16/0x1d0
  ret_from_fork+0x22/0x30

 ACPI tables mapped via kmap() do not have their mapped pages
 reserved and the pages can be "stolen" by the buddy allocator.

Apparently, on the affected system, the ACPI table in question is
not located in "reserved" memory, like ACPI NVS or ACPI Data, that
will not be used by the buddy allocator, so the memory occupied by
that table has to be explicitly reserved to prevent the buddy
allocator from using it.

In order to address this problem, rearrange the initialization of the
ACPI tables on x86 to locate the initial tables earlier and reserve
the memory occupied by them.

The other architectures using ACPI should not be affected by this
change.

Link: https://lore.kernel.org/linux-acpi/1614802160-29362-1-git-send-email-george.kennedy@oracle.com/


Reported-by: George Kennedy <george.kennedy@oracle.com>
Tested-by: George Kennedy <george.kennedy@oracle.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: 5.10+ <stable@vger.kernel.org> # 5.10+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

f60914de

x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access · 1cd844dd

Mike Galbraith authored 3 years ago

stable inclusion
from linux-4.19.189
commit f60194921e30e733ba94b4b3b2681d1cdc4ded55

--------------------------------

commit 5849cdf8c120e3979c57d34be55b92d90a77a47e upstream.

Commit in Fixes: added support for kexec-ing a kernel on panic using a
new system call. As part of it, it does prepare a memory map for the new
kernel.

However, while doing so, it wrongly accesses memory it has not
allocated: it accesses the first element of the cmem->ranges[] array in
memmap_exclude_ranges() but it has not allocated the memory for it in
crash_setup_memmap_entries(). As KASAN reports:

  BUG: KASAN: vmalloc-out-of-bounds in crash_setup_memmap_entries+0x17e/0x3a0
  Write of size 8 at addr ffffc90000426008 by task kexec/1187

  (gdb) list *crash_setup_memmap_entries+0x17e
  0xffffffff8107cafe is in crash_setup_memmap_entries (arch/x86/kernel/crash.c:322).
  317                                      unsigned long long mend)
  318     {
  319           ...

1cd844dd

Apr 07, 2021

x86/Kconfig: Rename UMIP config parameter · 2525e634

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.5
commit b971880f
category: x86/Kconfig
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

AMD 2nd generation EPYC processors support the UMIP (User-Mode
Instruction Prevention) feature. So, rename X86_INTEL_UMIP to
generic X86_UMIP and modify the text to cover both Intel and AMD.

 [ bp: take of the disabled-features.h copy in tools/ too. ]

Signed-off-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Link: https://lkml.kernel.org/r/157298912544.17462.2018334793891409521.stgit@naples-babu.amd.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com&gt;...>

2525e634

x86/apic: Mask IOAPIC entries when disabling the local APIC · ed7daf32

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.6
commit 0f378d73
category: x86/apic
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19
CVE: NA

----------------------------------------------------------------

When a system suspends, the local APIC is disabled in the suspend sequence,
but the IOAPIC is left in the current state. This means unmasked interrupt
lines stay unmasked. This is usually the case for IOAPIC pin 9 to which the
ACPI interrupt is connected.

That means that in suspended state the IOAPIC can respond to an external
interrupt, e.g. the wakeup via keyboard/RTC/ACPI, but the interrupt message
cannot be handled by the disabled local APIC. As a consequence the Remote
IRR bit is set, but the local APIC does not send an EOI to acknowledge
it. This causes the affected interrupt line to become stale and the stale
Remote IRR bit will cause a hang when __synchronize_hardirq() is invoked
for that interrupt line...

ed7daf32

x86/perf: Add hardware performance events support for Zhaoxin CPU. · 0ce7c3b6

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.8
commit 3a4ac121
category: x86/perf
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Add the generic Zhaoxin uncore PMU support.

Zhaoxin CPU has provided facilities for monitoring performance
via PMU (Performance Monitor Unit), but the functionality is unused so far.
Therefore, add support for zhaoxin pmu to make performance related
hardware events available.

The PMU is mostly an Intel Architectural PerfMon-v2 with a novel
errata for the ZXC line. It supports the following events:

  -----------------------------------------------------------------------------------------------------------------------------------
  Event                      | Event  | Umask |          Description
			     | Select |       |
  -----------------------------------------------------------------------------------------------------------------------------------
  cpu-cycles                 |  82h   |  00h  | unhalt core clock
  instructions               |  00h   |  00h  | number of instructions at retirement.
  cache-references           |  15h   |  05h  | number of fillq pushs at the current cycle.
  cache-misses               |  1ah   |  05h  | number of l2 miss pushed by fillq.
  branch-instructions        |  28h   |  00h  | counts the number of branch instructions retired.
  branch-misses              |  29h   |  00h  | mispredicted branch instructions at retirement.
  bus-cycles                 |  83h   |  00h  | unhalt bus clock
  stalled-cycles-frontend    |  01h   |  01h  | Increments each cycle the # of Uops issued by the RAT to RS.
  stalled-cycles-backend     |  0fh   |  04h  | RS0/1/2/3/45 empty
  L1-dcache-loads            |  68h   |  05h  | number of retire/commit load.
  L1-dcache-load-misses      |  4bh   |  05h  | retired load uops whose data source followed an L1 miss.
  L1-dcache-stores           |  69h   |  06h  | number of retire/commit Store,no LEA
  L1-dcache-store-misses     |  62h   |  05h  | cache lines in M state evicted out of L1D due to Snoop HitM or dirty line replacement.
  L1-icache-loads            |  00h   |  03h  | number of l1i cache access for valid normal fetch,including un-cacheable access.
  L1-icache-load-misses      |  01h   |  03h  | number of l1i cache miss for valid normal fetch,including un-cacheable miss.
  L1-icache-prefetches       |  0ah   |  03h  | number of prefetch.
  L1-icache-prefetch-misses  |  0bh   |  03h  | number of prefetch miss.
  dTLB-loads                 |  68h   |  05h  | number of retire/commit load
  dTLB-load-misses           |  2ch   |  05h  | number of load operations miss all level tlbs and cause a tablewalk.
  dTLB-stores                |  69h   |  06h  | number of retire/commit Store,no LEA
  dTLB-store-misses          |  30h   |  05h  | number of store operations miss all level tlbs and cause a tablewalk.
  dTLB-prefetches            |  64h   |  05h  | number of hardware pte prefetch requests dispatched out of the prefetch FIFO.
  dTLB-prefetch-misses       |  65h   |  05h  | number of hardware pte prefetch requests miss the l1d data cache.
  iTLB-load                  |  00h   |  00h  | actually counter instructions.
  iTLB-load-misses           |  34h   |  05h  | number of code operations miss all level tlbs and cause a tablewalk.
  -----------------------------------------------------------------------------------------------------------------------------------

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: CodyYao-oc <CodyYao-oc@zhaoxin.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1586747669-4827-1-git-send-email-CodyYao-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

0ce7c3b6

x86/speculation/swapgs: Exclude Zhaoxin CPUs from SWAPGS vulnerability · 43fcdf11

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.6
commit a84de2fa
category: x86/speculation/swapgs
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

New Zhaoxin family 7 CPUs are not affected by the SWAPGS vulnerability.
So
mark these CPUs in the cpu vulnerability whitelist accordingly.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1579227872-26972-3-git-send-email-TonyWWang-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

43fcdf11

x86/speculation/spectre_v2: Exclude Zhaoxin CPUs from SPECTRE_V2 · 4a1a1b50

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.6
commit 1e41a766
category: x86/speculation/spectre_v2
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

New Zhaoxin family 7 CPUs are not affected by SPECTRE_V2. So define a
separate cpu_vuln_whitelist bit NO_SPECTRE_V2 and add these CPUs to the
cpu
vulnerability whitelist.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/1579227872-26972-2-git-send-email-TonyWWang-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

4a1a1b50

x86/mce: Add Zhaoxin LMCE support · 184f2294

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.5
commit 70f0c230
category: x86/mce
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Add support for more Zhaoxin CPUs.

Newer Zhaoxin CPUs support LMCE compatible with Intel. Add support for
that.

 [ bp: Export functions and massage.  ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1568787573-1297-5-git-send-email-TonyWWang-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

184f2294

x86/mce: Add Zhaoxin CMCI support · 8661568e

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.5
commit 5a3d56a0
category: x86/mce
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Add support for more Zhaoxin CPUs.

All newer Zhaoxin CPUs support CMCI and are compatible with Intel's
Machine-Check Architecture. Add that support for Zhaoxin CPUs.

 [ bp: Massage comments and export intel_init_cmci().  ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1568787573-1297-4-git-send-email-TonyWWang-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

8661568e

x86/mce: Add Zhaoxin MCE support · f74db966

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.5
commit 6e898d2b
category: x86/mce
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Add support for more Zhaoxin CPUs.

All newer Zhaoxin CPUs are compatible with Intel's Machine-Check
Architecture, so add support for them.

 [ bp: Reflow comment in vendor_disable_error_reporting() and massage
    commit message. ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: CooperYan@zhaoxin.com
Cc: DavidWang@zhaoxin.com
Cc: HerryYang@zhaoxin.com
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: QiyuanWang@zhaoxin.com
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1568787573-1297-2-git-send-email-TonyWWang-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

f74db966

x86/acpi/cstate: Add Zhaoxin processors support for cache flush policy in C3 · 6b68baaf

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.2
commit f8c0e061
category: x86/acpi/cstate
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Same as Intel, Zhaoxin MP CPUs support C3 share cache and on all
recent Zhaoxin platforms ARB_DISABLE is a nop. So set related
flags correctly in the same way as Intel does.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "hpa@zytor.com" <hpa@zytor.com>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "rjw@rjwysocki.net" <rjw@rjwysocki.net>
Cc: "lenb@kernel.org" <lenb@kernel.org>
Cc: David Wang <DavidWang@zhaoxin.com>
Cc: "Cooper Yan(BJ-RD)" <CooperYan@zhaoxin.com>
Cc: "Qiyuan Wang(BJ-RD)" <QiyuanWang@zhaoxin.com>
Cc: "Herry Yang(BJ-RD)" <HerryYang@zhaoxin.com>
Link: https://lkml.kernel.org/r/a370503660994669991a7f7cda7c5e98@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

6b68baaf

x86/power: Optimize C3 entry on Centaur CPUs · c93213a5

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.2
commit 987ddbe4
category: x86/power
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

For new Centaur CPUs the ucode will take care of the preservation of
cache coherence
between CPU cores in C-states regardless of how deep the C-states are.
So, it is not
necessary to flush the caches in software befor entering C3. This
useless operation
will cause performance drop for the cores which share some caches with
the idling core.

Signed-off-by: David Wang <davidwang@zhaoxin.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: brucechang@via-alliance.com
Cc: cooperyan@zhaoxin.com
Cc: len.brown@intel.com
Cc: linux-pm@kernel.org
Cc: qiyuanwang@zhaoxin.com
Cc: rjw@rjwysocki.net
Cc: timguo@zhaoxin.com
Link: http://lkml.kernel.org/r/1545900110-2757-1-git-send-email-davidwang@zhaoxin.com


[ Tidy up the comment.  ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

c93213a5

x86/cpu: Add detect extended topology for Zhaoxin CPUs · d8c7117f

LeoLiu-oc authored 3 years ago

zhaoxin inclusion
category: feature
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Detect the extended topology information of Zhaoxin CPUs if available.

The patch is scheduled to be submitted to the kernel mainline in 2021.

Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

d8c7117f

x86/cpu/centaur: Add Centaur family >=7 CPUs initialization support · e4195ce2

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.9
commit 33b4711d
category: x86/cpu
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Add Centaur family >=7 CPUs specific initialization support.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link:https://lkml.kernel.org/r/1599562666-31351-3-git-send-email-TonyWWang-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

e4195ce2

x86/cpu/centaur: Replace two-condition switch-case with an if statement · 91a09386

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.9
commit 8687bdc0
category: x86/cpu
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Use a normal if statements instead of a two-condition switch-case.

[ bp: Massage commit message.   ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1599562666-31351-2-git-send-email-TonyWWang-oc@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

91a09386

x86/cpu: Remove redundant cpu_detect_cache_sizes() call · ccde3d27

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.6
commit 283bab98
category: x86/cpu
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Both functions call init_intel_cacheinfo() which computes L2 and L3
cache
sizes from CPUID(4). But then they also call cpu_detect_cache_sizes() a
bit later which computes ->x86_tlbsize and L2 size from CPUID(80000006).

However, the latter call is not needed because

- on these CPUs, CPUID(80000006).EBX for ->x86_tlbsize is reserved

- CPUID(80000006).ECX for the L2 size has the same result as CPUID(4)

Therefore, remove the latter call to simplify the code.

[ bp: Rewrite commit message.   ]

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1579075257-6985-1-git-send-email-TonyWWang-oc@zhaoxin

.
Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

ccde3d27

x86/cpu: Create Zhaoxin processors architecture support file · 0a0a94b8

LeoLiu-oc authored 3 years ago

mainline inclusion
from mainline-5.2
commit 761fdd5e
category: x86/cpu
bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=19


CVE: NA

----------------------------------------------------------------

Add x86 architecture support for new Zhaoxin processors.
Carve out initialization code needed by Zhaoxin processors into
a separate compilation unit.

To identify Zhaoxin CPU, add a new vendor type X86_VENDOR_ZHAOXIN
for system recognition.

Signed-off-by: Tony W Wang-oc <TonyWWang-oc@zhaoxin.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "hpa@zytor.com" <hpa@zytor.com>
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "rjw@rjwysocki.net" <rjw@rjwysocki.net>
Cc: "lenb@kernel.org" <lenb@kernel.org>
Cc: David Wang <DavidWang@zhaoxin.com>
Cc: "Cooper Yan(BJ-RD)" <CooperYan@zhaoxin.com>
Cc: "Qiyuan Wang(BJ-RD)" <QiyuanWang@zhaoxin.com>
Cc: "Herry Yang(BJ-RD)" <HerryYang@zhaoxin.com>
Link: https://lkml.kernel.org/r/01042674b2f741b2aed1f797359bdffb@zhaoxin.com


Signed-off-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: LeoLiu-oc <LeoLiu-oc@zhaoxin.com>
Signed-off-by: Cheng Jian <cj.chengjian@huawei.com>

0a0a94b8

Oct 29, 2020

x86/amd_nb: Make hygon_nb_misc_ids static · 9a9b7678

Pu Wen authored 4 years ago


commit 025e3204 upstream.

Fix the following sparse warning:

  arch/x86/kernel/amd_nb.c:74:28: warning:
    symbol 'hygon_nb_misc_ids' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Brian Woods <Brian.Woods@amd.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190614155441.22076-1-yuehaibing@huawei.com


Signed-off-by: Pu Wen <puwen@hygon.cn>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

9a9b7678

x86/CPU/hygon: Fix phys_proc_id calculation logic for multi-die processors · 22471d5a

Pu Wen authored 4 years ago

commit e0ceeae7 upstream.

The Hygon family 18h multi-die processor platform supports 1, 2 or
4-Dies per socket. The topology looks like this:

  System View (with 1-Die 2-Socket):
             |------------|
           ------       -----
   SOCKET0 | D0 |       | D1 |  SOCKET1
           ------       -----

  System View (with 2-Die 2-socket):
             --------------------
             |     -------------|------
             |     |            |     |
           ------------       ------------
   SOCKET0 | D1 -- D0 |       | D3 -- D2 | SOCKET1
           ------------       ------------

  System View (with 4-Die 2-Socket) :
             --------------------
             |     -------------|------
             |     |            |     |
           ------------       ------------
           | D1 -- D0 |       | D7 -- D6 |
           | |  \/ |  |       | |  \/ |  |
   SOCKET0 | |  /\ |  |       | |  /\ |  | SOCKET1
           | D2 -- D3 |       | D4 -- D5 |
           ------------       ------------
             |     |            |     |
             ------|------------|     |
                   --------------------

Currently

  phys_proc_id = initial_apicid >> bits

calculates the physical processor ID from the initial_apicid by shifting
*bits*.

However, this does not work for 1-Die and 2-Die 2-socket systems.

According to document [1] section 2.1.11.1, the bits is the value of
CPUID_Fn80000008_ECX[12:15]. The possible values are 4, 5 or 6 which
mean:

  4 - 1 die
  5 - 2 dies
  6 - 3/4 dies.

Hygon programs the initial ApicId the same way as AMD. The ApicId is
read from CPUID_Fn00000001_EBX (see section 2.1.11.1 of referrence [1])
and the definition is as below (see section 2.1.10.2.1.3 of [1]):

      -------------------------------------------------
  Bit |     6     |   5  4  |    3   |    2   1   0   |
      |-----------|---------|--------|----------------|
  IDs | Socket ID | Node ID | CCX ID | Core/Thread ID |
      -------------------------------------------------

So for 3/4-Die configurations, the bits variable is 6, which is the same
as the ApicID definition field.

For 1-Die and 2-Die configurations, bits is 4 or 5, which will cause the
right shifted result to not be exactly the value of socket ID.

However, the socket ID should be obtained from ApicId[6]. To fix the
problem and match the ApicID field definition, set the shift bits to 6
for all Hygon family 18h multi-die CPUs.

Because AMD doesn't have 2-Socket systems with 1-Die/2-Die processors
(see reference [2]), this doesn't need to be changed on the AMD side but
only for Hygon.

References:
[1] https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
[2] https://www.amd.com/en/products/specifications/processors



 [bp: heavily massage commit message. ]

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/1553355740-19999-1-git-send-email-puwen@hygon.cn


Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

22471d5a

x86/mce: Add Hygon Dhyana support to the MCA infrastructure · f7867e20

Pu Wen authored 4 years ago


commit ac78bd72 upstream.

The machine check architecture for Hygon Dhyana CPU is similar to the
AMD family 17h one. Add vendor checking for Hygon Dhyana to share the
code path of AMD family 17h.

Signed-off-by: Pu Wen <puwen@hygon.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: tony.luck@intel.com
Cc: thomas.lendacky@amd.com
Cc: linux-edac@vger.kernel.org
Link: https://lkml.kernel.org/r/87d8a4f16bdea0bfe0c0cf2e4a8d2c2a99b1055c.1537533369.git.puwen@hygon.cn


Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

f7867e20