Commits · add8fd4e2012eb37594dfb055c11cff54aa30008 · Summer2022 / 22b970264

Feb 08, 2022

mm/memcg_memfs_info: show files that having pages charged in mem_cgroup · add8fd4e

Liu Shixin authored 3 years ago

hulk inclusion
category: feature
bugzilla: 186182, https://gitee.com/openeuler/kernel/issues/I4SBQX


CVE: NA

--------------------------------

Support to print rootfs files and tmpfs files that having pages charged
in given memory cgroup. The files infomations can be printed through
interface "memory.memfs_files_info" or printed when OOM is triggered.

In order not to flush memory logs, we limit the maximum number of files
to be printed when oom through interface "max_print_files_in_oom". And
in order to filter out small files, we limit the minimum size of files
that can be printed through interface "size_threshold".

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

add8fd4e

ext4: fix e2fsprogs checksum failure for mounted filesystem · bc65936d

Jan Kara authored 3 years ago


mainline inclusion
from mainline-v5.15-rc1
commit b2bbb92f7042e8075fb036bf97043339576330c3
category: bugfix
bugzilla: 186203
CVE: NA

--------------------------------

Commit 81414b4dd48 ("ext4: remove redundant sb checksum
recomputation") removed checksum recalculation after updating
superblock free space / inode counters in ext4_fill_super() based on
the fact that we will recalculate the checksum on superblock
writeout.

That is correct assumption but until the writeout happens (which can
take a long time) the checksum is incorrect in the buffer cache and if
programs such as tune2fs or resize2fs is called shortly after a file
system is mounted can fail.  So return back the checksum recalculation
and add a comment explaining why.

Fixes: 81414b4dd48f ("ext4: remove redundant sb checksum recomputation")
Cc: stable@kernel.org
Reported-by: Boyang Xue <bxue@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20210812124737.21981-1-jack@suse.cz


Signed-off-by: Zheng Liang <zhengliang6@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

bc65936d

Feb 07, 2022

drm/vmwgfx: Fix stale file descriptors on failed usercopy · f6db3d87

Mathias Krause authored 3 years ago


stable inclusion
from linux-4.19.227
commit 0008a0c78fc33a84e2212a7c04e6b21a36ca6f4d
CVE: CVE-2022-22942

--------------------------------

commit a0f90c8815706981c483a652a6aefca51a5e191c upstream.

A failing usercopy of the fence_rep object will lead to a stale entry in
the file descriptor table as put_unused_fd() won't release it. This
enables userland to refer to a dangling 'file' object through that still
valid file descriptor, leading to all kinds of use-after-free
exploitation scenarios.

Fix this by deferring the call to fd_install() until after the usercopy
has succeeded.

Fixes: c906965d ("drm/vmwgfx: Add export fence to file descriptor support")
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Zack Rusin <zackr@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

f6db3d87

Jan 30, 2022

perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metric · b2e2d2c0

Smita Koralahalli authored 3 years ago

mainline inclusion
from mainline-v5.13-rc1
commit 86c2bc3da769124e3e856b6e9457be3667c30919
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

Commit 08ed77e4 ("perf vendor events amd: Add recommended events")
added the hits event "L2 Cache Hits from L2 HWPF" with the same metric
expression as the accesses event "L2 Cache Accesses from L2 HWPF":

$ perf list --details
...
  l2_cache_accesses_from_l2_hwpf
     [L2 Cache Accesses from L2 HWPF]
     [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3]
  l2_cache_hits_from_l2_hwpf
     [L2 Cache Hits from L2 HWPF]
     [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3]
...

This was wrong and led to counting hits the same as accesses. Section
2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h
B0 - 55803 Rev 0.54 - Sep 12, 2019", documents the hits event with
EventCode 0x70 which is the same as l2_pf_hit_l2.

Fix this, and massage the description for l2_pf_hit_l2 as the hits event
is now the duplicate of l2_pf_hit_l2. AMD recommends using the recommended
event over other events if the duplicate exists and maintain both for
consistency. Hence, l2_cache_hits_from_l2_hwpf should override
l2_pf_hit_l2.

Before:

 # perf stat -M l2_cache_accesses_from_l2_hwpf,l2_cache_hits_from_l2_hwpf sleep 1

 Performance counter stats for 'sleep 1':

             1,436      l2_pf_miss_l2_l3          # 11114.00 l2_cache_accesses_from_l2_hwpf
                                                  # 11114.00 l2_cache_hits_from_l2_hwpf
             4,482      l2_pf_hit_l2
             5,196      l2_pf_miss_l2_hit_l3

       1.001765339 seconds time elapsed

After:

 # perf stat -M l2_cache_accesses_from_l2_hwpf sleep 1

 Performance counter stats for 'sleep 1':

             1,477      l2_pf_miss_l2_l3          # 10442.00 l2_cache_accesses_from_l2_hwpf
             3,978      l2_pf_hit_l2
             4,987      l2_pf_miss_l2_hit_l3

       1.001491186 seconds time elapsed

 # perf stat -e l2_cache_hits_from_l2_hwpf sleep 1

 Performance counter stats for 'sleep 1':

             3,983      l2_cache_hits_from_l2_hwpf

       1.001329970 seconds time elapsed

Note the difference in performance counter values for the accesses
versus the hits after the fix, and the hits event now counting the same
as l2_pf_hit_l2.

Fixes: 08ed77e4 ("perf vendor events amd: Add recommended events")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206537


Reviewed-by: Robert Richter <rrichter@amd.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> # On a 3900X
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: linux-perf-users@vger.kernel.org
Link: https://lore.kernel.org/r/20210406215944.113332-2-Smita.KoralahalliChannabasappa@amd.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

b2e2d2c0

perf vendor events amd: Add recommended events · 5afca3af

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 08ed77e4
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

Add support for events listed in Section 2.1.15.2 "Performance
Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803
Rev 0.54 - Sep 12, 2019".

perf now supports these new events (-e):

  all_dc_accesses
  all_tlbs_flushed
  l1_dtlb_misses
  l2_cache_accesses_from_dc_misses
  l2_cache_accesses_from_ic_misses
  l2_cache_hits_from_dc_misses
  l2_cache_hits_from_ic_misses
  l2_cache_misses_from_dc_misses
  l2_cache_misses_from_ic_miss
  l2_dtlb_misses
  l2_itlb_misses
  sse_avx_stalls
  uops_dispatched
  uops_retired
  l3_accesses
  l3_misses

and these metrics (-M):

  branch_misprediction_ratio
  all_l2_cache_accesses
  all_l2_cache_hits
  all_l2_cache_misses
  ic_fetch_miss_ratio
  l2_cache_accesses_from_l2_hwpf
  l2_cache_hits_from_l2_hwpf
  l2_cache_misses_from_l2_hwpf
  l3_read_miss_latency
  l1_itlb_misses
  all_remote_links_outbound
  nps1_die_to_dram

The nps1_die_to_dram event may need perf stat's --metric-no-group
switch if the number of available data fabric counters is less
than the number it uses (8).

Committer testing:

On a AMD Ryzen 3900x system:

Before:

  # perf list all_dc_accesses   all_tlbs_flushed   l1_dtlb_misses   l2_cache_accesses_from_dc_misses   l2_cache_accesses_from_ic_misses   l2_cache_hits_from_dc_misses   l2_cache_hits_from_ic_misses   l2_cache_misses_from_dc_misses   l2_cache_misses_from_ic_miss   l2_dtlb_misses   l2_itlb_misses   sse_avx_stalls   uops_dispatched   uops_retired   l3_accesses   l3_misses | grep -v "^Metric Groups:$" | grep -v "^$"
  #

After:

  # perf list all_dc_accesses   all_tlbs_flushed   l1_dtlb_misses   l2_cache_accesses_from_dc_misses   l2_cache_accesses_from_ic_misses   l2_cache_hits_from_dc_misses   l2_cache_hits_from_ic_misses   l2_cache_misses_from_dc_misses   l2_cache_misses_from_ic_miss   l2_dtlb_misses   l2_itlb_misses   sse_avx_stalls   uops_dispatched   uops_retired   l3_accesses   l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" | grep -v "^recommended:$"
  all_dc_accesses
       [All L1 Data Cache Accesses]
  all_tlbs_flushed
       [All TLBs Flushed]
  l1_dtlb_misses
       [L1 DTLB Misses]
  l2_cache_accesses_from_dc_misses
       [L2 Cache Accesses from L1 Data Cache Misses (including prefetch)]
  l2_cache_accesses_from_ic_misses
       [L2 Cache Accesses from L1 Instruction Cache Misses (including
        prefetch)]
  l2_cache_hits_from_dc_misses
       [L2 Cache Hits from L1 Data Cache Misses]
  l2_cache_hits_from_ic_misses
       [L2 Cache Hits from L1 Instruction Cache Misses]
  l2_cache_misses_from_dc_misses
       [L2 Cache Misses from L1 Data Cache Misses]
  l2_cache_misses_from_ic_miss
       [L2 Cache Misses from L1 Instruction Cache Misses]
  l2_dtlb_misses
       [L2 DTLB Misses & Data page walks]
  l2_itlb_misses
       [L2 ITLB Misses & Instruction page walks]
  sse_avx_stalls
       [Mixed SSE/AVX Stalls]
  uops_dispatched
       [Micro-ops Dispatched]
  uops_retired
       [Micro-ops Retired]
  l3_accesses
       [L3 Accesses. Unit: amd_l3]
  l3_misses
       [L3 Misses (includes Chg2X). Unit: amd_l3]
  #

  # perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses,l2_cache_hits_from_dc_misses,l2_cache_hits_from_ic_misses,l2_cache_misses_from_dc_misses,l2_cache_misses_from_ic_miss,l2_dtlb_misses,l2_itlb_misses,sse_avx_stalls,uops_dispatched,uops_retired,l3_accesses,l3_misses sleep 2

   Performance counter stats for 'system wide':

       433,439,949      all_dc_accesses                                               (35.66%)
               443      all_tlbs_flushed                                              (35.66%)
         2,985,885      l1_dtlb_misses                                                (35.66%)
        18,318,019      l2_cache_accesses_from_dc_misses                                     (35.68%)
        50,114,810      l2_cache_accesses_from_ic_misses                                     (35.72%)
        12,423,978      l2_cache_hits_from_dc_misses                                     (35.74%)
        40,703,103      l2_cache_hits_from_ic_misses                                     (35.74%)
         6,698,673      l2_cache_misses_from_dc_misses                                     (35.74%)
        12,090,892      l2_cache_misses_from_ic_miss                                     (35.74%)
           614,267      l2_dtlb_misses                                                (35.74%)
           216,036      l2_itlb_misses                                                (35.74%)
            11,977      sse_avx_stalls                                                (35.74%)
       999,276,223      uops_dispatched                                               (35.73%)
     1,075,311,620      uops_retired                                                  (35.69%)
         1,420,763      l3_accesses
           540,164      l3_misses

       2.002344121 seconds time elapsed

  # perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses sleep 2

   Performance counter stats for 'system wide':

       175,943,104      all_dc_accesses
               310      all_tlbs_flushed
         2,280,359      l1_dtlb_misses
        11,700,151      l2_cache_accesses_from_dc_misses
        25,414,963      l2_cache_accesses_from_ic_misses

       2.001957818 seconds time elapsed

  #

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537


Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-3-kim.phillips@amd.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5afca3af

perf vendor events amd: Add L2 Prefetch events for zen1 · 8f88e25b

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 60d80452
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

Later revisions of PPRs that post-date the original Family 17h events
submission patch add these events.

Specifically, they were not in this 2017 revision of the F17h PPR:

Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors Rev 1.14 - April 15, 2017

But e.g., are included in this 2019 version of the PPR:

Processor Programming Reference (PPR) for AMD Family 17h Model 18h, Revision B1 Processors Rev. 3.14 - Sep 26, 2019

Fixes: 98c07a8f ("perf vendor events amd: perf PMU events for AMD Family 17h")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537


Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-1-kim.phillips@amd.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

8f88e25b

perf/amd/uncore: Fix sysfs type mismatch · e7957898

Nathan Chancellor authored 3 years ago

mainline inclusion
from mainline-v5.13-rc1
commit 5deac80d4571dffb51f452f0027979d72259a1b9
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

dev_attr_show() calls the __uncore_*_show() functions via an indirect
call but their type does not currently match the type of the show()
member in 'struct device_attribute', resulting in a Control Flow
Integrity violation.

$ cat /sys/devices/amd_l3/format/umask
config:8-15

$ dmesg | grep "CFI failure"
[ 1258.174653] CFI failure (target: __uncore_umask_show...):

Update the type in the DEFINE_UNCORE_FORMAT_ATTR macro to match
'struct device_attribute' so that there is no more CFI violation.

Fixes: 06f2c245 ("perf/amd/uncore: Prepare to scale for more attributes that vary per family")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210415001112.3024673-2-nathan@kernel.org


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

e7957898

perf/x86/amd: Don't touch the AMD64_EVENTSEL_HOSTONLY bit inside the guest · ddae1ff6

Like Xu authored 3 years ago

mainline inclusion
from mainline-v5.14-rc5
commit df51fe7ea1c1c2c3bfdb81279712fdd2e4ea6c27
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

If we use "perf record" in an AMD Milan guest, dmesg reports a #GP
warning from an unchecked MSR access error on MSR_F15H_PERF_CTLx:

  [] unchecked MSR access error: WRMSR to 0xc0010200 (tried to write 0x0000020000110076) at rIP: 0xffffffff8106ddb4 (native_write_msr+0x4/0x20)
  [] Call Trace:
  []  amd_pmu_disable_event+0x22/0x90
  []  x86_pmu_stop+0x4c/0xa0
  []  x86_pmu_del+0x3a/0x140

The AMD64_EVENTSEL_HOSTONLY bit is defined and used on the host,
while the guest perf driver should avoid such use.

Fixes: 1018faa6 ("perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM disabled")
Signed-off-by: Like Xu <likexu@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Tested-by: Kim Phillips <kim.phillips@amd.com>
Tested-by: Liam Merwick <liam.merwick@oracle.com>
Link: https://lkml.kernel.org/r/20210802070850.35295-1-likexu@tencent.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

ddae1ff6

tools/power turbostat: Support AMD Family 19h · 4cbf650e

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc4
commit 33eb8225
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Family 19h processors have the same RAPL (Running average power limit)
hardware register interface as Family 17h processors.

Change the family checks to succeed for Family 17h and above to enable
core and package energy measurement on Family 19h machines.

Also update the TDP to the largest found at the bottom of the page at
amd.com->processors->servers->epyc->2nd-gen-epyc, i.e., the EPYC 7H12.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Acked-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Wei Li <liwei391@huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

4cbf650e

perf/x86/amd/ibs: Support 27-bit extended Op/cycle counter · 069f01ef

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 8b0bed7d
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

IBS hardware with the OpCntExt feature gets a 7-bit wider internal
counter.  Both the maximum and current count bitfields in the
IBS_OP_CTL register are extended to support reading and writing it.

No changes are necessary to the driver for handling the extra
contiguous current count bits (IbsOpCurCnt), as the driver already
passes through 32 bits of that field.  However, the driver has to do
some extra bit manipulation when converting from a period to the
non-contiguous (although conveniently aligned) extra bits in the
IbsOpMaxCnt bitfield.

This decreases IBS Op interrupt overhead when the period is over
1,048,560 (0xffff0), which would previously activate the driver's
software counter.  That threshold is now 134,217,712 (0x7fffff0).

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200908214740.18097-7-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

069f01ef

perf vendor events amd: Enable Family 19h users by matching Zen2 events · 07ee3394

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 09b54b30
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

This enables zen3 users by reusing mostly-compatible zen2 events
until the official public list of zen3 events is published in a
future PPR.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-4-kim.phillips@amd.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

07ee3394

perf vendor events amd: Update Zen1 events to V2 · 35890547

Vijay Thakkar authored 3 years ago

mainline inclusion
from mainline-v5.7-rc1
commit b5b8a7cf
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

This patch updates the PMCs for AMD Zen1 core based processors (Family
17h; Models 0 through 2F) to be in accordance with PMCs as
documented in the latest versions of the AMD Processor Programming
Reference [1], [2] and [3]. Note that some events, such as FPU pipe
assignment are missing in [1], and therefore [3] is included for full
coverage of events.

PMCs added:

  fpu_pipe_assignment.dual{0|1|2|3}
  fpu_pipe_assignment.total{0|1|2|3}
  ls_mab_alloc.dc_prefetcher
  ls_mab_alloc.stores
  ls_mab_alloc.loads
  bp_dyn_ind_pred
  bp_de_redirect

PMC removed:

  ex_ret_cond_misp

Cumulative counts, fpu_pipe_assignment.total and
fpu_pipe_assignment.dual, existed in v1, but did expose port-level
counters.

ex_ret_cond_misp has been removed as it has been removed from the latest
versions of the PPR, and when tested, always seems to sample zero as
tested on a Ryzen 3400G system.

[1]: Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019.

[2]: Processor Programming Reference (PPR) for AMD Family 17h Model 18h,
Revision B1 Processors, 55570-B1 Rev 3.14 - Sep 26, 2019.

[3]: OSRR for AMD Family 17h processors, Models 00h-2Fh, 56255 Rev 3.03 - July, 2018

All of the PPRs can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=206537



Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: vijay thakkar <vijaythakkar@me.com>
Link: http://lore.kernel.org/lkml/20200318190002.307290-4-vijaythakkar@me.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

35890547

perf vendor events amd: Add Zen2 events · fbfa535b

Vijay Thakkar authored 3 years ago

mainline inclusion
from mainline-v5.7-rc1
commit 2079f7aa
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

This patch adds PMU events for AMD Zen2 core based processors, namely,
Matisse (model 71h), Castle Peak (model 31h) and Rome (model 2xh), as
documented in the AMD Processor Programming Reference for Matisse [1].
The model number regex has been set to detect all the models under
family 17 that do not match those of Zen1, as the range is larger for
zen2.

Zen2 adds some additional counters that are not present in Zen1 and
events for them have been added in this patch. Some counters have also
been removed for Zen2 thatwere previously present in Zen1 and have been
confirmed to always sample zero on zen2. These added/removed counters
have been omitted for brevity but can be found here:
https://gist.github.com/thakkarV/5b12ca5fd7488eb2c42e451e40bdd5f3

Note that PPR for Zen2 [1] does not include some counters that were
documented in the PPR for Zen1 based processors [2]. After having tested
these counters, some of them that still work for zen2 systems have been
preserved in the events for zen2. The counters that are omitted in [1]
but are still measurable and non-zero on zen2 (tested on a Ryzen 3900X
system) are the following:

  PMC 0x000 fpu_pipe_assignment.{total|total0|total1|total2|total3}
  PMC 0x004 fp_num_mov_elim_scal_op.*
  PMC 0x046 ls_tablewalker.*
  PMC 0x062 l2_latency.l2_cycles_waiting_on_fills
  PMC 0x063 l2_wcb_req.*
  PMC 0x06D l2_fill_pending.l2_fill_busy
  PMC 0x080 ic_fw32
  PMC 0x081 ic_fw32_miss
  PMC 0x086 bp_snp_re_sync
  PMC 0x087 ic_fetch_stall.*
  PMC 0x08C ic_cache_inval.*
  PMC 0x099 bp_tlb_rel
  PMC 0x0C7 ex_ret_brn_resync
  PMC 0x28A ic_oc_mode_switch.*
  L3PMC 0x001 l3_request_g1.*
  L3PMC 0x006 l3_comb_clstr_state.*

[1]: Processor Programming Reference (PPR) for AMD Family 17h Model 71h,
Revision B0 Processors, 56176 Rev 3.06 - Jul 17, 2019

[2]: Processor Programming Reference (PPR) for AMD Family 17h Models
01h,08h, Revision B2 Processors, 54945 Rev 3.03 - Jun 14, 2019

All of the PPRs can be found at:

https://bugzilla.kernel.org/show_bug.cgi?id=206537



Here are the results of running "fpu_pipe_assignment.total" events on my
Ryzen 3900X family 17h model 71h system:

Before this patch:

  $> perf list *fpu_pipe_assignment*

List of pre-defined events (to be used in -e):

After:

  $> perf list *fpu_pipe_assignment*

  floating point:
  fpu_pipe_assignment.total
      [Total number of fp uOps]
  fpu_pipe_assignment.total0
      [Total number uOps assigned to pipe 0]
  fpu_pipe_assignment.total1
      [Total number uOps assigned to pipe 1]
  fpu_pipe_assignment.total2
      [Total number uOps assigned to pipe 2]
  fpu_pipe_assignment.total3
      [Total number uOps assigned to pipe 3]

  Metric Groups:

  $> perf stat -e fpu_pipe_assignment.total sleep 1

  Performance counter stats for 'sleep 1':

              25,883      fpu_pipe_assignment.total

         1.004145868 seconds time elapsed

         0.001805000 seconds user
         0.000000000 seconds sys

Usage tests while running Linpackin the background:

  $> perf stat -I1000 -e fpu_pipe_assignment.total
       1.000266796     79,313,191,516      fpu_pipe_assignment.total
       2.000809630     68,091,474,430      fpu_pipe_assignment.total
       3.001028115     52,925,023,174      fpu_pipe_assignment.total

  $> perf record -e fpu_pipe_assignment.total,fpu_pipe_assignment.total0 -a sleep 1
  [ perf record: Woken up 9 times to write data ]
  [ perf record: Captured and wrote 4.031 MB perf.data (64764 samples) ]

  $> perf report --stdio --no-header | head -30
      98.33%  xhpl             xhpl                          [.] dgemm_kernel
       0.28%  xhpl             xhpl                          [.] dtrsm_kernel_LT
       0.10%  xhpl             [kernel.kallsyms]             [k] entry_SYSCALL_64
       0.08%  xhpl             xhpl                          [.] idamax_k
       0.07%  baloo_file_extr  liblmdb.so                    [.] mdb_mid2l_insert
       0.06%  xhpl             xhpl                          [.] dgemm_itcopy
       0.06%  xhpl             xhpl                          [.] dgemm_oncopy
       0.06%  xhpl             [kernel.kallsyms]             [k] __schedule
       0.06%  xhpl             [kernel.kallsyms]             [k] syscall_trace_enter
       0.06%  xhpl             [kernel.kallsyms]             [k] native_sched_clock
       0.06%  xhpl             [kernel.kallsyms]             [k] pick_next_task_fair
       0.05%  xhpl             xhpl                          [.] blas_thread_server.llvm.15009391670273914865
       0.04%  xhpl             [kernel.kallsyms]             [k] do_syscall_64
       0.04%  xhpl             [kernel.kallsyms]             [k] yield_task_fair
       0.04%  xhpl             libpthread-2.31.so            [.] __pthread_mutex_unlock_usercnt
       0.03%  xhpl             [kernel.kallsyms]             [k] cpuacct_charge
       0.03%  xhpl             [kernel.kallsyms]             [k] syscall_return_via_sysret
       0.03%  xhpl             libc-2.31.so                  [.] __sched_yield
       0.03%  xhpl             [kernel.kallsyms]             [k] __calc_delta

  $> perf annotate --stdio2 dgemm_kernel | egrep '^ {0,2}[0-9]+' -B2 -A2
                  sub          $0x60,%rsp
                  mov          %rbx,(%rsp)
    0.00          mov          %rbp,0x8(%rsp)
                  mov          %r12,0x10(%rsp)
    0.00          mov          %r13,0x18(%rsp)
                  mov          %r14,0x20(%rsp)
                  mov          %r15,0x28(%rsp)
  --
                  mov          %rdi,%r13
                  mov          %rsi,0x28(%rsp)
    0.00          mov          %rdx,%r12
                  vmovsd       %xmm0,0x30(%rsp)
                  shl          $0x3,%r10
                  mov          0x28(%rsp),%rax
    0.00          xor          %rdx,%rdx
                  mov          $0x18,%rdi
                  div          %rdi
  --
                  nop
            a0:   mov          %r12,%rax
    0.00          shl          $0x3,%rax
                  mov          %r8,%rdi
                  lea          (%r8,%rax,8),%r15
  --
                  mov          %r12,%rax
                  nop
    0.00    c0:   vmovups      (%rdi),%ymm1
    0.09          vmovups      0x20(%rdi),%ymm2
    0.02          vmovups      (%r15),%ymm3
    0.10          vmovups      %ymm1,(%rsi)
    0.07          vmovups      %ymm2,0x20(%rsi)
    0.07          vmovups      %ymm3,0x40(%rsi)
    0.06          add          $0x40,%rdi
                  add          $0x40,%r15
                  add          $0x60,%rsi
    0.00          dec          %rax
                ↑ jne          c0
                  mov          %r9,%r15
  --
                  nop
           110:   lea          0x80(%rsp),%rsi
    0.01          add          $0x60,%rsi
    0.03          mov          %r12,%rax
    0.00          sar          $0x3,%rax
                  cmp          $0x2,%rax
                ↓ jl           d26
                  prefetcht0   0x200(%rdi)
    0.01          vmovups      -0x60(%rsi),%ymm1
    0.02          prefetcht0   0xa0(%rsi)
    0.00          vbroadcastsd -0x80(%rdi),%ymm0
    0.00          prefetcht0   0xe0(%rsi)
    0.03          vmovups      -0x40(%rsi),%ymm2
    0.00          prefetcht0   0x120(%rsi)
                  vmovups      -0x20(%rsi),%ymm3
                  vmulpd       %ymm0,%ymm1,%ymm4
    0.01          prefetcht0   0x160(%rsi)
                  vmulpd       %ymm0,%ymm2,%ymm8
    0.01          vmulpd       %ymm0,%ymm3,%ymm12
    0.02          prefetcht0   0x1a0(%rsi)
    0.01          vbroadcastsd -0x78(%rdi),%ymm0
                  vmulpd       %ymm0,%ymm1,%ymm5
    0.01          vmulpd       %ymm0,%ymm2,%ymm9
                  vmulpd       %ymm0,%ymm3,%ymm13
    0.01          vbroadcastsd -0x70(%rdi),%ymm0
                  vmulpd       %ymm0,%ymm1,%ymm6
    0.00          vmulpd       %ymm0,%ymm2,%ymm10
    0.00          add          $0x60,%rsi

  ... snip ...

                  nop
          65e0:   vmovddup     -0x60(%rsi),%xmm2
    0.00          vmovups      -0x80(%rdi),%xmm0
                  vmovups      -0x70(%rdi),%xmm1
    0.00          vmovddup     -0x58(%rsi),%xmm3
                  vfmadd231pd  %xmm0,%xmm2,%xmm4
    0.00          vfmadd231pd  %xmm1,%xmm2,%xmm5
    0.00          vfmadd231pd  %xmm0,%xmm3,%xmm6
    0.00          vfmadd231pd  %xmm1,%xmm3,%xmm7
    0.00          add          $0x10,%rsi
                  add          $0x20,%rdi
    0.00          dec          %rax
                ↑ jne          65e0
                  nop
                  nop
          6620:   vmovddup     0x30(%rsp),%xmm0
    0.00          vmulpd       %xmm0,%xmm4,%xmm4
    0.00          vmulpd       %xmm0,%xmm5,%xmm5
                  vmulpd       %xmm0,%xmm6,%xmm6
                  vmulpd       %xmm0,%xmm7,%xmm7
                  vaddpd       (%r15),%xmm4,%xmm4
                  vaddpd       0x10(%r15),%xmm5,%xmm5
    0.00          vaddpd       (%r15,%r10,1),%xmm6,%xmm6
    0.00          vaddpd       0x10(%r15,%r10,1),%xmm7,%xmm7
    0.00          vmovups      %xmm4,(%r15)
                  vmovups      %xmm5,0x10(%r15)
    0.00          vmovups      %xmm6,(%r15,%r10,1)
                  vmovups      %xmm7,0x10(%r15,%r10,1)
                  add          $0x20,%r15
  --
                  lea          (%r8,%rax,8),%r8
          69d8:   mov          0x20(%rsp),%r14
    0.00          test         $0x1,%r14
                ↓ je           6d84
                  mov          %r9,%r15
  --
                  vbroadcastsd -0x28(%rsi),%ymm3
                  vfmadd231pd  (%rdi),%ymm0,%ymm4
    0.00          vfmadd231pd  0x20(%rdi),%ymm1,%ymm5
                  vfmadd231pd  0x40(%rdi),%ymm2,%ymm6
                  vfmadd231pd  0x60(%rdi),%ymm3,%ymm7
  --
                  vmulpd       %ymm0,%ymm4,%ymm4
                  vaddpd       (%r15),%ymm4,%ymm4
    0.00          vmovups      %ymm4,(%r15)
                  add          $0x20,%r15
                  dec          %r11
  --
                  mov          %rbx,%rsp
                  mov          (%rsp),%rbx
    0.01          mov          0x8(%rsp),%rbp
                  mov          0x10(%rsp),%r12
                  mov          0x18(%rsp),%r13

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200318190002.307290-3-vijaythakkar@me.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

fbfa535b

perf vendor events amd: Restrict model detection for zen1 based processors · 99776eba

Vijay Thakkar authored 3 years ago

mainline inclusion
from mainline-v5.7-rc1
commit c5f18e9e
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

This patch changes the previous blanket detection of AMD Family 17h
processors to be more specific to Zen1 core based products only by
replacing model detection regex pattern [[:xdigit:]]+ with
([12][0-9A-F]|[0-9A-F]), restricting to models 0 though 2f only.

This change is required to allow for the addition of separate PMU events
for Zen2 core based models in the following patches as those belong to
family 17h but have different PMCs. Current PMU events directory has
also been renamed to "amdzen1" from "amdfam17h" to reflect this
specificity.

Note that although this change does not break PMU counters for existing
zen1 based systems, it does disable the current set of counters for zen2
based systems. Counters for zen2 have been added in the following
patches in this patchset.

Signed-off-by: Vijay Thakkar <vijaythakkar@me.com>
Acked-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20200318190002.307290-2-vijaythakkar@me.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

99776eba

perf vendor events amd: Remove redundant '[' · 0f2753af

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit 0c03d3aa
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Remove the redundant '['.

'perf list' output before:

  ex_ret_brn
       [[Retired Branch Instructions]

'perf list' output after:

  ex_ret_brn
       [Retired Branch Instructions]

Fixes: 98c07a8f ("perf vendor events amd: perf PMU events for AMD Family 17h")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Janakarajan Natarajan <janakarajan.natarajan@amd.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Luke Mujica <lukemujica@google.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190919204306.12598-2-kim.phillips@amd.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

0f2753af

perf vendor events intel: Add Tremontx event file v1.02 · 80b409d7

Haiyan Song authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit 11e54d35
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Add a Intel event file for perf.

Signed-off-by: Haiyan Song <haiyanx.song@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190815035942.30602-1-haiyanx.song@intel.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

80b409d7

perf vendor events intel: Add Icelake V1.00 event file · 78fa05c7

Haiyan Song authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit b115df07
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Add a Intel event file for perf.

Signed-off-by: Haiyan Song <haiyanx.song@intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/8859095e-5b02-d6b7-fbdc-3f42b714bae0@intel.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

78fa05c7

perf vendor events amd: Add L3 cache events for Family 17h · ad5c6ca2

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit faef8749
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

Allow users to symbolically specify L3 events for Family 17h processors
using the existing AMD Uncore driver.

Source of events descriptions are from section 2.1.15.4.1 "L3 Cache PMC
Events" of the latest Family 17h PPR, available here:

  https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip



Opnly BriefDescriptions added, since they show with and without
the -v and --details flags.

Tested with:

 # perf stat -e l3_request_g1.caching_l3_cache_accesses,amd_l3/event=0x01,umask=0x80/,l3_comb_clstr_state.request_miss,amd_l3/event=0x06,umask=0x01/ perf bench mem memcpy -s 4mb -l 100 -f default
...
         7,006,831      l3_request_g1.caching_l3_cache_accesses
         7,006,830      amd_l3/event=0x01,umask=0x80/
           366,530      l3_comb_clstr_state.request_miss
           366,568      amd_l3/event=0x06,umask=0x01/

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Janakarajan Natarajan <janakarajan.natarajan@amd.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Luke Mujica <lukemujica@google.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20190919204306.12598-1-kim.phillips@amd.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

ad5c6ca2

perf vendor events intel: Add uncore_upi JSON support · 9cea22ef

Kan Liang authored 3 years ago

mainline inclusion
from mainline-v5.2-rc1
commit bf6d18cf
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

Perf cannot parse UPI (Intel's "Ultra Path Interconnect" [1]) events.

    # perf stat -e UPI_DATA_BANDWIDTH_TX
    event syntax error: 'UPI_DATA_BANDWIDTH_TX'
                     \___ parser error
    Run 'perf list' for a list of valid events

The JSON lists call the box UPI LL, while perf calls it upi.  Add
conversion support to JSON to convert the unit properly.

Committer notes:

[1] https://en.wikipedia.org/wiki/Intel_Ultra_Path_Interconnect



"The Intel Ultra Path Interconnect (UPI) is a point-to-point processor
interconnect developed by Intel which replaced the Intel QuickPath
Interconnect (QPI) in Xeon Skylake-SP platforms starting in 2017.

UPI is a low-latency coherent interconnect for scalable multiprocessor
systems with a shared address space. It uses a directory-based home
snoop coherency protocol with a transfer speed of up to 10.4 GT/s.
Supporting processors typically have two or three UPI links."

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1557234991-130456-1-git-send-email-kan.liang@linux.intel.com


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

9cea22ef

perf vendor events amd: perf PMU events for AMD Family 17h · 5468408d

Martin Liška authored 3 years ago

mainline inclusion
from mainline-v5.1-rc2
commit 98c07a8f
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4
CVE: NA

--------------------------------

Thi patch adds PMC events for AMD Family 17 CPUs as defined in [1].  It
covers events described in section: 2.1.13. Regex pattern in mapfile.csv
covers all CPUs of the family.

[1] https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf



Signed-off-by: Martin Liška <mliska@suse.cz>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: William Cohen <wcohen@redhat.com>
Link: https://lkml.kernel.org/r/d65873ca-e402-b198-4fe9-8c4af81258c8@suse.cz


Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5468408d

perf/amd/uncore: Allow F19h user coreid, threadmask, and sliceid specification · 4c3c1c15

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 87a54a1f
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

On Family 19h, the driver checks for a populated 2-bit threadmask in
order to establish that the user wants to measure individual slices,
individual cores (only one can be measured at a time), and lets
the user also directly specify enallcores and/or enallslices if
desired.

Example F19h invocation to measure L3 accesses (event 4, umask 0xff)
by the first thread (id 0 -> mask 0x1) of the first core (id 0) on the
first slice (id 0):

perf stat -a -e instructions,amd_l3/umask=0xff,event=0x4,coreid=0,threadmask=1,sliceid=0,enallcores=0,enallslices=0/ <workload>

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200921144330.6331-4-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

4c3c1c15

perf/amd/uncore: Allow F17h user threadmask and slicemask specification · ed705649

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 8170f386
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Continue to fully populate either one of threadmask or slicemask if the
user doesn't.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200921144330.6331-3-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

ed705649

perf/amd/uncore: Prepare to scale for more attributes that vary per family · 2651d616

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 06f2c245
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Replace AMD_FORMAT_ATTR with the more apropos DEFINE_UNCORE_FORMAT_ATTR
stolen from arch/x86/events/intel/uncore.h.  This way we can clearly
see the bit-variants of each of the attributes that want to have
the same name across families.

Also unroll AMD_ATTRIBUTE because we are going to separately add
new attributes that differ between DF and L3.

Also clean up the if-Family 17h-else logic in amd_uncore_init.

This is basically a rewrite of commit da6adaea
("perf/x86/amd/uncore: Update sysfs attributes for Family17h processors").

No functional changes.

Tested F17h+ /sys/bus/event_source/devices/amd_{l3,df}/format/*
content remains unchanged:

/sys/bus/event_source/devices/amd_l3/format/event:config:0-7
/sys/bus/event_source/devices/amd_l3/format/umask:config:8-15
/sys/bus/event_source/devices/amd_df/format/event:config:0-7,32-35,59-60
/sys/bus/event_source/devices/amd_df/format/umask:config:8-15

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200921144330.6331-2-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

2651d616

perf/x86/amd/ibs: Don't include randomized bits in get_ibs_op_count() · 26f8ef67

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 680d6963
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

get_ibs_op_count() adds hardware's current count (IbsOpCurCnt) bits
to its count regardless of hardware's valid status.

According to the PPR for AMD Family 17h Model 31h B0 55803 Rev 0.54,
if the counter rolls over, valid status is set, and the lower 7 bits
of IbsOpCurCnt are randomized by hardware.

Don't include those bits in the driver's event count.

Fixes: 8b1e1363 ("perf/x86-ibs: Fix usage of IBS op current count")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

26f8ef67

perf/amd/uncore: Set all slices and threads to restore perf stat -a behaviour · e3357c60

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit c8fe99d0
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Commit 2f217d58 ("perf/x86/amd/uncore: Set the thread mask for
F17h L3 PMCs") inadvertently changed the uncore driver's behaviour
wrt perf tool invocations with or without a CPU list, specified with
-C / --cpu=.

Change the behaviour of the driver to assume the former all-cpu (-a)
case, which is the more commonly desired default.  This fixes
'-a -A' invocations without explicit cpu lists (-C) to not count
L3 events only on behalf of the first thread of the first core
in the L3 domain.

BEFORE:

Activity performed by the first thread of the last core (CPU#43) in
CPU#40's L3 domain is not reported by CPU#40:

sudo perf stat -a -A -e l3_request_g1.caching_l3_cache_accesses taskset -c 43 perf bench mem memcpy -s 32mb -l 100 -f default
...
CPU36                 21,835      l3_request_g1.caching_l3_cache_accesses
CPU40                 87,066      l3_request_g1.caching_l3_cache_accesses
CPU44                 17,360      l3_request_g1.caching_l3_cache_accesses
...

AFTER:

The L3 domain activity is now reported by CPU#40:

sudo perf stat -a -A -e l3_request_g1.caching_l3_cache_accesses taskset -c 43 perf bench mem memcpy -s 32mb -l 100 -f default
...
CPU36                354,891      l3_request_g1.caching_l3_cache_accesses
CPU40              1,780,870      l3_request_g1.caching_l3_cache_accesses
CPU44                315,062      l3_request_g1.caching_l3_cache_accesses
...

Fixes: 2f217d58 ("perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200908214740.18097-2-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

e3357c60

perf/x86/amd/ibs: Fix raw sample data accumulation · d5082542

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 36e1be8a
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Neither IbsBrTarget nor OPDATA4 are populated in IBS Fetch mode.
Don't accumulate them into raw sample user data in that case.

Also, in Fetch mode, add saving the IBS Fetch Control Extended MSR.

Technically, there is an ABI change here with respect to the IBS raw
sample data format, but I don't see any perf driver version information
being included in perf.data file headers, but, existing users can detect
whether the size of the sample record has reduced by 8 bytes to
determine whether the IBS driver has this fix.

Fixes: 904cb367 ("perf/x86/amd/ibs: Update IBS MSRs and feature definitions")
Reported-by: Stephane Eranian <stephane.eranian@google.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200908214740.18097-6-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

d5082542

arch/x86/amd/ibs: Fix re-arming IBS Fetch · 1076809f

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 221bfce5
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Stephane Eranian found a bug in that IBS' current Fetch counter was not
being reset when the driver would write the new value to clear it along
with the enable bit set, and found that adding an MSR write that would
first disable IBS Fetch would make IBS Fetch reset its current count.

Indeed, the PPR for AMD Family 17h Model 31h B0 55803 Rev 0.54 - Sep 12,
2019 states "The periodic fetch counter is set to IbsFetchCnt [...] when
IbsFetchEn is changed from 0 to 1."

Explicitly set IbsFetchEn to 0 and then to 1 when re-enabling IBS Fetch,
so the driver properly resets the internal counter to 0 and IBS
Fetch starts counting again.

A family 15h machine tested does not have this problem, and the extra
wrmsr is also not needed on Family 19h, so only do the extra wrmsr on
families 16h through 18h.

Reported-by: Stephane Eranian <stephane.eranian@google.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
[peterz: optimized]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

1076809f

perf/amd/uncore: Add support for Family 19h L3 PMU · 17e4aec8

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.7-rc1
commit e48667b8
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Family 19h introduces change in slice, core and thread specification in
its L3 Performance Event Select (ChL3PmcCfg) h/w register. The change is
incompatible with Family 17h's version of the register.

Introduce a new path in l3_thread_slice_mask() to do things differently
for Family 19h vs. Family 17h, otherwise the new hardware doesn't get
programmed correctly.

Instead of a linear core--thread bitmask, Family 19h takes an encoded
core number, and a separate thread mask. There are new bits that are set
for all cores and all slices, of which only the latter is used, since
the driver counts events for all slices on behalf of the specified CPU.

Also update amd_uncore_init() to base its L2/NB vs. L3/Data Fabric mode
decision based on Family 17h or above, not just 17h and 18h: the Family
19h Data Fabric PMC is compatible with the Family 17h DF PMC.

 [ bp: Touchups. ]

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200313231024.17601-3-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

17e4aec8

perf/amd/uncore: Make L3 thread mask code more readable · 5cc910d8

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.7-rc1
commit 9689dbbe
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Convert the l3_thread_slice_mask() function to use the more readable
topology_* helper functions, more intuitive variable names like shift
and thread_mask, and BIT_ULL().

No functional changes.

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200313231024.17601-2-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5cc910d8

perf/amd/uncore: Prepare L3 thread mask code for Family 19h · 2ec2ad36

Kim Phillips authored 3 years ago

mainline inclusion
from mainline-v5.7-rc1
commit 4dcc3df8
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

In order to better accommodate the upcoming Family 19h, given
the 80-char line limit, move the existing code into a new
l3_thread_slice_mask() function.

No functional changes.

 [ bp: Touchups. ]

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200313231024.17601-1-kim.phillips@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

2ec2ad36

EDAC/amd64: Handle three rank interleaving mode · e8adddcd

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.16-rc1
commit 9f4873fb6af7966de8fcbd95c36b61351c1c4b1f
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

AMD Rome systems and later support interleaving between three identical
ranks within a channel.

Check for this mode by counting the number of enabled chip selects and
comparing their masks. If there are exactly three enabled chip selects
and their masks are identical, then three rank interleaving is enabled.

The size of a rank is determined from its mask value. However, three
rank interleaving doesn't follow the method of swapping an interleave
bit with the most significant bit. Rather, the interleave bit is flipped
and the most significant bit remains the same. There is only a single
interleave bit in this case.

Account for this when determining the chip select size by keeping the
most significant bit at its original value and ignoring any zero bits.
This will return a full bitmask in [MSB:1].

Fixes: e53a3b26 ("EDAC/amd64: Find Chip Select memory size using Address Mask")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211005154419.2060504-1-yazen.ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

e8adddcd

EDAC/amd64: Add family ops for Family 19h Models 00h-0Fh · db6707c1

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.6-rc1
commit 2eb61c91
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Add family ops to support AMD Family 19h systems. Existing Family 17h
functions can be used. Also, add Family 19h to the list of families to
automatically load the module.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200110015651.14887-5-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

db6707c1

EDAC/amd64: Save max number of controllers to family type · 3c8cccf3

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.5-rc1
commit 5e4c5527
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

The maximum number of memory controllers is fixed within a family/model
group. In most cases, this has been fixed at 2, but some systems may
have up to 8.

The struct amd64_family_type already contains family/model-specific
information, and this can be used rather than adding model checks to
various functions.

Create a new field in struct amd64_family_type for max_mcs.
Set this when setting other family type information, and use this when
needing the maximum number of memory controllers possible for a system.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20191106012448.243970-4-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

3c8cccf3

EDAC/amd64: Gather hardware information early · 658c7f7d

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.5-rc1
commit 80355a3b
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Split out gathering hardware information from init_one_instance()
into a separate function hw_info_get(). This is necessary so that
the information can be cached earlier and used to check if memory is
populated and if ECC is enabled on a node.

Also, define a function hw_info_put() to back out changes made in
hw_info_get().

Check for an allocated PCI device (Function 0 for Family 17h or Function
1 for pre-Family 17h) before freeing, since hw_info_put() may be called
before PCI siblings are reserved.

Drop the family check when freeing pvt->umc. This will be NULL on
pre-Family 17h systems. However, kfree() is safe and will check for a
NULL pointer before freeing.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20191106012448.243970-3-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

658c7f7d

EDAC/amd64: Make struct amd64_family_type global · 13103e84

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.5-rc1
commit 38ddd4d1
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

The struct amd64_family_type doesn't change between multiple nodes and
instances of the module, so make it global.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20191106012448.243970-2-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

13103e84

EDAC/amd64: Set grain per DIMM · 56dda048

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.5-rc1
commit 466503d6
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

The following commit introduced a warning on error reports without a
non-zero grain value.

  3724ace5 ("EDAC/mc: Fix grain_bits calculation")

The amd64_edac_mod module does not provide a value, so the warning will
be given on the first reported memory error.

Set the grain per DIMM to cacheline size (64 bytes). This is the current
recommendation.

Fixes: 3724ace5 ("EDAC/mc: Fix grain_bits calculation")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20191022203448.13962-7-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

56dda048

EDAC/amd64: Support asymmetric dual-rank DIMMs · 5ad5255f

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit 81f5090d
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Future AMD systems will support asymmetric dual-rank DIMMs. These are
DIMMs where the ranks are of different sizes.

The even rank will use the Primary Even Chip Select registers and the
odd rank will use the Secondary Odd Chip Select registers.

Recognize if a Secondary Odd Chip Select is being used. Use the
Secondary Odd Address Mask when calculating the chip select size.

 [ bp: move csrow_sec_enabled() to the header, fix CS_ODD define and
   tone-down the capitalized words spelling. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-8-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5ad5255f

EDAC/amd64: Cache secondary Chip Select registers · 5963b930

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit 7574729e
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

AMD Family 17h systems have a set of secondary Chip Select Base
Addresses and Address Masks. These do not represent unique Chip
Selects, rather they are used in conjunction with the primary
Chip Select registers in certain cases.

Cache these secondary Chip Select registers for future use.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-7-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5963b930

EDAC/amd64: Add PCI device IDs for family 17h, model 70h · 0d5616c0

Isaac Vaughn authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit 3e443eb3
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Add the new Family 17h Model 70h PCI IDs (device 18h functions 0 and 6)
to the AMD64 EDAC module.

 [ bp: s/f17_base_addr_to_cs_size/f17_addr_mask_to_cs_size/g ]

Signed-off-by: Isaac Vaughn <isaac.vaughn@knights.ucf.edu>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: James Morse <james.morse@arm.com>
Cc: linux-edac@vger.kernel.org
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Robert Richter <rrichter@marvell.com>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190906192131.8ced0ca112146f32d82b6cae@knights.ucf.edu


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

0d5616c0

EDAC/amd64: Find Chip Select memory size using Address Mask · 7da95752

Yazen Ghannam authored 3 years ago

mainline inclusion
from mainline-v5.4-rc1
commit e53a3b26
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4MKP4


CVE: NA

--------------------------------

Chip Select memory size reporting on AMD Family 17h was recently fixed
in order to account for interleaving. However, the current method is not
robust.

The Chip Select Address Mask can be used to find the memory size. There
are a couple of cases.

1) For single-rank and dual-rank non-interleaved, use the address mask
plus 1 as the size.

2) For dual-rank interleaved, do #1 but "de-interleave" the address mask
first.

Always "de-interleave" the address mask in order to simplify the code
flow. Bit mask manipulation is necessary to check for interleaving, so
just go ahead and do the de-interleaving. In the non-interleaved case,
the original and de-interleaved address masks will be the same.

To de-interleave the mask, count the number of zero bits in the middle
of the mask and swap them with the most significant bits.

For example,
Original=0xFFFF9FE, De-interleaved=0x3FFFFFE

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20190821235938.118710-5-Yazen.Ghannam@amd.com


Signed-off-by: Jackie Liu <liuyun01@kylinos.cn> #openEuler_contributor
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

7da95752