Skip to content
Snippets Groups Projects
Commit 8e8fcc82 authored by Dong Kai's avatar Dong Kai Committed by Yang Yingliang
Browse files

corelockup: Add support of cpu core hang check

ascend inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1


CVE: NA

--------------------------------

The softlockup and hardlockup detector only check the status
of the cpu which it resides. If certain cpu core suspends,
they are both not works. There is no any valid log but the
cpu already abnormal and brings a lot of problems of system.
To detect this case, we add the corelockup detector.

First we use whether cpu core can responds to nmi as a sectence
to determine if it is suspended. Then things is simple. Per cpu
core maintains it's nmi interrupt counts and detector the
nmi_counts of next cpu core. If the nmi interrupt counts not
changed any more which means it can't respond nmi normally, we
regard it as suspend.

To ensure robustness, only consecutive lost nmi more than two
times then trigger the warn.

The detection chain is as following:
cpu0->cpu1->...->cpuN->cpu0

Signed-off-by: default avatarDong Kai <dongkai11@huawei.com>
Reviewed-by: default avatarKuohai Xu <xukuohai@huawei.com>
Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
Signed-off-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: default avatarDing Tianhong <dingtianhong@huawei.com>
Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
parent 0d4086c5
No related branches found
No related tags found
No related merge requests found
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment