corelockup: Add support of cpu core hang check
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- The softlockup and hardlockup detector only check the status of the cpu which it resides. If certain cpu core suspends, they are both not works. There is no any valid log but the cpu already abnormal and brings a lot of problems of system. To detect this case, we add the corelockup detector. First we use whether cpu core can responds to nmi as a sectence to determine if it is suspended. Then things is simple. Per cpu core maintains it's nmi interrupt counts and detector the nmi_counts of next cpu core. If the nmi interrupt counts not changed any more which means it can't respond nmi normally, we regard it as suspend. To ensure robustness, only consecutive lost nmi more than two times then trigger the warn. The detection chain is as following: cpu0->cpu1->...->cpuN->cpu0 Signed-off-by:Dong Kai <dongkai11@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Zheng Zengkai <zhengzengkai@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
Please register or sign in to comment