- Oct 30, 2021
-
-
Yang Xingui authored
driver inclusion category: bugfix bugzilla: NA CVE: NA Debugfs dump should be executed before FLR run for we have to dump some registers before reset by FLR. So it's wrong to queue debugfs dump work when running FLR work for these two work queue in same workqueue. It mean that Debugfs dump work is alway execute after FLR and get data which is reset. Signed-off-by:
Yang Xingui <yangxingui@huawei.com> Reviewed-by:
Kangfenglong <kangfenglong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
yanghui authored
mainline inclusion from mainline-v5.15-rc1 commit 276aeee1c5fc00df700f0782060beae126600472 category: bugfix bugzilla: 181417 CVE: NA ----------------------------------------------- Servers happened below panic: Kernel version:5.4.56 BUG: unable to handle page fault for address: 0000000000002c48 RIP: 0010:__next_zones_zonelist+0x1d/0x40 Call Trace: __alloc_pages_nodemask+0x277/0x310 alloc_page_interleave+0x13/0x70 handle_mm_fault+0xf99/0x1390 __do_page_fault+0x288/0x500 do_page_fault+0x30/0x110 page_fault+0x3e/0x50 The reason for the panic is that MAX_NUMNODES is passed in the third parameter in __alloc_pages_nodemask(preferred_nid). So access to zonelist->zoneref->zone_idx in __next_zones_zonelist will cause a panic. In offset_il_node(), first_node() returns nid from pol->v.nodes, after this other threads may chang pol->v.nodes before next_node(). This race condition will let next_node return MAX_NU...
-
changfengnan authored
mainline inclusion from mainline-v5.10-rc1 commit fc750a3b category: bugfix bugzilla: 45093 CVE: NA ----------------------------------------------- When ext4 is formatted with lazy_journal_init=1 and transactions from the previous filesystem are still on disk, it is possible that they are considered during a recovery after a crash. Because the checksum seed has changed, the CRC check will fail, and the journal recovery fails with checksum error although the journal is otherwise perfectly valid. Fix the problem by checking commit block time stamps to determine whether the data in the journal block is just stale or whether it is indeed corrupt. Reported-by:
kernel test robot <lkp@intel.com> Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Signed-off-by:
Fengnan Chang <changfengnan@hikvision.com> Signed-off-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20201012164900.20197-1-jack@suse.cz Signed-off-by: Theodore T...
-
Shijie Luo authored
mainline inclusion from mainline-v5.9-rc2 commit 00a3fff0 category: bugfix bugzilla: 45093 CVE: NA ----------------------------------------------- Remove the unnecessary chksum_err and checksum_seen variables as well as some redundant code to make the function easier to understand. [ With changes suggested by jack@ and tytso@ ] Signed-off-by:
Shijie Luo <luoshijie1@huawei.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200819122955.33526-1-luoshijie1@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Ye Bin <yebin10@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
yangerkun authored
hulk inclusion category: bugfix bugzilla: 109246 CVE: NA --------------------------- Our stress testing with IO error can trigger follow OOB with a very low probability. [59898.282466] BUG: KASAN: slab-out-of-bounds in ext4_find_extent+0x2e4/0x480 ... [59898.287162] Call Trace: [59898.287575] dump_stack+0x8b/0xb9 [59898.288070] print_address_description+0x73/0x280 [59898.289903] ext4_find_extent+0x2e4/0x480 [59898.290553] ext4_ext_map_blocks+0x125/0x1470 [59898.295481] ext4_map_blocks+0x5ee/0x940 [59898.315984] ext4_mpage_readpages+0x63c/0xdb0 [59898.320231] read_pages+0xe6/0x370 [59898.321589] __do_page_cache_readahead+0x233/0x2a0 [59898.321594] ondemand_readahead+0x157/0x450 [59898.321598] generic_file_read_iter+0xcb2/0x1550 [59898.328828] __vfs_read+0x233/0x360 [59898.328840] vfs_read+0xa5/0x190 [59898.330126] ksys_read+0xa5/0x150 [59898.331405] do_syscall_64+0x6d/0x1f0 [59898.331418] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Digging deep and we found it's actually a xattr block which can happened with follow steps: 1. extent update for file1 and will remove a leaf extent block(block A) 2. we need update the idx extent block too 3. block A has been allocated as a xattr block and will set verified 3. io error happened for this idx block and will the buffer has been released late 4. extent find for file1 will read the idx block and see block A again 5. since the buffer of block A is already verified, we will use it directly, which can lead the upper OOB Same as __ext4_xattr_check_block, we can check magic even the buffer is verified to fix the problem. Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
yangerkun authored
hulk inclusion category: bugfix bugzilla: 109246 CVE: NA --------------------------- Buffer with verified means that it has been checked before. No need verify and call set_buffer_verified again. Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhang Yi authored
hulk inclusion category: bugfix bugzilla: 109205 CVE: NA --------------------------- In the most error path of current extents updating operations are not roll back partial updates properly when some bad things happens(.e.g in ext4_ext_insert_extent()). So we may get an inconsistent extents tree if journal has been aborted due to IO error, which may probability lead to BUGON later when we accessing these extent entries in errors=continue mode. This patch drop extent buffer's verify flag before updatng the contents in ext4_ext_get_access(), and reset it after updating in __ext4_ext_dirty(). After this patch we could force to check the extent buffer if extents tree updating was break off, make sure the extents are consistent. Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Reviewed-by:
Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210908120850.4012324-4-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> conflict: fs/ext4/extents.c Reviewed-by:
Yang Erkun <yangerkun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhang Yi authored
hulk inclusion category: bugfix bugzilla: 109205 CVE: NA --------------------------- Now that we can check out overlapping extents in leaf block and out-of-order index extents in index block. But the .ee_block in the first extent of one leaf block should equal to the .ei_block in it's parent index extent entry. This patch add a check to verify such inconsistent between the index and leaf block. Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20210908120850.4012324-3-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> conflict: fs/ext4/extents.c Reviewed-by:
Yang Erkun <yangerkun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhang Yi authored
hulk inclusion category: bugfix bugzilla: 109205 CVE: NA --------------------------- After commit 5946d089 ("ext4: check for overlapping extents in ext4_valid_extent_entries()"), we can check out the overlapping extent entry in leaf extent blocks. But the out-of-order extent entry in index extent blocks could also trigger bad things if the filesystem is inconsistent. So this patch add a check to figure out the out-of-order index extents and return error. Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Reviewed-by:
Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/r/20210908120850.4012324-2-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Yang Erkun <yangerkun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhang Yi authored
hulk inclusion category: bugfix bugzilla: 182754 CVE: NA --------------------------- Fix the error path in free_dqentry(), pass out the error number if the block to free is not correct. Fixes: 1ccd14b9 ("quota: Split off quota tree handling into a separate file") Link: https://lore.kernel.org/r/20211008093821.1001186-3-yi.zhang@huawei.com Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Cc: stable@kernel.org Signed-off-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Yang Erkun <yangerkun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhang Yi authored
hulk inclusion category: bugfix bugzilla: 182754 CVE: NA --------------------------- The block number in the quota tree on disk should be smaller than the v2_disk_dqinfo.dqi_blocks. If the quota file was corrupted, we may be allocating an 'allocated' block and that would lead to a loop in a tree, which will probably trigger oops later. This patch adds a check for the block number in the quota tree to prevent such potential issue. Link: https://lore.kernel.org/r/20211008093821.1001186-2-yi.zhang@huawei.com Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Cc: stable@kernel.org Signed-off-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Yang Erkun <yangerkun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
余快 authored
mainline inclusion from mainline-next-20211018 commit 52c90e0184f67eecb00b53b79bfdf75e0274f8fd category: bugfix bugzilla: 49890 CVE: NA --------------------------- There is a problem that nbd_handle_reply() might access freed request: 1) At first, a normal io is submitted and completed with scheduler: internel_tag = blk_mq_get_tag -> get tag from sched_tags blk_mq_rq_ctx_init sched_tags->rq[internel_tag] = sched_tag->static_rq[internel_tag] ... blk_mq_get_driver_tag __blk_mq_get_driver_tag -> get tag from tags tags->rq[tag] = sched_tag->static_rq[internel_tag] So, both tags->rq[tag] and sched_tags->rq[internel_tag] are pointing to the request: sched_tags->static_rq[internal_tag]. Even if the io is finished. 2) nbd server send a reply with random tag directly: recv_work nbd_handle_reply blk_mq_tag_to_rq(tags, tag) rq = tags->rq[tag] 3) if the sched_tags->static_rq is freed: blk_mq_sched_free_requests blk_mq_free_rqs(q->tag_set, hctx->sched_tags, i) -> step 2) access rq before clearing rq mapping blk_mq_clear_rq_mapping(set, tags, hctx_idx); __free_pages() -> rq is freed here 4) Then, nbd continue to use the freed request in nbd_handle_reply Fix the problem by get 'q_usage_counter' before blk_mq_tag_to_rq(), thus request is ensured not to be freed because 'q_usage_counter' is not zero. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20210916141810.2325276-1-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
余快 authored
mainline inclusion from mainline-next-20211018 commit 961e9f50be9bb47835b0ac7e08d55d2d0a45e493 category: bugfix bugzilla: 49890 CVE: NA --------------------------- Prepare to fix uaf in nbd_read_stat(), no functional changes. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210916093350.1410403-7-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Conflict: drivers/block/nbd.c Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
余快 authored
mainline inclusion from mainline-next-20211018 commit 6157a8f489909db00151a4e361903b9099b03b75 category: bugfix bugzilla: 49890 CVE: NA --------------------------- Check if sock_xmit() return 0 is useless because it'll never return 0, comment it and remove such checkings. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210916093350.1410403-6-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Conflict:drivers/block/nbd.c Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
余快 authored
mainline inclusion from mainline-next-20211018 commit a83fdc85365586dc5c0f3ff91680e18e37a66f19 category: bugfix bugzilla: 49890 CVE: NA --------------------------- commit 6a468d59 ("nbd: don't start req until after the dead connection logic") move blk_mq_start_request() from nbd_queue_rq() to nbd_handle_cmd() to skip starting request if the connection is dead. However, request is still started in other error paths. Currently, blk_mq_end_request() will be called immediately if nbd_queue_rq() failed, thus start request in such situation is useless. So remove blk_mq_start_request() from error paths in nbd_handle_cmd(). Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210916093350.1410403-5-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
余快 authored
mainline inclusion from mainline-next-20211018 commit dbd73178da676945d8bbcf6afe731623f683ce0a category: bugfix bugzilla: 49890 CVE: NA --------------------------- The sock that clent send request in nbd_send_cmd() and receive reply in nbd_read_stat() should be the same. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210916093350.1410403-4-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
余快 authored
mainline inclusion from mainline-next-20211018 commit d14b304f558f8c8f53da3a8d0c0b671f14a9c2f4 category: bugfix bugzilla: 49890 CVE: NA --------------------------- commit cddce0116058 ("nbd: Aovid double completion of a request") try to fix that nbd_clear_que() and recv_work() can complete a request concurrently. However, the problem still exists: t1 t2 t3 nbd_disconnect_and_put flush_workqueue recv_work blk_mq_complete_request blk_mq_complete_request_remote -> this is true WRITE_ONCE(rq->state, MQ_RQ_COMPLETE) blk_mq_raise_softirq blk_done_softirq blk_complete_reqs nbd_complete_rq blk_mq_end_request blk_mq_free_request WRITE_ONCE(rq->state, MQ_RQ_IDLE) nbd_clear_que blk_mq_tagset_busy_iter nbd_clear_req __blk_mq_free_request blk_mq_put_tag blk_mq_complete_request -> complete again There are three places where request can be completed in nbd: recv_work(), nbd_clear_que() and nbd_xmit_timeout(). Since they all hold cmd->lock before completing the request, it's easy to avoid the problem by setting and checking a cmd flag. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210916093350.1410403-3-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Conflict: drivers/block/nbd.c Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
余快 authored
mainline inclusion from mainline-next-20211018 commit b5644a3a79bf3be5f1238db1b2f241374b27b0f0 category: bugfix bugzilla: 49890 CVE: NA --------------------------- While handling a response message from server, nbd_read_stat() will try to get request by tag, and then complete the request. However, this is problematic if nbd haven't sent a corresponding request message: t1 t2 submit_bio nbd_queue_rq blk_mq_start_request recv_work nbd_read_stat blk_mq_tag_to_rq blk_mq_complete_request nbd_send_cmd Thus add a new cmd flag 'NBD_CMD_INFLIGHT', it will be set in nbd_send_cmd() and checked in nbd_read_stat(). Noted that this patch can't fix that blk_mq_tag_to_rq() might return a freed request, and this will be fixed in following patches. Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20210916093350.1410403-2-yukuai3@huawei.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Yang Yingliang authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I CVE: NA ---------------------------------------------------------- Enable CONFIG_ASCEND_CLEAN_CDM by default. Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Wang Wensheng authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I CVE: NA ---------------------------------------------------------- Use a bootarg to precisely specify the target node to which we want to move the kernel structrue for a cdm node. Signed-off-by:
Wang Wensheng <wangwensheng4@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Wang Wensheng authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I CVE: NA ------------------------------------------------- Not all cdm nodes are hbm and we don't need to operate on the other nodes. So we should specify the hbm count per partion. Here we assume that all the hbm nodes appear at first of all the cdm nodes in one partion. Otherwise the management structure of the hbm nodes could not be moved, which is not worse than closing this feature. Signed-off-by:
Wang Wensheng <wangwensheng4@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Wang Wensheng authored
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I CVE: NA ------------------- Check whether the topological structure of the DDR/HBM brokes our assumption or not. If it got broken we just return the input nid, or an invalid nid could be returned and it may break the kernel. Fixes: aabbfd385ab2 ("numa: Move the management structures for cdm nodes to ddr") Signed-off-by:
Wang Wensheng <wangwensheng4@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Wang Wensheng authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D63I CVE: NA ------------------------------------------------- The cdm nodes are easiler to raise an ECC error and it may cause the kernel crash if the essential structures went wrong. So move the management structures for hbm nodes to the ddr nodes of the same partion to reduce the probability of kernel crashes. Signed-off-by:
Wang Wensheng <wangwensheng4@huawei.com> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Zheng Zengkai <zhengzengkai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Fang Lijun authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D4WR CVE: NA --------------------------- This patch adds support for L3T PMU driver in HiSilicon SoC chip, Each L3T has own control, counter and interrupt registers and is an separate PMU. For each L3T PMU, it has 8-programable counters and each counter is free-running. Signed-off-by:
Fang Lijun <fanglijun3@huawei.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Fang Lijun authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D4WR CVE: NA --------------------------- This patch adds support for LPDDRC PMU driver in HiSilicon SoC chip, Each DDRC has own control, counter registers be an separate PMU. For each LPDDRC PMU, it has 8-fixed-purpose counters which have been mapped to 8-events by hardware, it assumes that counter index is equal to event code (0 - 7) in LPDDRC PMU driver. Since the counter register was read-only, set the perv-count in write_counter instead of wrote the counter register. Signed-off-by:
Fang Lijun <fanglijun3@huawei.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Fang Lijun authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D4WR CVE: NA --------------------------- Driver providing perf backend for LPDDRC and L3T PMU hardware found in Hisilicon Soc. Signed-off-by:
Fang Lijun <fanglijun3@huawei.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Fang Lijun authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4D4WR CVE: NA --------------------------- Add support for hisi PMU driver dt probe, Fix its compile error when disable CONFIG_ACPI. Signed-off-by:
Fang Lijun <fanglijun3@huawei.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xu Qiang authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- When hard lockup detection is disabled, core lockup detection is not performed. Signed-off-by:
Xu Qiang <xuqiang36@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xu Qiang authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- A user-mode interface is added to control the core lockup detection sensitivity. Signed-off-by:
Xu Qiang <xuqiang36@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xu Qiang authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- Optimized core lockup detection judgment rules to make it easier to understand. Core suspension detection is performed in the hrtimer interrupt processing function. The detection condition is that the hrtimer interrupt and NMI interrupt are not updated for multiple consecutive times. Signed-off-by:
Xu Qiang <xuqiang36@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Dong Kai authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- The corelockup detection is completed on arm64, enable it. Signed-off-by:
Dong Kai <dongkai11@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Zheng Zengkai <zhengzengkai@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Dong Kai authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- Add cmdline params "enable_corelockup_detector" to support enable core suspend detector. And enable defaultly within ascend features. Signed-off-by:
Dong Kai <dongkai11@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Zheng Zengkai <zhengzengkai@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Dong Kai authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- When using pmu events as nmi source, the pmu clock is disabled under wfi/wfe mode. And the nmi can't respond periodically. To minimize the misjudgment by wfi/wfe, we adopt a simple method which to disable wfi/wfe at the right time and the watchdog hrtimer is a good baseline. The watchdog hrtimer is based on generate timer and has high freq than nmi. If watchdog hrtimer not works we disable wfi/wfe mode then the pmu nmi should always responds as long as the cpu core not suspend. Signed-off-by:
Dong Kai <dongkai11@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Zheng Zengkai <zhengzengkai@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Dong Kai authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4F3V1 CVE: NA -------------------------------- The softlockup and hardlockup detector only check the status of the cpu which it resides. If certain cpu core suspends, they are both not works. There is no any valid log but the cpu already abnormal and brings a lot of problems of system. To detect this case, we add the corelockup detector. First we use whether cpu core can responds to nmi as a sectence to determine if it is suspended. Then things is simple. Per cpu core maintains it's nmi interrupt counts and detector the nmi_counts of next cpu core. If the nmi interrupt counts not changed any more which means it can't respond nmi normally, we regard it as suspend. To ensure robustness, only consecutive lost nmi more than two times then trigger the warn. The detection chain is as following: cpu0->cpu1->...->cpuN->cpu0 Signed-off-by:
Dong Kai <dongkai11@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Zheng Zengkai <zhengzengkai@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhou Guanghui authored
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4EUVI CVE: NA -------------------------------------------------- The user needs the process pid, that is, task tgid. Signed-off-by:
Zhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhou Guanghui authored
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4EUVI CVE: NA ----------------------------------------------------------- Solving the Problem that the 4G DVPP Address with Shared Pool coexist. Signed-off-by:
Zhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Fang Lijun authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4EUVI CVE: NA Signed-off-by:
Fang Lijun <fanglijun3@huawei.com> Signed-off-by:
Zhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Fang Lijun authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4EUVI CVE: NA ------------------------------------------------- The reason of exporting __vmalloc_node() is that gfp_mask __GFP_ACCOUNT is used in mbuff to limit memory usage of vmalloc() with memory cgroup. We add a new parameter vmflags for __vmalloc_node() because VM_USERMAP and VM_HUGE_PAGES is for vmalloc_hugepage_user(). By selecting HAVE_ARCH_HUGE_VMALLOC, vmalloc_hugepage_user() can allocate hugepage memory. Also, vmalloc() will allocate hugepage memory if possible. Reference: https://lwn.net/Articles/839107/ Signed-off-by:
Tang Yizhou <tangyizhou@huawei.com> Signed-off-by:
Zhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Fang Lijun authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4EUVI CVE: NA ------------------------------------------------- Signed-off-by:
Fang Lijun <fanglijun3@huawei.com> Signed-off-by:
Zhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
guomengqi authored
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4EUVI CVE: NA ------------------------------------------------- This interface is added to support the function of exiting a process from a sharing group. Signed-off-by:
guomengqi <guomengqi3@huawei.com> Signed-off-by:
Zhou Guanghui <zhouguanghui1@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-