Commits · cafd261d0428bd37c913b598e04b68090cf91f7e · Summer2022 / 22b970264

Oct 21, 2022

scsi: megaraid_sas: Add support for MegaRAID Aero controllers · cafd261d

Shivasharan S authored 2 years ago

mainline inclusion
from mainline-v5.0
commit 469f72dd
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5WW82


CVE: NA

--------------------------------

This patch adds support for MegaRAID Aero controller PCI IDs.  Print a
message when a configurable secure type controller is encountered.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Xibo.Wang <wangxb12@chinatelecom.cn>

cafd261d

Oct 10, 2022

netfilter: nf_conntrack_irc: Fix forged IP logic · ffa9f2a5

David Leadbeater authored 2 years ago

stable inclusion
from stable-v4.19.258
commit 3275f7804f40de3c578d2253232349b07c25f146
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5OWZ7


CVE: CVE-2022-2663

---------------------------

[ Upstream commit 0efe125cfb99e6773a7434f3463f7c2fa28f3a43 ]

Ensure the match happens in the right direction, previously the
destination used was the server, not the NAT host, as the comment
shows the code intended.

Additionally nf_nat_irc uses port 0 as a signal and there's no valid way
it can appear in a DCC message, so consider port 0 also forged.

Fixes: 869f37d8 ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port")
Signed-off-by: David Leadbeater <dgl@dgl.cx>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Li...

4.19.90-2210.1.0

ffa9f2a5

ext4: fix check for block being out of directory size · cc10dbac

Jan Kara authored 2 years ago

mainline inclusion
from mainline-v6.1-rc1
commit 61a1d87a324ad5e3ed27c6699dfc93218fcf3201
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I58WSQ


CVE: CVE-2022-1184

--------------------------------

The check in __ext4_read_dirblock() for block being outside of directory
size was wrong because it compared block number against directory size
in bytes. Fix it.

Fixes: 65f8ea4cd57d ("ext4: check if directory block is within i_size")
CVE: CVE-2022-1184
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Link: https://lore.kernel.org/r/20220822114832.1482-1-jack@suse.cz


Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

cc10dbac

ext4: check if directory block is within i_size · d9dc377b

Lukas Czerner authored 2 years ago

mainline inclusion
from mainline-v6.0-rc1
commit 65f8ea4cd57dbd46ea13b41dc8bac03176b04233
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I58WSQ


CVE: CVE-2022-1184

--------------------------------

Currently ext4 directory handling code implicitly assumes that the
directory blocks are always within the i_size. In fact ext4_append()
will attempt to allocate next directory block based solely on i_size and
the i_size is then appropriately increased after a successful
allocation.

However, for this to work it requires i_size to be correct. If, for any
reason, the directory inode i_size is corrupted in a way that the
directory tree refers to a valid directory block past i_size, we could
end up corrupting parts of the directory tree structure by overwriting
already used directory blocks when modifying the directory.

Fix it by catching the corruption early in __ext4_read_dirblock().

Addresses Red-Hat-Bugzilla: #2070205
CVE: CVE-2022-1184
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20220704142721.157985-1-lczerner@redhat.com


Signed-off-by: Theodore Ts'o <tytso@mit.edu>

Conflicts:
	fs/ext4/namei.c

Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

d9dc377b

Oct 09, 2022

block: Fix UAF in bd_link_disk_holder() · 01b1ec1d

Luo Meng authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5TY3L
CVE: NA

--------------------------------

A crash as follows:

 BUG: unable to handle page fault for address: 000000011241cec7
 sd 5:0:0:1: [sdl] Synchronizing SCSI cache
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] SMP PTI
 CPU: 3 PID: 2465367 Comm: multipath Kdump: loaded Tainted: G        W  O      5.10.0-60.18.0.50.h478.eulerosv2r11.x86_64 #1
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-20220525_182517-szxrtosci10000 04/01/2014
 RIP: 0010:kernfs_new_node+0x22/0x60
 Code: cc cc 66 0f 1f 44 00 00 0f 1f 44 00 00 41 54 41 89 cb 0f b7 ca 48 89 f2 53 48 8b 47 08 48 89 fb 48 89 de 48 85 c0 48 0f 44 c7 <48> 8b 78 50 41 51 45 89 c1 45 89 d8 e8 4d ee ff ff 5a 49 89 c4 48
 RSP: 0018:ffffa178419539e8 EFLAGS: 00010206
 RAX: 000000011241ce77 RBX: ffff9596828395a0 RCX: 00000000...

01b1ec1d

ALSA: pcm: oss: Fix race at SNDCTL_DSP_SYNC · e4fc0e51

Sasha Levin authored 2 years ago

stable inclusion
from stable-v5.4.215
commit 4051324a6dafd7053c74c475e80b3ba10ae672b0
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5T9C3


CVE: CVE-2022-3303

---------------------------

[ Upstream commit 8423f0b6d513b259fdab9c9bf4aaa6188d054c2d ]

There is a small race window at snd_pcm_oss_sync() that is called from
OSS PCM SNDCTL_DSP_SYNC ioctl; namely the function calls
snd_pcm_oss_make_ready() at first, then takes the params_lock mutex
for the rest.  When the stream is set up again by another thread
between them, it leads to inconsistency, and may result in unexpected
results such as NULL dereference of OSS buffer as a fuzzer spotted
recently.

The fix is simply to cover snd_pcm_oss_make_ready() call into the same
params_lock mutex with snd_pcm_oss_make_ready_locked() variant.

Reported-and-tested-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Cc: <stable@vger.kernel.org>
Link: https:...

e4fc0e51

block: add a new config to control dispatching bios asynchronously · 3d14cd06

余快 authored 2 years ago

hulk inclusion
category: performance
bugzilla: 187597, https://gitee.com/openeuler/kernel/issues/I5QK5M


CVE: NA

--------------------------------

If CONFIG_BLK_BIO_DISPATCH_ASYNC is enabled, and driver support
QUEUE_FLAG_DISPATCH_ASYNC, bios will be dispatched asynchronously to
specific CPUs to avoid across nodes memory access in driver.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

3d14cd06

block: fix kabi broken in request_queue · b6a187ae

余快 authored 2 years ago

hulk inclusion
category: performance
bugzilla: 187597, https://gitee.com/openeuler/kernel/issues/I5QK5M


CVE: NA

--------------------------------

request_queue_wrapper is not accessible in drivers currently,
introduce a new helper to initialize async dispatch to fix kabi broken.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

b6a187ae

md: enable dispatching bio asynchronously for raid10 by default · 8934afb9

余快 authored 2 years ago

hulk inclusion
category: performance
bugzilla: 187597, https://gitee.com/openeuler/kernel/issues/I5QK5M


CVE: NA

--------------------------------

Try to improve performance for raid when user issues io concurrently
from multiple nodes.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

8934afb9

arm64/topology: getting preferred sibling's cpumask supported by platform · c59d6d53

Wang ShaoBo authored 2 years ago

hulk inclusion
category: performance
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QK5M


CVE: NA

--------------------------------

For some architectures, masking the underlying processor topology
differences can make software unable to identify the cpu distance,
which results in performance fluctuations.

So we provide additional interface for getting preferred sibling's
cpumask supported by platform, this siblings' cpumask indicates those
CPUs which are clustered with relatively short distances, but this
hardly depends on the specific implementation of the specific platform.

Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

c59d6d53

block: support to dispatch bio asynchronously · f39ebff6

余快 authored 2 years ago

hulk inclusion
category: performance
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QK5M


CVE: NA

--------------------------------

In some architecture memory access latency is very bad across nodes
compare to local node. For consequence, io performance is rather bad
while users issue io from multiple nodes if lock contention exist in
the driver.

This patch make io dispatch asynchronously to specific kthread that is
bind to cpus that are belong to the same node, so that memory access
across nodes in driver can be avoided.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

f39ebff6

block: add new fields in request_queue · 4fc0fcd6

余快 authored 2 years ago

hulk inclusion
category: performance
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QK5M


CVE: NA

--------------------------------

Add a new flag QUEUE_FLAG_DISPATCH_ASYNC and two new fields
'dispatch_cpumask' and 'last_dispatch_cpu' for request_queue, prepare
to support dispatch bio asynchronous in specified cpus. This patch also
add sysfs apis.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

4fc0fcd6

md/raid10: convert resync_lock to use seqlock · f82f6d68

余快 authored 2 years ago

mainline inclusion
from md-next
commit ddc489e066cd267b383c0eed4f576f6bdb154588
category: performance
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PRMO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=ddc489e066cd267b383c0eed4f576f6bdb154588



---------------------

Currently, wait_barrier() will hold 'resync_lock' to read 'conf->barrier',
and io can't be dispatched until 'barrier' is dropped.

Since holding the 'barrier' is not common, convert 'resync_lock' to use
seqlock so that holding lock can be avoided in fast path.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-and-tested-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

f82f6d68

md/raid10: prevent unnecessary calls to wake_up() in fast path · 1668533d

余快 authored 2 years ago

mainline inclusion
from md-next
commit 7fdc91928ac109d3d1468ad7f951deb29a375e3d
category: performance
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PRMO
CVE: NA

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?h=md-next&id=7fdc91928ac109d3d1468ad7f951deb29a375e3d



--------------------------------

Currently, wake_up() is called unconditionally in fast path such as
raid10_make_request(), which will cause lock contention under high
concurrency:

raid10_make_request
 wake_up
  __wake_up_common_lock
   spin_lock_irqsave

Improve performance by only call wake_up() if waitqueue is not empty.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Song Liu <song@kernel.org>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

1668533d

!122 【kernel-openEuler-1.0-LTS】kernel：fix some issues with 4.19 kernel on openEuler 22.03 system · f25ff47b

openeuler-ci-bot authored 2 years ago

Merge Pull Request from: @tangbinzy 
 
This PR is to adapt the 4.19 kernel to the openEuler 22.03 system, the step one is just for initial kernel use.

Kernel Issue:
1）the problems of 4.19 kernel on openEuler 22.03 system, as follows:
https://gitee.com/openeuler/kernel/issues/I5Q0UG
2）the common problems of 4.19 kernel on 22.03/20.03, as follows:
2.1、https://gitee.com/openeuler/kernel/issues/I5QR5E
2.2、https://gitee.com/openeuler/kernel/issues/I5QSAP
2.3、https://gitee.com/openeuler/kernel/issues/I5RTF5
2.4、https://gitee.com/openeuler/kernel/issues/I5RZPX

Default config change
N/A 
 
Link:https://gitee.com/openeuler/kernel/pulls/122

 
Reviewed-by: Jackie Liu <liuyun01@kylinos.cn>
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>

f25ff47b

Sep 29, 2022

mm: sharepool: fix potential AA deadlock · b9f6a788

Guo Mengqi authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5R0X9


CVE: NA

--------------------------------

Fix a AA deadlock caused by nested lock in mg_sp_group_add_task().

Deadlock path:

mg_sp_group_add_task()

    down_write(sp_group_sem)
    find_or_alloc_sp_group()
	!spg_valid()
	sp_group_drop()
	    free_sp_group() -> down_write(sp_group_sem)
    ---> AA deadlock

Signed-off-by: Guo Mengqi <guomengqi3@huawei.com>
Reviewed-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

b9f6a788

mm: sharepool: check size=0 in mg_sp_make_share_k2u() · a541bd47

Guo Mengqi authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QQPG


CVE: NA

--------------------------------

Add a size-0-check in mg_sp_make_share_k2u() to avoid passing 0-size spa
to __insert_sp_area().

Signed-off-by: Guo Mengqi <guomengqi3@huawei.com>
Reviewed-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

a541bd47

mm: sharepool: delete redundant check in __sp_remap_get_pfn · c6b3415a

Guo Mengqi authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QETC


CVE: NA

--------------------------------

sp_make_share_k2u only supports vmalloc address now. Therefore, delete a
backup handle case.

Signed-off-by: Guo Mengqi <guomengqi3@huawei.com>
Reviewed-by: Weilong Chen <chenweilong@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

c6b3415a

Revert "cifs: fix double free race when mount fails in cifs_get_root()" · 589b2a6c

Luo Meng authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5TMYD


CVE: NA

--------------------------------

This reverts commit 7959a470.

Commit 2fe0e281f7ad (cifs: fix double free race when mount fails
in cifs_get_root()) fixes a double free. However there is no such
issue on 4.19 because it will return after cifs_cleanup_volume_info().

Since merge this patch, cifs_cleanup_volume_info() is skipped, leading
to a memory leak.

Signed-off-by: Luo Meng <luomeng12@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

589b2a6c

Sep 28, 2022

scsi: hisi_sas: Release resource directly in hisi_sas_abort_task() when NCQ error · 64d37f3f

Xingui Yang authored 2 years ago

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SXSB


CVE: NA

------------------------------------------------

When the port is detached, EH will clear ATA_EH_RESET in ehc->i.action when
call ata_eh_reset(), and device reset won't be executed.

As the disk won't return other I/Os normally after NCQ Error. In addition,
the abort operation is added, then resource release is safe, so release NCQ
command lldd resource directly in hisi_sas_abort_task() when NCQ error
without soft reset to make sure read log command can be executed success
later. But Soft reset still need to be used in other scenario.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Reviewed-by: kang fenglong <kangfenglong@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

64d37f3f

scsi: hisi_sas: Enable force phy when SATA disk directly connected · f32bc74e

Xingui Yang authored 2 years ago

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5QDH7


CVE: NA

----------------------------------

the SAS controller determines the disk to which I/Os are delivered based
on the port id in the DQ entry when SATA disk directly connected.

When the link is intermittently disconnected during I/O sending and the
port id changes and is used by another link, data inconsistency on the
SATA disk may occur during I/O retry. So enable force phy, then force the
command to be executed in a certain phy, and if the port's phy does not
match the phy configured in the command, the chip will stop delivering
I/Os to disk.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Reviewed-by: kang fenglong <kangfenglong@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

f32bc74e

scsi: hisi_sas: Modify v3 HW ATA completion process when SATA disk is in error status · 63c0c05a

Xingui Yang authored 2 years ago

driver inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q63H


CVE: NA

-------------------------------------

When an NCQ error occurs, SAS controller will abnormally complete the I/Os
that newly delivered to disk, and bit8 in CQ dw3 will be set to 1 to
indicate current SATA disk is in error status. The current processing flow
is set ts->stat to SAS_OPEN_REJECT and then sas_ata_task_done() will set
fis stat to ATA_ERR. After analyzed by ata_eh_analyze_tf(), err_mask will
set to AC_ERR_HSM. If media error occurs for four times within 10 minutes
and the chip rejects new I/Os for four times, NCQ will be disabled due to
excessive errors.

However, if media error occurs multiple times, the NCQ mode shouldn't be
disabled. Therefore, use sas_task_abort() to handle abnormally completed
I/Os when SATA disk is in error status.

[10253.397429] hisi_sas_v3_hw 0000:b4:02.0: erroneous completion disk err dev id=2 sas_addr=0x5000000000000605 CQ hdr: 0x400903 0x2007f 0x0 0x80470000
[10253.397430] hisi_sas_v3_hw 0000:b4:02.0: erroneous completion iptt=135 task= pK-error dev id=2 sas_addr=0x5000000000000605 CQ hdr: 0x203 0x20087 0x0 0x100 Error info: 0x0 0x0 0x0 0x0
[10253.397432] hisi_sas_v3_hw 0000:b4:02.0: erroneous completion iptt=136 task= pK-error dev id=2 sas_addr=0x5000000000000605 CQ hdr: 0x203 0x20088 0x0 0x100 Error info: 0x0 0x0 0x0 0x0

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Reviewed-by: kang fenglong <kangfenglong@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

63c0c05a

sched: Fix invalid free for tsk->se.dyn_affi_stats · 46212545

Hui Tang authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5TIOZ


CVE: NA

--------------------------------

BUG: KASAN: double-free or invalid-free in sched_prefer_cpus_free[...]

Freed by task 0:
 save_stack mm/kasan/kasan.c:448 [inline]
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free+0x120/0x228 mm/kasan/kasan.c:521
 kasan_slab_free+0x10/0x18 mm/kasan/kasan.c:528
 slab_free_hook mm/slub.c:1397 [inline]
 slab_free_freelist_hook mm/slub.c:1425 [inline]
 slab_free mm/slub.c:3004 [inline]
 kfree+0x84/0x250 mm/slub.c:3965
 sched_prefer_cpus_free+0x58/0x78 kernel/sched/core.c:7219
 free_task+0xb0/0xe8 kernel/fork.c:463
 __delayed_free_task+0x24/0x30 kernel/fork.c:1716
 __rcu_reclaim kernel/rcu/rcu.h:236 [inline]
 rcu_do_batch+0x200/0x5e0 kernel/rcu/tree.c:2584
 invoke_rcu_callbacks kernel/rcu/tree.c:2897 [inline]
 __rcu_process_callbacks kernel/rcu/tree.c:2864 [inline]
 rcu_process_callbacks+0x470/0xb60 kernel/rcu/tree.c:2881
 __do_softirq+0x2d0/0xba0 kernel/softirq.c:292

Add init of 'tsk->se.dyn_affi_stats == NULL' in dup_task_struct().

Fixes: ebca52ab ("sched: Add statistics for scheduler dynamic affinity")
Signed-off-by: Hui Tang <tanghui20@huawei.com>
Reviewed-by: Zhang Qiao <zhangqiao22@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

46212545

scsi: target: tcmu: Fix warning: 'page' may be used uninitialized · e71f2087

John Donnelly authored 2 years ago

mainline inclusion
from mainline-v5.10-rc1
commit 8c4e0f21
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SXLB
CVE: NA

--------------------------------

Corrects drivers/target/target_core_user.c:688:6: warning: 'page' may be
used uninitialized.

Link: https://lore.kernel.org/r/20200924001920.43594-1-john.p.donnelly@oracle.com


Fixes: 3c58f737 ("scsi: target: tcmu: Optimize use of flush_dcache_page")
Cc: Mike Christie <michael.christie@oracle.com>
Acked-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: John Donnelly <john.p.donnelly@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Reviewed-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

e71f2087

scsi: target: tcmu: Fix crash on ARM during cmd completion · d0b7519e

Bodo Stroesser authored 2 years ago

mainline inclusion
from mainline-v5.9-rc1
commit 5a0c256d
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SXLB
CVE: NA

--------------------------------

If tcmu_handle_completions() has to process a padding shorter than
sizeof(struct tcmu_cmd_entry), the current call to
tcmu_flush_dcache_range() with sizeof(struct tcmu_cmd_entry) as length
param is wrong and causes crashes on e.g. ARM, because
tcmu_flush_dcache_range() in this case calls
flush_dcache_page(vmalloc_to_page(start)); with start being an invalid
address above the end of the vmalloc'ed area.

The fix is to use the minimum of remaining ring space and sizeof(struct
tcmu_cmd_entry) as the length param.

The patch was tested on kernel 4.19.118.

See https://bugzilla.kernel.org/show_bug.cgi?id=208045#c10

Link: https://lore.kernel.org/r/20200629093756.8947-1-bstroesser@ts.fujitsu.com


Tested-by: JiangYu <lnsyyj@hotmail.com>
Acked-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Reviewed-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

d0b7519e

scsi: target: tcmu: Optimize use of flush_dcache_page · 75a8f1b4

Bodo Stroesser authored 2 years ago

mainline inclusion
from mainline-v5.9-rc1
commit 3c58f737
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SXLB
CVE: NA

--------------------------------

(scatter|gather)_data_area() need to flush dcache after writing data to or
before reading data from a page in uio data area.  The two routines are
able to handle data transfer to/from such a page in fragments and flush the
cache after each fragment was copied by calling the wrapper
tcmu_flush_dcache_range().

That means:

1) flush_dcache_page() can be called multiple times for the same page.

2) Calling flush_dcache_page() indirectly using the wrapper does not make
   sense, because each call of the wrapper is for one single page only and
   the calling routine already has the correct page pointer.

Change (scatter|gather)_data_area() such that, instead of calling
tcmu_flush_dcache_range() before/after each memcpy, it now calls
flush_dcache_page() before unmapping a page (when writing is complete for
that page) or after mapping a page (when starting to read the page).

After this change only calls to tcmu_flush_dcache_range() for addresses in
vmalloc'ed command ring are left over.

The patch was tested on ARM with kernel 4.19.118 and 5.7.2

Link: https://lore.kernel.org/r/20200618131632.32748-2-bstroesser@ts.fujitsu.com


Tested-by: JiangYu <lnsyyj@hotmail.com>
Tested-by: Daniel Meyerholt <dxm523@gmail.com>
Acked-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Reviewed-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

75a8f1b4

scsi: target: tcmu: Fix size in calls to tcmu_flush_dcache_range · 1981211d

Bodo Stroesser authored 2 years ago

mainline inclusion
from mainline-v5.8-rc1
commit 8c4e0f21
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SXLB
CVE: NA

--------------------------------

1) If remaining ring space before the end of the ring is smaller then the
   next cmd to write, tcmu writes a padding entry which fills the remaining
   space at the end of the ring.

   Then tcmu calls tcmu_flush_dcache_range() with the size of struct
   tcmu_cmd_entry as data length to flush.  If the space filled by the
   padding was smaller then tcmu_cmd_entry, tcmu_flush_dcache_range() is
   called for an address range reaching behind the end of the vmalloc'ed
   ring.

   tcmu_flush_dcache_range() in a loop calls
   flush_dcache_page(virt_to_page(start)); for every page being part of the
   range. On x86 the line is optimized out by the compiler, as
   flush_dcache_page() is empty on x86.

   But I assume the above can cause trouble on other architectures that
   really have a flush_dcache_page().  For paddings only the header part of
   an entry is relevant due to alignment rules the header always fits in
   the remaining space, if padding is needed.  So tcmu_flush_dcache_range()
   can safely be called with sizeof(entry->hdr) as the length here.

2) After it has written a command to cmd ring, tcmu calls
   tcmu_flush_dcache_range() using the size of a struct tcmu_cmd_entry as
   data length to flush.  But if a command needs many iovecs, the real size
   of the command may be bigger then tcmu_cmd_entry, so a part of the
   written command is not flushed then.

Link: https://lore.kernel.org/r/20200528193108.9085-1-bstroesser@ts.fujitsu.com


Acked-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Reviewed-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

1981211d

signal: fix deadlock caused by calling printk() under sighand->siglock · ab099ea9

Ye Weihua authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5T8FD


CVE: NA

--------------------------------

__dend_signal_locked() invokes __sigqueue_alloc() which may invoke a
normal printk() to print failure message. This can cause a deadlock in
the scenario reported by syz-bot below (test in 5.10):

	CPU0				CPU1
	----				----
	lock(&sighand->siglock);
					lock(&tty->read_wait);
					lock(&sighand->siglock);
	lock(console_owner);

This patch specities __GFP_NOWARN to __sigqueue_alloc(), so that printk
will not be called, and this deadlock problem can be avoided.

Syzbot reported the following lockdep error:

======================================================
WARNING: possible circular locking dependency detected
5.10.0-04424-ga472e3c833d3 #1 Not tainted
------------------------------------------------------
syz-executor.2/31970 is trying to acquire lock:
ffffa00014066a60 (console_owner){-.-.}-{0:0}, at: console_trylock_spinning+0xf0/0x2e0 kernel/printk/printk.c:1854

but task is already holding lock:
ffff0000ddb38a98 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x60/0x260 kernel/signal.c:1322

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #4 (&sighand->siglock){-.-.}-{2:2}:
       validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728
       __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954
       lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xc0/0x15c kernel/locking/spinlock.c:159
       __lock_task_sighand+0xf0/0x370 kernel/signal.c:1396
       lock_task_sighand include/linux/sched/signal.h:699 [inline]
       task_work_add+0x1f8/0x2a0 kernel/task_work.c:58
       io_req_task_work_add+0x98/0x10c fs/io_uring.c:2115
       __io_async_wake+0x338/0x780 fs/io_uring.c:4984
       io_poll_wake+0x40/0x50 fs/io_uring.c:5461
       __wake_up_common+0xcc/0x2a0 kernel/sched/wait.c:93
       __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:123
       __wake_up+0x1c/0x24 kernel/sched/wait.c:142
       pty_set_termios+0x1ac/0x2d0 drivers/tty/pty.c:286
       tty_set_termios+0x310/0x46c drivers/tty/tty_ioctl.c:334
       set_termios.part.0+0x2dc/0xa50 drivers/tty/tty_ioctl.c:414
       set_termios drivers/tty/tty_ioctl.c:368 [inline]
       tty_mode_ioctl+0x4f4/0xbec drivers/tty/tty_ioctl.c:736
       n_tty_ioctl_helper+0x74/0x260 drivers/tty/tty_ioctl.c:883
       n_tty_ioctl+0x80/0x3d0 drivers/tty/n_tty.c:2516
       tty_ioctl+0x508/0x1100 drivers/tty/tty_io.c:2751
       vfs_ioctl fs/ioctl.c:48 [inline]
       __do_sys_ioctl fs/ioctl.c:753 [inline]
       __se_sys_ioctl fs/ioctl.c:739 [inline]
       __arm64_sys_ioctl+0x12c/0x18c fs/ioctl.c:739
       __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
       invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
       el0_svc_common.constprop.0+0xf8/0x420 arch/arm64/kernel/syscall.c:155
       do_el0_svc+0x50/0x120 arch/arm64/kernel/syscall.c:217
       el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353
       el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369
       el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683

-> #3 (&tty->read_wait){....}-{2:2}:
       validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728
       __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954
       lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0xa0/0x120 kernel/locking/spinlock.c:151
       spin_lock include/linux/spinlock.h:354 [inline]
       io_poll_double_wake+0x158/0x30c fs/io_uring.c:5093
       __wake_up_common+0xcc/0x2a0 kernel/sched/wait.c:93
       __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:123
       __wake_up+0x1c/0x24 kernel/sched/wait.c:142
       pty_close+0x1bc/0x330 drivers/tty/pty.c:68
       tty_release+0x1e0/0x88c drivers/tty/tty_io.c:1761
       __fput+0x1dc/0x500 fs/file_table.c:281
       ____fput+0x24/0x30 fs/file_table.c:314
       task_work_run+0xf4/0x1ec kernel/task_work.c:151
       tracehook_notify_resume include/linux/tracehook.h:188 [inline]
       do_notify_resume+0x378/0x410 arch/arm64/kernel/signal.c:718
       work_pending+0xc/0x198

-> #2 (&tty->write_wait){....}-{2:2}:
       validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728
       __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954
       lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564
       __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
       _raw_spin_lock_irqsave+0xc0/0x15c kernel/locking/spinlock.c:159
       __wake_up_common_lock+0xb0/0x130 kernel/sched/wait.c:122
       __wake_up+0x1c/0x24 kernel/sched/wait.c:142
       tty_wakeup+0x54/0xbc drivers/tty/tty_io.c:539
       tty_port_default_wakeup+0x38/0x50 drivers/tty/tty_port.c:50
       tty_port_tty_wakeup+0x3c/0x50 drivers/tty/tty_port.c:388
       uart_write_wakeup+0x38/0x60 drivers/tty/serial/serial_core.c:106
       pl011_tx_chars+0x530/0x5c0 drivers/tty/serial/amba-pl011.c:1418
       pl011_start_tx_pio drivers/tty/serial/amba-pl011.c:1303 [inline]
       pl011_start_tx+0x1b4/0x430 drivers/tty/serial/amba-pl011.c:1315
       __uart_start.isra.0+0xb4/0xcc drivers/tty/serial/serial_core.c:127
       uart_write+0x21c/0x460 drivers/tty/serial/serial_core.c:613
       process_output_block+0x120/0x3ac drivers/tty/n_tty.c:590
       n_tty_write+0x2c8/0x650 drivers/tty/n_tty.c:2383
       do_tty_write drivers/tty/tty_io.c:1028 [inline]
       file_tty_write.constprop.0+0x2d0/0x520 drivers/tty/tty_io.c:1118
       tty_write drivers/tty/tty_io.c:1125 [inline]
       redirected_tty_write+0xe4/0x104 drivers/tty/tty_io.c:1147
       call_write_iter include/linux/fs.h:1960 [inline]
       new_sync_write+0x264/0x37c fs/read_write.c:515
       vfs_write+0x694/0x9d0 fs/read_write.c:602
       ksys_write+0xfc/0x200 fs/read_write.c:655
       __do_sys_write fs/read_write.c:667 [inline]
       __se_sys_write fs/read_write.c:664 [inline]
       __arm64_sys_write+0x50/0x60 fs/read_write.c:664
       __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
       invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
       el0_svc_common.constprop.0+0xf8/0x420 arch/arm64/kernel/syscall.c:155
       do_el0_svc+0x50/0x120 arch/arm64/kernel/syscall.c:217
       el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353
       el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369
       el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683

-> #1 (&port_lock_key){-.-.}-{2:2}:
       validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728
       __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954
       lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0xa0/0x120 kernel/locking/spinlock.c:151
       spin_lock include/linux/spinlock.h:354 [inline]
       pl011_console_write+0x2f0/0x410 drivers/tty/serial/amba-pl011.c:2263
       call_console_drivers.constprop.0+0x1f8/0x3b0 kernel/printk/printk.c:1932
       console_unlock+0x36c/0x9ec kernel/printk/printk.c:2553
       vprintk_emit+0x40c/0x4b0 kernel/printk/printk.c:2075
       vprintk_default+0x48/0x54 kernel/printk/printk.c:2092
       vprintk_func+0x1f0/0x40c kernel/printk/printk_safe.c:404
       printk+0xbc/0xf0 kernel/printk/printk.c:2123
       register_console+0x580/0x790 kernel/printk/printk.c:2905
       uart_configure_port.constprop.0+0x4a0/0x4e0 drivers/tty/serial/serial_core.c:2431
       uart_add_one_port+0x378/0x550 drivers/tty/serial/serial_core.c:2944
       pl011_register_port+0xb4/0x210 drivers/tty/serial/amba-pl011.c:2686
       pl011_probe+0x334/0x3ec drivers/tty/serial/amba-pl011.c:2736
       amba_probe+0x14c/0x2f0 drivers/amba/bus.c:283
       really_probe+0x210/0xa5c drivers/base/dd.c:562
       driver_probe_device+0x1c8/0x280 drivers/base/dd.c:747
       __device_attach_driver+0x18c/0x260 drivers/base/dd.c:853
       bus_for_each_drv+0x120/0x1a0 drivers/base/bus.c:431
       __device_attach+0x16c/0x3b4 drivers/base/dd.c:922
       device_initial_probe+0x28/0x34 drivers/base/dd.c:971
       bus_probe_device+0x124/0x13c drivers/base/bus.c:491
       fw_devlink_resume+0x164/0x270 drivers/base/core.c:1601
       of_platform_default_populate_init+0xf4/0x114 drivers/of/platform.c:543
       do_one_initcall+0x11c/0x770 init/main.c:1217
       do_initcall_level+0x364/0x388 init/main.c:1290
       do_initcalls+0x90/0xc0 init/main.c:1306
       do_basic_setup init/main.c:1326 [inline]
       kernel_init_freeable+0x57c/0x63c init/main.c:1529
       kernel_init+0x1c/0x20c init/main.c:1417
       ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1034

-> #0 (console_owner){-.-.}-{0:0}:
       check_prev_add+0xe0/0x105c kernel/locking/lockdep.c:2988
       check_prevs_add+0x1c8/0x3d4 kernel/locking/lockdep.c:3113
       validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728
       __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954
       lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564
       console_trylock_spinning+0x130/0x2e0 kernel/printk/printk.c:1875
       vprintk_emit+0x268/0x4b0 kernel/printk/printk.c:2074
       vprintk_default+0x48/0x54 kernel/printk/printk.c:2092
       vprintk_func+0x1f0/0x40c kernel/printk/printk_safe.c:404
       printk+0xbc/0xf0 kernel/printk/printk.c:2123
       fail_dump lib/fault-inject.c:45 [inline]
       should_fail+0x2a0/0x370 lib/fault-inject.c:146
       __should_failslab+0x8c/0xe0 mm/failslab.c:33
       should_failslab+0x14/0x2c mm/slab_common.c:1181
       slab_pre_alloc_hook mm/slab.h:495 [inline]
       slab_alloc_node mm/slub.c:2842 [inline]
       slab_alloc mm/slub.c:2931 [inline]
       kmem_cache_alloc+0x8c/0xe64 mm/slub.c:2936
       __sigqueue_alloc+0x224/0x5a4 kernel/signal.c:437
       __send_signal+0x700/0xeac kernel/signal.c:1121
       send_signal+0x348/0x6a0 kernel/signal.c:1247
       force_sig_info_to_task+0x184/0x260 kernel/signal.c:1339
       force_sig_fault_to_task kernel/signal.c:1678 [inline]
       force_sig_fault+0xb0/0xf0 kernel/signal.c:1685
       arm64_force_sig_fault arch/arm64/kernel/traps.c:182 [inline]
       arm64_notify_die arch/arm64/kernel/traps.c:208 [inline]
       arm64_notify_die+0xdc/0x160 arch/arm64/kernel/traps.c:199
       do_sp_pc_abort+0x4c/0x60 arch/arm64/mm/fault.c:794
       el0_pc+0xd8/0x19c arch/arm64/kernel/entry-common.c:309
       el0_sync_handler+0x12c/0x1e0 arch/arm64/kernel/entry-common.c:394
       el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683

other info that might help us debug this:

Chain exists of:
	console_owner --> &tty->read_wait --> &sighand->siglock

Signed-off-by: Ye Weihua <yeweihua4@huawei.com>
Reviewed-by: Kuohai Xu <xukuohai@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

ab099ea9

mm: fix missing handler for __GFP_NOWARN · 027e2638

Qi Zheng authored 2 years ago

mainline inclusion
from mainline-v5.19-rc1
commit 3f913fc5f9745613088d3c569778c9813ab9c129
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5T8FD
CVE: NA

--------------------------------

We expect no warnings to be issued when we specify __GFP_NOWARN, but
currently in paths like alloc_pages() and kmalloc(), there are still some
warnings printed, fix it.

But for some warnings that report usage problems, we don't deal with them.
If such warnings are printed, then we should fix the usage problems.
Such as the following case:

	WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1));

[zhengqi.arch@bytedance.com: v2]
 Link: https://lkml.kernel.org/r/20220511061951.1114-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20220510113809.80626-1-zhengqi.arch@bytedance.com


Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfo...

027e2638

Sep 27, 2022

KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog · 7a3eccfa

Like Xu authored 2 years ago

mainline inclusion
from mainline-v5.18
commit 75189d1de1b377e580ebd2d2c55914631eac9c64
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS


CVE: NA

-------------

NMI-watchdog is one of the favorite features of kernel developers,
but it does not work in AMD guest even with vPMU enabled and worse,
the system misrepresents this capability via /proc.

This is a PMC emulation error. KVM does not pass the latest valid
value to perf_event in time when guest NMI-watchdog is running, thus
the perf_event corresponding to the watchdog counter will enter the
old state at some point after the first guest NMI injection, forcing
the hardware register PMC0 to be constantly written to 0x800000000001.

Meanwhile, the running counter should accurately reflect its new value
based on the latest coordinated pmc->counter (from vPMC's point of view)
rather than the value written directly by the guest.

Fixes: 168d918f ("KVM: x86: Adjust counter sample period after a wrmsr")
Reported-by: Dongli Cao <caodongli@kingsoft.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220409015226.38619-1-likexu@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
[Yanan Wang: Adapt the code to linux v4.19]
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

4.19.90-2209.6.0

7a3eccfa

KVM: x86: Adjust counter sample period after a wrmsr · acb48d9d

Eric Hankland authored 2 years ago

mainline inclusion
from mainline-v5.6
commit 168d918f
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS


CVE: NA

-------------

The sample_period of a counter tracks when that counter will
overflow and set global status/trigger a PMI. However this currently
only gets set when the initial counter is created or when a counter is
resumed; this updates the sample period after a wrmsr so running
counters will accurately reflect their new value.

Signed-off-by: Eric Hankland <ehankland@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
[Yanan Wang: Adapt the code to linux v4.19]
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

acb48d9d

KVM: x86: Fix perfctr WRMSR for running counters · 83462301

Eric Hankland authored 2 years ago

mainline inclusion
from mainline-v5.5
commit 4400cf54
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS


CVE: NA

-------------

Correct the logic in intel_pmu_set_msr() for fixed and general purpose
counters. This was recently changed to set pmc->counter without taking
in to account the value of pmc_read_counter() which will be incorrect if
the counter is currently running and non-zero; this changes back to the
old logic which accounted for the value of currently running counters.

Signed-off-by: Eric Hankland <ehankland@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

83462301

perf/core: Provide a kernel-internal interface to recalibrate event period · 1dffed71

Like Xu authored 2 years ago

mainline inclusion
from mainline-v5.4
commit 3ca270fc
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS


CVE: NA

-------------

Currently, perf_event_period() is used by user tools via ioctl. Based on
naming convention, exporting perf_event_period() for kernel users (such
as KVM) who may recalibrate the event period for their assigned counter
according to their requirements.

The perf_event_period() is an external accessor, just like the
perf_event_{en,dis}able() and should thus use perf_event_ctx_lock().

Suggested-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

1dffed71

media: em28xx: initialize refcount before kref_get · 0f1d3387

Dongliang Mu authored 2 years ago

stable inclusion
from stable-v4.19.238
commit 0113fa98a49a8e46a19b0ad80f29c904c6feec23
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5RX5X


CVE: CVE-2022-3239

---------------------------

[ Upstream commit c08eadca1bdfa099e20a32f8fa4b52b2f672236d ]

The commit 47677e51("[media] em28xx: Only deallocate struct
em28xx after finishing all extensions") adds kref_get to many init
functions (e.g., em28xx_audio_init). However, kref_init is called too
late in em28xx_usb_probe, since em28xx_init_dev before will invoke
those init functions and call kref_get function. Then refcount bug
occurs in my local syzkaller instance.

Fix it by moving kref_init before em28xx_init_dev. This issue occurs
not only in dev but also dev->dev_next.

Fixes: 47677e51 ("[media] em28xx: Only deallocate struct em28xx after finishing all extensions")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.co...>

0f1d3387

Sep 26, 2022

mm: avoid potential deadlock tirgged by writing slab-attr-file · b2297d93

Li Lingfeng authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5SR8X


CVE: NA

--------------------------------

======================================================
WARNING: possible circular locking dependency detected
4.18.0+ #4 Tainted: G                 ---------r-  -
------------------------------------------------------
dmsetup/923 is trying to acquire lock:
000000008d8170dd (kn->count#184){++++}, at: kernfs_remove+0x24/0x40 fs/kernfs/dir.c:1354

but task is already holding lock:
000000003377330b (slab_mutex){+.+.}, at: kmem_cache_destroy+0xec/0x320 mm/slab_common.c:928

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (slab_mutex){+.+.}:
       __mutex_lock_common kernel/locking/mutex.c:925 [inline]
       __mutex_lock+0x105/0x11a0 kernel/locking/mutex.c:1072
       slab_attr_store+0x6d/0xe0 mm/slub.c:5526
       sysfs_kf_write+0x10f/0x170 fs/sysfs/file.c:139
       kernfs_fop_write+0x290/0x440 fs/kernfs/file.c:316
       __vfs_write+0x81/0x100 fs/read_write.c:485
       vfs_write+0x184/0x4c0 fs/read_write.c:549
       ksys_write+0xc6/0x1a0 fs/read_write.c:598
       do_syscall_64+0xca/0x5a0 arch/x86/entry/common.c:298
       entry_SYSCALL_64_after_hwframe+0x6a/0xdf

-> #0 (kn->count#184){++++}:
       lock_acquire+0x10f/0x340 kernel/locking/lockdep.c:3868
       kernfs_drain fs/kernfs/dir.c:467 [inline]
       __kernfs_remove fs/kernfs/dir.c:1320 [inline]
       __kernfs_remove+0x6d0/0x890 fs/kernfs/dir.c:1279
       kernfs_remove+0x24/0x40 fs/kernfs/dir.c:1354
       sysfs_remove_dir+0xb6/0xf0 fs/sysfs/dir.c:99
       kobject_del.part.1+0x35/0xe0 lib/kobject.c:573
       kobject_del+0x1b/0x30 lib/kobject.c:569
       shutdown_cache+0x17f/0x310 mm/slab_common.c:592
       kmem_cache_destroy+0x263/0x320 mm/slab_common.c:943
       bio_put_slab block/bio.c:152 [inline]
       bioset_exit+0x20d/0x330 block/bio.c:1916
       cleanup_mapped_device+0x64/0x360 drivers/md/dm.c:1903
       free_dev+0xbc/0x240 drivers/md/dm.c:2058
       __dm_destroy+0x317/0x490 drivers/md/dm.c:2426
       dm_hash_remove_all+0x8f/0x250 drivers/md/dm-ioctl.c:314
       remove_all+0x4d/0x90 drivers/md/dm-ioctl.c:471
       ctl_ioctl+0x426/0x910 drivers/md/dm-ioctl.c:1870
       dm_ctl_ioctl+0x23/0x30 drivers/md/dm-ioctl.c:1892
       vfs_ioctl fs/ioctl.c:46 [inline]
       file_ioctl fs/ioctl.c:509 [inline]
       do_vfs_ioctl+0x1a5/0x1100 fs/ioctl.c:696
       ksys_ioctl+0x7c/0xa0 fs/ioctl.c:713
       __do_sys_ioctl fs/ioctl.c:720 [inline]
       __se_sys_ioctl fs/ioctl.c:718 [inline]
       __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:718
       do_syscall_64+0xca/0x5a0 arch/x86/entry/common.c:298
       entry_SYSCALL_64_after_hwframe+0x6a/0xdf

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(slab_mutex);
                               lock(kn->count#184);
                               lock(slab_mutex);
  lock(kn->count#184);

A potential deadlock may occur when we remove and write a slab-attr-file in
/sys/kernfs/slab/xxx/ at the same time.
The lock sequence in remove process is:
slab_mutex --> kn->count
The lock sequence in write process is:
kn->count --> slab_mutex
This can be fixed by replacing mutex_lock with mutex_trylock in slab_attr_store.

Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

b2297d93

ext4: fix use-after-free in ext4_ext_shift_extents · ae52ee4a

Baokun Li authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: 187600, https://gitee.com/openeuler/kernel/issues/I5SV2U


CVE: NA

--------------------------------

If the starting position of our insert range happens to be in the hole
between the two ext4_extent_idx, because the lblk of the ext4_extent in
the previous ext4_extent_idx is always less than the start, which leads
to the "extent" variable access across the boundary, the following UAF is
triggered:

==================================================================
BUG: KASAN: use-after-free in ext4_ext_shift_extents+0x257/0x790
Read of size 4 at addr ffff88819807a008 by task fallocate/8010
CPU: 3 PID: 8010 Comm: fallocate Tainted: G            E     5.10.0+ #492
Call Trace:
 dump_stack+0x7d/0xa3
 print_address_description.constprop.0+0x1e/0x220
 kasan_report.cold+0x67/0x7f
 ext4_ext_shift_extents+0x257/0x790
 ext4_insert_range+0x5b6/0x700
 ext4_fallocate+0x39e/0x3d0
 vfs_fallocate+0x26f/0x470
 ksys_fallocate+0x3a/0x70
 __x64_sys_fallocate+0x4f/0x60
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
==================================================================

For right shifts, we can divide them into the following situations：

1. When the first ee_block of ext4_extent_idx is greater than or equal to
   start, make right shifts directly from the first ee_block.
    1) If it is greater than start, we need to continue searching in the
       previous ext4_extent_idx.
    2) If it is equal to start, we can exit the loop (iterator=NULL).

2. When the first ee_block of ext4_extent_idx is less than start, then
   traverse from the last extent to find the first extent whose ee_block
   is less than start.
    1) If extent is still the last extent after traversal, it means that
       the last ee_block of ext4_extent_idx is less than start, that is,
       start is located in the hole between idx and (idx+1), so we can
       exit the loop directly (break) without right shifts.
    2) Otherwise, make right shifts at the corresponding position of the
       found extent, and then exit the loop (iterator=NULL).

Fixes: 331573fe ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate")
Cc: stable@vger.kernel.org
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

ae52ee4a

quota: Add more checking after reading from quota file · f66997d9

Zhihao Cheng authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X


CVE: NA

--------------------------------

It would be better to do more sanity checking (eg. dqdh_entries,
block no.) for the content read from quota file, which can prevent
corrupting the quota file.

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

f66997d9

quota: Replace all block number checking with helper function · 1e9a49cf

Zhihao Cheng authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X


CVE: NA

--------------------------------

Cleanup all block checking places, replace them with helper function
do_check_range().

Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

1e9a49cf

quota: Check next/prev free block number after reading from quota file · 6c27d754

Zhihao Cheng authored 2 years ago

hulk inclusion
category: bugfix
bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X
CVE: NA

--------------------------------

Following process:
 Init: v2_read_file_info: <3> dqi_free_blk 0 dqi_free_entry 5 dqi_blks 6

 Step 1. chown bin f_a -> dquot_acquire -> v2_write_dquot:
  qtree_write_dquot
   do_insert_tree
    find_free_dqentry
     get_free_dqblk
      write_blk(info->dqi_blocks) // info->dqi_blocks = 6, failure. The
	   content in physical block (corresponding to blk 6) is random.

 Step 2. chown root f_a -> dquot_transfer -> dqput_all -> dqput ->
         ext4_release_dquot -> v2_release_dquot -> qtree_delete_dquot:
  dquot_release
   remove_tree
    free_dqentry
     put_free_dqblk(6)
      info->dqi_free_blk = blk    // info->dqi_free_blk = 6

 Step 3. drop cache (buffer head for block 6 is released)

 Step 4. chown bin f_b -> dquot_acquire -> commit_dqblk -> v2_write_dquot:
  qtree_write_dquot
   do_insert_tree
    find_free_dqentry
     get_free_dqblk
      dh = (struct qt_disk_dqdbheader *)buf
      blk = info->dqi_free_blk     // 6
      ret = read_blk(info, blk, buf)  // The content of buf is random
      info->dqi_free_blk = le32_to_cpu(dh->dqdh_next_free)  // random blk

 Step 5. chown bin f_c -> notify_change -> ext4_setattr -> dquot_transfer:
  dquot = dqget -> acquire_dquot -> ext4_acquire_dquot -> dquot_acquire ->
          commit_dqblk -> v2_write_dquot -> dq_insert_tree:
   do_insert_tree
    find_free_dqentry
     get_free_dqblk
      blk = info->dqi_free_blk    // If blk < 0 and blk is not an error
				     code, it will be returned as dquot

  transfer_to[USRQUOTA] = dquot  // A random negative value
  __dquot_transfer(transfer_to)
   dquot_add_inodes(transfer_to[cnt])
    spin_lock(&dquot->dq_dqb_lock)  // page fault

, which will lead to kernel page fault:
 Quota error (device sda): qtree_write_dquot: Error -8000 occurred
 while creating quota
 BUG: unable to handle page fault for address: ffffffffffffe120
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 0 PID: 5974 Comm: chown Not tainted 6.0.0-rc1-00004
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 RIP: 0010:_raw_spin_lock+0x3a/0x90
 Call Trace:
  dquot_add_inodes+0x28/0x270
  __dquot_transfer+0x377/0x840
  dquot_transfer+0xde/0x540
  ext4_setattr+0x405/0x14d0
  notify_change+0x68e/0x9f0
  chown_common+0x300/0x430
  __x64_sys_fchownat+0x29/0x40

In order to avoid accessing invalid quota memory address, this patch adds
block number checking of next/prev free block read from quota file.

Fetch a reproducer in [Link].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216372


Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

6c27d754

efi: capsule-loader: Fix use-after-free in efi_capsule_write · 27dfef31

Hyunwoo Kim authored 2 years ago

mainline inclusion
from mainline-v6.0-rc5
commit 9cb636b5f6a8cc6d1b50809ec8f8d33ae0c84c95
category: bugfix
bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5QI0W


CVE: CVE-2022-40307

---------------------------

A race condition may occur if the user calls close() on another thread
during a write() operation on the device node of the efi capsule.

This is a race condition that occurs between the efi_capsule_write() and
efi_capsule_flush() functions of efi_capsule_fops, which ultimately
results in UAF.

So, the page freeing process is modified to be done in
efi_capsule_release() instead of efi_capsule_flush().

Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
Link: https://lore.kernel.org/all/20220907102920.GA88602@ubuntu/


Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Xia Longlong <xialonglong1@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Xiu Jianfeng <x...

27dfef31