- Sep 28, 2022
-
-
Bodo Stroesser authored
mainline inclusion from mainline-v5.8-rc1 commit 8c4e0f21 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5SXLB CVE: NA -------------------------------- 1) If remaining ring space before the end of the ring is smaller then the next cmd to write, tcmu writes a padding entry which fills the remaining space at the end of the ring. Then tcmu calls tcmu_flush_dcache_range() with the size of struct tcmu_cmd_entry as data length to flush. If the space filled by the padding was smaller then tcmu_cmd_entry, tcmu_flush_dcache_range() is called for an address range reaching behind the end of the vmalloc'ed ring. tcmu_flush_dcache_range() in a loop calls flush_dcache_page(virt_to_page(start)); for every page being part of the range. On x86 the line is optimized out by the compiler, as flush_dcache_page() is empty on x86. But I assume the above can cause trouble on other architectures that really have a flush_dcache_page(). For paddings only the header part of an entry is relevant due to alignment rules the header always fits in the remaining space, if padding is needed. So tcmu_flush_dcache_range() can safely be called with sizeof(entry->hdr) as the length here. 2) After it has written a command to cmd ring, tcmu calls tcmu_flush_dcache_range() using the size of a struct tcmu_cmd_entry as data length to flush. But if a command needs many iovecs, the real size of the command may be bigger then tcmu_cmd_entry, so a part of the written command is not flushed then. Link: https://lore.kernel.org/r/20200528193108.9085-1-bstroesser@ts.fujitsu.com Acked-by:
Mike Christie <michael.christie@oracle.com> Signed-off-by:
Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Wenchao Hao <haowenchao@huawei.com> Reviewed-by:
lijinlin <lijinlin3@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Ye Weihua authored
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5T8FD CVE: NA -------------------------------- __dend_signal_locked() invokes __sigqueue_alloc() which may invoke a normal printk() to print failure message. This can cause a deadlock in the scenario reported by syz-bot below (test in 5.10): CPU0 CPU1 ---- ---- lock(&sighand->siglock); lock(&tty->read_wait); lock(&sighand->siglock); lock(console_owner); This patch specities __GFP_NOWARN to __sigqueue_alloc(), so that printk will not be called, and this deadlock problem can be avoided. Syzbot reported the following lockdep error: ====================================================== WARNING: possible circular locking dependency detected 5.10.0-04424-ga472e3c833d3 #1 Not tainted ------------------------------------------------------ syz-executor.2/31970 is trying to acquire lock: ffffa00014066a60 (console_owner){-.-.}-{0:0}, at: console_trylock_spinning+0xf0/0x2e0 kernel/printk/printk.c:1854 but task is already holding lock: ffff0000ddb38a98 (&sighand->siglock){-.-.}-{2:2}, at: force_sig_info_to_task+0x60/0x260 kernel/signal.c:1322 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #4 (&sighand->siglock){-.-.}-{2:2}: validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728 __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954 lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xc0/0x15c kernel/locking/spinlock.c:159 __lock_task_sighand+0xf0/0x370 kernel/signal.c:1396 lock_task_sighand include/linux/sched/signal.h:699 [inline] task_work_add+0x1f8/0x2a0 kernel/task_work.c:58 io_req_task_work_add+0x98/0x10c fs/io_uring.c:2115 __io_async_wake+0x338/0x780 fs/io_uring.c:4984 io_poll_wake+0x40/0x50 fs/io_uring.c:5461 __wake_up_common+0xcc/0x2a0 kernel/sched/wait.c:93 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:123 __wake_up+0x1c/0x24 kernel/sched/wait.c:142 pty_set_termios+0x1ac/0x2d0 drivers/tty/pty.c:286 tty_set_termios+0x310/0x46c drivers/tty/tty_ioctl.c:334 set_termios.part.0+0x2dc/0xa50 drivers/tty/tty_ioctl.c:414 set_termios drivers/tty/tty_ioctl.c:368 [inline] tty_mode_ioctl+0x4f4/0xbec drivers/tty/tty_ioctl.c:736 n_tty_ioctl_helper+0x74/0x260 drivers/tty/tty_ioctl.c:883 n_tty_ioctl+0x80/0x3d0 drivers/tty/n_tty.c:2516 tty_ioctl+0x508/0x1100 drivers/tty/tty_io.c:2751 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __arm64_sys_ioctl+0x12c/0x18c fs/ioctl.c:739 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common.constprop.0+0xf8/0x420 arch/arm64/kernel/syscall.c:155 do_el0_svc+0x50/0x120 arch/arm64/kernel/syscall.c:217 el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353 el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369 el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683 -> #3 (&tty->read_wait){....}-{2:2}: validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728 __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954 lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0xa0/0x120 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] io_poll_double_wake+0x158/0x30c fs/io_uring.c:5093 __wake_up_common+0xcc/0x2a0 kernel/sched/wait.c:93 __wake_up_common_lock+0xd0/0x130 kernel/sched/wait.c:123 __wake_up+0x1c/0x24 kernel/sched/wait.c:142 pty_close+0x1bc/0x330 drivers/tty/pty.c:68 tty_release+0x1e0/0x88c drivers/tty/tty_io.c:1761 __fput+0x1dc/0x500 fs/file_table.c:281 ____fput+0x24/0x30 fs/file_table.c:314 task_work_run+0xf4/0x1ec kernel/task_work.c:151 tracehook_notify_resume include/linux/tracehook.h:188 [inline] do_notify_resume+0x378/0x410 arch/arm64/kernel/signal.c:718 work_pending+0xc/0x198 -> #2 (&tty->write_wait){....}-{2:2}: validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728 __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954 lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xc0/0x15c kernel/locking/spinlock.c:159 __wake_up_common_lock+0xb0/0x130 kernel/sched/wait.c:122 __wake_up+0x1c/0x24 kernel/sched/wait.c:142 tty_wakeup+0x54/0xbc drivers/tty/tty_io.c:539 tty_port_default_wakeup+0x38/0x50 drivers/tty/tty_port.c:50 tty_port_tty_wakeup+0x3c/0x50 drivers/tty/tty_port.c:388 uart_write_wakeup+0x38/0x60 drivers/tty/serial/serial_core.c:106 pl011_tx_chars+0x530/0x5c0 drivers/tty/serial/amba-pl011.c:1418 pl011_start_tx_pio drivers/tty/serial/amba-pl011.c:1303 [inline] pl011_start_tx+0x1b4/0x430 drivers/tty/serial/amba-pl011.c:1315 __uart_start.isra.0+0xb4/0xcc drivers/tty/serial/serial_core.c:127 uart_write+0x21c/0x460 drivers/tty/serial/serial_core.c:613 process_output_block+0x120/0x3ac drivers/tty/n_tty.c:590 n_tty_write+0x2c8/0x650 drivers/tty/n_tty.c:2383 do_tty_write drivers/tty/tty_io.c:1028 [inline] file_tty_write.constprop.0+0x2d0/0x520 drivers/tty/tty_io.c:1118 tty_write drivers/tty/tty_io.c:1125 [inline] redirected_tty_write+0xe4/0x104 drivers/tty/tty_io.c:1147 call_write_iter include/linux/fs.h:1960 [inline] new_sync_write+0x264/0x37c fs/read_write.c:515 vfs_write+0x694/0x9d0 fs/read_write.c:602 ksys_write+0xfc/0x200 fs/read_write.c:655 __do_sys_write fs/read_write.c:667 [inline] __se_sys_write fs/read_write.c:664 [inline] __arm64_sys_write+0x50/0x60 fs/read_write.c:664 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall arch/arm64/kernel/syscall.c:48 [inline] el0_svc_common.constprop.0+0xf8/0x420 arch/arm64/kernel/syscall.c:155 do_el0_svc+0x50/0x120 arch/arm64/kernel/syscall.c:217 el0_svc+0x20/0x30 arch/arm64/kernel/entry-common.c:353 el0_sync_handler+0xe4/0x1e0 arch/arm64/kernel/entry-common.c:369 el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683 -> #1 (&port_lock_key){-.-.}-{2:2}: validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728 __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954 lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0xa0/0x120 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:354 [inline] pl011_console_write+0x2f0/0x410 drivers/tty/serial/amba-pl011.c:2263 call_console_drivers.constprop.0+0x1f8/0x3b0 kernel/printk/printk.c:1932 console_unlock+0x36c/0x9ec kernel/printk/printk.c:2553 vprintk_emit+0x40c/0x4b0 kernel/printk/printk.c:2075 vprintk_default+0x48/0x54 kernel/printk/printk.c:2092 vprintk_func+0x1f0/0x40c kernel/printk/printk_safe.c:404 printk+0xbc/0xf0 kernel/printk/printk.c:2123 register_console+0x580/0x790 kernel/printk/printk.c:2905 uart_configure_port.constprop.0+0x4a0/0x4e0 drivers/tty/serial/serial_core.c:2431 uart_add_one_port+0x378/0x550 drivers/tty/serial/serial_core.c:2944 pl011_register_port+0xb4/0x210 drivers/tty/serial/amba-pl011.c:2686 pl011_probe+0x334/0x3ec drivers/tty/serial/amba-pl011.c:2736 amba_probe+0x14c/0x2f0 drivers/amba/bus.c:283 really_probe+0x210/0xa5c drivers/base/dd.c:562 driver_probe_device+0x1c8/0x280 drivers/base/dd.c:747 __device_attach_driver+0x18c/0x260 drivers/base/dd.c:853 bus_for_each_drv+0x120/0x1a0 drivers/base/bus.c:431 __device_attach+0x16c/0x3b4 drivers/base/dd.c:922 device_initial_probe+0x28/0x34 drivers/base/dd.c:971 bus_probe_device+0x124/0x13c drivers/base/bus.c:491 fw_devlink_resume+0x164/0x270 drivers/base/core.c:1601 of_platform_default_populate_init+0xf4/0x114 drivers/of/platform.c:543 do_one_initcall+0x11c/0x770 init/main.c:1217 do_initcall_level+0x364/0x388 init/main.c:1290 do_initcalls+0x90/0xc0 init/main.c:1306 do_basic_setup init/main.c:1326 [inline] kernel_init_freeable+0x57c/0x63c init/main.c:1529 kernel_init+0x1c/0x20c init/main.c:1417 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1034 -> #0 (console_owner){-.-.}-{0:0}: check_prev_add+0xe0/0x105c kernel/locking/lockdep.c:2988 check_prevs_add+0x1c8/0x3d4 kernel/locking/lockdep.c:3113 validate_chain+0x6dc/0xb0c kernel/locking/lockdep.c:3728 __lock_acquire+0x498/0x940 kernel/locking/lockdep.c:4954 lock_acquire+0x228/0x580 kernel/locking/lockdep.c:5564 console_trylock_spinning+0x130/0x2e0 kernel/printk/printk.c:1875 vprintk_emit+0x268/0x4b0 kernel/printk/printk.c:2074 vprintk_default+0x48/0x54 kernel/printk/printk.c:2092 vprintk_func+0x1f0/0x40c kernel/printk/printk_safe.c:404 printk+0xbc/0xf0 kernel/printk/printk.c:2123 fail_dump lib/fault-inject.c:45 [inline] should_fail+0x2a0/0x370 lib/fault-inject.c:146 __should_failslab+0x8c/0xe0 mm/failslab.c:33 should_failslab+0x14/0x2c mm/slab_common.c:1181 slab_pre_alloc_hook mm/slab.h:495 [inline] slab_alloc_node mm/slub.c:2842 [inline] slab_alloc mm/slub.c:2931 [inline] kmem_cache_alloc+0x8c/0xe64 mm/slub.c:2936 __sigqueue_alloc+0x224/0x5a4 kernel/signal.c:437 __send_signal+0x700/0xeac kernel/signal.c:1121 send_signal+0x348/0x6a0 kernel/signal.c:1247 force_sig_info_to_task+0x184/0x260 kernel/signal.c:1339 force_sig_fault_to_task kernel/signal.c:1678 [inline] force_sig_fault+0xb0/0xf0 kernel/signal.c:1685 arm64_force_sig_fault arch/arm64/kernel/traps.c:182 [inline] arm64_notify_die arch/arm64/kernel/traps.c:208 [inline] arm64_notify_die+0xdc/0x160 arch/arm64/kernel/traps.c:199 do_sp_pc_abort+0x4c/0x60 arch/arm64/mm/fault.c:794 el0_pc+0xd8/0x19c arch/arm64/kernel/entry-common.c:309 el0_sync_handler+0x12c/0x1e0 arch/arm64/kernel/entry-common.c:394 el0_sync+0x148/0x180 arch/arm64/kernel/entry.S:683 other info that might help us debug this: Chain exists of: console_owner --> &tty->read_wait --> &sighand->siglock Signed-off-by:
Ye Weihua <yeweihua4@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Qi Zheng authored
mainline inclusion from mainline-v5.19-rc1 commit 3f913fc5f9745613088d3c569778c9813ab9c129 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5T8FD CVE: NA -------------------------------- We expect no warnings to be issued when we specify __GFP_NOWARN, but currently in paths like alloc_pages() and kmalloc(), there are still some warnings printed, fix it. But for some warnings that report usage problems, we don't deal with them. If such warnings are printed, then we should fix the usage problems. Such as the following case: WARN_ON_ONCE((gfp_flags & __GFP_NOFAIL) && (order > 1)); [zhengqi.arch@bytedance.com: v2] Link: https://lkml.kernel.org/r/20220511061951.1114-1-zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/20220510113809.80626-1-zhengqi.arch@bytedance.com Signed-off-by:
Qi Zheng <zhengqi.arch@bytedance.com> Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfo...
-
- Sep 27, 2022
-
-
Like Xu authored
mainline inclusion from mainline-v5.18 commit 75189d1de1b377e580ebd2d2c55914631eac9c64 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS CVE: NA ------------- NMI-watchdog is one of the favorite features of kernel developers, but it does not work in AMD guest even with vPMU enabled and worse, the system misrepresents this capability via /proc. This is a PMC emulation error. KVM does not pass the latest valid value to perf_event in time when guest NMI-watchdog is running, thus the perf_event corresponding to the watchdog counter will enter the old state at some point after the first guest NMI injection, forcing the hardware register PMC0 to be constantly written to 0x800000000001. Meanwhile, the running counter should accurately reflect its new value based on the latest coordinated pmc->counter (from vPMC's point of view) rather than the value written directly by the guest. Fixes: 168d918f ("KVM: x86: Adjust counter sample period after a wrmsr") Reported-by:
Dongli Cao <caodongli@kingsoft.com> Signed-off-by:
Like Xu <likexu@tencent.com> Reviewed-by:
Yanan Wang <wangyanan55@huawei.com> Tested-by:
Yanan Wang <wangyanan55@huawei.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Message-Id: <20220409015226.38619-1-likexu@tencent.com> Cc: stable@vger.kernel.org Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Yanan Wang <wangyanan55@huawei.com> [Yanan Wang: Adapt the code to linux v4.19] Reviewed-by:
Zenghui Yu <yuzenghui@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Eric Hankland authored
mainline inclusion from mainline-v5.6 commit 168d918f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS CVE: NA ------------- The sample_period of a counter tracks when that counter will overflow and set global status/trigger a PMI. However this currently only gets set when the initial counter is created or when a counter is resumed; this updates the sample period after a wrmsr so running counters will accurately reflect their new value. Signed-off-by:
Eric Hankland <ehankland@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Yanan Wang <wangyanan55@huawei.com> [Yanan Wang: Adapt the code to linux v4.19] Reviewed-by:
Zenghui Yu <yuzenghui@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Eric Hankland authored
mainline inclusion from mainline-v5.5 commit 4400cf54 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS CVE: NA ------------- Correct the logic in intel_pmu_set_msr() for fixed and general purpose counters. This was recently changed to set pmc->counter without taking in to account the value of pmc_read_counter() which will be incorrect if the counter is currently running and non-zero; this changes back to the old logic which accounted for the value of currently running counters. Signed-off-by:
Eric Hankland <ehankland@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Yanan Wang <wangyanan55@huawei.com> Reviewed-by:
Zenghui Yu <yuzenghui@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Like Xu authored
mainline inclusion from mainline-v5.4 commit 3ca270fc category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I5SDUS CVE: NA ------------- Currently, perf_event_period() is used by user tools via ioctl. Based on naming convention, exporting perf_event_period() for kernel users (such as KVM) who may recalibrate the event period for their assigned counter according to their requirements. The perf_event_period() is an external accessor, just like the perf_event_{en,dis}able() and should thus use perf_event_ctx_lock(). Suggested-by:
Kan Liang <kan.liang@linux.intel.com> Signed-off-by:
Like Xu <like.xu@linux.intel.com> Acked-by:
Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Yanan Wang <wangyanan55@huawei.com> Reviewed-by:
Zenghui Yu <yuzenghui@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Dongliang Mu authored
stable inclusion from stable-v4.19.238 commit 0113fa98a49a8e46a19b0ad80f29c904c6feec23 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5RX5X CVE: CVE-2022-3239 --------------------------- [ Upstream commit c08eadca1bdfa099e20a32f8fa4b52b2f672236d ] The commit 47677e51("[media] em28xx: Only deallocate struct em28xx after finishing all extensions") adds kref_get to many init functions (e.g., em28xx_audio_init). However, kref_init is called too late in em28xx_usb_probe, since em28xx_init_dev before will invoke those init functions and call kref_get function. Then refcount bug occurs in my local syzkaller instance. Fix it by moving kref_init before em28xx_init_dev. This issue occurs not only in dev but also dev->dev_next. Fixes: 47677e51 ("[media] em28xx: Only deallocate struct em28xx after finishing all extensions") Reported-by:
syzkaller <syzkaller@googlegroups.com> Signed-off-by:
Dongliang Mu <mudongliangabcd@gmail.co...>
-
- Sep 26, 2022
-
-
Li Lingfeng authored
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5SR8X CVE: NA -------------------------------- ====================================================== WARNING: possible circular locking dependency detected 4.18.0+ #4 Tainted: G ---------r- - ------------------------------------------------------ dmsetup/923 is trying to acquire lock: 000000008d8170dd (kn->count#184){++++}, at: kernfs_remove+0x24/0x40 fs/kernfs/dir.c:1354 but task is already holding lock: 000000003377330b (slab_mutex){+.+.}, at: kmem_cache_destroy+0xec/0x320 mm/slab_common.c:928 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slab_mutex){+.+.}: __mutex_lock_common kernel/locking/mutex.c:925 [inline] __mutex_lock+0x105/0x11a0 kernel/locking/mutex.c:1072 slab_attr_store+0x6d/0xe0 mm/slub.c:5526 sysfs_kf_write+0x10f/0x170 fs/sysfs/file.c:139 kernfs_fop_write+0x290/0x440 fs/kernfs/file.c:316 __vfs_write+0x81/0x100 fs/read_write.c:485 vfs_write+0x184/0x4c0 fs/read_write.c:549 ksys_write+0xc6/0x1a0 fs/read_write.c:598 do_syscall_64+0xca/0x5a0 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x6a/0xdf -> #0 (kn->count#184){++++}: lock_acquire+0x10f/0x340 kernel/locking/lockdep.c:3868 kernfs_drain fs/kernfs/dir.c:467 [inline] __kernfs_remove fs/kernfs/dir.c:1320 [inline] __kernfs_remove+0x6d0/0x890 fs/kernfs/dir.c:1279 kernfs_remove+0x24/0x40 fs/kernfs/dir.c:1354 sysfs_remove_dir+0xb6/0xf0 fs/sysfs/dir.c:99 kobject_del.part.1+0x35/0xe0 lib/kobject.c:573 kobject_del+0x1b/0x30 lib/kobject.c:569 shutdown_cache+0x17f/0x310 mm/slab_common.c:592 kmem_cache_destroy+0x263/0x320 mm/slab_common.c:943 bio_put_slab block/bio.c:152 [inline] bioset_exit+0x20d/0x330 block/bio.c:1916 cleanup_mapped_device+0x64/0x360 drivers/md/dm.c:1903 free_dev+0xbc/0x240 drivers/md/dm.c:2058 __dm_destroy+0x317/0x490 drivers/md/dm.c:2426 dm_hash_remove_all+0x8f/0x250 drivers/md/dm-ioctl.c:314 remove_all+0x4d/0x90 drivers/md/dm-ioctl.c:471 ctl_ioctl+0x426/0x910 drivers/md/dm-ioctl.c:1870 dm_ctl_ioctl+0x23/0x30 drivers/md/dm-ioctl.c:1892 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:509 [inline] do_vfs_ioctl+0x1a5/0x1100 fs/ioctl.c:696 ksys_ioctl+0x7c/0xa0 fs/ioctl.c:713 __do_sys_ioctl fs/ioctl.c:720 [inline] __se_sys_ioctl fs/ioctl.c:718 [inline] __x64_sys_ioctl+0x74/0xb0 fs/ioctl.c:718 do_syscall_64+0xca/0x5a0 arch/x86/entry/common.c:298 entry_SYSCALL_64_after_hwframe+0x6a/0xdf other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slab_mutex); lock(kn->count#184); lock(slab_mutex); lock(kn->count#184); A potential deadlock may occur when we remove and write a slab-attr-file in /sys/kernfs/slab/xxx/ at the same time. The lock sequence in remove process is: slab_mutex --> kn->count The lock sequence in write process is: kn->count --> slab_mutex This can be fixed by replacing mutex_lock with mutex_trylock in slab_attr_store. Signed-off-by:
Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Baokun Li authored
hulk inclusion category: bugfix bugzilla: 187600, https://gitee.com/openeuler/kernel/issues/I5SV2U CVE: NA -------------------------------- If the starting position of our insert range happens to be in the hole between the two ext4_extent_idx, because the lblk of the ext4_extent in the previous ext4_extent_idx is always less than the start, which leads to the "extent" variable access across the boundary, the following UAF is triggered: ================================================================== BUG: KASAN: use-after-free in ext4_ext_shift_extents+0x257/0x790 Read of size 4 at addr ffff88819807a008 by task fallocate/8010 CPU: 3 PID: 8010 Comm: fallocate Tainted: G E 5.10.0+ #492 Call Trace: dump_stack+0x7d/0xa3 print_address_description.constprop.0+0x1e/0x220 kasan_report.cold+0x67/0x7f ext4_ext_shift_extents+0x257/0x790 ext4_insert_range+0x5b6/0x700 ext4_fallocate+0x39e/0x3d0 vfs_fallocate+0x26f/0x470 ksys_fallocate+0x3a/0x70 __x64_sys_fallocate+0x4f/0x60 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ================================================================== For right shifts, we can divide them into the following situations: 1. When the first ee_block of ext4_extent_idx is greater than or equal to start, make right shifts directly from the first ee_block. 1) If it is greater than start, we need to continue searching in the previous ext4_extent_idx. 2) If it is equal to start, we can exit the loop (iterator=NULL). 2. When the first ee_block of ext4_extent_idx is less than start, then traverse from the last extent to find the first extent whose ee_block is less than start. 1) If extent is still the last extent after traversal, it means that the last ee_block of ext4_extent_idx is less than start, that is, start is located in the hole between idx and (idx+1), so we can exit the loop directly (break) without right shifts. 2) Otherwise, make right shifts at the corresponding position of the found extent, and then exit the loop (iterator=NULL). Fixes: 331573fe ("ext4: Add support FALLOC_FL_INSERT_RANGE for fallocate") Cc: stable@vger.kernel.org Signed-off-by:
Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by:
Baokun Li <libaokun1@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Zhihao Cheng authored
hulk inclusion category: bugfix bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X CVE: NA -------------------------------- It would be better to do more sanity checking (eg. dqdh_entries, block no.) for the content read from quota file, which can prevent corrupting the quota file. Signed-off-by:
Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by:
Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by:
Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Zhihao Cheng authored
hulk inclusion category: bugfix bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X CVE: NA -------------------------------- Cleanup all block checking places, replace them with helper function do_check_range(). Signed-off-by:
Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by:
Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by:
Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by:
Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Zhihao Cheng authored
hulk inclusion category: bugfix bugzilla: 187046, https://gitee.com/openeuler/kernel/issues/I5QH0X CVE: NA -------------------------------- Following process: Init: v2_read_file_info: <3> dqi_free_blk 0 dqi_free_entry 5 dqi_blks 6 Step 1. chown bin f_a -> dquot_acquire -> v2_write_dquot: qtree_write_dquot do_insert_tree find_free_dqentry get_free_dqblk write_blk(info->dqi_blocks) // info->dqi_blocks = 6, failure. The content in physical block (corresponding to blk 6) is random. Step 2. chown root f_a -> dquot_transfer -> dqput_all -> dqput -> ext4_release_dquot -> v2_release_dquot -> qtree_delete_dquot: dquot_release remove_tree free_dqentry put_free_dqblk(6) info->dqi_free_blk = blk // info->dqi_free_blk = 6 Step 3. drop cache (buffer head for block 6 is released) Step 4. chown bin f_b -> dquot_acquire -> commit_dqblk -> v2_write_dquot: qtree_write_dquot do_insert_tree find_free_dqentry get_free_dqblk dh = (struct qt_disk_dqdbheader *)buf blk = info->dqi_free_blk // 6 ret = read_blk(info, blk, buf) // The content of buf is random info->dqi_free_blk = le32_to_cpu(dh->dqdh_next_free) // random blk Step 5. chown bin f_c -> notify_change -> ext4_setattr -> dquot_transfer: dquot = dqget -> acquire_dquot -> ext4_acquire_dquot -> dquot_acquire -> commit_dqblk -> v2_write_dquot -> dq_insert_tree: do_insert_tree find_free_dqentry get_free_dqblk blk = info->dqi_free_blk // If blk < 0 and blk is not an error code, it will be returned as dquot transfer_to[USRQUOTA] = dquot // A random negative value __dquot_transfer(transfer_to) dquot_add_inodes(transfer_to[cnt]) spin_lock(&dquot->dq_dqb_lock) // page fault , which will lead to kernel page fault: Quota error (device sda): qtree_write_dquot: Error -8000 occurred while creating quota BUG: unable to handle page fault for address: ffffffffffffe120 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP CPU: 0 PID: 5974 Comm: chown Not tainted 6.0.0-rc1-00004 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) RIP: 0010:_raw_spin_lock+0x3a/0x90 Call Trace: dquot_add_inodes+0x28/0x270 __dquot_transfer+0x377/0x840 dquot_transfer+0xde/0x540 ext4_setattr+0x405/0x14d0 notify_change+0x68e/0x9f0 chown_common+0x300/0x430 __x64_sys_fchownat+0x29/0x40 In order to avoid accessing invalid quota memory address, this patch adds block number checking of next/prev free block read from quota file. Fetch a reproducer in [Link]. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216372 Fixes: 1da177e4 ("Linux-2.6.12-rc2") Signed-off-by:
Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by:
Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by:
Zhihao Cheng <chengzhihao1@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Hyunwoo Kim authored
mainline inclusion from mainline-v6.0-rc5 commit 9cb636b5f6a8cc6d1b50809ec8f8d33ae0c84c95 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5QI0W CVE: CVE-2022-40307 --------------------------- A race condition may occur if the user calls close() on another thread during a write() operation on the device node of the efi capsule. This is a race condition that occurs between the efi_capsule_write() and efi_capsule_flush() functions of efi_capsule_fops, which ultimately results in UAF. So, the page freeing process is modified to be done in efi_capsule_release() instead of efi_capsule_flush(). Cc: <stable@vger.kernel.org> # v4.9+ Signed-off-by:
Hyunwoo Kim <imv4bel@gmail.com> Link: https://lore.kernel.org/all/20220907102920.GA88602@ubuntu/ Signed-off-by:
Ard Biesheuvel <ardb@kernel.org> Signed-off-by:
Xia Longlong <xialonglong1@huawei.com> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Xiu Jianfeng <x...
-
Lu Wei authored
mainline inclusion from mainline-v6.0-rc6 commit 81225b2ea161af48e093f58e8dfee6d705b16af4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5SYBY CVE: NA -------------------------------- If an AF_PACKET socket is used to send packets through ipvlan and the default xmit function of the AF_PACKET socket is changed from dev_queue_xmit() to packet_direct_xmit() via setsockopt() with the option name of PACKET_QDISC_BYPASS, the skb->mac_header may not be reset and remains as the initial value of 65535, this may trigger slab-out-of-bounds bugs as following: ================================================================= UG: KASAN: slab-out-of-bounds in ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan] PU: 2 PID: 1768 Comm: raw_send Kdump: loaded Not tainted 6.0.0-rc4+ #6 ardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 all Trace: print_address_description.constprop.0+0x1d/0x160 print_report.cold+0x4f/0x112 kasan_report+0xa3/0x130 ipvlan_xmit_mode_l2+0xdb/0x330 [ipvlan] ipvlan_start_xmit+0x29/0xa0 [ipvlan] __dev_direct_xmit+0x2e2/0x380 packet_direct_xmit+0x22/0x60 packet_snd+0x7c9/0xc40 sock_sendmsg+0x9a/0xa0 __sys_sendto+0x18a/0x230 __x64_sys_sendto+0x74/0x90 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd The root cause is: 1. packet_snd() only reset skb->mac_header when sock->type is SOCK_RAW and skb->protocol is not specified as in packet_parse_headers() 2. packet_direct_xmit() doesn't reset skb->mac_header as dev_queue_xmit() In this case, skb->mac_header is 65535 when ipvlan_xmit_mode_l2() is called. So when ipvlan_xmit_mode_l2() gets mac header with eth_hdr() which use "skb->head + skb->mac_header", out-of-bound access occurs. This patch replaces eth_hdr() with skb_eth_hdr() in ipvlan_xmit_mode_l2() and reset mac header in multicast to solve this out-of-bound bug. Fixes: 2ad7bf36 ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by:
Lu Wei <luwei32@huawei.com> Reviewed-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Lu Wei <luwei32@huawei.com> Reviewed-by:
Yue Haibing <yuehaibing@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Wang Wensheng authored
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5PD4P CVE: NA -------------------------------- [ 2058.802818][ T290] BUG: KASAN: use-after-free in get_process_sp_res+0x70/0x134 [ 2058.810194][ T290] Read of size 8 at addr ffff00088dc6ab28 by task test_debug_loop/290 [ 2058.820520][ T290] CPU: 5 PID: 290 Comm: test_debug_loop Tainted: G W OE 5.10.0+ #2 [ 2058.829377][ T290] Hardware name: EVB(EP) (DT) [ 2058.833982][ T290] Call trace: [ 2058.837217][ T290] dump_backtrace+0x0/0x30c [ 2058.841660][ T290] show_stack+0x20/0x30 [ 2058.845758][ T290] dump_stack+0x120/0x1b0 [ 2058.850028][ T290] print_address_description.constprop.0+0x2c/0x1fc [ 2058.856555][ T290] __kasan_report+0xfc/0x160 [ 2058.861086][ T290] kasan_report+0x44/0xb0 [ 2058.865356][ T290] __asan_load8+0x94/0xd0 [ 2058.869623][ T290] get_process_sp_res+0x70/0x134 [ 2058.874501][ T290] proc_usage_show+0x1ac/0x304 [ 2058.879208][ T290]...
-
David Jeffery authored
mainline inclusion from mainline-v5.18-rc1 commit 8f5fea65b06de1cc51d4fc23fb4d378d1abd6ed7 category: bugfix bugzilla: 187541, https://gitee.com/openeuler/kernel/issues/I5RUM6 CVE: NA -------------------------------- When blk_mq_delay_run_hw_queues sets an hctx to run in the future, it can reset the delay length for an already pending delayed work run_work. This creates a scenario where multiple hctx may have their queues set to run, but if one runs first and finds nothing to do, it can reset the delay of another hctx and stall the other hctx's ability to run requests. To avoid this I/O stall when an hctx's run_work is already pending, leave it untouched to run at its current designated time rather than extending its delay. The work will still run which keeps closed the race calling blk_mq_delay_run_hw_queues is needed for while also avoiding the I/O stall. Signed-off-by:
David Jeffery <djeffery@redhat.com> Reviewed-by:
Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220131203337.GA17666@redhat Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Yu Kuai <yukuai3@huawei.com> Reviewed-by:
Jason Yan <yanaijie@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Ma Wupeng authored
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S CVE: NA -------------------------------- For reliable memory allocation bind to nodes which do not hvve any reliable zones, its memory allocation will fail and then warn message will be produced at the end of __alloc_pages_slowpath(). Though this memory allocation can fallback to movable zone in check_after_alloc() if fallback is enabled, something should be done to prevent this pointless warn log. To solve this problem, fallback to movable zone if no suitable zone found. Signed-off-by:
Ma Wupeng <mawupeng1@huawei.com> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
- Sep 22, 2022
-
-
Yonglong Liu authored
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5S7WZ CVE: NA ---------------------------- Signed-off-by:
Yonglong Liu <liuyonglong@huawei.com> Reviewed-by:
li yongxin <liyongxin1@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Yonglong Liu authored
driver inclusion category: https://gitee.com/openeuler/kernel/issues/I5S7WZ bugzilla: NA CVE: NA ---------------------------- When enable lots of vf, remove the driver of vf will call keep alive wrong resume. Fixes: 6a804d0a ("net: hns3: fix keep alive can not resume problem when system busy") Signed-off-by:
Yonglong Liu <liuyonglong@huawei.com> Reviewed-by:
li yongxin <liyongxin1@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Yonglong Liu authored
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5S7WZ CVE: NA ---------------------------- Signed-off-by:
Yonglong Liu <liuyonglong@huawei.com> Reviewed-by:
li yongxin <liyongxin1@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Yonglong Liu authored
driver inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5S7WZ CVE: NA ---------------------------- Currently, VF send keep alive to PF every 2s, and PF detect the keep alive for 8s, some case, the work queue may schedule late, cause keep alive lost, then the mac setting from PF may not affect to the VF, and the keep alive can not resume, only reset VF or reload VF driver can resume. This patch adds keep alive resume mechanism, and adds some debug print for this case. When link status change between keep alive lost and resume, the link status of VF may not the same as the PF, so adds push link status to VF to avoid this case. Signed-off-by:
Yonglong Liu <liuyonglong@huawei.com> Reviewed-by:
li yongxin <liyongxin1@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
- Sep 20, 2022
-
-
Haimin Zhang authored
stable inclusion from stable-v5.10.111 commit b9c5ac0a15f24d63b20f899072fa6dd8c93af136 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5RX0N?from=project-issue CVE: CVE-2022-3202 -------------------------------- [ Upstream commit a53046291020ec41e09181396c1e829287b48d47 ] Add validation check for JFS_IP(ipimap)->i_imap to prevent a NULL deref in diFree since diFree uses it without do any validations. When function jfs_mount calls diMount to initialize fileset inode allocation map, it can fail and JFS_IP(ipimap)->i_imap won't be initialized. Then it calls diFreeSpecial to close fileset inode allocation map inode and it will flow into jfs_evict_inode. Function jfs_evict_inode just validates JFS_SBI(inode->i_sb)->ipimap, then calls diFree. diFree use JFS_IP(ipimap)->i_imap directly, then it will cause a NULL deref. Reported-by:
TCS Robot <tcs_robot@tencent.com> Signed-off-by:
Haimin Zhang <tcs_kernel@tencent.com> Signed-off-by:
Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Wang Hai <wanghai38@huawei.com> Signed-off-by:
ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
Pavel Skripkin authored
mainline inclusion from mainline-6.0-rc4 commit 9d574f985fe33efd6911f4d752de6f485a1ea732 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5RX0N?from=project-issue CVE: CVE-2022-3202 -------------------------------- Avoid passing inode with JFS_SBI(inode->i_sb)->ipimap == NULL to diFree()[1]. GFP will appear: struct inode *ipimap = JFS_SBI(ip->i_sb)->ipimap; struct inomap *imap = JFS_IP(ipimap)->i_imap; JFS_IP() will return invalid pointer when ipimap == NULL Call Trace: diFree+0x13d/0x2dc0 fs/jfs/jfs_imap.c:853 [1] jfs_evict_inode+0x2c9/0x370 fs/jfs/inode.c:154 evict+0x2ed/0x750 fs/inode.c:578 iput_final fs/inode.c:1654 [inline] iput.part.0+0x3fe/0x820 fs/inode.c:1680 iput+0x58/0x70 fs/inode.c:1670 Reported-and-tested-by:
<syzbot+0a89a7b56db04c21a656@syzkaller.appspotmail.com> Signed-off-by:
Pavel Skripkin <paskripkin@gmail.com> Signed-off-by:
Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by:
ZhaoLong Wang <wangzhaolong1@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com>
-
- Sep 14, 2022
-
-
Jann Horn authored
stable inclusion from stable-v4.19.257 commit c3b1e88f14e7f442e2ddcbec94527eec84ac0ca3 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PE9S CVE: CVE-2022-39188 -------------------------------- commit b67fbebd4cf980aecbcc750e1462128bffe8ae15 upstream. Some drivers rely on having all VMAs through which a PFN might be accessible listed in the rmap for correctness. However, on X86, it was possible for a VMA with stale TLB entries to not be listed in the rmap. This was fixed in mainline with commit b67fbebd4cf9 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas"), but that commit relies on preceding refactoring in commit 18ba064e42df3 ("mmu_gather: Let there be one tlb_{start,end}_vma() implementation") and commit 1e9fdf21a4339 ("mmu_gather: Remove per arch tlb_{start,end}_vma()"). This patch provides equivalent protection without needing that refactoring, by forcing a TLB flush between removing PTEs in unmap_vmas() and the call to unlink_file_vma() in free_pgtables(). [This is a stable-specific rewrite of the upstream commit!] Signed-off-by:
Jann Horn <jannh@google.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by:
zuoze <zuoze1@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Hyunwoo Kim authored
mainline inclusion from mainline-v5.19-rc4 commit a09d2d00af53b43c6f11e6ab3cb58443c2cac8a7 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PRMO CVE: CVE-2022-39842 --------------------- In pxa3xx_gcu_write, a count parameter of type size_t is passed to words of type int. Then, copy_from_user() may cause a heap overflow because it is used as the third argument of copy_from_user(). Signed-off-by:
Hyunwoo Kim <imv4bel@gmail.com> Signed-off-by:
Helge Deller <deller@gmx.de> Signed-off-by:
Zhao Wenhui <zhaowenhui8@huawei.com> Reviewed-by:
Chen Hui <judy.chenhui@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Paolo Bonzini authored
mainline inclusion from mainline-v5.19-rc2 commit 6cd88243c7e03845a450795e134b488fc2afb736 category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5PJ7H CVE: CVE-2022-39189 ---------------------------------------- If a vCPU is outside guest mode and is scheduled out, it might be in the process of making a memory access. A problem occurs if another vCPU uses the PV TLB flush feature during the period when the vCPU is scheduled out, and a virtual address has already been translated but has not yet been accessed, because this is equivalent to using a stale TLB entry. To avoid this, only report a vCPU as preempted if sure that the guest is at an instruction boundary. A rescheduling request will be delivered to the host physical CPU as an external interrupt, so for simplicity consider any vmexit *not* instruction boundary except for external interrupts. It would in principle be okay to report the vCPU as preempte...
-
Andrew Murray authored
mainline inclusion from mainline-v5.6 commit 4942dc66 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5QR6C CVE: NA ------------- On VHE systems arch.mdcr_el2 is written to mdcr_el2 at vcpu_load time to set options for self-hosted debug and the performance monitors extension. Unfortunately the value of arch.mdcr_el2 is not calculated until kvm_arm_setup_debug() in the run loop after the vcpu has been loaded. This means that the initial brief iterations of the run loop use a zero value of mdcr_el2 - until the vcpu is preempted. This also results in a delay between changes to vcpu->guest_debug taking effect. Fix this by writing to mdcr_el2 in kvm_arm_setup_debug() on VHE systems when a change to arch.mdcr_el2 has been detected. Fixes: d5a21bcc ("KVM: arm64: Move common VHE/non-VHE trap config in separate functions") Cc: <stable@vger.kernel.org> # 4.17.x- Suggested-by: James Mors...
-
- Sep 13, 2022
-
-
David Leadbeater authored
mainline inclusion from mainline-v6.0-rc6 commit e8d5dfd1d8747b56077d02664a8838c71ced948e category: bugfix bugzilla: https://gitee.com/src-openeuler/kernel/issues/I5OWZ7 CVE: CVE-2022-2663 --------------------------- CTCP messages should only be at the start of an IRC message, not anywhere within it. While the helper only decodes packes in the ORIGINAL direction, its possible to make a client send a CTCP message back by empedding one into a PING request. As-is, thats enough to make the helper believe that it saw a CTCP message. Fixes: 869f37d8 ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port") Signed-off-by:
David Leadbeater <dgl@dgl.cx> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Liu Jian <liujian56@huawei.com> Reviewed-by:
Yue Haibing <yuehaibing@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
- Sep 07, 2022
-
-
Kiselev, Oleg authored
stable inclusion from stable-v4.19.256 commit 7bdfb01fc5f6b3696728aeb527c50386e0ee09a1 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- [ Upstream commit 69cb8e9d8cd97cdf5e293b26d70a9dee3e35e6bd ] This patch avoids an attempt to resize the filesystem to an unaligned cluster boundary. An online resize to a size that is not integral to cluster size results in the last iteration attempting to grow the fs by a negative amount, which trips a BUG_ON and leaves the fs with a corrupted in-memory superblock. Signed-off-by:
Oleg Kiselev <okiselev@amazon.com> Link: https://lore.kernel.org/r/0E92A0AB-4F16-4F1A-94B7-702CC6504FDE@amazon.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Hector Martin authored
stable inclusion from stable-v4.19.256 commit 4fffd776f56cc82280fa8c6036ff2051a511bcf8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit 415d832497098030241605c52ea83d4e2cfa7879 upstream. These operations are documented as always ordered in include/asm-generic/bitops/instrumented-atomic.h, and producer-consumer type use cases where one side needs to ensure a flag is left pending after some shared data was updated rely on this ordering, even in the failure case. This is the case with the workqueue code, which currently suffers from a reproducible ordering violation on Apple M1 platforms (which are notoriously out-of-order) that ends up causing the TTY layer to fail to deliver data to userspace properly under the right conditions. This change fixes that bug. Change the documentation to restrict the "no order on failure" story to the _lock() variant (for which it makes sense), and remove the early-exit from the generic implementation, which is what causes the missing barrier semantics in that case. Without this, the remaining atomic op is fully ordered (including on ARM64 LSE, as of recent versions of the architecture spec). Suggested-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Fixes: e986a0d6 ("locking/atomics, asm-generic/bitops/atomic.h: Rewrite using atomic_*() APIs") Fixes: 61e02392 ("locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()") Signed-off-by:
Hector Martin <marcan@marcan.st> Acked-by:
Will Deacon <will@kernel.org> Reviewed-by:
Arnd Bergmann <arnd@arndb.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Matthias May authored
stable inclusion from stable-v4.19.256 commit 381a9935cca88cf66e3098d733cec9e640247ade category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit ca2bb69514a8bc7f83914122f0d596371352416c upstream. According to Guillaume Nault RT_TOS should never be used for IPv6. Quote: RT_TOS() is an old macro used to interprete IPv4 TOS as described in the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4 code, although, given the current state of the code, most of the existing calls have no consequence. But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS" field to be interpreted the RFC 1349 way. There's no historical compatibility to worry about. Fixes: 3a56f86f ("geneve: handle ipv6 priority like ipv4 tos") Acked-by:
Guillaume Nault <gnault@redhat.com> Signed-off-by:
Matthias May <matthias.may@westermo.com> Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off...
-
Trond Myklebust authored
stable inclusion from stable-v4.19.256 commit 1429b2fa358c6fbd5e037edac59e20fa02aa783e category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit 6622e3a73112fc336c1c2c582428fb5ef18e456a upstream. When we're reusing the backchannel requests instead of freeing them, then we should reinitialise any values of the send/receive xdr_bufs so that they reflect the available space. Fixes: 0d2a970d ("SUNRPC: Fix a backchannel race") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Trond Myklebust authored
stable inclusion from stable-v4.19.256 commit 0fffb46ff3d5ed4668aca96441ec7a25b793bd6f category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit 2135e5d56278ffdb1c2e6d325dc6b87f669b9dac upstream. If someone cancels the open RPC call, then we must not try to free either the open slot or the layoutget operation arguments, since they are likely still in use by the hung RPC call. Fixes: 6949493884fe ("NFSv4: Don't hold the layoutget locks across multiple RPC calls") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Zhang Xianwei authored
stable inclusion from stable-v4.19.256 commit a0b3a9b0e6832be88b426b242aa3320c3817e2fb category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit e35a5e782f67ed76a65ad0f23a484444a95f000f upstream. A client should be able to handle getting an EACCES error while doing a mount operation to reclaim state due to NFS4CLNT_RECLAIM_REBOOT being set. If the server returns RPC_AUTH_BADCRED because authentication failed when we execute "exportfs -au", then RECLAIM_COMPLETE will go a wrong way. After mount succeeds, all OPEN call will fail due to an NFS4ERR_GRACE error being returned. This patch is to fix it by resending a RPC request. Signed-off-by:
Zhang Xianwei <zhang.xianwei8@zte.com.cn> Signed-off-by:
Yi Wang <wang.yi59@zte.com.cn> Fixes: aa5190d0 ("NFSv4: Kill nfs4_async_handle_error() abuses by NFSv4.1") Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Eric Dumazet authored
stable inclusion from stable-v4.19.256 commit e4a7b589cf19d20d587bc720ffbc55bad8ef26c6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit c4ee118561a0f74442439b7b5b486db1ac1ddfeb upstream. sk_forced_mem_schedule() has a bug similar to ones fixed in commit 7c80b038d23e ("net: fix sk_wmem_schedule() and sk_rmem_schedule() errors") While this bug has little chance to trigger in old kernels, we need to fix it before the following patch. Fixes: d83769a5 ("tcp: fix possible deadlock in tcp_send_fin()") Signed-off-by:
Eric Dumazet <edumazet@google.com> Acked-by:
Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by:
Shakeel Butt <shakeelb@google.com> Reviewed-by:
Wei Wang <weiwan@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Eric Whitney authored
stable inclusion from stable-v4.19.256 commit adf404e9d73dcae4f28fff9f7bcd857cfee2b7de category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit 7f0d8e1d607c1a4fa9a27362a108921d82230874 upstream. A race can occur in the unlikely event ext4 is unable to allocate a physical cluster for a delayed allocation in a bigalloc file system during writeback. Failure to allocate a cluster forces error recovery that includes a call to mpage_release_unused_pages(). That function removes any corresponding delayed allocated blocks from the extent status tree. If a new delayed write is in progress on the same cluster simultaneously, resulting in the addition of an new extent containing one or more blocks in that cluster to the extent status tree, delayed block accounting can be thrown off if that delayed write then encounters a similar cluster allocation failure during future writeback. Write lock the i_data_sem in mpage_release_unused_pages() to fix this problem. Ext4's block/cluster accounting code for bigalloc relies on i_data_sem for mutual exclusion, as is found in the delayed write path, and the locking in mpage_release_unused_pages() is missing. Cc: stable@kernel.org Reported-by:
Ye Bin <yebin10@huawei.com> Signed-off-by:
Eric Whitney <enwlinux@gmail.com> Link: https://lore.kernel.org/r/20220615160530.1928801-1-enwlinux@gmail.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Theodore Ts'o authored
stable inclusion from stable-v4.19.256 commit 38ac2511a56a74dfc6e0143bae58ef5fc4e318ea category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit de394a86658ffe4e89e5328fd4993abfe41b7435 upstream. When doing an online resize, the on-disk superblock on-disk wasn't updated. This means that when the file system is unmounted and remounted, and the on-disk overhead value is non-zero, this would result in the results of statfs(2) to be incorrect. This was partially fixed by Commits 10b01ee92df5 ("ext4: fix overhead calculation to account for the reserved gdt blocks"), 85d825dbf489 ("ext4: force overhead calculation if the s_overhead_cluster makes no sense"), and eb7054212eac ("ext4: update the cached overhead value in the superblock"). However, since it was too expensive to forcibly recalculate the overhead for bigalloc file systems at every mount, this didn't fix the problem for bigalloc file systems. This commit should address the problem when resizing file systems with the bigalloc feature enabled. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20220629040026.112371-1-tytso@mit.edu Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Lukas Czerner authored
stable inclusion from stable-v4.19.256 commit 4816177c9f154594c8a841ed95b883eb07e4d6ce category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- commit b8a04fe77ef1360fbf73c80fddbdfeaa9407ed1b upstream. ext4_append() must always allocate a new block, otherwise we run the risk of overwriting existing directory block corrupting the directory tree in the process resulting in all manner of problems later on. Add a sanity check to see if the logical block is already allocated and error out if it is. Cc: stable@kernel.org Signed-off-by:
Lukas Czerner <lczerner@redhat.com> Reviewed-by:
Andreas Dilger <adilger@dilger.ca> Link: https://lore.kernel.org/r/20220704142721.157985-2-lczerner@redhat.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-
Chen Zhongjin authored
stable inclusion from stable-v4.19.256 commit 1c836bad43f3e2ff71cc397a6e6ccb4e7bd116f8 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0SQ CVE: NA -------------------------------- [ Upstream commit 28f6c37a2910f565b4f5960df52b2eccae28c891 ] kernel_text_address() treats ftrace_trampoline, kprobe_insn_slot and bpf_text_address as valid kprobe addresses - which is not ideal. These text areas are removable and changeable without any notification to kprobes, and probing on them can trigger unexpected behavior: https://lkml.org/lkml/2022/7/26/1148 Considering that jump_label and static_call text are already forbiden to probe, kernel_text_address() should be replaced with core_kernel_text() and is_module_text_address() to check other text areas which are unsafe to kprobe. [ mingo: Rewrote the changelog. ] Fixes: 5b485629 ("kprobes, extable: Identify kprobes trampolines as kernel text area") Fixes: 74451e66 ("bpf: make jited programs visible in traces") Signed-off-by:
Chen Zhongjin <chenzhongjin@huawei.com> Signed-off-by:
Ingo Molnar <mingo@kernel.org> Acked-by:
Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20220801033719.228248-1-chenzhongjin@huawei.com Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yongqiang Liu <liuyongqiang13@huawei.com>
-