Skip to content
Snippets Groups Projects
  1. Jul 28, 2022
  2. Jul 26, 2022
  3. Jul 25, 2022
  4. Jul 22, 2022
  5. Jul 21, 2022
  6. Jul 20, 2022
    • Zhang Wensheng's avatar
      scsi: ses: fix slab-out-of-bounds in ses_enclosure_data_process · b101167f
      Zhang Wensheng authored
      hulk inclusion
      category: bugfix
      bugzilla: 187025, https://gitee.com/src-openeuler/kernel/issues/I5HPS5
      
      
      CVE: NA
      
      --------------------------------
      
      Kasan report a bug like below:
      [  494.865170] ==================================================================
      [  494.901335] BUG: KASAN: slab-out-of-bounds in ses_enclosure_data_process+0x234/0x6f0 [ses]
      [  494.901347] Write of size 1 at addr ffff8882f3181a70 by task systemd-udevd/1704
      [  494.931929] i801_smbus 0000:00:1f.4: SPD Write Disable is set
      
      [  494.944092] CPU: 12 PID: 1704 Comm: systemd-udevd Tainted: G
      [  494.944101] Hardware name: Huawei 2288H V5/BC11SPSCB0, BIOS 7.01 11/13/2019
      [  494.964003] i801_smbus 0000:00:1f.4: SMBus using PCI interrupt
      [  494.978532] Call Trace:
      [  494.978544]  dump_stack+0xbe/0xf9
      [  494.978558]  print_address_description.constprop.0+0x19/0x130
      [  495.092838]  ? ses_enclosure_data_process+0x234/0x6f0 [ses]
      [  495.092846]  __kasan_report.cold+0x68/0x80
      [  495.092855]  ? __kasan_kmalloc.constprop.0+0x71/0xd0
      [  495.092862]  ? ses_enclosure_data_process+0x234/0x6f0 [ses]
      [  495.092868]  kasan_report+0x3a/0x50
      [  495.092875]  ses_enclosure_data_process+0x234/0x6f0 [ses]
      [  495.092882]  ? mutex_unlock+0x1d/0x40
      [  495.092889]  ses_intf_add+0x57f/0x910 [ses]
      [  495.092900]  class_interface_register+0x26d/0x290
      [  495.092906]  ? class_destroy+0xd0/0xd0
      [  495.092912]  ? 0xffffffffc0bf8000
      [  495.092919]  ses_init+0x18/0x1000 [ses]
      [  495.092927]  do_one_initcall+0xcb/0x370
      [  495.092934]  ? initcall_blacklisted+0x1b0/0x1b0
      [  495.092942]  ? create_object.isra.0+0x330/0x3a0
      [  495.092950]  ? kasan_unpoison_shadow+0x33/0x40
      [  495.092957]  ? kasan_unpoison_shadow+0x33/0x40
      [  495.092966]  do_init_module+0xe4/0x3a0
      [  495.092972]  load_module+0xd0a/0xdd0
      [  495.092980]  ? layout_and_allocate+0x300/0x300
      [  495.092989]  ? seccomp_run_filters+0x1d6/0x2c0
      [  495.092999]  ? kernel_read_file_from_fd+0xb3/0xe0
      [  495.093006]  __se_sys_finit_module+0x11b/0x1b0
      [  495.093012]  ? __ia32_sys_init_module+0x40/0x40
      [  495.093023]  ? __audit_syscall_entry+0x226/0x290
      [  495.093032]  do_syscall_64+0x33/0x40
      [  495.093041]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [  495.093046] RIP: 0033:0x7f39c3376089
      [  495.093054] Code: 00 48 81 c4 80 00 00 00 89 f0 c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e7 dd 0b 00 f7 d8 64 89 01 48
      [  495.093058] RSP: 002b:00007ffdc6009e18 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      [  495.093068] RAX: ffffffffffffffda RBX: 000055d4192801c0 RCX: 00007f39c3376089
      [  495.093072] RDX: 0000000000000000 RSI: 00007f39c2fae99d RDI: 000000000000000f
      [  495.093076] RBP: 00007f39c2fae99d R08: 0000000000000000 R09: 0000000000000001
      [  495.093080] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000
      [  495.093084] R13: 000055d419282e00 R14: 0000000000020000 R15: 000055d41927f1f0
      
      [  495.093091] Allocated by task 1704:
      [  495.093098]  kasan_save_stack+0x1b/0x40
      [  495.093105]  __kasan_kmalloc.constprop.0+0xc2/0xd0
      [  495.093111]  ses_enclosure_data_process+0x65d/0x6f0 [ses]
      [  495.093117]  ses_intf_add+0x57f/0x910 [ses]
      [  495.093123]  class_interface_register+0x26d/0x290
      [  495.093129]  ses_init+0x18/0x1000 [ses]
      [  495.093134]  do_one_initcall+0xcb/0x370
      [  495.093139]  do_init_module+0xe4/0x3a0
      [  495.093144]  load_module+0xd0a/0xdd0
      [  495.093150]  __se_sys_finit_module+0x11b/0x1b0
      [  495.093155]  do_syscall_64+0x33/0x40
      [  495.093162]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      [  495.093168] The buggy address belongs to the object at ffff8882f3181a40
                      which belongs to the cache kmalloc-64 of size 64
      [  495.093173] The buggy address is located 48 bytes inside of
                      64-byte region [ffff8882f3181a40, ffff8882f3181a80)
      [  495.093175] The buggy address belongs to the page:
      [  495.093181] page:ffffea000bcc6000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2f3180
      [  495.093186] head:ffffea000bcc6000 order:2 compound_mapcount:0 compound_pincount:0
      [  495.093194] flags: 0x17ffe0000010200(slab|head|node=0|zone=2|lastcpupid=0x3fff)
      [  495.093204] raw: 017ffe0000010200 ffffea0016e5fb08 ffffea0016921508 ffff888100050e00
      [  495.093211] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
      [  495.093213] page dumped because: kasan: bad access detected
      
      [  495.093216] Memory state around the buggy address:
      [  495.093222]  ffff8882f3181900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  495.093227]  ffff8882f3181980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  495.093231] >ffff8882f3181a00: fc fc fc fc fc fc fc fc 00 00 00 00 01 fc fc fc
      [  495.093234]                                                              ^
      [  495.093239]  ffff8882f3181a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  495.093244]  ffff8882f3181b00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  495.093246] ==================================================================
      
      After analysis on vmcore, it was found that the line "desc_ptr[len] =
      '\0';" has slab-out-of-bounds problem in ses_enclosure_data_process.
      In ses_enclosure_data_process, "desc_ptr" point to "buf", so it have
      to be limited in the memory of "buf", however. although there is
      "desc_ptr >= buf + page7_len" judgment, it does not work because
      "desc_ptr + 4 + len" may bigger than "buf + page7_len", which will
      lead to slab-out-of-bounds problem.
      
      Signed-off-by: default avatarZhang Wensheng <zhangwensheng5@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      b101167f
    • Eric Biggers's avatar
      block: don't delete queue kobject before its children · 0698a51d
      Eric Biggers authored
      stable inclusion
      from linux-4.19.238
      commit b2001eb10f59363da930cdd6e086a2861986fa18
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5EWKO
      
      
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 0f69288253e9fc7c495047720e523b9f1aba5712 ]
      
      kobjects aren't supposed to be deleted before their child kobjects are
      deleted.  Apparently this is usually benign; however, a WARN will be
      triggered if one of the child kobjects has a named attribute group:
      
          sysfs group 'modes' not found for kobject 'crypto'
          WARNING: CPU: 0 PID: 1 at fs/sysfs/group.c:278 sysfs_remove_group+0x72/0x80
          ...
          Call Trace:
            sysfs_remove_groups+0x29/0x40 fs/sysfs/group.c:312
            __kobject_del+0x20/0x80 lib/kobject.c:611
            kobject_cleanup+0xa4/0x140 lib/kobject.c:696
            kobject_release lib/kobject.c:736 [inline]
            kref_put include/linux/kref.h:65 [inline]
            kobject_put+0x53/0x70 lib/kobject.c:753
            blk_crypto_sysfs_unregister+0x10/0x20 block/blk-crypto-sysfs.c:159
            blk_unregister_queue+0xb0/0x110 block/blk-sysfs.c:962
            del_gendisk+0x117/0x250 block/genhd.c:610
      
      Fix this by moving the kobject_del() and the corresponding
      kobject_uevent() to the correct place.
      
      Fixes: 2c2086af ("block: Protect less code with sysfs_lock in blk_{un,}register_queue()")
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Link: https://lore.kernel.org/r/20220124215938.2769-3-ebiggers@kernel.org
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      
      Conflict:
      	block/blk-sysfs.c
      Signed-off-by: default avatarZhang Wensheng <zhangwensheng5@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      0698a51d
    • liubo's avatar
      etmem:fix kernel stack overflow in do_swapcache_reclaim · 89ea0cd3
      liubo authored
      euleros inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5GN7K
      
      
      CVE: NA
      
      --------------------------------
      In the do_swapcache_reclaim interface,
      there are the following local variables.
      
      unsigned long nr[MAX_NUMNODES],
      unsigned long nr_to_reclaim[MAX_NUMNODES],
      struct list_head swapcache_list[MAX_NUMNODES],
      
      In the kernel, MAX_NUMNODES is defined as follows:
      
      Under the x86_64 architecture, CONFIG_NODES_SHIFT is
      defined as follows:
      CONFIG_NODES_SHIFT=10
      
      Therefore, under the X86_64 architecture, local variables
      may cause kernel stack overflow.
      
      Modify the above variable acquisition method
      and change it to dynamic application.
      
      Signed-off-by: default avatarliubo <liubo254@huawei.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarwangkefeng <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarLaibin Qiu <qiulaibin@huawei.com>
      89ea0cd3
    • liubo's avatar
      etmem:fix kasan slab-out-of-bounds in do_swapcache_reclaim · c9a24119
      liubo authored
      euleros inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5GN7K
      
      
      CVE: NA
      
      --------------------------------
      In the do_swapcache_reclaim interface,
      there is a slab-out-of-bounds kasan problem;
      
      The reason for the problem is that when
      list_for_each_entry_safe_reverse_from traverses
      the LRU linked list, it does not consider that next may be
      equal to the head address, which may lead to the
      head address being accessed as the page address,
      causing problems.
      
      In response to the above problems,
      add a judgment about whether pos is head.
      
      Signed-off-by: default avatarliubo <liubo254@huawei.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarwangkefeng <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarLaibin Qiu <qiulaibin@huawei.com>
      c9a24119
    • 余快's avatar
      nbd: don't clear 'NBD_CMD_INFLIGHT' flag if request is not completed · 622ecb59
      余快 authored
      mainline inclusion
      from mainline-v5.19-rc1
      commit 2895f1831e911ca87d4efdf43e35eb72a0c7e66e
      category: bugfix
      bugzilla: 187081, https://gitee.com/src-openeuler/kernel/issues/I5H341
      
      
      CVE: NA
      
      --------------------------------
      
      Otherwise io will hung because request will only be completed if the
      cmd has the flag 'NBD_CMD_INFLIGHT'.
      
      Fixes: 07175cb1baf4 ("nbd: make sure request completion won't concurrent")
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Link: https://lore.kernel.org/r/20220521073749.3146892-4-yukuai3@huawei.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      
      Conflict: fake timeout is not supported yet, clear_bit() in
      nbd_handle_reply() directly.
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      622ecb59
    • 余快's avatar
      blk-throttle: fix io hung due to configuration updates · f4df027e
      余快 authored
      hulk inclusion
      category: bugfix
      bugzilla: 186731, https://gitee.com/src-openeuler/kernel/issues/I5HWTF
      
      
      CVE: NA
      
      --------------------------------
      
      If new configuration is submitted while a bio is throttled, then new
      waiting time is recaculated regardless that the bio might aready wait
      for some time:
      
      tg_conf_updated
       throtl_start_new_slice
        tg_update_disptime
        throtl_schedule_next_dispatch
      
      Then io hung can be triggered by always submmiting new configuration
      before the throttled bio is dispatched.
      
      Fix the problem by respecting the time that throttled bio aready waited.
      In order to do that, instead of start new slice in tg_conf_updated(),
      just update 'bytes_disp' and 'io_disp' based on the new configuration.
      
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      f4df027e
    • 余快's avatar
      block: fix NULL pointer dereference in disk_release() · dc2c2374
      余快 authored
      hulk inclusion
      category: bugfix
      bugzilla: 186769, https://gitee.com/openeuler/kernel/issues/I5FYJY
      
      
      CVE: NA
      
      --------------------------------
      
      Our test report a crash:
      
      run fstests generic/349 at 2022-05-20 20:55:10
      sd 3:0:0:0: Power-on or device reset occurred
      BUG: kernel NULL pointer dereference, address: 0000000000000030
      Call Trace:
       disk_release+0x42/0x170
       device_release+0x92/0x120
       kobject_put+0x183/0x350
       put_disk+0x23/0x30
       sg_device_destroy+0x77/0xd0
       sg_remove_device+0x1b8/0x220
       device_del+0x19b/0x610
       ? kfree_const+0x3e/0x50
       ? kobject_put+0x1d1/0x350
       device_unregister+0x36/0xa0
       __scsi_remove_device+0x1ba/0x240
       scsi_forget_host+0x95/0xd0
       scsi_remove_host+0xba/0x1f0
       sdebug_driver_remove+0x30/0x110 [scsi_debug]
       device_release_driver_internal+0x1ab/0x340
       device_release_driver+0x16/0x20
       bus_remove_device+0x167/0x220
       device_del+0x23e/0x610
       device_unregister+0x36/0xa0
       sdebug_do_remove_host+0x159/0x190 [scsi_debug]
       scsi_debug_exit+0x2d/0x120 [scsi_debug]
       __se_sys_delete_module+0x34c/0x420
       ? exit_to_user_mode_prepare+0x93/0x210
       __x64_sys_delete_module+0x1a/0x30
       do_syscall_64+0x4d/0x70
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Such crash happened since commit 2a19b28f7929 ("blk-mq: cancel blk-mq
      dispatch work in both blk_cleanup_queue and disk_release()") was
      backported from mainline.
      
      commit 61a35cfc2633 ("block: hold a request_queue reference for the
      lifetime of struct gendisk") is not backported, thus we can't ensure
      request_queue still exist in disk_release(), and that's why
      blk_mq_cancel_work_sync() will triggered the problem in disk_release().
      However, in order to backport it, there are too many relied patches and
      kabi will be broken.
      
      Since we didn't backport related patches to tear down file system I/O in
      del_gendisk, which fix issues introduced by refactor patches to move bdi
      from request_queue to the disk, there is no need to call
      blk_mq_cancel_work_sync() from disk_release(). This patch just remove
      blk_mq_cancel_work_sync() from disk_release() to fix the above crash.
      
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      dc2c2374
    • 余快's avatar
      block, bfq: make bfq_has_work() more accurate · e27ff551
      余快 authored
      mainline inclusion
      from mainline-v5.19-rc1
      commit ddc25c86b466d2359b57bc7798f167baa1735a44
      category: bugfix
      bugzilla: 186769, https://gitee.com/openeuler/kernel/issues/I5FYJY
      
      
      CVE: NA
      
      --------------------------------
      
      bfq_has_work() is using busy_queues currently, which is not accurate
      because bfq_queue is busy doesn't represent that it has requests. Since
      bfqd aready has a counter 'queued' to record how many requests are in
      bfq, use it instead of busy_queues.
      
      Noted that bfq_has_work() can be called with 'bfqd->lock' held, thus the
      lock can't be held in bfq_has_work() to protect 'bfqd->queued'.
      
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/r/20220513023507.2625717-3-yukuai3@huawei.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      e27ff551
    • 余快's avatar
      blk-mq: fix panic during blk_mq_run_work_fn() · ea93e21d
      余快 authored
      hulk inclusion
      category: bugfix
      bugzilla: 186769, https://gitee.com/openeuler/kernel/issues/I5FYJY
      
      
      CVE: NA
      
      --------------------------------
      
      Our test report a following crash:
      
      BUG: kernel NULL pointer dereference, address: 0000000000000018
      PGD 0 P4D 0
      Oops: 0000 [#1] SMP NOPTI
      CPU: 6 PID: 265 Comm: kworker/6:1H Kdump: loaded Tainted: G           O      5.10.0-60.17.0.h43.eulerosv2r11.x86_64 #1
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-20220320_160524-szxrtosci10000 04/01/2014
      Workqueue: kblockd blk_mq_run_work_fn
      RIP: 0010:blk_mq_delay_run_hw_queues+0xb6/0xe0
      RSP: 0018:ffffacc6803d3d88 EFLAGS: 00010246
      RAX: 0000000000000006 RBX: ffff99e2c3d25008 RCX: 00000000ffffffff
      RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff99e2c911ae18
      RBP: ffffacc6803d3dd8 R08: 0000000000000000 R09: ffff99e2c0901f6c
      R10: 0000000000000018 R11: 0000000000000018 R12: ffff99e2c911ae18
      R13: 0000000000000000 R14: 0000000000000003 R15: ffff99e2c911ae18
      FS:  0000000000000000(0000) GS:ffff99e6bbf00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000018 CR3: 000000007460a006 CR4: 00000000003706e0
      Call Trace:
       __blk_mq_do_dispatch_sched+0x2a7/0x2c0
       ? newidle_balance+0x23e/0x2f0
       __blk_mq_sched_dispatch_requests+0x13f/0x190
       blk_mq_sched_dispatch_requests+0x30/0x60
       __blk_mq_run_hw_queue+0x47/0xd0
       process_one_work+0x1b0/0x350
       worker_thread+0x49/0x300
       ? rescuer_thread+0x3a0/0x3a0
       kthread+0xfe/0x140
       ? kthread_park+0x90/0x90
       ret_from_fork+0x22/0x30
      
      After digging from vmcore, I found that the queue is cleaned
      up(blk_cleanup_queue() is done) and tag set is
      freed(blk_mq_free_tag_set() is done).
      
      There are two problems here:
      
      1) blk_mq_delay_run_hw_queues() will only be called from
      __blk_mq_do_dispatch_sched() if e->type->ops.has_work() return true.
      This seems impossible because blk_cleanup_queue() is done, and there
      should be no io. However, bfq_has_work() can return true even if no
      io is queued. This is because bfq_has_work() is using busy queues, and
      bfq_queue can stay busy after dispatching all the requests.
      
      2) 'hctx->run_work' still exists after blk_cleanup_queue().
      blk_mq_cancel_work_sync() is called from blk_cleanup_queue() to cancel
      all the 'run_work'. However, there is no guarantee that new 'run_work'
      won't be queued after that(and before blk_mq_exit_queue() is done).
      
      The first problem is not the root cause, this patch just fix the second
      problem by grabbing 'q_usage_counter' before queuing 'hctx->run_work'.
      
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      ea93e21d
    • Ming Lei's avatar
      blk-mq: cancel blk-mq dispatch work in both blk_cleanup_queue and disk_release() · a27f3109
      Ming Lei authored
      mainline inclusion
      from mainline-v5.16-rc2
      commit 2a19b28f7929866e1cec92a3619f4de9f2d20005
      category: bugfix
      bugzilla: 186769, https://gitee.com/openeuler/kernel/issues/I5FYJY
      
      
      CVE: NA
      
      --------------------------------
      
      For avoiding to slow down queue destroy, we don't call
      blk_mq_quiesce_queue() in blk_cleanup_queue(), instead of delaying to
      cancel dispatch work in blk_release_queue().
      
      However, this way has caused kernel oops[1], reported by Changhui. The log
      shows that scsi_device can be freed before running blk_release_queue(),
      which is expected too since scsi_device is released after the scsi disk
      is closed and the scsi_device is removed.
      
      Fixes the issue by canceling blk-mq dispatch work in both blk_cleanup_queue()
      and disk_release():
      
      1) when disk_release() is run, the disk has been closed, and any sync
      dispatch activities have been done, so canceling dispatch work is enough to
      quiesce filesystem I/O dispatch activity.
      
      2) in blk_cleanup_queue(), we only focus on passthrough request, and
      passthrough request is always explicitly allocated & freed by
      its caller, so once queue is frozen, all sync dispatch activity
      for passthrough request has been done, then it is enough to just cancel
      dispatch work for avoiding any dispatch activity.
      
      [1] kernel panic log
      [12622.769416] BUG: kernel NULL pointer dereference, address: 0000000000000300
      [12622.777186] #PF: supervisor read access in kernel mode
      [12622.782918] #PF: error_code(0x0000) - not-present page
      [12622.788649] PGD 0 P4D 0
      [12622.791474] Oops: 0000 [#1] PREEMPT SMP PTI
      [12622.796138] CPU: 10 PID: 744 Comm: kworker/10:1H Kdump: loaded Not tainted 5.15.0+ #1
      [12622.804877] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
      [12622.813321] Workqueue: kblockd blk_mq_run_work_fn
      [12622.818572] RIP: 0010:sbitmap_get+0x75/0x190
      [12622.823336] Code: 85 80 00 00 00 41 8b 57 08 85 d2 0f 84 b1 00 00 00 45 31 e4 48 63 cd 48 8d 1c 49 48 c1 e3 06 49 03 5f 10 4c 8d 6b 40 83 f0 01 <48> 8b 33 44 89 f2 4c 89 ef 0f b6 c8 e8 fa f3 ff ff 83 f8 ff 75 58
      [12622.844290] RSP: 0018:ffffb00a446dbd40 EFLAGS: 00010202
      [12622.850120] RAX: 0000000000000001 RBX: 0000000000000300 RCX: 0000000000000004
      [12622.858082] RDX: 0000000000000006 RSI: 0000000000000082 RDI: ffffa0b7a2dfe030
      [12622.866042] RBP: 0000000000000004 R08: 0000000000000001 R09: ffffa0b742721334
      [12622.874003] R10: 0000000000000008 R11: 0000000000000008 R12: 0000000000000000
      [12622.881964] R13: 0000000000000340 R14: 0000000000000000 R15: ffffa0b7a2dfe030
      [12622.889926] FS:  0000000000000000(0000) GS:ffffa0baafb40000(0000) knlGS:0000000000000000
      [12622.898956] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [12622.905367] CR2: 0000000000000300 CR3: 0000000641210001 CR4: 00000000001706e0
      [12622.913328] Call Trace:
      [12622.916055]  <TASK>
      [12622.918394]  scsi_mq_get_budget+0x1a/0x110
      [12622.922969]  __blk_mq_do_dispatch_sched+0x1d4/0x320
      [12622.928404]  ? pick_next_task_fair+0x39/0x390
      [12622.933268]  __blk_mq_sched_dispatch_requests+0xf4/0x140
      [12622.939194]  blk_mq_sched_dispatch_requests+0x30/0x60
      [12622.944829]  __blk_mq_run_hw_queue+0x30/0xa0
      [12622.949593]  process_one_work+0x1e8/0x3c0
      [12622.954059]  worker_thread+0x50/0x3b0
      [12622.958144]  ? rescuer_thread+0x370/0x370
      [12622.962616]  kthread+0x158/0x180
      [12622.966218]  ? set_kthread_struct+0x40/0x40
      [12622.970884]  ret_from_fork+0x22/0x30
      [12622.974875]  </TASK>
      [12622.977309] Modules linked in: scsi_debug rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs sunrpc dm_multipath intel_rapl_msr intel_rapl_common dell_wmi_descriptor sb_edac rfkill video x86_pkg_temp_thermal intel_powerclamp dcdbas coretemp kvm_intel kvm mgag200 irqbypass i2c_algo_bit rapl drm_kms_helper ipmi_ssif intel_cstate intel_uncore syscopyarea sysfillrect sysimgblt fb_sys_fops pcspkr cec mei_me lpc_ich mei ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter drm fuse xfs libcrc32c sr_mod cdrom sd_mod t10_pi sg ixgbe ahci libahci crct10dif_pclmul crc32_pclmul crc32c_intel libata megaraid_sas ghash_clmulni_intel tg3 wdat_wdt mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug]
      
      Reported-by: default avatarChanghuiZhong <czhong@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Cc: Bart Van Assche <bvanassche@acm.org>
      Cc: linux-scsi@vger.kernel.org
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Link: https://lore.kernel.org/r/20211116014343.610501-1-ming.lei@redhat.com
      
      
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      
      Conflicts:
      ./block/blk-mq.c
      ./block/blk-mq.h
      ./block/blk-sysfs.c
      ./block/genhd.c
      ./block/blk-core.c
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      a27f3109
    • Yang Yang's avatar
      blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue · 1cf7ad3c
      Yang Yang authored
      mainline inclusion
      from mainline-v5.10-rc1
      commit 47ce030b
      category: bugfix
      bugzilla: 186769, https://gitee.com/openeuler/kernel/issues/I5FYJY
      
      
      CVE: NA
      
      --------------------------------
      
      blk_exit_queue will free elevator_data, while blk_mq_run_work_fn
      will access it. Move cancel of hctx->run_work to the front of
      blk_exit_queue to avoid use-after-free.
      
      Fixes: 1b97871b ("blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release")
      Signed-off-by: default avatarYang Yang <yang.yang@vivo.com>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      1cf7ad3c
    • Baokun Li's avatar
      ext4: fix race condition between ext4_ioctl_setflags and ext4_fiemap · b8493652
      Baokun Li authored
      hulk inclusion
      category: bugfix
      bugzilla: 187222, https://gitee.com/openeuler/kernel/issues/I5H3KE
      
      
      CVE: NA
      
      --------------------------------
      
      Hulk Robot reported a BUG:
      
      ==================================================================
      kernel BUG at fs/ext4/extents_status.c:762!
      invalid opcode: 0000 [#1] SMP KASAN PTI
      [...]
      Call Trace:
       ext4_cache_extents+0x238/0x2f0
       ext4_find_extent+0x785/0xa40
       ext4_fiemap+0x36d/0xe90
       do_vfs_ioctl+0x6af/0x1200
      [...]
      ==================================================================
      
      Above issue may happen as follows:
      -------------------------------------
                 cpu1		    cpu2
      _____________________|_____________________
      do_vfs_ioctl
       ext4_ioctl
        ext4_ioctl_setflags
         ext4_ind_migrate
                              do_vfs_ioctl
                               ioctl_fiemap
                                ext4_fiemap
                                 ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)
                                 ext4_fill_fiemap_extents
          down_write(&EXT4_I(inode)->i_data_sem);
          ext4_ext_check_inode
          ext4_clear_inode_flag(inode, EXT4_INODE_EXTENTS)
          memset(ei->i_data, 0, sizeof(ei->i_data))
          up_write(&EXT4_I(inode)->i_data_sem);
                                  down_read(&EXT4_I(inode)->i_data_sem);
                                  ext4_find_extent
                                   ext4_cache_extents
                                    ext4_es_cache_extent
                                     BUG_ON(end < lblk)
      
      We can easily reproduce this problem with the syzkaller testcase:
      ```
      02:37:07 executing program 3:
      r0 = openat(0xffffffffffffff9c, &(0x7f0000000040)='./file0\x00', 0x26e1, 0x0)
      ioctl$FS_IOC_FSSETXATTR(r0, 0x40086602, &(0x7f0000000080)={0x17e})
      mkdirat(0xffffffffffffff9c, &(0x7f00000000c0)='./file1\x00', 0x1ff)
      r1 = openat(0xffffffffffffff9c, &(0x7f0000000100)='./file1\x00', 0x0, 0x0)
      ioctl$FS_IOC_FIEMAP(r1, 0xc020660b, &(0x7f0000000180)={0x0, 0x1, 0x0, 0xef3, 0x6, []}) (async, rerun: 32)
      ioctl$FS_IOC_FSSETXATTR(r1, 0x40086602, &(0x7f0000000140)={0x17e}) (rerun: 32)
      ```
      
      To solve this issue, we use __generic_block_fiemap() instead of
      generic_block_fiemap() and add inode_lock_shared to avoid race condition.
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Reviewed-by: default avatarZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      b8493652
  7. Jul 14, 2022
    • 余快's avatar
      block: fix that part scan is disabled in device_add_disk() · faf2662e
      余快 authored
      hulk inclusion
      category: bugfix
      bugzilla: 187190, https://gitee.com/src-openeuler/kernel/issues/I5GWOV
      
      
      CVE: NA
      
      --------------------------------
      
      Commit f20a726b ("block: Fix warning in bd_link_disk_holder()")
      moves the setting of flag 'GENHD_FL_UP' behind blkdev_get, which will
      disabled part scan:
      
      devcie_add_disk
       register_disk
        blkdev_get
         __blkdev_get
          bdev_get_gendisk
           get_gendisk -> failed because 'GENHD_FL_UP' is not set
      
      And this will cause tests block/017, block/018 and scsi/004 to fail.
      
      Fix the problem by moving part scan as well.
      
      Fixes: f20a726b ("block: Fix warning in bd_link_disk_holder()")
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
    • 余快's avatar
      Revert "block: rename bd_invalidated" · 13879644
      余快 authored
      hulk inclusion
      category: bugfix
      bugzilla: 187190, https://gitee.com/src-openeuler/kernel/issues/I5GWOV
      
      
      CVE: NA
      
      --------------------------------
      
      This reverts commit b6113052.
      
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      13879644
    • 余快's avatar
      Revert "block: move the NEED_PART_SCAN flag to struct gendisk" · acc38712
      余快 authored
      hulk inclusion
      category: bugfix
      bugzilla: 187190, https://gitee.com/src-openeuler/kernel/issues/I5GWOV
      
      
      CVE: NA
      
      --------------------------------
      
      This reverts commit b2f0e44f.
      
      Because it will introduce following problem in ltp zram tests:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000600
      PGD 0 P4D 0
      Oops: 0002 [#1] SMP PTI
      CPU: 28 PID: 172121 Comm: sh Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0+ #2
      Hardware name: Huawei RH2288H V3/BC11HGSA0, BIOS 5.15 05/21/2019
      RIP: 0010:flush_disk+0x1d/0x50
      RSP: 0018:ffffaf14a516fe20 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff899e26bac380 RCX: 0000000000000000
      RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff899e26bac380
      RBP: ffff899e26bac380 R08: 00000000000006a9 R09: 0000000000000004
      R10: ffff89cd878ff440 R11: 0000000000000001 R12: 0000000000000000
      R13: ffff899e26bac398 R14: ffffaf14a516ff00 R15: ffff89cd8709c3e0
      FS:  00007f78d6840740(0000) GS:ffff89fcbf480000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000600 CR3: 000000308afc0002 CR4: 00000000001606e0
      Call Trace:
       revalidate_disk+0x57/0x80
       reset_store+0xaf/0x120 [zram]
       kernfs_fop_write+0x10f/0x190
       vfs_write+0xad/0x1a0
       ksys_write+0x52/0xc0
       do_syscall_64+0x5d/0x1d0
       entry_SYSCALL_64_after_hwframe+0x65/0xca
      
      This is because "bdev->bd_disk" is not ensured to exist, just convert
      "set_bit(BDEV_NEED_PART_SCAN, &bdev->bd_flags)" to
      "set_bit(GD_NEED_PART_SCAN, &bdev->bd_disk->state)" is wrong.
      
      The reason to backport it is that commit 2a57456c8973 ("block:
      Fix warning in bd_link_disk_holder()") has a regression that part scan
      is disabled in device_add_disk(), and this problem will be fixed in
      later patch.
      
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      acc38712
    • 余快's avatar
      Revert "block:Fix kabi broken" · 3e2bfb3a
      余快 authored
      hulk inclusion
      category: bugfix
      bugzilla: 187190, https://gitee.com/src-openeuler/kernel/issues/I5GWOV
      
      
      CVE: NA
      
      --------------------------------
      
      This reverts commit 64ba823f.
      
      The patches that broke kabi will be reverted together.
      
      Signed-off-by: default avatarYu Kuai <yukuai3@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      3e2bfb3a
    • Zheng Yejian's avatar
      rcu/tree: Mark functions as notrace · 9b5e728e
      Zheng Yejian authored
      hulk inclusion
      category: bugfix
      bugzilla: 187209, https://gitee.com/openeuler/kernel/issues/I5GWFT
      CVE: NA
      
      --------------------------------
      
      Syzkaller report a softlockup problem, see following logs:
        [   41.463870] watchdog: BUG: soft lockup - CPU#0 stuck for 22s!  [ksoftirqd/0:9]
        [   41.509763] Modules linked in:
        [   41.512295] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.19.90 #13
        [   41.516134] Hardware name: linux,dummy-virt (DT)
        [   41.519182] pstate: 80c00005 (Nzcv daif +PAN +UAO)
        [   41.522415] pc : perf_trace_buf_alloc+0x138/0x238
        [   41.525583] lr : perf_trace_buf_alloc+0x138/0x238
        [   41.528656] sp : ffff8000c137e880
        [   41.531050] x29: ffff8000c137e880 x28: ffff20000850ced0
        [   41.534759] x27: 0000000000000000 x26: ffff8000c137e9c0
        [   41.538456] x25: ffff8000ce5c2ae0 x24: ffff200008358b08
        [   41.542151] x23: 0000000000000000 x22: ffff2000084a50ac
        [   41.545834] x21: ffff8000c137e880 x20: 000000000000001c
        [   41.549516] x19: ffff7dffbfdf88e8 x18: 0000000000000000
        [   41.553202] x17: 0000000000000000 x16: 0000000000000000
        [   41.556892] x15: 1ffff00036e07805 x14: 0000000000000000
        [   41.560592] x13: 0000000000000004 x12: 0000000000000000
        [   41.564315] x11: 1fffefbff7fbf120 x10: ffff0fbff7fbf120
        [   41.568003] x9 : dfff200000000000 x8 : ffff7dffbfdf8904
        [   41.571699] x7 : 0000000000000000 x6 : ffff0fbff7fbf121
        [   41.575398] x5 : ffff0fbff7fbf121 x4 : ffff0fbff7fbf121
        [   41.579086] x3 : ffff20000850cdc8 x2 : 0000000000000008
        [   41.582773] x1 : ffff8000c1376000 x0 : 0000000000000100
        [   41.586495] Call trace:
        [   41.588922]  perf_trace_buf_alloc+0x138/0x238
        [   41.591912]  perf_ftrace_function_call+0x1ac/0x248
        [   41.595123]  ftrace_ops_no_ops+0x3a4/0x488
        [   41.597998]  ftrace_graph_call+0x0/0xc
        [   41.600715]  rcu_dynticks_curr_cpu_in_eqs+0x14/0x70
        [   41.603962]  rcu_is_watching+0xc/0x20
        [   41.606635]  ftrace_ops_no_ops+0x240/0x488
        [   41.609530]  ftrace_graph_call+0x0/0xc
        [   41.612249]  __read_once_size_nocheck.constprop.0+0x1c/0x38
        [   41.615905]  unwind_frame+0x140/0x358
        [   41.618597]  walk_stackframe+0x34/0x60
        [   41.621359]  __save_stack_trace+0x204/0x3b8
        [   41.624328]  save_stack_trace+0x2c/0x38
        [   41.627112]  __kasan_slab_free+0x120/0x228
        [   41.630018]  kasan_slab_free+0x10/0x18
        [   41.632752]  kfree+0x84/0x250
        [   41.635107]  skb_free_head+0x70/0xb0
        [   41.637772]  skb_release_data+0x3f8/0x730
        [   41.640626]  skb_release_all+0x50/0x68
        [   41.643350]  kfree_skb+0x84/0x278
        [   41.645890]  kfree_skb_list+0x4c/0x78
        [   41.648595]  __dev_queue_xmit+0x1a4c/0x23a0
        [   41.651541]  dev_queue_xmit+0x28/0x38
        [   41.654254]  ip6_finish_output2+0xeb0/0x1630
        [   41.657261]  ip6_finish_output+0x2d8/0x7f8
        [   41.660174]  ip6_output+0x19c/0x348
        [   41.663850]  mld_sendpack+0x560/0x9e0
        [   41.666564]  mld_ifc_timer_expire+0x484/0x8a8
        [   41.669624]  call_timer_fn+0x68/0x4b0
        [   41.672355]  expire_timers+0x168/0x498
        [   41.675126]  run_timer_softirq+0x230/0x7a8
        [   41.678052]  __do_softirq+0x2d0/0xba0
        [   41.680763]  run_ksoftirqd+0x110/0x1a0
        [   41.683512]  smpboot_thread_fn+0x31c/0x620
        [   41.686429]  kthread+0x2c8/0x348
        [   41.688927]  ret_from_fork+0x10/0x18
      
      Look into above call stack, we found a recursive call in
      'ftrace_graph_call', see a snippet:
          __read_once_size_nocheck.constprop.0
            ftrace_graph_call
              ......
                rcu_dynticks_curr_cpu_in_eqs
                  ftrace_graph_call
      
      We analyze that 'rcu_dynticks_curr_cpu_in_eqs' should not be tracable,
      and we verify that mark related functions as 'notrace' can avoid the
      problem.
      
      Comparing mainline kernel, we find that commit ff5c4f5c ("rcu/tree:
      Mark the idle relevant functions noinstr") mark related functions as
      'noinstr' which implies notrace, noinline and sticks things in the
      .noinstr.text section.
      Link: https://lore.kernel.org/all/20200416114706.625340212@infradead.org/
      
      
      
      Currently 'noinstr' mechanism has not been introduced, so we would not
      directly backport that commit (otherwise more changes may be introduced).
      Instead, we mark the functions as 'notrace' where it is 'noinstr' in
      that commit.
      
      Signed-off-by: default avatarZheng Yejian <zhengyejian1@huawei.com>
      Reviewed-by: default avatarZhen Lei <thunder.leizhen@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      9b5e728e
  8. Jul 13, 2022