Skip to content
Snippets Groups Projects
  1. Jun 02, 2022
  2. Jun 01, 2022
  3. May 31, 2022
  4. May 28, 2022
  5. May 26, 2022
    • Ye Bin's avatar
      ext4: fix bug_on in ext4_writepages · fb4e2c7c
      Ye Bin authored
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I58A6T
      
      
      CVE: NA
      
      ---------------------------
      
      we got issue as follows:
      EXT4-fs error (device loop0): ext4_mb_generate_buddy:1141: group 0, block bitmap and bg descriptor inconsistent: 25 vs 31513 free cls
      ------------[ cut here ]------------
      kernel BUG at fs/ext4/inode.c:2708!
      invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI
      CPU: 2 PID: 2147 Comm: rep Not tainted 5.18.0-rc2-next-20220413+ #155
      RIP: 0010:ext4_writepages+0x1977/0x1c10
      RSP: 0018:ffff88811d3e7880 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: 0000000000000001 RCX: ffff88811c098000
      RDX: 0000000000000000 RSI: ffff88811c098000 RDI: 0000000000000002
      RBP: ffff888128140f50 R08: ffffffffb1ff6387 R09: 0000000000000000
      R10: 0000000000000007 R11: ffffed10250281ea R12: 0000000000000001
      R13: 00000000000000a4 R14: ffff88811d3e7bb8 R15: ffff888128141028
      FS:  00007f443aed9740(0000) GS:ffff8883aef00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020007200 CR3: 000000011c2a4000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       do_writepages+0x130/0x3a0
       filemap_fdatawrite_wbc+0x83/0xa0
       filemap_flush+0xab/0xe0
       ext4_alloc_da_blocks+0x51/0x120
       __ext4_ioctl+0x1534/0x3210
       __x64_sys_ioctl+0x12c/0x170
       do_syscall_64+0x3b/0x90
      
      It may happen as follows:
      1. write inline_data inode
      vfs_write
        new_sync_write
          ext4_file_write_iter
            ext4_buffered_write_iter
              generic_perform_write
                ext4_da_write_begin
                  ext4_da_write_inline_data_begin -> If inline data size too
                  small will allocate block to write, then mapping will has
                  dirty page
                      ext4_da_convert_inline_data_to_extent ->clear EXT4_STATE_MAY_INLINE_DATA
      2. fallocate
      do_vfs_ioctl
        ioctl_preallocate
          vfs_fallocate
            ext4_fallocate
              ext4_convert_inline_data
                ext4_convert_inline_data_nolock
                  ext4_map_blocks -> fail will goto restore data
                  ext4_restore_inline_data
                    ext4_create_inline_data
                    ext4_write_inline_data
                    ext4_set_inode_state -> set inode EXT4_STATE_MAY_INLINE_DATA
      3. writepages
      __ext4_ioctl
        ext4_alloc_da_blocks
          filemap_flush
            filemap_fdatawrite_wbc
              do_writepages
                ext4_writepages
                  if (ext4_has_inline_data(inode))
                    BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA))
      
      The root cause of this issue is we destroy inline data until call ext4_writepages
      under delay allocation mode. But there maybe already covert from inline to extent.
      To solved this issue, we call filemap_flush firstly.
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarLi Nan <linan122@huawei.com>
      Reviewed-by: default avatarZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      fb4e2c7c
    • Ye Bin's avatar
      ext4: fix warning in ext4_handle_inode_extension · c188888a
      Ye Bin authored
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I58A7W
      
      
      CVE: NA
      
      ---------------------------
      
      We got issue as follows:
      EXT4-fs error (device loop0) in ext4_reserve_inode_write:5741: Out of memory
      EXT4-fs error (device loop0): ext4_setattr:5462: inode #13: comm syz-executor.0: mark_inode_dirty error
      EXT4-fs error (device loop0) in ext4_setattr:5519: Out of memory
      EXT4-fs error (device loop0): ext4_ind_map_blocks:595: inode #13: comm syz-executor.0: Can't allocate blocks for non-extent mapped inodes with bigalloc
      ------------[ cut here ]------------
      WARNING: CPU: 1 PID: 4361 at fs/ext4/file.c:301 ext4_file_write_iter+0x11c9/0x1220
      Modules linked in:
      CPU: 1 PID: 4361 Comm: syz-executor.0 Not tainted 5.10.0+ #1
      RIP: 0010:ext4_file_write_iter+0x11c9/0x1220
      RSP: 0018:ffff924d80b27c00 EFLAGS: 00010282
      RAX: ffffffff815a3379 RBX: 0000000000000000 RCX: 000000003b000000
      RDX: ffff924d81601000 RSI: 00000000000009cc RDI: 00000000000009cd
      RBP: 000000000000000d R08: ffffffffbc5a2c6b R09: 0000902e0e52a96f
      R10: ffff902e2b7c1b40 R11: ffff902e2b7c1b40 R12: 000000000000000a
      R13: 0000000000000001 R14: ffff902e0e52aa10 R15: ffffffffffffff8b
      FS:  00007f81a7f65700(0000) GS:ffff902e3bc80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffff600400 CR3: 000000012db88001 CR4: 00000000003706e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       do_iter_readv_writev+0x2e5/0x360
       do_iter_write+0x112/0x4c0
       do_pwritev+0x1e5/0x390
       __x64_sys_pwritev2+0x7e/0xa0
       do_syscall_64+0x37/0x50
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Above issue may happen as follows:
      Assume
      inode.i_size=4096
      EXT4_I(inode)->i_disksize=4096
      
      step 1: set inode->i_isize = 8192
      ext4_setattr
        if (attr->ia_size != inode->i_size)
          EXT4_I(inode)->i_disksize = attr->ia_size;
          rc = ext4_mark_inode_dirty
             ext4_reserve_inode_write
                ext4_get_inode_loc
                  __ext4_get_inode_loc
                    sb_getblk --> return -ENOMEM
         ...
         if (!error)  ->will not update i_size
           i_size_write(inode, attr->ia_size);
      Now:
      inode.i_size=4096
      EXT4_I(inode)->i_disksize=8192
      
      step 2: Direct write 4096 bytes
      ext4_file_write_iter
       ext4_dio_write_iter
         iomap_dio_rw ->return error
       if (extend)
         ext4_handle_inode_extension
           WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize);
      ->Then trigger warning.
      
      To solve above issue, if mark inode dirty failed in ext4_setattr just
      set 'EXT4_I(inode)->i_disksize' with old value.
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Signed-off-by: default avatarLi Nan <linan122@huawei.com>
      Reviewed-by: default avatarZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      c188888a
    • Ye Bin's avatar
      ext4: fix use-after-free in ext4_rename_dir_prepare · e5e75e67
      Ye Bin authored
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I585D4
      
      
      CVE: NA
      
      ---------------------------
      
      We got issue as follows:
      EXT4-fs (loop0): mounted filesystem without journal. Opts: ,errors=continue
      ext4_get_first_dir_block: bh->b_data=0xffff88810bee6000 len=34478
      ext4_get_first_dir_block: *parent_de=0xffff88810beee6ae bh->b_data=0xffff88810bee6000
      ext4_rename_dir_prepare: [1] parent_de=0xffff88810beee6ae
      
      ==================================================================
      BUG: KASAN: use-after-free in ext4_rename_dir_prepare+0x152/0x220
      Read of size 4 at addr ffff88810beee6ae by task rep/1895
      
      CPU: 13 PID: 1895 Comm: rep Not tainted 5.10.0+ #241
      Call Trace:
       dump_stack+0xbe/0xf9
       print_address_description.constprop.0+0x1e/0x220
       kasan_report.cold+0x37/0x7f
       ext4_rename_dir_prepare+0x152/0x220
       ext4_rename+0xf44/0x1ad0
       ext4_rename2+0x11c/0x170
       vfs_rename+0xa84/0x1440
       do_renameat2+0x683/0x8f0
       __x64_sys_renameat+0x53/0x60
       do_syscall_64+0x33/0x40
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x7f45a6fc41c9
      RSP: 002b:00007ffc5a470218 EFLAGS: 00000246 ORIG_RAX: 0000000000000108
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f45a6fc41c9
      RDX: 0000000000000005 RSI: 0000000020000180 RDI: 0000000000000005
      RBP: 00007ffc5a470240 R08: 00007ffc5a470160 R09: 0000000020000080
      R10: 00000000200001c0 R11: 0000000000000246 R12: 0000000000400bb0
      R13: 00007ffc5a470320 R14: 0000000000000000 R15: 0000000000000000
      
      The buggy address belongs to the page:
      page:00000000440015ce refcount:0 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x10beee
      flags: 0x200000000000000()
      raw: 0200000000000000 ffffea00043ff4c8 ffffea0004325608 0000000000000000
      raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff88810beee580: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff88810beee600: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      >ffff88810beee680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                                        ^
       ffff88810beee700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff88810beee780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      ==================================================================
      Disabling lock debugging due to kernel taint
      ext4_rename_dir_prepare: [2] parent_de->inode=3537895424
      ext4_rename_dir_prepare: [3] dir=0xffff888124170140
      ext4_rename_dir_prepare: [4] ino=2
      ext4_rename_dir_prepare: ent->dir->i_ino=2 parent=-757071872
      
      Reason is first directory entry which 'rec_len' is 34478, then will get illegal
      parent entry. Now, we do not check directory entry after read directory block
      in 'ext4_get_first_dir_block'.
      To solve this issue, check directory entry in 'ext4_get_first_dir_block'.
      
      Signed-off-by: default avatarYe Bin <yebin10@huawei.com>
      Signed-off-by: default avatarLi Nan <linan122@huawei.com>
      Reviewed-by: default avatarZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      e5e75e67
  6. May 24, 2022
    • 童甜根's avatar
      uce: coredump scenario support kernel recovery · 4b0b9671
      童甜根 authored
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I591O4
      
      
      CVE: NA
      
      --------------------------------
      
      This patch add uce kernel recovery path support in coredump process.
      
      Coredump file writing to fs is related to the specific implementation of
      fs's write_iter operation. This patch only supports uce kernel recovery
      in ext4/tmpfs/pipefs.
      
      Coredump scenario use bit4 of uce_kernel_recover as the switch which can
      be set in procfs and cmdline.
      
      Signed-off-by: default avatarTong Tiangen <tongtiangen@huawei.com>
      Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      4b0b9671
    • David Wilder's avatar
      NULL pointer dereference on rmmod iptable_mangle. · e4be281f
      David Wilder authored
      maillist inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I590MF
      
      
      CVE: NA
      
      --------------------------------
      
      This crash happened on a ppc64le system running ltp network tests when ltp script ran "rmmod iptable_mangle".
      
      [213425.602369] BUG: Kernel NULL pointer dereference at 0x00000010
      [213425.602388] Faulting instruction address: 0xc008000000550bdc
      [213425.602399] Oops: Kernel access of bad area, sig: 11 [#1]
      [213425.602409] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
      [213425.602418] Modules linked in: nf_log_ipv4 nf_log_common iptable_mangle(-) iptable_nat nf_nat nf_conntrack iptable_filter ip_tables xt_limit xt_multiport xt_LOG xt_tcpudp nf_defrag_ipv6 nf_defrag_ipv4 x_tables sch_netem tcp_bbr rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver rds dummy sctp crypto_user veth uhid kvm_pr kvm vfio_iommu_spapr_tce vfio_spapr_eeh vfio hci_vhci bluetooth ecdh_generic ecc vhost_net tap vhost_vsock vmw_vsock_virtio_transport_common vhost vsock uinput n_gsm pps_ldisc ppp_synctty ppp_async ppp_generic slip slhc serport brd tun fuse vfat fat xfs ext4 crc16 mbcache jbd2 mlx5_ib ib_uverbs ib_core mlx5_core mlxfw tls loop be2net ibmveth(XX) st sr_mod cdrom lp parport_pc parport nvram xfrm_user joydev binfmt_misc rpadlpar_io(XX) rpaphp(XX) xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag nfsv3 nfs_acl nfs lockd grace sunrpc fscache af_packet rfkill vmx_crypto gf128mul ibmvnic uio_pdrv_genirq crct10dif_vpmsum uio rtc_generic btrfs
      [213425.602577]  libcrc32c xor raid6_pq dm_service_time sd_mod ibmvfc(XX) scsi_transport_fc crc32c_vpmsum dm_mirror dm_region_hash dm_log sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod [last unloaded: ipt_REJECT]
      [213425.602659] Supported: No, Unreleased kernel
      [213425.602671] CPU: 0 PID: 10 Comm: ksoftirqd/0 Tainted: G               X   5.3.18-14-default #1 SLE15-SP2 (unreleased)
      [213425.602682] NIP:  c008000000550bdc LR: c008000001de00c8 CTR: c008000000550b48
      [213425.602692] REGS: c000000002973250 TRAP: 0380   Tainted: G               X    (5.3.18-14-default)
      [213425.602701] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88082822  XER: 00000001
      [213425.602726] CFAR: c008000000551050 IRQMASK: 0
                      GPR00: c008000001de00c8 c0000000029734e0 c00800000055d800 c00000087b7c3600
                      GPR04: c000000002973768 0000000000000000 0000000000000000 c0000007ab050800
                      GPR08: 000000000000000e c0000007ab050814 c000000001558380 c008000001de04e0
                      GPR12: c008000000550b48 c0000000021e0000 c00000000016b358 0000000000000100
                      GPR16: 000000000000008e 00000000000000a0 0000000000000000 0000000000000005
                      GPR20: 0000000000000000 c000000001168fa8 0000000000000000 c0000007ac4b46d4
                      GPR24: c000000002973768 c008000000555f80 0000000000000001 c0000000011ee000
                      GPR28: c0000000011ee000 c00000087b7c3600 c0000007ab05080e c000000002973768
      [213425.602816] NIP [c008000000550bdc] ipt_do_table+0x94/0x980 [ip_tables]
      [213425.602827] LR [c008000001de00c8] iptable_mangle_hook+0x50/0x180 [iptable_mangle]
      [213425.602835] Call Trace:
      [213425.602843] [c0000000029734e0] [c000000002973570] 0xc000000002973570 (unreliable)
      [213425.602856] [c000000002973690] [c008000001de00c8] iptable_mangle_hook+0x50/0x180 [iptable_mangle]
      [213425.602871] [c0000000029736f0] [c000000000a82b60] nf_hook_slow+0x70/0x140
      [213425.602882] [c000000002973740] [c000000000a90cdc] ip_rcv+0xac/0x120
      [213425.602894] [c0000000029737c0] [c0000000009d978c] __netif_receive_skb_core+0x42c/0x1160
      [213425.602906] [c0000000029738a0] [c0000000009dab80] __netif_receive_skb_list_core+0x130/0x330
      [213425.602919] [c000000002973940] [c0000000009dafa4] netif_receive_skb_list_internal+0x224/0x350
      [213425.602932] [c0000000029739c0] [c0000000009db2b4] gro_normal_list.part.109+0x34/0x60
      [213425.602943] [c0000000029739f0] [c0000000009dc0c8] napi_gro_receive+0x1b8/0x200
      [213425.602957] [c000000002973a30] [c008000000e32368] ibmvnic_poll+0x2d0/0x410 [ibmvnic]
      [213425.602969] [c000000002973b10] [c0000000009dcebc] net_rx_action+0x1ec/0x540
      [213425.602982] [c000000002973c30] [c000000000c1ff68] __do_softirq+0x178/0x424
      [213425.602994] [c000000002973d20] [c00000000013c924] run_ksoftirqd+0x64/0x90
      [213425.603006] [c000000002973d40] [c0000000001717c0] smpboot_thread_fn+0x270/0x2c0
      [213425.603018] [c000000002973db0] [c00000000016b4fc] kthread+0x1ac/0x1c0
      [213425.603029] [c000000002973e20] [c00000000000b660] ret_from_kernel_thread+0x5c/0x7c
      [213425.603038] Instruction dump:
      [213425.603046] e8e300c0 82c40000 e92d1178 f9210118 39200000 2fbc0000 7fc74214 419e046c
      [213425.603067] eb380010 2fb90000 419e0474 393e0006 <80850010> 38c00000 7d404e2c 39200001
      [213425.603089] ---[ end trace f2babb2170f723cc ]---
      [213425.690517]
      
      In the crash we find in iptable_mangle_hook() that state->net->ipv4.iptable_mangle=NULL causing a NULL pointer dereference. net->ipv4.iptable_mangle is set to NULL in iptable_mangle_net_exit() and called when ip_mangle modules is unloaded. A rmmod task was found in the crash dump.  A 2nd crash showed the same problem when running "rmmod iptable_filter" (net->ipv4.iptable_filter=NULL).
      
      Once a hook is registered packets will picked up a pointer from: net->ipv4.iptable_$table. The patch adds a call to synchronize_net() in ipt_unregister_table() to insure no packets are in flight that have picked up the pointer before completing the un-register.
      
      This change has has prevented the problem in our testing.  However, we have concerns with this change as it would mean that on netns cleanup, we would need one synchronize_net() call for every table in use. Also, on module unload, there would be one synchronize_net() for every existing netns.
      
      Meanwhile, we fix the same problem in IPv6 stack.
      
      Signed-off-by: default avatarDavid Wilder <dwilder@us.ibm.com>
      link: https://www.spinics.net/lists/netdev/msg658602.html
      
      
      Signed-off-by: default avatarHuang Guobin <huangguobin4@huawei.com>
      Reviewed-by: default avatarYue Haibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      e4be281f
  7. May 23, 2022
    • Zhang Qiao's avatar
      sched/qos: Add qos_tg_{throttle,unthrottle}_{up,down} · 453eaea6
      Zhang Qiao authored
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4VZJT
      
      
      CVE: NA
      
      --------------------------------
      
      1. Qos throttle reuse tg_{throttle,unthrottle}_{up,down} that
      can write some cfs-bandwidth fields, it may cause some unknown
      data error. So add qos_tg_{throttle,unthrottle}_{up,down} for
      qos throttle.
      
      2. walk_tg_tree_from() caller must hold rcu_lock, currently there is
      none, so add it now.
      
      Signed-off-by: default avatarZhang Qiao <zhangqiao22@huawei.com>
      Reviewed-by: default avatarChen Hui <judy.chenhui@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
    • Zhang Qiao's avatar
      sched: Throttle offline task at tracehook_notify_resume() · 2701a7bb
      Zhang Qiao authored
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4VZJT
      
      
      CVE: NA
      
      --------------------------------
      
      Before, when detect the cpu is overloaded, we throttle offline
      tasks at exit_to_user_mode_loop() before returning to user mode.
      Some architects(e.g.,arm64) do not support QOS scheduler because
      a task do not via exit_to_user_mode_loop() return to userspace at
      these platforms.
      In order to slove this problem and support qos scheduler on all
      architectures, if we require throttling offline tasks, we set flag
      TIF_NOTIFY_RESUME to an offline task when it is picked and throttle
      it at tracehook_notify_resume().
      
      Signed-off-by: default avatarZhang Qiao <zhangqiao22@huawei.com>
      Reviewed-by: default avatarChen Hui <judy.chenhui@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      2701a7bb
    • Zhang Qiao's avatar
      sched: enable CONFIG_QOS_SCHED on arm64 · 70d21cfa
      Zhang Qiao authored
      hulk inclusion
      category: feature
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4VZJT
      
      
      CVE: NA
      
      --------------------------------
      
      Signed-off-by: default avatarZhang Qiao <zhangqiao22@huawei.com>
      Reviewed-by: default avatarChen Hui <judy.chenhui@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      70d21cfa
    • Zhang Qiao's avatar
      sched/qos: Remove dependency CONFIG_x86 · 045a6974
      Zhang Qiao authored
      hulk inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I4VZJT
      
      
      CVE: NA
      
      --------------------------------
      
      After removing dependency CONFIG_x86, if enable CONFIG_QOS_SCHED,
      only x86 server can handle priority inversion issue.
      
      Signed-off-by: default avatarZhang Qiao <zhangqiao22@huawei.com>
      Reviewed-by: default avatarCheng Jian <cj.chengjian@huawei.com>
      Reviewed-by: default avatarChen Hui <judy.chenhui@huawei.com>
      Signed-off-by: default avatarZheng Zengkai <zhengzengkai@huawei.com>
      Reviewed-by: default avatarChen Hui <judy.chenhui@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      045a6974
    • Eric Dumazet's avatar
      net/sched: cls_u32: fix netns refcount changes in u32_change() · bede0bb3
      Eric Dumazet authored
      stable inclusion
      from stable-v4.19.241
      commit 75b0cc7904da7b40c6e8f2cf3ec4223b292b1184
      category: bugfix
      bugzilla: 186701, https://gitee.com/src-openeuler/kernel/issues/I5850T
      
      
      CVE: CVE-2022-29581
      
      --------------------------------
      
      commit 3db09e762dc79584a69c10d74a6b98f89a9979f8 upstream.
      
      We are now able to detect extra put_net() at the moment
      they happen, instead of much later in correct code paths.
      
      u32_init_knode() / tcf_exts_init() populates the ->exts.net
      pointer, but as mentioned in tcf_exts_init(),
      the refcount on netns has not been elevated yet.
      
      The refcount is taken only once tcf_exts_get_net()
      is called.
      
      So the two u32_destroy_key() calls from u32_change()
      are attempting to release an invalid reference on the netns.
      
      syzbot report:
      
      refcount_t: decrement hit 0; leaking memory.
      WARNING: CPU: 0 PID: 21708 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31
      Modules linked in:
      CPU: 0 PID: 21708 Comm: syz-executor.5 Not tainted 5.18.0-rc2-next-20220412-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31
      Code: 1d 14 b6 b2 09 31 ff 89 de e8 6d e9 89 fd 84 db 75 e0 e8 84 e5 89 fd 48 c7 c7 40 aa 26 8a c6 05 f4 b5 b2 09 01 e8 e5 81 2e 05 <0f> 0b eb c4 e8 68 e5 89 fd 0f b6 1d e3 b5 b2 09 31 ff 89 de e8 38
      RSP: 0018:ffffc900051af1b0 EFLAGS: 00010286
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000040000 RSI: ffffffff8160a0c8 RDI: fffff52000a35e28
      RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff81604a9e R11: 0000000000000000 R12: 1ffff92000a35e3b
      R13: 00000000ffffffef R14: ffff8880211a0194 R15: ffff8880577d0a00
      FS:  00007f25d183e700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f19c859c028 CR3: 0000000051009000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       __refcount_dec include/linux/refcount.h:344 [inline]
       refcount_dec include/linux/refcount.h:359 [inline]
       ref_tracker_free+0x535/0x6b0 lib/ref_tracker.c:118
       netns_tracker_free include/net/net_namespace.h:327 [inline]
       put_net_track include/net/net_namespace.h:341 [inline]
       tcf_exts_put_net include/net/pkt_cls.h:255 [inline]
       u32_destroy_key.isra.0+0xa7/0x2b0 net/sched/cls_u32.c:394
       u32_change+0xe01/0x3140 net/sched/cls_u32.c:909
       tc_new_tfilter+0x98d/0x2200 net/sched/cls_api.c:2148
       rtnetlink_rcv_msg+0x80d/0xb80 net/core/rtnetlink.c:6016
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2495
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:705 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:725
       ____sys_sendmsg+0x6e2/0x800 net/socket.c:2413
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2467
       __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      RIP: 0033:0x7f25d0689049
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f25d183e168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f25d079c030 RCX: 00007f25d0689049
      RDX: 0000000000000000 RSI: 0000000020000340 RDI: 0000000000000005
      RBP: 00007f25d06e308d R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007ffd0b752e3f R14: 00007f25d183e300 R15: 0000000000022000
       </TASK>
      
      Fixes: 35c55fc1 ("cls_u32: use tcf_exts_get_net() before call_rcu()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      [rkolchmeyer: Backported to 4.19: adjusted u32_destroy_key() signature]
      Signed-off-by: default avatarRobert Kolchmeyer <rkolchmeyer@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarXu Jia <xujia39@huawei.com>
      Reviewed-by: default avatarYue Haibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Reviewed-by: default avatarWang Weiyang <wangweiyang2@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      bede0bb3
    • Liu Shixin's avatar
      mm: hwpoison: enable memory error handling on 1GB hugepage optionaly · 2299190c
      Liu Shixin authored
      hulk inclusion
      category: feature
      bugzilla: 186704, https://gitee.com/openeuler/kernel/issues/I574NB
      
      
      CVE: NA
      
      --------------------------------
      
      The memory error handling on 1GB hugepage is disabled by commit 31286a84
      because it may lead to a kernel panic.
      
      However, the commit will result a more troublesome downstream problem. So we
      have to revert it in some situation. At the same time, we backport commit
      15494520 which resolve the kernel panic described in commit 31286a84.
      
      We add a new cmdline named 'hugetlb_hwpoison_full' to enable memory error
      handling on 1GB hugepage. By default, the memory error handling on 1GB hugepage
      is disabled.
      
      Note that the kernel panic may not have been completely resolved!
      
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      2299190c
    • Qiujun Huang's avatar
      mm: fix gup_pud_range · 5a13dee5
      Qiujun Huang authored
      mainline inclusion
      from mainline-v5.6-rc1
      commit 15494520
      category: bugfix
      bugzilla: 186704, https://gitee.com/openeuler/kernel/issues/I574NB
      CVE: NA
      
      --------------------------------
      
      sorry for not processing for a long time.  I met it again.
      
      patch v1   https://lkml.org/lkml/2019/9/20/656
      
      do_machine_check()
        do_memory_failure()
          memory_failure()
            hw_poison_user_mappings()
              try_to_unmap()
                pteval = swp_entry_to_pte(make_hwpoison_entry(subpage));
      
      ...and now we have a swap entry that indicates that the page entry
      refers to a bad (and poisoned) page of memory, but gup_fast() at this
      level of the page table was ignoring swap entries, and incorrectly
      assuming that "!pxd_none() == valid and present".
      
      And this was not just a poisoned page problem, but a generaly swap entry
      problem.  So, any swap entry type (device memory migration, numa
      migration, or just regular swapping) could lead to the same problem.
      
      Fix this by checking for pxd_present(), instead of pxd_none().
      
      Link: http://lkml.kernel.org/r/1578479084-15508-1-git-send-email-hqjagain@gmail.com
      
      
      Signed-off-by: default avatarQiujun Huang <hqjagain@gmail.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Reviewed-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      5a13dee5
    • Duoming Zhou's avatar
      nfc: nfcmrvl: main: reorder destructive operations in nfcmrvl_nci_unregister_dev to avoid bugs · 08f5d6a4
      Duoming Zhou authored
      stable inclusion
      from stable-v4.19.242
      commit b266f492b2af82269aaaab871ac3949420ae678c
      category: bugfix
      bugzilla: https://gitee.com/src-openeuler/kernel/issues/I584YD
      
      
      CVE: CVE-2022-1734
      
      --------------------------------
      
      commit d270453a0d9ec10bb8a802a142fb1b3601a83098 upstream.
      
      There are destructive operations such as nfcmrvl_fw_dnld_abort and
      gpio_free in nfcmrvl_nci_unregister_dev. The resources such as firmware,
      gpio and so on could be destructed while the upper layer functions such as
      nfcmrvl_fw_dnld_start and nfcmrvl_nci_recv_frame is executing, which leads
      to double-free, use-after-free and null-ptr-deref bugs.
      
      There are three situations that could lead to double-free bugs.
      
      The first situation is shown below:
      
         (Thread 1)                 |      (Thread 2)
      nfcmrvl_fw_dnld_start         |
       ...                          |  nfcmrvl_nci_unregister_dev
       release_firmware()           |   nfcmrvl_fw_dnld_abort
        kfree(fw) //(1)             |    fw_dnld_over
                                    |     release_firmware
        ...                         |      kfree(fw) //(2)
                                    |     ...
      
      The second situation is shown below:
      
         (Thread 1)                 |      (Thread 2)
      nfcmrvl_fw_dnld_start         |
       ...                          |
       mod_timer                    |
       (wait a time)                |
       fw_dnld_timeout              |  nfcmrvl_nci_unregister_dev
         fw_dnld_over               |   nfcmrvl_fw_dnld_abort
          release_firmware          |    fw_dnld_over
           kfree(fw) //(1)          |     release_firmware
           ...                      |      kfree(fw) //(2)
      
      The third situation is shown below:
      
             (Thread 1)               |       (Thread 2)
      nfcmrvl_nci_recv_frame          |
       if(..->fw_download_in_progress)|
        nfcmrvl_fw_dnld_recv_frame    |
         queue_work                   |
                                      |
      fw_dnld_rx_work                 | nfcmrvl_nci_unregister_dev
       fw_dnld_over                   |  nfcmrvl_fw_dnld_abort
        release_firmware              |   fw_dnld_over
         kfree(fw) //(1)              |    release_firmware
                                      |     kfree(fw) //(2)
      
      The firmware struct is deallocated in position (1) and deallocated
      in position (2) again.
      
      The crash trace triggered by POC is like below:
      
      BUG: KASAN: double-free or invalid-free in fw_dnld_over
      Call Trace:
        kfree
        fw_dnld_over
        nfcmrvl_nci_unregister_dev
        nci_uart_tty_close
        tty_ldisc_kill
        tty_ldisc_hangup
        __tty_hangup.part.0
        tty_release
        ...
      
      What's more, there are also use-after-free and null-ptr-deref bugs
      in nfcmrvl_fw_dnld_start. If we deallocate firmware struct, gpio or
      set null to the members of priv->fw_dnld in nfcmrvl_nci_unregister_dev,
      then, we dereference firmware, gpio or the members of priv->fw_dnld in
      nfcmrvl_fw_dnld_start, the UAF or NPD bugs will happen.
      
      This patch reorders destructive operations after nci_unregister_device
      in order to synchronize between cleanup routine and firmware download
      routine.
      
      The nci_unregister_device is well synchronized. If the device is
      detaching, the firmware download routine will goto error. If firmware
      download routine is executing, nci_unregister_device will wait until
      firmware download routine is finished.
      
      v1->v2 change:
       	- fix stable branch
      
      Fixes: 3194c687 ("NFC: nfcmrvl: add firmware download support")
      Signed-off-by: default avatarDuoming Zhou <duoming@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBaisong Zhong <zhongbaisong@huawei.com>
      Reviewed-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Reviewed-by: default avatarXiu Jianfeng <xiujianfeng@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      08f5d6a4
    • Zhang Yi's avatar
      ext4: fix warning when submitting superblock in ext4_commit_super() · ac5df7ff
      Zhang Yi authored
      hulk inclusion
      category: bugfix
      bugzilla: 186737, https://gitee.com/openeuler/kernel/issues/I58COJ
      
      
      CVE: NA
      
      --------------------------------
      
      We have already check the io_error and uptodate flag before submitting
      the superblock buffer, and re-set the uptodate flag if it has been
      failed to write out. But it was lockless and could be raced by another
      ext4_commit_super(), and finally trigger '!uptodate' WARNING when
      marking buffer dirty. Fix it by submit buffer directly.
      
      Signed-off-by: default avatarZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: default avatarZhihao Cheng <chengzhihao1@huawei.com>
      Reviewed-by: default avatarZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      ac5df7ff
  8. May 19, 2022
    • Baokun Li's avatar
      ext4: fix bug_on in __es_tree_search · fec5e578
      Baokun Li authored
      hulk inclusion
      category: bugfix
      bugzilla: 186770, https://gitee.com/openeuler/kernel/issues/I58670
      
      
      CVE: NA
      
      --------------------------------
      
      Hulk Robot reported a BUG_ON:
      
      ==================================================================
      kernel BUG at fs/ext4/extents_status.c:199!
      [...]
      RIP: 0010:ext4_es_end fs/ext4/extents_status.c:199 [inline]
      RIP: 0010:__es_tree_search+0x1e0/0x260 fs/ext4/extents_status.c:217
      [...]
      Call Trace:
       ext4_es_cache_extent+0x109/0x340 fs/ext4/extents_status.c:766
       ext4_cache_extents+0x239/0x2e0 fs/ext4/extents.c:561
       ext4_find_extent+0x6b7/0xa20 fs/ext4/extents.c:964
       ext4_ext_map_blocks+0x16b/0x4b70 fs/ext4/extents.c:4384
       ext4_map_blocks+0xe26/0x19f0 fs/ext4/inode.c:567
       ext4_getblk+0x320/0x4c0 fs/ext4/inode.c:980
       ext4_bread+0x2d/0x170 fs/ext4/inode.c:1031
       ext4_quota_read+0x248/0x320 fs/ext4/super.c:6257
       v2_read_header+0x78/0x110 fs/quota/quota_v2.c:63
       v2_check_quota_file+0x76/0x230 fs/quota/quota_v2.c:82
       vfs_load_quota_inode+0x5d1/0x1530 fs/quota/dquot.c:2368
       dquot_enable+0x28a/0x330 fs/quota/dquot.c:2490
       ext4_quota_enable fs/ext4/super.c:6137 [inline]
       ext4_enable_quotas+0x5d7/0x960 fs/ext4/super.c:6163
       ext4_fill_super+0xa7c9/0xdc00 fs/ext4/super.c:4754
       mount_bdev+0x2e9/0x3b0 fs/super.c:1158
       mount_fs+0x4b/0x1e4 fs/super.c:1261
      [...]
      ==================================================================
      
      Above issue may happen as follows:
      -------------------------------------
      ext4_fill_super
       ext4_enable_quotas
        ext4_quota_enable
         ext4_iget
          __ext4_iget
           ext4_ext_check_inode
            ext4_ext_check
             __ext4_ext_check
              ext4_valid_extent_entries
               Check for overlapping extents does't take effect
         dquot_enable
          vfs_load_quota_inode
           v2_check_quota_file
            v2_read_header
             ext4_quota_read
              ext4_bread
               ext4_getblk
                ext4_map_blocks
                 ext4_ext_map_blocks
                  ext4_find_extent
                   ext4_cache_extents
                    ext4_es_cache_extent
                     ext4_es_cache_extent
                      __es_tree_search
                       ext4_es_end
                        BUG_ON(es->es_lblk + es->es_len < es->es_lblk)
      
      The error ext4 extents is as follows:
      0af3 0300 0400 0000 00000000    extent_header
      00000000 0100 0000 12000000     extent1
      00000000 0100 0000 18000000     extent2
      02000000 0400 0000 14000000     extent3
      
      In the ext4_valid_extent_entries function,
      if prev is 0, no error is returned even if lblock<=prev.
      This was intended to skip the check on the first extent, but
      in the error image above, prev=0+1-1=0 when checking the second extent,
      so even though lblock<=prev, the function does not return an error.
      As a result, bug_ON occurs in __es_tree_search and the system panics.
      
      To solve this problem, we only need to check that:
      1. The lblock of the first extent is not less than 0.
      2. The lblock of the next extent  is not less than
         the next block of the previous extent.
      The same applies to extent_idx.
      
      Fixes: 5946d089 ("ext4: check for overlapping extents in ext4_valid_extent_entries()")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarBaokun Li <libaokun1@huawei.com>
      Reviewed-by: default avatarZhang Yi <yi.zhang@huawei.com>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      fec5e578
  9. May 18, 2022
  10. May 17, 2022