Skip to content
Snippets Groups Projects
  1. Sep 22, 2020
    • youngjun's avatar
      ovl: inode reference leak in ovl_is_inuse true case. · 35866c6e
      youngjun authored
      
      stable inclusion
      from linux-4.19.134
      commit 77cc397f6bb05121ef5544e9d78d29ecf4ba7165
      
      --------------------------------
      
      commit 24f14009 upstream.
      
      When "ovl_is_inuse" true case, trap inode reference not put.  plus adding
      the comment explaining sequence of ovl_is_inuse after ovl_setup_trap.
      
      Fixes: 0be0bfd2 ("ovl: fix regression caused by overlapping layers detection")
      Cc: <stable@vger.kernel.org> # v4.19+
      Reviewed-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avataryoungjun <her0gyugyu@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      35866c6e
    • Ard Biesheuvel's avatar
      arm64/alternatives: don't patch up internal branches · 668f513a
      Ard Biesheuvel authored
      
      stable inclusion
      from linux-4.19.134
      commit 47fed9aa3fe0838eee41cb4f94c386aa18729b9e
      
      --------------------------------
      
      [ Upstream commit 5679b281 ]
      
      Commit f7b93d42 ("arm64/alternatives: use subsections for replacement
      sequences") moved the alternatives replacement sequences into subsections,
      in order to keep the as close as possible to the code that they replace.
      
      Unfortunately, this broke the logic in branch_insn_requires_update,
      which assumed that any branch into kernel executable code was a branch
      that required updating, which is no longer the case now that the code
      sequences that are patched in are in the same section as the patch site
      itself.
      
      So the only way to discriminate branches that require updating and ones
      that don't is to check whether the branch targets the replacement sequence
      itself, and so we can drop the call to kernel_text_address() entirely.
      
      Fixes: f7b93d42 ("arm64/alternatives: use subsections for replacement sequences")
      Reported-by: default avatarAlexandru Elisei <alexandru.elisei@arm.com>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Tested-by: default avatarAlexandru Elisei <alexandru.elisei@arm.com>
      Link: https://lore.kernel.org/r/20200709125953.30918-1-ardb@kernel.org
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      668f513a
    • Ard Biesheuvel's avatar
      arm64/alternatives: use subsections for replacement sequences · 2b69cb40
      Ard Biesheuvel authored
      
      stable inclusion
      from linux-4.19.134
      commit d6d9145866fbfac91c9c1dd8e32b355bdd63d000
      
      --------------------------------
      
      [ Upstream commit f7b93d42 ]
      
      When building very large kernels, the logic that emits replacement
      sequences for alternatives fails when relative branches are present
      in the code that is emitted into the .altinstr_replacement section
      and patched in at the original site and fixed up. The reason is that
      the linker will insert veneers if relative branches go out of range,
      and due to the relative distance of the .altinstr_replacement from
      the .text section where its branch targets usually live, veneers
      may be emitted at the end of the .altinstr_replacement section, with
      the relative branches in the sequence pointed at the veneers instead
      of the actual target.
      
      The alternatives patching logic will attempt to fix up the branch to
      point to its original target, which will be the veneer in this case,
      but given that the patch site is likely to be far away as well, it
      will be out of range and so patching will fail. There are other cases
      where these veneers are problematic, e.g., when the target of the
      branch is in .text while the patch site is in .init.text, in which
      case putting the replacement sequence inside .text may not help either.
      
      So let's use subsections to emit the replacement code as closely as
      possible to the patch site, to ensure that veneers are only likely to
      be emitted if they are required at the patch site as well, in which
      case they will be in range for the replacement sequence both before
      and after it is transported to the patch site.
      
      This will prevent alternative sequences in non-init code from being
      released from memory after boot, but this is tolerable given that the
      entire section is only 512 KB on an allyesconfig build (which weighs in
      at 500+ MB for the entire Image). Also, note that modules today carry
      the replacement sequences in non-init sections as well, and any of
      those that target init code will be emitted into init sections after
      this change.
      
      This fixes an early crash when booting an allyesconfig kernel on a
      system where any of the alternatives sequences containing relative
      branches are activated at boot (e.g., ARM64_HAS_PAN on TX2)
      
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Cc: Andre Przywara <andre.przywara@arm.com>
      Cc: Dave P Martin <dave.martin@arm.com>
      Link: https://lore.kernel.org/r/20200630081921.13443-1-ardb@kernel.org
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      2b69cb40
    • Chengguang Xu's avatar
      block: release bip in a right way in error path · f9c4e7b0
      Chengguang Xu authored
      
      stable inclusion
      from linux-4.19.133
      commit 1d269061bb3823c9e744397713f42e9fa49576eb
      
      --------------------------------
      
      [ Upstream commit 0b8eb629 ]
      
      Release bip using kfree() in error path when that was allocated
      by kmalloc().
      
      Signed-off-by: default avatarChengguang Xu <cgxu519@mykernel.net>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Acked-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      f9c4e7b0
    • Zhang Xiaoxu's avatar
      cifs: update ctime and mtime during truncate · fd197555
      Zhang Xiaoxu authored
      
      stable inclusion
      from linux-4.19.133
      commit 38bcc785c2eb67b22caca3ec4ac7ce08bcc65326
      
      --------------------------------
      
      [ Upstream commit 5618303d ]
      
      As the man description of the truncate, if the size changed,
      then the st_ctime and st_mtime fields should be updated. But
      in cifs, we doesn't do it.
      
      It lead the xfstests generic/313 failed.
      
      So, add the ATTR_MTIME|ATTR_CTIME flags on attrs when change
      the file size
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarZhang Xiaoxu <zhangxiaoxu5@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      fd197555
    • Hou Tao's avatar
      dm zoned: assign max_io_len correctly · 837f42b9
      Hou Tao authored
      
      stable inclusion
      from linux-4.19.132
      commit de07529a9aad73bae29563097372aeda1a3f31c6
      
      --------------------------------
      
      commit 7b237748 upstream.
      
      The unit of max_io_len is sector instead of byte (spotted through
      code review), so fix it.
      
      Fixes: 3b1a94c8 ("dm zoned: drive-managed zoned block device target")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      837f42b9
    • Hou Tao's avatar
      virtio-blk: free vblk-vqs in error path of virtblk_probe() · 9b4c52f8
      Hou Tao authored
      
      stable inclusion
      from linux-4.19.132
      commit a8c7823274e608903f8d72d72d18e6a4aee45c83
      
      --------------------------------
      
      [ Upstream commit e7eea44e ]
      
      Else there will be memory leak if alloc_disk() fails.
      
      Fixes: 6a27b656 ("block: virtio-blk: support multi virt queues per virtio-blk device")
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Reviewed-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      9b4c52f8
    • Qian Cai's avatar
      mm/slub: fix stack overruns with SLUB_STATS · bedc0149
      Qian Cai authored
      
      stable inclusion
      from linux-4.19.132
      commit 3e632652e3dc186d61d584258becf9ca69ae2f3a
      
      --------------------------------
      
      [ Upstream commit a68ee057 ]
      
      There is no need to copy SLUB_STATS items from root memcg cache to new
      memcg cache copies.  Doing so could result in stack overruns because the
      store function only accepts 0 to clear the stat and returns an error for
      everything else while the show method would print out the whole stat.
      
      Then, the mismatch of the lengths returns from show and store methods
      happens in memcg_propagate_slab_attrs():
      
      	else if (root_cache->max_attr_size < ARRAY_SIZE(mbuf))
      		buf = mbuf;
      
      max_attr_size is only 2 from slab_attr_store(), then, it uses mbuf[64]
      in show_stat() later where a bounch of sprintf() would overrun the stack
      variable.  Fix it by always allocating a page of buffer to be used in
      show_stat() if SLUB_STATS=y which should only be used for debug purpose.
      
        # echo 1 > /sys/kernel/slab/fs_cache/shrink
        BUG: KASAN: stack-out-of-bounds in number+0x421/0x6e0
        Write of size 1 at addr ffffc900256cfde0 by task kworker/76:0/53251
      
        Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 07/10/2019
        Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
        Call Trace:
          number+0x421/0x6e0
          vsnprintf+0x451/0x8e0
          sprintf+0x9e/0xd0
          show_stat+0x124/0x1d0
          alloc_slowpath_show+0x13/0x20
          __kmem_cache_create+0x47a/0x6b0
      
        addr ffffc900256cfde0 is located in stack of task kworker/76:0/53251 at offset 0 in frame:
         process_one_work+0x0/0xb90
      
        this frame has 1 object:
         [32, 72) 'lockdep_map'
      
        Memory state around the buggy address:
         ffffc900256cfc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
         ffffc900256cfd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        >ffffc900256cfd80: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
                                                               ^
         ffffc900256cfe00: 00 00 00 00 00 f2 f2 f2 00 00 00 00 00 00 00 00
         ffffc900256cfe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
        ==================================================================
        Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: __kmem_cache_create+0x6ac/0x6b0
        Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
        Call Trace:
          __kmem_cache_create+0x6ac/0x6b0
      
      Fixes: 107dab5c ("slub: slub-specific propagation changes")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Glauber Costa <glauber@scylladb.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Link: http://lkml.kernel.org/r/20200429222356.4322-1-cai@lca.pw
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      bedc0149
    • Dongli Zhang's avatar
      mm/slub.c: fix corrupted freechain in deactivate_slab() · b0eb7832
      Dongli Zhang authored
      
      stable inclusion
      from linux-4.19.132
      commit 6c09755c02642ea3727a87c994a1d9fab32aa8f4
      
      --------------------------------
      
      [ Upstream commit 52f23478 ]
      
      The slub_debug is able to fix the corrupted slab freelist/page.
      However, alloc_debug_processing() only checks the validity of current
      and next freepointer during allocation path.  As a result, once some
      objects have their freepointers corrupted, deactivate_slab() may lead to
      page fault.
      
      Below is from a test kernel module when 'slub_debug=PUF,kmalloc-128
      slub_nomerge'.  The test kernel corrupts the freepointer of one free
      object on purpose.  Unfortunately, deactivate_slab() does not detect it
      when iterating the freechain.
      
        BUG: unable to handle page fault for address: 00000000123456f8
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 0 P4D 0
        Oops: 0000 [#1] SMP PTI
        ... ...
        RIP: 0010:deactivate_slab.isra.92+0xed/0x490
        ... ...
        Call Trace:
         ___slab_alloc+0x536/0x570
         __slab_alloc+0x17/0x30
         __kmalloc+0x1d9/0x200
         ext4_htree_store_dirent+0x30/0xf0
         htree_dirblock_to_tree+0xcb/0x1c0
         ext4_htree_fill_tree+0x1bc/0x2d0
         ext4_readdir+0x54f/0x920
         iterate_dir+0x88/0x190
         __x64_sys_getdents+0xa6/0x140
         do_syscall_64+0x49/0x170
         entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Therefore, this patch adds extra consistency check in deactivate_slab().
      Once an object's freepointer is corrupted, all following objects
      starting at this object are isolated.
      
      [akpm@linux-foundation.org: fix build with CONFIG_SLAB_DEBUG=n]
      Signed-off-by: default avatarDongli Zhang <dongli.zhang@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Joe Jin <joe.jin@oracle.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Link: http://lkml.kernel.org/r/20200331031450.12182-1-dongli.zhang@oracle.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      b0eb7832
    • Hugh Dickins's avatar
      mm: fix swap cache node allocation mask · 1e0effa0
      Hugh Dickins authored
      stable inclusion
      from linux-4.19.132
      commit fa11088c6f75f8da2e20dd2b80105b67f0e4f117
      
      --------------------------------
      
      [ Upstream commit 243bce09 ]
      
      Chris Murphy reports that a slightly overcommitted load, testing swap
      and zram along with i915, splats and keeps on splatting, when it had
      better fail less noisily:
      
        gnome-shell: page allocation failure: order:0,
        mode:0x400d0(__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_RECLAIMABLE),
        nodemask=(null),cpuset=/,mems_allowed=0
        CPU: 2 PID: 1155 Comm: gnome-shell Not tainted 5.7.0-1.fc33.x86_64 #1
        Call Trace:
          dump_stack+0x64/0x88
          warn_alloc.cold+0x75/0xd9
          __alloc_pages_slowpath.constprop.0+0xcfa/0xd30
          __alloc_pages_nodemask+0x2df/0x320
          alloc_slab_page+0x195/0x310
          allocate_slab+0x3c5/0x440
          ___slab_alloc+0x40c/0x5f0
          __slab_alloc+0x1c/0x30
          kmem_cache_alloc+0x20e/0x220
          xas_nomem+0x28/0x70
          add_to_swap_cache+0x321/0x400
          __read_swap_cache_async+0x105/0x240
          swap_cluster_readahead+0x22c/0x2e0
          shmem_swapin+0x8e/0xc0
          shmem_swapin_page+0x196/0x740
          shmem_getpage_gfp+0x3a2/0xa60
          shmem_read_mapping_page_gfp+0x32/0x60
          shmem_get_pages+0x155/0x5e0 [i915]
          __i915_gem_object_get_pages+0x68/0xa0 [i915]
          i915_vma_pin+0x3fe/0x6c0 [i915]
          eb_add_vma+0x10b/0x2c0 [i915]
          i915_gem_do_execbuffer+0x704/0x3430 [i915]
          i915_gem_execbuffer2_ioctl+0x1ea/0x3e0 [i915]
          drm_ioctl_kernel+0x86/0xd0 [drm]
          drm_ioctl+0x206/0x390 [drm]
          ksys_ioctl+0x82/0xc0
          __x64_sys_ioctl+0x16/0x20
          do_syscall_64+0x5b/0xf0
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Reported on 5.7, but it goes back really to 3.1: when
      shmem_read_mapping_page_gfp() was implemented for use by i915, and
      allowed for __GFP_NORETRY and __GFP_NOWARN flags in most places, but
      missed swapin's "& GFP_KERNEL" mask for page tree node allocation in
      __read_swap_cache_async() - that was to mask off HIGHUSER_MOVABLE bits
      from what page cache uses, but GFP_RECLAIM_MASK is now what's needed.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=208085
      Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2006151330070.11064@eggly.anvils
      
      
      Fixes: 68da9f05 ("tmpfs: pass gfp to shmem_getpage_gfp")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Reviewed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Reported-by: default avatarChris Murphy <lists@colorremedies.com>
      Analyzed-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Analyzed-by: default avatarMatthew Wilcox <willy@infradead.org>
      Tested-by: default avatarChris Murphy <lists@colorremedies.com>
      Cc: <stable@vger.kernel.org>	[3.1+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      1e0effa0
    • Mikulas Patocka's avatar
      dm writecache: add cond_resched to loop in persistent_memory_claim() · 866f7e40
      Mikulas Patocka authored
      
      stable inclusion
      from linux-4.19.131
      commit 6cd52ae3868b677052ef749a424a7d9ecdc2db08
      
      --------------------------------
      
      commit d35bd764 upstream.
      
      Add cond_resched() to a loop that fills in the mapper memory area
      because the loop can be executed many times.
      
      Fixes: 48debafe ("dm: add writecache target")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      866f7e40
    • Huaisheng Ye's avatar
      dm writecache: correct uncommitted_block when discarding uncommitted entry · 41392d7e
      Huaisheng Ye authored
      
      stable inclusion
      from linux-4.19.131
      commit a33e2d0aa4e265ae5ea06778516b863b17a7c737
      
      --------------------------------
      
      commit 39495b12 upstream.
      
      When uncommitted entry has been discarded, correct wc->uncommitted_block
      for getting the exact number.
      
      Fixes: 48debafe ("dm: add writecache target")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHuaisheng Ye <yehs1@lenovo.com>
      Acked-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      41392d7e
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Zero out time extend if it is nested and not absolute · 05219f8a
      Steven Rostedt (VMware) authored
      stable inclusion
      from linux-4.19.131
      commit a1de4067516d1fbf20a08c0b68e8e8e70f39c3c8
      
      --------------------------------
      
      commit 097350d1 upstream.
      
      Currently the ring buffer makes events that happen in interrupts that preempt
      another event have a delta of zero. (Hopefully we can change this soon). But
      this is to deal with the races of updating a global counter with lockless
      and nesting functions updating deltas.
      
      With the addition of absolute time stamps, the time extend didn't follow
      this rule. A time extend can happen if two events happen longer than 2^27
      nanoseconds appart, as the delta time field in each event is only 27 bits.
      If that happens, then a time extend is injected with 2^59 bits of
      nanoseconds to use (18 years). But if the 2^27 nanoseconds happen between
      two events, and as it is writing the event, an interrupt triggers, it will
      see the 2^27 difference as well and inject a time extend of its own. But a
      recent change made the time extend logic not take into account the nesting,
      and this can cause two time extend deltas to happen moving the time stamp
      much further ahead than the current time. This gets all reset when the ring
      buffer moves to the next page, but that can cause time to appear to go
      backwards.
      
      This was observed in a trace-cmd recording, and since the data is saved in a
      file, with trace-cmd report --debug, it was possible to see that this indeed
      did happen!
      
        bash-52501   110d... 81778.908247: sched_switch:         bash:52501 [120] S ==> swapper/110:0 [120] [12770284:0x2e8:64]
        <idle>-0     110d... 81778.908757: sched_switch:         swapper/110:0 [120] R ==> bash:52501 [120] [509947:0x32c:64]
       TIME EXTEND: delta:306454770 length:0
        bash-52501   110.... 81779.215212: sched_swap_numa:      src_pid=52501 src_tgid=52388 src_ngid=52501 src_cpu=110 src_nid=2 dst_pid=52509 dst_tgid=52388 dst_ngid=52501 dst_cpu=49 dst_nid=1 [0:0x378:48]
       TIME EXTEND: delta:306458165 length:0
        bash-52501   110dNh. 81779.521670: sched_wakeup:         migration/110:565 [0] success=1 CPU:110 [0:0x3b4:40]
      
      and at the next page, caused the time to go backwards:
      
        bash-52504   110d... 81779.685411: sched_switch:         bash:52504 [120] S ==> swapper/110:0 [120] [8347057:0xfb4:64]
      CPU:110 [SUBBUFFER START] [81779379165886:0x1320000]
        <idle>-0     110dN.. 81779.379166: sched_wakeup:         bash:52504 [120] success=1 CPU:110 [0:0x10:40]
        <idle>-0     110d... 81779.379167: sched_switch:         swapper/110:0 [120] R ==> bash:52504 [120] [1168:0x3c:64]
      
      Link: https://lkml.kernel.org/r/20200622151815.345d1bf5@oasis.local.home
      
      
      
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Tom Zanussi <zanussi@kernel.org>
      Cc: stable@vger.kernel.org
      Fixes: dc4e2801 ("ring-buffer: Redefine the unimplemented RINGBUF_TYPE_TIME_STAMP")
      Reported-by: default avatarJulia Lawall <julia.lawall@inria.fr>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      05219f8a
    • Waiman Long's avatar
      mm/slab: use memzero_explicit() in kzfree() · ead7ad4b
      Waiman Long authored
      stable inclusion
      from linux-4.19.131
      commit 9ac47ed7c9090e0fd60b7a67f5611573b1410a95
      
      --------------------------------
      
      commit 8982ae52 upstream.
      
      The kzfree() function is normally used to clear some sensitive
      information, like encryption keys, in the buffer before freeing it back to
      the pool.  Memset() is currently used for buffer clearing.  However
      unlikely, there is still a non-zero probability that the compiler may
      choose to optimize away the memory clearing especially if LTO is being
      used in the future.
      
      To make sure that this optimization will never happen,
      memzero_explicit(), which is introduced in v3.18, is now used in
      kzfree() to future-proof it.
      
      Link: http://lkml.kernel.org/r/20200616154311.12314-2-longman@redhat.com
      
      
      Fixes: 3ef0e5ba ("slab: introduce kzfree()")
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: "Serge E. Hallyn" <serge@hallyn.com>
      Cc: Joe Perches <joe@perches.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      ead7ad4b
    • Juri Lelli's avatar
      sched/core: Fix PI boosting between RT and DEADLINE tasks · 7bab2bb2
      Juri Lelli authored
      
      stable inclusion
      from linux-4.19.131
      commit e852bdcce9e41c26127e4b919210e3445590a1a4
      
      --------------------------------
      
      [ Upstream commit 740797ce ]
      
      syzbot reported the following warning:
      
       WARNING: CPU: 1 PID: 6351 at kernel/sched/deadline.c:628
       enqueue_task_dl+0x22da/0x38a0 kernel/sched/deadline.c:1504
      
      At deadline.c:628 we have:
      
       623 static inline void setup_new_dl_entity(struct sched_dl_entity *dl_se)
       624 {
       625 	struct dl_rq *dl_rq = dl_rq_of_se(dl_se);
       626 	struct rq *rq = rq_of_dl_rq(dl_rq);
       627
       628 	WARN_ON(dl_se->dl_boosted);
       629 	WARN_ON(dl_time_before(rq_clock(rq), dl_se->deadline));
              [...]
           }
      
      Which means that setup_new_dl_entity() has been called on a task
      currently boosted. This shouldn't happen though, as setup_new_dl_entity()
      is only called when the 'dynamic' deadline of the new entity
      is in the past w.r.t. rq_clock and boosted tasks shouldn't verify this
      condition.
      
      Digging through the PI code I noticed that what above might in fact happen
      if an RT tasks blocks on an rt_mutex hold by a DEADLINE task. In the
      first branch of boosting conditions we check only if a pi_task 'dynamic'
      deadline is earlier than mutex holder's and in this case we set mutex
      holder to be dl_boosted. However, since RT 'dynamic' deadlines are only
      initialized if such tasks get boosted at some point (or if they become
      DEADLINE of course), in general RT 'dynamic' deadlines are usually equal
      to 0 and this verifies the aforementioned condition.
      
      Fix it by checking that the potential donor task is actually (even if
      temporary because in turn boosted) running at DEADLINE priority before
      using its 'dynamic' deadline value.
      
      Fixes: 2d3d891d ("sched/deadline: Add SCHED_DEADLINE inheritance logic")
      Reported-by: default avatar <syzbot+119ba87189432ead09b4@syzkaller.appspotmail.com>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Tested-by: default avatarDaniel Wagner <dwagner@suse.de>
      Link: https://lkml.kernel.org/r/20181119153201.GB2119@localhost.localdomain
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      7bab2bb2
    • Juri Lelli's avatar
      sched/deadline: Initialize ->dl_boosted · 85e86fa7
      Juri Lelli authored
      
      stable inclusion
      from linux-4.19.131
      commit edf55b5e3bde2fdba1a304b8e069154a4312f566
      
      --------------------------------
      
      [ Upstream commit ce9bc3b2 ]
      
      syzbot reported the following warning triggered via SYSC_sched_setattr():
      
        WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 setup_new_dl_entity /kernel/sched/deadline.c:594 [inline]
        WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_dl_entity /kernel/sched/deadline.c:1370 [inline]
        WARNING: CPU: 0 PID: 6973 at kernel/sched/deadline.c:593 enqueue_task_dl+0x1c17/0x2ba0 /kernel/sched/deadline.c:1441
      
      This happens because the ->dl_boosted flag is currently not initialized by
      __dl_clear_params() (unlike the other flags) and setup_new_dl_entity()
      rightfully complains about it.
      
      Initialize dl_boosted to 0.
      
      Fixes: 2d3d891d ("sched/deadline: Add SCHED_DEADLINE inheritance logic")
      Reported-by: default avatar <syzbot+5ac8bac25f95e8b221e7@syzkaller.appspotmail.com>
      Signed-off-by: default avatarJuri Lelli <juri.lelli@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Tested-by: default avatarDaniel Wagner <dwagner@suse.de>
      Link: https://lkml.kernel.org/r/20200617072919.818409-1-juri.lelli@redhat.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      85e86fa7
    • Qiushi Wu's avatar
      efi/esrt: Fix reference count leak in esre_create_sysfs_entry. · 51fb9c05
      Qiushi Wu authored
      
      stable inclusion
      from linux-4.19.131
      commit a717bbd11e3962e8144ef3e34a0533231dd2c409
      
      --------------------------------
      
      [ Upstream commit 4ddf4739 ]
      
      kobject_init_and_add() takes reference even when it fails.
      If this function returns an error, kobject_put() must be called to
      properly clean up the memory associated with the object. Previous
      commit "b8eb7183" fixed a similar problem.
      
      Fixes: 0bb54905 ("efi: Add esrt support")
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Link: https://lore.kernel.org/r/20200528183804.4497-1-wu000273@umn.edu
      
      
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      51fb9c05
    • Zheng Bin's avatar
      loop: replace kill_bdev with invalidate_bdev · 2476d843
      Zheng Bin authored
      
      stable inclusion
      from linux-4.19.131
      commit a388c0a88b7d676418b5861cfa40a159013cc6a6
      
      --------------------------------
      
      commit f4bd34b1 upstream.
      
      When a filesystem is mounted on a loop device and on a loop ioctl
      LOOP_SET_STATUS64, because of kill_bdev, buffer_head mappings are getting
      destroyed.
      kill_bdev
        truncate_inode_pages
          truncate_inode_pages_range
            do_invalidatepage
              block_invalidatepage
                discard_buffer  -->clear BH_Mapped flag
      
      sb_bread
        __bread_gfp
        bh = __getblk_gfp
        -->discard_buffer clear BH_Mapped flag
        __bread_slow
          submit_bh
            submit_bh_wbc
              BUG_ON(!buffer_mapped(bh))  --> hit this BUG_ON
      
      Fixes: 5db470e2 ("loop: drop caches if offset or block_size are changed")
      Signed-off-by: default avatarZheng Bin <zhengbin13@huawei.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      2476d843
    • Amir Goldstein's avatar
      fanotify: fix ignore mask logic for events on child and on dir · 8aa93898
      Amir Goldstein authored
      stable inclusion
      from linux-4.19.131
      commit 62b4f338185a9b0831c16dcd2e579c24550d8a4f
      
      --------------------------------
      
      commit 2f02fd3f upstream.
      
      The comments in fanotify_group_event_mask() say:
      
        "If the event is on dir/child and this mark doesn't care about
         events on dir/child, don't send it!"
      
      Specifically, mount and filesystem marks do not care about events
      on child, but they can still specify an ignore mask for those events.
      For example, a group that has:
      - A mount mark with mask 0 and ignore_mask FAN_OPEN
      - An inode mark on a directory with mask FAN_OPEN | FAN_OPEN_EXEC
        with flag FAN_EVENT_ON_CHILD
      
      A child file open for exec would be reported to group with the FAN_OPEN
      event despite the fact that FAN_OPEN is in ignore mask of mount mark,
      because the mark iteration loop skips over non-inode marks for events
      on child when calculating the ignore mask.
      
      Move ignore mask calculation to the top of the iteration loop block
      before excluding marks for events on dir/child.
      
      Link: https://lore.kernel.org/r/20200524072441.18258-1-amir73il@gmail.com
      
      
      Reported-by: default avatarJan Kara <jack@suse.cz>
      Link: https://lore.kernel.org/linux-fsdevel/20200521162443.GA26052@quack2.suse.cz/
      
      
      Fixes: 55bf882c "fanotify: fix merging marks masks with FAN_ONDIR"
      Fixes: b469e7e4 "fanotify: fix handling of events on child..."
      Signed-off-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      8aa93898
    • NeilBrown's avatar
      md: add feature flag MD_FEATURE_RAID0_LAYOUT · d8367ae0
      NeilBrown authored
      
      stable inclusion
      from linux-4.19.130
      commit f04928c3c2627deb43acd6724991d4573a4be7c8
      
      --------------------------------
      
      [ Upstream commit 33f2c35a ]
      
      Due to a bug introduced in Linux 3.14 we cannot determine the
      correctly layout for a multi-zone RAID0 array - there are two
      possibilities.
      
      It is possible to tell the kernel which to chose using a module
      parameter, but this can be clumsy to use.  It would be best if
      the choice were recorded in the metadata.
      So add a feature flag for this purpose.
      If it is set, then the 'layout' field of the superblock is used
      to determine which layout to use.
      
      If this flag is not set, then mddev->layout gets set to -1,
      which causes the module parameter to be required.
      
      Acked-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Conflicts:
        drivers/md/md.c
      [yyl: adjust context]
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      d8367ae0
    • Jiri Olsa's avatar
      kretprobe: Prevent triggering kretprobe from within kprobe_flush_task · ea29c178
      Jiri Olsa authored
      stable inclusion
      from linux-4.19.130
      commit 98abe944f93faf19c3707f8f188df5073e5668f8
      
      --------------------------------
      
      [ Upstream commit 9b38cc70 ]
      
      Ziqian reported lockup when adding retprobe on _raw_spin_lock_irqsave.
      My test was also able to trigger lockdep output:
      
       ============================================
       WARNING: possible recursive locking detected
       5.6.0-rc6+ #6 Not tainted
       --------------------------------------------
       sched-messaging/2767 is trying to acquire lock:
       ffffffff9a492798 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_hash_lock+0x52/0xa0
      
       but task is already holding lock:
       ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(&(kretprobe_table_locks[i].lock));
         lock(&(kretprobe_table_locks[i].lock));
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
       1 lock held by sched-messaging/2767:
        #0: ffffffff9a491a18 (&(kretprobe_table_locks[i].lock)){-.-.}, at: kretprobe_trampoline+0x0/0x50
      
       stack backtrace:
       CPU: 3 PID: 2767 Comm: sched-messaging Not tainted 5.6.0-rc6+ #6
       Call Trace:
        dump_stack+0x96/0xe0
        __lock_acquire.cold.57+0x173/0x2b7
        ? native_queued_spin_lock_slowpath+0x42b/0x9e0
        ? lockdep_hardirqs_on+0x590/0x590
        ? __lock_acquire+0xf63/0x4030
        lock_acquire+0x15a/0x3d0
        ? kretprobe_hash_lock+0x52/0xa0
        _raw_spin_lock_irqsave+0x36/0x70
        ? kretprobe_hash_lock+0x52/0xa0
        kretprobe_hash_lock+0x52/0xa0
        trampoline_handler+0xf8/0x940
        ? kprobe_fault_handler+0x380/0x380
        ? find_held_lock+0x3a/0x1c0
        kretprobe_trampoline+0x25/0x50
        ? lock_acquired+0x392/0xbc0
        ? _raw_spin_lock_irqsave+0x50/0x70
        ? __get_valid_kprobe+0x1f0/0x1f0
        ? _raw_spin_unlock_irqrestore+0x3b/0x40
        ? finish_task_switch+0x4b9/0x6d0
        ? __switch_to_asm+0x34/0x70
        ? __switch_to_asm+0x40/0x70
      
      The code within the kretprobe handler checks for probe reentrancy,
      so we won't trigger any _raw_spin_lock_irqsave probe in there.
      
      The problem is in outside kprobe_flush_task, where we call:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave
      
      where _raw_spin_lock_irqsave triggers the kretprobe and installs
      kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
      
      The kretprobe_trampoline handler is then executed with already
      locked kretprobe_table_locks, and first thing it does is to
      lock kretprobe_table_locks ;-) the whole lockup path like:
      
        kprobe_flush_task
          kretprobe_table_lock
            raw_spin_lock_irqsave
              _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
      
              ---> kretprobe_table_locks locked
      
              kretprobe_trampoline
                trampoline_handler
                  kretprobe_hash_lock(current, &head, &flags);  <--- deadlock
      
      Adding kprobe_busy_begin/end helpers that mark code with fake
      probe installed to prevent triggering of another kprobe within
      this code.
      
      Using these helpers in kprobe_flush_task, so the probe recursion
      protection check is hit and the probe is never set to prevent
      above lockup.
      
      Link: http://lkml.kernel.org/r/158927059835.27680.7011202830041561604.stgit@devnote2
      
      
      
      Fixes: ef53d9c5 ("kprobes: improve kretprobe scalability with hashed locking")
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: "Gustavo A . R . Silva" <gustavoars@kernel.org>
      Cc: Anders Roxell <anders.roxell@linaro.org>
      Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com>
      Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Reported-by: default avatar"Ziqian SUN (Zamir)" <zsun@redhat.com>
      Acked-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      ea29c178
    • Theodore Ts'o's avatar
      ext4: avoid race conditions when remounting with options that change dax · e84fcf7b
      Theodore Ts'o authored
      
      stable inclusion
      from linux-4.19.130
      commit 5dbb625573abf67b9f7d6992c8e3f57acaef5946
      
      --------------------------------
      
      [ Upstream commit 829b37b8 ]
      
      Trying to change dax mount options when remounting could allow mount
      options to be enabled for a small amount of time, and then the mount
      option change would be reverted.
      
      In the case of "mount -o remount,dax", this can cause a race where
      files would temporarily treated as DAX --- and then not.
      
      Cc: stable@kernel.org
      Reported-by: default avatar <syzbot+bca9799bf129256190da@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      e84fcf7b
    • Sasha Levin's avatar
      ext4: fix partial cluster initialization when splitting extent · c02a3e78
      Sasha Levin authored
      
      stable inclusion
      from linux-4.19.130
      commit 051bf267b1fabba8a585a185eb816bf0006bdae6
      
      --------------------------------
      
      [ Upstream commit cfb3c85a ]
      
      Fix the bug when calculating the physical block number of the first
      block in the split extent.
      
      This bug will cause xfstests shared/298 failure on ext4 with bigalloc
      enabled occasionally. Ext4 error messages indicate that previously freed
      blocks are being freed again, and the following fsck will fail due to
      the inconsistency of block bitmap and bg descriptor.
      
      The following is an example case:
      
      1. First, Initialize a ext4 filesystem with cluster size '16K', block size
      '4K', in which case, one cluster contains four blocks.
      
      2. Create one file (e.g., xxx.img) on this ext4 filesystem. Now the extent
      tree of this file is like:
      
      ...
      36864:[0]4:220160
      36868:[0]14332:145408
      51200:[0]2:231424
      ...
      
      3. Then execute PUNCH_HOLE fallocate on this file. The hole range is
      like:
      
      ..
      ext4_ext_remove_space: dev 254,16 ino 12 since 49506 end 49506 depth 1
      ext4_ext_remove_space: dev 254,16 ino 12 since 49544 end 49546 depth 1
      ext4_ext_remove_space: dev 254,16 ino 12 since 49605 end 49607 depth 1
      ...
      
      4. Then the extent tree of this file after punching is like
      
      ...
      49507:[0]37:158047
      49547:[0]58:158087
      ...
      
      5. Detailed procedure of punching hole [49544, 49546]
      
      5.1. The block address space:
      ```
      lblk        ~49505  49506   49507~49543     49544~49546    49547~
      	  ---------+------+-------------+----------------+--------
      	    extent | hole |   extent	|	hole	 | extent
      	  ---------+------+-------------+----------------+--------
      pblk       ~158045  158046  158047~158083  158084~158086   158087~
      ```
      
      5.2. The detailed layout of cluster 39521:
      ```
      		cluster 39521
      	<------------------------------->
      
      		hole		  extent
      	<----------------------><--------
      
      lblk      49544   49545   49546   49547
      	+-------+-------+-------+-------+
      	|	|	|	|	|
      	+-------+-------+-------+-------+
      pblk     158084  1580845  158086  158087
      ```
      
      5.3. The ftrace output when punching hole [49544, 49546]:
      - ext4_ext_remove_space (start 49544, end 49546)
        - ext4_ext_rm_leaf (start 49544, end 49546, last_extent [49507(158047), 40], partial [pclu 39522 lblk 0 state 2])
          - ext4_remove_blocks (extent [49507(158047), 40], from 49544 to 49546, partial [pclu 39522 lblk 0 state 2]
            - ext4_free_blocks: (block 158084 count 4)
              - ext4_mballoc_free (extent 1/6753/1)
      
      5.4. Ext4 error message in dmesg:
      EXT4-fs error (device vdb): mb_free_blocks:1457: group 1, block 158084:freeing already freed block (bit 6753); block bitmap corrupt.
      EXT4-fs error (device vdb): ext4_mb_generate_buddy:747: group 1, block bitmap and bg descriptor inconsistent: 19550 vs 19551 free clusters
      
      In this case, the whole cluster 39521 is freed mistakenly when freeing
      pblock 158084~158086 (i.e., the first three blocks of this cluster),
      although pblock 158087 (the last remaining block of this cluster) has
      not been freed yet.
      
      The root cause of this isuue is that, the pclu of the partial cluster is
      calculated mistakenly in ext4_ext_remove_space(). The correct
      partial_cluster.pclu (i.e., the cluster number of the first block in the
      next extent, that is, lblock 49597 (pblock 158086)) should be 39521 rather
      than 39522.
      
      Fixes: f4226d9e ("ext4: fix partial cluster initialization")
      Signed-off-by: default avatarJeffle Xu <jefflexu@linux.alibaba.com>
      Reviewed-by: default avatarEric Whitney <enwlinux@gmail.com>
      Cc: stable@kernel.org # v3.19+
      Link: https://lore.kernel.org/r/1590121124-37096-1-git-send-email-jefflexu@linux.alibaba.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      c02a3e78
    • Tom Rix's avatar
      selinux: fix double free · f7c7e9d0
      Tom Rix authored
      
      stable inclusion
      from linux-4.19.130
      commit cd80735a43a9c8fd5e883f8313e3ba7b27167310
      
      --------------------------------
      
      commit 65de5096 upstream.
      
      Clang's static analysis tool reports these double free memory errors.
      
      security/selinux/ss/services.c:2987:4: warning: Attempt to free released memory [unix.Malloc]
                              kfree(bnames[i]);
                              ^~~~~~~~~~~~~~~~
      security/selinux/ss/services.c:2990:2: warning: Attempt to free released memory [unix.Malloc]
              kfree(bvalues);
              ^~~~~~~~~~~~~~
      
      So improve the security_get_bools error handling by freeing these variables
      and setting their return pointers to NULL and the return len to 0
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Acked-by: default avatarStephen Smalley <stephen.smalley.work@gmail.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      f7c7e9d0
    • Will Deacon's avatar
      arm64: hw_breakpoint: Don't invoke overflow handler on uaccess watchpoints · 77da7a99
      Will Deacon authored
      
      stable inclusion
      from linux-4.19.130
      commit a39340ab12b3234dc9b74cc878d1f6bc81dc97bd
      
      --------------------------------
      
      [ Upstream commit 24ebec25 ]
      
      Unprivileged memory accesses generated by the so-called "translated"
      instructions (e.g. STTR) at EL1 can cause EL0 watchpoints to fire
      unexpectedly if kernel debugging is enabled. In such cases, the
      hw_breakpoint logic will invoke the user overflow handler which will
      typically raise a SIGTRAP back to the current task. This is futile when
      returning back to the kernel because (a) the signal won't have been
      delivered and (b) userspace can't handle the thing anyway.
      
      Avoid invoking the user overflow handler for watchpoints triggered by
      kernel uaccess routines, and instead single-step over the faulting
      instruction as we would if no overflow handler had been installed.
      
      (Fixes tag identifies the introduction of unprivileged memory accesses,
       which exposed this latent bug in the hw_breakpoint code)
      
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Fixes: 57f4959b ("arm64: kernel: Add support for User Access Override")
      Reported-by: default avatarLuis Machado <luis.machado@linaro.org>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      77da7a99
    • Jann Horn's avatar
      lib/zlib: remove outdated and incorrect pre-increment optimization · 4c38dbd4
      Jann Horn authored
      stable inclusion
      from linux-4.19.130
      commit c1d9c6995d83e63c41eb982d25cbc415e3f004f0
      
      --------------------------------
      
      [ Upstream commit acaab733 ]
      
      The zlib inflate code has an old micro-optimization based on the
      assumption that for pre-increment memory accesses, the compiler will
      generate code that fits better into the processor's pipeline than what
      would be generated for post-increment memory accesses.
      
      This optimization was already removed in upstream zlib in 2016:
      https://github.com/madler/zlib/commit/9aaec95e8211
      
      
      
      This optimization causes UB according to C99, which says in section 6.5.6
      "Additive operators": "If both the pointer operand and the result point to
      elements of the same array object, or one past the last element of the
      array object, the evaluation shall not produce an overflow; otherwise, the
      behavior is undefined".
      
      This UB is not only a theoretical concern, but can also cause trouble for
      future work on compiler-based sanitizers.
      
      According to the zlib commit, this optimization also is not optimal
      anymore with modern compilers.
      
      Replace uses of OFF, PUP and UP_UNALIGNED with their definitions in the
      POSTINC case, and remove the macro definitions, just like in the upstream
      patch.
      
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Mikhail Zaslonko <zaslonko@linux.ibm.com>
      Link: http://lkml.kernel.org/r/20200507123112.252723-1-jannh@google.com
      
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      4c38dbd4
    • Qiushi Wu's avatar
      vfio/mdev: Fix reference count leak in add_mdev_supported_type · 364e9f2d
      Qiushi Wu authored
      
      stable inclusion
      from linux-4.19.130
      commit 1a98e4ef324d4735b0a8240717b4e2d1bb9adfd8
      
      --------------------------------
      
      [ Upstream commit aa8ba13c ]
      
      kobject_init_and_add() takes reference even when it fails.
      If this function returns an error, kobject_put() must be called to
      properly clean up the memory associated with the object. Thus,
      replace kfree() by kobject_put() to fix this issue. Previous
      commit "b8eb7183" fixed a similar problem.
      
      Fixes: 7b96953b ("vfio: Mediated device Core driver")
      Signed-off-by: default avatarQiushi Wu <wu000273@umn.edu>
      Reviewed-by: default avatarCornelia Huck <cohuck@redhat.com>
      Reviewed-by: default avatarKirti Wankhede <kwankhede@nvidia.com>
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      364e9f2d
    • Marc Zyngier's avatar
      PCI: dwc: Fix inner MSI IRQ domain registration · 5e6d265a
      Marc Zyngier authored
      stable inclusion
      from linux-4.19.130
      commit ff6136914a9b0704f3495d22624571dd9de162bd
      
      --------------------------------
      
      [ Upstream commit 0414b93e ]
      
      On a system that uses the internal DWC MSI widget, I get this
      warning from debugfs when CONFIG_GENERIC_IRQ_DEBUGFS is selected:
      
        debugfs: File ':soc:pcie@fc000000' in directory 'domains' already present!
      
      This is due to the fact that the DWC MSI code tries to register two
      IRQ domains for the same firmware node, without telling the low
      level code how to distinguish them (by setting a bus token). This
      further confuses debugfs which tries to create corresponding
      files for each domain.
      
      Fix it by tagging the inner domain as DOMAIN_BUS_NEXUS, which is
      the closest thing we have as to "generic MSI".
      
      Link: https://lore.kernel.org/r/20200501113921.366597-1-maz@kernel.org
      
      
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Acked-by: default avatarJingoo Han <jingoohan1@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      5e6d265a
    • Hannes Reinecke's avatar
      dm zoned: return NULL if dmz_get_zone_for_reclaim() fails to find a zone · 2dfbd903
      Hannes Reinecke authored
      
      stable inclusion
      from linux-4.19.130
      commit 4ad7add5077880f232190962f309907d4f8e74f3
      
      --------------------------------
      
      [ Upstream commit 489dc0f0 ]
      
      The only case where dmz_get_zone_for_reclaim() cannot return a zone is
      if the respective lists are empty. So we should just return a simple
      NULL value here as we really don't have an error code which would make
      sense.
      
      Signed-off-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      2dfbd903
    • Feng Tang's avatar
      ipmi: use vzalloc instead of kmalloc for user creation · 8cb3959a
      Feng Tang authored
      
      stable inclusion
      from linux-4.19.130
      commit 1d9c47a329a3045fbb761d10aa2930f8811011dc
      
      --------------------------------
      
      [ Upstream commit 7c47a219 ]
      
      We met mulitple times of failure of staring bmc-watchdog,
      due to the runtime memory allocation failure of order 4.
      
           bmc-watchdog: page allocation failure: order:4, mode:0x40cc0(GFP_KERNEL|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0-1
           CPU: 1 PID: 2571 Comm: bmc-watchdog Not tainted 5.5.0-00045-g7d6bb61d6188c #1
           Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.00.01.0015.110720180833 11/07/2018
           Call Trace:
            dump_stack+0x66/0x8b
            warn_alloc+0xfe/0x160
            __alloc_pages_slowpath+0xd3e/0xd80
            __alloc_pages_nodemask+0x2f0/0x340
            kmalloc_order+0x18/0x70
            kmalloc_order_trace+0x1d/0xb0
            ipmi_create_user+0x55/0x2c0 [ipmi_msghandler]
            ipmi_open+0x72/0x110 [ipmi_devintf]
            chrdev_open+0xcb/0x1e0
            do_dentry_open+0x1ce/0x380
            path_openat+0x305/0x14f0
            do_filp_open+0x9b/0x110
            do_sys_open+0x1bd/0x250
            do_syscall_64+0x5b/0x1f0
            entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Using vzalloc/vfree for creating ipmi_user heals the
      problem
      
      Thanks to Stephen Rothwell for finding the vmalloc.h
      inclusion issue.
      
      Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      8cb3959a
    • Rob Herring's avatar
      PCI: Fix pci_register_host_bridge() device_register() error handling · 8024756b
      Rob Herring authored
      stable inclusion
      from linux-4.19.130
      commit 5d78c4a343f4e89b47b27d382b117ce70a1138a5
      
      --------------------------------
      
      [ Upstream commit 1b54ae83 ]
      
      If device_register() has an error, we should bail out of
      pci_register_host_bridge() rather than continuing on.
      
      Fixes: 37d6a0a6 ("PCI: Add pci_register_host_bridge() interface")
      Link: https://lore.kernel.org/r/20200513223859.11295-1-robh@kernel.org
      
      
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      8024756b
    • Kuppuswamy Sathyanarayanan's avatar
      drivers: base: Fix NULL pointer exception in __platform_driver_probe() if a... · 2db112b7
      Kuppuswamy Sathyanarayanan authored
      drivers: base: Fix NULL pointer exception in __platform_driver_probe() if a driver developer is foolish
      
      stable inclusion
      from linux-4.19.130
      commit cd74de676af119f4110be9f828f22da953dd94b0
      
      --------------------------------
      
      [ Upstream commit 388bcc6e ]
      
      If platform bus driver registration is failed then, accessing
      platform bus spin lock (&drv->driver.bus->p->klist_drivers.k_lock)
      in __platform_driver_probe() without verifying the return value
      __platform_driver_register() can lead to NULL pointer exception.
      
      So check the return value before attempting the spin lock.
      
      One such example is below:
      
      For a custom usecase, I have intentionally failed the platform bus
      registration and I expected all the platform device/driver
      registrations to fail gracefully. But I came across this panic
      issue.
      
      [    1.331067] BUG: kernel NULL pointer dereference, address: 00000000000000c8
      [    1.331118] #PF: supervisor write access in kernel mode
      [    1.331163] #PF: error_code(0x0002) - not-present page
      [    1.331208] PGD 0 P4D 0
      [    1.331233] Oops: 0002 [#1] PREEMPT SMP
      [    1.331268] CPU: 3 PID: 1 Comm: swapper/0 Tainted: G        W         5.6.0-00049-g670d35fb0144 #165
      [    1.331341] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
      [    1.331406] RIP: 0010:_raw_spin_lock+0x15/0x30
      [    1.331588] RSP: 0000:ffffc9000001be70 EFLAGS: 00010246
      [    1.331632] RAX: 0000000000000000 RBX: 00000000000000c8 RCX: 0000000000000001
      [    1.331696] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000000
      [    1.331754] RBP: 00000000ffffffed R08: 0000000000000501 R09: 0000000000000001
      [    1.331817] R10: ffff88817abcc520 R11: 0000000000000670 R12: 00000000ffffffed
      [    1.331881] R13: ffffffff82dbc268 R14: ffffffff832f070a R15: 0000000000000000
      [    1.331945] FS:  0000000000000000(0000) GS:ffff88817bd80000(0000) knlGS:0000000000000000
      [    1.332008] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [    1.332062] CR2: 00000000000000c8 CR3: 000000000681e001 CR4: 00000000003606e0
      [    1.332126] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [    1.332189] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [    1.332252] Call Trace:
      [    1.332281]  __platform_driver_probe+0x92/0xee
      [    1.332323]  ? rtc_dev_init+0x2b/0x2b
      [    1.332358]  cmos_init+0x37/0x67
      [    1.332396]  do_one_initcall+0x7d/0x168
      [    1.332428]  kernel_init_freeable+0x16c/0x1c9
      [    1.332473]  ? rest_init+0xc0/0xc0
      [    1.332508]  kernel_init+0x5/0x100
      [    1.332543]  ret_from_fork+0x1f/0x30
      [    1.332579] CR2: 00000000000000c8
      [    1.332616] ---[ end trace 3bd87f12e9010b87 ]---
      [    1.333549] note: swapper/0[1] exited with preempt_count 1
      [    1.333592] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
      [    1.333736] Kernel Offset: disabled
      
      Note, this can only be triggered if a driver errors out from this call,
      which should never happen.  If it does, the driver needs to be fixed.
      
      Signed-off-by: default avatarKuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
      Link: https://lore.kernel.org/r/20200408214003.3356-1-sathyanarayanan.kuppuswamy@linux.intel.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      2db112b7
    • Simon Arlott's avatar
      scsi: sr: Fix sr_probe() missing deallocate of device minor · c5472765
      Simon Arlott authored
      stable inclusion
      from linux-4.19.130
      commit 5d41879d2ea51b1dd37798d5fdb597d55359caed
      
      --------------------------------
      
      [ Upstream commit 6555781b ]
      
      If the cdrom fails to be registered then the device minor should be
      deallocated.
      
      Link: https://lore.kernel.org/r/072dac4b-8402-4de8-36bd-47e7588969cd@0882a8b5-c6c3-11e9-b005-00805fc181fe
      
      
      Signed-off-by: default avatarSimon Arlott <simon@octiron.net>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      c5472765
    • Qian Cai's avatar
      vfio/pci: fix memory leaks in alloc_perm_bits() · aabdfc7c
      Qian Cai authored
      
      stable inclusion
      from linux-4.19.130
      commit 78914f1689701e997a17891644117539d6c103cf
      
      --------------------------------
      
      [ Upstream commit 3e63b94b ]
      
      vfio_pci_disable() calls vfio_config_free() but forgets to call
      free_perm_bits() resulting in memory leaks,
      
      unreferenced object 0xc000000c4db2dee0 (size 16):
        comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s)
        hex dump (first 16 bytes):
          00 00 ff 00 ff ff ff ff ff ff ff ff ff ff 00 00  ................
        backtrace:
          [<00000000a6a4552d>] alloc_perm_bits+0x58/0xe0 [vfio_pci]
          [<00000000ac990549>] vfio_config_init+0xdf0/0x11b0 [vfio_pci]
          init_pci_cap_msi_perm at drivers/vfio/pci/vfio_pci_config.c:1125
          (inlined by) vfio_msi_cap_len at drivers/vfio/pci/vfio_pci_config.c:1180
          (inlined by) vfio_cap_len at drivers/vfio/pci/vfio_pci_config.c:1241
          (inlined by) vfio_cap_init at drivers/vfio/pci/vfio_pci_config.c:1468
          (inlined by) vfio_config_init at drivers/vfio/pci/vfio_pci_config.c:1707
          [<000000006db873a1>] vfio_pci_open+0x234/0x700 [vfio_pci]
          [<00000000630e1906>] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio]
          [<000000009e34c54f>] ksys_ioctl+0xd8/0x130
          [<000000006577923d>] sys_ioctl+0x28/0x40
          [<000000006d7b1cf2>] system_call_exception+0x114/0x1e0
          [<0000000008ea7dd5>] system_call_common+0xf0/0x278
      unreferenced object 0xc000000c4db2e330 (size 16):
        comm "qemu-kvm", pid 4305, jiffies 4295020272 (age 3463.780s)
        hex dump (first 16 bytes):
          00 ff ff 00 ff ff ff ff ff ff ff ff ff ff 00 00  ................
        backtrace:
          [<000000004c71914f>] alloc_perm_bits+0x44/0xe0 [vfio_pci]
          [<00000000ac990549>] vfio_config_init+0xdf0/0x11b0 [vfio_pci]
          [<000000006db873a1>] vfio_pci_open+0x234/0x700 [vfio_pci]
          [<00000000630e1906>] vfio_group_fops_unl_ioctl+0x8e0/0xb84 [vfio]
          [<000000009e34c54f>] ksys_ioctl+0xd8/0x130
          [<000000006577923d>] sys_ioctl+0x28/0x40
          [<000000006d7b1cf2>] system_call_exception+0x114/0x1e0
          [<0000000008ea7dd5>] system_call_common+0xf0/0x278
      
      Fixes: 89e1f7d4 ("vfio: Add PCI device driver")
      Signed-off-by: default avatarQian Cai <cai@lca.pw>
      [aw: rolled in follow-up patch]
      Signed-off-by: default avatarAlex Williamson <alex.williamson@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      aabdfc7c
    • Ard Biesheuvel's avatar
      PCI: Allow pci_resize_resource() for devices on root bus · afa0af94
      Ard Biesheuvel authored
      stable inclusion
      from linux-4.19.130
      commit f7ec605600a3cbed562049600d9a668cb34f6978
      
      --------------------------------
      
      [ Upstream commit d09ddd81 ]
      
      When resizing a BAR, pci_reassign_bridge_resources() is invoked to bring
      the bridge windows of parent bridges in line with the new BAR assignment.
      
      This assumes the device whose BAR is being resized lives on a subordinate
      bus, but this is not necessarily the case. A device may live on the root
      bus, in which case dev->bus->self is NULL, and passing a NULL pci_dev
      pointer to pci_reassign_bridge_resources() will cause it to crash.
      
      So let's make the call to pci_reassign_bridge_resources() conditional on
      whether dev->bus->self is non-NULL in the first place.
      
      Fixes: 8bb705e3 ("PCI: Add pci_resize_resource() for resizing BARs")
      Link: https://lore.kernel.org/r/20200421162256.26887-1-ardb@kernel.org
      
      
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      afa0af94
    • Corey Minyard's avatar
      ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier · a09b461d
      Corey Minyard authored
      
      stable inclusion
      from linux-4.19.37
      commit dacdbc115d23f1e4dbc091440bd5cdcdc01173b6
      
      --------------------------------
      
      commit 3b9a9072 upstream.
      
      free_user() could be called in atomic context.
      
      This patch pushed the free operation off into a workqueue.
      
      Example:
      
       BUG: sleeping function called from invalid context at kernel/workqueue.c:2856
       in_atomic(): 1, irqs_disabled(): 0, pid: 177, name: ksoftirqd/27
       CPU: 27 PID: 177 Comm: ksoftirqd/27 Not tainted 4.19.25-3 #1
       Hardware name: AIC 1S-HV26-08/MB-DPSB04-06, BIOS IVYBV060 10/21/2015
       Call Trace:
        dump_stack+0x5c/0x7b
        ___might_sleep+0xec/0x110
        __flush_work+0x48/0x1f0
        ? try_to_del_timer_sync+0x4d/0x80
        _cleanup_srcu_struct+0x104/0x140
        free_user+0x18/0x30 [ipmi_msghandler]
        ipmi_free_recv_msg+0x3a/0x50 [ipmi_msghandler]
        deliver_response+0xbd/0xd0 [ipmi_msghandler]
        deliver_local_response+0xe/0x30 [ipmi_msghandler]
        handle_one_recv_msg+0x163/0xc80 [ipmi_msghandler]
        ? dequeue_entity+0xa0/0x960
        handle_new_recv_msgs+0x15c/0x1f0 [ipmi_msghandler]
        tasklet_action_common.isra.22+0x103/0x120
        __do_softirq+0xf8/0x2d7
        run_ksoftirqd+0x26/0x50
        smpboot_thread_fn+0x11d/0x1e0
        kthread+0x103/0x140
        ? sort_range+0x20/0x20
        ? kthread_destroy_worker+0x40/0x40
        ret_from_fork+0x1f/0x40
      
      Fixes: 77f82696 ("ipmi: fix use-after-free of user->release_barrier.rda")
      
      Reported-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Cc: stable@vger.kernel.org # 5.0
      Cc: Yang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      a09b461d
    • Yang Yingliang's avatar
      Revert "ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier" · 4a5c19a7
      Yang Yingliang authored
      
      hulk inclusion
      category: bugfix
      bugzilla: 5499
      CVE: NA
      
      ---------------------------
      
      cleanup_srcu_struct_quiesced won't free memory if work_pending()
      is true, and it will cause a memory leak, so revert it and
      using mainline patch instead.
      
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Reviewed-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      4a5c19a7
    • Douglas Anderson's avatar
      kernel/cpu_pm: Fix uninitted local in cpu_pm · 48883605
      Douglas Anderson authored
      
      stable inclusion
      from linux-4.19.129
      commit 262c6e883e057ea186186f80f2e6e3fd614f7bd0
      
      --------------------------------
      
      commit b5945214 upstream.
      
      cpu_pm_notify() is basically a wrapper of notifier_call_chain().
      notifier_call_chain() doesn't initialize *nr_calls to 0 before it
      starts incrementing it--presumably it's up to the callers to do this.
      
      Unfortunately the callers of cpu_pm_notify() don't init *nr_calls.
      This potentially means you could get too many or two few calls to
      CPU_PM_ENTER_FAILED or CPU_CLUSTER_PM_ENTER_FAILED depending on the
      luck of the stack.
      
      Let's fix this.
      
      Fixes: ab10023e ("cpu_pm: Add cpu power management notifiers")
      Cc: stable@vger.kernel.org
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarStephen Boyd <swboyd@chromium.org>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Link: https://lore.kernel.org/r/20200504104917.v6.3.I2d44fc0053d019f239527a4e5829416714b7e299@changeid
      
      
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      48883605
    • Eric Biggers's avatar
      ext4: fix race between ext4_sync_parent() and rename() · f229d18d
      Eric Biggers authored
      
      stable inclusion
      from linux-4.19.129
      commit 8f3f5ba25e2b811be915c1d86cf8d7847287339d
      
      --------------------------------
      
      commit 08adf452 upstream.
      
      'igrab(d_inode(dentry->d_parent))' without holding dentry->d_lock is
      broken because without d_lock, d_parent can be concurrently changed due
      to a rename().  Then if the old directory is immediately deleted, old
      d_parent->inode can be NULL.  That causes a NULL dereference in igrab().
      
      To fix this, use dget_parent() to safely grab a reference to the parent
      dentry, which pins the inode.  This also eliminates the need to use
      d_find_any_alias() other than for the initial inode, as we no longer
      throw away the dentry at each step.
      
      This is an extremely hard race to hit, but it is possible.  Adding a
      udelay() in between the reads of ->d_parent and its ->d_inode makes it
      reproducible on a no-journal filesystem using the following program:
      
          #include <fcntl.h>
          #include <unistd.h>
      
          int main()
          {
              if (fork()) {
                  for (;;) {
                      mkdir("dir1", 0700);
                      int fd = open("dir1/file", O_RDWR|O_CREAT|O_SYNC);
                      write(fd, "X", 1);
                      close(fd);
                  }
              } else {
                  mkdir("dir2", 0700);
                  for (;;) {
                      rename("dir1/file", "dir2/file");
                      rmdir("dir1");
                  }
              }
          }
      
      Fixes: d59729f4 ("ext4: fix races in ext4_sync_parent()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Link: https://lore.kernel.org/r/20200506183140.541194-1-ebiggers@kernel.org
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      f229d18d
    • Harshad Shirwadkar's avatar
      ext4: fix EXT_MAX_EXTENT/INDEX to check for zeroed eh_max · 8442633f
      Harshad Shirwadkar authored
      
      stable inclusion
      from linux-4.19.129
      commit acbec3dd4586d271a0248453d2810712439ded1b
      
      --------------------------------
      
      commit c36a71b4 upstream.
      
      If eh->eh_max is 0, EXT_MAX_EXTENT/INDEX would evaluate to unsigned
      (-1) resulting in illegal memory accesses. Although there is no
      consistent repro, we see that generic/019 sometimes crashes because of
      this bug.
      
      Ran gce-xfstests smoke and verified that there were no regressions.
      
      Signed-off-by: default avatarHarshad Shirwadkar <harshadshirwadkar@gmail.com>
      Link: https://lore.kernel.org/r/20200421023959.20879-2-harshadshirwadkar@gmail.com
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Cc: stable@kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      8442633f