- Dec 27, 2019
-
-
euler inclusion category: bugfix bugzilla: 9513/11006/11050 CVE: NA -------------------------------------------------- [ Cheng Jian HULK-Syzkaller reported a problem which has been reported to mainline(lkml) by syzbot early, this patch comes from the reply form lkml. v1 https://lkml.org/lkml/2019/2/28/529 v2 https://lkml.org/lkml/2019/3/8/206 we merged v1 first but cause bugzilla #11050, it was because : we also use perf_remove_from_context() in perf_event_open() when we move events from a SW context to a HW context, so we can't destroy the event here. now v2 will not exhibit that warning. it's same to another patch at https://lkml.org/lkml/2019/3/8/536 . but more clear than it.] First, we have a race between perf_event_release_kernel() and perf_free_event(), which happens when parent's event is released while the child's fork fails (because of a fatal signal, for example), that looks like this: cpu X cpu Y ----- ----- copy_process() error path perf_release(parent) +->perf_event_free_task() +-> lock(child_ctx->mutex) | | +-> remove_from_context(child) | | +-> unlock(child_ctx->mutex) | | | | +-> lock(child_ctx->mutex) | | +-> unlock(child_ctx->mutex) | +-> free_task(child_task) +-> put_task_struct(child_task) Technically, we're still holding a reference to the task via parent->hw.target, that's not stopping free_task(), so we end up poking at free'd memory, as is pointed out by KASAN in the syzkaller report (see Link below). The straightforward fix is to drop the hw.target reference while the task is still around. Therein lies the second problem: the users of hw.target (uprobe) assume that it's around at ->destroy() callback time, where they use it for context. So, in order to not break the uprobe teardown and avoid leaking stuff, we need to call ->destroy() at the same time. This patch fixes the race and the subsequent fallout by doing both these things at remove_from_context time. Signed-off-by:
Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://syzkaller.appspot.com/bug?extid=a24c397a29ad22d86c98 Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Reviewed-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: bugfix bugzilla: 10679 CVE: NA --------------------------- We want iocb_put() happening on errors, to balance the extra reference we'd taken. As it is, we end up with a leak. The rules should be * error: iocb_put() to deal with the extra ref, return error, let the caller do another iocb_put(). * async: iocb_put() to deal with the extra ref, return 0. * no error, event present immediately: aio_poll_complete() to report it, iocb_put() to deal with the extra ref, return 0. Link: https://patchwork.kernel.org/patch/10842103/ Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
zhengbin <zhengbin13@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: bugfix bugzilla: 10679 CVE: NA --------------------------- In case of early wakeups, aio_poll() assumes that aio_poll_complete() has either already happened or is imminent. In that case we do not want to put iocb on the list of cancellables. However, ignored wakeups need to be treated as if wakeup has not happened at all. Trivially fixed by having aio_poll_wake() set ->woken only after it's committed to taking iocb out of the waitqueue. Link: https://patchwork.kernel.org/patch/10842107/ Suggested-by:
zhengbin <zhengbin13@huawei.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
zhengbin <zhengbin13@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: bugfix bugzilla: 11043 CVE: NA --------------------------- All indirect buffers get by ext4_find_shared() should be released no mater the branch should be freed or not. But now, we forget to release the lower depth indirect buffers when removing space from the same higher depth indirect block. It will lead to buffer leak and futher more, it may lead to quota information corruption when using old quota, consider the following case. - Create and mount an empty ext4 filesystem without extent and quota features, - quotacheck and enable the user & group quota, - Create some files and write some data to them, and then punch hole to some files of them, it may trigger the buffer leak problem mentioned above. - Disable quota and run quotacheck again, it will create two new aquota files and write the checked quota information to them, which probably may reuse the freed indirect block(the buffer and page cache was not freed) as data block. - Enable quota again, it will invoke vfs_load_quota_inode()->invalidate_bdev() to try to clean unused buffers and pagecache. Unfortunately, because of the buffer of quota data block is still referenced, quota code cannot read the up to date quota info from the device and lead to quota information corruption. This problem can be reproduced by xfstests generic/231 on ext3 filesystem or ext4 filesystem without extent and quota feature. This patch fix this problem by brelse all indirect buffers, and also cleanup the brelse code in ext4_ind_remove_space(). Reported-by:
Hulk Robot <hulkci@huawei.com> Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Reviewed-by:
Miao Xie <miaoxie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-v5.0 commit 7771bdbb category: bugfix bugzilla: 10979 CVE: NA ------------------------------------------------ Use after scope bugs detector seems to be almost entirely useless for the linux kernel. It exists over two years, but I've seen only one valid bug so far [1]. And the bug was fixed before it has been reported. There were some other use-after-scope reports, but they were false-positives due to different reasons like incompatibility with structleak plugin. This feature significantly increases stack usage, especially with GCC < 9 version, and causes a 32K stack overflow. It probably adds performance penalty too. Given all that, let's remove use-after-scope detector entirely. While preparing this patch I've noticed that we mistakenly enable use-after-scope detection for clang compiler regardless of CONFIG_KASAN_EXTRA setting. This is also fixed now. [1] http://lkml.kernel.org/r/<2...
-
euler inclusion category: bugfix bugzilla: 10989 CVE: NA ------------------------------------------------ MEMCG depends on the task structure not to be freed under rcu_read_lock() in get_mem_cgroup_from_mm() after it dereferences mm->owner. A better fix would be to avoid registering forked vmas in userfaultfd contexts reported to the monitor, if case fork ends up failing. Signed-off-by:
Andrea Arcangeli <aarcange@redhat.com> Signed-off-by:
zhong jiang <zhongjiang@huawei.com> Reviewed-by:
Miao Xie <miaoxie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-5.0-rc8 commit 822ad64d category: bugfix bugzilla: 10783 CVE: NA --------------------------- In the request_key() upcall mechanism there's a dependency loop by which if a key type driver overrides the ->request_key hook and the userspace side manages to lose the authorisation key, the auth key and the internal construction record (struct key_construction) can keep each other pinned. Fix this by the following changes: (1) Killing off the construction record and using the auth key instead. (2) Including the operation name in the auth key payload and making the payload available outside of security/keys/. (3) The ->request_key hook is given the authkey instead of the cons record and operation name. Changes (2) and (3) allow the auth key to naturally be cleaned up if the keyring it is in is destroyed or cleared or the auth key is unlinked. Fixes: 7ee02a316600 ("keys: Fix dependency loop between construction record and auth key") Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
James Morris <james.morris@microsoft.com> Signed-off-by:
Jason Yan <yanaijie@huawei.com> Reviewed-by:
ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-5.0-rc8 commit 7c1857bd category: bugfix bugzilla: 10783 CVE: NA --------------------------- Set the timestamp on new keys rather than leaving it unset. Fixes: 31d5a79d ("KEYS: Do LRU discard in full keyrings") Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
James Morris <james.morris@microsoft.com> Signed-off-by:
Jason Yan <yanaijie@huawei.com> Reviewed-by:
ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-5.0 commit af3d5d1c category: bugfix bugzilla: NA CVE: CVE-2019-3460 ------------------------------------------------- When doing option parsing for standard type values of 1, 2 or 4 octets, the value is converted directly into a variable instead of a pointer. To avoid being tricked into being a pointer, check that for these option types that sizes actually match. In L2CAP every option is fixed size and thus it is prudent anyway to ensure that the remote side sends us the right option size along with option paramters. If the option size is not matching the option type, then that option is silently ignored. It is a protocol violation and instead of trying to give the remote attacker any further hints just pretend that option is not present and proceed with the default values. Implementation following the specification and its qualification procedures will always use the correct size and thus not being impacted here. To keep the code readable and consistent accross all options, a few cosmetic changes were also required. Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Reviewed-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Johan Hedberg <johan.hedberg@intel.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-5.0 commit 7c9cbd0b category: bugfix bugzilla: NA CVE: CVE-2019-3459 ------------------------------------------------- The function l2cap_get_conf_opt will return L2CAP_CONF_OPT_SIZE + opt->len as length value. The opt->len however is in control over the remote user and can be used by an attacker to gain access beyond the bounds of the actual packet. To prevent any potential leak of heap memory, it is enough to check that the resulting len calculation after calling l2cap_get_conf_opt is not below zero. A well formed packet will always return >= 0 here and will end with the length value being zero after the last option has been parsed. In case of malformed packets messing with the opt->len field the length value will become negative. If that is the case, then just abort and ignore the option. In case an attacker uses a too short opt->len value, then garbage will be parsed, but that is protected by the unknown option handling and also the option parameter size checks. Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Reviewed-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Johan Hedberg <johan.hedberg@intel.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: bugfix bugzilla: 9513/11006 CVE: NA -------------------------------------------------- This reverts commit b772baf9a14ab4975e8884a399a4e0bab2fb6bf9. we merge the patch b772baf9a14a ("perf: Paper over the hw.target problems") to reslove an use-after-free issue (bugzilla #9513/#11006). but it cause some new problem (bugzilla #11050/#11049) in this version. So just revert it. Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Reviewed-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-master~13 commit 6caabe7f category: bugfix bugzilla: 11026 CVE: NA ------------------------------------------------- If hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER) failed to add port, it directly returns res and forgets to free the node that allocated in hsr_create_self_node(), and forgets to delete the node->mac_list linked in hsr->self_node_db. BUG: memory leak unreferenced object 0xffff8881cfa0c780 (size 64): comm "syz-executor.0", pid 2077, jiffies 4294717969 (age 2415.377s) hex dump (first 32 bytes): e0 c7 a0 cf 81 88 ff ff 00 02 00 00 00 00 ad de ................ 00 e6 49 cd 81 88 ff ff c0 9b 87 d0 81 88 ff ff ..I............. backtrace: [<00000000e2ff5070>] hsr_dev_finalize+0x736/0x960 [hsr] [<000000003ed2e597>] hsr_newlink+0x2b2/0x3e0 [hsr] [<000000003fa8c6b6>] __rtnl_newlink+0xf1f/0x1600 net/core/rtnetlink.c:3182 [<000000001247a7ad>] rtnl_newlink+0x66/0x90 net/core/rtnetlink.c:3240 [<00000000e7d1b61d>] rtnetlink_rcv_msg+0x54e/0xb90 net/core/rtnetlink.c:5130 [<000000005556bd3a>] netlink_rcv_skb+0x129/0x340 net/netlink/af_netlink.c:2477 [<00000000741d5ee6>] netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] [<00000000741d5ee6>] netlink_unicast+0x49a/0x650 net/netlink/af_netlink.c:1336 [<000000009d56f9b7>] netlink_sendmsg+0x88b/0xdf0 net/netlink/af_netlink.c:1917 [<0000000046b35c59>] sock_sendmsg_nosec net/socket.c:621 [inline] [<0000000046b35c59>] sock_sendmsg+0xc3/0x100 net/socket.c:631 [<00000000d208adc9>] __sys_sendto+0x33e/0x560 net/socket.c:1786 [<00000000b582837a>] __do_sys_sendto net/socket.c:1798 [inline] [<00000000b582837a>] __se_sys_sendto net/socket.c:1794 [inline] [<00000000b582837a>] __x64_sys_sendto+0xdd/0x1b0 net/socket.c:1794 [<00000000c866801d>] do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290 [<00000000fea382d9>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<00000000e01dacb3>] 0xffffffffffffffff Fixes: c5a75911 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Reported-by:
Hulk Robot <hulkci@huawei.com> Signed-off-by:
Mao Wenan <maowenan@huawei.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Mao Wenan <maowenan@huawei.com> Reviewed-by:
YueHaibing <yuehaibing@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: feature Bugzilla: 10876 CVE: N/A ---------------------------------------- When I ran Syzkaller testsuite, I got the following call trace. Reviewed-by:
Yang Yingliang <yangyingliang@huawei.com> ================================================================================ UBSAN: Undefined behaviour in ./include/linux/time64.h:120:27 signed integer overflow: 8243129037239968815 * 1000000000 cannot be represented in type 'long long int' CPU: 5 PID: 28854 Comm: syz-executor.1 Not tainted 4.19.24 #4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 handle_overflow+0x193/0x1e2 lib/ubsan.c:190 timespec64_to_ns include/linux/time64.h:120 [inline] posix_cpu_timer_set+0x95a/0xb70 kernel/time/posix-cpu-timers.c:687 do_timer_settime+0x198/0x2a0 kernel/time/posix-timers.c:892 __do_sys_timer_settime kernel/time/posix-timers.c:918 [inline] __se_sys_timer_settime kernel/time/posix-timers.c:904 [inline] __x64_sys_timer_settime+0x18d/0x260 kernel/time/posix-timers.c:904 do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x462eb9 Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f14e4127c58 EFLAGS: 00000246 ORIG_RAX: 00000000000000df RAX: ffffffffffffffda RBX: 000000000073bfa0 RCX: 0000000000462eb9 RDX: 0000000020000080 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f14e41286bc R13: 00000000004c54cc R14: 0000000000704278 R15: 00000000ffffffff ================================================================================ It is because 'it_interval.tv_sec' is larger than 'KTIME_SEC_MAX' and 'it_interval.tv_sec * NSEC_PER_SEC' overflows in 'timespec64_to_ns()'. This patch checks whether 'it_interval.tv_sec' is larger than 'KTIME_SEC_MAX' and saturate if that is the case. Signed-off-by:
Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: feature Bugzilla: 11009 CVE: N/A ---------------------------------------- When I ran Syzkaller testsuite, I got the following call trace. Reviewed-by:
Yang Yingliang <yangyingliang@huawei.com> ================================================================================ UBSAN: Undefined behaviour in kernel/time/ntp.c:457:16 signed integer overflow: 9223372036854775807 + 500 cannot be represented in type 'long int' CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.19.25-dirty #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xca/0x13e lib/dump_stack.c:113 ubsan_epilogue+0xe/0x81 lib/ubsan.c:159 handle_overflow+0x193/0x1e2 lib/ubsan.c:190 second_overflow+0x403/0x540 kernel/time/ntp.c:457 accumulate_nsecs_to_secs kernel/time/timekeeping.c:2002 [inline] logarithmic_accumulation kernel/time/timekeeping.c:2046 [inline] timekeeping_advance+0x2bb/0xec0 kernel/time/timekeeping.c:2114 tick_do_update_jiffies64.part.2+0x1a0/0x350 kernel/time/tick-sched.c:97 tick_do_update_jiffies64 kernel/time/tick-sched.c:1229 [inline] tick_nohz_update_jiffies kernel/time/tick-sched.c:499 [inline] tick_nohz_irq_enter kernel/time/tick-sched.c:1232 [inline] tick_irq_enter+0x1fd/0x240 kernel/time/tick-sched.c:1249 irq_enter+0xc4/0x100 kernel/softirq.c:353 entering_irq arch/x86/include/asm/apic.h:517 [inline] entering_ack_irq arch/x86/include/asm/apic.h:523 [inline] smp_apic_timer_interrupt+0x20/0x480 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:864 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58 Code: 01 f0 0f 82 bc fd ff ff 48 c7 c7 c0 21 b1 83 e8 a1 0a 02 ff e9 ab fd ff ff 4c 89 e7 e8 77 b6 a5 fe e9 6a ff ff ff 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffff888106307d20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 RAX: 0000000000000007 RBX: dffffc0000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8881062e4f1c RBP: 0000000000000003 R08: ffffed107c5dc77b R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff848c78a0 R13: 0000000000000003 R14: 1ffff11020c60fae R15: 0000000000000000 arch_safe_halt arch/x86/include/asm/paravirt.h:94 [inline] default_idle+0x24/0x2b0 arch/x86/kernel/process.c:561 cpuidle_idle_call kernel/sched/idle.c:153 [inline] do_idle+0x2ca/0x420 kernel/sched/idle.c:262 cpu_startup_entry+0xcb/0xe0 kernel/sched/idle.c:368 start_secondary+0x421/0x570 arch/x86/kernel/smpboot.c:271 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243 ================================================================================ It is because time_maxerror is set as 0x7FFFFFFFFFFFFFFF by user. It overflows when we add it with 'MAXFREQ / NSEC_PER_USEC' in 'second_overflow()'. This patch add a limit check and saturate it when the user set 'time_maxerror'. Signed-off-by:
Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
scsi: hisi_sas: add softreset behind abort device at I_T_nexus_reset() to ensure decoupling of SATA device driver inclusion category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- We found out that SATA disk can not be write but read only after system come up. No abnormal IO have come back between init, but when we try to write SATA disk, the IO can not return and timeout. We notice that one if-check is remove at sas_I_T_nexus(), and it cause internal_task_abort() will be allow to run besides error handle, and obviously softreset_ata() did not run after this condition, so it's clear that SATA disk is not decoupling. Fixes: 0de2941 ("scsi: hisi_sas: remove the check of sas_dev status in function hisi_sas_I_T_nexus_reset()") Signed-off-by:
Luo Jiaxing <luojiaxing@huawei.com> Reviewed-by:
Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by:
John Garry <john.garry@huawei.com> Reviewed-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-next commit abdc644e category: bugfix bugzilla: 5355 CVE: NA -------------------------------------------------- The reason is that while swapping two inode, we swap the flags too. Some flags such as EXT4_JOURNAL_DATA_FL can really confuse the things since we're not resetting the address operations structure. The simplest way to keep things sane is to restrict the flags that can be swapped. Signed-off-by:
yangerkun <yangerkun@huawei.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-next commit aa507b5f category: bugfix bugzilla: 5355 CVE: NA -------------------------------------------------- While do swap between two inode, they swap i_data without update quota information. Also, swap_inode_boot_loader can do "revert" somtimes, so update the quota while all operations has been finished. Signed-off-by:
yangerkun <yangerkun@huawei.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-next commit a46c68a3 category: bugfix bugzilla: 5355 CVE: NA -------------------------------------------------- While do swap, we should make sure there has no new dirty page since we should swap i_data between two inode: 1.We should lock i_mmap_sem with write to avoid new pagecache from mmap read/write; 2.Change filemap_flush to filemap_write_and_wait and move them to the space protected by inode lock to avoid new pagecache from buffer read/write. Signed-off-by:
yangerkun <yangerkun@huawei.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-next commit 67a11611 category: bugfix bugzilla: 5355 CVE: NA -------------------------------------------------- Before really do swap between inode and boot inode, something need to check to avoid invalid or not permitted operation, like does this inode has inline data. But the condition check should be protected by inode lock to avoid change while swapping. Also some other condition will not change between swapping, but there has no problem to do this under inode lock. Fixes: ee3c859409("ext4: disallow files with EXT4_JOURNAL_DATA_FL ...") Signed-off-by:
yangerkun <yangerkun@huawei.com> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-5.0-rc8 commit bb2ba2d7 category: bugfix bugzilla: 10759 CVE: NA ------------------------------------------------- Fix the creation of shortcuts for which the length of the index key value is an exact multiple of the machine word size. The problem is that the code that blanks off the unused bits of the shortcut value malfunctions if the number of bits in the last word equals machine word size. This is due to the "<<" operator being given a shift of zero in this case, and so the mask that should be all zeros is all ones instead. This causes the subsequent masking operation to clear everything rather than clearing nothing. Ordinarily, the presence of the hash at the beginning of the tree index key makes the issue very hard to test for, but in this case, it was encountered due to a development mistake that caused the hash output to be either 0 (keyring) or 1 (non-keyring) only. This made it susceptible to the keyctl/unlink/valid test in the keyutils package. The fix is simply to skip the blanking if the shift would be 0. For example, an index key that is 64 bits long would produce a 0 shift and thus a 'blank' of all 1s. This would then be inverted and AND'd onto the index_key, incorrectly clearing the entire last word. Fixes: 3cb98950 ("Add a generic associative array implementation.") Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
James Morris <james.morris@microsoft.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Reviewed-by:
Li Bin <huawei.libin@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-5.0-rc8 commit 3defaf2f category: bugfix bugzilla: 10760 CVE: NA ------------------------------------------------- Lockdep warns about false positive: [ 11.211460] ------------[ cut here ]------------ [ 11.211936] DEBUG_LOCKS_WARN_ON(depth <= 0) [ 11.211985] WARNING: CPU: 0 PID: 141 at ../kernel/locking/lockdep.c:3592 lock_release+0x1ad/0x280 [ 11.213134] Modules linked in: [ 11.214954] RIP: 0010:lock_release+0x1ad/0x280 [ 11.223508] Call Trace: [ 11.223705] <IRQ> [ 11.223874] ? __local_bh_enable+0x7a/0x80 [ 11.224199] up_read+0x1c/0xa0 [ 11.224446] do_up_read+0x12/0x20 [ 11.224713] irq_work_run_list+0x43/0x70 [ 11.225030] irq_work_run+0x26/0x50 [ 11.225310] smp_irq_work_interrupt+0x57/0x1f0 [ 11.225662] irq_work_interrupt+0xf/0x20 since rw_semaphore is released in a different task vs task that locked the sema. It is expected behavior. Fix the warning with up_read_non_owner() and rwsem_release() annotation. Fixes: bae77c5e ("bpf: enable stackmap with build_id in nmi context") Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Reviewed-by:
Li Bin <huawei.libin@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: bugfix bugzilla: 9513/11006 CVE: NA -------------------------------------------------- [ Cheng Jian HULK-Syzkaller reported a problem which has been reported to mainline(lkml) by syzbot early, this patch comes from the reply form lkml. https://lkml.org/lkml/2019/2/28/529 ] First, we have a race between perf_event_release_kernel() and perf_free_event(), which happens when parent's event is released while the child's fork fails (because of a fatal signal, for example), that looks like this: cpu X cpu Y ----- ----- copy_process() error path perf_release(parent) +->perf_event_free_task() +-> lock(child_ctx->mutex) | | +-> remove_from_context(child) | | +-> unlock(child_ctx->mutex) | | | | +-> lock(child_ctx->mutex) | | +-> unlock(child_ctx->mutex) | +-> free_task(child_task) +-> put_task_struct(child_task) Technically, we're still holding a reference to the task via parent->hw.target, that's not stopping free_task(), so we end up poking at free'd memory, as is pointed out by KASAN in the syzkaller report (see Link below). The straightforward fix is to drop the hw.target reference while the task is still around. Therein lies the second problem: the users of hw.target (uprobe) assume that it's around at ->destroy() callback time, where they use it for context. So, in order to not break the uprobe teardown and avoid leaking stuff, we need to call ->destroy() at the same time. This patch fixes the race and the subsequent fallout by doing both these things at remove_from_context time. Signed-off-by:
Alexander Shishkin <alexander.shishkin@linux.intel.com> Link: https://syzkaller.appspot.com/bug?extid=a24c397a29ad22d86c98 Reported-by:
<syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Reviewed-by:
Li Bin <huawei.libin@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
mainline inclusion from mainline-5.x commit: <not-yet-available> category: bugfix bugzilla: 10883 CVE: NA ------------------------------------------------ When soft_offline_in_use_page() runs on a thp tail page after pmd is split, we trigger the following VM_BUG_ON_PAGE(): Memory failure: 0x3755ff: non anonymous thp __get_any_page: 0x3755ff: unknown zero refcount page type 2fffff80000000 Soft offlining pfn 0x34d805 at process virtual address 0x20fff000 page:ffffea000d360140 count:0 mapcount:0 mapping:0000000000000000 index:0x1 flags: 0x2fffff80000000() raw: 002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) ------------[ cut here ]------------ kernel BUG at ./include/linux/mm.h:519! soft_offline_in_use_page() passed refcount and page lock from tail page to head page, which is not needed because we can pass any subpage to split_huge_page(). Naoya had fixed the similar issue in the commit c3901e72 (" mm: hwpoison: fix thp split handling in memory_failure()"). But he missed fixing soft offline. Fixes: 61f5d698 ("mm: re-enable THP") Cc: <stable@vger.kernel.org> [4.5+] Acked-by:
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by:
zhongjiang <zhongjiang@huawei.com> Reviewed-by:
Miao Xie <miaoxie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
euler inclusion category: bugfix bugzilla: 10984 CVE: NA --------------------------- When .mknod create a block device file in hugetlbfs, it will allocate an inode, and kmalloc a 'struct resv_map' in resv_map_alloc(). For now, inode->i_mapping->private_data is used to point the resv_map. However, when open the device, bd_acquire() will set i_mapping as bd_inode->imapping, result in resv_map memory leak. We fix the leak by adding a new entry resv_map in hugetlbfs_inode_info. It can store resv_map pointer. Programs to reproduce: mount -t hugetlbfs nodev hugetlbfs mknod hugetlbfs/dev b 0 0 exec 30<> hugetlbfs/dev umount hugetlbfs/ Fixes: 9119a41e ("mm, hugetlb: unify region structure handling") Signed-off-by:
Yufen Yu <yuyufen@huawei.com> Reviewed-by:
Miao Xie <miaoxie@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Merge 75 patches from 4.19.27 stable branch (79 total) beside 4 already merged patches 0655618 irqchip/gic-v3-mbi: Fix uninitialized mbi_lock 5024f0a sched/wait: Fix rcuwait_wake_up() ordering 2368e6d futex: Fix (possible) missed wakeup 9ad6216 locking/rwsem: Fix (possible) missed wakeup Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 2a418cf3 upstream. When calling __put_user(foo(), ptr), the __put_user() macro would call foo() in between __uaccess_begin() and __uaccess_end(). If that code were buggy, then those bugs would be run without SMAP protection. Fortunately, there seem to be few instances of the problem in the kernel. Nevertheless, __put_user() should be fixed to avoid doing this. Therefore, evaluate __put_user()'s argument before setting AC. This issue was noticed when an objtool hack by Peter Zijlstra complained about genregs_get() and I compared the assembly output to the C source. [ bp: Massage commit message and fixed up whitespace. ] Fixes: 11f1a4b9 ("x86: reorganize SMAP handling in user space accesses") Signed-off-by:
Andy Lutomirski <luto@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190225125231.845656645@infradead.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit d1a2930d upstream. The MIPS eBPF JIT calls flush_icache_range() in order to ensure the icache observes the code that we just wrote. Unfortunately it gets the end address calculation wrong due to some bad pointer arithmetic. The struct jit_ctx target field is of type pointer to u32, and as such adding one to it will increment the address being pointed to by 4 bytes. Therefore in order to find the address of the end of the code we simply need to add the number of 4 byte instructions emitted, but we mistakenly add the number of instructions multiplied by 4. This results in the call to flush_icache_range() operating on a memory region 4x larger than intended, which is always wasteful and can cause crashes if we overrun into an unmapped page. Fix this by correcting the pointer arithmetic to remove the bogus multiplication, and use braces to remove the need for a set of brackets whilst also making it obvious that the target field is a pointer. Signed-off-by:
Paul Burton <paul.burton@mips.com> Fixes: b6bd53f9 ("MIPS: Add missing file for eBPF JIT.") Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: netdev@vger.kernel.org Cc: bpf@vger.kernel.org Cc: linux-mips@vger.kernel.org Cc: stable@vger.kernel.org # v4.13+ Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 18836b48 upstream. The switch to the generic dma ops made dma masks mandatory, breaking devices having them not set. In case of bcm63xx, it broke ethernet with the following warning when trying to up the device: [ 2.633123] ------------[ cut here ]------------ [ 2.637949] WARNING: CPU: 0 PID: 325 at ./include/linux/dma-mapping.h:516 bcm_enetsw_open+0x160/0xbbc [ 2.647423] Modules linked in: gpio_button_hotplug [ 2.652361] CPU: 0 PID: 325 Comm: ip Not tainted 4.19.16 #0 [ 2.658080] Stack : 80520000 804cd3ec 00000000 00000000 804ccc00 87085bdc 87d3f9d4 804f9a17 [ 2.666707] 8049cf18 00000145 80a942a0 00000204 80ac0000 10008400 87085b90 eb3d5ab7 [ 2.675325] 00000000 00000000 80ac0000 000022b0 00000000 00000000 00000007 00000000 [ 2.683954] 0000007a 80500000 0013b381 00000000 80000000 00000000 804a1664 80289878 [ 2.692572] 00000009 00000204 80ac0000 00000200 00000002 00000000 00000000 80a90000 [ 2.701191] ... [ 2.703701] Call Trace: [ 2.706244] [<8001f3c8>] show_stack+0x58/0x100 [ 2.710840] [<800336e4>] __warn+0xe4/0x118 [ 2.715049] [<800337d4>] warn_slowpath_null+0x48/0x64 [ 2.720237] [<80289878>] bcm_enetsw_open+0x160/0xbbc [ 2.725347] [<802d1d4c>] __dev_open+0xf8/0x16c [ 2.729913] [<802d20cc>] __dev_change_flags+0x100/0x1c4 [ 2.735290] [<802d21b8>] dev_change_flags+0x28/0x70 [ 2.740326] [<803539e0>] devinet_ioctl+0x310/0x7b0 [ 2.745250] [<80355fd8>] inet_ioctl+0x1f8/0x224 [ 2.749939] [<802af290>] sock_ioctl+0x30c/0x488 [ 2.754632] [<80112b34>] do_vfs_ioctl+0x740/0x7dc [ 2.759459] [<80112c20>] ksys_ioctl+0x50/0x94 [ 2.763955] [<800240b8>] syscall_common+0x34/0x58 [ 2.768782] ---[ end trace fb1a6b14d74e28b6 ]--- [ 2.773544] bcm63xx_enetsw bcm63xx_enetsw.0: cannot allocate rx ring 512 Fix this by adding appropriate DMA masks for the platform devices. Fixes: f8c55dc6 ("MIPS: use generic dma noncoherent ops for simple noncoherent platforms") Signed-off-by:
Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
Paul Burton <paul.burton@mips.com> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: James Hogan <jhogan@kernel.org> Cc: stable@vger.kernel.org # v4.19+ Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 94ee12b5 upstream. __cmpxchg_small erroneously uses u8 for load comparison which can be either char or short. This patch changes the local variable to u32 which is sufficiently sized, as the loaded value is already masked and shifted appropriately. Using an integer size avoids any unnecessary canonicalization from use of non native widths. This patch is part of a series that adapts the MIPS small word atomics code for xchg and cmpxchg on short and char to RISC-V. Cc: RISC-V Patches <patches@groups.riscv.org> Cc: Linux RISC-V <linux-riscv@lists.infradead.org> Cc: Linux MIPS <linux-mips@linux-mips.org> Signed-off-by:
Michael Clark <michaeljclark@mac.com> [paul.burton@mips.com: - Fix varialble typo per Jonas Gorski. - Consolidate load variable with other declarations.] Signed-off-by:
Paul Burton <paul.burton@mips.com> Fixes: 3ba7f44d ("MIPS: cmpxchg: Implement 1 byte & 2 byte cmpxchg()") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit cb6acd01 upstream. hugetlb pages should only be migrated if they are 'active'. The routines set/clear_page_huge_active() modify the active state of hugetlb pages. When a new hugetlb page is allocated at fault time, set_page_huge_active is called before the page is locked. Therefore, another thread could race and migrate the page while it is being added to page table by the fault code. This race is somewhat hard to trigger, but can be seen by strategically adding udelay to simulate worst case scheduling behavior. Depending on 'how' the code races, various BUG()s could be triggered. To address this issue, simply delay the set_page_huge_active call until after the page is successfully added to the page table. Hugetlb pages can also be leaked at migration time if the pages are associated with a file in an explicitly mounted hugetlbfs filesystem. For example, consider a two node system with 4GB worth of huge pages available. A program mmaps a 2G file in a hugetlbfs filesystem. It then migrates the pages associated with the file from one node to another. When the program exits, huge page counts are as follows: node0 1024 free_hugepages 1024 nr_hugepages node1 0 free_hugepages 1024 nr_hugepages Filesystem Size Used Avail Use% Mounted on nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool That is as expected. 2G of huge pages are taken from the free_hugepages counts, and 2G is the size of the file in the explicitly mounted filesystem. If the file is then removed, the counts become: node0 1024 free_hugepages 1024 nr_hugepages node1 1024 free_hugepages 1024 nr_hugepages Filesystem Size Used Avail Use% Mounted on nodev 4.0G 2.0G 2.0G 50% /var/opt/hugepool Note that the filesystem still shows 2G of pages used, while there actually are no huge pages in use. The only way to 'fix' the filesystem accounting is to unmount the filesystem If a hugetlb page is associated with an explicitly mounted filesystem, this information in contained in the page_private field. At migration time, this information is not preserved. To fix, simply transfer page_private from old to new page at migration time if necessary. There is a related race with removing a huge page from a file and migration. When a huge page is removed from the pagecache, the page_mapping() field is cleared, yet page_private remains set until the page is actually freed by free_huge_page(). A page could be migrated while in this state. However, since page_mapping() is not set the hugetlbfs specific routine to transfer page_private is not called and we leak the page count in the filesystem. To fix that, check for this condition before migrating a huge page. If the condition is detected, return EBUSY for the page. Link: http://lkml.kernel.org/r/74510272-7319-7372-9ea6-ec914734c179@oracle.com Link: http://lkml.kernel.org/r/20190212221400.3512-1-mike.kravetz@oracle.com Fixes: bcc54222 ("mm: hugetlb: introduce page_huge_active") Signed-off-by:
Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by:
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: <stable@vger.kernel.org> [mike.kravetz@oracle.com: v2] Link: http://lkml.kernel.org/r/7534d322-d782-8ac6-1c8d-a8dc380eb3ab@oracle.com [mike.kravetz@oracle.com: update comment and changelog] Link: http://lkml.kernel.org/r/420bcfd6-158b-38e4-98da-26d0cd85bd01@oracle.com Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 22163229 upstream. The prepare_fb call always happens on new_plane_state. The drm_atomic_helper_cleanup_planes checks to see if plane state pointer has changed when deciding to call cleanup_fb on either the new_plane_state or the old_plane_state. For a non-async atomic commit the state pointer is swapped, so this helper calls prepare_fb on the new_plane_state and cleanup_fb on the old_plane_state. This makes sense, since we want to prepare the framebuffer we are going to use and cleanup the the framebuffer we are no longer using. For the async atomic update helpers this differs. The async atomic update helpers perform in-place updates on the existing state. They call drm_atomic_helper_cleanup_planes but the state pointer is not swapped. This means that prepare_fb is called on the new_plane_state and cleanup_fb is called on the new_plane_state (not the old). In the case where old_plane_state->fb == new_plane_state->fb then there should be no behavioral difference between an async update and a non-async commit. But there are issues that arise when old_plane_state->fb != new_plane_state->fb. The first is that the new_plane_state->fb is immediately cleaned up after it has been prepared, so we're using a fb that we shouldn't be. The second occurs during a sequence of async atomic updates and non-async regular atomic commits. Suppose there are two framebuffers being interleaved in a double-buffering scenario, fb1 and fb2: - Async update, oldfb = NULL, newfb = fb1, prepare fb1, cleanup fb1 - Async update, oldfb = fb1, newfb = fb2, prepare fb2, cleanup fb2 - Non-async commit, oldfb = fb2, newfb = fb1, prepare fb1, cleanup fb2 We call cleanup_fb on fb2 twice in this example scenario, and any further use will result in use-after-free. The simple fix to this problem is to block framebuffer changes in the drm_atomic_helper_async_check function for now. v2: Move check by itself, add a FIXME (Daniel) Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: <stable@vger.kernel.org> # v4.14+ Fixes: fef9df8b ("drm/atomic: initial support for asynchronous plane update") Signed-off-by:
Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by:
Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by:
Harry Wentland <harry.wentland@amd.com> Reviewed-by:
Daniel Vetter <daniel@ffwll.ch> Signed-off-by:
Harry Wentland <harry.wentland@amd.com> Link: https://patchwork.freedesktop.org/patch/275364/ Signed-off-by:
Dave Airlie <airlied@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 0a1d5299 upstream. security_mmap_addr() does a capability check with current_cred(), but we can reach this code from contexts like a VFS write handler where current_cred() must not be used. This can be abused on systems without SMAP to make NULL pointer dereferences exploitable again. Fixes: 8869477a ("security: protect from stack expansion into low vm addresses") Cc: stable@kernel.org Signed-off-by:
Jann Horn <jannh@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit e30be063 upstream. Commit 18094430 ("mmc: sdhci-esdhc-imx: add ADMA Length Mismatch errata fix") involve the fix of ERR004536, but the fix is incorrect. Double confirm with IC, need to clear the bit 7 of register 0x6c rather than set this bit 7. Here is the definition of bit 7 of 0x6c: 0: enable the new IC fix for ERR004536 1: do not use the IC fix, keep the same as before Find this issue on i.MX845s-evk board when enable CMDQ, and let system in heavy loading. root@imx8mmevk:~# dd if=/dev/mmcblk2 of=/dev/null bs=1M & root@imx8mmevk:~# memtester 1000M > /dev/zero & root@imx8mmevk:~# [ 139.897220] mmc2: cqhci: timeout for tag 16 [ 139.901417] mmc2: cqhci: ============ CQHCI REGISTER DUMP =========== [ 139.907862] mmc2: cqhci: Caps: 0x0000310a | Version: 0x00000510 [ 139.914311] mmc2: cqhci: Config: 0x00001001 | Control: 0x00000000 [ 139.920753] mmc2: cqhci: Int stat: 0x00000000 | Int enab: 0x00000006 [ 139.927193] mmc2: cqhci: Int sig: 0x00000006 | Int Coal: 0x00000000 [ 139.933634] mmc2: cqhci: TDL base: 0x7809c000 | TDL up32: 0x00000000 [ 139.940073] mmc2: cqhci: Doorbell: 0x00030000 | TCN: 0x00000000 [ 139.946518] mmc2: cqhci: Dev queue: 0x00010000 | Dev Pend: 0x00010000 [ 139.952967] mmc2: cqhci: Task clr: 0x00000000 | SSC1: 0x00011000 [ 139.959411] mmc2: cqhci: SSC2: 0x00000001 | DCMD rsp: 0x00000000 [ 139.965857] mmc2: cqhci: RED mask: 0xfdf9a080 | TERRI: 0x00000000 [ 139.972308] mmc2: cqhci: Resp idx: 0x0000002e | Resp arg: 0x00000900 [ 139.978761] mmc2: sdhci: ============ SDHCI REGISTER DUMP =========== [ 139.985214] mmc2: sdhci: Sys addr: 0xb2c19000 | Version: 0x00000002 [ 139.991669] mmc2: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000400 [ 139.998127] mmc2: sdhci: Argument: 0x40110400 | Trn mode: 0x00000033 [ 140.004618] mmc2: sdhci: Present: 0x01088a8f | Host ctl: 0x00000030 [ 140.011113] mmc2: sdhci: Power: 0x00000002 | Blk gap: 0x00000080 [ 140.017583] mmc2: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f [ 140.024039] mmc2: sdhci: Timeout: 0x0000008f | Int stat: 0x00000000 [ 140.030497] mmc2: sdhci: Int enab: 0x107f4000 | Sig enab: 0x107f4000 [ 140.036972] mmc2: sdhci: AC12 err: 0x00000000 | Slot int: 0x00000502 [ 140.043426] mmc2: sdhci: Caps: 0x07eb0000 | Caps_1: 0x8000b407 [ 140.049867] mmc2: sdhci: Cmd: 0x00002c1a | Max curr: 0x00ffffff [ 140.056314] mmc2: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0xffffffff [ 140.062755] mmc2: sdhci: Resp[2]: 0x328f5903 | Resp[3]: 0x00d00f00 [ 140.069195] mmc2: sdhci: Host ctl2: 0x00000008 [ 140.073640] mmc2: sdhci: ADMA Err: 0x00000007 | ADMA Ptr: 0x7809c108 [ 140.080079] mmc2: sdhci: ============================================ [ 140.086662] mmc2: running CQE recovery Fixes: 18094430 ("mmc: sdhci-esdhc-imx: add ADMA Length Mismatch errata fix") Signed-off-by:
Haibo Chen <haibo.chen@nxp.com> Cc: stable@vger.kernel.org Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit d07e9fad upstream. Free up the allocated memory in the case of error return The value of mmc_host->cqe_enabled stays 'false'. Thus, cqhci_disable (mmc_cqe_ops->cqe_disable) won't be called to free the memory. Also, cqhci_disable() seems to be designed to disable and free all resources, not suitable to handle this corner case. Fixes: a4080225 ("mmc: cqhci: support for command queue enabled host") Signed-off-by:
Alamy Liu <alamy.liu@gmail.com> Acked-by:
Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 27ec9dc1 upstream. There is not enough space being allocated when DCMD is disabled. CQE_DCMD is not necessary to be enabled when CQE is enabled. (Software could halt CQE to send command) In the case that CQE_DCMD is not enabled, it still needs to allocate space for data transfer. For instance: CQE_DCMD is enabled: 31 slots space (one slot used by DCMD) CQE_DCMD is disabled: 32 slots space Fixes: a4080225 ("mmc: cqhci: support for command queue enabled host") Signed-off-by:
Alamy Liu <alamy.liu@gmail.com> Acked-by:
Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit e5723f95 upstream. In case of CQHCI, mrq->cmd may be NULL for data requests (non DCMD). In such case mmc_should_fail_request is directly dereferencing mrq->cmd while cmd is NULL. Fix this by checking for mrq->cmd pointer. Fixes: 72a5af55 ("mmc: core: Add support for handling CQE requests") Signed-off-by:
Ritesh Harjani <riteshh@codeaurora.org> Cc: stable@vger.kernel.org Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 5603731a upstream. In R-Car Gen2 or later, the maximum number of transfer blocks are changed from 0xFFFF to 0xFFFFFFFF. Therefore, Block Count Register should use iowrite32(). If another system (U-boot, Hypervisor OS, etc) uses bit[31:16], this value will not be cleared. So, SD/MMC card initialization fails. So, check for the bigger register and use apropriate write. Also, mark the register as extended on Gen2. Signed-off-by:
Takeshi Saito <takeshi.saito.xv@renesas.com> [wsa: use max_blk_count in if(), add Gen2, update commit message] Signed-off-by:
Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: stable@kernel.org Reviewed-by:
Simon Horman <horms+renesas@verge.net.au> [Ulf: Fixed build error] Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit 5c27ff5d upstream. I have encountered an interrupt storm during the eMMC chip probing (and the chip finally didn't get detected). It turned out that U-Boot left the DMAC interrupts enabled while the Linux driver didn't use those. The SDHI driver's interrupt handler somehow assumes that, even if an SDIO interrupt didn't happen, it should return IRQ_HANDLED. I think that if none of the enabled interrupts happened and got handled, we should return IRQ_NONE -- that way the kernel IRQ code recoginizes a spurious interrupt and masks it off pretty quickly... Fixes: 7729c7a2 ("mmc: tmio: Provide separate interrupt handlers") Signed-off-by:
Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by:
Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by:
Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by:
Simon Horman <horms+renesas@verge.net.au> Cc: stable@vger.kernel.org Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
commit c9bd505d upstream. When using the mmc_spi driver with a card-detect pin, I noticed that the card was not detected immediately after probe, but only after it was unplugged and plugged back in (and the CD IRQ fired). The call tree looks something like this: mmc_spi_probe mmc_add_host mmc_start_host _mmc_detect_change mmc_schedule_delayed_work(&host->detect, 0) mmc_rescan host->bus_ops->detect(host) mmc_detect _mmc_detect_card_removed host->ops->get_cd(host) mmc_gpio_get_cd -> -ENOSYS (ctx->cd_gpio not set) mmc_gpiod_request_cd ctx->cd_gpio = desc To fix this issue, call mmc_detect_change after the card-detect GPIO/IRQ is registered. Signed-off-by:
Jonathan Neuschäfer <j.neuschaefer@gmx.net> Reviewed-by:
Linus Walleij <linus.walleij@linaro.org> Cc: stable@vger.kernel.org Signed-off-by:
Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
[ Upstream commit 94a980c3 ] Fix a call to userspace_mem_region_find to conform to its spec of taking an inclusive, inclusive range. It was previously being called with an inclusive, exclusive range. Also remove a redundant region bounds check in vm_userspace_mem_region_add. Region overlap checking is already performed by the call to userspace_mem_region_find. Tested: Compiled tools/testing/selftests/kvm with -static Ran all resulting test binaries on an Intel Haswell test machine All tests passed Signed-off-by:
Ben Gardon <bgardon@google.com> Reviewed-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-