- Nov 29, 2021
-
-
Zheng Zucheng authored
hulk inclusion category: feature bugzilla: 51828, https://gitee.com/openeuler/kernel/issues/I4K96G CVE: NA -------------------------------- We introduce the idea of qos level to scheduler, which now is supported with different scheduler policies. The qos scheduler will change the policy of correlative tasks when the qos level of a task group is modified with cpu.qos_level cpu cgroup file. In this way we are able to satisfy different needs of tasks in different qos levels. Signed-off-by:
Zhang Qiao <zhangqiao22@huawei.com> Signed-off-by:
Zheng Zucheng <zhengzucheng@huawei.com> Reviewed-by:
Chen Hui <judy.chenhui@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 27, 2021
-
-
Pavel Begunkov authored
mainline inclusion from mainline-v5.13-rc1 commit f70865db5ff35f5ed0c7e9ef63e7cca3d4947f04 category: bugfix bugzilla: 185739 CVE: NA ----------------------------------------------- Revert of revert of "io_uring: wait potential ->release() on resurrect", which adds a helper for resurrect not racing completion reinit, as was removed because of a strange bug with no clear root or link to the patch. Was improved, instead of rcu_synchronize(), just wait_for_completion() because we're at 0 refs and it will happen very shortly. Specifically use non-interruptible version to ignore all pending signals that may have ended prior interruptible wait. This reverts commit cb5e1b81304e089ee3ca948db4d29f71902eb575. Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/7a080c20f686d026efade810b116b72f88abaff9.1618101759.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> conflicts: fs/io_uring.c Signed-off-by:
Ye Bin <yebin10@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xiongfeng Wang authored
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4HYY4?from=project-issue CVE: NA ------------------------------------------------- When I hot added a CPU, I found 'cpufreq' directory is not created below /sys/devices/system/cpu/cpuX/. It is because get_cpu_device() failed in add_cpu_dev_symlink(). cpufreq_add_dev() is the .add_dev callback of a CPU subsys interface. It will be called when the CPU device registered into the system. The stack is as follows. register_cpu() ->device_register() ->device_add() ->bus_probe_device() ->cpufreq_add_dev() But only after the CPU device has been registered, we can get the CPU device by get_cpu_device(), otherwise it will return NULL. Since we already have the CPU device in cpufreq_add_dev(), pass it to add_cpu_dev_symlink(). Signed-off-by:
Xiongfeng Wang <wangxiongfeng2@huawei.com> Reviewed-by:
Keqian Zhu <zhukeqian1@huawei.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Reviewed-by:
Cheng Jian <cj.chengjian@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xiongfeng Wang authored
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4HYY4?from=project-issue CVE: NA ------------------------------------------------- Per-CPU variables cpc_desc_ptr are initialized in acpi_cppc_processor_probe() when the processor devices are present and added into the system. But when cpu_possible_mask and cpu_present_mask is not equal, only cpc_desc_ptr in cpu_present_mask are initialized, this will cause acpi_get_psd_map() failed in cppc_cpufreq_init(). To fix this issue, we parse the _PSD method for all possible CPUs to get the P-State topology and modify acpi_get_psd_map() to rely on this information. Signed-off-by:
Xiongfeng Wang <wangxiongfeng@huawei.com> Reviewed-by:
Keqian Zhu <zhukeqian1@huawei.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Reviewed-by:
Cheng Jian <cj.chengjian@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 24, 2021
-
-
Cheng Jian authored
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I3OX0C CVE: NA -------------------------------- We must ensure that the following four instructions are cache-aligned. Otherwise, it will cause problems with the performance of libMicro pread. 1: # uao_user_alternative 9f, str, sttr, xzr, x0, 8 str xzr, [x0], #8 nop subs x1, x1, #8 b.pl 1b with this patch: prc thr usecs/call samples errors cnt/samp size pread_z100 1 1 5.88400 807 0 1 102400 The result of pread can range from 5 to 9 depending on the alignment performance of this function. Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Reviewed-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 23, 2021
-
-
Daniel Vetter authored
mainline inclusion from mainline-v5.4-rc1 commit 75426367 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JI7R CVE: NA -------------------------------- This completes Emil's series of removing DRM_UNLOCKED from modern drivers. It's entirely cargo-culted since we ignore it on non-DRIVER_LEGACY drivers since: commit ea487835 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Sep 28 21:42:40 2015 +0200 drm: Enforce unlocked ioctl operation for kms driver ioctls Now justifying why we can do this for legacy drives too (and hence close the source of all the bogus copypasting) is a bit more involved. DRM_UNLOCKED was introduced in: commit ed8b6704 Author: Arnd Bergmann <arnd@arndb.de> Date: Wed Dec 16 22:17:09 2009 +0000 drm: convert drm_ioctl to unlocked_ioctl As a immediate hack to keep i810 happy, which would have deadlocked without this trickery. The old BKL is automatically dropped in schedule(), and hence the i810 vs. mmap_sem deadlock didn't actually cause a real deadlock. But with a mutex it would. The solution was to annotate these as DRM_UNLOCKED and mark i810 unsafe on SMP machines. This conversion caused a regression, because unlike the BKL a mutex isn't dropped over schedule (that thing again), which caused a vblank wait in one thread to block the entire desktop and all its apps. Back then we did vblank scheduling by blocking in the client, awesome isn't it. This was fixed quickly in (ok not so quickly, took 2 years): commit 8f4ff2b0 Author: Ilija Hadzic <ihadzic@research.bell-labs.com> Date: Mon Oct 31 17:46:18 2011 -0400 drm: do not sleep on vblank while holding a mutex All the other DRM_UNLOCKED annotations for all the core ioctls was work to reach finer-grained locking for modern drivers. This took years, and culminated in: commit fdd5b877 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Sat Dec 10 22:52:54 2016 +0100 drm: Enforce BKL-less ioctls for modern drivers DRM_UNLOCKED was never required by any legacy drivers, except for the vblank_wait IOCTL. Therefore we will not regress these old drivers by going back to where we've been in 2011. For all modern drivers nothing will change. To make this perfectly clear, also add a comment to DRM_UNLOCKED. v2: Don't forget about drm_ioc32.c (Michel). Cc: Michel Dänzer <michel@daenzer.net> Cc: Emil Velikov <emil.l.velikov@gmail.com> Acked-by:
Emil Velikov <emil.velikov@collabora.com> Acked-by:
Michel Dänzer <michel@daenzer.net> Signed-off-by:
Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190605120835.2798-1-daniel.vetter@ffwll.ch Signed-off-by:
Liu ZiXian <liuzixian4@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Reviewed-by:
wangxiongfeng 00379786 <wangxiongfeng2@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Yang Yingliang authored
hulk inclusion category: bugfix bugzilla: NA CVE: NA -------------------------------- Enable some configs for test. Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Cheng Jian <cj.chengjian@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhang Jian authored
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JICC CVE: NA ------------------------------------------------- When we access the process's sp_group file and the precess is a kernel process, it's task_struct->mm value will be 0, so we must check it, and make sure the process is not a kernel process. v1->v2: The path of a process as a kernel process is not often triggered, so add a unlikely function to accelerate execytion. Signed-off-by:
Zhang Jian <zhangjian210@huawei.com> Reviewed-by:
Ding Tianhong <dingtianhong@huawei.com> Reviewed-by:
Weilong Chen <chenweilong@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 22, 2021
-
-
Jan Kara authored
stable inclusion from stable-5.10.51 commit 8cc58a6e2c394aa48aa05f600be7d279efbafcd7 bugzilla: 175263 Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=8cc58a6e2c394aa48aa05f600be7d279efbafcd7 -------------------------------- commit 11c7aa0ddea8611007768d3e6b58d45dc60a19e1 upstream. Commit 545fbd07 ("rq-qos: fix missed wake-ups in rq_qos_throttle") tried to fix a problem that a process could be sleeping in rq_qos_wait() without anyone to wake it up. However the fix is not complete and the following can still happen: CPU1 (waiter1) CPU2 (waiter2) CPU3 (waker) rq_qos_wait() rq_qos_wait() acquire_inflight_cb() -> fails acquire_inflight_cb() -> fails completes IOs, inflight decreased prepare_to_wait_exclusive() prepare_to_wait_exclusive() has_sleeper = !wq_has_single_sleeper() -> true as there are two sleepers has_sleeper = !wq_has_single_sleeper() -> true io_schedule() io_schedule() Deadlock as now there's nobody to wakeup the two waiters. The logic automatically blocking when there are already sleepers is really subtle and the only way to make it work reliably is that we check whether there are some waiters in the queue when adding ourselves there. That way, we are guaranteed that at least the first process to enter the wait queue will recheck the waiting condition before going to sleep and thus guarantee forward progress. Fixes: 545fbd07 ("rq-qos: fix missed wake-ups in rq_qos_throttle") CC: stable@vger.kernel.org Signed-off-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210607112613.25344-1-jack@suse.cz Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Lihong Kou <koulihong@huawei.com> Reviewed-by:
Hou Tao <houtao1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zekun Shen authored
mainline inclusion from mainline-v5.16-rc2 commit b922f622592af76b57cbc566eaeccda0b31a3496 category: bugfix bugzilla: NA CVE: CVE-2021-43975 ------------------------------------------------- This bug report shows up when running our research tools. The reports is SOOB read, but it seems SOOB write is also possible a few lines below. In details, fw.len and sw.len are inputs coming from io. A len over the size of self->rpc triggers SOOB. The patch fixes the bugs by adding sanity checks. The bugs are triggerable with compromised/malfunctioning devices. They are potentially exploitable given they first leak up to 0xffff bytes and able to overwrite the region later. The patch is tested with QEMU emulater. This is NOT tested with a real device. Attached is the log we found by fuzzing. BUG: KASAN: slab-out-of-bounds in hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] Read of size 4 at addr ffff888016260b08 by task modprobe/213 CPU: 0 PID: 213 Comm: modprobe Not tainted 5.6.0 #1 Call Trace: dump_stack+0x76/0xa0 print_address_description.constprop.0+0x16/0x200 ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] __kasan_report.cold+0x37/0x7c ? aq_hw_read_reg_bit+0x60/0x70 [atlantic] ? hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] kasan_report+0xe/0x20 hw_atl_utils_fw_upload_dwords+0x393/0x3c0 [atlantic] hw_atl_utils_fw_rpc_call+0x95/0x130 [atlantic] hw_atl_utils_fw_rpc_wait+0x176/0x210 [atlantic] hw_atl_utils_mpi_create+0x229/0x2e0 [atlantic] ? hw_atl_utils_fw_rpc_wait+0x210/0x210 [atlantic] ? hw_atl_utils_initfw+0x9f/0x1c8 [atlantic] hw_atl_utils_initfw+0x12a/0x1c8 [atlantic] aq_nic_ndev_register+0x88/0x650 [atlantic] ? aq_nic_ndev_init+0x235/0x3c0 [atlantic] aq_pci_probe+0x731/0x9b0 [atlantic] ? aq_pci_func_init+0xc0/0xc0 [atlantic] local_pci_probe+0xd3/0x160 pci_device_probe+0x23f/0x3e0 Reported-by:
Brendan Dolan-Gavitt <brendandg@nyu.edu> Signed-off-by:
Zekun Shen <bruceshenzk@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by:
Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
fengsheng authored
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IYX7?from=project-issue CVE: NA ------------------------------------------------------------ This driver is not in use. Remove it. Signed-off-by:
fengsheng <fengsheng5@huawei.com> Reviewed-by:
lidongming <lidongming5@huawei.com> Reviewed-by:
ouyang delong <ouyangdelong@huawei.com> Acked-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
fengsheng authored
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IYWW?from=project-issue CVE: NA ------------------------------------------------------------ This driver is not in use. Remove it. Signed-off-by:
fengsheng <fengsheng5@huawei.com> Reviewed-by:
lidongming <lidongming5@huawei.com> Reviewed-by:
ouyang delong <ouyangdelong@huawei.com> Acked-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
fengsheng authored
driver inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4IYWE?from=project-issue CVE: NA ------------------------------------------------------------ This driver is not in use. Remove it. Signed-off-by:
fengsheng <fengsheng5@huawei.com> Reviewed-by:
lidongming <lidongming5@huawei.com> Reviewed-by:
ouyang delong <ouyangdelong@huawei.com> Acked-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 18, 2021
-
-
Xu Jia authored
hulk inclusion category: bugfix bugzilla: 177871 CVE: NA ------------------------------------------------- The following warning is falsely reported since commit e2eea86c (ipv4: make exception cache less predictible): error: ‘oldest_p’ may be used uninitialized in this function [-Werror=maybe-uninitialized] *oldest_p = oldest->fnhe_next; ~~~~~~~~~~^~~~~~~~~~~~~~~~~~~ net/ipv4/route.c:602:44: note: ‘oldest_p’ was declared here struct fib_nh_exception __rcu **fnhe_p, **oldest_p; Fix and avoid the alarm. Signed-off-by:
Xu Jia <xujia39@huawei.com> Reviewed-by:
Yue Haibing <yuehaibing@huawei.com> Reviewed-by:
Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
zhenwei pi authored
stable inclusion from linux-4.19.207 commit aab312696d37de80502ca633b40184de24f22917 -------------------------------- commit f985911b7bc75d5c98ed24d8aaa8b94c590f7c6a upstream. Hit kernel warning like this, it can be reproduced by verifying 256 bytes datafile by keyctl command, run script: RAWDATA=rawdata SIGDATA=sigdata modprobe pkcs8_key_parser rm -rf *.der *.pem *.pfx rm -rf $RAWDATA dd if=/dev/random of=$RAWDATA bs=256 count=1 openssl req -nodes -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem \ -subj "/C=CN/ST=GD/L=SZ/O=vihoo/OU=dev/CN=xx.com/emailAddress=yy@xx.com" KEY_ID=`openssl pkcs8 -in key.pem -topk8 -nocrypt -outform DER | keyctl \ padd asymmetric 123 @s` keyctl pkey_sign $KEY_ID 0 $RAWDATA enc=pkcs1 hash=sha1 > $SIGDATA keyctl pkey_verify $KEY_ID 0 $RAWDATA $SIGDATA enc=pkcs1 hash=sha1 Then the kernel reports: WARNING: CPU: 5 PID: 344556 at crypto/rsa-pkcs1pad.c:540 pkcs1pad_verify+0x160/0x190 ... Call Trace: public_key_verify_signature+0x282/0x380 ? software_key_query+0x12d/0x180 ? keyctl_pkey_params_get+0xd6/0x130 asymmetric_key_verify_signature+0x66/0x80 keyctl_pkey_verify+0xa5/0x100 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae The reason of this issue, in function 'asymmetric_key_verify_signature': '.digest_size(u8) = params->in_len(u32)' leads overflow of an u8 value, so use u32 instead of u8 for digest_size field. And reorder struct public_key_signature, it saves 8 bytes on a 64-bit machine. Cc: stable@vger.kernel.org Signed-off-by:
zhenwei pi <pizhenwei@bytedance.com> Reviewed-by:
Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by:
Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Nikolay Aleksandrov authored
mainline inclusion from mainline-v5.6-rc4 commit 823d81b0 category: bugfix bugzilla: 185773 CVE: NA ------------------------------------------------- In br_dev_xmit() we perform vlan filtering in br_allowed_ingress() but if the packet has the vlan header inside (e.g. bridge with disabled tx-vlan-offload) then the vlan filtering code will use skb_vlan_untag() to extract the vid before filtering which in turn calls pskb_may_pull() and we may end up with a stale eth pointer. Moreover the cached eth header pointer will generally be wrong after that operation. Remove the eth header caching and just use eth_hdr() directly, the compiler does the right thing and calculates it only once so we don't lose anything. Fixes: 057658cb ("bridge: suppress arp pkts on BR_NEIGH_SUPPRESS ports") Signed-off-by:
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Huang Guobin <huangguobin4@huawei.com> Reviewed-by:
Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 17, 2021
-
-
Peter Zijlstra authored
mainline inclusion from mainline-v5.8-rc1 commit 1c3e5d3f category: feature bugzilla: 175666 CVE: NA --------------------------- Currently entry_64_compat is exempt from objtool, but with vmlinux mode there is no hiding it. Make the following changes to make it pass: - change entry_SYSENTER_compat to STT_NOTYPE; it's not a function and doesn't have function type stack setup. - mark all STT_NOTYPE symbols with UNWIND_HINT_EMPTY; so we do validate them and don't treat them as unreachable. - don't abuse RSP as a temp register, this confuses objtool mightily as it (rightfully) thinks we're doing unspeakable things to the stack. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Alexandre Chartre <alexandre.chartre@oracle.com> Acked-by:
Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/20200505134341.272248024@linutronix.de Signed-off-by:
Wang ShaoBo <bobo.shaobowang@huawei.com> Conflicts: arch/x86/entry/entry_64_compat.S [wangshaobo: change ENDPROC to END, avoid objtool skipping STT_FUNC type check] Reviewed-by:
Cheng Jian <cj.chengjian@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 15, 2021
-
-
Pavel Begunkov authored
mainline inclusion from mainline-v5.13-rc2 commit 447c19f3b5074409c794b350b10306e1da1ef4ba category: bugfix bugzilla: 185736 CVE: NA ----------------------------------------------- Always remove linked timeout on io_link_timeout_fn() from the master request link list, otherwise we may get use-after-free when first io_link_timeout_fn() puts linked timeout in the fail path, and then will be found and put on master's free. Cc: stable@vger.kernel.org # 5.10+ Fixes: 90cd7e424969d ("io_uring: track link timeout's master explicitly") Reported-and-tested-by:
<syzbot+5a864149dd970b546223@syzkaller.appspotmail.com> Signed-off-by:
Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/69c46bf6ce37fec4fdcd98f0882e18eb07ce693a.1620990121.git.asml.silence@gmail.com Signed-off-by:
Jens Axboe <axboe@kernel.dk> conflicts: fs/io_uring.c Signed-off-by:
Ye Bin <yebin10@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zheng Zengkai authored
phytium inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I41AUQ -------------------------------------- Disabling CONFIG_ARCH_PHYTIUM results in following compile errors: drivers/iommu/arm-smmu.c: In function ‘phytium_smmu_def_domain_type’: drivers/iommu/arm-smmu.c:1641:6: error: implicit declaration of function ‘typeof_ft2000plus’ [-Werror=implicit-function-declaration] 1641 | if (typeof_ft2000plus() || typeof_s2500()) { | ^~~~~~~~~~~~~~~~~ drivers/iommu/arm-smmu.c:1641:29: error: implicit declaration of function ‘typeof_s2500’ [-Werror=implicit-function-declaration] 1641 | if (typeof_ft2000plus() || typeof_s2500()) { | ^~~~~~~~~~~~ cc1: some warnings being treated as errors Fix it by using CONFIG_ARCH_PHYTIUM to control phytium related code. Signed-off-by:
Zheng Zengkai <zhengzengkai@huawei.com> Reviewed-by:
Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by:
Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- Nov 12, 2021
-
-
Yu'an Wang authored
driver inclusion category: Feature bugzilla: NA CVE: NA In this patch, delete several invalid define and api: 1. sq_head in hisi_qp_status is not used for any judge, as well as qm_sq_head_update 2. crypto hisilicon just support async logic in kernel driver, so hisi_qp_wait logic is abandoned 3. CONFIG_CRYPTO_QM_UACCE seems redundant, so we delete it Signed-off-by:
Yu'an Wang <wangyuan46@huawei.com> Reviewed-by:
Weili Qian <qianweili@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Yu'an Wang authored
driver inclusion category: Bugfix bugzilla: NA CVE: NA Set the flag CRYPTO_TFM_REQ_MAY_BACKLOG in the crypto driver, which can limit task process Signed-off-by:
Yu'an Wang <wangyuan46@huawei.com> Reviewed-by:
Longfang Liu <liulongfang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Eric Dumazet authored
mainline inclusion from mainline-v5.4-rc2 commit 3256a2d6 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4AFRJ?from=project-issue CVE: NA ------------------------------------------------------------ The cited commit exposed an old retransmits_timed_out() bug which assumed it could call tcp_model_timeout() with TCP_RTO_MIN as rto_base for all states. But flows in SYN_SENT or SYN_RECV state uses a different RTO base (1 sec instead of 200 ms, unless BPF choses another value) This caused a reduction of SYN retransmits from 6 to 4 with the default /proc/sys/net/ipv4/tcp_syn_retries value. Fixes: a41e8a88 ("tcp: better handle TCP_USER_TIMEOUT in SYN_SENT state") Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Marek Majkowski <marek@cloudflare.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Conflicts: net/ipv4/tcp_timer.c Signed-off-by: Jiazhenyuan <jiazhenyuan@uniontech.com> #openEuler_contributor Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com> Reviewed-by:
Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Yuchung Cheng authored
mainline inclusion from mainline-v5.1-rc1 commit 01a523b0 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4AFRJ?from=project-issue CVE: NA ------------------------------------------------------------ Create a helper to model TCP exponential backoff for the next patch. This is pure refactor w no behavior change. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Reviewed-by:
Neal Cardwell <ncardwell@google.com> Reviewed-by:
Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Conflicts: net/ipv4/tcp_timer.c Signed-off-by: Jiazhenyuan <jiazhenyuan@uniontech.com> #openEuler_contributor Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com> Reviewed-by:
Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Yuchung Cheng authored
mainline inclusion from mainline-v5.1-rc1 commit 7ae18975 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4AFRJ?from=project-issue CVE: NA ------------------------------------------------------------ Previously TCP socket's retrans_stamp is not set if the retransmission has failed to send. As a result if a socket is experiencing local issues to retransmit packets, determining when to abort a socket is complicated w/o knowning the starting time of the recovery since retrans_stamp may remain zero. This complication causes sub-optimal behavior that TCP may use the latest, instead of the first, retransmission time to compute the elapsed time of a stalling connection due to local issues. Then TCP may disrecard TCP retries settings and keep retrying until it finally succeed: not a good idea when the local host is already strained. The simple fix is to always timestamp the start of a recovery. It's worth noting that retrans_stamp is also used to compare echo timestamp values to detect spurious recovery. This patch does not break that because retrans_stamp is still later than when the original packet was sent. Signed-off-by:
Yuchung Cheng <ycheng@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Reviewed-by:
Neal Cardwell <ncardwell@google.com> Reviewed-by:
Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Conflicts: net/ipv4/tcp_timer.c Signed-off-by: Jiazhenyuan <jiazhenyuan@uniontech> #openEuler_contributor Signed-off-by:
Laibin Qiu <qiulaibin@huawei.com> Reviewed-by:
Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Pavel Skripkin authored
stable inclusion from linux-4.19.208 commit a94a60ef75d6cc2ac4c0d07cd043316e7e8b5b3b -------------------------------- commit 2d186afd04d669fe9c48b994c41a7405a3c9f16d upstream. Syzbot reported shift-out-of-bounds bug in profile_init(). The problem was in incorrect prof_shift. Since prof_shift value comes from userspace we need to clamp this value into [0, BITS_PER_LONG -1] boundaries. Second possible shiht-out-of-bounds was found by Tetsuo: sample_step local variable in read_profile() had "unsigned int" type, but prof_shift allows to make a BITS_PER_LONG shift. So, to prevent possible shiht-out-of-bounds sample_step type was changed to "unsigned long". Also, "unsigned short int" will be sufficient for storing [0, BITS_PER_LONG] value, that's why there is no need for "unsigned long" prof_shift. Link: https://lkml.kernel.org/r/20210813140022.5011-1-paskripkin@gmail.com Fixes: 1da177e4 ("Linux-2.6.12-rc2") Reported-and-tested-by:
<syzbot+e68c89a9510c159d9684@syzkaller.appspotmail.com> Suggested-by:
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by:
Pavel Skripkin <paskripkin@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Cyrill Gorcunov authored
stable inclusion from linux-4.19.208 commit 6a96bac8ba0a5ab9c9af1ed1c77478e19ae1f0f6 -------------------------------- commit e1fbbd073137a9d63279f6bf363151a938347640 upstream. Keno Fischer reported that when a binray loaded via ld-linux-x the prctl(PR_SET_MM_MAP) doesn't allow to setup brk value because it lays before mm:end_data. For example a test program shows | # ~/t | | start_code 401000 | end_code 401a15 | start_stack 7ffce4577dd0 | start_data 403e10 | end_data 40408c | start_brk b5b000 | sbrk(0) b5b000 and when executed via ld-linux | # /lib64/ld-linux-x86-64.so.2 ~/t | | start_code 7fc25b0a4000 | end_code 7fc25b0c4524 | start_stack 7fffcc6b2400 | start_data 7fc25b0ce4c0 | end_data 7fc25b0cff98 | start_brk 55555710c000 | sbrk(0) 55555710c000 This of course prevent criu from restoring such programs. Looking into how kernel operates with brk/start_brk inside brk() syscall I don't see any problem if we allow to setup brk/start_brk without checking for end_data. Even if someone pass some weird address here on a purpose then the worst possible result will be an unexpected unmapping of existing vma (own vma, since prctl works with the callers memory) but test for RLIMIT_DATA is still valid and a user won't be able to gain more memory in case of expanding VMAs via new values shipped with prctl call. Link: https://lkml.kernel.org/r/20210121221207.GB2174@grain Fixes: bbdc6076 ("binfmt_elf: move brk out of mmap when doing direct loader exec") Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Reported-by:
Keno Fischer <keno@juliacomputing.com> Acked-by:
Andrey Vagin <avagin@gmail.com> Tested-by:
Andrey Vagin <avagin@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Kirill Tkhai <ktkhai@virtuozzo.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Cc: Alexander Mikhalitsyn <alexander.mikhalitsyn@virtuozzo.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Andy Shevchenko authored
stable inclusion from linux-4.19.208 commit 523559507138ca4abcf4c2522c0061071c1d60a0 -------------------------------- commit 67db87dc8284070adb15b3c02c1c31d5cf51c5d6 upstream. Currently the CRST parsing relies on the fact that on most of x86 devices the IRQ mapping is 1:1 with Linux vIRQ. However, it may be not true for some. Fix this by converting GSI to Linux vIRQ before checking it. Fixes: ee8209fd ("dma: acpi-dma: parse CSRT to extract additional resources") Signed-off-by:
Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210730202715.24375-1-andriy.shevchenko@linux.intel.com Signed-off-by:
Vinod Koul <vkoul@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Li Huafei authored
stable inclusion from linux-4.19.208 commit 6cfbbb961bb94de85455fe35140b1350c7ccb76c -------------------------------- The commit 960434acef37 ("tracing/kprobe: Fix to support kretprobe events on unloaded modules") backport from v5.11, which modifies the return value of kprobe_on_func_entry(). However, there is no adaptation modification in create_trace_kprobe(), resulting in the exact opposite behavior. Now we need to return an error immediately only if kprobe_on_func_entry() returns -EINVAL. Fixes: 960434acef37 ("tracing/kprobe: Fix to support kretprobe events on unloaded modules") Signed-off-by:
Li Huafei <lihuafei1@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Neeraj Upadhyay authored
stable inclusion from linux-4.19.208 commit 3226fb90cf5dc89611f742f122a33d4598076ad5 -------------------------------- commit fd6bc19d upstream. Tasks waiting within exp_funnel_lock() for an expedited grace period to elapse can be starved due to the following sequence of events: 1. Tasks A and B both attempt to start an expedited grace period at about the same time. This grace period will have completed when the lower four bits of the rcu_state structure's ->expedited_sequence field are 0b'0100', for example, when the initial value of this counter is zero. Task A wins, and thus does the actual work of starting the grace period, including acquiring the rcu_state structure's .exp_mutex and sets the counter to 0b'0001'. 2. Because task B lost the race to start the grace period, it waits on ->expedited_sequence to reach 0b'0100' inside of exp_funnel_lock(). This task therefore blocks on the rcu_node structure's ->exp_wq[1] field, keeping in mind that the end-of-grace-period value of ->expedited_sequence (0b'0100') is shifted down two bits before indexing the ->exp_wq[] field. 3. Task C attempts to start another expedited grace period, but blocks on ->exp_mutex, which is still held by Task A. 4. The aforementioned expedited grace period completes, so that ->expedited_sequence now has the value 0b'0100'. A kworker task therefore acquires the rcu_state structure's ->exp_wake_mutex and starts awakening any tasks waiting for this grace period. 5. One of the first tasks awakened happens to be Task A. Task A therefore releases the rcu_state structure's ->exp_mutex, which allows Task C to start the next expedited grace period, which causes the lower four bits of the rcu_state structure's ->expedited_sequence field to become 0b'0101'. 6. Task C's expedited grace period completes, so that the lower four bits of the rcu_state structure's ->expedited_sequence field now become 0b'1000'. 7. The kworker task from step 4 above continues its wakeups. Unfortunately, the wake_up_all() refetches the rcu_state structure's .expedited_sequence field: wake_up_all(&rnp->exp_wq[rcu_seq_ctr(rcu_state.expedited_sequence) & 0x3]); This results in the wakeup being applied to the rcu_node structure's ->exp_wq[2] field, which is unfortunate given that Task B is instead waiting on ->exp_wq[1]. On a busy system, no harm is done (or at least no permanent harm is done). Some later expedited grace period will redo the wakeup. But on a quiet system, such as many embedded systems, it might be a good long time before there was another expedited grace period. On such embedded systems, this situation could therefore result in a system hang. This issue manifested as DPM device timeout during suspend (which usually qualifies as a quiet time) due to a SCSI device being stuck in _synchronize_rcu_expedited(), with the following stack trace: schedule() synchronize_rcu_expedited() synchronize_rcu() scsi_device_quiesce() scsi_bus_suspend() dpm_run_callback() __device_suspend() This commit therefore prevents such delays, timeouts, and hangs by making rcu_exp_wait_wake() use its "s" argument consistently instead of refetching from rcu_state.expedited_sequence. Fixes: 3b5f668e ("rcu: Overlap wakeups with next expedited grace period") Signed-off-by:
Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by:
Paul E. McKenney <paulmck@kernel.org> Signed-off-by:
David Chen <david.chen@nutanix.com> Acked-by:
Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Benjamin Hesmans authored
stable inclusion from linux-4.19.207 commit d6efada330af09253b0f81a0d836cee02192bd4f -------------------------------- [ Upstream commit 730affed24bffcd1eebd5903171960f5ff9f1f22 ] Bug reported by KASAN: BUG: KASAN: use-after-scope in inet6_ehashfn (net/ipv6/inet6_hashtables.c:40) Call Trace: (...) inet6_ehashfn (net/ipv6/inet6_hashtables.c:40) (...) nf_sk_lookup_slow_v6 (net/ipv6/netfilter/nf_socket_ipv6.c:91 net/ipv6/netfilter/nf_socket_ipv6.c:146) It seems that this bug has already been fixed by Eric Dumazet in the past in: commit 78296c97 ("netfilter: xt_socket: fix a stack corruption bug") But a variant of the same issue has been introduced in commit d64d80a2 ("netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match") `daddr` and `saddr` potentially hold a reference to ipv6_var that is no longer in scope when the call to `nf_socket_get_sock_v6` is made. Fixes: d64d80a2 ("netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match") Acked-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Benjamin Hesmans <benjamin.hesmans@tessares.net> Reviewed-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Andy Shevchenko authored
stable inclusion from linux-4.19.207 commit 6bdadfff347e42b6da70a9c77bb443479781c1f3 -------------------------------- [ Upstream commit 817f9916a6e96ae43acdd4e75459ef4f92d96eb1 ] The CONFIG_PCI=y case got a new parameter long time ago. Sync the stub as well. [bhelgaas: add parameter names] Fixes: 725522b5 ("PCI: add the sysfs driver name to all modules") Link: https://lore.kernel.org/r/20210813153619.89574-1-andriy.shevchenko@linux.intel.com Reported-by:
kernel test robot <lkp@intel.com> Signed-off-by:
Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Dan Carpenter authored
stable inclusion from linux-4.19.207 commit 1a091bfd11e61032b6192cf2a1ebb259889f28b3 -------------------------------- [ Upstream commit 7eb6ea4148579b85540a41d57bcec315b8af8ff8 ] pci_dev_str_match_path() is often called with a spinlock held so the allocation has to be atomic. The call tree is: pci_specified_resource_alignment() <-- takes spin_lock(); pci_dev_str_match() pci_dev_str_match_path() Fixes: 45db3370 ("PCI: Allow specifying devices using a base bus and path of devfns") Link: https://lore.kernel.org/r/20210812070004.GC31863@kili Signed-off-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Bjorn Helgaas <bhelgaas@google.com> Reviewed-by:
Logan Gunthorpe <logang@deltatee.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Paolo Valente authored
stable inclusion from linux-4.19.207 commit 2c1b1848357dc69f62ce3630b850e6680b87854b -------------------------------- [ Upstream commit 2d52c58b9c9bdae0ca3df6a1eab5745ab3f7d80b ] The function bfq_setup_merge prepares the merging between two bfq_queues, say bfqq and new_bfqq. To this goal, it assigns bfqq->new_bfqq = new_bfqq. Then, each time some I/O for bfqq arrives, the process that generated that I/O is disassociated from bfqq and associated with new_bfqq (merging is actually a redirection). In this respect, bfq_setup_merge increases new_bfqq->ref in advance, adding the number of processes that are expected to be associated with new_bfqq. Unfortunately, the stable-merging mechanism interferes with this setup. After bfqq->new_bfqq has been set by bfq_setup_merge, and before all the expected processes have been associated with bfqq->new_bfqq, bfqq may happen to be stably merged with a different queue than the current bfqq->new_bfqq. In this case, bfqq->new_bfqq gets changed. So, some of the processes that have been already accounted for in the ref counter of the previous new_bfqq will not be associated with that queue. This creates an unbalance, because those references will never be decremented. This commit fixes this issue by reestablishing the previous, natural behaviour: once bfqq->new_bfqq has been set, it will not be changed until all expected redirections have occurred. Signed-off-by:
Davide Zini <davidezini2@gmail.com> Signed-off-by:
Paolo Valente <paolo.valente@linaro.org> Link: https://lore.kernel.org/r/20210802141352.74353-2-paolo.valente@linaro.org Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
David Hildenbrand authored
stable inclusion from linux-4.19.207 commit c48402015e02901ff8b1fac5c112b02546364bb6 -------------------------------- commit 7cf209ba8a86410939a24cb1aeb279479a7e0ca6 upstream. Patch series "mm/memory_hotplug: preparatory patches for new online policy and memory" These are all cleanups and one fix previously sent as part of [1]: [PATCH v1 00/12] mm/memory_hotplug: "auto-movable" online policy and memory groups. These patches make sense even without the other series, therefore I pulled them out to make the other series easier to digest. [1] https://lkml.kernel.org/r/20210607195430.48228-1-david@redhat.com This patch (of 4): Checkpatch complained on a follow-up patch that we are using "unsigned" here, which defaults to "unsigned int" and checkpatch is correct. As we will search for a fitting zone using the wrong pfn, we might end up onlining memory to one of the special kernel zones, such as ZONE_DMA, which can end badly as the onlined memory does not satisfy properties of these zones. Use "unsigned long" instead, just as we do in other places when handling PFNs. This can bite us once we have physical addresses in the range of multiple TB. Link: https://lkml.kernel.org/r/20210712124052.26491-2-david@redhat.com Fixes: e5e68930 ("mm, memory_hotplug: display allowed zones in the preferred ordering") Signed-off-by:
David Hildenbrand <david@redhat.com> Reviewed-by:
Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by:
Muchun Song <songmuchun@bytedance.com> Reviewed-by:
Oscar Salvador <osalvador@suse.de> Cc: David Hildenbrand <david@redhat.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com> Cc: Wei Yang <richard.weiyang@linux.alibaba.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mike Rapoport <rppt@kernel.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Len Brown <lenb@kernel.org> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: virtualization@lists.linux-foundation.org Cc: Andy Lutomirski <luto@kernel.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> Cc: Anton Blanchard <anton@ozlabs.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christophe Leroy <christophe.leroy@c-s.fr> Cc: Dave Jiang <dave.jiang@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jia He <justin.he@arm.com> Cc: Joe Perches <joe@perches.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Laurent Dufour <ldufour@linux.ibm.com> Cc: Michel Lespinasse <michel@lespinasse.org> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Cc: Rich Felker <dalias@libc.org> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Sergei Trofimovich <slyfox@gentoo.org> Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
David Hildenbrand <david@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
zhenggy authored
stable inclusion from linux-4.19.207 commit dfefcc46354530c3ec1d12db0e16c740548c6229 -------------------------------- commit 4f884f3962767877d7aabbc1ec124d2c307a4257 upstream. Commit 10d3be56 ("tcp-tso: do not split TSO packets at retransmit time") may directly retrans a multiple segments TSO/GSO packet without split, Since this commit, we can no longer assume that a retransmitted packet is a single segment. This patch fixes the tp->undo_retrans accounting in tcp_sacktag_one() that use the actual segments(pcount) of the retransmitted packet. Before that commit (10d3be56), the assumption underlying the tp->undo_retrans-- seems correct. Fixes: 10d3be56 ("tcp-tso: do not split TSO packets at retransmit time") Signed-off-by:
zhenggy <zhenggy@chinatelecom.cn> Reviewed-by:
Eric Dumazet <edumazet@google.com> Acked-by:
Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Eric Dumazet authored
stable inclusion from linux-4.19.207 commit 44ba281510190e2915506016407ba5b374a3add2 -------------------------------- commit 04f08eb44b5011493d77b602fdec29ff0f5c6cd5 upstream. syzbot reported another data-race in af_unix [1] Lets change __skb_insert() to use WRITE_ONCE() when changing skb head qlen. Also, change unix_dgram_poll() to use lockless version of unix_recvq_full() It is verry possible we can switch all/most unix_recvq_full() to the lockless version, this will be done in a future kernel version. [1] HEAD commit: 8596e589b787732c8346f0482919e83cc9362db1 BUG: KCSAN: data-race in skb_queue_tail / unix_dgram_poll write to 0xffff88814eeb24e0 of 4 bytes by task 25815 on cpu 0: __skb_insert include/linux/skbuff.h:1938 [inline] __skb_queue_before include/linux/skbuff.h:2043 [inline] __skb_queue_tail include/linux/skbuff.h:2076 [inline] skb_queue_tail+0x80/0xa0 net/core/skbuff.c:3264 unix_dgram_sendmsg+0xff2/0x1600 net/unix/af_unix.c:1850 sock_sendmsg_nosec net/socket.c:703 [inline] sock_sendmsg net/socket.c:723 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2392 ___sys_sendmsg net/socket.c:2446 [inline] __sys_sendmmsg+0x315/0x4b0 net/socket.c:2532 __do_sys_sendmmsg net/socket.c:2561 [inline] __se_sys_sendmmsg net/socket.c:2558 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2558 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88814eeb24e0 of 4 bytes by task 25834 on cpu 1: skb_queue_len include/linux/skbuff.h:1869 [inline] unix_recvq_full net/unix/af_unix.c:194 [inline] unix_dgram_poll+0x2bc/0x3e0 net/unix/af_unix.c:2777 sock_poll+0x23e/0x260 net/socket.c:1288 vfs_poll include/linux/poll.h:90 [inline] ep_item_poll fs/eventpoll.c:846 [inline] ep_send_events fs/eventpoll.c:1683 [inline] ep_poll fs/eventpoll.c:1798 [inline] do_epoll_wait+0x6ad/0xf00 fs/eventpoll.c:2226 __do_sys_epoll_wait fs/eventpoll.c:2238 [inline] __se_sys_epoll_wait fs/eventpoll.c:2233 [inline] __x64_sys_epoll_wait+0xf6/0x120 fs/eventpoll.c:2233 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000001b -> 0x00000001 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 25834 Comm: syz-executor.1 Tainted: G W 5.14.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 86b18aaa ("skbuff: fix a data race in skb_queue_len()") Cc: Qian Cai <cai@lca.pw> Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Baptiste Lepers authored
stable inclusion from linux-4.19.207 commit c09a84aea0d3902f955cc6504e1c25cddb7c48c2 -------------------------------- commit b89a05b21f46150ac10a962aa50109250b56b03b upstream. In perf_event_addr_filters_apply, the task associated with the event (event->ctx->task) is read using READ_ONCE at the beginning of the function, checked, and then re-read from event->ctx->task, voiding all guarantees of the checks. Reuse the value that was read by READ_ONCE to ensure the consistency of the task struct throughout the function. Fixes: 375637bc ("perf/core: Introduce address range filtering") Signed-off-by:
Baptiste Lepers <baptiste.lepers@gmail.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20210906015310.12802-1-baptiste.lepers@gmail.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Mike Rapoport authored
stable inclusion from linux-4.19.207 commit 2717db72f74c8e51068801b6327559241c54b86e -------------------------------- commit 34b1999da935a33be6239226bfa6cd4f704c5c88 upstream. Jiri Olsa reported a fault when running: # cat /proc/kallsyms | grep ksys_read ffffffff8136d580 T ksys_read # objdump -d --start-address=0xffffffff8136d580 --stop-address=0xffffffff8136d590 /proc/kcore /proc/kcore: file format elf64-x86-64 Segmentation fault general protection fault, probably for non-canonical address 0xf887ffcbff000: 0000 [#1] SMP PTI CPU: 12 PID: 1079 Comm: objdump Not tainted 5.14.0-rc5qemu+ #508 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014 RIP: 0010:kern_addr_valid Call Trace: read_kcore ? rcu_read_lock_sched_held ? rcu_read_lock_sched_held ? rcu_read_lock_sched_held ? trace_hardirqs_on ? rcu_read_lock_sched_held ? lock_acquire ? lock_acquire ? rcu_read_lock_sched_held ? lock_acquire ? rcu_read_lock_sched_held ? rcu_read_lock_sched_held ? rcu_read_lock_sched_held ? lock_release ? _raw_spin_unlock ? __handle_mm_fault ? rcu_read_lock_sched_held ? lock_acquire ? rcu_read_lock_sched_held ? lock_release proc_reg_read ? vfs_read vfs_read ksys_read do_syscall_64 entry_SYSCALL_64_after_hwframe The fault happens because kern_addr_valid() dereferences existent but not present PMD in the high kernel mappings. Such PMDs are created when free_kernel_image_pages() frees regions larger than 2Mb. In this case, a part of the freed memory is mapped with PMDs and the set_memory_np_noalias() -> ... -> __change_page_attr() sequence will mark the PMD as not present rather than wipe it completely. Have kern_addr_valid() check whether higher level page table entries are present before trying to dereference them to fix this issue and to avoid similar issues in the future. Stable backporting note: ------------------------ Note that the stable marking is for all active stable branches because there could be cases where pagetable entries exist but are not valid - see 9a14aefc ("x86: cpa, fix lookup_address"), for example. So make sure to be on the safe side here and use pXY_present() accessors rather than pXY_none() which could #GP when accessing pages in the direct map. Also see: c40a56a7 ("x86/mm/init: Remove freed kernel image areas from alias mapping") for more info. Reported-by:
Jiri Olsa <jolsa@redhat.com> Signed-off-by:
Mike Rapoport <rppt@linux.ibm.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
David Hildenbrand <david@redhat.com> Acked-by:
Dave Hansen <dave.hansen@intel.com> Tested-by:
Jiri Olsa <jolsa@redhat.com> Cc: <stable@vger.kernel.org> # 4.4+ Link: https://lkml.kernel.org/r/20210819132717.19358-1-rppt@kernel.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Mark Brown authored
stable inclusion from linux-4.19.207 commit aadf3115f86c6df4308e2734b5c39eadbe9573e0 -------------------------------- commit e35ac9d0b56e9efefaeeb84b635ea26c2839ea86 upstream. When we need a buffer for SVE register state we call sve_alloc() to make sure that one is there. In order to avoid repeated allocations and frees we keep the buffer around unless we change vector length and just memset() it to ensure a clean register state. The function that deals with this takes the task to operate on as an argument, however in the case where we do a memset() we initialise using the SVE state size for the current task rather than the task passed as an argument. This is only an issue in the case where we are setting the register state for a task via ptrace and the task being configured has a different vector length to the task tracing it. In the case where the buffer is larger in the traced process we will leak old state from the traced process to itself, in the case where the buffer is smaller in the traced process we will overflow the buffer and corrupt memory. Fixes: bc0ee476 ("arm64/sve: Core task context handling") Cc: <stable@vger.kernel.org> # 4.15.x Signed-off-by:
Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20210909165356.10675-1-broonie@kernel.org Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Liu Zixian authored
stable inclusion from linux-4.19.207 commit 2fed7f8eda3211190a27caddd6ba8fd728f7b17b -------------------------------- commit 13db8c50477d83ad3e3b9b0ae247e5cd833a7ae4 upstream. After fork, the child process will get incorrect (2x) hugetlb_usage. If a process uses 5 2MB hugetlb pages in an anonymous mapping, HugetlbPages: 10240 kB and then forks, the child will show, HugetlbPages: 20480 kB The reason for double the amount is because hugetlb_usage will be copied from the parent and then increased when we copy page tables from parent to child. Child will have 2x actual usage. Fix this by adding hugetlb_count_init in mm_init. Link: https://lkml.kernel.org/r/20210826071742.877-1-liuzixian4@huawei.com Fixes: 5d317b2b ("mm: hugetlb: proc: add HugetlbPages field to /proc/PID/status") Signed-off-by:
Liu Zixian <liuzixian4@huawei.com> Reviewed-by:
Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by:
Mike Kravetz <mike.kravetz@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-