- Sep 22, 2020
-
-
Ding Tianhong authored
ascend inclusion category: feature bugzilla: NA CVE: NA ------------------------------------------------- The mem_sleep_current is set to PM_SUSPEND_TO_IDLE default, it would cause the system to hang up if the wake-up device is not registered, therefore the PM_SUSPEND_ON need to be set to prevent the system from entering an endless loop. Signed-off-by:
Ding Tianhong <dingtianhong@huawei.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Qian Cai authored
mainline inclusion from mainline-v5.8-rc1 commit d6c1f098 category: bugfix bugzilla: 35806 CVE: NA ------------------------------------------------- "prev_offset" is a static variable in swapin_nr_pages() that can be accessed concurrently with only mmap_sem held in read mode as noticed by KCSAN, BUG: KCSAN: data-race in swap_cluster_readahead / swap_cluster_readahead write to 0xffffffff92763830 of 8 bytes by task 14795 on cpu 17: swap_cluster_readahead+0x2a6/0x5e0 swapin_readahead+0x92/0x8dc do_swap_page+0x49b/0xf20 __handle_mm_fault+0xcfb/0xd70 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x715 page_fault+0x34/0x40 1 lock held by (dnf)/14795: #0: ffff897bd2e98858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715 do_user_addr_fault at arch/x86/mm/fault.c:1405 (inlined by) do_page_fault at arch/x86/mm/fault.c:1535 irq event stamp: 83493 count_memcg_event_mm+0x1a6/0x270 count_memcg_event_mm+0x119/0x270 __do_softirq+0x365/0x589 irq_exit+0xa2/0xc0 read to 0xffffffff92763830 of 8 bytes by task 1 on cpu 22: swap_cluster_readahead+0xfd/0x5e0 swapin_readahead+0x92/0x8dc do_swap_page+0x49b/0xf20 __handle_mm_fault+0xcfb/0xd70 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x715 page_fault+0x34/0x40 1 lock held by systemd/1: #0: ffff897c38f14858 (&mm->mmap_sem#2){++++}-{3:3}, at: do_page_fault+0x143/0x715 irq event stamp: 43530289 count_memcg_event_mm+0x1a6/0x270 count_memcg_event_mm+0x119/0x270 __do_softirq+0x365/0x589 irq_exit+0xa2/0xc0 Signed-off-by:
Qian Cai <cai@lca.pw> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Marco Elver <elver@google.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200402213748.2237-1-cai@lca.pw Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Liu Shixin <liushixin2@huawei.com> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xiongfeng Wang authored
hulk inclusion category: bugfix bugzilla: NA CVE: NA --------------------------- ILP32 application belongs to the compat application. But its syscall number is different from the traditional compat a32 application. The syscall number is the same with the lp64 application. So we need to fix the secure computing mode 1 syscall check for ilp32. Signed-off-by:
Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by:
Yury Norov <ynorov@caviumnetworks.com> Reviewed-by:
Hanjun Guo <guohanjun@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jeremy Sowden authored
stable inclusion from linux-4.19.119 commit be2b6b4a22013ce7016388b276a6a9b147bb0a24 -------------------------------- commit 01ce31c5 upstream. Removed info log-message if ipip tunnel registration fails during module-initialization: it adds nothing to the error message that is written on all failures. Fixes: dd9ee344 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel") Signed-off-by:
Jeremy Sowden <jeremy@azazel.net> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Waiman Long authored
stable inclusion from linux-4.19.118 commit 18779eac17b5df9dff57138d07631573553a41d4 -------------------------------- commit d3ec10aa upstream. A lockdep circular locking dependency report was seen when running a keyutils test: [12537.027242] ====================================================== [12537.059309] WARNING: possible circular locking dependency detected [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE --------- - - [12537.125253] ------------------------------------------------------ [12537.153189] keyctl/25598 is trying to acquire lock: [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0 [12537.208365] [12537.208365] but task is already holding lock: [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12537.270476] [12537.270476] which lock already depends on the new lock. [12537.270476] [12537.307209] [12537.307209] the existing dependency chain (in reverse order) is: [12537.340754] [12537.340754] -> #3 (&type->lock_class){++++}: [12537.367434] down_write+0x4d/0x110 [12537.385202] __key_link_begin+0x87/0x280 [12537.405232] request_key_and_link+0x483/0xf70 [12537.427221] request_key+0x3c/0x80 [12537.444839] dns_query+0x1db/0x5a5 [dns_resolver] [12537.468445] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.496731] cifs_reconnect+0xe04/0x2500 [cifs] [12537.519418] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.546263] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.573551] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.601045] kthread+0x30c/0x3d0 [12537.617906] ret_from_fork+0x3a/0x50 [12537.636225] [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}: [12537.664525] __mutex_lock+0x105/0x11f0 [12537.683734] request_key_and_link+0x35a/0xf70 [12537.705640] request_key+0x3c/0x80 [12537.723304] dns_query+0x1db/0x5a5 [dns_resolver] [12537.746773] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.775607] cifs_reconnect+0xe04/0x2500 [cifs] [12537.798322] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.823369] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.847262] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.873477] kthread+0x30c/0x3d0 [12537.890281] ret_from_fork+0x3a/0x50 [12537.908649] [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}: [12537.935225] __mutex_lock+0x105/0x11f0 [12537.954450] cifs_call_async+0x102/0x7f0 [cifs] [12537.977250] smb2_async_readv+0x6c3/0xc90 [cifs] [12538.000659] cifs_readpages+0x120a/0x1e50 [cifs] [12538.023920] read_pages+0xf5/0x560 [12538.041583] __do_page_cache_readahead+0x41d/0x4b0 [12538.067047] ondemand_readahead+0x44c/0xc10 [12538.092069] filemap_fault+0xec1/0x1830 [12538.111637] __do_fault+0x82/0x260 [12538.129216] do_fault+0x419/0xfb0 [12538.146390] __handle_mm_fault+0x862/0xdf0 [12538.167408] handle_mm_fault+0x154/0x550 [12538.187401] __do_page_fault+0x42f/0xa60 [12538.207395] do_page_fault+0x38/0x5e0 [12538.225777] page_fault+0x1e/0x30 [12538.243010] [12538.243010] -> #0 (&mm->mmap_sem){++++}: [12538.267875] lock_acquire+0x14c/0x420 [12538.286848] __might_fault+0x119/0x1b0 [12538.306006] keyring_read_iterator+0x7e/0x170 [12538.327936] assoc_array_subtree_iterate+0x97/0x280 [12538.352154] keyring_read+0xe9/0x110 [12538.370558] keyctl_read_key+0x1b9/0x220 [12538.391470] do_syscall_64+0xa5/0x4b0 [12538.410511] entry_SYSCALL_64_after_hwframe+0x6a/0xdf [12538.435535] [12538.435535] other info that might help us debug this: [12538.435535] [12538.472829] Chain exists of: [12538.472829] &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class [12538.472829] [12538.524820] Possible unsafe locking scenario: [12538.524820] [12538.551431] CPU0 CPU1 [12538.572654] ---- ---- [12538.595865] lock(&type->lock_class); [12538.613737] lock(root_key_user.cons_lock); [12538.644234] lock(&type->lock_class); [12538.672410] lock(&mm->mmap_sem); [12538.687758] [12538.687758] *** DEADLOCK *** [12538.687758] [12538.714455] 1 lock held by keyctl/25598: [12538.732097] #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12538.770573] [12538.770573] stack backtrace: [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015 [12538.881963] Call Trace: [12538.892897] dump_stack+0x9a/0xf0 [12538.907908] print_circular_bug.isra.25.cold.50+0x1bc/0x279 [12538.932891] ? save_trace+0xd6/0x250 [12538.948979] check_prev_add.constprop.32+0xc36/0x14f0 [12538.971643] ? keyring_compare_object+0x104/0x190 [12538.992738] ? check_usage+0x550/0x550 [12539.009845] ? sched_clock+0x5/0x10 [12539.025484] ? sched_clock_cpu+0x18/0x1e0 [12539.043555] __lock_acquire+0x1f12/0x38d0 [12539.061551] ? trace_hardirqs_on+0x10/0x10 [12539.080554] lock_acquire+0x14c/0x420 [12539.100330] ? __might_fault+0xc4/0x1b0 [12539.119079] __might_fault+0x119/0x1b0 [12539.135869] ? __might_fault+0xc4/0x1b0 [12539.153234] keyring_read_iterator+0x7e/0x170 [12539.172787] ? keyring_read+0x110/0x110 [12539.190059] assoc_array_subtree_iterate+0x97/0x280 [12539.211526] keyring_read+0xe9/0x110 [12539.227561] ? keyring_gc_check_iterator+0xc0/0xc0 [12539.249076] keyctl_read_key+0x1b9/0x220 [12539.266660] do_syscall_64+0xa5/0x4b0 [12539.283091] entry_SYSCALL_64_after_hwframe+0x6a/0xdf One way to prevent this deadlock scenario from happening is to not allow writing to userspace while holding the key semaphore. Instead, an internal buffer is allocated for getting the keys out from the read method first before copying them out to userspace without holding the lock. That requires taking out the __user modifier from all the relevant read methods as well as additional changes to not use any userspace write helpers. That is, 1) The put_user() call is replaced by a direct copy. 2) The copy_to_user() call is replaced by memcpy(). 3) All the fault handling code is removed. Compiling on a x86-64 system, the size of the rxrpc_read() function is reduced from 3795 bytes to 2384 bytes with this patch. Fixes: ^1da177e4 ("Linux-2.6.12-rc2") Reviewed-by:
Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by:
Waiman Long <longman@redhat.com> Signed-off-by:
David Howells <dhowells@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Pablo Neira Ayuso authored
stable inclusion from linux-4.19.118 commit 79f784c999bc43c55125432b791c6f3821b5995f -------------------------------- commit d9583cdf upstream. EINVAL should be used for malformed netlink messages. New userspace utility and old kernels might easily result in EINVAL when exercising new set features, which is misleading. Fixes: 8aeff920 ("netfilter: nf_tables: add stateful object reference to set elements") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Konstantin Khlebnikov authored
stable inclusion from linux-4.19.117 commit f7379c0050d2bfb65e44b340f1d667254dcc3058 -------------------------------- [ Upstream commit a4837980 ] For HZ < 1000 timeout 2000us rounds up to 1 jiffy but expires randomly because next timer interrupt could come shortly after starting softirq. For commonly used CONFIG_HZ=1000 nothing changes. Fixes: 7acf8a1e ("Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning") Reported-by:
Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by:
Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Tim Stallard authored
stable inclusion from linux-4.19.117 commit 8fdf8a84ea68fff914f137169e20aba95978a7af -------------------------------- [ Upstream commit 03e2a984 ] The behaviour for what is considered an anycast address changed in commit 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception"). This now considers the first address in a subnet where there is a route via a gateway to be an anycast address. This breaks path MTU discovery and traceroutes when a host in a remote network uses the address at the start of a prefix (eg 2600:: advertised as 2600::/48 in the DFZ) as ICMP errors will not be sent to anycast addresses. This patch excludes any routes with a gateway, or via point to point links, like the behaviour previously from rt6_is_gw_or_nonexthop in net/ipv6/route.c. This can be tested with: ip link add v1 type veth peer name v2 ip netns add test ip netns exec test ip link set lo up ip link set v2 netns test ip link set v1 up ip netns exec test ip link set v2 up ip addr add 2001:db8::1/64 dev v1 nodad ip addr add 2001:db8:100:: dev lo nodad ip netns exec test ip addr add 2001:db8::2/64 dev v2 nodad ip netns exec test ip route add unreachable 2001:db8:1::1 ip netns exec test ip route add 2001:db8:100::/64 via 2001:db8::1 ip netns exec test sysctl net.ipv6.conf.all.forwarding=1 ip route add 2001:db8:1::1 via 2001:db8::2 ping -I 2001:db8::1 2001:db8:1::1 -c1 ping -I 2001:db8:100:: 2001:db8:1::1 -c1 ip addr delete 2001:db8:100:: dev lo ip netns delete test Currently the first ping will get back a destination unreachable ICMP error, but the second will never get a response, with "icmp6_send: acast source" logged. After this patch, both get destination unreachable ICMP replies. Fixes: 45e4fd26 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Signed-off-by:
Tim Stallard <code@timstallard.me.uk> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Taras Chornyi authored
stable inclusion from linux-4.19.117 commit 80dd8146df680b8982b659341b8ecd3361f032ca -------------------------------- [ Upstream commit 690cc863 ] When CONFIG_IP_MULTICAST is not set and multicast ip is added to the device with autojoin flag or when multicast ip is deleted kernel will crash. steps to reproduce: ip addr add 224.0.0.0/32 dev eth0 ip addr del 224.0.0.0/32 dev eth0 or ip addr add 224.0.0.0/32 dev eth0 autojoin Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088 pc : _raw_write_lock_irqsave+0x1e0/0x2ac lr : lock_sock_nested+0x1c/0x60 Call trace: _raw_write_lock_irqsave+0x1e0/0x2ac lock_sock_nested+0x1c/0x60 ip_mc_config.isra.28+0x50/0xe0 inet_rtm_deladdr+0x1a8/0x1f0 rtnetlink_rcv_msg+0x120/0x350 netlink_rcv_skb+0x58/0x120 rtnetlink_rcv+0x14/0x20 netlink_unicast+0x1b8/0x270 netlink_sendmsg+0x1a0/0x3b0 ____sys_sendmsg+0x248/0x290 ___sys_sendmsg+0x80/0xc0 __sys_sendmsg+0x68/0xc0 __arm64_sys_sendmsg+0x20/0x30 el0_svc_common.constprop.2+0x88/0x150 do_el0_svc+0x20/0x80 el0_sync_handler+0x118/0x190 el0_sync+0x140/0x180 Fixes: 93a714d6 ("multicast: Extend ip address command to enable multicast group join/leave on") Signed-off-by:
Taras Chornyi <taras.chornyi@plvision.eu> Signed-off-by:
Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Petr Machata authored
stable inclusion from linux-4.19.115 commit b12448912c5e3c38f5baa58fb1f8912a1926a542 -------------------------------- [ Upstream commit ccfc5693 ] The handler for FLOW_ACTION_VLAN_MANGLE ends by returning whatever the lower-level function that it calls returns. If there are more actions lined up after this action, those are never offloaded. Fix by only bailing out when the called function returns an error. Fixes: a150201a ("mlxsw: spectrum: Add support for vlan modify TC action") Signed-off-by:
Petr Machata <petrm@mellanox.com> Reviewed-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jarod Wilson authored
stable inclusion from linux-4.19.115 commit 7a5f4bd6868cc21ea9d4471265d662f7c487c3fc -------------------------------- [ Upstream commit 744fdc82 ] Bonding slave and team port devices should not have link-local addresses automatically added to them, as it can interfere with openvswitch being able to properly add tc ingress. Basic reproducer, courtesy of Marcelo: $ ip link add name bond0 type bond $ ip link set dev ens2f0np0 master bond0 $ ip link set dev ens2f1np2 master bond0 $ ip link set dev bond0 up $ ip a s 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens2f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff 5: ens2f1np2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN group default qlen 1000 link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff 11: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff inet6 fe80::20f:53ff:fe2f:ea40/64 scope link valid_lft forever preferred_lft forever (above trimmed to relevant entries, obviously) $ sysctl net.ipv6.conf.ens2f0np0.addr_gen_mode=0 net.ipv6.conf.ens2f0np0.addr_gen_mode = 0 $ sysctl net.ipv6.conf.ens2f1np2.addr_gen_mode=0 net.ipv6.conf.ens2f1np2.addr_gen_mode = 0 $ ip a l ens2f0np0 2: ens2f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff inet6 fe80::20f:53ff:fe2f:ea40/64 scope link tentative valid_lft forever preferred_lft forever $ ip a l ens2f1np2 5: ens2f1np2: <NO-CARRIER,BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc mq master bond0 state DOWN group default qlen 1000 link/ether 00:0f:53:2f:ea:40 brd ff:ff:ff:ff:ff:ff inet6 fe80::20f:53ff:fe2f:ea40/64 scope link tentative valid_lft forever preferred_lft forever Looks like addrconf_sysctl_addr_gen_mode() bypasses the original "is this a slave interface?" check added by commit c2edacf8, and results in an address getting added, while w/the proposed patch added, no address gets added. This simply adds the same gating check to another code path, and thus should prevent the same devices from erroneously obtaining an ipv6 link-local address. Fixes: d35a00b8 ("net/ipv6: allow sysctl to change link-local address generation mode") Reported-by:
Moshe Levi <moshele@mellanox.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: Marcelo Ricardo Leitner <mleitner@redhat.com> CC: netdev@vger.kernel.org Signed-off-by:
Jarod Wilson <jarod@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Amritha Nambiar authored
stable inclusion from linux-4.19.115 commit b1cb7f2bc9b4f776ae1ab9583802b1bca34d215a -------------------------------- commit 6e11d157 upstream. Fixes the lower and upper bounds when there are multiple TCs and traffic is on the the same TC on the same device. The lower bound is represented by 'qoffset' and the upper limit for hash value is 'qcount + qoffset'. This gives a clean Rx to Tx queue mapping when there are multiple TCs, as the queue indices for upper TCs will be offset by 'qoffset'. v2: Fixed commit description based on comments. Fixes: 1b837d48 ("net: Revoke export for __skb_tx_hash, update it to just be static skb_tx_hash") Fixes: eadec877 ("net: Add support for subordinate traffic classes to netdev_pick_tx") Signed-off-by:
Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by:
Alexander Duyck <alexander.h.duyck@linux.intel.com> Reviewed-by:
Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Marcelo Ricardo Leitner authored
stable inclusion from linux-4.19.115 commit e2ed7b117f3fe6aa0237568dcb69ed7f39cb4979 -------------------------------- [ Upstream commit 582eea23 ] Under certain circumstances, depending on the order of addresses on the interfaces, it could be that sctp_v[46]_get_dst() would return a dst with a mismatched struct flowi. For example, if when walking through the bind addresses and the first one is not a match, it saves the dst as a fallback (added in 410f0383), but not the flowi. Then if the next one is also not a match, the previous dst will be returned but with the flowi information for the 2nd address, which is wrong. The fix is to use a locally stored flowi that can be used for such attempts, and copy it to the parameter only in case it is a possible match, together with the corresponding dst entry. The patch updates IPv6 code mostly just to be in sync. Even though the issue is also present there, it fallback is not expected to work with IPv6. Fixes: 410f0383 ("sctp: add routing output fallback") Reported-by:
Jin Meng <meng.a.jin@nokia-sbell.com> Signed-off-by:
Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by:
Xin Long <lucien.xin@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: net/sctp/ipv6.c [yyl: adjust context] Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Wenan Mao <maowenan@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Qiujun Huang authored
stable inclusion from linux-4.19.115 commit 6ce6aea362d46781d4f5f03cfda16f0a395445d2 -------------------------------- [ Upstream commit 5c3e82fe ] We should iterate over the datamsgs to move all chunks(skbs) to newsk. The following case cause the bug: for the trouble SKB, it was in outq->transmitted list sctp_outq_sack sctp_check_transmitted SKB was moved to outq->sacked list then throw away the sack queue SKB was deleted from outq->sacked (but it was held by datamsg at sctp_datamsg_to_asoc So, sctp_wfree was not called here) then migrate happened sctp_for_each_tx_datachunk( sctp_clear_owner_w); sctp_assoc_migrate(); sctp_for_each_tx_datachunk( sctp_set_owner_w); SKB was not in the outq, and was not changed to newsk finally __sctp_outq_teardown sctp_chunk_put (for another skb) sctp_datamsg_put __kfree_skb(msg->frag_list) sctp_wfree (for SKB) SKB->sk was still oldsk (skb->sk != asoc->base.sk). Reported-and-tested-by:
<syzbot+cea71eec5d6de256d54d@syzkaller.appspotmail.com> Signed-off-by:
Qiujun Huang <hqjagain@gmail.com> Acked-by:
Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
William Dauchy authored
stable inclusion from linux-4.19.115 commit 48dee02237117c0758410fa4989ce71bdb6cf184 -------------------------------- [ Upstream commit 25629fda ] when creating a new ipip interface with no local/remote configuration, the lookup is done with TUNNEL_NO_KEY flag, making it impossible to match the new interface (only possible match being fallback or metada case interface); e.g: `ip link add tunl1 type ipip dev eth0` To fix this case, adding a flag check before the key comparison so we permit to match an interface with no local/remote config; it also avoids breaking possible userland tools relying on TUNNEL_NO_KEY flag and uninitialised key. context being on my side, I'm creating an extra ipip interface attached to the physical one, and moving it to a dedicated namespace. Fixes: c5441932 ("GRE: Refactor GRE tunneling code.") Signed-off-by:
William Dauchy <w.dauchy@criteo.com> Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Qian Cai authored
stable inclusion from linux-4.19.115 commit 6f2239a1ad0c965d9faeb5d175f8c6c163b4fa57 -------------------------------- [ Upstream commit fbe4e0c1 ] fib_triestat_seq_show() calls hlist_for_each_entry_rcu(tb, head, tb_hlist) without rcu_read_lock() will trigger a warning, net/ipv4/fib_trie.c:2579 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by proc01/115277: #0: c0000014507acf00 (&p->lock){+.+.}-{3:3}, at: seq_read+0x58/0x670 Call Trace: dump_stack+0xf4/0x164 (unreliable) lockdep_rcu_suspicious+0x140/0x164 fib_triestat_seq_show+0x750/0x880 seq_read+0x1a0/0x670 proc_reg_read+0x10c/0x1b0 __vfs_read+0x3c/0x70 vfs_read+0xac/0x170 ksys_read+0x7c/0x140 system_call+0x5c/0x68 Fix it by adding a pair of rcu_read_lock/unlock() and use cond_resched_rcu() to avoid the situation where walking of a large number of items may prevent scheduling for a long time. Signed-off-by:
Qian Cai <cai@lca.pw> Reviewed-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Torsten Hilbrich authored
stable inclusion from linux-4.19.114 commit 7df44c92854964ff5540756dd47507908e4e63cf -------------------------------- commit 2a9de3af upstream. The vti6_rcv function performs some tests on the retrieved tunnel including checking the IP protocol, the XFRM input policy, the source and destination address. In all but one places the skb is released in the error case. When the input policy check fails the network packet is leaked. Using the same goto-label discard in this case to fix this problem. Fixes: ed1efb2a ("ipv6: Add support for IPsec virtual tunnel interfaces") Signed-off-by:
Torsten Hilbrich <torsten.hilbrich@secunet.com> Reviewed-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Pablo Neira Ayuso authored
stable inclusion from linux-4.19.114 commit 24c290b811945102e2c0e51cfe4b9efea9ae49d4 -------------------------------- commit 76a109fa upstream. Make sure the forward action is only used from ingress. Fixes: 39e6dea2 ("netfilter: nf_tables: add forward expression to the netdev family") Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Haishuang Yan authored
stable inclusion from linux-4.19.114 commit 113df2c58a723b6e30b3f0b7b5bf1dee16d177db -------------------------------- commit 41e9ec5a upstream. Since pskb_may_pull may change skb->data, so we need to reload ip{v6}h at the right place. Fixes: a908fdec ("netfilter: nf_flow_table: move ipv6 offload hook code to nf_flow_table") Fixes: 7d208687 ("netfilter: nf_flow_table: move ipv4 offload hook code to nf_flow_table") Signed-off-by:
Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
YueHaibing authored
stable inclusion from linux-4.19.114 commit 7ad217a824f7fab1e8534a6dfa82899ae1900bcb -------------------------------- commit 4c59406e upstream. After xfrm_add_policy add a policy, its ref is 2, then xfrm_policy_timer read_lock xp->walk.dead is 0 .... mod_timer() xfrm_policy_kill policy->walk.dead = 1 .... del_timer(&policy->timer) xfrm_pol_put //ref is 1 xfrm_pol_put //ref is 0 xfrm_policy_destroy call_rcu xfrm_pol_hold //ref is 1 read_unlock xfrm_pol_put //ref is 0 xfrm_policy_destroy call_rcu xfrm_policy_destroy is called twice, which may leads to double free. Call Trace: RIP: 0010:refcount_warn_saturate+0x161/0x210 ... xfrm_policy_timer+0x522/0x600 call_timer_fn+0x1b3/0x5e0 ? __xfrm_decode_session+0x2990/0x2990 ? msleep+0xb0/0xb0 ? _raw_spin_unlock_irq+0x24/0x40 ? __xfrm_decode_session+0x2990/0x2990 ? __xfrm_decode_session+0x2990/0x2990 run_timer_softirq+0x5c5/0x10e0 Fix this by use write_lock_bh in xfrm_policy_kill. Fixes: ea2dea9d ("xfrm: remove policy lock when accessing policy->walk.dead") Signed-off-by:
YueHaibing <yuehaibing@huawei.com> Acked-by:
Timo Teräs <timo.teras@iki.fi> Acked-by:
Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xin Long authored
stable inclusion from linux-4.19.114 commit 0a7b397c013322fec975f30012302f694efba2da -------------------------------- commit a1a7e3a3 upstream. Without doing verify_sec_ctx_len() check in xfrm_add_acquire(), it may be out-of-bounds to access uctx->ctx_str with uctx->ctx_len, as noticed by syz: BUG: KASAN: slab-out-of-bounds in selinux_xfrm_alloc_user+0x237/0x430 Read of size 768 at addr ffff8880123be9b4 by task syz-executor.1/11650 Call Trace: dump_stack+0xe8/0x16e print_address_description.cold.3+0x9/0x23b kasan_report.cold.4+0x64/0x95 memcpy+0x1f/0x50 selinux_xfrm_alloc_user+0x237/0x430 security_xfrm_policy_alloc+0x5c/0xb0 xfrm_policy_construct+0x2b1/0x650 xfrm_add_acquire+0x21d/0xa10 xfrm_user_rcv_msg+0x431/0x6f0 netlink_rcv_skb+0x15a/0x410 xfrm_netlink_rcv+0x6d/0x90 netlink_unicast+0x50e/0x6a0 netlink_sendmsg+0x8ae/0xd40 sock_sendmsg+0x133/0x170 ___sys_sendmsg+0x834/0x9a0 __sys_sendmsg+0x100/0x1e0 do_syscall_64+0xe5/0x660 entry_SYSCALL_64_after_hwframe+0x6a/0xdf So fix it by adding the missing verify_sec_ctx_len check there. Fixes: 980ebd25 ("[IPSEC]: Sync series - acquire insert") Reported-by:
Hangbin Liu <liuhangbin@gmail.com> Signed-off-by:
Xin Long <lucien.xin@gmail.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xin Long authored
stable inclusion from linux-4.19.114 commit cf265c64c91957fd0f1b86b7427028d823966d74 -------------------------------- commit 171d449a upstream. It's not sufficient to do 'uctx->len != (sizeof(struct xfrm_user_sec_ctx) + uctx->ctx_len)' check only, as uctx->len may be greater than nla_len(rt), in which case it will cause slab-out-of-bounds when accessing uctx->ctx_str later. This patch is to fix it by return -EINVAL when uctx->len > nla_len(rt). Fixes: df71837d ("[LSM-IPSec]: Security association restriction.") Signed-off-by:
Xin Long <lucien.xin@gmail.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Nicolas Dichtel authored
stable inclusion from linux-4.19.114 commit f8ee708284e1d62ecc345908b40b7f9ccca4e603 -------------------------------- commit f1ed1026 upstream. I forgot the 4in6/6in4 cases in my previous patch. Let's fix them. Fixes: 95224166 ("vti[6]: fix packet tx through bpf_redirect()") Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Raed Salem authored
stable inclusion from linux-4.19.114 commit cb2775c906eed8f350b8deed7d681bf285fbcb72 -------------------------------- commit 03891f82 upstream. This patch to handle the asynchronous unregister device event so the device IPsec offload resources could be cleanly released. Fixes: e4db5b61 ("xfrm: policy: remove pcpu policy cache") Signed-off-by:
Raed Salem <raeds@mellanox.com> Reviewed-by:
Boris Pismenny <borisp@mellanox.com> Reviewed-by:
Saeed Mahameed <saeedm@mellanox.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Ilya Dryomov authored
stable inclusion from linux-4.19.114 commit 1e2d0c50980c55f84035adf7e7cece8a19e6b9ec -------------------------------- commit 76142097 upstream. CEPH_OSDMAP_FULL/NEARFULL aren't set since mimic, so we need to consult per-pool flags as well. Unfortunately the backwards compatibility here is lacking: - the change that deprecated OSDMAP_FULL/NEARFULL went into mimic, but was guarded by require_osd_release >= RELEASE_LUMINOUS - it was subsequently backported to luminous in v12.2.2, but that makes no difference to clients that only check OSDMAP_FULL/NEARFULL because require_osd_release is not client-facing -- it is for OSDs Since all kernels are affected, the best we can do here is just start checking both map flags and pool flags and send that to stable. These checks are best effort, so take osdc->lock and look up pool flags just once. Remove the FIXME, since filesystem quotas are checked above and RADOS quotas are reflected in POOL_FLAG_FULL: when the pool reaches its quota, both POOL_FLAG_FULL and POOL_FLAG_FULL_QUOTA are set. Cc: stable@vger.kernel.org Reported-by:
Yanhu Cao <gmayyyha@gmail.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Reviewed-by:
Jeff Layton <jlayton@kernel.org> Acked-by:
Sage Weil <sage@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Taehee Yoo authored
stable inclusion from linux-4.19.114 commit facf9c7ecc2f0d8e8c65e4d532f690dc5e7aa659 -------------------------------- [ Upstream commit 384d91c2 ] gro_cells_init() returns error if memory allocation is failed. But the vxlan module doesn't check the return value of gro_cells_init(). Fixes: 58ce31cc ("vxlan: GRO support at tunnel layer")` Signed-off-by:
Taehee Yoo <ap420073@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Eric Dumazet authored
stable inclusion from linux-4.19.114 commit 58b501cc08ccd688c4dd3d202cbdc4e36aeff79a -------------------------------- [ Upstream commit 6cd6cbf5 ] When application uses TCP_QUEUE_SEQ socket option to change tp->rcv_next, we must also update tp->copied_seq. Otherwise, stuff relying on tcp_inq() being precise can eventually be confused. For example, tcp_zerocopy_receive() might crash because it does not expect tcp_recv_skb() to return NULL. We could add tests in various places to fix the issue, or simply make sure tcp_inq() wont return a random value, and leave fast path as it is. Note that this fixes ioctl(fd, SIOCINQ, &val) at the same time. Fixes: ee995283 ("tcp: Initial repair mode") Fixes: 05255b82 ("tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive") Signed-off-by:
Eric Dumazet <edumazet@google.com> Reported-by:
syzbot <syzkaller@googlegroups.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Petr Machata authored
stable inclusion from linux-4.19.114 commit f5ebb2dd86777379a552acce0d635de8210a427c -------------------------------- [ Upstream commit 32ca98fe ] The fix referenced below causes a crash when an ERSPAN tunnel is created without passing IFLA_INFO_DATA. Fix by validating passed-in data in the same way as ipgre does. Fixes: e1f8f78f ("net: ip_gre: Separate ERSPAN newlink / changelink callbacks") Reported-by:
<syzbot+1b4ebf4dae4e510dd219@syzkaller.appspotmail.com> Signed-off-by:
Petr Machata <petrm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Petr Machata authored
stable inclusion from linux-4.19.114 commit 54266b2694682e7207ec66bce59f4f5323727dd3 -------------------------------- [ Upstream commit e1f8f78f ] ERSPAN shares most of the code path with GRE and gretap code. While that helps keep the code compact, it is also error prone. Currently a broken userspace can turn a gretap tunnel into a de facto ERSPAN one by passing IFLA_GRE_ERSPAN_VER. There has been a similar issue in ip6gretap in the past. To prevent these problems in future, split the newlink and changelink code paths. Split the ERSPAN code out of ipgre_netlink_parms() into a new function erspan_netlink_parms(). Extract a piece of common logic from ipgre_newlink() and ipgre_changelink() into ipgre_newlink_encap_setup(). Add erspan_newlink() and erspan_changelink(). Fixes: 84e54fe0 ("gre: introduce native tunnel support for ERSPAN") Signed-off-by:
Petr Machata <petrm@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Cong Wang authored
stable inclusion from linux-4.19.114 commit 557d015ffb27b672e24e6ad141fd887783871dc2 -------------------------------- [ Upstream commit 0d1c3530 ] In commit 599be01e ("net_sched: fix an OOB access in cls_tcindex") I moved cp->hash calculation before the first tcindex_alloc_perfect_hash(), but cp->alloc_hash is left untouched. This difference could lead to another out of bound access. cp->alloc_hash should always be the size allocated, we should update it after this tcindex_alloc_perfect_hash(). Reported-and-tested-by:
<syzbot+dcc34d54d68ef7d2d53d@syzkaller.appspotmail.com> Reported-and-tested-by:
<syzbot+c72da7b9ed57cde6fca2@syzkaller.appspotmail.com> Fixes: 599be01e ("net_sched: fix an OOB access in cls_tcindex") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Cong Wang authored
stable inclusion from linux-4.19.114 commit ea3d6652c240978736a91b9e85fde9fee9359be4 -------------------------------- [ Upstream commit ef299cc3 ] route4_change() allocates a new filter and copies values from the old one. After the new filter is inserted into the hash table, the old filter should be removed and freed, as the final step of the update. However, the current code mistakenly removes the new one. This looks apparently wrong to me, and it causes double "free" and use-after-free too, as reported by syzbot. Reported-and-tested-by:
<syzbot+f9b32aaacd60305d9687@syzkaller.appspotmail.com> Reported-and-tested-by:
<syzbot+2f8c233f131943d6056d@syzkaller.appspotmail.com> Reported-and-tested-by:
<syzbot+9c2df9fd5e9445b74e01@syzkaller.appspotmail.com> Fixes: 1109c005 ("net: sched: RCU cls_route") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Willem de Bruijn authored
stable inclusion from linux-4.19.114 commit 6fb0e4385928900ccb8697748555b3f54bba5193 -------------------------------- [ Upstream commit 61fad681 ] PACKET_RX_RING can cause multiple writers to access the same slot if a fast writer wraps the ring while a slow writer is still copying. This is particularly likely with few, large, slots (e.g., GSO packets). Synchronize kernel thread ownership of rx ring slots with a bitmap. Writers acquire a slot race-free by testing tp_status TP_STATUS_KERNEL while holding the sk receive queue lock. They release this lock before copying and set tp_status to TP_STATUS_USER to release to userspace when done. During copying, another writer may take the lock, also see TP_STATUS_KERNEL, and start writing to the same slot. Introduce a new rx_owner_map bitmap with a bit per slot. To acquire a slot, test and set with the lock held. To release race-free, update tp_status and owner bit as a transaction, so take the lock again. This is the one of a variety of discussed options (see Link below): * instead of a shadow ring, embed the data in the slot itself, such as in tp_padding. But any test for this field may match a value left by userspace, causing deadlock. * avoid the lock on release. This leaves a small race if releasing the shadow slot before setting TP_STATUS_USER. The below reproducer showed that this race is not academic. If releasing the slot after tp_status, the race is more subtle. See the first link for details. * add a new tp_status TP_KERNEL_OWNED to avoid the transactional store of two fields. But, legacy applications may interpret all non-zero tp_status as owned by the user. As libpcap does. So this is possible only opt-in by newer processes. It can be added as an optional mode. * embed the struct at the tail of pg_vec to avoid extra allocation. The implementation proved no less complex than a separate field. The additional locking cost on release adds contention, no different than scaling on multicore or multiqueue h/w. In practice, below reproducer nor small packet tcpdump showed a noticeable change in perf report in cycles spent in spinlock. Where contention is problematic, packet sockets support mitigation through PACKET_FANOUT. And we can consider adding opt-in state TP_KERNEL_OWNED. Easy to reproduce by running multiple netperf or similar TCP_STREAM flows concurrently with `tcpdump -B 129 -n greater 60000`. Based on an earlier patchset by Jon Rosen. See links below. I believe this issue goes back to the introduction of tpacket_rcv, which predates git history. Link: https://www.mail-archive.com/netdev@vger.kernel.org/msg237222.html Suggested-by:
Jon Rosen <jrosen@cisco.com> Signed-off-by:
Willem de Bruijn <willemb@google.com> Signed-off-by:
Jon Rosen <jrosen@cisco.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zh-yuan Ye authored
stable inclusion from linux-4.19.114 commit c94fbe2892d523e8706dc60b714a677f20918ad6 -------------------------------- [ Upstream commit 961d0e5b ] Currently the software CBS does not consider the packet sending time when depleting the credits. It caused the throughput to be Idleslope[kbps] * (Port transmit rate[kbps] / |Sendslope[kbps]|) where Idleslope * (Port transmit rate / (Idleslope + |Sendslope|)) = Idleslope is expected. In order to fix the issue above, this patch takes the time when the packet sending completes into account by moving the anchor time variable "last" ahead to the send completion time upon transmission and adding wait when the next dequeue request comes before the send completion time of the previous packet. changelog: V2->V3: - remove unnecessary whitespace cleanup - add the checks if port_rate is 0 before division V1->V2: - combine variable "send_completed" into "last" - add the comment for estimate of the packet sending Fixes: 585d763a ("net/sched: Introduce Credit Based Shaper (CBS) qdisc") Signed-off-by:
Zh-yuan Ye <ye.zh-yuan@socionext.com> Reviewed-by:
Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Ido Schimmel authored
stable inclusion from linux-4.19.114 commit b371fdcd26675e7bc583ac9449c667e2e90b4e7e -------------------------------- [ Upstream commit f6bf1baf ] list_for_each_entry_from_reverse() iterates backwards over the list from the current position, but in the error path we should start from the previous position. Fix this by using list_for_each_entry_continue_reverse() instead. This suppresses the following error from coccinelle: drivers/net/ethernet/mellanox/mlxsw//spectrum_mr.c:655:34-38: ERROR: invalid reference to the index variable of the iterator on line 636 Fixes: c011ec1b ("mlxsw: spectrum: Add the multicast routing offloading logic") Signed-off-by:
Ido Schimmel <idosch@mellanox.com> Reviewed-by:
Jiri Pirko <jiri@mellanox.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Sasha Levin authored
stable inclusion from linux-4.19.113 commit a22d7fc61f931e280b77dc755c807548bd1765d9 -------------------------------- This reverts commit 2b3541ffdd05198b329d21920a0f606009a1058b. This patch shouldn't have been backported to 4.19. Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Sasha Levin authored
stable inclusion from linux-4.19.113 commit ae2f7c84371a2a4c449a92c956d0e4f83565e257 -------------------------------- This reverts commit 91c5f99d131ed3b231aaef7d4ed6799085b095a3. This patch shouldn't have been backported to 4.19. Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Matteo Croce authored
stable inclusion from linux-4.19.112 commit b4176d3b1a820f792e36d7cadd5bf0eeaf71fb09 -------------------------------- commit 3e72dfdf upstream. Similarly to commit c543cb4a ("ipv4: ensure rcu_read_lock() in ipv4_link_failure()"), __ip_options_compile() must be called under rcu protection. Fixes: 3da1ed7a ("net: avoid use IPCB in cipso_v4_error") Suggested-by:
Guillaume Nault <gnault@redhat.com> Signed-off-by:
Matteo Croce <mcroce@redhat.com> Acked-by:
Paul Moore <paul@paul-moore.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jakub Kicinski authored
stable inclusion from linux-4.19.111 commit 5ae2daf9977a1fa4f153c20e1996ba28a54a66d1 -------------------------------- commit 88a63771 upstream. Add missing attribute validation for tunnel source and destination ports to the netlink policy. Fixes: af308b94 ("netfilter: nf_tables: add tunnel support") Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jakub Kicinski authored
stable inclusion from linux-4.19.111 commit 64d43185eba6d61467db53ca026fdeb66fe78646 -------------------------------- commit 9d6effb2 upstream. Add missing attribute validation for NFTA_PAYLOAD_CSUM_FLAGS to the netlink policy. Fixes: 18140969 ("netfilter: nft_payload: layer 4 checksum adjustment for pseudoheader fields") Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jakub Kicinski authored
stable inclusion from linux-4.19.111 commit 5b425d389ed2627aa04739a076b9da9a9adaad9e -------------------------------- commit c049b345 upstream. Add missing attribute validation for cthelper to the netlink policy. Fixes: 12f7a505 ("netfilter: add user-space connection tracking helper infrastructure") Signed-off-by:
Jakub Kicinski <kuba@kernel.org> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Li Aichun <liaichun@huawei.com> Reviewed-by:
guodeqing <geffrey.guo@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-