Skip to content
Snippets Groups Projects
  1. Oct 19, 2022
  2. Aug 08, 2022
    • Eric Dumazet's avatar
      net: bonding: fix possible NULL deref in rlb code · 4b48e5d4
      Eric Dumazet authored
      stable inclusion
      from stable-4.19.251
      commit 2c0ab68f8f1e1e1b0ef5691275cdee3eafa67833
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5I4FP
      
      
      CVE: NA
      
      --------------------------------
      
      commit ab84db251c04d38b8dc7ee86e13d4050bedb1c88 upstream.
      
      syzbot has two reports involving the same root cause.
      
      bond_alb_initialize() must not set bond->alb_info.rlb_enabled
      if a memory allocation error is detected.
      
      Report 1:
      
      general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
      CPU: 0 PID: 12276 Comm: kworker/u4:10 Not tainted 5.19.0-rc3-syzkaller-00132-g3b89b511ea0c #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: netns cleanup_net
      RIP: 0010:rlb_clear_slave+0x10e/0x690 drivers/net/bonding/bond_alb.c:393
      Code: 8e fc 83 fb ff 0f 84 74 02 00 00 e8 cc 2a 8e fc 48 8b 44 24 08 89 dd 48 c1 e5 06 4c 8d 34 28 49 8d 7e 14 48 89 f8 48 c1 e8 03 <42> 0f b6 14 20 48 89 f8 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85
      RSP: 0018:ffffc90018a8f678 EFLAGS: 00010203
      RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: ffff88803375bb00 RSI: ffffffff84ec4ac4 RDI: 0000000000000014
      RBP: 0000000000000000 R08: 0000000000000005 R09: 00000000ffffffff
      R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
      R13: ffff8880ac889000 R14: 0000000000000000 R15: ffff88815a668c80
      FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00005597077e10b0 CR3: 0000000026668000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      bond_alb_deinit_slave+0x43c/0x6b0 drivers/net/bonding/bond_alb.c:1663
      __bond_release_one.cold+0x383/0xd53 drivers/net/bonding/bond_main.c:2370
      bond_slave_netdev_event drivers/net/bonding/bond_main.c:3778 [inline]
      bond_netdev_event+0x993/0xad0 drivers/net/bonding/bond_main.c:3889
      notifier_call_chain+0xb5/0x200 kernel/notifier.c:87
      call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1945
      call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
      call_netdevice_notifiers net/core/dev.c:1997 [inline]
      unregister_netdevice_many+0x948/0x18b0 net/core/dev.c:10839
      default_device_exit_batch+0x449/0x590 net/core/dev.c:11333
      ops_exit_list+0x125/0x170 net/core/net_namespace.c:167
      cleanup_net+0x4ea/0xb00 net/core/net_namespace.c:594
      process_one_work+0x996/0x1610 kernel/workqueue.c:2289
      worker_thread+0x665/0x1080 kernel/workqueue.c:2436
      kthread+0x2e9/0x3a0 kernel/kthread.c:376
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:302
      </TASK>
      
      Report 2:
      
      general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
      CPU: 1 PID: 5206 Comm: syz-executor.1 Not tainted 5.18.0-syzkaller-12108-g58f9d52ff689 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:rlb_req_update_slave_clients+0x109/0x2f0 drivers/net/bonding/bond_alb.c:502
      Code: 5d 18 8f fc 41 80 3e 00 0f 85 a5 01 00 00 89 d8 48 c1 e0 06 49 03 84 24 68 01 00 00 48 8d 78 30 49 89 c7 48 89 fa 48 c1 ea 03 <80> 3c 2a 00 0f 85 98 01 00 00 4d 39 6f 30 75 83 e8 22 18 8f fc 49
      RSP: 0018:ffffc9000300ee80 EFLAGS: 00010206
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffc90016c11000
      RDX: 0000000000000006 RSI: ffffffff84eb6bf3 RDI: 0000000000000030
      RBP: dffffc0000000000 R08: 0000000000000005 R09: 00000000ffffffff
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff888027c80c80
      R13: ffff88807d7ff800 R14: ffffed1004f901bd R15: 0000000000000000
      FS:  00007f6f46c58700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020010000 CR3: 00000000516cc000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <TASK>
       alb_fasten_mac_swap+0x886/0xa80 drivers/net/bonding/bond_alb.c:1070
       bond_alb_handle_active_change+0x624/0x1050 drivers/net/bonding/bond_alb.c:1765
       bond_change_active_slave+0xfa1/0x29b0 drivers/net/bonding/bond_main.c:1173
       bond_select_active_slave+0x23f/0xa50 drivers/net/bonding/bond_main.c:1253
       bond_enslave+0x3b34/0x53b0 drivers/net/bonding/bond_main.c:2159
       do_set_master+0x1c8/0x220 net/core/rtnetlink.c:2577
       rtnl_newlink_create net/core/rtnetlink.c:3380 [inline]
       __rtnl_newlink+0x13ac/0x17e0 net/core/rtnetlink.c:3580
       rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3593
       rtnetlink_rcv_msg+0x43a/0xc90 net/core/rtnetlink.c:6089
       netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501
       netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
       netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
       netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
       sock_sendmsg_nosec net/socket.c:714 [inline]
       sock_sendmsg+0xcf/0x120 net/socket.c:734
       ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492
       ___sys_sendmsg+0xf3/0x170 net/socket.c:2546
       __sys_sendmsg net/socket.c:2575 [inline]
       __do_sys_sendmsg net/socket.c:2584 [inline]
       __se_sys_sendmsg net/socket.c:2582 [inline]
       __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x46/0xb0
      RIP: 0033:0x7f6f45a89109
      Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
      RSP: 002b:00007f6f46c58168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f6f45b9c030 RCX: 00007f6f45a89109
      RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000006
      RBP: 00007f6f45ae308d R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: 00007ffed99029af R14: 00007f6f46c58300 R15: 0000000000022000
       </TASK>
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Acked-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Link: https://lore.kernel.org/r/20220627102813.126264-1-edumazet@google.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYongqiang Liu <liuyongqiang13@huawei.com>
      4b48e5d4
  3. Sep 22, 2020
    • Eric Dumazet's avatar
      bonding/alb: make sure arp header is pulled before accessing it · 3a7c7135
      Eric Dumazet authored
      
      stable inclusion
      from linux-4.19.111
      commit 4dff63b3ff055e6402ffa435ec09a28ab6a5d2f8
      
      --------------------------------
      
      commit b7469e83 upstream.
      
      Similar to commit 38f88c45 ("bonding/alb: properly access headers
      in bond_alb_xmit()"), we need to make sure arp header was pulled
      in skb->head before blindly accessing it in rlb_arp_xmit().
      
      Remove arp_pkt() private helper, since it is more readable/obvious
      to have the following construct back to back :
      
      	if (!pskb_network_may_pull(skb, sizeof(*arp)))
      		return NULL;
      	arp = (struct arp_pkt *)skb_network_header(skb);
      
      syzbot reported :
      
      BUG: KMSAN: uninit-value in bond_slave_has_mac_rx include/net/bonding.h:704 [inline]
      BUG: KMSAN: uninit-value in rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline]
      BUG: KMSAN: uninit-value in bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477
      CPU: 0 PID: 12743 Comm: syz-executor.4 Not tainted 5.6.0-rc2-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x220 lib/dump_stack.c:118
       kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118
       __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215
       bond_slave_has_mac_rx include/net/bonding.h:704 [inline]
       rlb_arp_xmit drivers/net/bonding/bond_alb.c:662 [inline]
       bond_alb_xmit+0x575/0x25e0 drivers/net/bonding/bond_alb.c:1477
       __bond_start_xmit drivers/net/bonding/bond_main.c:4257 [inline]
       bond_start_xmit+0x85d/0x2f70 drivers/net/bonding/bond_main.c:4282
       __netdev_start_xmit include/linux/netdevice.h:4524 [inline]
       netdev_start_xmit include/linux/netdevice.h:4538 [inline]
       xmit_one net/core/dev.c:3470 [inline]
       dev_hard_start_xmit+0x531/0xab0 net/core/dev.c:3486
       __dev_queue_xmit+0x37de/0x4220 net/core/dev.c:4063
       dev_queue_xmit+0x4b/0x60 net/core/dev.c:4096
       packet_snd net/packet/af_packet.c:2967 [inline]
       packet_sendmsg+0x8347/0x93b0 net/packet/af_packet.c:2992
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg net/socket.c:672 [inline]
       __sys_sendto+0xc1b/0xc50 net/socket.c:1998
       __do_sys_sendto net/socket.c:2010 [inline]
       __se_sys_sendto+0x107/0x130 net/socket.c:2006
       __x64_sys_sendto+0x6e/0x90 net/socket.c:2006
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x45c479
      Code: ad b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fc77ffbbc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007fc77ffbc6d4 RCX: 000000000045c479
      RDX: 000000000000000e RSI: 00000000200004c0 RDI: 0000000000000003
      RBP: 000000000076bf20 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 0000000000000a04 R14: 00000000004cc7b0 R15: 000000000076bf2c
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline]
       kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127
       kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82
       slab_alloc_node mm/slub.c:2793 [inline]
       __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401
       __kmalloc_reserve net/core/skbuff.c:142 [inline]
       __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210
       alloc_skb include/linux/skbuff.h:1051 [inline]
       alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766
       sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242
       packet_alloc_skb net/packet/af_packet.c:2815 [inline]
       packet_snd net/packet/af_packet.c:2910 [inline]
       packet_sendmsg+0x66a0/0x93b0 net/packet/af_packet.c:2992
       sock_sendmsg_nosec net/socket.c:652 [inline]
       sock_sendmsg net/socket.c:672 [inline]
       __sys_sendto+0xc1b/0xc50 net/socket.c:1998
       __do_sys_sendto net/socket.c:2010 [inline]
       __se_sys_sendto+0x107/0x130 net/socket.c:2006
       __x64_sys_sendto+0x6e/0x90 net/socket.c:2006
       do_syscall_64+0xb8/0x160 arch/x86/entry/common.c:296
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarLi Aichun <liaichun@huawei.com>
      Reviewed-by: default avatarguodeqing <geffrey.guo@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      3a7c7135
    • Eric Dumazet's avatar
      bonding/alb: properly access headers in bond_alb_xmit() · 14890aef
      Eric Dumazet authored
      
      stable inclusion
      from linux-4.19.103
      commit 6513fd0adb0cd1b1b9cd330590f9d04e72a85398
      
      --------------------------------
      
      [ Upstream commit 38f88c45 ]
      
      syzbot managed to send an IPX packet through bond_alb_xmit()
      and af_packet and triggered a use-after-free.
      
      First, bond_alb_xmit() was using ipx_hdr() helper to reach
      the IPX header, but ipx_hdr() was using the transport offset
      instead of the network offset. In the particular syzbot
      report transport offset was 0xFFFF
      
      This patch removes ipx_hdr() since it was only (mis)used from bonding.
      
      Then we need to make sure IPv4/IPv6/IPX headers are pulled
      in skb->head before dereferencing anything.
      
      BUG: KASAN: use-after-free in bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452
      Read of size 2 at addr ffff8801ce56dfff by task syz-executor.2/18108
       (if (ipx_hdr(skb)->ipx_checksum != IPX_NO_CHECKSUM) ...)
      
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       [<ffffffff8441fc42>] __dump_stack lib/dump_stack.c:17 [inline]
       [<ffffffff8441fc42>] dump_stack+0x14d/0x20b lib/dump_stack.c:53
       [<ffffffff81a7dec4>] print_address_description+0x6f/0x20b mm/kasan/report.c:282
       [<ffffffff81a7e0ec>] kasan_report_error mm/kasan/report.c:380 [inline]
       [<ffffffff81a7e0ec>] kasan_report mm/kasan/report.c:438 [inline]
       [<ffffffff81a7e0ec>] kasan_report.cold+0x8c/0x2a0 mm/kasan/report.c:422
       [<ffffffff81a7dc4f>] __asan_report_load_n_noabort+0xf/0x20 mm/kasan/report.c:469
       [<ffffffff82c8c00a>] bond_alb_xmit+0x153a/0x1590 drivers/net/bonding/bond_alb.c:1452
       [<ffffffff82c60c74>] __bond_start_xmit drivers/net/bonding/bond_main.c:4199 [inline]
       [<ffffffff82c60c74>] bond_start_xmit+0x4f4/0x1570 drivers/net/bonding/bond_main.c:4224
       [<ffffffff83baa558>] __netdev_start_xmit include/linux/netdevice.h:4525 [inline]
       [<ffffffff83baa558>] netdev_start_xmit include/linux/netdevice.h:4539 [inline]
       [<ffffffff83baa558>] xmit_one net/core/dev.c:3611 [inline]
       [<ffffffff83baa558>] dev_hard_start_xmit+0x168/0x910 net/core/dev.c:3627
       [<ffffffff83bacf35>] __dev_queue_xmit+0x1f55/0x33b0 net/core/dev.c:4238
       [<ffffffff83bae3a8>] dev_queue_xmit+0x18/0x20 net/core/dev.c:4278
       [<ffffffff84339189>] packet_snd net/packet/af_packet.c:3226 [inline]
       [<ffffffff84339189>] packet_sendmsg+0x4919/0x70b0 net/packet/af_packet.c:3252
       [<ffffffff83b1ac0c>] sock_sendmsg_nosec net/socket.c:673 [inline]
       [<ffffffff83b1ac0c>] sock_sendmsg+0x12c/0x160 net/socket.c:684
       [<ffffffff83b1f5a2>] __sys_sendto+0x262/0x380 net/socket.c:1996
       [<ffffffff83b1f700>] SYSC_sendto net/socket.c:2008 [inline]
       [<ffffffff83b1f700>] SyS_sendto+0x40/0x60 net/socket.c:2004
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Jay Vosburgh <j.vosburgh@gmail.com>
      Cc: Veaceslav Falico <vfalico@gmail.com>
      Cc: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      Signed-off-by: default avatarLi Aichun <liaichun@huawei.com>
      Reviewed-by: default avatarguodeqing <geffrey.guo@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      14890aef
  4. May 17, 2018
  5. May 12, 2018
  6. May 11, 2018
  7. Oct 25, 2017
    • Mark Rutland's avatar
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns... · 6aa7de05
      Mark Rutland authored
      locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
      
      Please do not apply this to mainline directly, instead please re-run the
      coccinelle script shown below and apply its output.
      
      For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
      preference to ACCESS_ONCE(), and new code is expected to use one of the
      former. So far, there's been no reason to change most existing uses of
      ACCESS_ONCE(), as these aren't harmful, and changing them results in
      churn.
      
      However, for some features, the read/write distinction is critical to
      correct operation. To distinguish these cases, separate read/write
      accessors must be used. This patch migrates (most) remaining
      ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
      coccinelle script:
      
      ----
      // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
      // WRITE_ONCE()
      
      // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
      
      virtual patch
      
      @ depends on patch @
      expression E1, E2;
      @@
      
      - ACCESS_ONCE(E1) = E2
      + WRITE_ONCE(E1, E2)
      
      @ depends on patch @
      expression E;
      @@
      
      - ACCESS_ONCE(E)
      + READ_ONCE(E)
      ----
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: davem@davemloft.net
      Cc: linux-arch@vger.kernel.org
      Cc: mpe@ellerman.id.au
      Cc: shuah@kernel.org
      Cc: snitzer@redhat.com
      Cc: thor.thayer@linux.intel.com
      Cc: tj@kernel.org
      Cc: viro@zeniv.linux.org.uk
      Cc: will.deacon@arm.com
      Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6aa7de05
  8. Jun 21, 2017
  9. Jun 16, 2017
    • Johannes Berg's avatar
      networking: introduce and use skb_put_data() · 59ae1d12
      Johannes Berg authored
      
      A common pattern with skb_put() is to just want to memcpy()
      some data into the new space, introduce skb_put_data() for
      this.
      
      An spatch similar to the one for skb_put_zero() converts many
      of the places using it:
      
          @@
          identifier p, p2;
          expression len, skb, data;
          type t, t2;
          @@
          (
          -p = skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          |
          -p = (t)skb_put(skb, len);
          +p = skb_put_data(skb, data, len);
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, len);
          |
          -memcpy(p, data, len);
          )
      
          @@
          type t, t2;
          identifier p, p2;
          expression skb, data;
          @@
          t *p;
          ...
          (
          -p = skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          |
          -p = (t *)skb_put(skb, sizeof(t));
          +p = skb_put_data(skb, data, sizeof(t));
          )
          (
          p2 = (t2)p;
          -memcpy(p2, data, sizeof(*p));
          |
          -memcpy(p, data, sizeof(*p));
          )
      
          @@
          expression skb, len, data;
          @@
          -memcpy(skb_put(skb, len), data, len);
          +skb_put_data(skb, data, len);
      
      (again, manually post-processed to retain some comments)
      
      Reviewed-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59ae1d12
  10. Apr 06, 2017
    • Jarod Wilson's avatar
      bonding: attempt to better support longer hw addresses · faeeb317
      Jarod Wilson authored
      
      People are using bonding over Infiniband IPoIB connections, and who knows
      what else. Infiniband has a hardware address length of 20 octets
      (INFINIBAND_ALEN), and the network core defines a MAX_ADDR_LEN of 32.
      Various places in the bonding code are currently hard-wired to 6 octets
      (ETH_ALEN), such as the 3ad code, which I've left untouched here. Besides,
      only alb is currently possible on Infiniband links right now anyway, due
      to commit 1533e773, so the alb code is where most of the changes are.
      
      One major component of this change is the addition of a bond_hw_addr_copy
      function that takes a length argument, instead of using ether_addr_copy
      everywhere that hardware addresses need to be copied about. The other
      major component of this change is converting the bonding code from using
      struct sockaddr for address storage to struct sockaddr_storage, as the
      former has an address storage space of only 14, while the latter is 128
      minus a few, which is necessary to support bonding over device with up to
      MAX_ADDR_LEN octet hardware addresses. Additionally, this probably fixes
      up some memory corruption issues with the current code, where it's
      possible to write an infiniband hardware address into a sockaddr declared
      on the stack.
      
      Lightly tested on a dual mlx4 IPoIB setup, which properly shows a 20-octet
      hardware address now:
      
      $ cat /proc/net/bonding/bond0
      Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
      
      Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
      Primary Slave: mlx4_ib0 (primary_reselect always)
      Currently Active Slave: mlx4_ib0
      MII Status: up
      MII Polling Interval (ms): 100
      Up Delay (ms): 100
      Down Delay (ms): 100
      
      Slave Interface: mlx4_ib0
      MII Status: up
      Speed: Unknown
      Duplex: Unknown
      Link Failure Count: 0
      Permanent HW addr:
      80:00:02:08:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:1d:67:01
      Slave queue ID: 0
      
      Slave Interface: mlx4_ib1
      MII Status: up
      Speed: Unknown
      Duplex: Unknown
      Link Failure Count: 0
      Permanent HW addr:
      80:00:02:09:fe:80:00:00:00:00:00:01:e4:1d:2d:03:00:1d:67:02
      Slave queue ID: 0
      
      Also tested with a standard 1Gbps NIC bonding setup (with a mix of
      e1000 and e1000e cards), running LNST's bonding tests.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      faeeb317
  11. Oct 18, 2016
  12. Jul 01, 2016
  13. Feb 11, 2016
  14. Nov 22, 2014
  15. Nov 11, 2014
  16. Nov 01, 2014
  17. Oct 07, 2014
    • Mahesh Bandewar's avatar
      bonding: Simplify the xmit function for modes that use xmit_hash · ee637714
      Mahesh Bandewar authored
      
      Earlier change to use usable slave array for TLB mode had an additional
      performance advantage. So extending the same logic to all other modes
      that use xmit-hash for slave selection (viz 802.3AD, and XOR modes).
      Also consolidating this with the earlier TLB change.
      
      The main idea is to build the usable slaves array in the control path
      and use that array for slave selection during xmit operation.
      
      Measured performance in a setup with a bond of 4x1G NICs with 200
      instances of netperf for the modes involved (3ad, xor, tlb)
      cmd: netperf -t TCP_RR -H <TargetHost> -l 60 -s 5
      
      Mode        TPS-Before   TPS-After
      
      802.3ad   : 468,694      493,101
      TLB (lb=0): 392,583      392,965
      XOR       : 475,696      484,517
      
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee637714
  18. Sep 16, 2014
  19. Sep 14, 2014
    • Nikolay Aleksandrov's avatar
      bonding: adjust locking comments · 8c0bc550
      Nikolay Aleksandrov authored
      
      Now that locks have been removed, remove some unnecessary comments and
      adjust others to reflect reality. Also add a comment to "mode_lock" to
      describe its current users and give a brief summary why they need it.
      
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c0bc550
    • Nikolay Aleksandrov's avatar
      bonding: alb: convert to bond->mode_lock · 4bab16d7
      Nikolay Aleksandrov authored
      
      The ALB/TLB specific spinlocks are no longer necessary as we now have
      bond->mode_lock for this purpose, so convert them and remove them from
      struct alb_bond_info.
      Also remove the unneeded lock/unlock functions and use spin_lock/unlock
      directly.
      
      Suggested-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4bab16d7
    • Nikolay Aleksandrov's avatar
      bonding: clean curr_slave_lock use · 1c72cfdc
      Nikolay Aleksandrov authored
      
      Mostly all users of curr_slave_lock already have RTNL as we've discussed
      previously so there's no point in using it, the one case where the lock
      must stay is the 3ad code, in fact it's the only one.
      It's okay to remove it from bond_do_fail_over_mac() as it's called with
      RTNL and drops the curr_slave_lock anyway.
      bond_change_active_slave() is one of the main places where
      curr_slave_lock was used, it's okay to remove it as all callers use RTNL
      these days before calling it, that's why we move the ASSERT_RTNL() in
      the beginning to catch any potential offenders to this rule.
      The RTNL argument actually applies to all of the places where
      curr_slave_lock has been removed from in this patch.
      Also remove the unnecessary bond_deref_active_protected() macro and use
      rtnl_dereference() instead.
      
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1c72cfdc
    • Nikolay Aleksandrov's avatar
      bonding: alb: remove curr_slave_lock · 62c5f518
      Nikolay Aleksandrov authored
      
      First in rlb_teach_disabled_mac_on_primary() it's okay to remove
      curr_slave_lock as all callers except bond_alb_monitor() already hold
      RTNL, and in case bond_alb_monitor() is executing we can at most have a
      period with bad throughput (very unlikely though).
      In bond_alb_monitor() it's okay to remove the read_lock as the slave
      list is walked with RCU and the worst that could happen is another
      transmitter at the same time and thus for a period which currently is 10
      seconds (bond_alb.h: BOND_ALB_LP_TICKS).
      And bond_alb_handle_active_change() is okay because it's always called
      with RTNL. Removed the ASSERT_RTNL() because it'll be inserted in the
      parent function in a following patch.
      
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      62c5f518
  20. Sep 10, 2014
  21. Aug 23, 2014
  22. Jul 21, 2014
  23. Jul 17, 2014
    • Mahesh Bandewar's avatar
      bonding: Do not try to send packets over dead link in TLB mode. · 6b794c1c
      Mahesh Bandewar authored
      
      In TLB mode if tlb_dynamic_lb is NOT set, slaves from the bond
      group are selected based on the hash distribution. This does not
      exclude dead links which are part of the bond. Also if there is a
      temporary link event which brings down the interface, packets
      hashed on that interface would be dropped too.
      
      This patch fixes these issues and distributes flows across the
      UP links only. Also the array construction of links which are
      capable of sending packets happen in the control path leaving
      only link-selection during the data-path.
      
      One possible side effect of this is - at a link event; all
      flows will be shuffled to get good distribution. But impact of
      this should be minimum with the assumption that a member or
      members of the bond group are not available is a very temporary
      situation.
      
      Signed-off-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b794c1c
  24. Jul 16, 2014
  25. Jun 05, 2014
    • Vlad Yasevich's avatar
      bonding: Support macvlans on top of tlb/rlb mode bonds · 14af9963
      Vlad Yasevich authored
      
      To make TLB mode work, the patch allows learning packets
      to be sent using mac addresses assigned to macvlan devices,
      also taking into an account vlans that may be between the
      bond and macvlan device.
      
      To make RLB work, all we have to do is accept ARP packets
      for addresses added to the bond dev->uc list.  Since RLB
      mode will take care to update the peers directly with
      correct mac addresses, learning packets for these addresses
      do not have be send to switch.
      
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14af9963
  26. May 23, 2014
    • Vlad Yasevich's avatar
      bonding: Send ALB learning packets using the right source · d0c21d43
      Vlad Yasevich authored
      
      ALB learning packets are currentlyalways sent using the slave mac
      address for all vlans configured on top of bond.   This is not always
      correct, as vlans may change their mac address.
      This patch introduced a concept of strict matching where the
      source of learning packets can either strictly match the address
      passed in, or it can determine a more correct address to use.
      
      There are 3 casese to consider:
        1) Switchover.  In this case, we have a new active slave and we need
           tell the switch about all addresses available on the slave.
        2) Monitor.  We'll periodically refresh learning info for all slaves.
           In this case, we refresh all addresses for current active, and just
           the slave address for other slaves.
        3) Teaching of disabled adddress.  This happens as part of the
           failover and in this case, we alwyas to use just the address
           provided.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0c21d43
    • Vlad Yasevich's avatar
      bonding: Don't assume 802.1Q when sending alb learning packets. · d6b694c0
      Vlad Yasevich authored
      
      TLB/ALB learning packets always assume 802.1Q vlan protocol, but
      that is no longer the case since we now have support for Q-in-Q
      on top of bonding.  Pass the vlan protocol to alb_send_lp_vid()
      so that the packets are properly tagged.
      
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      Signed-off-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Acked-by: default avatarVeaceslav Falico <vfalico@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6b694c0
  27. May 17, 2014