Skip to content
Snippets Groups Projects
  1. Sep 22, 2022
    • Ivan Vecera's avatar
      i40e: Fix kernel crash during module removal · f04ab3c8
      Ivan Vecera authored
      stable inclusion
      from stable-v4.19.258
      commit c49f320e2492738d478bc427dcd54ccfe0cba746
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5RZPX
      
      
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit fb8396aeda5872369a8ed6d2301e2c86e303c520 ]
      
      The driver incorrectly frees client instance and subsequent
      i40e module removal leads to kernel crash.
      
      Reproducer:
      1. Do ethtool offline test followed immediately by another one
      host# ethtool -t eth0 offline; ethtool -t eth0 offline
      2. Remove recursively irdma module that also removes i40e module
      host# modprobe -r irdma
      
      Result:
      [ 8675.035651] i40e 0000:3d:00.0 eno1: offline testing starting
      [ 8675.193774] i40e 0000:3d:00.0 eno1: testing finished
      [ 8675.201316] i40e 0000:3d:00.0 eno1: offline testing starting
      [ 8675.358921] i40e 0000:3d:00.0 eno1: testing finished
      [ 8675.496921] i40e 0000:3d:00.0: IRDMA hardware initialization FAILED init_state=2 status=-110
      [ 8686.188955] i40e 0000:3d:00.1: i40e_ptp_stop: removed PHC on eno2
      [ 8686.943890] i40e 0000:3d:00.1: Deleted LAN device PF1 bus=0x3d dev=0x00 func=0x01
      [ 8686.952669] i40e 0000:3d:00.0: i40e_ptp_stop: removed PHC on eno1
      [ 8687.761787] BUG: kernel NULL pointer dereference, address: 0000000000000030
      [ 8687.768755] #PF: supervisor read access in kernel mode
      [ 8687.773895] #PF: error_code(0x0000) - not-present page
      [ 8687.779034] PGD 0 P4D 0
      [ 8687.781575] Oops: 0000 [#1] PREEMPT SMP NOPTI
      [ 8687.785935] CPU: 51 PID: 172891 Comm: rmmod Kdump: loaded Tainted: G        W I        5.19.0+ #2
      [ 8687.794800] Hardware name: Intel Corporation S2600WFD/S2600WFD, BIOS SE5C620.86B.0X.02.0001.051420190324 05/14/2019
      [ 8687.805222] RIP: 0010:i40e_lan_del_device+0x13/0xb0 [i40e]
      [ 8687.810719] Code: d4 84 c0 0f 84 b8 25 01 00 e9 9c 25 01 00 41 bc f4 ff ff ff eb 91 90 0f 1f 44 00 00 41 54 55 53 48 8b 87 58 08 00 00 48 89 fb <48> 8b 68 30 48 89 ef e8 21 8a 0f d5 48 89 ef e8 a9 78 0f d5 48 8b
      [ 8687.829462] RSP: 0018:ffffa604072efce0 EFLAGS: 00010202
      [ 8687.834689] RAX: 0000000000000000 RBX: ffff8f43833b2000 RCX: 0000000000000000
      [ 8687.841821] RDX: 0000000000000000 RSI: ffff8f4b0545b298 RDI: ffff8f43833b2000
      [ 8687.848955] RBP: ffff8f43833b2000 R08: 0000000000000001 R09: 0000000000000000
      [ 8687.856086] R10: 0000000000000000 R11: 000ffffffffff000 R12: ffff8f43833b2ef0
      [ 8687.863218] R13: ffff8f43833b2ef0 R14: ffff915103966000 R15: ffff8f43833b2008
      [ 8687.870342] FS:  00007f79501c3740(0000) GS:ffff8f4adffc0000(0000) knlGS:0000000000000000
      [ 8687.878427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 8687.884174] CR2: 0000000000000030 CR3: 000000014276e004 CR4: 00000000007706e0
      [ 8687.891306] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 8687.898441] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 8687.905572] PKRU: 55555554
      [ 8687.908286] Call Trace:
      [ 8687.910737]  <TASK>
      [ 8687.912843]  i40e_remove+0x2c0/0x330 [i40e]
      [ 8687.917040]  pci_device_remove+0x33/0xa0
      [ 8687.920962]  device_release_driver_internal+0x1aa/0x230
      [ 8687.926188]  driver_detach+0x44/0x90
      [ 8687.929770]  bus_remove_driver+0x55/0xe0
      [ 8687.933693]  pci_unregister_driver+0x2a/0xb0
      [ 8687.937967]  i40e_exit_module+0xc/0xf48 [i40e]
      
      Two offline tests cause IRDMA driver failure (ETIMEDOUT) and this
      failure is indicated back to i40e_client_subtask() that calls
      i40e_client_del_instance() to free client instance referenced
      by pf->cinst and sets this pointer to NULL. During the module
      removal i40e_remove() calls i40e_lan_del_device() that dereferences
      pf->cinst that is NULL -> crash.
      Do not remove client instance when client open callbacks fails and
      just clear __I40E_CLIENT_INSTANCE_OPENED bit. The driver also needs
      to take care about this situation (when netdev is up and client
      is NOT opened) in i40e_notify_client_of_netdev_close() and
      calls client close callback only when __I40E_CLIENT_INSTANCE_OPENED
      is set.
      
      Fixes: 0ef2d5af ("i40e: KISS the client interface")
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Tested-by: default avatarHelena Anna Dubel <helena.anna.dubel@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      
      Signed-off-by: default avatartangbin <tangbin_yewu@cmss.chinamobile.com>
      f04ab3c8
    • Yunjian Wang's avatar
      i40e: Fix use-after-free in i40e_client_subtask() · 77ffc465
      Yunjian Wang authored
      stable inclusion
      from stable-v4.19.191
      commit c1322eaeb8af0d8985b5cc5fa759140fa0e57b84
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5RZPX
      
      
      CVE: NA
      
      --------------------------------
      
      [ Upstream commit 38318f23a7ef86a8b1862e5e8078c4de121960c3 ]
      
      Currently the call to i40e_client_del_instance frees the object
      pf->cinst, however pf->cinst->lan_info is being accessed after
      the free. Fix this by adding the missing return.
      
      Addresses-Coverity: ("Read from pointer after free")
      Fixes: 7b0b1a6d ("i40e: Disable iWARP VSI PETCP_ENA flag on netdev down events")
      Signed-off-by: default avatarYunjian Wang <wangyunjian@huawei.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      
      Signed-off-by: default avatartangbin <tangbin_yewu@cmss.chinamobile.com>
      77ffc465
    • Aristeu Rozanski's avatar
      EDAC: skx_common: downgrade message importance on missing PCI device · 7b3c334c
      Aristeu Rozanski authored
      mainline inclusion
      from mainline-v5.6-rc4
      commit 854bb480
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0UG
      
      
      CVE: NA
      
      ---------------------------
      
      Both skx_edac and i10nm_edac drivers are loaded based on the matching CPU being
      available which leads the module to be automatically loaded in virtual machines
      as well. That will fail due the missing PCI devices. In both drivers the first
      function to make use of the PCI devices is skx_get_hi_lo() will simply print
      
      	EDAC skx: Can't get tolm/tohm
      
      for each CPU core, which is noisy. This patch makes it a debug message.
      
      Signed-off-by: default avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Link: https://lore.kernel.org/r/20191204212325.c4k47p5hrnn3vpb5@redhat.com
      
      
      
      Signed-off-by: default avatartangbin <tangbin_yewu@cmss.chinamobile.com>
      7b3c334c
    • Andy Lutomirski's avatar
      x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled · a4ee022d
      Andy Lutomirski authored
      mainline inclusion
      from mainline-v5.3
      commit dffb3f9d
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0UG?from=project-issue
      
      
      CVE: NA
      
      ---------------------------
      
      It's only used if !CONFIG_IA32_EMULATION, so disable it in normal
      configs.  This will save a few bytes of text and reduce confusion.
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc:  "BaeChang Seok" <chang.seok.bae@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: "Bae, Chang Seok" <chang.seok.bae@intel.com>
      Link: https://lkml.kernel.org/r/0f7dafa72fe7194689de5ee8cfe5d83509fabcf5.1562035429.git.luto@kernel.org
      
      
      
      Signed-off-by: default avatartangbin <tangbin_yewu@cmss.chinamobile.com>
      a4ee022d
    • Borislav Petkov's avatar
      x86: Fix early boot crash on gcc-10, third try · 99b180a1
      Borislav Petkov authored
      mainline inclusion
      from mainline-v5.7
      commit a9a3ed1e
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0UG?from=project-issue
      
      
      CVE: NA
      
      ---------------------------
      
      ... or the odyssey of trying to disable the stack protector for the
      function which generates the stack canary value.
      
      The whole story started with Sergei reporting a boot crash with a kernel
      built with gcc-10:
      
        Kernel panic — not syncing: stack-protector: Kernel stack is corrupted in: start_secondary
        CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.6.0-rc5—00235—gfffb08b37df9 #139
        Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77M—D3H, BIOS F12 11/14/2013
        Call Trace:
          dump_stack
          panic
          ? start_secondary
          __stack_chk_fail
          start_secondary
          secondary_startup_64
        -—-[ end Kernel panic — not syncing: stack—protector: Kernel stack is corrupted in: start_secondary
      
      This happens because gcc-10 tail-call optimizes the last function call
      in start_secondary() - cpu_startup_entry() - and thus emits a stack
      canary check which fails because the canary value changes after the
      boot_init_stack_canary() call.
      
      To fix that, the initial attempt was to mark the one function which
      generates the stack canary with:
      
        __attribute__((optimize("-fno-stack-protector"))) ... start_secondary(void *unused)
      
      however, using the optimize attribute doesn't work cumulatively
      as the attribute does not add to but rather replaces previously
      supplied optimization options - roughly all -fxxx options.
      
      The key one among them being -fno-omit-frame-pointer and thus leading to
      not present frame pointer - frame pointer which the kernel needs.
      
      The next attempt to prevent compilers from tail-call optimizing
      the last function call cpu_startup_entry(), shy of carving out
      start_secondary() into a separate compilation unit and building it with
      -fno-stack-protector, was to add an empty asm("").
      
      This current solution was short and sweet, and reportedly, is supported
      by both compilers but we didn't get very far this time: future (LTO?)
      optimization passes could potentially eliminate this, which leads us
      to the third attempt: having an actual memory barrier there which the
      compiler cannot ignore or move around etc.
      
      That should hold for a long time, but hey we said that about the other
      two solutions too so...
      
      Reported-by: default avatarSergei Trofimovich <slyfox@gentoo.org>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Tested-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20200314164451.346497-1-slyfox@gentoo.org
      
      
      
      Signed-off-by: default avatartangbin <tangbin_yewu@cmss.chinamobile.com>
      99b180a1
    • Josh Poimboeuf's avatar
      objtool: Don't fail on missing symbol table · e38b3c91
      Josh Poimboeuf authored
      stable inclusion
      from stable-5.10.12
      commit c6fd968f58439398b765300aecd7758d501ee49c
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5Q0UG?from=project-issue
      
      
      CVE: NA
      
      --------------------------------
      
      commit 1d489151e9f9d1647110277ff77282fe4d96d09b upstream.
      
      Thanks to a recent binutils change which doesn't generate unused
      symbols, it's now possible for thunk_64.o be completely empty without
      CONFIG_PREEMPTION: no text, no data, no symbols.
      
      We could edit the Makefile to only build that file when
      CONFIG_PREEMPTION is enabled, but that will likely create confusion
      if/when the thunks end up getting used by some other code again.
      
      Just ignore it and move on.
      
      Reported-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarMiroslav Benes <mbenes@suse.cz>
      Tested-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Link: https://github.com/ClangBuiltLinux/linux/issues/1254
      
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      Signed-off-by: default avatartangbin <tangbin_yewu@cmss.chinamobile.com>
      e38b3c91
    • Yonglong Liu's avatar
      net: hns3: update hns3 version to 22.9.1 · de092558
      Yonglong Liu authored
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5S7WZ
      
      
      CVE: NA
      
      ----------------------------
      
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Reviewed-by: default avatarli yongxin <liyongxin1@huawei.com>
      Signed-off-by: default avatarLaibin Qiu <qiulaibin@huawei.com>
      de092558
    • Yonglong Liu's avatar
      net: hns3: fix keep alive can not resume problem when system busy · 6a804d0a
      Yonglong Liu authored
      driver inclusion
      category: bugfix
      bugzilla: https://gitee.com/openeuler/kernel/issues/I5S7WZ
      
      
      CVE: NA
      
      ----------------------------
      
      Currently, VF send keep alive to PF every 2s, and PF detect the
      keep alive for 8s, some case, the work queue may schedule late,
      cause keep alive lost, then the mac setting from PF may not
      affect to the VF, and the keep alive can not resume, only reset
      VF or reload VF driver can resume.
      
      This patch adds keep alive resume mechanism, and adds some debug
      print for this case.
      
      When link status change between keep alive lost and resume, the
      link status of VF may not the same as the PF, so adds push link
      status to VF to avoid this case.
      
      Signed-off-by: default avatarYonglong Liu <liuyonglong@huawei.com>
      Reviewed-by: default avatarli yongxin <liyongxin1@huawei.com>
      Signed-off-by: default avatarLaibin Qiu <qiulaibin@huawei.com>
      6a804d0a
  2. Sep 20, 2022
  3. Sep 14, 2022
  4. Sep 13, 2022
  5. Sep 07, 2022