Commits · 10097449f40604d121f5854b063b62dfcaaa669a · Summer2022 / 22b970495

Jun 09, 2021

block: take bd_mutex around delete_partitions in del_gendisk · 10097449


mainline inclusion
from mainline-v5.13-rc1
commit c76f48eb5c084b1e15c931ae8cc1826cd771d70d
category: bugfix
bugzilla: 55097
CVE: NA

--------------------------------

There is nothing preventing an ioctl from trying do delete partition
concurrenly with del_gendisk, so take open_mutex to serialize against
that.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210406062303.811835-6-hch@lst.de


Signed-off-by: Jens Axboe <axboe@kernel.dk>
Conflicts:
	block/genhd.c
	block/partitions/core.c
[Yufen: linux-4.19 have not extract blk_drop_partitions().]
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

10097449

NFSv4: Fix second deadlock in nfs4_evict_inode() · 01f9b568

Trond Myklebust authored 3 years ago


hulk inclusion
category: bugfix
bugzilla: 51898
CVE: NA

---------------------------

If the inode is being evicted but has to return a layout first, then
that too can cause a deadlock in the corner case where the server
reboots.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

01f9b568

NFSv4: Fix deadlock between nfs4_evict_inode() and nfs4_opendata_get_inode() · d1c9802f

Trond Myklebust authored 3 years ago


hulk inclusion
category: bugfix
bugzilla: 51898
CVE: NA

---------------------------

If the inode is being evicted, but has to return a delegation first,
then it can cause a deadlock in the corner case where the server reboots
before the delegreturn completes, but while the call to iget5_locked() in
nfs4_opendata_get_inode() is waiting for the inode free to complete.
Since the open call still holds a session slot, the reboot recovery
cannot proceed.

In order to break the logjam, we can turn the delegation return into a
privileged operation for the case where we're evicting the inode. We
know that in that case, there can be no other state recovery operation
that conflicts.

Reported-by: zhangxiaoxu (A) <zhangxiaoxu5@huawei.com>
Fixes: 5fcdfacc ("NFSv4: Return delegations synchronously in evict_inode")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Link: https://patchwork.kernel.org/project/linux-nfs/list/?series=491989


Conflict:
	fs/nfs/nfs4proc.c
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

d1c9802f

Jun 08, 2021

NFSv4.1: fix handling of backchannel binding in BIND_CONN_TO_SESSION · d71caefa

Olga Kornievskaia authored 3 years ago


mainline inclusion
from mainline-v5.7-rc4
commit dff58530
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

Currently, if the client sends BIND_CONN_TO_SESSION with
NFS4_CDFC4_FORE_OR_BOTH but only gets NFS4_CDFS4_FORE back it ignores
that it wasn't able to enable a backchannel.

To make sure, the client sends BIND_CONN_TO_SESSION as the first
operation on the connections (ie., no other session compounds haven't
been sent before), and if the client's request to bind the backchannel
is not satisfied, then reset the connection and retry.

Cc: stable@vger.kernel.org
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Conflicts:
	include/linux/sunrpc/clnt.h

Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

d71caefa

NFS: Don't gratuitously clear the inode cache when lookup failed · 9c7ae043

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.12-rc3
commit 47397915ede0192235474b145ebcd81b37b03624
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

The fact that the lookup revalidation failed, does not mean that the
inode contents have changed.

Fixes: 5ceb9d7f ("NFS: Refactor nfs_lookup_revalidate()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

9c7ae043

NFS: Don't revalidate the directory permissions on a lookup failure · 347702e3

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.12-rc3
commit 82e7ca1334ab16e2e04fafded1cab9dfcdc11b40
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

There should be no reason to expect the directory permissions to change
just because the directory contents changed or a negative lookup timed
out. So let's avoid doing a full call to nfs_mark_for_revalidate() in
that case.
Furthermore, if this is a negative dentry, and we haven't actually done
a new lookup, then we have no reason yet to believe the directory has
changed at all. So let's remove the gratuitous directory inode
invalidation altogether when called from
nfs_lookup_revalidate_negative().

Reported-by: Geert Jansen <gerardu@amazon.com>
Fixes: 5ceb9d7f ("NFS: Refactor nfs_lookup_revalidate()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

347702e3

NFS: nfs_delegation_find_inode_server must first reference the superblock · 21175e46

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.11-rc4
commit 113aac6d567bda783af36d08f73bfda47d8e9a40
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

Before referencing the inode, we must ensure that the superblock can be
referenced. Otherwise, we can end up with iput() calling superblock
operations that are no longer valid or accessible.

Fixes: e39d8a18 ("NFSv4: Fix an Oops during delegation callbacks")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

21175e46

nfs4: strengthen error check to avoid unexpected result · 178890f8

Chengguang Xu authored 3 years ago


mainline inclusion
from mainline-v5.10-rc1
commit 82c596eb
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

The variable error is ssize_t, which is signed and will
cast to unsigned when comapre with variable size, so add
a check to avoid unexpected result in case of negative
value of error.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

178890f8

NFS: Fix interrupted slots by sending a solo SEQUENCE operation · 493b3c9d

Anna Schumaker authored 3 years ago


mainline inclusion
from mainline-v5.8-rc6
commit 913fadc5
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

We used to do this before 3453d570, but this was changed to better
handle the NFS4ERR_SEQ_MISORDERED error code. This commit fixed the slot
re-use case when the server doesn't receive the interrupted operation,
but if the server does receive the operation then it could still end up
replying to the client with mis-matched operations from the reply cache.

We can fix this by sending a SEQUENCE to the server while recovering from
a SEQ_MISORDERED error when we detect that we are in an interrupted slot
situation.

Fixes: 3453d570 (NFSv4.1: Avoid false retries when RPC calls are interrupted)
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Conflicts:
	fs/nfs/nfs4proc.c

Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

493b3c9d

NFS: Ensure we time out if a delegreturn does not complete · 110c3d3d

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.7-rc1
commit 244fcd2f
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

We can't allow delegreturn to hold up nfs4_evict_inode() forever,
since that can cause the memory shrinkers to block. This patch
therefore ensures that we eventually time out, and complete the
reclaim of the inode.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

110c3d3d

NFSv4.0: nfs4_do_fsinfo() should not do implicit lease renewals · 159b2c25

Robert Milkowski authored 3 years ago


mainline inclusion
from mainline-v5.6-rc1
commit 7dc2993a
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

Currently, each time nfs4_do_fsinfo() is called it will do an implicit
NFS4 lease renewal, which is not compliant with the NFS4 specification.
This can result in a lease being expired by an NFS server.

Commit 83ca7f5a ("NFS: Avoid PUTROOTFH when managing leases")
introduced implicit client lease renewal in nfs4_do_fsinfo(),
which can result in the NFSv4.0 lease to expire on a server side,
and servers returning NFS4ERR_EXPIRED or NFS4ERR_STALE_CLIENTID.

This can easily be reproduced by frequently unmounting a sub-mount,
then stat'ing it to get it mounted again, which will delay or even
completely prevent client from sending RENEW operations if no other
NFS operations are issued. Eventually nfs server will expire client's
lease and return an error on file access or next RENEW.

This can also happen when a sub-mount is automatically unmounted
due to inactivity (after nfs_mountpoint_expiry_timeout), then it is
mounted again via stat(). This can result in a short window during
which client's lease will expire on a server but not on a client.
This specific case was observed on production systems.

This patch removes the implicit lease renewal from nfs4_do_fsinfo().

Fixes: 83ca7f5a ("NFS: Avoid PUTROOTFH when managing leases")
Signed-off-by: Robert Milkowski <rmilkowski@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>

Conflicts:
	fs/nfs/nfs4proc.c

Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

159b2c25

NFS: Use kmemdup_nul() in nfs_readdir_make_qstr() · 8e7d17c8

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.6-rc1
commit 3803d672
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

The directory strings stored in the readdir cache may be used with
printk(), so it is better to ensure they are nul-terminated.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

8e7d17c8

NFSv3: FIx bug when using chacl and chmod to change acl · 3a812fa2

Su Yanjun authored 3 years ago


mainline inclusion
from mainline-v5.6-rc1
commit fe1e8dbe
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

We find a bug when running test under nfsv3  as below.
1)
chacl u::r--,g::rwx,o:rw- file1
2)
chmod u+w file1
3)
chacl -l file1

We expect u::rw-, but it shows u::r--, more likely it returns the
cached acl in inode.

We dig the code find that the code path is different.

chacl->..->__nfs3_proc_setacls->nfs_zap_acl_cache
Then nfs_zap_acl_cache clears the NFS_INO_INVALID_ACL in
NFS_I(inode)->cache_validity.

chmod->..->nfs3_proc_setattr
Because NFS_INO_INVALID_ACL has been cleared by chacl path,
nfs_zap_acl_cache wont be called.

nfs_setattr_update_inode will set NFS_INO_INVALID_ACL so let it
before nfs_zap_acl_cache call.

Signed-off-by: Su Yanjun <suyanjun218@gmail.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

3a812fa2

NFSv4.x: Handle bad/dead sessions correctly in nfs41_sequence_process() · 9d92b3e1

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.5-rc1
commit 5c441544
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

If the server returns a bad or dead session error, the we don't want
to update the session slot number, but just immediately schedule
recovery and allow it to proceed.

We can/should then remove handling in other places

Fixes: 3453d570 ("NFSv4.1: Avoid false retries when RPC calls are interrupted")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

9d92b3e1

NFSv4.1: Only reap expired delegations · ef41787a

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.3-rc4
commit ad114089
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

Fix nfs_reap_expired_delegations() to ensure that we only reap delegations
that are actually expired, rather than triggering on random errors.

Fixes: 45870d69 ("NFSv4.1: Test delegation stateids when server...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Conflicts:
	fs/nfs/delegation.c

Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

ef41787a

NFSv4.1: Fix open stateid recovery · 3d46ec68

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.3-rc4
commit 27a30cf6
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

The logic for checking in nfs41_check_open_stateid() whether the state
is supported by a delegation is inverted. In addition, it makes more
sense to perform that check before we check for expired locks.

Fixes: 8a64c4ef ("NFSv4.1: Even if the stateid is OK,...")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

Conflicts:
	fs/nfs/nfs4proc.c

Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

3d46ec68

NFSv4.1: Don't process the sequence op more than once. · 6b4f30dc

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.0-rc1
commit c71c46f0
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

Ensure that if we call nfs41_sequence_process() a second time for the
same rpc_task, then we only process the results once.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

6b4f30dc

NFS: Ensure NFS writeback allocations don't recurse back into NFS. · fa10c640

Trond Myklebust authored 3 years ago


mainline inclusion
from mainline-v5.0-rc1
commit 875bc3fb
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

All the allocations that we can hit in the NFS layer and sunrpc layers
themselves are already marked as GFP_NOFS, but we need to ensure that
any calls to generic kernel functionality do the right thing as well.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

fa10c640

nfs_remount(): don't leak, don't ignore LSM options quietly · 3e917350

Al Viro authored 3 years ago


mainline inclusion
from mainline-v5.0-rc1
commit 6a0440e5
category: bugfix
bugzilla: NA
CVE: NA

--------------------------------

* if mount(2) passes something like "context=foo" with MS_REMOUNT
in flags (/sbin/mount.nfs will _not_ do that - you need to issue
the syscall manually), you'll get leaked copies for LSM options.
The reason is that instead of nfs_{alloc,free}_parsed_mount_data()
nfs_remount() uses kzalloc/kfree, which lacks the needed cleanup.

* selinux options are not changed on remount (as for any other
fs), but in case of NFS the failure is quiet - they are not compared
to what we used to have, with complaint in case of attempted changes.
Trivially fixed by converting to use of security_sb_remount().

Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Conflict:
	fs/nfs/super.c

Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

3e917350

UACCE backport from mainline · 451823fe

Yu'an Wang authored 3 years ago


hulk inclusion
category: Feature
bugzilla: NA
CVE: NA

backport uacce from mainline, it moved uacce.c to misc/uacce and update
Kconfig and Makefile. At the same time, uacce.h is moved from /uapi/linux
to /uapi/misc/uacce.

Signed-off-by: Yu'an Wang <wangyuan46@huawei.com>
Signed-off-by: Kai Ye <yekai13@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

451823fe

crypto: hisilicon-Cap block size at 2^31 · 946bb161

Yu'an Wang authored 3 years ago


hulk inclusion
category: Feature
bugzilla: NA
CVE: NA

The function hisi_acc_create_sg_pool may allocate a block of
memory of size PAGE_SIZE * 2^(MAX_ORDER - 1).  This value may
exceed 2^31 on ia64, which would overflow the u32.
This patch caps it at 2^31.

Signed-off-by: Yu'an Wang <wangyuan46@huawei.com>
Signed-off-by: Zibo Xu <xuzaibo@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

946bb161

crypto: hisilicon-hpre add req check when callback · 289ca045

Yu'an Wang authored 3 years ago


hulk inclusion
category: Bugfix
bugzilla: NA
CVE: NA

When running the hpre kernel state task, a ras error occurred.
After the driver actively called back the incomplete task to
recycle the sqe resources, the hardware wrote back the sqe and
caused the kernel calltrace.

Signed-off-by: Yu'an Wang <wangyuan46@huawei.com>
Signed-off-by: Hui Tang <tanghui20@huawei.com>
Reviewed-by: Zibo Xu <xuzaibo@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

289ca045

crypto: hisilicon- count send_ref when sending bd · b8e8ddd4

Yu'an Wang authored 3 years ago


hulk inclusion
category: Bugfix
bugzilla: NA
CVE: NA

When a BD is delivered, the RAS resets occasionally clear the BD that is
being delivered. The count send_ref ensures that the RAS process does
not perform operations on this QP when the BD is delivered.

Signed-off-by: Yu'an Wang <wangyuan46@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Reviewed-by: Zibo Xu <xuzaibo@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

b8e8ddd4

crypto: hisilicon-enhancement of qm DFX · 6736cf1b

Yu'an Wang authored 3 years ago


hulk inclusion
category: Feature
bugzilla: NA
CVE: NA

add DebugFS for xQC and xQE dump, user can use cmd to dump information
of SQC/CQC/EQC/AEQC/SQE/CQE/EQE/AEQE.

Signed-off-by: Yu'an Wang <wangyuan46@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

6736cf1b

crypto: hisilicon-memory management optimization · 825c43b7

Yu'an Wang authored 3 years ago


hulk inclusion
category: Feature
bugzilla: NA
CVE: NA

Put all the code for the memory allocation into the QM initialization
process. Before, The qp memory was allocated when the qp was created,
and released when the qp was released, It is now changed to allocate
all the qp memory once.

Signed-off-by: Yu'an Wang <wangyuan46@huawei.com>
Signed-off-by: Weili Qian <qianweili@huawei.com>
Signed-off-by: Shukun Tan <tanshukun1@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

825c43b7

net: hns3: update hns3 version to 1.9.38.12 · 342493cd

Yonglong Liu authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

-----------------------------

This patch is used to update driver version to 1.9.38.12.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

342493cd

net: hns3: add match_id to check mailbox response from PF to VF · 51260392

Peng Li authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

When VF need response from PF, VF will wait (1us - 1s) to receive
the response, or it will wait timeout and the VF action fails.
If VF do not receive response in 1st action because timeout,
the 2nd action may receive response for the 1st action, and get
incorrect response data.VF must reciveve the right response from
PF,or it will cause unexpected error.

This patch adds match_id to check mailbox response from PF to VF,
to make sure VF get the right response:
1. The message sent from VF was labelled with match_id which was a
unique 16-bit non-zero value.
2. The response sent from PF will label with match_id which got from
the request.
3. The VF uses the match_id to match request and response message.

This scheme depends on the PF driver, if the PF driver don't support
then VF will uses the original scheme.

PF driver adds match_id by the patch
430acf6 ("net: hns3: fix possible mismatches resp of mailbox").

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

51260392

net: hns3: fix possible mismatches resp of mailbox · 93596fe7

Chengwen Feng authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

Currently, the mailbox synchronous communication between VF and PF use
the following fields to maintain communication:
1. Origin_mbx_msg which was combined by message code and subcode, used
to match request and response.
2. Received_resp which means whether received response.

There may possible mismatches of the following situation:
1. VF sends message A with code=1 subcode=1.
2. PF was blocked about 500ms when processing the message A.
3. VF will detect message A timeout because it can't get the response
within 500ms.
4. VF sends message B with code=1 subcode=1 which equal message A.
5. PF processes the first message A and send the response message to
VF.
6. VF will identify the response matched the message B because the
code/subcode is the same. This will lead to mismatch of request and
response.

To fix the above bug, we use the following scheme:
1. The message sent from VF was labelled with match_id which was a
unique 16-bit non-zero value.
2. The response sent from PF will label with match_id which got from
the request.
3. The VF uses the match_id to match request and response message.

As for PF driver, it only needs to copy the match_id from request to
response.

Fixes: dde1a86e ("net: hns3: Add mailbox support to PF driver")

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

93596fe7

net: hns3: fix the logic for clearing resp_msg · 0ce20b79

Jiaran Zhang authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

In the hclge_mbx_handler function, if there are two consecutive mailbox
messages that require resp_msg, the first message's resp_msg is not
cleared after being processed, causing the second resp_msg data is
incorrect.

Fix it by clearing the resp_msg before processing every mailbox message.

Fixes: bb5790b7 ("net: hns3: refactor mailbox response scheme between PF and VF")

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

0ce20b79

net: hns3: fix queue id check error when configure flow director rule by ethtool · 68c65287

Jian Shen authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

Currently, when configure flow director rule, the driver uses the
total queue number of each function, rather than the active queue
number, as the upper limit value. It's inconsistent with the value
query from "ethtool -u". So fixes it.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

68c65287

net: hns3: add check for HNS3_NIC_STATE_INITED before net open · 22fff1d5

Jian Shen authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

In a small chance, a VF reset failed, and the VF device was not
initialized. In the time window before it retry, if another task
calls hns3_reset_notify_up_enet(), it will access uninitialized
ring memory, and calltrace. So add check for HNS3_NIC_STATE_INITED
before calling hns3_nic_net_open() in hns3_reset_notify_up_enet().

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

22fff1d5

net: hns3: add waiting time before cmdq memory is released · 4173565a

Yufeng Mo authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

After the cmdq registers are cleared, the firmware may take time to
clear out possible left over commands in the cmdq. Driver must release
cmdq memory only after firmware has completed processing of left over
commands.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

4173565a

net: hns3: disable firmware compatible features when uninstall PF · e1eb0c72

Guangbin Huang authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

Currently, the firmware compatible features are enabled in PF driver
initialization process, but they are not disabled in PF driver
deinitialization process and firmware keeps these features in enabled
status.

In this case, if load an old PF driver (for example, in VM) which not
support the firmware compatible features, firmware will still send mailbox
message to PF when link status changed and PF will print
"un-supported mailbox message, code = 201".

To fix this problem, disable these firmware compatible features in PF
driver deinitialization process.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

e1eb0c72

net: hns3: fix change RSS 'hfunc' ineffective issue · 25be528f

Jian Shen authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

When user change rss 'hfunc' without set rss 'hkey' by ethtool
-X command, the driver will ignore the 'hfunc' for the hkey is
NULL. It's unreasonable. So fix it.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

25be528f

net: hns3: fix inconsistent vf id print · f5648cfa

Jian Shen authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

The vf id from ethtool is added 1 before configured to driver.
So it's necessary to minus 1 when printing it, in order to
keep consistent with user's configuration.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

f5648cfa

net: hns3: remove redundant variable initialization · 24ef95d9

Yonglong Liu authored 3 years ago


driver inclusion
category: cleanup
bugzilla: NA
CVE: NA

----------------------------

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

24ef95d9

net: hns3: replace the tab before the left brace with one space · 5d77e364

Yonglong Liu authored 3 years ago


driver inclusion
category: cleanup
bugzilla: NA
CVE: NA

----------------------------

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5d77e364

net: hns3: fix hns3_cae_pfc_storm.h missing header guard problem · 35e879f0

Yonglong Liu authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

This patch adds missing header guard to the hns3_cae_pfc_storm.h.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

35e879f0

net: hns3: modify an error type configuration · 5e0fe141

Yufeng Mo authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

According to the UM, the error configuration of the igu_egu_hw_err
is that the type and enable status should be configured separately.
The error type is also changed by mistake when we disable it
currently. The correct way is to configure these two parameters
separately.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5e0fe141

net: hns3: put off calling register_netdev() until client initialize complete · 0b01a72d

Jian Shen authored 3 years ago


driver inclusion
category: bugfix
bugzilla: NA
CVE: NA

----------------------------

Currently, the netdevice is registered before client initializing
complete. So there is a timewindow between netdevice available
and usable. In this case, if user try to change the channel number
or ring param, it may cause the hns3_set_rx_cpu_rmap() being called
twice, and report bug.

[47199.416502] hns3 0000:35:00.0 eth1: set channels: tqp_num=1, rxfh=0
[47199.430340] hns3 0000:35:00.0 eth1: already uninitialized
[47199.438554] hns3 0000:35:00.0: rss changes from 4 to 1
[47199.511854] hns3 0000:35:00.0: Channels changed, rss_size from 4 to 1, tqps from 4 to 1
[47200.163524] ------------[ cut here ]------------
[47200.171674] kernel BUG at lib/cpu_rmap.c:142!
[47200.177847] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[47200.185259] Modules linked in: hclge(+) hns3(-) hns3_cae(O) hns_roce_hw_v2 hnae3 vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O) [last unloaded: hclge]
[47200.205912] CPU: 1 PID: 8260 Comm: ethtool Tainted: G           O      5.11.0-rc3+ #1
[47200.215601] Hardware name:  , xxxxxx 02/04/2021
[47200.223052] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--)
[47200.230188] pc : cpu_rmap_add+0x38/0x40
[47200.237472] lr : irq_cpu_rmap_add+0x84/0x140
[47200.243291] sp : ffff800010e93a30
[47200.247295] x29: ffff800010e93a30 x28: ffff082100584880
[47200.254155] x27: 0000000000000000 x26: 0000000000000000
[47200.260712] x25: 0000000000000000 x24: 0000000000000004
[47200.267241] x23: ffff08209ba03000 x22: ffff08209ba038c0
[47200.273789] x21: 000000000000003f x20: ffff0820e2bc1680
[47200.280400] x19: ffff0820c970ec80 x18: 00000000000000c0
[47200.286944] x17: 0000000000000000 x16: ffffb43debe4a0d0
[47200.293456] x15: fffffc2082990600 x14: dead000000000122
[47200.300059] x13: ffffffffffffffff x12: 000000000000003e
[47200.306606] x11: ffff0820815b8080 x10: ffff53e411988000
[47200.313171] x9 : 0000000000000000 x8 : ffff0820e2bc1700
[47200.319682] x7 : 0000000000000000 x6 : 000000000000003f
[47200.326170] x5 : 0000000000000040 x4 : ffff800010e93a20
[47200.332656] x3 : 0000000000000004 x2 : ffff0820c970ec80
[47200.339168] x1 : ffff0820e2bc1680 x0 : 0000000000000004
[47200.346058] Call trace:
[47200.349324]  cpu_rmap_add+0x38/0x40
[47200.354300]  hns3_set_rx_cpu_rmap+0x6c/0xe0 [hns3]
[47200.362294]  hns3_reset_notify_init_enet+0x1cc/0x340 [hns3]
[47200.370049]  hns3_change_channels+0x40/0xb0 [hns3]
[47200.376770]  hns3_set_channels+0x12c/0x2a0 [hns3]
[47200.383353]  ethtool_set_channels+0x140/0x250
[47200.389772]  dev_ethtool+0x714/0x23d0
[47200.394440]  dev_ioctl+0x4cc/0x640
[47200.399277]  sock_do_ioctl+0x100/0x2a0
[47200.404574]  sock_ioctl+0x28c/0x470
[47200.409079]  __arm64_sys_ioctl+0xb4/0x100
[47200.415217]  el0_svc_common.constprop.0+0x84/0x210
[47200.422088]  do_el0_svc+0x28/0x34
[47200.426387]  el0_svc+0x28/0x70
[47200.431308]  el0_sync_handler+0x1a4/0x1b0
[47200.436477]  el0_sync+0x174/0x180
[47200.441562] Code: 11000405 79000c45 f8247861 d65f03c0 (d4210000)
[47200.448869] ---[ end trace a01efe4ce42e5f34 ]---

The process is like below:
excuting hns3_client_init
|
register_netdev()
|                           hns3_set_channels()
|                           |
hns3_set_rx_cpu_rmap()      hns3_reset_notify_uninit_enet()
|                               |
|                            quit without calling function
|                            hns3_free_rx_cpu_rmap for flag
|                            HNS3_NIC_STATE_INITED is unset.
|                           |
|                           hns3_reset_notify_init_enet()
|                               |
set HNS3_NIC_STATE_INITED    call hns3_set_rx_cpu_rmap()-- crash

Fix it by calling register_netdev() at the end of function
hns3_client_init().

Fixes: 08a10068 ("net: hns3: re-organize vector handle")

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Reviewed-by: li yongxin <liyongxin1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

0b01a72d