Commits · f4b3a0007abb3a8d818125e364168dd3efeb40f3 · Summer2022 / 22b970264

May 08, 2021

pid: fix pid recover method kabi change · f4b3a000

Jingxian He authored 4 years ago


hulk inclusion
category: feature
bugzilla: 48159
CVE: N/A

------------------------------

Pid recover method add new member to task_truct,
which generates kabi change problem.
We use KABI_RESERVE of task_truct, instead of
adding new member to task_truct.

Signed-off-by: Jingxian He <hejingxian@huawei.com>
Reviewed-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

4.19.90-2104.26.0

f4b3a000

config: enable kernel hotupgrade features by default · b5758284

Jingxian He authored 4 years ago


hulk inclusion
category: feature
bugzilla: 48159
CVE: N/A

------------------------------

enable kernel hot upgrade features by default:
 1 add pin mem method for checkpoint and restore:
  CONFIG_PIN_MEMORY=y
  CONFIG_PIN_MEMORY_DEV=m
 2 add pid recover method for checkpoint and restore
  CONFIG_PID_RESERVE=y
 3 add cpu park method
  CONFIG_ARM64_CPU_PARK=y
 4 add quick kexec support for kernel
  CONFIG_QUICK_KEXEC=y

Signed-off-by: Sang Yan <sangyan@huawei.com>
Signed-off-by: Jingxian He <hejingxian@huawei.com>
Reviewed-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

b5758284

kexec: Add quick kexec support for kernel · 60384564

Jingxian He authored 4 years ago


hulk inclusion
category: feature
bugzilla: 48159
CVE: N/A

------------------------------

In normal kexec, relocating kernel may cost 5 ~ 10 seconds, to
copy all segments from vmalloced memory to kernel boot memory,
because of disabled mmu.

We introduce quick kexec to save time of copying memory as above,
just like kdump(kexec on crash), by using reserved memory
"Quick Kexec".

Constructing quick kimage as the same as crash kernel,
then simply copy all segments of kimage to reserved memroy.

We also add this support in syscall kexec_load using flags
of KEXEC_QUICK.

Signed-off-by: Sang Yan <sangyan@huawei.com>
Signed-off-by: Jingxian He <hejingxian@huawei.com>
Reviewed-by: Wang Xiongfeng <wangxiongfeng2@huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

60384564

arm64: smp: Add support for cpu park · 1f9bd22e

Jingxian He authored 4 years ago


hulk inclusion
category: feature
bugzilla: 48159
CVE: N/A

------------------------------

Introducing a feature of CPU PARK in order to save time
of cpus down and up during kexec, which may cost 250ms of
per cpu's down and 30ms of up.

As a result, for 128 cores, it costs more than 30 seconds
to down and up cpus during kexec. Think about 256 cores and more.

CPU PARK is a state that cpu power-on and staying in spin loop, polling
for exit chances, such as writing exit address.

Reserving a block of memory, to fill with cpu park text section,
exit address and park-magic-flag of each cpu. In implementation,
reserved one page for one cpu core.

Cpus going to park state instead of down in machine_shutdown().
Cpus going out of park state in smp_init instead of brought up.

One of cpu park sections in pre-reserved memory blocks,:
+--------------+
+ exit address +
+--------------+
+ park magic   +
+--------------+
+ park codes   +
+      .       +
+      .       +
+      .       +
+--------------+

Signed-off-by: Sang Yan <sangyan@huawei.com>
Signed-off-by: Jingxian He <hejingxian@huawei.com>
Reviewed-by: Wang Xiongfeng <wangxiongfeng2@huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

1f9bd22e

pid: add pid reserve method for checkpoint and restore · 892de178

Jingxian He authored 4 years ago


hulk inclusion
category: feature
bugzilla: 48159
CVE: N/A

------------------------------

We record the pid of dump tasks in the reserved memory,
and reserve the pids before init task start.
In the recover process, free the reserved pids and
realloc them by setting fork_pid.

Signed-off-by: Jingxian He <hejingxian@huawei.com>
Reviewed-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

892de178

mm: add pin memory method for checkpoint add restore · 1a378b87

Jingxian He authored 4 years ago


hulk inclusion
category: feature
bugzilla: 48159
CVE: N/A

------------------------------

We can use the checkpoint and restore in userspace(criu) method to dump
and restore tasks when updating the kernel.
Currently, criu needs dump all memory data of tasks to files.
When the memory size is very large(larger than 1G),
the cost time of the dumping data will be very long(more than 1 min).

By pin the memory data of tasks and collect the corresponding physical pages
mapping info in checkpoint process, we can remap the physical pages to
restore tasks after upgrading the kernel. This pin memory method can
restore the task data within one second.

The pin memory area info is saved in the reserved memblock,
which can keep usable in the kernel update process.

The pin memory driver provides the following ioctl command for criu:
1) SET_PIN_MEM_AREA:
Set pin memory area, which can be remap to the restore task.
2) CLEAR_PIN_MEM_AREA:
Clear the pin memory area info,
which enable user reset the pin data.
3) REMAP_PIN_MEM_AREA:
Remap the pages of the pin memory to the restore task.

Signed-off-by: Jingxian He <hejingxian@huawei.com>
Reviewed-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Xie XiuQi <xiexiuqi@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

1a378b87

May 07, 2021

Revert "sched: Introduce qos scheduler for co-location" · ea08b58e
Cheng Jian authored 4 years ago
```
This reverts commit 709d6561.
```
4.19.90-2104.25.0

ea08b58e
Revert "sched: Throttle qos cfs_rq when current cpu is running online task" · fae96c9c
Cheng Jian authored 4 years ago
```
This reverts commit 8a8e23b4.
```
fae96c9c
Revert "sched: Enable qos scheduler config" · 5400f5e9
Cheng Jian authored 4 years ago
```
This reverts commit ec6deb3a.
```
5400f5e9
Revert "memcg: support priority for oom" · 4d47f53d
Cheng Jian authored 4 years ago
```
This reverts commit a3f9b995.
```
4d47f53d
Revert "memcg: enable CONFIG_MEMCG_QOS by default" · 145f94b8
Cheng Jian authored 4 years ago
```
This reverts commit d4c4911d.
```
145f94b8
Revert "memcg: fix kabi broken when enable CONFIG_MEMCG_QOS" · 838e48a8
Cheng Jian authored 4 years ago
```
This reverts commit 0a03d546.
```
838e48a8

f2fs: fix to avoid out-of-bounds memory access · e7dd7124

Chao Yu authored 4 years ago

mainline inclusion
from mainline-v5.12
commit b862676e371715456c9dade7990c8004996d0d9e
category: bugfix
bugzilla: NA
CVE: CVE-2021-3506

--------------------------------

butt3rflyh4ck <butterflyhuangxx@gmail.com> reported a bug found by
syzkaller fuzzer with custom modifications in 5.12.0-rc3+ [1]:

 dump_stack+0xfa/0x151 lib/dump_stack.c:120
 print_address_description.constprop.0.cold+0x82/0x32c mm/kasan/report.c:232
 __kasan_report mm/kasan/report.c:399 [inline]
 kasan_report.cold+0x7c/0xd8 mm/kasan/report.c:416
 f2fs_test_bit fs/f2fs/f2fs.h:2572 [inline]
 current_nat_addr fs/f2fs/node.h:213 [inline]
 get_next_nat_page fs/f2fs/node.c:123 [inline]
 __flush_nat_entry_set fs/f2fs/node.c:2888 [inline]
 f2fs_flush_nat_entries+0x258e/0x2960 fs/f2fs/node.c:2991
 f2fs_write_checkpoint+0x1372/0x6a70 fs/f2fs/checkpoint.c:1640
 f2fs_issue_checkpoint+0x149/0x410 fs/f2fs/checkpoint.c:1807
 f2fs_sync_fs+0x20f/0x420 fs/f2fs/super.c:1454
 __sync_filesystem fs/sync.c:39 [inline]
 sync_filesystem fs/sync.c:67 [inline]
 sync_filesystem+0x1b5/0x260 fs/sync.c:48
 generic_shutdown_super+0x70/0x370 fs/super.c:448
 kill_block_super+0x97/0xf0 fs/super.c:1394

The root cause is, if nat entry in checkpoint journal area is corrupted,
e.g. nid of journalled nat entry exceeds max nid value, during checkpoint,
once it tries to flush nat journal to NAT area, get_next_nat_page() may
access out-of-bounds memory on nat_bitmap due to it uses wrong nid value
as bitmap offset.

[1] https://lore.kernel.org/lkml/CAFcO6XOMWdr8pObek6eN6-fs58KG9doRFadgJj-FnF-1x43s2g@mail.gmail.com/T/#u



Reported-and-tested-by: butt3rflyh4ck <butterflyhuangxx@gmail.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Fang Wei <fangwei1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

4.19.90-2104.24.0

e7dd7124

ext4: Reduce ext4 timestamp warnings · e7bf1284

Deepa Dinamani authored 4 years ago

mainline inclusion
from mainline-v5.4-rc1
commit cba465b4
category: bugfix
bugzilla: 50526
CVE: NA

-----------------------------------------------

When ext4 file systems were created intentionally with 128 byte inodes,
the rate-limited warning of eventual possible timestamp overflow are
still emitted rather frequently.  Remove the warning for now.

Discussion for whether any warning is needed,
and where it should be emitted, can be found at
https://lore.kernel.org/lkml/1567523922.5576.57.camel@lca.pw/

.
I can post a separate follow-up patch after the conclusion.

Reported-by: Qian Cai <cai@lca.pw>
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

e7bf1284

livepatch: Restoring code segment permissions after stop_machine completed · 36f72d8e

Zhao Xuehui authored 4 years ago


hulk inclusion
category: bugfix
bugzilla: 51821
CVE: NA

---------------------------

The function 'arch_klp_code_modify_prepare' is called before stop_machine
to change the permissions of the code segment to be readable and writable,
but the permissions of the code segment were not restored to the original
state after the stop_mahcine was completed. This may introduce security
issues, so 'arch_klp_code_modify_post_process' is used after 'stop_machine'
to fix this problem in this commit.

Signed-off-by: Zhao Xuehui <zhaoxuehui1@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

36f72d8e

livepatch: Delete redundant variable 'flag' · f360724e

Zhao Xuehui authored 4 years ago


hulk inclusion
category: bugfix
bugzilla: 51819
CVE: NA

---------------------------

The varible 'flag' in klp_try_enable_patch is assigned with
value '1', but that stored value is not used, so delete it.

Signed-off-by: Zhao Xuehui <zhaoxuehui1@huawei.com>
Reviewed-by: Yang Jihong <yangjihong1@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

f360724e

Apr 30, 2021

memcg: fix kabi broken when enable CONFIG_MEMCG_QOS · 0a03d546

Jing Xiangfeng authored 4 years ago


hulk inclusion
category: feature
bugzilla: 51827
CVE: NA

--------------------------------------

Fix it by moving memcg_priority from struct mem_cgroup to
struct mem_cgroup_extension.

Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

4.19.90-2104.23.0

0a03d546

memcg: enable CONFIG_MEMCG_QOS by default · d4c4911d

Jing Xiangfeng authored 4 years ago


hulk inclusion
category: feature
bugzilla: 51827
CVE: NA

--------------------------------------

enable CONFIG_MEMCG_QOS to support memcg OOM priority.

Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

d4c4911d

memcg: support priority for oom · a3f9b995

Jing Xiangfeng authored 4 years ago


hulk inclusion
category: feature
bugzilla: 51827
CVE: NA

--------------------------------------

we first kill the process from the low priority memcg if OOM occurs.
If the process is not found, then fallback to normal handle.

Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

a3f9b995

sched: Enable qos scheduler config · ec6deb3a

Zhang Qiao authored 4 years ago


hulk inclusion
category: feature
bugzilla: 51828
CVE: NA

--------------------------------

Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Reviewed-by: Hui Chen <clare.chenhui@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

ec6deb3a

sched: Throttle qos cfs_rq when current cpu is running online task · 8a8e23b4

Zhang Qiao authored 4 years ago


hulk inclusion
category: feature
bugzilla: 51828
CVE: NA

--------------------------------

In a co-location scenario, we usually deploy online and offline
task groups in the same server.

The online tasks are more important than offline tasks. And to
avoid offline tasks affects online tasks, we will throttle the
offline tasks group when some online task groups running in the
same cpu and unthrottle offline tasks when the cpu is about to
enter idle state.

Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Reviewed-by: Hui Chen <clare.chenhui@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

8a8e23b4

sched: Introduce qos scheduler for co-location · 709d6561

Zhang Qiao authored 4 years ago


hulk inclusion
category: feature
bugzilla: 51828
CVE: NA

--------------------------------

We introduce the idea of qos level to scheduler, which now is
supported with different scheduler policies. The qos scheduler
will change the policy of correlative tasks when the qos level
of a task group is modified with cpu.qos_level cpu cgroup file.
In this way we are able to satisfy different needs of tasks in
different qos levels.

Signed-off-by: Zhang Qiao <zhangqiao22@huawei.com>
Reviewed-by: Hui Chen <clare.chenhui@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

709d6561

ipv6: route: convert comma to semicolon · 403ea842

Xu Wang authored 4 years ago


mainline inclusion
from mainline-v5.9-rc7
commit 91b2c9a0
category: bugfix
bugzilla: 51403
CVE: NA

-------------------------------------------------

Replace a comma between expression statements by a semicolon.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Xu Jia <xujia39@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

403ea842

ipv6/route: Add a missing check on proc_dointvec · c62d1c61

Aditya Pakki authored 4 years ago


mainline inclusion
from mainline-v5.0-rc1
commit f0fb9b28
category: bugfix
bugzilla: 51403
CVE: NA

-------------------------------------------------

While flushing the cache via  ipv6_sysctl_rtcache_flush(), the call
to proc_dointvec() may fail. The fix adds a check that returns the
error, on failure.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Xu Jia <xujia39@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

c62d1c61

netfilter: xtables: avoid BUG_ON · b317bddc

Florian Westphal authored 4 years ago


mainline inclusion
from mainline-v4.20-rc1
commit 70c0eb1c
category: bugfix
bugzilla: 51403
CVE: NA

-------------------------------------------------

I see no reason for them, label or timer cannot be NULL, and if they
were, we'll crash with null deref anyway.

For skb_header_pointer failure, just set hotdrop to true and toss
such packet.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Xu Jia <xujia39@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

b317bddc

SUNRPC: Test whether the task is queued before grabbing the queue spinlocks · 0ab24262

Trond Myklebust authored 4 years ago


mainline inclusion
from mainline-4.20-rc1
commit 5ce97039
category: bugfix
bugzilla: 51820
CVE: NA

-------------------------------------------------

When asked to wake up an RPC task, it makes sense to test whether or not
the task is still queued.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

0ab24262

SUNRPC: If there is no reply expected, bail early from call_decode · 5d268184

Trond Myklebust authored 4 years ago


mainline inclusion
from mainline-4.20-rc1
commit 9ee94d3e
category: bugfix
bugzilla: 51820
CVE: NA

-------------------------------------------------

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Huang Guobin <huangguobin4@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

5d268184

Apr 29, 2021

SUNRPC: Fix backchannel latency metrics · 601d2221

Chuck Lever authored 4 years ago


mainline inclusion
from mainline-v5.5-rc1
commit 8729aaba
category: bugfix
bugzilla: 51810
CVE: NA

-------------------------------------------------

I noticed that for callback requests, the reported backlog latency
is always zero, and the rtt value is crazy big. The problem was that
rqst->rq_xtime is never set for backchannel requests.

Fixes: 78215759 ("SUNRPC: Make RTT measurement more ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

601d2221

sunrpc: convert to time64_t for expiry · 2fd885ef

Arnd Bergmann authored 4 years ago


mainline inclusion
from mainline-v5.6-rc1
commit 294ec5b8
category: bugfix
bugzilla: 51810
CVE: NA

-------------------------------------------------

Using signed 32-bit types for UTC time leads to the y2038 overflow,
which is what happens in the sunrpc code at the moment.

This changes the sunrpc code over to use time64_t where possible.
The one exception is the gss_import_v{1,2}_context() function for
kerberos5, which uses 32-bit timestamps in the protocol. Here,
we can at least treat the numbers as 'unsigned', which extends the
range from 2038 to 2106.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

2fd885ef

sunrpc: Fix potential leaks in sunrpc_cache_unhash() · 2d5b3a48

Trond Myklebust authored 4 years ago


mainline inclusion
from mainline-v5.6-rc1
commit 1d821637
category: bugfix
bugzilla: 51810
CVE: NA

-------------------------------------------------

When we unhash the cache entry, we need to handle any pending upcalls
by calling cache_fresh_unlocked().

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

2d5b3a48

SUNRPC: Skip zero-refcount transports · b7282525

Trond Myklebust authored 4 years ago


mainline inclusion
from mainline-v5.3-rc1
commit 163f8821
category: bugfix
bugzilla: 51816
CVE: NA

-------------------------------------------------

When looking for the next transport to use for an RPC call, skip those
that are in the process of being destroyed and that have a zero refcount.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

b7282525

SUNRPC: Fix buffer handling of GSS MIC without slack · ed2c6aeb

Benjamin Coddington authored 4 years ago


mainline inclusion
from mainline-v5.4-rc1
commit 5f1bc399
category: bugfix
bugzilla: 51816
CVE: NA

-------------------------------------------------

The GSS Message Integrity Check data for krb5i may lie partially in the XDR
reply buffer's pages and tail.  If so, we try to copy the entire MIC into
free space in the tail.  But as the estimations of the slack space required
for authentication and verification have improved there may be less free
space in the tail to complete this copy -- see commit 2c94b8ec
("SUNRPC: Use au_rslack when computing reply buffer size").  In fact, there
may only be room in the tail for a single copy of the MIC, and not part of
the MIC and then another complete copy.

The real world failure reported is that `ls` of a directory on NFS may
sometimes return -EIO, which can be traced back to xdr_buf_read_netobj()
failing to find available free space in the tail to copy the MIC.

Fix this by checking for the case of the MIC crossing the boundaries of
head, pages, and tail. If so, shift the buffer until the MIC is contained
completely within the pages or tail.  This allows the remainder of the
function to create a sub buffer that directly address the complete MIC.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Cc: stable@vger.kernel.org # v5.1
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

ed2c6aeb

SUNRPC: Don't allow compiler optimisation of svc_xprt_release_slot() · 1d50109a

Trond Myklebust authored 4 years ago


mainline inclusion
from mainline-5.1-rc1
commit 1602a7b7
category: bugfix
bugzilla: 51818
CVE: NA

-------------------------------------------------
Use READ_ONCE() to tell the compiler to not optimse away the read of
xprt->xpt_flags in svc_xprt_release_slot().

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

1d50109a

SUNRPC/nfs: Fix return value for nfs4_callback_compound() · 28ac1720

Trond Myklebust authored 4 years ago


mainline inclusion
from mainline-5.2-rc1
commit 83dd59a0
category: bugfix
bugzilla: NA
CVE: NA

-------------------------------------------------
RPC server procedures are normally expected to return a __be32 encoded
status value of type 'enum rpc_accept_stat', however at least one function
wants to return an authentication status of type 'enum rpc_auth_stat'
in the case where authentication fails.
This patch adds functionality to allow this.

Fixes: a4e187d8 ("NFS: Don't drop CB requests with invalid principals")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit 83dd59a0)
Signed-off-by: Qiheng Lin <linqiheng@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

28ac1720

net/sunrpc: return 0 on attempt to write to "transports" · fa87dc82

Dan Carpenter authored 4 years ago


mainline inclusion
from mainline-5.10-rc4
commit d435c05a
category: bugfix
bugzilla: 51817
CVE: NA

-------------------------------------------------
You can't write to this file because the permissions are 0444.  But
it sort of looked like you could do a write and it would result in
a read.  Then it looked like proc_sys_call_handler() just ignored
it.  Which is confusing.  It's more clear if the "write" just
returns zero.

Also, the "lenp" pointer is never NULL so that check can be removed.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit d435c05a)
Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

fa87dc82

net/sunrpc: Fix return value for sysctl sunrpc.transports · 9323b943

Artur Molchanov authored 4 years ago


mainline inclusion
from mainline-5.10-rc1
commit c09f56b8
category: bugfix
bugzilla: 51817
CVE: NA

-------------------------------------------------
Fix returning value for sysctl sunrpc.transports.
Return error code from sysctl proc_handler function proc_do_xprt instead of number of the written bytes.
Otherwise sysctl returns random garbage for this key.

Since v1:
- Handle negative returned value from memory_read_from_buffer as an error

Signed-off-by: Artur Molchanov <arturmolchanov@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit c09f56b8)
Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

9323b943

sunrpc: raise kernel RPC channel buffer size · 49f46108

Roberto Bergantinos Corpas authored 4 years ago


mainline inclusion
from mainline-5.10-rc1
commit 27a1e8a0
category: bugfix
bugzilla: 51817
CVE: NA

-------------------------------------------------
Its possible that using AUTH_SYS and mountd manage-gids option a
user may hit the 8k RPC channel buffer limit. This have been observed
on field, causing unanswered RPCs on clients after mountd fails to
write on channel :

rpc.mountd[11231]: auth_unix_gid: error writing reply

Userland nfs-utils uses a buffer size of 32k (RPC_CHAN_BUF_SIZE), so
lets match those two.

Signed-off-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit 27a1e8a0)
Signed-off-by: Baisong Zhong <zhongbaisong@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

49f46108

sunrpc: add missing newline when printing parameter 'pool_mode' by sysfs · 105ab7c9

Xiongfeng Wang authored 4 years ago


mainline inclusion
from mainline-5.8-rc1
commit 746c6237
category: bugfix
bugzilla: 51810
CVE: NA

-------------------------------------------------
When I cat parameter '/sys/module/sunrpc/parameters/pool_mode', it
displays as follows. It is better to add a newline for easy reading.

[root@hulk-202 ~]# cat /sys/module/sunrpc/parameters/pool_mode
global[root@hulk-202 ~]#

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
(cherry picked from commit 746c6237)
Signed-off-by: Yufen Wang <wangyufen@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

105ab7c9

xprtrdma: Fix trace point use-after-free race · d4997fb3

Chuck Lever authored 4 years ago


mainline inclusion
from mainline-5.7-rc4
commit bdb2ce82
category: bugfix
bugzilla: 51810
CVE: NA

-------------------------------------------------

It's not safe to use resources pointed to by the @send_wr of
ib_post_send() _after_ that function returns. Those resources are
typically freed by the Send completion handler, which can run before
ib_post_send() returns.

Thus the trace points currently around ib_post_send() in the
client's RPC/RDMA transport are a hazard, even when they are
disabled. Rearrange them so that they touch the Work Request only
_before_ ib_post_send() is invoked.

Fixes: ab03eff5 ("xprtrdma: Add trace points in RPC Call transmit paths")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
(cherry picked from commit bdb2ce82)
Signed-off-by: Yufen Wang <wangyufen@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

d4997fb3

SUNRPC: Fix backchannel RPC soft lockups · 399f2ab5

Chuck Lever authored 4 years ago


mainline inclusion
from mainline-5.7-rc3
commit 6221f1d9
category: bugfix
bugzilla: 51810
CVE: NA

-------------------------------------------------
Currently, after the forward channel connection goes away,
backchannel operations are causing soft lockups on the server
because call_transmit_status's SOFTCONN logic ignores ENOTCONN.
Such backchannel Calls are aggressively retried until the client
reconnects.

Backchannel Calls should use RPC_TASK_NOCONNECT rather than
RPC_TASK_SOFTCONN. If there is no forward connection, the server is
not capable of establishing a connection back to the client, thus
that backchannel request should fail before the server attempts to
send it. Commit 58255a4e ("NFSD: NFSv4 callback client should
use RPC_TASK_SOFTCONN") was merged several years before
RPC_TASK_NOCONNECT was available.

Because setup_callback_client() explicitly sets NOPING, the NFSv4.0
callback connection depends on the first callback RPC to initiate
a connection to the client. Thus NFSv4.0 needs to continue to use
RPC_TASK_SOFTCONN.

Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org> # v4.20+
(cherry picked from commit 6221f1d9)
Signed-off-by: Yufen Wang <wangyufen@huawei.com>
Reviewed-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>

399f2ab5