Commits · a5e62d73dee762e486e26df79fe05df5c2e843c6 · Summer2022 / 22b970264

Apr 02, 2022

Revert "swiotlb: rework "fix info leak with DMA_FROM_DEVICE"" · a5e62d73

Linus Torvalds authored 3 years ago

mainline inclusion
from mainline-v5.18-rc1
commit bddac7c1e02ba47f0570e494c9289acea3062cc1
category: bugfix
bugzilla: 186478, https://gitee.com/openeuler/kernel/issues/I4Z86P


CVE: CVE-2022-0854

--------------------------------

This reverts commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13.

It turns out this breaks at least the ath9k wireless driver, and
possibly others.

What the ath9k driver does on packet receive is to set up the DMA
transfer with:

  int ath_rx_init(..)
  ..
                bf->bf_buf_addr = dma_map_single(sc->dev, skb->data,
                                                 common->rx_bufsize,
                                                 DMA_FROM_DEVICE);

and then the receive logic (through ath_rx_tasklet()) will fetch
incoming packets

  static bool ath_edma_get_buffers(..)
  ..
        dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
                                common->rx_bufsize, DMA_FROM_DEVICE);

        ret = ath9k_hw_process_rxdesc_edma(ah, rs, skb->data);
        if (ret == -EINPROGRESS) {
                /*let device gain the buffer again*/
                dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
                                common->rx_bufsize, DMA_FROM_DEVICE);
                return false;
        }

and it's worth noting how that first DMA sync:

    dma_sync_single_for_cpu(..DMA_FROM_DEVICE);

is there to make sure the CPU can read the DMA buffer (possibly by
copying it from the bounce buffer area, or by doing some cache flush).
The iommu correctly turns that into a "copy from bounce bufer" so that
the driver can look at the state of the packets.

In the meantime, the device may continue to write to the DMA buffer, but
we at least have a snapshot of the state due to that first DMA sync.

But that _second_ DMA sync:

    dma_sync_single_for_device(..DMA_FROM_DEVICE);

is telling the DMA mapping that the CPU wasn't interested in the area
because the packet wasn't there.  In the case of a DMA bounce buffer,
that is a no-op.

Note how it's not a sync for the CPU (the "for_device()" part), and it's
not a sync for data written by the CPU (the "DMA_FROM_DEVICE" part).

Or rather, it _should_ be a no-op.  That's what commit aa6f8dcbab47
broke: it made the code bounce the buffer unconditionally, and changed
the DMA_FROM_DEVICE to just unconditionally and illogically be
DMA_TO_DEVICE.

[ Side note: purely within the confines of the swiotlb driver it wasn't
  entirely illogical: The reason it did that odd DMA_FROM_DEVICE ->
  DMA_TO_DEVICE conversion thing is because inside the swiotlb driver,
  it uses just a swiotlb_bounce() helper that doesn't care about the
  whole distinction of who the sync is for - only which direction to
  bounce.

  So it took the "sync for device" to mean that the CPU must have been
  the one writing, and thought it meant DMA_TO_DEVICE. ]

Also note how the commentary in that commit was wrong, probably due to
that whole confusion, claiming that the commit makes the swiotlb code

                                  "bounce unconditionally (that is, also
    when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
    data from the swiotlb buffer"

which is nonsensical for two reasons:

 - that "also when dir == DMA_TO_DEVICE" is nonsensical, as that was
   exactly when it always did - and should do - the bounce.

 - since this is a sync for the device (not for the CPU), we're clearly
   fundamentally not coping back stale data from the bounce buffers at
   all, because we'd be copying *to* the bounce buffers.

So that commit was just very confused.  It confused the direction of the
synchronization (to the device, not the cpu) with the direction of the
DMA (from the device).

Reported-and-bisected-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Olha Cherevyk <olha.cherevyk@gmail.com>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Toke Høiland-Jørgensen <toke@toke.dk>
Cc: Maxime Bizon <mbizon@freebox.fr>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	Documentation/core-api/dma-attributes.rst
	include/linux/dma-mapping.h
	kernel/dma/swiotlb.c
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

a5e62d73

USB: gadget: validate endpoint index for xilinx udc · a91080fd

Szymon Heidrich authored 3 years ago

mainline inclusion
from mainline-v5.17-rc4
commit 7f14c7227f342d9932f9b918893c8814f86d2a0d
category: bugfix
bugzilla: 186427, https://gitee.com/openeuler/kernel/issues/I4ZWQA


CVE: CVE-2022-27223

--------------------------------

Assure that host may not manipulate the index to point
past endpoint array.

Signed-off-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Lijun Fang <fanglijun3@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

a91080fd

sr9700: sanity check for packet length · 2126576f

Oliver Neukum authored 3 years ago

mainline inclusion
from mainline-v5.17-rc4
commit e9da0b56fe27206b49f39805f7dcda8a89379062
category: bugfix
bugzilla: 186472, https://gitee.com/openeuler/kernel/issues/I4ZWKH


CVE: CVE-2022-26966

--------------------------------

A malicious device can leak heap data to user space
providing bogus frame lengths. Introduce a sanity check.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Lijun Fang <fanglijun3@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

2126576f

ima: Fix return value of ima_write_policy() · 50cea8d3

Roberto Sassu authored 3 years ago

stable inclusion
from stable-v4.19.125
commit 657a03ff6c97319d8e664c05a1beebd8eb15d049
category: bugfix
bugzilla: 91661, https://gitee.com/openeuler/kernel/issues/I5047U


CVE: NA

-----------------------------------------------------------------

[ Upstream commit 2e3a34e9 ]

This patch fixes the return value of ima_write_policy() when a new policy
is directly passed to IMA and the current policy requires appraisal of the
file containing the policy. Currently, if appraisal is not in ENFORCE mode,
ima_write_policy() returns 0 and leads user space applications to an
endless loop. Fix this issue by denying the operation regardless of the
appraisal mode.

Cc: stable@vger.kernel.org # 4.10.x
Fixes: 19f8a847 ("ima: measure and appraise the IMA policy itself")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

50cea8d3

ima: Don't modify file descriptor mode on the fly · d3428741

Roberto Sassu authored 3 years ago

stable inclusion
from stable-v4.19.164
commit 709ed96f6ef2b040a0ea9274b4d67ce61a095eb3
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5047U


CVE: NA

-----------------------------------------------------------------

commit 207cdd565dfc95a0a5185263a567817b7ebf5467 upstream.

Commit a408e4a8 ("ima: open a new file instance if no read
permissions") already introduced a second open to measure a file when the
original file descriptor does not allow it. However, it didn't remove the
existing method of changing the mode of the original file descriptor, which
is still necessary if the current process does not have enough privileges
to open a new one.

Changing the mode isn't really an option, as the filesystem might need to
do preliminary steps to make the read possible. Thus, this patch removes
the code and keeps the second open as the only option to measure a file
when it is unreadable with the original file descriptor.

Cc: <stable@vger.kernel.org> # 4.20.x: 0014cc04 ima: Set file->f_mode
Fixes: 2fe5d6de ("ima: integrity appraisal extension")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

d3428741

ima: Set file->f_mode instead of file->f_flags in ima_calc_file_hash() · 458e2478

Roberto Sassu authored 3 years ago

stable inclusion
from stable-v4.19.125
commit 904de138bae903a6e377322e64312e22a84f2140
category: bugfix
bugzilla: 91657, https://gitee.com/openeuler/kernel/issues/I5047U


CVE: NA

-----------------------------------------------------------------

[ Upstream commit 0014cc04 ]

Commit a408e4a8 ("ima: open a new file instance if no read
permissions") tries to create a new file descriptor to calculate a file
digest if the file has not been opened with O_RDONLY flag. However, if a
new file descriptor cannot be obtained, it sets the FMODE_READ flag to
file->f_flags instead of file->f_mode.

This patch fixes this issue by replacing f_flags with f_mode as it was
before that commit.

Cc: stable@vger.kernel.org # 4.20.x
Fixes: a408e4a8 ("ima: open a new file instance if no read permissions")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

458e2478

ima: Remove __init annotation from ima_pcrread() · d073934a

Roberto Sassu authored 3 years ago

stable inclusion
from stable-v4.19.169
commit de581e41716795ce93506f3e5b0200048aa4439c
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I5047U


CVE: NA

-----------------------------------------------------------------

commit 8b8c704d upstream.

Commit 6cc7c266 ("ima: Call ima_calc_boot_aggregate() in
ima_eventdigest_init()") added a call to ima_calc_boot_aggregate() so that
the digest can be recalculated for the boot_aggregate measurement entry if
the 'd' template field has been requested. For the 'd' field, only SHA1 and
MD5 digests are accepted.

Given that ima_eventdigest_init() does not have the __init annotation, all
functions called should not have it. This patch removes __init from
ima_pcrread().

Cc: stable@vger.kernel.org
Fixes:  6cc7c266 ("ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()")
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

d073934a

ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init() · b7a7cb07

Roberto Sassu authored 3 years ago

stable inclusion
from stable-v4.19.129
commit fcb067cb457e2326c6d759e346f5f5dfef351d50
category: bugfix
bugzilla: 89622, https://gitee.com/openeuler/kernel/issues/I5047U


CVE: NA

-----------------------------------------------------------------

[ Upstream commit 6cc7c266 ]

If the template field 'd' is chosen and the digest to be added to the
measurement entry was not calculated with SHA1 or MD5, it is
recalculated with SHA1, by using the passed file descriptor. However, this
cannot be done for boot_aggregate, because there is no file descriptor.

This patch adds a call to ima_calc_boot_aggregate() in
ima_eventdigest_init(), so that the digest can be recalculated also for the
boot_aggregate entry.

Cc: stable@vger.kernel.org # 3.13.x
Fixes: 3ce1217d ("ima: define template fields library and new helpers")
Reported-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>

Conflicts:
	security/integrity/ima/ima_crypto.c

Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

b7a7cb07

evm: Check size of security.evm before using it · 192b51c2

Roberto Sassu authored 3 years ago

stable inclusion
from stable-v4.19.155
commit 05f703b07727c0eb81f487143b583d5f8561d900
category: bugfix
bugzilla: 83797, https://gitee.com/openeuler/kernel/issues/I5047U


CVE: NA

-----------------------------------------------------------------

commit 455b6c91 upstream.

This patch checks the size for the EVM_IMA_XATTR_DIGSIG and
EVM_XATTR_PORTABLE_DIGSIG types to ensure that the algorithm is read from
the buffer returned by vfs_getxattr_alloc().

Cc: stable@vger.kernel.org # 4.19.x
Fixes: 5feeb611 ("evm: Allow non-SHA1 digital signatures")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

192b51c2

ima: Don't ignore errors from crypto_shash_update() · 93dfedd1

Roberto Sassu authored 3 years ago

stable inclusion
from stable-v4.19.153
commit c470dc530c9ee6ef4b22fed19c77e20c745564e1
category: bugfix
bugzilla: 83782, https://gitee.com/openeuler/kernel/issues/I5047U


CVE: NA

-----------------------------------------------------------------

commit 60386b85 upstream.

Errors returned by crypto_shash_update() are not checked in
ima_calc_boot_aggregate_tfm() and thus can be overwritten at the next
iteration of the loop. This patch adds a check after calling
crypto_shash_update() and returns immediately if the result is not zero.

Cc: stable@vger.kernel.org
Fixes: 3323eec9 ("integrity: IMA as an integrity service provider")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Wang Weiyang <wangweiyang2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

93dfedd1

Mar 30, 2022

mm: Fallback to non-mirrored region below low watermark · 368d710d

Ma Wupeng authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S


CVE: NA

--------------------------------

Commit 45bd608e ("mm: Introduce watermark check for memory reliable")
introduce watermark to reserve memory in mirrored region for kernel usage.
But this value does not interact well with kswapd and this will lead to
fallback to non-mirrored region without release pagecache in mirrored
region.

With this patch, watermark is set to zero to avoid this problem.
Memory allocation with ___GFP_RELIABILITY will fallback to non-mirrored
region if zone's low watermark is reached and kswapd will be awakened at
this time.

Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

368d710d

mm: Disable watermark check if reliable fallback is disabled · a30ba5b6

Ma Wupeng authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S


CVE: NA

--------------------------------

If memory reliable watermark is far greater than zone's default watermark,
reliable memory allocation will trigger oom without wake up kswapd to
release pagecache.

With this patch, memory reliable watermark will be disabled if memory
fallback is disabled.

Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

a30ba5b6

mm: Do limit checking after memory allocation for memory reliable · 875ffd41

Ma Wupeng authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SK3S


CVE: NA

--------------------------------

Previous limit checking are done before real memory allocation. This will
lead to some steps in __alloc_pages_slowpath() unreached(kswapd, direct
compact and so on).

Now limit checking are done in the end of __alloc_pages_nodemask(). Pages
will be released if this memory allocation is stopped by limit checking.

Memory allocation will fallback to movable zone if one of the following
conditions are met:
- memory reliable fallback is enabled
- global init process

Memory allocation with __GFP_NOFAIL will not check any limit.

If memory reliable fallback is disabled and gfp flag does not contains any
of the following flags:
- __GFP_NORETRY
- __GFP_RETRY_MAYFAIL
- __GFP_THISNODE
Or the following conditions are false
- current->flags & PF_DUMPCORE
- order > PAGE_ALLOC_COSTLY_ORDER
Oom will occur to release some memory.

Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

875ffd41

Mar 26, 2022

livepatch/arm64: Fix incorrect endian conversion when long jump · 2b491a2e

Zheng Yejian authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I4ZIB2


CVE: NA

--------------------------------

Kernel panic happened on 'arm64 big endian' board after calling function
that has been live-patched. It can be reproduced as follows:
  1. Insert 'livepatch-sample.ko' to patch kernel function 'cmdline_proc_show';
  2. Enable patch by execute:
     echo 1 > /sys/kernel/livepatch/livepatch-sample/enabled
  3. Call 'cmdline_proc_show' by execute:
     cat /proc/cmdline
  4. Then we get following panic logs:
     > kernel BUG at arch/arm64/kernel/traps.c:408!
     > Internal error: Oops - BUG: 0 [#1] SMP
     > Modules linked in: dump_mem(OE) livepatch_cmdline1(OEK) hi1382_gmac(OE)
     > [last unloaded: dump_mem]
     > CPU: 3 PID: 1752 Comm: cat Session: 0 Tainted: G           OE K
     > 5.10.0+ #2
     > Hardware name: Hisilicon PhosphorHi1382 (DT)
     > pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
     > pc : do_undefinstr+0x23c/0x2b4
     > lr : do_undefinstr+0x5c/0x2b4
     > sp : ffffffc010ac3a80
     > x29: ffffffc010ac3a80 x28: ffffff82eb0a8000
     > x27: 0000000000000000 x26: 0000000000000001
     > x25: 0000000000000000 x24: 0000000000001000
     > x23: 0000000000000000 x22: ffffffd0e0f16000
     > x21: ffffffd0e0ae7000 x20: ffffffc010ac3b00
     > x19: 0000000000021fd6 x18: ffffffd0e04aad94
     > x17: 0000000000000000 x16: 0000000000000000
     > x15: ffffffd0e04b519c x14: 0000000000000000
     > x13: 0000000000000000 x12: 0000000000000000
     > x11: 0000000000000000 x10: 0000000000000000
     > x9 : 0000000000000000 x8 : 0000000000000000
     > x7 : 0000000000000000 x6 : ffffffd0e0f16100
     > x5 : 0000000000000000 x4 : 00000000d5300000
     > x3 : 0000000000000000 x2 : ffffffd0e0f160f0
     > x1 : ffffffd0e0f16103 x0 : 0000000000000005
     > Call trace:
     >  do_undefinstr+0x23c/0x2b4
     >  el1_undef+0x2c/0x44
     >  el1_sync_handler+0xa4/0xb0
     >  el1_sync+0x74/0x100
     >  cmdline_proc_show+0xc/0x44
     >  proc_reg_read_iter+0xb0/0xc4
     >  new_sync_read+0x10c/0x15c
     >  vfs_read+0x144/0x18c
     >  ksys_read+0x78/0xe8
     >  __arm64_sys_read+0x24/0x30

We compare first 6 instructions of 'cmdline_proc_show' before and after
patch (see below). There are 4 instructions modified, so this is case
that offset between old and new function is out of 128M. And we found
that instruction at 'cmdline_proc_show+0xc' seems incorrect (it expects
to be '00021fd6').
  origin:     patched:
  --------    --------
  fd7bbea9    929ff7f0
  21d500f0    f2a91b30
  fd030091    f2d00010
  211040f9    d61f0200 <-- cmdline_proc_show+0xc (expect is '00021fd6')
  f30b00f9    f30b00f9
  f30300aa    f30300aa

It is caused by an incorrect big-to-little endian conversion, and we
correct it.

Fixes: 5aa9a1a3 ("livepatch/arm64: support livepatch without ftrace")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: Kuohai Xu <xukuohai@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

4.19.90-2203.5.0

2b491a2e

arm64/mpam: realign step entry when traversing rmid_transform · e79aa56b

Wang ShaoBo authored 3 years ago

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4SE03


CVE: NA

---------------------------------------------------

This makes step entry aligned with step_size*step_cnt but not step_size,
and check for alignment before traversing rmid_transform.

When modifying rmid with a value not aligned with step_size*step_cnt,
for_each_rmid_transform_point_step_from might miss next step point if
it has been occupied in case step_cnt or step_size not equals to 1,
which will cause the actual allocated rmid to be inconsistent with the
expected one.

Fixes: b47b9a81 ("arm64/mpam: rmid: refine allocation and release process")
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

e79aa56b

dt-bindings: mpam: refactor device tree node structure · 880b1947

Xingang Wang authored 3 years ago

arm64/mpam: refactor device tree structure to support multiple
devices

ascend inclusion
category: feature
bugzilla:
https://gitee.com/openeuler/kernel/issues/I49RB2


CVE: NA

---------------------------------------------------

To support multiple mpam device nodes, all nodes should be organized
as child of the same parent nodes. This makes sure that the mpam
discovery start and complete procedure in the right execution order.
Add modification in the devicetree documentation to record this.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

880b1947

arm64/mpam: refactor device tree structure to support multiple devices · 33dea90e

Xingang Wang authored 3 years ago

ascend inclusion
category: feature
bugzilla:
https://gitee.com/openeuler/kernel/issues/I49RB2


CVE: NA

---------------------------------------------------

The process of MPAM device tree initialization is like this:
arm_mpam_device_probe() 	// driver probe
  mpam_discovery_start()	// start discover mpam devices
    [...] 			// find and add mpam devices
  mpam_discovery_complete()   	// trigger mpam_enable

When there are multiple mpam device nodes, the driver probe procedure
will execute more than once. However, the mpam_discovery_start() and
mpam_discovery_complete() should only run once. Besides, the start
should run first, and the complete should run after all devices added.

So we reorganize the device tree structure, so that there will be only
one mpam device parent nodes, and the probe procedure will only run once.
We add the child node to represent the mpam devices, and traverse and
add all mpam devices in the middle procedure of driver probe.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

33dea90e

arm64/mpam: fix __mpam_device_create() section mismatch error · 11e2bfdd

Xingang Wang authored 3 years ago

ascend inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I49RB2


CVE: NA

---------------------------------------------------

Fix modpost Section mismatch error in __mpam_device_create() and others.
These warnings will occur in high version gcc, for example 10.1.0.

  [...]
  WARNING: vmlinux.o(.text+0x2ed88): Section mismatch in reference from the
  function __mpam_device_create() to the function .init.text:mpam_device_alloc()
  The function __mpam_device_create() references
  the function __init mpam_device_alloc().
  This is often because __mpam_device_create lacks a __init
  annotation or the annotation of mpam_device_alloc is wrong.

  WARNING: vmlinux.o(.text.unlikely+0xa5c): Section mismatch in reference from
  the function mpam_resctrl_init() to the function .init.text:mpam_init_padding()
  The function mpam_resctrl_init() references
  the function __init mpam_init_padding().
  This is often because mpam_resctrl_init lacks a __init
  annotation or the annotation of mpam_init_padding is wrong.

  WARNING: vmlinux.o(.text.unlikely+0x5a9c): Section mismatch in reference from
  the function resctrl_group_init() to the function .init.text:resctrl_group_setup_root()
  The function resctrl_group_init() references
  the function __init resctrl_group_setup_root().
  This is often because resctrl_group_init lacks a __init
  annotation or the annotation of resctrl_group_setup_root is wrong.
  [...]

Fixes: c5e27c39 ("arm64/mpam: remove __init macro to support driver probe")
Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

11e2bfdd

Mar 24, 2022

block-map: add __GFP_ZERO flag for alloc_page in function bio_copy_kern · b253ac1c

Haimin Zhang authored 3 years ago

mainline inclusion
from mainline-v5.17-rc5
commit cc8f7fe1f5eab010191aa4570f27641876fa1267
category: bugfix
bugzilla: 186474, https://gitee.com/openeuler/kernel/issues/I4Z2LA


CVE: CVE-2022-0494

--------------------------------

Add __GFP_ZERO flag for alloc_page in function bio_copy_kern to initialize
the buffer of a bio.

Signed-off-by: Haimin Zhang <tcs.kernel@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220216084038.15635-1-tcs.kernel@gmail.com


Signed-off-by: Jens Axboe <axboe@kernel.dk>

Conflict: commit ce288e053568 ("block: remove BLK_BOUNCE_ISA support")
is not backported.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

b253ac1c

hugetlb: Add huge page alloced limit · c7c20ad0

王克锋 authored 3 years ago

hulk inclusion
category: feature
bugzilla: https://gitee.com/openeuler/kernel/issues/I4YTLN


CVE: NA

--------------------------------

The user wants to reserve a certain amount of memory for normal
non-huge page, that is, the hugetlb can't allowed to use all the
memory.

Add a new kernel parameters "hugepage_prohibit_sz=" to set size
for normal non-huge page reserved, and when alloc huge page,
let's fail if the new allocating exceeds the limit.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Peng Liu <liupeng256@huawei.com>
Reviewed-by: Chen Wandun <chenwandun@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

c7c20ad0

Mar 23, 2022

swiotlb: rework "fix info leak with DMA_FROM_DEVICE" · 3f80e186

Halil Pasic authored 3 years ago

mainline inclusion
from mainline-v5.17-rc8
commit aa6f8dcbab473f3a3c7454b74caa46d36cdc5d13
category: bugfix
bugzilla: 186478, https://gitee.com/openeuler/kernel/issues/I4Z86P


CVE: CVE-2022-0854

--------------------------------

Unfortunately, we ended up merging an old version of the patch "fix info
leak with DMA_FROM_DEVICE" instead of merging the latest one. Christoph
(the swiotlb maintainer), he asked me to create an incremental fix
(after I have pointed this out the mix up, and asked him for guidance).
So here we go.

The main differences between what we got and what was agreed are:
* swiotlb_sync_single_for_device is also required to do an extra bounce
* We decided not to introduce DMA_ATTR_OVERWRITE until we have exploiters
* The implantation of DMA_ATTR_OVERWRITE is flawed: DMA_ATTR_OVERWRITE
  must take precedence over DMA_ATTR_SKIP_CPU_SYNC

Thus this patch removes DMA_ATTR_OVERWRITE, and makes
swiotlb_sync_single_for_device() bounce unconditionally (that is, also
when dir == DMA_TO_DEVICE) in order do avoid synchronising back stale
data from the swiotlb buffer.

Let me note, that if the size used with dma_sync_* API is less than the
size used with dma_[un]map_*, under certain circumstances we may still
end up with swiotlb not being transparent. In that sense, this is no
perfect fix either.

To get this bullet proof, we would have to bounce the entire
mapping/bounce buffer. For that we would have to figure out the starting
address, and the size of the mapping in
swiotlb_sync_single_for_device(). While this does seem possible, there
seems to be no firm consensus on how things are supposed to work.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Fixes: ddbd89deb7d3 ("swiotlb: fix info leak with DMA_FROM_DEVICE")
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	Documentation/core-api/dma-attributes.rst
	include/linux/dma-mapping.h
	kernel/dma/swiotlb.c
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

3f80e186

swiotlb: fix info leak with DMA_FROM_DEVICE · 04c20fc8

Halil Pasic authored 3 years ago

mainline inclusion
from mainline-v5.17-rc6
commit ddbd89deb7d32b1fbb879f48d68fda1a8ac58e8e
category: bugfix
bugzilla: 186478, https://gitee.com/openeuler/kernel/issues/I4Z86P


CVE: CVE-2022-0854

--------------------------------

The problem I'm addressing was discovered by the LTP test covering
cve-2018-1000204.

A short description of what happens follows:
1) The test case issues a command code 00 (TEST UNIT READY) via the SG_IO
   interface with: dxfer_len == 524288, dxdfer_dir == SG_DXFER_FROM_DEV
   and a corresponding dxferp. The peculiar thing about this is that TUR
   is not reading from the device.
2) In sg_start_req() the invocation of blk_rq_map_user() effectively
   bounces the user-space buffer. As if the device was to transfer into
   it. Since commit a45b599a ("scsi: sg: allocate with __GFP_ZERO in
   sg_build_indirect()") we make sure this first bounce buffer is
   allocated with GFP_ZERO.
3) For the rest of the story we keep ignoring that we have a TUR, so the
   device won't touch the buffer we prepare as if the we had a
   DMA_FROM_DEVICE type of situation. My setup uses a virtio-scsi device
   and the  buffer allocated by SG is mapped by the function
   virtqueue_add_split() which uses DMA_FROM_DEVICE for the "in" sgs (here
   scatter-gather and not scsi generics). This mapping involves bouncing
   via the swiotlb (we need swiotlb to do virtio in protected guest like
   s390 Secure Execution, or AMD SEV).
4) When the SCSI TUR is done, we first copy back the content of the second
   (that is swiotlb) bounce buffer (which most likely contains some
   previous IO data), to the first bounce buffer, which contains all
   zeros.  Then we copy back the content of the first bounce buffer to
   the user-space buffer.
5) The test case detects that the buffer, which it zero-initialized,
  ain't all zeros and fails.

One can argue that this is an swiotlb problem, because without swiotlb
we leak all zeros, and the swiotlb should be transparent in a sense that
it does not affect the outcome (if all other participants are well
behaved).

Copying the content of the original buffer into the swiotlb buffer is
the only way I can think of to make swiotlb transparent in such
scenarios. So let's do just that if in doubt, but allow the driver
to tell us that the whole mapped buffer is going to be overwritten,
in which case we can preserve the old behavior and avoid the performance
impact of the extra bounce.

Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Conflicts:
	Documentation/core-api/dma-attributes.rst
	include/linux/dma-mapping.h
	kernel/dma/swiotlb.c
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

04c20fc8

Mar 22, 2022

esp: Fix possible buffer overflow in ESP transformation · 019f9120

Steffen Klassert authored 3 years ago

mainline inclusion
from mainline
commit ebe48d368e97d007bfeb76fcb065d6cfc4c96645
category: bugfix
bugzilla: 186409, https://gitee.com/openeuler/kernel/issues/I4Z0V2


CVE: CVE-2022-0886

--------------------------------

The maximum message size that can be send is bigger than
the  maximum site that skb_page_frag_refill can allocate.
So it is possible to write beyond the allocated buffer.

Fix this by doing a fallback to COW in that case.

v2:

Avoid get get_order() costs as suggested by Linus Torvalds.

Fixes: cac2661c ("esp4: Avoid skb_cow_data whenever possible")
Fixes: 03e2a30f ("esp6: Avoid skb_cow_data whenever possible")
Reported-by: valis <sec@valis.email>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Xu Jia <xujia39@huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

019f9120

sock: remove one redundant SKB_FRAG_PAGE_ORDER macro · b630f785

Yunsheng Lin authored 3 years ago

mainline inclusion
from mainline-v5.15-rc1
commit 723783d077e39c256a1fafebbd97cbb14207c28f
category: bugfix
bugzilla: 186409, https://gitee.com/openeuler/kernel/issues/I4Z0V2


CVE: CVE-2022-0886

--------------------------------

Both SKB_FRAG_PAGE_ORDER are defined to the same value in
net/core/sock.c and drivers/vhost/net.c.

Move the SKB_FRAG_PAGE_ORDER definition to net/core/sock.h,
as both net/core/sock.c and drivers/vhost/net.c include it,
and it seems a reasonable file to put the macro.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Xu Jia <xujia39@huawei.com>
conflict:
	drivers/vhost/net.c
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

b630f785

io_uring: fix UAF in get_files_struct() · 0213acd0

Luo Meng authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: 186337, https://gitee.com/openeuler/kernel/issues/I4XA09


CVE: NA

--------------------------------

If two tasks are running concurrently as follows:
     task1                                        |       task2
io_uring_enter                                    |  io_wqe_worker
  io_submit_sqes                                  |
    io_submit_sqe                                 |
      io_queue_sqe                                |
        io_req_defer                              |
          io_req_defer_prep                       |
            io_prep_work_files                    |
              io_grab_files                       |
                req->work.files = current->files  |
          io_queue_async_work                     |
            __io_queue_async_work                 |
              io_wq_enqueue                       |
                io_wqe_insert_work                |
                                                  |  io_worker_handle_work
                                                  |    io_impersonate_work
                                                  |      current->files = work->files

And then, one of the concurrency UAF can be shown as below:
          free                                          use (task3 ls -l /proc/io_wqe_worker id/fd )
do_exit // tsk = current = work->files            |
  exit_files				          |
    put_files_struct			          |
      tsk->files // tsk->files = work->files      |
	                                          |  iterate_dir
					          |    proc_readfd_common
                                                  |      p = get_proc_task(file_inode(file))
                                                  |       get_files_struct
                                                  |         files = task->files
                                                  |         atomic_inc(&files->count)

The root cause of UAF bugs is when get req->work.files doesn't add refcount.
The mainline commit 0f212204(io_uring: don't rely on weak ->files references)
fixes this problem, based on this commit to resolved the problme.

Signed-off-by: Luo Meng <luomeng12@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

0213acd0

xfs: fix an undefined behaviour in _da3_path_shift · 76f51e9e

Qian Cai authored 3 years ago

mainline inclusion
from mainline-v5.6-rc4
commit 4982bff1
category: bugfix
bugzilla: 186464, https://gitee.com/openeuler/kernel/issues/I4YYIZ



--------------------------------

In xfs_da3_path_shift() "blk" can be assigned to state->path.blk[-1] if
state->path.active is 1 (which is a valid state) when it tries to add an
entry to a single dir leaf block and then to shift forward to see if
there's a sibling block that would be a better place to put the new
entry. This causes a UBSAN warning given negative array indices are
undefined behavior in C. In practice the warning is entirely harmless
given that "blk" is never dereferenced in this case, but it is still
better to fix up the warning and slightly improve the code.

 UBSAN: Undefined behaviour in fs/xfs/libxfs/xfs_da_btree.c:1989:14
 index -1 is out of range for type 'xfs_da_state_blk_t [5]'
 Call trace:
  dump_backtrace+0x0/0x2c8
  show_stack+0x20/0x2c
  dump_stack+0xe8/0x150
  __ubsan_handle_out_of_bounds+0xe4/0xfc
  xfs_da3_path_shift+0x860/0x86c [xfs]
  xfs_da3_node_lookup_int+0x7c8/0x934 [xfs]
  xfs_dir2_node_addname+0x2c8/0xcd0 [xfs]
  xfs_dir_createname+0x348/0x38c [xfs]
  xfs_create+0x6b0/0x8b4 [xfs]
  xfs_generic_create+0x12c/0x1f8 [xfs]
  xfs_vn_mknod+0x3c/0x4c [xfs]
  xfs_vn_create+0x34/0x44 [xfs]
  do_last+0xd4c/0x10c8
  path_openat+0xbc/0x2f4
  do_filp_open+0x74/0xf4
  do_sys_openat2+0x98/0x180
  __arm64_sys_openat+0xf8/0x170
  do_el0_svc+0x170/0x240
  el0_sync_handler+0x150/0x250
  el0_sync+0x164/0x180

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>

Conflicts:
	fs/xfs/libxfs/xfs_da_btree.c
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

76f51e9e

xfs: Fix possible null-pointer dereferences in xchk_da_btree_block_check_sibling() · a4e35984

Jia-Ju Bai authored 3 years ago

mainline inclusion
from mainline-v5.3-rc2
commit afa1d96d
category: bugfix
bugzilla: 186464, https://gitee.com/openeuler/kernel/issues/I4YYIZ



--------------------------------

In xchk_da_btree_block_check_sibling(), there is an if statement on
line 274 to check whether ds->state->altpath.blk[level].bp is NULL:
    if (ds->state->altpath.blk[level].bp)

When ds->state->altpath.blk[level].bp is NULL, it is used on line 281:
    xfs_trans_brelse(..., ds->state->altpath.blk[level].bp);
        struct xfs_buf_log_item *bip = bp->b_log_item;
        ASSERT(bp->b_transp == tp);

Thus, possible null-pointer dereferences may occur.

To fix these bugs, ds->state->altpath.blk[level].bp is checked before
being used.

These bugs are found by a static analysis tool STCheck written by us.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

a4e35984

xfs: fix use after free in buf log item unlock assert · afea1700

Brian Foster authored 3 years ago

mainline inclusion
from mainline-v5.1-rc5
commit 4d09807f
category: bugfix
bugzilla: 186464, https://gitee.com/openeuler/kernel/issues/I4YYIZ



--------------------------------

The xfs_buf_log_item ->iop_unlock() callback asserts that the buffer
is unlocked when either non-stale or aborted. This assert occurs
after the bli refcount has been dropped and the log item potentially
freed. The aborted check is thus a potential use after free. This
problem has been reproduced with KASAN enabled via generic/475.

Fix up xfs_buf_item_unlock() to query aborted state before the bli
reference is dropped to prevent a potential use after free.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

afea1700

ACPI/IORT: Do not blindly trust DMA masks from firmware · 1b70444c

Moritz Fischer authored 3 years ago

mainline inclusion
from mainline-v5.11-rc6
commit a1df829ead5877d4a1061e976a50e2e665a16f24
category: bugfix
bugzilla: https://gitee.com/openeuler/qemu/issues/I4WE3Y


CVE: NA

--------------------------------

Address issue observed on real world system with suboptimal IORT table
where DMA masks of PCI devices would get set to 0 as result.

iort_dma_setup() would query the root complex'/named component IORT
entry for a DMA mask, and use that over the one the device has been
configured with earlier.

Ideally we want to use the minimum mask of what the IORT contains for
the root complex and what the device was configured with.

Fixes: 5ac65e8c ("ACPI/IORT: Support address size limit for root complexes")
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Link: https://lore.kernel.org/r/20210122012419.95010-1-mdf@kernel.org


Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

 Conflicts:
	drivers/acpi/arm64/iort.c
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

1b70444c

Mar 21, 2022

kabi: fix kabi broken in struct fuse_in · 10856783

Zhang Wensheng authored 3 years ago

mainline inclusion
from mainline-v5.17-rc8
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909
category: bugfix
bugzilla: 186448, https://gitee.com/openeuler/kernel/issues/I4YORE


CVE: CVE-2022-1011

--------------------------------

Because create a new user_pages in fuse_in, to fix kabi change.

Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Reviewed-by: Tao Hou <houtao1@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

4.19.90-2203.4.0

10856783

fuse: fix pipe buffer lifetime for direct_io · ee8a1d43

Miklos Szeredi authored 3 years ago

mainline inclusion
from mainline-v5.17-rc8
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909
category: bugfix
bugzilla: 186448, https://gitee.com/openeuler/kernel/issues/I4YORE


CVE: CVE-2022-1011

--------------------------------

In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls
fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then
imports the write buffer with fuse_get_user_pages(), which uses
iov_iter_get_pages() to grab references to userspace pages instead of
actually copying memory.

On the filesystem device side, these pages can then either be read to
userspace (via fuse_dev_read()), or splice()d over into a pipe using
fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops.

This is wrong because after fuse_dev_do_read() unlocks the FUSE request,
the userspace filesystem can mark the request as completed, causing write()
to return. At that point, the userspace filesystem should no longer have
access to the pipe buffer.

Fix by copying pages coming from the user address space to new pipe
buffers.

Reported-by: Jann Horn <jannh@google.com>
Fixes: c3021629 ("fuse: support splice() reading from fuse device")
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>

Conflicts:
	fs/fuse/file.c
	fs/fuse/fuse_i.h
Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>
Reviewed-by: Tao Hou <houtao1@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

ee8a1d43

blk-throtl: fix race in io dispatching · 06ff79d5

余快 authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: 186449, https://gitee.com/openeuler/kernel/issues/I4YSPC


CVE: NA

--------------------------------

If io is throttled, such io will be issued by blk_throtl_dispatch_work_fn()
or blk_throtl_drain(), and the io is fetched by throtl_pop_queued().
throtl_pop_queued() should be protected by 'queue_lock', as what
blk_throtl_dispatch_work_fn() does. However, it's not protected in
blk_throtl_drain(), which may lead to concurrent bio_list_pop(), and may
end up crashing the kernel.

Fix the problem by protecting throtl_pop_queued() through 'queue_lock'
in blk_throtl_drain().

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

06ff79d5

ext4: Fix symlink file size not match to file content · f91c9577

Ye Bin authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: 186450, https://gitee.com/openeuler/kernel/issues/I4YSJ7


CVE: NA

-----------------------------------------------

We got issue as follows:
[root@yebin home]# fsck.ext4  -fn  ram0yb
e2fsck 1.45.6 (20-Mar-2020)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Symlink /p3/d14/d1a/l3d (inode #3494) is invalid.
Clear? no
Entry 'l3d' in /p3/d14/d1a (3383) has an incorrect filetype (was 7, should be 0).
Fix? no

As symlink file size not match to file content. If symlink data block writback
failed, will call ext4_finish_bio to end io. In this path don't mark buffer
error. When umount do checkpoint can't detect buffer error, then will cleanup
jounral. Actually, correct data maybe in journal area.
To solve this issue, mark buffer error when detect bio error in ext4_finish_bio.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Zhang Yi <yi.zhang@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

f91c9577

livepatch/core: Check klp_func before 'klp_init_object_loaded' · b89cbe68

Zheng Yejian authored 3 years ago

hulk inclusion
category: feature
bugzilla: 186346, https://gitee.com/openeuler/kernel/issues/I4WBFN


CVE: NA

--------------------------------

Refer to following procedure:
  klp_init_object
    klp_init_object_loaded
      klp_find_object_symbol <-- 1. oops happened when old_name is NULL!!!
    klp_init_func  <-- 2. currently old_name is first time check here

This problem was introduced in commit 453d3845 ("livepatch/arm64:
fix func size less than limit") which exchange order of 'klp_init_func'
and 'klp_init_object_loaded' then cause old_name being used before check.

We move these checks before 'klp_init_object_loaded' and add several
logs to tell why check failed.

Fixes: 453d3845 ("livepatch/arm64: fix func size less than limit")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: Cheng Jian <cj.chengjian@huawei.com>
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>

b89cbe68

Mar 18, 2022

irqchip/gic-phytium-2500: Fix issue that interrupts are concentrated in one cpu · c61648a1

Mao HongBo authored 3 years ago

Phytium inclusion
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I41AUQ


CVE: NA

-------------------------------------------------

Fix the issue that interrupts are concentrated in one cpu
for Phytium S2500 server.

Signed-off-by: Mao HongBo <maohongbo@phytium.com.cn>
Signed-off-by: Zheng Zengkai <zhengzengkai@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

c61648a1

blk-mq: add exception handling when srcu->sda alloc failed · ab774358

Laibin Qiu authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: 186352, https://gitee.com/openeuler/kernel/issues/I4YADX


DTS: DTS2022031707143
CVE: NA

--------------------------------

In case of BLK_MQ_F_BLOCKING, per-hctx srcu is used to protect dispatch
critical area. But the current process is not aware when memory of srcu
allocation failed in blk_mq_alloc_hctx, which will leads to illegal
address BUG. Add return value validation to avoid this problem.

Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>
Reviewed-by: Hou Tao <houtao1@huawei.com>

ab774358

Mar 17, 2022

audit: improve audit queue handling when "audit=1" on cmdline · 1518b80b

Paul Moore authored 3 years ago

mainline inclusion
from mainline-v5.17-rc3
commit f26d04331360d42dbd6b58448bd98e4edbfbe1c5
category: bugfix
bugzilla: 186384 https://gitee.com/openeuler/kernel/issues/I4X1AI?from=project-issue


CVE: NA

--------------------------------

When an admin enables audit at early boot via the "audit=1" kernel
command line the audit queue behavior is slightly different; the
audit subsystem goes to greater lengths to avoid dropping records,
which unfortunately can result in problems when the audit daemon is
forcibly stopped for an extended period of time.

This patch makes a number of changes designed to improve the audit
queuing behavior so that leaving the audit daemon in a stopped state
for an extended period does not cause a significant impact to the
system.

- kauditd_send_queue() is now limited to looping through the
  passed queue only once per call.  This not only prevents the
  function from looping indefinitely when records are returned
  to the current queue, it also allows any recovery handling in
  kauditd_thread() to take place when kauditd_send_queue()
  returns.

- Transient netlink send errors seen as -EAGAIN now cause the
  record to be returned to the retry queue instead of going to
  the hold queue.  The intention of the hold queue is to store,
  perhaps for an extended period of time, the events which led
  up to the audit daemon going offline.  The retry queue remains
  a temporary queue intended to protect against transient issues
  between the kernel and the audit daemon.

- The retry queue is now limited by the audit_backlog_limit
  setting, the same as the other queues.  This allows admins
  to bound the size of all of the audit queues on the system.

- kauditd_rehold_skb() now returns records to the end of the
  hold queue to ensure ordering is preserved in the face of
  recent changes to kauditd_send_queue().

Cc: stable@vger.kernel.org
Fixes: 5b52330b ("audit: fix auditd/kernel connection state tracking")
Fixes: f4b3ee3c85551 ("audit: improve robustness of the audit queue handling")
Reported-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Tested-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Cui GaoSheng <cuigaosheng1@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

1518b80b

Revert "audit: bugfix for infinite loop when flush the hold queue" · a9140c08

Cui GaoSheng authored 3 years ago

hulk inclusion
category: bugfix
bugzilla: 186384 https://gitee.com/openeuler/kernel/issues/I4X1AI?from=project-issue


CVE: NA

--------------------------------

This reverts commit 67ab712f.

Signed-off-by: Cui GaoSheng <cuigaosheng1@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

a9140c08

veth: Do not record rx queue hint in veth_xmit · 2632c58b

Daniel Borkmann authored 3 years ago


stable inclusion
from linux-4.19.226
commit bd6e97e2b6f59a19894c7032a83f03ad38ede28e

--------------------------------

commit 710ad98c363a66a0cd8526465426c5c5f8377ee0 upstream.

Laurent reported that they have seen a significant amount of TCP retransmissions
at high throughput from applications residing in network namespaces talking to
the outside world via veths. The drops were seen on the qdisc layer (fq_codel,
as per systemd default) of the phys device such as ena or virtio_net due to all
traffic hitting a _single_ TX queue _despite_ multi-queue device. (Note that the
setup was _not_ using XDP on veths as the issue is generic.)

More specifically, after edbea9220251 ("veth: Store queue_mapping independently
of XDP prog presence") which made it all the way back to v4.19.184+,
skb_record_rx_queue() would set skb->queue_mapping to 1 (given 1 RX and 1 TX
queue by default for veths) instead of leaving at 0.

This is eventually retained and callbacks like ena_select_queue() will also pick
single queue via netdev_core_pick_tx()'s ndo_select_queue() once all the traffic
is forwarded to that device via upper stack or other means. Similarly, for others
not implementing ndo_select_queue() if XPS is disabled, netdev_pick_tx() might
call into the skb_tx_hash() and check for prior skb_rx_queue_recorded() as well.

In general, it is a _bad_ idea for virtual devices like veth to mess around with
queue selection [by default]. Given dev->real_num_tx_queues is by default 1,
the skb->queue_mapping was left untouched, and so prior to edbea9220251 the
netdev_core_pick_tx() could do its job upon __dev_queue_xmit() on the phys device.

Unbreak this and restore prior behavior by removing the skb_record_rx_queue()
from veth_xmit() altogether.

If the veth peer has an XDP program attached, then it would return the first RX
queue index in xdp_md->rx_queue_index (unless configured in non-default manner).
However, this is still better than breaking the generic case.

Fixes: edbea9220251 ("veth: Store queue_mapping independently of XDP prog presence")
Fixes: 638264dc ("veth: Support per queue XDP ring")
Reported-by: Laurent Bernaille <laurent.bernaille@datadoghq.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Cc: Toshiaki Makita <toshiaki.makita1@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Conflicts:
	drivers/net/veth.c
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

2632c58b

Mar 15, 2022

crypto: pcrypt - Fix user-after-free on module unload · 939559f5

Herbert Xu authored 3 years ago


stable inclusion
from linux-4.19.102
commit 47ef5cb878817127bd3d54c3578bbbd3f7c2bf2c
CVE: NA

-------------------------------

[ Upstream commit 07bfd9bd ]

On module unload of pcrypt we must unregister the crypto algorithms
first and then tear down the padata structure.  As otherwise the
crypto algorithms are still alive and can be used while the padata
structure is being freed.

Fixes: 5068c7a8 ("crypto: pcrypt - Add pcrypt crypto...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Lu Jialin <lujialin4@huawei.com>
Reviewed-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Wang Weiyang <wangweiyang2@huawei.com>
Signed-off-by: Laibin Qiu <qiulaibin@huawei.com>

4.19.90-2203.3.0

939559f5