- May 22, 2021
-
-
Shile Zhang authored
mainline inclusion from mainline-v5.6-rc1 commit 3c47b787 category: feature bugzilla: NA CVE: NA ------------------------------------------------- The scripts/sortextable.c code has originally copied some code from scripts/recordmount.c, which used the same setjmp/longjmp method to manage control flow. Meanwhile recordmcount has improved its error handling via: 3f1df120 ("recordmcount: Rewrite error/success handling"). So rewrite this part of sortextable as well to get rid of the setjmp/longjmp kludges, with additional refactoring, to make it more readable and easier to extend. No functional changes intended. [ mingo: Rewrote the changelog. ] Signed-off-by:
Shile Zhang <shile.zhang@linux.alibaba.com> Acked-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: linux-kbuild@vger.kernel.org Link: https://lkml.kernel.org/r/20191204004633.88660-2-shile.zhang@linux.alibaba.com Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Jian Cheng <cj.chengjian@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Thomas Gleixner authored
mainline inclusion from mainline-v5.2-rc4 commit 4317cf95 category: feature bugzilla: NA CVE: NA ------------------------------------------------- Based on 1 normalized pattern(s): licensed under the gnu general public license version 2 gplv2 extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 5 file(s). Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Reviewed-by:
Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by:
Armijn Hemel <armijn@tjaldur.nl> Reviewed-by:
Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190531081036.993848054@linutronix.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Jian Cheng <cj.chengjian@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Ye Bin authored
hulk inclusion category: bugfix bugzilla: 51854 CVE: NA ------------------------------------------------- We got follow bug_on when run fsstress with injecting IO fault: [130747.323114] kernel BUG at fs/ext4/extents_status.c:762! [130747.323117] Internal error: Oops - BUG: 0 [#1] SMP ...... [130747.334329] Call trace: [130747.334553] ext4_es_cache_extent+0x150/0x168 [ext4] [130747.334975] ext4_cache_extents+0x64/0xe8 [ext4] [130747.335368] ext4_find_extent+0x300/0x330 [ext4] [130747.335759] ext4_ext_map_blocks+0x74/0x1178 [ext4] [130747.336179] ext4_map_blocks+0x2f4/0x5f0 [ext4] [130747.336567] ext4_mpage_readpages+0x4a8/0x7a8 [ext4] [130747.336995] ext4_readpage+0x54/0x100 [ext4] [130747.337359] generic_file_buffered_read+0x410/0xae8 [130747.337767] generic_file_read_iter+0x114/0x190 [130747.338152] ext4_file_read_iter+0x5c/0x140 [ext4] [130747.338556] __vfs_read+0x11c/0x188 [130747.338851] vfs_read+0x94/0x150 [130747.339110] ksys_read+0x74/0xf0 If call ext4_ext_insert_extent failed but new extent already inserted, we just update "ex->ee_len = orig_ex.ee_len", this will lead to extent overlap, then cause bug on when cache extent. If call ext4_ext_insert_extent failed don't update ex->ee_len with old value. Maybe there will lead to block leak, but it can be fixed by fsck later. After we fixed above issue with v2 patch, but we got the same issue. ext4_split_extent_at: { ...... err = ext4_ext_insert_extent(handle, inode, ppath, &newex, flags); if (err == -ENOSPC && (EXT4_EXT_MAY_ZEROOUT & split_flag)) { ...... ext4_ext_try_to_merge(handle, inode, path, ex); ->step(1) err = ext4_ext_dirty(handle, inode, path + path->p_depth); ->step(2) if (err) goto fix_extent_len; ...... } ...... fix_extent_len: ex->ee_len = orig_ex.ee_len; ->step(3) ...... } If step(1) have been merged, but step(2) dirty extent failed, then go to fix_extent_len label to fix ex->ee_len with orig_ex.ee_len. But "ex" may not be old one, will cause overwritten. Then will trigger the same issue as previous. If step(2) failed, just return error, don't fix ex->ee_len with old value. This patch's modification is according to Jan Kara's suggestion in V3 patch: ("https://patchwork.ozlabs.org/project/linux-ext4/patch/20210428085158.3728201-1-yebin10@huawei.com/" ) "I see. Now I understand your patch. Honestly, seeing how fragile is trying to fix extent tree after split has failed in the middle, I would probably go even further and make sure we fix the tree properly in case of ENOSPC and EDQUOT (those are easily user triggerable). Anything else indicates a HW problem or fs corruption so I'd rather leave the extent tree as is and don't try to fix it (which also means we will not create overlapping extents)." Signed-off-by:
Ye Bin <yebin10@huawei.com> Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Ye Bin <yebin10@huawei.com> Reviewed-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Ye Bin authored
hulk inclusion category: bugfix bugzilla: 51854 CVE: NA ------------------------------------------------- This reverts commit 5446b76c34ed8875ba05a61fccfe838a98193791. Signed-off-by:
Ye Bin <yebin10@huawei.com> Reviewed-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Donald Buczek authored
mainline inclusion from mainline-v5.3-rc1 commit 5b596830 category: bugfix bugzilla: NA CVE: NA -------------------------------- RFC 7530 requires us to refetch the lease time attribute once a new clientID is established. This is already implemented for the nfs4.1(+) clients by nfs41_init_clientid, which calls nfs41_finish_session_reset, which calls nfs4_setup_state_renewal. To make nfs4_setup_state_renewal available for nfs4.0, move it further to the top of the source file to include it regardles of CONFIG_NFS_V4_1 and to save a forward declaration. Call nfs4_setup_state_renewal from nfs4_init_clientid. Signed-off-by:
Donald Buczek <buczek@molgen.mpg.de> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Donald Buczek authored
mainline inclusion from mainline-v5.3-rc1 commit ea51efaa category: bugfix bugzilla: NA CVE: NA -------------------------------- The function nfs41_setup_state_renewal is useful to the nfs 4.0 client as well, so rename the function to nfs4_setup_state_renewal. Signed-off-by:
Donald Buczek <buczek@molgen.mpg.de> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Donald Buczek authored
mainline inclusion from mainline-v5.3-rc1 commit 0efb01b2 category: bugfix bugzilla: NA CVE: NA -------------------------------- Compile nfs4_proc_get_lease_time, enc_get_lease_time and dec_get_lease_time for nfs4.0. Use nfs4_sequence_done instead of nfs41_sequence_done in nfs4_proc_get_lease_time, Signed-off-by:
Donald Buczek <buczek@molgen.mpg.de> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Conflicts: fs/nfs/nfs4_fs.h Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Donald Buczek authored
mainline inclusion from mainline-v5.3-rc1 commit 2eaf426d category: bugfix bugzilla: NA CVE: NA -------------------------------- The debug message of decode_attr_lease_time incorrectly says "file size". Fix it to "lease time". Signed-off-by:
Donald Buczek <buczek@molgen.mpg.de> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhang Yi authored
hulk inclusion category: bugfix bugzilla: 51864 CVE: NA --------------------------- In ext4_orphan_cleanup(), if ext4_truncate() failed to get a transaction handle, it didn't remove the inode from the in-core orphan list, which may probably trigger below error dump in ext4_destroy_inode() during the final iput() and could lead to memory corruption on the later orphan list changes. EXT4-fs (sda): Inode 6291467 (00000000b8247c67): orphan list check failed! 00000000b8247c67: 0001f30a 00000004 00000000 00000023 ............#... 00000000e24cde71: 00000006 014082a3 00000000 00000000 ......@......... 0000000072c6a5ee: 00000000 00000000 00000000 00000000 ................ ... This patch fix this by cleanup in-core orphan list manually if ext4_truncate() return error. Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Reviewed-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Lin Ma authored
mainline inclusion from mainline-v5.13-rc1 commit e2cb6b891ad2b8caa9131e3be70f45243df82a80 category: bugfix bugzilla: NA CVE: CVE-2021-32399 -------------------------------- There is a possible race condition vulnerability between issuing a HCI command and removing the cont. Specifically, functions hci_req_sync() and hci_dev_do_close() can race each other like below: thread-A in hci_req_sync() | thread-B in hci_dev_do_close() | hci_req_sync_lock(hdev); test_bit(HCI_UP, &hdev->flags); | ... | test_and_clear_bit(HCI_UP, &hdev->flags) hci_req_sync_lock(hdev); | | In this commit we alter the sequence in function hci_req_sync(). Hence, the thread-A cannot issue th. Signed-off-by:
Lin Ma <linma@zju.edu.cn> Cc: Marcel Holtmann <marcel@holtmann.org> Fixes: 7c6a329e ("[Bluetooth] Fix regression from using default link policy") Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- May 21, 2021
-
-
Jingxian He authored
hulk inclusion category: feature bugzilla: 48159 CVE: N/A ------------------------------ Enhance variables check and sync for pin mem as followings: 1) Remove unused variable in set_fork_pid; 2) Remove unused calling of access_ok, which is called in copy_from_user; 3) Enhance page_map_entry_start check in pin_mem_area; 4) Keep get_page_map_info and create_page_map_info for internal use, and increase get_page_map_info_by_pid and create_page_map_info_by_pid for external use, which is protected by spinlock; 5) Use spin_lock_irqsave instead of spin_lock. Signed-off-by:
Jingxian He <hejingxian@huawei.com> Reviewed-by:
Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
Nick Gasson authored
mainline inclusion from mainline-v5.7 commit 1e4bd2ae category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- Fix an issue where addresses in the DWARF line table are offset by -0x40 (GEN_ELF_TEXT_OFFSET). This can be seen with `objdump -S` on the ELF files after perf inject. Committer notes: Ian added this in his Acked-by reply: --- Without too much knowledge this looks good to me. The original code came from oprofile's jit support: https://sourceforge.net/p/oprofile/oprofile/ci/master/tree/opjitconv/debug_line.c#l325 --- Signed-off-by:
Nick Gasson <nick.gasson@arm.com> Acked-by:
Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200528051916.6722-1-nick.gasson@arm.com Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Zhichang Yuan <erik.yuan@arm.com> Reviewed-by:
Yang Jihong <yangjihong1@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
Nick Gasson authored
mainline inclusion from mainline-v5.7 commit 7d7e503c category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- For each PC/BCI pair in the JVMTI compiler inlining record table, the jitdump plugin emits debug line table entries for every source line in the method preceding that BCI. Instead only emit one source line per PC/BCI pair. Reported by Ian Rogers. This reduces the .dump size for SPECjbb from ~230MB to ~40MB. Signed-off-by:
Nick Gasson <nick.gasson@arm.com> Acked-by:
Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200528054049.13662-1-nick.gasson@arm.com Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Zhichang Yuan <erik.yuan@arm.com> Reviewed-by:
Yang Jihong <yangjihong1@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
Nick Gasson authored
mainline inclusion from mainline-v5.7 commit 0bdf3181 category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- For a Java method signature like: Ljava/lang/AbstractStringBuilder;appendChars(Ljava/lang/String;II)V The demangler produces: void class java.lang.AbstractStringBuilder.appendChars(class java.lang., shorttring., int, int) The arguments should be (java.lang.String, int, int) but the demangler interprets the "S" in String as the type code for "short". Correct this and two other minor things: - There is no "bool" type in Java, should be "boolean". - The demangler prepends "class" to every Java class name. This is not standard Java syntax and it wastes a lot of horizontal space if the signature is long. Remove this as there isn't any ambiguity between class names and primitives. Committer notes: This was split from a larger patch that also added a java demangler 'perf test' entry, that, before this patch shows the error being fixed by it: $ perf test java 65: Demangle Java : FAILED! $ perf test -v java Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 65: Demangle Java : --- start --- test child forked, pid 307264 FAILED: Ljava/lang/StringLatin1;equals([B[B)Z: bool class java.lang.StringLatin1.equals(byte[], byte[]) != boolean java.lang.StringLatin1.equals(byte[], byte[]) FAILED: Ljava/util/zip/ZipUtils;CENSIZ([BI)J: long class java.util.zip.ZipUtils.CENSIZ(byte[], int) != long java.util.zip.ZipUtils.CENSIZ(byte[], int) FAILED: Ljava/util/regex/Pattern$BmpCharProperty;match(Ljava/util/regex/Matcher;ILjava/lang/CharSequence;)Z: bool class java.util.regex.Pattern$BmpCharProperty.match(class java.util.regex.Matcher., int, class java.lang., charhar, shortequence) != boolean java.util.regex.Pattern$BmpCharProperty.match(java.util.regex.Matcher, int, java.lang.CharSequence) FAILED: Ljava/lang/AbstractStringBuilder;appendChars(Ljava/lang/String;II)V: void class java.lang.AbstractStringBuilder.appendChars(class java.lang., shorttring., int, int) != void java.lang.AbstractStringBuilder.appendChars(java.lang.String, int, int) FAILED: Ljava/lang/Object;<init>()V: void class java.lang.Object<init>() != void java.lang.Object<init>() test child finished with -1 ---- end ---- Demangle Java: FAILED! $ After applying this patch: $ perf test java 65: Demangle Java : Ok $ Signed-off-by:
Nick Gasson <nick.gasson@arm.com> Reviewed-by:
Ian Rogers <irogers@google.com> Tested-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by:
Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200427061520.24905-4-nick.gasson@arm.com Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Zhichang Yuan <erik.yuan@arm.com> Reviewed-by:
Yang Jihong <yangjihong1@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
Nick Gasson authored
mainline inclusion from mainline-v5.7 commit 525c821d category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- Split from a larger patch that was also fixing a problem with the java demangler, so, before applying that patch we see: $ perf test java 65: Demangle Java : FAILED! $ perf test -v java 65: Demangle Java : --- start --- test child forked, pid 307264 FAILED: Ljava/lang/StringLatin1;equals([B[B)Z: bool class java.lang.StringLatin1.equals(byte[], byte[]) != boolean java.lang.StringLatin1.equals(byte[], byte[]) FAILED: Ljava/util/zip/ZipUtils;CENSIZ([BI)J: long class java.util.zip.ZipUtils.CENSIZ(byte[], int) != long java.util.zip.ZipUtils.CENSIZ(byte[], int) FAILED: Ljava/util/regex/Pattern$BmpCharProperty;match(Ljava/util/regex/Matcher;ILjava/lang/CharSequence;)Z: bool class java.util.regex.Pattern$BmpCharProperty.match(class java.util.regex.Matcher., int, class java.lang., charhar, shortequence) != boolean java.util.regex.Pattern$BmpCharProperty.match(java.util.regex.Matcher, int, java.lang.CharSequence) FAILED: Ljava/lang/AbstractStringBuilder;appendChars(Ljava/lang/String;II)V: void class java.lang.AbstractStringBuilder.appendChars(class java.lang., shorttring., int, int) != void java.lang.AbstractStringBuilder.appendChars(java.lang.String, int, int) FAILED: Ljava/lang/Object;<init>()V: void class java.lang.Object<init>() != void java.lang.Object<init>() test child finished with -1 ---- end ---- Demangle Java: FAILED! $ Next patch should fix this. Signed-off-by:
Nick Gasson <nick.gasson@arm.com> Reviewed-by:
Ian Rogers <irogers@google.com> Tested-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by:
Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200427061520.24905-4-nick.gasson@arm.com Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Zhichang Yuan <erik.yuan@arm.com> Reviewed-by:
Yang Jihong <yangjihong1@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
Nick Gasson authored
mainline inclusion from mainline-v5.7 commit 959f8ed4 category: bugfix bugzilla: NA CVE: NA ------------------------------------------------- If the Java sources are compiled with -g:none to disable debug information the perf JVMTI plugin reports a lot of errors like: java: GetLineNumberTable failed with JVMTI_ERROR_ABSENT_INFORMATION java: GetLineNumberTable failed with JVMTI_ERROR_ABSENT_INFORMATION java: GetLineNumberTable failed with JVMTI_ERROR_ABSENT_INFORMATION java: GetLineNumberTable failed with JVMTI_ERROR_ABSENT_INFORMATION java: GetLineNumberTable failed with JVMTI_ERROR_ABSENT_INFORMATION Instead if GetLineNumberTable returns JVMTI_ERROR_ABSENT_INFORMATION simply skip emitting line number information for that method. Unlike the previous patch these errors don't affect the jitdump generation, they just generate a lot of noise. Similarly for native methods which also don't have line tables. Signed-off-by:
Nick Gasson <nick.gasson@arm.com> Reviewed-by:
Ian Rogers <irogers@google.com> Tested-by:
Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200427061520.24905-3-nick.gasson@arm.com [ Moved || operator to the end of the line, not at the start of 2nd if condition ] Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Zhichang Yuan <erik.yuan@arm.com> Reviewed-by:
Yang Jihong <yangjihong1@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
Nick Gasson authored
mainline inclusion from mainline-v5.7 commit 953e9240 category: bugfix bugzilla: NA CVE: NA --------------------------- If a Java class is compiled with -g:none to omit debug information, the JVMTI plugin won't write jitdump entries for any method in this class and prints a lot of errors like: java: GetSourceFileName failed with JVMTI_ERROR_ABSENT_INFORMATION The call to GetSourceFileName is used to derive the file name `fn`, but this value is not actually used since commit ca58d7e6 ("perf jvmti: Generate correct debug information for inlined code") which moved the file name lookup into fill_source_filenames(). So the call to GetSourceFileName and related code can be safely removed. Signed-off-by:
Nick Gasson <nick.gasson@arm.com> Reviewed-by:
Ian Rogers <irogers@google.com> Tested-by:
Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20200427061520.24905-2-nick.gasson@arm.com Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Zhichang Yuan <erik.yuan@arm.com> Reviewed-by:
Yang Jihong <yangjihong1@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
- May 18, 2021
-
-
Daniel Borkmann authored
mainline inclusion from mainline-v5.13-rc1 commit 801c6058d14a82179a7ee17a4b532cac6fad067f category: bugfix bugzilla: NA CVE: CVE-2021-31829 -------------------------------- The current implemented mechanisms to mitigate data disclosure under speculation mainly address stack and map value oob access from the speculative domain. However, Piotr discovered that uninitialized BPF stack is not protected yet, and thus old data from the kernel stack, potentially including addresses of kernel structures, could still be extracted from that 512 bytes large window. The BPF stack is special compared to map values since it's not zero initialized for every program invocation, whereas map values /are/ zero initialized upon their initial allocation and thus cannot leak any prior data in either domain. In the non-speculative domain, the verifier ensures that every stack slot read must have a prior stack slot write by the BPF program to avoid such data leaking issue. However, this is not enough: for example, when the pointer arithmetic operation moves the stack pointer from the last valid stack offset to the first valid offset, the sanitation logic allows for any intermediate offsets during speculative execution, which could then be used to extract any restricted stack content via side-channel. Given for unprivileged stack pointer arithmetic the use of unknown but bounded scalars is generally forbidden, we can simply turn the register-based arithmetic operation into an immediate-based arithmetic operation without the need for masking. This also gives the benefit of reducing the needed instructions for the operation. Given after the work in 7fedb63a8307 ("bpf: Tighten speculative pointer arithmetic mask"), the aux->alu_limit already holds the final immediate value for the offset register with the known scalar. Thus, a simple mov of the immediate to AX register with using AX as the source for the original instruction is sufficient and possible now in this case. Reported-by:
Piotr Krysiuk <piotras@gmail.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Tested-by:
Piotr Krysiuk <piotras@gmail.com> Reviewed-by:
Piotr Krysiuk <piotras@gmail.com> Reviewed-by:
John Fastabend <john.fastabend@gmail.com> Acked-by:
Alexei Starovoitov <ast@kernel.org> Conflicts: kernel/bpf/verifier.c Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Daniel Borkmann authored
stable inclusion from linux-4.19.190 commit 0e2dfdc74a7f4036127356d42ea59388f153f42c -------------------------------- commit b9b34ddbe2076ade359cd5ce7537d5ed019e9807 upstream. The negation logic for the case where the off_reg is sitting in the dst register is not correct given then we cannot just invert the add to a sub or vice versa. As a fix, perform the final bitwise and-op unconditionally into AX from the off_reg, then move the pointer from the src to dst and finally use AX as the source for the original pointer arithmetic operation such that the inversion yields a correct result. The single non-AX mov in between is possible given constant blinding is retaining it as it's not an immediate based operation. Fixes: 979d63d5 ("bpf: prevent out of bounds speculation on pointer arithmetic") Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Tested-by:
Piotr Krysiuk <piotras@gmail.com> Reviewed-by:
Piotr Krysiuk <piotras@gmail.com> Reviewed-by:
John Fastabend <john.fastabend@gmail.com> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- May 17, 2021
-
-
Coly Li authored
mainline inclusion from mainline-v5.6-rc1 commit 038ba8cc category: bugfix bugzilla: https://bugzilla.openeuler.org/show_bug.cgi?id=26 CVE: NA ----------------------------------------------- In year 2007 high performance SSD was still expensive, in order to save more space for real workload or meta data, the readahead I/Os for non-meta data was bypassed and not cached on SSD. In now days, SSD price drops a lot and people can find larger size SSD with more comfortable price. It is unncessary to alway bypass normal readahead I/Os to save SSD space for now. This patch adds options for readahead data cache policies via sysfs file /sys/block/bcache<N>/readahead_cache_policy, the options are, - "all": cache all readahead data I/Os. - "meta-only": only cache meta data, and bypass other regular I/Os. If users want to make bcache continue to only cache readahead request for metadata and bypass regular data readahead, please set "meta-only" to this sysfs file. By default, bcache will back to cache all read- ahead requests now. Cc: stable@vger.kernel.org Signed-off-by:
Coly Li <colyli@suse.de> Acked-by:
Eric Wheeler <bcache@linux.ewheeler.net> Cc: Michael Lyle <mlyle@lyle.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Li Ruilin <liruilin4@huawei.com> Reviewed-by:
Peng Junyi <pengjunyi1@huawei.com> Acked-by:
Xie XiuQi <xiexiuqi@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com>
-
Guo Hui authored
uniontech inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I3RFV8 CVE: NA ---------------------------------------------------------------- Commit eb761d65 ("mm: parallelize deferred struct page initialization within each node") the code "++zone" in follow code: /* Sanity check that the next zone really is unpopulated */ WARN_ON(++zid < MAX_NR_ZONES && populated_zone(++zone)); VM_BUG_ON(nr_init != nr_free); zone->managed_pages += nr_free; makes the managed_pages statistics of the current zone incorrect and the zone may have out-of-bounds memory when CONFIG_DEFERRED_STRUCT_PAGE_INIT=y, causing the Virtual machine system startup to fail when the Virtual machine system current allocated memory is set to half of the Virtual machine maximum memory using virt-manager tool Fix it by putting the code “zone->managed_pages += nr_free;” before “++zone” code Fixes: eb761d65 ("mm: parallelize deferred struct page initialization within each node") Reported-by:
Peng Yuanbo <pengyuanbo@uniontech.com> Signed-off-by:
Guo Hui <guohui@uniontech.com> Reviewed-by:
Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by:
Cheng Jian <cj.chengjian@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
He Zhe authored
mainline inclusion from mainline-v5.9-rc1 commit 59679d99 category: bugfix bugzilla: NA CVE: NA -------------------------------- commit 0688e64b ("NFS: Allow signal interruption of NFS4ERR_DELAYed operations") introduces nfs4_delay_interruptible which also needs an _unsafe version to avoid the following call trace for the same reason explained in commit 416ad3c9 ("freezer: add unsafe versions of freezable helpers for NFS") CPU: 4 PID: 3968 Comm: rm Tainted: G W 5.8.0-rc4 #1 Hardware name: Marvell OcteonTX CN96XX board (DT) Call trace: dump_backtrace+0x0/0x1dc show_stack+0x20/0x30 dump_stack+0xdc/0x150 debug_check_no_locks_held+0x98/0xa0 nfs4_delay_interruptible+0xd8/0x120 nfs4_handle_exception+0x130/0x170 nfs4_proc_rmdir+0x8c/0x220 nfs_rmdir+0xa4/0x360 vfs_rmdir.part.0+0x6c/0x1b0 do_rmdir+0x18c/0x210 __arm64_sys_unlinkat+0x64/0x7c el0_svc_common.constprop.0+0x7c/0x110 do_el0_svc+0x24/0xa0 el0_sync_handler+0x13c/0x1b8 el0_sync+0x158/0x180 Signed-off-by:
He Zhe <zhe.he@windriver.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by:
Hou Tao <houtao1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Trond Myklebust authored
mainline inclusion from mainline-v5.2-rc1 commit 0688e64b category: bugfix bugzilla: NA CVE: NA -------------------------------- If the server is unable to immediately execute an RPC call, and returns an NFS4ERR_DELAY then we can assume it is safe to interrupt the operation in order to handle ordinary signals. This allows the application to service timer interrupts that would otherwise have to wait until the server is again able to respond. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Reviewed-by:
Hou Tao <houtao1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Trond Myklebust authored
mainline inclusion from mainline-5.2-rc1 commit e4ec48d3 category: bugfix bugzilla: 51818 CVE: NA ------------------------------------------------- If a soft NFSv4 request is sent, then we don't need it to time out unless the connection breaks. The reason is that as long as the connection is unbroken, the protocol states that the server is not allowed to drop the request. IOW: as long as the connection remains unbroken, the client may assume that all transmitted RPC requests are being processed by the server, and that retransmissions and timeouts of those requests are unwarranted. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by:
Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by:
Hou Tao <houtao1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Trond Myklebust authored
mainline inclusion from mainline-5.1-rc3 commit d84dd3fb category: bugfix bugzilla: 51818 CVE: NA ------------------------------------------------- If the transport is still connected, then we do want to allow RPC_SOFTCONN tasks to retry. They should time out if and only if the connection is broken. Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Conflicts: net/sunrpc/clnt.c Signed-off-by:
Zhang Changzhong <zhangchangzhong@huawei.com> Reviewed-by:
Hou Tao <houtao1@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- May 14, 2021
-
-
Zhang Yi authored
mainline inclusion from mainline-v5.13-rc1 commit a149d2a5cabbf6507a7832a1c4fd2593c55fd450 category: bugfix bugzilla: 50787 CVE: NA --------------------------- Commit <50122847> ("ext4: fix check to prevent initializing reserved inodes") check the block group zero and prevent initializing reserved inodes. But in some special cases, the reserved inode may not all belong to the group zero, it may exist into the second group if we format filesystem below. mkfs.ext4 -b 4096 -g 8192 -N 1024 -I 4096 /dev/sda So, it will end up triggering a false positive report of a corrupted file system. This patch fix it by avoid check reserved inodes if no free inode blocks will be zeroed. Cc: stable@kernel.org Fixes: 50122847 ("ext4: fix check to prevent initializing reserved inodes") Signed-off-by:
Zhang Yi <yi.zhang@huawei.com> Suggested-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20210331121516.2243099-1-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Reviewed-by:
Zhang Xiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Zhao Xuehui authored
hulk inclusion category: bugfix bugzilla: 51843 CVE: NA --------------------------- In function klp_init_patch, a text_mutex lock is used when doing jump_label_apply_nops. However, the jump_label_register in which a text_mutex lock is used is done before the original text_mutex lock released. Thus, an AA deadlock is occured. In this commit, we do jump_label_register after the original text_mutex lock is released to avoid this AA deadlock. Signed-off-by:
Zhao Xuehui <zhaoxuehui1@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Chris Kennelly authored
mainline inclusion from mainline-5.10-rc1 commit 206e22f0 category: bugfix bugzilla: 51854 CVE: NA ------------------------------------------------- This produces a PIE binary with a variety of p_align requirements, suitable for verifying that the load address meets that alignment requirement. Signed-off-by:
Chris Kennelly <ckennelly@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: Fangrui Song <maskray@google.com> Cc: Hugh Dickens <hughd@google.com> Cc: Ian Rogers <irogers@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Song Liu <songliubraving@fb.com> Cc: Suren Baghdasaryan <surenb@google.com> Link: https://lkml.kernel.org/r/20200820170541.1132271-3-ckennelly@google.com Link: https://lkml.kernel.org/r/20200821233848.3904680-3-ckennelly@google.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Conflicts: tools/testing/selftests/exec/.gitignore tools/testing/selftests/exec/Makefile Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Chris Kennelly authored
mainline inclusion from mainline-5.10-rc1 commit ce81bb25 category: bugfix bugzilla: 51854 CVE: NA ------------------------------------------------- Patch series "Selecting Load Addresses According to p_align", v3. The current ELF loading mechancism provides page-aligned mappings. This can lead to the program being loaded in a way unsuitable for file-backed, transparent huge pages when handling PIE executables. While specifying -z,max-page-size=0x200000 to the linker will generate suitably aligned segments for huge pages on x86_64, the executable needs to be loaded at a suitably aligned address as well. This alignment requires the binary's cooperation, as distinct segments need to be appropriately paddded to be eligible for THP. For binaries built with increased alignment, this limits the number of bits usable for ASLR, but provides some randomization over using fixed load addresses/non-PIE binaries. This patch (of 2): The current ELF loading mechancism provides page-aligned mappings. This can lead to the program being loaded in a way unsuitable for file-backed, transparent huge pages when handling PIE executables. For binaries built with increased alignment, this limits the number of bits usable for ASLR, but provides some randomization over using fixed load addresses/non-PIE binaries. Tested by verifying program with -Wl,-z,max-page-size=0x200000 loading. [akpm@linux-foundation.org: fix max() warning] [ckennelly@google.com: augment comment] Link: https://lkml.kernel.org/r/20200821233848.3904680-2-ckennelly@google.com Signed-off-by:
Chris Kennelly <ckennelly@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: David Rientjes <rientjes@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Hugh Dickens <hughd@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sandeep Patil <sspatil@google.com> Cc: Fangrui Song <maskray@google.com> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Link: https://lkml.kernel.org/r/20200820170541.1132271-1-ckennelly@google.com Link: https://lkml.kernel.org/r/20200820170541.1132271-2-ckennelly@google.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Conflicts: fs/binfmt_elf.c Reviewed-by:
Zhang Yi <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
- May 13, 2021
-
-
zhangyi (F) authored
mainline inclusion from mainline-5.10-rc1 commit 8394a6ab category: bugfix bugzilla: 51832 CVE: NA --------------------------- Now we only use sb_bread_unmovable() to read superblock and descriptor block at mount time, so there is no opportunity that we need to clear buffer verified bit and also handle buffer write_io error bit. But for the sake of unification, let's introduce ext4_sb_bread_unmovable() to replace all sb_bread_unmovable(). After this patch, we stop using read helpers in fs/buffer.c. Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20200924073337.861472-8-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
zhangyi (F) authored
mainline inclusion from mainline-5.10-rc1 commit 0a846f49 category: bugfix bugzilla: 51832 CVE: NA --------------------------- We have already remove open codes that invoke helpers provide by fs/buffer.c in all places reading metadata buffers. This patch switch to use ext4_sb_bread() to replace all sb_bread() helpers, which is ext4_read_bh() helper back end. Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20200924073337.861472-7-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
zhangyi (F) authored
mainline inclusion from mainline-5.10-rc1 commit 5df1d412 category: bugfix bugzilla: 51832 CVE: NA --------------------------- If we readahead inode tables in __ext4_get_inode_loc(), it may bypass buffer_write_io_error() check, so introduce ext4_sb_breadahead_unmovable() to handle this special case. This patch also replace sb_breadahead_unmovable() in ext4_fill_super() for the sake of unification. Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20200924073337.861472-6-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
zhangyi (F) authored
mainline inclusion from mainline-5.10-rc1 commit 60c776e5 category: bugfix bugzilla: 51832 CVE: NA --------------------------- We have already introduced ext4_buffer_uptodate() to re-set the uptodate bit on buffer which has been failed to write out to disk. Just remove the redundant codes and switch to use ext4_buffer_uptodate() in __ext4_get_inode_loc(). Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Link: https://lore.kernel.org/r/20200924073337.861472-5-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
zhangyi (F) authored
mainline inclusion from mainline-5.10-rc1 commit 2d069c08 category: bugfix bugzilla: 51832 CVE: NA --------------------------- Revome all open codes that read metadata buffers, switch to use ext4_read_bh_*() common helpers. Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Suggested-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200924073337.861472-4-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Conflicts: fs/ext4/balloc.c fs/ext4/inode.c fs/ext4/ialloc.c fs/ext4/inode.c Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
zhangyi (F) authored
mainline inclusion from mainline-5.10-rc1 commit fa491b14 category: bugfix bugzilla: 51832 CVE: NA --------------------------- The previous patch add clear_buffer_verified() before we read metadata block from disk again, but it's rather easy to miss clearing of this bit because currently we read metadata buffer through different open codes (e.g. ll_rw_block(), bh_submit_read() and invoke submit_bh() directly). So, it's time to add common helpers to unify in all the places reading metadata buffers instead. This patch add 3 helpers: - ext4_read_bh_nowait(): async read metadata buffer if it's actually not uptodate, clear buffer_verified bit before read from disk. - ext4_read_bh(): sync version of read metadata buffer, it will wait until the read operation return and check the return status. - ext4_read_bh_lock(): try to lock the buffer before read buffer, it will skip reading if the buffer is already locked. After this patch, we need to use these helpers in all the places reading metadata buffer instead of different open codes. Signed-off-by:
zhangyi (F) <yi.zhang@huawei.com> Suggested-by:
Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20200924073337.861472-3-yi.zhang@huawei.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Theodore Ts'o authored
mainline inclusion from mainline-5.6-rc1 commit cf2834a5 category: bugfix bugzilla: 51832 CVE: NA --------------------------- In commit 7963e5ac ("ext4: treat buffers with write errors as containing valid data") we missed changing ext4_sb_bread() to use ext4_buffer_uptodate(). So fix this oversight. Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Conflicts: fs/ext4/super.c [we include d9befeda("ext4: clear buffer verified flag if read meta block from disk") first] Signed-off-by:
yangerkun <yangerkun@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Daniel Borkmann authored
mainline inclusion from mainline-v5.4.101 commit 185c2266c1df80bec001c987d64cae2d9cd13816 category: bugfix bugzilla: NA CVE: CVE-2021-3444 -------------------------------- commit 9b00f1b78809309163dda2d044d9e94a3c0248a3 upstream. Recently noticed that when mod32 with a known src reg of 0 is performed, then the dst register is 32-bit truncated in verifier: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = 0 1: R0_w=inv0 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b7) r1 = -1 2: R0_w=inv0 R1_w=inv-1 R10=fp0 2: (b4) w2 = -1 3: R0_w=inv0 R1_w=inv-1 R2_w=inv4294967295 R10=fp0 3: (9c) w1 %= w0 4: R0_w=inv0 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 4: (b7) r0 = 1 5: R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 5: (1d) if r1 == r2 goto pc+1 R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 6: R0_w=inv1 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 6: (b7) r0 = 2 7: R0_w=inv2 R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2_w=inv4294967295 R10=fp0 7: (95) exit 7: R0=inv1 R1=inv(id=0,umin_value=4294967295,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R2=inv4294967295 R10=fp0 7: (95) exit However, as a runtime result, we get 2 instead of 1, meaning the dst register does not contain (u32)-1 in this case. The reason is fairly straight forward given the 0 test leaves the dst register as-is: # ./bpftool p d x i 23 0: (b7) r0 = 0 1: (b7) r1 = -1 2: (b4) w2 = -1 3: (16) if w0 == 0x0 goto pc+1 4: (9c) w1 %= w0 5: (b7) r0 = 1 6: (1d) if r1 == r2 goto pc+1 7: (b7) r0 = 2 8: (95) exit This was originally not an issue given the dst register was marked as completely unknown (aka 64 bit unknown). However, after 468f6eaf ("bpf: fix 32-bit ALU op verification") the verifier casts the register output to 32 bit, and hence it becomes 32 bit unknown. Note that for the case where the src register is unknown, the dst register is marked 64 bit unknown. After the fix, the register is truncated by the runtime and the test passes: # ./bpftool p d x i 23 0: (b7) r0 = 0 1: (b7) r1 = -1 2: (b4) w2 = -1 3: (16) if w0 == 0x0 goto pc+2 4: (9c) w1 %= w0 5: (05) goto pc+1 6: (bc) w1 = w1 7: (b7) r0 = 1 8: (1d) if r1 == r2 goto pc+1 9: (b7) r0 = 2 10: (95) exit Semantics also match with {R,W}x mod{64,32} 0 -> {R,W}x. Invalid div has always been {R,W}x div{64,32} 0 -> 0. Rewrites are as follows: mod32: mod64: (16) if w0 == 0x0 goto pc+2 (15) if r0 == 0x0 goto pc+1 (9c) w1 %= w0 (9f) r1 %= r0 (05) goto pc+1 (bc) w1 = w1 Fixes: 468f6eaf ("bpf: fix 32-bit ALU op verification") Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Reviewed-by:
John Fastabend <john.fastabend@gmail.com> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
He <Fengqing<hefengqing@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Daniel Borkmann authored
mainline inclusion from mainline-v5.4.98 commit 78e2f71b89b22222583f74803d14f3d90cdf9d12 category: bugfix bugzilla: NA CVE: CVE-2021-3444 -------------------------------- commit e88b2c6e5a4d9ce30d75391e4d950da74bb2bd90 upstream. While reviewing a different fix, John and I noticed an oddity in one of the BPF program dumps that stood out, for example: # bpftool p d x i 13 0: (b7) r0 = 808464450 1: (b4) w4 = 808464432 2: (bc) w0 = w0 3: (15) if r0 == 0x0 goto pc+1 4: (9c) w4 %= w0 [...] In line 2 we noticed that the mov32 would 32 bit truncate the original src register for the div/mod operation. While for the two operations the dst register is typically marked unknown e.g. from adjust_scalar_min_max_vals() the src register is not, and thus verifier keeps tracking original bounds, simplified: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r0 = -1 1: R0_w=invP-1 R1=ctx(id=0,off=0,imm=0) R10=fp0 1: (b7) r1 = -1 2: R0_w=invP-1 R1_w=invP-1 R10=fp0 2: (3c) w0 /= w1 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP-1 R10=fp0 3: (77) r1 >>= 32 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R1_w=invP4294967295 R10=fp0 4: (bf) r0 = r1 5: R0_w=invP4294967295 R1_w=invP4294967295 R10=fp0 5: (95) exit processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 Runtime result of r0 at exit is 0 instead of expected -1. Remove the verifier mov32 src rewrite in div/mod and replace it with a jmp32 test instead. After the fix, we result in the following code generation when having dividend r1 and divisor r6: div, 64 bit: div, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (55) if r6 != 0x0 goto pc+2 2: (56) if w6 != 0x0 goto pc+2 3: (ac) w1 ^= w1 3: (ac) w1 ^= w1 4: (05) goto pc+1 4: (05) goto pc+1 5: (3f) r1 /= r6 5: (3c) w1 /= w6 6: (b7) r0 = 0 6: (b7) r0 = 0 7: (95) exit 7: (95) exit mod, 64 bit: mod, 32 bit: 0: (b7) r6 = 8 0: (b7) r6 = 8 1: (b7) r1 = 8 1: (b7) r1 = 8 2: (15) if r6 == 0x0 goto pc+1 2: (16) if w6 == 0x0 goto pc+1 3: (9f) r1 %= r6 3: (9c) w1 %= w6 4: (b7) r0 = 0 4: (b7) r0 = 0 5: (95) exit 5: (95) exit x86 in particular can throw a 'divide error' exception for div instruction not only for divisor being zero, but also for the case when the quotient is too large for the designated register. For the edx:eax and rdx:rax dividend pair it is not an issue in x86 BPF JIT since we always zero edx (rdx). Hence really the only protection needed is against divisor being zero. Fixes: 68fda450 ("bpf: fix 32-bit divide by zero") Co-developed-by:
John Fastabend <john.fastabend@gmail.com> Signed-off-by:
John Fastabend <john.fastabend@gmail.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Acked-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
He <Fengqing<hefengqing@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jiong Wang authored
mainline inclusion from mainline-v5.1-rc1 commit 654b65a0 category: bugfix bugzilla: NA CVE: CVE-2021-3444 -------------------------------- This patch implements code-gen for new JMP32 instructions on arm64. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Zi Shen Lim <zlim.lnx@gmail.com> Signed-off-by:
Jiong Wang <jiong.wang@netronome.com> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Conflicts: arch/arm64/net/bpf_jit_comp.c Signed-off-by:
He <Fengqing<hefengqing@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jiong Wang authored
mainline inclusion from mainline-v5.1-rc1 commit 3f5d6525 category: bugfix bugzilla: NA CVE: CVE-2021-3444 -------------------------------- This patch implements code-gen for new JMP32 instructions on x86_64. Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Jiong Wang <jiong.wang@netronome.com> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Signed-off-by:
He <Fengqing<hefengqing@huawei.com> Reviewed-by:
Kuohai Xu <xukuohai@huawei.com> Reviewed-by:
Xiu Jianfeng <xiujianfeng@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-