- Oct 15, 2020
-
-
Max Reitz authored
mainline inclusion from mainline-v5.4-rc3 commit e093c4be category: bugfix bugzilla: NA CVE: NA --------------------------- To ensure that all blocks touched by the range [offset, offset + count) are allocated, we need to calculate the block count from the difference of the range end (rounded up) and the range start (rounded down). Before this patch, we just round up the byte count, which may lead to unaligned ranges not being fully allocated: $ touch test_file $ block_size=$(stat -fc '%S' test_file) $ fallocate -o $((block_size / 2)) -l $block_size test_file $ xfs_bmap test_file test_file: 0: [0..7]: 1396264..1396271 1: [8..15]: hole There should not be a hole there. Instead, the first two blocks should be fully allocated. With this patch applied, the result is something like this: $ touch test_file $ block_size=$(stat -fc '%S' test_file) $ fallocate -o $((block_size / 2)) -l $block_size test_file $ xfs_bmap test_file test_file: 0: [0..15]: 11024..11039 Signed-off-by:
Max Reitz <mreitz@redhat.com> Reviewed-by:
Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Reviewed-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com> Reviewed-by:
zhangyi (F) <yi.zhang@huawei.com> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Yang Xu authored
stable inclusion from linux-4.19.116 commit 14b96359440421c4d84033933a2a5de73de1510d -------------------------------- commit 2e356101 upstream. Currently, when we add a new user key, the calltrace as below: add_key() key_create_or_update() key_alloc() __key_instantiate_and_link generic_key_instantiate key_payload_reserve ...... Since commit a08bf91c ("KEYS: allow reaching the keys quotas exactly"), we can reach max bytes/keys in key_alloc, but we forget to remove this limit when we reserver space for payload in key_payload_reserve. So we can only reach max keys but not max bytes when having delta between plen and type->def_datalen. Remove this limit when instantiating the key, so we can keep consistent with key_alloc. Also, fix the similar problem in keyctl_chown_key(). Fixes: 0b77f5bf ("keys: make the keyring quotas controllable through /proc/sys") Fixes: a08bf91c ("KEYS: allow reaching the keys quotas exactly") Cc: stable@vger.kernel.org # 5.0.x Cc: Eric Biggers <ebiggers@google.com> Signed-off-by:
Yang Xu <xuyang2018.jy@cn.fujitsu.com> Reviewed-by:
Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by:
Eric Biggers <ebiggers@google.com> Signed-off-by:
Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Lukas Wunner authored
stable inclusion from linux-4.19.148 commit 8b4846ac1af4b0c99817aee7304e9f5dd6ffcb56 -------------------------------- commit e0a851fe upstream. If the call to uart_add_one_port() in serial8250_register_8250_port() fails, a half-initialized entry in the serial_8250ports[] array is left behind. A subsequent reprobe of the same serial port causes that entry to be reused. Because uart->port.dev is set, uart_remove_one_port() is called for the half-initialized entry and bails out with an error message: bcm2835-aux-uart 3f215040.serial: Removing wrong port: (null) != (ptrval) The same happens on failure of mctrl_gpio_init() since commit 4a96895f ("tty/serial/8250: use mctrl_gpio helpers"). Fix by zeroing the uart->port.dev pointer in the probe error path. The bug was introduced in v2.6.10 by historical commit befff6f5bf5f ("[SERIAL] Add new port registration/unregistration functions."): https://git.kernel.org/tglx/history/c/befff6f5bf5f The commit added an unconditional call to uart_remove_one_port() in serial8250_register_port(). In v3.7, commit 835d844d ("8250_pnp: do pnp probe before legacy probe") made that call conditional on uart->port.dev which allows me to fix the issue by zeroing that pointer in the error path. Thus, the present commit will fix the problem as far back as v3.7 whereas still older versions need to also cherry-pick 835d844d. Fixes: 835d844d ("8250_pnp: do pnp probe before legacy probe") Signed-off-by:
Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v2.6.10 Cc: stable@vger.kernel.org # v2.6.10: 835d844d: 8250_pnp: do pnp probe before legacy Reviewed-by:
Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/b4a072013ee1a1d13ee06b4325afb19bda57ca1b.1589285873.git.lukas@wunner.de [iwamatsu: Backported to 4.14, 4.19: adjust context] Signed-off-by:
Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xunlei Pang authored
stable inclusion from linux-4.19.148 commit 1aa7a9e5eebc5c40f0a5ea4e4cb8e8bd0267aea1 -------------------------------- commit e3336cab upstream. We've met softlockup with "CONFIG_PREEMPT_NONE=y", when the target memcg doesn't have any reclaimable memory. It can be easily reproduced as below: watchdog: BUG: soft lockup - CPU#0 stuck for 111s![memcg_test:2204] CPU: 0 PID: 2204 Comm: memcg_test Not tainted 5.9.0-rc2+ #12 Call Trace: shrink_lruvec+0x49f/0x640 shrink_node+0x2a6/0x6f0 do_try_to_free_pages+0xe9/0x3e0 try_to_free_mem_cgroup_pages+0xef/0x1f0 try_charge+0x2c1/0x750 mem_cgroup_charge+0xd7/0x240 __add_to_page_cache_locked+0x2fd/0x370 add_to_page_cache_lru+0x4a/0xc0 pagecache_get_page+0x10b/0x2f0 filemap_fault+0x661/0xad0 ext4_filemap_fault+0x2c/0x40 __do_fault+0x4d/0xf9 handle_mm_fault+0x1080/0x1790 It only happens on our 1-vcpu instances, because there's no chance for oom reaper to run to reclaim the to-be-killed process. Add a cond_resched() at the upper shrink_node_memcgs() to solve this issue, this will mean that we will get a scheduling point for each memcg in the reclaimed hierarchy without any dependency on the reclaimable memory in that memcg thus making it more predictable. Suggested-by:
Michal Hocko <mhocko@suse.com> Signed-off-by:
Xunlei Pang <xlpang@linux.alibaba.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Acked-by:
Chris Down <chris@chrisdown.name> Acked-by:
Michal Hocko <mhocko@suse.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Link: http://lkml.kernel.org/r/1598495549-67324-1-git-send-email-xlpang@linux.alibaba.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Julius Hemanth Pitti <jpitti@cisco.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Ralph Campbell authored
stable inclusion from linux-4.19.148 commit ec56646e3b2a9a0c3a2fa63732fab731009a25af -------------------------------- [ Upstream commit ec0abae6 ] A migrating transparent huge page has to already be unmapped. Otherwise, the page could be modified while it is being copied to a new page and data could be lost. The function __split_huge_pmd() checks for a PMD migration entry before calling __split_huge_pmd_locked() leading one to think that __split_huge_pmd_locked() can handle splitting a migrating PMD. However, the code always increments the page->_mapcount and adjusts the memory control group accounting assuming the page is mapped. Also, if the PMD entry is a migration PMD entry, the call to is_huge_zero_pmd(*pmd) is incorrect because it calls pmd_pfn(pmd) instead of migration_entry_to_pfn(pmd_to_swp_entry(pmd)). Fix these problems by checking for a PMD migration entry. Fixes: 84c3fc4e ("mm: thp: check pmd migration entry in common path") Signed-off-by:
Ralph Campbell <rcampbell@nvidia.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Reviewed-by:
Yang Shi <shy828301@gmail.com> Reviewed-by:
Zi Yan <ziy@nvidia.com> Cc: Jerome Glisse <jglisse@redhat.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> [4.14+] Link: https://lkml.kernel.org/r/20200903183140.19055-1-rcampbell@nvidia.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Muchun Song authored
stable inclusion from linux-4.19.148 commit d44a437826119e8307c3904c1e4f513095ea17cb -------------------------------- [ Upstream commit b0399092 ] If a kprobe is marked as gone, we should not kill it again. Otherwise, we can disarm the kprobe more than once. In that case, the statistics of kprobe_ftrace_enabled can unbalance which can lead to that kprobe do not work. Fixes: e8386a0c ("kprobes: support probing module __exit function") Co-developed-by:
Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by:
Muchun Song <songmuchun@bytedance.com> Signed-off-by:
Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Acked-by:
Masami Hiramatsu <mhiramat@kernel.org> Cc: "Naveen N . Rao" <naveen.n.rao@linux.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Cc: Song Liu <songliubraving@fb.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200822030055.32383-1-songmuchun@bytedance.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Sunghyun Jin authored
stable inclusion from linux-4.19.147 commit 5afd52f302cac2700c59b86d19c329c0ba918977 -------------------------------- commit b3b33d3c upstream. Variable populated, which is a member of struct pcpu_chunk, is used as a unit of size of unsigned long. However, size of populated is miscounted. So, I fix this minor part. Fixes: 8ab16c43 ("percpu: change the number of pages marked in the first_chunk pop bitmap") Cc: <stable@vger.kernel.org> # 4.14+ Signed-off-by:
Sunghyun Jin <mcsmonk@gmail.com> Signed-off-by:
Dennis Zhou <dennis@kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Gustav Wiklander authored
stable inclusion from linux-4.19.147 commit 80c468d9abc9d4129809c1ffc90b3c835a1202c2 -------------------------------- [ Upstream commit b59a7ca1 ] In the prepare_message callback the bus driver has the opportunity to split a transfer into smaller chunks. spi_map_msg is done after prepare_message. Function spi_res_release releases the splited transfers in the message. Therefore spi_res_release should be called after spi_map_msg. The previous try at this was commit c9ba7a16 which released the splited transfers after spi_finalize_current_message had been called. This introduced a race since the message struct could be out of scope because the spi_sync call got completed. Fixes this leak on spi bus driver spi-bcm2835.c when transfer size is greater than 65532: Kmemleak: sg_alloc_table+0x28/0xc8 spi_map_buf+0xa4/0x300 __spi_pump_messages+0x370/0x748 __spi_sync+0x1d4/0x270 spi_sync+0x34/0x58 spi_test_execute_msg+0x60/0x340 [spi_loopback_test] spi_test_run_iter+0x548/0x578 [spi_loopback_test] spi_test_run_test+0x94/0x140 [spi_loopback_test] spi_test_run_tests+0x150/0x180 [spi_loopback_test] spi_loopback_test_probe+0x50/0xd0 [spi_loopback_test] spi_drv_probe+0x84/0xe0 Signed-off-by:
Gustav Wiklander <gustavwi@axis.com> Link: https://lore.kernel.org/r/20200908151129.15915-1-gustav.wiklander@axis.com Signed-off-by:
Mark Brown <broonie@kernel.org> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
David Milburn authored
stable inclusion from linux-4.19.147 commit f10c9c9dce4d3ee542987680e2a8576871c05734 -------------------------------- [ Upstream commit 925dd04c ] Cancel async event work in case async event has been queued up, and nvme_rdma_submit_async_event() runs after event has been freed. Signed-off-by:
David Milburn <dmilburn@redhat.com> Reviewed-by:
Keith Busch <kbusch@kernel.org> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
David Milburn authored
stable inclusion from linux-4.19.147 commit 514171c50909736af8b6cdf6365c0d15bdb869a2 -------------------------------- [ Upstream commit e126e821 ] Cancel async event work in case async event has been queued up, and nvme_fc_submit_async_event() runs after event has been freed. Signed-off-by:
David Milburn <dmilburn@redhat.com> Reviewed-by:
Keith Busch <kbusch@kernel.org> Reviewed-by:
Sagi Grimberg <sagi@grimberg.me> Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Chuck Lever authored
stable inclusion from linux-4.19.147 commit a6a2cf4d918f3c62b652a87d4e9b667049de4cb1 -------------------------------- [ Upstream commit 644c9f40 ] If a write delegation isn't available, the Linux NFS client uses a zero-stateid when performing a SETATTR. NFSv4.0 provides no mechanism for an NFS server to match such a request to a particular client. It recalls all delegations for that file, even delegations held by the client issuing the request. If that client happens to hold a read delegation, the server will recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/ DELEGRETURN sequence. Optimize out this pipeline bubble by having the client return any delegations it may hold on a file before it issues a SETATTR(zero-stateid) on that file. Signed-off-by:
Chuck Lever <chuck.lever@oracle.com> Signed-off-by:
Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Hou Pu authored
stable inclusion from linux-4.19.146 commit 4f78e55daaa2986bc533d1da3e8e7ac9cc3048f5 -------------------------------- commit ed43ffea upstream. The iSCSI target login thread might get stuck with the following stack: cat /proc/`pidof iscsi_np`/stack [<0>] down_interruptible+0x42/0x50 [<0>] iscsit_access_np+0xe3/0x167 [<0>] iscsi_target_locate_portal+0x695/0x8ac [<0>] __iscsi_target_login_thread+0x855/0xb82 [<0>] iscsi_target_login_thread+0x2f/0x5a [<0>] kthread+0xfa/0x130 [<0>] ret_from_fork+0x1f/0x30 This can be reproduced via the following steps: 1. Initiator A tries to log in to iqn1-tpg1 on port 3260. After finishing PDU exchange in the login thread and before the negotiation is finished the the network link goes down. At this point A has not finished login and tpg->np_login_sem is held. 2. Initiator B tries to log in to iqn2-tpg1 on port 3260. After finishing PDU exchange in the login thread the target expects to process remaining login PDUs in workqueue context. 3. Initiator A' tries to log in to iqn1-tpg1 on port 3260 from a new socket. A' will wait for tpg->np_login_sem with np->np_login_timer loaded to wait for at most 15 seconds. The lock is held by A so A' eventually times out. 4. Before A' got timeout initiator B gets negotiation failed and calls iscsi_target_login_drop()->iscsi_target_login_sess_out(). The np->np_login_timer is canceled and initiator A' will hang forever. Because A' is now in the login thread, no new login requests can be serviced. Fix this by moving iscsi_stop_login_thread_timer() out of iscsi_target_login_sess_out(). Also remove iscsi_np parameter from iscsi_target_login_sess_out(). Link: https://lore.kernel.org/r/20200729130343.24976-1-houpu@bytedance.com Cc: stable@vger.kernel.org Reviewed-by:
Mike Christie <michael.christie@oracle.com> Signed-off-by:
Hou Pu <houpu@bytedance.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Varun Prakash authored
stable inclusion from linux-4.19.146 commit 549a2cac6bc278b7f238a59eeb644205a10a86ca -------------------------------- commit 5528d031 upstream. Current code does not consider 'page_off' in data digest calculation. To fix this, add a local variable 'first_sg' and set first_sg.offset to sg->offset + page_off. Link: https://lore.kernel.org/r/1598358910-3052-1-git-send-email-varun@chelsio.com Fixes: e48354ce ("iscsi-target: Add iSCSI fabric support for target v4.1") Cc: <stable@vger.kernel.org> Reviewed-by:
Mike Christie <michael.christie@oralce.com> Signed-off-by:
Varun Prakash <varun@chelsio.com> Signed-off-by:
Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Darrick J. Wong authored
stable inclusion from linux-4.19.146 commit b701016288dc8cfe55d24b1b9e075adbab25ae78 -------------------------------- [ Upstream commit 125eac24 ] Don't leak kernel memory contents into the shortform attr fork. Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Eric Sandeen <sandeen@redhat.com> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jens Axboe authored
stable inclusion from linux-4.19.145 commit 732fd460bb72fd51607311009b7d474a6e0e47f3 -------------------------------- [ Upstream commit de1b0ee4 ] If a driver leaves the limit settings as the defaults, then we don't initialize bdi->io_pages. This means that file systems may need to work around bdi->io_pages == 0, which is somewhat messy. Initialize the default value just like we do for ->ra_pages. Cc: stable@vger.kernel.org Fixes: 9491ae4a ("mm: don't cap request size based on read-ahead setting") Reported-by:
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Mikulas Patocka authored
stable inclusion from linux-4.19.144 commit 154096e9966123f198cfc2f959bc559c562fc6b7 -------------------------------- commit f9e040ef upstream. The function dax_direct_access doesn't take partitions into account, it always maps pages from the beginning of the device. Therefore, persistent_memory_claim() must get the partition offset using get_start_sect() and add it to the page offsets passed to dax_direct_access(). Signed-off-by:
Mikulas Patocka <mpatocka@redhat.com> Fixes: 48debafe ("dm: add writecache target") Cc: stable@vger.kernel.org # 4.18+ Signed-off-by:
Mike Snitzer <snitzer@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Tejun Heo authored
stable inclusion from linux-4.19.144 commit a8bb7740aa313994bfa4c21cba399f65985a8a35 -------------------------------- commit 3b545563 upstream. All three generations of Sandisk SSDs lock up hard intermittently. Experiments showed that disabling NCQ lowered the failure rate significantly and the kernel has been disabling NCQ for some models of SD7's and 8's, which is obviously undesirable. Karthik worked with Sandisk to root cause the hard lockups to trim commands larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which limits max trim size to 128M and applies it to all three generations of Sandisk SSDs. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Karthik Shivaram <karthikgs@fb.com> Cc: stable@vger.kernel.org Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Daniel Borkmann authored
stable inclusion from linux-4.19.144 commit cfb4721fce554cb596ea86af116d3d68a4d91254 -------------------------------- [ Upstream commit 1d1585ca ] Commit 3d708182 ("uaccess: Add non-pagefault user-space read functions") missed to add probe write function, therefore factor out a probe_write_common() helper with most logic of probe_kernel_write() except setting KERNEL_DS, and add a new probe_user_write() helper so it can be used from BPF side. Again, on some archs, the user address space and kernel address space can co-exist and be overlapping, so in such case, setting KERNEL_DS would mean that the given address is treated as being in kernel address space. Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
Alexei Starovoitov <ast@kernel.org> Acked-by:
Andrii Nakryiko <andriin@fb.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/9df2542e68141bfa3addde631441ee45503856a8.1572649915.git.daniel@iogearbox.net Signed-off-by:
Sasha Levin <sashal@kernel.org> Conflicts: mm/maccess.c [yyl: remove VERIFY_WRITE in access_ok()] Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Masami Hiramatsu authored
stable inclusion from linux-4.19.144 commit 61135a9c74c8f79fce637c28f6109ab58467d5bd -------------------------------- [ Upstream commit 3d708182 ] Add probe_user_read(), strncpy_from_unsafe_user() and strnlen_unsafe_user() which allows caller to access user-space in IRQ context. Current probe_kernel_read() and strncpy_from_unsafe() are not available for user-space memory, because it sets KERNEL_DS while accessing data. On some arch, user address space and kernel address space can be co-exist, but others can not. In that case, setting KERNEL_DS means given address is treated as a kernel address space. Also strnlen_user() is only available from user context since it can sleep if pagefault is enabled. To access user-space memory without pagefault, we need these new functions which sets USER_DS while accessing the data. Link: http://lkml.kernel.org/r/155789869802.26965.4940338412595759063.stgit@devnote2 Acked-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by:
Sasha Levin <sashal@kernel.org> Conflicts: mm/maccess.c [yyl: remove VERIFY_READ in access_ok()] Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Mikulas Patocka authored
stable inclusion from linux-4.19.144 commit 884fee7632168ab59ed49a26de430fa3ed5c6a86 -------------------------------- commit b17164e2 upstream. When running in a dax mode, if the user maps a page with MAP_PRIVATE and PROT_WRITE, the xfs filesystem would incorrectly update ctime and mtime when the user hits a COW fault. This breaks building of the Linux kernel. How to reproduce: 1. extract the Linux kernel tree on dax-mounted xfs filesystem 2. run make clean 3. run make -j12 4. run make -j12 at step 4, make would incorrectly rebuild the whole kernel (although it was already built in step 3). The reason for the breakage is that almost all object files depend on objtool. When we run objtool, it takes COW page fault on its .data section, and these faults will incorrectly update the timestamp of the objtool binary. The updated timestamp causes make to rebuild the whole tree. Signed-off-by:
Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jason Gunthorpe authored
stable inclusion from linux-4.19.144 commit 95968e5cbb5db7ff33288d01a95349d476f19248 -------------------------------- [ Upstream commit 428fc0af ] Otherwise gcc generates warnings if the expression is complicated. Fixes: 312a0c17 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by:
Jason Gunthorpe <jgg@nvidia.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.com Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Namhyung Kim authored
stable inclusion from linux-4.19.144 commit faec94592f7364be5f1e512f6330d3b86136166b -------------------------------- [ Upstream commit e62458e3 ] The new string should have enough space for the original string and the back slashes IMHO. Fixes: fbc2844e ("perf vendor events: Use more flexible pattern matching for CPU identification for mapfile.csv") Signed-off-by:
Namhyung Kim <namhyung@kernel.org> Reviewed-by:
Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: William Cohen <wcohen@redhat.com> Link: http://lore.kernel.org/lkml/20200903152510.489233-1-namhyung@kernel.org Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Darrick J. Wong authored
stable inclusion from linux-4.19.144 commit ab2413892e2d26015eae2f279f30935846ca24aa -------------------------------- [ Upstream commit d0c20d38 ] The realtime flag only applies to the data fork, so don't use the realtime block number checks on the attr fork of a realtime file. Fixes: 30b0984d ("xfs: refactor bmap record validation") Signed-off-by:
Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Al Viro authored
stable inclusion from linux-4.19.144 commit 37d933e8b41b83bb8278815e366aec5a542b7e31 -------------------------------- [ Upstream commit 77f4689d ] epoll_loop_check_proc() can run into a file already committed to destruction; we can't grab a reference on those and don't need to add them to the set for reverse path check anyway. Tested-by:
Marc Zyngier <maz@kernel.org> Fixes: a9ed4a65 ("epoll: Keep a reference on files added to the check list") Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Al Grant authored
stable inclusion from linux-4.19.144 commit 5154e806105266406156b3fa67d05df7a398aa6c -------------------------------- [ Upstream commit 39c0a53b ] perf_event.h has macros that define the field offsets in the data_src bitmask in perf records. The SNOOPX and REMOTE offsets were both 37. These are distinct fields, and the bitfield layout in perf_mem_data_src confirms that SNOOPX should be at offset 38. Committer notes: This was extracted from a larger patch that also contained kernel changes. Fixes: 52839e65 ("perf tools: Add support for printing new mem_info encodings") Signed-off-by:
Al Grant <al.grant@arm.com> Reviewed-by:
Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/9974f2d0-bf7f-518e-d9f7-4520e5ff1bb0@foss.arm.com Signed-off-by:
Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Peter Zijlstra authored
stable inclusion from linux-4.19.144 commit c12a4e6ac32c6d5978fecc0501b7bb9c8ddd2bae -------------------------------- [ Upstream commit 49d9c593 ] Match the pattern elsewhere in this file. Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> Reviewed-by:
Thomas Gleixner <tglx@linutronix.de> Acked-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by:
Marco Elver <elver@google.com> Link: https://lkml.kernel.org/r/20200821085348.251340558@infradead.org Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jarkko Sakkinen authored
stable inclusion from linux-4.19.143 commit dc828b79feea72cfbc34fd6104a8386bade6fd78 -------------------------------- [ Upstream commit 6c4e79d9 ] The size of the buffers for storing context's and sessions can vary from arch to arch as PAGE_SIZE can be anything between 4 kB and 256 kB (the maximum for PPC64). Define a fixed buffer size set to 16 kB. This should be enough for most use with three handles (that is how many we allow at the moment). Parametrize the buffer size while doing this, so that it is easier to revisit this later on if required. Cc: stable@vger.kernel.org Reported-by:
Stefan Berger <stefanb@linux.ibm.com> Fixes: 745b361e ("tpm: infrastructure for TPM spaces") Reviewed-by:
Jerry Snitselaar <jsnitsel@redhat.com> Tested-by:
Stefan Berger <stefanb@linux.ibm.com> Signed-off-by:
Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Heikki Krogerus authored
stable inclusion from linux-4.19.143 commit 9d57313ce14b6449369f04e07afd8bfe2c70a2bc -------------------------------- commit c15e1bdd upstream. When the primary firmware node pointer is removed from a device (set to NULL) the secondary firmware node pointer, when it exists, is made the primary node for the device. However, the secondary firmware node pointer of the original primary firmware node is never cleared (set to NULL). To avoid situation where the secondary firmware node pointer is pointing to a non-existing object, clearing it properly when the primary node is removed from a device in set_primary_fwnode(). Fixes: 97badf87 ("device property: Make it possible to use secondary firmware nodes") Cc: All applicable <stable@vger.kernel.org> Signed-off-by:
Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Rafael J. Wysocki authored
stable inclusion from linux-4.19.143 commit 517b087a70a3fc59cdba5dcf99afcb8a2f90a3c7 -------------------------------- commit e3eb6e8f upstream. It has been reported that system-wide suspend may be aborted in the absence of any wakeup events due to unforseen interactions of it with the runtume PM framework. One failing scenario is when there are multiple devices sharing an ACPI power resource and runtime-resume needs to be carried out for one of them during system-wide suspend (for example, because it needs to be reconfigured before the whole system goes to sleep). In that case, the runtime-resume of that device involves turning the ACPI power resource "on" which in turn causes runtime-resume requests to be queued up for all of the other devices sharing it. Those requests go to the runtime PM workqueue which is frozen during system-wide suspend, so they are not actually taken care of until the resume of the whole system, but the pm_runtime_barrier() call in __device_suspend() sees them and triggers system wakeup events for them which then cause the system-wide suspend to be aborted if wakeup source objects are in active use. Of course, the logic that leads to triggering those wakeup events is questionable in the first place, because clearly there are cases in which a pending runtime resume request for a device is not connected to any real wakeup events in any way (like the one above). Moreover, it is racy, because the device may be resuming already by the time the pm_runtime_barrier() runs and so if the driver doesn't take care of signaling the wakeup event as appropriate, it will be lost. However, if the driver does take care of that, the extra pm_wakeup_event() call in the core is redundant. Accordingly, drop the conditional pm_wakeup_event() call fron __device_suspend() and make the latter call pm_runtime_barrier() alone. Also modify the comment next to that call to reflect the new code and extend it to mention the need to avoid unwanted interactions between runtime PM and system-wide device suspend callbacks. Fixes: 1e2ef05b ("PM: Limit race conditions between runtime PM and system sleep (v2)") Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by:
Alan Stern <stern@rowland.harvard.edu> Reported-by:
Utkarsh H Patel <utkarsh.h.patel@intel.com> Tested-by:
Utkarsh H Patel <utkarsh.h.patel@intel.com> Tested-by:
Pengfei Xu <pengfei.xu@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jan Kara authored
stable inclusion from linux-4.19.143 commit 7c3d77a31b0633dd8d61e3ab7e3d919af6fded75 -------------------------------- commit f9cae926 upstream. When we are processing writeback for sync(2), move_expired_inodes() didn't set any inode expiry value (older_than_this). This can result in writeback never completing if there's steady stream of inodes added to b_dirty_time list as writeback rechecks dirty lists after each writeback round whether there's more work to be done. Fix the problem by using sync(2) start time is inode expiry value when processing b_dirty_time list similarly as for ordinarily dirtied inodes. This requires some refactoring of older_than_this handling which simplifies the code noticeably as a bonus. Fixes: 0ae45f63 ("vfs: add support for a lazytime mount option") CC: stable@vger.kernel.org Reviewed-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Conflicts: include/trace/events/writeback.h [yyl: strlcpy() is replaced with bdi_get_dev_name()] Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jan Kara authored
stable inclusion from linux-4.19.143 commit be0937e03bf65b4e113213658bb3c54b5e5d4f0e -------------------------------- commit 5afced3b upstream. Inode's i_io_list list head is used to attach inode to several different lists - wb->{b_dirty, b_dirty_time, b_io, b_more_io}. When flush worker prepares a list of inodes to writeback e.g. for sync(2), it moves inodes to b_io list. Thus it is critical for sync(2) data integrity guarantees that inode is not requeued to any other writeback list when inode is queued for processing by flush worker. That's the reason why writeback_single_inode() does not touch i_io_list (unless the inode is completely clean) and why __mark_inode_dirty() does not touch i_io_list if I_SYNC flag is set. However there are two flaws in the current logic: 1) When inode has only I_DIRTY_TIME set but it is already queued in b_io list due to sync(2), concurrent __mark_inode_dirty(inode, I_DIRTY_SYNC) can still move inode back to b_dirty list resulting in skipping writeback of inode time stamps during sync(2). 2) When inode is on b_dirty_time list and writeback_single_inode() races with __mark_inode_dirty() like: writeback_single_inode() __mark_inode_dirty(inode, I_DIRTY_PAGES) inode->i_state |= I_SYNC __writeback_single_inode() inode->i_state |= I_DIRTY_PAGES; if (inode->i_state & I_SYNC) bail if (!(inode->i_state & I_DIRTY_ALL)) - not true so nothing done We end up with I_DIRTY_PAGES inode on b_dirty_time list and thus standard background writeback will not writeback this inode leading to possible dirty throttling stalls etc. (thanks to Martijn Coenen for this analysis). Fix these problems by tracking whether inode is queued in b_io or b_more_io lists in a new I_SYNC_QUEUED flag. When this flag is set, we know flush worker has queued inode and we should not touch i_io_list. On the other hand we also know that once flush worker is done with the inode it will requeue the inode to appropriate dirty list. When I_SYNC_QUEUED is not set, __mark_inode_dirty() can (and must) move inode to appropriate dirty list. Reported-by:
Martijn Coenen <maco@android.com> Reviewed-by:
Martijn Coenen <maco@android.com> Tested-by:
Martijn Coenen <maco@android.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Fixes: 0ae45f63 ("vfs: add support for a lazytime mount option") CC: stable@vger.kernel.org Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Jan Kara authored
stable inclusion from linux-4.19.143 commit 1f58ddc07eef098a32741e74516bb6ef18009ce1 -------------------------------- commit b35250c0 upstream. Currently, operations on inode->i_io_list are protected by wb->list_lock. In the following patches we'll need to maintain consistency between inode->i_state and inode->i_io_list so change the code so that inode->i_lock protects also all inode's i_io_list handling. Reviewed-by:
Martijn Coenen <maco@android.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> CC: stable@vger.kernel.org # Prerequisite for "writeback: Avoid skipping inode writeback" Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Sergey Senozhatsky authored
stable inclusion from linux-4.19.143 commit 2eb35a11bbcc3636cdabd6228299d6a1b91cde00 -------------------------------- commit 205d300a upstream. We have a number of "uart.port->desc.lock vs desc.lock->uart.port" lockdep reports coming from 8250 driver; this causes a bit of trouble to people, so let's fix it. The problem is reverse lock order in two different call paths: chain #1: serial8250_do_startup() spin_lock_irqsave(&port->lock); disable_irq_nosync(port->irq); raw_spin_lock_irqsave(&desc->lock) chain #2: __report_bad_irq() raw_spin_lock_irqsave(&desc->lock) for_each_action_of_desc() printk() spin_lock_irqsave(&port->lock); Fix this by changing the order of locks in serial8250_do_startup(): do disable_irq_nosync() first, which grabs desc->lock, and grab uart->port after that, so that chain #1 and chain #2 have same lock order. Full lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.4.39 #55 Not tainted ====================================================== swapper/0/0 is trying to acquire lock: ffffffffab65b6c0 (console_owner){-...}, at: console_lock_spinning_enable+0x31/0x57 but task is already holding lock: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&irq_desc_lock_class){-.-.}: _raw_spin_lock_irqsave+0x61/0x8d __irq_get_desc_lock+0x65/0x89 __disable_irq_nosync+0x3b/0x93 serial8250_do_startup+0x451/0x75c uart_startup+0x1b4/0x2ff uart_port_activate+0x73/0xa0 tty_port_open+0xae/0x10a uart_open+0x1b/0x26 tty_open+0x24d/0x3a0 chrdev_open+0xd5/0x1cc do_dentry_open+0x299/0x3c8 path_openat+0x434/0x1100 do_filp_open+0x9b/0x10a do_sys_open+0x15f/0x3d7 kernel_init_freeable+0x157/0x1dd kernel_init+0xe/0x105 ret_from_fork+0x27/0x50 -> #1 (&port_lock_key){-.-.}: _raw_spin_lock_irqsave+0x61/0x8d serial8250_console_write+0xa7/0x2a0 console_unlock+0x3b7/0x528 vprintk_emit+0x111/0x17f printk+0x59/0x73 register_console+0x336/0x3a4 uart_add_one_port+0x51b/0x5be serial8250_register_8250_port+0x454/0x55e dw8250_probe+0x4dc/0x5b9 platform_drv_probe+0x67/0x8b really_probe+0x14a/0x422 driver_probe_device+0x66/0x130 device_driver_attach+0x42/0x5b __driver_attach+0xca/0x139 bus_for_each_dev+0x97/0xc9 bus_add_driver+0x12b/0x228 driver_register+0x64/0xed do_one_initcall+0x20c/0x4a6 do_initcall_level+0xb5/0xc5 do_basic_setup+0x4c/0x58 kernel_init_freeable+0x13f/0x1dd kernel_init+0xe/0x105 ret_from_fork+0x27/0x50 -> #0 (console_owner){-...}: __lock_acquire+0x118d/0x2714 lock_acquire+0x203/0x258 console_lock_spinning_enable+0x51/0x57 console_unlock+0x25d/0x528 vprintk_emit+0x111/0x17f printk+0x59/0x73 __report_bad_irq+0xa3/0xba note_interrupt+0x19a/0x1d6 handle_irq_event_percpu+0x57/0x79 handle_irq_event+0x36/0x55 handle_fasteoi_irq+0xc2/0x18a do_IRQ+0xb3/0x157 ret_from_intr+0x0/0x1d cpuidle_enter_state+0x12f/0x1fd cpuidle_enter+0x2e/0x3d do_idle+0x1ce/0x2ce cpu_startup_entry+0x1d/0x1f start_kernel+0x406/0x46a secondary_startup_64+0xa4/0xb0 other info that might help us debug this: Chain exists of: console_owner --> &port_lock_key --> &irq_desc_lock_class Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&irq_desc_lock_class); lock(&port_lock_key); lock(&irq_desc_lock_class); lock(console_owner); *** DEADLOCK *** 2 locks held by swapper/0/0: #0: ffff88810a8e34c0 (&irq_desc_lock_class){-.-.}, at: __report_bad_irq+0x5b/0xba #1: ffffffffab65b5c0 (console_lock){+.+.}, at: console_trylock_spinning+0x20/0x181 stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.39 #55 Hardware name: XXXXXX Call Trace: <IRQ> dump_stack+0xbf/0x133 ? print_circular_bug+0xd6/0xe9 check_noncircular+0x1b9/0x1c3 __lock_acquire+0x118d/0x2714 lock_acquire+0x203/0x258 ? console_lock_spinning_enable+0x31/0x57 console_lock_spinning_enable+0x51/0x57 ? console_lock_spinning_enable+0x31/0x57 console_unlock+0x25d/0x528 ? console_trylock+0x18/0x4e vprintk_emit+0x111/0x17f ? lock_acquire+0x203/0x258 printk+0x59/0x73 __report_bad_irq+0xa3/0xba note_interrupt+0x19a/0x1d6 handle_irq_event_percpu+0x57/0x79 handle_irq_event+0x36/0x55 handle_fasteoi_irq+0xc2/0x18a do_IRQ+0xb3/0x157 common_interrupt+0xf/0xf </IRQ> Signed-off-by:
Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Fixes: 768aec0b ("serial: 8250: fix shared interrupts issues with SMP and RT kernels") Reported-by:
Guenter Roeck <linux@roeck-us.net> Reported-by:
Raul Rangel <rrangel@google.com> BugLink: https://bugs.chromium.org/p/chromium/issues/detail?id=1114800 Link: https://lore.kernel.org/lkml/CAHQZ30BnfX+gxjPm1DUd5psOTqbyDh4EJE=2=VAMW_VDafctkA@mail.gmail.com/T/#u Reviewed-by:
Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by:
Guenter Roeck <linux@roeck-us.net> Tested-by:
Guenter Roeck <linux@roeck-us.net> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200817022646.1484638-1-sergey.senozhatsky@gmail.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Valmer Huhn authored
stable inclusion from linux-4.19.143 commit ce755e4eae3e7f782d6a690ac6bd85ac9671b5e0 -------------------------------- commit c6b9e95d upstream. The following in 8250_exar.c line 589 is used to determine the number of ports for each Exar board: nr_ports = board->num_ports ? board->num_ports : pcidev->device & 0x0f; If the number of ports a card has is not explicitly specified, it defaults to the rightmost 4 bits of the PCI device ID. This is prone to error since not all PCI device IDs contain a number which corresponds to the number of ports that card provides. This particular case involves COMMTECH_4222PCIE, COMMTECH_4224PCIE and COMMTECH_4228PCIE cards with device IDs 0x0022, 0x0020 and 0x0021. Currently the multiport cards receive 2, 0 and 1 port instead of 2, 4 and 8 ports respectively. To fix this, each Commtech Fastcom PCIe card is given a struct where the number of ports is explicitly specified. This ensures 'board->num_ports' is used instead of the default 'pcidev->device & 0x0f'. Fixes: d0aeaa83 ("serial: exar: split out the exar code from 8250_pci") Signed-off-by:
Valmer Huhn <valmer.huhn@concurrent-rt.com> Tested-by:
Valmer Huhn <valmer.huhn@concurrent-rt.com> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200813165255.GC345440@icarus.concurrent-rt.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Lukas Wunner authored
stable inclusion from linux-4.19.143 commit e12f36220f6f89ecf78edc98a153d548352d6951 -------------------------------- commit 89efbe70 upstream. pl011_probe() calls pl011_setup_port() to reserve an amba_ports[] entry, then calls pl011_register_port() to register the uart driver with the tty layer. If registration of the uart driver fails, the amba_ports[] entry is not released. If this happens 14 times (value of UART_NR macro), then all amba_ports[] entries will have been leaked and driver probing is no longer possible. (To be fair, that can only happen if the DeviceTree doesn't contain alias IDs since they cause the same entry to be used for a given port.) Fix it. Fixes: ef2889f7 ("serial: pl011: Move uart_register_driver call to device") Signed-off-by:
Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v3.15+ Cc: Tushar Behera <tushar.behera@linaro.org> Link: https://lore.kernel.org/r/138f8c15afb2f184d8102583f8301575566064a6.1597316167.git.lukas@wunner.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Lukas Wunner authored
stable inclusion from linux-4.19.143 commit eec2f7d9f0352a8bfe41980632e4e67a0d5c032b -------------------------------- commit 27afac93 upstream. If probing of a pl011 gets deferred until after free_initmem(), an oops ensues because pl011_console_match() is called which has been freed. Fix by removing the __init attribute from the function and those it calls. Commit 10879ae5 ("serial: pl011: add console matching function") introduced pl011_console_match() not just for early consoles but regular preferred consoles, such as those added by acpi_parse_spcr(). Regular consoles may be registered after free_initmem() for various reasons, one being deferred probing, another being dynamic enablement of serial ports using a DeviceTree overlay. Thus, pl011_console_match() must not be declared __init and the functions it calls mustn't either. Stack trace for posterity: Unable to handle kernel paging request at virtual address 80c38b58 Internal error: Oops: 8000000d [#1] PREEMPT SMP ARM PC is at pl011_console_match+0x0/0xfc LR is at register_console+0x150/0x468 [<80187004>] (register_console) [<805a8184>] (uart_add_one_port) [<805b2b68>] (pl011_register_port) [<805b3ce4>] (pl011_probe) [<80569214>] (amba_probe) [<805ca088>] (really_probe) [<805ca2ec>] (driver_probe_device) [<805ca5b0>] (__device_attach_driver) [<805c8060>] (bus_for_each_drv) [<805c9dfc>] (__device_attach) [<805ca630>] (device_initial_probe) [<805c90a8>] (bus_probe_device) [<805c95a8>] (deferred_probe_work_func) Fixes: 10879ae5 ("serial: pl011: add console matching function") Signed-off-by:
Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v4.10+ Cc: Aleksey Makarov <amakarov@marvell.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Christopher Covington <cov@codeaurora.org> Link: https://lore.kernel.org/r/f827ff09da55b8c57d316a1b008a137677b58921.1597315557.git.lukas@wunner.de Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
George Kennedy authored
stable inclusion from linux-4.19.143 commit 1221d11e5c35db18323ade3d4b2130bde89cc9df -------------------------------- commit bc5269ca upstream. vc_resize() can return with an error after failure. Change VT_RESIZEX ioctl to save struct vc_data values that are modified and restore the original values in case of error. Signed-off-by:
George Kennedy <george.kennedy@oracle.com> Cc: stable <stable@vger.kernel.org> Reported-by:
<syzbot+38a3699c7eaf165b97a6@syzkaller.appspotmail.com> Link: https://lore.kernel.org/r/1596213192-6635-2-git-send-email-george.kennedy@oracle.com Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Tetsuo Handa authored
stable inclusion from linux-4.19.143 commit c1fe757dd3d18497eaca831ed82aa20b4186affd -------------------------------- commit f8d1653d upstream. syzbot is reporting UAF bug in set_origin() from vc_do_resize() [1], for vc_do_resize() calls kfree(vc->vc_screenbuf) before calling set_origin(). Unfortunately, in set_origin(), vc->vc_sw->con_set_origin() might access vc->vc_pos when scroll is involved in order to manipulate cursor, but vc->vc_pos refers already released vc->vc_screenbuf until vc->vc_pos gets updated based on the result of vc->vc_sw->con_set_origin(). Preserving old buffer and tolerating outdated vc members until set_origin() completes would be easier than preventing vc->vc_sw->con_set_origin() from accessing outdated vc members. [1] https://syzkaller.appspot.com/bug?id=6649da2081e2ebdc65c0642c214b27fe91099db3 Reported-by:
syzbot <syzbot+9116ecc1978ca3a12f43@syzkaller.appspotmail.com> Signed-off-by:
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: stable <stable@vger.kernel.org> Link: https://lore.kernel.org/r/1596034621-4714-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Ming Lei authored
stable inclusion from linux-4.19.143 commit 77064570e4c3636da26f30ed28e4d132256424ec -------------------------------- commit d7d8535f upstream. SCHED_RESTART code path is relied to re-run queue for dispatch requests in hctx->dispatch. Meantime the SCHED_RSTART flag is checked when adding requests to hctx->dispatch. memory barriers have to be used for ordering the following two pair of OPs: 1) adding requests to hctx->dispatch and checking SCHED_RESTART in blk_mq_dispatch_rq_list() 2) clearing SCHED_RESTART and checking if there is request in hctx->dispatch in blk_mq_sched_restart(). Without the added memory barrier, either: 1) blk_mq_sched_restart() may miss requests added to hctx->dispatch meantime blk_mq_dispatch_rq_list() observes SCHED_RESTART, and not run queue in dispatch side or 2) blk_mq_dispatch_rq_list still sees SCHED_RESTART, and not run queue in dispatch side, meantime checking if there is request in hctx->dispatch from blk_mq_sched_restart() is missed. IO hang in ltp/fs_fill test is reported by kernel test robot: https://lkml.org/lkml/2020/7/26/77 Turns out it is caused by the above out-of-order OPs. And the IO hang can't be observed any more after applying this patch. Fixes: bd166ef1 ("blk-mq-sched: add framework for MQ capable IO schedulers") Reported-by:
kernel test robot <rong.a.chen@intel.com> Signed-off-by:
Ming Lei <ming.lei@redhat.com> Reviewed-by:
Christoph Hellwig <hch@lst.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Christoph Hellwig <hch@lst.de> Cc: David Jeffery <djeffery@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Jens Axboe <axboe@kernel.dk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-
Xianting Tian authored
stable inclusion from linux-4.19.143 commit 4aaac9c537b79ffd0602db06cd5127a455e49275 -------------------------------- [ Upstream commit 377254b2 ] If a device is hot-removed --- for example, when a physical device is unplugged from pcie slot or a nbd device's network is shutdown --- this can result in a BUG_ON() crash in submit_bh_wbc(). This is because the when the block device dies, the buffer heads will have their Buffer_Mapped flag get cleared, leading to the crash in submit_bh_wbc. We had attempted to work around this problem in commit a17712c8 ("ext4: check superblock mapped prior to committing"). Unfortunately, it's still possible to hit the BUG_ON(!buffer_mapped(bh)) if the device dies between when the work-around check in ext4_commit_super() and when submit_bh_wbh() is finally called: Code path: ext4_commit_super judge if 'buffer_mapped(sbh)' is false, return <== commit a17712c8 lock_buffer(sbh) ... unlock_buffer(sbh) __sync_dirty_buffer(sbh,... lock_buffer(sbh) judge if 'buffer_mapped(sbh))' is false, return <== added by this patch submit_bh(...,sbh) submit_bh_wbc(...,sbh,...) [100722.966497] kernel BUG at fs/buffer.c:3095! <== BUG_ON(!buffer_mapped(bh))' in submit_bh_wbc() [100722.966503] invalid opcode: 0000 [#1] SMP [100722.966566] task: ffff8817e15a9e40 task.stack: ffffc90024744000 [100722.966574] RIP: 0010:submit_bh_wbc+0x180/0x190 [100722.966575] RSP: 0018:ffffc90024747a90 EFLAGS: 00010246 [100722.966576] RAX: 0000000000620005 RBX: ffff8818a80603a8 RCX: 0000000000000000 [100722.966576] RDX: ffff8818a80603a8 RSI: 0000000000020800 RDI: 0000000000000001 [100722.966577] RBP: ffffc90024747ac0 R08: 0000000000000000 R09: ffff88207f94170d [100722.966578] R10: 00000000000437c8 R11: 0000000000000001 R12: 0000000000020800 [100722.966578] R13: 0000000000000001 R14: 000000000bf9a438 R15: ffff88195f333000 [100722.966580] FS: 00007fa2eee27700(0000) GS:ffff88203d840000(0000) knlGS:0000000000000000 [100722.966580] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [100722.966581] CR2: 0000000000f0b008 CR3: 000000201a622003 CR4: 00000000007606e0 [100722.966582] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [100722.966583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [100722.966583] PKRU: 55555554 [100722.966583] Call Trace: [100722.966588] __sync_dirty_buffer+0x6e/0xd0 [100722.966614] ext4_commit_super+0x1d8/0x290 [ext4] [100722.966626] __ext4_std_error+0x78/0x100 [ext4] [100722.966635] ? __ext4_journal_get_write_access+0xca/0x120 [ext4] [100722.966646] ext4_reserve_inode_write+0x58/0xb0 [ext4] [100722.966655] ? ext4_dirty_inode+0x48/0x70 [ext4] [100722.966663] ext4_mark_inode_dirty+0x53/0x1e0 [ext4] [100722.966671] ? __ext4_journal_start_sb+0x6d/0xf0 [ext4] [100722.966679] ext4_dirty_inode+0x48/0x70 [ext4] [100722.966682] __mark_inode_dirty+0x17f/0x350 [100722.966686] generic_update_time+0x87/0xd0 [100722.966687] touch_atime+0xa9/0xd0 [100722.966690] generic_file_read_iter+0xa09/0xcd0 [100722.966694] ? page_cache_tree_insert+0xb0/0xb0 [100722.966704] ext4_file_read_iter+0x4a/0x100 [ext4] [100722.966707] ? __inode_security_revalidate+0x4f/0x60 [100722.966709] __vfs_read+0xec/0x160 [100722.966711] vfs_read+0x8c/0x130 [100722.966712] SyS_pread64+0x87/0xb0 [100722.966716] do_syscall_64+0x67/0x1b0 [100722.966719] entry_SYSCALL64_slow_path+0x25/0x25 To address this, add the check of 'buffer_mapped(bh)' to __sync_dirty_buffer(). This also has the benefit of fixing this for other file systems. With this addition, we can drop the workaround in ext4_commit_supper(). [ Commit description rewritten by tytso. ] Signed-off-by:
Xianting Tian <xianting_tian@126.com> Link: https://lore.kernel.org/r/1596211825-8750-1-git-send-email-xianting_tian@126.com Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Signed-off-by:
Sasha Levin <sashal@kernel.org> Signed-off-by:
Yang Yingliang <yangyingliang@huawei.com>
-