commit 28ae0a748c161ed01e2f43018beb978c74e12a0d Author: Greg Kroah-Hartman Date: Wed Jun 28 11:14:25 2023 +0200 Linux 6.3.10 Link: https://lore.kernel.org/r/20230626180805.643662628@linuxfoundation.org Tested-by: Jon Hunter Tested-by: Ron Economos Tested-by: Markus Reichelt Tested-by: Chris Paterson (CIP) Tested-by: Salvatore Bonaccorso Tested-by: Guenter Roeck Tested-by: Linux Kernel Functional Testing Tested-by: Conor Dooley Signed-off-by: Greg Kroah-Hartman commit 247bcad9a0df0855a2d549124eb4505ffb9b5d01 Author: Namjae Jeon Date: Thu May 25 00:13:38 2023 +0900 ksmbd: call putname after using the last component commit 6fe55c2799bc29624770c26f98ba7b06214f43e0 upstream. last component point filename struct. Currently putname is called after vfs_path_parent_lookup(). And then last component is used for lookup_one_qstr_excl(). name in last component is freed by previous calling putname(). And It cause file lookup failure when testing generic/464 test of xfstest. Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name") Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit bf8355e3d347db88b78fb443f9284df369a4d905 Author: Namjae Jeon Date: Sun May 14 10:02:27 2023 +0900 ksmbd: fix uninitialized pointer read in smb2_create_link() commit df14afeed2e6c1bbadef7d2f9c46887bbd6d8d94 upstream. There is a case that file_present is true and path is uninitialized. This patch change file_present is set to false by default and set to true when patch is initialized. Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name") Reported-by: Coverity Scan Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit c526418bc005193fd96d94fba12b0f892f0318f3 Author: Namjae Jeon Date: Fri May 12 17:05:41 2023 +0900 ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename() commit 48b47f0caaa8a9f05ed803cb4f335fa3a7bfc622 upstream. Uninitialized rd.delegated_inode can be used in vfs_rename(). Fix this by setting rd.delegated_inode to NULL to avoid the uninitialized read. Fixes: 74d7970febf7 ("ksmbd: fix racy issue from using ->d_parent and ->d_name") Reported-by: Coverity Scan Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 20cb9d47f0bc13fe9b9ba4e711305c1978bb1c2b Author: Marc Zyngier Date: Wed Jun 7 15:38:44 2023 +0100 KVM: arm64: Restore GICv2-on-GICv3 functionality commit 1caa71a7a600f7781ce05ef1e84701c459653663 upstream. When reworking the vgic locking, the vgic distributor registration got simplified, which was a very good cleanup. But just a tad too radical, as we now register the *native* vgic only, ignoring the GICv2-on-GICv3 that allows pre-historic VMs (or so I thought) to run. As it turns out, QEMU still defaults to GICv2 in some cases, and this breaks Nathan's setup! Fix it by propagating the *requested* vgic type rather than the host's version. Fixes: 59112e9c390b ("KVM: arm64: vgic: Fix a circular locking issue") Reported-by: Nathan Chancellor Tested-by: Nathan Chancellor Signed-off-by: Marc Zyngier link: https://lore.kernel.org/r/20230606221525.GA2269598@dev-arch.thelio-3990X Signed-off-by: Greg Kroah-Hartman commit 2a4d5af41e7ed043a93caf6f97d2d4c566a70531 Author: Pablo Neira Ayuso Date: Wed Jun 14 23:20:18 2023 +0200 netfilter: nf_tables: drop module reference after updating chain commit 043d2acf57227db1fdaaa620b2a420acfaa56d6e upstream. Otherwise the module reference counter is leaked. Fixes b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 4108066fd2dd4f4a4f561a4218d5797853516e21 Author: Clark Wang Date: Mon May 29 16:02:51 2023 +0800 i2c: imx-lpi2c: fix type char overflow issue when calculating the clock cycle [ Upstream commit e69b9bc170c6d93ee375a5cbfd15f74c0fb59bdd ] Claim clkhi and clklo as integer type to avoid possible calculation errors caused by data overflow. Fixes: a55fa9d0e42e ("i2c: imx-lpi2c: add low power i2c bus driver") Signed-off-by: Clark Wang Signed-off-by: Carlos Song Reviewed-by: Andi Shyti Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin commit 1fadaf23facc9ec446164ba3585da0640fff1d61 Author: Dheeraj Kumar Srivastava Date: Sat Jun 17 02:52:36 2023 +0530 x86/apic: Fix kernel panic when booting with intremap=off and x2apic_phys [ Upstream commit 85d38d5810e285d5aec7fb5283107d1da70c12a9 ] When booting with "intremap=off" and "x2apic_phys" on the kernel command line, the physical x2APIC driver ends up being used even when x2APIC mode is disabled ("intremap=off" disables x2APIC mode). This happens because the first compound condition check in x2apic_phys_probe() is false due to x2apic_mode == 0 and so the following one returns true after default_acpi_madt_oem_check() having already selected the physical x2APIC driver. This results in the following panic: kernel BUG at arch/x86/kernel/apic/io_apic.c:2409! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0-rc2-ver4.1rc2 #2 Hardware name: Dell Inc. PowerEdge R6515/07PXPY, BIOS 2.3.6 07/06/2021 RIP: 0010:setup_IO_APIC+0x9c/0xaf0 Call Trace: ? native_read_msr apic_intr_mode_init x86_late_time_init start_kernel x86_64_start_reservations x86_64_start_kernel secondary_startup_64_no_verify which is: setup_IO_APIC: apic_printk(APIC_VERBOSE, "ENABLING IO-APIC IRQs\n"); for_each_ioapic(ioapic) BUG_ON(mp_irqdomain_create(ioapic)); Return 0 to denote that x2APIC has not been enabled when probing the physical x2APIC driver. [ bp: Massage commit message heavily. ] Fixes: 9ebd680bd029 ("x86, apic: Use probe routines to simplify apic selection") Signed-off-by: Dheeraj Kumar Srivastava Signed-off-by: Borislav Petkov (AMD) Reviewed-by: Kishon Vijay Abraham I Reviewed-by: Vasant Hegde Reviewed-by: Cyrill Gorcunov Reviewed-by: Thomas Gleixner Link: https://lore.kernel.org/r/20230616212236.1389-1-dheerajkumar.srivastava@amd.com Signed-off-by: Sasha Levin commit 28b3a5f2ad2b92326f74ab6d40906c22d5178604 Author: Omar Sandoval Date: Tue Jun 13 14:14:56 2023 -0700 x86/unwind/orc: Add ELF section with ORC version identifier [ Upstream commit b9f174c811e3ae4ae8959dc57e6adb9990e913f4 ] Commits ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC metadata") and fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two") changed the ORC format. Although ORC is internal to the kernel, it's the only way for external tools to get reliable kernel stack traces on x86-64. In particular, the drgn debugger [1] uses ORC for stack unwinding, and these format changes broke it [2]. As the drgn maintainer, I don't care how often or how much the kernel changes the ORC format as long as I have a way to detect the change. It suffices to store a version identifier in the vmlinux and kernel module ELF files (to use when parsing ORC sections from ELF), and in kernel memory (to use when parsing ORC from a core dump+symbol table). Rather than hard-coding a version number that needs to be manually bumped, Peterz suggested hashing the definitions from orc_types.h. If there is a format change that isn't caught by this, the hashing script can be updated. This patch adds an .orc_header allocated ELF section containing the 20-byte hash to vmlinux and kernel modules, along with the corresponding __start_orc_header and __stop_orc_header symbols in vmlinux. 1: https://github.com/osandov/drgn 2: https://github.com/osandov/drgn/issues/303 Fixes: ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC metadata") Fixes: fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in two") Signed-off-by: Omar Sandoval Signed-off-by: Peter Zijlstra (Intel) Acked-by: Josh Poimboeuf Link: https://lkml.kernel.org/r/aef9c8dc43915b886a8c48509a12ec1b006ca1ca.1686690801.git.osandov@osandov.com Signed-off-by: Sasha Levin commit 5b43a641a2308a07eebbfaded515761aa9b79c7d Author: Andrey Smetanin Date: Mon Apr 24 23:44:11 2023 +0300 vhost_net: revert upend_idx only on retriable error [ Upstream commit 1f5d2e3bab16369d5d4b4020a25db4ab1f4f082c ] Fix possible virtqueue used buffers leak and corresponding stuck in case of temporary -EIO from sendmsg() which is produced by tun driver while backend device is not up. In case of no-retriable error and zcopy do not revert upend_idx to pass packet data (that is update used_idx in corresponding vhost_zerocopy_signal_used()) as if packet data has been transferred successfully. v2: set vq->heads[ubuf->desc].len equal to VHOST_DMA_DONE_LEN in case of fake successful transmit. Signed-off-by: Andrey Smetanin Message-Id: <20230424204411.24888-1-asmetanin@yandex-team.ru> Signed-off-by: Michael S. Tsirkin Signed-off-by: Andrey Smetanin Acked-by: Jason Wang Signed-off-by: Sasha Levin commit 0aa3eb551c7155d79bdde4cce36f0a319d5d777b Author: Shannon Nelson Date: Mon Apr 24 15:50:29 2023 -0700 vhost_vdpa: tell vqs about the negotiated [ Upstream commit 376daf317753ccb6b1ecbdece66018f7f6313a7f ] As is done in the net, iscsi, and vsock vhost support, let the vdpa vqs know about the features that have been negotiated. This allows vhost to more safely make decisions based on the features, such as when using PACKED vs split queues. Signed-off-by: Shannon Nelson Acked-by: Jason Wang Message-Id: <20230424225031.18947-2-shannon.nelson@amd.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 8ec977a9d3bb5e960646c8b689dbbcf027d4b040 Author: Rong Tao Date: Wed May 24 20:31:24 2023 +0800 tools/virtio: Fix arm64 ringtest compilation error [ Upstream commit 57380fd1249b20ef772549af2c58ef57b21faba7 ] Add cpu_relax() for arm64 instead of directly assert(), and add assert.h header file. Also, add smp_wmb and smp_mb for arm64. Compilation error as follows, avoid __always_inline undefined. $ make cc -Wall -pthread -O2 -ggdb -flto -fwhole-program -c -o ring.o ring.c In file included from ring.c:10: main.h: In function ‘busy_wait’: main.h:99:21: warning: implicit declaration of function ‘assert’ [-Wimplicit-function-declaration] 99 | #define cpu_relax() assert(0) | ^~~~~~ main.h:107:17: note: in expansion of macro ‘cpu_relax’ 107 | cpu_relax(); | ^~~~~~~~~ main.h:12:1: note: ‘assert’ is defined in header ‘’; did you forget to ‘#include ’? 11 | #include +++ |+#include 12 | main.h: At top level: main.h:143:23: error: expected ‘;’ before ‘void’ 143 | static __always_inline | ^ | ; 144 | void __read_once_size(const volatile void *p, void *res, int size) | ~~~~ main.h:158:23: error: expected ‘;’ before ‘void’ 158 | static __always_inline void __write_once_size(volatile void *p, void *res, int size) | ^~~~~ | ; make: *** [: ring.o] Error 1 Signed-off-by: Rong Tao Message-Id: Signed-off-by: Michael S. Tsirkin Signed-off-by: Sasha Levin commit 7701bef7888b99abe4acb0b6ed4886146e561caf Author: Min Li Date: Sat Jun 3 15:43:45 2023 +0800 drm/radeon: fix race condition UAF in radeon_gem_set_domain_ioctl [ Upstream commit 982b173a6c6d9472730c3116051977e05d17c8c5 ] Userspace can race to free the gobj(robj converted from), robj should not be accessed again after drm_gem_object_put, otherwith it will result in use-after-free. Reviewed-by: Christian König Signed-off-by: Min Li Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit c088e79f5d6ce13f9f2a5ff21a7d445c4931b6b1 Author: Min Li Date: Fri May 26 21:01:31 2023 +0800 drm/exynos: fix race condition UAF in exynos_g2d_exec_ioctl [ Upstream commit 48bfd02569f5db49cc033f259e66d57aa6efc9a3 ] If it is async, runqueue_node is freed in g2d_runqueue_worker on another worker thread. So in extreme cases, if g2d_runqueue_worker runs first, and then executes the following if statement, there will be use-after-free. Signed-off-by: Min Li Reviewed-by: Andi Shyti Signed-off-by: Inki Dae Signed-off-by: Sasha Levin commit f5a19130d4504f43a963edc5a3ad15b3c6bfd254 Author: Inki Dae Date: Fri May 19 08:55:05 2023 +0900 drm/exynos: vidi: fix a wrong error return [ Upstream commit 4a059559809fd1ddbf16f847c4d2237309c08edf ] Fix a wrong error return by dropping an error return. When vidi driver is remvoed, if ctx->raw_edid isn't same as fake_edid_info then only what we have to is to free ctx->raw_edid so that driver removing can work correctly - it's not an error case. Signed-off-by: Inki Dae Reviewed-by: Andi Shyti Signed-off-by: Sasha Levin commit 7ccc017fea6913884fc382c232fcc36c4a61be2c Author: Nitesh Shetty Date: Mon Jun 5 11:53:53 2023 +0530 null_blk: Fix: memory release when memory_backed=1 [ Upstream commit 8cfb98196cceec35416041c6b91212d2b99392e4 ] Memory/pages are not freed, when unloading nullblk driver. Steps to reproduce issue 1.free -h total used free shared buff/cache available Mem: 7.8Gi 260Mi 7.1Gi 3.0Mi 395Mi 7.3Gi Swap: 0B 0B 0B 2.modprobe null_blk memory_backed=1 3.dd if=/dev/urandom of=/dev/nullb0 oflag=direct bs=1M count=1000 4.modprobe -r null_blk 5.free -h total used free shared buff/cache available Mem: 7.8Gi 1.2Gi 6.1Gi 3.0Mi 398Mi 6.3Gi Swap: 0B 0B 0B Signed-off-by: Anuj Gupta Signed-off-by: Nitesh Shetty Link: https://lore.kernel.org/r/20230605062354.24785-1-nj.shetty@samsung.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 85dc6adb63151a6f32aad9a8605d2b494e73f87c Author: Linus Walleij Date: Wed May 10 12:51:56 2023 +0200 ARM: dts: Fix erroneous ADS touchscreen polarities [ Upstream commit 4a672d500bfd6bb87092c33d5a2572c3d0a1cf83 ] Several device tree files get the polarity of the pendown-gpios wrong: this signal is active low. Fix up all incorrect flags, so that operating systems can rely on the flag being correctly set. Signed-off-by: Linus Walleij Link: https://lore.kernel.org/r/20230510105156.1134320-1-linus.walleij@linaro.org Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin commit 6b758ca6fb86367e89f62cd956bcf357c4875773 Author: David Zheng Date: Wed May 24 11:14:59 2023 -0700 i2c: designware: fix idx_write_cnt in read loop [ Upstream commit 1acfc6e753ed978b36d722f54e57fe4d1e8a6ffa ] With IC_INTR_RX_FULL slave interrupt handler reads data in a loop until RX FIFO is empty. When testing with the slave-eeprom, each transaction has 2 bytes for address/index and 1 byte for value, the address byte can be written as data byte due to dropping STOP condition. In the test below, the master continuously writes to the slave, first 2 bytes are index, 3rd byte is value and follow by a STOP condition. i2c_write: i2c-3 #0 a=04b f=0000 l=3 [00-D1-D1] i2c_write: i2c-3 #0 a=04b f=0000 l=3 [00-D2-D2] i2c_write: i2c-3 #0 a=04b f=0000 l=3 [00-D3-D3] Upon receiving STOP condition slave eeprom would reset `idx_write_cnt` so next 2 bytes can be treated as buffer index for upcoming transaction. Supposedly the slave eeprom buffer would be written as EEPROM[0x00D1] = 0xD1 EEPROM[0x00D2] = 0xD2 EEPROM[0x00D3] = 0xD3 When CPU load is high the slave irq handler may not read fast enough, the interrupt status can be seen as 0x204 with both DW_IC_INTR_STOP_DET (0x200) and DW_IC_INTR_RX_FULL (0x4) bits. The slave device may see the transactions below. 0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x1594 : INTR_STAT=0x4 0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x1594 : INTR_STAT=0x4 0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x1594 : INTR_STAT=0x4 0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x1794 : INTR_STAT=0x204 0x1 STATUS SLAVE_ACTIVITY=0x0 : RAW_INTR_STAT=0x1790 : INTR_STAT=0x200 0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x1594 : INTR_STAT=0x4 0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x1594 : INTR_STAT=0x4 0x1 STATUS SLAVE_ACTIVITY=0x1 : RAW_INTR_STAT=0x1594 : INTR_STAT=0x4 After `D1` is received, read loop continues to read `00` which is the first bype of next index. Since STOP condition is ignored by the loop, eeprom buffer index increased to `D2` and `00` is written as value. So the slave eeprom buffer becomes EEPROM[0x00D1] = 0xD1 EEPROM[0x00D2] = 0x00 EEPROM[0x00D3] = 0xD3 The fix is to use `FIRST_DATA_BYTE` (bit 11) in `IC_DATA_CMD` to split the transactions. The first index byte in this case would have bit 11 set. Check this indication to inject I2C_SLAVE_WRITE_REQUESTED event which will reset `idx_write_cnt` in slave eeprom. Signed-off-by: David Zheng Acked-by: Jarkko Nikula Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin commit 41241429449b4b8bbc0a0f83f2e7bfa2389610df Author: Simon Horman Date: Wed May 10 14:32:17 2023 +0200 i2c: mchp-pci1xxxx: Avoid cast to incompatible function type [ Upstream commit 7ebfd881abe9e0ea9557b29dab6aa28d294fabb4 ] Rather than casting pci1xxxx_i2c_shutdown to an incompatible function type, update the type to match that expected by __devm_add_action. Reported by clang-16 with W-1: .../i2c-mchp-pci1xxxx.c:1159:29: error: cast from 'void (*)(struct pci1xxxx_i2c *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] ret = devm_add_action(dev, (void (*)(void *))pci1xxxx_i2c_shutdown, i2c); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/device.h:251:29: note: expanded from macro 'devm_add_action' __devm_add_action(release, action, data, #action) ^~~~~~ No functional change intended. Compile tested only. Signed-off-by: Simon Horman Reviewed-by: Horatiu Vultur Reviewed-by: Andi Shyti Reviewed-by: Tharun Kumar P Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin commit 367a2795f2290ac73ae955182453f6dbc7e3b16f Author: Sayed, Karimuddin Date: Fri Jun 2 14:38:12 2023 -0500 ALSA: hda/realtek: Add "Intel Reference board" and "NUC 13" SSID in the ALC256 [ Upstream commit 1a93f10c5b12bd766a537b24a50fca5373467303 ] Add "Intel Reference boad" and "Intel NUC 13" SSID in the alc256. Enable jack headset volume buttons Reviewed-by: Kai Vehmanen Signed-off-by: Sayed, Karimuddin Signed-off-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20230602193812.66768-1-pierre-louis.bossart@linux.intel.com Signed-off-by: Takashi Iwai Signed-off-by: Sasha Levin commit 52edaa6fb87a73fb782f2294d568b89f78b88963 Author: Min-Hua Chen Date: Sat Jun 3 07:52:09 2023 +0800 net: sched: wrap tc_skip_wrapper with CONFIG_RETPOLINE [ Upstream commit 8cde87b007dad2e461015ff70352af56ceb02c75 ] This patch fixes the following sparse warning: net/sched/sch_api.c:2305:1: sparse: warning: symbol 'tc_skip_wrapper' was not declared. Should it be static? No functional change intended. Signed-off-by: Min-Hua Chen Acked-by: Pedro Tammela Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0517065daf74eb20808a9fdedfa04424db006354 Author: Chancel Liu Date: Tue May 30 18:30:12 2023 +0800 ASoC: fsl_sai: Enable BCI bit if SAI works on synchronous mode with BYP asserted [ Upstream commit 32cf0046a652116d6a216d575f3049a9ff9dd80d ] There's an issue on SAI synchronous mode that TX/RX side can't get BCLK from RX/TX it sync with if BYP bit is asserted. It's a workaround to fix it that enable SION of IOMUX pad control and assert BCI. For example if TX sync with RX which means both TX and RX are using clk form RX and BYP=1. TX can get BCLK only if the following two conditions are valid: 1. SION of RX BCLK IOMUX pad is set to 1 2. BCI of TX is set to 1 Signed-off-by: Chancel Liu Acked-by: Shengjiu Wang Link: https://lore.kernel.org/r/20230530103012.3448838-1-chancel.liu@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit eb43e2acd1af675c7d3f6ad4ada07f83b9a144cf Author: Alexander Gordeev Date: Thu May 25 19:41:45 2023 +0200 s390/purgatory: disable branch profiling [ Upstream commit 03c5c83b70dca3729a3eb488e668e5044bd9a5ea ] Avoid linker error for randomly generated config file that has CONFIG_BRANCH_PROFILE_NONE enabled and make it similar to riscv, x86 and also to commit 4bf3ec384edf ("s390: disable branch profiling for vdso"). Reviewed-by: Vasily Gorbik Signed-off-by: Alexander Gordeev Signed-off-by: Sasha Levin commit 9a8dc72541376eaf5a3ff50fd0c42144a67b87e3 Author: Andreas Gruenbacher Date: Wed May 31 21:08:26 2023 +0200 gfs2: Don't get stuck writing page onto itself under direct I/O [ Upstream commit fa58cc888d67e640e354d8b3ceef877ea167b0cf ] When a direct I/O write is performed, iomap_dio_rw() invalidates the part of the page cache which the write is going to before carrying out the write. In the odd case, the direct I/O write will be reading from the same page it is writing to. gfs2 carries out writes with page faults disabled, so it should have been obvious that this page invalidation can cause iomap_dio_rw() to never make any progress. Currently, gfs2 will end up in an endless retry loop in gfs2_file_direct_write() instead, though. Break this endless loop by limiting the number of retries and falling back to buffered I/O after that. Also simplify should_fault_in_pages() sightly and add a comment to make the above case easier to understand. Reported-by: Jan Kara Signed-off-by: Andreas Gruenbacher Signed-off-by: Sasha Levin commit cf2fe455d46c425f2be736f4793f311cd3d5c55e Author: Sicong Jiang Date: Wed May 31 21:06:35 2023 +1200 ASoC: amd: yc: Add Thinkpad Neo14 to quirks list for acp6x [ Upstream commit 57d1e8900495cf1751cec74db16fe1a0fe47efbb ] Thinkpad Neo14 Ryzen Edition uses Ryzen 6800H processor, and adding to quirks list for acp6x will enable internal mic. Signed-off-by: Sicong Jiang Link: https://lore.kernel.org/r/20230531090635.89565-1-kevin.jiangsc@gmail.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 4942d43fa894bc9169ad580ff415885a269245ef Author: Edson Juliano Drosdeck Date: Mon May 29 15:19:11 2023 -0300 ASoC: nau8824: Add quirk to active-high jack-detect [ Upstream commit e384dba03e3294ce7ea69e4da558e9bf8f0e8946 ] Add entries for Positivo laptops: CW14Q01P, K1424G, N14ZP74G to the DMI table, so that active-high jack-detect will work properly on these laptops. Signed-off-by: Edson Juliano Drosdeck Link: https://lore.kernel.org/r/20230529181911.632851-1-edson.drosdeck@gmail.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 07f150cb3bbd5d78c22af7b84204ffd051c9621e Author: Hao Yao Date: Wed May 24 11:51:33 2023 +0800 platform/x86: int3472: Avoid crash in unregistering regulator gpio [ Upstream commit fb109fba728407fa4a84d659b5cb87cd8399d7b3 ] When int3472 is loaded before GPIO driver, acpi_get_and_request_gpiod() failed but the returned gpio descriptor is not NULL, it will cause panic in later gpiod_put(), so set the gpio_desc to NULL in register error handling to avoid such crash. Signed-off-by: Hao Yao Signed-off-by: Bingbu Cao Link: https://lore.kernel.org/r/20230524035135.90315-1-bingbu.cao@intel.com Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 6b448cbc3d9e751ffe92c8989392df62eab16e27 Author: Krzysztof Kozlowski Date: Wed May 17 18:37:36 2023 +0200 soundwire: qcom: add proper error paths in qcom_swrm_startup() [ Upstream commit 99e09b9c0ab43346c52f2787ca4e5c4b1798362e ] Reverse actions in qcom_swrm_startup() error paths to avoid leaking stream memory and keeping runtime PM unbalanced. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Pierre-Louis Bossart Link: https://lore.kernel.org/r/20230517163736.997553-1-krzysztof.kozlowski@linaro.org Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit a04961316a47b006656f09f800c92cb1c73e7d35 Author: Pierre-Louis Bossart Date: Mon May 15 15:48:59 2023 +0800 soundwire: dmi-quirks: add new mapping for HP Spectre x360 [ Upstream commit 700581ede41d029403feec935df4616309696fd7 ] A BIOS/DMI update seems to have broken some devices, let's add a new mapping. Link: https://github.com/thesofproject/linux/issues/4323 Signed-off-by: Pierre-Louis Bossart Reviewed-by: Rander Wang Signed-off-by: Bard Liao Link: https://lore.kernel.org/r/20230515074859.3097-1-yung-chuan.liao@linux.intel.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit d529d13ab3565dc3c68c2c89328b3bfca88e2c87 Author: Herve Codina Date: Tue May 23 17:12:22 2023 +0200 ASoC: simple-card: Add missing of_node_put() in case of error [ Upstream commit 8938f75a5e35c597a647c28984a0304da7a33d63 ] In the error path, a of_node_put() for platform is missing. Just add it. Signed-off-by: Herve Codina Acked-by: Kuninori Morimoto Link: https://lore.kernel.org/r/20230523151223.109551-9-herve.codina@bootlin.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 5811c5519f736406fcb3fc06a683a7560da5e1b1 Author: Srinivas Kandagatla Date: Tue May 23 17:54:14 2023 +0100 ASoC: codecs: wcd938x-sdw: do not set can_multi_write flag [ Upstream commit 2d7c2f9272de6347a9cec0fc07708913692c0ae3 ] regmap-sdw does not support multi register writes, so there is no point in setting this flag. This also leads to incorrect programming of WSA codecs with regmap_multi_reg_write() call. This invalid configuration should have been rejected by regmap-sdw. Signed-off-by: Srinivas Kandagatla Link: https://lore.kernel.org/r/20230523165414.14560-1-srinivas.kandagatla@linaro.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 1d5c83c6388a743dcbfc83b658642962c9bf1a25 Author: Clark Wang Date: Fri May 5 14:35:57 2023 +0800 spi: lpspi: disable lpspi module irq in DMA mode [ Upstream commit 9728fb3ce11729aa8c276825ddf504edeb00611d ] When all bits of IER are set to 0, we still can observe the lpspi irq events when using DMA mode to transfer data. So disable irq to avoid the too much irq events. Signed-off-by: Clark Wang Link: https://lore.kernel.org/r/20230505063557.3962220-1-xiaoning.wang@nxp.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit e3d7dbf093307f5624d78192b508080fbe454887 Author: Vineeth Vijayan Date: Thu May 4 20:53:20 2023 +0200 s390/cio: unregister device when the only path is gone [ Upstream commit 89c0c62e947a01e7a36b54582fd9c9e346170255 ] Currently, if the device is offline and all the channel paths are either configured or varied offline, the associated subchannel gets unregistered. Don't unregister the subchannel, instead unregister offline device. Signed-off-by: Vineeth Vijayan Reviewed-by: Peter Oberparleiter Signed-off-by: Alexander Gordeev Signed-off-by: Sasha Levin commit 89760f3aaffa1f0346420b05b6fc9263916364e2 Author: Krzysztof Kozlowski Date: Mon Feb 20 10:54:01 2023 +0100 arm64: dts: qcom: sc7280-qcard: drop incorrect dai-cells from WCD938x SDW [ Upstream commit 16bd455d0897d1b8b7a9aee2ed51d75b14a34563 ] The WCD938x audio codec Soundwire interface part is not a DAI and does not allow sound-dai-cells: sc7280-herobrine-crd.dtb: codec@0,4: '#sound-dai-cells' does not match any of the regexes: 'pinctrl-[0-9]+' Signed-off-by: Krzysztof Kozlowski Reviewed-by: Douglas Anderson Reviewed-by: Konrad Dybcio Signed-off-by: Bjorn Andersson Link: https://lore.kernel.org/r/20230220095401.64196-2-krzysztof.kozlowski@linaro.org Signed-off-by: Sasha Levin commit 04dc6bc4a22a7e9795d17e38c59b594f776e7194 Author: Krzysztof Kozlowski Date: Mon Feb 20 10:54:00 2023 +0100 arm64: dts: qcom: sc7280-idp: drop incorrect dai-cells from WCD938x SDW [ Upstream commit ca8fc6814844d8787e7fec61b2544a871ea8b675 ] The WCD938x audio codec Soundwire interface part is not a DAI and does not allow sound-dai-cells: sc7280-idp.dtb: codec@0,4: '#sound-dai-cells' does not match any of the regexes: 'pinctrl-[0-9]+' Signed-off-by: Krzysztof Kozlowski Reviewed-by: Douglas Anderson Reviewed-by: Konrad Dybcio Signed-off-by: Bjorn Andersson Link: https://lore.kernel.org/r/20230220095401.64196-1-krzysztof.kozlowski@linaro.org Signed-off-by: Sasha Levin commit 84a495d481501522069585433fa896f00ef6482f Author: Hans de Goede Date: Thu May 11 11:57:04 2023 -0700 Input: soc_button_array - add invalid acpi_index DMI quirk handling [ Upstream commit 20a99a291d564a559cc2fd013b4824a3bb3f1db7 ] Some devices have a wrong entry in their button array which points to a GPIO which is required in another driver, so soc_button_array must not claim it. A specific example of this is the Lenovo Yoga Book X90F / X90L, where the PNP0C40 home button entry points to a GPIO which is not a home button and which is required by the lenovo-yogabook driver. Add a DMI quirk table which can specify an ACPI GPIO resource index which should be skipped; and add an entry for the Lenovo Yoga Book X90F / X90L to this new DMI quirk table. Signed-off-by: Hans de Goede Link: https://lore.kernel.org/r/20230414072116.4497-1-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin commit 99790dc4872e5fde37f37596b8fbddc0156b8701 Author: Uday Shankar Date: Thu May 25 12:22:04 2023 -0600 nvme: improve handling of long keep alives [ Upstream commit c7275ce6a5fd32ca9f5a6294ed89cf0523181af9 ] Upon keep alive completion, nvme_keep_alive_work is scheduled with the same delay every time. If keep alive commands are completing slowly, this may cause a keep alive timeout. The following trace illustrates the issue, taking KATO = 8 and TBKAS off for simplicity: 1. t = 0: run nvme_keep_alive_work, send keep alive 2. t = ε: keep alive reaches controller, controller restarts its keep alive timer 3. t = 4: host receives keep alive completion, schedules nvme_keep_alive_work with delay 4 4. t = 8: run nvme_keep_alive_work, send keep alive Here, a keep alive having RTT of 4 causes a delay of at least 8 - ε between the controller receiving successive keep alives. With ε small, the controller is likely to detect a keep alive timeout. Fix this by calculating the RTT of the keep alive command, and adjusting the scheduling delay of the next keep alive work accordingly. Reported-by: Costa Sapuntzakis Reported-by: Randy Jennings Signed-off-by: Uday Shankar Reviewed-by: Hannes Reinecke Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Sasha Levin commit 750f2e5ab69ec653c785618caaedfe8631174e58 Author: Uday Shankar Date: Thu May 25 12:22:03 2023 -0600 nvme: check IO start time when deciding to defer KA [ Upstream commit 774a9636514764ddc0d072ae0d1d1c01a47e6ddd ] When a command completes, we set a flag which will skip sending a keep alive at the next run of nvme_keep_alive_work when TBKAS is on. However, if the command was submitted long ago, it's possible that the controller may have also restarted its keep alive timer (as a result of receiving the command) long ago. The following trace demonstrates the issue, assuming TBKAS is on and KATO = 8 for simplicity: 1. t = 0: submit I/O commands A, B, C, D, E 2. t = 0.5: commands A, B, C, D, E reach controller, restart its keep alive timer 3. t = 1: A completes 4. t = 2: run nvme_keep_alive_work, see recent completion, do nothing 5. t = 3: B completes 6. t = 4: run nvme_keep_alive_work, see recent completion, do nothing 7. t = 5: C completes 8. t = 6: run nvme_keep_alive_work, see recent completion, do nothing 9. t = 7: D completes 10. t = 8: run nvme_keep_alive_work, see recent completion, do nothing 11. t = 9: E completes At this point, 8.5 seconds have passed without restarting the controller's keep alive timer, so the controller will detect a keep alive timeout. Fix this by checking the IO start time when deciding to defer sending a keep alive command. Only set comp_seen if the command started after the most recent run of nvme_keep_alive_work. With this change, the completions of B, C, and D will not set comp_seen and the run of nvme_keep_alive_work at t = 4 will send a keep alive. Reported-by: Costa Sapuntzakis Reported-by: Randy Jennings Signed-off-by: Uday Shankar Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Sasha Levin commit 3d2949d77ff90e790936eb52521aee76c9cc5354 Author: Uday Shankar Date: Thu May 25 12:22:02 2023 -0600 nvme: double KA polling frequency to avoid KATO with TBKAS on [ Upstream commit ea4d453b9ec9ea279c39744cd0ecb47ef48ede35 ] With TBKAS on, the completion of one command can defer sending a keep alive for up to twice the delay between successive runs of nvme_keep_alive_work. The current delay of KATO / 2 thus makes it possible for one command to defer sending a keep alive for up to KATO, which can result in the controller detecting a KATO. The following trace demonstrates the issue, taking KATO = 8 for simplicity: 1. t = 0: run nvme_keep_alive_work, no keep-alive sent 2. t = ε: I/O completion seen, set comp_seen = true 3. t = 4: run nvme_keep_alive_work, see comp_seen == true, skip sending keep-alive, set comp_seen = false 4. t = 8: run nvme_keep_alive_work, see comp_seen == false, send a keep-alive command. Here, there is a delay of 8 - ε between receiving a command completion and sending the next command. With ε small, the controller is likely to detect a keep alive timeout. Fix this by running nvme_keep_alive_work with a delay of KATO / 4 whenever TBKAS is on. Going through the above trace now gives us a worst-case delay of 4 - ε, which is in line with the recommendation of sending a command every KATO / 2 in the NVMe specification. Reported-by: Costa Sapuntzakis Reported-by: Randy Jennings Signed-off-by: Uday Shankar Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Sasha Levin commit b9026db654d99e0a750b7759d0e3e1b5347388b5 Author: min15.li Date: Fri May 26 17:06:56 2023 +0000 nvme: fix miss command type check [ Upstream commit 31a5978243d24d77be4bacca56c78a0fbc43b00d ] In the function nvme_passthru_end(), only the value of the command opcode is checked, without checking the command type (IO command or Admin command). When we send a Dataset Management command (The opcode of the Dataset Management command is the same as the Set Feature command), kernel thinks it is a set feature command, then sets the controller's keep alive interval, and calls nvme_keep_alive_work(). Signed-off-by: min15.li Reviewed-by: Kanchan Joshi Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Sasha Levin commit 5458a0a579928524670ff14c1269f6a0f08cdb37 Author: Dan Carpenter Date: Thu May 25 18:38:37 2023 +0300 usb: gadget: udc: fix NULL dereference in remove() [ Upstream commit 016da9c65fec9f0e78c4909ed9a0f2d567af6775 ] The "udc" pointer was never set in the probe() function so it will lead to a NULL dereference in udc_pci_remove() when we do: usb_del_gadget_udc(&udc->gadget); Signed-off-by: Dan Carpenter Link: https://lore.kernel.org/r/ZG+A/dNpFWAlCChk@kili Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit e6a9a52882c38a5e48c12182118e0429be6411f4 Author: Shida Zhang Date: Tue May 16 09:34:30 2023 +0800 btrfs: fix an uninitialized variable warning in btrfs_log_inode [ Upstream commit 8fd9f4232d8152c650fd15127f533a0f6d0a4b2b ] This fixes the following warning reported by gcc 10.2.1 under x86_64: ../fs/btrfs/tree-log.c: In function ‘btrfs_log_inode’: ../fs/btrfs/tree-log.c:6211:9: error: ‘last_range_start’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 6211 | ret = insert_dir_log_key(trans, log, path, key.objectid, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 6212 | first_dir_index, last_dir_index); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../fs/btrfs/tree-log.c:6161:6: note: ‘last_range_start’ was declared here 6161 | u64 last_range_start; | ^~~~~~~~~~~~~~~~ This might be a false positive fixed in later compiler versions but we want to have it fixed. Reported-by: k2ci Reviewed-by: Anand Jain Signed-off-by: Shida Zhang Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit d0aae9053e8f1331f1d0d38660122993eaeb3813 Author: Osama Muhammad Date: Thu May 25 22:27:46 2023 +0500 nfcsim.c: Fix error checking for debugfs_create_dir [ Upstream commit 9b9e46aa07273ceb96866b2e812b46f1ee0b8d2f ] This patch fixes the error checking in nfcsim.c. The DebugFS kernel API is developed in a way that the caller can safely ignore the errors that occur during the creation of DebugFS nodes. Signed-off-by: Osama Muhammad Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit ae4e69dcae2cb40c83e46e57e58f64c1f0795963 Author: Hans Verkuil Date: Mon Apr 24 16:07:28 2023 +0100 media: cec: core: don't set last_initiator if tx in progress [ Upstream commit 73af6c7511038249cad3d5f3b44bf8d78ac0f499 ] When a message was received the last_initiator is set to 0xff. This will force the signal free time for the next transmit to that for a new initiator. However, if a new transmit is already in progress, then don't set last_initiator, since that's the initiator of the current transmit. Overwriting this would cause the signal free time of a following transmit to be that of the new initiator instead of a next transmit. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit a41100969171cc03a5d7895641c95826be9d159e Author: Hans Verkuil Date: Thu Apr 20 08:26:53 2023 +0100 media: cec: core: disable adapter in cec_devnode_unregister [ Upstream commit fe4526d99e2e06b08bb80316c3a596ea6a807b75 ] Explicitly disable the CEC adapter in cec_devnode_unregister() Usually this does not really do anything important, but for drivers that use the CEC pin framework this is needed to properly stop the hrtimer. Without this a crash would happen when such a driver is unloaded with rmmod. Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Sasha Levin commit 18c7c10352c0f92d2ca7b140a1695eb67f6f1a8a Author: Steve French Date: Thu May 25 18:53:28 2023 -0500 smb3: missing null check in SMB2_change_notify [ Upstream commit b535cc796a4b4942cd189652588e8d37c1f5925a ] If plen is null when passed in, we only checked for null in one of the two places where it could be used. Although plen is always valid (not null) for current callers of the SMB2_change_notify function, this change makes it more consistent. Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/all/202305251831.3V1gbbFs-lkp@intel.com/ Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 336a79d6664a66f5e498fe503a91f79d0d9eb598 Author: Marc Zyngier Date: Mon May 15 21:46:00 2023 +0100 arm64: Add missing Set/Way CMO encodings [ Upstream commit 8d0f019e4c4f2ee2de81efd9bf1c27e9fb3c0460 ] Add the missing Set/Way CMOs that apply to tagged memory. Signed-off-by: Marc Zyngier Reviewed-by: Cornelia Huck Reviewed-by: Steven Price Reviewed-by: Oliver Upton Link: https://lore.kernel.org/r/20230515204601.1270428-2-maz@kernel.org Signed-off-by: Sasha Levin commit 07ca89c2b4a2c7ec894934493ac11eb3a174ebb9 Author: Denis Arefev Date: Thu Apr 27 14:47:45 2023 +0300 HID: wacom: Add error check to wacom_parse_and_register() [ Upstream commit 16a9c24f24fbe4564284eb575b18cc20586b9270 ] Added a variable check and transition in case of an error Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Denis Arefev Reviewed-by: Ping Cheng Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit cf1738fbaabffc75fbdfe60e3dc1bd9742ecea08 Author: Maurizio Lombardi Date: Mon May 8 18:22:19 2023 +0200 scsi: target: iscsi: Prevent login threads from racing between each other [ Upstream commit 2a737d3b8c792400118d6cf94958f559de9c5e59 ] The tpg->np_login_sem is a semaphore that is used to serialize the login process when multiple login threads run concurrently against the same target portal group. The iscsi_target_locate_portal() function finds the tpg, calls iscsit_access_np() against the np_login_sem semaphore and saves the tpg pointer in conn->tpg; If iscsi_target_locate_portal() fails, the caller will check for the conn->tpg pointer and, if it's not NULL, then it will assume that iscsi_target_locate_portal() called iscsit_access_np() on the semaphore. Make sure that conn->tpg gets initialized only if iscsit_access_np() was successful, otherwise iscsit_deaccess_np() may end up being called against a semaphore we never took, allowing more than one thread to access the same tpg. Signed-off-by: Maurizio Lombardi Link: https://lore.kernel.org/r/20230508162219.1731964-4-mlombard@redhat.com Reviewed-by: Mike Christie Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 46488d29d13078294621019ebd8bb4e8e82a9f1c Author: Maurizio Lombardi Date: Mon May 8 18:22:18 2023 +0200 scsi: target: iscsi: Remove unused transport_timer [ Upstream commit 98a8c2bf938a5973716f280da618077a3d255976 ] Signed-off-by: Maurizio Lombardi Link: https://lore.kernel.org/r/20230508162219.1731964-3-mlombard@redhat.com Reviewed-by: Mike Christie Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 25396ffbceec13a0c936e1f87b17351ff66403bf Author: Maurizio Lombardi Date: Mon May 8 18:22:17 2023 +0200 scsi: target: iscsi: Fix hang in the iSCSI login code [ Upstream commit 13247018d68f21e7132924b9853f7e2c423588b6 ] If the initiator suddenly stops sending data during a login while keeping the TCP connection open, the login_work won't be scheduled and will never release the login semaphore; concurrent login operations will therefore get stuck and fail. The bug is due to the inability of the login timeout code to properly handle this particular case. Fix the problem by replacing the old per-NP login timer with a new per-connection timer. The timer is started when an initiator connects to the target; if it expires, it sends a SIGINT signal to the thread pointed at by the conn->login_kworker pointer. conn->login_kworker is set by calling the iscsit_set_login_timer_kworker() helper, initially it will point to the np thread; When the login operation's control is in the process of being passed from the NP-thread to login_work, the conn->login_worker pointer is set to NULL. Finally, login_kworker will be changed to point to the worker thread executing the login_work job. If conn->login_kworker is NULL when the timer expires, it means that the login operation hasn't been completed yet but login_work isn't running, in this case the timer will mark the login process as failed and will schedule login_work so the latter will be forced to free the resources it holds. Signed-off-by: Maurizio Lombardi Link: https://lore.kernel.org/r/20230508162219.1731964-2-mlombard@redhat.com Reviewed-by: Mike Christie Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 278f6042c24d01d9c66aec73a9f79c3069703585 Author: Michael Walle Date: Mon Jun 19 10:56:07 2023 +0200 gpiolib: Fix irq_domain resource tracking for gpiochip_irqchip_add_domain() [ Upstream commit ff7a1790fbf92f1bdd0966d3f0da3ea808ede876 ] Up until commit 6a45b0e2589f ("gpiolib: Introduce gpiochip_irqchip_add_domain()") all irq_domains were allocated by gpiolib itself and thus gpiolib also takes care of freeing it. With gpiochip_irqchip_add_domain() a user of gpiolib can associate an irq_domain with the gpio_chip. This irq_domain is not managed by gpiolib and therefore must not be freed by gpiolib. Fixes: 6a45b0e2589f ("gpiolib: Introduce gpiochip_irqchip_add_domain()") Reported-by: Jiawen Wu Signed-off-by: Michael Walle Reviewed-by: Linus Walleij Reviewed-by: Andy Shevchenko Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit e0ae6e90364e9b885517f37753a78453d6bca0b6 Author: Su Hui Date: Thu Jun 8 10:19:34 2023 +0800 iommu/amd: Fix possible memory leak of 'domain' [ Upstream commit 5b00369fcf6d1ff9050b94800dc596925ff3623f ] Move allocation code down to avoid memory leak. Fixes: 29f54745f245 ("iommu/amd: Add missing domain type checks") Signed-off-by: Su Hui Reviewed-by: Jason Gunthorpe Reviewed-by: Jerry Snitselaar Reviewed-by: Vasant Hegde Link: https://lore.kernel.org/r/20230608021933.856045-1-suhui@nfschina.com Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin commit f68a2b563987890d25827f0eac13437c4919c891 Author: Jiasheng Jiang Date: Tue Jun 6 11:11:59 2023 +0800 gpio: sifive: add missing check for platform_get_irq [ Upstream commit c1bcb976d8feb107ff2c12caaf12ac5e70f44d5f ] Add the missing check for platform_get_irq() and return error code if it fails. The returned error code will be dealed with in builtin_platform_driver(sifive_gpio_driver) and the driver will not be registered. Fixes: f52d6d8b43e5 ("gpio: sifive: To get gpio irq offset from device tree data") Signed-off-by: Jiasheng Jiang Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit aad182bd0b8f53b753e39ab98e42d4b3804b45d9 Author: Jiawen Wu Date: Wed Jun 7 16:18:03 2023 +0800 gpiolib: Fix GPIO chip IRQ initialization restriction [ Upstream commit 8c00914e5438e3636f26b4f814b3297ae2a1b9ee ] In case of gpio-regmap, IRQ chip is added by regmap-irq and associated with GPIO chip by gpiochip_irqchip_add_domain(). The initialization flag was not added in gpiochip_irqchip_add_domain(), causing gpiochip_to_irq() to return -EPROBE_DEFER. Fixes: 5467801f1fcb ("gpio: Restrict usage of GPIO chip irq members before initialization") Signed-off-by: Jiawen Wu Reviewed-by: Andy Shevchenko Reviewed-by: Linus Walleij Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit 077c5df768fe50e286700e7c9d3892fa8ce43c95 Author: Nicolas Frattaroli Date: Fri Apr 21 17:26:10 2023 +0200 arm64: dts: rockchip: fix nEXTRST on SOQuartz [ Upstream commit cf9ae4a0077496e8224d68fc88e3df13dd7e5f37 ] In pre-production prototypes (of which I only know one person having one, Peter Geis), GPIO0 pin A5 was tied to the SDMMC power enable pin on the CM4 connector. On all production models, this is not the case; instead, this pin is used for the nEXTRST signal, and the SDMMC power enable pin is always pulled high. Since everyone currently using the SOQuartz device trees will want this change, it is made to the tree without splitting the trees into two separate ones of which users will then inevitably choose the wrong one. This fixes USB and PCIe on a wide variety of CM4IO-compatible boards which use the nEXTRST signal. Fixes: 5859b5a9c3ac ("arm64: dts: rockchip: add SoQuartz CM4IO dts") Signed-off-by: Nicolas Frattaroli Link: https://lore.kernel.org/r/20230421152610.21688-1-frattaroli.nicolas@gmail.com Signed-off-by: Heiko Stuebner Signed-off-by: Sasha Levin commit 3654477d43a839d7d7485efd5caa297f078db42d Author: Maciej Żenczykowski Date: Sun Jun 18 03:31:30 2023 -0700 revert "net: align SO_RCVMARK required privileges with SO_MARK" [ Upstream commit a9628e88776eb7d045cf46467f1afdd0f7fe72ea ] This reverts commit 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") because the reasoning in the commit message is not really correct: SO_RCVMARK is used for 'reading' incoming skb mark (via cmsg), as such it is more equivalent to 'getsockopt(SO_MARK)' which has no priv check and retrieves the socket mark, rather than 'setsockopt(SO_MARK) which sets the socket mark and does require privs. Additionally incoming skb->mark may already be visible if sysctl_fwmark_reflect and/or sysctl_tcp_fwmark_accept are enabled. Furthermore, it is easier to block the getsockopt via bpf (either cgroup setsockopt hook, or via syscall filters) then to unblock it if it requires CAP_NET_RAW/ADMIN. On Android the socket mark is (among other things) used to store the network identifier a socket is bound to. Setting it is privileged, but retrieving it is not. We'd like unprivileged userspace to be able to read the network id of incoming packets (where mark is set via iptables [to be moved to bpf])... An alternative would be to add another sysctl to control whether setting SO_RCVMARK is privilged or not. (or even a MASK of which bits in the mark can be exposed) But this seems like over-engineering... Note: This is a non-trivial revert, due to later merged commit e42c7beee71d ("bpf: net: Consider has_current_bpf_ctx() when testing capable() in sk_setsockopt()") which changed both 'ns_capable' into 'sockopt_ns_capable' calls. Fixes: 1f86123b9749 ("net: align SO_RCVMARK required privileges with SO_MARK") Cc: Larysa Zaremba Cc: Simon Horman Cc: Paolo Abeni Cc: Eyal Birger Cc: Jakub Kicinski Cc: Eric Dumazet Cc: Patrick Rohr Signed-off-by: Maciej Żenczykowski Reviewed-by: Simon Horman Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20230618103130.51628-1-maze@google.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 8fa7c1b469214ddd8152f4d493ea7857cf5f703f Author: Eric Dumazet Date: Tue Jun 20 18:44:25 2023 +0000 sch_netem: acquire qdisc lock in netem_change() [ Upstream commit 2174a08db80d1efeea382e25ac41c4e7511eb6d6 ] syzbot managed to trigger a divide error [1] in netem. It could happen if q->rate changes while netem_enqueue() is running, since q->rate is read twice. It turns out netem_change() always lacked proper synchronization. [1] divide error: 0000 [#1] SMP KASAN CPU: 1 PID: 7867 Comm: syz-executor.1 Not tainted 6.1.30-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023 RIP: 0010:div64_u64 include/linux/math64.h:69 [inline] RIP: 0010:packet_time_ns net/sched/sch_netem.c:357 [inline] RIP: 0010:netem_enqueue+0x2067/0x36d0 net/sched/sch_netem.c:576 Code: 89 e2 48 69 da 00 ca 9a 3b 42 80 3c 28 00 4c 8b a4 24 88 00 00 00 74 0d 4c 89 e7 e8 c3 4f 3b fd 48 8b 4c 24 18 48 89 d8 31 d2 <49> f7 34 24 49 01 c7 4c 8b 64 24 48 4d 01 f7 4c 89 e3 48 c1 eb 03 RSP: 0018:ffffc9000dccea60 EFLAGS: 00010246 RAX: 000001a442624200 RBX: 000001a442624200 RCX: ffff888108a4f000 RDX: 0000000000000000 RSI: 000000000000070d RDI: 000000000000070d RBP: ffffc9000dcceb90 R08: ffffffff849c5e26 R09: fffffbfff10e1297 R10: 0000000000000000 R11: dffffc0000000001 R12: ffff888108a4f358 R13: dffffc0000000000 R14: 0000001a8cd9a7ec R15: 0000000000000000 FS: 00007fa73fe18700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa73fdf7718 CR3: 000000011d36e000 CR4: 0000000000350ee0 Call Trace: [] __dev_xmit_skb net/core/dev.c:3931 [inline] [] __dev_queue_xmit+0xcf5/0x3370 net/core/dev.c:4290 [] dev_queue_xmit include/linux/netdevice.h:3030 [inline] [] neigh_hh_output include/net/neighbour.h:531 [inline] [] neigh_output include/net/neighbour.h:545 [inline] [] ip_finish_output2+0xb92/0x10d0 net/ipv4/ip_output.c:235 [] __ip_finish_output+0xc3/0x2b0 [] ip_finish_output+0x31/0x2a0 net/ipv4/ip_output.c:323 [] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [] ip_output+0x224/0x2a0 net/ipv4/ip_output.c:437 [] dst_output include/net/dst.h:444 [inline] [] ip_local_out net/ipv4/ip_output.c:127 [inline] [] __ip_queue_xmit+0x1425/0x2000 net/ipv4/ip_output.c:542 [] ip_queue_xmit+0x4c/0x70 net/ipv4/ip_output.c:556 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot Signed-off-by: Eric Dumazet Cc: Stephen Hemminger Cc: Jamal Hadi Salim Cc: Cong Wang Cc: Jiri Pirko Reviewed-by: Jamal Hadi Salim Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20230620184425.1179809-1-edumazet@google.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 108058802dd8b44154b9a5368d85bb3fc7a50b66 Author: Shyam Sundar S K Date: Thu Jun 22 11:33:09 2023 +0530 platform/x86/amd/pmf: Register notify handler only if SPS is enabled [ Upstream commit 146b6f6855e7656e8329910606595220c761daac ] Power source notify handler is getting registered even when none of the PMF feature in enabled leading to a crash. ... [ 22.592162] Call Trace: [ 22.592164] [ 22.592164] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592166] ? __warn+0x81/0x130 [ 22.592171] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592172] ? report_bug+0x171/0x1a0 [ 22.592175] ? prb_read_valid+0x1b/0x30 [ 22.592177] ? handle_bug+0x3c/0x80 [ 22.592178] ? exc_invalid_op+0x17/0x70 [ 22.592179] ? asm_exc_invalid_op+0x1a/0x20 [ 22.592182] ? rcu_note_context_switch+0x5e0/0x660 [ 22.592183] ? acpi_ut_delete_object_desc+0x86/0xb0 [ 22.592186] ? acpi_ut_update_ref_count.part.0+0x22d/0x930 [ 22.592187] __schedule+0xc0/0x1410 [ 22.592189] ? ktime_get+0x3c/0xa0 [ 22.592191] ? lapic_next_event+0x1d/0x30 [ 22.592193] ? hrtimer_start_range_ns+0x25b/0x350 [ 22.592196] schedule+0x5e/0xd0 [ 22.592197] schedule_hrtimeout_range_clock+0xbe/0x140 [ 22.592199] ? __pfx_hrtimer_wakeup+0x10/0x10 [ 22.592200] usleep_range_state+0x64/0x90 [ 22.592203] amd_pmf_send_cmd+0x106/0x2a0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592207] amd_pmf_update_slider+0x56/0x1b0 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592210] amd_pmf_set_sps_power_limits+0x72/0x80 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592213] amd_pmf_pwr_src_notify_call+0x49/0x90 [amd_pmf bddfe0fe3712aaa99acce3d5487405c5213c6616] [ 22.592216] notifier_call_chain+0x5a/0xd0 [ 22.592218] atomic_notifier_call_chain+0x32/0x50 ... Fix this by moving the registration of source change notify handler only when SPS(Static Slider) is advertised as supported. Reported-by: Allen Zhong Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217571 Fixes: 4c71ae414474 ("platform/x86/amd/pmf: Add support SPS PMF feature") Tested-by: Patil Rajesh Reddy Reviewed-by: Mario Limonciello Signed-off-by: Shyam Sundar S K Link: https://lore.kernel.org/r/20230622060309.310001-1-Shyam-sundar.S-k@amd.com Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 37ed1d3519853373cbd07654eecb7b98cc21a079 Author: Danielle Ratson Date: Tue Jun 20 14:45:15 2023 +0200 selftests: forwarding: Fix race condition in mirror installation [ Upstream commit c7c059fba6fb19c3bc924925c984772e733cb594 ] When mirroring to a gretap in hardware the device expects to be programmed with the egress port and all the encapsulating headers. This requires the driver to resolve the path the packet will take in the software data path and program the device accordingly. If the path cannot be resolved (in this case because of an unresolved neighbor), then mirror installation fails until the path is resolved. This results in a race that causes the test to sometimes fail. Fix this by setting the neighbor's state to permanent in a couple of tests, so that it is always valid. Fixes: 35c31d5c323f ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1d") Fixes: 239e754af854 ("selftests: forwarding: Test mirror-to-gretap w/ UL 802.1q") Signed-off-by: Danielle Ratson Reviewed-by: Petr Machata Signed-off-by: Petr Machata Link: https://lore.kernel.org/r/268816ac729cb6028c7a34d4dda6f4ec7af55333.1687264607.git.petrm@nvidia.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 7798d55a056c9085d1aee53a845c2fb2195ca9b1 Author: Jens Axboe Date: Tue Jun 20 16:11:51 2023 -0600 io_uring/net: use the correct msghdr union member in io_sendmsg_copy_hdr [ Upstream commit 26fed83653d0154704cadb7afc418f315c7ac1f0 ] Rather than assign the user pointer to msghdr->msg_control, assign it to msghdr->msg_control_user to make sparse happy. They are in a union so the end result is the same, but let's avoid new sparse warnings and squash this one. Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202306210654.mDMcyMuB-lkp@intel.com/ Fixes: cac9e4418f4c ("io_uring/net: save msghdr->msg_control for retries") Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit fef7c07b4ca521ce8d6b95cc3ffa046afb5c1d09 Author: Jiri Olsa Date: Sun Jun 18 15:14:14 2023 +0200 bpf: Force kprobe multi expected_attach_type for kprobe_multi link [ Upstream commit db8eae6bc5c702d8e3ab2d0c6bb5976c131576eb ] We currently allow to create perf link for program with expected_attach_type == BPF_TRACE_KPROBE_MULTI. This will cause crash when we call helpers like get_attach_cookie or get_func_ip in such program, because it will call the kprobe_multi's version (current->bpf_ctx context setup) of those helpers while it expects perf_link's current->bpf_ctx context setup. Making sure that we use BPF_TRACE_KPROBE_MULTI expected_attach_type only for programs attaching through kprobe_multi link. Fixes: ca74823c6e16 ("bpf: Add cookie support to programs attached with kprobe multi link") Signed-off-by: Jiri Olsa Signed-off-by: Andrii Nakryiko Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20230618131414.75649-1-jolsa@kernel.org Signed-off-by: Sasha Levin commit c644783d474973ad3c6991a4e0c69dacf90c2d7f Author: Florent Revest Date: Thu Jun 15 16:56:07 2023 +0200 bpf/btf: Accept function names that contain dots [ Upstream commit 9724160b3942b0a967b91a59f81da5593f28b8ba ] When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM leaves DWARF tags for the "asan.module_ctor" & co symbols. In turn, pahole creates BTF_KIND_FUNC entries for these and this makes the BTF metadata validation fail because they contain a dot. In a dramatic turn of event, this BTF verification failure can cause the netfilter_bpf initialization to fail, causing netfilter_core to free the netfilter_helper hashmap and netfilter_ftp to trigger a use-after-free. The risk of u-a-f in netfilter will be addressed separately but the existence of "asan.module_ctor" debug info under some build conditions sounds like a good enough reason to accept functions that contain dots in BTF. Although using only LLVM=1 is the recommended way to compile clang-based kernels, users can certainly do LLVM=1, LLVM_IAS=0 as well and we still try to support that combination according to Nick. To clarify: - > v5.10 kernel, LLVM=1 (LLVM_IAS=0 is not the default) is recommended, but user can still have LLVM=1, LLVM_IAS=0 to trigger the issue - <= 5.10 kernel, LLVM=1 (LLVM_IAS=0 is the default) is recommended in which case GNU as will be used Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") Signed-off-by: Florent Revest Signed-off-by: Daniel Borkmann Acked-by: Andrii Nakryiko Cc: Yonghong Song Cc: Nick Desaulniers Link: https://lore.kernel.org/bpf/20230615145607.3469985-1-revest@chromium.org Signed-off-by: Sasha Levin commit d2c436dbb90d80356da15e42e8f9fde441db027f Author: Francesco Dolcini Date: Mon Jun 19 17:44:35 2023 +0200 Revert "net: phy: dp83867: perform soft reset and retain established link" [ Upstream commit a129b41fe0a8b4da828c46b10f5244ca07a3fec3 ] This reverts commit da9ef50f545f86ffe6ff786174d26500c4db737a. This fixes a regression in which the link would come up, but no communication was possible. The reverted commit was also removing a comment about DP83867_PHYCR_FORCE_LINK_GOOD, this is not added back in this commits since it seems that this is unrelated to the original code change. Closes: https://lore.kernel.org/all/ZGuDJos8D7N0J6Z2@francesco-nb.int.toradex.com/ Fixes: da9ef50f545f ("net: phy: dp83867: perform soft reset and retain established link") Signed-off-by: Francesco Dolcini Reviewed-by: Andrew Lunn Reviewed-by: Praneeth Bajjuri Link: https://lore.kernel.org/r/20230619154435.355485-1-francesco@dolcini.it Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit cdd5225165c074fa1c5221432a1cab3e76959e55 Author: Pablo Neira Ayuso Date: Thu Jun 15 10:14:25 2023 +0200 netfilter: nfnetlink_osf: fix module autoload [ Upstream commit 62f9a68a36d4441a6c412b81faed102594bc6670 ] Move the alias from xt_osf to nfnetlink_osf. Fixes: f9324952088f ("netfilter: nfnetlink_osf: extract nfnetlink_subsystem code from xt_osf.c") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 70c71aee78e280eef2b148a6ccc9e866355195fc Author: Pablo Neira Ayuso Date: Fri Jun 16 15:22:01 2023 +0200 netfilter: nf_tables: disallow updates of anonymous sets [ Upstream commit b770283c98e0eee9133c47bc03b6cc625dc94723 ] Disallow updates of set timeout and garbage collection parameters for anonymous sets. Fixes: 123b99619cca ("netfilter: nf_tables: honor set timeout and garbage collection updates") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 7e95119acd929410e55ca8783042815c1448b15a Author: Pablo Neira Ayuso Date: Fri Jun 16 15:21:39 2023 +0200 netfilter: nf_tables: reject unbound chain set before commit phase [ Upstream commit 62e1e94b246e685d89c3163aaef4b160e42ceb02 ] Use binding list to track set transaction and to check for unbound chains before entering the commit phase. Bail out if chain binding remain unused before entering the commit step. Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 3d09b7fff7edfba4411bd17db6af1f8210d16e0c Author: Pablo Neira Ayuso Date: Fri Jun 16 15:21:33 2023 +0200 netfilter: nf_tables: reject unbound anonymous set before commit phase [ Upstream commit 938154b93be8cd611ddfd7bafc1849f3c4355201 ] Add a new list to track set transaction and to check for unbound anonymous sets before entering the commit phase. Bail out at the end of the transaction handling if an anonymous set remains unbound. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 921ca64b7e07006b35137c8b990ff912a5e4b45d Author: Pablo Neira Ayuso Date: Fri Jun 16 15:20:16 2023 +0200 netfilter: nf_tables: disallow element updates of bound anonymous sets [ Upstream commit c88c535b592d3baeee74009f3eceeeaf0fdd5e1b ] Anonymous sets come with NFT_SET_CONSTANT from userspace. Although API allows to create anonymous sets without NFT_SET_CONSTANT, it makes no sense to allow to add and to delete elements for bound anonymous sets. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit f661383b5f1aaac3fe121b91e04332944bc90193 Author: Pablo Neira Ayuso Date: Fri Jun 16 15:20:04 2023 +0200 netfilter: nft_set_pipapo: .walk does not deal with generations [ Upstream commit 2b84e215f87443c74ac0aa7f76bb172d43a87033 ] The .walk callback iterates over the current active set, but it might be useful to iterate over the next generation set. Use the generation mask to determine what set view (either current or next generation) is use for the walk iteration. Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit dc7cdf8cbcbf8b13de1df93f356ec04cdeef5c41 Author: Pablo Neira Ayuso Date: Fri Jun 16 14:51:49 2023 +0200 netfilter: nf_tables: drop map element references from preparation phase [ Upstream commit 628bd3e49cba1c066228e23d71a852c23e26da73 ] set .destroy callback releases the references to other objects in maps. This is very late and it results in spurious EBUSY errors. Drop refcount from the preparation phase instead, update set backend not to drop reference counter from set .destroy path. Exceptions: NFT_TRANS_PREPARE_ERROR does not require to drop the reference counter because the transaction abort path releases the map references for each element since the set is unbound. The abort path also deals with releasing reference counter for new elements added to unbound sets. Fixes: 591054469b3e ("netfilter: nf_tables: revisit chain/object refcounting from elements") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 0b342806144ab28c7711712df16f085529d46ea4 Author: Pablo Neira Ayuso Date: Fri Jun 16 14:45:26 2023 +0200 netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain [ Upstream commit 26b5a5712eb85e253724e56a54c17f8519bd8e4e ] Add a new state to deal with rule expressions deactivation from the newrule error path, otherwise the anonymous set remains in the list in inactive state for the next generation. Mark the set/chain transaction as unbound so the abort path releases this object, set it as inactive in the next generation so it is not reachable anymore from this transaction and reference counter is dropped. Fixes: 1240eb93f061 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit a1547f81341f14b1b355df04218152e8b5d4b264 Author: Pablo Neira Ayuso Date: Fri Jun 16 14:45:22 2023 +0200 netfilter: nf_tables: fix chain binding transaction logic [ Upstream commit 4bedf9eee016286c835e3d8fa981ddece5338795 ] Add bound flag to rule and chain transactions as in 6a0a8d10a366 ("netfilter: nf_tables: use-after-free in failing rule with bound set") to skip them in case that the chain is already bound from the abort path. This patch fixes an imbalance in the chain use refcnt that triggers a WARN_ON on the table and chain destroy path. This patch also disallows nested chain bindings, which is not supported from userspace. The logic to deal with chain binding in nft_data_hold() and nft_data_release() is not correct. The NFT_TRANS_PREPARE state needs a special handling in case a chain is bound but next expressions in the same rule fail to initialize as described by 1240eb93f061 ("netfilter: nf_tables: incorrect error path handling with NFT_MSG_NEWRULE"). The chain is left bound if rule construction fails, so the objects stored in this chain (and the chain itself) are released by the transaction records from the abort path, follow up patch ("netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain") completes this error handling. When deleting an existing rule, chain bound flag is set off so the rule expression .destroy path releases the objects. Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 07777fe912e9f6af2d0328809431e4298e3861cf Author: Ross Lagerwall Date: Fri Jun 16 17:45:49 2023 +0100 be2net: Extend xmit workaround to BE3 chip [ Upstream commit 7580e0a78eb29e7bb1a772eba4088250bbb70d41 ] We have seen a bug where the NIC incorrectly changes the length in the IP header of a padded packet to include the padding bytes. The driver already has a workaround for this so do the workaround for this NIC too. This resolves the issue. The NIC in question identifies itself as follows: [ 8.828494] be2net 0000:02:00.0: FW version is 10.7.110.31 [ 8.834759] be2net 0000:02:00.0: Emulex OneConnect(be3): PF FLEX10 port 1 02:00.0 Ethernet controller: Emulex Corporation OneConnect 10Gb NIC (be3) (rev 01) Fixes: ca34fe38f06d ("be2net: fix wrong usage of adapter->generation") Signed-off-by: Ross Lagerwall Link: https://lore.kernel.org/r/20230616164549.2863037-1-ross.lagerwall@citrix.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 21ade3f9cf9b32f88cada9b933b93eb713774896 Author: Vladimir Oltean Date: Sat Jun 17 09:26:48 2023 +0300 net: dsa: introduce preferred_default_local_cpu_port and use on MT7530 [ Upstream commit b79d7c14f48083abb3fb061370c0c64a569edf4c ] Since the introduction of the OF bindings, DSA has always had a policy that in case multiple CPU ports are present in the device tree, the numerically smallest one is always chosen. The MT7530 switch family, except the switch on the MT7988 SoC, has 2 CPU ports, 5 and 6, where port 6 is preferable on the MT7531BE switch because it has higher bandwidth. The MT7530 driver developers had 3 options: - to modify DSA when the MT7531 switch support was introduced, such as to prefer the better port - to declare both CPU ports in device trees as CPU ports, and live with the sub-optimal performance resulting from not preferring the better port - to declare just port 6 in the device tree as a CPU port Of course they chose the path of least resistance (3rd option), kicking the can down the road. The hardware description in the device tree is supposed to be stable - developers are not supposed to adopt the strategy of piecemeal hardware description, where the device tree is updated in lockstep with the features that the kernel currently supports. Now, as a result of the fact that they did that, any attempts to modify the device tree and describe both CPU ports as CPU ports would make DSA change its default selection from port 6 to 5, effectively resulting in a performance degradation visible to users with the MT7531BE switch as can be seen below. Without preferring port 6: [ ID][Role] Interval Transfer Bitrate Retr [ 5][TX-C] 0.00-20.00 sec 374 MBytes 157 Mbits/sec 734 sender [ 5][TX-C] 0.00-20.00 sec 373 MBytes 156 Mbits/sec receiver [ 7][RX-C] 0.00-20.00 sec 1.81 GBytes 778 Mbits/sec 0 sender [ 7][RX-C] 0.00-20.00 sec 1.81 GBytes 777 Mbits/sec receiver With preferring port 6: [ ID][Role] Interval Transfer Bitrate Retr [ 5][TX-C] 0.00-20.00 sec 1.99 GBytes 856 Mbits/sec 273 sender [ 5][TX-C] 0.00-20.00 sec 1.99 GBytes 855 Mbits/sec receiver [ 7][RX-C] 0.00-20.00 sec 1.72 GBytes 737 Mbits/sec 15 sender [ 7][RX-C] 0.00-20.00 sec 1.71 GBytes 736 Mbits/sec receiver Using one port for WAN and the other ports for LAN is a very popular use case which is what this test emulates. As such, this change proposes that we retroactively modify stable kernels (which don't support the modification of the CPU port assignments, so as to let user space fix the problem and restore the throughput) to keep the mt7530 driver preferring port 6 even with device trees where the hardware is more fully described. Fixes: c288575f7810 ("net: dsa: mt7530: Add the support of MT7531 switch") Signed-off-by: Vladimir Oltean Signed-off-by: Arınç ÜNAL Reviewed-by: Russell King (Oracle) Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit bb1ca506e053f839804a4a7df33d3631d583e14e Author: Arınç ÜNAL Date: Sat Jun 17 09:26:47 2023 +0300 net: dsa: mt7530: fix handling of LLDP frames [ Upstream commit 8332cf6fd7c7087dbc2067115b33979c9851bbc4 ] LLDP frames are link-local frames, therefore they must be trapped to the CPU port. Currently, the MT753X switches treat LLDP frames as regular multicast frames, therefore flooding them to user ports. To fix this, set LLDP frames to be trapped to the CPU port(s). Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL Reviewed-by: Vladimir Oltean Reviewed-by: Russell King (Oracle) Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 849c5edce672ad5fbd656e53ce26ea2551d87049 Author: Arınç ÜNAL Date: Sat Jun 17 09:26:46 2023 +0300 net: dsa: mt7530: fix handling of BPDUs on MT7530 switch [ Upstream commit d7c66073559386b836bded7cdc8b66ee5c049129 ] BPDUs are link-local frames, therefore they must be trapped to the CPU port. Currently, the MT7530 switch treats BPDUs as regular multicast frames, therefore flooding them to user ports. To fix this, set BPDUs to be trapped to the CPU port. Group this on mt7530_setup() and mt7531_setup_common() into mt753x_trap_frames() and call that. Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL Reviewed-by: Vladimir Oltean Reviewed-by: Russell King (Oracle) Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 02b2de2e67c33433e5552add9ceca79bab3944fd Author: Arınç ÜNAL Date: Sat Jun 17 09:26:45 2023 +0300 net: dsa: mt7530: fix trapping frames on non-MT7621 SoC MT7530 switch [ Upstream commit 4ae90f90e4909e3014e2dc6a0627964617a7b824 ] All MT7530 switch IP variants share the MT7530_MFC register, but the current driver only writes it for the switch variant that is integrated in the MT7621 SoC. Modify the code to include all MT7530 derivatives. Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Suggested-by: Vladimir Oltean Signed-off-by: Arınç ÜNAL Reviewed-by: Vladimir Oltean Reviewed-by: Russell King (Oracle) Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 0b35ab59560b72e3f778a14be886def5429f9904 Author: Terin Stock Date: Fri Jun 9 22:58:42 2023 +0200 ipvs: align inner_mac_header for encapsulation [ Upstream commit d7fce52fdf96663ddc2eb21afecff3775588612a ] When using encapsulation the original packet's headers are copied to the inner headers. This preserves the space for an inner mac header, which is not used by the inner payloads for the encapsulation types supported by IPVS. If a packet is using GUE or GRE encapsulation and needs to be segmented, flow can be passed to __skb_udp_tunnel_segment() which calculates a negative tunnel header length. A negative tunnel header length causes pskb_may_pull() to fail, dropping the packet. This can be observed by attaching probes to ip_vs_in_hook(), __dev_queue_xmit(), and __skb_udp_tunnel_segment(): perf probe --add '__dev_queue_xmit skb->inner_mac_header \ skb->inner_network_header skb->mac_header skb->network_header' perf probe --add '__skb_udp_tunnel_segment:7 tnl_hlen' perf probe -m ip_vs --add 'ip_vs_in_hook skb->inner_mac_header \ skb->inner_network_header skb->mac_header skb->network_header' These probes the headers and tunnel header length for packets which traverse the IPVS encapsulation path. A TCP packet can be forced into the segmentation path by being smaller than a calculated clamped MSS, but larger than the advertised MSS. probe:ip_vs_in_hook: inner_mac_header=0x0 inner_network_header=0x0 mac_header=0x44 network_header=0x52 probe:ip_vs_in_hook: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32 probe:dev_queue_xmit: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32 probe:__skb_udp_tunnel_segment_L7: tnl_hlen=-2 When using veth-based encapsulation, the interfaces are set to be mac-less, which does not preserve space for an inner mac header. This prevents this issue from occurring. In our real-world testing of sending a 32KB file we observed operation time increasing from ~75ms for veth-based encapsulation to over 1.5s using IPVS encapsulation due to retries from dropped packets. This changeset modifies the packet on the encapsulation path in ip_vs_tunnel_xmit() and ip_vs_tunnel_xmit_v6() to remove the inner mac header offset. This fixes UDP segmentation for both encapsulation types, and corrects the inner headers for any IPIP flows that may use it. Fixes: 84c0d5e96f3a ("ipvs: allow tunneling with gue encapsulation") Signed-off-by: Terin Stock Acked-by: Julian Anastasov Acked-by: Simon Horman Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 9f225fc091c97be489d2e3cfc21d278ff0951881 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:22 2023 +0300 mmc: usdhi60rol0: fix deferred probing [ Upstream commit 413db499730248431c1005b392e8ed82c4fa19bf ] The driver overrides the error codes returned by platform_get_irq_byname() to -ENODEV, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-13-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 31a8ccdc721cc548f96de96fc92206758674124b Author: Sergey Shtylyov Date: Sat Jun 17 23:36:20 2023 +0300 mmc: sh_mmcif: fix deferred probing [ Upstream commit 5b067d7f855c61df7f8e2e8ccbcee133c282415e ] The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-11-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 27cfa6e3398d3d13bbd7d0f32f81e9990dd691df Author: Sergey Shtylyov Date: Sat Jun 17 23:36:18 2023 +0300 mmc: sdhci-acpi: fix deferred probing [ Upstream commit b465dea5e1540c7d7b5211adaf94926980d3014b ] The driver overrides the error codes returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 1b7ba57ecc86 ("mmc: sdhci-acpi: Handle return value of platform_get_irq") Signed-off-by: Sergey Shtylyov Acked-by: Adrian Hunter Link: https://lore.kernel.org/r/20230617203622.6812-9-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 4a90e1cea1c59a2f5d02ebbb04519e33147fd37c Author: Sergey Shtylyov Date: Sat Jun 17 23:36:17 2023 +0300 mmc: owl: fix deferred probing [ Upstream commit 3c482e1e830d79b9be8afb900a965135c01f7893 ] The driver overrides the error codes returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: ff65ffe46d28 ("mmc: Add Actions Semi Owl SoCs SD/MMC driver") Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-8-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 46a002c502dca82fbde482a444aae3aa11de1c37 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:16 2023 +0300 mmc: omap_hsmmc: fix deferred probing [ Upstream commit fb51b74a57859b707c3e8055ed0c25a7ca4f6a29 ] The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-7-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 14dac63fb8f7130f72be2157bb0cc0b9c7074a56 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:15 2023 +0300 mmc: omap: fix deferred probing [ Upstream commit aedf4ba1ad00aaa94c1b66c73ecaae95e2564b95 ] The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-6-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 9bef5a4d10e6a53cd4f5738942955043a8b189f8 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:14 2023 +0300 mmc: mvsdio: fix deferred probing [ Upstream commit 8d84064da0d4672e74f984e8710f27881137472c ] The driver overrides the error codes returned by platform_get_irq() to -ENXIO, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 9ec36cafe43b ("of/irq: do irq resolution in platform_get_irq") Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-5-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit dc9cb797426479bc1ba300845866a1902ca3cf00 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:13 2023 +0300 mmc: mtk-sd: fix deferred probing [ Upstream commit 0c4dc0f054891a2cbde0426b0c0fdf232d89f47f ] The driver overrides the error codes returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Fixes: 208489032bdd ("mmc: mediatek: Add Mediatek MMC driver") Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-4-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 6c9f30f23c5f440add2d032f1f0effa7dac1c79d Author: Stefan Wahren Date: Wed Jun 14 23:06:56 2023 +0200 net: qca_spi: Avoid high load if QCA7000 is not available [ Upstream commit 92717c2356cb62c89e8a3dc37cbbab2502562524 ] In case the QCA7000 is not available via SPI (e.g. in reset), the driver will cause a high load. The reason for this is that the synchronization is never finished and schedule() is never called. Since the synchronization is not timing critical, it's safe to drop this from the scheduling condition. Signed-off-by: Stefan Wahren Fixes: 291ab06ecf67 ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f6be9b10ba4e333d55cd5aa164cb3d8b887af76b Author: Íñigo Huguet Date: Thu Jun 15 10:49:29 2023 +0200 sfc: use budget for TX completions [ Upstream commit 4aaf2c52834b7f95acdf9fb0211a1b60adbf421b ] When running workloads heavy unbalanced towards TX (high TX, low RX traffic), sfc driver can retain the CPU during too long times. Although in many cases this is not enough to be visible, it can affect performance and system responsiveness. A way to reproduce it is to use a debug kernel and run some parallel netperf TX tests. In some systems, this will lead to this message being logged: kernel:watchdog: BUG: soft lockup - CPU#12 stuck for 22s! The reason is that sfc driver doesn't account any NAPI budget for the TX completion events work. With high-TX/low-RX traffic, this makes that the CPU is held for long time for NAPI poll. Documentations says "drivers can process completions for any number of Tx packets but should only process up to budget number of Rx packets". However, many drivers do limit the amount of TX completions that they process in a single NAPI poll. In the same way, this patch adds a limit for the TX work in sfc. With the patch applied, the watchdog warning never appears. Tested with netperf in different combinations: single process / parallel processes, TCP / UDP and different sizes of UDP messages. Repeated the tests before and after the patch, without any noticeable difference in network or CPU performance. Test hardware: Intel(R) Xeon(R) CPU E5-1620 v4 @ 3.50GHz (4 cores, 2 threads/core) Solarflare Communications XtremeScale X2522-25G Network Adapter Fixes: 5227ecccea2d ("sfc: remove tx and MCDI handling from NAPI budget consideration") Fixes: d19a53721863 ("sfc_ef100: TX path for EF100 NICs") Reported-by: Fei Liu Signed-off-by: Íñigo Huguet Acked-by: Martin Habets Link: https://lore.kernel.org/r/20230615084929.10506-1-ihuguet@redhat.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit bf7a5d954f862ae267f02afe8ff339775d00167d Author: Yevgeny Kliteynik Date: Sun Jun 4 21:07:04 2023 +0300 net/mlx5: DR, Fix wrong action data allocation in decap action [ Upstream commit ef4c5afc783dc3d47640270a9b94713229c697e8 ] When TUNNEL_L3_TO_L2 decap action was created, a pointer to a local variable was passed as its HW action data, resulting in attempt to free invalid address: BUG: KASAN: invalid-free in mlx5dr_action_destroy+0x318/0x410 [mlx5_core] Fixes: 4781df92f4da ("net/mlx5: DR, Move STEv0 modify header logic") Signed-off-by: Yevgeny Kliteynik Reviewed-by: Alex Vesker Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 3cb0d1406c41bdb7abedbc2ff090e373e754ac12 Author: Sebastian Andrzej Siewior Date: Wed Jun 14 12:02:02 2023 +0200 xfrm: Linearize the skb after offloading if needed. [ Upstream commit f015b900bc3285322029b4a7d132d6aeb0e51857 ] With offloading enabled, esp_xmit() gets invoked very late, from within validate_xmit_xfrm() which is after validate_xmit_skb() validates and linearizes the skb if the underlying device does not support fragments. esp_output_tail() may add a fragment to the skb while adding the auth tag/ IV. Devices without the proper support will then send skb->data points to with the correct length so the packet will have garbage at the end. A pcap sniffer will claim that the proper data has been sent since it parses the skb properly. It is not affected with INET_ESP_OFFLOAD disabled. Linearize the skb after offloading if the sending hardware requires it. It was tested on v4, v6 has been adopted. Fixes: 7785bba299a8d ("esp: Add a software GRO codepath") Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit 7c8c796b308cff9337f1ab9d860125f774f00746 Author: Magali Lemes Date: Tue Jun 13 09:32:22 2023 -0300 selftests: net: fcnal-test: check if FIPS mode is enabled [ Upstream commit d7a2fc1437f71cb058c7b11bc33dfc19e4bf277a ] There are some MD5 tests which fail when the kernel is in FIPS mode, since MD5 is not FIPS compliant. Add a check and only run those tests if FIPS mode is not enabled. Fixes: f0bee1ebb5594 ("fcnal-test: Add TCP MD5 tests") Fixes: 5cad8bce26e01 ("fcnal-test: Add TCP MD5 tests for VRF") Reviewed-by: David Ahern Signed-off-by: Magali Lemes Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 74eba3fddf996008944a7d4a9a34d9a8937fddbf Author: Magali Lemes Date: Tue Jun 13 09:32:21 2023 -0300 selftests: net: vrf-xfrm-tests: change authentication and encryption algos [ Upstream commit cb43c60e64ca67fcc9d23bd08f51d2ab8209d9d7 ] The vrf-xfrm-tests tests use the hmac(md5) and cbc(des3_ede) algorithms for performing authentication and encryption, respectively. This causes the tests to fail when fips=1 is set, since these algorithms are not allowed in FIPS mode. Therefore, switch from hmac(md5) and cbc(des3_ede) to hmac(sha1) and cbc(aes), which are FIPS compliant. Fixes: 3f251d741150 ("selftests: Add tests for vrf and xfrms") Reviewed-by: David Ahern Signed-off-by: Magali Lemes Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c7632b61aa2a830f99610fc1a20a2cb4d8291fc5 Author: Magali Lemes Date: Tue Jun 13 09:32:20 2023 -0300 selftests: net: tls: check if FIPS mode is enabled [ Upstream commit d113c395c67b62fc0d3f2004c0afc406aca0a2b7 ] TLS selftests use the ChaCha20-Poly1305 and SM4 algorithms, which are not FIPS compliant. When fips=1, this set of tests fails. Add a check and only run these tests if not in FIPS mode. Fixes: 4f336e88a870 ("selftests/tls: add CHACHA20-POLY1305 to tls selftests") Fixes: e506342a03c7 ("selftests/tls: add SM4 GCM/CCM to tls selftests") Reviewed-by: Jakub Kicinski Signed-off-by: Magali Lemes Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit d8cb7c824abc0bafa9e35cbda66511c1f2444c4f Author: Yonghong Song Date: Thu Jun 8 17:54:39 2023 -0700 bpf: Fix a bpf_jit_dump issue for x86_64 with sysctl bpf_jit_enable. [ Upstream commit ad96f1c9138e0897bee7f7c5e54b3e24f8b62f57 ] The sysctl net/core/bpf_jit_enable does not work now due to commit 1022a5498f6f ("bpf, x86_64: Use bpf_jit_binary_pack_alloc"). The commit saved the jitted insns into 'rw_image' instead of 'image' which caused bpf_jit_dump not dumping proper content. With 'echo 2 > /proc/sys/net/core/bpf_jit_enable', run './test_progs -t fentry_test'. Without this patch, one of jitted image for one particular prog is: flen=17 proglen=92 pass=4 image=0000000014c64883 from=test_progs pid=1807 00000000: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00000010: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00000020: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00000030: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00000040: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 00000050: cc cc cc cc cc cc cc cc cc cc cc cc With this patch, the jitte image for the same prog is: flen=17 proglen=92 pass=4 image=00000000b90254b7 from=test_progs pid=1809 00000000: f3 0f 1e fa 0f 1f 44 00 00 66 90 55 48 89 e5 f3 00000010: 0f 1e fa 31 f6 48 8b 57 00 48 83 fa 07 75 2b 48 00000020: 8b 57 10 83 fa 09 75 22 48 8b 57 08 48 81 e2 ff 00000030: 00 00 00 48 83 fa 08 75 11 48 8b 7f 18 be 01 00 00000040: 00 00 48 83 ff 0a 74 02 31 f6 48 bf 18 d0 14 00 00000050: 00 c9 ff ff 48 89 77 00 31 c0 c9 c3 Fixes: 1022a5498f6f ("bpf, x86_64: Use bpf_jit_binary_pack_alloc") Signed-off-by: Yonghong Song Signed-off-by: Daniel Borkmann Acked-by: Song Liu Link: https://lore.kernel.org/bpf/20230609005439.3173569-1-yhs@fb.com Signed-off-by: Sasha Levin commit 21ae0f8f1fec752c8224030a2a54d20df43662f4 Author: Maciej Żenczykowski Date: Mon Jun 5 04:06:54 2023 -0700 xfrm: fix inbound ipv4/udp/esp packets to UDPv6 dualstack sockets [ Upstream commit 1166a530a84758bb9e6b448fc8c195ed413f5ded ] Before Linux v5.8 an AF_INET6 SOCK_DGRAM (udp/udplite) socket with SOL_UDP, UDP_ENCAP, UDP_ENCAP_ESPINUDP{,_NON_IKE} enabled would just unconditionally use xfrm4_udp_encap_rcv(), afterwards such a socket would use the newly added xfrm6_udp_encap_rcv() which only handles IPv6 packets. Cc: Sabrina Dubroca Cc: Steffen Klassert Cc: Jakub Kicinski Cc: Benedict Wong Cc: Yan Yan Fixes: 0146dca70b87 ("xfrm: add support for UDPv6 encapsulation of ESP") Signed-off-by: Maciej Żenczykowski Reviewed-by: Simon Horman Reviewed-by: Sabrina Dubroca Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit 36ee561b59990acae0acceaf6563020b5bc9f8c1 Author: Maxim Mikityanskiy Date: Wed Jun 7 15:39:50 2023 +0300 bpf: Fix verifier id tracking of scalars on spill [ Upstream commit 713274f1f2c896d37017efee333fd44149710119 ] The following scenario describes a bug in the verifier where it incorrectly concludes about equivalent scalar IDs which could lead to verifier bypass in privileged mode: 1. Prepare a 32-bit rogue number. 2. Put the rogue number into the upper half of a 64-bit register, and roll a random (unknown to the verifier) bit in the lower half. The rest of the bits should be zero (although variations are possible). 3. Assign an ID to the register by MOVing it to another arbitrary register. 4. Perform a 32-bit spill of the register, then perform a 32-bit fill to another register. Due to a bug in the verifier, the ID will be preserved, although the new register will contain only the lower 32 bits, i.e. all zeros except one random bit. At this point there are two registers with different values but the same ID, which means the integrity of the verifier state has been corrupted. 5. Compare the new 32-bit register with 0. In the branch where it's equal to 0, the verifier will believe that the original 64-bit register is also 0, because it has the same ID, but its actual value still contains the rogue number in the upper half. Some optimizations of the verifier prevent the actual bypass, so extra care is needed: the comparison must be between two registers, and both branches must be reachable (this is why one random bit is needed). Both branches are still suitable for the bypass. 6. Right shift the original register by 32 bits to pop the rogue number. 7. Use the rogue number as an offset with any pointer. The verifier will believe that the offset is 0, while in reality it's the given number. The fix is similar to the 32-bit BPF_MOV handling in check_alu_op for SCALAR_VALUE. If the spill is narrowing the actual register value, don't keep the ID, make sure it's reset to 0. Fixes: 354e8f1970f8 ("bpf: Support <8-byte scalar spill and refill") Signed-off-by: Maxim Mikityanskiy Signed-off-by: Daniel Borkmann Tested-by: Andrii Nakryiko # Checked veristat delta Acked-by: Yonghong Song Link: https://lore.kernel.org/bpf/20230607123951.558971-2-maxtram95@gmail.com Signed-off-by: Sasha Levin commit 35395e31e5588cbf01eeffaeb9a764d52229c9f7 Author: Leon Romanovsky Date: Mon Jun 5 10:36:15 2023 +0300 xfrm: add missed call to delete offloaded policies [ Upstream commit bf06fcf4be0feefebd27deb8b60ad262f4230489 ] Offloaded policies are deleted through two flows: netdev is going down and policy flush. In both cases, the code lacks relevant call to delete offloaded policy. Fixes: 919e43fad516 ("xfrm: add an interface to offload policy") Signed-off-by: Leon Romanovsky Reviewed-by: Simon Horman Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit f31ba824c25281d3325d600beecb486123739e0b Author: Reiji Watanabe Date: Fri Jun 2 19:50:34 2023 -0700 KVM: arm64: PMU: Restore the host's PMUSERENR_EL0 [ Upstream commit 8681f71759010503892f9e3ddb05f65c0f21b690 ] Restore the host's PMUSERENR_EL0 value instead of clearing it, before returning back to userspace, as the host's EL0 might have a direct access to PMU registers (some bits of PMUSERENR_EL0 for might not be zero for the host EL0). Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event") Signed-off-by: Reiji Watanabe Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20230603025035.3781797-2-reijiw@google.com Signed-off-by: Sasha Levin commit 02252d75a92c14f26b86cf73dfb7c0c6ebdd613b Author: Benedict Wong Date: Wed May 10 01:30:22 2023 +0000 xfrm: Ensure policies always checked on XFRM-I input path [ Upstream commit a287f5b0cfc6804c5b12a4be13c7c9fe27869e90 ] This change adds methods in the XFRM-I input path that ensures that policies are checked prior to processing of the subsequent decapsulated packet, after which the relevant policies may no longer be resolvable (due to changing src/dst/proto/etc). Notably, raw ESP/AH packets did not perform policy checks inherently, whereas all other encapsulated packets (UDP, TCP encapsulated) do policy checks after calling xfrm_input handling in the respective encapsulation layer. Fixes: b0355dbbf13c ("Fix XFRM-I support for nested ESP tunnels") Test: Verified with additional Android Kernel Unit tests Test: Verified against Android CTS Signed-off-by: Benedict Wong Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit 74dcf228bba43aaa2ff6fcdaed69edd4b4829883 Author: Benedict Wong Date: Wed May 10 01:30:21 2023 +0000 xfrm: Treat already-verified secpath entries as optional [ Upstream commit 1f8b6df6a997a430b0c48b504638154b520781ad ] This change allows inbound traffic through nested IPsec tunnels to successfully match policies and templates, while retaining the secpath stack trace as necessary for netfilter policies. Specifically, this patch marks secpath entries that have already matched against a relevant policy as having been verified, allowing it to be treated as optional and skipped after a tunnel decapsulation (during which the src/dst/proto/etc may have changed, and the correct policy chain no long be resolvable). This approach is taken as opposed to the iteration in b0355dbbf13c, where the secpath was cleared, since that breaks subsequent validations that rely on the existence of the secpath entries (netfilter policies, or transport-in-tunnel mode, where policies remain resolvable). Fixes: b0355dbbf13c ("Fix XFRM-I support for nested ESP tunnels") Test: Tested against Android Kernel Unit Tests Test: Tested against Android CTS Signed-off-by: Benedict Wong Signed-off-by: Steffen Klassert Signed-off-by: Sasha Levin commit e43c93ede27ad9c13653315092943d271a487573 Author: Chen Aotian Date: Sun Apr 9 10:20:48 2023 +0800 ieee802154: hwsim: Fix possible memory leaks [ Upstream commit a61675294735570daca3779bd1dbb3715f7232bd ] After replacing e->info, it is necessary to free the old einfo. Fixes: f25da51fdc38 ("ieee802154: hwsim: add replacement for fakelb") Reviewed-by: Miquel Raynal Reviewed-by: Alexander Aring Signed-off-by: Chen Aotian Link: https://lore.kernel.org/r/20230409022048.61223-1-chenaotian2@163.com Signed-off-by: Stefan Schmidt Signed-off-by: Sasha Levin commit 75a25a84444c1ed584d9d1da359327f29d19e633 Author: Lee Jones Date: Wed Jun 14 17:38:54 2023 +0100 x86/mm: Avoid using set_pgd() outside of real PGD pages commit d082d48737c75d2b3cc1f972b8c8674c25131534 upstream. KPTI keeps around two PGDs: one for userspace and another for the kernel. Among other things, set_pgd() contains infrastructure to ensure that updates to the kernel PGD are reflected in the user PGD as well. One side-effect of this is that set_pgd() expects to be passed whole pages. Unfortunately, init_trampoline_kaslr() passes in a single entry: 'trampoline_pgd_entry'. When KPTI is on, set_pgd() will update 'trampoline_pgd_entry' (an 8-Byte globally stored [.bss] variable) and will then proceed to replicate that value into the non-existent neighboring user page (located +4k away), leading to the corruption of other global [.bss] stored variables. Fix it by directly assigning 'trampoline_pgd_entry' and avoiding set_pgd(). [ dhansen: tweak subject and changelog ] Fixes: 0925dda5962e ("x86/mm/KASLR: Use only one PUD entry for real mode trampoline") Suggested-by: Dave Hansen Signed-off-by: Lee Jones Signed-off-by: Dave Hansen Cc: Link: https://lore.kernel.org/all/20230614163859.924309-1-lee@kernel.org/g Signed-off-by: Greg Kroah-Hartman commit ecc72019f13da7e2217a0cf0ee805785ab5fa374 Author: Jens Axboe Date: Sat Jun 17 19:50:24 2023 -0600 io_uring/poll: serialize poll linked timer start with poll removal Commit ef7dfac51d8ed961b742218f526bd589f3900a59 upstream. We selectively grab the ctx->uring_lock for poll update/removal, but we really should grab it from the start to fully synchronize with linked timeouts. Normally this is indeed the case, but if requests are forced async by the application, we don't fully cover removal and timer disarm within the uring_lock. Make this simpler by having consistent locking state for poll removal. Cc: stable@vger.kernel.org # 6.1+ Reported-by: Querijn Voet Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit d9376472840e73c3221c770d57e2404a54ff3caf Author: Ming Lei Date: Thu Jun 22 16:42:49 2023 +0800 block: make sure local irq is disabled when calling __blkcg_rstat_flush commit 9c39b7a905d84b7da5f59d80f2e455853fea7217 upstream. When __blkcg_rstat_flush() is called from cgroup_rstat_flush*() code path, interrupt is always disabled. When we start to flush blkcg per-cpu stats list in __blkg_release() for avoiding to leak blkcg_gq's reference in commit 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq"), local irq isn't disabled yet, then lockdep warning may be triggered because the dependent cgroup locks may be acquired from irq(soft irq) handler. Fix the issue by disabling local irq always. Fixes: 20cb1c2fb756 ("blk-cgroup: Flush stats before releasing blkcg_gq") Reported-by: Shinichiro Kawasaki Closes: https://lore.kernel.org/linux-block/pz2wzwnmn5tk3pwpskmjhli6g3qly7eoknilb26of376c7kwxy@qydzpvt6zpis/T/#u Cc: stable@vger.kernel.org Cc: Jay Shin Cc: Tejun Heo Cc: Waiman Long Signed-off-by: Ming Lei Reviewed-by: Waiman Long Link: https://lore.kernel.org/r/20230622084249.1208005-1-ming.lei@redhat.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 5cca00b2af474e83f457382211a9e55a8ea3c65b Author: Andrew Powers-Holmes Date: Thu Jun 1 15:25:16 2023 +0200 arm64: dts: rockchip: Fix rk356x PCIe register and range mappings commit 568a67e742dfa90b19a23305317164c5c350b71e upstream. The register and range mappings for the PCIe controller in Rockchip's RK356x SoCs are incorrect. Replace them with corrected values from the vendor BSP sources, updated to match current DT schema. These values are also used in u-boot. Fixes: 66b51ea7d70f ("arm64: dts: rockchip: Add rk3568 PCIe2x1 controller") Cc: stable@vger.kernel.org Signed-off-by: Andrew Powers-Holmes Signed-off-by: Jonas Karlman Signed-off-by: Nicolas Frattaroli Tested-by: Diederik de Haas Link: https://lore.kernel.org/r/20230601132516.153934-1-frattaroli.nicolas@gmail.com Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 761bd068e59c1af0b1a88b9d7be33b3d0dc3a62f Author: Namjae Jeon Date: Thu Jun 15 15:56:32 2023 +0900 ksmbd: add mnt_want_write to ksmbd vfs functions [ Upstream commit 40b268d384a22276dca1450549f53eed60e21deb ] ksmbd is doing write access using vfs helpers. There are the cases that mnt_want_write() is not called in vfs helper. This patch add missing mnt_want_write() to ksmbd vfs functions. Cc: stable@vger.kernel.org Cc: Amir Goldstein Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Sasha Levin commit ef6cc1c465786b3e4e735d1ab8b616b3e94402df Author: Namjae Jeon Date: Fri Apr 21 16:09:01 2023 +0900 ksmbd: fix racy issue from using ->d_parent and ->d_name [ Upstream commit 74d7970febf7e9005375aeda0df821d2edffc9f7 ] Al pointed out that ksmbd has racy issue from using ->d_parent and ->d_name in ksmbd_vfs_unlink and smb2_vfs_rename(). and use new lock_rename_child() to lock stable parent while underlying rename racy. Introduce vfs_path_parent_lookup helper to avoid out of share access and export vfs functions like the following ones to use vfs_path_parent_lookup(). - rename __lookup_hash() to lookup_one_qstr_excl(). - export lookup_one_qstr_excl(). - export getname_kernel() and putname(). vfs_path_parent_lookup() is used for parent lookup of destination file using absolute pathname given from FILE_RENAME_INFORMATION request. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Stable-dep-of: 40b268d384a2 ("ksmbd: add mnt_want_write to ksmbd vfs functions") Signed-off-by: Sasha Levin commit e21e30d954a20af41f29ea8d448247a6ac956bf0 Author: Al Viro Date: Thu Mar 16 07:34:34 2023 +0900 fs: introduce lock_rename_child() helper [ Upstream commit 9bc37e04823b5280dd0f22b6680fc23fe81ca325 ] Pass the dentry of a source file and the dentry of a destination directory to lock parent inodes for rename. As soon as this function returns, ->d_parent of the source file dentry is stable and inodes are properly locked for calling vfs-rename. This helper is needed for ksmbd server. rename request of SMB protocol has to rename an opened file, no matter which directory it's in. Signed-off-by: Al Viro Signed-off-by: Namjae Jeon Signed-off-by: Al Viro Stable-dep-of: 40b268d384a2 ("ksmbd: add mnt_want_write to ksmbd vfs functions") Signed-off-by: Sasha Levin commit 2b33fd52e89ea55b3ac55adf44f5eba276375b18 Author: Namjae Jeon Date: Thu Mar 16 07:34:33 2023 +0900 ksmbd: remove internal.h include [ Upstream commit 211db0ac9e3dc6c46f2dd53395b34d76af929faf ] Since vfs_path_lookup is exported, It should not be internal. Move vfs_path_lookup prototype in internal.h to linux/namei.h. Suggested-by: Al Viro Reviewed-by: Christian Brauner Signed-off-by: Namjae Jeon Signed-off-by: Al Viro Stable-dep-of: 40b268d384a2 ("ksmbd: add mnt_want_write to ksmbd vfs functions") Signed-off-by: Sasha Levin commit 4940711ffb29ceee7da49fb70c2c51df0bb7a36a Author: Russ Weight Date: Tue Jun 20 13:28:24 2023 -0700 regmap: spi-avmm: Fix regmap_bus max_raw_write [ Upstream commit c8e796895e2310b6130e7577248da1d771431a77 ] The max_raw_write member of the regmap_spi_avmm_bus structure is defined as: .max_raw_write = SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT SPI_AVMM_VAL_SIZE == 4 and MAX_WRITE_CNT == 1 so this results in a maximum write transfer size of 4 bytes which provides only enough space to transfer the address of the target register. It provides no space for the value to be transferred. This bug became an issue (divide-by-zero in _regmap_raw_write()) after the following was accepted into mainline: commit 3981514180c9 ("regmap: Account for register length when chunking") Change max_raw_write to include space (4 additional bytes) for both the register address and value: .max_raw_write = SPI_AVMM_REG_SIZE + SPI_AVMM_VAL_SIZE * MAX_WRITE_CNT Fixes: 7f9fb67358a2 ("regmap: add Intel SPI Slave to AVMM Bus Bridge support") Reviewed-by: Matthew Gerlach Signed-off-by: Russ Weight Link: https://lore.kernel.org/r/20230620202824.380313-1-russell.h.weight@intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 2b1ffbbfccc3cfd212f4ed55c5698c234a9b6e7f Author: Teresa Remmet Date: Wed Jun 14 14:52:40 2023 +0200 regulator: pca9450: Fix LDO3OUT and LDO4OUT MASK [ Upstream commit 7257d930aadcd62d1c7971ab14f3b1126356abdc ] L3_OUT and L4_OUT Bit fields range from Bit 0:4 and thus the mask should be 0x1F instead of 0x0F. Fixes: 0935ff5f1f0a ("regulator: pca9450: add pca9450 pmic driver") Signed-off-by: Teresa Remmet Reviewed-by: Frieder Schrempf Link: https://lore.kernel.org/r/20230614125240.3946519-1-t.remmet@phytec.de Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit a9eea081e111dcc6f035a6ae5080de8ffc633f41 Author: Neil Armstrong Date: Thu Jun 15 14:51:45 2023 +0200 spi: spi-geni-qcom: correctly handle -EPROBE_DEFER from dma_request_chan() [ Upstream commit 9d7054fb3ac2e8d252aae1268f20623f244e644f ] Now spi_geni_grab_gpi_chan() errors are correctly reported, the -EPROBE_DEFER error should be returned from probe in case the GPI dma driver is built as module and/or not probed yet. Fixes: b59c122484ec ("spi: spi-geni-qcom: Add support for GPI dma") Fixes: 6532582c353f ("spi: spi-geni-qcom: fix error handling in spi_geni_grab_gpi_chan()") Signed-off-by: Neil Armstrong Link: https://lore.kernel.org/r/20230615-topic-sm8550-upstream-fix-spi-geni-qcom-probe-v2-1-670c3d9e8c9c@linaro.org Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit c12e9e60f234d8c51ca812cee1201f56434797dc Author: Mukesh Sisodiya Date: Mon Jun 19 17:02:34 2023 +0200 wifi: iwlwifi: pcie: Handle SO-F device for PCI id 0x7AF0 commit 4e9f0ec38852c18faa9689322e758575af33e5d4 upstream. Add support for AX1690i and AX1690s devices with PCIE id 0x7AF0. Cc: stable@vger.kernel.org # 6.1+ Signed-off-by: Mukesh Sisodiya Signed-off-by: Gregory Greenman Signed-off-by: Johannes Berg Link: https://lore.kernel.org/r/20230619150233.461290-2-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 6aaa750ca64dd4b271afcd876d26d2d5c664c8f1 Author: Krister Johansen Date: Mon Jun 12 17:44:40 2023 -0700 bpf: ensure main program has an extable commit 0108a4e9f3584a7a2c026d1601b0682ff7335d95 upstream. When subprograms are in use, the main program is not jit'd after the subprograms because jit_subprogs sets a value for prog->bpf_func upon success. Subsequent calls to the JIT are bypassed when this value is non-NULL. This leads to a situation where the main program and its func[0] counterpart are both in the bpf kallsyms tree, but only func[0] has an extable. Extables are only created during JIT. Now there are two nearly identical program ksym entries in the tree, but only one has an extable. Depending upon how the entries are placed, there's a chance that a fault will call search_extable on the aux with the NULL entry. Since jit_subprogs already copies state from func[0] to the main program, include the extable pointer in this state duplication. Additionally, ensure that the copy of the main program in func[0] is not added to the bpf_prog_kallsyms table. Instead, let the main program get added later in bpf_prog_load(). This ensures there is only a single copy of the main program in the kallsyms table, and that its tag matches the tag observed by tooling like bpftool. Cc: stable@vger.kernel.org Fixes: 1c2a088a6626 ("bpf: x64: add JIT support for multi-function programs") Signed-off-by: Krister Johansen Acked-by: Yonghong Song Acked-by: Ilya Leoshkevich Tested-by: Ilya Leoshkevich Link: https://lore.kernel.org/r/6de9b2f4b4724ef56efbb0339daaa66c8b68b1e7.1686616663.git.kjlx@templeofstupid.com Signed-off-by: Alexei Starovoitov Signed-off-by: Greg Kroah-Hartman commit 857e7a5a661c42068ceef1d32ee7d0c664f52149 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:12 2023 +0300 mmc: meson-gx: fix deferred probing commit b8ada54fa1b83f3b6480d4cced71354301750153 upstream. The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: cbcaac6d7dd2 ("mmc: meson-gx-mmc: Fix platform_get_irq's error checking") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov Reviewed-by: Neil Armstrong Link: https://lore.kernel.org/r/20230617203622.6812-3-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit cf030a577fe0dc4a1a038e941591b184b2fbc0d1 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:21 2023 +0300 mmc: sunxi: fix deferred probing commit c2df53c5806cfd746dae08e07bc8c4ad247c3b70 upstream. The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: 2408a08583d2 ("mmc: sunxi-mmc: Handle return value of platform_get_irq") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov Reviewed-by: Jernej Skrabec Link: https://lore.kernel.org/r/20230617203622.6812-12-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 34d8823dba1e92e9f5c6c493fcd6b0cc9a5a5485 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:11 2023 +0300 mmc: bcm2835: fix deferred probing commit 71150ac12558bcd9d75e6e24cf7c872c2efd80f3 upstream. The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: 660fc733bd74 ("mmc: bcm2835: Add new driver for the sdhost controller.") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov Link: https://lore.kernel.org/r/20230617203622.6812-2-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit fde939971b4f4a4f68f1571bc34da62a48b34fb4 Author: Sergey Shtylyov Date: Sat Jun 17 23:36:19 2023 +0300 mmc: sdhci-spear: fix deferred probing commit 8d0caeedcd05a721f3cc2537b0ea212ec4027307 upstream. The driver overrides the error codes and IRQ0 returned by platform_get_irq() to -EINVAL, so if it returns -EPROBE_DEFER, the driver will fail the probe permanently instead of the deferred probing. Switch to propagating the error codes upstream. Since commit ce753ad1549c ("platform: finally disallow IRQ0 in platform_get_irq() and its ilk") IRQ0 is no longer returned by those APIs, so we now can safely ignore it... Fixes: 682798a596a6 ("mmc: sdhci-spear: Handle return value of platform_get_irq") Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Sergey Shtylyov Acked-by: Viresh Kumar Acked-by: Adrian Hunter Link: https://lore.kernel.org/r/20230617203622.6812-10-s.shtylyov@omp.ru Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit cd87d9caef83d19fe4965f6f6caf067bf4b88db7 Author: Christophe Kerello Date: Tue Jun 13 15:41:46 2023 +0200 mmc: mmci: stm32: fix max busy timeout calculation commit 47b3ad6b7842f49d374a01b054a4b1461a621bdc upstream. The way that the timeout is currently calculated could lead to a u64 timeout value in mmci_start_command(). This value is then cast in a u32 register that leads to mmc erase failed issue with some SD cards. Fixes: 8266c585f489 ("mmc: mmci: add hardware busy timeout feature") Signed-off-by: Yann Gautier Signed-off-by: Christophe Kerello Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230613134146.418016-1-yann.gautier@foss.st.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit e85d0e750b3ac89c4530cfa6dc85b38ae0ce565f Author: Martin Hundebøll Date: Wed Jun 7 10:27:12 2023 +0200 mmc: meson-gx: remove redundant mmc_request_done() call from irq context commit 3c40eb8145325b0f5b93b8a169146078cb2c49d6 upstream. The call to mmc_request_done() can schedule, so it must not be called from irq context. Wake the irq thread if it needs to be called, and let its existing logic do its work. Fixes the following kernel bug, which appears when running an RT patched kernel on the AmLogic Meson AXG A113X SoC: [ 11.111407] BUG: scheduling while atomic: kworker/0:1H/75/0x00010001 [ 11.111438] Modules linked in: [ 11.111451] CPU: 0 PID: 75 Comm: kworker/0:1H Not tainted 6.4.0-rc3-rt2-rtx-00081-gfd07f41ed6b4-dirty #1 [ 11.111461] Hardware name: RTX AXG A113X Linux Platform Board (DT) [ 11.111469] Workqueue: kblockd blk_mq_run_work_fn [ 11.111492] Call trace: [ 11.111497] dump_backtrace+0xac/0xe8 [ 11.111510] show_stack+0x18/0x28 [ 11.111518] dump_stack_lvl+0x48/0x60 [ 11.111530] dump_stack+0x18/0x24 [ 11.111537] __schedule_bug+0x4c/0x68 [ 11.111548] __schedule+0x80/0x574 [ 11.111558] schedule_loop+0x2c/0x50 [ 11.111567] schedule_rtlock+0x14/0x20 [ 11.111576] rtlock_slowlock_locked+0x468/0x730 [ 11.111587] rt_spin_lock+0x40/0x64 [ 11.111596] __wake_up_common_lock+0x5c/0xc4 [ 11.111610] __wake_up+0x18/0x24 [ 11.111620] mmc_blk_mq_req_done+0x68/0x138 [ 11.111633] mmc_request_done+0x104/0x118 [ 11.111644] meson_mmc_request_done+0x38/0x48 [ 11.111654] meson_mmc_irq+0x128/0x1f0 [ 11.111663] __handle_irq_event_percpu+0x70/0x114 [ 11.111674] handle_irq_event_percpu+0x18/0x4c [ 11.111683] handle_irq_event+0x80/0xb8 [ 11.111691] handle_fasteoi_irq+0xa4/0x120 [ 11.111704] handle_irq_desc+0x20/0x38 [ 11.111712] generic_handle_domain_irq+0x1c/0x28 [ 11.111721] gic_handle_irq+0x8c/0xa8 [ 11.111735] call_on_irq_stack+0x24/0x4c [ 11.111746] do_interrupt_handler+0x88/0x94 [ 11.111757] el1_interrupt+0x34/0x64 [ 11.111769] el1h_64_irq_handler+0x18/0x24 [ 11.111779] el1h_64_irq+0x64/0x68 [ 11.111786] __add_wait_queue+0x0/0x4c [ 11.111795] mmc_blk_rw_wait+0x84/0x118 [ 11.111804] mmc_blk_mq_issue_rq+0x5c4/0x654 [ 11.111814] mmc_mq_queue_rq+0x194/0x214 [ 11.111822] blk_mq_dispatch_rq_list+0x3ac/0x528 [ 11.111834] __blk_mq_sched_dispatch_requests+0x340/0x4d0 [ 11.111847] blk_mq_sched_dispatch_requests+0x38/0x70 [ 11.111858] blk_mq_run_work_fn+0x3c/0x70 [ 11.111865] process_one_work+0x17c/0x1f0 [ 11.111876] worker_thread+0x1d4/0x26c [ 11.111885] kthread+0xe4/0xf4 [ 11.111894] ret_from_fork+0x10/0x20 Fixes: 51c5d8447bd7 ("MMC: meson: initial support for GX platforms") Cc: stable@vger.kernel.org Signed-off-by: Martin Hundebøll Link: https://lore.kernel.org/r/20230607082713.517157-1-martin@geanix.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 46b88e067f64c4f12153ef50a5826315bbda1b06 Author: Stephan Gerhold Date: Thu May 18 11:39:36 2023 +0200 mmc: sdhci-msm: Disable broken 64-bit DMA on MSM8916 commit e6f9e590b72e12bbb86b1b8be7e1981f357392ad upstream. While SDHCI claims to support 64-bit DMA on MSM8916 it does not seem to be properly functional. It is not immediately obvious because SDHCI is usually used with IOMMU bypassed on this SoC, and all physical memory has 32-bit addresses. But when trying to enable the IOMMU it quickly fails with an error such as the following: arm-smmu 1e00000.iommu: Unhandled context fault: fsr=0x402, iova=0xfffff200, fsynr=0xe0000, cbfrsynra=0x140, cb=3 mmc1: ADMA error: 0x02000000 mmc1: sdhci: ============ SDHCI REGISTER DUMP =========== mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00002e02 mmc1: sdhci: Blk size: 0x00000008 | Blk cnt: 0x00000000 mmc1: sdhci: Argument: 0x00000000 | Trn mode: 0x00000013 mmc1: sdhci: Present: 0x03f80206 | Host ctl: 0x00000019 mmc1: sdhci: Power: 0x0000000f | Blk gap: 0x00000000 mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000007 mmc1: sdhci: Timeout: 0x0000000a | Int stat: 0x00000001 mmc1: sdhci: Int enab: 0x03ff900b | Sig enab: 0x03ff100b mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 mmc1: sdhci: Caps: 0x322dc8b2 | Caps_1: 0x00008007 mmc1: sdhci: Cmd: 0x0000333a | Max curr: 0x00000000 mmc1: sdhci: Resp[0]: 0x00000920 | Resp[1]: 0x5b590000 mmc1: sdhci: Resp[2]: 0xe6487f80 | Resp[3]: 0x0a404094 mmc1: sdhci: Host ctl2: 0x00000008 mmc1: sdhci: ADMA Err: 0x00000001 | ADMA Ptr: 0x0000000ffffff224 mmc1: sdhci_msm: ----------- VENDOR REGISTER DUMP ----------- mmc1: sdhci_msm: DLL sts: 0x00000000 | DLL cfg: 0x60006400 | DLL cfg2: 0x00000000 mmc1: sdhci_msm: DLL cfg3: 0x00000000 | DLL usr ctl: 0x00000000 | DDR cfg: 0x00000000 mmc1: sdhci_msm: Vndr func: 0x00018a9c | Vndr func2 : 0xf88018a8 Vndr func3: 0x00000000 mmc1: sdhci: ============================================ mmc1: sdhci: fffffffff200: DMA 0x0000ffffffffe100, LEN 0x0008, Attr=0x21 mmc1: sdhci: fffffffff20c: DMA 0x0000000000000000, LEN 0x0000, Attr=0x03 Looking closely it's obvious that only the 32-bit part of the address (0xfffff200) arrives at the SMMU, the higher 16-bit (0xffff...) get lost somewhere. This might not be a limitation of the SDHCI itself but perhaps the bus/interconnect it is connected to, or even the connection to the SMMU. Work around this by setting SDHCI_QUIRK2_BROKEN_64_BIT_DMA to avoid using 64-bit addresses. Signed-off-by: Stephan Gerhold Acked-by: Adrian Hunter Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230518-msm8916-64bit-v1-1-5694b0f35211@gerhold.net Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 73c8c6891cc6c05d7d9bad389a2041fd668a9df3 Author: Jisheng Zhang Date: Sat Jun 17 16:53:19 2023 +0800 mmc: litex_mmc: set PROBE_PREFER_ASYNCHRONOUS commit f334ad47683606b682b4166b800d8b372d315436 upstream. mmc host drivers should have enabled the asynchronous probe option, but it seems like we didn't set it for litex_mmc when introducing litex mmc support, so let's set it now. Tested with linux-on-litex-vexriscv on sipeed tang nano 20K fpga. Signed-off-by: Jisheng Zhang Acked-by: Gabriel Somlo Fixes: 92e099104729 ("mmc: Add driver for LiteX's LiteSDCard interface") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230617085319.2139-1-jszhang@kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 3e07190a1325d077a00a730f4c6cf830c9169b9b Author: Jiawen Wu Date: Mon Jun 19 17:49:48 2023 +0800 net: mdio: fix the wrong parameters commit 408c090002c8ca5da3da1417d1d675583379fae6 upstream. PHY address and device address are passed in the wrong order. Cc: stable@vger.kernel.org Fixes: 4e4aafcddbbf ("net: mdio: Add dedicated C45 API to MDIO bus drivers") Signed-off-by: Jiawen Wu Reviewed-by: Andrew Lunn Reviewed-by: Russell King (Oracle) Link: https://lore.kernel.org/r/20230619094948.84452-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 25382018806f21c1bf67145796733dd94708ec57 Author: Tetsuo Handa Date: Sun Jun 11 22:48:12 2023 +0900 cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex in freezer_css_{online,offline}() commit f0cc749254d12c78e93dae3b27b21dc9546843d0 upstream. syzbot is again reporting circular locking dependency between cpu_hotplug_lock and freezer_mutex. Do like what we did with commit 57dcd64c7e036299 ("cgroup,freezer: hold cpu_hotplug_lock before freezer_mutex"). Reported-by: syzbot Closes: https://syzkaller.appspot.com/bug?extid=2ab700fe1829880a2ec6 Signed-off-by: Tetsuo Handa Tested-by: syzbot Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic") Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit 10c8e1ceb6ee3fd30e228770e486589848cb2a09 Author: Xiu Jianfeng Date: Sat Jun 10 17:26:43 2023 +0800 cgroup: Do not corrupt task iteration when rebinding subsystem commit 6f363f5aa845561f7ea496d8b1175e3204470486 upstream. We found a refcount UAF bug as follows: refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 342 at lib/refcount.c:25 refcount_warn_saturate+0xa0/0x148 Workqueue: events cpuset_hotplug_workfn Call trace: refcount_warn_saturate+0xa0/0x148 __refcount_add.constprop.0+0x5c/0x80 css_task_iter_advance_css_set+0xd8/0x210 css_task_iter_advance+0xa8/0x120 css_task_iter_next+0x94/0x158 update_tasks_root_domain+0x58/0x98 rebuild_root_domains+0xa0/0x1b0 rebuild_sched_domains_locked+0x144/0x188 cpuset_hotplug_workfn+0x138/0x5a0 process_one_work+0x1e8/0x448 worker_thread+0x228/0x3e0 kthread+0xe0/0xf0 ret_from_fork+0x10/0x20 then a kernel panic will be triggered as below: Unable to handle kernel paging request at virtual address 00000000c0000010 Call trace: cgroup_apply_control_disable+0xa4/0x16c rebind_subsystems+0x224/0x590 cgroup_destroy_root+0x64/0x2e0 css_free_rwork_fn+0x198/0x2a0 process_one_work+0x1d4/0x4bc worker_thread+0x158/0x410 kthread+0x108/0x13c ret_from_fork+0x10/0x18 The race that cause this bug can be shown as below: (hotplug cpu) | (umount cpuset) mutex_lock(&cpuset_mutex) | mutex_lock(&cgroup_mutex) cpuset_hotplug_workfn | rebuild_root_domains | rebind_subsystems update_tasks_root_domain | spin_lock_irq(&css_set_lock) css_task_iter_start | list_move_tail(&cset->e_cset_node[ss->id] while(css_task_iter_next) | &dcgrp->e_csets[ss->id]); css_task_iter_end | spin_unlock_irq(&css_set_lock) mutex_unlock(&cpuset_mutex) | mutex_unlock(&cgroup_mutex) Inside css_task_iter_start/next/end, css_set_lock is hold and then released, so when iterating task(left side), the css_set may be moved to another list(right side), then it->cset_head points to the old list head and it->cset_pos->next points to the head node of new list, which can't be used as struct css_set. To fix this issue, switch from all css_sets to only scgrp's css_sets to patch in-flight iterators to preserve correct iteration, and then update it->cset_head as well. Reported-by: Gaosheng Cui Link: https://www.spinics.net/lists/cgroups/msg37935.html Suggested-by: Michal Koutný Link: https://lore.kernel.org/all/20230526114139.70274-1-xiujianfeng@huaweicloud.com/ Signed-off-by: Xiu Jianfeng Fixes: 2d8f243a5e6e ("cgroup: implement cgroup->e_csets[]") Cc: stable@vger.kernel.org # v3.16+ Signed-off-by: Tejun Heo Signed-off-by: Greg Kroah-Hartman commit 6c193b407ffef1d3ab42d5211f9a67309221648a Author: Paolo Abeni Date: Tue Jun 20 18:24:23 2023 +0200 mptcp: ensure listener is unhashed before updating the sk status commit 57fc0f1ceaa4016354cf6f88533e20b56190e41a upstream. The MPTCP protocol access the listener subflow in a lockless manner in a couple of places (poll, diag). That works only if the msk itself leaves the listener status only after that the subflow itself has been closed/disconnected. Otherwise we risk deadlock in diag, as reported by Christoph. Address the issue ensuring that the first subflow (the listener one) is always disconnected before updating the msk socket status. Reported-by: Christoph Paasch Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/407 Fixes: b29fcfb54cd7 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c2b330e10ec6bfda4acd8b906d3c79455670fbe7 Author: Paolo Abeni Date: Tue Jun 20 18:24:21 2023 +0200 mptcp: consolidate fallback and non fallback state machine commit 81c1d029016001f994ce1c46849c5e9900d8eab8 upstream. An orphaned msk releases the used resources via the worker, when the latter first see the msk in CLOSED status. If the msk status transitions to TCP_CLOSE in the release callback invoked by the worker's final release_sock(), such instance of the workqueue will not take any action. Additionally the MPTCP code prevents scheduling the worker once the socket reaches the CLOSE status: such msk resources will be leaked. The only code path that can trigger the above scenario is the __mptcp_check_send_data_fin() in fallback mode. Address the issue removing the special handling of fallback socket in __mptcp_check_send_data_fin(), consolidating the state machine for fallback and non fallback socket. Since non-fallback sockets do not send and do not receive data_fin, the mptcp code can update the msk internal status to match the next step in the SM every time data fin (ack) should be generated or received. As a consequence we can remove a bunch of checks for fallback from the fastpath. Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Mat Martineau Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit db96b45f4c538b1ea4eb057d9f592a3bc301c2dc Author: Paolo Abeni Date: Tue Jun 20 18:24:20 2023 +0200 mptcp: fix possible list corruption on passive MPJ commit 56a666c48b038e91b76471289e2cf60c79d326b9 upstream. At passive MPJ time, if the msk socket lock is held by the user, the new subflow is appended to the msk->join_list under the msk data lock. In mptcp_release_cb()/__mptcp_flush_join_list(), the subflows in that list are moved from the join_list into the conn_list under the msk socket lock. Append and removal could race, possibly corrupting such list. Address the issue splicing the join list into a temporary one while still under the msk data lock. Found by code inspection, the race itself should be almost impossible to trigger in practice. Fixes: 3e5014909b56 ("mptcp: cleanup MPJ subflow list handling") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Reviewed-by: Matthieu Baerts Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 72bda6bfea499cbdc18f1891bfac6607edd805ae Author: Paolo Abeni Date: Tue Jun 20 18:24:19 2023 +0200 mptcp: fix possible divide by zero in recvmsg() commit 0ad529d9fd2bfa3fc619552a8d2fb2f2ef0bce2e upstream. Christoph reported a divide by zero bug in mptcp_recvmsg(): divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 19978 Comm: syz-executor.6 Not tainted 6.4.0-rc2-gffcc7899081b #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__tcp_select_window+0x30e/0x420 net/ipv4/tcp_output.c:3018 Code: 11 ff 0f b7 cd c1 e9 0c b8 ff ff ff ff d3 e0 89 c1 f7 d1 01 cb 21 c3 eb 17 e8 2e 83 11 ff 31 db eb 0e e8 25 83 11 ff 89 d8 99 7c 24 04 29 d3 65 48 8b 04 25 28 00 00 00 48 3b 44 24 10 75 60 RSP: 0018:ffffc90000a07a18 EFLAGS: 00010246 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000040000 RDX: 0000000000000000 RSI: 000000000003ffff RDI: 0000000000040000 RBP: 000000000000ffd7 R08: ffffffff820cf297 R09: 0000000000000001 R10: 0000000000000000 R11: ffffffff8103d1a0 R12: 0000000000003f00 R13: 0000000000300000 R14: ffff888101cf3540 R15: 0000000000180000 FS: 00007f9af4c09640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b33824000 CR3: 000000012f241001 CR4: 0000000000170ee0 Call Trace: __tcp_cleanup_rbuf+0x138/0x1d0 net/ipv4/tcp.c:1611 mptcp_recvmsg+0xcb8/0xdd0 net/mptcp/protocol.c:2034 inet_recvmsg+0x127/0x1f0 net/ipv4/af_inet.c:861 ____sys_recvmsg+0x269/0x2b0 net/socket.c:1019 ___sys_recvmsg+0xe6/0x260 net/socket.c:2764 do_recvmmsg+0x1a5/0x470 net/socket.c:2858 __do_sys_recvmmsg net/socket.c:2937 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0xa6/0x130 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f9af58fc6a9 Code: 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 37 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007f9af4c08cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f9af58fc6a9 RDX: 0000000000000001 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000f00 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40 mptcp_recvmsg is allowed to release the msk socket lock when blocking, and before re-acquiring it another thread could have switched the sock to TCP_LISTEN status - with a prior connect(AF_UNSPEC) - also clearing icsk_ack.rcv_mss. Address the issue preventing the disconnect if some other process is concurrently performing a blocking syscall on the same socket, alike commit 4faeee0cf8a5 ("tcp: deny tcp_disconnect() when threads are waiting"). Fixes: a6b118febbab ("mptcp: add receive buffer auto-tuning") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/404 Signed-off-by: Paolo Abeni Tested-by: Christoph Paasch Reviewed-by: Matthieu Baerts Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 9c998d59a6b1359ad43d1ef38538af5f55fd01a2 Author: Paolo Abeni Date: Tue Jun 20 18:24:18 2023 +0200 mptcp: handle correctly disconnect() failures commit c2b2ae3925b65070adb27d5a31a31c376f26dec7 upstream. Currently the mptcp code has assumes that disconnect() can fail only at mptcp_sendmsg_fastopen() time - to avoid a deadlock scenario - and don't even bother returning an error code. Soon mptcp_disconnect() will handle more error conditions: let's track them explicitly. As a bonus, explicitly annotate TCP-level disconnect as not failing: the mptcp code never blocks for event on the subflows. Fixes: 7d803344fdc3 ("mptcp: fix deadlock in fastopen error path") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni Tested-by: Christoph Paasch Reviewed-by: Matthieu Baerts Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit d64f876c25a5c0a01dc283064602f4012f593bbe Author: Jens Axboe Date: Mon Jun 19 09:41:05 2023 -0600 io_uring/net: disable partial retries for recvmsg with cmsg commit 78d0d2063bab954d19a1696feae4c7706a626d48 upstream. We cannot sanely handle partial retries for recvmsg if we have cmsg attached. If we don't, then we'd just be overwriting the initial cmsg header on retries. Alternatively we could increment and handle this appropriately, but it doesn't seem worth the complication. Move the MSG_WAITALL check into the non-multishot case while at it, since MSG_WAITALL is explicitly disabled for multishot anyway. Link: https://lore.kernel.org/io-uring/0b0d4411-c8fd-4272-770b-e030af6919a0@kernel.dk/ Cc: stable@vger.kernel.org # 5.10+ Reported-by: Stefan Metzmacher Reviewed-by: Stefan Metzmacher Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit bbc7ecfcf8e3c615122ff9889fc628c8d71bd1b1 Author: Jens Axboe Date: Mon Jun 19 09:35:34 2023 -0600 io_uring/net: clear msg_controllen on partial sendmsg retry commit b1dc492087db0f2e5a45f1072a743d04618dd6be upstream. If we have cmsg attached AND we transferred partial data at least, clear msg_controllen on retry so we don't attempt to send that again. Cc: stable@vger.kernel.org # 5.10+ Fixes: cac9e4418f4c ("io_uring/net: save msghdr->msg_control for retries") Reported-by: Stefan Metzmacher Reviewed-by: Stefan Metzmacher Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 4a1d32de8abc5c4927e981189922c81b9d77fc0f Author: Dexuan Cui Date: Wed Jun 14 21:44:51 2023 -0700 PCI: hv: Add a per-bus mutex state_lock commit 067d6ec7ed5b49380688e06c1e5f883a71bef4fe upstream. In the case of fast device addition/removal, it's possible that hv_eject_device_work() can start to run before create_root_hv_pci_bus() starts to run; as a result, the pci_get_domain_bus_and_slot() in hv_eject_device_work() can return a 'pdev' of NULL, and hv_eject_device_work() can remove the 'hpdev', and immediately send a message PCI_EJECTION_COMPLETE to the host, and the host immediately unassigns the PCI device from the guest; meanwhile, create_root_hv_pci_bus() and the PCI device driver can be probing the dead PCI device and reporting timeout errors. Fix the issue by adding a per-bus mutex 'state_lock' and grabbing the mutex before powering on the PCI bus in hv_pci_enter_d0(): when hv_eject_device_work() starts to run, it's able to find the 'pdev' and call pci_stop_and_remove_bus_device(pdev): if the PCI device driver has loaded, the PCI device driver's probe() function is already called in create_root_hv_pci_bus() -> pci_bus_add_devices(), and now hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able to call the PCI device driver's remove() function and remove the device reliably; if the PCI device driver hasn't loaded yet, the function call hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able to remove the PCI device reliably and the PCI device driver's probe() function won't be called; if the PCI device driver's probe() is already running (e.g., systemd-udev is loading the PCI device driver), it must be holding the per-device lock, and after the probe() finishes and releases the lock, hv_eject_device_work() -> pci_stop_and_remove_bus_device() is able to proceed to remove the device reliably. Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Dexuan Cui Reviewed-by: Michael Kelley Acked-by: Lorenzo Pieralisi Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-6-decui@microsoft.com Signed-off-by: Wei Liu Signed-off-by: Greg Kroah-Hartman commit 27b44e8f9f6afed86fe8228fe448f890f485a978 Author: Dexuan Cui Date: Wed Jun 14 21:44:48 2023 -0700 PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic commit 2738d5ab7929a845b654cd171a1e275c37eb428e upstream. When the host tries to remove a PCI device, the host first sends a PCI_EJECT message to the guest, and the guest is supposed to gracefully remove the PCI device and send a PCI_EJECTION_COMPLETE message to the host; the host then sends a VMBus message CHANNELMSG_RESCIND_CHANNELOFFER to the guest (when the guest receives this message, the device is already unassigned from the guest) and the guest can do some final cleanup work; if the guest fails to respond to the PCI_EJECT message within one minute, the host sends the VMBus message CHANNELMSG_RESCIND_CHANNELOFFER and removes the PCI device forcibly. In the case of fast device addition/removal, it's possible that the PCI device driver is still configuring MSI-X interrupts when the guest receives the PCI_EJECT message; the channel callback calls hv_pci_eject_device(), which sets hpdev->state to hv_pcichild_ejecting, and schedules a work hv_eject_device_work(); if the PCI device driver is calling pci_alloc_irq_vectors() -> ... -> hv_compose_msi_msg(), we can break the while loop in hv_compose_msi_msg() due to the updated hpdev->state, and leave data->chip_data with its default value of NULL; later, when the PCI device driver calls request_irq() -> ... -> hv_irq_unmask(), the guest crashes in hv_arch_irq_unmask() due to data->chip_data being NULL. Fix the issue by not testing hpdev->state in the while loop: when the guest receives PCI_EJECT, the device is still assigned to the guest, and the guest has one minute to finish the device removal gracefully. We don't really need to (and we should not) test hpdev->state in the loop. Fixes: de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") Signed-off-by: Dexuan Cui Reviewed-by: Michael Kelley Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-3-decui@microsoft.com Signed-off-by: Wei Liu Signed-off-by: Greg Kroah-Hartman commit 2b8b666dd848ec444f18fb027ffdddef38e548db Author: Dexuan Cui Date: Wed Jun 14 21:44:49 2023 -0700 PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev commit add9195e69c94b32e96f78c2f9cea68f0e850b3f upstream. The hpdev->state is never really useful. The only use in hv_pci_eject_device() and hv_eject_device_work() is not really necessary. Signed-off-by: Dexuan Cui Reviewed-by: Michael Kelley Acked-by: Lorenzo Pieralisi Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-4-decui@microsoft.com Signed-off-by: Wei Liu Signed-off-by: Greg Kroah-Hartman commit 55e6ecd8fd278a10fc0b3fc61f2f8d5f15809591 Author: Dexuan Cui Date: Wed Jun 14 21:44:50 2023 -0700 Revert "PCI: hv: Fix a timing issue which causes kdump to fail occasionally" commit a847234e24d03d01a9566d1d9dcce018cc018d67 upstream. This reverts commit d6af2ed29c7c1c311b96dac989dcb991e90ee195. The statement "the hv_pci_bus_exit() call releases structures of all its child devices" in commit d6af2ed29c7c is not true: in the path hv_pci_probe() -> hv_pci_enter_d0() -> hv_pci_bus_exit(hdev, true): the parameter "keep_devs" is true, so hv_pci_bus_exit() does *not* release the child "struct hv_pci_dev *hpdev" that is created earlier in pci_devices_present_work() -> new_pcichild_device(). The commit d6af2ed29c7c was originally made in July 2020 for RHEL 7.7, where the old version of hv_pci_bus_exit() was used; when the commit was rebased and merged into the upstream, people didn't notice that it's not really necessary. The commit itself doesn't cause any issue, but it makes hv_pci_probe() more complicated. Revert it to facilitate some upcoming changes to hv_pci_probe(). Signed-off-by: Dexuan Cui Reviewed-by: Michael Kelley Acked-by: Wei Hu Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-5-decui@microsoft.com Signed-off-by: Wei Liu Signed-off-by: Greg Kroah-Hartman commit 447123893d3fbfd265e6573401cfca560dbd7b90 Author: Dexuan Cui Date: Wed Jun 14 21:44:47 2023 -0700 PCI: hv: Fix a race condition bug in hv_pci_query_relations() commit 440b5e3663271b0ffbd4908115044a6a51fb938b upstream. Since day 1 of the driver, there has been a race between hv_pci_query_relations() and survey_child_resources(): during fast device hotplug, hv_pci_query_relations() may error out due to device-remove and the stack variable 'comp' is no longer valid; however, pci_devices_present_work() -> survey_child_resources() -> complete() may be running on another CPU and accessing the no-longer-valid 'comp'. Fix the race by flushing the workqueue before we exit from hv_pci_query_relations(). Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Signed-off-by: Dexuan Cui Reviewed-by: Michael Kelley Acked-by: Lorenzo Pieralisi Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230615044451.5580-2-decui@microsoft.com Signed-off-by: Wei Liu Signed-off-by: Greg Kroah-Hartman commit 1be7443a689ca05d62dd42080bd174a6ddf7682b Author: Michael Kelley Date: Thu May 18 08:13:52 2023 -0700 Drivers: hv: vmbus: Fix vmbus_wait_for_unload() to scan present CPUs commit 320805ab61e5f1e2a5729ae266e16bec2904050c upstream. vmbus_wait_for_unload() may be called in the panic path after other CPUs are stopped. vmbus_wait_for_unload() currently loops through online CPUs looking for the UNLOAD response message. But the values of CONFIG_KEXEC_CORE and crash_kexec_post_notifiers affect the path used to stop the other CPUs, and in one of the paths the stopped CPUs are removed from cpu_online_mask. This removal happens in both x86/x64 and arm64 architectures. In such a case, vmbus_wait_for_unload() only checks the panic'ing CPU, and misses the UNLOAD response message except when the panic'ing CPU is CPU 0. vmbus_wait_for_unload() eventually times out, but only after waiting 100 seconds. Fix this by looping through *present* CPUs in vmbus_wait_for_unload(). The cpu_present_mask is not modified by stopping the other CPUs in the panic path, nor should it be. Also, in a CoCo VM the synic_message_page is not allocated in hv_synic_alloc(), but is set and cleared in hv_synic_enable_regs() and hv_synic_disable_regs() such that it is set only when the CPU is online. If not all present CPUs are online when vmbus_wait_for_unload() is called, the synic_message_page might be NULL. Add a check for this. Fixes: cd95aad55793 ("Drivers: hv: vmbus: handle various crash scenarios") Cc: stable@vger.kernel.org Reported-by: John Starks Signed-off-by: Michael Kelley Reviewed-by: Vitaly Kuznetsov Link: https://lore.kernel.org/r/1684422832-38476-1-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu Signed-off-by: Greg Kroah-Hartman commit 8ee040b074a9bb155ad350667fdcc488db7a44f3 Author: Dexuan Cui Date: Thu May 4 15:41:55 2023 -0700 Drivers: hv: vmbus: Call hv_synic_free() if hv_synic_alloc() fails commit ec97e112985c2581ee61854a4b74f080f6cdfc2c upstream. Commit 572086325ce9 ("Drivers: hv: vmbus: Cleanup synic memory free path") says "Any memory allocations that succeeded will be freed when the caller cleans up by calling hv_synic_free()", but if the get_zeroed_page() in hv_synic_alloc() fails, currently hv_synic_free() is not really called in vmbus_bus_init(), consequently there will be a memory leak, e.g. hv_context.hv_numa_map is not freed in the error path. Fix this by updating the goto labels. Cc: stable@kernel.org Signed-off-by: Dexuan Cui Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining") Reviewed-by: Michael Kelley Link: https://lore.kernel.org/r/20230504224155.10484-1-decui@microsoft.com Signed-off-by: Wei Liu Signed-off-by: Greg Kroah-Hartman commit b6e15f5940d1becb471a97c676c452b6c48d153c Author: Liam R. Howlett Date: Tue Jun 6 14:29:12 2023 -0400 mm/mprotect: fix do_mprotect_pkey() limit check commit 77795f900e2a07c1cbedc375789aefb43843b6c2 upstream. The return of do_mprotect_pkey() can still be incorrectly returned as success if there is a gap that spans to or beyond the end address passed in. Update the check to ensure that the end address has indeed been seen. Link: https://lore.kernel.org/all/CABi2SkXjN+5iFoBhxk71t3cmunTk-s=rB4T7qo0UQRh17s49PQ@mail.gmail.com/ Link: https://lkml.kernel.org/r/20230606182912.586576-1-Liam.Howlett@oracle.com Fixes: 82f951340f25 ("mm/mprotect: fix do_mprotect_pkey() return on error") Signed-off-by: Liam R. Howlett Reported-by: Jeff Xu Reviewed-by: Lorenzo Stoakes Acked-by: David Hildenbrand Acked-by: Vlastimil Babka Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit c189994b5dd3b239c223d423c801565addb666cd Author: Lorenzo Stoakes Date: Mon Jun 5 21:11:07 2023 +0100 mm/vmalloc: do not output a spurious warning when huge vmalloc() fails commit 95a301eefa82057571207edd06ea36218985a75e upstream. In __vmalloc_area_node() we always warn_alloc() when an allocation performed by vm_area_alloc_pages() fails unless it was due to a pending fatal signal. However, huge page allocations instigated either by vmalloc_huge() or __vmalloc_node_range() (or a caller that invokes this like kvmalloc() or kvmalloc_node()) always falls back to order-0 allocations if the huge page allocation fails. This renders the warning useless and noisy, especially as all callers appear to be aware that this may fallback. This has already resulted in at least one bug report from a user who was confused by this (see link). Therefore, simply update the code to only output this warning for order-0 pages when no fatal signal is pending. Link: https://bugzilla.suse.com/show_bug.cgi?id=1211410 Link: https://lkml.kernel.org/r/20230605201107.83298-1-lstoakes@gmail.com Fixes: 80b1d8fdfad1 ("mm: vmalloc: correct use of __GFP_NOWARN mask in __vmalloc_area_node()") Signed-off-by: Lorenzo Stoakes Acked-by: Vlastimil Babka Reviewed-by: Baoquan He Acked-by: Michal Hocko Reviewed-by: Uladzislau Rezki (Sony) Reviewed-by: David Hildenbrand Cc: Christoph Hellwig Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 1dfe889fd23e2419230840e31b123a30890dfe54 Author: Gavin Shan Date: Thu Jun 15 15:42:59 2023 +1000 KVM: Avoid illegal stage2 mapping on invalid memory slot commit 2230f9e1171a2e9731422a14d1bbc313c0b719d1 upstream. We run into guest hang in edk2 firmware when KSM is kept as running on the host. The edk2 firmware is waiting for status 0x80 from QEMU's pflash device (TYPE_PFLASH_CFI01) during the operation of sector erasing or buffered write. The status is returned by reading the memory region of the pflash device and the read request should have been forwarded to QEMU and emulated by it. Unfortunately, the read request is covered by an illegal stage2 mapping when the guest hang issue occurs. The read request is completed with QEMU bypassed and wrong status is fetched. The edk2 firmware runs into an infinite loop with the wrong status. The illegal stage2 mapping is populated due to same page sharing by KSM at (C) even the associated memory slot has been marked as invalid at (B) when the memory slot is requested to be deleted. It's notable that the active and inactive memory slots can't be swapped when we're in the middle of kvm_mmu_notifier_change_pte() because kvm->mn_active_invalidate_count is elevated, and kvm_swap_active_memslots() will busy loop until it reaches to zero again. Besides, the swapping from the active to the inactive memory slots is also avoided by holding &kvm->srcu in __kvm_handle_hva_range(), corresponding to synchronize_srcu_expedited() in kvm_swap_active_memslots(). CPU-A CPU-B ----- ----- ioctl(kvm_fd, KVM_SET_USER_MEMORY_REGION) kvm_vm_ioctl_set_memory_region kvm_set_memory_region __kvm_set_memory_region kvm_set_memslot(kvm, old, NULL, KVM_MR_DELETE) kvm_invalidate_memslot kvm_copy_memslot kvm_replace_memslot kvm_swap_active_memslots (A) kvm_arch_flush_shadow_memslot (B) same page sharing by KSM kvm_mmu_notifier_invalidate_range_start : kvm_mmu_notifier_change_pte kvm_handle_hva_range __kvm_handle_hva_range kvm_set_spte_gfn (C) : kvm_mmu_notifier_invalidate_range_end Fix the issue by skipping the invalid memory slot at (C) to avoid the illegal stage2 mapping so that the read request for the pflash's status is forwarded to QEMU and emulated by it. In this way, the correct pflash's status can be returned from QEMU to break the infinite loop in the edk2 firmware. We tried a git-bisect and the first problematic commit is cd4c71835228 (" KVM: arm64: Convert to the gfn-based MMU notifier callbacks"). With this, clean_dcache_guest_page() is called after the memory slots are iterated in kvm_mmu_notifier_change_pte(). clean_dcache_guest_page() is called before the iteration on the memory slots before this commit. This change literally enlarges the racy window between kvm_mmu_notifier_change_pte() and memory slot removal so that we're able to reproduce the issue in a practical test case. However, the issue exists since commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup"). Cc: stable@vger.kernel.org # v3.9+ Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup") Reported-by: Shuai Hu Reported-by: Zhenyu Zhang Signed-off-by: Gavin Shan Reviewed-by: David Hildenbrand Reviewed-by: Oliver Upton Reviewed-by: Peter Xu Reviewed-by: Sean Christopherson Reviewed-by: Shaoqin Huang Message-Id: <20230615054259.14911-1-gshan@redhat.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 5f784b8d6548ab67b168583e2e05a109d4aee376 Author: Hans de Goede Date: Wed Jun 14 12:07:56 2023 +0200 thermal/intel/intel_soc_dts_iosf: Fix reporting wrong temperatures commit 0bb619f9227aa370330d2b309733d74750705053 upstream. Since commit 955fb8719efb ("thermal/intel/intel_soc_dts_iosf: Use Intel TCC library") intel_soc_dts_iosf is reporting the wrong temperature. The driver expects tj_max to be in milli-degrees-celcius but after the switch to the TCC library this is now in degrees celcius so instead of e.g. 90000 it is set to 90 causing a temperature 45 degrees below tj_max to be reported as -44910 milli-degrees instead of as 45000 milli-degrees. Fix this by adding back the lost factor of 1000. Fixes: 955fb8719efb ("thermal/intel/intel_soc_dts_iosf: Use Intel TCC library") Reported-by: Bernhard Krug Signed-off-by: Hans de Goede Acked-by: Zhang Rui Cc: 6.3+ # 6.3+ Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 0c58e97b4753de90e806d99be631115663913d00 Author: Rafael J. Wysocki Date: Wed Jun 14 17:29:21 2023 +0200 ACPI: sleep: Avoid breaking S3 wakeup due to might_sleep() commit 22db06337f590d01d79f60f181d8dfe5a9ef9085 upstream. The addition of might_sleep() to down_timeout() caused the latter to enable interrupts unconditionally in some cases, which in turn broke the ACPI S3 wakeup path in acpi_suspend_enter(), where down_timeout() is called by acpi_disable_all_gpes() via acpi_ut_acquire_mutex(). Namely, if CONFIG_DEBUG_ATOMIC_SLEEP is set, might_sleep() causes might_resched() to be used and if CONFIG_PREEMPT_VOLUNTARY is set, this triggers __cond_resched() which may call preempt_schedule_common(), so __schedule() gets invoked and it ends up with enabled interrupts (in the prev == next case). Now, enabling interrupts early in the S3 wakeup path causes the kernel to crash. Address this by modifying acpi_suspend_enter() to disable GPEs without attempting to acquire the sleeping lock which is not needed in that code path anyway. Fixes: 99409b935c9a ("locking/semaphore: Add might_sleep() to down_*() family") Reported-by: Srinivas Pandruvada Signed-off-by: Rafael J. Wysocki Acked-by: Peter Zijlstra (Intel) Cc: 5.15+ # 5.15+ Signed-off-by: Greg Kroah-Hartman commit 5c67f1459c388ae357c7c10668fcc2aae700a3f4 Author: Ryusuke Konishi Date: Mon Jun 12 11:14:56 2023 +0900 nilfs2: prevent general protection fault in nilfs_clear_dirty_page() commit 782e53d0c14420858dbf0f8f797973c150d3b6d7 upstream. In a syzbot stress test that deliberately causes file system errors on nilfs2 with a corrupted disk image, it has been reported that nilfs_clear_dirty_page() called from nilfs_clear_dirty_pages() can cause a general protection fault. In nilfs_clear_dirty_pages(), when looking up dirty pages from the page cache and calling nilfs_clear_dirty_page() for each dirty page/folio retrieved, the back reference from the argument page to "mapping" may have been changed to NULL (and possibly others). It is necessary to check this after locking the page/folio. So, fix this issue by not calling nilfs_clear_dirty_page() on a page/folio after locking it in nilfs_clear_dirty_pages() if the back reference "mapping" from the page/folio is different from the "mapping" that held the page/folio just before. Link: https://lkml.kernel.org/r/20230612021456.3682-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi Reported-by: syzbot+53369d11851d8f26735c@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/000000000000da4f6b05eb9bf593@google.com Tested-by: Ryusuke Konishi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 1dc4ab1fcedececcd2fafc16dc00ab83c2f7c6aa Author: Ryusuke Konishi Date: Fri Jun 9 12:57:32 2023 +0900 nilfs2: fix buffer corruption due to concurrent device reads commit 679bd7ebdd315bf457a4740b306ae99f1d0a403d upstream. As a result of analysis of a syzbot report, it turned out that in three cases where nilfs2 allocates block device buffers directly via sb_getblk, concurrent reads to the device can corrupt the allocated buffers. Nilfs2 uses sb_getblk for segment summary blocks, that make up a log header, and the super root block, that is the trailer, and when moving and writing the second super block after fs resize. In any of these, since the uptodate flag is not set when storing metadata to be written in the allocated buffers, the stored metadata will be overwritten if a device read of the same block occurs concurrently before the write. This causes metadata corruption and misbehavior in the log write itself, causing warnings in nilfs_btree_assign() as reported. Fix these issues by setting an uptodate flag on the buffer head on the first or before modifying each buffer obtained with sb_getblk, and clearing the flag on failure. When setting the uptodate flag, the lock_buffer/unlock_buffer pair is used to perform necessary exclusive control, and the buffer is filled to ensure that uninitialized bytes are not mixed into the data read from others. As for buffers for segment summary blocks, they are filled incrementally, so if the uptodate flag was unset on their allocation, set the flag and zero fill the buffer once at that point. Also, regarding the superblock move routine, the starting point of the memset call to zerofill the block is incorrectly specified, which can cause a buffer overflow on file systems with block sizes greater than 4KiB. In addition, if the superblock is moved within a large block, it is necessary to assume the possibility that the data in the superblock will be destroyed by zero-filling before copying. So fix these potential issues as well. Link: https://lkml.kernel.org/r/20230609035732.20426-1-konishi.ryusuke@gmail.com Signed-off-by: Ryusuke Konishi Reported-by: syzbot+31837fe952932efc8fb9@syzkaller.appspotmail.com Closes: https://lkml.kernel.org/r/00000000000030000a05e981f475@google.com Tested-by: Ryusuke Konishi Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit cc6f800a4cad6d510d4301874650656927a0ca84 Author: Prathu Baronia Date: Thu Jun 8 21:14:49 2023 +0530 scripts: fix the gfp flags header path in gfp-translate commit 2049a7d0cbc6ac8e370e836ed68597be04a7dc49 upstream. Since gfp flags have been shifted to gfp_types.h so update the path in the gfp-translate script. Link: https://lkml.kernel.org/r/20230608154450.21758-1-prathubaronia2011@gmail.com Fixes: cb5a065b4ea9c ("headers/deps: mm: Split out of ") Signed-off-by: Prathu Baronia Reviewed-by: David Hildenbrand Cc: Masahiro Yamada Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Nicolas Schier Cc: Ingo Molnar Cc: Yury Norov Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 0d274f215844bd75fa9a3a9a5cc4eab8665ceeb3 Author: Rafael Aquini Date: Tue Jun 6 19:36:13 2023 -0400 writeback: fix dereferencing NULL mapping->host on writeback_page_template commit 54abe19e00cfcc5a72773d15cd00ed19ab763439 upstream. When commit 19343b5bdd16 ("mm/page-writeback: introduce tracepoint for wait_on_page_writeback()") repurposed the writeback_dirty_page trace event as a template to create its new wait_on_page_writeback trace event, it ended up opening a window to NULL pointer dereference crashes due to the (infrequent) occurrence of a race where an access to a page in the swap-cache happens concurrently with the moment this page is being written to disk and the tracepoint is enabled: BUG: kernel NULL pointer dereference, address: 0000000000000040 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 800000010ec0a067 P4D 800000010ec0a067 PUD 102353067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 1320 Comm: shmem-worker Kdump: loaded Not tainted 6.4.0-rc5+ #13 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230301gitf80f052277c8-1.fc37 03/01/2023 RIP: 0010:trace_event_raw_event_writeback_folio_template+0x76/0xf0 Code: 4d 85 e4 74 5c 49 8b 3c 24 e8 06 98 ee ff 48 89 c7 e8 9e 8b ee ff ba 20 00 00 00 48 89 ef 48 89 c6 e8 fe d4 1a 00 49 8b 04 24 <48> 8b 40 40 48 89 43 28 49 8b 45 20 48 89 e7 48 89 43 30 e8 a2 4d RSP: 0000:ffffaad580b6fb60 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff90e38035c01c RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff90e38035c044 RBP: ffff90e38035c024 R08: 0000000000000002 R09: 0000000000000006 R10: ffff90e38035c02e R11: 0000000000000020 R12: ffff90e380bac000 R13: ffffe3a7456d9200 R14: 0000000000001b81 R15: ffffe3a7456d9200 FS: 00007f2e4e8a15c0(0000) GS:ffff90e3fbc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000040 CR3: 00000001150c6003 CR4: 0000000000170ee0 Call Trace: ? __die+0x20/0x70 ? page_fault_oops+0x76/0x170 ? kernelmode_fixup_or_oops+0x84/0x110 ? exc_page_fault+0x65/0x150 ? asm_exc_page_fault+0x22/0x30 ? trace_event_raw_event_writeback_folio_template+0x76/0xf0 folio_wait_writeback+0x6b/0x80 shmem_swapin_folio+0x24a/0x500 ? filemap_get_entry+0xe3/0x140 shmem_get_folio_gfp+0x36e/0x7c0 ? find_busiest_group+0x43/0x1a0 shmem_fault+0x76/0x2a0 ? __update_load_avg_cfs_rq+0x281/0x2f0 __do_fault+0x33/0x130 do_read_fault+0x118/0x160 do_pte_missing+0x1ed/0x2a0 __handle_mm_fault+0x566/0x630 handle_mm_fault+0x91/0x210 do_user_addr_fault+0x22c/0x740 exc_page_fault+0x65/0x150 asm_exc_page_fault+0x22/0x30 This problem arises from the fact that the repurposed writeback_dirty_page trace event code was written assuming that every pointer to mapping (struct address_space) would come from a file-mapped page-cache object, thus mapping->host would always be populated, and that was a valid case before commit 19343b5bdd16. The swap-cache address space (swapper_spaces), however, doesn't populate its ->host (struct inode) pointer, thus leading to the crashes in the corner-case aforementioned. commit 19343b5bdd16 ended up breaking the assignment of __entry->name and __entry->ino for the wait_on_page_writeback tracepoint -- both dependent on mapping->host carrying a pointer to a valid inode. The assignment of __entry->name was fixed by commit 68f23b89067f ("memcg: fix a crash in wb_workfn when a device disappears"), and this commit fixes the remaining case, for __entry->ino. Link: https://lkml.kernel.org/r/20230606233613.1290819-1-aquini@redhat.com Fixes: 19343b5bdd16 ("mm/page-writeback: introduce tracepoint for wait_on_page_writeback()") Signed-off-by: Rafael Aquini Reviewed-by: Yafang Shao Cc: Aristeu Rozanski Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 0fd6652c93bb3594654b7cba39e7517dce4c2246 Author: Roberto Sassu Date: Wed Jun 7 15:24:27 2023 +0200 memfd: check for non-NULL file_seals in memfd_create() syscall commit 935d44acf621aa0688fef8312dec3e5940f38f4e upstream. Ensure that file_seals is non-NULL before using it in the memfd_create() syscall. One situation in which memfd_file_seals_ptr() could return a NULL pointer when CONFIG_SHMEM=n, oopsing the kernel. Link: https://lkml.kernel.org/r/20230607132427.2867435-1-roberto.sassu@huaweicloud.com Fixes: 47b9012ecdc7 ("shmem: add sealing support to hugetlb-backed memfd") Signed-off-by: Roberto Sassu Cc: Marc-Andr Lureau Cc: Mike Kravetz Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit b8b17211c09494f9e2b80e8f465cc61f866a1b4e Author: Matthieu Baerts Date: Sat Jun 10 18:11:52 2023 +0200 selftests: mptcp: join: skip mixed tests if not supported commit 6673851be0fc1bfc3353ffb52ff26ae5468f12c9 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of a mix of subflows in v4 and v6 by the in-kernel PM introduced by commit b9d69db87fb7 ("mptcp: let the in-kernel PM use mixed IPv4 and IPv6 addresses"). It looks like there is no external sign we can use to predict the expected behaviour. Instead of accepting different behaviours and thus not really checking for the expected behaviour, we are looking here for a specific kernel version. That's not ideal but it looks better than removing the test because it cannot support older kernel versions. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ad3493746ebe ("selftests: mptcp: add test-cases for mixed v4/v6 subflows") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f2a248443b524c4fa8dd81b986d0c30a760aa403 Author: Matthieu Baerts Date: Sat Jun 10 18:11:51 2023 +0200 selftests: mptcp: join: uniform listener tests commit 96b84195df61d374d8028cf426a115ae085031ec upstream. The alignment was different from the other tests because tabs were used instead of spaces. While at it, also use 'echo' instead of 'printf' to print the result to keep the same style as done in the other sub-tests. And, even if it should be better with, also remove 'stdbuf' and sed's '--unbuffered' option because they are not used in the other subtests and they are not available when using a minimal environment with busybox. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 178d023208eb ("selftests: mptcp: listener test for in-kernel PM") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 54722f776dcbf030d3629a90297befc62e65ac33 Author: Matthieu Baerts Date: Sat Jun 10 18:11:50 2023 +0200 selftests: mptcp: join: skip PM listener tests if not supported commit 0471bb479af03874b09350fcfe51d3743a5608de upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of PM listener events introduced by commit f8c9dfbd875b ("mptcp: add pm listener events"). It is possible to look for "mptcp_event_pm_listener" in kallsyms to know in advance if the kernel supports this feature. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 178d023208eb ("selftests: mptcp: listener test for in-kernel PM") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit cf84db1d9ea391088b55e8fde670eb6859c564e5 Author: Matthieu Baerts Date: Sat Jun 10 18:11:49 2023 +0200 selftests: mptcp: join: skip MPC backups tests if not supported commit 632978f0a961b4591a05ba9e39eab24541d83e84 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of sending an MP_PRIO signal for the initial subflow, introduced by commit c157bbe776b7 ("mptcp: allow the in kernel PM to set MPC subflow priority"). It is possible to look for "mptcp_subflow_send_ack" in kallsyms because it was needed to introduce the mentioned feature. So we can know in advance if the feature is supported instead of trying and accepting any results. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 914f6a59b10f ("selftests: mptcp: add MPC backup tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit ef15b6e6e15d0391ad9b62eef171af5839de5bf8 Author: Matthieu Baerts Date: Sat Jun 10 18:11:48 2023 +0200 selftests: mptcp: join: skip fail tests if not supported commit ff8897b5189495b47895ca247b860a29dc04b36b upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the MP_FAIL / infinite mapping introduced by commit 1e39e5a32ad7 ("mptcp: infinite mapping sending") and the following ones. It is possible to look for one of the infinite mapping counters to know in advance if the this feature is available. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: b6e074e171bc ("selftests: mptcp: add infinite map testcase") Cc: stable@vger.kernel.org Fixes: 2ba18161d407 ("selftests: mptcp: add MP_FAIL reset testcase") Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit e7b1950400a6e77afb06ae730ef2cf268077fe63 Author: Matthieu Baerts Date: Sat Jun 10 18:11:47 2023 +0200 selftests: mptcp: join: skip userspace PM tests if not supported commit f2b492b04a167261e1c38eb76f78fb4294473a49 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the userspace PM introduced by commit 4638de5aefe5 ("mptcp: handle local addrs announced by userspace PMs") and the following ones. It is possible to look for the MPTCP pm_type's sysctl knob to know in advance if the userspace PM is available. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 1cc9d563c75d1536bc11623232e7a08726004e91 Author: Matthieu Baerts Date: Sat Jun 10 18:11:46 2023 +0200 selftests: mptcp: join: skip fullmesh flag tests if not supported commit 9db34c4294af9999edc773d96744e2d2d4eb5060 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the fullmesh flag for the in-kernel PM introduced by commit 2843ff6f36db ("mptcp: remote addresses fullmesh") and commit 1a0d6136c5f0 ("mptcp: local addresses fullmesh"). It looks like there is no easy external sign we can use to predict the expected behaviour. We could add the flag and then check if it has been added but for that, and for each fullmesh test, we would need to setup a new environment, do the checks, clean it and then only start the test from yet another clean environment. To keep it simple and avoid introducing new issues, we look for a specific kernel version. That's not ideal but an acceptable solution for this case. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6a0653b96f5d ("selftests: mptcp: add fullmesh setting tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8dfc005754328c8dbf0c5e7dc2307862ecd14c4b Author: Matthieu Baerts Date: Sat Jun 10 18:11:45 2023 +0200 selftests: mptcp: join: skip backup if set flag on ID not supported commit 07216a3c5d926bf1b6b360a0073747228a1f9b7f upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. Commit bccefb762439 ("selftests: mptcp: simplify pm_nl_change_endpoint") has simplified the way the backup flag is set on an endpoint. Instead of doing: ./pm_nl_ctl set 10.0.2.1 flags backup Now we do: ./pm_nl_ctl set id 1 flags backup The new way is easier to maintain but it is also incompatible with older kernels not supporting the implicit endpoints putting in place the infrastructure to set flags per ID, hence the second Fixes tag. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: bccefb762439 ("selftests: mptcp: simplify pm_nl_change_endpoint") Cc: stable@vger.kernel.org Fixes: 4cf86ae84c71 ("mptcp: strict local address ID selection") Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8d4020b1e3813bde224718fe5f4473e0c67c565c Author: Matthieu Baerts Date: Sat Jun 10 18:11:44 2023 +0200 selftests: mptcp: join: skip implicit tests if not supported commit 36c4127ae8dd0ebac6d56d8a1b272dd483471c40 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of the implicit endpoints introduced by commit d045b9eb95a9 ("mptcp: introduce implicit endpoints"). It is possible to look for "mptcp_subflow_send_ack" in kallsyms because it was needed to introduce the mentioned feature. So we can know in advance if the feature is supported instead of trying and accepting any results. Note that here and in the following commits, we re-do the same check for each sub-test of the same function for a few reasons. The main one is not to break the ID assign to each test in order to be able to easily compare results between different kernel versions. Also, we can still run a specific test even if it is skipped. Another reason is that it makes it clear during the review that a specific subtest will be skipped or not under certain conditions. At the end, it looks OK to call the exact same helper multiple times: it is not a critical path and it is the same code that is executed, not really more cases to maintain. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 69c6ce7b6eca ("selftests: mptcp: add implicit endpoint test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 5abad19fe5de1e1a203809e71646db211e5f3ef0 Author: Matthieu Baerts Date: Sat Jun 10 18:11:43 2023 +0200 selftests: mptcp: join: support RM_ADDR for used endpoints or not commit 425ba803124b90cb9124d99f13b372a89dc151d9 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. At some points, a new feature caused internal behaviour changes we are verifying in the selftests, see the Fixes tag below. It was not a UAPI change but because in these selftests, we check some internal behaviours, it is normal we have to adapt them from time to time after having added some features. It looks like there is no external sign we can use to predict the expected behaviour. Instead of accepting different behaviours and thus not really checking for the expected behaviour, we are looking here for a specific kernel version. That's not ideal but it looks better than removing the test because it cannot support older kernel versions. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6fa0174a7c86 ("mptcp: more careful RM_ADDR generation") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit bbeeab87e8a194c56edb80fb92594b135f69bd5b Author: Matthieu Baerts Date: Sat Jun 10 18:11:42 2023 +0200 selftests: mptcp: join: skip Fastclose tests if not supported commit ae947bb2c253ff5f395bb70cb9db8700543bf398 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of MP_FASTCLOSE introduced in commit f284c0c77321 ("mptcp: implement fastclose xmit path"). If the MIB counter is not available, the test cannot be verified and the behaviour will not be the expected one. So we can skip the test if the counter is missing. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 01542c9bf9ab ("selftests: mptcp: add fastclose testcase") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 34d183c3f421ecbc649fa73f16ac82605fcdba34 Author: Matthieu Baerts Date: Sat Jun 10 18:11:41 2023 +0200 selftests: mptcp: join: support local endpoint being tracked or not commit d4c81bbb8600257fd3076d0196cb08bd2e5bdf24 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. At some points, a new feature caused internal behaviour changes we are verifying in the selftests, see the Fixes tag below. It was not a uAPI change but because in these selftests, we check some internal behaviours, it is normal we have to adapt them from time to time after having added some features. It is possible to look for "mptcp_pm_subflow_check_next" in kallsyms because it was needed to introduce the mentioned feature. So we can know in advance what the behaviour we are expecting here instead of supporting the two behaviours. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 76cede8594b154b1123a521727c7223752d6a844 Author: Matthieu Baerts Date: Sat Jun 10 18:11:40 2023 +0200 selftests: mptcp: join: skip test if iptables/tc cmds fail commit 4a0b866a3f7d3c22033f40e93e94befc6fe51bce upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. Some tests are using IPTables and/or TC commands to force some behaviours. If one of these commands fails -- likely because some features are not available due to missing kernel config -- we should intercept the error and skip the tests requiring these features. Note that if we expect to have these features available and if SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, the tests will be marked as failed instead of skipped. This patch also replaces the 'exit 1' by 'return 1' not to stop the selftest in the middle without the conclusion if there is an issue with NF or TC. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 041c1e5388e816fc36b1fd4025566b95edc8ed0a Author: Matthieu Baerts Date: Sat Jun 10 18:11:39 2023 +0200 selftests: mptcp: join: skip check if MIB counter not supported commit 47867f0a7e831e24e5eab3330667ce9682d50fb1 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the MPTCP MIB counters introduced in commit fc518953bc9c ("mptcp: add and use MIB counter infrastructure") and more later. The MPTCP Join selftest heavily relies on these counters. If a counter is not supported by the kernel, it is not displayed when using 'nstat -z'. We can then detect that and skip the verification. A new helper (get_counter()) has been added to do the required checks and return an error if the counter is not available. Note that if we expect to have these features available and if SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, the tests will be marked as failed instead of skipped. This new helper also makes sure we get the exact counter we want to avoid issues we had in the past, e.g. with MPTcpExtRmAddr and MPTcpExtRmAddrDrop sharing the same prefix. While at it, we uniform the way we fetch a MIB counter. Note for the backports: we rarely change these modified blocks so if there is are conflicts, it is very likely because a counter is not used in the older kernels and we don't need that chunk. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 2d36726a8b5e5d184add6155bc16cf7bc6433d6c Author: Matthieu Baerts Date: Sat Jun 10 18:11:38 2023 +0200 selftests: mptcp: join: helpers to skip tests commit cdb50525345cf5a8359ee391032ef606a7826f08 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. Here are some helpers that will be used to mark subtests as skipped if a feature is not supported. Marking as a fix for the commit introducing this selftest to help with the backports. While at it, also check if kallsyms feature is available as it will also be used in the following commits to check if MPTCP features are available before starting a test. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: b08fbf241064 ("selftests: add test-cases for MPTCP MP_JOIN") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 06738b3730d92dbc85073f4a1f8d35ce16fcef44 Author: Matthieu Baerts Date: Sat Jun 10 18:11:37 2023 +0200 selftests: mptcp: join: use 'iptables-legacy' if available commit 0c4cd3f86a40028845ad6f8af5b37165666404cd upstream. IPTables commands using 'iptables-nft' fail on old kernels, at least 5.15 because it doesn't see the default IPTables chains: $ iptables -L iptables/1.8.2 Failed to initialize nft: Protocol not supported As a first step before switching to NFTables, we can use iptables-legacy if available. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 8d014eaa9254 ("selftests: mptcp: add ADD_ADDR timeout test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 8955f3fd758bf58cd189e179f7061ddd4bf30cc8 Author: Matthieu Baerts Date: Sat Jun 10 18:11:36 2023 +0200 selftests: mptcp: lib: skip if not below kernel version commit b1a6a38ab8a633546cefae890da842f19e006c74 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. A new function is now available to easily detect if a feature is missing by looking at the kernel version. That's clearly not ideal and this kind of check should be avoided as soon as possible. But sometimes, there are no external sign that a "feature" is available or not: internal behaviours can change without modifying the uAPI and these selftests are verifying the internal behaviours. Sometimes, the only (easy) way to verify if the feature is present is to run the test but then the validation cannot determine if there is a failure with the feature or if the feature is missing. Then it looks better to check the kernel version instead of having tests that can never fail. In any case, we need a solution not to have a whole selftest being marked as failed just because one sub-test has failed. Note that this env var car be set to 1 not to do such check and run the linked sub-test: SELFTESTS_MPTCP_LIB_NO_KVERSION_CHECK. This new helper is going to be used in the following commits. In order to ease the backport of such future patches, it would be good if this patch is backported up to the introduction of MPTCP selftests, hence the Fixes tag below: this type of check was supposed to be done from the beginning. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f13f3a5b0186613c3cd95c4ce1cd011c83899789 Author: Matthieu Baerts Date: Thu Jun 8 18:38:56 2023 +0200 selftests: mptcp: userspace pm: skip PM listener events tests if unavailable commit 626cb7a5f6b892e48f27a76d11af040c538e03dc upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the new listener events linked to the path-manager introduced by commit f8c9dfbd875b ("mptcp: add pm listener events"). It is possible to look for "mptcp_event_pm_listener" in kallsyms to know in advance if the kernel supports this feature and skip these sub-tests if the feature is not supported. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6c73008aa301 ("selftests: mptcp: listener test for userspace PM") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c045d528bc6e5e747d27ca8590ddcb115f21ca6f Author: Matthieu Baerts Date: Thu Jun 8 18:38:55 2023 +0200 selftests: mptcp: userspace pm: skip if not supported commit f90adb033891d418c5dafef34a9aa49f3c860991 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the MPTCP Userspace PM introduced by commit 4638de5aefe5 ("mptcp: handle local addrs announced by userspace PMs"). We can skip all these tests if the feature is not supported simply by looking for the MPTCP pm_type's sysctl knob. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 259a834fadda ("selftests: mptcp: functional tests for the userspace PM type") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 679bcea381eaf8414292d6d4b29fbab562ccad4d Author: Matthieu Baerts Date: Thu Jun 8 18:38:54 2023 +0200 selftests: mptcp: userspace pm: skip if 'ip' tool is unavailable commit 723d6b9b12338c1caf06bf6fe269962ef04e2c71 upstream. When a required tool is missing, the return code 4 (SKIP) should be returned instead of 1 (FAIL). Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 259a834fadda ("selftests: mptcp: functional tests for the userspace PM type") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 84ffad3e30b203eed48b572c62ec0488342977fa Author: Matthieu Baerts Date: Thu Jun 8 18:38:53 2023 +0200 selftests: mptcp: sockopt: skip TCP_INQ checks if not supported commit b631e3a4e94c77c9007d60b577a069c203ce9594 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is TCP_INQ cmsg support introduced in commit 2c9e77659a0c ("mptcp: add TCP_INQ cmsg support"). It is possible to look for "mptcp_ioctl" in kallsyms because it was needed to introduce the mentioned feature. We can skip these tests and not set TCPINQ option if the feature is not supported. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 5cbd886ce2a9 ("selftests: mptcp: add TCP_INQ support") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b77b4113e5e8fe2e763f1366dd6af62fb9066aa4 Author: Matthieu Baerts Date: Thu Jun 8 18:38:52 2023 +0200 selftests: mptcp: sockopt: skip getsockopt checks if not supported commit c6f7eccc519837ebde1d099d9610c4f1d5bd975e upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the getsockopt(SOL_MPTCP) to get info about the MPTCP connections introduced by commit 55c42fa7fa33 ("mptcp: add MPTCP_INFO getsockopt") and the following ones. It is possible to look for "mptcp_diag_fill_info" in kallsyms because it is introduced by the mentioned feature. So we can know in advance if the feature is supported and skip the sub-test if not. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ce9979129a0b ("selftests: mptcp: add mptcp getsockopt test cases") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 395634f28f70966b4c9c56714a82964cb22fba90 Author: Matthieu Baerts Date: Thu Jun 8 18:38:51 2023 +0200 selftests: mptcp: sockopt: relax expected returned size commit 8dee6ca2ac1e5630a7bb6a98bc0b686916fc2000 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the getsockopt(SOL_MPTCP) to get info about the MPTCP connections introduced by commit 55c42fa7fa33 ("mptcp: add MPTCP_INFO getsockopt") and the following ones. We cannot guess in advance which sizes the kernel will returned: older kernel can returned smaller sizes, e.g. recently the tcp_info structure has been modified in commit 71fc704768f6 ("tcp: add rcv_wnd and plb_rehash to TCP_INFO") where a new field has been added. The userspace can also expect a smaller size if it is compiled with old uAPI kernel headers. So for these sizes, we can only check if they are above a certain threshold, 0 for the moment. We can also only compared sizes with the ones set by the kernel. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ce9979129a0b ("selftests: mptcp: add mptcp getsockopt test cases") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 5316dde8e05100846794d69835473ef64d0bde4d Author: Matthieu Baerts Date: Thu Jun 8 18:38:50 2023 +0200 selftests: mptcp: pm nl: skip fullmesh flag checks if not supported commit f3761b50b8e4cb4807b5d41e02144c8c8a0f2512 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the fullmesh flag that can be given to the MPTCP in-kernel path-manager and introduced in commit 2843ff6f36db ("mptcp: remote addresses fullmesh"). If the flag is not visible in the dump after having set it, we don't check the content. Note that if we expect to have this feature and SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var is set to 1, we always check the content to avoid regressions. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 6da1dfdd037e ("selftests: mptcp: add set_flags tests in pm_netlink.sh") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 6cd3dc192b1443a53f09c8974bff9f9eea711f67 Author: Matthieu Baerts Date: Thu Jun 8 18:38:49 2023 +0200 selftests: mptcp: pm nl: remove hardcoded default limits commit 2177d0b08e421971e035672b70f3228d9485c650 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the checks of the default limits returned by the MPTCP in-kernel path-manager. The default values have been modified by commit 72bcbc46a5c3 ("mptcp: increase default max additional subflows to 2"). Instead of comparing with hardcoded values, we can get the default one and compare with them. Note that if we expect to have the latest version, we continue to check the hardcoded values to avoid unexpected behaviour changes. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: eedbc685321b ("selftests: add PM netlink functional tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit f642670e772757b163aed403e434f7a0f60b6ca2 Author: Matthieu Baerts Date: Thu Jun 8 18:38:48 2023 +0200 selftests: mptcp: diag: skip inuse tests if not supported commit dc93086aff040349b5b2a4608c71ea01286dc2cc upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the reporting of the MPTCP sockets being used, introduced by commit c558246ee73e ("mptcp: add statistics for mptcp socket in use"). Similar to the parent commit, it looks like there is no good pre-check to do here, i.e. dedicated function available in kallsyms. Instead, we try to get info and if nothing is returned, the test is marked as skipped. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: e04a30f78809 ("selftest: mptcp: add test for mptcp socket in use") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit b6e7effd1d1132794f976b0759498df8bacee63f Author: Matthieu Baerts Date: Thu Jun 8 18:38:47 2023 +0200 selftests: mptcp: diag: skip listen tests if not supported commit dc97251bf0b70549c76ba261516c01b8096771c5 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the listen diag dump support introduced by commit 4fa39b701ce9 ("mptcp: listen diag dump support"). It looks like there is no good pre-check to do here, i.e. dedicated function available in kallsyms. Instead, we try to get info if nothing is returned, the test is marked as skipped. That's not ideal because something could be wrong with the feature and instead of reporting an error, the test could be marked as skipped. If we know in advanced that the feature is supposed to be supported, the tester can set SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var to 1: in this case the test will report an error instead of marking the test as skipped if nothing is returned. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: f2ae0fa68e28 ("selftests/mptcp: add diag listen tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 2b108e002702a8fece38899108ffb82f278c97a8 Author: Matthieu Baerts Date: Thu Jun 8 18:38:46 2023 +0200 selftests: mptcp: connect: skip TFO tests if not supported commit 06b03083158e90d57866fa220de92c8dd8b9598b upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of TCP_FASTOPEN socket option with MPTCP connections introduced by commit 4ffb0a02346c ("mptcp: add TCP_FASTOPEN sock option"). It is possible to look for "mptcp_fastopen_" in kallsyms to know if the feature is supported or not. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: ca7ae8916043 ("selftests: mptcp: mptfo Initiator/Listener") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 4ac7d1530abc90ca77fd2a1412c9100861ec9628 Author: Matthieu Baerts Date: Thu Jun 8 18:38:45 2023 +0200 selftests: mptcp: connect: skip disconnect tests if not supported commit 4ad39a42da2e9770c8e4c37fe632ed8898419129 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the full support of disconnections from the userspace introduced by commit b29fcfb54cd7 ("mptcp: full disconnect implementation"). It is possible to look for "mptcp_pm_data_reset" in kallsyms because a preparation patch added it to ease the introduction of the mentioned feature. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit a5e7e9313e7494bef087bf4ac2f87c907b29afac Author: Matthieu Baerts Date: Thu Jun 8 18:38:44 2023 +0200 selftests: mptcp: connect: skip transp tests if not supported commit 07bf49401909264a38fa3427c3cce43e8304436a upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. One of them is the support of IP(V6)_TRANSPARENT socket option with MPTCP connections introduced by commit c9406a23c116 ("mptcp: sockopt: add SOL_IP freebind & transparent options"). It is possible to look for "__ip_sock_set_tos" in kallsyms because IP(V6)_TRANSPARENT socket option support has been added after TOS support which came with the required infrastructure in MPTCP sockopt code. To support TOS, the following function has been exported (T). Not great but better than checking for a specific kernel version. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 5fb62e9cd3ad ("selftests: mptcp: add tproxy test case") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 4ebb55ee14b7e4e1b8aaed6f9a0f757bc19fbc7b Author: Matthieu Baerts Date: Thu Jun 8 18:38:43 2023 +0200 selftests: mptcp: lib: skip if missing symbol commit 673004821ab98c6645bd21af56a290854e88f533 upstream. Selftests are supposed to run on any kernels, including the old ones not supporting all MPTCP features. New functions are now available to easily detect if a certain feature is missing by looking at kallsyms. These new helpers are going to be used in the following commits. In order to ease the backport of such future patches, it would be good if this patch is backported up to the introduction of MPTCP selftests, hence the Fixes tag below: this type of check was supposed to be done from the beginning. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/368 Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp") Cc: stable@vger.kernel.org Signed-off-by: Matthieu Baerts Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit e0fec5d8c040a486e9e7b1b3ccd970afd8e6db71 Author: Matthieu Baerts Date: Fri Apr 14 17:47:10 2023 +0200 selftests: mptcp: join: fix ShellCheck warnings commit 0fcd72df8847d3a62eb34a084862157ce0564a94 upstream. Most of the code had an issue according to ShellCheck. That's mainly due to the fact it incorrectly believes most of the code was unreachable because it's invoked by variable name, see how the "tests" array is used. Once SC2317 has been ignored, three small warnings were still visible: - SC2155: Declare and assign separately to avoid masking return values. - SC2046: Quote this to prevent word splitting: can be ignored because "ip netns pids" can display more than one pid. - SC2166: Prefer [ p ] || [ q ] as [ p -o q ] is not well defined. This probably didn't fix any actual issues but it might help spotting new interesting warnings reported by ShellCheck as just before, ShellCheck was reporting issues for most lines making it a bit useless. Signed-off-by: Matthieu Baerts Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7f0ac5c05bac8eebba9bb8bb3f4025a06f724add Author: Matthieu Baerts Date: Fri Apr 14 17:47:09 2023 +0200 selftests: mptcp: remove duplicated entries in usage commit 0a85264e48b642d360720589fdb837a3643fb9b0 upstream. mptcp_connect tool was printing some duplicated entries when showing how to use it: -j -l -r While at it, I also: - moved the very few entries that were not sorted, - added -R that was missing since commit 8a4b910d005d ("mptcp: selftests: add rcvbuf set option"), - removed the -u parameter that has been removed in commit f730b65c9d85 ("selftests: mptcp: try to set mptcp ulp mode in different sk states"). No need to backport this, it is just an internal tool used by our selftests. The help menu is mainly useful for MPTCP kernel devs. Acked-by: Paolo Abeni Signed-off-by: Matthieu Baerts Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 77d9967b43c273bf0d20c23f34d85d5424929376 Author: Nathan Chancellor Date: Tue Jun 20 17:44:50 2023 +0000 riscv: Link with '-z norelro' This patch fixes a stable only patch, so it has no direct upstream equivalent. After a stable only patch to explicitly handle the '.got' section to handle an orphan section warning from the linker, certain configurations error when linking with ld.lld, which enables relro by default: ld.lld: error: section: .got is not contiguous with other relro sections This has come up with other architectures before, such as arm and arm64 in commit 0cda9bc15dfc ("ARM: 9038/1: Link with '-z norelro'") and commit 3b92fa7485eb ("arm64: link with -z norelro regardless of CONFIG_RELOCATABLE"). Additionally, '-z norelro' is used unconditionally for RISC-V upstream after commit 26e7aacb83df ("riscv: Allow to downgrade paging mode from the command line"), which alluded to this issue for the same reason. Bring 6.3 in line with mainline and link with '-z norelro', which resolves the above link failure. Fixes: e6d1562dd4e9 ("riscv: vmlinux.lds.S: Explicitly handle '.got' section") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202306192231.DJmWr6BX-lkp@intel.com/ Signed-off-by: Nathan Chancellor Acked-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit cb23a5f55bb7e29d58612ef57601d7a973bef3b3 Author: Michael S. Tsirkin Date: Thu Jun 8 17:42:53 2023 -0400 Revert "virtio-blk: support completion batching for the IRQ path" commit afd384f0dbea2229fd11159efb86a5b41051c4a9 upstream. This reverts commit 07b679f70d73483930e8d3c293942416d9cd5c13. This change appears to have broken things... We now see applications hanging during disk accesses. e.g. multi-port virtio-blk device running in h/w (FPGA) Host running a simple 'fio' test. [global] thread=1 direct=1 ioengine=libaio norandommap=1 group_reporting=1 bs=4K rw=read iodepth=128 runtime=1 numjobs=4 time_based [job0] filename=/dev/vda [job1] filename=/dev/vdb [job2] filename=/dev/vdc ... [job15] filename=/dev/vdp i.e. 16 disks; 4 queues per disk; simple burst of 4KB reads This is repeatedly run in a loop. After a few, normally <10 seconds, fio hangs. With 64 queues (16 disks), failure occurs within a few seconds; with 8 queues (2 disks) it may take ~hour before hanging. Last message: fio-3.19 Starting 8 threads Jobs: 1 (f=1): [_(7),R(1)][68.3%][eta 03h:11m:06s] I think this means at the end of the run 1 queue was left incomplete. 'diskstats' (run while fio is hung) shows no outstanding transactions. e.g. $ cat /proc/diskstats ... 252 0 vda 1843140071 0 14745120568 712568645 0 0 0 0 0 3117947 712568645 0 0 0 0 0 0 252 16 vdb 1816291511 0 14530332088 704905623 0 0 0 0 0 3117711 704905623 0 0 0 0 0 0 ... Other stats (in the h/w, and added to the virtio-blk driver ([a]virtio_queue_rq(), [b]virtblk_handle_req(), [c]virtblk_request_done()) all agree, and show every request had a completion, and that virtblk_request_done() never gets called. e.g. PF= 0 vq=0 1 2 3 [a]request_count - 839416590 813148916 105586179 84988123 [b]completion1_count - 839416590 813148916 105586179 84988123 [c]completion2_count - 0 0 0 0 PF= 1 vq=0 1 2 3 [a]request_count - 823335887 812516140 104582672 75856549 [b]completion1_count - 823335887 812516140 104582672 75856549 [c]completion2_count - 0 0 0 0 i.e. the issue is after the virtio-blk driver. This change was introduced in kernel 6.3.0. I am seeing this using 6.3.3. If I run with an earlier kernel (5.15), it does not occur. If I make a simple patch to the 6.3.3 virtio-blk driver, to skip the blk_mq_add_to_batch()call, it does not fail. e.g. kernel 5.15 - this is OK virtio_blk.c,virtblk_done() [irq handler] if (likely(!blk_should_fake_timeout(req->q))) { blk_mq_complete_request(req); } kernel 6.3.3 - this fails virtio_blk.c,virtblk_handle_req() [irq handler] if (likely(!blk_should_fake_timeout(req->q))) { if (!blk_mq_complete_request_remote(req)) { if (!blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr), virtblk_complete_batch)) { virtblk_request_done(req); //this never gets called... so blk_mq_add_to_batch() must always succeed } } } If I do, kernel 6.3.3 - this is OK virtio_blk.c,virtblk_handle_req() [irq handler] if (likely(!blk_should_fake_timeout(req->q))) { if (!blk_mq_complete_request_remote(req)) { virtblk_request_done(req); //force this here... if (!blk_mq_add_to_batch(req, iob, virtblk_vbr_status(vbr), virtblk_complete_batch)) { virtblk_request_done(req); //this never gets called... so blk_mq_add_to_batch() must always succeed } } } Perhaps you might like to fix/test/revert this change... Martin Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202306090826.C1fZmdMe-lkp@intel.com/ Cc: Suwan Kim Tested-by: edliaw@google.com Reported-by: "Roberts, Martin" Message-Id: <336455b4f630f329380a8f53ee8cad3868764d5c.1686295549.git.mst@redhat.com> Signed-off-by: Michael S. Tsirkin Signed-off-by: Greg Kroah-Hartman commit b84d0644a0a40fb4b8aa1201d087c71cc69554e4 Author: Thomas Gleixner Date: Thu Jun 15 11:18:30 2023 +0200 tick/common: Align tick period during sched_timer setup commit 13bb06f8dd42071cb9a49f6e21099eea05d4b856 upstream. The tick period is aligned very early while the first clock_event_device is registered. At that point the system runs in periodic mode and switches later to one-shot mode if possible. The next wake-up event is programmed based on the aligned value (tick_next_period) but the delta value, that is used to program the clock_event_device, is computed based on ktime_get(). With the subtracted offset, the device fires earlier than the exact time frame. With a large enough offset the system programs the timer for the next wake-up and the remaining time left is too small to make any boot progress. The system hangs. Move the alignment later to the setup of tick_sched timer. At this point the system switches to oneshot mode and a high resolution clocksource is available. At this point it is safe to align tick_next_period because ktime_get() will now return accurate (not jiffies based) time. [bigeasy: Patch description + testing]. Fixes: e9523a0d81899 ("tick/common: Align tick period with the HZ tick.") Reported-by: Mathias Krause Reported-by: "Bhatnagar, Rishabh" Suggested-by: Mathias Krause Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner Tested-by: Richard W.M. Jones Tested-by: Mathias Krause Acked-by: SeongJae Park Cc: stable@vger.kernel.org Link: https://lore.kernel.org/5a56290d-806e-b9a5-f37c-f21958b5a8c0@grsecurity.net Link: https://lore.kernel.org/12c6f9a3-d087-b824-0d05-0d18c9bc1bf3@amazon.com Link: https://lore.kernel.org/r/20230615091830.RxMV2xf_@linutronix.de Signed-off-by: Greg Kroah-Hartman commit 779bc5f99a320f4f7793b327415fed6759c504a6 Author: Vishal Moola (Oracle) Date: Wed Jun 7 13:41:20 2023 -0700 afs: Fix waiting for writeback then skipping folio commit 819da022dd007398d0c42ebcd8dbb1b681acea53 upstream. Commit acc8d8588cb7 converted afs_writepages_region() to write back a folio batch. The function waits for writeback to a folio, but then proceeds to the rest of the batch without trying to write that folio again. This patch fixes has it attempt to write the folio again. [DH: Also remove an 'else' that adding a goto makes redundant] Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()") Signed-off-by: Vishal Moola (Oracle) Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20230607204120.89416-2-vishal.moola@gmail.com/ Signed-off-by: Greg Kroah-Hartman commit e98ffc5f9042e7d460c8224e1268b50ee546d3a1 Author: Vishal Moola (Oracle) Date: Wed Jun 7 13:41:19 2023 -0700 afs: Fix dangling folio ref counts in writeback commit a2b6f2ab3e144f8e23666aafeba0e4d9ea4b7975 upstream. Commit acc8d8588cb7 converted afs_writepages_region() to write back a folio batch. If writeback needs rescheduling, the function exits without dropping the references to the folios in fbatch. This patch fixes that. [DH: Moved the added line before the _leave()] Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()") Signed-off-by: Vishal Moola (Oracle) Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20230607204120.89416-1-vishal.moola@gmail.com/ Signed-off-by: Greg Kroah-Hartman commit cd396376fbba1f0ffc50caf8e7880d8c0de0792b Author: Linus Torvalds Date: Wed Jun 21 10:58:46 2023 -0700 Revert "efi: random: refresh non-volatile random seed when RNG is initialized" commit 69cbeb61ff9093a9155cb19a36d633033f71093a upstream. This reverts commit e7b813b32a42a3a6281a4fd9ae7700a0257c1d50 (and the subsequent fix for it: 41a15855c1ee "efi: random: fix NULL-deref when refreshing seed"). It turns otu to cause non-deterministic boot stalls on at least a HP 6730b laptop. Reported-and-bisected-by: Sami Korkalainen Link: https://lore.kernel.org/all/GQUnKz2al3yke5mB2i1kp3SzNHjK8vi6KJEh7rnLrOQ24OrlljeCyeWveLW9pICEmB9Qc8PKdNt3w1t_g3-Uvxq1l8Wj67PpoMeWDoH8PKk=@proton.me/ Cc: Jason A. Donenfeld Cc: Bagas Sanjaya Cc: stable@kernel.org Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 43576a9994d49bce4b39d4d1fba319e6ec89eb2e Author: Mike Kravetz Date: Thu Jun 8 13:49:27 2023 -0700 udmabuf: revert 'Add support for mapping hugepages (v4)' commit b7cb3821905b79b6ed474fd5ba34d1e187649139 upstream. This effectively reverts commit 16c243e99d33 ("udmabuf: Add support for mapping hugepages (v4)"). Recently, Junxiao Chang found a BUG with page map counting as described here [1]. This issue pointed out that the udmabuf driver was making direct use of subpages of hugetlb pages. This is not a good idea, and no other mm code attempts such use. In addition to the mapcount issue, this also causes issues with hugetlb vmemmap optimization and page poisoning. For now, remove hugetlb support. If udmabuf wants to be used on hugetlb mappings, it should be changed to only use complete hugetlb pages. This will require different alignment and size requirements on the UDMABUF_CREATE API. [1] https://lore.kernel.org/linux-mm/20230512072036.1027784-1-junxiao.chang@intel.com/ Link: https://lkml.kernel.org/r/20230608204927.88711-1-mike.kravetz@oracle.com Fixes: 16c243e99d33 ("udmabuf: Add support for mapping hugepages (v4)") Signed-off-by: Mike Kravetz Acked-by: Greg Kroah-Hartman Acked-by: Vivek Kasireddy Acked-by: Gerd Hoffmann Cc: David Hildenbrand Cc: Dongwon Kim Cc: James Houghton Cc: Jerome Marchand Cc: Junxiao Chang Cc: Kirill A. Shutemov Cc: Michal Hocko Cc: Muchun Song Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit d1066c1b3663401cd23c0d6e60cdae750ce00c0f Author: Namjae Jeon Date: Thu Jun 15 22:05:29 2023 +0900 ksmbd: validate session id and tree id in the compound request commit 5005bcb4219156f1bf7587b185080ec1da08518e upstream. This patch validate session id and tree id in compound request. If first operation in the compound is SMB2 ECHO request, ksmbd bypass session and tree validation. So work->sess and work->tcon could be NULL. If secound request in the compound access work->sess or tcon, It cause NULL pointer dereferecing error. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21165 Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 58a9c41064df27632e780c5a3ae3e0e4284957d1 Author: Namjae Jeon Date: Thu Jun 15 22:04:40 2023 +0900 ksmbd: fix out-of-bound read in smb2_write commit 5fe7f7b78290638806211046a99f031ff26164e1 upstream. ksmbd_smb2_check_message doesn't validate hdr->NextCommand. If ->NextCommand is bigger than Offset + Length of smb2 write, It will allow oversized smb2 write length. It will cause OOB read in smb2_write. Cc: stable@vger.kernel.org Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-21164 Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 768caf4019f0391c0b6452afe34cea1704133f7b Author: Namjae Jeon Date: Mon Jun 5 01:57:34 2023 +0900 ksmbd: validate command payload size commit 2b9b8f3b68edb3d67d79962f02e26dbb5ae3808d upstream. ->StructureSize2 indicates command payload size. ksmbd should validate this size with rfc1002 length before accessing it. This patch remove unneeded check and add the validation for this. [ 8.912583] BUG: KASAN: slab-out-of-bounds in ksmbd_smb2_check_message+0x12a/0xc50 [ 8.913051] Read of size 2 at addr ffff88800ac7d92c by task kworker/0:0/7 ... [ 8.914967] Call Trace: [ 8.915126] [ 8.915267] dump_stack_lvl+0x33/0x50 [ 8.915506] print_report+0xcc/0x620 [ 8.916558] kasan_report+0xae/0xe0 [ 8.917080] kasan_check_range+0x35/0x1b0 [ 8.917334] ksmbd_smb2_check_message+0x12a/0xc50 [ 8.917935] ksmbd_verify_smb_message+0xae/0xd0 [ 8.918223] handle_ksmbd_work+0x192/0x820 [ 8.918478] process_one_work+0x419/0x760 [ 8.918727] worker_thread+0x2a2/0x6f0 [ 8.919222] kthread+0x187/0x1d0 [ 8.919723] ret_from_fork+0x1f/0x30 [ 8.919954] Cc: stable@vger.kernel.org Reported-by: Chih-Yen Chang Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 7f3cc46040d8b6235618775bad229a3af9855c22 Author: Lino Sanfilippo Date: Thu Nov 24 14:55:35 2022 +0100 tpm, tpm_tis: Claim locality in interrupt handler commit 0e069265bce5a40c4eee52e2364bbbd4dabee94a upstream. Writing the TPM_INT_STATUS register in the interrupt handler to clear the interrupts only has effect if a locality is held. Since this is not guaranteed at the time the interrupt is fired, claim the locality explicitly in the handler. Signed-off-by: Lino Sanfilippo Tested-by: Michael Niewöhner Tested-by: Jarkko Sakkinen Reviewed-by: Jarkko Sakkinen Signed-off-by: Jarkko Sakkinen Signed-off-by: Greg Kroah-Hartman commit 3acb3dd3145b54933e88ae107e1288c1147d6d33 Author: Alexei Starovoitov Date: Mon Apr 10 19:43:44 2023 +0200 mm: Fix copy_from_user_nofault(). commit d319f344561de23e810515d109c7278919bff7b0 upstream. There are several issues with copy_from_user_nofault(): - access_ok() is designed for user context only and for that reason it has WARN_ON_IN_IRQ() which triggers when bpf, kprobe, eprobe and perf on ppc are calling it from irq. - it's missing nmi_uaccess_okay() which is a nop on all architectures except x86 where it's required. The comment in arch/x86/mm/tlb.c explains the details why it's necessary. Calling copy_from_user_nofault() from bpf, [ke]probe without this check is not safe. - __copy_from_user_inatomic() under CONFIG_HARDENED_USERCOPY is calling check_object_size()->__check_object_size()->check_heap_object()->find_vmap_area()->spin_lock() which is not safe to do from bpf, [ke]probe and perf due to potential deadlock. Fix all three issues. At the end the copy_from_user_nofault() becomes equivalent to copy_from_user_nmi() from safety point of view with a difference in the return value. Reported-by: Hsin-Wei Hung Signed-off-by: Alexei Starovoitov Signed-off-by: Florian Lehner Tested-by: Hsin-Wei Hung Tested-by: Florian Lehner Link: https://lore.kernel.org/r/20230410174345.4376-2-dev@der-flo.net Signed-off-by: Alexei Starovoitov Cc: Javier Honduvilla Coto Cc: Daniel Borkmann Signed-off-by: Greg Kroah-Hartman commit acadffcc167df6c8c54b3cfc4e246e6248db6681 Author: Damien Le Moal Date: Thu Jun 15 17:18:53 2023 +0900 ata: libata-scsi: Avoid deadlock on rescan after device resume [ Upstream commit 6aa0365a3c8512587fffd42fe438768709ddef8e ] When an ATA port is resumed from sleep, the port is reset and a power management request issued to libata EH to reset the port and rescanning the device(s) attached to the port. Device rescanning is done by scheduling an ata_scsi_dev_rescan() work, which will execute scsi_rescan_device(). However, scsi_rescan_device() takes the generic device lock, which is also taken by dpm_resume() when the SCSI device is resumed as well. If a device rescan execution starts before the completion of the SCSI device resume, the rcu locking used to refresh the cached VPD pages of the device, combined with the generic device locking from scsi_rescan_device() and from dpm_resume() can cause a deadlock. Avoid this situation by changing struct ata_port scsi_rescan_task to be a delayed work instead of a simple work_struct. ata_scsi_dev_rescan() is modified to check if the SCSI device associated with the ATA device that must be rescanned is not suspended. If the SCSI device is still suspended, ata_scsi_dev_rescan() returns early and reschedule itself for execution after an arbitrary delay of 5ms. Reported-by: Kai-Heng Feng Reported-by: Joe Breuer Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217530 Fixes: a19a93e4c6a9 ("scsi: core: pm: Rely on the device driver core for async power management") Signed-off-by: Damien Le Moal Reviewed-by: Hannes Reinecke Tested-by: Kai-Heng Feng Tested-by: Joe Breuer Signed-off-by: Sasha Levin commit b0cb56fc6e3096c9da04c30d9b501da84dae2b4f Author: Tom Chung Date: Mon May 29 18:00:09 2023 +0800 drm/amd/display: fix the system hang while disable PSR [ Upstream commit ea2062dd1f0384ae1b136d333ee4ced15bedae38 ] [Why] When the PSR enabled. If you try to adjust the timing parameters, it may cause system hang. Because the timing mismatch with the DMCUB settings. [How] Disable the PSR before adjusting timing parameters. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Acked-by: Stylon Wang Signed-off-by: Tom Chung Reviewed-by: Wayne Lin Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 1ca399f127e0a372537625b1d462ed586f5d9139 Author: Rodrigo Siqueira Date: Thu Feb 23 11:36:08 2023 -0700 drm/amd/display: Add wrapper to call planes and stream update [ Upstream commit 81f743a08f3b214638aa389e252ae5e6c3592e7c ] [Why & How] This commit is part of a sequence of changes that replaces the commit sequence used in the DC with a new one. As a result of this transition, we moved some specific parts from the commit sequence and brought them to amdgpu_dm. This commit adds a wrapper inside DM that enable our drivers to do any necessary preparation or change before we offload the plane/stream update to DC. Reviewed-by: Harry Wentland Acked-by: Qingqing Zhuo Signed-off-by: Rodrigo Siqueira Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: ea2062dd1f03 ("drm/amd/display: fix the system hang while disable PSR") Signed-off-by: Sasha Levin commit da2d907e051d591717d00e28e67ab341b961fd05 Author: Rodrigo Siqueira Date: Thu Oct 6 16:40:55 2022 -0400 drm/amd/display: Use dc_update_planes_and_stream [ Upstream commit f7511289821ffccc07579406d6ab520aa11049f5 ] [Why & How] The old dc_commit_updates_for_stream lacks manipulation for many corner cases where the DC feature requires special attention; as a result, it starts to show its limitation (e.g., the SubVP feature is not supported by it, among other cases). To modernize and unify our internal API, this commit replaces the old dc_commit_updates_for_stream with dc_update_planes_and_stream, which has more features. Reviewed-by: Harry Wentland Acked-by: Qingqing Zhuo Signed-off-by: Rodrigo Siqueira Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: ea2062dd1f03 ("drm/amd/display: fix the system hang while disable PSR") Signed-off-by: Sasha Levin commit afb49c5af04b32c6d598f676f875456936d0f225 Author: Shyam Prasad N Date: Fri Jun 9 17:46:54 2023 +0000 cifs: fix status checks in cifs_tree_connect [ Upstream commit 91f4480c41f56f7c723323cf7f581f1d95d9ffbc ] The ordering of status checks at the beginning of cifs_tree_connect is wrong. As a result, a tcon which is good may stay marked as needing reconnect infinitely. Fixes: 2f0e4f034220 ("cifs: check only tcon status on tcon related functions") Cc: stable@vger.kernel.org # 6.3 Signed-off-by: Shyam Prasad N Signed-off-by: Steve French Signed-off-by: Sasha Levin