commit 01e08e5d7656e660c8a4852191e1e133cbdb0a66 Author: Greg Kroah-Hartman Date: Wed Jan 31 16:21:21 2024 -0800 Linux 6.7.3 Link: https://lore.kernel.org/r/20240129170016.356158639@linuxfoundation.org Tested-by: Ronald Warsow Tested-by: SeongJae Park Tested-by: Salvatore Bonaccorso Tested-by: Shuah Khan Tested-by: Allen Pais Tested-by: Florian Fainelli Tested-by: Bagas Sanjaya Tested-by: Ron Economos Tested-by: Jon Hunter Tested-by: Linux Kernel Functional Testing Tested-by: Ricardo B. Marliere Tested-by: Justin M. Forbes Signed-off-by: Greg Kroah-Hartman commit 9547994f03fe96a96ab3985d959c4dbf1259be11 Author: Richard Palethorpe Date: Wed Jan 10 15:01:22 2024 +0200 x86/entry/ia32: Ensure s32 is sign extended to s64 commit 56062d60f117dccfb5281869e0ab61e090baf864 upstream. Presently ia32 registers stored in ptregs are unconditionally cast to unsigned int by the ia32 stub. They are then cast to long when passed to __se_sys*, but will not be sign extended. This takes the sign of the syscall argument into account in the ia32 stub. It still casts to unsigned int to avoid implementation specific behavior. However then casts to int or unsigned int as necessary. So that the following cast to long sign extends the value. This fixes the io_pgetevents02 LTP test when compiled with -m32. Presently the systemcall io_pgetevents_time64() unexpectedly accepts -1 for the maximum number of events. It doesn't appear other systemcalls with signed arguments are effected because they all have compat variants defined and wired up. Fixes: ebeb8c82ffaf ("syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32") Suggested-by: Arnd Bergmann Signed-off-by: Richard Palethorpe Signed-off-by: Nikolay Borisov Signed-off-by: Thomas Gleixner Reviewed-by: Arnd Bergmann Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240110130122.3836513-1-nik.borisov@suse.com Link: https://lore.kernel.org/ltp/20210921130127.24131-1-rpalethorpe@suse.com/ Signed-off-by: Greg Kroah-Hartman commit 155a09d38bfa39ca7e3eb7bf249b4a8d4ff8d443 Author: Tim Chen Date: Mon Jan 22 15:35:34 2024 -0800 tick/sched: Preserve number of idle sleeps across CPU hotplug events commit 9a574ea9069be30b835a3da772c039993c43369b upstream. Commit 71fee48f ("tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug") preserved total idle sleep time and iowait sleeptime across CPU hotplug events. Similar reasoning applies to the number of idle calls and idle sleeps to get the proper average of sleep time per idle invocation. Preserve those fields too. Fixes: 71fee48f ("tick-sched: Fix idle and iowait sleeptime accounting vs CPU hotplug") Signed-off-by: Tim Chen Signed-off-by: Thomas Gleixner Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240122233534.3094238-1-tim.c.chen@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 4e5bed870706d55260b17d915437c7922b731859 Author: Jiri Wiesner Date: Mon Jan 22 18:23:50 2024 +0100 clocksource: Skip watchdog check for large watchdog intervals commit 644649553508b9bacf0fc7a5bdc4f9e0165576a5 upstream. There have been reports of the watchdog marking clocksources unstable on machines with 8 NUMA nodes: clocksource: timekeeping watchdog on CPU373: Marking clocksource 'tsc' as unstable because the skew is too large: clocksource: 'hpet' wd_nsec: 14523447520 clocksource: 'tsc' cs_nsec: 14524115132 The measured clocksource skew - the absolute difference between cs_nsec and wd_nsec - was 668 microseconds: cs_nsec - wd_nsec = 14524115132 - 14523447520 = 667612 The kernel used 200 microseconds for the uncertainty_margin of both the clocksource and watchdog, resulting in a threshold of 400 microseconds (the md variable). Both the cs_nsec and the wd_nsec value indicate that the readout interval was circa 14.5 seconds. The observed behaviour is that watchdog checks failed for large readout intervals on 8 NUMA node machines. This indicates that the size of the skew was directly proportinal to the length of the readout interval on those machines. The measured clocksource skew, 668 microseconds, was evaluated against a threshold (the md variable) that is suited for readout intervals of roughly WATCHDOG_INTERVAL, i.e. HZ >> 1, which is 0.5 second. The intention of 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold") was to tighten the threshold for evaluating skew and set the lower bound for the uncertainty_margin of clocksources to twice WATCHDOG_MAX_SKEW. Later in c37e85c135ce ("clocksource: Loosen clocksource watchdog constraints"), the WATCHDOG_MAX_SKEW constant was increased to 125 microseconds to fit the limit of NTP, which is able to use a clocksource that suffers from up to 500 microseconds of skew per second. Both the TSC and the HPET use default uncertainty_margin. When the readout interval gets stretched the default uncertainty_margin is no longer a suitable lower bound for evaluating skew - it imposes a limit that is far stricter than the skew with which NTP can deal. The root causes of the skew being directly proportinal to the length of the readout interval are: * the inaccuracy of the shift/mult pairs of clocksources and the watchdog * the conversion to nanoseconds is imprecise for large readout intervals Prevent this by skipping the current watchdog check if the readout interval exceeds 2 * WATCHDOG_INTERVAL. Considering the maximum readout interval of 2 * WATCHDOG_INTERVAL, the current default uncertainty margin (of the TSC and HPET) corresponds to a limit on clocksource skew of 250 ppm (microseconds of skew per second). To keep the limit imposed by NTP (500 microseconds of skew per second) for all possible readout intervals, the margins would have to be scaled so that the threshold value is proportional to the length of the actual readout interval. As for why the readout interval may get stretched: Since the watchdog is executed in softirq context the expiration of the watchdog timer can get severely delayed on account of a ksoftirqd thread not getting to run in a timely manner. Surely, a system with such belated softirq execution is not working well and the scheduling issue should be looked into but the clocksource watchdog should be able to deal with it accordingly. Fixes: 2e27e793e280 ("clocksource: Reduce clocksource-skew threshold") Suggested-by: Feng Tang Signed-off-by: Jiri Wiesner Signed-off-by: Thomas Gleixner Tested-by: Paul E. McKenney Reviewed-by: Feng Tang Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240122172350.GA740@incl Signed-off-by: Greg Kroah-Hartman commit 7e4a79e6d9b4123361364aea3a6fd9a0523453e8 Author: Dawei Li Date: Mon Jan 22 16:57:15 2024 +0800 genirq: Initialize resend_node hlist for all interrupt descriptors commit b184c8c2889ceef0a137c7d0567ef9fe3d92276e upstream. For a CONFIG_SPARSE_IRQ=n kernel, early_irq_init() is supposed to initialize all interrupt descriptors. It does except for irq_desc::resend_node, which ia only initialized for the first descriptor. Use the indexed decriptor and not the base pointer to address that. Fixes: bc06a9e08742 ("genirq: Use hlist for managing resend handlers") Signed-off-by: Dawei Li Signed-off-by: Thomas Gleixner Acked-by: Marc Zyngier Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240122085716.2999875-5-dawei.li@shingroup.cn Signed-off-by: Greg Kroah-Hartman commit f869d597f001569a7a21aa6ca3859edb1c8f5ce0 Author: Xi Ruoyao Date: Sat Jan 27 05:05:57 2024 +0800 mips: Call lose_fpu(0) before initializing fcr31 in mips_set_personality_nan commit 59be5c35850171e307ca5d3d703ee9ff4096b948 upstream. If we still own the FPU after initializing fcr31, when we are preempted the dirty value in the FPU will be read out and stored into fcr31, clobbering our setting. This can cause an improper floating-point environment after execve(). For example: zsh% cat measure.c #include int main() { return fetestexcept(FE_INEXACT); } zsh% cc measure.c -o measure -lm zsh% echo $((1.0/3)) # raising FE_INEXACT 0.33333333333333331 zsh% while ./measure; do ; done (stopped in seconds) Call lose_fpu(0) before setting fcr31 to prevent this. Closes: https://lore.kernel.org/linux-mips/7a6aa1bbdbbe2e63ae96ff163fab0349f58f1b9e.camel@xry111.site/ Fixes: 9b26616c8d9d ("MIPS: Respect the ISA level in FCSR handling") Cc: stable@vger.kernel.org Signed-off-by: Xi Ruoyao Signed-off-by: Thomas Bogendoerfer Signed-off-by: Greg Kroah-Hartman commit 026de45cd745028ca0b06c904ffbb99d0c78521a Author: Quanquan Cao Date: Wed Jan 24 17:15:26 2024 +0800 cxl/region:Fix overflow issue in alloc_hpa() commit d76779dd3681c01a4c6c3cae4d0627c9083e0ee6 upstream. Creating a region with 16 memory devices caused a problem. The div_u64_rem function, used for dividing an unsigned 64-bit number by a 32-bit one, faced an issue when SZ_256M * p->interleave_ways. The result surpassed the maximum limit of the 32-bit divisor (4G), leading to an overflow and a remainder of 0. note: At this point, p->interleave_ways is 16, meaning 16 * 256M = 4G To fix this issue, I replaced the div_u64_rem function with div64_u64_rem and adjusted the type of the remainder. Signed-off-by: Quanquan Cao Reviewed-by: Dave Jiang Fixes: 23a22cd1c98b ("cxl/region: Allocate HPA capacity to regions") Cc: Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit e8931ebb0143dc4a3ad302be6f963ae375548bc0 Author: Jithu Joseph Date: Thu Jan 25 00:22:50 2024 -0800 platform/x86/intel/ifs: Call release_firmware() when handling errors. [ Upstream commit 8c898ec07a2fc1d4694e81097a48e94a3816308d ] Missing release_firmware() due to error handling blocked any future image loading. Fix the return code and release_fiwmare() to release the bad image. Fixes: 25a76dbb36dd ("platform/x86/intel/ifs: Validate image size") Reported-by: Pengfei Xu Signed-off-by: Jithu Joseph Signed-off-by: Ashok Raj Tested-by: Pengfei Xu Reviewed-by: Tony Luck Reviewed-by: Ilpo Järvinen Link: https://lore.kernel.org/r/20240125082254.424859-2-ashok.raj@intel.com Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 54ed6fed7cd7ee139dfd40d6818e40b1033e0643 Author: Michael Walle Date: Mon Nov 13 17:43:44 2023 +0100 drm: bridge: samsung-dsim: Don't use FORCE_STOP_STATE [ Upstream commit ff3d5d04db07e5374758baa7e877fde8d683ebab ] The FORCE_STOP_STATE bit is unsuitable to force the DSI link into LP-11 mode. It seems the bridge internally queues DSI packets and when the FORCE_STOP_STATE bit is cleared, they are sent in close succession without any useful timing (this also means that the DSI lanes won't go into LP-11 mode). The length of this gibberish varies between 1ms and 5ms. This sometimes breaks an attached bridge (TI SN65DSI84 in this case). In our case, the bridge will fail in about 1 per 500 reboots. The FORCE_STOP_STATE handling was introduced to have the DSI lanes in LP-11 state during the .pre_enable phase. But as it turns out, none of this is needed at all. Between samsung_dsim_init() and samsung_dsim_set_display_enable() the lanes are already in LP-11 mode. The code as it was before commit 20c827683de0 ("drm: bridge: samsung-dsim: Fix init during host transfer") and 0c14d3130654 ("drm: bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec") was correct in this regard. This patch basically reverts both commits. It was tested on an i.MX8M SoC with an SN65DSI84 bridge. The signals were probed and the DSI packets were decoded during initialization and link start-up. After this patch the first DSI packet on the link is a VSYNC packet and the timing is correct. Command mode between .pre_enable and .enable was also briefly tested by a quick hack. There was no DSI link partner which would have responded, but it was made sure the DSI packet was send on the link. As a side note, the command mode seems to just work in HS mode. I couldn't find that the bridge will handle commands in LP mode. Fixes: 20c827683de0 ("drm: bridge: samsung-dsim: Fix init during host transfer") Fixes: 0c14d3130654 ("drm: bridge: samsung-dsim: Fix i.MX8M enable flow to meet spec") Signed-off-by: Michael Walle Signed-off-by: Inki Dae Link: https://patchwork.freedesktop.org/patch/msgid/20231113164344.1612602-1-mwalle@kernel.org Signed-off-by: Sasha Levin commit 324b61f69f6c658c3f29017ac19113f3fcfd82dd Author: Inochi Amaoto Date: Fri Jan 26 17:20:00 2024 +0800 riscv: dts: sophgo: separate sg2042 mtime and mtimecmp to fit aclint format [ Upstream commit 1f4a994be2c3d13852fd5c1054f292bd303352cc ] Change the timer layout in the dtb to fit the format that needed by the SBI. Fixes: 967a94a92aaa ("riscv: dts: add initial Sophgo SG2042 SoC device tree") Reviewed-by: Chen Wang Reviewed-by: Guo Ren Signed-off-by: Inochi Amaoto Signed-off-by: Chen Wang Signed-off-by: Arnd Bergmann Signed-off-by: Sasha Levin commit ebdb4032174b3ab8beef0ed7847e8bf1e2f70677 Author: Aleksander Jan Bajkowski Date: Mon Jan 22 19:47:09 2024 +0100 MIPS: lantiq: register smp_ops on non-smp platforms [ Upstream commit 4bf2a626dc4bb46f0754d8ac02ec8584ff114ad5 ] Lantiq uses a common kernel config for devices with 24Kc and 34Kc cores. The changes made previously to add support for interrupts on all cores work on 24Kc platforms with SMP disabled and 34Kc platforms with SMP enabled. This patch fixes boot issues on Danube (single core 24Kc) with SMP enabled. Fixes: 730320fd770d ("MIPS: lantiq: enable all hardware interrupts on second VPE") Signed-off-by: Aleksander Jan Bajkowski Signed-off-by: Thomas Bogendoerfer Signed-off-by: Sasha Levin commit 3cb11d189673ae012ecad07bbbcd70bb08ca884b Author: Huacai Chen Date: Fri Jan 26 16:22:07 2024 +0800 LoongArch/smp: Call rcutree_report_cpu_starting() at tlb_init() [ Upstream commit 5056c596c3d1848021a4eaa76ee42f4c05c50346 ] Machines which have more than 8 nodes fail to boot SMP after commit a2ccf46333d7b2cf96 ("LoongArch/smp: Call rcutree_report_cpu_starting() earlier"). Because such machines use tlb-based per-cpu base address rather than dmw-based per-cpu base address, resulting per-cpu variables can only be accessed after tlb_init(). But rcutree_report_cpu_starting() is now called before tlb_init() and accesses per-cpu variables indeed. Since the original patch want to avoid the lockdep warning caused by page allocation in tlb_init(), we can move rcutree_report_cpu_starting() to tlb_init() where after tlb exception configuration but before page allocation. Fixes: a2ccf46333d7b2cf96 ("LoongArch/smp: Call rcutree_report_cpu_starting() earlier") Signed-off-by: Huacai Chen Signed-off-by: Sasha Levin commit 19d73c253a444d2005a914d840822959eb5db8de Author: David Lechner Date: Thu Jan 25 14:53:09 2024 -0600 spi: fix finalize message on error return [ Upstream commit 8c2ae772fe08e33f3d7a83849e85539320701abd ] In __spi_pump_transfer_message(), the message was not finalized in the first error return as it is in the other error return paths. Not finalizing the message could cause anything waiting on the message to complete to hang forever. This adds the missing call to spi_finalize_current_message(). Fixes: ae7d2346dc89 ("spi: Don't use the message queue if possible in spi_sync") Signed-off-by: David Lechner Link: https://msgid.link/r/20240125205312.3458541-2-dlechner@baylibre.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 0dea68be2a345a8b4b5c911d0de92cad831fbdb2 Author: Shyam Prasad N Date: Tue Jan 23 05:07:57 2024 +0000 cifs: fix stray unlock in cifs_chan_skip_or_disable [ Upstream commit 993d1c346b1a51ac41b2193609a0d4e51e9748f4 ] A recent change moved the code that decides to skip a channel or disable multichannel entirely, into a helper function. During this, a mutex_unlock of the session_mutex should have been removed. Doing that here. Fixes: f591062bdbf4 ("cifs: handle servers that still advertise multichannel after disabling") Signed-off-by: Shyam Prasad N Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 1bda88547390761e12a5609742819cca8dbd2e77 Author: Amit Kumar Mahapatra Date: Mon Dec 18 14:36:52 2023 +0530 spi: spi-cadence: Reverse the order of interleaved write and read operations [ Upstream commit 633cd6fe6e1993ba80e0954c2db127a0b1a3e66f ] In the existing implementation, when executing interleaved write and read operations in the ISR for a transfer length greater than the FIFO size, the TXFIFO write precedes the RXFIFO read. Consequently, the initially received data in the RXFIFO is pushed out and lost, leading to a failure in data integrity. To address this issue, reverse the order of interleaved operations and conduct the RXFIFO read followed by the TXFIFO write. Fixes: 6afe2ae8dc48 ("spi: spi-cadence: Interleave write of TX and read of RX FIFO") Signed-off-by: Amit Kumar Mahapatra Link: https://msgid.link/r/20231218090652.18403-1-amit.kumar-mahapatra@amd.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit aa1408093e378d8d951550d8514b0b23a21d702a Author: Kamal Dasu Date: Tue Jan 9 16:00:32 2024 -0500 spi: bcm-qspi: fix SFDP BFPT read by usig mspi read [ Upstream commit 574bf7bbe83794a902679846770f75a9b7f28176 ] SFDP read shall use the mspi reads when using the bcm_qspi_exec_mem_op() call. This fixes SFDP parameter page read failures seen with parts that now use SFDP protocol to read the basic flash parameter table. Fixes: 5f195ee7d830 ("spi: bcm-qspi: Implement the spi_mem interface") Signed-off-by: Kamal Dasu Tested-by: Florian Fainelli Reviewed-by: Florian Fainelli Link: https://msgid.link/r/20240109210033.43249-1-kamal.dasu@broadcom.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit c794cc65bb3eef3142bbecd48f1fd43f1fa8d478 Author: Mario Limonciello Date: Fri Jan 19 05:33:19 2024 -0600 cpufreq/amd-pstate: Fix setting scaling max/min freq values [ Upstream commit 22fb4f041999f5f16ecbda15a2859b4ef4cbf47e ] Scaling min/max freq values were being cached and lagging a setting each time. Fix the ordering of the clamp call to ensure they work. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931 Fixes: febab20caeba ("cpufreq/amd-pstate: Fix scaling_min_freq and scaling_max_freq update") Signed-off-by: Mario Limonciello Reviewed-by: Wyes Karny Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit c44ec974b4d15fcf426770f1f3a91a3f09a41594 Author: Hsin-Yi Wang Date: Wed Jan 17 17:58:14 2024 -0800 drm/bridge: anx7625: Ensure bridge is suspended in disable() [ Upstream commit 4d5b7daa3c610af3f322ad1e91fc0c752ff32f0e ] Similar to commit 26db46bc9c67 ("drm/bridge: parade-ps8640: Ensure bridge is suspended in .post_disable()"). Add a mutex to ensure that aux transfer won't race with atomic_disable by holding the PM reference and prevent the bridge from suspend. Also we need to use pm_runtime_put_sync_suspend() to suspend the bridge instead of idle with pm_runtime_put_sync(). Fixes: 3203e497eb76 ("drm/bridge: anx7625: Synchronously run runtime suspend.") Fixes: adca62ec370c ("drm/bridge: anx7625: Support reading edid through aux channel") Signed-off-by: Hsin-Yi Wang Tested-by: Xuxin Xiong Reviewed-by: Pin-yen Lin Reviewed-by: Douglas Anderson Signed-off-by: Douglas Anderson Link: https://patchwork.freedesktop.org/patch/msgid/20240118015916.2296741-1-hsinyi@chromium.org Signed-off-by: Sasha Levin commit 0e29e13c4fb3c5d0b201f29a696ba71ce9644b5c Author: Li Lingfeng Date: Thu Jan 18 21:04:01 2024 +0800 block: Move checking GENHD_FL_NO_PART to bdev_add_partition() [ Upstream commit 7777f47f2ea64efd1016262e7b59fab34adfb869 ] Commit 1a721de8489f ("block: don't add or resize partition on the disk with GENHD_FL_NO_PART") prevented all operations about partitions on disks with GENHD_FL_NO_PART in blkpg_do_ioctl() since they are meaningless. However, it changed error code in some scenarios. So move checking GENHD_FL_NO_PART to bdev_add_partition() to eliminate impact. Fixes: 1a721de8489f ("block: don't add or resize partition on the disk with GENHD_FL_NO_PART") Reported-by: Allison Karlitskaya Closes: https://lore.kernel.org/all/CAOYeF9VsmqKMcQjo1k6YkGNujwN-nzfxY17N3F-CMikE1tYp+w@mail.gmail.com/ Signed-off-by: Li Lingfeng Reviewed-by: Yu Kuai Link: https://lore.kernel.org/r/20240118130401.792757-1-lilingfeng@huaweicloud.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 821acbb1dffbf830685e3333461b57cb4d849c7b Author: Mika Westerberg Date: Mon Jan 22 14:00:33 2024 +0200 spi: intel-pci: Remove Meteor Lake-S SoC PCI ID from the list [ Upstream commit 6c314425b9ef6b247cefd0903e287eb072580c3b ] Turns out this "SoC" side controller does not support certain commands, such as reading chip JEDEC ID, so the controller is pretty much unusable in Linux. We should be using the "PCH" side controller instead. For this reason remove this PCI ID from the list. Fixes: c2912d42e86e ("spi: intel-pci: Add support for Meteor Lake-S SPI serial flash") Signed-off-by: Mika Westerberg Link: https://msgid.link/r/20240122120034.2664812-2-mika.westerberg@linux.intel.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit c34b46598069fa3439c253fdae756d6062c0f314 Author: Shravan Kumar Ramani Date: Wed Jan 17 05:01:34 2024 -0500 platform/mellanox: mlxbf-pmc: Fix offset calculation for crspace events [ Upstream commit 732c35ce6d4892f7b07cc9aca61a6ad0fd400a26 ] The event selector fields for 2 counters are contained in one 32-bit register and the current logic does not account for this. Fixes: 423c3361855c ("platform/mellanox: mlxbf-pmc: Add support for BlueField-3") Signed-off-by: Shravan Kumar Ramani Reviewed-by: David Thompson Reviewed-by: Vadim Pasternak Link: https://lore.kernel.org/r/8834cfa496c97c7c2fcebcfca5a2aa007e20ae96.1705485095.git.shravankr@nvidia.com Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit eddce6fcf35029edcc876302cdaa5cab25efcd6a Author: Artur Weber Date: Fri Jan 5 07:53:01 2024 +0100 ARM: dts: exynos4212-tab3: add samsung,invert-vclk flag to fimd [ Upstream commit eab4f56d3e75dad697acf8dc2c8be3c341d6c63e ] After more investigation, I've found that it's not the panel driver config that needs to be modified to invert the data polarity, but the FIMD config. Add the missing invert-vclk option that is required to get the display to work correctly. Fixes: ee37a457af1d ("ARM: dts: exynos: Add Samsung Galaxy Tab 3 8.0 boards") Signed-off-by: Artur Weber Link: https://lore.kernel.org/r/20240105-tab3-display-fixes-v2-1-904d1207bf6f@gmail.com Signed-off-by: Krzysztof Kozlowski Signed-off-by: Sasha Levin commit 32515e64a525feedd734a6e2eb7255d0a199932d Author: Wenhua Lin Date: Tue Jan 9 15:38:48 2024 +0800 gpio: eic-sprd: Clear interrupt after set the interrupt type [ Upstream commit 84aef4ed59705585d629e81d633a83b7d416f5fb ] The raw interrupt status of eic maybe set before the interrupt is enabled, since the eic interrupt has a latch function, which would trigger the interrupt event once enabled it from user side. To solve this problem, interrupts generated before setting the interrupt trigger type are ignored. Fixes: 25518e024e3a ("gpio: Add Spreadtrum EIC driver support") Acked-by: Chunyan Zhang Signed-off-by: Wenhua Lin Signed-off-by: Bartosz Golaszewski Signed-off-by: Sasha Levin commit 549b755efa423ca73ba21cbb8eff89a06c30250a Author: Armin Wolf Date: Wed Jan 3 20:27:04 2024 +0100 platform/x86: wmi: Fix error handling in legacy WMI notify handler functions [ Upstream commit 6ba7843b59b77360812617d071313c7f35f3757a ] When wmi_install_notify_handler()/wmi_remove_notify_handler() are unable to enable/disable the WMI device, they unconditionally return an error to the caller. When registering legacy WMI notify handlers, this means that the callback remains registered despite wmi_install_notify_handler() having returned an error. When removing legacy WMI notify handlers, this means that the callback is removed despite wmi_remove_notify_handler() having returned an error. Fix this by only warning when the WMI device could not be enabled. This behaviour matches the bus-based WMI interface. Tested on a Dell Inspiron 3505 and a Acer Aspire E1-731. Fixes: 58f6425eb92f ("WMI: Cater for multiple events with same GUID") Signed-off-by: Armin Wolf Reviewed-by: Ilpo Järvinen Reviewed-by: Hans de Goede Link: https://lore.kernel.org/r/20240103192707.115512-2-W_Armin@gmx.de Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 26ef46b731576387fa56e9aafb6ccb75cf1d6cfd Author: Cristian Marussi Date: Mon Jan 8 12:34:13 2024 +0000 firmware: arm_ffa: Check xa_load() return value [ Upstream commit c00d9738fd5fce15dc5494d05b7599dce23e8146 ] Add a check to verify the result of xa_load() during the partition lookups done while registering/unregistering the scheduler receiver interrupt callbacks and while executing the main scheduler receiver interrupt callback handler. Fixes: 0184450b8b1e ("firmware: arm_ffa: Add schedule receiver callback mechanism") Signed-off-by: Cristian Marussi Link: https://lore.kernel.org/r/20240108-ffa_fixes_6-8-v1-3-75bf7035bc50@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit 39cd62bf8c11a85919e1e3ca3d4c4e2adc4ba7d1 Author: Cristian Marussi Date: Mon Jan 8 12:34:12 2024 +0000 firmware: arm_ffa: Add missing rwlock_init() for the driver partition [ Upstream commit 5ff30ade16cd9efc2466d3ea22bbaf370772941a ] Add the missing rwlock initialization for the FF-A partition associated the driver in ffa_setup_partitions(). It will the primary scheduler partition in the host or the VM partition in the virtualised environment. IOW, it corresponds to the partition with VM ID == drv_info->vm_id. Fixes: 1b6bf41b7a65 ("firmware: arm_ffa: Add notification handling mechanism") Signed-off-by: Cristian Marussi Link: https://lore.kernel.org/r/20240108-ffa_fixes_6-8-v1-2-75bf7035bc50@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit 895d16d6eb66d2f022455700ba8bb8bbb046d02b Author: Cristian Marussi Date: Mon Jan 8 12:34:11 2024 +0000 firmware: arm_ffa: Add missing rwlock_init() in ffa_setup_partitions() [ Upstream commit 59b2e242b13192e50bf47df3780bf8a7e2260e98 ] Add the missing rwlock initialization for the individual FF-A partition information in ffa_setup_partitions(). Fixes: 0184450b8b1e ("firmware: arm_ffa: Add schedule receiver callback mechanism") Signed-off-by: Cristian Marussi Link: https://lore.kernel.org/r/20240108-ffa_fixes_6-8-v1-1-75bf7035bc50@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit 1e651b30824789076c3288438c5290b67a3b44d6 Author: Cristian Marussi Date: Tue Jan 9 15:01:06 2024 +0000 firmware: arm_scmi: Fix the clock protocol version for v3.2 [ Upstream commit 27600c96e2ffa6c1b2cb378ddc75c6620c628d04 ] The clock protocol version as per the SCMI v3.2 specification is 0x30000. Enable the v3.0 clock protocol features only when clock protocol version equals 0x30000. The previous beta version of the spec had this value set to 0x20001 and th same value trickled down from the initial development. The version update were missed in the driver. Fixes: e49e314a2cf7 ("firmware: arm_scmi: Add clock v3.2 CONFIG_SET support") Signed-off-by: Cristian Marussi Link: https://lore.kernel.org/r/20240109150106.2066739-1-cristian.marussi@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit 8d35f33c199c3ec7bd1b1e6ece3d99cbf4e63427 Author: Cristian Marussi Date: Mon Jan 8 18:50:50 2024 +0000 firmware: arm_scmi: Use xa_insert() when saving raw queues [ Upstream commit b5dc0ffd36560dbadaed9a3d9fd7838055d62d74 ] Use xa_insert() when saving per-channel raw queues to better check for duplicates. Fixes: 7860701d1e6e ("firmware: arm_scmi: Add per-channel raw injection support") Signed-off-by: Cristian Marussi Link: https://lore.kernel.org/r/20240108185050.1628687-2-cristian.marussi@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit e69b3a04b5b538f6974e1aec633b3fcb80737d0b Author: Cristian Marussi Date: Mon Jan 8 18:50:49 2024 +0000 firmware: arm_scmi: Use xa_insert() to store opps [ Upstream commit e8ef4bbe39b9576a73f104f6af743fb9c7b624ba ] When storing opps by level or index use xa_insert() instead of xa_store() and add error-checking to spot bad duplicates indexes possibly wrongly provided by the platform firmware. Fixes: 31c7c1397a33 ("firmware: arm_scmi: Add v3.2 perf level indexing mode support") Signed-off-by: Cristian Marussi Link: https://lore.kernel.org/r/20240108185050.1628687-1-cristian.marussi@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Sasha Levin commit 81499dc7be07888d976c38a184a85437772829d2 Author: Fedor Pchelkin Date: Wed Dec 20 12:53:15 2023 +0300 drm/exynos: gsc: minor fix for loop iteration in gsc_runtime_resume [ Upstream commit 4050957c7c2c14aa795dbf423b4180d5ac04e113 ] Do not forget to call clk_disable_unprepare() on the first element of ctx->clocks array. Found by Linux Verification Center (linuxtesting.org). Fixes: 8b7d3ec83aba ("drm/exynos: gsc: Convert driver to IPP v2 core API") Signed-off-by: Fedor Pchelkin Reviewed-by: Marek Szyprowski Signed-off-by: Inki Dae Signed-off-by: Sasha Levin commit dd25db729ae8431c04ad5bc14f1b3186aa168981 Author: Arnd Bergmann Date: Thu Dec 14 13:32:15 2023 +0100 drm/exynos: fix accidental on-stack copy of exynos_drm_plane [ Upstream commit 960b537e91725bcb17dd1b19e48950e62d134078 ] gcc rightfully complains about excessive stack usage in the fimd_win_set_pixfmt() function: drivers/gpu/drm/exynos/exynos_drm_fimd.c: In function 'fimd_win_set_pixfmt': drivers/gpu/drm/exynos/exynos_drm_fimd.c:750:1: error: the frame size of 1032 bytes is larger than 1024 byte drivers/gpu/drm/exynos/exynos5433_drm_decon.c: In function 'decon_win_set_pixfmt': drivers/gpu/drm/exynos/exynos5433_drm_decon.c:381:1: error: the frame size of 1032 bytes is larger than 1024 bytes There is really no reason to copy the large exynos_drm_plane structure to the stack before using one of its members, so just use a pointer instead. Fixes: 6f8ee5c21722 ("drm/exynos: fimd: Make plane alpha configurable") Signed-off-by: Arnd Bergmann Reviewed-by: Marek Szyprowski Signed-off-by: Inki Dae Signed-off-by: Sasha Levin commit 0745fef8a97d6a569c5c16353be9922d073c3bb0 Author: Sebastian Andrzej Siewior Date: Thu Jan 18 12:54:51 2024 +0100 futex: Prevent the reuse of stale pi_state [ Upstream commit e626cb02ee8399fd42c415e542d031d185783903 ] Jiri Slaby reported a futex state inconsistency resulting in -EINVAL during a lock operation for a PI futex. It requires that the a lock process is interrupted by a timeout or signal: T1 Owns the futex in user space. T2 Tries to acquire the futex in kernel (futex_lock_pi()). Allocates a pi_state and attaches itself to it. T2 Times out and removes its rt_waiter from the rt_mutex. Drops the rtmutex lock and tries to acquire the hash bucket lock to remove the futex_q. The lock is contended and T2 schedules out. T1 Unlocks the futex (futex_unlock_pi()). Finds a futex_q but no rt_waiter. Unlocks the futex (do_uncontended) and makes it available to user space. T3 Acquires the futex in user space. T4 Tries to acquire the futex in kernel (futex_lock_pi()). Finds the existing futex_q of T2 and tries to attach itself to the existing pi_state. This (attach_to_pi_state()) fails with -EINVAL because uval contains the TID of T3 but pi_state points to T1. It's incorrect to unlock the futex and make it available for user space to acquire as long as there is still an existing state attached to it in the kernel. T1 cannot hand over the futex to T2 because T2 already gave up and started to clean up and is blocked on the hash bucket lock, so T2's futex_q with the pi_state pointing to T1 is still queued. T2 observes the futex_q, but ignores it as there is no waiter on the corresponding rt_mutex and takes the uncontended path which allows the subsequent caller of futex_lock_pi() (T4) to observe that stale state. To prevent this the unlock path must dequeue all futex_q entries which point to the same pi_state when there is no waiter on the rt mutex. This requires obviously to make the dequeue conditional in the locking path to prevent a double dequeue. With that it's guaranteed that user space cannot observe an uncontended futex which has kernel state attached. Fixes: fbeb558b0dd0d ("futex/pi: Fix recursive rt_mutex waiter state") Reported-by: Jiri Slaby Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner Tested-by: Jiri Slaby Link: https://lore.kernel.org/r/20240118115451.0TkD_ZhB@linutronix.de Closes: https://lore.kernel.org/all/4611bcf2-44d0-4c34-9b84-17406f881003@kernel.org Signed-off-by: Sasha Levin commit 4101ba70a03a85700193ca9f93a793b7f444e03d Author: Yajun Deng Date: Thu Jan 18 14:18:53 2024 +0800 memblock: fix crash when reserved memory is not added to memory [ Upstream commit 6a9531c3a88096a26cf3ac582f7ec44f94a7dcb2 ] After commit 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") nid of a reserved region is used by init_reserved_page() (with CONFIG_DEFERRED_STRUCT_PAGE_INIT=y) to access node strucure. In many cases the nid of the reserved memory is not set and this causes a crash. When the nid of a reserved region is not set, fall back to early_pfn_to_nid(), so that nid of the first_online_node will be passed to init_reserved_page(). Fixes: 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()") Signed-off-by: Yajun Deng Link: https://lore.kernel.org/r/20240118061853.2652295-1-yajun.deng@linux.dev [rppt: massaged the commit message] Signed-off-by: Mike Rapoport (IBM) Signed-off-by: Sasha Levin commit 161f3651798bca0597f971766a631b5626f4238c Author: Douglas Anderson Date: Wed Jan 17 10:35:03 2024 -0800 drm/bridge: parade-ps8640: Make sure we drop the AUX mutex in the error case [ Upstream commit a20f1b02bafcbf5a32d96a1d4185d6981cf7d016 ] After commit 26db46bc9c67 ("drm/bridge: parade-ps8640: Ensure bridge is suspended in .post_disable()"), if we hit the error case in ps8640_aux_transfer() then we return without dropping the mutex. Fix this oversight. Fixes: 26db46bc9c67 ("drm/bridge: parade-ps8640: Ensure bridge is suspended in .post_disable()") Reviewed-by: Hsin-Yi Wang Signed-off-by: Douglas Anderson Link: https://patchwork.freedesktop.org/patch/msgid/20240117103502.1.Ib726a0184913925efc7e99c4d4fc801982e1bc24@changeid Signed-off-by: Sasha Levin commit 5a3a33a18221e14b6368203970b07943e6ef6a1b Author: Pin-yen Lin Date: Tue Jan 9 20:04:57 2024 +0800 drm/bridge: parade-ps8640: Ensure bridge is suspended in .post_disable() [ Upstream commit 26db46bc9c675e43230cc6accd110110a7654299 ] The ps8640 bridge seems to expect everything to be power cycled at the disable process, but sometimes ps8640_aux_transfer() holds the runtime PM reference and prevents the bridge from suspend. Prevent that by introducing a mutex lock between ps8640_aux_transfer() and .post_disable() to make sure the bridge is really powered off. Fixes: 826cff3f7ebb ("drm/bridge: parade-ps8640: Enable runtime power management") Signed-off-by: Pin-yen Lin Reviewed-by: Douglas Anderson Signed-off-by: Douglas Anderson Link: https://patchwork.freedesktop.org/patch/msgid/20240109120528.1292601-1-treapking@chromium.org Signed-off-by: Sasha Levin commit 218c1f41c8c0fbbb16752dc6abefdc2033240efd Author: Tomi Valkeinen Date: Wed Jan 3 15:31:08 2024 +0200 drm/bridge: sii902x: Fix audio codec unregistration [ Upstream commit 3fc6c76a8d208d3955c9e64b382d0ff370bc61fc ] The driver never unregisters the audio codec platform device, which can lead to a crash on module reloading, nor does it handle the return value from sii902x_audio_codec_init(). Signed-off-by: Tomi Valkeinen Fixes: ff5781634c41 ("drm/bridge: sii902x: Implement HDMI audio support") Cc: Jyri Sarha Acked-by: Linus Walleij Link: https://lore.kernel.org/r/20240103-si902x-fixes-v1-2-b9fd3e448411@ideasonboard.com Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20240103-si902x-fixes-v1-2-b9fd3e448411@ideasonboard.com Signed-off-by: Sasha Levin commit 2a4c6af7934a7b4c304542c38fee35e09cc1770c Author: Tomi Valkeinen Date: Wed Jan 3 15:31:07 2024 +0200 drm/bridge: sii902x: Fix probing race issue [ Upstream commit 08ac6f132dd77e40f786d8af51140c96c6d739c9 ] A null pointer dereference crash has been observed rarely on TI platforms using sii9022 bridge: [ 53.271356] sii902x_get_edid+0x34/0x70 [sii902x] [ 53.276066] sii902x_bridge_get_edid+0x14/0x20 [sii902x] [ 53.281381] drm_bridge_get_edid+0x20/0x34 [drm] [ 53.286305] drm_bridge_connector_get_modes+0x8c/0xcc [drm_kms_helper] [ 53.292955] drm_helper_probe_single_connector_modes+0x190/0x538 [drm_kms_helper] [ 53.300510] drm_client_modeset_probe+0x1f0/0xbd4 [drm] [ 53.305958] __drm_fb_helper_initial_config_and_unlock+0x50/0x510 [drm_kms_helper] [ 53.313611] drm_fb_helper_initial_config+0x48/0x58 [drm_kms_helper] [ 53.320039] drm_fbdev_dma_client_hotplug+0x84/0xd4 [drm_dma_helper] [ 53.326401] drm_client_register+0x5c/0xa0 [drm] [ 53.331216] drm_fbdev_dma_setup+0xc8/0x13c [drm_dma_helper] [ 53.336881] tidss_probe+0x128/0x264 [tidss] [ 53.341174] platform_probe+0x68/0xc4 [ 53.344841] really_probe+0x188/0x3c4 [ 53.348501] __driver_probe_device+0x7c/0x16c [ 53.352854] driver_probe_device+0x3c/0x10c [ 53.357033] __device_attach_driver+0xbc/0x158 [ 53.361472] bus_for_each_drv+0x88/0xe8 [ 53.365303] __device_attach+0xa0/0x1b4 [ 53.369135] device_initial_probe+0x14/0x20 [ 53.373314] bus_probe_device+0xb0/0xb4 [ 53.377145] deferred_probe_work_func+0xcc/0x124 [ 53.381757] process_one_work+0x1f0/0x518 [ 53.385770] worker_thread+0x1e8/0x3dc [ 53.389519] kthread+0x11c/0x120 [ 53.392750] ret_from_fork+0x10/0x20 The issue here is as follows: - tidss probes, but is deferred as sii902x is still missing. - sii902x starts probing and enters sii902x_init(). - sii902x calls drm_bridge_add(). Now the sii902x bridge is ready from DRM's perspective. - sii902x calls sii902x_audio_codec_init() and platform_device_register_data() - The registration of the audio platform device causes probing of the deferred devices. - tidss probes, which eventually causes sii902x_bridge_get_edid() to be called. - sii902x_bridge_get_edid() tries to use the i2c to read the edid. However, the sii902x driver has not set up the i2c part yet, leading to the crash. Fix this by moving the drm_bridge_add() to the end of the sii902x_init(), which is also at the very end of sii902x_probe(). Signed-off-by: Tomi Valkeinen Fixes: 21d808405fe4 ("drm/bridge/sii902x: Fix EDID readback") Acked-by: Linus Walleij Link: https://lore.kernel.org/r/20240103-si902x-fixes-v1-1-b9fd3e448411@ideasonboard.com Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20240103-si902x-fixes-v1-1-b9fd3e448411@ideasonboard.com Signed-off-by: Sasha Levin commit f98cf99578c25e001dc979e88ec3c5ea77ad835b Author: Arnd Bergmann Date: Mon Oct 23 13:55:58 2023 +0200 drm/panel/raydium-rm692e5: select CONFIG_DRM_DISPLAY_DP_HELPER [ Upstream commit 589830b13ac21bddf99b9bc5a4ec17813d0869ef ] As with several other panel drivers, this fails to link without the DP helper library: ld: drivers/gpu/drm/panel/panel-raydium-rm692e5.o: in function `rm692e5_prepare': panel-raydium-rm692e5.c:(.text+0x11f4): undefined reference to `drm_dsc_pps_payload_pack' Select the same symbols that the others already use. Fixes: 988d0ff29ecf7 ("drm/panel: Add driver for BOE RM692E5 AMOLED panel") Signed-off-by: Arnd Bergmann Reviewed-by: Neil Armstrong Link: https://lore.kernel.org/r/20231023115619.3551348-1-arnd@kernel.org Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20231023115619.3551348-1-arnd@kernel.org Signed-off-by: Sasha Levin commit 462faf589c126ecaffd3a6b780741f5c24d02b98 Author: Artur Weber Date: Fri Jan 5 07:53:02 2024 +0100 drm/panel: samsung-s6d7aa0: drop DRM_BUS_FLAG_DE_HIGH for lsl080al02 [ Upstream commit 62b143b5ec4a14e1ae0dede5aabaf1832e3b0073 ] It turns out that I had misconfigured the device I was using the panel with; the bus data polarity is not high for this panel, I had to change the config on the display controller's side. Fix the panel config to properly reflect its accurate settings. Fixes: 6810bb390282 ("drm/panel: Add Samsung S6D7AA0 panel controller driver") Reviewed-by: Jessica Zhang Signed-off-by: Artur Weber Link: https://lore.kernel.org/r/20240105-tab3-display-fixes-v2-2-904d1207bf6f@gmail.com Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20240105-tab3-display-fixes-v2-2-904d1207bf6f@gmail.com Signed-off-by: Sasha Levin commit ba7e07633f04bd16b8b0ab34bac461e03da106b4 Author: Markus Niebel Date: Thu Oct 12 10:42:08 2023 +0200 drm: panel-simple: add missing bus flags for Tianma tm070jvhg[30/33] [ Upstream commit 45dd7df26cee741b31c25ffdd44fb8794eb45ccd ] The DE signal is active high on this display, fill in the missing bus_flags. This aligns panel_desc with its display_timing. Fixes: 9a2654c0f62a ("drm/panel: Add and fill drm_panel type field") Fixes: b3bfcdf8a3b6 ("drm/panel: simple: add Tianma TM070JVHG33") Signed-off-by: Markus Niebel Signed-off-by: Alexander Stein Reviewed-by: Sam Ravnborg Link: https://lore.kernel.org/r/20231012084208.2731650-1-alexander.stein@ew.tq-group.com Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20231012084208.2731650-1-alexander.stein@ew.tq-group.com Signed-off-by: Sasha Levin commit 31976a047b8a2fae7fa2c3df858314dc3016b06d Author: Douglas Anderson Date: Thu Dec 21 13:55:48 2023 -0800 drm/bridge: parade-ps8640: Wait for HPD when doing an AUX transfer [ Upstream commit 024b32db43a359e0ded3fcc6cd86247cbbed4224 ] Unlike what is claimed in commit f5aa7d46b0ee ("drm/bridge: parade-ps8640: Provide wait_hpd_asserted() in struct drm_dp_aux"), if someone manually tries to do an AUX transfer (like via `i2cdump ${bus} 0x50 i`) while the panel is off we don't just get a simple transfer error. Instead, the whole ps8640 gets thrown for a loop and goes into a bad state. Let's put the function to wait for the HPD (and the magical 50 ms after first reset) back in when we're doing an AUX transfer. This shouldn't actually make things much slower (assuming the panel is on) because we should immediately poll and see the HPD high. Mostly this is just an extra i2c transfer to the bridge. Fixes: f5aa7d46b0ee ("drm/bridge: parade-ps8640: Provide wait_hpd_asserted() in struct drm_dp_aux") Tested-by: Pin-yen Lin Reviewed-by: Pin-yen Lin Signed-off-by: Douglas Anderson Link: https://patchwork.freedesktop.org/patch/msgid/20231221135548.1.I10f326a9305d57ad32cee7f8d9c60518c8be20fb@changeid Signed-off-by: Sasha Levin commit 2b132089a9eb49508b017b4b742dfc98bb5b405f Author: Alex Deucher Date: Fri Jan 19 12:32:59 2024 -0500 drm/amdgpu/gfx11: set UNORD_DISPATCH in compute MQDs [ Upstream commit 3380fcad2c906872110d31ddf7aa1fdea57f9df6 ] This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit e664165d4bea5de733a3c19b518ecc207e503bba Author: Alex Deucher Date: Fri Jan 19 12:23:55 2024 -0500 drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs [ Upstream commit 03ff6d7238b77e5fb2b85dc5fe01d2db9eb893bd ] This needs to be set to 1 to avoid a potential deadlock in the GC 10.x and newer. On GC 9.x and older, this needs to be set to 0. This can lead to hangs in some mixed graphics and compute workloads. Updated firmware is also required for AQL. Reviewed-by: Feifei Xu Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 50a7e7868f753602cda77d641e53dc8a6f01d6bc Author: Friedrich Vock Date: Sat Dec 2 01:17:40 2023 +0100 drm/amdgpu: Enable tunneling on high-priority compute queues [ Upstream commit 91963397c49aa2907aeafa52d929555dcbc9cd07 ] This improves latency if the GPU is already busy with other work. This is useful for VR compositors that submit highly latency-sensitive compositing work on high-priority compute queues while the GPU is busy rendering the next frame. Userspace merge request: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/26462 v2: bump driver version (Alex) Reviewed-by: Marek Olšák Signed-off-by: Friedrich Vock Signed-off-by: Alex Deucher Stable-dep-of: 03ff6d7238b7 ("drm/amdgpu/gfx10: set UNORD_DISPATCH in compute MQDs") Signed-off-by: Sasha Levin commit da37d00c636966c8a0856417d02cb79811b61692 Author: Dillon Varone Date: Thu Dec 28 21:36:39 2023 -0500 drm/amd/display: Init link enc resources in dc_state only if res_pool presents [ Upstream commit aa36d8971fccb55ef3241cbfff9d1799e31d8628 ] [Why & How] res_pool is not initialized in all situations such as virtual environments, and therefore link encoder resources should not be initialized if res_pool is NULL. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Martin Leung Acked-by: Alex Hung Signed-off-by: Dillon Varone Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 30985683549b150b4e55c032034cad0175b98e7a Author: Wenjing Liu Date: Thu Nov 2 14:59:13 2023 -0400 drm/amd/display: update pixel clock params after stream slice count change in context [ Upstream commit cfab803884f426b36b58dbe1f86f99742767c208 ] [why] When ODM slice count is changed, otg master pipe's pixel clock params is no longer valid as the value is dependent on ODM slice count. Reviewed-by: Chaitanya Dhere Acked-by: Hamza Mahfooz Signed-off-by: Wenjing Liu Signed-off-by: Alex Deucher Stable-dep-of: aa36d8971fcc ("drm/amd/display: Init link enc resources in dc_state only if res_pool presents") Signed-off-by: Sasha Levin commit eeb8ae770f8ae0933103a20e4d3c32ce3ea05b9c Author: Charlene Liu Date: Thu Dec 28 13:19:33 2023 -0500 drm/amd/display: Add logging resource checks [ Upstream commit 8a51cc097dd590a86e8eec5398934ef389ff9a7b ] [Why] When mapping resources, resources could be unavailable. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Sung joon Kim Acked-by: Alex Hung Signed-off-by: Charlene Liu Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 7f0b58f2b24f39e9131f7fc01e9d8346ec8bac4c Author: Ilya Bakoulin Date: Wed Jan 3 09:42:04 2024 -0500 drm/amd/display: Clear OPTC mem select on disable [ Upstream commit 3ba2a0bfd8cf94eb225e1c60dff16e5c35bde1da ] [Why] Not clearing the memory select bits prior to OPTC disable can cause DSC corruption issues when attempting to reuse a memory instance for another OPTC that enables ODM. [How] Clear the memory select bits prior to disabling an OPTC. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu Acked-by: Alex Hung Signed-off-by: Ilya Bakoulin Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit dd9c923a582cf4590018973a7c003614cd3b3215 Author: George Shen Date: Sun Dec 17 17:17:57 2023 -0500 drm/amd/display: Disconnect phantom pipe OPP from OPTC being disabled [ Upstream commit 7bdbfb4e36e34eb788e44f27666bf0a2b3b90803 ] [Why] If an OPP is used for a different OPTC without first being disconnected from the previous OPTC, unexpected behaviour can occur. This also applies to phantom pipes, which is what the current logic missed. [How] Disconnect OPPs from OPTC for phantom pipes before disabling OTG master. Also move the disconnection to before the OTG master disable, since the register is double buffered. Reviewed-by: Dillon Varone Acked-by: Rodrigo Siqueira Signed-off-by: George Shen Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: 3ba2a0bfd8cf ("drm/amd/display: Clear OPTC mem select on disable") Signed-off-by: Sasha Levin commit 4b6b479b2da6badff099b2e3abf0248936eefbf5 Author: Ilya Bakoulin Date: Fri Dec 8 12:19:33 2023 -0500 drm/amd/display: Fix hang/underflow when transitioning to ODM4:1 [ Upstream commit e7b2b108cdeab76a7e7324459e50b0c1214c0386 ] [Why] Under some circumstances, disabling an OPTC and attempting to reclaim its OPP(s) for a different OPTC could cause a hang/underflow due to OPPs not being properly disconnected from the disabled OPTC. [How] Ensure that all OPPs are unassigned from an OPTC when it gets disabled. Reviewed-by: Alvin Lee Acked-by: Wayne Lin Signed-off-by: Ilya Bakoulin Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: 3ba2a0bfd8cf ("drm/amd/display: Clear OPTC mem select on disable") Signed-off-by: Sasha Levin commit b1b67136af67b5cfa9800a0debf117da855669f6 Author: Hsin-Yi Wang Date: Tue Nov 7 12:41:52 2023 -0800 drm/panel-edp: drm/panel-edp: Fix AUO B116XTN02 name [ Upstream commit 962845c090c4f85fa4f6872a5b6c89ee61f53cc0 ] Rename AUO 0x235c B116XTN02 to B116XTN02.3 according to decoding edid. Fixes: 3db2420422a5 ("drm/panel-edp: Add AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0") Cc: stable@vger.kernel.org Signed-off-by: Hsin-Yi Wang Reviewed-by: Douglas Anderson Acked-by: Maxime Ripard Signed-off-by: Douglas Anderson Link: https://patchwork.freedesktop.org/patch/msgid/20231107204611.3082200-3-hsinyi@chromium.org Signed-off-by: Sasha Levin commit 779eb323e2bfe5d9c7a0965a9e68f2b8fe90030c Author: Hsin-Yi Wang Date: Tue Nov 7 12:41:51 2023 -0800 drm/panel-edp: drm/panel-edp: Fix AUO B116XAK01 name and timing [ Upstream commit fc6e7679296530106ee0954e8ddef1aa58b2e0b5 ] Rename AUO 0x405c B116XAK01 to B116XAK01.0 and adjust the timing of auo_b116xak01: T3=200, T12=500, T7_max = 50 according to decoding edid and datasheet. Fixes: da458286a5e2 ("drm/panel: Add support for AUO B116XAK01 panel") Cc: stable@vger.kernel.org Signed-off-by: Hsin-Yi Wang Reviewed-by: Douglas Anderson Acked-by: Maxime Ripard Signed-off-by: Douglas Anderson Link: https://patchwork.freedesktop.org/patch/msgid/20231107204611.3082200-2-hsinyi@chromium.org Signed-off-by: Sasha Levin commit 21aadc548ca2b802928ee0246b6d99e1ae08b430 Author: Sheng-Liang Pan Date: Fri Oct 27 11:04:56 2023 +0800 drm/panel-edp: Add AUO B116XTN02, BOE NT116WHM-N21,836X2, NV116WHM-N49 V8.0 [ Upstream commit 3db2420422a5912d97966e0176050bb0fc9aa63e ] Add panel identification entry for - AUO B116XTN02 family (product ID:0x235c) - BOE NT116WHM-N21,836X2 (product ID:0x09c3) - BOE NV116WHM-N49 V8.0 (product ID:0x0979) Signed-off-by: Sheng-Liang Pan Signed-off-by: Douglas Anderson Link: https://patchwork.freedesktop.org/patch/msgid/20231027110435.1.Ia01fe9ec1c0953e0050a232eaa782fef2c037516@changeid Stable-dep-of: fc6e76792965 ("drm/panel-edp: drm/panel-edp: Fix AUO B116XAK01 name and timing") Signed-off-by: Sasha Levin commit 886942651af86e85ade7a8b560bdd6157cdc5014 Author: Taimur Hassan Date: Fri Nov 10 10:15:28 2023 -0500 drm/amd/display: Fix conversions between bytes and KB [ Upstream commit 22136ff27c4e01fae81f6588033363a46c72ed8c ] [Why] There are a number of instances where we convert HostVMMinPageSize or GPUVMMinPageSize from bytes to KB by dividing (rather than multiplying) and vice versa. Additionally, in some cases, a parameter is passed through DML in KB but later checked as if it were in bytes. Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas Acked-by: Hamza Mahfooz Signed-off-by: Taimur Hassan Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 2ef98c6d753a744e333b7e34b9cf687040fba57d Author: Nicholas Kazlauskas Date: Tue Dec 5 11:22:56 2023 -0500 drm/amd/display: Wake DMCUB before executing GPINT commands [ Upstream commit e5ffd1263dd5b44929c676171802e7b6af483f21 ] [Why] DMCUB can be in idle when we attempt to interface with the HW through the GPINT mailbox resulting in a system hang. [How] Add dc_wake_and_execute_gpint() to wrap the wake, execute, sleep sequence. If the GPINT executes successfully then DMCUB will be put back into sleep after the optional response is returned. It functions similar to the inbox command interface. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Hansen Dsouza Acked-by: Wayne Lin Signed-off-by: Nicholas Kazlauskas Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 303197775a97416b62d4da69280d0c120a20e009 Author: Nicholas Kazlauskas Date: Mon Dec 4 16:35:04 2023 -0500 drm/amd/display: Wake DMCUB before sending a command [ Upstream commit 8892780834ae294bc3697c7d0e056d7743900b39 ] [Why] We can hang in place trying to send commands when the DMCUB isn't powered on. [How] For functions that execute within a DC context or DC lock we can wrap the direct calls to dm_execute_dmub_cmd/list with code that exits idle power optimizations and reallows once we're done with the command submission on success. For DM direct submissions the DM will need to manage the enter/exit sequencing manually. We cannot invoke a DMCUB command directly within the DM execution helper or we can deadlock. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Hansen Dsouza Acked-by: Wayne Lin Signed-off-by: Nicholas Kazlauskas Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 820c3870c491946a78950cdf961bf40e28c1025f Author: Nicholas Kazlauskas Date: Mon Dec 4 14:10:05 2023 -0500 drm/amd/display: Refactor DMCUB enter/exit idle interface [ Upstream commit 8e57c06bf4b0f51a4d6958e15e1a99c9520d00fa ] [Why] We can hang in place trying to send commands when the DMCUB isn't powered on. [How] We need to exit out of the idle state prior to sending a command, but the process that performs the exit also invokes a command itself. Fixing this issue involves the following: 1. Using a software state to track whether or not we need to start the process to exit idle or notify idle. It's possible for the hardware to have exited an idle state without driver knowledge, but entering one is always restricted to a driver allow - which makes the SW state vs HW state mismatch issue purely one of optimization, which should seldomly be hit, if at all. 2. Refactor any instances of exit/notify idle to use a single wrapper that maintains this SW state. This works simialr to dc_allow_idle_optimizations, but works at the DMCUB level and makes sure the state is marked prior to any notify/exit idle so we don't enter an infinite loop. 3. Make sure we exit out of idle prior to sending any commands or waiting for DMCUB idle. This patch takes care of 1/2. A future patch will take care of wrapping DMCUB command submission with calls to this new interface. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Hansen Dsouza Acked-by: Wayne Lin Signed-off-by: Nicholas Kazlauskas Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: 8892780834ae ("drm/amd/display: Wake DMCUB before sending a command") Signed-off-by: Sasha Levin commit 839895c804416ff39d726c919bcdc37eb6a4e630 Author: Samson Tam Date: Tue Nov 28 16:53:12 2023 -0500 drm/amd/display: do not send commands to DMUB if DMUB is inactive from S3 [ Upstream commit 0f657938e4345a77be871d906f3e0de3c58a7a49 ] [Why] On resume from S3, may get apply_idle_optimizations call while DMUB is inactive which will just time out. [How] Set and track power state in dmub_srv and check power state before sending commands to DMUB. Add interface in both dmub_srv and dc_dmub_srv Reviewed-by: Nicholas Kazlauskas Acked-by: Wayne Lin Signed-off-by: Samson Tam Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Stable-dep-of: 8892780834ae ("drm/amd/display: Wake DMCUB before sending a command") Signed-off-by: Sasha Levin commit 6f64fb8ed7b00fe806d683c02ea0f91cc47a235c Author: Naohiro Aota Date: Tue Dec 19 01:02:29 2023 +0900 btrfs: zoned: optimize hint byte for zoned allocator [ Upstream commit 02444f2ac26eae6385a65fcd66915084d15dffba ] Writing sequentially to a huge file on btrfs on a SMR HDD revealed a decline of the performance (220 MiB/s to 30 MiB/s after 500 minutes). The performance goes down because of increased latency of the extent allocation, which is induced by a traversing of a lot of full block groups. So, this patch optimizes the ffe_ctl->hint_byte by choosing a block group with sufficient size from the active block group list, which does not contain full block groups. After applying the patch, the performance is maintained well. Fixes: 2eda57089ea3 ("btrfs: zoned: implement sequential extent allocation") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Johannes Thumshirn Signed-off-by: Naohiro Aota Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit 2333a3ed959eb61547ef6681bfe585719d1adda1 Author: Naohiro Aota Date: Tue Dec 19 01:02:28 2023 +0900 btrfs: zoned: factor out prepare_allocation_zoned() [ Upstream commit b271fee9a41ca1474d30639fd6cc912c9901d0f8 ] Factor out prepare_allocation_zoned() for further extension. While at it, optimize the if-branch a bit. Reviewed-by: Johannes Thumshirn Signed-off-by: Naohiro Aota Reviewed-by: David Sterba Signed-off-by: David Sterba Stable-dep-of: 02444f2ac26e ("btrfs: zoned: optimize hint byte for zoned allocator") Signed-off-by: Sasha Levin commit 6e6b4f93614a13466d6629750d7e9acaf6d8fa4c Author: Alexander Stein Date: Thu Nov 2 10:50:48 2023 +0100 media: i2c: imx290: Properly encode registers as little-endian [ Upstream commit 60fc87a69523c294eb23a1316af922f6665a6f8c ] The conversion to CCI also converted the multi-byte register access to big-endian. Correct the register definition by using the correct little-endian ones. Fixes: af73323b9770 ("media: imx290: Convert to new CCI register access helpers") Cc: stable@vger.kernel.org Signed-off-by: Alexander Stein Reviewed-by: Hans de Goede Reviewed-by: Laurent Pinchart [Sakari Ailus: Fixed the Fixes: tag.] Signed-off-by: Sakari Ailus Signed-off-by: Hans Verkuil Signed-off-by: Sasha Levin commit d5d66144c75fdf3f225b935504c21bf33386990b Author: Alexander Stein Date: Thu Nov 2 10:50:47 2023 +0100 media: v4l2-cci: Add support for little-endian encoded registers [ Upstream commit d92e7a013ff33f4e0b31bbf768d0c85a8acefebf ] Some sensors, e.g. Sony IMX290, are using little-endian registers. Add support for those by encoding the endianness into Bit 20 of the register address. Fixes: af73323b9770 ("media: imx290: Convert to new CCI register access helpers") Cc: stable@vger.kernel.org Signed-off-by: Alexander Stein Reviewed-by: Hans de Goede Reviewed-by: Laurent Pinchart [Sakari Ailus: Fixed commit message.] Signed-off-by: Sakari Ailus Signed-off-by: Hans Verkuil Signed-off-by: Sasha Levin commit b6d4d288de9cdad765be357b89f651dabc1c6fcc Author: Sakari Ailus Date: Tue Nov 7 17:42:40 2023 +0200 media: v4l: cci: Add macros to obtain register width and address [ Upstream commit cd93cc245dfe334c38da98c14b34f9597e1b4ea6 ] Add CCI_REG_WIDTH() macro to obtain register width in bits and similarly, CCI_REG_WIDTH_BYTES() to obtain it in bytes. Also add CCI_REG_ADDR() macro to obtain the address of a register. Use both macros in v4l2-cci.c, too. Signed-off-by: Sakari Ailus Reviewed-by: Hans de Goede Reviewed-by: Laurent Pinchart Signed-off-by: Hans Verkuil Stable-dep-of: d92e7a013ff3 ("media: v4l2-cci: Add support for little-endian encoded registers") Signed-off-by: Sasha Levin commit a7a60c8301245a916ddfdddc8ea29704bcc9ad4f Author: Sakari Ailus Date: Tue Nov 7 10:45:30 2023 +0200 media: v4l: cci: Include linux/bits.h [ Upstream commit eba5058633b4d11e2a4d65eae9f1fce0b96365d9 ] linux/bits.h is needed for GENMASK(). Include it. Signed-off-by: Sakari Ailus Reviewed-by: Hans de Goede Reviewed-by: Laurent Pinchart Signed-off-by: Hans Verkuil Stable-dep-of: d92e7a013ff3 ("media: v4l2-cci: Add support for little-endian encoded registers") Signed-off-by: Sasha Levin commit c9d6d63b6c03afaa6f185df249af693a7939577c Author: Ricardo Neri Date: Tue Jan 9 19:07:04 2024 -0800 thermal: intel: hfi: Add syscore callbacks for system-wide PM [ Upstream commit 97566d09fd02d2ab329774bb89a2cdf2267e86d9 ] The kernel allocates a memory buffer and provides its location to the hardware, which uses it to update the HFI table. This allocation occurs during boot and remains constant throughout runtime. When resuming from hibernation, the restore kernel allocates a second memory buffer and reprograms the HFI hardware with the new location as part of a normal boot. The location of the second memory buffer may differ from the one allocated by the image kernel. When the restore kernel transfers control to the image kernel, its HFI buffer becomes invalid, potentially leading to memory corruption if the hardware writes to it (the hardware continues to use the buffer from the restore kernel). It is also possible that the hardware "forgets" the address of the memory buffer when resuming from "deep" suspend. Memory corruption may also occur in such a scenario. To prevent the described memory corruption, disable HFI when preparing to suspend or hibernate. Enable it when resuming. Add syscore callbacks to handle the package of the boot CPU (packages of non-boot CPUs are handled via CPU offline). Syscore ops always run on the boot CPU. Additionally, HFI only needs to be disabled during "deep" suspend and hibernation. Syscore ops only run in these cases. Cc: 6.1+ # 6.1+ Signed-off-by: Ricardo Neri [ rjw: Comment adjustment, subject and changelog edits ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Sasha Levin commit 74090a40021a7d78429bd7af6f7cd7e90e489444 Author: Ricardo Neri Date: Tue Jan 2 20:14:58 2024 -0800 thermal: intel: hfi: Disable an HFI instance when all its CPUs go offline [ Upstream commit 1c53081d773c2cb4461636559b0d55b46559ceec ] In preparation to support hibernation, add functionality to disable an HFI instance during CPU offline. The last CPU of an instance that goes offline will disable such instance. The Intel Software Development Manual states that the operating system must wait for the hardware to set MSR_IA32_PACKAGE_THERM_STATUS[26] after disabling an HFI instance to ensure that it will no longer write on the HFI memory. Some processors, however, do not ever set such bit. Wait a minimum of 2ms to give time hardware to complete any pending memory writes. Signed-off-by: Ricardo Neri Signed-off-by: Rafael J. Wysocki Stable-dep-of: 97566d09fd02 ("thermal: intel: hfi: Add syscore callbacks for system-wide PM") Signed-off-by: Sasha Levin commit b326de729f87b3ad53a382c871701f3016582e24 Author: Ricardo Neri Date: Tue Jan 2 20:14:56 2024 -0800 thermal: intel: hfi: Refactor enabling code into helper functions [ Upstream commit 8a8b6bb93c704776c4b05cb517c3fa8baffb72f5 ] In preparation for the addition of a suspend notifier, wrap the logic to enable HFI and program its memory buffer into helper functions. Both the CPU hotplug callback and the suspend notifier will use them. This refactoring does not introduce functional changes. Signed-off-by: Ricardo Neri Signed-off-by: Rafael J. Wysocki Stable-dep-of: 97566d09fd02 ("thermal: intel: hfi: Add syscore callbacks for system-wide PM") Signed-off-by: Sasha Levin commit 170a9dac29720869398c56cc554142004a0e4157 Author: Srinivasan Shanmugam Date: Wed Jan 17 08:41:52 2024 +0530 drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() & write_dpcd()' functions commit a58371d632ebab9ea63f10893a6b6731196b6f8d upstream. The 'status' variable in 'core_link_read_dpcd()' & 'core_link_write_dpcd()' was uninitialized. Thus, initializing 'status' variable to 'DC_ERROR_UNEXPECTED' by default. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:226 core_link_read_dpcd() error: uninitialized symbol 'status'. drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:248 core_link_write_dpcd() error: uninitialized symbol 'status'. Cc: stable@vger.kernel.org Cc: Jerry Zuo Cc: Jun Lei Cc: Wayne Lin Cc: Aurabindo Pillai Cc: Rodrigo Siqueira Cc: Hamza Mahfooz Signed-off-by: Srinivasan Shanmugam Reviewed-by: Rodrigo Siqueira Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 2d2c9032d61164e15a1d98558ba8575c00d621dd Author: Ma Jun Date: Wed Jan 17 14:35:29 2024 +0800 drm/amdgpu/pm: Fix the power source flag error commit ca1ffb174f16b699c536734fc12a4162097c49f4 upstream. The power source flag should be updated when [1] System receives an interrupt indicating that the power source has changed. [2] System resumes from suspend or runtime suspend Signed-off-by: Ma Jun Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit bf2fb650b1b605ae248c92b2c354bd623c7b12f1 Author: Kenneth Feng Date: Fri Jan 19 16:12:00 2024 +0800 drm/amd/pm: update the power cap setting commit 30269954745c6cac730352829ac9850918457440 upstream. update the power cap setting for smu_v13.0.0/smu_v13.0.7 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2356 Signed-off-by: Kenneth Feng Acked-by: Alex Deucher Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 5c0cc185d44895d5c6bb8304f18a03f7bc9b819f Author: Lijo Lazar Date: Sat Jan 20 13:48:09 2024 +0530 drm/amdgpu: Show vram vendor only if available commit 89a7c0bd74918f723c94c10452265e25063cba9b upstream. Ony if vram vendor info is available, show in sysfs. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit 10b0d6ea0ea96e84b9ecd2e5622953b6a3cfb0a4 Author: Lijo Lazar Date: Sat Jan 20 13:32:51 2024 +0530 drm/amdgpu: Avoid fetching vram vendor information commit 90751bdeee4e3ac87ebf814bf282b0fa97edfeab upstream. For GFX 9.4.3 APUs, the current method of fetching vram vendor information is not reliable. Avoid fetching the information. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit 31e967c68c351b25201cf263cc5f6a1355587f44 Author: Lijo Lazar Date: Thu Jan 18 14:25:35 2024 +0530 drm/amd/pm: Fetch current power limit from FW commit f1807682de0edbff6c1e46b19642a517d2e15c57 upstream. Power limit of SMUv13.0.6 SOCs can be updated by out-of-band ways. Fetch the limit from firmware instead of using cached values. Signed-off-by: Lijo Lazar Reviewed-by: Asad Kamal Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit e5e134f0e14ee09c2e1b95b819fe8baec3383e78 Author: Tom St Denis Date: Wed Jan 17 12:47:37 2024 -0500 drm/amd/amdgpu: Assign GART pages to AMD device mapping commit e7a8594cc2af920a905db15653c19c362d4ebd3f upstream. This allows kernel mapped pages like the PDB and PTB to be read via the iomem debugfs when there is no vram in the system. Signed-off-by: Tom St Denis Reviewed-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit 4ed605220dced43aa4eca93820c32ac9eb52ae6c Author: Christophe JAILLET Date: Sat Jan 13 15:58:21 2024 +0100 drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state() commit 05638ff6dd6f0f38734b6b3ee2c7cf15520f5c00 upstream. It is likely that the statement related to 'dml_edp' is misplaced. So move it in the correct "case SIGNAL_TYPE_EDP". Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2") Signed-off-by: Christophe JAILLET Signed-off-by: Hamza Mahfooz Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit cf656fc7276e5b3709a81bc9d9639459be2b2647 Author: Srinivasan Shanmugam Date: Wed Jan 10 20:58:35 2024 +0530 drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()' commit 3bb9b1f958c3d986ed90a3ff009f1e77e9553207 upstream. In link_set_dsc_pps_packet(), 'struct display_stream_compressor *dsc' was dereferenced in a DC_LOGGER_INIT(dsc->ctx->logger); before the 'dsc' NULL pointer check. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_dpms.c:905 link_set_dsc_pps_packet() warn: variable dereferenced before check 'dsc' (see line 903) Cc: stable@vger.kernel.org Cc: Aurabindo Pillai Cc: Rodrigo Siqueira Cc: Hamza Mahfooz Cc: Wenjing Liu Cc: Qingqing Zhuo Signed-off-by: Srinivasan Shanmugam Reviewed-by: Aurabindo Pillai Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 86195357c9211068b8c4c8147232b1915ed29fbe Author: Wayne Lin Date: Tue Jan 2 14:20:37 2024 +0800 drm/amd/display: Align the returned error code with legacy DP commit bfe79f5fff1300d96203383582b078c7b0aec80a upstream. [Why] For usb4 connector, AUX transaction is handled by dmub utilizing a differnt code path comparing to legacy DP connector. If the usb4 DP connector is disconnected, AUX access will report EBUSY and cause igt@kms_dp_aux_dev fail. [How] Align the error code with the one reported by legacy DP as EIO. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Acked-by: Alex Hung Signed-off-by: Wayne Lin Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 717fc1bd64261d9d25ad613ad1c9d3995a1de677 Author: Nicholas Kazlauskas Date: Fri Dec 15 11:01:42 2023 -0500 drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A commit 4b56f7d47be87cde5f368b67bc7fac53a2c3e8d2 upstream. [Why] We can experience DENTIST hangs during optimize_bandwidth or TDRs if FIFO is toggled and hangs. [How] Port the DCN35 fixes to DCN314. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu Acked-by: Alex Hung Signed-off-by: Nicholas Kazlauskas Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 58063cc9bcc3bd39f6f8fb8e68f9de8453b229d3 Author: Ovidiu Bunea Date: Mon Dec 18 21:40:45 2023 -0500 drm/amd/display: Fix DML2 watermark calculation commit d3579f5df0536c2f0fabaa3ea80bb2d179884195 upstream. [Why] core_mode_programming in DML2 should output watermark calculations to locals, but it incorrectly uses mode_lib [How] update code to match HW DML2 Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Charlene Liu Acked-by: Alex Hung Signed-off-by: Ovidiu Bunea Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit c02d257c654191ecda1dc1af6875d527e85310e7 Author: Srinivasan Shanmugam Date: Mon Jan 8 21:20:28 2024 +0530 drm/amd/display: Fix variable deferencing before NULL check in edp_setup_replay() commit 7073934f5d73f8b53308963cee36f0d389ea857c upstream. In edp_setup_replay(), 'struct dc *dc' & 'struct dmub_replay *replay' was dereferenced before the pointer 'link' & 'replay' NULL check. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_edp_panel_control.c:947 edp_setup_replay() warn: variable dereferenced before check 'link' (see line 933) Cc: stable@vger.kernel.org Cc: Bhawanpreet Lakha Cc: Harry Wentland Cc: Rodrigo Siqueira Cc: Aurabindo Pillai Cc: Alex Deucher Signed-off-by: Srinivasan Shanmugam Reviewed-by: Aurabindo Pillai Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit cde4c4f391f45a7b369c9f97fb15160ab268dbdc Author: Lijo Lazar Date: Thu Jan 11 09:47:33 2024 +0530 drm/amd/pm: Add error log for smu v13.0.6 reset commit 91739a897c12dcec699e53f390be1b4abdeef3a0 upstream. For all mode-2 reset fail cases, add error log. Signed-off-by: Lijo Lazar Reviewed-by: Asad Kamal Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit e39165fdec04467fcaf01b26d0dd0e61b0583066 Author: Lijo Lazar Date: Thu Jan 11 15:28:53 2024 +0530 drm/amd/pm: Fix smuv13.0.6 current clock reporting commit a992c90d8ed3929b70ae815ce21ca5651cc0a692 upstream. When current clock is equal to max dpm level clock, the level is not indicated correctly with *. Fix by comparing current clock against dpm level value. Signed-off-by: Lijo Lazar Reviewed-by: Asad Kamal Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit 562ef60ac05f75d1aa4309caf05f40bef1bde7d3 Author: Likun Gao Date: Fri Jan 5 17:33:34 2024 +0800 drm/amdgpu: correct the cu count for gfx v11 commit f4a94dbb6dc0bed10a5fc63718d00f1de45b12c0 upstream. Correct the algorithm of active CU to skip disabled sa for gfx v11. Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 14d31c45985d2c6977f8f568bfe622a8ac44655f Author: Yifan Zhang Date: Tue Jan 9 09:19:22 2024 +0800 drm/amdgpu: update regGL2C_CTRL4 value in golden setting commit 2b9a073b7304f4a9e130d04794c91a0c4f9a5c12 upstream. This patch to update regGL2C_CTRL4 in golden setting. Signed-off-by: Yifan Zhang Reviewed-by: Tim Huang Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit 35ab8ad6c12630f0fbb6dc58f9ce8e2b4fd86254 Author: Alex Deucher Date: Tue Jan 9 10:45:42 2024 -0500 drm/amdgpu: drop exp hw support check for GC 9.4.3 commit c3d5e297dcae88274dc6924db337a2159279eced upstream. No longer needed. Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 6.7.x Signed-off-by: Greg Kroah-Hartman commit 48683e4dcf805274125b37ec4b617a1e10929075 Author: Ori Messinger Date: Wed Nov 22 00:12:13 2023 -0500 drm/amdgpu: Enable GFXOFF for Compute on GFX11 commit aa0901a9008eeb2710292aff94e615adf7884d5f upstream. On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger Acked-by: Alex Deucher Reviewed-by: Harish Kasiviswanathan Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 43235db21fc23559f50a62f8f273002eeb506f5a Author: Aurabindo Pillai Date: Fri Oct 27 15:59:48 2023 -0400 drm/amd/display: Fix a debugfs null pointer error commit efb91fea652a42fcc037d2a9ef4ecd1ffc5ff4b7 upstream. [WHY & HOW] Check whether get_subvp_en() callback exists before calling it. Cc: Mario Limonciello Cc: Alex Deucher Cc: stable@vger.kernel.org Reviewed-by: Alex Hung Acked-by: Alex Hung Signed-off-by: Aurabindo Pillai Tested-by: Daniel Wheeler Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 76161cb064ff98cf2190f4ab375f84bf583b01e9 Author: Dan Carpenter Date: Wed Dec 6 18:05:15 2023 +0300 drm/bridge: nxp-ptn3460: simplify some error checking commit 28d3d0696688154cc04983f343011d07bf0508e4 upstream. The i2c_master_send/recv() functions return negative error codes or they return "len" on success. So the error handling here can be written as just normal checks for "if (ret < 0) return ret;". No need to complicate things. Btw, in this code the "len" parameter can never be zero, but even if it were, then I feel like this would still be the best way to write it. Fixes: 914437992876 ("drm/bridge: nxp-ptn3460: fix i2c_master_send() error checking") Suggested-by: Neil Armstrong Signed-off-by: Dan Carpenter Reviewed-by: Robert Foss Signed-off-by: Robert Foss Link: https://patchwork.freedesktop.org/patch/msgid/04242630-42d8-4920-8c67-24ac9db6b3c9@moroto.mountain Signed-off-by: Greg Kroah-Hartman commit 435edec678163de4d0420eb86ee13dd3ccc52c35 Author: Ivan Lipski Date: Fri Jan 5 19:40:50 2024 -0500 Revert "drm/amd/display: fix bandwidth validation failure on DCN 2.1" commit c2ab9ce0ee7225fc05f58a6671c43b8a3684f530 upstream. This commit causes dmesg-warn on several IGT tests on DCN 3.1.6: *ERROR* link_enc_cfg_validate: Invalid link encoder assignments - 0x1c Affected IGT tests include: - amdgpu/[amd_assr|amd_plane|amd_hotplug] - kms_atomic - kms_color - kms_flip - kms_properties - kms_universal_plane and some other tests This reverts commit 3a0fa3bc245ef92838a8296e0055569b8dff94c4. Cc: Melissa Wen Cc: Hamza Mahfooz Reviewed-by: Rodrigo Siqueira Signed-off-by: Ivan Lipski Signed-off-by: Rodrigo Siqueira Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit f015d8b6405d950f30826b4d8d9e1084dd9ea2a4 Author: Mario Limonciello Date: Mon Jun 19 15:04:24 2023 -0500 drm/amd/display: Disable PSR-SU on Parade 0803 TCON again commit 571c2fa26aa654946447c282a09d40a56c7ff128 upstream. When screen brightness is rapidly changed and PSR-SU is enabled the display hangs on panels with this TCON even on the latest DCN 3.1.4 microcode (0x8002a81 at this time). This was disabled previously as commit 072030b17830 ("drm/amd: Disable PSR-SU on Parade 0803 TCON") but reverted as commit 1e66a17ce546 ("Revert "drm/amd: Disable PSR-SU on Parade 0803 TCON"") in favor of testing for a new enough microcode (commit cd2e31a9ab93 ("drm/amd/display: Set minimum requirement for using PSR-SU on Phoenix")). As hangs are still happening specifically with this TCON, disable PSR-SU again for it until it can be root caused. Cc: stable@vger.kernel.org Cc: aaron.ma@canonical.com Cc: binli@gnome.org Cc: Marc Rossi Cc: Hamza Mahfooz Signed-off-by: Mario Limonciello Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2046131 Acked-by: Alex Deucher Reviewed-by: Harry Wentland Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit f829932e011b31b85d89cba0ce757ee875f68d5f Author: Melissa Wen Date: Fri Dec 29 15:25:00 2023 -0100 drm/amd/display: fix bandwidth validation failure on DCN 2.1 commit 3a0fa3bc245ef92838a8296e0055569b8dff94c4 upstream. IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of the test execution, during atomic check, because DC rejects the bandwidth state for a fb sizing 64x64. The test was previously working with the deprecated dc_commit_state(). Now using dc_validate_with_context() approach, the atomic check needs to perform a full state validation. Therefore, set fast_validation to false in the dc_validate_global_state call for atomic check. Cc: stable@vger.kernel.org Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams") Signed-off-by: Melissa Wen Signed-off-by: Hamza Mahfooz Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 03971c801baf1c322e926cb7dac94e16f2350676 Author: Javier Martinez Canillas Date: Thu Nov 23 23:13:00 2023 +0100 drm: Allow drivers to indicate the damage helpers to ignore damage clips commit 35ed38d58257336c1df26b14fd5110b026e2adde upstream. It allows drivers to set a struct drm_plane_state .ignore_damage_clips in their plane's .atomic_check callback, as an indication to damage helpers such as drm_atomic_helper_damage_iter_init() that the damage clips should be ignored. To be used by drivers that do per-buffer (e.g: virtio-gpu) uploads (rather than per-plane uploads), since these type of drivers need to handle buffer damages instead of frame damages. That way, these drivers could force a full plane update if the framebuffer attached to a plane's state has changed since the last update (page-flip). Fixes: 01f05940a9a7 ("drm/virtio: Enable fb damage clips property for the primary plane") Cc: # v6.4+ Reported-by: nerdopolis Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218115 Suggested-by: Thomas Zimmermann Signed-off-by: Javier Martinez Canillas Reviewed-by: Thomas Zimmermann Reviewed-by: Zack Rusin Acked-by: Sima Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20231123221315.3579454-2-javierm@redhat.com Signed-off-by: Greg Kroah-Hartman commit 07fcfd45b2eaed23c8375224333d64292dbe59b9 Author: Javier Martinez Canillas Date: Thu Nov 23 23:13:01 2023 +0100 drm/virtio: Disable damage clipping if FB changed since last page-flip commit 0240db231dfe5ee5b7a3a03cba96f0844b7a673d upstream. The driver does per-buffer uploads and needs to force a full plane update if the plane's attached framebuffer has change since the last page-flip. Fixes: 01f05940a9a7 ("drm/virtio: Enable fb damage clips property for the primary plane") Cc: # v6.4+ Reported-by: nerdopolis Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218115 Suggested-by: Sima Vetter Signed-off-by: Javier Martinez Canillas Reviewed-by: Thomas Zimmermann Reviewed-by: Zack Rusin Acked-by: Sima Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20231123221315.3579454-3-javierm@redhat.com Signed-off-by: Greg Kroah-Hartman commit c038f9c471a90da8da2509457255ec7fb4c02b79 Author: Zack Rusin Date: Mon Oct 23 09:46:05 2023 +0200 drm: Disable the cursor plane on atomic contexts with virtualized drivers commit 4e3b70da64a53784683cfcbac2deda5d6e540407 upstream. Cursor planes on virtualized drivers have special meaning and require that the clients handle them in specific ways, e.g. the cursor plane should react to the mouse movement the way a mouse cursor would be expected to and the client is required to set hotspot properties on it in order for the mouse events to be routed correctly. This breaks the contract as specified by the "universal planes". Fix it by disabling the cursor planes on virtualized drivers while adding a foundation on top of which it's possible to special case mouse cursor planes for clients that want it. Disabling the cursor planes makes some kms compositors which were broken, e.g. Weston, fallback to software cursor which works fine or at least better than currently while having no effect on others, e.g. gnome-shell or kwin, which put virtualized drivers on a deny-list when running in atomic context to make them fallback to legacy kms and avoid this issue. Signed-off-by: Zack Rusin Fixes: 681e7ec73044 ("drm: Allow userspace to ask for universal plane list (v2)") Cc: # v5.4+ Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Dave Airlie Cc: Gerd Hoffmann Cc: Hans de Goede Cc: Gurchetan Singh Cc: Chia-I Wu Cc: dri-devel@lists.freedesktop.org Cc: virtualization@lists.linux-foundation.org Cc: spice-devel@lists.freedesktop.org Acked-by: Pekka Paalanen Reviewed-by: Javier Martinez Canillas Acked-by: Simon Ser Signed-off-by: Javier Martinez Canillas Link: https://patchwork.freedesktop.org/patch/msgid/20231023074613.41327-2-aesteve@redhat.com Signed-off-by: Greg Kroah-Hartman commit f26a6eac4c7209b81a235a88b9dc322332d2818c Author: Tomi Valkeinen Date: Thu Nov 9 09:38:03 2023 +0200 drm/tidss: Fix atomic_flush check commit 95d4b471953411854f9c80b568da7fcf753f3801 upstream. tidss_crtc_atomic_flush() checks if the crtc is enabled, and if not, returns immediately as there's no reason to do any register changes. However, the code checks for 'crtc->state->enable', which does not reflect the actual HW state. We should instead look at the 'crtc->state->active' flag. This causes the tidss_crtc_atomic_flush() to proceed with the flush even if the active state is false, which then causes us to hit the WARN_ON(!crtc->state->event) check. Fix this by checking the active flag, and while at it, fix the related debug print which had "active" and "needs modeset" wrong way. Cc: Fixes: 32a1795f57ee ("drm/tidss: New driver for TI Keystone platform Display SubSystem") Reviewed-by: Aradhya Bhatia Link: https://lore.kernel.org/r/20231109-tidss-probe-v2-10-ac91b5ea35c0@ideasonboard.com Signed-off-by: Tomi Valkeinen Signed-off-by: Greg Kroah-Hartman commit 5fdbdd9f0a1a9e36bdaaad0738627601672d14a0 Author: Thomas Zimmermann Date: Wed Nov 22 13:09:31 2023 +0100 drm: Fix TODO list mentioning non-KMS drivers commit 9cf5ca1f485cae406968947a92bf304603999fa1 upstream. Non-KMS drivers have been removed from DRM. Update the TODO list accordingly. Signed-off-by: Thomas Zimmermann Fixes: a276afc19eec ("drm: Remove some obsolete drm pciids(tdfx, mga, i810, savage, r128, sis, via)") Cc: Cai Huoqing Cc: Daniel Vetter Cc: Dave Airlie Cc: Thomas Zimmermann Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: David Airlie Cc: Daniel Vetter Cc: Jonathan Corbet Cc: dri-devel@lists.freedesktop.org Cc: # v6.3+ Cc: linux-doc@vger.kernel.org Reviewed-by: David Airlie Reviewed-by: Daniel Vetter Acked-by: Alex Deucher Link: https://patchwork.freedesktop.org/patch/msgid/20231122122449.11588-3-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman commit aa2a2dc68b001fbbc9ba6866b458a6ff7887cce9 Author: Dan Carpenter Date: Mon Dec 4 15:29:00 2023 +0300 drm/bridge: nxp-ptn3460: fix i2c_master_send() error checking commit 914437992876838662c968cb416f832110fb1093 upstream. The i2c_master_send/recv() functions return negative error codes or the number of bytes that were able to be sent/received. This code has two problems. 1) Instead of checking if all the bytes were sent or received, it checks that at least one byte was sent or received. 2) If there was a partial send/receive then we should return a negative error code but this code returns success. Fixes: a9fe713d7d45 ("drm/bridge: Add PTN3460 bridge driver") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter Reviewed-by: Robert Foss Signed-off-by: Robert Foss Link: https://patchwork.freedesktop.org/patch/msgid/0cdc2dce-ca89-451a-9774-1482ab2f4762@moroto.mountain Signed-off-by: Greg Kroah-Hartman commit bd8f006d4b6ff221e0cf54d6d98f59489d0dd3c8 Author: Ville Syrjälä Date: Thu Jan 18 23:21:31 2024 +0200 drm/i915/psr: Only allow PSR in LPSP mode on HSW non-ULT commit f9f031dd21a7ce13a13862fa5281d32e1029c70f upstream. On HSW non-ULT (or at least on Dell Latitude E6540) external displays start to flicker when we enable PSR on the eDP. We observe a much higher SR and PC6 residency than should be possible with an external display, and indeen much higher than what we observe with eDP disabled and only the external display enabled. Looks like the hardware is somehow ignoring the fact that the external display is active during PSR. I wasn't able to redproduce this on my HSW ULT machine, or BDW. So either there's something specific about this particular laptop (eg. some unknown firmware thing) or the issue is limited to just non-ULT HSW systems. All known registers that could affect this look perfectly reasonable on the affected machine. As a workaround let's unmask the LPSP event to prevent PSR entry except while in LPSP mode (only pipe A + eDP active). This will prevent PSR entry entirely when multiple pipes are active. The one slight downside is that we now also prevent PSR entry when driving eDP with pipe B or C, but I think that's a reasonable tradeoff to avoid having to implement a more complex workaround. Cc: stable@vger.kernel.org Fixes: 783d8b80871f ("drm/i915/psr: Re-enable PSR1 on hsw/bdw") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10092 Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20240118212131.31868-1-ville.syrjala@linux.intel.com Reviewed-by: Jouni Högander (cherry picked from commit 94501c3ca6400e463ff6cc0c9cf4a2feb6a9205d) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit bfd0feb1b109cb63b87fdcd00122603787c75a1a Author: Ville Syrjälä Date: Mon Dec 11 10:16:24 2023 +0200 drm: Don't unref the same fb many times by mistake due to deadlock handling commit cb4daf271302d71a6b9a7c01bd0b6d76febd8f0c upstream. If we get a deadlock after the fb lookup in drm_mode_page_flip_ioctl() we proceed to unref the fb and then retry the whole thing from the top. But we forget to reset the fb pointer back to NULL, and so if we then get another error during the retry, before the fb lookup, we proceed the unref the same fb again without having gotten another reference. The end result is that the fb will (eventually) end up being freed while it's still in use. Reset fb to NULL once we've unreffed it to avoid doing it again until we've done another fb lookup. This turned out to be pretty easy to hit on a DG2 when doing async flips (and CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y). The first symptom I saw that drm_closefb() simply got stuck in a busy loop while walking the framebuffer list. Fortunately I was able to convince it to oops instead, and from there it was easier to track down the culprit. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20231211081625.25704-1-ville.syrjala@linux.intel.com Acked-by: Javier Martinez Canillas Signed-off-by: Greg Kroah-Hartman commit 04272ccd2a19da0b238d8f6a7d130c97c6243e6d Author: Ville Syrjälä Date: Tue Jan 16 23:08:21 2024 +0200 Revert "drm/i915/dsi: Do display on sequence later on icl+" commit 6992eb815d087858f8d7e4020529c2fe800456b3 upstream. This reverts commit 88b065943cb583e890324d618e8d4b23460d51a3. Lenovo 82TQ is unhappy if we do the display on sequence this late. The display output shows severe corruption. It's unclear if this is a failure on our part (perhaps something to do with sending commands in LP mode after HS /video mode transmission has been started? Though the backlight on command at least seems to work) or simply that there are some commands in the sequence that are needed to be done earlier (eg. could be some DSC init stuff?). If the latter then I don't think the current Windows code would work either, but maybe this was originally tested with an older driver, who knows. Root causing this fully would likely require a lot of experimentation which isn't really feasible without direct access to the machine, so let's just accept failure and go back to the original sequence. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10071 Signed-off-by: Ville Syrjälä Link: https://patchwork.freedesktop.org/patch/msgid/20240116210821.30194-1-ville.syrjala@linux.intel.com Acked-by: Jani Nikula (cherry picked from commit dc524d05974f615b145404191fcf91b478950499) Signed-off-by: Joonas Lahtinen Signed-off-by: Greg Kroah-Hartman commit 2e55e7dabeec38a1c1b405d6d741c1be2e20ddb6 Author: Dave Airlie Date: Sat Jan 27 04:04:34 2024 +1000 Revert "nouveau: push event block/allowing out of the fence context" commit 4d7acc8f48bcf27d0dc068f02e55c77e840b9110 upstream. This reverts commit eacabb5462717a52fccbbbba458365a4f5e61f35. This commit causes some regressions in desktop usage, this will reintroduce the original deadlock in DRI_PRIME situations, I've got an idea to fix it by offloading to a workqueue in a different spot, however this code has a race condition where we sometimes miss interrupts so I'd like to fix that as well. Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie Signed-off-by: Greg Kroah-Hartman commit 13eb8c89762d96d6086bb4da7d5763f8ec0d93c2 Author: Rafael J. Wysocki Date: Mon Jan 22 15:18:11 2024 +0100 cpufreq: intel_pstate: Refine computation of P-state for given frequency commit 192cdb1c907fd8df2d764c5bb17496e415e59391 upstream. On systems using HWP, if a given frequency is equal to the maximum turbo frequency or the maximum non-turbo frequency, the HWP performance level corresponding to it is already known and can be used directly without any computation. Accordingly, adjust the code to use the known HWP performance levels in the cases mentioned above. This also helps to avoid limiting CPU capacity artificially in some cases when the BIOS produces the HWP_CAP numbers using a different E-core-to-P-core performance scaling factor than expected by the kernel. Fixes: f5c8cf2a4992 ("cpufreq: intel_pstate: hybrid: Use known scaling factor for P-cores") Cc: 6.1+ # 6.1+ Tested-by: Srinivas Pandruvada Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 11d9f99757315a1c4b82170ccf57ef1969be65cd Author: Mario Limonciello Date: Wed Jan 17 08:29:42 2024 -0600 gpiolib: acpi: Ignore touchpad wakeup on GPD G1619-04 commit 805c74eac8cb306dc69b87b6b066ab4da77ceaf1 upstream. Spurious wakeups are reported on the GPD G1619-04 which can be absolved by programming the GPIO to ignore wakeups. Cc: stable@vger.kernel.org Reported-and-tested-by: George Melikov Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3073 Signed-off-by: Mario Limonciello Reviewed-by: Andy Shevchenko Signed-off-by: Bartosz Golaszewski Signed-off-by: Greg Kroah-Hartman commit ff31e8358ab2da5755a91c87392b178495b2d961 Author: Dave Chinner Date: Tue Jan 16 15:33:07 2024 +1100 xfs: read only mounts with fsopen mount API are busted commit d8d222e09dab84a17bb65dda4b94d01c565f5327 upstream. Recently xfs/513 started failing on my test machines testing "-o ro,norecovery" mount options. This was being emitted in dmesg: [ 9906.932724] XFS (pmem0): no-recovery mounts must be read-only. Turns out, readonly mounts with the fsopen()/fsconfig() mount API have been busted since day zero. It's only taken 5 years for debian unstable to start using this "new" mount API, and shortly after this I noticed xfs/513 had started to fail as per above. The syscall trace is: fsopen("xfs", FSOPEN_CLOEXEC) = 3 mount_setattr(-1, NULL, 0, NULL, 0) = -1 EINVAL (Invalid argument) ..... fsconfig(3, FSCONFIG_SET_STRING, "source", "/dev/pmem0", 0) = 0 fsconfig(3, FSCONFIG_SET_FLAG, "ro", NULL, 0) = 0 fsconfig(3, FSCONFIG_SET_FLAG, "norecovery", NULL, 0) = 0 fsconfig(3, FSCONFIG_CMD_CREATE, NULL, NULL, 0) = -1 EINVAL (Invalid argument) close(3) = 0 Showing that the actual mount instantiation (FSCONFIG_CMD_CREATE) is what threw out the error. During mount instantiation, we call xfs_fs_validate_params() which does: /* No recovery flag requires a read-only mount */ if (xfs_has_norecovery(mp) && !xfs_is_readonly(mp)) { xfs_warn(mp, "no-recovery mounts must be read-only."); return -EINVAL; } and xfs_is_readonly() checks internal mount flags for read only state. This state is set in xfs_init_fs_context() from the context superblock flag state: /* * Copy binary VFS mount flags we are interested in. */ if (fc->sb_flags & SB_RDONLY) set_bit(XFS_OPSTATE_READONLY, &mp->m_opstate); With the old mount API, all of the VFS specific superblock flags had already been parsed and set before xfs_init_fs_context() is called, so this all works fine. However, in the brave new fsopen/fsconfig world, xfs_init_fs_context() is called from fsopen() context, before any VFS superblock have been set or parsed. Hence if we use fsopen(), the internal XFS readonly state is *never set*. Hence anything that depends on xfs_is_readonly() actually returning true for read only mounts is broken if fsopen() has been used to mount the filesystem. Fix this by moving this internal state initialisation to xfs_fs_fill_super() before we attempt to validate the parameters that have been set prior to the FSCONFIG_CMD_CREATE call being made. Signed-off-by: Dave Chinner Fixes: 73e5fff98b64 ("xfs: switch to use the new mount-api") cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig Signed-off-by: Chandan Babu R Signed-off-by: Greg Kroah-Hartman commit d3887448486caeef9687fb5dfebd4ff91e0f25aa Author: Ma Jun Date: Fri Jan 12 13:33:24 2024 +0800 drm/amdgpu: Fix the null pointer when load rlc firmware commit bc03c02cc1991a066b23e69bbcc0f66e8f1f7453 upstream. If the RLC firmware is invalid because of wrong header size, the pointer to the rlc firmware is released in function amdgpu_ucode_request. There will be a null pointer error in subsequent use. So skip validation to fix it. Fixes: 3da9b71563cb ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX10") Signed-off-by: Ma Jun Acked-by: Alex Deucher Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 29faedc76f549cf67f0e17b91682a7d59a2b7277 Author: Thomas Zimmermann Date: Tue Jan 23 13:09:26 2024 +0100 Revert "drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync" commit d1b163aa0749706379055e40a52cf7a851abf9dc upstream. This reverts commit 60aebc9559492cea6a9625f514a8041717e3a2e4. Commit 60aebc9559492cea ("drivers/firmware: Move sysfb_init() from device_initcall to subsys_initcall_sync") messes up initialization order of the graphics drivers and leads to blank displays on some systems. So revert the commit. To make the display drivers fully independent from initialization order requires to track framebuffer memory by device and independently from the loaded drivers. The kernel currently lacks the infrastructure to do so. Reported-by: Jaak Ristioja Closes: https://lore.kernel.org/dri-devel/ZUnNi3q3yB3zZfTl@P70.localdomain/T/#t Reported-by: Huacai Chen Closes: https://lore.kernel.org/dri-devel/20231108024613.2898921-1-chenhuacai@loongson.cn/ Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10133 Signed-off-by: Thomas Zimmermann Cc: Javier Martinez Canillas Cc: Thorsten Leemhuis Cc: Jani Nikula Cc: stable@vger.kernel.org # v6.5+ Reviewed-by: Javier Martinez Canillas Acked-by: Jani Nikula Link: https://patchwork.freedesktop.org/patch/msgid/20240123120937.27736-1-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman commit 12dc4217f16551d6dee9cbefc23fdb5659558cda Author: Cristian Marussi Date: Wed Dec 20 17:21:12 2023 +0000 firmware: arm_scmi: Check mailbox/SMT channel for consistency commit 437a310b22244d4e0b78665c3042e5d1c0f45306 upstream. On reception of a completion interrupt the shared memory area is accessed to retrieve the message header at first and then, if the message sequence number identifies a transaction which is still pending, the related payload is fetched too. When an SCMI command times out the channel ownership remains with the platform until eventually a late reply is received and, as a consequence, any further transmission attempt remains pending, waiting for the channel to be relinquished by the platform. Once that late reply is received the channel ownership is given back to the agent and any pending request is then allowed to proceed and overwrite the SMT area of the just delivered late reply; then the wait for the reply to the new request starts. It has been observed that the spurious IRQ related to the late reply can be wrongly associated with the freshly enqueued request: when that happens the SCMI stack in-flight lookup procedure is fooled by the fact that the message header now present in the SMT area is related to the new pending transaction, even though the real reply has still to arrive. This race-condition on the A2P channel can be detected by looking at the channel status bits: a genuine reply from the platform will have set the channel free bit before triggering the completion IRQ. Add a consistency check to validate such condition in the A2P ISR. Reported-by: Xinglong Yang Closes: https://lore.kernel.org/all/PUZPR06MB54981E6FA00D82BFDBB864FBF08DA@PUZPR06MB5498.apcprd06.prod.outlook.com/ Fixes: 5c8a47a5a91d ("firmware: arm_scmi: Make scmi core independent of the transport type") Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Cristian Marussi Tested-by: Xinglong Yang Link: https://lore.kernel.org/r/20231220172112.763539-1-cristian.marussi@arm.com Signed-off-by: Sudeep Holla Signed-off-by: Greg Kroah-Hartman commit 6993328a4cd62a24df254b587c0796a4a1eecc95 Author: Lin Ma Date: Sun Jan 21 15:35:06 2024 +0800 ksmbd: fix global oob in ksmbd_nl_policy commit ebeae8adf89d9a82359f6659b1663d09beec2faa upstream. Similar to a reported issue (check the commit b33fb5b801c6 ("net: qualcomm: rmnet: fix global oob in rmnet_policy"), my local fuzzer finds another global out-of-bounds read for policy ksmbd_nl_policy. See bug trace below: ================================================================== BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:386 [inline] BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600 Read of size 1 at addr ffffffff8f24b100 by task syz-executor.1/62810 CPU: 0 PID: 62810 Comm: syz-executor.1 Tainted: G N 6.1.0 #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8b/0xb3 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:284 [inline] print_report+0x172/0x475 mm/kasan/report.c:395 kasan_report+0xbb/0x1c0 mm/kasan/report.c:495 validate_nla lib/nlattr.c:386 [inline] __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600 __nla_parse+0x3e/0x50 lib/nlattr.c:697 __nlmsg_parse include/net/netlink.h:748 [inline] genl_family_rcv_msg_attrs_parse.constprop.0+0x1b0/0x290 net/netlink/genetlink.c:565 genl_family_rcv_msg_doit+0xda/0x330 net/netlink/genetlink.c:734 genl_family_rcv_msg net/netlink/genetlink.c:833 [inline] genl_rcv_msg+0x441/0x780 net/netlink/genetlink.c:850 netlink_rcv_skb+0x14f/0x410 net/netlink/af_netlink.c:2540 genl_rcv+0x24/0x40 net/netlink/genetlink.c:861 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x930/0xe50 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0x154/0x190 net/socket.c:734 ____sys_sendmsg+0x6df/0x840 net/socket.c:2482 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fdd66a8f359 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fdd65e00168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fdd66bbcf80 RCX: 00007fdd66a8f359 RDX: 0000000000000000 RSI: 0000000020000500 RDI: 0000000000000003 RBP: 00007fdd66ada493 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc84b81aff R14: 00007fdd65e00300 R15: 0000000000022000 The buggy address belongs to the variable: ksmbd_nl_policy+0x100/0xa80 The buggy address belongs to the physical page: page:0000000034f47940 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1ccc4b flags: 0x200000000001000(reserved|node=0|zone=2) raw: 0200000000001000 ffffea00073312c8 ffffea00073312c8 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffffffff8f24b000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffff8f24b080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffffffff8f24b100: f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 00 00 07 f9 ^ ffffffff8f24b180: f9 f9 f9 f9 00 05 f9 f9 f9 f9 f9 f9 00 00 00 05 ffffffff8f24b200: f9 f9 f9 f9 00 00 03 f9 f9 f9 f9 f9 00 00 04 f9 ================================================================== To fix it, add a placeholder named __KSMBD_EVENT_MAX and let KSMBD_EVENT_MAX to be its original value - 1 according to what other netlink families do. Also change two sites that refer the KSMBD_EVENT_MAX to correct value. Cc: stable@vger.kernel.org Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Signed-off-by: Lin Ma Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit d281ac9a987c553d93211b90fd4fe97d8eca32cd Author: Shin'ichiro Kawasaki Date: Mon Jan 8 15:20:58 2024 +0900 platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe commit 5913320eb0b3ec88158cfcb0fa5e996bf4ef681b upstream. p2sb_bar() unhides P2SB device to get resources from the device. It guards the operation by locking pci_rescan_remove_lock so that parallel rescans do not find the P2SB device. However, this lock causes deadlock when PCI bus rescan is triggered by /sys/bus/pci/rescan. The rescan locks pci_rescan_remove_lock and probes PCI devices. When PCI devices call p2sb_bar() during probe, it locks pci_rescan_remove_lock again. Hence the deadlock. To avoid the deadlock, do not lock pci_rescan_remove_lock in p2sb_bar(). Instead, do the lock at fs_initcall. Introduce p2sb_cache_resources() for fs_initcall which gets and caches the P2SB resources. At p2sb_bar(), refer the cache and return to the caller. Before operating the device at P2SB DEVFN for resource cache, check that its device class is PCI_CLASS_MEMORY_OTHER 0x0580 that PCH specifications define. This avoids unexpected operation to other devices at the same DEVFN. Link: https://lore.kernel.org/linux-pci/6xb24fjmptxxn5js2fjrrddjae6twex5bjaftwqsuawuqqqydx@7cl3uik5ef6j/ Fixes: 9745fb07474f ("platform/x86/intel: Add Primary to Sideband (P2SB) bridge support") Cc: stable@vger.kernel.org Suggested-by: Andy Shevchenko Signed-off-by: Shin'ichiro Kawasaki Link: https://lore.kernel.org/r/20240108062059.3583028-2-shinichiro.kawasaki@wdc.com Reviewed-by: Andy Shevchenko Reviewed-by: Ilpo Järvinen Tested-by Klara Modin Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Signed-off-by: Greg Kroah-Hartman commit 70c5aecd830d132c514d0cee9fcadcdc7263f119 Author: Nathan Chancellor Date: Thu Jan 4 15:59:03 2024 -0700 platform/x86: intel-uncore-freq: Fix types in sysfs callbacks commit 416de0246f35f43d871a57939671fe814f4455ee upstream. When booting a kernel with CONFIG_CFI_CLANG, there is a CFI failure when accessing any of the values under /sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00: $ cat /sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/max_freq_khz fish: Job 1, 'cat /sys/devices/system/cpu/int…' terminated by signal SIGSEGV (Address boundary error) $ sudo dmesg &| grep 'CFI failure' [ 170.953925] CFI failure at kobj_attr_show+0x19/0x30 (target: show_max_freq_khz+0x0/0xc0 [intel_uncore_frequency_common]; expected type: 0xd34078c5 The sysfs callback functions such as show_domain_id() are written as if they are going to be called by dev_attr_show() but as the above message shows, they are instead called by kobj_attr_show(). kCFI checks that the destination of an indirect jump has the exact same type as the prototype of the function pointer it is called through and fails when they do not. These callbacks are called through kobj_attr_show() because uncore_root_kobj was initialized with kobject_create_and_add(), which means uncore_root_kobj has a ->sysfs_ops of kobj_sysfs_ops from kobject_create(), which uses kobj_attr_show() as its ->show() value. The only reason there has not been a more noticeable problem until this point is that 'struct kobj_attribute' and 'struct device_attribute' have the same layout, so getting the callback from container_of() works the same with either value. Change all the callbacks and their uses to be compatible with kobj_attr_show() and kobj_attr_store(), which resolves the kCFI failure and allows the sysfs files to work properly. Closes: https://github.com/ClangBuiltLinux/linux/issues/1974 Fixes: ae7b2ce57851 ("platform/x86/intel/uncore-freq: Use sysfs API to create attributes") Cc: stable@vger.kernel.org Signed-off-by: Nathan Chancellor Reviewed-by: Sami Tolvanen Acked-by: Srinivas Pandruvada Link: https://lore.kernel.org/r/20240104-intel-uncore-freq-kcfi-fix-v1-1-bf1e8939af40@kernel.org Signed-off-by: Hans de Goede Signed-off-by: Greg Kroah-Hartman commit f05a497e7bc8851eeeb3a58da180ba469efebb05 Author: Florian Westphal Date: Sat Jan 20 22:50:04 2024 +0100 netfilter: nf_tables: reject QUEUE/DROP verdict parameters commit f342de4e2f33e0e39165d8639387aa6c19dff660 upstream. This reverts commit e0abdadcc6e1. core.c:nf_hook_slow assumes that the upper 16 bits of NF_DROP verdicts contain a valid errno, i.e. -EPERM, -EHOSTUNREACH or similar, or 0. Due to the reverted commit, its possible to provide a positive value, e.g. NF_ACCEPT (1), which results in use-after-free. Its not clear to me why this commit was made. NF_QUEUE is not used by nftables; "queue" rules in nftables will result in use of "nft_queue" expression. If we later need to allow specifiying errno values from userspace (do not know why), this has to call NF_DROP_GETERR and check that "err <= 0" holds true. Fixes: e0abdadcc6e1 ("netfilter: nf_tables: accept QUEUE/DROP verdict parameters") Cc: stable@vger.kernel.org Reported-by: Notselwyn Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 36a0a80f32209238469deb481967d777a3d539ee Author: Pablo Neira Ayuso Date: Thu Jan 18 10:56:26 2024 +0100 netfilter: nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain commit 01acb2e8666a6529697141a6017edbf206921913 upstream. Remove netdevice from inet/ingress basechain in case NETDEV_UNREGISTER event is reported, otherwise a stale reference to netdevice remains in the hook list. Fixes: 60a3815da702 ("netfilter: add inet ingress support") Cc: stable@vger.kernel.org Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 3c798a636ba0f3563b893d6168bf0bade90c1d42 Author: Michael Kelley Date: Mon Jan 22 08:20:28 2024 -0800 hv_netvsc: Calculate correct ring size when PAGE_SIZE is not 4 Kbytes commit 6941f67ad37d5465b75b9ffc498fcf6897a3c00e upstream. Current code in netvsc_drv_init() incorrectly assumes that PAGE_SIZE is 4 Kbytes, which is wrong on ARM64 with 16K or 64K page size. As a result, the default VMBus ring buffer size on ARM64 with 64K page size is 8 Mbytes instead of the expected 512 Kbytes. While this doesn't break anything, a typical VM with 8 vCPUs and 8 netvsc channels wastes 120 Mbytes (8 channels * 2 ring buffers/channel * 7.5 Mbytes/ring buffer). Unfortunately, the module parameter specifying the ring buffer size is in units of 4 Kbyte pages. Ideally, it should be in units that are independent of PAGE_SIZE, but backwards compatibility prevents changing that now. Fix this by having netvsc_drv_init() hardcode 4096 instead of using PAGE_SIZE when calculating the ring buffer size in bytes. Also use the VMBUS_RING_SIZE macro to ensure proper alignment when running with page size larger than 4K. Cc: # 5.15.x Fixes: 7aff79e297ee ("Drivers: hv: Enable Hyper-V code to be built on ARM64") Signed-off-by: Michael Kelley Link: https://lore.kernel.org/r/20240122162028.348885-1-mhklinux@outlook.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit eedc92622b102943c4c1e483c6e07b936106e936 Author: Amir Goldstein Date: Sat Jan 20 12:18:39 2024 +0200 ovl: mark xwhiteouts directory with overlay.opaque='x' commit 420332b94119cdc7db4477cc88484691cb92ae71 upstream. An opaque directory cannot have xwhiteouts, so instead of marking an xwhiteouts directory with a new xattr, overload overlay.opaque xattr for marking both opaque dir ('y') and xwhiteouts dir ('x'). This is more efficient as the overlay.opaque xattr is checked during lookup of directory anyway. This also prevents unnecessary checking the xattr when reading a directory without xwhiteouts, i.e. most of the time. Note that the xwhiteouts marker is not checked on the upper layer and on the last layer in lowerstack, where xwhiteouts are not expected. Fixes: bc8df7a3dc03 ("ovl: Add an alternative type of whiteout") Cc: # v6.7 Reviewed-by: Alexander Larsson Tested-by: Alexander Larsson Signed-off-by: Amir Goldstein Signed-off-by: Greg Kroah-Hartman commit 8f5b860de87039b007e84a28a5eefc888154e098 Author: NeilBrown Date: Mon Jan 22 14:58:16 2024 +1100 nfsd: fix RELEASE_LOCKOWNER commit edcf9725150e42beeca42d085149f4c88fa97afd upstream. The test on so_count in nfsd4_release_lockowner() is nonsense and harmful. Revert to using check_for_locks(), changing that to not sleep. First: harmful. As is documented in the kdoc comment for nfsd4_release_lockowner(), the test on so_count can transiently return a false positive resulting in a return of NFS4ERR_LOCKS_HELD when in fact no locks are held. This is clearly a protocol violation and with the Linux NFS client it can cause incorrect behaviour. If RELEASE_LOCKOWNER is sent while some other thread is still processing a LOCK request which failed because, at the time that request was received, the given owner held a conflicting lock, then the nfsd thread processing that LOCK request can hold a reference (conflock) to the lock owner that causes nfsd4_release_lockowner() to return an incorrect error. The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so it knows that the error is impossible. It assumes the lock owner was in fact released so it feels free to use the same lock owner identifier in some later locking request. When it does reuse a lock owner identifier for which a previous RELEASE failed, it will naturally use a lock_seqid of zero. However the server, which didn't release the lock owner, will expect a larger lock_seqid and so will respond with NFS4ERR_BAD_SEQID. So clearly it is harmful to allow a false positive, which testing so_count allows. The test is nonsense because ... well... it doesn't mean anything. so_count is the sum of three different counts. 1/ the set of states listed on so_stateids 2/ the set of active vfs locks owned by any of those states 3/ various transient counts such as for conflicting locks. When it is tested against '2' it is clear that one of these is the transient reference obtained by find_lockowner_str_locked(). It is not clear what the other one is expected to be. In practice, the count is often 2 because there is precisely one state on so_stateids. If there were more, this would fail. In my testing I see two circumstances when RELEASE_LOCKOWNER is called. In one case, CLOSE is called before RELEASE_LOCKOWNER. That results in all the lock states being removed, and so the lockowner being discarded (it is removed when there are no more references which usually happens when the lock state is discarded). When nfsd4_release_lockowner() finds that the lock owner doesn't exist, it returns success. The other case shows an so_count of '2' and precisely one state listed in so_stateid. It appears that the Linux client uses a separate lock owner for each file resulting in one lock state per lock owner, so this test on '2' is safe. For another client it might not be safe. So this patch changes check_for_locks() to use the (newish) find_any_file_locked() so that it doesn't take a reference on the nfs4_file and so never calls nfsd_file_put(), and so never sleeps. With this check is it safe to restore the use of check_for_locks() rather than testing so_count against the mysterious '2'. Fixes: ce3c4ad7f4ce ("NFSD: Fix possible sleep during nfsd4_release_lockowner()") Signed-off-by: NeilBrown Reviewed-by: Jeff Layton Cc: stable@vger.kernel.org # v6.2+ Signed-off-by: Chuck Lever Signed-off-by: Greg Kroah-Hartman commit f32a81999d0b8e5ce60afb5f6a3dd7241c17dd67 Author: Emmanuel Grumbach Date: Thu Jan 11 15:07:25 2024 +0200 wifi: iwlwifi: fix a memory corruption commit cf4a0d840ecc72fcf16198d5e9c505ab7d5a5e4d upstream. iwl_fw_ini_trigger_tlv::data is a pointer to a __le32, which means that if we copy to iwl_fw_ini_trigger_tlv::data + offset while offset is in bytes, we'll write past the buffer. Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218233 Fixes: cf29c5b66b9f ("iwlwifi: dbg_ini: implement time point handling") Signed-off-by: Emmanuel Grumbach Signed-off-by: Miri Korenblit Link: https://msgid.link/20240111150610.2d2b8b870194.I14ed76505a5cf87304e0c9cc05cc0ae85ed3bf91@changeid Signed-off-by: Johannes Berg Signed-off-by: Greg Kroah-Hartman commit 69a6f8ee9da574166a907997c8524dd8b000aafb Author: Bernd Edlinger Date: Mon Jan 22 19:34:21 2024 +0100 exec: Fix error handling in begin_new_exec() commit 84c39ec57d409e803a9bb6e4e85daf1243e0e80b upstream. If get_unused_fd_flags() fails, the error handling is incomplete because bprm->cred is already set to NULL, and therefore free_bprm will not unlock the cred_guard_mutex. Note there are two error conditions which end up here, one before and one after bprm->cred is cleared. Fixes: b8a61c9e7b4a ("exec: Generic execfd support") Signed-off-by: Bernd Edlinger Acked-by: Eric W. Biederman Link: https://lore.kernel.org/r/AS8P193MB128517ADB5EFF29E04389EDAE4752@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM Cc: stable@vger.kernel.org Signed-off-by: Kees Cook Signed-off-by: Greg Kroah-Hartman commit aaa054e78d0a7d1b559599a70d9c0c7573eab967 Author: Ilya Dryomov Date: Wed Jan 17 18:59:44 2024 +0100 rbd: don't move requests to the running list on errors commit ded080c86b3f99683774af0441a58fc2e3d60cae upstream. The running list is supposed to contain requests that are pinning the exclusive lock, i.e. those that must be flushed before exclusive lock is released. When wake_lock_waiters() is called to handle an error, requests on the acquiring list are failed with that error and no flushing takes place. Briefly moving them to the running list is not only pointless but also harmful: if exclusive lock gets acquired before all of their state machines are scheduled and go through rbd_lock_del_request(), we trigger rbd_assert(list_empty(&rbd_dev->running_list)); in rbd_try_acquire_lock(). Cc: stable@vger.kernel.org Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code") Signed-off-by: Ilya Dryomov Reviewed-by: Dongsheng Yang Signed-off-by: Greg Kroah-Hartman commit d8680b722f0ff6d7a01ddacc1844e0d52354d6ff Author: Omar Sandoval Date: Thu Jan 4 11:48:46 2024 -0800 btrfs: don't abort filesystem when attempting to snapshot deleted subvolume commit 7081929ab2572920e94d70be3d332e5c9f97095a upstream. If the source file descriptor to the snapshot ioctl refers to a deleted subvolume, we get the following abort: BTRFS: Transaction aborted (error -2) WARNING: CPU: 0 PID: 833 at fs/btrfs/transaction.c:1875 create_pending_snapshot+0x1040/0x1190 [btrfs] Modules linked in: pata_acpi btrfs ata_piix libata scsi_mod virtio_net blake2b_generic xor net_failover virtio_rng failover scsi_common rng_core raid6_pq libcrc32c CPU: 0 PID: 833 Comm: t_snapshot_dele Not tainted 6.7.0-rc6 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 RIP: 0010:create_pending_snapshot+0x1040/0x1190 [btrfs] RSP: 0018:ffffa09c01337af8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff9982053e7c78 RCX: 0000000000000027 RDX: ffff99827dc20848 RSI: 0000000000000001 RDI: ffff99827dc20840 RBP: ffffa09c01337c00 R08: 0000000000000000 R09: ffffa09c01337998 R10: 0000000000000003 R11: ffffffffb96da248 R12: fffffffffffffffe R13: ffff99820535bb28 R14: ffff99820b7bd000 R15: ffff99820381ea80 FS: 00007fe20aadabc0(0000) GS:ffff99827dc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000559a120b502f CR3: 00000000055b6000 CR4: 00000000000006f0 Call Trace: ? create_pending_snapshot+0x1040/0x1190 [btrfs] ? __warn+0x81/0x130 ? create_pending_snapshot+0x1040/0x1190 [btrfs] ? report_bug+0x171/0x1a0 ? handle_bug+0x3a/0x70 ? exc_invalid_op+0x17/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? create_pending_snapshot+0x1040/0x1190 [btrfs] ? create_pending_snapshot+0x1040/0x1190 [btrfs] create_pending_snapshots+0x92/0xc0 [btrfs] btrfs_commit_transaction+0x66b/0xf40 [btrfs] btrfs_mksubvol+0x301/0x4d0 [btrfs] btrfs_mksnapshot+0x80/0xb0 [btrfs] __btrfs_ioctl_snap_create+0x1c2/0x1d0 [btrfs] btrfs_ioctl_snap_create_v2+0xc4/0x150 [btrfs] btrfs_ioctl+0x8a6/0x2650 [btrfs] ? kmem_cache_free+0x22/0x340 ? do_sys_openat2+0x97/0xe0 __x64_sys_ioctl+0x97/0xd0 do_syscall_64+0x46/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 RIP: 0033:0x7fe20abe83af RSP: 002b:00007ffe6eff1360 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fe20abe83af RDX: 00007ffe6eff23c0 RSI: 0000000050009417 RDI: 0000000000000003 RBP: 0000000000000003 R08: 0000000000000000 R09: 00007fe20ad16cd0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe6eff13c0 R14: 00007fe20ad45000 R15: 0000559a120b6d58 ---[ end trace 0000000000000000 ]--- BTRFS: error (device vdc: state A) in create_pending_snapshot:1875: errno=-2 No such entry BTRFS info (device vdc: state EA): forced readonly BTRFS warning (device vdc: state EA): Skipping commit of aborted transaction. BTRFS: error (device vdc: state EA) in cleanup_transaction:2055: errno=-2 No such entry This happens because create_pending_snapshot() initializes the new root item as a copy of the source root item. This includes the refs field, which is 0 for a deleted subvolume. The call to btrfs_insert_root() therefore inserts a root with refs == 0. btrfs_get_new_fs_root() then finds the root and returns -ENOENT if refs == 0, which causes create_pending_snapshot() to abort. Fix it by checking the source root's refs before attempting the snapshot, but after locking subvol_sem to avoid racing with deletion. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Sweet Tea Dorminy Reviewed-by: Anand Jain Signed-off-by: Omar Sandoval Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit ed1e9ca4e56af02d0c4c2a1f05460ffe5450c564 Author: Qu Wenruo Date: Wed Jan 10 08:58:26 2024 +1030 btrfs: defrag: reject unknown flags of btrfs_ioctl_defrag_range_args commit 173431b274a9a54fc10b273b46e67f46bcf62d2e upstream. Add extra sanity check for btrfs_ioctl_defrag_range_args::flags. This is not really to enhance fuzzing tests, but as a preparation for future expansion on btrfs_ioctl_defrag_range_args. In the future we're going to add new members, allowing more fine tuning for btrfs defrag. Without the -ENONOTSUPP error, there would be no way to detect if the kernel supports those new defrag features. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Filipe Manana Signed-off-by: Qu Wenruo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 877409a6dd350d6d8656a79b29b6cf350aeaa186 Author: David Sterba Date: Mon Jan 15 20:30:26 2024 +0100 btrfs: don't warn if discard range is not aligned to sector commit a208b3f132b48e1f94f620024e66fea635925877 upstream. There's a warning in btrfs_issue_discard() when the range is not aligned to 512 bytes, originally added in 4d89d377bbb0 ("btrfs: btrfs_issue_discard ensure offset/length are aligned to sector boundaries"). We can't do sub-sector writes anyway so the adjustment is the only thing that we can do and the warning is unnecessary. CC: stable@vger.kernel.org # 4.19+ Reported-by: syzbot+4a4f1eba14eb5c3417d1@syzkaller.appspotmail.com Reviewed-by: Johannes Thumshirn Reviewed-by: Anand Jain Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 299d3e2d5aa5350429cd0a5d0f1339d29186942c Author: Chung-Chiang Cheng Date: Fri Jan 12 15:41:05 2024 +0800 btrfs: tree-checker: fix inline ref size in error messages commit f398e70dd69e6ceea71463a5380e6118f219197e upstream. The error message should accurately reflect the size rather than the type. Fixes: f82d1c7ca8ae ("btrfs: tree-checker: Add EXTENT_ITEM and METADATA_ITEM check") CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Filipe Manana Reviewed-by: Qu Wenruo Signed-off-by: Chung-Chiang Cheng Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 12885229ea0709a50891a396c5d7e66e5b7b85af Author: Fedor Pchelkin Date: Wed Jan 3 13:31:27 2024 +0300 btrfs: ref-verify: free ref cache before clearing mount opt commit f03e274a8b29d1d1c1bbd7f764766cb5ca537ab7 upstream. As clearing REF_VERIFY mount option indicates there were some errors in a ref-verify process, a ref cache is not relevant anymore and should be freed. btrfs_free_ref_cache() requires REF_VERIFY option being set so call it just before clearing the mount option. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Reported-by: syzbot+be14ed7728594dc8bd42@syzkaller.appspotmail.com Fixes: fd708b81d972 ("Btrfs: add a extent ref verify tool") CC: stable@vger.kernel.org # 5.4+ Closes: https://lore.kernel.org/lkml/000000000000e5a65c05ee832054@google.com/ Reported-by: syzbot+c563a3c79927971f950f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/0000000000007fe09705fdc6086c@google.com/ Reviewed-by: Anand Jain Signed-off-by: Fedor Pchelkin Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 337c5da28f1f6f18fabf6690d1185111a6094f29 Author: Omar Sandoval Date: Thu Jan 4 11:48:47 2024 -0800 btrfs: avoid copying BTRFS_ROOT_SUBVOL_DEAD flag to snapshot of subvolume being deleted commit 3324d0547861b16cf436d54abba7052e0c8aa9de upstream. Sweet Tea spotted a race between subvolume deletion and snapshotting that can result in the root item for the snapshot having the BTRFS_ROOT_SUBVOL_DEAD flag set. The race is: Thread 1 | Thread 2 ----------------------------------------------|---------- btrfs_delete_subvolume | btrfs_set_root_flags(BTRFS_ROOT_SUBVOL_DEAD)| |btrfs_mksubvol | down_read(subvol_sem) | create_snapshot | ... | create_pending_snapshot | copy root item from source down_write(subvol_sem) | This flag is only checked in send and swap activate, which this would cause to fail mysteriously. create_snapshot() now checks the root refs to reject a deleted subvolume, so we can fix this by locking subvol_sem earlier so that the BTRFS_ROOT_SUBVOL_DEAD flag and the root refs are updated atomically. CC: stable@vger.kernel.org # 4.14+ Reported-by: Sweet Tea Dorminy Reviewed-by: Sweet Tea Dorminy Reviewed-by: Anand Jain Signed-off-by: Omar Sandoval Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 1908e9d01e5395adff68d9d308a0fb15337e6272 Author: Naohiro Aota Date: Fri Dec 22 13:56:34 2023 +0900 btrfs: zoned: fix lock ordering in btrfs_zone_activate() commit b18f3b60b35a8c01c9a2a0f0d6424c6d73971dc3 upstream. The btrfs CI reported a lockdep warning as follows by running generic generic/129. WARNING: possible circular locking dependency detected 6.7.0-rc5+ #1 Not tainted ------------------------------------------------------ kworker/u5:5/793427 is trying to acquire lock: ffff88813256d028 (&cache->lock){+.+.}-{2:2}, at: btrfs_zone_finish_one_bg+0x5e/0x130 but task is already holding lock: ffff88810a23a318 (&fs_info->zone_active_bgs_lock){+.+.}-{2:2}, at: btrfs_zone_finish_one_bg+0x34/0x130 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&fs_info->zone_active_bgs_lock){+.+.}-{2:2}: ... -> #0 (&cache->lock){+.+.}-{2:2}: ... This is because we take fs_info->zone_active_bgs_lock after a block_group's lock in btrfs_zone_activate() while doing the opposite in other places. Fix the issue by expanding the fs_info->zone_active_bgs_lock's critical section and taking it before a block_group's lock. Fixes: a7e1ac7bdc5a ("btrfs: zoned: reserve zones for an active metadata/system block group") CC: stable@vger.kernel.org # 6.6 Signed-off-by: Naohiro Aota Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 246f72777e887f69c052129671c2e1dd009fea50 Author: Gerhard Engleder Date: Tue Jan 23 21:09:18 2024 +0100 tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring [ Upstream commit 9a91c05f4bd6f6bdd6b8f90445e0da92e3ac956c ] The fill ring of the XDP socket may contain not enough buffers to completey fill the RX queue during socket creation. In this case the flag XDP_RING_NEED_WAKEUP is not set as this flag is only set if the RX queue is not completely filled during polling. Set XDP_RING_NEED_WAKEUP flag also if RX queue is not completely filled during XDP socket creation. Fixes: 3fc2333933fd ("tsnep: Add XDP socket zero-copy RX support") Signed-off-by: Gerhard Engleder Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 9db354014dce82292586d2909b8c0e7995f19227 Author: Gerhard Engleder Date: Tue Jan 23 21:09:17 2024 +0100 tsnep: Remove FCS for XDP data path [ Upstream commit 50bad6f797d4d501c5ef416a6f92e1912ab5aa8b ] The RX data buffer includes the FCS. The FCS is already stripped for the normal data path. But for the XDP data path the FCS is included and acts like additional/useless data. Remove the FCS from the RX data buffer also for XDP. Fixes: 65b28c810035 ("tsnep: Add XDP RX support") Fixes: 3fc2333933fd ("tsnep: Add XDP socket zero-copy RX support") Signed-off-by: Gerhard Engleder Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 9216b473c693417674d51faa67ce4aecdb9e5653 Author: Shenwei Wang Date: Tue Jan 23 10:51:41 2024 -0600 net: fec: fix the unhandled context fault from smmu [ Upstream commit 5e344807735023cd3a67c37a1852b849caa42620 ] When repeatedly changing the interface link speed using the command below: ethtool -s eth0 speed 100 duplex full ethtool -s eth0 speed 1000 duplex full The following errors may sometimes be reported by the ARM SMMU driver: [ 5395.035364] fec 5b040000.ethernet eth0: Link is Down [ 5395.039255] arm-smmu 51400000.iommu: Unhandled context fault: fsr=0x402, iova=0x00000000, fsynr=0x100001, cbfrsynra=0x852, cb=2 [ 5398.108460] fec 5b040000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off It is identified that the FEC driver does not properly stop the TX queue during the link speed transitions, and this results in the invalid virtual I/O address translations from the SMMU and causes the context faults. Fixes: dbc64a8ea231 ("net: fec: move calls to quiesce/resume packet processing out of fec_restart()") Signed-off-by: Shenwei Wang Link: https://lore.kernel.org/r/20240123165141.2008104-1-shenwei.wang@nxp.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 70253e69f1dc26d3e5dec6e7325c3ec1ce7870a9 Author: Hangbin Liu Date: Tue Jan 23 15:59:17 2024 +0800 selftests: bonding: do not test arp/ns target with mode balance-alb/tlb [ Upstream commit a2933a8759a62269754e54733d993b19de870e84 ] The prio_arp/ns tests hard code the mode to active-backup. At the same time, The balance-alb/tlb modes do not support arp/ns target. So remove the prio_arp/ns tests from the loop and only test active-backup mode. Fixes: 481b56e0391e ("selftests: bonding: re-format bond option tests") Reported-by: Jay Vosburgh Closes: https://lore.kernel.org/netdev/17415.1705965957@famine/ Signed-off-by: Hangbin Liu Acked-by: Jay Vosburgh Link: https://lore.kernel.org/r/20240123075917.1576360-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 4f060f6bccbf49afed27e3d13bed821053811e15 Author: Zhipeng Lu Date: Tue Jan 23 01:24:42 2024 +0800 fjes: fix memleaks in fjes_hw_setup [ Upstream commit f6cc4b6a3ae53df425771000e9c9540cce9b7bb1 ] In fjes_hw_setup, it allocates several memory and delay the deallocation to the fjes_hw_exit in fjes_probe through the following call chain: fjes_probe |-> fjes_hw_init |-> fjes_hw_setup |-> fjes_hw_exit However, when fjes_hw_setup fails, fjes_hw_exit won't be called and thus all the resources allocated in fjes_hw_setup will be leaked. In this patch, we free those resources in fjes_hw_setup and prevents such leaks. Fixes: 2fcbca687702 ("fjes: platform_driver's .probe and .remove routine") Signed-off-by: Zhipeng Lu Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240122172445.3841883-1-alexious@zju.edu.cn Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 3df7f34a7d4237a9a18eae04510e6b149f2b60fb Author: Maciej Fijalkowski Date: Wed Jan 24 20:16:02 2024 +0100 i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue [ Upstream commit 0cbb08707c932b3f004bc1a8ec6200ef572c1f5f ] Now that i40e driver correctly sets up frag_size in xdp_rxq_info, let us make it work for ZC multi-buffer as well. i40e_ring::rx_buf_len for ZC is being set via xsk_pool_get_rx_frame_size() and this needs to be propagated up to xdp_rxq_info. Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support") Acked-by: Magnus Karlsson Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-12-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 7113fa239a4ef03bce5b8d4ec9b24b740ca26743 Author: Maciej Fijalkowski Date: Wed Jan 24 20:16:01 2024 +0100 i40e: set xdp_rxq_info::frag_size [ Upstream commit a045d2f2d03d23e7db6772dd83e0ba2705dfad93 ] i40e support XDP multi-buffer so it is supposed to use __xdp_rxq_info_reg() instead of xdp_rxq_info_reg() and set the frag_size. It can not be simply converted at existing callsite because rx_buf_len could be un-initialized, so let us register xdp_rxq_info within i40e_configure_rx_ring(), which happen to be called with already initialized rx_buf_len value. Commit 5180ff1364bc ("i40e: use int for i40e_status") converted 'err' to int, so two variables to deal with return codes are not needed within i40e_configure_rx_ring(). Remove 'ret' and use 'err' to handle status from xdp_rxq_info registration. Fixes: e213ced19bef ("i40e: add support for XDP multi-buffer Rx") Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-11-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 440fee8feb34a4e23d01cc17fd65a989917a8641 Author: Maciej Fijalkowski Date: Wed Jan 24 20:16:00 2024 +0100 xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL [ Upstream commit fbadd83a612c3b7aad2987893faca6bd24aaebb3 ] XSK ZC Rx path calculates the size of data that will be posted to XSK Rx queue via subtracting xdp_buff::data_end from xdp_buff::data. In bpf_xdp_frags_increase_tail(), when underlying memory type of xdp_rxq_info is MEM_TYPE_XSK_BUFF_POOL, add offset to data_end in tail fragment, so that later on user space will be able to take into account the amount of bytes added by XDP program. Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-10-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 8bd491e5dcb55269856db759a14ba796cf0b05c9 Author: Maciej Fijalkowski Date: Wed Jan 24 20:15:59 2024 +0100 ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue [ Upstream commit 3de38c87174225487fc93befeea7d380db80aef6 ] Now that ice driver correctly sets up frag_size in xdp_rxq_info, let us make it work for ZC multi-buffer as well. ice_rx_ring::rx_buf_len for ZC is being set via xsk_pool_get_rx_frame_size() and this needs to be propagated up to xdp_rxq_info. Use a bigger hammer and instead of unregistering only xdp_rxq_info's memory model, unregister it altogether and register it again and have xdp_rxq_info with correct frag_size value. Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support") Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-9-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 0528ad701722931eea26272527c723120744a252 Author: Maciej Fijalkowski Date: Wed Jan 24 20:15:58 2024 +0100 intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers [ Upstream commit 290779905d09d5fdf6caa4f58ddefc3f4db0c0a9 ] Ice and i40e ZC drivers currently set offset of a frag within skb_shared_info to 0, which is incorrect. xdp_buffs that come from xsk_buff_pool always have 256 bytes of a headroom, so they need to be taken into account to retrieve xdp_buff::data via skb_frag_address(). Otherwise, bpf_xdp_frags_increase_tail() would be starting its job from xdp_buff::data_hard_start which would result in overwriting existing payload. Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support") Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support") Acked-by: Magnus Karlsson Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-8-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 663e87702f446b18956b35b6df44ef6b3423cb3f Author: Maciej Fijalkowski Date: Wed Jan 24 20:15:57 2024 +0100 ice: remove redundant xdp_rxq_info registration [ Upstream commit 2ee788c06493d02ee85855414cca39825e768aaf ] xdp_rxq_info struct can be registered by drivers via two functions - xdp_rxq_info_reg() and __xdp_rxq_info_reg(). The latter one allows drivers that support XDP multi-buffer to set up xdp_rxq_info::frag_size which in turn will make it possible to grow the packet via bpf_xdp_adjust_tail() BPF helper. Currently, ice registers xdp_rxq_info in two spots: 1) ice_setup_rx_ring() // via xdp_rxq_info_reg(), BUG 2) ice_vsi_cfg_rxq() // via __xdp_rxq_info_reg(), OK Cited commit under fixes tag took care of setting up frag_size and updated registration scheme in 2) but it did not help as 1) is called before 2) and as shown above it uses old registration function. This means that 2) sees that xdp_rxq_info is already registered and never calls __xdp_rxq_info_reg() which leaves us with xdp_rxq_info::frag_size being set to 0. To fix this misbehavior, simply remove xdp_rxq_info_reg() call from ice_setup_rx_ring(). Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Acked-by: Magnus Karlsson Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-7-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 3431ec41d05604c1f2ba31f3e8a3b6b3ed3fb082 Author: Tirthendu Sarkar Date: Wed Jan 24 20:15:56 2024 +0100 i40e: handle multi-buffer packets that are shrunk by xdp prog [ Upstream commit 83014323c642b8faa2d64a5f303b41c019322478 ] XDP programs can shrink packets by calling the bpf_xdp_adjust_tail() helper function. For multi-buffer packets this may lead to reduction of frag count stored in skb_shared_info area of the xdp_buff struct. This results in issues with the current handling of XDP_PASS and XDP_DROP cases. For XDP_PASS, currently skb is being built using frag count of xdp_buffer before it was processed by XDP prog and thus will result in an inconsistent skb when frag count gets reduced by XDP prog. To fix this, get correct frag count while building the skb instead of using pre-obtained frag count. For XDP_DROP, current page recycling logic will not reuse the page but instead will adjust the pagecnt_bias so that the page can be freed. This again results in inconsistent behavior as the page refcnt has already been changed by the helper while freeing the frag(s) as part of shrinking the packet. To fix this, only adjust pagecnt_bias for buffers that are stillpart of the packet post-xdp prog run. Fixes: e213ced19bef ("i40e: add support for XDP multi-buffer Rx") Reported-by: Maciej Fijalkowski Signed-off-by: Tirthendu Sarkar Link: https://lore.kernel.org/r/20240124191602.566724-6-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit c75708b288ae8713a0e1bdae4523814c16be63d3 Author: Maciej Fijalkowski Date: Wed Jan 24 20:15:55 2024 +0100 ice: work on pre-XDP prog frag count [ Upstream commit ad2047cf5d9313200e308612aed516548873d124 ] Fix an OOM panic in XDP_DRV mode when a XDP program shrinks a multi-buffer packet by 4k bytes and then redirects it to an AF_XDP socket. Since support for handling multi-buffer frames was added to XDP, usage of bpf_xdp_adjust_tail() helper within XDP program can free the page that given fragment occupies and in turn decrease the fragment count within skb_shared_info that is embedded in xdp_buff struct. In current ice driver codebase, it can become problematic when page recycling logic decides not to reuse the page. In such case, __page_frag_cache_drain() is used with ice_rx_buf::pagecnt_bias that was not adjusted after refcount of page was changed by XDP prog which in turn does not drain the refcount to 0 and page is never freed. To address this, let us store the count of frags before the XDP program was executed on Rx ring struct. This will be used to compare with current frag count from skb_shared_info embedded in xdp_buff. A smaller value in the latter indicates that XDP prog freed frag(s). Then, for given delta decrement pagecnt_bias for XDP_DROP verdict. While at it, let us also handle the EOP frag within ice_set_rx_bufs_act() to make our life easier, so all of the adjustments needed to be applied against freed frags are performed in the single place. Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side") Acked-by: Magnus Karlsson Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-5-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 5cd781f7216f980207af09c5e0e1bb1eda284540 Author: Maciej Fijalkowski Date: Wed Jan 24 20:15:54 2024 +0100 xsk: fix usage of multi-buffer BPF helpers for ZC XDP [ Upstream commit c5114710c8ce86b8317e9b448f4fd15c711c2a82 ] Currently when packet is shrunk via bpf_xdp_adjust_tail() and memory type is set to MEM_TYPE_XSK_BUFF_POOL, null ptr dereference happens: [1136314.192256] BUG: kernel NULL pointer dereference, address: 0000000000000034 [1136314.203943] #PF: supervisor read access in kernel mode [1136314.213768] #PF: error_code(0x0000) - not-present page [1136314.223550] PGD 0 P4D 0 [1136314.230684] Oops: 0000 [#1] PREEMPT SMP NOPTI [1136314.239621] CPU: 8 PID: 54203 Comm: xdpsock Not tainted 6.6.0+ #257 [1136314.250469] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [1136314.265615] RIP: 0010:__xdp_return+0x6c/0x210 [1136314.274653] Code: ad 00 48 8b 47 08 49 89 f8 a8 01 0f 85 9b 01 00 00 0f 1f 44 00 00 f0 41 ff 48 34 75 32 4c 89 c7 e9 79 cd 80 ff 83 fe 03 75 17 41 34 01 0f 85 02 01 00 00 48 89 cf e9 22 cc 1e 00 e9 3d d2 86 [1136314.302907] RSP: 0018:ffffc900089f8db0 EFLAGS: 00010246 [1136314.312967] RAX: ffffc9003168aed0 RBX: ffff8881c3300000 RCX: 0000000000000000 [1136314.324953] RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffc9003168c000 [1136314.336929] RBP: 0000000000000ae0 R08: 0000000000000002 R09: 0000000000010000 [1136314.348844] R10: ffffc9000e495000 R11: 0000000000000040 R12: 0000000000000001 [1136314.360706] R13: 0000000000000524 R14: ffffc9003168aec0 R15: 0000000000000001 [1136314.373298] FS: 00007f8df8bbcb80(0000) GS:ffff8897e0e00000(0000) knlGS:0000000000000000 [1136314.386105] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1136314.396532] CR2: 0000000000000034 CR3: 00000001aa912002 CR4: 00000000007706f0 [1136314.408377] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [1136314.420173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [1136314.431890] PKRU: 55555554 [1136314.439143] Call Trace: [1136314.446058] [1136314.452465] ? __die+0x20/0x70 [1136314.459881] ? page_fault_oops+0x15b/0x440 [1136314.468305] ? exc_page_fault+0x6a/0x150 [1136314.476491] ? asm_exc_page_fault+0x22/0x30 [1136314.484927] ? __xdp_return+0x6c/0x210 [1136314.492863] bpf_xdp_adjust_tail+0x155/0x1d0 [1136314.501269] bpf_prog_ccc47ae29d3b6570_xdp_sock_prog+0x15/0x60 [1136314.511263] ice_clean_rx_irq_zc+0x206/0xc60 [ice] [1136314.520222] ? ice_xmit_zc+0x6e/0x150 [ice] [1136314.528506] ice_napi_poll+0x467/0x670 [ice] [1136314.536858] ? ttwu_do_activate.constprop.0+0x8f/0x1a0 [1136314.546010] __napi_poll+0x29/0x1b0 [1136314.553462] net_rx_action+0x133/0x270 [1136314.561619] __do_softirq+0xbe/0x28e [1136314.569303] do_softirq+0x3f/0x60 This comes from __xdp_return() call with xdp_buff argument passed as NULL which is supposed to be consumed by xsk_buff_free() call. To address this properly, in ZC case, a node that represents the frag being removed has to be pulled out of xskb_list. Introduce appropriate xsk helpers to do such node operation and use them accordingly within bpf_xdp_adjust_tail(). Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Acked-by: Magnus Karlsson # For the xsk header part Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-4-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit a32136624a356d0ea1ef75a6f9586768fba064a1 Author: Maciej Fijalkowski Date: Wed Jan 24 20:15:53 2024 +0100 xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags [ Upstream commit f7f6aa8e24383fbb11ac55942e66da9660110f80 ] XDP multi-buffer support introduced XDP_FLAGS_HAS_FRAGS flag that is used by drivers to notify data path whether xdp_buff contains fragments or not. Data path looks up mentioned flag on first buffer that occupies the linear part of xdp_buff, so drivers only modify it there. This is sufficient for SKB and XDP_DRV modes as usually xdp_buff is allocated on stack or it resides within struct representing driver's queue and fragments are carried via skb_frag_t structs. IOW, we are dealing with only one xdp_buff. ZC mode though relies on list of xdp_buff structs that is carried via xsk_buff_pool::xskb_list, so ZC data path has to make sure that fragments do *not* have XDP_FLAGS_HAS_FRAGS set. Otherwise, xsk_buff_free() could misbehave if it would be executed against xdp_buff that carries a frag with XDP_FLAGS_HAS_FRAGS flag set. Such scenario can take place when within supplied XDP program bpf_xdp_adjust_tail() is used with negative offset that would in turn release the tail fragment from multi-buffer frame. Calling xsk_buff_free() on tail fragment with XDP_FLAGS_HAS_FRAGS would result in releasing all the nodes from xskb_list that were produced by driver before XDP program execution, which is not what is intended - only tail fragment should be deleted from xskb_list and then it should be put onto xsk_buff_pool::free_list. Such multi-buffer frame will never make it up to user space, so from AF_XDP application POV there would be no traffic running, however due to free_list getting constantly new nodes, driver will be able to feed HW Rx queue with recycled buffers. Bottom line is that instead of traffic being redirected to user space, it would be continuously dropped. To fix this, let us clear the mentioned flag on xsk_buff_pool side during xdp_buff initialization, which is what should have been done right from the start of XSK multi-buffer support. Fixes: 1bbc04de607b ("ice: xsk: add RX multi-buffer support") Fixes: 1c9ba9c14658 ("i40e: xsk: add RX multi-buffer support") Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-3-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 7b4d93d31aade99210d41cd9d4cbd2957c98bc8c Author: Maciej Fijalkowski Date: Wed Jan 24 20:15:52 2024 +0100 xsk: recycle buffer in case Rx queue was full [ Upstream commit 269009893146c495f41e9572dd9319e787c2eba9 ] Add missing xsk_buff_free() call when __xsk_rcv_zc() failed to produce descriptor to XSK Rx queue. Fixes: 24ea50127ecf ("xsk: support mbuf on ZC RX") Acked-by: Magnus Karlsson Signed-off-by: Maciej Fijalkowski Link: https://lore.kernel.org/r/20240124191602.566724-2-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov Signed-off-by: Sasha Levin commit 1b62822f66443fb5b0b60d342fd91b7cf2b0d7e2 Author: Jakub Kicinski Date: Mon Jan 22 22:05:29 2024 -0800 selftests: netdevsim: fix the udp_tunnel_nic test [ Upstream commit 0879020a7817e7ce636372c016b4528f541c9f4d ] This test is missing a whole bunch of checks for interface renaming and one ifup. Presumably it was only used on a system with renaming disabled and NetworkManager running. Fixes: 91f430b2c49d ("selftests: net: add a test for UDP tunnel info infra") Acked-by: Paolo Abeni Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240123060529.1033912-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 99c386bf6e1bbae6ad2f756c916d842ef1c8cab1 Author: Jakub Kicinski Date: Mon Jan 22 11:58:15 2024 -0800 selftests: net: fix rps_default_mask with >32 CPUs [ Upstream commit 0719b5338a0cbe80d1637a5fb03d8141b5bfc7a1 ] If there is more than 32 cpus the bitmask will start to contain commas, leading to: ./rps_default_mask.sh: line 36: [: 00000000,00000000: integer expression expected Remove the commas, bash doesn't interpret leading zeroes as oct so that should be good enough. Switch to bash, Simon reports that not all shells support this type of substitution. Fixes: c12e0d5f267d ("self-tests: introduce self-tests for RPS default mask") Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240122195815.638997-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit dc77f6ab5c3759df60ff87ed24f4d45df0f3b4c4 Author: Jenishkumar Maheshbhai Patel Date: Thu Jan 18 19:59:14 2024 -0800 net: mvpp2: clear BM pool before initialization [ Upstream commit 9f538b415db862e74b8c5d3abbccfc1b2b6caa38 ] Register value persist after booting the kernel using kexec which results in kernel panic. Thus clear the BM pool registers before initialisation to fix the issue. Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Jenishkumar Maheshbhai Patel Reviewed-by: Maxime Chevallier Link: https://lore.kernel.org/r/20240119035914.2595665-1-jpatel2@marvell.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 4e7ef904929b002f533ac7d5067e488f3b559d0b Author: Bernd Edlinger Date: Mon Jan 22 19:19:09 2024 +0100 net: stmmac: Wait a bit for the reset to take effect [ Upstream commit a5f5eee282a0aae80227697e1d9c811b1726d31d ] otherwise the synopsys_id value may be read out wrong, because the GMAC_VERSION register might still be in reset state, for at least 1 us after the reset is de-asserted. Add a wait for 10 us before continuing to be on the safe side. > From what have you got that delay value? Just try and error, with very old linux versions and old gcc versions the synopsys_id was read out correctly most of the time (but not always), with recent linux versions and recnet gcc versions it was read out wrongly most of the time, but again not always. I don't have access to the VHDL code in question, so I cannot tell why it takes so long to get the correct values, I also do not have more than a few hardware samples, so I cannot tell how long this timeout must be in worst case. Experimentally I can tell that the register is read several times as zero immediately after the reset is de-asserted, also adding several no-ops is not enough, adding a printk is enough, also udelay(1) seems to be enough but I tried that not very often, and I have not access to many hardware samples to be 100% sure about the necessary delay. And since the udelay here is only executed once per device instance, it seems acceptable to delay the boot for 10 us. BTW: my hardware's synopsys id is 0x37. Fixes: c5e4ddbdfa11 ("net: stmmac: Add support for optional reset control") Signed-off-by: Bernd Edlinger Reviewed-by: Jiri Pirko Reviewed-by: Serge Semin Link: https://lore.kernel.org/r/AS8P193MB1285A810BD78C111E7F6AA34E4752@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 86cca272a250cb6bc18a1ec96a4419cbcfa1108f Author: Pablo Neira Ayuso Date: Tue Jan 23 16:38:25 2024 +0100 netfilter: nf_tables: validate NFPROTO_* family [ Upstream commit d0009effa8862c20a13af4cb7475d9771b905693 ] Several expressions explicitly refer to NF_INET_* hook definitions from expr->ops->validate, however, family is not validated. Bail out with EOPNOTSUPP in case they are used from unsupported families. Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables") Fixes: a3c90f7a2323 ("netfilter: nf_tables: flow offload expression") Fixes: 2fa841938c64 ("netfilter: nf_tables: introduce routing expression") Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching") Fixes: ad49d86e07a4 ("netfilter: nf_tables: Add synproxy support") Fixes: 4ed8eb6570a4 ("netfilter: nf_tables: Add native tproxy support") Fixes: 6c47260250fc ("netfilter: nf_tables: add xfrm expression") Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 496f722faeab9b42318dc3b1fb9988e38b8e8718 Author: Florian Westphal Date: Fri Jan 19 13:34:32 2024 +0100 netfilter: nf_tables: restrict anonymous set and map names to 16 bytes [ Upstream commit b462579b2b86a8f5230543cadd3a4836be27baf7 ] nftables has two types of sets/maps, one where userspace defines the name, and anonymous sets/maps, where userspace defines a template name. For the latter, kernel requires presence of exactly one "%d". nftables uses "__set%d" and "__map%d" for this. The kernel will expand the format specifier and replaces it with the smallest unused number. As-is, userspace could define a template name that allows to move the set name past the 256 bytes upperlimit (post-expansion). I don't see how this could be a problem, but I would prefer if userspace cannot do this, so add a limit of 16 bytes for the '%d' template name. 16 bytes is the old total upper limit for set names that existed when nf_tables was merged initially. Fixes: 387454901bd6 ("netfilter: nf_tables: Allow set names of up to 255 chars") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit 00c2c29aa36d1d1827c51a3720e9f893a22c7c6a Author: Florian Westphal Date: Fri Jan 19 13:11:32 2024 +0100 netfilter: nft_limit: reject configurations that cause integer overflow [ Upstream commit c9d9eb9c53d37cdebbad56b91e40baf42d5a97aa ] Reject bogus configs where internal token counter wraps around. This only occurs with very very large requests, such as 17gbyte/s. Its better to reject this rather than having incorrect ratelimit. Fixes: d2168e849ebf ("netfilter: nft_limit: add per-byte limiting") Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Sasha Levin commit b01ccd9f1c3d6220f1ce2e9e2eb009e98b11803c Author: Frederic Weisbecker Date: Tue Dec 19 00:19:15 2023 +0100 rcu: Defer RCU kthreads wakeup when CPU is dying [ Upstream commit e787644caf7628ad3269c1fbd321c3255cf51710 ] When the CPU goes idle for the last time during the CPU down hotplug process, RCU reports a final quiescent state for the current CPU. If this quiescent state propagates up to the top, some tasks may then be woken up to complete the grace period: the main grace period kthread and/or the expedited main workqueue (or kworker). If those kthreads have a SCHED_FIFO policy, the wake up can indirectly arm the RT bandwith timer to the local offline CPU. Since this happens after hrtimers have been migrated at CPUHP_AP_HRTIMERS_DYING stage, the timer gets ignored. Therefore if the RCU kthreads are waiting for RT bandwidth to be available, they may never be actually scheduled. This triggers TREE03 rcutorture hangs: rcu: INFO: rcu_preempt self-detected stall on CPU rcu: 4-...!: (1 GPs behind) idle=9874/1/0x4000000000000000 softirq=0/0 fqs=20 rcuc=21071 jiffies(starved) rcu: (t=21035 jiffies g=938281 q=40787 ncpus=6) rcu: rcu_preempt kthread starved for 20964 jiffies! g938281 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0 rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_preempt state:R running task stack:14896 pid:14 tgid:14 ppid:2 flags:0x00004000 Call Trace: __schedule+0x2eb/0xa80 schedule+0x1f/0x90 schedule_timeout+0x163/0x270 ? __pfx_process_timeout+0x10/0x10 rcu_gp_fqs_loop+0x37c/0x5b0 ? __pfx_rcu_gp_kthread+0x10/0x10 rcu_gp_kthread+0x17c/0x200 kthread+0xde/0x110 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2b/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 The situation can't be solved with just unpinning the timer. The hrtimer infrastructure and the nohz heuristics involved in finding the best remote target for an unpinned timer would then also need to handle enqueues from an offline CPU in the most horrendous way. So fix this on the RCU side instead and defer the wake up to an online CPU if it's too late for the local one. Reported-by: Paul E. McKenney Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") Signed-off-by: Frederic Weisbecker Signed-off-by: Paul E. McKenney Signed-off-by: Neeraj Upadhyay (AMD) Signed-off-by: Sasha Levin commit 65a4ade8a6d205979292e88beeb6a626ddbd4779 Author: Dinghao Liu Date: Tue Nov 28 17:29:01 2023 +0800 net/mlx5e: fix a potential double-free in fs_any_create_groups [ Upstream commit aef855df7e1bbd5aa4484851561211500b22707e ] When kcalloc() for ft->g succeeds but kvzalloc() for in fails, fs_any_create_groups() will free ft->g. However, its caller fs_any_create_table() will free ft->g again through calling mlx5e_destroy_flow_table(), which will lead to a double-free. Fix this by setting ft->g to NULL in fs_any_create_groups(). Fixes: 0f575c20bf06 ("net/mlx5e: Introduce Flow Steering ANY API") Signed-off-by: Dinghao Liu Reviewed-by: Tariq Toukan Reviewed-by: Simon Horman Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 66cc521a739ccd5da057a1cb3d6346c6d0e7619b Author: Zhipeng Lu Date: Wed Jan 17 15:17:36 2024 +0800 net/mlx5e: fix a double-free in arfs_create_groups [ Upstream commit 3c6d5189246f590e4e1f167991558bdb72a4738b ] When `in` allocated by kvzalloc fails, arfs_create_groups will free ft->g and return an error. However, arfs_create_table, the only caller of arfs_create_groups, will hold this error and call to mlx5e_destroy_flow_table, in which the ft->g will be freed again. Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables") Signed-off-by: Zhipeng Lu Reviewed-by: Simon Horman Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit f8ec9a7bd98c7d4cc42e0af3017658e35ac78629 Author: Leon Romanovsky Date: Sun Nov 26 11:08:10 2023 +0200 net/mlx5e: Ignore IPsec replay window values on sender side [ Upstream commit 315a597f9bcfe7fe9980985031413457bee95510 ] XFRM stack doesn't prevent from users to configure replay window in TX side and strongswan sets replay_window to be 1. It causes to failures in validation logic when trying to offload the SA. Replay window is not relevant in TX side and should be ignored. Fixes: cded6d80129b ("net/mlx5e: Store replay window in XFRM attributes") Signed-off-by: Aya Levin Signed-off-by: Leon Romanovsky Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit f8462e8a95adfbe3218de178bafa3eff92c609d1 Author: Leon Romanovsky Date: Tue Dec 12 13:52:55 2023 +0200 net/mlx5e: Allow software parsing when IPsec crypto is enabled [ Upstream commit 20f5468a7988dedd94a57ba8acd65ebda6a59723 ] All ConnectX devices have software parsing capability enabled, but it is more correct to set allow_swp only if capability exists, which for IPsec means that crypto offload is supported. Fixes: 2451da081a34 ("net/mlx5: Unify device IPsec capabilities check") Signed-off-by: Leon Romanovsky Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 8ed2f78eb5af74b12dd0b07520a1c9e49831f6fa Author: Rahul Rameshbabu Date: Tue Nov 28 14:01:54 2023 -0800 net/mlx5: Use mlx5 device constant for selecting CQ period mode for ASO [ Upstream commit 20cbf8cbb827094197f3b17db60d71449415db1e ] mlx5 devices have specific constants for choosing the CQ period mode. These constants do not have to match the constants used by the kernel software API for DIM period mode selection. Fixes: cdd04f4d4d71 ("net/mlx5: Add support to create SQ and CQ for ASO") Signed-off-by: Rahul Rameshbabu Reviewed-by: Jianbo Liu Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit ad896e88a3fc9fdf49da07014dec885994e67825 Author: Yevgeny Kliteynik Date: Sun Dec 17 13:20:36 2023 +0200 net/mlx5: DR, Can't go to uplink vport on RX rule [ Upstream commit 5b2a2523eeea5f03d39a9d1ff1bad2e9f8eb98d2 ] Go-To-Vport action on RX is not allowed when the vport is uplink. In such case, the packet should be dropped. Fixes: 9db810ed2d37 ("net/mlx5: DR, Expose steering action functionality") Signed-off-by: Yevgeny Kliteynik Reviewed-by: Erez Shitrit Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit c3a974bcfd651e4560f7857c5d45432ee0bc4483 Author: Yevgeny Kliteynik Date: Sun Dec 17 11:24:08 2023 +0200 net/mlx5: DR, Use the right GVMI number for drop action [ Upstream commit 5665954293f13642f9c052ead83c1e9d8cff186f ] When FW provides ICM addresses for drop RX/TX, the provided capability is 64 bits that contain its GVMI as well as the ICM address itself. In case of TX DROP this GVMI is different from the GVMI that the domain is operating on. This patch fixes the action to use these GVMI IDs, as provided by FW. Fixes: 9db810ed2d37 ("net/mlx5: DR, Expose steering action functionality") Signed-off-by: Yevgeny Kliteynik Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 0d704520a93989565a19a859a85fee2006ce9524 Author: Moshe Shemesh Date: Sat Dec 30 22:40:37 2023 +0200 net/mlx5: Bridge, fix multicast packets sent to uplink [ Upstream commit ec7cc38ef9f83553102e84c82536971a81630739 ] To enable multicast packets which are offloaded in bridge multicast offload mode to be sent also to uplink, FTE bit uplink_hairpin_en should be set. Add this bit to FTE for the bridge multicast offload rules. Fixes: 18c2916cee12 ("net/mlx5: Bridge, snoop igmp/mld packets") Signed-off-by: Moshe Shemesh Reviewed-by: Gal Pressman Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit fc18f59d0f88ae25d8334887f84e5d3e66a7ea72 Author: Yishai Hadas Date: Sun Dec 31 15:19:50 2023 +0200 net/mlx5: Fix a WARN upon a callback command failure [ Upstream commit cc8091587779cfaddb6b29c9e9edb9079a282cad ] The below WARN [1] is reported once a callback command failed. As a callback runs under an interrupt context, needs to use the IRQ save/restore variant. [1] DEBUG_LOCKS_WARN_ON(lockdep_hardirq_context()) WARNING: CPU: 15 PID: 0 at kernel/locking/lockdep.c:4353 lockdep_hardirqs_on_prepare+0x11b/0x180 Modules linked in: vhost_net vhost tap mlx5_vfio_pci vfio_pci vfio_pci_core vfio_iommu_type1 vfio mlx5_vdpa vringh vhost_iotlb vdpa nfnetlink_cttimeout openvswitch nsh ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_mangle xt_conntrackxt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm mlx5_ib ib_uverbs ib_core fuse mlx5_core CPU: 15 PID: 0 Comm: swapper/15 Tainted: G W 6.7.0-rc4+ #1587 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:lockdep_hardirqs_on_prepare+0x11b/0x180 Code: 00 5b c3 c3 e8 e6 0d 58 00 85 c0 74 d6 8b 15 f0 c3 76 01 85 d2 75 cc 48 c7 c6 04 a5 3b 82 48 c7 c7 f1 e9 39 82 e8 95 12 f9 ff <0f> 0b 5b c3 e8 bc 0d 58 00 85 c0 74 ac 8b 3d c6 c3 76 01 85 ff 75 RSP: 0018:ffffc900003ecd18 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000027 RDX: 0000000000000000 RSI: ffff88885fbdb880 RDI: ffff88885fbdb888 RBP: 00000000ffffff87 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 284e4f5f4e524157 R12: 00000000002c9aa1 R13: ffff88810aace980 R14: ffff88810aace9b8 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff88885fbc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f731436f4c8 CR3: 000000010aae6001 CR4: 0000000000372eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? __warn+0x81/0x170 ? lockdep_hardirqs_on_prepare+0x11b/0x180 ? report_bug+0xf8/0x1c0 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? lockdep_hardirqs_on_prepare+0x11b/0x180 ? lockdep_hardirqs_on_prepare+0x11b/0x180 trace_hardirqs_on+0x4a/0xa0 raw_spin_unlock_irq+0x24/0x30 cmd_status_err+0xc0/0x1a0 [mlx5_core] cmd_status_err+0x1a0/0x1a0 [mlx5_core] mlx5_cmd_exec_cb_handler+0x24/0x40 [mlx5_core] mlx5_cmd_comp_handler+0x129/0x4b0 [mlx5_core] cmd_comp_notifier+0x1a/0x20 [mlx5_core] notifier_call_chain+0x3e/0xe0 atomic_notifier_call_chain+0x5f/0x130 mlx5_eq_async_int+0xe7/0x200 [mlx5_core] notifier_call_chain+0x3e/0xe0 atomic_notifier_call_chain+0x5f/0x130 irq_int_handler+0x11/0x20 [mlx5_core] __handle_irq_event_percpu+0x99/0x220 ? tick_irq_enter+0x5d/0x80 handle_irq_event_percpu+0xf/0x40 handle_irq_event+0x3a/0x60 handle_edge_irq+0xa2/0x1c0 __common_interrupt+0x55/0x140 common_interrupt+0x7d/0xa0 asm_common_interrupt+0x22/0x40 RIP: 0010:default_idle+0x13/0x20 Code: c0 08 00 00 00 4d 29 c8 4c 01 c7 4c 29 c2 e9 72 ff ff ff cc cc cc cc 8b 05 ea 08 25 01 85 c0 7e 07 0f 00 2d 7f b0 26 00 fb f4 c3 90 66 2e 0f 1f 84 00 00 00 00 00 65 48 8b 04 25 80 d0 02 00 RSP: 0018:ffffc9000010fec8 EFLAGS: 00000242 RAX: 0000000000000001 RBX: 000000000000000f RCX: 4000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff811c410c RBP: ffffffff829478c0 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 ? do_idle+0x1ec/0x210 default_idle_call+0x6c/0x90 do_idle+0x1ec/0x210 cpu_startup_entry+0x26/0x30 start_secondary+0x11b/0x150 secondary_startup_64_no_verify+0x165/0x16b irq event stamp: 833284 hardirqs last enabled at (833283): [] do_idle+0x1ec/0x210 hardirqs last disabled at (833284): [] common_interrupt+0xf/0xa0 softirqs last enabled at (833224): [] __do_softirq+0x2bf/0x40e softirqs last disabled at (833177): [] irq_exit_rcu+0x7f/0xa0 Fixes: 34f46ae0d4b3 ("net/mlx5: Add command failures data to debugfs") Signed-off-by: Yishai Hadas Reviewed-by: Moshe Shemesh Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit e24d6f5a7f2d95a98a46257a5a5a5381d572894f Author: Vlad Buslov Date: Fri Nov 10 11:10:22 2023 +0100 net/mlx5e: Fix peer flow lists handling [ Upstream commit d76fdd31f953ac5046555171620f2562715e9b71 ] The cited change refactored mlx5e_tc_del_fdb_peer_flow() to only clear DUP flag when list of peer flows has become empty. However, if any concurrent user holds a reference to a peer flow (for example, the neighbor update workqueue task is updating peer flow's parent encap entry concurrently), then the flow will not be removed from the peer list and, consecutively, DUP flag will remain set. Since mlx5e_tc_del_fdb_peers_flow() calls mlx5e_tc_del_fdb_peer_flow() for every possible peer index the algorithm will try to remove the flow from eswitch instances that it has never peered with causing either NULL pointer dereference when trying to remove the flow peer list head of peer_index that was never initialized or a warning if the list debug config is enabled[0]. Fix the issue by always removing the peer flow from the list even when not releasing the last reference to it. [0]: [ 3102.985806] ------------[ cut here ]------------ [ 3102.986223] list_del corruption, ffff888139110698->next is NULL [ 3102.986757] WARNING: CPU: 2 PID: 22109 at lib/list_debug.c:53 __list_del_entry_valid_or_report+0x4f/0xc0 [ 3102.987561] Modules linked in: act_ct nf_flow_table bonding act_tunnel_key act_mirred act_skbedit vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa openvswitch nsh xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcg ss oid_registry overlay rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core [last unloaded: bonding] [ 3102.991113] CPU: 2 PID: 22109 Comm: revalidator28 Not tainted 6.6.0-rc6+ #3 [ 3102.991695] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 3102.992605] RIP: 0010:__list_del_entry_valid_or_report+0x4f/0xc0 [ 3102.993122] Code: 39 c2 74 56 48 8b 32 48 39 fe 75 62 48 8b 51 08 48 39 f2 75 73 b8 01 00 00 00 c3 48 89 fe 48 c7 c7 48 fd 0a 82 e8 41 0b ad ff <0f> 0b 31 c0 c3 48 89 fe 48 c7 c7 70 fd 0a 82 e8 2d 0b ad ff 0f 0b [ 3102.994615] RSP: 0018:ffff8881383e7710 EFLAGS: 00010286 [ 3102.995078] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 3102.995670] RDX: 0000000000000001 RSI: ffff88885f89b640 RDI: ffff88885f89b640 [ 3102.997188] DEL flow 00000000be367878 on port 0 [ 3102.998594] RBP: dead000000000122 R08: 0000000000000000 R09: c0000000ffffdfff [ 3102.999604] R10: 0000000000000008 R11: ffff8881383e7598 R12: dead000000000100 [ 3103.000198] R13: 0000000000000002 R14: ffff888139110000 R15: ffff888101901240 [ 3103.000790] FS: 00007f424cde4700(0000) GS:ffff88885f880000(0000) knlGS:0000000000000000 [ 3103.001486] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3103.001986] CR2: 00007fd42e8dcb70 CR3: 000000011e68a003 CR4: 0000000000370ea0 [ 3103.002596] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3103.003190] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3103.003787] Call Trace: [ 3103.004055] [ 3103.004297] ? __warn+0x7d/0x130 [ 3103.004623] ? __list_del_entry_valid_or_report+0x4f/0xc0 [ 3103.005094] ? report_bug+0xf1/0x1c0 [ 3103.005439] ? console_unlock+0x4a/0xd0 [ 3103.005806] ? handle_bug+0x3f/0x70 [ 3103.006149] ? exc_invalid_op+0x13/0x60 [ 3103.006531] ? asm_exc_invalid_op+0x16/0x20 [ 3103.007430] ? __list_del_entry_valid_or_report+0x4f/0xc0 [ 3103.007910] mlx5e_tc_del_fdb_peers_flow+0xcf/0x240 [mlx5_core] [ 3103.008463] mlx5e_tc_del_flow+0x46/0x270 [mlx5_core] [ 3103.008944] mlx5e_flow_put+0x26/0x50 [mlx5_core] [ 3103.009401] mlx5e_delete_flower+0x25f/0x380 [mlx5_core] [ 3103.009901] tc_setup_cb_destroy+0xab/0x180 [ 3103.010292] fl_hw_destroy_filter+0x99/0xc0 [cls_flower] [ 3103.010779] __fl_delete+0x2d4/0x2f0 [cls_flower] [ 3103.011207] fl_delete+0x36/0x80 [cls_flower] [ 3103.011614] tc_del_tfilter+0x56f/0x750 [ 3103.011982] rtnetlink_rcv_msg+0xff/0x3a0 [ 3103.012362] ? netlink_ack+0x1c7/0x4e0 [ 3103.012719] ? rtnl_calcit.isra.44+0x130/0x130 [ 3103.013134] netlink_rcv_skb+0x54/0x100 [ 3103.013533] netlink_unicast+0x1ca/0x2b0 [ 3103.013902] netlink_sendmsg+0x361/0x4d0 [ 3103.014269] __sock_sendmsg+0x38/0x60 [ 3103.014643] ____sys_sendmsg+0x1f2/0x200 [ 3103.015018] ? copy_msghdr_from_user+0x72/0xa0 [ 3103.015265] ___sys_sendmsg+0x87/0xd0 [ 3103.016608] ? copy_msghdr_from_user+0x72/0xa0 [ 3103.017014] ? ___sys_recvmsg+0x9b/0xd0 [ 3103.017381] ? ttwu_do_activate.isra.137+0x58/0x180 [ 3103.017821] ? wake_up_q+0x49/0x90 [ 3103.018157] ? futex_wake+0x137/0x160 [ 3103.018521] ? __sys_sendmsg+0x51/0x90 [ 3103.018882] __sys_sendmsg+0x51/0x90 [ 3103.019230] ? exit_to_user_mode_prepare+0x56/0x130 [ 3103.019670] do_syscall_64+0x3c/0x80 [ 3103.020017] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 3103.020469] RIP: 0033:0x7f4254811ef4 [ 3103.020816] Code: 89 f3 48 83 ec 10 48 89 7c 24 08 48 89 14 24 e8 42 eb ff ff 48 8b 14 24 41 89 c0 48 89 de 48 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 48 89 04 24 e8 78 eb ff ff 48 8b [ 3103.022290] RSP: 002b:00007f424cdd9480 EFLAGS: 00000293 ORIG_RAX: 000000000000002e [ 3103.022970] RAX: ffffffffffffffda RBX: 00007f424cdd9510 RCX: 00007f4254811ef4 [ 3103.023564] RDX: 0000000000000000 RSI: 00007f424cdd9510 RDI: 0000000000000012 [ 3103.024158] RBP: 00007f424cdda238 R08: 0000000000000000 R09: 00007f41d801a4b0 [ 3103.024748] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000001 [ 3103.025341] R13: 00007f424cdd9510 R14: 00007f424cdda240 R15: 00007f424cdd99a0 [ 3103.025931] [ 3103.026182] ---[ end trace 0000000000000000 ]--- [ 3103.027033] ------------[ cut here ]------------ Fixes: 9be6c21fdcf8 ("net/mlx5e: Handle offloads flows per peer") Signed-off-by: Vlad Buslov Reviewed-by: Mark Bloch Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit b7077d52ca22562e85141a7705e3b49e012469a5 Author: Tariq Toukan Date: Sun Nov 5 17:09:46 2023 +0200 net/mlx5e: Fix inconsistent hairpin RQT sizes [ Upstream commit c20767fd45e82d64352db82d4fc8d281a43e4783 ] The processing of traffic in hairpin queues occurs in HW/FW and does not involve the cpus, hence the upper bound on max num channels does not apply to them. Using this bound for the hairpin RQT max_table_size is wrong. It could be too small, and cause the error below [1]. As the RQT size provided on init does not get modified later, use the same value for both actual and max table sizes. [1] mlx5_core 0000:08:00.1: mlx5_cmd_out_err:805:(pid 1200): CREATE_RQT(0x916) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x538faf), err(-22) Fixes: 74a8dadac17e ("net/mlx5e: Preparations for supporting larger number of channels") Signed-off-by: Tariq Toukan Reviewed-by: Gal Pressman Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit 33cdeae8c6fb58cc445f859b67c014dc9f60b4e0 Author: Rahul Rameshbabu Date: Wed Nov 22 18:32:11 2023 -0800 net/mlx5e: Fix operation precedence bug in port timestamping napi_poll context [ Upstream commit 3876638b2c7ebb2c9d181de1191db0de8cac143a ] Indirection (*) is of lower precedence than postfix increment (++). Logic in napi_poll context would cause an out-of-bound read by first increment the pointer address by byte address space and then dereference the value. Rather, the intended logic was to dereference first and then increment the underlying value. Fixes: 92214be5979c ("net/mlx5e: Update doorbell for port timestamping CQ before the software counter") Signed-off-by: Rahul Rameshbabu Reviewed-by: Tariq Toukan Signed-off-by: Saeed Mahameed Signed-off-by: Sasha Levin commit c04709b2cc99ae31c346f79f0211752d7b74df01 Author: Ido Schimmel Date: Mon Jan 22 15:28:43 2024 +0200 net/sched: flower: Fix chain template offload [ Upstream commit 32f2a0afa95fae0d1ceec2ff06e0e816939964b8 ] When a qdisc is deleted from a net device the stack instructs the underlying driver to remove its flow offload callback from the associated filter block using the 'FLOW_BLOCK_UNBIND' command. The stack then continues to replay the removal of the filters in the block for this driver by iterating over the chains in the block and invoking the 'reoffload' operation of the classifier being used. In turn, the classifier in its 'reoffload' operation prepares and emits a 'FLOW_CLS_DESTROY' command for each filter. However, the stack does not do the same for chain templates and the underlying driver never receives a 'FLOW_CLS_TMPLT_DESTROY' command when a qdisc is deleted. This results in a memory leak [1] which can be reproduced using [2]. Fix by introducing a 'tmplt_reoffload' operation and have the stack invoke it with the appropriate arguments as part of the replay. Implement the operation in the sole classifier that supports chain templates (flower) by emitting the 'FLOW_CLS_TMPLT_{CREATE,DESTROY}' command based on whether a flow offload callback is being bound to a filter block or being unbound from one. As far as I can tell, the issue happens since cited commit which reordered tcf_block_offload_unbind() before tcf_block_flush_all_chains() in __tcf_block_put(). The order cannot be reversed as the filter block is expected to be freed after flushing all the chains. [1] unreferenced object 0xffff888107e28800 (size 2048): comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s) hex dump (first 32 bytes): b1 a6 7c 11 81 88 ff ff e0 5b b3 10 81 88 ff ff ..|......[...... 01 00 00 00 00 00 00 00 e0 aa b0 84 ff ff ff ff ................ backtrace: [] __kmem_cache_alloc_node+0x1e8/0x320 [] __kmalloc+0x4e/0x90 [] mlxsw_sp_acl_ruleset_get+0x34d/0x7a0 [] mlxsw_sp_flower_tmplt_create+0x145/0x180 [] mlxsw_sp_flow_block_cb+0x1ea/0x280 [] tc_setup_cb_call+0x183/0x340 [] fl_tmplt_create+0x3da/0x4c0 [] tc_ctl_chain+0xa15/0x1170 [] rtnetlink_rcv_msg+0x3cc/0xed0 [] netlink_rcv_skb+0x170/0x440 [] netlink_unicast+0x540/0x820 [] netlink_sendmsg+0x8d8/0xda0 [] ____sys_sendmsg+0x30f/0xa80 [] ___sys_sendmsg+0x13a/0x1e0 [] __sys_sendmsg+0x11c/0x1f0 [] do_syscall_64+0x40/0xe0 unreferenced object 0xffff88816d2c0400 (size 1024): comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s) hex dump (first 32 bytes): 40 00 00 00 00 00 00 00 57 f6 38 be 00 00 00 00 @.......W.8..... 10 04 2c 6d 81 88 ff ff 10 04 2c 6d 81 88 ff ff ..,m......,m.... backtrace: [] __kmem_cache_alloc_node+0x1e8/0x320 [] __kmalloc_node+0x51/0x90 [] kvmalloc_node+0xa6/0x1f0 [] bucket_table_alloc.isra.0+0x83/0x460 [] rhashtable_init+0x43b/0x7c0 [] mlxsw_sp_acl_ruleset_get+0x428/0x7a0 [] mlxsw_sp_flower_tmplt_create+0x145/0x180 [] mlxsw_sp_flow_block_cb+0x1ea/0x280 [] tc_setup_cb_call+0x183/0x340 [] fl_tmplt_create+0x3da/0x4c0 [] tc_ctl_chain+0xa15/0x1170 [] rtnetlink_rcv_msg+0x3cc/0xed0 [] netlink_rcv_skb+0x170/0x440 [] netlink_unicast+0x540/0x820 [] netlink_sendmsg+0x8d8/0xda0 [] ____sys_sendmsg+0x30f/0xa80 [2] # tc qdisc add dev swp1 clsact # tc chain add dev swp1 ingress proto ip chain 1 flower dst_ip 0.0.0.0/32 # tc qdisc del dev swp1 clsact # devlink dev reload pci/0000:06:00.0 Fixes: bbf73830cd48 ("net: sched: traverse chains in block with tcf_get_next_chain()") Signed-off-by: Ido Schimmel Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit f5fd652487aeeccb86c5ce4b5649079b663ef1e6 Author: Jakub Kicinski Date: Mon Jan 22 12:35:28 2024 -0800 selftests: fill in some missing configs for net [ Upstream commit 04fe7c5029cbdbcdb28917f09a958d939a8f19f7 ] We are missing a lot of config options from net selftests, it seems: tun/tap: CONFIG_TUN, CONFIG_MACVLAN, CONFIG_MACVTAP fib_tests: CONFIG_NET_SCH_FQ_CODEL l2tp: CONFIG_L2TP, CONFIG_L2TP_V3, CONFIG_L2TP_IP, CONFIG_L2TP_ETH sctp-vrf: CONFIG_INET_DIAG txtimestamp: CONFIG_NET_CLS_U32 vxlan_mdb: CONFIG_BRIDGE_VLAN_FILTERING gre_gso: CONFIG_NET_IPGRE_DEMUX, CONFIG_IP_GRE, CONFIG_IPV6_GRE srv6_end_dt*_l3vpn: CONFIG_IPV6_SEG6_LWTUNNEL ip_local_port_range: CONFIG_MPTCP fib_test: CONFIG_NET_CLS_BASIC rtnetlink: CONFIG_MACSEC, CONFIG_NET_SCH_HTB, CONFIG_XFRM_INTERFACE CONFIG_NET_IPGRE, CONFIG_BONDING fib_nexthops: CONFIG_MPLS, CONFIG_MPLS_ROUTING vxlan_mdb: CONFIG_NET_ACT_GACT tls: CONFIG_TLS, CONFIG_CRYPTO_CHACHA20POLY1305 psample: CONFIG_PSAMPLE fcnal: CONFIG_TCP_MD5SIG Try to add them in a semi-alphabetical order. Fixes: 62199e3f1658 ("selftests: net: Add VXLAN MDB test") Fixes: c12e0d5f267d ("self-tests: introduce self-tests for RPS default mask") Fixes: 122db5e3634b ("selftests/net: add MPTCP coverage for IP_LOCAL_PORT_RANGE") Link: https://lore.kernel.org/r/20240122203528.672004-1-kuba@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 7986be39e4e1627312538f8b4ad7efebf845ae11 Author: Zhengchao Shao Date: Mon Jan 22 18:20:01 2024 +0800 ipv6: init the accept_queue's spinlocks in inet6_create [ Upstream commit 435e202d645c197dcfd39d7372eb2a56529b6640 ] In commit 198bc90e0e73("tcp: make sure init the accept_queue's spinlocks once"), the spinlocks of accept_queue are initialized only when socket is created in the inet4 scenario. The locks are not initialized when socket is created in the inet6 scenario. The kernel reports the following error: INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack_lvl (lib/dump_stack.c:107) register_lock_class (kernel/locking/lockdep.c:1289) __lock_acquire (kernel/locking/lockdep.c:5015) lock_acquire.part.0 (kernel/locking/lockdep.c:5756) _raw_spin_lock_bh (kernel/locking/spinlock.c:178) inet_csk_listen_stop (net/ipv4/inet_connection_sock.c:1386) tcp_disconnect (net/ipv4/tcp.c:2981) inet_shutdown (net/ipv4/af_inet.c:935) __sys_shutdown (./include/linux/file.h:32 net/socket.c:2438) __x64_sys_shutdown (net/socket.c:2445) do_syscall_64 (arch/x86/entry/common.c:52) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) RIP: 0033:0x7f52ecd05a3d Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48 RSP: 002b:00007f52ecf5dde8 EFLAGS: 00000293 ORIG_RAX: 0000000000000030 RAX: ffffffffffffffda RBX: 00007f52ecf5e640 RCX: 00007f52ecd05a3d RDX: 00007f52ecc8b188 RSI: 0000000000000000 RDI: 0000000000000004 RBP: 00007f52ecf5de20 R08: 00007ffdae45c69f R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f52ecf5e640 R13: 0000000000000000 R14: 00007f52ecc8b060 R15: 00007ffdae45c6e0 Fixes: 198bc90e0e73 ("tcp: make sure init the accept_queue's spinlocks once") Signed-off-by: Zhengchao Shao Reviewed-by: Eric Dumazet Link: https://lore.kernel.org/r/20240122102001.2851701-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 597ee30fd36af2f3bb64e3bae281c15642c7536e Author: Zhengchao Shao Date: Mon Jan 22 09:18:07 2024 +0800 netlink: fix potential sleeping issue in mqueue_flush_file [ Upstream commit 234ec0b6034b16869d45128b8cd2dc6ffe596f04 ] I analyze the potential sleeping issue of the following processes: Thread A Thread B ... netlink_create //ref = 1 do_mq_notify ... sock = netlink_getsockbyfilp ... //ref = 2 info->notify_sock = sock; ... ... netlink_sendmsg ... skb = netlink_alloc_large_skb //skb->head is vmalloced ... netlink_unicast ... sk = netlink_getsockbyportid //ref = 3 ... netlink_sendskb ... __netlink_sendskb ... skb_queue_tail //put skb to sk_receive_queue ... sock_put //ref = 2 ... ... ... netlink_release ... deferred_put_nlk_sk //ref = 1 mqueue_flush_file spin_lock remove_notification netlink_sendskb sock_put //ref = 0 sk_free ... __sk_destruct netlink_sock_destruct skb_queue_purge //get skb from sk_receive_queue ... __skb_queue_purge_reason kfree_skb_reason __kfree_skb ... skb_release_all skb_release_head_state netlink_skb_destructor vfree(skb->head) //sleeping while holding spinlock In netlink_sendmsg, if the memory pointed to by skb->head is allocated by vmalloc, and is put to sk_receive_queue queue, also the skb is not freed. When the mqueue executes flush, the sleeping bug will occur. Use vfree_atomic instead of vfree in netlink_skb_destructor to solve the issue. Fixes: c05cdb1b864f ("netlink: allow large data transfers from user-space") Signed-off-by: Zhengchao Shao Link: https://lore.kernel.org/r/20240122011807.2110357-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 581ae53049fa559c990d62e9e96193bd21ce3ac5 Author: Kuniyuki Iwashima Date: Fri Jan 19 19:16:42 2024 -0800 selftest: Don't reuse port for SO_INCOMING_CPU test. [ Upstream commit 97de5a15edf2d22184f5ff588656030bbb7fa358 ] Jakub reported that ASSERT_EQ(cpu, i) in so_incoming_cpu.c seems to fire somewhat randomly. # # RUN so_incoming_cpu.before_reuseport.test3 ... # # so_incoming_cpu.c:191:test3:Expected cpu (32) == i (0) # # test3: Test terminated by assertion # # FAIL so_incoming_cpu.before_reuseport.test3 # not ok 3 so_incoming_cpu.before_reuseport.test3 When the test failed, not-yet-accepted CLOSE_WAIT sockets received SYN with a "challenging" SEQ number, which was sent from an unexpected CPU that did not create the receiver. The test basically does: 1. for each cpu: 1-1. create a server 1-2. set SO_INCOMING_CPU 2. for each cpu: 2-1. set cpu affinity 2-2. create some clients 2-3. let clients connect() to the server on the same cpu 2-4. close() clients 3. for each server: 3-1. accept() all child sockets 3-2. check if all children have the same SO_INCOMING_CPU with the server The root cause was the close() in 2-4. and net.ipv4.tcp_tw_reuse. In a loop of 2., close() changed the client state to FIN_WAIT_2, and the peer transitioned to CLOSE_WAIT. In another loop of 2., connect() happened to select the same port of the FIN_WAIT_2 socket, and it was reused as the default value of net.ipv4.tcp_tw_reuse is 2. As a result, the new client sent SYN to the CLOSE_WAIT socket from a different CPU, and the receiver's sk_incoming_cpu was overwritten with unexpected CPU ID. Also, the SYN had a different SEQ number, so the CLOSE_WAIT socket responded with Challenge ACK. The new client properly returned RST and effectively killed the CLOSE_WAIT socket. This way, all clients were created successfully, but the error was detected later by 3-2., ASSERT_EQ(cpu, i). To avoid the failure, let's make sure that (i) the number of clients is less than the number of available ports and (ii) such reuse never happens. Fixes: 6df96146b202 ("selftest: Add test for SO_INCOMING_CPU.") Reported-by: Jakub Kicinski Signed-off-by: Kuniyuki Iwashima Tested-by: Jakub Kicinski Link: https://lore.kernel.org/r/20240120031642.67014-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 782f5ec085380ec10ca1d50b3595ed9c7c21dd82 Author: Salvatore Dipietro Date: Fri Jan 19 11:01:33 2024 -0800 tcp: Add memory barrier to tcp_push() [ Upstream commit 7267e8dcad6b2f9fce05a6a06335d7040acbc2b6 ] On CPUs with weak memory models, reads and updates performed by tcp_push to the sk variables can get reordered leaving the socket throttled when it should not. The tasklet running tcp_wfree() may also not observe the memory updates in time and will skip flushing any packets throttled by tcp_push(), delaying the sending. This can pathologically cause 40ms extra latency due to bad interactions with delayed acks. Adding a memory barrier in tcp_push removes the bug, similarly to the previous commit bf06200e732d ("tcp: tsq: fix nonagle handling"). smp_mb__after_atomic() is used to not incur in unnecessary overhead on x86 since not affected. Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu 22.04 and Apache Tomcat 9.0.83 running the basic servlet below: import java.io.IOException; import java.io.OutputStreamWriter; import java.io.PrintWriter; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class HelloWorldServlet extends HttpServlet { @Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=utf-8"); OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8"); String s = "a".repeat(3096); osw.write(s,0,s.length()); osw.flush(); } } Load was applied using wrk2 (https://github.com/kinvolk/wrk2) from an AWS c6i.8xlarge instance. Before the patch an additional 40ms latency from P99.99+ values is observed while, with the patch, the extra latency disappears. No patch and tcp_autocorking=1 ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.60.173:8080/hello/hello ... 50.000% 0.91ms 75.000% 1.13ms 90.000% 1.46ms 99.000% 1.74ms 99.900% 1.89ms 99.990% 41.95ms <<< 40+ ms extra latency 99.999% 48.32ms 100.000% 48.96ms With patch and tcp_autocorking=1 ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.60.173:8080/hello/hello ... 50.000% 0.90ms 75.000% 1.13ms 90.000% 1.45ms 99.000% 1.72ms 99.900% 1.83ms 99.990% 2.11ms <<< no 40+ ms extra latency 99.999% 2.53ms 100.000% 2.62ms Patch has been also tested on x86 (m7i.2xlarge instance) which it is not affected by this issue and the patch doesn't introduce any additional delay. Fixes: 7aa5470c2c09 ("tcp: tsq: move tsq_flags close to sk_wmem_alloc") Signed-off-by: Salvatore Dipietro Reviewed-by: Eric Dumazet Link: https://lore.kernel.org/r/20240119190133.43698-1-dipiets@amazon.com Signed-off-by: Paolo Abeni Signed-off-by: Sasha Levin commit 3b91566b0bebe3f7699ddf2ed595b5241dfdf0d5 Author: David Howells Date: Tue Jan 2 14:02:37 2024 +0000 afs: Fix error handling with lookup via FS.InlineBulkStatus [ Upstream commit 17ba6f0bd14fe3ac606aac6bebe5e69bdaad8ba1 ] When afs does a lookup, it tries to use FS.InlineBulkStatus to preemptively look up a bunch of files in the parent directory and cache this locally, on the basis that we might want to look at them too (for example if someone does an ls on a directory, they may want want to then stat every file listed). FS.InlineBulkStatus can be considered a compound op with the normal abort code applying to the compound as a whole. Each status fetch within the compound is then given its own individual abort code - but assuming no error that prevents the bulk fetch from returning the compound result will be 0, even if all the constituent status fetches failed. At the conclusion of afs_do_lookup(), we should use the abort code from the appropriate status to determine the error to return, if any - but instead it is assumed that we were successful if the op as a whole succeeded and we return an incompletely initialised inode, resulting in ENOENT, no matter the actual reason. In the particular instance reported, a vnode with no permission granted to be accessed is being given a UAEACCES abort code which should be reported as EACCES, but is instead being reported as ENOENT. Fix this by abandoning the inode (which will be cleaned up with the op) if file[1] has an abort code indicated and turn that abort code into an error instead. Whilst we're at it, add a tracepoint so that the abort codes of the individual subrequests of FS.InlineBulkStatus can be logged. At the moment only the container abort code can be 0. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Reported-by: Jeffrey Altman Signed-off-by: David Howells Reviewed-by: Marc Dionne cc: linux-afs@lists.infradead.org Signed-off-by: Sasha Levin commit ca8e060fd4cb11d174fb67c1af0f6d5d4b78c5f6 Author: David Howells Date: Wed Oct 25 17:53:33 2023 +0100 afs: Simplify error handling [ Upstream commit aa453becce5d1ae1b94b7fc22f47d7b05d22b14e ] Simplify error handling a bit by moving it from the afs_addr_cursor struct to the afs_operation and afs_vl_cursor structs and using the error prioritisation function for accumulating errors from multiple sources (AFS tries to rotate between multiple fileservers, some of which may be inaccessible or in some state of offlinedness). Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit bc0a7ccd9ebf5c7d7245c6f560e17b443009a410 Author: David Howells Date: Thu Oct 26 09:54:07 2023 +0100 afs: Don't put afs_call in afs_wait_for_call_to_complete() [ Upstream commit 6f2ff7e89bd05677f4c08fccafcf625ca3e09c1c ] Don't put the afs_call struct in afs_wait_for_call_to_complete() but rather have the caller do it. This will allow the caller to fish stuff out of the afs_call struct rather than the afs_addr_cursor struct, thereby allowing a subsequent patch to subsume it. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit eeb4b31f4e1f0a3f2249e99289fd88bc61685dd4 Author: David Howells Date: Thu Oct 26 09:43:23 2023 +0100 afs: Wrap most op->error accesses with inline funcs [ Upstream commit 2de5599f63babb416e09b1a6be429a47910dd47c ] Wrap most op->error accesses with inline funcs which will make it easier for a subsequent patch to replace op->error with something else. Two functions are added to this end: (1) afs_op_error() - Get the error code. (2) afs_op_set_error() - Set the error code. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit 06fefb477d42b8ff405b721c05aba56962cd48d0 Author: David Howells Date: Fri Oct 20 16:04:52 2023 +0100 afs: Use op->nr_iterations=-1 to indicate to begin fileserver iteration [ Upstream commit 075171fd22be33acf4ab354814bfa6de1c3412ce ] Set op->nr_iterations to -1 to indicate that we need to begin fileserver iteration rather than setting error to SHRT_MAX. This makes it easier to eliminate the address cursor. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit 3cd9a24aaa63c7f15b8b9588b911312dd62bad4c Author: David Howells Date: Fri Oct 20 16:00:18 2023 +0100 afs: Handle the VIO and UAEIO aborts explicitly [ Upstream commit eb8eae65f0c713bcef84b082aa919f72c3d83268 ] When processing the result of a call, handle the VIO and UAEIO abort specifically rather than leaving it to a default case. Rather than erroring out unconditionally, see if there's another server if the volume has more than one server available, otherwise return -EREMOTEIO. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit 056fc740be000d39a7dba700a935f3bbfbc664e6 Author: David Howells Date: Thu Oct 19 12:55:11 2023 +0100 rxrpc, afs: Allow afs to pin rxrpc_peer objects [ Upstream commit 72904d7b9bfbf2dd146254edea93958bc35bbbfe ] Change rxrpc's API such that: (1) A new function, rxrpc_kernel_lookup_peer(), is provided to look up an rxrpc_peer record for a remote address and a corresponding function, rxrpc_kernel_put_peer(), is provided to dispose of it again. (2) When setting up a call, the rxrpc_peer object used during a call is now passed in rather than being set up by rxrpc_connect_call(). For afs, this meenat passing it to rxrpc_kernel_begin_call() rather than the full address (the service ID then has to be passed in as a separate parameter). (3) A new function, rxrpc_kernel_remote_addr(), is added so that afs can get a pointer to the transport address for display purposed, and another, rxrpc_kernel_remote_srx(), to gain a pointer to the full rxrpc address. (4) The function to retrieve the RTT from a call, rxrpc_kernel_get_srtt(), is then altered to take a peer. This now returns the RTT or -1 if there are insufficient samples. (5) Rename rxrpc_kernel_get_peer() to rxrpc_kernel_call_get_peer(). (6) Provide a new function, rxrpc_kernel_get_peer(), to get a ref on a peer the caller already has. This allows the afs filesystem to pin the rxrpc_peer records that it is using, allowing faster lookups and pointer comparisons rather than comparing sockaddr_rxrpc contents. It also makes it easier to get hold of the RTT. The following changes are made to afs: (1) The addr_list struct's addrs[] elements now hold a peer struct pointer and a service ID rather than a sockaddr_rxrpc. (2) When displaying the transport address, rxrpc_kernel_remote_addr() is used. (3) The port arg is removed from afs_alloc_addrlist() since it's always overridden. (4) afs_merge_fs_addr4() and afs_merge_fs_addr6() do peer lookup and may now return an error that must be handled. (5) afs_find_server() now takes a peer pointer to specify the address. (6) afs_find_server(), afs_compare_fs_alists() and afs_merge_fs_addr[46]{} now do peer pointer comparison rather than address comparison. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit b87cf8db84d29d3d880d9f75729a259937ececfa Author: David Howells Date: Wed Oct 18 15:38:14 2023 +0100 afs: Turn the afs_addr_list address array into an array of structs [ Upstream commit 07f3502b33a260f873e35708d2fa693eb52225cb ] Turn the afs_addr_list address array into an array of structs, thereby allowing per-address (such as RTT) info to be added. Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit b838e39d922930ddc097c7160931adfbadc3dc1f Author: David Howells Date: Wed Oct 18 08:42:18 2023 +0100 afs: Add comments on abort handling [ Upstream commit fe245c8fcdac339e6b42076c828a6bede3a5e948 ] Add some comments on AFS abort code handling in the rotation algorithm and adjust the errors produced to match. Reported-by: Jeffrey E Altman Signed-off-by: David Howells Reviewed-by: Jeffrey Altman cc: Marc Dionne cc: linux-afs@lists.infradead.org Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit 3f3aae10b7b06b5170faa781da50b9d38f0dfeb9 Author: Oleg Nesterov Date: Thu Nov 30 12:56:14 2023 +0100 afs: fix the usage of read_seqbegin_or_lock() in afs_find_server*() [ Upstream commit 1702e0654ca9a7bcd7c7619c8a5004db58945b71 ] David Howells says: (5) afs_find_server(). There could be a lot of servers in the list and each server can have multiple addresses, so I think this would be better with an exclusive second pass. The server list isn't likely to change all that often, but when it does change, there's a good chance several servers are going to be added/removed one after the other. Further, this is only going to be used for incoming cache management/callback requests from the server, which hopefully aren't going to happen too often - but it is remotely drivable. (6) afs_find_server_by_uuid(). Similarly to (5), there could be a lot of servers to search through, but they are in a tree not a flat list, so it should be faster to process. Again, it's not likely to change that often and, again, when it does change it's likely to involve multiple changes. This can be driven remotely by an incoming cache management request but is mostly going to be driven by setting up or reconfiguring a volume's server list - something that also isn't likely to happen often. Make the "seq" counter odd on the 2nd pass, otherwise read_seqbegin_or_lock() never takes the lock. Signed-off-by: Oleg Nesterov Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20231130115614.GA21581@redhat.com/ Stable-dep-of: 17ba6f0bd14f ("afs: Fix error handling with lookup via FS.InlineBulkStatus") Signed-off-by: Sasha Levin commit fa70c6954aabbfbca1fe39b9b60f82cf2e8cec38 Author: David Howells Date: Mon Jan 8 17:22:36 2024 +0000 afs: Hide silly-rename files from userspace [ Upstream commit 57e9d49c54528c49b8bffe6d99d782ea051ea534 ] There appears to be a race between silly-rename files being created/removed and various userspace tools iterating over the contents of a directory, leading to such errors as: find: './kernel/.tmp_cpio_dir/include/dt-bindings/reset/.__afs2080': No such file or directory tar: ./include/linux/greybus/.__afs3C95: File removed before we read it when building a kernel. Fix afs_readdir() so that it doesn't return .__afsXXXX silly-rename files to userspace. This doesn't stop them being looked up directly by name as we need to be able to look them up from within the kernel as part of the silly-rename algorithm. Fixes: 79ddbfa500b3 ("afs: Implement sillyrename for unlink and rename") Signed-off-by: David Howells cc: Marc Dionne cc: linux-afs@lists.infradead.org Signed-off-by: Sasha Levin commit bf4aeff7da85c3becd39fb73bac94122331c30fb Author: Petr Pavlu Date: Mon Jan 22 16:09:28 2024 +0100 tracing: Ensure visibility when inserting an element into tracing_map [ Upstream commit 2b44760609e9eaafc9d234a6883d042fc21132a7 ] Running the following two commands in parallel on a multi-processor AArch64 machine can sporadically produce an unexpected warning about duplicate histogram entries: $ while true; do echo hist:key=id.syscall:val=hitcount > \ /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist sleep 0.001 done $ stress-ng --sysbadaddr $(nproc) The warning looks as follows: [ 2911.172474] ------------[ cut here ]------------ [ 2911.173111] Duplicates detected: 1 [ 2911.173574] WARNING: CPU: 2 PID: 12247 at kernel/trace/tracing_map.c:983 tracing_map_sort_entries+0x3e0/0x408 [ 2911.174702] Modules linked in: iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) af_packet(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) ena(E) tiny_power_button(E) qemu_fw_cfg(E) button(E) fuse(E) efi_pstore(E) ip_tables(E) x_tables(E) xfs(E) libcrc32c(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) polyval_ce(E) polyval_generic(E) ghash_ce(E) gf128mul(E) sm4_ce_gcm(E) sm4_ce_ccm(E) sm4_ce(E) sm4_ce_cipher(E) sm4(E) sm3_ce(E) sm3(E) sha3_ce(E) sha512_ce(E) sha512_arm64(E) sha2_ce(E) sha256_arm64(E) nvme(E) sha1_ce(E) nvme_core(E) nvme_auth(E) t10_pi(E) sg(E) scsi_mod(E) scsi_common(E) efivarfs(E) [ 2911.174738] Unloaded tainted modules: cppc_cpufreq(E):1 [ 2911.180985] CPU: 2 PID: 12247 Comm: cat Kdump: loaded Tainted: G E 6.7.0-default #2 1b58bbb22c97e4399dc09f92d309344f69c44a01 [ 2911.182398] Hardware name: Amazon EC2 c7g.8xlarge/, BIOS 1.0 11/1/2018 [ 2911.183208] pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 2911.184038] pc : tracing_map_sort_entries+0x3e0/0x408 [ 2911.184667] lr : tracing_map_sort_entries+0x3e0/0x408 [ 2911.185310] sp : ffff8000a1513900 [ 2911.185750] x29: ffff8000a1513900 x28: ffff0003f272fe80 x27: 0000000000000001 [ 2911.186600] x26: ffff0003f272fe80 x25: 0000000000000030 x24: 0000000000000008 [ 2911.187458] x23: ffff0003c5788000 x22: ffff0003c16710c8 x21: ffff80008017f180 [ 2911.188310] x20: ffff80008017f000 x19: ffff80008017f180 x18: ffffffffffffffff [ 2911.189160] x17: 0000000000000000 x16: 0000000000000000 x15: ffff8000a15134b8 [ 2911.190015] x14: 0000000000000000 x13: 205d373432323154 x12: 5b5d313131333731 [ 2911.190844] x11: 00000000fffeffff x10: 00000000fffeffff x9 : ffffd1b78274a13c [ 2911.191716] x8 : 000000000017ffe8 x7 : c0000000fffeffff x6 : 000000000057ffa8 [ 2911.192554] x5 : ffff0012f6c24ec0 x4 : 0000000000000000 x3 : ffff2e5b72b5d000 [ 2911.193404] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0003ff254480 [ 2911.194259] Call trace: [ 2911.194626] tracing_map_sort_entries+0x3e0/0x408 [ 2911.195220] hist_show+0x124/0x800 [ 2911.195692] seq_read_iter+0x1d4/0x4e8 [ 2911.196193] seq_read+0xe8/0x138 [ 2911.196638] vfs_read+0xc8/0x300 [ 2911.197078] ksys_read+0x70/0x108 [ 2911.197534] __arm64_sys_read+0x24/0x38 [ 2911.198046] invoke_syscall+0x78/0x108 [ 2911.198553] el0_svc_common.constprop.0+0xd0/0xf8 [ 2911.199157] do_el0_svc+0x28/0x40 [ 2911.199613] el0_svc+0x40/0x178 [ 2911.200048] el0t_64_sync_handler+0x13c/0x158 [ 2911.200621] el0t_64_sync+0x1a8/0x1b0 [ 2911.201115] ---[ end trace 0000000000000000 ]--- The problem appears to be caused by CPU reordering of writes issued from __tracing_map_insert(). The check for the presence of an element with a given key in this function is: val = READ_ONCE(entry->val); if (val && keys_match(key, val->key, map->key_size)) ... The write of a new entry is: elt = get_free_elt(map); memcpy(elt->key, key, map->key_size); entry->val = elt; The "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;" stores may become visible in the reversed order on another CPU. This second CPU might then incorrectly determine that a new key doesn't match an already present val->key and subsequently insert a new element, resulting in a duplicate. Fix the problem by adding a write barrier between "memcpy(elt->key, key, map->key_size);" and "entry->val = elt;", and for good measure, also use WRITE_ONCE(entry->val, elt) for publishing the element. The sequence pairs with the mentioned "READ_ONCE(entry->val);" and the "val->key" check which has an address dependency. The barrier is placed on a path executed when adding an element for a new key. Subsequent updates targeting the same key remain unaffected. From the user's perspective, the issue was introduced by commit c193707dde77 ("tracing: Remove code which merges duplicates"), which followed commit cbf4100efb8f ("tracing: Add support to detect and avoid duplicates"). The previous code operated differently; it inherently expected potential races which result in duplicates but merged them later when they occurred. Link: https://lore.kernel.org/linux-trace-kernel/20240122150928.27725-1-petr.pavlu@suse.com Fixes: c193707dde77 ("tracing: Remove code which merges duplicates") Signed-off-by: Petr Pavlu Acked-by: Tom Zanussi Signed-off-by: Steven Rostedt (Google) Signed-off-by: Sasha Levin commit 4200ad3e46ce50f410fdda302745489441bc70f0 Author: Dan Carpenter Date: Fri Jan 12 09:59:41 2024 +0300 netfs, fscache: Prevent Oops in fscache_put_cache() [ Upstream commit 3be0b3ed1d76c6703b9ee482b55f7e01c369cc68 ] This function dereferences "cache" and then checks if it's IS_ERR_OR_NULL(). Check first, then dereference. Fixes: 9549332df4ed ("fscache: Implement cache registration") Signed-off-by: Dan Carpenter Signed-off-by: David Howells Link: https://lore.kernel.org/r/e84bc740-3502-4f16-982a-a40d5676615c@moroto.mountain/ # v2 Signed-off-by: Sasha Levin commit 0b787c2dea15e7a2828fa3a74a5447df4ed57711 Author: Sharath Srinivasan Date: Fri Jan 19 17:48:39 2024 -0800 net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv [ Upstream commit 13e788deb7348cc88df34bed736c3b3b9927ea52 ] Syzcaller UBSAN crash occurs in rds_cmsg_recv(), which reads inc->i_rx_lat_trace[j + 1] with index 4 (3 + 1), but with array size of 4 (RDS_RX_MAX_TRACES). Here 'j' is assigned from rs->rs_rx_trace[i] and in-turn from trace.rx_trace_pos[i] in rds_recv_track_latency(), with both arrays sized 3 (RDS_MSG_RX_DGRAM_TRACE_MAX). So fix the off-by-one bounds check in rds_recv_track_latency() to prevent a potential crash in rds_cmsg_recv(). Found by syzcaller: ================================================================= UBSAN: array-index-out-of-bounds in net/rds/recv.c:585:39 index 4 is out of range for type 'u64 [4]' CPU: 1 PID: 8058 Comm: syz-executor228 Not tainted 6.6.0-gd2f51b3516da #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x136/0x150 lib/dump_stack.c:106 ubsan_epilogue lib/ubsan.c:217 [inline] __ubsan_handle_out_of_bounds+0xd5/0x130 lib/ubsan.c:348 rds_cmsg_recv+0x60d/0x700 net/rds/recv.c:585 rds_recvmsg+0x3fb/0x1610 net/rds/recv.c:716 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg+0xe2/0x160 net/socket.c:1066 __sys_recvfrom+0x1b6/0x2f0 net/socket.c:2246 __do_sys_recvfrom net/socket.c:2264 [inline] __se_sys_recvfrom net/socket.c:2260 [inline] __x64_sys_recvfrom+0xe0/0x1b0 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b ================================================================== Fixes: 3289025aedc0 ("RDS: add receive message trace used by application") Reported-by: Chenyuan Yang Closes: https://lore.kernel.org/linux-rdma/CALGdzuoVdq-wtQ4Az9iottBqC5cv9ZhcE5q8N7LfYFvkRsOVcw@mail.gmail.com/ Signed-off-by: Sharath Srinivasan Reviewed-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3d0b8f9bd0c15239e23da22f048d9a3daa91cc15 Author: Horatiu Vultur Date: Fri Jan 19 11:47:50 2024 +0100 net: micrel: Fix PTP frame parsing for lan8814 [ Upstream commit aaf632f7ab6dec57bc9329a438f94504fe8034b9 ] The HW has the capability to check each frame if it is a PTP frame, which domain it is, which ptp frame type it is, different ip address in the frame. And if one of these checks fail then the frame is not timestamp. Most of these checks were disabled except checking the field minorVersionPTP inside the PTP header. Meaning that once a partner sends a frame compliant to 8021AS which has minorVersionPTP set to 1, then the frame was not timestamp because the HW expected by default a value of 0 in minorVersionPTP. This is exactly the same issue as on lan8841. Fix this issue by removing this check so the userspace can decide on this. Fixes: ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy") Signed-off-by: Horatiu Vultur Reviewed-by: Maxime Chevallier Reviewed-by: Divya Koppera Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit fe7fc23a95035b08fae299655c34e495a03ca6db Author: Arkadiusz Kubalewski Date: Fri Jan 19 14:43:04 2024 +0100 dpll: fix register pin with unregistered parent pin [ Upstream commit 7dc5b18ff71bd6f948810ab8a08b6a6ff8b315c5 ] In case of multiple kernel module instances using the same dpll device: if only one registers dpll device, then only that one can register directly connected pins with a dpll device. When unregistered parent is responsible for determining if the muxed pin can be registered with it or not, the drivers need to be loaded in serialized order to work correctly - first the driver instance which registers the direct pins needs to be loaded, then the other instances could register muxed type pins. Allow registration of a pin with a parent even if the parent was not yet registered, thus allow ability for unserialized driver instance load order. Do not WARN_ON notification for unregistered pin, which can be invoked for described case, instead just return error. Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions") Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Reviewed-by: Jan Glaza Reviewed-by: Jiri Pirko Signed-off-by: Arkadiusz Kubalewski Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 1e741ec11f244f0ee5e5338c984b7c9880864e32 Author: Arkadiusz Kubalewski Date: Fri Jan 19 14:43:03 2024 +0100 dpll: fix userspace availability of pins [ Upstream commit db2ec3c94667eaeecc6a74d96594fab6baf80fdc ] If parent pin was unregistered but child pin was not, the userspace would see the "zombie" pins - the ones that were registered with a parent pin (dpll_pin_on_pin_register(..)). Technically those are not available - as there is no dpll device in the system. Do not dump those pins and prevent userspace from any interaction with them. Provide a unified function to determine if the pin is available and use it before acting/responding for user requests. Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Reviewed-by: Jan Glaza Reviewed-by: Jiri Pirko Signed-off-by: Arkadiusz Kubalewski Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 5050a5b9d8b4d3c6f7e376e07670e437db7ccf9c Author: Arkadiusz Kubalewski Date: Fri Jan 19 14:43:02 2024 +0100 dpll: fix pin dump crash for rebound module [ Upstream commit 830ead5fb0c5855ce4d70ba2ed4a673b5f1e7d9b ] When a kernel module is unbound but the pin resources were not entirely freed (other kernel module instance of the same PCI device have had kept the reference to that pin), and kernel module is again bound, the pin properties would not be updated (the properties are only assigned when memory for the pin is allocated), prop pointer still points to the kernel module memory of the kernel module which was deallocated on the unbind. If the pin dump is invoked in this state, the result is a kernel crash. Prevent the crash by storing persistent pin properties in dpll subsystem, copy the content from the kernel module when pin is allocated, instead of using memory of the kernel module. Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions") Fixes: 9d71b54b65b1 ("dpll: netlink: Add DPLL framework base functions") Reviewed-by: Jan Glaza Reviewed-by: Przemek Kitszel Signed-off-by: Arkadiusz Kubalewski Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 79fc259dca8689150c30869e3c667fb635d543c8 Author: Arkadiusz Kubalewski Date: Fri Jan 19 14:43:01 2024 +0100 dpll: fix broken error path in dpll_pin_alloc(..) [ Upstream commit b6a11a7fc4d6337f7ea720b9287d1b9749c4eae0 ] If pin type is not expected, or pin properities failed to allocate memory, the unwind error path shall not destroy pin's xarrays, which were not yet initialized. Add new goto label and use it to fix broken error path. Reviewed-by: Jiri Pirko Signed-off-by: Arkadiusz Kubalewski Signed-off-by: David S. Miller Stable-dep-of: 830ead5fb0c5 ("dpll: fix pin dump crash for rebound module") Signed-off-by: Sasha Levin commit 6438382dd9f80ba8d51ad8b86d20061b783d8702 Author: Yunjian Wang Date: Fri Jan 19 18:22:56 2024 +0800 tun: add missing rx stats accounting in tun_xdp_act [ Upstream commit f1084c427f55d573fcd5688d9ba7b31b78019716 ] The TUN can be used as vhost-net backend, and it is necessary to count the packets transmitted from TUN to vhost-net/virtio-net. However, there are some places in the receive path that were not taken into account when using XDP. It would be beneficial to also include new accounting for successfully received bytes using dev_sw_netstats_rx_add. Fixes: 761876c857cb ("tap: XDP support") Signed-off-by: Yunjian Wang Reviewed-by: Willem de Bruijn Acked-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4efd09da0d49fb2a5d0138f339d973e7a41a966c Author: Yunjian Wang Date: Fri Jan 19 18:22:35 2024 +0800 tun: fix missing dropped counter in tun_xdp_act [ Upstream commit 5744ba05e7c4bff8fec133dd0f9e51ddffba92f5 ] The commit 8ae1aff0b331 ("tuntap: split out XDP logic") includes dropped counter for XDP_DROP, XDP_ABORTED, and invalid XDP actions. Unfortunately, that commit missed the dropped counter when error occurs during XDP_TX and XDP_REDIRECT actions. This patch fixes this issue. Fixes: 8ae1aff0b331 ("tuntap: split out XDP logic") Signed-off-by: Yunjian Wang Reviewed-by: Willem de Bruijn Acked-by: Jason Wang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 8072699aa9e67d1727692cfb3c347263bb627fb9 Author: Jakub Kicinski Date: Thu Jan 18 16:58:59 2024 -0800 net: fix removing a namespace with conflicting altnames [ Upstream commit d09486a04f5da0a812c26217213b89a3b1acf836 ] Mark reports a BUG() when a net namespace is removed. kernel BUG at net/core/dev.c:11520! Physical interfaces moved outside of init_net get "refunded" to init_net when that namespace disappears. The main interface name may get overwritten in the process if it would have conflicted. We need to also discard all conflicting altnames. Recent fixes addressed ensuring that altnames get moved with the main interface, which surfaced this problem. Reported-by: Марк Коренберг Link: https://lore.kernel.org/all/CAEmTpZFZ4Sv3KwqFOY2WKDHeZYdi0O7N5H1nTvcGp=SAEavtDg@mail.gmail.com/ Fixes: 7663d522099e ("net: check for altname conflicts when changing netdev's netns") Signed-off-by: Jakub Kicinski Reviewed-by: Eric Dumazet Reviewed-by: Jiri Pirko Reviewed-by: Xin Long Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 22346088d33f051a0c848d711534daa3e47cf6af Author: Michal Schmidt Date: Thu Jan 18 21:50:40 2024 +0100 idpf: distinguish vports by the dev_port attribute [ Upstream commit 359724fa3ab79fbe9f42c6263cddc2afae32eef3 ] idpf registers multiple netdevs (virtual ports) for one PCI function, but it does not provide a way for userspace to distinguish them with sysfs attributes. Per Documentation/ABI/testing/sysfs-class-net, it is a bug not to set dev_port for independent ports on the same PCI bus, device and function. Without dev_port set, systemd-udevd's default naming policy attempts to assign the same name ("ens2f0") to all four idpf netdevs on my test system and obviously fails, leaving three of them with the initial eth name. With this patch, systemd-udevd is able to assign unique names to the netdevs (e.g. "ens2f0", "ens2f0d1", "ens2f0d2", "ens2f0d3"). The Intel-provided out-of-tree idpf driver already sets dev_port. In this patch I chose to do it in the same place in the idpf_cfg_netdev function. Fixes: 0fe45467a104 ("idpf: add create vport and netdev configuration") Signed-off-by: Michal Schmidt Reviewed-by: Jesse Brandeburg Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 4ff8c69389f221ac1ebedd4fb15be6d2b83d44bb Author: Eric Dumazet Date: Thu Jan 18 20:17:49 2024 +0000 udp: fix busy polling [ Upstream commit a54d51fb2dfb846aedf3751af501e9688db447f5 ] Generic sk_busy_loop_end() only looks at sk->sk_receive_queue for presence of packets. Problem is that for UDP sockets after blamed commit, some packets could be present in another queue: udp_sk(sk)->reader_queue In some cases, a busy poller could spin until timeout expiration, even if some packets are available in udp_sk(sk)->reader_queue. v3: - make sk_busy_loop_end() nicer (Willem) v2: - add a READ_ONCE(sk->sk_family) in sk_is_inet() to avoid KCSAN splats. - add a sk_is_inet() check in sk_is_udp() (Willem feedback) - add a sk_is_inet() check in sk_is_tcp(). Fixes: 2276f58ac589 ("udp: use a separate rx queue for packet reception") Signed-off-by: Eric Dumazet Reviewed-by: Paolo Abeni Reviewed-by: Willem de Bruijn Reviewed-by: Kuniyuki Iwashima Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit df57fc2f2abf548aa889a36ab0bdcc94a75399dc Author: Kuniyuki Iwashima Date: Thu Jan 18 17:55:15 2024 -0800 llc: Drop support for ETH_P_TR_802_2. [ Upstream commit e3f9bed9bee261e3347131764e42aeedf1ffea61 ] syzbot reported an uninit-value bug below. [0] llc supports ETH_P_802_2 (0x0004) and used to support ETH_P_TR_802_2 (0x0011), and syzbot abused the latter to trigger the bug. write$tun(r0, &(0x7f0000000040)={@val={0x0, 0x11}, @val, @mpls={[], @llc={@snap={0xaa, 0x1, ')', "90e5dd"}}}}, 0x16) llc_conn_handler() initialises local variables {saddr,daddr}.mac based on skb in llc_pdu_decode_sa()/llc_pdu_decode_da() and passes them to __llc_lookup(). However, the initialisation is done only when skb->protocol is htons(ETH_P_802_2), otherwise, __llc_lookup_established() and __llc_lookup_listener() will read garbage. The missing initialisation existed prior to commit 211ed865108e ("net: delete all instances of special processing for token ring"). It removed the part to kick out the token ring stuff but forgot to close the door allowing ETH_P_TR_802_2 packets to sneak into llc_rcv(). Let's remove llc_tr_packet_type and complete the deprecation. [0]: BUG: KMSAN: uninit-value in __llc_lookup_established+0xe9d/0xf90 __llc_lookup_established+0xe9d/0xf90 __llc_lookup net/llc/llc_conn.c:611 [inline] llc_conn_handler+0x4bd/0x1360 net/llc/llc_conn.c:791 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206 __netif_receive_skb_one_core net/core/dev.c:5527 [inline] __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5641 netif_receive_skb_internal net/core/dev.c:5727 [inline] netif_receive_skb+0x58/0x660 net/core/dev.c:5786 tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555 tun_get_user+0x53af/0x66d0 drivers/net/tun.c:2002 tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2020 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x8ef/0x1490 fs/read_write.c:584 ksys_write+0x20f/0x4c0 fs/read_write.c:637 __do_sys_write fs/read_write.c:649 [inline] __se_sys_write fs/read_write.c:646 [inline] __x64_sys_write+0x93/0xd0 fs/read_write.c:646 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Local variable daddr created at: llc_conn_handler+0x53/0x1360 net/llc/llc_conn.c:783 llc_rcv+0xfbb/0x14a0 net/llc/llc_input.c:206 CPU: 1 PID: 5004 Comm: syz-executor994 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 Fixes: 211ed865108e ("net: delete all instances of special processing for token ring") Reported-by: syzbot+b5ad66046b913bc04c6f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=b5ad66046b913bc04c6f Signed-off-by: Kuniyuki Iwashima Reviewed-by: Eric Dumazet Link: https://lore.kernel.org/r/20240119015515.61898-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit c451c008f563d56d5e676c9dcafae565fcad84bb Author: Eric Dumazet Date: Thu Jan 18 18:36:25 2024 +0000 llc: make llc_ui_sendmsg() more robust against bonding changes [ Upstream commit dad555c816a50c6a6a8a86be1f9177673918c647 ] syzbot was able to trick llc_ui_sendmsg(), allocating an skb with no headroom, but subsequently trying to push 14 bytes of Ethernet header [1] Like some others, llc_ui_sendmsg() releases the socket lock before calling sock_alloc_send_skb(). Then it acquires it again, but does not redo all the sanity checks that were performed. This fix: - Uses LL_RESERVED_SPACE() to reserve space. - Check all conditions again after socket lock is held again. - Do not account Ethernet header for mtu limitation. [1] skbuff: skb_under_panic: text:ffff800088baa334 len:1514 put:14 head:ffff0000c9c37000 data:ffff0000c9c36ff2 tail:0x5dc end:0x6c0 dev:bond0 kernel BUG at net/core/skbuff.c:193 ! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6875 Comm: syz-executor.0 Not tainted 6.7.0-rc8-syzkaller-00101-g0802e17d9aca-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_panic net/core/skbuff.c:189 [inline] pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203 lr : skb_panic net/core/skbuff.c:189 [inline] lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:203 sp : ffff800096f97000 x29: ffff800096f97010 x28: ffff80008cc8d668 x27: dfff800000000000 x26: ffff0000cb970c90 x25: 00000000000005dc x24: ffff0000c9c36ff2 x23: ffff0000c9c37000 x22: 00000000000005ea x21: 00000000000006c0 x20: 000000000000000e x19: ffff800088baa334 x18: 1fffe000368261ce x17: ffff80008e4ed000 x16: ffff80008a8310f8 x15: 0000000000000001 x14: 1ffff00012df2d58 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000001 x10: 0000000000ff0100 x9 : e28a51f1087e8400 x8 : e28a51f1087e8400 x7 : ffff80008028f8d0 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000001 x3 : ffff800082b78714 x2 : 0000000000000001 x1 : 0000000100000000 x0 : 0000000000000089 Call trace: skb_panic net/core/skbuff.c:189 [inline] skb_under_panic+0x13c/0x140 net/core/skbuff.c:203 skb_push+0xf0/0x108 net/core/skbuff.c:2451 eth_header+0x44/0x1f8 net/ethernet/eth.c:83 dev_hard_header include/linux/netdevice.h:3188 [inline] llc_mac_hdr_init+0x110/0x17c net/llc/llc_output.c:33 llc_sap_action_send_xid_c+0x170/0x344 net/llc/llc_s_ac.c:85 llc_exec_sap_trans_actions net/llc/llc_sap.c:153 [inline] llc_sap_next_state net/llc/llc_sap.c:182 [inline] llc_sap_state_process+0x1ec/0x774 net/llc/llc_sap.c:209 llc_build_and_send_xid_pkt+0x12c/0x1c0 net/llc/llc_sap.c:270 llc_ui_sendmsg+0x7bc/0xb1c net/llc/af_llc.c:997 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] sock_sendmsg+0x194/0x274 net/socket.c:767 splice_to_socket+0x7cc/0xd58 fs/splice.c:881 do_splice_from fs/splice.c:933 [inline] direct_splice_actor+0xe4/0x1c0 fs/splice.c:1142 splice_direct_to_actor+0x2a0/0x7e4 fs/splice.c:1088 do_splice_direct+0x20c/0x348 fs/splice.c:1194 do_sendfile+0x4bc/0xc70 fs/read_write.c:1254 __do_sys_sendfile64 fs/read_write.c:1322 [inline] __se_sys_sendfile64 fs/read_write.c:1308 [inline] __arm64_sys_sendfile64+0x160/0x3b4 fs/read_write.c:1308 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155 el0_svc+0x54/0x158 arch/arm64/kernel/entry-common.c:678 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:595 Code: aa1803e6 aa1903e7 a90023f5 94792f6a (d4210000) Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-and-tested-by: syzbot+2a7024e9502df538e8ef@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet Reviewed-by: Kuniyuki Iwashima Link: https://lore.kernel.org/r/20240118183625.4007013-1-edumazet@google.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 6260c32d63d53347b9c29dede29eb4833bda8c59 Author: Lin Ma Date: Thu Jan 18 21:03:06 2024 +0800 vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING [ Upstream commit 6c21660fe221a15c789dee2bc2fd95516bc5aeaf ] In the vlan_changelink function, a loop is used to parse the nested attributes IFLA_VLAN_EGRESS_QOS and IFLA_VLAN_INGRESS_QOS in order to obtain the struct ifla_vlan_qos_mapping. These two nested attributes are checked in the vlan_validate_qos_map function, which calls nla_validate_nested_deprecated with the vlan_map_policy. However, this deprecated validator applies a LIBERAL strictness, allowing the presence of an attribute with the type IFLA_VLAN_QOS_UNSPEC. Consequently, the loop in vlan_changelink may parse an attribute of type IFLA_VLAN_QOS_UNSPEC and believe it carries a payload of struct ifla_vlan_qos_mapping, which is not necessarily true. To address this issue and ensure compatibility, this patch introduces two type checks that skip attributes whose type is not IFLA_VLAN_QOS_MAPPING. Fixes: 07b5b17e157b ("[VLAN]: Use rtnl_link API") Signed-off-by: Lin Ma Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240118130306.1644001-1-linma@zju.edu.cn Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 5338331880831ca91598f3d7f4d57824a0556f4b Author: Michael Chan Date: Wed Jan 17 15:45:14 2024 -0800 bnxt_en: Prevent kernel warning when running offline self test [ Upstream commit c20f482129a582455f02eb9a6dcb2a4215274599 ] We call bnxt_half_open_nic() to setup the chip partially to run loopback tests. The rings and buffers are initialized normally so that we can transmit and receive packets in loopback mode. That means page pool buffers are allocated for the aggregation ring just like the normal case. NAPI is not needed because we are just polling for the loopback packets. When we're done with the loopback tests, we call bnxt_half_close_nic() to clean up. When freeing the page pools, we hit a WARN_ON() in page_pool_unlink_napi() because the NAPI state linked to the page pool is uninitialized. The simplest way to avoid this warning is just to initialize the NAPIs during half open and delete the NAPIs during half close. Trying to skip the page pool initialization or skip linking of NAPI during half open will be more complicated. This fix avoids this warning: WARNING: CPU: 4 PID: 46967 at net/core/page_pool.c:946 page_pool_unlink_napi+0x1f/0x30 CPU: 4 PID: 46967 Comm: ethtool Tainted: G S W 6.7.0-rc5+ #22 Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.3.8 08/31/2021 RIP: 0010:page_pool_unlink_napi+0x1f/0x30 Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8b 47 18 48 85 c0 74 1b 48 8b 50 10 83 e2 01 74 08 8b 40 34 83 f8 ff 74 02 <0f> 0b 48 c7 47 18 00 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90 RSP: 0018:ffa000003d0dfbe8 EFLAGS: 00010246 RAX: ff110003607ce640 RBX: ff110010baf5d000 RCX: 0000000000000008 RDX: 0000000000000000 RSI: ff110001e5e522c0 RDI: ff110010baf5d000 RBP: ff11000145539b40 R08: 0000000000000001 R09: ffffffffc063f641 R10: ff110001361eddb8 R11: 000000000040000f R12: 0000000000000001 R13: 000000000000001c R14: ff1100014553a080 R15: 0000000000003fc0 FS: 00007f9301c4f740(0000) GS:ff1100103fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f91344fa8f0 CR3: 00000003527cc005 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? __warn+0x81/0x140 ? page_pool_unlink_napi+0x1f/0x30 ? report_bug+0x102/0x200 ? handle_bug+0x44/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? bnxt_free_ring.isra.123+0xb1/0xd0 [bnxt_en] ? page_pool_unlink_napi+0x1f/0x30 page_pool_destroy+0x3e/0x150 bnxt_free_mem+0x441/0x5e0 [bnxt_en] bnxt_half_close_nic+0x2a/0x40 [bnxt_en] bnxt_self_test+0x21d/0x450 [bnxt_en] __dev_ethtool+0xeda/0x2e30 ? native_queued_spin_lock_slowpath+0x17f/0x2b0 ? __link_object+0xa1/0x160 ? _raw_spin_unlock_irqrestore+0x23/0x40 ? __create_object+0x5f/0x90 ? __kmem_cache_alloc_node+0x317/0x3c0 ? dev_ethtool+0x59/0x170 dev_ethtool+0xa7/0x170 dev_ioctl+0xc3/0x530 sock_do_ioctl+0xa8/0xf0 sock_ioctl+0x270/0x310 __x64_sys_ioctl+0x8c/0xc0 do_syscall_64+0x3e/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 Fixes: 294e39e0d034 ("bnxt: hook NAPIs to page pools") Reviewed-by: Andy Gospodarek Reviewed-by: Ajit Khaparde Signed-off-by: Michael Chan Link: https://lore.kernel.org/r/20240117234515.226944-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 4bc28baf37927613a24df363a407eefa163296ae Author: Michael Chan Date: Wed Jan 17 15:45:11 2024 -0800 bnxt_en: Wait for FLR to complete during probe [ Upstream commit 3c1069fa42872f95cf3c6fedf80723d391e12d57 ] The first message to firmware may fail if the device is undergoing FLR. The driver has some recovery logic for this failure scenario but we must wait 100 msec for FLR to complete before proceeding. Otherwise the recovery will always fail. Fixes: ba02629ff6cb ("bnxt_en: log firmware status on firmware init failure") Reviewed-by: Damodharam Ammepalli Signed-off-by: Michael Chan Link: https://lore.kernel.org/r/20240117234515.226944-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 3982fe726a63fb3de6005e534e2ac8ca7e0aca2a Author: Zhengchao Shao Date: Thu Jan 18 09:20:19 2024 +0800 tcp: make sure init the accept_queue's spinlocks once [ Upstream commit 198bc90e0e734e5f98c3d2833e8390cac3df61b2 ] When I run syz's reproduction C program locally, it causes the following issue: pvqspinlock: lock 0xffff9d181cd5c660 has corrupted value 0x0! WARNING: CPU: 19 PID: 21160 at __pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508) Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__pv_queued_spin_unlock_slowpath (kernel/locking/qspinlock_paravirt.h:508) Code: 73 56 3a ff 90 c3 cc cc cc cc 8b 05 bb 1f 48 01 85 c0 74 05 c3 cc cc cc cc 8b 17 48 89 fe 48 c7 c7 30 20 ce 8f e8 ad 56 42 ff <0f> 0b c3 cc cc cc cc 0f 0b 0f 1f 40 00 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffa8d200604cb8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9d1ef60e0908 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff9d1ef60e0900 RBP: ffff9d181cd5c280 R08: 0000000000000000 R09: 00000000ffff7fff R10: ffffa8d200604b68 R11: ffffffff907dcdc8 R12: 0000000000000000 R13: ffff9d181cd5c660 R14: ffff9d1813a3f330 R15: 0000000000001000 FS: 00007fa110184640(0000) GS:ffff9d1ef60c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000011f65e000 CR4: 00000000000006f0 Call Trace: _raw_spin_unlock (kernel/locking/spinlock.c:186) inet_csk_reqsk_queue_add (net/ipv4/inet_connection_sock.c:1321) inet_csk_complete_hashdance (net/ipv4/inet_connection_sock.c:1358) tcp_check_req (net/ipv4/tcp_minisocks.c:868) tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2260) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205) ip_local_deliver_finish (net/ipv4/ip_input.c:234) __netif_receive_skb_one_core (net/core/dev.c:5529) process_backlog (./include/linux/rcupdate.h:779) __napi_poll (net/core/dev.c:6533) net_rx_action (net/core/dev.c:6604) __do_softirq (./arch/x86/include/asm/jump_label.h:27) do_softirq (kernel/softirq.c:454 kernel/softirq.c:441) __local_bh_enable_ip (kernel/softirq.c:381) __dev_queue_xmit (net/core/dev.c:4374) ip_finish_output2 (./include/net/neighbour.h:540 net/ipv4/ip_output.c:235) __ip_queue_xmit (net/ipv4/ip_output.c:535) __tcp_transmit_skb (net/ipv4/tcp_output.c:1462) tcp_rcv_synsent_state_process (net/ipv4/tcp_input.c:6469) tcp_rcv_state_process (net/ipv4/tcp_input.c:6657) tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1929) __release_sock (./include/net/sock.h:1121 net/core/sock.c:2968) release_sock (net/core/sock.c:3536) inet_wait_for_connect (net/ipv4/af_inet.c:609) __inet_stream_connect (net/ipv4/af_inet.c:702) inet_stream_connect (net/ipv4/af_inet.c:748) __sys_connect (./include/linux/file.h:45 net/socket.c:2064) __x64_sys_connect (net/socket.c:2073 net/socket.c:2070 net/socket.c:2070) do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) RIP: 0033:0x7fa10ff05a3d Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ab a3 0e 00 f7 d8 64 89 01 48 RSP: 002b:00007fa110183de8 EFLAGS: 00000202 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000020000054 RCX: 00007fa10ff05a3d RDX: 000000000000001c RSI: 0000000020000040 RDI: 0000000000000003 RBP: 00007fa110183e20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000202 R12: 00007fa110184640 R13: 0000000000000000 R14: 00007fa10fe8b060 R15: 00007fff73e23b20 The issue triggering process is analyzed as follows: Thread A Thread B tcp_v4_rcv //receive ack TCP packet inet_shutdown tcp_check_req tcp_disconnect //disconnect sock ... tcp_set_state(sk, TCP_CLOSE) inet_csk_complete_hashdance ... inet_csk_reqsk_queue_add inet_listen //start listen spin_lock(&queue->rskq_lock) inet_csk_listen_start ... reqsk_queue_alloc ... spin_lock_init spin_unlock(&queue->rskq_lock) //warning When the socket receives the ACK packet during the three-way handshake, it will hold spinlock. And then the user actively shutdowns the socket and listens to the socket immediately, the spinlock will be initialized. When the socket is going to release the spinlock, a warning is generated. Also the same issue to fastopenq.lock. Move init spinlock to inet_create and inet_accept to make sure init the accept_queue's spinlocks once. Fixes: fff1f3001cc5 ("tcp: add a spinlock to protect struct request_sock_queue") Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path") Reported-by: Ming Shu Signed-off-by: Zhengchao Shao Reviewed-by: Eric Dumazet Link: https://lore.kernel.org/r/20240118012019.1751966-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit abc5830a0baf3ba282fa04accc275ec53029ade7 Author: Benjamin Poirier Date: Wed Jan 17 19:12:32 2024 -0500 selftests: bonding: Increase timeout to 1200s [ Upstream commit b01f15a7571b7aa222458bc9bf26ab59bd84e384 ] When tests are run by runner.sh, bond_options.sh gets killed before it can complete: make -C tools/testing/selftests run_tests TARGETS="drivers/net/bonding" [...] # timeout set to 120 # selftests: drivers/net/bonding: bond_options.sh # TEST: prio (active-backup miimon primary_reselect 0) [ OK ] # TEST: prio (active-backup miimon primary_reselect 1) [ OK ] # TEST: prio (active-backup miimon primary_reselect 2) [ OK ] # TEST: prio (active-backup arp_ip_target primary_reselect 0) [ OK ] # TEST: prio (active-backup arp_ip_target primary_reselect 1) [ OK ] # TEST: prio (active-backup arp_ip_target primary_reselect 2) [ OK ] # not ok 7 selftests: drivers/net/bonding: bond_options.sh # TIMEOUT 120 seconds This test includes many sleep statements, at least some of which are related to timers in the operation of the bonding driver itself. Increase the test timeout to allow the test to complete. I ran the test in slightly different VMs (including one without HW virtualization support) and got runtimes of 13m39.760s, 13m31.238s, and 13m2.956s. Use a ~1.5x "safety factor" and set the timeout to 1200s. Fixes: 42a8d4aaea84 ("selftests: bonding: add bonding prio option test") Reported-by: Jakub Kicinski Closes: https://lore.kernel.org/netdev/20240116104402.1203850a@kernel.org/#t Suggested-by: Jakub Kicinski Signed-off-by: Benjamin Poirier Reviewed-by: Hangbin Liu Link: https://lore.kernel.org/r/20240118001233.304759-1-bpoirier@nvidia.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 8f3f9186e5bb96a9c9654c41653210e3ea7e48a6 Author: Wen Gu Date: Thu Jan 18 12:32:10 2024 +0800 net/smc: fix illegal rmb_desc access in SMC-D connection dump [ Upstream commit dbc153fd3c142909e564bb256da087e13fbf239c ] A crash was found when dumping SMC-D connections. It can be reproduced by following steps: - run nginx/wrk test: smc_run nginx smc_run wrk -t 16 -c 1000 -d -H 'Connection: Close' - continuously dump SMC-D connections in parallel: watch -n 1 'smcss -D' BUG: kernel NULL pointer dereference, address: 0000000000000030 CPU: 2 PID: 7204 Comm: smcss Kdump: loaded Tainted: G E 6.7.0+ #55 RIP: 0010:__smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag] Call Trace: ? __die+0x24/0x70 ? page_fault_oops+0x66/0x150 ? exc_page_fault+0x69/0x140 ? asm_exc_page_fault+0x26/0x30 ? __smc_diag_dump.constprop.0+0x5e5/0x620 [smc_diag] ? __kmalloc_node_track_caller+0x35d/0x430 ? __alloc_skb+0x77/0x170 smc_diag_dump_proto+0xd0/0xf0 [smc_diag] smc_diag_dump+0x26/0x60 [smc_diag] netlink_dump+0x19f/0x320 __netlink_dump_start+0x1dc/0x300 smc_diag_handler_dump+0x6a/0x80 [smc_diag] ? __pfx_smc_diag_dump+0x10/0x10 [smc_diag] sock_diag_rcv_msg+0x121/0x140 ? __pfx_sock_diag_rcv_msg+0x10/0x10 netlink_rcv_skb+0x5a/0x110 sock_diag_rcv+0x28/0x40 netlink_unicast+0x22a/0x330 netlink_sendmsg+0x1f8/0x420 __sock_sendmsg+0xb0/0xc0 ____sys_sendmsg+0x24e/0x300 ? copy_msghdr_from_user+0x62/0x80 ___sys_sendmsg+0x7c/0xd0 ? __do_fault+0x34/0x160 ? do_read_fault+0x5f/0x100 ? do_fault+0xb0/0x110 ? __handle_mm_fault+0x2b0/0x6c0 __sys_sendmsg+0x4d/0x80 do_syscall_64+0x69/0x180 entry_SYSCALL_64_after_hwframe+0x6e/0x76 It is possible that the connection is in process of being established when we dump it. Assumed that the connection has been registered in a link group by smc_conn_create() but the rmb_desc has not yet been initialized by smc_buf_create(), thus causing the illegal access to conn->rmb_desc. So fix it by checking before dump. Fixes: 4b1b7d3b30a6 ("net/smc: add SMC-D diag support") Signed-off-by: Wen Gu Reviewed-by: Dust Li Reviewed-by: Wenjia Zhang Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 34de0f04684ec00c093a0455648be055f0e8e24f Author: Qu Wenruo Date: Wed Jan 17 11:02:25 2024 +1030 btrfs: scrub: avoid use-after-free when chunk length is not 64K aligned [ Upstream commit f546c4282673497a06ecb6190b50ae7f6c85b02f ] [BUG] There is a bug report that, on a ext4-converted btrfs, scrub leads to various problems, including: - "unable to find chunk map" errors BTRFS info (device vdb): scrub: started on devid 1 BTRFS critical (device vdb): unable to find chunk map for logical 2214744064 length 4096 BTRFS critical (device vdb): unable to find chunk map for logical 2214744064 length 45056 This would lead to unrepariable errors. - Use-after-free KASAN reports: ================================================================== BUG: KASAN: slab-use-after-free in __blk_rq_map_sg+0x18f/0x7c0 Read of size 8 at addr ffff8881013c9040 by task btrfs/909 CPU: 0 PID: 909 Comm: btrfs Not tainted 6.7.0-x64v3-dbg #11 c50636e9419a8354555555245df535e380563b2b Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-2 12/24/2023 Call Trace: dump_stack_lvl+0x43/0x60 print_report+0xcf/0x640 kasan_report+0xa6/0xd0 __blk_rq_map_sg+0x18f/0x7c0 virtblk_prep_rq.isra.0+0x215/0x6a0 [virtio_blk 19a65eeee9ae6fcf02edfad39bb9ddee07dcdaff] virtio_queue_rqs+0xc4/0x310 [virtio_blk 19a65eeee9ae6fcf02edfad39bb9ddee07dcdaff] blk_mq_flush_plug_list.part.0+0x780/0x860 __blk_flush_plug+0x1ba/0x220 blk_finish_plug+0x3b/0x60 submit_initial_group_read+0x10a/0x290 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965] flush_scrub_stripes+0x38e/0x430 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965] scrub_stripe+0x82a/0xae0 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965] scrub_chunk+0x178/0x200 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965] scrub_enumerate_chunks+0x4bc/0xa30 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965] btrfs_scrub_dev+0x398/0x810 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965] btrfs_ioctl+0x4b9/0x3020 [btrfs e57987a360bed82fe8756dcd3e0de5406ccfe965] __x64_sys_ioctl+0xbd/0x100 do_syscall_64+0x5d/0xe0 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7f47e5e0952b - Crash, mostly due to above use-after-free [CAUSE] The converted fs has the following data chunk layout: item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 2214658048) itemoff 16025 itemsize 80 length 86016 owner 2 stripe_len 65536 type DATA|single For above logical bytenr 2214744064, it's at the chunk end (2214658048 + 86016 = 2214744064). This means btrfs_submit_bio() would split the bio, and trigger endio function for both of the two halves. However scrub_submit_initial_read() would only expect the endio function to be called once, not any more. This means the first endio function would already free the bbio::bio, leaving the bvec freed, thus the 2nd endio call would lead to use-after-free. [FIX] - Make sure scrub_read_endio() only updates bits in its range Since we may read less than 64K at the end of the chunk, we should not touch the bits beyond chunk boundary. - Make sure scrub_submit_initial_read() only to read the chunk range This is done by calculating the real number of sectors we need to read, and add sector-by-sector to the bio. Thankfully the scrub read repair path won't need extra fixes: - scrub_stripe_submit_repair_read() With above fixes, we won't update error bit for range beyond chunk, thus scrub_stripe_submit_repair_read() should never submit any read beyond the chunk. Reported-by: Rongrong Fixes: e02ee89baa66 ("btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure") Tested-by: Rongrong Reviewed-by: Johannes Thumshirn Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Sasha Levin commit e04bf59bdba0fa45d52160be676114e16be855a9 Author: Johannes Berg Date: Thu Jan 11 18:17:44 2024 +0200 wifi: mac80211: fix potential sta-link leak [ Upstream commit b01a74b3ca6fd51b62c67733ba7c3280fa6c5d26 ] When a station is allocated, links are added but not set to valid yet (e.g. during connection to an AP MLD), we might remove the station without ever marking links valid, and leak them. Fix that. Fixes: cb71f1d136a6 ("wifi: mac80211: add sta link addition/removal") Signed-off-by: Johannes Berg Reviewed-by: Ilan Peer Signed-off-by: Miri Korenblit Link: https://msgid.link/20240111181514.6573998beaf8.I09ac2e1d41c80f82a5a616b8bd1d9d8dd709a6a6@changeid Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit ebb67233acf708c0a943475af58b53da2b23ff3d Author: Lucas Stach Date: Wed Jan 17 22:06:28 2024 +0100 SUNRPC: use request size to initialize bio_vec in svc_udp_sendto() [ Upstream commit 1d9cabe2817edd215779dc9c2fe5e7ab9aac0704 ] Use the proper size when setting up the bio_vec, as otherwise only zero-length UDP packets will be sent. Fixes: baabf59c2414 ("SUNRPC: Convert svc_udp_sendto() to use the per-socket bio_vec array") Signed-off-by: Lucas Stach Signed-off-by: Chuck Lever Signed-off-by: Sasha Levin commit e4b7b15c93d340e8591d89794667e131a869d80d Author: Namjae Jeon Date: Tue Jan 23 20:42:28 2024 +0900 ksmbd: Add missing set_freezable() for freezable kthread From: Kevin Hao [ Upstream commit 8fb7b723924cc9306bc161f45496497aec733904 ] The kernel thread function ksmbd_conn_handler_loop() invokes the try_to_freeze() in its loop. But all the kernel threads are non-freezable by default. So if we want to make a kernel thread to be freezable, we have to invoke set_freezable() explicitly. Signed-off-by: Kevin Hao Acked-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 0bf525e64fd3ccc0fd136a3a70bb4922f93e90a6 Author: Namjae Jeon Date: Tue Jan 23 20:42:27 2024 +0900 ksmbd: send lease break notification on FILE_RENAME_INFORMATION [ Upstream commit 3fc74c65b367476874da5fe6f633398674b78e5a ] Send lease break notification on FILE_RENAME_INFORMATION request. This patch fix smb2.lease.v2_epoch2 test failure. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit c9520770de70300d716528d07b6ed182e3a90be9 Author: Namjae Jeon Date: Tue Jan 23 20:42:26 2024 +0900 ksmbd: don't increment epoch if current state and request state are same [ Upstream commit b6e9a44e99603fe10e1d78901fdd97681a539612 ] If existing lease state and request state are same, don't increment epoch in create context. Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit ecfd93955994ecc2a1308f5ee4bd90c7fca9a8c6 Author: Namjae Jeon Date: Tue Jan 23 20:42:25 2024 +0900 ksmbd: fix potential circular locking issue in smb2_set_ea() [ Upstream commit 6fc0a265e1b932e5e97a038f99e29400a93baad0 ] smb2_set_ea() can be called in parent inode lock range. So add get_write argument to smb2_set_ea() not to call nested mnt_want_write(). Signed-off-by: Namjae Jeon Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit bbaf2a0ecd0c5305a577bd1dc87b949a3d87dd7b Author: Jonathan Gray Date: Sat Jan 27 12:03:59 2024 +1100 Revert "drm/amd: Enable PCIe PME from D3" This reverts commit 05f7a3475af0faa8bf77f8637c4a40349db4f78f. duplicated a change made in 6.7 6967741d26c87300a51b5e50d4acd104bc1a9759 Cc: stable@vger.kernel.org # 6.7 Signed-off-by: Jonathan Gray Signed-off-by: Greg Kroah-Hartman commit aa74ce30a8a40d19a4256de4ae5322e71344a274 Author: Benjamin Berg Date: Mon Jan 15 11:18:05 2024 +0100 wifi: ath11k: rely on mac80211 debugfs handling for vif commit 556857aa1d0855aba02b1c63bc52b91ec63fc2cc upstream. mac80211 started to delete debugfs entries in certain cases, causing a ath11k to crash when it tried to delete the entries later. Fix this by relying on mac80211 to delete the entries when appropriate and adding them from the vif_add_debugfs handler. Fixes: 0a3d898ee9a8 ("wifi: mac80211: add/remove driver debugfs entries as appropriate") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218364 Signed-off-by: Benjamin Berg Acked-by: Jeff Johnson Signed-off-by: Kalle Valo Link: https://msgid.link/20240115101805.1277949-1-benjamin@sipsolutions.net Signed-off-by: Greg Kroah-Hartman commit 9b1a6e31edf953542643466299b23d9c9e822e92 Author: Namjae Jeon Date: Tue Jan 23 20:42:24 2024 +0900 ksmbd: set v2 lease version on lease upgrade [ Upstream commit bb05367a66a9990d2c561282f5620bb1dbe40c28 ] If file opened with v2 lease is upgraded with v1 lease, smb server should response v2 lease create context to client. This patch fix smb2.lease.v2_epoch2 test failure. This test case assumes the following scenario: 1. smb2 create with v2 lease(R, LEASE1 key) 2. smb server return smb2 create response with v2 lease context(R, LEASE1 key, epoch + 1) 3. smb2 create with v1 lease(RH, LEASE1 key) 4. smb server return smb2 create response with v2 lease context(RH, LEASE1 key, epoch + 2) i.e. If same client(same lease key) try to open a file that is being opened with v2 lease with v1 lease, smb server should return v2 lease. Signed-off-by: Namjae Jeon Acked-by: Tom Talpey Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 5506bcf75d89bd28fb4645fc969e2a1eb0273083 Author: Charan Teja Kalla Date: Fri Nov 24 16:27:25 2023 +0530 mm: page_alloc: unreserve highatomic page blocks before oom commit ac3f3b0a55518056bc80ed32a41931c99e1f7d81 upstream. __alloc_pages_direct_reclaim() is called from slowpath allocation where high atomic reserves can be unreserved after there is a progress in reclaim and yet no suitable page is found. Later should_reclaim_retry() gets called from slow path allocation to decide if the reclaim needs to be retried before OOM kill path is taken. should_reclaim_retry() checks the available(reclaimable + free pages) memory against the min wmark levels of a zone and returns: a) true, if it is above the min wmark so that slow path allocation will do the reclaim retries. b) false, thus slowpath allocation takes oom kill path. should_reclaim_retry() can also unreserves the high atomic reserves **but only after all the reclaim retries are exhausted.** In a case where there are almost none reclaimable memory and free pages contains mostly the high atomic reserves but allocation context can't use these high atomic reserves, makes the available memory below min wmark levels hence false is returned from should_reclaim_retry() leading the allocation request to take OOM kill path. This can turn into a early oom kill if high atomic reserves are holding lot of free memory and unreserving of them is not attempted. (early)OOM is encountered on a VM with the below state: [ 295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB local_pcp:492kB free_cma:0kB [ 295.998656] lowmem_reserve[]: 0 32 [ 295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH) 33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7752kB Per above log, the free memory of ~7MB exist in the high atomic reserves is not freed up before falling back to oom kill path. Fix it by trying to unreserve the high atomic reserves in should_reclaim_retry() before __alloc_pages_direct_reclaim() can fallback to oom kill path. Link: https://lkml.kernel.org/r/1700823445-27531-1-git-send-email-quic_charante@quicinc.com Fixes: 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand") Signed-off-by: Charan Teja Kalla Reported-by: Chris Goldsworthy Suggested-by: Michal Hocko Acked-by: Michal Hocko Acked-by: David Rientjes Cc: Chris Goldsworthy Cc: David Hildenbrand Cc: Johannes Weiner Cc: Mel Gorman Cc: Pavankumar Kondeti Cc: Vlastimil Babka Cc: Joakim Tjernlund Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 3554e2d33723eb95366e6f02f0502fc461790c43 Author: Hugo Villeneuve Date: Thu Dec 21 18:18:12 2023 -0500 serial: sc16is7xx: improve do/while loop in sc16is7xx_irq() commit d5078509c8b06c5c472a60232815e41af81c6446 upstream. Simplify and improve readability by replacing while(1) loop with do {} while, and by using the keep_polling variable as the exit condition, making it more explicit. Fixes: 834449872105 ("sc16is7xx: Fix for multi-channel stall") Cc: Suggested-by: Andy Shevchenko Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231221231823.2327894-6-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit 30b6b173fc8db6df58aef834cbbe487a81fe271b Author: Hugo Villeneuve Date: Thu Dec 21 18:18:11 2023 -0500 serial: sc16is7xx: remove obsolete loop in sc16is7xx_port_irq() commit ed647256e8f226241ecff7baaecdb8632ffc7ec1 upstream. Commit 834449872105 ("sc16is7xx: Fix for multi-channel stall") changed sc16is7xx_port_irq() from looping multiple times when there was still interrupts to serve. It simply changed the do {} while(1) loop to a do {} while(0) loop, which makes the loop itself now obsolete. Clean the code by removing this obsolete do {} while(0) loop. Fixes: 834449872105 ("sc16is7xx: Fix for multi-channel stall") Cc: Suggested-by: Andy Shevchenko Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231221231823.2327894-5-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit ec79957c2d592fb530471089818ee4f9396c5919 Author: Hugo Villeneuve Date: Thu Dec 21 18:18:08 2023 -0500 serial: sc16is7xx: fix invalid sc16is7xx_lines bitfield in case of probe error commit 8a1060ce974919f2a79807527ad82ac39336eda2 upstream. If an error occurs during probing, the sc16is7xx_lines bitfield may be left in a state that doesn't represent the correct state of lines allocation. For example, in a system with two SC16 devices, if an error occurs only during probing of channel (port) B of the second device, sc16is7xx_lines final state will be 00001011b instead of the expected 00000011b. This is caused in part because of the "i--" in the for/loop located in the out_ports: error path. Fix this by checking the return value of uart_add_one_port() and set line allocation bit only if this was successful. This allows the refactor of the obfuscated for(i--...) loop in the error path, and properly call uart_remove_one_port() only when needed, and properly unset line allocation bits. Also use same mechanism in remove() when calling uart_remove_one_port(). Fixes: c64349722d14 ("sc16is7xx: support multiple devices") Cc: Cc: Yury Norov Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231221231823.2327894-2-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit 181acce4f316d3dc3266be0b27a45c2cbbfd1dab Author: Hugo Villeneuve Date: Mon Dec 11 12:13:53 2023 -0500 serial: sc16is7xx: fix unconditional activation of THRI interrupt commit 9915753037eba7135b209fef4f2afeca841af816 upstream. Commit cc4c1d05eb10 ("sc16is7xx: Properly resume TX after stop") changed behavior to unconditionnaly set the THRI interrupt in sc16is7xx_tx_proc(). For example when sending a 65 bytes message, and assuming the Tx FIFO is initially empty, sc16is7xx_handle_tx() will write the first 64 bytes of the message to the FIFO and sc16is7xx_tx_proc() will then activate THRI. When the THRI IRQ is fired, the driver will write the remaining byte of the message to the FIFO, and disable THRI by calling sc16is7xx_stop_tx(). When sending a 2 bytes message, sc16is7xx_handle_tx() will write the 2 bytes of the message to the FIFO and call sc16is7xx_stop_tx(), disabling THRI. After sc16is7xx_handle_tx() exits, control returns to sc16is7xx_tx_proc() which will unconditionally set THRI. When the THRI IRQ is fired, the driver simply acknowledges the interrupt and does nothing more, since all the data has already been written to the FIFO. This results in 2 register writes and 4 register reads all for nothing and taking precious cycles from the I2C/SPI bus. Fix this by enabling the THRI interrupt only when we fill the Tx FIFO to its maximum capacity and there are remaining bytes to send in the message. Fixes: cc4c1d05eb10 ("sc16is7xx: Properly resume TX after stop") Cc: Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231211171353.2901416-7-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit aa7cb4787698add9367b19f7afc667662c9bdb23 Author: Hugo Villeneuve Date: Mon Dec 11 12:13:52 2023 -0500 serial: sc16is7xx: convert from _raw_ to _noinc_ regmap functions for FIFO commit dbf4ab821804df071c8b566d9813083125e6d97b upstream. The SC16IS7XX IC supports a burst mode to access the FIFOs where the initial register address is sent ($00), followed by all the FIFO data without having to resend the register address each time. In this mode, the IC doesn't increment the register address for each R/W byte. The regmap_raw_read() and regmap_raw_write() are functions which can perform IO over multiple registers. They are currently used to read/write from/to the FIFO, and although they operate correctly in this burst mode on the SPI bus, they would corrupt the regmap cache if it was not disabled manually. The reason is that when the R/W size is more than 1 byte, these functions assume that the register address is incremented and handle the cache accordingly. Convert FIFO R/W functions to use the regmap _noinc_ versions in order to remove the manual cache control which was a workaround when using the _raw_ versions. FIFO registers are properly declared as volatile so cache will not be used/updated for FIFO accesses. Fixes: dfeae619d781 ("serial: sc16is7xx") Cc: Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231211171353.2901416-6-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit dbe196ca489f3833e0f9eeb86ac346194f131c91 Author: Hugo Villeneuve Date: Mon Dec 11 12:13:51 2023 -0500 serial: sc16is7xx: change EFR lock to operate on each channels commit 4409df5866b7ff7686ba27e449ca97a92ee063c9 upstream. Now that the driver has been converted to use one regmap per port, change efr locking to operate on a channel basis instead of on the whole IC. Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port") Cc: # 6.1.x: 3837a03 serial: sc16is7xx: improve regmap debugfs by using one regmap per port Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231211171353.2901416-5-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit b7fbd3e4c2404e02455b555113ca240d9b39805f Author: Hugo Villeneuve Date: Mon Dec 11 12:13:50 2023 -0500 serial: sc16is7xx: remove unused line structure member commit 41a308cbedb2a68a6831f0f2e992e296c4b8aff0 upstream. Now that the driver has been converted to use one regmap per port, the line structure member is no longer used, so remove it. Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port") Cc: Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231211171353.2901416-4-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit 052181baa70e2504cfc6292f5c350420dfff6bef Author: Hugo Villeneuve Date: Mon Dec 11 12:13:49 2023 -0500 serial: sc16is7xx: remove global regmap from struct sc16is7xx_port commit f6959c5217bd799bcb770b95d3c09b3244e175c6 upstream. Remove global struct regmap so that it is more obvious that this regmap is to be used only in the probe function. Also add a comment to that effect in probe function. Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port") Cc: Suggested-by: Andy Shevchenko Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231211171353.2901416-3-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit 9ca830e1eb874acbb38eeaef5671f4ea09c3ca19 Author: Hugo Villeneuve Date: Mon Dec 11 12:13:48 2023 -0500 serial: sc16is7xx: remove wasteful static buffer in sc16is7xx_regmap_name() commit 6bcab3c8acc88e265c570dea969fd04f137c8a4c upstream. Using a static buffer inside sc16is7xx_regmap_name() was a convenient and simple way to set the regmap name without having to allocate and free a buffer each time it is called. The drawback is that the static buffer wastes memory for nothing once regmap is fully initialized. Remove static buffer and use constant strings instead. This also avoids a truncation warning when using "%d" or "%u" in snprintf which was flagged by kernel test robot. Fixes: 3837a0379533 ("serial: sc16is7xx: improve regmap debugfs by using one regmap per port") Cc: # 6.1.x: 3837a03 serial: sc16is7xx: improve regmap debugfs by using one regmap per port Suggested-by: Andy Shevchenko Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231211171353.2901416-2-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit 0c84cb41133f3c36b42b88d8a73f629d9d8dbf48 Author: Hugo Villeneuve Date: Mon Oct 30 17:14:47 2023 -0400 serial: sc16is7xx: improve regmap debugfs by using one regmap per port commit 3837a0379533aabb9e4483677077479f7c6aa910 upstream. With this current driver regmap implementation, it is hard to make sense of the register addresses displayed using the regmap debugfs interface, because they do not correspond to the actual register addresses documented in the datasheet. For example, register 1 is displayed as registers 04 thru 07: $ cat /sys/kernel/debug/regmap/spi0.0/registers 04: 10 -> Port 0, register offset 1 05: 10 -> Port 1, register offset 1 06: 00 -> Port 2, register offset 1 -> invalid 07: 00 -> port 3, register offset 1 -> invalid ... The reason is that bits 0 and 1 of the register address correspond to the channel (port) bits, so the register address itself starts at bit 2, and we must 'mentally' shift each register address by 2 bits to get its real address/offset. Also, only channels 0 and 1 are supported by the chip, so channel mask combinations of 10b and 11b are invalid, and the display of these registers is useless. This patch adds a separate regmap configuration for each port, similar to what is done in the max310x driver, so that register addresses displayed match the register addresses in the chip datasheet. Also, each port now has its own debugfs entry. Example with new regmap implementation: $ cat /sys/kernel/debug/regmap/spi0.0-port0/registers 1: 10 2: 01 3: 00 ... $ cat /sys/kernel/debug/regmap/spi0.0-port1/registers 1: 10 2: 01 3: 00 As an added bonus, this also simplifies some operations (read/write/modify) because it is no longer necessary to manually shift register addresses. Signed-off-by: Hugo Villeneuve Link: https://lore.kernel.org/r/20231030211447.974779-1-hugo@hugovil.com Signed-off-by: Greg Kroah-Hartman commit cbb5ec8f75b74175e2a0d41b8281a1d4ca6a6089 Author: Al Viro Date: Sun Nov 19 20:25:58 2023 -0500 rename(): fix the locking of subdirectories commit 22e111ed6c83dcde3037fc81176012721bc34c0b upstream. We should never lock two subdirectories without having taken ->s_vfs_rename_mutex; inode pointer order or not, the "order" proposed in 28eceeda130f "fs: Lock moved directories" is not transitive, with the usual consequences. The rationale for locking renamed subdirectory in all cases was the possibility of race between rename modifying .. in a subdirectory to reflect the new parent and another thread modifying the same subdirectory. For a lot of filesystems that's not a problem, but for some it can lead to trouble (e.g. the case when short directory contents is kept in the inode, but creating a file in it might push it across the size limit and copy its contents into separate data block(s)). However, we need that only in case when the parent does change - otherwise ->rename() doesn't need to do anything with .. entry in the first place. Some instances are lazy and do a tautological update anyway, but it's really not hard to avoid. Amended locking rules for rename(): find the parent(s) of source and target if source and target have the same parent lock the common parent else lock ->s_vfs_rename_mutex lock both parents, in ancestor-first order; if neither is an ancestor of another, lock the parent of source first. find the source and target. if source and target have the same parent if operation is an overwriting rename of a subdirectory lock the target subdirectory else if source is a subdirectory lock the source if target is a subdirectory lock the target lock non-directories involved, in inode pointer order if both source and target are such. That way we are guaranteed that parents are locked (for obvious reasons), that any renamed non-directory is locked (nfsd relies upon that), that any victim is locked (emptiness check needs that, among other things) and subdirectory that changes parent is locked (needed to protect the update of .. entries). We are also guaranteed that any operation locking more than one directory either takes ->s_vfs_rename_mutex or locks a parent followed by its child. Cc: stable@vger.kernel.org Fixes: 28eceeda130f "fs: Lock moved directories" Reviewed-by: Jan Kara Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 3a01daace71b521563c38bbbf874e14c3e58adb7 Author: Charan Teja Kalla Date: Fri Oct 13 18:34:27 2023 +0530 mm/sparsemem: fix race in accessing memory_section->usage commit 5ec8e8ea8b7783fab150cf86404fc38cb4db8800 upstream. The below race is observed on a PFN which falls into the device memory region with the system memory configuration where PFN's are such that [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL]. Since normal zone start and end pfn contains the device memory PFN's as well, the compaction triggered will try on the device memory PFN's too though they end up in NOP(because pfn_to_online_page() returns NULL for ZONE_DEVICE memory sections). When from other core, the section mappings are being removed for the ZONE_DEVICE region, that the PFN in question belongs to, on which compaction is currently being operated is resulting into the kernel crash with CONFIG_SPASEMEM_VMEMAP enabled. The crash logs can be seen at [1]. compact_zone() memunmap_pages ------------- --------------- __pageblock_pfn_to_page ...... (a)pfn_valid(): valid_section()//return true (b)__remove_pages()-> sparse_remove_section()-> section_deactivate(): [Free the array ms->usage and set ms->usage = NULL] pfn_section_valid() [Access ms->usage which is NULL] NOTE: From the above it can be said that the race is reduced to between the pfn_valid()/pfn_section_valid() and the section deactivate with SPASEMEM_VMEMAP enabled. The commit b943f045a9af("mm/sparse: fix kernel crash with pfn_section_valid check") tried to address the same problem by clearing the SECTION_HAS_MEM_MAP with the expectation of valid_section() returns false thus ms->usage is not accessed. Fix this issue by the below steps: a) Clear SECTION_HAS_MEM_MAP before freeing the ->usage. b) RCU protected read side critical section will either return NULL when SECTION_HAS_MEM_MAP is cleared or can successfully access ->usage. c) Free the ->usage with kfree_rcu() and set ms->usage = NULL. No attempt will be made to access ->usage after this as the SECTION_HAS_MEM_MAP is cleared thus valid_section() return false. Thanks to David/Pavan for their inputs on this patch. [1] https://lore.kernel.org/linux-mm/994410bb-89aa-d987-1f50-f514903c55aa@quicinc.com/ On Snapdragon SoC, with the mentioned memory configuration of PFN's as [ZONE_NORMAL ZONE_DEVICE ZONE_NORMAL], we are able to see bunch of issues daily while testing on a device farm. For this particular issue below is the log. Though the below log is not directly pointing to the pfn_section_valid(){ ms->usage;}, when we loaded this dump on T32 lauterbach tool, it is pointing. [ 540.578056] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 540.578068] Mem abort info: [ 540.578070] ESR = 0x0000000096000005 [ 540.578073] EC = 0x25: DABT (current EL), IL = 32 bits [ 540.578077] SET = 0, FnV = 0 [ 540.578080] EA = 0, S1PTW = 0 [ 540.578082] FSC = 0x05: level 1 translation fault [ 540.578085] Data abort info: [ 540.578086] ISV = 0, ISS = 0x00000005 [ 540.578088] CM = 0, WnR = 0 [ 540.579431] pstate: 82400005 (Nzcv daif +PAN -UAO +TCO -DIT -SSBSBTYPE=--) [ 540.579436] pc : __pageblock_pfn_to_page+0x6c/0x14c [ 540.579454] lr : compact_zone+0x994/0x1058 [ 540.579460] sp : ffffffc03579b510 [ 540.579463] x29: ffffffc03579b510 x28: 0000000000235800 x27:000000000000000c [ 540.579470] x26: 0000000000235c00 x25: 0000000000000068 x24:ffffffc03579b640 [ 540.579477] x23: 0000000000000001 x22: ffffffc03579b660 x21:0000000000000000 [ 540.579483] x20: 0000000000235bff x19: ffffffdebf7e3940 x18:ffffffdebf66d140 [ 540.579489] x17: 00000000739ba063 x16: 00000000739ba063 x15:00000000009f4bff [ 540.579495] x14: 0000008000000000 x13: 0000000000000000 x12:0000000000000001 [ 540.579501] x11: 0000000000000000 x10: 0000000000000000 x9 :ffffff897d2cd440 [ 540.579507] x8 : 0000000000000000 x7 : 0000000000000000 x6 :ffffffc03579b5b4 [ 540.579512] x5 : 0000000000027f25 x4 : ffffffc03579b5b8 x3 :0000000000000001 [ 540.579518] x2 : ffffffdebf7e3940 x1 : 0000000000235c00 x0 :0000000000235800 [ 540.579524] Call trace: [ 540.579527] __pageblock_pfn_to_page+0x6c/0x14c [ 540.579533] compact_zone+0x994/0x1058 [ 540.579536] try_to_compact_pages+0x128/0x378 [ 540.579540] __alloc_pages_direct_compact+0x80/0x2b0 [ 540.579544] __alloc_pages_slowpath+0x5c0/0xe10 [ 540.579547] __alloc_pages+0x250/0x2d0 [ 540.579550] __iommu_dma_alloc_noncontiguous+0x13c/0x3fc [ 540.579561] iommu_dma_alloc+0xa0/0x320 [ 540.579565] dma_alloc_attrs+0xd4/0x108 [quic_charante@quicinc.com: use kfree_rcu() in place of synchronize_rcu(), per David] Link: https://lkml.kernel.org/r/1698403778-20938-1-git-send-email-quic_charante@quicinc.com Link: https://lkml.kernel.org/r/1697202267-23600-1-git-send-email-quic_charante@quicinc.com Fixes: f46edbd1b151 ("mm/sparsemem: add helpers track active portions of a section at boot") Signed-off-by: Charan Teja Kalla Cc: Aneesh Kumar K.V Cc: Dan Williams Cc: David Hildenbrand Cc: Mel Gorman Cc: Oscar Salvador Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 3889a418b6eb9a1113fb989aaadecf2f64964767 Author: Baolin Wang Date: Fri Dec 15 20:07:52 2023 +0800 mm: migrate: fix getting incorrect page mapping during page migration commit d1adb25df7111de83b64655a80b5a135adbded61 upstream. When running stress-ng testing, we found below kernel crash after a few hours: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 pc : dentry_name+0xd8/0x224 lr : pointer+0x22c/0x370 sp : ffff800025f134c0 ...... Call trace: dentry_name+0xd8/0x224 pointer+0x22c/0x370 vsnprintf+0x1ec/0x730 vscnprintf+0x2c/0x60 vprintk_store+0x70/0x234 vprintk_emit+0xe0/0x24c vprintk_default+0x3c/0x44 vprintk_func+0x84/0x2d0 printk+0x64/0x88 __dump_page+0x52c/0x530 dump_page+0x14/0x20 set_migratetype_isolate+0x110/0x224 start_isolate_page_range+0xc4/0x20c offline_pages+0x124/0x474 memory_block_offline+0x44/0xf4 memory_subsys_offline+0x3c/0x70 device_offline+0xf0/0x120 ...... After analyzing the vmcore, I found this issue is caused by page migration. The scenario is that, one thread is doing page migration, and we will use the target page's ->mapping field to save 'anon_vma' pointer between page unmap and page move, and now the target page is locked and refcount is 1. Currently, there is another stress-ng thread performing memory hotplug, attempting to offline the target page that is being migrated. It discovers that the refcount of this target page is 1, preventing the offline operation, thus proceeding to dump the page. However, page_mapping() of the target page may return an incorrect file mapping to crash the system in dump_mapping(), since the target page->mapping only saves 'anon_vma' pointer without setting PAGE_MAPPING_ANON flag. There are seveval ways to fix this issue: (1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving 'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target page has not built mappings yet. (2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing the system, however, there are still some PFN walkers that call page_mapping() without holding the page lock, such as compaction. (3) Using target page->private field to save the 'anon_vma' pointer and 2 bits page state, just as page->mapping records an anonymous page, which can remove the page_mapping() impact for PFN walkers and also seems a simple way. So I choose option 3 to fix this issue, and this can also fix other potential issues for PFN walkers, such as compaction. Link: https://lkml.kernel.org/r/e60b17a88afc38cb32f84c3e30837ec70b343d2b.1702641709.git.baolin.wang@linux.alibaba.com Fixes: 64c8902ed441 ("migrate_pages: split unmap_and_move() to _unmap() and _move()") Signed-off-by: Baolin Wang Reviewed-by: "Huang, Ying" Cc: Matthew Wilcox Cc: David Hildenbrand Cc: Xu Yu Cc: Zi Yan Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 16002a2a365d3c851e890af3df5980648701b411 Author: Steven Rostedt (Google) Date: Fri Dec 1 14:59:36 2023 -0500 mm/rmap: fix misplaced parenthesis of a likely() commit f67f8d4a8c1e1ebc85a6cbdb9a7266f14863461c upstream. Running my yearly branch profiler to see where likely/unlikely annotation may be added or removed, I discovered this: correct incorrect % Function File Line ------- --------- - -------- ---- ---- 0 457918 100 page_try_dup_anon_rmap rmap.h 264 [..] 458021 0 0 page_try_dup_anon_rmap rmap.h 265 I thought it was interesting that line 264 of rmap.h had a 100% incorrect annotation, but the line directly below it was 100% correct. Looking at the code: if (likely(!is_device_private_page(page) && unlikely(page_needs_cow_for_dma(vma, page)))) It didn't make sense. The "likely()" was around the entire if statement (not just the "!is_device_private_page(page)"), which also included the "unlikely()" portion of that if condition. If the unlikely portion is unlikely to be true, that would make the entire if condition unlikely to be true, so it made no sense at all to say the entire if condition is true. What is more likely to be likely is just the first part of the if statement before the && operation. It's likely to be a misplaced parenthesis. And after making the if condition broken into a likely() && unlikely(), both now appear to be correct! Link: https://lkml.kernel.org/r/20231201145936.5ddfdb50@gandalf.local.home Fixes:fb3d824d1a46c ("mm/rmap: split page_dup_rmap() into page_dup_file_rmap() and page_try_dup_anon_rmap()") Signed-off-by: Steven Rostedt (Google) Acked-by: Vlastimil Babka Cc: David Hildenbrand Cc: Vlastimil Babka Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit d1ffcaf5d68d81d9f95a2717477c405ff9971398 Author: Donet Tom Date: Wed Jan 10 14:03:35 2024 +0530 selftests: mm: hugepage-vmemmap fails on 64K page size systems commit 00bcfcd47a52f50f07a2e88d730d7931384cb073 upstream. The kernel sefltest mm/hugepage-vmemmap fails on architectures which has different page size other than 4K. In hugepage-vmemmap page size used is 4k so the pfn calculation will go wrong on systems which has different page size .The length of MAP_HUGETLB memory must be hugepage aligned but in hugepage-vmemmap map length is 2M so this will not get aligned if the system has differnet hugepage size. Added psize() to get the page size and default_huge_page_size() to get the default hugepage size at run time, hugepage-vmemmap test pass on powerpc with 64K page size and x86 with 4K page size. Result on powerpc without patch (page size 64K) *# ./hugepage-vmemmap Returned address is 0x7effff000000 whose pfn is 0 Head page flags (100000000) is invalid check_page_flags: Invalid argument *# Result on powerpc with patch (page size 64K) *# ./hugepage-vmemmap Returned address is 0x7effff000000 whose pfn is 600 *# Result on x86 with patch (page size 4K) *# ./hugepage-vmemmap Returned address is 0x7fc7c2c00000 whose pfn is 1dac00 *# Link: https://lkml.kernel.org/r/3b3a3ae37ba21218481c482a872bbf7526031600.1704865754.git.donettom@linux.vnet.ibm.com Fixes: b147c89cd429 ("selftests: vm: add a hugetlb test case") Signed-off-by: Donet Tom Reported-by: Geetika Moolchandani Tested-by: Geetika Moolchandani Acked-by: Muchun Song Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 61c8b879f9949aa98ecd1c390c48c2aa809996ee Author: James Gowans Date: Wed Dec 13 08:40:04 2023 +0200 kexec: do syscore_shutdown() in kernel_kexec commit 7bb943806ff61e83ae4cceef8906b7fe52453e8a upstream. syscore_shutdown() runs driver and module callbacks to get the system into a state where it can be correctly shut down. In commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()") syscore_shutdown() was removed from kernel_restart_prepare() and hence got (incorrectly?) removed from the kexec flow. This was innocuous until commit 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown") changed the way that KVM registered its shutdown callbacks, switching from reboot notifiers to syscore_ops.shutdown. As syscore_shutdown() is missing from kexec, KVM's shutdown hook is not run and virtualisation is left enabled on the boot CPU which results in triple faults when switching to the new kernel on Intel x86 VT-x with VMXE enabled. Fix this by adding syscore_shutdown() to the kexec sequence. In terms of where to add it, it is being added after migrating the kexec task to the boot CPU, but before APs are shut down. It is not totally clear if this is the best place: in commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()") it is stated that "syscore_ops operations should be carried with one CPU on-line and interrupts disabled." APs are only offlined later in machine_shutdown(), so this syscore_shutdown() is being run while APs are still online. This seems to be the correct place as it matches where syscore_shutdown() is run in the reboot and halt flows - they also run it before APs are shut down. The assumption is that the commit message in commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()") is no longer valid. KVM has been discussed here as it is what broke loudly by not having syscore_shutdown() in kexec, but this change impacts more than just KVM; all drivers/modules which register a syscore_ops.shutdown callback will now be invoked in the kexec flow. Looking at some of them like x86 MCE it is probably more correct to also shut these down during kexec. Maintainers of all drivers which use syscore_ops.shutdown are added on CC for visibility. They are: arch/powerpc/platforms/cell/spu_base.c .shutdown = spu_shutdown, arch/x86/kernel/cpu/mce/core.c .shutdown = mce_syscore_shutdown, arch/x86/kernel/i8259.c .shutdown = i8259A_shutdown, drivers/irqchip/irq-i8259.c .shutdown = i8259A_shutdown, drivers/irqchip/irq-sun6i-r.c .shutdown = sun6i_r_intc_shutdown, drivers/leds/trigger/ledtrig-cpu.c .shutdown = ledtrig_cpu_syscore_shutdown, drivers/power/reset/sc27xx-poweroff.c .shutdown = sc27xx_poweroff_shutdown, kernel/irq/generic-chip.c .shutdown = irq_gc_shutdown, virt/kvm/kvm_main.c .shutdown = kvm_shutdown, This has been tested by doing a kexec on x86_64 and aarch64. Link: https://lkml.kernel.org/r/20231213064004.2419447-1-jgowans@amazon.com Fixes: 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown") Signed-off-by: James Gowans Cc: Baoquan He Cc: Eric Biederman Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Marc Zyngier Cc: Arnd Bergmann Cc: Tony Luck Cc: Borislav Petkov Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Chen-Yu Tsai Cc: Jernej Skrabec Cc: Samuel Holland Cc: Pavel Machek Cc: Sebastian Reichel Cc: Orson Zhai Cc: Alexander Graf Cc: Jan H. Schoenherr Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 05509adf297924f51e1493aa86f9fcde1433ed80 Author: Muhammad Usama Anjum Date: Tue Jan 9 16:24:42 2024 +0500 fs/proc/task_mmu: move mmu notification mechanism inside mm lock commit 4cccb6221cae6d020270606b9e52b1678fc8b71a upstream. Move mmu notification mechanism inside mm lock to prevent race condition in other components which depend on it. The notifier will invalidate memory range. Depending upon the number of iterations, different memory ranges would be invalidated. The following warning would be removed by this patch: WARNING: CPU: 0 PID: 5067 at arch/x86/kvm/../../../virt/kvm/kvm_main.c:734 kvm_mmu_notifier_change_pte+0x860/0x960 arch/x86/kvm/../../../virt/kvm/kvm_main.c:734 There is no behavioural and performance change with this patch when there is no component registered with the mmu notifier. [akpm@linux-foundation.org: narrow the scope of `range', per Sean] Link: https://lkml.kernel.org/r/20240109112445.590736-1-usama.anjum@collabora.com Fixes: 52526ca7fdb9 ("fs/proc/task_mmu: implement IOCTL to get and optionally clear info about PTEs") Signed-off-by: Muhammad Usama Anjum Reported-by: syzbot+81227d2bd69e9dedb802@syzkaller.appspotmail.com Link: https://lore.kernel.org/all/000000000000f6d051060c6785bc@google.com/ Reviewed-by: Sean Christopherson Cc: Andrei Vagin Cc: Arnd Bergmann Cc: David Hildenbrand Cc: Hugh Dickins Cc: Kefeng Wang Cc: Liam R. Howlett Cc: Michał Mirosław Cc: Peter Xu Cc: Ryan Roberts Cc: Stephen Rothwell Cc: Suren Baghdasaryan Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 538db7912d9bd8d92f5afef03c0d5eafe1ef0fbe Author: Di Shen Date: Wed Jan 10 19:55:26 2024 +0800 thermal: gov_power_allocator: avoid inability to reset a cdev commit e95fa7404716f6e25021e66067271a4ad8eb1486 upstream. Commit 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") adds an update flag to avoid triggering a thermal event when there is no need, and the thermal cdev is updated once when the temperature is low. But when the trips are writable, and switch_on_temp is set to be a higher value, the cooling device state may not be reset to 0, because last_temperature is smaller than switch_on_temp. For example: First: switch_on_temp=70 control_temp=85; Then userspace change the trip_temp: switch_on_temp=45 control_temp=55 cur_temp=54 Then userspace reset the trip_temp: switch_on_temp=70 control_temp=85 cur_temp=57 last_temp=54 At this time, the cooling device state should be reset to 0. However, because cur_temp(57) < switch_on_temp(70) last_temp(54) < switch_on_temp(70) ----> update = false, update is false, the cooling device state can not be reset. Using the observation that tz->passive can also be regarded as the temperature status, set the update flag to the tz->passive value. When the temperature drops below switch_on for the first time, the states of cooling devices can be reset once, and tz->passive is updated to 0. In the next round, because tz->passive is 0, cdev->state will not be updated. By using the tz->passive value as the "update" flag, the issue above can be solved, and the cooling devices can be updated only once when the temperature is low. Fixes: 0952177f2a1f ("thermal/core/power_allocator: Update once cooling devices when temp is low") Cc: 5.13+ # 5.13+ Suggested-by: Wei Wang Signed-off-by: Di Shen Reviewed-by: Lukasz Luba [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit b499289e6338b580b27589baf346f3fecda0c887 Author: Zhihao Cheng Date: Fri Dec 22 16:54:46 2023 +0800 ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path commit 1e022216dcd248326a5bb95609d12a6815bca4e2 upstream. For error handling path in ubifs_symlink(), inode will be marked as bad first, then iput() is invoked. If inode->i_link is initialized by fscrypt_encrypt_symlink() in encryption scenario, inode->i_link won't be freed by callchain ubifs_free_inode -> fscrypt_free_inode in error handling path, because make_bad_inode() has changed 'inode->i_mode' as 'S_IFREG'. Following kmemleak is easy to be reproduced by injecting error in ubifs_jnl_update() when doing symlink in encryption scenario: unreferenced object 0xffff888103da3d98 (size 8): comm "ln", pid 1692, jiffies 4294914701 (age 12.045s) backtrace: kmemdup+0x32/0x70 __fscrypt_encrypt_symlink+0xed/0x1c0 ubifs_symlink+0x210/0x300 [ubifs] vfs_symlink+0x216/0x360 do_symlinkat+0x11a/0x190 do_syscall_64+0x3b/0xe0 There are two ways fixing it: 1. Remove make_bad_inode() in error handling path. We can do that because ubifs_evict_inode() will do same processes for good symlink inode and bad symlink inode, for inode->i_nlink checking is before is_bad_inode(). 2. Free inode->i_link before marking inode bad. Method 2 is picked, it has less influence, personally, I think. Cc: stable@vger.kernel.org Fixes: 2c58d548f570 ("fscrypt: cache decrypted symlink target in ->i_link") Signed-off-by: Zhihao Cheng Suggested-by: Eric Biggers Reviewed-by: Eric Biggers Signed-off-by: Richard Weinberger Signed-off-by: Greg Kroah-Hartman commit 8b745a045e885a35bfd282504870032c824294f1 Author: Huacai Chen Date: Fri Dec 29 16:02:13 2023 +0800 kdump: defer the insertion of crashkernel resources commit 4a693ce65b186fddc1a73621bd6f941e6e3eca21 upstream. In /proc/iomem, sub-regions should be inserted after their parent, otherwise the insertion of parent resource fails. But after generic crashkernel reservation applied, in both RISC-V and ARM64 (LoongArch will also use generic reservation later on), crashkernel resources are inserted before their parent, which causes the parent disappear in /proc/iomem. So we defer the insertion of crashkernel resources to an early_initcall(). 1, Without 'crashkernel' parameter: 100d0100-100d01ff : LOON0001:00 100d0100-100d01ff : LOON0001:00 LOON0001:00 100e0000-100e0bff : LOON0002:00 100e0000-100e0bff : LOON0002:00 LOON0002:00 1fe001e0-1fe001e7 : serial 90400000-fa17ffff : System RAM f6220000-f622ffff : Reserved f9ee0000-f9ee3fff : Reserved fa120000-fa17ffff : Reserved fa190000-fe0bffff : System RAM fa190000-fa1bffff : Reserved fe4e0000-47fffffff : System RAM 43c000000-441ffffff : Reserved 47ff98000-47ffa3fff : Reserved 47ffa4000-47ffa7fff : Reserved 47ffa8000-47ffabfff : Reserved 47ffac000-47ffaffff : Reserved 47ffb0000-47ffb3fff : Reserved 2, With 'crashkernel' parameter, before this patch: 100d0100-100d01ff : LOON0001:00 100d0100-100d01ff : LOON0001:00 LOON0001:00 100e0000-100e0bff : LOON0002:00 100e0000-100e0bff : LOON0002:00 LOON0002:00 1fe001e0-1fe001e7 : serial e6200000-f61fffff : Crash kernel fa190000-fe0bffff : System RAM fa190000-fa1bffff : Reserved fe4e0000-47fffffff : System RAM 43c000000-441ffffff : Reserved 47ff98000-47ffa3fff : Reserved 47ffa4000-47ffa7fff : Reserved 47ffa8000-47ffabfff : Reserved 47ffac000-47ffaffff : Reserved 47ffb0000-47ffb3fff : Reserved 3, With 'crashkernel' parameter, after this patch: 100d0100-100d01ff : LOON0001:00 100d0100-100d01ff : LOON0001:00 LOON0001:00 100e0000-100e0bff : LOON0002:00 100e0000-100e0bff : LOON0002:00 LOON0002:00 1fe001e0-1fe001e7 : serial 90400000-fa17ffff : System RAM e6200000-f61fffff : Crash kernel f6220000-f622ffff : Reserved f9ee0000-f9ee3fff : Reserved fa120000-fa17ffff : Reserved fa190000-fe0bffff : System RAM fa190000-fa1bffff : Reserved fe4e0000-47fffffff : System RAM 43c000000-441ffffff : Reserved 47ff98000-47ffa3fff : Reserved 47ffa4000-47ffa7fff : Reserved 47ffa8000-47ffabfff : Reserved 47ffac000-47ffaffff : Reserved 47ffb0000-47ffb3fff : Reserved Link: https://lkml.kernel.org/r/20231229080213.2622204-1-chenhuacai@loongson.cn Signed-off-by: Huacai Chen Fixes: 0ab97169aa05 ("crash_core: add generic function to do reservation") Cc: Baoquan He Cc: Zhen Lei Cc: [6.6+] Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit 8afa41bb2d31d382325b1e9d875df906c821331c Author: Ma Wupeng Date: Tue Jan 9 12:15:36 2024 +0800 efi: disable mirror feature during crashkernel commit 7ea6ec4c25294e8bc8788148ef854df92ee8dc5e upstream. If the system has no mirrored memory or uses crashkernel.high while kernelcore=mirror is enabled on the command line then during crashkernel, there will be limited mirrored memory and this usually leads to OOM. To solve this problem, disable the mirror feature during crashkernel. Link: https://lkml.kernel.org/r/20240109041536.3903042-1-mawupeng1@huawei.com Signed-off-by: Ma Wupeng Acked-by: Mike Rapoport (IBM) Cc: Signed-off-by: Andrew Morton Signed-off-by: Greg Kroah-Hartman commit f66509fd27be96713a5181479d4d0bfb095da6fd Author: Dave Airlie Date: Wed Jan 10 11:14:05 2024 +1000 nouveau/gsp: handle engines in runl without nonstall interrupts. commit 205e18c13545ab43cc4fe4930732b4feef551198 upstream. It appears on TU106 GPUs (2070), that some of the nvdec engines are in the runlist but have no valid nonstall interrupt, nouveau didn't handle that too well. This should let nouveau/gsp work on those. Cc: stable@vger.kernel.org # v6.7+ Signed-off-by: Dave Airlie Link: https://lore.kernel.org/all/20240110011826.3996289-1-airlied@gmail.com/ Signed-off-by: Greg Kroah-Hartman commit e3d4454f7c004ff3eef2fb9c4896b7bef830d647 Author: Dave Airlie Date: Thu Jan 18 06:19:57 2024 +1000 nouveau/vmm: don't set addr on the fail path to avoid warning commit cacea81390fd8c8c85404e5eb2adeb83d87a912e upstream. nvif_vmm_put gets called if addr is set, but if the allocation fails we don't need to call put, otherwise we get a warning like [523232.435671] ------------[ cut here ]------------ [523232.435674] WARNING: CPU: 8 PID: 1505697 at drivers/gpu/drm/nouveau/nvif/vmm.c:68 nvif_vmm_put+0x72/0x80 [nouveau] [523232.435795] Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep sunrpc binfmt_misc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common iwlmvm nfit libnvdimm vfat fat x86_pkg_temp_thermal intel_powerclamp mac80211 snd_soc_avs snd_soc_hda_codec coretemp snd_hda_ext_core snd_soc_core snd_hda_codec_realtek kvm_intel snd_hda_codec_hdmi snd_compress snd_hda_codec_generic ac97_bus snd_pcm_dmaengine snd_hda_intel libarc4 snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm iwlwifi snd_hda_core btusb snd_hwdep btrtl snd_seq btintel irqbypass btbcm rapl snd_seq_device eeepc_wmi btmtk intel_cstate iTCO_wdt cfg80211 snd_pcm asus_wmi bluetooth intel_pmc_bxt iTCO_vendor_support snd_timer ledtrig_audio pktcdvd snd mei_me [523232.435828] sparse_keymap intel_uncore i2c_i801 platform_profile wmi_bmof mei pcspkr ioatdma soundcore i2c_smbus rfkill idma64 dca joydev acpi_tad loop zram nouveau drm_ttm_helper ttm video drm_exec drm_gpuvm gpu_sched crct10dif_pclmul i2c_algo_bit nvme crc32_pclmul crc32c_intel drm_display_helper polyval_clmulni nvme_core polyval_generic e1000e mxm_wmi cec ghash_clmulni_intel r8169 sha512_ssse3 nvme_common wmi pinctrl_sunrisepoint uas usb_storage ip6_tables ip_tables fuse [523232.435849] CPU: 8 PID: 1505697 Comm: gnome-shell Tainted: G W 6.6.0-rc7-nvk-uapi+ #12 [523232.435851] Hardware name: System manufacturer System Product Name/ROG STRIX X299-E GAMING II, BIOS 1301 09/24/2021 [523232.435852] RIP: 0010:nvif_vmm_put+0x72/0x80 [nouveau] [523232.435934] Code: 00 00 48 89 e2 be 02 00 00 00 48 c7 04 24 00 00 00 00 48 89 44 24 08 e8 fc bf ff ff 85 c0 75 0a 48 c7 43 08 00 00 00 00 eb b3 <0f> 0b eb f2 e8 f5 c9 b2 e6 0f 1f 44 00 00 90 90 90 90 90 90 90 90 [523232.435936] RSP: 0018:ffffc900077ffbd8 EFLAGS: 00010282 [523232.435937] RAX: 00000000fffffffe RBX: ffffc900077ffc00 RCX: 0000000000000010 [523232.435938] RDX: 0000000000000010 RSI: ffffc900077ffb38 RDI: ffffc900077ffbd8 [523232.435940] RBP: ffff888e1c4f2140 R08: 0000000000000000 R09: 0000000000000000 [523232.435940] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888503811800 [523232.435941] R13: ffffc900077ffca0 R14: ffff888e1c4f2140 R15: ffff88810317e1e0 [523232.435942] FS: 00007f933a769640(0000) GS:ffff88905fa00000(0000) knlGS:0000000000000000 [523232.435943] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [523232.435944] CR2: 00007f930bef7000 CR3: 00000005d0322001 CR4: 00000000003706e0 [523232.435945] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [523232.435946] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [523232.435964] Call Trace: [523232.435965] [523232.435966] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436051] ? __warn+0x81/0x130 [523232.436055] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436138] ? report_bug+0x171/0x1a0 [523232.436142] ? handle_bug+0x3c/0x80 [523232.436144] ? exc_invalid_op+0x17/0x70 [523232.436145] ? asm_exc_invalid_op+0x1a/0x20 [523232.436149] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436230] ? nvif_vmm_put+0x64/0x80 [nouveau] [523232.436342] nouveau_vma_del+0x80/0xd0 [nouveau] [523232.436506] nouveau_vma_new+0x1a0/0x210 [nouveau] [523232.436671] nouveau_gem_object_open+0x1d0/0x1f0 [nouveau] [523232.436835] drm_gem_handle_create_tail+0xd1/0x180 [523232.436840] drm_prime_fd_to_handle_ioctl+0x12e/0x200 [523232.436844] ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10 [523232.436847] drm_ioctl_kernel+0xd3/0x180 [523232.436849] drm_ioctl+0x26d/0x4b0 [523232.436851] ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10 [523232.436855] nouveau_drm_ioctl+0x5a/0xb0 [nouveau] [523232.437032] __x64_sys_ioctl+0x94/0xd0 [523232.437036] do_syscall_64+0x5d/0x90 [523232.437040] ? syscall_exit_to_user_mode+0x2b/0x40 [523232.437044] ? do_syscall_64+0x6c/0x90 [523232.437046] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Reported-by: Faith Ekstrand Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie Link: https://patchwork.freedesktop.org/patch/msgid/20240117213852.295565-1-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman commit 3d049da3f1de6d1033b0e91164ad8eb82d5c6027 Author: Mario Limonciello Date: Mon Nov 27 23:36:53 2023 -0600 rtc: Extend timeout for waiting for UIP to clear to 1s commit cef9ecc8e938dd48a560f7dd9be1246359248d20 upstream. Specs don't say anything about UIP being cleared within 10ms. They only say that UIP won't occur for another 244uS. If a long NMI occurs while UIP is still updating it might not be possible to get valid data in 10ms. This has been observed in the wild that around s2idle some calls can take up to 480ms before UIP is clear. Adjust callers from outside an interrupt context to wait for up to a 1s instead of 10ms. Cc: # 6.1.y Fixes: ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP") Reported-by: Carsten Hatger Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217626 Tested-by: Mateusz Jończyk Reviewed-by: Mateusz Jończyk Acked-by: Mateusz Jończyk Signed-off-by: Mario Limonciello Link: https://lore.kernel.org/r/20231128053653.101798-5-mario.limonciello@amd.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit 0181f829da16c92dd8a910bc31d7b0aa1f98f1ed Author: Mario Limonciello Date: Mon Nov 27 23:36:52 2023 -0600 rtc: Add support for configuring the UIP timeout for RTC reads commit 120931db07b49252aba2073096b595482d71857c upstream. The UIP timeout is hardcoded to 10ms for all RTC reads, but in some contexts this might not be enough time. Add a timeout parameter to mc146818_get_time() and mc146818_get_time_callback(). If UIP timeout is configured by caller to be >=100 ms and a call takes this long, log a warning. Make all callers use 10ms to ensure no functional changes. Cc: # 6.1.y Fixes: ec5895c0f2d8 ("rtc: mc146818-lib: extract mc146818_avoid_UIP") Signed-off-by: Mario Limonciello Tested-by: Mateusz Jończyk Reviewed-by: Mateusz Jończyk Acked-by: Mateusz Jończyk Link: https://lore.kernel.org/r/20231128053653.101798-4-mario.limonciello@amd.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit f18817875b96a89fff702fcebe23b5feee3a37b1 Author: Mario Limonciello Date: Mon Nov 27 23:36:50 2023 -0600 rtc: mc146818-lib: Adjust failure return code for mc146818_get_time() commit af838635a3eb9b1bc0d98599c101ebca98f31311 upstream. mc146818_get_time() calls mc146818_avoid_UIP() to avoid fetching the time while RTC update is in progress (UIP). When this fails, the return code is -EIO, but actually there was no IO failure. The reason for the return from mc146818_avoid_UIP() is that the UIP wasn't cleared in the time period. Adjust the return code to -ETIMEDOUT to match the behavior. Tested-by: Mateusz Jończyk Reviewed-by: Mateusz Jończyk Acked-by: Mateusz Jończyk Cc: Fixes: 2a61b0ac5493 ("rtc: mc146818-lib: refactor mc146818_get_time") Signed-off-by: Mario Limonciello Link: https://lore.kernel.org/r/20231128053653.101798-2-mario.limonciello@amd.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit e949b1642b29f9ffd32f5ac41953e4c402903c32 Author: Mario Limonciello Date: Mon Nov 27 23:36:51 2023 -0600 rtc: Adjust failure return code for cmos_set_alarm() commit 1311a8f0d4b23f58bbababa13623aa40b8ad4e0c upstream. When mc146818_avoid_UIP() fails to return a valid value, this is because UIP didn't clear in the timeout period. Adjust the return code in this case to -ETIMEDOUT. Tested-by: Mateusz Jończyk Reviewed-by: Mateusz Jończyk Acked-by: Mateusz Jończyk Cc: Fixes: cdedc45c579f ("rtc: cmos: avoid UIP when reading alarm time") Fixes: cd17420ebea5 ("rtc: cmos: avoid UIP when writing alarm time") Signed-off-by: Mario Limonciello Link: https://lore.kernel.org/r/20231128053653.101798-3-mario.limonciello@amd.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit dea7cc6ca45a04146b71d099316f320cbf3d636e Author: Mario Limonciello Date: Mon Nov 6 10:23:10 2023 -0600 rtc: cmos: Use ACPI alarm for non-Intel x86 systems too commit 3d762e21d56370a43478b55e604b4a83dd85aafc upstream. Intel systems > 2015 have been configured to use ACPI alarm instead of HPET to avoid s2idle issues. Having HPET programmed for wakeup causes problems on AMD systems with s2idle as well. One particular case is that the systemd "SuspendThenHibernate" feature doesn't work properly on the Framework 13" AMD model. Switching to using ACPI alarm fixes the issue. Adjust the quirk to apply to AMD/Hygon systems from 2021 onwards. This matches what has been tested and is specifically to avoid potential risk to older systems. Cc: # 6.1+ Reported-by: Reported-by: Closes: https://github.com/systemd/systemd/issues/24279 Reported-by: Kelvie Wong Closes: https://community.frame.work/t/systemd-suspend-then-hibernate-wakes-up-after-5-minutes/39392 Signed-off-by: Mario Limonciello Link: https://lore.kernel.org/r/20231106162310.85711-1-mario.limonciello@amd.com Signed-off-by: Alexandre Belloni Signed-off-by: Greg Kroah-Hartman commit baa0aaac16432019651e0d60c41cd34a0c3c3477 Author: Mark Rutland Date: Tue Jan 16 11:02:20 2024 +0000 arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD commit 832dd634bd1b4e3bbe9f10b9c9ba5db6f6f2b97f upstream. Currently the ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround isn't quite right, as it is supposed to be applied after the last explicit memory access, but is immediately followed by an LDR. The ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD workaround is used to handle Cortex-A520 erratum 2966298 and Cortex-A510 erratum 3117295, which are described in: * https://developer.arm.com/documentation/SDEN2444153/0600/?lang=en * https://developer.arm.com/documentation/SDEN1873361/1600/?lang=en In both cases the workaround is described as: | If pagetable isolation is disabled, the context switch logic in the | kernel can be updated to execute the following sequence on affected | cores before exiting to EL0, and after all explicit memory accesses: | | 1. A non-shareable TLBI to any context and/or address, including | unused contexts or addresses, such as a `TLBI VALE1 Xzr`. | | 2. A DSB NSH to guarantee completion of the TLBI. The important part being that the TLBI+DSB must be placed "after all explicit memory accesses". Unfortunately, as-implemented, the TLBI+DSB is immediately followed by an LDR, as we have: | alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD | tlbi vale1, xzr | dsb nsh | alternative_else_nop_endif | alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0 | ldr lr, [sp, #S_LR] | add sp, sp, #PT_REGS_SIZE // restore sp | eret | alternative_else_nop_endif | | [ ... KPTI exception return path ... ] This patch fixes this by reworking the logic to place the TLBI+DSB immediately before the ERET, after all explicit memory accesses. The ERET is currently in a separate alternative block, and alternatives cannot be nested. To account for this, the alternative block for ARM64_UNMAP_KERNEL_AT_EL0 is replaced with a single alternative branch to skip the KPTI logic, with the new shape of the logic being: | alternative_insn "b .L_skip_tramp_exit_\@", nop, ARM64_UNMAP_KERNEL_AT_EL0 | [ ... KPTI exception return path ... ] | .L_skip_tramp_exit_\@: | | ldr lr, [sp, #S_LR] | add sp, sp, #PT_REGS_SIZE // restore sp | | alternative_if ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD | tlbi vale1, xzr | dsb nsh | alternative_else_nop_endif | eret The new structure means that the workaround is only applied when KPTI is not in use; this is fine as noted in the documented implications of the erratum: | Pagetable isolation between EL0 and higher level ELs prevents the | issue from occurring. ... and as per the workaround description quoted above, the workaround is only necessary "If pagetable isolation is disabled". Fixes: 471470bc7052 ("arm64: errata: Add Cortex-A520 speculative unprivileged load workaround") Signed-off-by: Mark Rutland Cc: Catalin Marinas Cc: James Morse Cc: Rob Herring Cc: Will Deacon Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240116110221.420467-2-mark.rutland@arm.com Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 814af6b4e6000e574e74d92197190edf07cc3680 Author: Mark Brown Date: Mon Jan 15 20:15:46 2024 +0000 arm64/sme: Always exit sme_alloc() early with existing storage commit dc7eb8755797ed41a0d1b5c0c39df3c8f401b3d9 upstream. When sme_alloc() is called with existing storage and we are not flushing we will always allocate new storage, both leaking the existing storage and corrupting the state. Fix this by separating the checks for flushing and for existing storage as we do for SVE. Callers that reallocate (eg, due to changing the vector length) should call sme_free() themselves. Fixes: 5d0a8d2fba50 ("arm64/ptrace: Ensure that SME is set up for target when writing SSVE state") Signed-off-by: Mark Brown Cc: Link: https://lore.kernel.org/r/20240115-arm64-sme-flush-v1-1-7472bd3459b7@kernel.org Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 0d600ea11f4c966f156f3d73ff8f348de016d863 Author: Rob Herring Date: Wed Jan 10 11:29:21 2024 -0600 arm64: errata: Add Cortex-A510 speculative unprivileged load workaround commit f827bcdafa2a2ac21c91e47f587e8d0c76195409 upstream. Implement the workaround for ARM Cortex-A510 erratum 3117295. On an affected Cortex-A510 core, a speculatively executed unprivileged load might leak data from a privileged load via a cache side channel. The issue only exists for loads within a translation regime with the same translation (e.g. same ASID and VMID). Therefore, the issue only affects the return to EL0. The erratum and workaround are the same as ARM Cortex-A520 erratum 2966298, so reuse the existing workaround. Cc: stable@vger.kernel.org Signed-off-by: Rob Herring Reviewed-by: Mark Rutland Link: https://lore.kernel.org/r/20240110-arm-errata-a510-v1-2-d02bc51aeeee@kernel.org Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 625a9e25192141a101a2dc62d40c71b020df6e25 Author: Rob Herring Date: Wed Jan 10 11:29:20 2024 -0600 arm64: Rename ARM64_WORKAROUND_2966298 commit 546b7cde9b1dd36089649101b75266564600ffe5 upstream. In preparation to apply ARM64_WORKAROUND_2966298 for multiple errata, rename the kconfig and capability. No functional change. Cc: stable@vger.kernel.org Signed-off-by: Rob Herring Reviewed-by: Mark Rutland Link: https://lore.kernel.org/r/20240110-arm-errata-a510-v1-1-d02bc51aeeee@kernel.org Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit eb01db7143cf3a8f0a7027b5ad731b23f8b5be7e Author: Andrew Jones Date: Wed Jan 17 14:09:34 2024 +0100 RISC-V: selftests: cbo: Ensure asm operands match constraints commit 0de65288d75ff96c30e216557d979fb9342c4323 upstream. The 'i' constraint expects a constant operand, which fn and its constant derivative MK_CBO(fn) are, but passing fn through a function as a parameter and using a local variable for MK_CBO(fn) allow the compiler to lose sight of that when no optimization is done. Use a macro instead of a function and skip the local variable to ensure the compiler uses constants, matching the asm constraints. Reported-by: Yunhui Cui Closes: https://lore.kernel.org/all/20240117082514.42967-1-cuiyunhui@bytedance.com Fixes: a29e2a48afe3 ("RISC-V: selftests: Add CBO tests") Signed-off-by: Andrew Jones Link: https://lore.kernel.org/r/20240117130933.57514-2-ajones@ventanamicro.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 69343951738f7a1ab1475777aa5d57a7d07cb4d5 Author: Guo Ren Date: Fri Dec 22 06:57:00 2023 -0500 riscv: mm: Fixup compat mode boot failure commit 5f449e245e5b0d9d63eef6c8968fbdc3a8594407 upstream. In COMPAT mode, the STACK_TOP is DEFAULT_MAP_WINDOW (0x80000000), but the TASK_SIZE is 0x7fff000. When the user stack is upon 0x7fff000, it will cause a user segment fault. Sometimes, it would cause boot failure when the whole rootfs is rv32. Freeing unused kernel image (initmem) memory: 2236K Run /sbin/init as init process Starting init: /sbin/init exists but couldn't execute it (error -14) Run /etc/init as init process ... Increase the TASK_SIZE to cover STACK_TOP. Cc: stable@vger.kernel.org Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57") Signed-off-by: Guo Ren Signed-off-by: Guo Ren Reviewed-by: Leonardo Bras Reviewed-by: Charlie Jenkins Link: https://lore.kernel.org/r/20231222115703.2404036-2-guoren@kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit abc8175e34a065f0e04ece3d2eb35596af02218a Author: Guo Ren Date: Fri Dec 22 06:57:01 2023 -0500 riscv: mm: Fixup compat arch_get_mmap_end commit 97b7ac69be2e5a683e898f5267f659fde52efdd5 upstream. When the task is in COMPAT mode, the arch_get_mmap_end should be 2GB, not TASK_SIZE_64. The TASK_SIZE has contained is_compat_mode() detection, so change the definition of STACK_TOP_MAX to TASK_SIZE directly. Cc: stable@vger.kernel.org Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57") Signed-off-by: Guo Ren Signed-off-by: Guo Ren Reviewed-by: Leonardo Bras Reviewed-by: Charlie Jenkins Link: https://lore.kernel.org/r/20231222115703.2404036-3-guoren@kernel.org Signed-off-by: Palmer Dabbelt Signed-off-by: Greg Kroah-Hartman commit 6e2f37022f0fc0893da4d85a0500c9d547fffd4c Author: Zheng Wang Date: Mon Nov 6 15:48:10 2023 +0100 media: mtk-jpeg: Fix use after free bug due to error path handling in mtk_jpeg_dec_device_run commit 206c857dd17d4d026de85866f1b5f0969f2a109e upstream. In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with mtk_jpeg_job_timeout_work. In mtk_jpeg_dec_device_run, if error happens in mtk_jpeg_set_dec_dst, it will finally start the worker while mark the job as finished by invoking v4l2_m2m_job_finish. There are two methods to trigger the bug. If we remove the module, it which will call mtk_jpeg_remove to make cleanup. The possible sequence is as follows, which will cause a use-after-free bug. CPU0 CPU1 mtk_jpeg_dec_... | start worker | |mtk_jpeg_job_timeout_work mtk_jpeg_remove | v4l2_m2m_release | kfree(m2m_dev); | | | v4l2_m2m_get_curr_priv | m2m_dev->curr_ctx //use If we close the file descriptor, which will call mtk_jpeg_release, it will have a similar sequence. Fix this bug by starting timeout worker only if started jpegdec worker successfully. Then v4l2_m2m_job_finish will only be called in either mtk_jpeg_job_timeout_work or mtk_jpeg_dec_device_run. Fixes: b2f0d2724ba4 ("[media] vcodec: mediatek: Add Mediatek JPEG Decoder Driver") Signed-off-by: Zheng Wang Signed-off-by: Dmitry Osipenko Cc: stable@vger.kernel.org Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit a195de026417fc8f530d649ff792fdf0fbec1e30 Author: Zheng Wang Date: Mon Nov 6 15:48:11 2023 +0100 media: mtk-jpeg: Fix timeout schedule error in mtk_jpegdec_worker. commit 38e1857933def4b3fafc28cc34ff3bbc84cad2c3 upstream. In mtk_jpegdec_worker, if error occurs in mtk_jpeg_set_dec_dst, it will start the timeout worker and invoke v4l2_m2m_job_finish at the same time. This will break the logic of design for there should be only one function to call v4l2_m2m_job_finish. But now the timeout handler and mtk_jpegdec_worker will both invoke it. Fix it by start the worker only if mtk_jpeg_set_dec_dst successfully finished. Fixes: da4ede4b7fd6 ("media: mtk-jpeg: move data/code inside CONFIG_OF blocks") Signed-off-by: Zheng Wang Signed-off-by: Dmitry Osipenko Cc: stable@vger.kernel.org Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit b9ecc0494d4b62d1723be2ce1c3382852aae1546 Author: Alain Volmat Date: Mon Nov 13 15:57:30 2023 +0100 media: i2c: st-mipid02: correct format propagation commit b33cb0cbe2893b96ecbfa16254407153f4b55d16 upstream. Use a copy of the struct v4l2_subdev_format when propagating format from the sink to source pad in order to avoid impacting the sink format returned to the application. Thanks to Jacopo Mondi for pointing the issue. Fixes: 6c01e6f3f27b ("media: st-mipid02: Propagate format from sink to source pad") Signed-off-by: Alain Volmat Cc: stable@vger.kernel.org Reviewed-by: Jacopo Mondi Reviewed-by: Daniel Scally Reviewed-by: Benjamin Mugnier Signed-off-by: Sakari Ailus Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit 69bd72260868186cf4a3c24c2129adcb8ff8c591 Author: Andy Shevchenko Date: Fri Dec 8 00:19:01 2023 +0200 mmc: mmc_spi: remove custom DMA mapped buffers commit 84a6be7db9050dd2601c9870f65eab9a665d2d5d upstream. There is no need to duplicate what SPI core or individual controller drivers already do, i.e. mapping the buffers for DMA capable transfers. Note, that the code, besides its redundancy, was buggy: strictly speaking there is no guarantee, while it's true for those which can use this code (see below), that the SPI host controller _is_ the device which does DMA. Also see the Link tags below. Additional notes. Currently only two SPI host controller drivers may use premapped (by the user) DMA buffers: - drivers/spi/spi-au1550.c - drivers/spi/spi-fsl-spi.c Both of them have DMA mapping support code. I don't expect that SPI host controller code is worse than what has been done in mmc_spi. Hence I do not expect any regressions here. Otherwise, I'm pretty much sure these regressions have to be fixed in the respective drivers, and not here. That said, remove all related pieces of DMA mapping code from mmc_spi. Link: https://lore.kernel.org/linux-mmc/c73b9ba9-1699-2aff-e2fd-b4b4f292a3ca@raspberrypi.org/ Link: https://stackoverflow.com/questions/67620728/mmc-spi-issue-not-able-to-setup-mmc-sd-card-in-linux Signed-off-by: Andy Shevchenko Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231207221901.3259962-1-andriy.shevchenko@linux.intel.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit eed9119f8f8e8fbf225c08abdbb58597fba807e0 Author: Avri Altman Date: Wed Nov 29 11:25:35 2023 +0200 mmc: core: Use mrq.sbc in close-ended ffu commit 4d0c8d0aef6355660b6775d57ccd5d4ea2e15802 upstream. Field Firmware Update (ffu) may use close-ended or open ended sequence. Each such sequence is comprised of a write commands enclosed between 2 switch commands - to and from ffu mode. So for the close-ended case, it will be: cmd6->cmd23-cmd25-cmd6. Some host controllers however, get confused when multi-block rw is sent without sbc, and may generate auto-cmd12 which breaks the ffu sequence. I encountered this issue while testing fwupd (github.com/fwupd/fwupd) on HP Chromebook x2, a qualcomm based QC-7c, code name - strongbad. Instead of a quirk, or hooking the request function of the msm ops, it would be better to fix the ioctl handling and make it use mrq.sbc instead of issuing SET_BLOCK_COUNT separately. Signed-off-by: Avri Altman Acked-by: Adrian Hunter Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231129092535.3278-1-avri.altman@wdc.com Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit f7013649c5423b419e16fb4c0828c4962e34abe1 Author: Michael Grzeschik Date: Thu Nov 23 23:32:05 2023 +0100 media: videobuf2-dma-sg: fix vmap callback commit 608ca5a60ee47b48fec210aeb7a795a64eb5dcee upstream. For dmabuf import users to be able to use the vaddr from another videobuf2-dma-sg source, the exporter needs to set a proper vaddr on vb2_dma_sg_dmabuf_ops_vmap callback. This patch adds vmap on map if buf->vaddr was not set. Cc: stable@kernel.org Fixes: 7938f4218168 ("dma-buf-map: Rename to iosys-map") Signed-off-by: Michael Grzeschik Acked-by: Tomasz Figa Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit fd956ce66beb7eba60aad707d9b36e583e24878c Author: Vegard Nossum Date: Mon Jan 1 00:59:58 2024 +0100 scripts/get_abi: fix source path leak commit 5889d6ede53bc17252f79c142387e007224aa554 upstream. The code currently leaks the absolute path of the ABI files into the rendered documentation. There exists code to prevent this, but it is not effective when an absolute path is passed, which it is when $srctree is used. I consider this to be a minimal, stop-gap fix; a better fix would strip off the actual prefix instead of hacking it off with a regex. Link: https://mastodon.social/@vegard/111677490643495163 Cc: Jani Nikula Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20231231235959.3342928-1-vegard.nossum@oracle.com Signed-off-by: Greg Kroah-Hartman commit 78d757c607225a43e583edffdac3fdf3f5ee922e Author: Vegard Nossum Date: Mon Jan 1 00:59:59 2024 +0100 docs: kernel_abi.py: fix command injection commit 3231dd5862779c2e15633c96133a53205ad660ce upstream. The kernel-abi directive passes its argument straight to the shell. This is unfortunate and unnecessary. Let's always use paths relative to $srctree/Documentation/ and use subprocess.check_call() instead of subprocess.Popen(shell=True). This also makes the code shorter. Link: https://fosstodon.org/@jani/111676532203641247 Reported-by: Jani Nikula Cc: stable@vger.kernel.org Signed-off-by: Vegard Nossum Signed-off-by: Jonathan Corbet Link: https://lore.kernel.org/r/20231231235959.3342928-2-vegard.nossum@oracle.com Signed-off-by: Greg Kroah-Hartman commit 4ecf1864f2076872b7aea29d463e785ef6fc9909 Author: Jordan Rife Date: Mon Nov 6 15:24:38 2023 -0600 dlm: use kernel_connect() and kernel_bind() commit e9cdebbe23f1aa9a1caea169862f479ab3fa2773 upstream. Recent changes to kernel_connect() and kernel_bind() ensure that callers are insulated from changes to the address parameter made by BPF SOCK_ADDR hooks. This patch wraps direct calls to ops->connect() and ops->bind() with kernel_connect() and kernel_bind() to protect callers in such cases. Link: https://lore.kernel.org/netdev/9944248dba1bce861375fcce9de663934d933ba9.camel@redhat.com/ Fixes: d74bad4e74ee ("bpf: Hooks for sys_connect") Fixes: 4fbac77d2d09 ("bpf: Hooks for sys_bind") Cc: stable@vger.kernel.org Signed-off-by: Jordan Rife Signed-off-by: David Teigland Signed-off-by: Greg Kroah-Hartman commit 1818ee771c3afcff013fe25043b0ffa2017ecfb9 Author: Alfred Piccioni Date: Tue Dec 19 10:09:09 2023 +0100 lsm: new security_file_ioctl_compat() hook commit f1bb47a31dff6d4b34fb14e99850860ee74bb003 upstream. Some ioctl commands do not require ioctl permission, but are routed to other permissions such as FILE_GETATTR or FILE_SETATTR. This routing is done by comparing the ioctl cmd to a set of 64-bit flags (FS_IOC_*). However, if a 32-bit process is running on a 64-bit kernel, it emits 32-bit flags (FS_IOC32_*) for certain ioctl operations. These flags are being checked erroneously, which leads to these ioctl operations being routed to the ioctl permission, rather than the correct file permissions. This was also noted in a RED-PEN finding from a while back - "/* RED-PEN how should LSM module know it's handling 32bit? */". This patch introduces a new hook, security_file_ioctl_compat(), that is called from the compat ioctl syscall. All current LSMs have been changed to support this hook. Reviewing the three places where we are currently using security_file_ioctl(), it appears that only SELinux needs a dedicated compat change; TOMOYO and SMACK appear to be functional without any change. Cc: stable@vger.kernel.org Fixes: 0b24dcb7f2f7 ("Revert "selinux: simplify ioctl checking"") Signed-off-by: Alfred Piccioni Reviewed-by: Stephen Smalley [PM: subject tweak, line length fixes, and alignment corrections] Signed-off-by: Paul Moore Signed-off-by: Greg Kroah-Hartman commit f87b063bf264c6992457eb32ea5451c3b2ef3fa6 Author: Johan Hovold Date: Wed Dec 13 18:31:31 2023 +0100 ARM: dts: qcom: sdx55: fix USB SS wakeup commit 710dd03464e4ab5b3d329768388b165d61958577 upstream. The USB SS PHY interrupt needs to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states. Fixes: fea4b41022f3 ("ARM: dts: qcom: sdx55: Add USB3 and PHY support") Cc: stable@vger.kernel.org # 5.12 Cc: Manivannan Sadhasivam Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20231213173131.29436-4-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 599a6b85b102a1a00fda352e0d127c553f3a9b2b Author: Johan Hovold Date: Thu Dec 14 08:43:18 2023 +0100 arm64: dts: qcom: sdm670: fix USB SS wakeup commit 047b2edc35b8db22354b4fba37818b548fc18896 upstream. The USB SS PHY interrupt needs to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states. Fixes: 07c8ded6e373 ("arm64: dts: qcom: add sdm670 and pixel 3a device trees") Cc: stable@vger.kernel.org # 6.2 Cc: Richard Acayan Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Tested-by: Richard Acayan Link: https://lore.kernel.org/r/20231214074319.11023-3-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit ea5b4db33bc4963bc98ce72803c42caf083ddf0f Author: Johan Hovold Date: Thu Dec 14 08:43:17 2023 +0100 arm64: dts: qcom: sdm670: fix USB DP/DM HS PHY interrupts commit c42d12ea105f67b0f137f1e52d5c59d13fe12b1f upstream. The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states and to be able to detect disconnect events, which requires triggering on falling edges. A recent commit updated the trigger type but failed to change the interrupt provider as required. This leads to the current Linux driver failing to probe instead of printing an error during suspend and USB wakeup not working as intended. Fixes: de3b3de30999 ("arm64: dts: qcom: sdm670: fix USB wakeup interrupt types") Fixes: 07c8ded6e373 ("arm64: dts: qcom: add sdm670 and pixel 3a device trees") Cc: stable@vger.kernel.org # 6.2 Cc: Richard Acayan Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Tested-by: Richard Acayan Link: https://lore.kernel.org/r/20231214074319.11023-2-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit ee78909c4b1e037fe23687dbdffe029450004552 Author: Johan Hovold Date: Thu Dec 14 08:43:19 2023 +0100 arm64: dts: qcom: sc8180x: fix USB SS wakeup commit 0afa885d42d05d30161ab8eab1ebacd993edb82b upstream. The USB SS PHY interrupt needs to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states. Fixes: b080f53a8f44 ("arm64: dts: qcom: sc8180x: Add remoteprocs, wifi and usb nodes") Cc: stable@vger.kernel.org # 6.5 Cc: Vinod Koul Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20231214074319.11023-4-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 7180943407a4e221b94cf27160931fae10797aed Author: Johan Hovold Date: Wed Dec 13 18:33:59 2023 +0100 arm64: dts: qcom: sc8180x: fix USB DP/DM HS PHY interrupts commit 687d402bb350b392fa330e9d9d1b917777ee9ed1 upstream. The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states and to be able to detect disconnect events, which requires triggering on falling edges. A recent commit updated the trigger type but failed to change the interrupt provider as required. This leads to the current Linux driver failing to probe instead of printing an error during suspend and USB wakeup not working as intended. Fixes: 0dc0f6da3d43 ("arm64: dts: qcom: sc8180x: fix USB wakeup interrupt types") Fixes: b080f53a8f44 ("arm64: dts: qcom: sc8180x: Add remoteprocs, wifi and usb nodes") Cc: stable@vger.kernel.org # 6.5 Cc: Vinod Koul Reported-by: Konrad Dybcio Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Tested-by: Konrad Dybcio Link: https://lore.kernel.org/r/20231213173403.29544-2-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit d1020ab1199e197c6504daf3a415d4f7465016ef Author: Johan Hovold Date: Wed Dec 13 18:34:03 2023 +0100 arm64: dts: qcom: sm8150: fix USB SS wakeup commit cc4e1da491b84ca05339a19893884cda78f74aef upstream. The USB SS PHY interrupts need to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states. Fixes: 0c9dde0d2015 ("arm64: dts: qcom: sm8150: Add secondary USB and PHY nodes") Fixes: b33d2868e8d3 ("arm64: dts: qcom: sm8150: Add USB and PHY device nodes") Cc: stable@vger.kernel.org # 5.10 Cc: Jack Pham Cc: Jonathan Marek Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20231213173403.29544-6-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit a0c074768af14c90787017f9b55ed50f11553701 Author: Johan Hovold Date: Wed Dec 13 18:34:02 2023 +0100 arm64: dts: qcom: sm8150: fix USB DP/DM HS PHY interrupts commit 134de5e831775e8b178db9b131c1d3769a766982 upstream. The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states and to be able to detect disconnect events, which requires triggering on falling edges. A recent commit updated the trigger type but failed to change the interrupt provider as required. This leads to the current Linux driver failing to probe instead of printing an error during suspend and USB wakeup not working as intended. Fixes: 54524b6987d1 ("arm64: dts: qcom: sm8150: fix USB wakeup interrupt types") Fixes: 0c9dde0d2015 ("arm64: dts: qcom: sm8150: Add secondary USB and PHY nodes") Fixes: b33d2868e8d3 ("arm64: dts: qcom: sm8150: Add USB and PHY device nodes") Cc: stable@vger.kernel.org # 5.10 Cc: Jack Pham Cc: Jonathan Marek Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20231213173403.29544-5-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 866921be9a0397fe5221295390da93c9c3a4f187 Author: Johan Hovold Date: Wed Dec 13 18:34:01 2023 +0100 arm64: dts: qcom: sdm845: fix USB SS wakeup commit 971f5d8b0618d09db75184ddd8cca0767514db5d upstream. The USB SS PHY interrupts need to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states. Fixes: ca4db2b538a1 ("arm64: dts: qcom: sdm845: Add USB-related nodes") Cc: stable@vger.kernel.org # 4.20 Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20231213173403.29544-4-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 72fb2acdd4e04042cea01512a45b2516e2c3e2c9 Author: Johan Hovold Date: Wed Dec 13 18:34:00 2023 +0100 arm64: dts: qcom: sdm845: fix USB DP/DM HS PHY interrupts commit 204f9ed4bad6293933179517624143b8f412347c upstream. The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states and to be able to detect disconnect events, which requires triggering on falling edges. A recent commit updated the trigger type but failed to change the interrupt provider as required. This leads to the current Linux driver failing to probe instead of printing an error during suspend and USB wakeup not working as intended. Fixes: 84ad9ac8d9ca ("arm64: dts: qcom: sdm845: fix USB wakeup interrupt types") Fixes: ca4db2b538a1 ("arm64: dts: qcom: sdm845: Add USB-related nodes") Cc: stable@vger.kernel.org # 4.20 Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20231213173403.29544-3-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 2dccfaad5cab97fc3e6ff90d0f6f3f597038dac3 Author: Johan Hovold Date: Wed Dec 13 18:31:30 2023 +0100 ARM: dts: qcom: sdx55: fix USB DP/DM HS PHY interrupts commit de95f139394a5ed82270f005bc441d2e7c1e51b7 upstream. The USB DP/DM HS PHY interrupts need to be provided by the PDC interrupt controller in order to be able to wake the system up from low-power states and to be able to detect disconnect events, which requires triggering on falling edges. A recent commit updated the trigger type but failed to change the interrupt provider as required. This leads to the current Linux driver failing to probe instead of printing an error during suspend and USB wakeup not working as intended. Fixes: d0ec3c4c11c3 ("ARM: dts: qcom: sdx55: fix USB wakeup interrupt types") Fixes: fea4b41022f3 ("ARM: dts: qcom: sdx55: Add USB3 and PHY support") Cc: stable@vger.kernel.org # 5.12 Cc: Manivannan Sadhasivam Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20231213173131.29436-3-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 97065416d01bd53143215a23e78fafa91c2f85e1 Author: Stephan Gerhold Date: Mon Dec 4 10:46:11 2023 +0100 arm64: dts: qcom: Add missing vio-supply for AW2013 commit cc1ec484f2d0f464ad11b56fe3de2589c23f73ec upstream. Add the missing vio-supply to all usages of the AW2013 LED controller to ensure that the regulator needed for pull-up of the interrupt and I2C lines is really turned on. While this seems to have worked fine so far some of these regulators are not guaranteed to be always-on. For example, pm8916_l6 is typically turned off together with the display if there aren't any other devices (e.g. sensors) keeping it always-on. Cc: stable@vger.kernel.org # 6.6 Signed-off-by: Stephan Gerhold Link: https://lore.kernel.org/r/20231204-qcom-aw2013-vio-v1-1-5d264bb5c0b2@gerhold.net Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit dbc7561091c05b211942bcf7dfeabd0cd0aeb8ef Author: Johan Hovold Date: Mon Nov 20 17:43:24 2023 +0100 arm64: dts: qcom: sc7280: fix usb_1 wakeup interrupt types commit c34199d967a946e55381404fa949382691737521 upstream. A recent cleanup reordering the usb_1 wakeup interrupts inadvertently switched the DP and SuperSpeed interrupt trigger types. Fixes: 4a7ffc10d195 ("arm64: dts: qcom: align DWC3 USB interrupts with DT schema") Cc: stable@vger.kernel.org # 5.19 Cc: Krzysztof Kozlowski Signed-off-by: Johan Hovold Reviewed-by: Krzysztof Kozlowski Link: https://lore.kernel.org/r/20231120164331.8116-5-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 4502030566af3d9f3fc05849f80eabb8328b6a25 Author: Johan Hovold Date: Mon Nov 20 17:43:26 2023 +0100 arm64: dts: qcom: sc8180x: fix USB wakeup interrupt types commit 0dc0f6da3d43da8d2297105663e51ecb01b6f790 upstream. The DP/DM wakeup interrupts are edge triggered and which edge to trigger on depends on use-case and whether a Low speed or Full/High speed device is connected. Fixes: b080f53a8f44 ("arm64: dts: qcom: sc8180x: Add remoteprocs, wifi and usb nodes") Cc: stable@vger.kernel.org # 6.5 Cc: Vinod Koul Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20231120164331.8116-7-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 16ca59fe7f217fdac0c26734221082aa8d644044 Author: Johan Hovold Date: Mon Nov 20 17:43:30 2023 +0100 arm64: dts: qcom: sm8150: fix USB wakeup interrupt types commit 54524b6987d1fffe64cbf3dded1b2fa6b903edf9 upstream. The DP/DM wakeup interrupts are edge triggered and which edge to trigger on depends on use-case and whether a Low speed or Full/High speed device is connected. Fixes: 0c9dde0d2015 ("arm64: dts: qcom: sm8150: Add secondary USB and PHY nodes") Fixes: b33d2868e8d3 ("arm64: dts: qcom: sm8150: Add USB and PHY device nodes") Cc: stable@vger.kernel.org # 5.10 Cc: Jonathan Marek Cc: Jack Pham Signed-off-by: Johan Hovold Reviewed-by: Jack Pham Link: https://lore.kernel.org/r/20231120164331.8116-11-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 546c81f2f037d8c9cfcdbe9795001955adbdea19 Author: Johan Hovold Date: Mon Nov 20 17:43:27 2023 +0100 arm64: dts: qcom: sdm670: fix USB wakeup interrupt types commit de3b3de30999106549da4df88a7963d0ac02b91e upstream. The DP/DM wakeup interrupts are edge triggered and which edge to trigger on depends on use-case and whether a Low speed or Full/High speed device is connected. Fixes: 07c8ded6e373 ("arm64: dts: qcom: add sdm670 and pixel 3a device trees") Cc: stable@vger.kernel.org # 6.2 Cc: Richard Acayan Signed-off-by: Johan Hovold Acked-by: Richard Acayan Link: https://lore.kernel.org/r/20231120164331.8116-8-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit bd94951ea0f91a46bb70389e821fd59fccfce9ce Author: Johan Hovold Date: Mon Nov 20 17:43:28 2023 +0100 arm64: dts: qcom: sdm845: fix USB wakeup interrupt types commit 84ad9ac8d9ca29033d589e79a991866b38e23b85 upstream. The DP/DM wakeup interrupts are edge triggered and which edge to trigger on depends on use-case and whether a Low speed or Full/High speed device is connected. Fixes: ca4db2b538a1 ("arm64: dts: qcom: sdm845: Add USB-related nodes") Cc: stable@vger.kernel.org # 4.20 Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20231120164331.8116-9-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 88061887c3dd6085176035d0bae6c683f14c9ba1 Author: Johan Hovold Date: Mon Nov 20 17:43:23 2023 +0100 arm64: dts: qcom: sc7180: fix USB wakeup interrupt types commit 9b956999bf725fd62613f719c3178fdbee6e5f47 upstream. The DP/DM wakeup interrupts are edge triggered and which edge to trigger on depends on use-case and whether a Low speed or Full/High speed device is connected. Fixes: 0b766e7fe5a2 ("arm64: dts: qcom: sc7180: Add USB related nodes") Cc: stable@vger.kernel.org # 5.10 Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20231120164331.8116-4-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 66c9f5134c89ad575abb61225ba54fd463fb34e3 Author: Stephan Gerhold Date: Mon Dec 4 11:21:21 2023 +0100 arm64: dts: qcom: msm8939: Make blsp_dma controlled-remotely commit 4bbda9421f316efdaef5dbf642e24925ef7de130 upstream. The blsp_dma controller is shared between the different subsystems, which is why it is already initialized by the firmware. We should not reinitialize it from Linux to avoid potential other users of the DMA engine to misbehave. In mainline this can be described using the "qcom,controlled-remotely" property. In the downstream/vendor kernel from Qualcomm there is an opposite "qcom,managed-locally" property. This property is *not* set for the qcom,sps-dma@7884000 [1] so adding "qcom,controlled-remotely" upstream matches the behavior of the downstream/vendor kernel. Adding this seems to fix some weird issues with UART where both input/output becomes garbled with certain obscure firmware versions on some devices. [1]: https://git.codelinaro.org/clo/la/kernel/msm-3.10/-/blob/LA.BR.1.2.9.1-02310-8x16.0/arch/arm/boot/dts/qcom/msm8939-common.dtsi#L866-872 Cc: stable@vger.kernel.org # 6.5 Fixes: 61550c6c156c ("arm64: dts: qcom: Add msm8939 SoC") Signed-off-by: Stephan Gerhold Reviewed-by: Bryan O'Donoghue Link: https://lore.kernel.org/r/20231204-msm8916-blsp-dma-remote-v1-2-3e49c8838c8d@gerhold.net Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 137728b605de988c4cac09e1b480571173e5443b Author: Stephan Gerhold Date: Mon Dec 4 11:21:20 2023 +0100 arm64: dts: qcom: msm8916: Make blsp_dma controlled-remotely commit 7c45b6ddbcff01f9934d11802010cfeb0879e693 upstream. The blsp_dma controller is shared between the different subsystems, which is why it is already initialized by the firmware. We should not reinitialize it from Linux to avoid potential other users of the DMA engine to misbehave. In mainline this can be described using the "qcom,controlled-remotely" property. In the downstream/vendor kernel from Qualcomm there is an opposite "qcom,managed-locally" property. This property is *not* set for the qcom,sps-dma@7884000 [1] so adding "qcom,controlled-remotely" upstream matches the behavior of the downstream/vendor kernel. Adding this seems to fix some weird issues with UART where both input/output becomes garbled with certain obscure firmware versions on some devices. [1]: https://git.codelinaro.org/clo/la/kernel/msm-3.10/-/blob/LA.BR.1.2.9.1-02310-8x16.0/arch/arm/boot/dts/qcom/msm8916.dtsi#L1466-1472 Cc: stable@vger.kernel.org # 6.5 Fixes: a0e5fb103150 ("arm64: dts: qcom: Add msm8916 BLSP device nodes") Signed-off-by: Stephan Gerhold Reviewed-by: Bryan O'Donoghue Link: https://lore.kernel.org/r/20231204-msm8916-blsp-dma-remote-v1-1-3e49c8838c8d@gerhold.net Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit de70bb8b7426702526500ace71ef07fc3c269741 Author: Sam Edwards Date: Fri Dec 15 19:10:19 2023 -0700 arm64: dts: rockchip: Fix rk3588 USB power-domain clocks commit 44de8996ed5a10f08f2fe947182da6535edcfae5 upstream. The QoS blocks saved/restored when toggling the PD_USB power domain are clocked by ACLK_USB. Attempting to access these memory regions without that clock running will result in an indefinite CPU stall. The PD_USB node wasn't specifying this clock dependency, resulting in hangs when trying to toggle the power domain (either on or off), unless we get "lucky" and have ACLK_USB running for another reason at the time. This "luck" can result from the bootloader leaving USB powered/clocked, and if no built-in driver wants USB, Linux will disable the unused PD+CLK on boot when {pd,clk}_ignore_unused aren't given. This can also be unlucky because the two cleanup tasks run in parallel and race: if the CLK is disabled first, the PD deactivation stalls the boot. In any case, the PD cannot then be reenabled (if e.g. the driver loads later) once the clock has been stopped. Fix this by specifying a dependency on ACLK_USB, instead of only ACLK_USB_ROOT. The child-parent relationship means the former implies the latter anyway. Fixes: c9211fa2602b8 ("arm64: dts: rockchip: Add base DT for rk3588 SoC") Cc: stable@vger.kernel.org Signed-off-by: Sam Edwards Link: https://lore.kernel.org/r/20231216021019.1543811-1-CFSworks@gmail.com [changed to only include the missing clock, not dropping the root-clocks] Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 4a91630e24a67a255aea21ba06302d63c01ae9f6 Author: Tianling Shen Date: Sat Dec 16 12:07:23 2023 +0800 arm64: dts: rockchip: configure eth pad driver strength for orangepi r1 plus lts commit fc5a80a432607d05e85bba37971712405f75c546 upstream. The default strength is not enough to provide stable connection under 3.3v LDO voltage. Fixes: 387b3bbac5ea ("arm64: dts: rockchip: Add Xunlong OrangePi R1 Plus LTS") Cc: stable@vger.kernel.org # 6.6+ Signed-off-by: Tianling Shen Link: https://lore.kernel.org/r/20231216040723.17864-1-cnsztl@gmail.com Signed-off-by: Heiko Stuebner Signed-off-by: Greg Kroah-Hartman commit 393950c8d1a254378469b24202f641e0c0db9904 Author: Cixi Geng Date: Wed Jul 12 00:23:46 2023 +0800 arm64: dts: sprd: fix the cpu node for UMS512 commit 2da4f4a7b003441b80f0f12d8a216590f652a40f upstream. The UMS512 Socs have 8 cores contains 6 a55 and 2 a75. modify the cpu nodes to correct information. Fixes: 2b4881839a39 ("arm64: dts: sprd: Add support for Unisoc's UMS512") Cc: stable@vger.kernel.org Signed-off-by: Cixi Geng Link: https://lore.kernel.org/r/20230711162346.5978-1-cixi.geng@linux.dev Signed-off-by: Chunyan Zhang Signed-off-by: Greg Kroah-Hartman commit 9c3495d5adf39e7f3375bd004aa913644545acad Author: Johan Hovold Date: Wed Dec 13 18:31:29 2023 +0100 ARM: dts: qcom: sdx55: fix pdc '#interrupt-cells' commit cc25bd06c16aa582596a058d375b2e3133f79b93 upstream. The Qualcomm PDC interrupt controller binding expects two cells in interrupt specifiers. Fixes: 9d038b2e62de ("ARM: dts: qcom: Add SDX55 platform and MTP board support") Cc: stable@vger.kernel.org # 5.12 Cc: Manivannan Sadhasivam Signed-off-by: Johan Hovold Reviewed-by: Konrad Dybcio Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20231213173131.29436-2-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit fda40117a8105d6155d45df7f4e8a1eef5d2647d Author: Paul Cercueil Date: Wed Dec 6 23:15:54 2023 +0100 ARM: dts: samsung: exynos4210-i9100: Unconditionally enable LDO12 commit 84228d5e29dbc7a6be51e221000e1d122125826c upstream. The kernel hangs for a good 12 seconds without any info being printed to dmesg, very early in the boot process, if this regulator is not enabled. Force-enable it to work around this issue, until we know more about the underlying problem. Signed-off-by: Paul Cercueil Fixes: 8620cc2f99b7 ("ARM: dts: exynos: Add devicetree file for the Galaxy S2") Cc: stable@vger.kernel.org # v5.8+ Link: https://lore.kernel.org/r/20231206221556.15348-2-paul@crapouillou.net Signed-off-by: Krzysztof Kozlowski Signed-off-by: Greg Kroah-Hartman commit f651443f357d35b5b1c640353536ad632ec83337 Author: Johan Hovold Date: Mon Nov 20 17:43:21 2023 +0100 ARM: dts: qcom: sdx55: fix USB wakeup interrupt types commit d0ec3c4c11c3b30e1f2d344973b2a7bf0f986734 upstream. The DP/DM wakeup interrupts are edge triggered and which edge to trigger on depends on use-case and whether a Low speed or Full/High speed device is connected. Fixes: fea4b41022f3 ("ARM: dts: qcom: sdx55: Add USB3 and PHY support") Cc: stable@vger.kernel.org # 5.12 Cc: Manivannan Sadhasivam Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20231120164331.8116-2-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit 1bb948bbdca6d29ff4c12e25993f23d8e004defe Author: Johan Hovold Date: Mon Oct 16 10:06:58 2023 +0200 arm64: dts: qcom: sc8280xp-crd: fix eDP phy compatible commit 663affdb12b3e26c77d103327cf27de720c8117e upstream. The sc8280xp Display Port PHYs can be used in either DP or eDP mode and this is configured using the devicetree compatible string which defaults to DP mode in the SoC dtsi. Override the default compatible string for the CRD eDP PHY node so that the eDP settings are used. Fixes: 4a883a8d80b5 ("arm64: dts: qcom: sc8280xp-crd: Enable EDP") Cc: stable@vger.kernel.org # 6.3 Signed-off-by: Johan Hovold Link: https://lore.kernel.org/r/20231016080658.6667-1-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit d2d8b18ba136fe0932480a6efd5866d72afba87b Author: Andrejs Cainikovs Date: Fri Oct 20 17:30:22 2023 +0200 ARM: dts: imx6q-apalis: add can power-up delay on ixora board commit b76bbf835d8945080b22b52fc1e6f41cde06865d upstream. Newer variants of Ixora boards require a power-up delay when powering up the CAN transceiver of up to 1ms. Cc: stable@vger.kernel.org Signed-off-by: Andrejs Cainikovs Signed-off-by: Shawn Guo Signed-off-by: Greg Kroah-Hartman commit 4e2eba51b39c42ab4645195693a3dd1c0be60c4f Author: Shyam Prasad N Date: Wed Jan 3 12:51:49 2024 +0000 cifs: update iface_last_update on each query-and-update [ Upstream commit 78e727e58e54efca4c23863fbd9e16e9d2d83f81 ] iface_last_update was an unused field when it was introduced. Later, when we had periodic update of server interface list, this field was used regularly to decide when to update next. However, with the new logic of updating the interfaces, it becomes crucial that this field be updated whenever parse_server_interfaces runs successfully. This change updates this field when either the server does not support query of interfaces; so that we do not query the interfaces repeatedly. It also updates the field when the function reaches the end. Fixes: aa45dadd34e4 ("cifs: change iface_list from array to sorted linked list") Signed-off-by: Shyam Prasad N Signed-off-by: Steve French Signed-off-by: Sasha Levin commit d61ba1d71ea6039eca7ada870bf3f0c3c8fc12e4 Author: Shyam Prasad N Date: Tue Jan 2 13:14:46 2024 +0000 cifs: handle servers that still advertise multichannel after disabling [ Upstream commit f591062bdbf4742b7f1622173017f19e927057b0 ] Some servers like Azure SMB servers always advertise multichannel capability in server capabilities list. Such servers return error STATUS_NOT_IMPLEMENTED for ioctl calls to query server interfaces, and expect clients to consider that as a sign that they do not support multichannel. We already handled this at mount time. Soon after the tree connect, we query server interfaces. And when server returned STATUS_NOT_IMPLEMENTED, we kept interface list as empty. When cifs_try_adding_channels gets called, it would not find any interfaces, so will not add channels. For the case where an active multichannel mount exists, and multichannel is disabled by such a server, this change will now allow the client to disable secondary channels on the mount. It will check the return status of query server interfaces call soon after a tree reconnect. If the return status is EOPNOTSUPP, then instead of the check to add more channels, we'll disable the secondary channels instead. For better code reuse, this change also moves the common code for disabling multichannel to a helper function. Signed-off-by: Shyam Prasad N Signed-off-by: Steve French Stable-dep-of: 78e727e58e54 ("cifs: update iface_last_update on each query-and-update") Signed-off-by: Sasha Levin commit 9201b54603496dc227ecc07bb6b8a78d5e711d8e Author: Paulo Alcantara Date: Fri Jan 19 01:08:26 2024 -0300 smb: client: fix parsing of SMB3.1.1 POSIX create context [ Upstream commit 76025cc2285d9ede3d717fe4305d66f8be2d9346 ] The data offset for the SMB3.1.1 POSIX create context will always be 8-byte aligned so having the check 'noff + nlen >= doff' in smb2_parse_contexts() is wrong as it will lead to -EINVAL because noff + nlen == doff. Fix the sanity check to correctly handle aligned create context data. Fixes: af1689a9b770 ("smb: client: fix potential OOBs in smb2_parse_contexts()") Signed-off-by: Paulo Alcantara Signed-off-by: Steve French Signed-off-by: Sasha Levin commit 144a2afd8ad54b2973aee4e60a6a511f0da49278 Author: Geert Uytterhoeven Date: Mon Sep 25 13:10:22 2023 +0200 sh: ecovec24: Rename missed backlight field from fbdev to dev [ Upstream commit d87123aa9a7920e88633ffc5c5a0a22ab08bdc06 ] One instance of gpio_backlight_platform_data.fbdev was renamed, but the second instance was forgotten, causing a build failure: arch/sh/boards/mach-ecovec24/setup.c: In function ‘arch_setup’: arch/sh/boards/mach-ecovec24/setup.c:1223:37: error: ‘struct gpio_backlight_platform_data’ has no member named ‘fbdev’; did you mean ‘dev’? 1223 | gpio_backlight_data.fbdev = NULL; | ^~~~~ | dev Fix this by updating the second instance. Fixes: ed369def91c1579a ("backlight/gpio_backlight: Rename field 'fbdev' to 'dev'") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202309231601.Uu6qcRnU-lkp@intel.com/ Signed-off-by: Geert Uytterhoeven Acked-by: Thomas Zimmermann Reviewed-by: John Paul Adrian Glaubitz Link: https://lore.kernel.org/r/20230925111022.3626362-1-geert+renesas@glider.be Signed-off-by: John Paul Adrian Glaubitz Signed-off-by: Sasha Levin commit d95dcb15cd42a3706b5dd47cbf1d5d37d28520b4 Author: Niklas Cassel Date: Thu Jan 11 13:05:32 2024 +0100 scsi: core: Kick the requeue list after inserting when flushing [ Upstream commit 6df0e077d76bd144c533b61d6182676aae6b0a85 ] When libata calls ata_link_abort() to abort all ata queued commands, it calls blk_abort_request() on the SCSI command representing each QC. This causes scsi_timeout() to be called, which calls scsi_eh_scmd_add() for each SCSI command. scsi_eh_scmd_add() sets the SCSI host to state recovery, and then adds the command to shost->eh_cmd_q. This will wake up the SCSI EH, and eventually the libata EH strategy handler will be called, which calls scsi_eh_flush_done_q() to either flush retry or flush finish each failed command. The commands that are flush retried by scsi_eh_flush_done_q() are done so using scsi_queue_insert(). Before commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"), __scsi_queue_insert() called blk_mq_requeue_request() with the second argument set to true, indicating that it should always kick/run the requeue list after inserting. After commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"), __scsi_queue_insert() does not kick/run the requeue list after inserting, if the current SCSI host state is recovery (which is the case in the libata example above). This optimization is probably fine in most cases, as I can only assume that most often someone will eventually kick/run the queues. However, that is not the case for scsi_eh_flush_done_q(), where we can see that the request gets inserted to the requeue list, but the queue is never started after the request has been inserted, leading to the block layer waiting for the completion of command that never gets to run. Since scsi_eh_flush_done_q() is called by SCSI EH context, the SCSI host state is most likely always in recovery when this function is called. Thus, let scsi_eh_flush_done_q() explicitly kick the requeue list after inserting a flush retry command, so that scsi_eh_flush_done_q() keeps the same behavior as before commit 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary"). Simple reproducer for the libata example above: $ hdparm -Y /dev/sda $ echo 1 > /sys/class/scsi_device/0\:0\:0\:0/device/delete Fixes: 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary") Reported-by: Kevin Locke Closes: https://lore.kernel.org/linux-scsi/ZZw3Th70wUUvCiCY@kevinlocke.name/ Signed-off-by: Niklas Cassel Link: https://lore.kernel.org/r/20240111120533.3612509-1-cassel@kernel.org Reviewed-by: Bart Van Assche Reviewed-by: Damien Le Moal Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 838127ab54fe4c2425070e1f3b9f3327f38d6ce5 Author: Christophe JAILLET Date: Sun Oct 29 08:20:40 2023 +0100 riscv: Fix an off-by-one in get_early_cmdline() [ Upstream commit adb1f95d388a43c4c564ef3e436f18900dde978e ] The ending NULL is not taken into account by strncat(), so switch to strlcat() to correctly compute the size of the available memory when appending CONFIG_CMDLINE to 'early_cmdline'. Fixes: 26e7aacb83df ("riscv: Allow to downgrade paging mode from the command line") Signed-off-by: Christophe JAILLET Reviewed-by: Alexandre Ghiti Link: https://lore.kernel.org/r/9f66d2b58c8052d4055e90b8477ee55d9a0914f9.1698564026.git.christophe.jaillet@wanadoo.fr Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 7a5b2ca3f643c6e8655efc9043cb65cbdc697d53 Author: Charlie Jenkins Date: Thu Jan 4 11:42:49 2024 -0800 riscv: Fix relocation_hashtable size [ Upstream commit a35551c7244d9d061643a01eb96cc3ba04eaf45c ] A second dereference is needed to get the accurate size of the relocation_hashtable. Signed-off-by: Charlie Jenkins Fixes: d8792a5734b0 ("riscv: Safely remove entries from relocation list") Reported-by: kernel test robot Reported-by: Julia Lawall Closes: https://lore.kernel.org/r/202312120044.wTI1Uyaa-lkp@intel.com/ Reviewed-by: Dan Carpenter Link: https://lore.kernel.org/r/20240104-module_loading_fix-v3-3-a71f8de6ce0f@rivosinc.com Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 72301e18d8b0a3666bd567dcdfee7aafb1cdcbbd Author: Charlie Jenkins Date: Thu Jan 4 11:42:48 2024 -0800 riscv: Correctly free relocation hashtable on error [ Upstream commit 4b38b36bfbd83b23e20c172d08dd85773791e3bd ] When there is not enough allocatable memory for the relocation hashtable, module loading should exit gracefully. Previously, this was attempted to be accomplished by checking if an unsigned number is less than zero which does not work. Instead have the caller check if the hashtable was correctly allocated and add a comment explaining that hashtable_bits that is 0 is valid. Signed-off-by: Charlie Jenkins Fixes: d8792a5734b0 ("riscv: Safely remove entries from relocation list") Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202312132019.iYGTwW0L-lkp@intel.com/ Reported-by: kernel test robot Reported-by: Julia Lawall Closes: https://lore.kernel.org/r/202312120044.wTI1Uyaa-lkp@intel.com/ Reviewed-by: Dan Carpenter Link: https://lore.kernel.org/r/20240104-module_loading_fix-v3-2-a71f8de6ce0f@rivosinc.com Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 2fa79badf4bfeffda6b5032cf62b828486ec9a99 Author: Charlie Jenkins Date: Thu Jan 4 11:42:47 2024 -0800 riscv: Fix module loading free order [ Upstream commit 78996eee79ebdfe8b6f0e54cb6dcc792d5129291 ] Reverse order of kfree calls to resolve use-after-free error. Signed-off-by: Charlie Jenkins Fixes: d8792a5734b0 ("riscv: Safely remove entries from relocation list") Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202312132019.iYGTwW0L-lkp@intel.com/ Reported-by: kernel test robot Reported-by: Julia Lawall Closes: https://lore.kernel.org/r/202312120044.wTI1Uyaa-lkp@intel.com/ Reviewed-by: Dan Carpenter Link: https://lore.kernel.org/r/20240104-module_loading_fix-v3-1-a71f8de6ce0f@rivosinc.com Signed-off-by: Palmer Dabbelt Signed-off-by: Sasha Levin commit 691bb9b5c0d6d59bfa2dd0bc56a1b832a271b0c3 Author: Bart Van Assche Date: Mon Dec 18 14:52:15 2023 -0800 scsi: ufs: core: Remove the ufshcd_hba_exit() call from ufshcd_async_scan() [ Upstream commit ee36710912b2075c417100a8acc642c9c6496501 ] Calling ufshcd_hba_exit() from a function that is called asynchronously from ufshcd_init() is wrong because this triggers multiple race conditions. Instead of calling ufshcd_hba_exit(), log an error message. Reported-by: Daniel Mentz Fixes: 1d337ec2f35e ("ufs: improve init sequence") Signed-off-by: Bart Van Assche Link: https://lore.kernel.org/r/20231218225229.2542156-3-bvanassche@acm.org Reviewed-by: Can Guo Reviewed-by: Manivannan Sadhasivam Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 7ff852b6b012aa0554b4f1c0f10e324563001a1e Author: Miquel Raynal Date: Thu Nov 30 12:13:12 2023 +0100 dmaengine: xilinx: xdma: Fix the count of elapsed periods in cyclic mode [ Upstream commit 26ee018ff6d1c326ac9b9be36513e35870ed09db ] Xilinx DMA engine is capable of keeping track of the number of elapsed periods and this is an increasing 32-bit counter which is only reset when turning off the engine. No need to add this value to our local counter. Fixes: cd8c732ce1a5 ("dmaengine: xilinx: xdma: Support cyclic transfers") Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/r/20231130111315.729430-2-miquel.raynal@bootlin.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 0a32a2d8467079fe1995454499521b488fb58c00 Author: Rex Zhang Date: Tue Dec 12 10:21:58 2023 +0800 dmaengine: idxd: Move dma_free_coherent() out of spinlocked context [ Upstream commit e271c0ba3f919c48e90c64b703538fbb7865cb63 ] Task may be rescheduled within dma_free_coherent(). So dma_free_coherent() can't be called between spin_lock() and spin_unlock() to avoid Call Trace: Call Trace: dump_stack_lvl+0x37/0x50 __might_resched+0x16a/0x1c0 vunmap+0x2c/0x70 __iommu_dma_free+0x96/0x100 idxd_device_evl_free+0xd5/0x100 [idxd] device_release_driver_internal+0x197/0x200 unbind_store+0xa1/0xb0 kernfs_fop_write_iter+0x120/0x1c0 vfs_write+0x2d3/0x400 ksys_write+0x63/0xe0 do_syscall_64+0x44/0xa0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Move it out of the context. Fixes: 244da66cda35 ("dmaengine: idxd: setup event log configuration") Signed-off-by: Rex Zhang Reviewed-by: Dave Jiang Reviewed-by: Fenghua Yu Link: https://lore.kernel.org/r/20231212022158.358619-2-rex.zhang@intel.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 9263fd2a63487c6d04cbb7b74a48fb12e1e352d0 Author: Amelie Delaunay Date: Wed Dec 13 17:04:52 2023 +0100 dmaengine: fix NULL pointer in channel unregistration function [ Upstream commit f5c24d94512f1b288262beda4d3dcb9629222fc7 ] __dma_async_device_channel_register() can fail. In case of failure, chan->local is freed (with free_percpu()), and chan->local is nullified. When dma_async_device_unregister() is called (because of managed API or intentionally by DMA controller driver), channels are unconditionally unregistered, leading to this NULL pointer: [ 1.318693] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000d0 [...] [ 1.484499] Call trace: [ 1.486930] device_del+0x40/0x394 [ 1.490314] device_unregister+0x20/0x7c [ 1.494220] __dma_async_device_channel_unregister+0x68/0xc0 Look at dma_async_device_register() function error path, channel device unregistration is done only if chan->local is not NULL. Then add the same condition at the beginning of __dma_async_device_channel_unregister() function, to avoid NULL pointer issue whatever the API used to reach this function. Fixes: d2fb0a043838 ("dmaengine: break out channel registration") Signed-off-by: Amelie Delaunay Reviewed-by: Dave Jiang Link: https://lore.kernel.org/r/20231213160452.2598073-1-amelie.delaunay@foss.st.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 0515904b169e0d0aa5eecdacd481a437c8114556 Author: Frank Li Date: Tue Nov 14 10:48:21 2023 -0500 dmaengine: fsl-edma: fix eDMAv4 channel allocation issue [ Upstream commit dc51b4442dd94ab12c146c1897bbdb40e16d5636 ] The eDMAv4 channel mux has a limitation where certain requests must use even channels, while others must use odd numbers. Add two flags (ARGS_EVEN_CH and ARGS_ODD_CH) to reflect this limitation. The device tree source (dts) files need to be updated accordingly. This issue was identified by the following commit: commit a725990557e7 ("arm64: dts: imx93: Fix the dmas entries order") Reverting channel orders triggered this problem. Fixes: 72f5801a4e2b ("dmaengine: fsl-edma: integrate v3 support") Signed-off-by: Frank Li Link: https://lore.kernel.org/r/20231114154824.3617255-2-Frank.Li@nxp.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit ba1c101b28935c192d939ed8e27cfb689781dcbf Author: Marcelo Schmitt Date: Tue Dec 19 17:26:27 2023 -0300 iio: adc: ad7091r: Enable internal vref if external vref is not supplied [ Upstream commit e71c5c89bcb165a02df35325aa13d1ee40112401 ] The ADC needs a voltage reference to work correctly. Users can provide an external voltage reference or use the chip internal reference to operate the ADC. The availability of an in chip reference for the ADC saves the user from having to supply an external voltage reference, which makes the external reference an optional property as described in the device tree documentation. Though, to use the internal reference, it must be enabled by writing to the configuration register. Enable AD7091R internal voltage reference if no external vref is supplied. Fixes: 260442cc5be4 ("iio: adc: ad7091r5: Add scale and external VREF support") Signed-off-by: Marcelo Schmitt Link: https://lore.kernel.org/r/b865033fa6a4fc4bf2b4a98ec51a6144e0f64f77.1703013352.git.marcelo.schmitt1@gmail.com Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit 49bf4d6a3da669d398cb22f1ded0fa01e2ddf7d7 Author: Helge Deller Date: Wed Jan 3 21:17:23 2024 +0100 parisc/power: Fix power soft-off button emulation on qemu commit 6472036581f947109b20664121db1d143e916f0b upstream. Make sure to start the kthread to check the power button on qemu as well if the power button address was provided. This fixes the qemu built-in system_powerdown runtime command. Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu") Signed-off-by: Helge Deller Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman commit 166aa1d2834440a41587a4b87ed91d297a453deb Author: Helge Deller Date: Wed Jan 3 21:02:16 2024 +0100 parisc/firmware: Fix F-extend for PDC addresses commit 735ae74f73e55c191d48689bd11ff4a06ea0508f upstream. When running with narrow firmware (64-bit kernel using a 32-bit firmware), extend PDC addresses into the 0xfffffff0.00000000 region instead of the 0xf0f0f0f0.00000000 region. This fixes the power button on the C3700 machine in qemu (64-bit CPU with 32-bit firmware), and my assumption is that the previous code was really never used (because most 64-bit machines have a 64-bit firmware), or it just worked on very old machines because they may only decode 40-bit of virtual addresses. Cc: stable@vger.kernel.org Signed-off-by: Helge Deller Signed-off-by: Greg Kroah-Hartman commit 642adb03541673f3897f64bbb62856ffd73807f5 Author: Bhaumik Bhatt Date: Mon Dec 11 14:42:51 2023 +0800 bus: mhi: host: Add spinlock to protect WP access when queueing TREs commit b89b6a863dd53bc70d8e52d50f9cfaef8ef5e9c9 upstream. Protect WP accesses such that multiple threads queueing buffers for incoming data do not race. Meanwhile, if CONFIG_TRACE_IRQFLAGS is enabled, irq will be enabled once __local_bh_enable_ip is called as part of write_unlock_bh. Hence, let's take irqsave lock after TRE is generated to avoid running write_unlock_bh when irqsave lock is held. Cc: stable@vger.kernel.org Fixes: 189ff97cca53 ("bus: mhi: core: Add support for data transfer") Signed-off-by: Bhaumik Bhatt Signed-off-by: Qiang Yu Reviewed-by: Jeffrey Hugo Tested-by: Jeffrey Hugo Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1702276972-41296-2-git-send-email-quic_qianyu@quicinc.com Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit b8eff20d87092e14cac976d057cb0aea2f1d0830 Author: Qiang Yu Date: Mon Dec 11 14:42:52 2023 +0800 bus: mhi: host: Drop chan lock before queuing buffers commit 01bd694ac2f682fb8017e16148b928482bc8fa4b upstream. Ensure read and write locks for the channel are not taken in succession by dropping the read lock from parse_xfer_event() such that a callback given to client can potentially queue buffers and acquire the write lock in that process. Any queueing of buffers should be done without channel read lock acquired as it can result in multiple locks and a soft lockup. Cc: # 5.7 Fixes: 1d3173a3bae7 ("bus: mhi: core: Add support for processing events from client device") Signed-off-by: Qiang Yu Reviewed-by: Jeffrey Hugo Tested-by: Jeffrey Hugo Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/1702276972-41296-3-git-send-email-quic_qianyu@quicinc.com [mani: added fixes tag and cc'ed stable] Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit ecf8320111822a1ae5d5fc512953eab46d543d0b Author: Krishna chaitanya chundru Date: Tue Oct 31 15:21:05 2023 +0530 bus: mhi: host: Add alignment check for event ring read pointer commit eff9704f5332a13b08fbdbe0f84059c9e7051d5f upstream. Though we do check the event ring read pointer by "is_valid_ring_ptr" to make sure it is in the buffer range, but there is another risk the pointer may be not aligned. Since we are expecting event ring elements are 128 bits(struct mhi_ring_element) aligned, an unaligned read pointer could lead to multiple issues like DoS or ring buffer memory corruption. So add a alignment check for event ring read pointer. Fixes: ec32332df764 ("bus: mhi: core: Sanity check values from remote device before use") cc: stable@vger.kernel.org Signed-off-by: Krishna chaitanya chundru Reviewed-by: Jeffrey Hugo Reviewed-by: Manivannan Sadhasivam Link: https://lore.kernel.org/r/20231031-alignment_check-v2-1-1441db7c5efd@quicinc.com Signed-off-by: Manivannan Sadhasivam Signed-off-by: Greg Kroah-Hartman commit d157f10b96216d8c3b7f27023e13aa70b2b1c8a6 Author: Serge Semin Date: Sat Dec 2 14:14:20 2023 +0300 mips: Fix max_mapnr being uninitialized on early stages commit e1a9ae45736989c972a8d1c151bc390678ae6205 upstream. max_mapnr variable is utilized in the pfn_valid() method in order to determine the upper PFN space boundary. Having it uninitialized effectively makes any PFN passed to that method invalid. That in its turn causes the kernel mm-subsystem occasion malfunctions even after the max_mapnr variable is actually properly updated. For instance, pfn_valid() is called in the init_unavailable_range() method in the framework of the calls-chain on MIPS: setup_arch() +-> paging_init() +-> free_area_init() +-> memmap_init() +-> memmap_init_zone_range() +-> init_unavailable_range() Since pfn_valid() always returns "false" value before max_mapnr is initialized in the mem_init() method, any flatmem page-holes will be left in the poisoned/uninitialized state including the IO-memory pages. Thus any further attempts to map/remap the IO-memory by using MMU may fail. In particular it happened in my case on attempt to map the SRAM region. The kernel bootup procedure just crashed on the unhandled unaligned access bug raised in the __update_cache() method: > Unhandled kernel unaligned access[#1]: > CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc1-XXX-dirty #2056 > ... > Call Trace: > [<8011ef9c>] __update_cache+0x88/0x1bc > [<80385944>] ioremap_page_range+0x110/0x2a4 > [<80126948>] ioremap_prot+0x17c/0x1f4 > [<80711b80>] __devm_ioremap+0x8c/0x120 > [<80711e0c>] __devm_ioremap_resource+0xf4/0x218 > [<808bf244>] sram_probe+0x4f4/0x930 > [<80889d20>] platform_probe+0x68/0xec > ... Let's fix the problem by initializing the max_mapnr variable as soon as the required data is available. In particular it can be done right in the paging_init() method before free_area_init() is called since all the PFN zone boundaries have already been calculated by that time. Cc: stable@vger.kernel.org Signed-off-by: Serge Semin Signed-off-by: Thomas Bogendoerfer Signed-off-by: Greg Kroah-Hartman commit b0028f333420a65a53a63978522db680b37379dd Author: Eric Dumazet Date: Fri Jan 12 13:26:57 2024 +0000 nbd: always initialize struct msghdr completely commit 78fbb92af27d0982634116c7a31065f24d092826 upstream. syzbot complains that msg->msg_get_inq value can be uninitialized [1] struct msghdr got many new fields recently, we should always make sure their values is zero by default. [1] BUG: KMSAN: uninit-value in tcp_recvmsg+0x686/0xac0 net/ipv4/tcp.c:2571 tcp_recvmsg+0x686/0xac0 net/ipv4/tcp.c:2571 inet_recvmsg+0x131/0x580 net/ipv4/af_inet.c:879 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg+0x12b/0x1e0 net/socket.c:1066 __sock_xmit+0x236/0x5c0 drivers/block/nbd.c:538 nbd_read_reply drivers/block/nbd.c:732 [inline] recv_work+0x262/0x3100 drivers/block/nbd.c:863 process_one_work kernel/workqueue.c:2627 [inline] process_scheduled_works+0x104e/0x1e70 kernel/workqueue.c:2700 worker_thread+0xf45/0x1490 kernel/workqueue.c:2781 kthread+0x3ed/0x540 kernel/kthread.c:388 ret_from_fork+0x66/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242 Local variable msg created at: __sock_xmit+0x4c/0x5c0 drivers/block/nbd.c:513 nbd_read_reply drivers/block/nbd.c:732 [inline] recv_work+0x262/0x3100 drivers/block/nbd.c:863 CPU: 1 PID: 7465 Comm: kworker/u5:1 Not tainted 6.7.0-rc7-syzkaller-00041-gf016f7547aee #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023 Workqueue: nbd5-recv recv_work Fixes: f94fd25cb0aa ("tcp: pass back data left in socket after receive") Reported-by: syzbot Signed-off-by: Eric Dumazet Cc: stable@vger.kernel.org Cc: Josef Bacik Cc: Jens Axboe Cc: linux-block@vger.kernel.org Cc: nbd@other.debian.org Reviewed-by: Simon Horman Link: https://lore.kernel.org/r/20240112132657.647112-1-edumazet@google.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit b997bf417d44e121d59ec4d39fdef5b60b4b768e Author: Nathan Lynch Date: Tue Jan 16 08:09:25 2024 -0600 seq_buf: Make DECLARE_SEQ_BUF() usable commit 7a8e9cdf9405819105ae7405cd91e482bf574b01 upstream. Using the address operator on the array doesn't work: ./include/linux/seq_buf.h:27:27: error: initialization of ‘char *’ from incompatible pointer type ‘char (*)[128]’ [-Werror=incompatible-pointer-types] 27 | .buffer = &__ ## NAME ## _buffer, \ | ^ Apart from fixing that, we can improve DECLARE_SEQ_BUF() by using a compound literal to define the buffer array without attaching a name to it. This makes the macro a single statement, allowing constructs such as: static DECLARE_SEQ_BUF(my_seq_buf, MYSB_SIZE); to work as intended. Link: https://lkml.kernel.org/r/20240116-declare-seq-buf-fix-v1-1-915db4692f32@linux.ibm.com Cc: stable@vger.kernel.org Acked-by: Kees Cook Fixes: dcc4e5728eea ("seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str()") Signed-off-by: Nathan Lynch Signed-off-by: Steven Rostedt (Google) Signed-off-by: Greg Kroah-Hartman commit 7abe1494896c052d5dcf4ba910656777aff72fbb Author: Tony Krowiak Date: Mon Jan 15 13:54:36 2024 -0500 s390/vfio-ap: do not reset queue removed from host config commit b9bd10c43456d16abd97b717446f51afb3b88411 upstream. When a queue is unbound from the vfio_ap device driver, it is reset to ensure its crypto data is not leaked when it is bound to another device driver. If the queue is unbound due to the fact that the adapter or domain was removed from the host's AP configuration, then attempting to reset it will fail with response code 01 (APID not valid) getting returned from the reset command. Let's ensure that the queue is assigned to the host's configuration before resetting it. Signed-off-by: Tony Krowiak Reviewed-by: "Jason J. Herne" Reviewed-by: Halil Pasic Fixes: eeb386aeb5b7 ("s390/vfio-ap: handle config changed and scan complete notification") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-7-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev Signed-off-by: Greg Kroah-Hartman commit 2c9bb887dcb5b6ce59ef5cc0a04100f4e9ed6c4c Author: Tony Krowiak Date: Mon Jan 15 13:54:35 2024 -0500 s390/vfio-ap: reset queues associated with adapter for queue unbound from driver commit f009cfa466558b7dfe97f167ba1875d6f9ea4c07 upstream. When a queue is unbound from the vfio_ap device driver, if that queue is assigned to a guest's AP configuration, its associated adapter is removed because queues are defined to a guest via a matrix of adapters and domains; so, it is not possible to remove a single queue. If an adapter is removed from the guest's AP configuration, all associated queues must be reset to prevent leaking crypto data should any of them be assigned to a different guest or device driver. The one caveat is that if the queue is being removed because the adapter or domain has been removed from the host's AP configuration, then an attempt to reset the queue will fail with response code 01, AP-queue number not valid; so resetting these queues should be skipped. Acked-by: Halil Pasic Signed-off-by: Tony Krowiak Fixes: 09d31ff78793 ("s390/vfio-ap: hot plug/unplug of AP devices when probed/removed") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-6-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev Signed-off-by: Greg Kroah-Hartman commit 44992a765642df2f9c18ab2cf46d901a9752001d Author: Tony Krowiak Date: Mon Jan 15 13:54:34 2024 -0500 s390/vfio-ap: reset queues filtered from the guest's AP config commit f848cba767e59f8d5c54984b1d45451aae040d50 upstream. When filtering the adapters from the configuration profile for a guest to create or update a guest's AP configuration, if the APID of an adapter and the APQI of a domain identify a queue device that is not bound to the vfio_ap device driver, the APID of the adapter will be filtered because an individual APQN can not be filtered due to the fact the APQNs are assigned to an AP configuration as a matrix of APIDs and APQIs. Consequently, a guest will not have access to all of the queues associated with the filtered adapter. If the queues are subsequently made available again to the guest, they should re-appear in a reset state; so, let's make sure all queues associated with an adapter unplugged from the guest are reset. In order to identify the set of queues that need to be reset, let's allow a vfio_ap_queue object to be simultaneously stored in both a hashtable and a list: A hashtable used to store all of the queues assigned to a matrix mdev; and/or, a list used to store a subset of the queues that need to be reset. For example, when an adapter is hot unplugged from a guest, all guest queues associated with that adapter must be reset. Since that may be a subset of those assigned to the matrix mdev, they can be stored in a list that can be passed to the vfio_ap_mdev_reset_queues function. Signed-off-by: Tony Krowiak Acked-by: Halil Pasic Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-5-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev Signed-off-by: Greg Kroah-Hartman commit c706b3e6acc4bd3c52bdb6fc63baa5f331636965 Author: Tony Krowiak Date: Mon Jan 15 13:54:33 2024 -0500 s390/vfio-ap: let on_scan_complete() callback filter matrix and update guest's APCB commit 774d10196e648e2c0b78da817f631edfb3dfa557 upstream. When adapters and/or domains are added to the host's AP configuration, this may result in multiple queue devices getting created and probed by the vfio_ap device driver. For each queue device probed, the matrix of adapters and domains assigned to a matrix mdev will be filtered to update the guest's APCB. If any adapters or domains get added to or removed from the APCB, the guest's AP configuration will be dynamically updated (i.e., hot plug/unplug). To dynamically update the guest's configuration, its VCPUs must be taken out of SIE for the period of time it takes to make the update. This is disruptive to the guest's operation and if there are many queues probed due to a change in the host's AP configuration, this could be troublesome. The problem is exacerbated by the fact that the 'on_scan_complete' callback also filters the mdev's matrix and updates the guest's AP configuration. In order to reduce the potential amount of disruption to the guest that may result from a change to the host's AP configuration, let's bypass the filtering of the matrix and updating of the guest's AP configuration in the probe callback - if due to a host config change - and defer it until the 'on_scan_complete' callback is invoked after the AP bus finishes its device scan operation. This way the filtering and updating will be performed only once regardless of the number of queues added. Signed-off-by: Tony Krowiak Reviewed-by: Halil Pasic Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-4-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev Signed-off-by: Greg Kroah-Hartman commit 24f07b64c791c957be6f437a374d2ca8fdfad4bc Author: Tony Krowiak Date: Mon Jan 15 13:54:32 2024 -0500 s390/vfio-ap: loop over the shadow APCB when filtering guest's AP configuration commit 16fb78cbf56e42b8efb2682a4444ab59e32e7959 upstream. While filtering the mdev matrix, it doesn't make sense - and will have unexpected results - to filter an APID from the matrix if the APID or one of the associated APQIs is not in the host's AP configuration. There are two reasons for this: 1. An adapter or domain that is not in the host's AP configuration can be assigned to the matrix; this is known as over-provisioning. Queue devices, however, are only created for adapters and domains in the host's AP configuration, so there will be no queues associated with an over-provisioned adapter or domain to filter. 2. The adapter or domain may have been externally removed from the host's configuration via an SE or HMC attached to a DPM enabled LPAR. In this case, the vfio_ap device driver would have been notified by the AP bus via the on_config_changed callback and the adapter or domain would have already been filtered. Since the matrix_mdev->shadow_apcb.apm and matrix_mdev->shadow_apcb.aqm are copied from the mdev matrix sans the APIDs and APQIs not in the host's AP configuration, let's loop over those bitmaps instead of those assigned to the matrix. Signed-off-by: Tony Krowiak Reviewed-by: Halil Pasic Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-3-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev Signed-off-by: Greg Kroah-Hartman commit cdd134d56138302976685e6c7bc4755450b3880e Author: Tony Krowiak Date: Mon Jan 15 13:54:31 2024 -0500 s390/vfio-ap: always filter entire AP matrix commit 850fb7fa8c684a4c6bf0e4b6978f4ddcc5d43d11 upstream. The vfio_ap_mdev_filter_matrix function is called whenever a new adapter or domain is assigned to the mdev. The purpose of the function is to update the guest's AP configuration by filtering the matrix of adapters and domains assigned to the mdev. When an adapter or domain is assigned, only the APQNs associated with the APID of the new adapter or APQI of the new domain are inspected. If an APQN does not reference a queue device bound to the vfio_ap device driver, then it's APID will be filtered from the mdev's matrix when updating the guest's AP configuration. Inspecting only the APID of the new adapter or APQI of the new domain will result in passing AP queues through to a guest that are not bound to the vfio_ap device driver under certain circumstances. Consider the following: guest's AP configuration (all also assigned to the mdev's matrix): 14.0004 14.0005 14.0006 16.0004 16.0005 16.0006 unassign domain 4 unbind queue 16.0005 assign domain 4 When domain 4 is re-assigned, since only domain 4 will be inspected, the APQNs that will be examined will be: 14.0004 16.0004 Since both of those APQNs reference queue devices that are bound to the vfio_ap device driver, nothing will get filtered from the mdev's matrix when updating the guest's AP configuration. Consequently, queue 16.0005 will get passed through despite not being bound to the driver. This violates the linux device model requirement that a guest shall only be given access to devices bound to the device driver facilitating their pass-through. To resolve this problem, every adapter and domain assigned to the mdev will be inspected when filtering the mdev's matrix. Signed-off-by: Tony Krowiak Acked-by: Halil Pasic Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-2-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev Signed-off-by: Greg Kroah-Hartman commit 6ce2041036098207757c63db9649fc8ec5e231b2 Author: Herve Codina Date: Tue Dec 5 16:21:00 2023 +0100 soc: fsl: cpm1: qmc: Fix rx channel reset commit dfe66d012af2ddfa566cf9c860b8472b412fb7e4 upstream. The qmc_chan_reset_rx() set the is_rx_stopped flag. This leads to an inconsistent state in the following sequence. qmc_chan_stop() qmc_chan_reset() Indeed, after the qmc_chan_reset() call, the channel must still be stopped. Only a qmc_chan_start() call can move the channel from stopped state to started state. Fix the issue removing the is_rx_stopped flag setting from qmc_chan_reset() Fixes: 3178d58e0b97 ("soc: fsl: cpm1: Add support for QMC") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina Reviewed-by: Christophe Leroy Link: https://lore.kernel.org/r/20231205152116.122512-4-herve.codina@bootlin.com Signed-off-by: Greg Kroah-Hartman commit 4c1080e07d206a4120c0a4828da68ff0200ccf10 Author: Herve Codina Date: Tue Dec 5 16:20:59 2023 +0100 soc: fsl: cpm1: qmc: Fix __iomem addresses declaration commit a5ec3a21220da06bdda2e686012ca64fdb6c513d upstream. Running sparse (make C=1) on qmc.c raises a lot of warning such as: ... warning: incorrect type in assignment (different address spaces) expected struct cpm_buf_desc [usertype] *[noderef] __iomem bd got struct cpm_buf_desc [noderef] [usertype] __iomem *txbd_free ... Indeed, some variable were declared 'type *__iomem var' instead of 'type __iomem *var'. Use the correct declaration to remove these warnings. Fixes: 3178d58e0b97 ("soc: fsl: cpm1: Add support for QMC") Cc: stable@vger.kernel.org Signed-off-by: Herve Codina Reviewed-by: Christophe Leroy Link: https://lore.kernel.org/r/20231205152116.122512-3-herve.codina@bootlin.com Signed-off-by: Greg Kroah-Hartman commit c17fcc9673676fc7a4a8ccd899e7c1203393a9a8 Author: Herve Codina Date: Tue Dec 5 16:20:58 2023 +0100 soc: fsl: cpm1: tsa: Fix __iomem addresses declaration commit fc0c64154e5ddeb6f63c954735bd646ce5b8d9a4 upstream. Running sparse (make C=1) on tsa.c raises a lot of warning such as: --- 8< --- warning: incorrect type in assignment (different address spaces) expected void *[noderef] si_regs got void [noderef] __iomem * --- 8< --- Indeed, some variable were declared 'type *__iomem var' instead of 'type __iomem *var'. Use the correct declaration to remove these warnings. Fixes: 1d4ba0b81c1c ("soc: fsl: cpm1: Add support for TSA") Cc: stable@vger.kernel.org Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202312051959.9YdRIYbg-lkp@intel.com/ Signed-off-by: Herve Codina Reviewed-by: Christophe Leroy Link: https://lore.kernel.org/r/20231205152116.122512-2-herve.codina@bootlin.com Signed-off-by: Greg Kroah-Hartman commit 900a5edf278b3c96eedd6fc49443ba058bcda5ff Author: Bingbu Cao Date: Wed Nov 22 17:46:07 2023 +0800 media: ov01a10: Enable runtime PM before registering async sub-device commit 47a78052db51b16e8045524fbf33373b58f1323b upstream. As the sensor device maybe accessible right after its async sub-device is registered, such as ipu-bridge will try to power up sensor by sensor's client device's runtime PM from the async notifier callback, if runtime PM is not enabled, it will fail. So runtime PM should be ready before its async sub-device is registered and accessible by others. It also sets the runtime PM status to active as the sensor was turned on by i2c-core. Fixes: 0827b58dabff ("media: i2c: add ov01a10 image sensor driver") Cc: stable@vger.kernel.org Signed-off-by: Bingbu Cao Signed-off-by: Sakari Ailus Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit a14314c065c14fa370cd7915b072d300fbead269 Author: Bingbu Cao Date: Wed Nov 22 17:46:08 2023 +0800 media: ov13b10: Enable runtime PM before registering async sub-device commit 7b0454cfd8edb3509619407c3b9f78a6d0dee1a5 upstream. As the sensor device maybe accessible right after its async sub-device is registered, such as ipu-bridge will try to power up sensor by sensor's client device's runtime PM from the async notifier callback, if runtime PM is not enabled, it will fail. So runtime PM should be ready before its async sub-device is registered and accessible by others. Fixes: 7ee850546822 ("media: Add sensor driver support for the ov13b10 camera.") Cc: stable@vger.kernel.org Signed-off-by: Bingbu Cao Signed-off-by: Sakari Ailus Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit 7dd53e14a6e98b42e746edf5f60fd89e90d5d376 Author: Bingbu Cao Date: Wed Nov 22 17:46:09 2023 +0800 media: ov9734: Enable runtime PM before registering async sub-device commit e242e9c144050ed120cf666642ba96b7c4462a4c upstream. As the sensor device maybe accessible right after its async sub-device is registered, such as ipu-bridge will try to power up sensor by sensor's client device's runtime PM from the async notifier callback, if runtime PM is not enabled, it will fail. So runtime PM should be ready before its async sub-device is registered and accessible by others. Fixes: d3f863a63fe4 ("media: i2c: Add ov9734 image sensor driver") Cc: stable@vger.kernel.org Signed-off-by: Bingbu Cao Signed-off-by: Sakari Ailus Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit 9a416d624e5fb7246ea97c11fbfea7e0e27abf43 Author: Xiaolei Wang Date: Fri Dec 15 10:00:49 2023 +0800 rpmsg: virtio: Free driver_override when rpmsg_remove() commit d5362c37e1f8a40096452fc201c30e705750e687 upstream. Free driver_override when rpmsg_remove(), otherwise the following memory leak will occur: unreferenced object 0xffff0000d55d7080 (size 128): comm "kworker/u8:2", pid 56, jiffies 4294893188 (age 214.272s) hex dump (first 32 bytes): 72 70 6d 73 67 5f 6e 73 00 00 00 00 00 00 00 00 rpmsg_ns........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000009c94c9c1>] __kmem_cache_alloc_node+0x1f8/0x320 [<000000002300d89b>] __kmalloc_node_track_caller+0x44/0x70 [<00000000228a60c3>] kstrndup+0x4c/0x90 [<0000000077158695>] driver_set_override+0xd0/0x164 [<000000003e9c4ea5>] rpmsg_register_device_override+0x98/0x170 [<000000001c0c89a8>] rpmsg_ns_register_device+0x24/0x30 [<000000008bbf8fa2>] rpmsg_probe+0x2e0/0x3ec [<00000000e65a68df>] virtio_dev_probe+0x1c0/0x280 [<00000000443331cc>] really_probe+0xbc/0x2dc [<00000000391064b1>] __driver_probe_device+0x78/0xe0 [<00000000a41c9a5b>] driver_probe_device+0xd8/0x160 [<000000009c3bd5df>] __device_attach_driver+0xb8/0x140 [<0000000043cd7614>] bus_for_each_drv+0x7c/0xd4 [<000000003b929a36>] __device_attach+0x9c/0x19c [<00000000a94e0ba8>] device_initial_probe+0x14/0x20 [<000000003c999637>] bus_probe_device+0xa0/0xac Signed-off-by: Xiaolei Wang Fixes: b0b03b811963 ("rpmsg: Release rpmsg devices in backends") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231215020049.78750-1-xiaolei.wang@windriver.com Signed-off-by: Mathieu Poirier Signed-off-by: Greg Kroah-Hartman commit 15914e41d0735398b70954bdfc9b6a76a2ddc3f5 Author: Bingbu Cao Date: Wed Nov 22 17:46:06 2023 +0800 media: imx355: Enable runtime PM before registering async sub-device commit efa5fe19c0a9199f49e36e1f5242ed5c88da617d upstream. As the sensor device maybe accessible right after its async sub-device is registered, such as ipu-bridge will try to power up sensor by sensor's client device's runtime PM from the async notifier callback, if runtime PM is not enabled, it will fail. So runtime PM should be ready before its async sub-device is registered and accessible by others. Fixes: df0b5c4a7ddd ("media: add imx355 camera sensor driver") Cc: stable@vger.kernel.org Signed-off-by: Bingbu Cao Signed-off-by: Sakari Ailus Signed-off-by: Hans Verkuil Signed-off-by: Greg Kroah-Hartman commit d26edf4ee3672cc9828f2a3ffae34086a712574d Author: Johan Hovold Date: Thu Nov 9 10:31:00 2023 +0100 soc: qcom: pmic_glink_altmode: fix port sanity check commit c4fb7d2eac9ff9bfc35a2e4d40c7169a332416e0 upstream. The PMIC GLINK altmode driver currently supports at most two ports. Fix the incomplete port sanity check on notifications to avoid accessing and corrupting memory beyond the port array if we ever get a notification for an unsupported port. Fixes: 080b4e24852b ("soc: qcom: pmic_glink: Introduce altmode support") Cc: stable@vger.kernel.org # 6.3 Signed-off-by: Johan Hovold Reviewed-by: Dmitry Baryshkov Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20231109093100.19971-1-johan+linaro@kernel.org Signed-off-by: Bjorn Andersson Signed-off-by: Greg Kroah-Hartman commit f97b329d1f98ab1b757110b2baa247c0f77b17e5 Author: Miquel Raynal Date: Fri Dec 15 13:32:08 2023 +0100 mtd: rawnand: Clarify conditions to enable continuous reads commit 828f6df1bcba7f64729166efc7086ea657070445 upstream. The current logic is probably fine but is a bit convoluted. Plus, we don't want partial pages to be part of the sequential operation just in case the core would optimize the page read with a subpage read (which would break the sequence). This may happen on the first and last page only, so if the start offset or the end offset is not aligned with a page boundary, better avoid them to prevent any risk. Cc: stable@vger.kernel.org Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads") Signed-off-by: Miquel Raynal Tested-by: Martin Hundebøll Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-5-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman commit 17413aeae039336ce410d873415fda13c62adb98 Author: Miquel Raynal Date: Fri Dec 15 13:32:07 2023 +0100 mtd: rawnand: Prevent sequential reads with on-die ECC engines commit a62c4597953fe54c6af04166a5e2872efd0e1490 upstream. Some devices support sequential reads when using the on-die ECC engines, some others do not. It is a bit hard to know which ones will break other than experimentally, so in order to avoid such a difficult and painful task, let's just pretend all devices should avoid using this optimization when configured like this. Cc: stable@vger.kernel.org Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads") Signed-off-by: Miquel Raynal Tested-by: Martin Hundebøll Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-4-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman commit 2a3cc28b600f267f6f1804cf84b14e39c3c744dd Author: Miquel Raynal Date: Fri Dec 15 13:32:06 2023 +0100 mtd: rawnand: Fix core interference with sequential reads commit 7c9414c870c027737d0f2ed7b0ed10f26edb1c61 upstream. A couple of reports pointed at some strange failures happening a bit randomly since the introduction of sequential page reads support. After investigation it turned out the most likely reason for these issues was the fact that sometimes a (longer) read might happen, starting at the same page that was read previously. This is optimized by the raw NAND core, by not sending the READ_PAGE command to the NAND device and just reading out the data in a local cache. When this page is also flagged as being the starting point for a sequential read, it means the page right next will be accessed without the right instructions. The NAND chip will be confused and will not output correct data. In order to avoid such situation from happening anymore, we can however handle this case with a bit of additional logic, to postpone the initialization of the read sequence by one page. Reported-by: Alexander Shiyan Closes: https://lore.kernel.org/linux-mtd/CAP1tNvS=NVAm-vfvYWbc3k9Cx9YxMc2uZZkmXk8h1NhGX877Zg@mail.gmail.com/ Reported-by: Måns Rullgård Closes: https://lore.kernel.org/linux-mtd/yw1xfs6j4k6q.fsf@mansr.com/ Reported-by: Martin Hundebøll Closes: https://lore.kernel.org/linux-mtd/9d0c42fcde79bfedfe5b05d6a4e9fdef71d3dd52.camel@geanix.com/ Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal Tested-by: Martin Hundebøll Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-3-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman commit 427f8206e84d4de1f49d639576ebf4ca74455da5 Author: Miquel Raynal Date: Fri Dec 15 13:32:05 2023 +0100 mtd: rawnand: Prevent crossing LUN boundaries during sequential reads commit bbcd80f53a5e8c27c2511f539fec8c373f500cf4 upstream. The ONFI specification states that devices do not need to support sequential reads across LUN boundaries. In order to prevent such event from happening and possibly failing, let's introduce the concept of "pause" in the sequential read to handle these cases. The first/last pages remain the same but any time we cross a LUN boundary we will end and restart (if relevant) the sequential read operation. Cc: stable@vger.kernel.org Fixes: 003fe4b9545b ("mtd: rawnand: Support for sequential cache reads") Signed-off-by: Miquel Raynal Tested-by: Martin Hundebøll Link: https://lore.kernel.org/linux-mtd/20231215123208.516590-2-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman commit 1168d6b79d2fafb41299fbc1b528e20644c562a5 Author: Miquel Raynal Date: Tue Dec 5 08:59:36 2023 +0100 mtd: maps: vmu-flash: Fix the (mtd core) switch to ref counters commit a7d84a2e7663bbe12394cc771107e04668ea313a upstream. While switching to ref counters for track mtd devices use, the vmu-flash driver was forgotten. The reason for reading the ref counter seems debatable, but let's just fix the build for now. Fixes: 19bfa9ebebb5 ("mtd: use refcount to prevent corruption") Reported-by: kernel test robot Closes: https://lore.kernel.org/oe-kbuild-all/202312022315.79twVRZw-lkp@intel.com/ Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal Link: https://lore.kernel.org/linux-mtd/20231205075936.13831-1-miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman commit eaef4650fa2050147ca25fd7ee43bc0082e03c87 Author: Christian Marangi Date: Tue Oct 24 20:30:15 2023 +0200 PM / devfreq: Fix buffer overflow in trans_stat_show commit 08e23d05fa6dc4fc13da0ccf09defdd4bbc92ff4 upstream. Fix buffer overflow in trans_stat_show(). Convert simple snprintf to the more secure scnprintf with size of PAGE_SIZE. Add condition checking if we are exceeding PAGE_SIZE and exit early from loop. Also add at the end a warning that we exceeded PAGE_SIZE and that stats is disabled. Return -EFBIG in the case where we don't have enough space to write the full transition table. Also document in the ABI that this function can return -EFBIG error. Link: https://lore.kernel.org/all/20231024183016.14648-2-ansuelsmth@gmail.com/ Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218041 Fixes: e552bbaf5b98 ("PM / devfreq: Add sysfs node for representing frequency transition information.") Signed-off-by: Christian Marangi Signed-off-by: Chanwoo Choi Signed-off-by: Greg Kroah-Hartman commit 8fc366549f9b11544827883329a515f6a4af037c Author: Anthony Krowiak Date: Thu Nov 9 11:44:20 2023 -0500 s390/vfio-ap: unpin pages on gisc registration failure commit 7b2d039da622daa9ba259ac6f38701d542b237c3 upstream. In the vfio_ap_irq_enable function, after the page containing the notification indicator byte (NIB) is pinned, the function attempts to register the guest ISC. If registration fails, the function sets the status response code and returns without unpinning the page containing the NIB. In order to avoid a memory leak, the NIB should be unpinned before returning from the vfio_ap_irq_enable function. Co-developed-by: Janosch Frank Signed-off-by: Janosch Frank Signed-off-by: Anthony Krowiak Reviewed-by: Matthew Rosato Fixes: 783f0a3ccd79 ("s390/vfio-ap: add s390dbf logging to the vfio_ap_irq_enable function") Cc: Link: https://lore.kernel.org/r/20231109164427.460493-2-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev Signed-off-by: Greg Kroah-Hartman commit e78f1a43e72daf77705ad5b9946de66fc708b874 Author: Herbert Xu Date: Tue Nov 28 14:22:13 2023 +0800 crypto: s390/aes - Fix buffer overread in CTR mode commit d07f951903fa9922c375b8ab1ce81b18a0034e3b upstream. When processing the last block, the s390 ctr code will always read a whole block, even if there isn't a whole block of data left. Fix this by using the actual length left and copy it into a buffer first for processing. Fixes: 0200f3ecc196 ("crypto: s390 - add System z hardware support for CTR mode") Cc: Reported-by: Guangwu Zhang Signed-off-by: Herbert Xu Reviewd-by: Harald Freudenberger Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 6822a14271786150e178869f1495cc03e74c5029 Author: Herbert Xu Date: Sat Dec 2 09:01:54 2023 +0800 hwrng: core - Fix page fault dead lock on mmap-ed hwrng commit 78aafb3884f6bc6636efcc1760c891c8500b9922 upstream. There is a dead-lock in the hwrng device read path. This triggers when the user reads from /dev/hwrng into memory also mmap-ed from /dev/hwrng. The resulting page fault triggers a recursive read which then dead-locks. Fix this by using a stack buffer when calling copy_to_user. Reported-by: Edward Adam Davis Reported-by: syzbot+c52ab18308964d248092@syzkaller.appspotmail.com Fixes: 9996508b3353 ("hwrng: core - Replace u32 in driver API with byte array") Cc: Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 2ea043279a2c92c1025e2deeed1ffbc3289afd94 Author: Hongchen Zhang Date: Thu Nov 16 08:56:09 2023 +0800 PM: hibernate: Enforce ordering during image compression/decompression commit 71cd7e80cfde548959952eac7063aeaea1f2e1c6 upstream. An S4 (suspend to disk) test on the LoongArch 3A6000 platform sometimes fails with the following error messaged in the dmesg log: Invalid LZO compressed length That happens because when compressing/decompressing the image, the synchronization between the control thread and the compress/decompress/crc thread is based on a relaxed ordering interface, which is unreliable, and the following situation may occur: CPU 0 CPU 1 save_image_lzo lzo_compress_threadfn atomic_set(&d->stop, 1); atomic_read(&data[thr].stop) data[thr].cmp = data[thr].cmp_len; WRITE data[thr].cmp_len Then CPU0 gets a stale cmp_len and writes it to disk. During resume from S4, wrong cmp_len is loaded. To maintain data consistency between the two threads, use the acquire/release variants of atomic set and read operations. Fixes: 081a9d043c98 ("PM / Hibernate: Improve performance of LZO/plain hibernation, checksum image") Cc: All applicable Signed-off-by: Hongchen Zhang Co-developed-by: Weihao Li Signed-off-by: Weihao Li [ rjw: Subject rewrite and changelog edits ] Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 7a09e2a77a8d4b133cc7e5e4f9a4ccf4c2283d58 Author: Herbert Xu Date: Thu Dec 7 18:36:57 2023 +0800 crypto: api - Disallow identical driver names commit 27016f75f5ed47e2d8e0ca75a8ff1f40bc1a5e27 upstream. Disallow registration of two algorithms with identical driver names. Cc: Reported-by: Ovidiu Panait Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit bffc4cc334c5bb31ded54bc3cfd651735a3cb79e Author: Gao Xiang Date: Wed Dec 6 12:55:34 2023 +0800 erofs: fix lz4 inplace decompression commit 3c12466b6b7bf1e56f9b32c366a3d83d87afb4de upstream. Currently EROFS can map another compressed buffer for inplace decompression, that was used to handle the cases that some pages of compressed data are actually not in-place I/O. However, like most simple LZ77 algorithms, LZ4 expects the compressed data is arranged at the end of the decompressed buffer and it explicitly uses memmove() to handle overlapping: __________________________________________________________ |_ direction of decompression --> ____ |_ compressed data _| Although EROFS arranges compressed data like this, it typically maps two individual virtual buffers so the relative order is uncertain. Previously, it was hardly observed since LZ4 only uses memmove() for short overlapped literals and x86/arm64 memmove implementations seem to completely cover it up and they don't have this issue. Juhyung reported that EROFS data corruption can be found on a new Intel x86 processor. After some analysis, it seems that recent x86 processors with the new FSRM feature expose this issue with "rep movsb". Let's strictly use the decompressed buffer for lz4 inplace decompression for now. Later, as an useful improvement, we could try to tie up these two buffers together in the correct order. Reported-and-tested-by: Juhyung Park Closes: https://lore.kernel.org/r/CAD14+f2AVKf8Fa2OO1aAUdDNTDsVzzR6ctU_oJSmTyd6zSYR2Q@mail.gmail.com Fixes: 0ffd71bcc3a0 ("staging: erofs: introduce LZ4 decompression inplace") Fixes: 598162d05080 ("erofs: support decompress big pcluster for lz4 backend") Cc: stable # 5.4+ Tested-by: Yifan Zhao Signed-off-by: Gao Xiang Link: https://lore.kernel.org/r/20231206045534.3920847-1-hsiangkao@linux.alibaba.com Signed-off-by: Greg Kroah-Hartman commit 7abdfd45a650c714d5ebab564bb1b988f14d9b49 Author: Tianjia Zhang Date: Thu Dec 14 11:08:34 2023 +0800 crypto: lib/mpi - Fix unexpected pointer access in mpi_ec_init commit ba3c5574203034781ac4231acf117da917efcd2a upstream. When the mpi_ec_ctx structure is initialized, some fields are not cleared, causing a crash when referencing the field when the structure was released. Initially, this issue was ignored because memory for mpi_ec_ctx is allocated with the __GFP_ZERO flag. For example, this error will be triggered when calculating the Za value for SM2 separately. Fixes: d58bb7e55a8a ("lib/mpi: Introduce ec implementation to MPI library") Cc: stable@vger.kernel.org # v6.5 Signed-off-by: Tianjia Zhang Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit b135aafeac4b3adfa54d1990d443cdd168dd3703 Author: David Disseldorp Date: Fri Dec 8 11:41:56 2023 +1100 btrfs: sysfs: validate scrub_speed_max value commit 2b0122aaa800b021e36027d7f29e206f87c761d6 upstream. The value set as scrub_speed_max accepts size with suffixes (k/m/g/t/p/e) but we should still validate it for trailing characters, similar to what we do with chunk_size_store. CC: stable@vger.kernel.org # 5.15+ Signed-off-by: David Disseldorp Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 78653c35206c9d2d24d6d4c91c29cab5165d0cb7 Author: Viresh Kumar Date: Fri Jan 5 13:55:37 2024 +0530 OPP: Pass rounded rate to _set_opp() commit 7269c250db1b89cda72ca419b7bd5e37997309d6 upstream. The OPP core finds the eventual frequency to set with the help of clk_round_rate() and the same was earlier getting passed to _set_opp() and that's what would get configured. The commit 1efae8d2e777 ("OPP: Make dev_pm_opp_set_opp() independent of frequency") mistakenly changed that. Fix it. Fixes: 1efae8d2e777 ("OPP: Make dev_pm_opp_set_opp() independent of frequency") Cc: v5.18+ # v6.0+ Signed-off-by: Viresh Kumar Signed-off-by: Greg Kroah-Hartman commit 5c57ec3cf4bf208a2992aa601f060e1e3915facd Author: Josef Bacik Date: Thu Dec 14 11:18:50 2023 -0500 arm64: properly install vmlinuz.efi commit 7b21ed7d119dc06b0ed2ba3e406a02cafe3a8d03 upstream. If you select CONFIG_EFI_ZBOOT, we will generate vmlinuz.efi, and then when we go to install the kernel we'll install the vmlinux instead because install.sh only recognizes Image.gz as wanting the compressed install image. With CONFIG_EFI_ZBOOT we don't get the proper kernel installed, which means it doesn't boot, which makes for a very confused and subsequently angry kernel developer. Fix this by properly installing our compressed kernel if we've enabled CONFIG_EFI_ZBOOT. Signed-off-by: Josef Bacik Cc: # 6.1.x Fixes: c37b830fef13 ("arm64: efi: enable generic EFI compressed boot") Reviewed-by: Simon Glass Link: https://lore.kernel.org/r/6edb1402769c2c14c4fbef8f7eaedb3167558789.1702570674.git.josef@toxicpanda.com Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 9bd3dce27b01c51295b60e1433e1dadfb16649f7 Author: Rafael J. Wysocki Date: Wed Dec 27 21:41:06 2023 +0100 PM: sleep: Fix possible deadlocks in core system-wide PM code commit 7839d0078e0d5e6cc2fa0b0dfbee71de74f1e557 upstream. It is reported that in low-memory situations the system-wide resume core code deadlocks, because async_schedule_dev() executes its argument function synchronously if it cannot allocate memory (and not only in that case) and that function attempts to acquire a mutex that is already held. Executing the argument function synchronously from within dpm_async_fn() may also be problematic for ordering reasons (it may cause a consumer device's resume callback to be invoked before a requisite supplier device's one, for example). Address this by changing the code in question to use async_schedule_dev_nocall() for scheduling the asynchronous execution of device suspend and resume functions and to directly run them synchronously if async_schedule_dev_nocall() returns false. Link: https://lore.kernel.org/linux-pm/ZYvjiqX6EsL15moe@perf/ Reported-by: Youngmin Nam Signed-off-by: Rafael J. Wysocki Reviewed-by: Stanislaw Gruszka Tested-by: Youngmin Nam Reviewed-by: Ulf Hansson Cc: 5.7+ # 5.7+: 6aa09a5bccd8 async: Split async_schedule_node_domain() Cc: 5.7+ # 5.7+: 7d4b5d7a37bd async: Introduce async_schedule_dev_nocall() Cc: 5.7+ # 5.7+ Signed-off-by: Greg Kroah-Hartman commit 22f7c9cb05cd5153100c859099da64982e9c1d1f Author: Rafael J. Wysocki Date: Wed Dec 27 21:38:23 2023 +0100 async: Introduce async_schedule_dev_nocall() commit 7d4b5d7a37bdd63a5a3371b988744b060d5bb86f upstream. In preparation for subsequent changes, introduce a specialized variant of async_schedule_dev() that will not invoke the argument function synchronously when it cannot be scheduled for asynchronous execution. The new function, async_schedule_dev_nocall(), will be used for fixing possible deadlocks in the system-wide power management core code. Signed-off-by: Rafael J. Wysocki Reviewed-by: Stanislaw Gruszka for the series. Tested-by: Youngmin Nam Reviewed-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 8ae1a4fd03bd36694758e75f6f77644a6dc0db3b Author: Rafael J. Wysocki Date: Wed Dec 27 21:37:02 2023 +0100 async: Split async_schedule_node_domain() commit 6aa09a5bccd8e224d917afdb4c278fc66aacde4d upstream. In preparation for subsequent changes, split async_schedule_node_domain() in two pieces so as to allow the bottom part of it to be called from a somewhat different code path. No functional impact. Signed-off-by: Rafael J. Wysocki Reviewed-by: Stanislaw Gruszka Tested-by: Youngmin Nam Reviewed-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 73986e8d2808c24b88c7b36a081fb351d2bf6a30 Author: Suraj Jitindar Singh Date: Wed Dec 13 16:16:35 2023 +1100 ext4: allow for the last group to be marked as trimmed commit 7c784d624819acbeefb0018bac89e632467cca5a upstream. The ext4 filesystem tracks the trim status of blocks at the group level. When an entire group has been trimmed then it is marked as such and subsequent trim invocations with the same minimum trim size will not be attempted on that group unless it is marked as able to be trimmed again such as when a block is freed. Currently the last group can't be marked as trimmed due to incorrect logic in ext4_last_grp_cluster(). ext4_last_grp_cluster() is supposed to return the zero based index of the last cluster in a group. This is then used by ext4_try_to_trim_range() to determine if the trim operation spans the entire group and as such if the trim status of the group should be recorded. ext4_last_grp_cluster() takes a 0 based group index, thus the valid values for grp are 0..(ext4_get_groups_count - 1). Any group index less than (ext4_get_groups_count - 1) is not the last group and must have EXT4_CLUSTERS_PER_GROUP(sb) clusters. For the last group we need to calculate the number of clusters based on the number of blocks in the group. Finally subtract 1 from the number of clusters as zero based indexing is expected. Rearrange the function slightly to make it clear what we are calculating and returning. Reproducer: // Create file system where the last group has fewer blocks than // blocks per group $ mkfs.ext4 -b 4096 -g 8192 /dev/nvme0n1 8191 $ mount /dev/nvme0n1 /mnt Before Patch: $ fstrim -v /mnt /mnt: 25.9 MiB (27156480 bytes) trimmed // Group not marked as trimmed so second invocation still discards blocks $ fstrim -v /mnt /mnt: 25.9 MiB (27156480 bytes) trimmed After Patch: fstrim -v /mnt /mnt: 25.9 MiB (27156480 bytes) trimmed // Group marked as trimmed so second invocation DOESN'T discard any blocks fstrim -v /mnt /mnt: 0 B (0 bytes) trimmed Fixes: 45e4ab320c9b ("ext4: move setting of trimmed bit into ext4_try_to_trim_range()") Cc: # 4.19+ Signed-off-by: Suraj Jitindar Singh Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20231213051635.37731-1-surajjs@amazon.com Signed-off-by: Theodore Ts'o Signed-off-by: Greg Kroah-Hartman commit d0f0780f03df54d08ced118d27834ee5008724e4 Author: Geoff Levand Date: Sun Dec 24 09:52:46 2023 +0900 powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2 commit 482b718a84f08b6fc84879c3e90cc57dba11c115 upstream. Commit 8c5fa3b5c4df ("powerpc/64: Make ELFv2 the default for big-endian builds"), merged in Linux-6.5-rc1 changes the calling ABI in a way that is incompatible with the current code for the PS3's LV1 hypervisor calls. This change just adds the line '# CONFIG_PPC64_BIG_ENDIAN_ELF_ABI_V2 is not set' to the ps3_defconfig file so that the PPC64_ELF_ABI_V1 is used. Fixes run time errors like these: BUG: Kernel NULL pointer dereference at 0x00000000 Faulting instruction address: 0xc000000000047cf0 Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: [c0000000023039e0] [c00000000100ebfc] ps3_create_spu+0xc4/0x2b0 (unreliable) [c000000002303ab0] [c00000000100d4c4] create_spu+0xcc/0x3c4 [c000000002303b40] [c00000000100eae4] ps3_enumerate_spus+0xa4/0xf8 Fixes: 8c5fa3b5c4df ("powerpc/64: Make ELFv2 the default for big-endian builds") Cc: stable@vger.kernel.org # v6.5+ Signed-off-by: Geoff Levand Signed-off-by: Michael Ellerman Link: https://msgid.link/df906ac1-5f17-44b9-b0bb-7cd292a0df65@infradead.org Signed-off-by: Greg Kroah-Hartman commit 6fb70694f8d1ac34e45246b0ac988f025e1e5b55 Author: Lukas Schauer Date: Fri Dec 1 11:11:28 2023 +0100 pipe: wakeup wr_wait after setting max_usage commit e95aada4cb93d42e25c30a0ef9eb2923d9711d4a upstream. Commit c73be61cede5 ("pipe: Add general notification queue support") a regression was introduced that would lock up resized pipes under certain conditions. See the reproducer in [1]. The commit resizing the pipe ring size was moved to a different function, doing that moved the wakeup for pipe->wr_wait before actually raising pipe->max_usage. If a pipe was full before the resize occured it would result in the wakeup never actually triggering pipe_write. Set @max_usage and @nr_accounted before waking writers if this isn't a watch queue. Link: https://bugzilla.kernel.org/show_bug.cgi?id=212295 [1] Link: https://lore.kernel.org/r/20231201-orchideen-modewelt-e009de4562c6@brauner Fixes: c73be61cede5 ("pipe: Add general notification queue support") Reviewed-by: David Howells Cc: Signed-off-by: Lukas Schauer [Christian Brauner : rewrite to account for watch queues] Signed-off-by: Christian Brauner Signed-off-by: Greg Kroah-Hartman commit 55aca2ce91a63740278502066beaddbd841af9c6 Author: Marcelo Schmitt Date: Tue Dec 19 17:26:01 2023 -0300 iio: adc: ad7091r: Allow users to configure device events [ Upstream commit 020e71c7ffc25dfe29ed9be6c2d39af7bd7f661f ] AD7091R-5 devices are supported by the ad7091r-5 driver together with the ad7091r-base driver. Those drivers declared iio events for notifying user space when ADC readings fall bellow the thresholds of low limit registers or above the values set in high limit registers. However, to configure iio events and their thresholds, a set of callback functions must be implemented and those were not present until now. The consequence of trying to configure ad7091r-5 events without the proper callback functions was a null pointer dereference in the kernel because the pointers to the callback functions were not set. Implement event configuration callbacks allowing users to read/write event thresholds and enable/disable event generation. Since the event spec structs are generic to AD7091R devices, also move those from the ad7091r-5 driver the base driver so they can be reused when support for ad7091r-2/-4/-8 be added. Fixes: ca69300173b6 ("iio: adc: Add support for AD7091R5 ADC") Suggested-by: David Lechner Signed-off-by: Marcelo Schmitt Link: https://lore.kernel.org/r/59552d3548dabd56adc3107b7b4869afee2b0c3c.1703013352.git.marcelo.schmitt1@gmail.com Cc: Signed-off-by: Jonathan Cameron Signed-off-by: Sasha Levin commit b05c9588c7c0f97d86dc6442cc86c97b2341beec Author: Marcelo Schmitt Date: Sat Dec 16 14:46:37 2023 -0300 iio: adc: ad7091r: Set alert bit in config register [ Upstream commit 149694f5e79b0c7a36ceb76e7c0d590db8f151c1 ] The ad7091r-base driver sets up an interrupt handler for firing events when inputs are either above or below a certain threshold. However, for the interrupt signal to come from the device it must be configured to enable the ALERT/BUSY/GPO pin to be used as ALERT, which was not being done until now. Enable interrupt signals on the ALERT/BUSY/GPO pin by setting the proper bit in the configuration register. Signed-off-by: Marcelo Schmitt Link: https://lore.kernel.org/r/e8da2ee98d6df88318b14baf3dc9630e20218418.1702746240.git.marcelo.schmitt1@gmail.com Signed-off-by: Jonathan Cameron Stable-dep-of: 020e71c7ffc2 ("iio: adc: ad7091r: Allow users to configure device events") Signed-off-by: Sasha Levin commit a434c75e0671f925f0c534b0a2284de7b8f164e0 Author: Krzysztof Kozlowski Date: Tue Oct 17 11:09:33 2023 -0500 soundwire: fix initializing sysfs for same devices on different buses [ Upstream commit 8a8a9ac8a4972ee69d3dd3d1ae43963ae39cee18 ] If same devices with same device IDs are present on different soundwire buses, the probe fails due to conflicting device names and sysfs entries: sysfs: cannot create duplicate filename '/bus/soundwire/devices/sdw:0:0217:0204:00:0' The link ID is 0 for both devices, so they should be differentiated by the controller ID. Add the controller ID so, the device names and sysfs entries look like: sdw:1:0:0217:0204:00:0 -> ../../../devices/platform/soc@0/6ab0000.soundwire-controller/sdw-master-1-0/sdw:1:0:0217:0204:00:0 sdw:3:0:0217:0204:00:0 -> ../../../devices/platform/soc@0/6b10000.soundwire-controller/sdw-master-3-0/sdw:3:0:0217:0204:00:0 [PLB changes: use bus->controller_id instead of bus->id] Fixes: 7c3cd189b86d ("soundwire: Add Master registration") Cc: stable@vger.kernel.org Reviewed-by: Bard Liao Reviewed-by: Vijendar Mukunda Co-developed-by: Pierre-Louis Bossart Signed-off-by: Pierre-Louis Bossart Signed-off-by: Krzysztof Kozlowski Reviewed-by: Krzysztof Kozlowski Tested-by: Krzysztof Kozlowski Acked-by: Mark Brown Tested-by: Srinivas Kandagatla Link: https://lore.kernel.org/r/20231017160933.12624-3-pierre-louis.bossart@linux.intel.com Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit 7364e2e244d833ed48b0084f2eeeb817a2c687f5 Author: Pierre-Louis Bossart Date: Tue Oct 17 11:09:32 2023 -0500 soundwire: bus: introduce controller_id [ Upstream commit 6543ac13c623f906200dfd3f1c407d8d333b6995 ] The existing SoundWire support misses a clear Controller/Manager hiearchical definition to deal with all variants across SOC vendors. a) Intel platforms have one controller with 4 or more Managers. b) AMD platforms have two controllers with one Manager each, but due to BIOS issues use two different link_id values within the scope of a single controller. c) QCOM platforms have one or more controller with one Manager each. This patch adds a 'controller_id' which can be set by higher levels. If assigned to -1, the controller_id will be set to the system-unique IDA-assigned bus->id. The main change is that the bus->id is no longer used for any device name, which makes the definition completely predictable and not dependent on any enumeration order. The bus->id is only used to insert the Managers in the stream rt context. Reviewed-by: Bard Liao Reviewed-by: Vijendar Mukunda Signed-off-by: Pierre-Louis Bossart Reviewed-by: Krzysztof Kozlowski Tested-by: Krzysztof Kozlowski Signed-off-by: Srinivas Kandagatla Link: https://lore.kernel.org/stable/20231017160933.12624-2-pierre-louis.bossart%40linux.intel.com Tested-by: Srinivas Kandagatla Link: https://lore.kernel.org/r/20231017160933.12624-2-pierre-louis.bossart@linux.intel.com Signed-off-by: Vinod Koul Stable-dep-of: 8a8a9ac8a497 ("soundwire: fix initializing sysfs for same devices on different buses") Signed-off-by: Sasha Levin