commit dd5ae3523018ed0d6e58591910cd61bc596fc89c Author: Greg Kroah-Hartman Date: Wed Mar 17 17:11:47 2021 +0100 Linux 5.11.7 Tested-by: Jon Hunter Tested-by: Jason Self Tested-by: Linux Kernel Functional Testing Tested-by: Guenter Roeck Tested-by: Ross Schmidt Link: https://lore.kernel.org/r/20210315135507.611436477@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit 198b86551885f58f9d8396d152db2b6ba96836db Author: Andrew Scull Date: Mon Mar 15 12:21:36 2021 +0000 KVM: arm64: Fix nVHE hyp panic host context restore Commit c4b000c3928d4f20acef79dccf3a65ae3795e0b0 upstream. When panicking from the nVHE hyp and restoring the host context, x29 is expected to hold a pointer to the host context. This wasn't being done so fix it to make sure there's a valid pointer the host context being used. Rather than passing a boolean indicating whether or not the host context should be restored, instead pass the pointer to the host context. NULL is passed to indicate that no context should be restored. Fixes: a2e102e20fd6 ("KVM: arm64: nVHE: Handle hyp panics") Cc: stable@vger.kernel.org # 5.11.y only Signed-off-by: Andrew Scull Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20210219122406.1337626-1-ascull@google.com Signed-off-by: Greg Kroah-Hartman commit 4699bb8dd14c3ad7377b036c85742360c603f117 Author: Mike Rapoport Date: Fri Mar 12 21:07:12 2021 -0800 mm/page_alloc.c: refactor initialization of struct page for holes in memory layout commit 0740a50b9baa4472cfb12442df4b39e2712a64a4 upstream. There could be struct pages that are not backed by actual physical memory. This can happen when the actual memory bank is not a multiple of SECTION_SIZE or when an architecture does not register memory holes reserved by the firmware as memblock.memory. Such pages are currently initialized using init_unavailable_mem() function that iterates through PFNs in holes in memblock.memory and if there is a struct page corresponding to a PFN, the fields of this page are set to default values and it is marked as Reserved. init_unavailable_mem() does not take into account zone and node the page belongs to and sets both zone and node links in struct page to zero. Before commit 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") the holes inside a zone were re-initialized during memmap_init() and got their zone/node links right. However, after that commit nothing updates the struct pages representing such holes. On a system that has firmware reserved holes in a zone above ZONE_DMA, for instance in a configuration below: # grep -A1 E820 /proc/iomem 7a17b000-7a216fff : Unknown E820 type 7a217000-7bffffff : System RAM unset zone link in struct page will trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page); in set_pfnblock_flags_mask() when called with a struct page from a range other than E820_TYPE_RAM because there are pages in the range of ZONE_DMA32 but the unset zone link in struct page makes them appear as a part of ZONE_DMA. Interleave initialization of the unavailable pages with the normal initialization of memory map, so that zone and node information will be properly set on struct pages that are not backed by the actual memory. With this change the pages for holes inside a zone will get proper zone/node links and the pages that are not spanned by any node will get links to the adjacent zone/node. The holes between nodes will be prepended to the zone/node above the hole and the trailing pages in the last section that will be appended to the zone/node below. [akpm@linux-foundation.org: don't initialize static to zero, use %llu for u64] Link: https://lkml.kernel.org/r/20210225224351.7356-2-rppt@kernel.org Fixes: 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions rather that check each PFN") Signed-off-by: Mike Rapoport Reported-by: Qian Cai Reported-by: Andrea Arcangeli Reviewed-by: Baoquan He Acked-by: Vlastimil Babka Reviewed-by: David Hildenbrand Cc: Borislav Petkov Cc: Chris Wilson Cc: "H. Peter Anvin" Cc: Łukasz Majczak Cc: Ingo Molnar Cc: Mel Gorman Cc: Michal Hocko Cc: "Sarvela, Tomi P" Cc: Thomas Gleixner Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Mike Rapoport Signed-off-by: Greg Kroah-Hartman commit b72b83c6a0e70e266d1b276f0ded1fcbf41eb0dc Author: Zhou Guanghui Date: Fri Mar 12 21:08:30 2021 -0800 mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument commit be6c8982e4ab9a41907555f601b711a7e2a17d4c upstream. Rename mem_cgroup_split_huge_fixup to split_page_memcg and explicitly pass in page number argument. In this way, the interface name is more common and can be used by potential users. In addition, the complete info(memcg and flag) of the memcg needs to be set to the tail pages. Link: https://lkml.kernel.org/r/20210304074053.65527-2-zhouguanghui1@huawei.com Signed-off-by: Zhou Guanghui Acked-by: Johannes Weiner Reviewed-by: Zi Yan Reviewed-by: Shakeel Butt Acked-by: Michal Hocko Cc: Hugh Dickins Cc: "Kirill A. Shutemov" Cc: Nicholas Piggin Cc: Kefeng Wang Cc: Hanjun Guo Cc: Tianhong Ding Cc: Weilong Chen Cc: Rui Xiang Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 5a162c56353de6d07eab79a02904c31fbb7aa577 Author: Zhou Guanghui Date: Fri Mar 12 21:08:33 2021 -0800 mm/memcg: set memcg when splitting page commit e1baddf8475b06cc56f4bafecf9a32a124343d9f upstream. As described in the split_page() comment, for the non-compound high order page, the sub-pages must be freed individually. If the memcg of the first page is valid, the tail pages cannot be uncharged when be freed. For example, when alloc_pages_exact is used to allocate 1MB continuous physical memory, 2MB is charged(kmemcg is enabled and __GFP_ACCOUNT is set). When make_alloc_exact free the unused 1MB and free_pages_exact free the applied 1MB, actually, only 4KB(one page) is uncharged. Therefore, the memcg of the tail page needs to be set when splitting a page. Michel: There are at least two explicit users of __GFP_ACCOUNT with alloc_exact_pages added recently. See 7efe8ef274024 ("KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT") and c419621873713 ("KVM: s390: Add memcg accounting to KVM allocations"), so this is not just a theoretical issue. Link: https://lkml.kernel.org/r/20210304074053.65527-3-zhouguanghui1@huawei.com Signed-off-by: Zhou Guanghui Acked-by: Johannes Weiner Reviewed-by: Zi Yan Reviewed-by: Shakeel Butt Acked-by: Michal Hocko Cc: Hanjun Guo Cc: Hugh Dickins Cc: Kefeng Wang Cc: "Kirill A. Shutemov" Cc: Nicholas Piggin Cc: Rui Xiang Cc: Tianhong Ding Cc: Weilong Chen Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 27980e6bd3a21a5bf13a47090d7e934aaf339227 Author: Suren Baghdasaryan Date: Fri Mar 12 21:08:06 2021 -0800 mm/madvise: replace ptrace attach requirement for process_madvise commit 96cfe2c0fd23ea7c2368d14f769d287e7ae1082e upstream. process_madvise currently requires ptrace attach capability. PTRACE_MODE_ATTACH gives one process complete control over another process. It effectively removes the security boundary between the two processes (in one direction). Granting ptrace attach capability even to a system process is considered dangerous since it creates an attack surface. This severely limits the usage of this API. The operations process_madvise can perform do not affect the correctness of the operation of the target process; they only affect where the data is physically located (and therefore, how fast it can be accessed). What we want is the ability for one process to influence another process in order to optimize performance across the entire system while leaving the security boundary intact. Replace PTRACE_MODE_ATTACH with a combination of PTRACE_MODE_READ and CAP_SYS_NICE. PTRACE_MODE_READ to prevent leaking ASLR metadata and CAP_SYS_NICE for influencing process performance. Link: https://lkml.kernel.org/r/20210303185807.2160264-1-surenb@google.com Signed-off-by: Suren Baghdasaryan Reviewed-by: Kees Cook Acked-by: Minchan Kim Acked-by: David Rientjes Cc: Jann Horn Cc: Jeff Vander Stoep Cc: Michal Hocko Cc: Shakeel Butt Cc: Tim Murray Cc: Florian Weimer Cc: Oleg Nesterov Cc: James Morris Cc: [5.10+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit eb2a88359a82b1d0c150849ad0873f5d2403fa85 Author: Nadav Amit Date: Fri Mar 12 21:08:17 2021 -0800 mm/userfaultfd: fix memory corruption due to writeprotect commit 6ce64428d62026a10cb5d80138ff2f90cc21d367 upstream. Userfaultfd self-test fails occasionally, indicating a memory corruption. Analyzing this problem indicates that there is a real bug since mmap_lock is only taken for read in mwriteprotect_range() and defers flushes, and since there is insufficient consideration of concurrent deferred TLB flushes in wp_page_copy(). Although the PTE is flushed from the TLBs in wp_page_copy(), this flush takes place after the copy has already been performed, and therefore changes of the page are possible between the time of the copy and the time in which the PTE is flushed. To make matters worse, memory-unprotection using userfaultfd also poses a problem. Although memory unprotection is logically a promotion of PTE permissions, and therefore should not require a TLB flush, the current userrfaultfd code might actually cause a demotion of the architectural PTE permission: when userfaultfd_writeprotect() unprotects memory region, it unintentionally *clears* the RW-bit if it was already set. Note that this unprotecting a PTE that is not write-protected is a valid use-case: the userfaultfd monitor might ask to unprotect a region that holds both write-protected and write-unprotected PTEs. The scenario that happens in selftests/vm/userfaultfd is as follows: cpu0 cpu1 cpu2 ---- ---- ---- [ Writable PTE cached in TLB ] userfaultfd_writeprotect() [ write-*unprotect* ] mwriteprotect_range() mmap_read_lock() change_protection() change_protection_range() ... change_pte_range() [ *clear* “write”-bit ] [ defer TLB flushes ] [ page-fault ] ... wp_page_copy() cow_user_page() [ copy page ] [ write to old page ] ... set_pte_at_notify() A similar scenario can happen: cpu0 cpu1 cpu2 cpu3 ---- ---- ---- ---- [ Writable PTE cached in TLB ] userfaultfd_writeprotect() [ write-protect ] [ deferred TLB flush ] userfaultfd_writeprotect() [ write-unprotect ] [ deferred TLB flush] [ page-fault ] wp_page_copy() cow_user_page() [ copy page ] ... [ write to page ] set_pte_at_notify() This race exists since commit 292924b26024 ("userfaultfd: wp: apply _PAGE_UFFD_WP bit"). Yet, as Yu Zhao pointed, these races became apparent since commit 09854ba94c6a ("mm: do_wp_page() simplification") which made wp_page_copy() more likely to take place, specifically if page_count(page) > 1. To resolve the aforementioned races, check whether there are pending flushes on uffd-write-protected VMAs, and if there are, perform a flush before doing the COW. Further optimizations will follow to avoid during uffd-write-unprotect unnecassary PTE write-protection and TLB flushes. Link: https://lkml.kernel.org/r/20210304095423.3825684-1-namit@vmware.com Fixes: 09854ba94c6a ("mm: do_wp_page() simplification") Signed-off-by: Nadav Amit Suggested-by: Yu Zhao Reviewed-by: Peter Xu Tested-by: Peter Xu Cc: Andrea Arcangeli Cc: Andy Lutomirski Cc: Pavel Emelyanov Cc: Mike Kravetz Cc: Mike Rapoport Cc: Minchan Kim Cc: Will Deacon Cc: Peter Zijlstra Cc: [5.9+] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 1ae5cef2362a2018ccf95c3943c05640082674ff Author: OGAWA Hirofumi Date: Fri Mar 12 21:07:37 2021 -0800 mm/highmem.c: fix zero_user_segments() with start > end commit 184cee516f3e24019a08ac8eb5c7cf04c00933cb upstream. zero_user_segments() is used from __block_write_begin_int(), for example like the following zero_user_segments(page, 4096, 1024, 512, 918) But new the zero_user_segments() implementation for for HIGHMEM + TRANSPARENT_HUGEPAGE doesn't handle "start > end" case correctly, and hits BUG_ON(). (we can fix __block_write_begin_int() instead though, it is the old and multiple usage) Also it calls kmap_atomic() unnecessarily while start == end == 0. Link: https://lkml.kernel.org/r/87v9ab60r4.fsf@mail.parknet.co.jp Fixes: 0060ef3b4e6d ("mm: support THPs in zero_user_segments") Signed-off-by: OGAWA Hirofumi Cc: Matthew Wilcox Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 4e11dc5f0e9cd67651911bfad605fadd08030a33 Author: Marc Zyngier Date: Thu Mar 11 10:00:16 2021 +0000 KVM: arm64: Fix exclusive limit for IPA size commit 262b003d059c6671601a19057e9fe1a5e7f23722 upstream. When registering a memslot, we check the size and location of that memslot against the IPA size to ensure that we can provide guest access to the whole of the memory. Unfortunately, this check rejects memslot that end-up at the exact limit of the addressing capability for a given IPA size. For example, it refuses the creation of a 2GB memslot at 0x8000000 with a 32bit IPA space. Fix it by relaxing the check to accept a memslot reaching the limit of the IPA space. Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE") Reviewed-by: Eric Auger Signed-off-by: Marc Zyngier Cc: stable@vger.kernel.org Reviewed-by: Andrew Jones Link: https://lore.kernel.org/r/20210311100016.3830038-3-maz@kernel.org Signed-off-by: Greg Kroah-Hartman commit 9c501b2204800b472bfbc53226e02a2dc7b12c48 Author: Marc Zyngier Date: Thu Mar 11 10:00:15 2021 +0000 KVM: arm64: Reject VM creation when the default IPA size is unsupported commit 7d717558dd5ef10d28866750d5c24ff892ea3778 upstream. KVM/arm64 has forever used a 40bit default IPA space, partially due to its 32bit heritage (where the only choice is 40bit). However, there are implementations in the wild that have a *cough* much smaller *cough* IPA space, which leads to a misprogramming of VTCR_EL2, and a guest that is stuck on its first memory access if userspace dares to ask for the default IPA setting (which most VMMs do). Instead, blundly reject the creation of such VM, as we can't satisfy the requirements from userspace (with a one-off warning). Also clarify the boot warning, and document that the VM creation will fail when an unsupported IPA size is provided. Although this is an ABI change, it doesn't really change much for userspace: - the guest couldn't run before this change, but no error was returned. At least userspace knows what is happening. - a memory slot that was accepted because it did fit the default IPA space now doesn't even get a chance to be registered. The other thing that is left doing is to convince userspace to actually use the IPA space setting instead of relying on the antiquated default. Fixes: 233a7cb23531 ("kvm: arm64: Allow tuning the physical address size for VM") Signed-off-by: Marc Zyngier Cc: stable@vger.kernel.org Reviewed-by: Andrew Jones Reviewed-by: Eric Auger Link: https://lore.kernel.org/r/20210311100016.3830038-2-maz@kernel.org Signed-off-by: Greg Kroah-Hartman commit 6d448bab893bb870d558454428ebf4558b79c424 Author: Suzuki K Poulose Date: Fri Mar 5 18:52:47 2021 +0000 KVM: arm64: nvhe: Save the SPE context early commit b96b0c5de685df82019e16826a282d53d86d112c upstream. The nVHE KVM hyp drains and disables the SPE buffer, before entering the guest, as the EL1&0 translation regime is going to be loaded with that of the guest. But this operation is performed way too late, because : - The owning translation regime of the SPE buffer is transferred to EL2. (MDCR_EL2_E2PB == 0) - The guest Stage1 is loaded. Thus the flush could use the host EL1 virtual address, but use the EL2 translations instead of host EL1, for writing out any cached data. Fix this by moving the SPE buffer handling early enough. The restore path is doing the right thing. Fixes: 014c4c77aad7 ("KVM: arm64: Improve debug register save/restore flow") Cc: stable@vger.kernel.org Cc: Christoffer Dall Cc: Marc Zyngier Cc: Will Deacon Cc: Catalin Marinas Cc: Mark Rutland Cc: Alexandru Elisei Reviewed-by: Alexandru Elisei Signed-off-by: Suzuki K Poulose Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20210302120345.3102874-1-suzuki.poulose@arm.com Message-Id: <20210305185254.3730990-2-maz@kernel.org> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 308d9960c1461337d1d101fd9712dafa143e07fd Author: Will Deacon Date: Fri Mar 5 18:52:48 2021 +0000 KVM: arm64: Avoid corrupting vCPU context register in guest exit commit 31948332d5fa392ad933f4a6a10026850649ed76 upstream. Commit 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context") tracks the currently running vCPU, clearing the pointer to NULL on exit from a guest. Unfortunately, the use of 'set_loaded_vcpu' clobbers x1 to point at the kvm_hyp_ctxt instead of the vCPU context, causing the subsequent RAS code to go off into the weeds when it saves the DISR assuming that the CPU context is embedded in a struct vCPU. Leave x1 alone and use x3 as a temporary register instead when clearing the vCPU on the guest exit path. Cc: Marc Zyngier Cc: Andrew Scull Cc: Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context") Suggested-by: Quentin Perret Signed-off-by: Will Deacon Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20210226181211.14542-1-will@kernel.org Message-Id: <20210305185254.3730990-3-maz@kernel.org> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 92d908dc52912e4f89b69779069e54c65ed2d86b Author: Jia He Date: Fri Mar 5 18:52:54 2021 +0000 KVM: arm64: Fix range alignment when walking page tables commit 357ad203d45c0f9d76a8feadbd5a1c5d460c638b upstream. When walking the page tables at a given level, and if the start address for the range isn't aligned for that level, we propagate the misalignment on each iteration at that level. This results in the walker ignoring a number of entries (depending on the original misalignment) on each subsequent iteration. Properly aligning the address before the next iteration addresses this issue. Cc: stable@vger.kernel.org Reported-by: Howard Zhang Acked-by: Will Deacon Signed-off-by: Jia He Fixes: b1e57de62cfb ("KVM: arm64: Add stand-alone page-table walker infrastructure") [maz: rewrite commit message] Signed-off-by: Marc Zyngier Link: https://lore.kernel.org/r/20210303024225.2591-1-justin.he@arm.com Message-Id: <20210305185254.3730990-9-maz@kernel.org> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit f2f50032b65d30a2122f693ba4850335bbd4f699 Author: Marc Zyngier Date: Wed Mar 3 16:45:05 2021 +0000 KVM: arm64: Ensure I-cache isolation between vcpus of a same VM commit 01dc9262ff5797b675c32c0c6bc682777d23de05 upstream. It recently became apparent that the ARMv8 architecture has interesting rules regarding attributes being used when fetching instructions if the MMU is off at Stage-1. In this situation, the CPU is allowed to fetch from the PoC and allocate into the I-cache (unless the memory is mapped with the XN attribute at Stage-2). If we transpose this to vcpus sharing a single physical CPU, it is possible for a vcpu running with its MMU off to influence another vcpu running with its MMU on, as the latter is expected to fetch from the PoU (and self-patching code doesn't flush below that level). In order to solve this, reuse the vcpu-private TLB invalidation code to apply the same policy to the I-cache, nuking it every time the vcpu runs on a physical CPU that ran another vcpu of the same VM in the past. This involve renaming __kvm_tlb_flush_local_vmid() to __kvm_flush_cpu_context(), and inserting a local i-cache invalidation there. Cc: stable@vger.kernel.org Signed-off-by: Marc Zyngier Acked-by: Will Deacon Acked-by: Catalin Marinas Link: https://lore.kernel.org/r/20210303164505.68492-1-maz@kernel.org Signed-off-by: Greg Kroah-Hartman commit fd398928b1b06df777c0c5076b97d48f013f9575 Author: Wanpeng Li Date: Wed Feb 24 09:37:29 2021 +0800 KVM: kvmclock: Fix vCPUs > 64 can't be online/hotpluged commit d7eb79c6290c7ae4561418544072e0a3266e7384 upstream. # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 88 On-line CPU(s) list: 0-63 Off-line CPU(s) list: 64-87 # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-5.10.0-rc3-tlinux2-0050+ root=/dev/mapper/cl-root ro rd.lvm.lv=cl/root rhgb quiet console=ttyS0 LANG=en_US .UTF-8 no-kvmclock-vsyscall # echo 1 > /sys/devices/system/cpu/cpu76/online -bash: echo: write error: Cannot allocate memory The per-cpu vsyscall pvclock data pointer assigns either an element of the static array hv_clock_boot (#vCPU <= 64) or dynamically allocated memory hvclock_mem (vCPU > 64), the dynamically memory will not be allocated if kvmclock vsyscall is disabled, this can result in cpu hotpluged fails in kvmclock_setup_percpu() which returns -ENOMEM. It's broken for no-vsyscall and sometimes you end up with vsyscall disabled if the host does something strange. This patch fixes it by allocating this dynamically memory unconditionally even if vsyscall is disabled. Fixes: 6a1cac56f4 ("x86/kvm: Use __bss_decrypted attribute in shared variables") Reported-by: Zelin Deng Cc: Brijesh Singh Cc: stable@vger.kernel.org#v4.19-rc5+ Signed-off-by: Wanpeng Li Message-Id: <1614130683-24137-1-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 8da4090a0f4688e6cb3d2b55f985f297c882022c Author: Sean Christopherson Date: Thu Mar 4 18:18:08 2021 -0800 KVM: x86: Ensure deadline timer has truly expired before posting its IRQ commit beda430177f56656e7980dcce93456ffaa35676b upstream. When posting a deadline timer interrupt, open code the checks guarding __kvm_wait_lapic_expire() in order to skip the lapic_timer_int_injected() check in kvm_wait_lapic_expire(). The injection check will always fail since the interrupt has not yet be injected. Moving the call after injection would also be wrong as that wouldn't actually delay delivery of the IRQ if it is indeed sent via posted interrupt. Fixes: 010fd37fddf6 ("KVM: LAPIC: Reduce world switch latency caused by timer_advance_ns") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson Message-Id: <20210305021808.3769732-1-seanjc@google.com> Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 9bd012cb9b431424ce411da291c51ae80c830201 Author: Andy Lutomirski Date: Thu Mar 4 11:05:54 2021 -0800 x86/entry: Fix entry/exit mismatch on failed fast 32-bit syscalls commit 5d5675df792ff67e74a500c4c94db0f99e6a10ef upstream. On a 32-bit fast syscall that fails to read its arguments from user memory, the kernel currently does syscall exit work but not syscall entry work. This confuses audit and ptrace. For example: $ ./tools/testing/selftests/x86/syscall_arg_fault_32 ... strace: pid 264258: entering, ptrace_syscall_info.op == 2 ... This is a minimal fix intended for ease of backporting. A more complete cleanup is coming. Fixes: 0b085e68f407 ("x86/entry: Consolidate 32/64 bit syscall entry") Signed-off-by: Andy Lutomirski Signed-off-by: Thomas Gleixner Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/8c82296ddf803b91f8d1e5eac89e5803ba54ab0e.1614884673.git.luto@kernel.org Signed-off-by: Greg Kroah-Hartman commit eb4d509d3da6dcd89ec0e43e99bdefb5ddda9970 Author: Joerg Roedel Date: Wed Mar 3 15:17:16 2021 +0100 x86/sev-es: Use __copy_from_user_inatomic() commit bffe30dd9f1f3b2608a87ac909a224d6be472485 upstream. The #VC handler must run in atomic context and cannot sleep. This is a problem when it tries to fetch instruction bytes from user-space via copy_from_user(). Introduce a insn_fetch_from_user_inatomic() helper which uses __copy_from_user_inatomic() to safely copy the instruction bytes to kernel memory in the #VC handler. Fixes: 5e3427a7bc432 ("x86/sev-es: Handle instruction fetches from user-space") Signed-off-by: Joerg Roedel Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-6-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman commit 178615c6bceef1104c865b9d1e4133a859de5095 Author: Joerg Roedel Date: Wed Mar 3 15:17:15 2021 +0100 x86/sev-es: Correctly track IRQ states in runtime #VC handler commit 62441a1fb53263bda349b6e5997c3cc5c120d89e upstream. Call irqentry_nmi_enter()/irqentry_nmi_exit() in the #VC handler to correctly track the IRQ state during its execution. Fixes: 0786138c78e79 ("x86/sev-es: Add a Runtime #VC Exception Handler") Reported-by: Andy Lutomirski Signed-off-by: Joerg Roedel Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org # v5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-5-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman commit d023a773a0e81d3843f12547226b99d8486d5d3e Author: Joerg Roedel Date: Wed Mar 3 15:17:13 2021 +0100 x86/sev-es: Check regs->sp is trusted before adjusting #VC IST stack commit 545ac14c16b5dbd909d5a90ddf5b5a629a40fa94 upstream. The code in the NMI handler to adjust the #VC handler IST stack is needed in case an NMI hits when the #VC handler is still using its IST stack. But the check for this condition also needs to look if the regs->sp value is trusted, meaning it was not set by user-space. Extend the check to not use regs->sp when the NMI interrupted user-space code or the SYSCALL gap. Fixes: 315562c9af3d5 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Reported-by: Andy Lutomirski Signed-off-by: Joerg Roedel Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org # 5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-3-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman commit 8d5720ce452f008bbc47cb010597da4035af9fdc Author: Joerg Roedel Date: Wed Mar 3 15:17:12 2021 +0100 x86/sev-es: Introduce ip_within_syscall_gap() helper commit 78a81d88f60ba773cbe890205e1ee67f00502948 upstream. Introduce a helper to check whether an exception came from the syscall gap and use it in the SEV-ES code. Extend the check to also cover the compatibility SYSCALL entry path. Fixes: 315562c9af3d5 ("x86/sev-es: Adjust #VC IST Stack on entering NMI handler") Signed-off-by: Joerg Roedel Signed-off-by: Borislav Petkov Cc: stable@vger.kernel.org # 5.10+ Link: https://lkml.kernel.org/r/20210303141716.29223-2-joro@8bytes.org Signed-off-by: Greg Kroah-Hartman commit 1794ef8142ddeda54e74280e1798e08dbcef0882 Author: Josh Poimboeuf Date: Fri Feb 5 08:24:02 2021 -0600 x86/unwind/orc: Disable KASAN checking in the ORC unwinder, part 2 commit e504e74cc3a2c092b05577ce3e8e013fae7d94e6 upstream. KASAN reserves "redzone" areas between stack frames in order to detect stack overruns. A read or write to such an area triggers a KASAN "stack-out-of-bounds" BUG. Normally, the ORC unwinder stays in-bounds and doesn't access the redzone. But sometimes it can't find ORC metadata for a given instruction. This can happen for code which is missing ORC metadata, or for generated code. In such cases, the unwinder attempts to fall back to frame pointers, as a best-effort type thing. This fallback often works, but when it doesn't, the unwinder can get confused and go off into the weeds into the KASAN redzone, triggering the aforementioned KASAN BUG. But in this case, the unwinder's confusion is actually harmless and working as designed. It already has checks in place to prevent off-stack accesses, but those checks get short-circuited by the KASAN BUG. And a BUG is a lot more disruptive than a harmless unwinder warning. Disable the KASAN checks by using READ_ONCE_NOCHECK() for all stack accesses. This finishes the job started by commit 881125bfe65b ("x86/unwind: Disable KASAN checking in the ORC unwinder"), which only partially fixed the issue. Fixes: ee9f8fce9964 ("x86/unwind: Add the ORC unwinder") Reported-by: Ivan Babrou Signed-off-by: Josh Poimboeuf Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Borislav Petkov Reviewed-by: Steven Rostedt (VMware) Tested-by: Ivan Babrou Cc: stable@kernel.org Link: https://lkml.kernel.org/r/9583327904ebbbeda399eca9c56d6c7085ac20fe.1612534649.git.jpoimboe@redhat.com Signed-off-by: Greg Kroah-Hartman commit 8f105cac6c48f62147a7a15db51ebbe7b4c27377 Author: Andrey Konovalov Date: Fri Mar 12 21:08:13 2021 -0800 kasan: fix KASAN_STACK dependency for HW_TAGS commit d9b571c885a8974fbb7d4ee639dbc643fd000f9e upstream. There's a runtime failure when running HW_TAGS-enabled kernel built with GCC on hardware that doesn't support MTE. GCC-built kernels always have CONFIG_KASAN_STACK enabled, even though stack instrumentation isn't supported by HW_TAGS. Having that config enabled causes KASAN to issue MTE-only instructions to unpoison kernel stacks, which causes the failure. Fix the issue by disallowing CONFIG_KASAN_STACK when HW_TAGS is used. (The commit that introduced CONFIG_KASAN_HW_TAGS specified proper dependency for CONFIG_KASAN_STACK_ENABLE but not for CONFIG_KASAN_STACK.) Link: https://lkml.kernel.org/r/59e75426241dbb5611277758c8d4d6f5f9298dac.1615215441.git.andreyknvl@google.com Fixes: 6a63a63ff1ac ("kasan: introduce CONFIG_KASAN_HW_TAGS") Signed-off-by: Andrey Konovalov Reported-by: Catalin Marinas Cc: Cc: Will Deacon Cc: Vincenzo Frascino Cc: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Marco Elver Cc: Peter Collingbourne Cc: Evgenii Stepanov Cc: Branislav Rankov Cc: Kevin Brodsky Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f32ff9f5ceb2fe1e10be2264029b0aad198cc2da Author: Andrey Konovalov Date: Fri Mar 12 21:08:10 2021 -0800 kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC commit f9d79e8dce4077d3c6ab739c808169dfa99af9ef upstream. Currently, kasan_free_nondeferred_pages()->kasan_free_pages() is called after debug_pagealloc_unmap_pages(). This causes a crash when debug_pagealloc is enabled, as HW_TAGS KASAN can't set tags on an unmapped page. This patch puts kasan_free_nondeferred_pages() before debug_pagealloc_unmap_pages() and arch_free_page(), which can also make the page unavailable. Link: https://lkml.kernel.org/r/24cd7db274090f0e5bc3adcdc7399243668e3171.1614987311.git.andreyknvl@google.com Fixes: 94ab5b61ee16 ("kasan, arm64: enable CONFIG_KASAN_HW_TAGS") Signed-off-by: Andrey Konovalov Cc: Catalin Marinas Cc: Will Deacon Cc: Vincenzo Frascino Cc: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Alexander Potapenko Cc: Marco Elver Cc: Peter Collingbourne Cc: Evgenii Stepanov Cc: Branislav Rankov Cc: Kevin Brodsky Cc: Christoph Hellwig Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit d28492be82e19fc69cc69975fc2052b37ef0c821 Author: Lior Ribak Date: Fri Mar 12 21:07:41 2021 -0800 binfmt_misc: fix possible deadlock in bm_register_write commit e7850f4d844e0acfac7e570af611d89deade3146 upstream. There is a deadlock in bm_register_write: First, in the begining of the function, a lock is taken on the binfmt_misc root inode with inode_lock(d_inode(root)). Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call open_exec on the user-provided interpreter. open_exec will call a path lookup, and if the path lookup process includes the root of binfmt_misc, it will try to take a shared lock on its inode again, but it is already locked, and the code will get stuck in a deadlock To reproduce the bug: $ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register backtrace of where the lock occurs (#5): 0 schedule () at ./arch/x86/include/asm/current.h:15 1 0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=, state=state@entry=2) at kernel/locking/rwsem.c:992 2 0xffffffff81b5150a in __down_read_common (state=2, sem=) at kernel/locking/rwsem.c:1213 3 __down_read (sem=) at kernel/locking/rwsem.c:1222 4 down_read (sem=) at kernel/locking/rwsem.c:1355 5 0xffffffff811ee22a in inode_lock_shared (inode=) at ./include/linux/fs.h:783 6 open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177 7 path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366 8 0xffffffff811efe1c in do_filp_open (dfd=, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396 9 0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=, flags@entry=0) at fs/exec.c:913 10 0xffffffff811e4a92 in open_exec (name=) at fs/exec.c:948 11 0xffffffff8124aa84 in bm_register_write (file=, buffer=, count=19, ppos=) at fs/binfmt_misc.c:682 12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF ", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603 13 0xffffffff811defda in ksys_write (fd=, buf=0xa758d0 ":iiiii:E::ii::i:CF ", count=19) at fs/read_write.c:658 14 0xffffffff81b49813 in do_syscall_64 (nr=, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46 15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120 To solve the issue, the open_exec call is moved to before the write lock is taken by bm_register_write Link: https://lkml.kernel.org/r/20210228224414.95962-1-liorribak@gmail.com Fixes: 948b701a607f1 ("binfmt_misc: add persistent opened binary handler for containers") Signed-off-by: Lior Ribak Acked-by: Helge Deller Cc: Al Viro Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit fabe9bbf52c7bda6cefafbc2e75ab1a41951d94d Author: Christophe Leroy Date: Tue Mar 9 08:39:39 2021 +0000 powerpc: Fix missing declaration of [en/dis]able_kernel_vsx() commit bd73758803c2eedc037c2268b65a19542a832594 upstream. Add stub instances of enable_kernel_vsx() and disable_kernel_vsx() when CONFIG_VSX is not set, to avoid following build failure. CC [M] drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o In file included from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29, from ./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:37, from drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:27: drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: In function 'dcn_bw_apply_registry_override': ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:64:3: error: implicit declaration of function 'enable_kernel_vsx'; did you mean 'enable_kernel_fp'? [-Werror=implicit-function-declaration] 64 | enable_kernel_vsx(); \ | ^~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:640:2: note: in expansion of macro 'DC_FP_START' 640 | DC_FP_START(); | ^~~~~~~~~~~ ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:75:3: error: implicit declaration of function 'disable_kernel_vsx'; did you mean 'disable_kernel_fp'? [-Werror=implicit-function-declaration] 75 | disable_kernel_vsx(); \ | ^~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:676:2: note: in expansion of macro 'DC_FP_END' 676 | DC_FP_END(); | ^~~~~~~~~ cc1: some warnings being treated as errors make[5]: *** [drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o] Error 1 This works because the caller is checking if VSX is available using cpu_has_feature(): #define DC_FP_START() { \ if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \ preempt_disable(); \ enable_kernel_vsx(); \ } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \ preempt_disable(); \ enable_kernel_altivec(); \ } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \ preempt_disable(); \ enable_kernel_fp(); \ } \ When CONFIG_VSX is not selected, cpu_has_feature(CPU_FTR_VSX_COMP) constant folds to 'false' so the call to enable_kernel_vsx() is discarded and the build succeeds. Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER") Cc: stable@vger.kernel.org # v5.6+ Reported-by: Geert Uytterhoeven Reported-by: kernel test robot Signed-off-by: Christophe Leroy [mpe: Incorporate some discussion comments into the change log] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/8d7d285a027e9d21f5ff7f850fa71a2655b0c4af.1615279170.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman commit 2f12b9febb38de5d7c0bbf0dd7f4a2fdb9ab1e0b Author: Nicholas Piggin Date: Mon Mar 8 18:55:30 2021 +1000 powerpc: Fix inverted SET_FULL_REGS bitop commit 73ac79881804eed2e9d76ecdd1018037f8510cb1 upstream. This bit operation was inverted and set the low bit rather than cleared it, breaking the ability to ptrace non-volatile GPRs after exec. Fix. Only affects 64e and 32-bit. Fixes: feb9df3462e6 ("powerpc/64s: Always has full regs, so remove remnant checks") Cc: stable@vger.kernel.org # v5.8+ Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210308085530.3191843-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman commit 9e46a01793b1c1292c83e6028ff4d679b27cbd98 Author: Naveen N. Rao Date: Thu Mar 4 07:34:11 2021 +0530 powerpc/64s: Fix instruction encoding for lis in ppc_function_entry() commit cea15316ceee2d4a51dfdecd79e08a438135416c upstream. 'lis r2,N' is 'addis r2,0,N' and the instruction encoding in the macro LIS_R2 is incorrect (it currently maps to 'addis r0,r2,N'). Fix the same. Fixes: c71b7eff426f ("powerpc: Add ABIv2 support to ppc_function_entry") Cc: stable@vger.kernel.org # v3.16+ Reported-by: Jiri Olsa Signed-off-by: Naveen N. Rao Acked-by: Segher Boessenkool Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210304020411.16796-1-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman commit 15dcba8a7fdbb096252c7cc3e17dee384c1c52b8 Author: Ard Biesheuvel Date: Fri Mar 5 10:21:05 2021 +0100 efi: stub: omit SetVirtualAddressMap() if marked unsupported in RT_PROP table commit 9e9888a0fe97b9501a40f717225d2bef7100a2c1 upstream. The EFI_RT_PROPERTIES_TABLE contains a mask of runtime services that are available after ExitBootServices(). This mostly does not concern the EFI stub at all, given that it runs before that. However, there is one call that is made at runtime, which is the call to SetVirtualAddressMap() (which is not even callable at boot time to begin with) So add the missing handling of the RT_PROP table to ensure that we only call SetVirtualAddressMap() if it is not being advertised as unsupported by the firmware. Cc: # v5.10+ Tested-by: Shawn Guo Signed-off-by: Ard Biesheuvel Signed-off-by: Greg Kroah-Hartman commit ab4728011789c84a6bca3c3a88ea8ace7acec837 Author: Peter Zijlstra Date: Wed Feb 24 11:42:08 2021 +0100 sched: Simplify set_affinity_pending refcounts commit 50caf9c14b1498c90cf808dbba2ca29bd32ccba4 upstream. Now that we have set_affinity_pending::stop_pending to indicate if a stopper is in progress, and we have the guarantee that if that stopper exists, it will (eventually) complete our @pending we can simplify the refcount scheme by no longer counting the stopper thread. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.724130207@infradead.org Signed-off-by: Greg Kroah-Hartman commit 6ebb2dbadb442e8efb4b7a5daa4a50031b2a8491 Author: Peter Zijlstra Date: Wed Feb 24 11:31:09 2021 +0100 sched: Fix affine_move_task() self-concurrency commit 9e81889c7648d48dd5fe13f41cbc99f3c362484a upstream. Consider: sched_setaffinity(p, X); sched_setaffinity(p, Y); Then the first will install p->migration_pending = &my_pending; and issue stop_one_cpu_nowait(pending); and the second one will read p->migration_pending and _also_ issue: stop_one_cpu_nowait(pending), the _SAME_ @pending. This causes stopper list corruption. Add set_affinity_pending::stop_pending, to indicate if a stopper is in progress. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.649146419@infradead.org Signed-off-by: Greg Kroah-Hartman commit a8957bc5a1a086fc806a6441c9157f12e6ef7129 Author: Peter Zijlstra Date: Wed Feb 24 11:21:35 2021 +0100 sched: Optimize migration_cpu_stop() commit 3f1bc119cd7fc987c8ed25ffb717f99403bb308c upstream. When the purpose of migration_cpu_stop() is to migrate the task to 'any' valid CPU, don't migrate the task when it's already running on a valid CPU. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.569238629@infradead.org Signed-off-by: Greg Kroah-Hartman commit dd8b56e25964452c63eb01ce49b9e4cfd97c4dc0 Author: Peter Zijlstra Date: Wed Feb 24 11:50:39 2021 +0100 sched: Simplify migration_cpu_stop() commit c20cf065d4a619d394d23290093b1002e27dff86 upstream. When affine_move_task() issues a migration_cpu_stop(), the purpose of that function is to complete that @pending, not any random other p->migration_pending that might have gotten installed since. This realization much simplifies migration_cpu_stop() and allows further necessary steps to fix all this as it provides the guarantee that @pending's stopper will complete @pending (and not some random other @pending). Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.430014682@infradead.org Signed-off-by: Greg Kroah-Hartman commit 3d0fef8ad71809778aeb4e43599ae5ec1d5428c9 Author: Peter Zijlstra Date: Wed Feb 24 11:15:23 2021 +0100 sched: Collate affine_move_task() stoppers commit 58b1a45086b5f80f2b2842aa7ed0da51a64a302b upstream. The SCA_MIGRATE_ENABLE and task_running() cases are almost identical, collapse them to avoid further duplication. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.500108964@infradead.org Signed-off-by: Greg Kroah-Hartman commit 1cd4fb3a325474a44b4451c2b19fc9f1f1ed630d Author: Mathieu Desnoyers Date: Wed Feb 17 11:56:51 2021 -0500 sched/membarrier: fix missing local execution of ipi_sync_rq_state() commit ce29ddc47b91f97e7f69a0fb7cbb5845f52a9825 upstream. The function sync_runqueues_membarrier_state() should copy the membarrier state from the @mm received as parameter to each runqueue currently running tasks using that mm. However, the use of smp_call_function_many() skips the current runqueue, which is unintended. Replace by a call to on_each_cpu_mask(). Fixes: 227a4aadc75b ("sched/membarrier: Fix p->mm->membarrier_state racy load") Reported-by: Nadav Amit Signed-off-by: Mathieu Desnoyers Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Cc: stable@vger.kernel.org # 5.4.x+ Link: https://lore.kernel.org/r/74F1E842-4A84-47BF-B6C2-5407DFDD4A4A@gmail.com Signed-off-by: Greg Kroah-Hartman commit 5b4f39edf62bad6e59fd9083e92667edd74fb964 Author: Peter Zijlstra Date: Sat Feb 13 13:10:35 2021 +0100 sched: Fix migration_cpu_stop() requeueing commit 8a6edb5257e2a84720fe78cb179eca58ba76126f upstream. When affine_move_task(p) is called on a running task @p, which is not otherwise already changing affinity, we'll first set p->migration_pending and then do: stop_one_cpu(cpu_of_rq(rq), migration_cpu_stop, &arg); This then gets us to migration_cpu_stop() running on the CPU that was previously running our victim task @p. If we find that our task is no longer on that runqueue (this can happen because of a concurrent migration due to load-balance etc.), then we'll end up at the: } else if (dest_cpu < 1 || pending) { branch. Which we'll take because we set pending earlier. Here we first check if the task @p has already satisfied the affinity constraints, if so we bail early [A]. Otherwise we'll reissue migration_cpu_stop() onto the CPU that is now hosting our task @p: stop_one_cpu_nowait(cpu_of(rq), migration_cpu_stop, &pending->arg, &pending->stop_work); Except, we've never initialized pending->arg, which will be all 0s. This then results in running migration_cpu_stop() on the next CPU with arg->p == NULL, which gives the by now obvious result of fireworks. The cure is to change affine_move_task() to always use pending->arg, furthermore we can use the exact same pattern as the SCA_MIGRATE_ENABLE case, since we'll block on the pending->done completion anyway, no point in adding yet another completion in stop_one_cpu(). This then gives a clear distinction between the two migration_cpu_stop() use cases: - sched_exec() / migrate_task_to() : arg->pending == NULL - affine_move_task() : arg->pending != NULL; And we can have it ignore p->migration_pending when !arg->pending. Any stop work from sched_exec() / migrate_task_to() is in addition to stop works from affine_move_task(), which will be sufficient to issue the completion. Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()") Cc: stable@kernel.org Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Reviewed-by: Valentin Schneider Link: https://lkml.kernel.org/r/20210224131355.357743989@infradead.org Signed-off-by: Greg Kroah-Hartman commit 10466d88b6b889bd868b276dc4af4ea935601781 Author: Arnd Bergmann Date: Fri Mar 12 21:07:47 2021 -0800 linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP* commit 97e4910232fa1f81e806aa60c25a0450276d99a2 upstream. Separating compiler-clang.h from compiler-gcc.h inadventently dropped the definitions of the three HAVE_BUILTIN_BSWAP macros, which requires falling back to the open-coded version and hoping that the compiler detects it. Since all versions of clang support the __builtin_bswap interfaces, add back the flags and have the headers pick these up automatically. This results in a 4% improvement of compilation speed for arm defconfig. Note: it might also be worth revisiting which architectures set CONFIG_ARCH_USE_BUILTIN_BSWAP for one compiler or the other, today this is set on six architectures (arm32, csky, mips, powerpc, s390, x86), while another ten architectures define custom helpers (alpha, arc, ia64, m68k, mips, nios2, parisc, sh, sparc, xtensa), and the rest (arm64, h8300, hexagon, microblaze, nds32, openrisc, riscv) just get the unoptimized version and rely on the compiler to detect it. A long time ago, the compiler builtins were architecture specific, but nowadays, all compilers that are able to build the kernel have correct implementations of them, though some may not be as optimized as the inline asm versions. The patch that dropped the optimization landed in v4.19, so as discussed it would be fairly safe to backport this revert to stable kernels to the 4.19/5.4/5.10 stable kernels, but there is a remaining risk for regressions, and it has no known side-effects besides compile speed. Link: https://lkml.kernel.org/r/20210226161151.2629097-1-arnd@kernel.org Link: https://lore.kernel.org/lkml/20210225164513.3667778-1-arnd@kernel.org/ Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually exclusive") Signed-off-by: Arnd Bergmann Reviewed-by: Nathan Chancellor Reviewed-by: Kees Cook Acked-by: Miguel Ojeda Acked-by: Nick Desaulniers Acked-by: Luc Van Oostenryck Cc: Masahiro Yamada Cc: Nick Hu Cc: Greentime Hu Cc: Vincent Chen Cc: Paul Walmsley Cc: Palmer Dabbelt Cc: Albert Ou Cc: Guo Ren Cc: Randy Dunlap Cc: Sami Tolvanen Cc: Marco Elver Cc: Arvind Sankar Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit db2d807997f3f80c5b4f3f9536e903f005aeaa17 Author: Minchan Kim Date: Fri Mar 12 21:08:41 2021 -0800 zram: fix broken page writeback commit 2766f1821600cc7562bae2128ad0b163f744c5d9 upstream. commit 0d8359620d9b ("zram: support page writeback") introduced two problems. It overwrites writeback_store's return value as kstrtol's return value, which makes return value zero so user could see zero as return value of write syscall even though it wrote data successfully. It also breaks index value in the loop in that it doesn't increase the index any longer. It means it can write only first starting block index so user couldn't write all idle pages in the zram so lose memory saving chance. This patch fixes those issues. Link: https://lkml.kernel.org/r/20210312173949.2197662-2-minchan@kernel.org Fixes: 0d8359620d9b("zram: support page writeback") Signed-off-by: Minchan Kim Reported-by: Amos Bianchi Cc: Sergey Senozhatsky Cc: John Dias Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 2eeb6948049a3164b5434d26d4f1116fdd9eb475 Author: Minchan Kim Date: Fri Mar 12 21:08:38 2021 -0800 zram: fix return value on writeback_store commit 57e0076e6575a7b7cef620a0bd2ee2549ef77818 upstream. writeback_store's return value is overwritten by submit_bio_wait's return value. Thus, writeback_store will return zero since there was no IO error. In the end, write syscall from userspace will see the zero as return value, which could make the process stall to keep trying the write until it will succeed. Link: https://lkml.kernel.org/r/20210312173949.2197662-1-minchan@kernel.org Fixes: 3b82a051c101("drivers/block/zram/zram_drv.c: fix error return codes not being returned in writeback_store") Signed-off-by: Minchan Kim Cc: Sergey Senozhatsky Cc: Colin Ian King Cc: John Dias Cc: Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 86a41a3b2ed7fda7e8641108c03bfc52cd07258b Author: Matthew Wilcox (Oracle) Date: Fri Mar 12 21:08:03 2021 -0800 include/linux/sched/mm.h: use rcu_dereference in in_vfork() [ Upstream commit 149fc787353f65b7e72e05e7b75d34863266c3e2 ] Fix a sparse warning by using rcu_dereference(). Technically this is a bug and a sufficiently aggressive compiler could reload the `real_parent' pointer outside the protection of the rcu lock (and access freed memory), but I think it's pretty unlikely to happen. Link: https://lkml.kernel.org/r/20210221194207.1351703-1-willy@infradead.org Fixes: b18dc5f291c0 ("mm, oom: skip vforked tasks from being selected") Signed-off-by: Matthew Wilcox (Oracle) Reviewed-by: Miaohe Lin Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 17e14c8a01ca681d5afbd245081ff5b1c5fa26b4 Author: Arnd Bergmann Date: Fri Mar 12 21:07:04 2021 -0800 stop_machine: mark helpers __always_inline [ Upstream commit cbf78d85079cee662c45749ef4f744d41be85d48 ] With clang-13, some functions only get partially inlined, with a specialized version referring to a global variable. This triggers a harmless build-time check for the intel-rng driver: WARNING: modpost: drivers/char/hw_random/intel-rng.o(.text+0xe): Section mismatch in reference from the function stop_machine() to the function .init.text:intel_rng_hw_init() The function stop_machine() references the function __init intel_rng_hw_init(). This is often because stop_machine lacks a __init annotation or the annotation of intel_rng_hw_init is wrong. In this instance, an easy workaround is to force the stop_machine() function to be inline, along with related interfaces that did not show the same behavior at the moment, but theoretically could. The combination of the two patches listed below triggers the behavior in clang-13, but individually these commits are correct. Link: https://lkml.kernel.org/r/20210225130153.1956990-1-arnd@kernel.org Fixes: fe5595c07400 ("stop_machine: Provide stop_machine_cpuslocked()") Fixes: ee527cd3a20c ("Use stop_machine_run in the Intel RNG driver") Signed-off-by: Arnd Bergmann Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Thomas Gleixner Cc: Sebastian Andrzej Siewior Cc: "Paul E. McKenney" Cc: Ingo Molnar Cc: Prarit Bhargava Cc: Daniel Bristot de Oliveira Cc: Peter Zijlstra Cc: Valentin Schneider Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 2dacea74389c377152292abea55419c055d0c4e8 Author: Arnd Bergmann Date: Fri Mar 12 21:07:01 2021 -0800 memblock: fix section mismatch warning [ Upstream commit 34dc2efb39a231280fd6696a59bbe712bf3c5c4a ] The inlining logic in clang-13 is rewritten to often not inline some functions that were inlined by all earlier compilers. In case of the memblock interfaces, this exposed a harmless bug of a missing __init annotation: WARNING: modpost: vmlinux.o(.text+0x507c0a): Section mismatch in reference from the function memblock_bottom_up() to the variable .meminit.data:memblock The function memblock_bottom_up() references the variable __meminitdata memblock. This is often because memblock_bottom_up lacks a __meminitdata annotation or the annotation of memblock is wrong. Interestingly, these annotations were present originally, but got removed with the explanation that the __init annotation prevents the function from getting inlined. I checked this again and found that while this is the case with clang, gcc (version 7 through 10, did not test others) does inline the functions regardless. As the previous change was apparently intended to help the clang builds, reverting it to help the newer clang versions seems appropriate as well. gcc builds don't seem to care either way. Link: https://lkml.kernel.org/r/20210225133808.2188581-1-arnd@kernel.org Fixes: 5bdba520c1b3 ("mm: memblock: drop __init from memblock functions to make it inline") Reference: 2cfb3665e864 ("include/linux/memblock.h: add __init to memblock_set_bottom_up()") Signed-off-by: Arnd Bergmann Reviewed-by: David Hildenbrand Reviewed-by: Mike Rapoport Cc: Nathan Chancellor Cc: Nick Desaulniers Cc: Faiyaz Mohammed Cc: Baoquan He Cc: Thomas Bogendoerfer Cc: Aslan Bakirov Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit aee07c5595b53e2c196d5f352fb2dff3fba64431 Author: Peter Zijlstra Date: Tue Mar 9 15:21:18 2021 +0100 seqlock,lockdep: Fix seqcount_latch_init() [ Upstream commit 4817a52b306136c8b2b2271d8770401441e4cf79 ] seqcount_init() must be a macro in order to preserve the static variable that is used for the lockdep key. Don't then wrap it in an inline function, which destroys that. Luckily there aren't many users of this function, but fix it before it becomes a problem. Fixes: 80793c3471d9 ("seqlock: Introduce seqcount_latch_t") Reported-by: Eric Dumazet Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/YEeFEbNUVkZaXDp4@hirez.programming.kicks-ass.net Signed-off-by: Sasha Levin commit 87a2ad9d9bf82b2e493c690f6680b961a81c033a Author: Daniel Axtens Date: Thu Feb 25 14:09:59 2021 +1100 powerpc/64s/exception: Clean up a missed SRR specifier [ Upstream commit c080a173301ffc62cb6c76308c803c7fee05517a ] Nick's patch cleaning up the SRR specifiers in exception-64s.S missed a single instance of EXC_HV_OR_STD. Clean that up. Caught by clang's integrated assembler. Fixes: 3f7fbd97d07d ("powerpc/64s/exception: Clean up SRR specifiers") Signed-off-by: Daniel Axtens Acked-by: Nicholas Piggin Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210225031006.1204774-2-dja@axtens.net Signed-off-by: Sasha Levin commit 5e38bd7c7a1f158cb753dd5eee20520ed2069a4f Author: Anna-Maria Behnsen Date: Tue Feb 23 17:02:40 2021 +0100 hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event() [ Upstream commit 46eb1701c046cc18c032fa68f3c8ccbf24483ee4 ] hrtimer_force_reprogram() and hrtimer_interrupt() invokes __hrtimer_get_next_event() to find the earliest expiry time of hrtimer bases. __hrtimer_get_next_event() does not update cpu_base::[softirq_]_expires_next to preserve reprogramming logic. That needs to be done at the callsites. hrtimer_force_reprogram() updates cpu_base::softirq_expires_next only when the first expiring timer is a softirq timer and the soft interrupt is not activated. That's wrong because cpu_base::softirq_expires_next is left stale when the first expiring timer of all bases is a timer which expires in hard interrupt context. hrtimer_interrupt() does never update cpu_base::softirq_expires_next which is wrong too. That becomes a problem when clock_settime() sets CLOCK_REALTIME forward and the first soft expiring timer is in the CLOCK_REALTIME_SOFT base. Setting CLOCK_REALTIME forward moves the clock MONOTONIC based expiry time of that timer before the stale cpu_base::softirq_expires_next. cpu_base::softirq_expires_next is cached to make the check for raising the soft interrupt fast. In the above case the soft interrupt won't be raised until clock monotonic reaches the stale cpu_base::softirq_expires_next value. That's incorrect, but what's worse it that if the softirq timer becomes the first expiring timer of all clock bases after the hard expiry timer has been handled the reprogramming of the clockevent from hrtimer_interrupt() will result in an interrupt storm. That happens because the reprogramming does not use cpu_base::softirq_expires_next, it uses __hrtimer_get_next_event() which returns the actual expiry time. Once clock MONOTONIC reaches cpu_base::softirq_expires_next the soft interrupt is raised and the storm subsides. Change the logic in hrtimer_force_reprogram() to evaluate the soft and hard bases seperately, update softirq_expires_next and handle the case when a soft expiring timer is the first of all bases by comparing the expiry times and updating the required cpu base fields. Split this functionality into a separate function to be able to use it in hrtimer_interrupt() as well without copy paste. Fixes: 5da70160462e ("hrtimer: Implement support for softirq based hrtimers") Reported-by: Mikael Beckius Suggested-by: Thomas Gleixner Tested-by: Mikael Beckius Signed-off-by: Anna-Maria Behnsen Signed-off-by: Thomas Gleixner Signed-off-by: Ingo Molnar Link: https://lore.kernel.org/r/20210223160240.27518-1-anna-maria@linutronix.de Signed-off-by: Sasha Levin commit 186d77e23a558070c144208b19974b0b4bad254b Author: Kan Liang Date: Mon Nov 30 11:38:41 2020 -0800 perf/x86/intel: Set PERF_ATTACH_SCHED_CB for large PEBS and LBR [ Upstream commit afbef30149587ad46f4780b1e0cc5e219745ce90 ] To supply a PID/TID for large PEBS, it requires flushing the PEBS buffer in a context switch. For normal LBRs, a context switch can flip the address space and LBR entries are not tagged with an identifier, we need to wipe the LBR, even for per-cpu events. For LBR callstack, save/restore the stack is required during a context switch. Set PERF_ATTACH_SCHED_CB for the event with large PEBS & LBR. Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context switches") Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20201130193842.10569-2-kan.liang@linux.intel.com Signed-off-by: Sasha Levin commit 1f89cb3cba5b423358f83cf3d2ef002be4193f7b Author: Kan Liang Date: Mon Nov 30 11:38:40 2020 -0800 perf/core: Flush PMU internal buffers for per-CPU events [ Upstream commit a5398bffc01fe044848c5024e5e867e407f239b8 ] Sometimes the PMU internal buffers have to be flushed for per-CPU events during a context switch, e.g., large PEBS. Otherwise, the perf tool may report samples in locations that do not belong to the process where the samples are processed in, because PEBS does not tag samples with PID/TID. The current code only flush the buffers for a per-task event. It doesn't check a per-CPU event. Add a new event state flag, PERF_ATTACH_SCHED_CB, to indicate that the PMU internal buffers have to be flushed for this event during a context switch. Add sched_cb_entry and perf_sched_cb_usages back to track the PMU/cpuctx which is required to be flushed. Only need to invoke the sched_task() for per-CPU events in this patch. The per-task events have been handled in perf_event_context_sched_in/out already. Fixes: 9c964efa4330 ("perf/x86/intel: Drain the PEBS buffer during context switches") Reported-by: Gabriel Marin Originally-by: Namhyung Kim Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Link: https://lkml.kernel.org/r/20201130193842.10569-1-kan.liang@linux.intel.com Signed-off-by: Sasha Levin commit a6757e81ba1b3702b89e7edc14059ea86a57bfc9 Author: Paolo Abeni Date: Thu Mar 4 13:32:10 2021 -0800 mptcp: fix memory accounting on allocation error [ Upstream commit eaeef1ce55ec9161e0c44ff27017777b1644b421 ] In case of memory pressure the MPTCP xmit path keeps at most a single skb in the tx cache, eventually freeing additional ones. The associated counter for forward memory is not update accordingly, and that causes the following splat: WARNING: CPU: 0 PID: 12 at net/core/stream.c:208 sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.11.0-rc2 #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:sk_stream_kill_queues+0x3ca/0x530 net/core/stream.c:208 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e 63 01 00 00 8b ab 00 01 00 00 e9 60 ff ff ff e8 2f 24 d3 fe 0f 0b eb 97 e8 26 24 d3 fe <0f> 0b eb a0 e8 1d 24 d3 fe 0f 0b e9 a5 fe ff ff 4c 89 e7 e8 0e d0 RSP: 0018:ffffc900000c7bc8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88810030ac40 RSI: ffffffff8262ca4a RDI: 0000000000000003 RBP: 0000000000000d00 R08: 0000000000000000 R09: ffffffff85095aa7 R10: ffffffff8262c9ea R11: 0000000000000001 R12: ffff888108908100 R13: ffffffff85095aa0 R14: ffffc900000c7c48 R15: 1ffff92000018f85 FS: 0000000000000000(0000) GS:ffff88811b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa7444baef8 CR3: 0000000035ee9005 CR4: 0000000000170ef0 Call Trace: __mptcp_destroy_sock+0x4a7/0x6c0 net/mptcp/protocol.c:2547 mptcp_worker+0x7dd/0x1610 net/mptcp/protocol.c:2272 process_one_work+0x896/0x1170 kernel/workqueue.c:2275 worker_thread+0x605/0x1350 kernel/workqueue.c:2421 kthread+0x344/0x410 kernel/kthread.c:292 ret_from_fork+0x22/0x30 arch/x86/entry/entry_64.S:296 At close time, as reported by syzkaller/Christoph. This change address the issue properly updating the fwd allocated memory counter in the error path. Reported-by: Christoph Paasch Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/136 Fixes: 724cfd2ee8aa ("mptcp: allocate TX skbs in msk context") Signed-off-by: Paolo Abeni Signed-off-by: Mat Martineau Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit cba9e678aafcdc879c5fc6222957f364acfe0010 Author: Florian Westphal Date: Thu Mar 4 13:32:09 2021 -0800 mptcp: put subflow sock on connect error [ Upstream commit f07157792c633b528de5fc1dbe2e4ea54f8e09d4 ] mptcp_add_pending_subflow() performs a sock_hold() on the subflow, then adds the subflow to the join list. Without a sock_put the subflow sk won't be freed in case connect() fails. unreferenced object 0xffff88810c03b100 (size 3000): [..] sk_prot_alloc.isra.0+0x2f/0x110 sk_alloc+0x5d/0xc20 inet6_create+0x2b7/0xd30 __sock_create+0x17f/0x410 mptcp_subflow_create_socket+0xff/0x9c0 __mptcp_subflow_connect+0x1da/0xaf0 mptcp_pm_nl_work+0x6e0/0x1120 mptcp_worker+0x508/0x9a0 Fixes: 5b950ff4331ddda ("mptcp: link MPC subflow into msk only after accept") Signed-off-by: Florian Westphal Signed-off-by: Mat Martineau Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit b9fc94c4f38405e5eefdad19da699af04bff63e9 Author: Willem de Bruijn Date: Mon Mar 1 15:09:44 2021 +0000 net: expand textsearch ts_state to fit skb_seq_state [ Upstream commit b228c9b058760500fda5edb3134527f629fc2dc3 ] The referenced commit expands the skb_seq_state used by skb_find_text with a 4B frag_off field, growing it to 48B. This exceeds container ts_state->cb, causing a stack corruption: [ 73.238353] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: skb_find_text+0xc5/0xd0 [ 73.247384] CPU: 1 PID: 376 Comm: nping Not tainted 5.11.0+ #4 [ 73.252613] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 [ 73.260078] Call Trace: [ 73.264677] dump_stack+0x57/0x6a [ 73.267866] panic+0xf6/0x2b7 [ 73.270578] ? skb_find_text+0xc5/0xd0 [ 73.273964] __stack_chk_fail+0x10/0x10 [ 73.277491] skb_find_text+0xc5/0xd0 [ 73.280727] string_mt+0x1f/0x30 [ 73.283639] ipt_do_table+0x214/0x410 The struct is passed between skb_find_text and its callbacks skb_prepare_seq_read, skb_seq_read and skb_abort_seq read through the textsearch interface using TS_SKB_CB. I assumed that this mapped to skb->cb like other .._SKB_CB wrappers. skb->cb is 48B. But it maps to ts_state->cb, which is only 40B. skb->cb was increased from 40B to 48B after ts_state was introduced, in commit 3e3850e989c5 ("[NETFILTER]: Fix xfrm lookup in ip_route_me_harder/ip6_route_me_harder"). Increase ts_state.cb[] to 48 to fit the struct. Also add a BUILD_BUG_ON to avoid a repeat. The alternative is to directly add a dependency from textsearch onto linux/skbuff.h, but I think the intent is textsearch to have no such dependencies on its callers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=211911 Fixes: 97550f6fa592 ("net: compound page support in skb_seq_read") Reported-by: Kris Karas Signed-off-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 828a2cf4bdfb01d4f37623561359cb697dfa0646 Author: Wei Yongjun Date: Fri Mar 12 08:04:21 2021 +0000 perf/arm_dmc620_pmu: Fix error return code in dmc620_pmu_device_probe() [ Upstream commit c8e3866836528a4ba3b0535834f03768d74f7d8e ] Fix to return negative error code -ENOMEM from the error handling case instead of 0, as done elsewhere in this function. Fixes: 53c218da220c ("driver/perf: Add PMU driver for the ARM DMC-620 memory controller") Reported-by: Hulk Robot Signed-off-by: Wei Yongjun Link: https://lore.kernel.org/r/20210312080421.277562-1-weiyongjun1@huawei.com Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit aba4bdddaaca8df12876b78639c67f65f7cab870 Author: Dave Airlie Date: Thu Mar 11 14:35:27 2021 +1000 drm/nouveau: fix dma syncing for loops (v2) [ Upstream commit 4042160c2e5433e0759782c402292a90b5bf458d ] The index variable should only be increased in one place. Noticed this while trying to track down another oops. v2: use while loop. Fixes: f295c8cfec83 ("drm/nouveau: fix dma syncing warning with debugging on.") Signed-off-by: Dave Airlie Reviewed-by: Michael J. Ruhl Signed-off-by: Dave Airlie Link: https://patchwork.freedesktop.org/patch/msgid/20210311043527.5376-1-airlied@gmail.com Signed-off-by: Sasha Levin commit 33448e4e24d3ffe0d377ee24e78771e57047f904 Author: Jens Axboe Date: Thu Mar 11 10:49:20 2021 -0700 io_uring: perform IOPOLL reaping if canceler is thread itself [ Upstream commit d052d1d685f5125249ab4ff887562c88ba959638 ] We bypass IOPOLL completion polling (and reaping) for the SQPOLL thread, but if it's the thread itself invoking cancelations, then we still need to perform it or no one will. Fixes: 9936c7c2bc76 ("io_uring: deduplicate core cancellations sequence") Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit add3e42e5365ef6a177a7f091ad81657f807221e Author: Ard Biesheuvel Date: Wed Mar 10 18:15:11 2021 +0100 arm64: mm: use a 48-bit ID map when possible on 52-bit VA builds [ Upstream commit 7ba8f2b2d652cd8d8a2ab61f4be66973e70f9f88 ] 52-bit VA kernels can run on hardware that is only 48-bit capable, but configure the ID map as 52-bit by default. This was not a problem until recently, because the special T0SZ value for a 52-bit VA space was never programmed into the TCR register anwyay, and because a 52-bit ID map happens to use the same number of translation levels as a 48-bit one. This behavior was changed by commit 1401bef703a4 ("arm64: mm: Always update TCR_EL1 from __cpu_set_tcr_t0sz()"), which causes the unsupported T0SZ value for a 52-bit VA to be programmed into TCR_EL1. While some hardware simply ignores this, Mark reports that Amberwing systems choke on this, resulting in a broken boot. But even before that commit, the unsupported idmap_t0sz value was exposed to KVM and used to program TCR_EL2 incorrectly as well. Given that we already have to deal with address spaces being either 48-bit or 52-bit in size, the cleanest approach seems to be to simply default to a 48-bit VA ID map, and only switch to a 52-bit one if the placement of the kernel in DRAM requires it. This is guaranteed not to happen unless the system is actually 52-bit VA capable. Fixes: 90ec95cda91a ("arm64: mm: Introduce VA_BITS_MIN") Reported-by: Mark Salter Link: http://lore.kernel.org/r/20210310003216.410037-1-msalter@redhat.com Signed-off-by: Ard Biesheuvel Link: https://lore.kernel.org/r/20210310171515.416643-2-ardb@kernel.org Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 92204ad2df5460d8b43fb15b0c3111079e938455 Author: Daiyue Zhang Date: Mon Mar 1 14:10:53 2021 +0800 configfs: fix a use-after-free in __configfs_open_file [ Upstream commit 14fbbc8297728e880070f7b077b3301a8c698ef9 ] Commit b0841eefd969 ("configfs: provide exclusion between IO and removals") uses ->frag_dead to mark the fragment state, thus no bothering with extra refcount on config_item when opening a file. The configfs_get_config_item was removed in __configfs_open_file, but not with config_item_put. So the refcount on config_item will lost its balance, causing use-after-free issues in some occasions like this: Test: 1. Mount configfs on /config with read-only items: drwxrwx--- 289 root root 0 2021-04-01 11:55 /config drwxr-xr-x 2 root root 0 2021-04-01 11:54 /config/a --w--w--w- 1 root root 4096 2021-04-01 11:53 /config/a/1.txt ...... 2. Then run: for file in /config do echo $file grep -R 'key' $file done 3. __configfs_open_file will be called in parallel, the first one got called will do: if (file->f_mode & FMODE_READ) { if (!(inode->i_mode & S_IRUGO)) goto out_put_module; config_item_put(buffer->item); kref_put() package_details_release() kfree() the other one will run into use-after-free issues like this: BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0 Read of size 8 at addr fffffff155f02480 by task grep/13096 CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G W 4.14.116-kasan #1 TGID: 13096 Comm: grep Call trace: dump_stack+0x118/0x160 kasan_report+0x22c/0x294 __asan_load8+0x80/0x88 __configfs_open_file+0x1bc/0x3b0 configfs_open_file+0x28/0x34 do_dentry_open+0x2cc/0x5c0 vfs_open+0x80/0xe0 path_openat+0xd8c/0x2988 do_filp_open+0x1c4/0x2fc do_sys_open+0x23c/0x404 SyS_openat+0x38/0x48 Allocated by task 2138: kasan_kmalloc+0xe0/0x1ac kmem_cache_alloc_trace+0x334/0x394 packages_make_item+0x4c/0x180 configfs_mkdir+0x358/0x740 vfs_mkdir2+0x1bc/0x2e8 SyS_mkdirat+0x154/0x23c el0_svc_naked+0x34/0x38 Freed by task 13096: kasan_slab_free+0xb8/0x194 kfree+0x13c/0x910 package_details_release+0x524/0x56c kref_put+0xc4/0x104 config_item_put+0x24/0x34 __configfs_open_file+0x35c/0x3b0 configfs_open_file+0x28/0x34 do_dentry_open+0x2cc/0x5c0 vfs_open+0x80/0xe0 path_openat+0xd8c/0x2988 do_filp_open+0x1c4/0x2fc do_sys_open+0x23c/0x404 SyS_openat+0x38/0x48 el0_svc_naked+0x34/0x38 To fix this issue, remove the config_item_put in __configfs_open_file to balance the refcount of config_item. Fixes: b0841eefd969 ("configfs: provide exclusion between IO and removals") Signed-off-by: Daiyue Zhang Signed-off-by: Yi Chen Signed-off-by: Ge Qiu Reviewed-by: Chao Yu Acked-by: Al Viro Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 9460288cfc6912bde5bfa7138ebd68b84108881e Author: James Smart Date: Mon Mar 8 16:51:26 2021 -0800 nvme-fc: fix racing controller reset and create association [ Upstream commit f20ef34d71abc1fc56b322aaa251f90f94320140 ] Recent patch to prevent calling __nvme_fc_abort_outstanding_ios in interrupt context results in a possible race condition. A controller reset results in errored io completions, which schedules error work. The change of error work to a work element allows it to fire after the ctrl state transition to NVME_CTRL_CONNECTING, causing any outstanding io (used to initialize the controller) to fail and cause problems for connect_work. Add a state check to only schedule error work if not in the RESETTING state. Fixes: 19fce0470f05 ("nvme-fc: avoid calling _nvme_fc_abort_outstanding_ios from interrupt context") Signed-off-by: Nigel Kirkland Signed-off-by: James Smart Signed-off-by: Christoph Hellwig Signed-off-by: Sasha Levin commit 2652a763abf21c7932962de5f4545c0f33021dbc Author: Anthony DeRossi Date: Tue Mar 2 17:17:25 2021 -0800 drm/ttm: Fix TTM page pool accounting [ Upstream commit ca63d76fd2319db984f2875992643f900caf2c72 ] Freed pages are not subtracted from the allocated_pages counter in ttm_pool_type_fini(), causing a leak in the count on device removal. The next shrinker invocation loops forever trying to free pages that are no longer in the pool: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 3-....: (9998 ticks this GP) idle=54e/1/0x4000000000000000 softirq=434857/434857 fqs=2237 (t=10001 jiffies g=2194533 q=49211) NMI backtrace for cpu 3 CPU: 3 PID: 1034 Comm: kswapd0 Tainted: P O 5.11.0-com #1 Hardware name: System manufacturer System Product Name/PRIME X570-PRO, BIOS 1405 11/19/2019 Call Trace: ... sysvec_apic_timer_interrupt+0x77/0x80 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0010:mutex_unlock+0x16/0x20 Code: e7 48 8b 70 10 e8 7a 53 77 ff eb aa e8 43 6c ff ff 0f 1f 00 65 48 8b 14 25 00 6d 01 00 31 c9 48 89 d0 f0 48 0f b1 0f 48 39 c2 <74> 05 e9 e3 fe ff ff c3 66 90 48 8b 47 20 48 85 c0 74 0f 8b 50 10 RSP: 0018:ffffbdb840797be8 EFLAGS: 00000246 RAX: ffff9ff445a41c00 RBX: ffffffffc02a9ef8 RCX: 0000000000000000 RDX: ffff9ff445a41c00 RSI: ffffbdb840797c78 RDI: ffffffffc02a9ac0 RBP: 0000000000000080 R08: 0000000000000000 R09: ffffbdb840797c80 R10: 0000000000000000 R11: fffffffffffffff5 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000084 R15: ffffffffc02a9a60 ttm_pool_shrink+0x7d/0x90 [ttm] ttm_pool_shrinker_scan+0x5/0x20 [ttm] do_shrink_slab+0x13a/0x1a0 ... debugfs shows the incorrect total: $ cat /sys/kernel/debug/dri/0/ttm_page_pool --- 0--- --- 1--- --- 2--- --- 3--- --- 4--- --- 5--- --- 6--- --- 7--- --- 8--- --- 9--- ---10--- wc : 0 0 0 0 0 0 0 0 0 0 0 uc : 0 0 0 0 0 0 0 0 0 0 0 wc 32 : 0 0 0 0 0 0 0 0 0 0 0 uc 32 : 0 0 0 0 0 0 0 0 0 0 0 DMA uc : 0 0 0 0 0 0 0 0 0 0 0 DMA wc : 0 0 0 0 0 0 0 0 0 0 0 DMA : 0 0 0 0 0 0 0 0 0 0 0 total : 3029 of 8244261 Using ttm_pool_type_take() to remove pages from the pool before freeing them correctly accounts for the freed pages. Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Signed-off-by: Anthony DeRossi Link: https://patchwork.freedesktop.org/patch/msgid/20210303011723.22512-1-ajderossi@gmail.com Reviewed-by: Christian König Signed-off-by: Christian König Signed-off-by: Maarten Lankhorst Signed-off-by: Sasha Levin commit bfd2566889f3ea37584d39cb5903bba3475ba40e Author: Jia-Ju Bai Date: Tue Mar 9 19:30:17 2021 -0800 block: rsxx: fix error return code of rsxx_pci_probe() [ Upstream commit df66617bfe87487190a60783d26175b65d2502ce ] When create_singlethread_workqueue returns NULL to card->event_wq, no error return code of rsxx_pci_probe() is assigned. To fix this bug, st is assigned with -ENOMEM in this case. Fixes: 8722ff8cdbfa ("block: IBM RamSan 70/80 device driver") Reported-by: TOTE Robot Signed-off-by: Jia-Ju Bai Link: https://lore.kernel.org/r/20210310033017.4023-1-baijiaju1990@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Sasha Levin commit 07744d5bf4a6ef2287232c85755dd2f775cea042 Author: Ondrej Mosnacek Date: Fri Jan 15 18:43:56 2021 +0100 NFSv4.2: fix return value of _nfs4_get_security_label() [ Upstream commit 53cb245454df5b13d7063162afd7a785aed6ebf2 ] An xattr 'get' handler is expected to return the length of the value on success, yet _nfs4_get_security_label() (and consequently also nfs4_xattr_get_nfs4_label(), which is used as an xattr handler) returns just 0 on success. Fix this by returning label.len instead, which contains the length of the result. Fixes: aa9c2669626c ("NFS: Client implementation of Labeled-NFS") Signed-off-by: Ondrej Mosnacek Reviewed-by: James Morris Reviewed-by: Paul Moore Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit 55b0c3d184ed597cafe977cac6548acef9ecfe73 Author: Trond Myklebust Date: Mon Mar 8 14:42:52 2021 -0500 NFS: Don't gratuitously clear the inode cache when lookup failed [ Upstream commit 47397915ede0192235474b145ebcd81b37b03624 ] The fact that the lookup revalidation failed, does not mean that the inode contents have changed. Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit 594c1e4f69e97f97d6c5415f5adc5fd3667f95ad Author: Trond Myklebust Date: Mon Mar 8 14:42:51 2021 -0500 NFS: Don't revalidate the directory permissions on a lookup failure [ Upstream commit 82e7ca1334ab16e2e04fafded1cab9dfcdc11b40 ] There should be no reason to expect the directory permissions to change just because the directory contents changed or a negative lookup timed out. So let's avoid doing a full call to nfs_mark_for_revalidate() in that case. Furthermore, if this is a negative dentry, and we haven't actually done a new lookup, then we have no reason yet to believe the directory has changed at all. So let's remove the gratuitous directory inode invalidation altogether when called from nfs_lookup_revalidate_negative(). Reported-by: Geert Jansen Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()") Signed-off-by: Trond Myklebust Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit e1e4c4bb81d2b5205aff9ca3c308e32d545ea693 Author: Benjamin Coddington Date: Wed Mar 3 08:47:16 2021 -0500 SUNRPC: Set memalloc_nofs_save() for sync tasks [ Upstream commit f0940f4b3284a00f38a5d42e6067c2aaa20e1f2e ] We could recurse into NFS doing memory reclaim while sending a sync task, which might result in a deadlock. Set memalloc_nofs_save for sync task execution. Fixes: a1231fda7e94 ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs") Signed-off-by: Benjamin Coddington Signed-off-by: Anna Schumaker Signed-off-by: Sasha Levin commit 5f13849cf5eda8d84469580661be993ac4209aac Author: Anshuman Khandual Date: Fri Mar 5 10:54:57 2021 +0530 arm64/mm: Fix pfn_valid() for ZONE_DEVICE based memory [ Upstream commit eeb0753ba27b26f609e61f9950b14f1b934fe429 ] pfn_valid() validates a pfn but basically it checks for a valid struct page backing for that pfn. It should always return positive for memory ranges backed with struct page mapping. But currently pfn_valid() fails for all ZONE_DEVICE based memory types even though they have struct page mapping. pfn_valid() asserts that there is a memblock entry for a given pfn without MEMBLOCK_NOMAP flag being set. The problem with ZONE_DEVICE based memory is that they do not have memblock entries. Hence memblock_is_map_memory() will invariably fail via memblock_search() for a ZONE_DEVICE based address. This eventually fails pfn_valid() which is wrong. memblock_is_map_memory() needs to be skipped for such memory ranges. As ZONE_DEVICE memory gets hotplugged into the system via memremap_pages() called from a driver, their respective memory sections will not have SECTION_IS_EARLY set. Normal hotplug memory will never have MEMBLOCK_NOMAP set in their memblock regions. Because the flag MEMBLOCK_NOMAP was specifically designed and set for firmware reserved memory regions. memblock_is_map_memory() can just be skipped as its always going to be positive and that will be an optimization for the normal hotplug memory. Like ZONE_DEVICE based memory, all normal hotplugged memory too will not have SECTION_IS_EARLY set for their sections Skipping memblock_is_map_memory() for all non early memory sections would fix pfn_valid() problem for ZONE_DEVICE based memory and also improve its performance for normal hotplug memory as well. Cc: Catalin Marinas Cc: Will Deacon Cc: Ard Biesheuvel Cc: Robin Murphy Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Acked-by: David Hildenbrand Fixes: 73b20c84d42d ("arm64: mm: implement pte_devmap support") Signed-off-by: Anshuman Khandual Acked-by: Catalin Marinas Link: https://lore.kernel.org/r/1614921898-4099-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon Signed-off-by: Sasha Levin commit 49e95245f113371a12a1a5fe81ada9a36426d91c Author: Wei Yongjun Date: Thu Mar 4 10:04:23 2021 +0000 cpufreq: qcom-hw: Fix return value check in qcom_cpufreq_hw_cpu_init() [ Upstream commit 536eb97abeba857126ad055de5923fa592acef25 ] In case of error, the function ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: Hulk Robot Signed-off-by: Wei Yongjun Acked-by: Shawn Guo Signed-off-by: Viresh Kumar Signed-off-by: Sasha Levin commit e00dc812e45c3d9ea61ccbb83e739cd56d57bc7e Author: Shawn Guo Date: Sun Feb 28 09:33:19 2021 +0800 cpufreq: qcom-hw: fix dereferencing freed memory 'data' [ Upstream commit 02fc409540303801994d076fcdb7064bd634dbf3 ] Commit 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") introduces an issue of dereferencing freed memory 'data'. Fix it. Fixes: 67fc209b527d ("cpufreq: qcom-hw: drop devm_xxx() calls from init/exit hooks") Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Shawn Guo Signed-off-by: Viresh Kumar Signed-off-by: Sasha Levin commit a902066ac48ef381f14e49875c7797e4f43ac1eb Author: Atish Patra Date: Wed Mar 3 11:55:49 2021 -0800 net: macb: Add default usrio config to default gem config [ Upstream commit b12422362ce947098ac420ac3c975fc006af4c02 ] There is no usrio config defined for default gem config leading to a kernel panic devices that don't define a data. This issue can be reprdouced with microchip polar fire soc where compatible string is defined as "cdns,macb". Fixes: edac63861db7 ("add userio bits as platform configuration") Signed-off-by: Atish Patra Acked-by: Nicolas Ferre Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 98bac6bd42a7ab90576ad87dd80dfd37bf5945a5 Author: Jordan Niethe Date: Thu Feb 25 14:19:46 2021 +1100 powerpc/sstep: Fix VSX instruction emulation [ Upstream commit 5c88a17e15795226b56d83f579cbb9b7a4864f79 ] Commit af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions") added loading and storing 32 word long data into adjacent VSRs. However the calculation used to determine if two VSRs needed to be loaded/stored inadvertently prevented the load/storing taking place for instructions with a data length less than 16 words. This causes the emulation to not function correctly, which can be seen by the alignment_handler selftest: $ ./alignment_handler [snip] test: test_alignment_handler_vsx_207 tags: git_version:powerpc-5.12-1-0-g82d2c16b350f VSX: 2.07B Doing lxsspx: PASSED Doing lxsiwax: FAILED: Wrong Data Doing lxsiwzx: PASSED Doing stxsspx: PASSED Doing stxsiwx: PASSED failure: test_alignment_handler_vsx_207 test: test_alignment_handler_vsx_300 tags: git_version:powerpc-5.12-1-0-g82d2c16b350f VSX: 3.00B Doing lxsd: PASSED Doing lxsibzx: PASSED Doing lxsihzx: PASSED Doing lxssp: FAILED: Wrong Data Doing lxv: PASSED Doing lxvb16x: PASSED Doing lxvh8x: PASSED Doing lxvx: PASSED Doing lxvwsx: FAILED: Wrong Data Doing lxvl: PASSED Doing lxvll: PASSED Doing stxsd: PASSED Doing stxsibx: PASSED Doing stxsihx: PASSED Doing stxssp: PASSED Doing stxv: PASSED Doing stxvb16x: PASSED Doing stxvh8x: PASSED Doing stxvx: PASSED Doing stxvl: PASSED Doing stxvll: PASSED failure: test_alignment_handler_vsx_300 [snip] Fix this by making sure all VSX instruction emulation correctly load/store from the VSRs. Fixes: af99da74333b ("powerpc/sstep: Support VSX vector paired storage access instructions") Signed-off-by: Jordan Niethe Reviewed-by: Ravi Bangoria Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210225031946.1458206-1-jniethe5@gmail.com Signed-off-by: Sasha Levin commit c9dc6f49afb55e3b00458173ebce34f4799f1606 Author: Sergey Shtylyov Date: Sun Feb 28 23:26:34 2021 +0300 sh_eth: fix TRSCER mask for R7S72100 [ Upstream commit 75be7fb7f978202c4c3a1a713af4485afb2ff5f6 ] According to the RZ/A1H Group, RZ/A1M Group User's Manual: Hardware, Rev. 4.00, the TRSCER register has bit 9 reserved, hence we can't use the driver's default TRSCER mask. Add the explicit initializer for sh_eth_cpu_data::trscer_err_mask for R7S72100. Fixes: db893473d313 ("sh_eth: Add support for r7s72100") Signed-off-by: Sergey Shtylyov Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 7af661b1b62d5199500714cae16bd393c1bdd88a Author: Ioana Ciornei Date: Fri Feb 26 17:30:20 2021 +0200 net: phy: ti: take into account all possible interrupt sources [ Upstream commit 73f476aa1975bae6a792b340f5b26ffcfba869a6 ] The previous implementation of .handle_interrupt() did not take into account the fact that all the interrupt status registers should be acknowledged since multiple interrupt sources could be asserted. Fix this by reading all the status registers before exiting with IRQ_NONE or triggering the PHY state machine. Fixes: 1d1ae3c6ca3f ("net: phy: ti: implement generic .handle_interrupt() callback") Reported-by: Sven Schuchmann Signed-off-by: Ioana Ciornei Link: https://lore.kernel.org/r/20210226153020.867852-1-ciorneiioana@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 48d541ef9b7d6d1256b2d0a1e86cfe641fedf4a2 Author: Ido Schimmel Date: Thu Feb 25 18:57:21 2021 +0200 mlxsw: spectrum_router: Ignore routes using a deleted nexthop object [ Upstream commit dc860b88ce0a7ed9a048d5042cbb175daf60b657 ] Routes are currently processed from a workqueue whereas nexthop objects are processed in system call context. This can result in the driver not finding a suitable nexthop group for a route and issuing a warning [1]. Fix this by ignoring such routes earlier in the process. The subsequent deletion notification will be ignored as well. [1] WARNING: CPU: 2 PID: 7754 at drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c:4853 mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum] [...] CPU: 2 PID: 7754 Comm: kworker/u8:0 Not tainted 5.11.0-rc6-cq-20210207-1 #16 Hardware name: Mellanox Technologies Ltd. MSN2100/SA001390, BIOS 5.6.5 05/24/2018 Workqueue: mlxsw_core_ordered mlxsw_sp_router_fib_event_work [mlxsw_spectrum] RIP: 0010:mlxsw_sp_router_fib_event_work+0x1112/0x1e00 [mlxsw_spectrum] Fixes: cdd6cfc54c64 ("mlxsw: spectrum_router: Allow programming routes with nexthop objects") Signed-off-by: Ido Schimmel Reported-by: Alex Veber Tested-by: Alex Veber Reviewed-by: Petr Machata Reviewed-by: Jiri Pirko Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit 69eba6dfb351f9cb7df573d0b2a6a775a0a25903 Author: Ian Abbott Date: Tue Feb 23 14:30:50 2021 +0000 staging: comedi: pcl818: Fix endian problem for AI command data commit 148e34fd33d53740642db523724226de14ee5281 upstream. The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer parameter. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the parameter holding the sample value to `unsigned short`. [Note: the bug was introduced in commit edf4537bcbf5 ("staging: comedi: pcl818: use comedi_buf_write_samples()") but the patch applies better to commit d615416de615 ("staging: comedi: pcl818: introduce pcl818_ai_write_sample()").] Fixes: d615416de615 ("staging: comedi: pcl818: introduce pcl818_ai_write_sample()") Cc: # 4.0+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-10-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit bc446306e2bf4fc899a89265a9d72aa8fbddc771 Author: Ian Abbott Date: Tue Feb 23 14:30:49 2021 +0000 staging: comedi: pcl711: Fix endian problem for AI command data commit a084303a645896e834883f2c5170d044410dfdb3 upstream. The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`. Fixes: 1f44c034de2e ("staging: comedi: pcl711: use comedi_buf_write_samples()") Cc: # 3.19+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-9-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit 61fd133419de153ce554a516ba7984315c4bdef7 Author: Ian Abbott Date: Tue Feb 23 14:30:48 2021 +0000 staging: comedi: me4000: Fix endian problem for AI command data commit b39dfcced399d31e7c4b7341693b18e01c8f655e upstream. The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the calls to `comedi_buf_write_samples()` are passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`. Fixes: de88924f67d1 ("staging: comedi: me4000: use comedi_buf_write_samples()") Cc: # 3.19+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-8-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit ee81c14d867406fcb232aeb5490bfb997895a159 Author: Ian Abbott Date: Tue Feb 23 14:30:47 2021 +0000 staging: comedi: dmm32at: Fix endian problem for AI command data commit 54999c0d94b3c26625f896f8e3460bc029821578 upstream. The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`. [Note: the bug was introduced in commit 1700529b24cc ("staging: comedi: dmm32at: use comedi_buf_write_samples()") but the patch applies better to the later (but in the same kernel release) commit 0c0eadadcbe6e ("staging: comedi: dmm32at: introduce dmm32_ai_get_sample()").] Fixes: 0c0eadadcbe6e ("staging: comedi: dmm32at: introduce dmm32_ai_get_sample()") Cc: # 3.19+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-7-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit b4a6dc4a7611dd137c1f8f6752419777847c91f6 Author: Ian Abbott Date: Tue Feb 23 14:30:46 2021 +0000 staging: comedi: das800: Fix endian problem for AI command data commit 459b1e8c8fe97fcba0bd1b623471713dce2c5eaf upstream. The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`. Fixes: ad9eb43c93d8 ("staging: comedi: das800: use comedi_buf_write_samples()") Cc: # 3.19+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-6-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit 82ecb3c446f74bedc23a43dec1850bd2f1276a03 Author: Ian Abbott Date: Tue Feb 23 14:30:45 2021 +0000 staging: comedi: das6402: Fix endian problem for AI command data commit 1c0f20b78781b9ca50dc3ecfd396d0db5b141890 upstream. The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the call to `comedi_buf_write_samples()` is passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variable holding the sample value to `unsigned short`. Fixes: d1d24cb65ee3 ("staging: comedi: das6402: read analog input samples in interrupt handler") Cc: # 3.19+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-5-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit 62201ee024e35e17f08072966a2317e48a92fa4f Author: Ian Abbott Date: Tue Feb 23 14:30:44 2021 +0000 staging: comedi: adv_pci1710: Fix endian problem for AI command data commit b2e78630f733a76508b53ba680528ca39c890e82 upstream. The analog input subdevice supports Comedi asynchronous commands that use Comedi's 16-bit sample format. However, the calls to `comedi_buf_write_samples()` are passing the address of a 32-bit integer variable. On bigendian machines, this will copy 2 bytes from the wrong end of the 32-bit value. Fix it by changing the type of the variables holding the sample value to `unsigned short`. The type of the `val` parameter of `pci1710_ai_read_sample()` is changed to `unsigned short *` accordingly. The type of the `val` variable in `pci1710_ai_insn_read()` is also changed to `unsigned short` since its address is passed to `pci1710_ai_read_sample()`. Fixes: a9c3a015c12f ("staging: comedi: adv_pci1710: use comedi_buf_write_samples()") Cc: # 4.0+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-4-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit 26334ba6dd7f3fbe72c8a2f0f9eae425e320e383 Author: Ian Abbott Date: Tue Feb 23 14:30:43 2021 +0000 staging: comedi: addi_apci_1500: Fix endian problem for command sample commit ac0bbf55ed3be75fde1f8907e91ecd2fd589bde3 upstream. The digital input subdevice supports Comedi asynchronous commands that read interrupt status information. This uses 16-bit Comedi samples (of which only the bottom 8 bits contain status information). However, the interrupt handler is calling `comedi_buf_write_samples()` with the address of a 32-bit variable `unsigned int status`. On a bigendian machine, this will copy 2 bytes from the wrong end of the variable. Fix it by changing the type of the variable to `unsigned short`. Fixes: a8c66b684efa ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions") Cc: #4.0+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-3-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit cdca970941ce1c6d5ff8a65ccd818e7d2bbe0f04 Author: Ian Abbott Date: Tue Feb 23 14:30:42 2021 +0000 staging: comedi: addi_apci_1032: Fix endian problem for COS sample commit 25317f428a78fde71b2bf3f24d05850f08a73a52 upstream. The Change-Of-State (COS) subdevice supports Comedi asynchronous commands to read 16-bit change-of-state values. However, the interrupt handler is calling `comedi_buf_write_samples()` with the address of a 32-bit integer `&s->state`. On bigendian architectures, it will copy 2 bytes from the wrong end of the 32-bit integer. Fix it by transferring the value via a 16-bit integer. Fixes: 6bb45f2b0c86 ("staging: comedi: addi_apci_1032: use comedi_buf_write_samples()") Cc: # 3.19+ Signed-off-by: Ian Abbott Link: https://lore.kernel.org/r/20210223143055.257402-2-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman commit 3ff08f7339b7ca8d639fbc9d750a70a09684d62b Author: Lee Gibson Date: Fri Feb 26 14:51:57 2021 +0000 staging: rtl8192e: Fix possible buffer overflow in _rtl92e_wx_set_scan commit 8687bf9ef9551bcf93897e33364d121667b1aadf upstream. Function _rtl92e_wx_set_scan calls memcpy without checking the length. A user could control that length and trigger a buffer overflow. Fix by checking the length is within the maximum allowed size. Reviewed-by: Dan Carpenter Signed-off-by: Lee Gibson Cc: stable Link: https://lore.kernel.org/r/20210226145157.424065-1-leegib@gmail.com Signed-off-by: Greg Kroah-Hartman commit 1a1642a610394d27fb29255ab7e803b5b77b740c Author: Lee Gibson Date: Mon Mar 1 13:26:48 2021 +0000 staging: rtl8712: Fix possible buffer overflow in r8712_sitesurvey_cmd commit b93c1e3981af19527beee1c10a2bef67a228c48c upstream. Function r8712_sitesurvey_cmd calls memcpy without checking the length. A user could control that length and trigger a buffer overflow. Fix by checking the length is within the maximum allowed size. Signed-off-by: Lee Gibson Link: https://lore.kernel.org/r/20210301132648.420296-1-leegib@gmail.com Cc: stable Signed-off-by: Greg Kroah-Hartman commit 85bd010b4c03b3b920491e05ed7f4973f83c997a Author: Dan Carpenter Date: Tue Mar 2 14:19:39 2021 +0300 staging: ks7010: prevent buffer overflow in ks_wlan_set_scan() commit e163b9823a0b08c3bb8dc4f5b4b5c221c24ec3e5 upstream. The user can specify a "req->essid_len" of up to 255 but if it's over IW_ESSID_MAX_SIZE (32) that can lead to memory corruption. Fixes: 13a9930d15b4 ("staging: ks7010: add driver from Nanonote extra-repository") Signed-off-by: Dan Carpenter Cc: stable Link: https://lore.kernel.org/r/YD4fS8+HmM/Qmrw6@mwanda Signed-off-by: Greg Kroah-Hartman commit c21fccda627d18882237af01e3d73d8aea7db49b Author: Dan Carpenter Date: Fri Mar 5 11:56:32 2021 +0300 staging: rtl8188eu: fix potential memory corruption in rtw_check_beacon_data() commit d4ac640322b06095128a5c45ba4a1e80929fe7f3 upstream. The "ie_len" is a value in the 1-255 range that comes from the user. We have to cap it to ensure that it's not too large or it could lead to memory corruption. Fixes: 9a7fe54ddc3a ("staging: r8188eu: Add source files for new driver - part 1") Signed-off-by: Dan Carpenter Cc: stable Link: https://lore.kernel.org/r/YEHyQCrFZKTXyT7J@mwanda Signed-off-by: Greg Kroah-Hartman commit 6e90e2392f26d2eb6c0ee2a5b3621a2debd641a1 Author: Dan Carpenter Date: Wed Feb 24 11:45:59 2021 +0300 staging: rtl8712: unterminated string leads to read overflow commit d660f4f42ccea50262c6ee90c8e7ad19a69fb225 upstream. The memdup_user() function does not necessarily return a NUL terminated string so this can lead to a read overflow. Switch from memdup_user() to strndup_user() to fix this bug. Fixes: c6dc001f2add ("staging: r8712u: Merging Realtek's latest (v2.6.6). Various fixes.") Cc: stable Signed-off-by: Dan Carpenter Link: https://lore.kernel.org/r/YDYSR+1rj26NRhvb@mwanda Signed-off-by: Greg Kroah-Hartman commit 1cdd069f7080acf6370250853c1211890f4ff38f Author: Dan Carpenter Date: Fri Mar 5 11:58:03 2021 +0300 staging: rtl8188eu: prevent ->ssid overflow in rtw_wx_set_scan() commit 74b6b20df8cfe90ada777d621b54c32e69e27cd7 upstream. This code has a check to prevent read overflow but it needs another check to prevent writing beyond the end of the ->ssid[] array. Fixes: a2c60d42d97c ("staging: r8188eu: Add files for new driver - part 16") Signed-off-by: Dan Carpenter Cc: stable Link: https://lore.kernel.org/r/YEHymwsnHewzoam7@mwanda Signed-off-by: Greg Kroah-Hartman commit 952cf8b4b05b5343765e74ed38ed5950e1af1087 Author: Dan Carpenter Date: Fri Mar 5 11:12:49 2021 +0300 staging: rtl8192u: fix ->ssid overflow in r8192_wx_set_scan() commit 87107518d7a93fec6cdb2559588862afeee800fb upstream. We need to cap len at IW_ESSID_MAX_SIZE (32) to avoid memory corruption. This can be controlled by the user via the ioctl. Fixes: 5f53d8ca3d5d ("Staging: add rtl8192SU wireless usb driver") Signed-off-by: Dan Carpenter Cc: stable Link: https://lore.kernel.org/r/YEHoAWMOSZBUw91F@mwanda Signed-off-by: Greg Kroah-Hartman commit 2754ab0efc08a9ab6f50d4ad592967db37dd38cc Author: Dmitry Baryshkov Date: Fri Feb 12 22:26:58 2021 +0300 misc: fastrpc: restrict user apps from sending kernel RPC messages commit 20c40794eb85ea29852d7bc37c55713802a543d6 upstream. Verify that user applications are not using the kernel RPC message handle to restrict them from directly attaching to guest OS on the remote subsystem. This is a port of CVE-2019-2308 fix. Fixes: c68cfb718c8f ("misc: fastrpc: Add support for context Invoke method") Cc: Srinivas Kandagatla Cc: Jonathan Marek Cc: stable@vger.kernel.org Signed-off-by: Dmitry Baryshkov Link: https://lore.kernel.org/r/20210212192658.3476137-1-dmitry.baryshkov@linaro.org Signed-off-by: Greg Kroah-Hartman commit 5d6496a8d31f2e5f9ff2ac02bd99e50ee97923e5 Author: Shile Zhang Date: Thu Feb 18 20:31:16 2021 +0800 misc/pvpanic: Export module FDT device table commit 65527a51c66f4edfa28602643d7dd4fa366eb826 upstream. Export the module FDT device table to ensure the FDT compatible strings are listed in the module alias. This help the pvpanic driver can be loaded on boot automatically not only the ACPI device, but also the FDT device. Fixes: 46f934c9a12fc ("misc/pvpanic: add support to get pvpanic device info FDT") Signed-off-by: Shile Zhang Link: https://lore.kernel.org/r/20210218123116.207751-1-shile.zhang@linux.alibaba.com Cc: stable Signed-off-by: Greg Kroah-Hartman commit ca67a57d517fedbcae2697935299676e643759a7 Author: Alexander Shiyan Date: Wed Feb 17 11:06:08 2021 +0300 Revert "serial: max310x: rework RX interrupt handling" commit 2334de198fed3da72e9785ecdd691d101aa96e77 upstream. This reverts commit fce3c5c1a2d9cd888f2987662ce17c0c651916b2. FIFO is triggered 4 intervals after receiving a byte, it's good when we don't care about the time of reception, but are only interested in the presence of any activity on the line. Unfortunately, this method is not suitable for all tasks, for example, the RS-485 protocol will not work properly, since the state machine must track the request-response time and after the timeout expires, a decision is made that the device on the line is not responding. Signed-off-by: Alexander Shiyan Link: https://lore.kernel.org/r/20210217080608.31192-1-shc_work@mail.ru Fixes: fce3c5c1a2d9 ("serial: max310x: rework RX interrupt handling") Cc: Thomas Petazzoni Cc: stable Signed-off-by: Greg Kroah-Hartman commit 15ab5e45f6c47e0a17ba69c9a1f666b54610fb24 Author: Shuah Khan Date: Sun Mar 7 20:53:31 2021 -0700 usbip: fix vudc usbip_sockfd_store races leading to gpf commit 46613c9dfa964c0c60b5385dbdf5aaa18be52a9c upstream. usbip_sockfd_store() is invoked when user requests attach (import) detach (unimport) usb gadget device from usbip host. vhci_hcd sends import request and usbip_sockfd_store() exports the device if it is free for export. Export and unexport are governed by local state and shared state - Shared state (usbip device status, sockfd) - sockfd and Device status are used to determine if stub should be brought up or shut down. Device status is shared between host and client. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - While the device is exported, device status is marked used and socket, sockfd, and thread pointers are valid. Export sequence (stub-up) includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the client to provide access to the exported device. rx and tx threads depends on local and shared state to be correct and in sync. Unexport (stub-down) sequence shuts the socket down and stops the rx and tx threads. Stub-down sequence relies on local and shared states to be in sync. There are races in updating the local and shared status in the current stub-up sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync. 1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to SDEV_ST_USED. This opens up a race condition between the threads and usbip_sockfd_store() stub up and down handling. Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold usbip_device lock to update local and shared states after creating rx and tx threads. - Update usbip_device status to SDEV_ST_USED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete. Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is a hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the stub-up sequence. Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot Reported-by: syzbot Reported-by: syzbot Reported-by: Tetsuo Handa Signed-off-by: Shuah Khan Link: https://lore.kernel.org/r/b1c08b983ffa185449c9f0f7d1021dc8c8454b60.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit ff5d8d20368fc612f0c0dd44f5a94728877c2bd4 Author: Shuah Khan Date: Sun Mar 7 20:53:30 2021 -0700 usbip: fix vhci_hcd attach_store() races leading to gpf commit 718ad9693e3656120064b715fe931f43a6201e67 upstream. attach_store() is invoked when user requests import (attach) a device from usbip host. Attach and detach are governed by local state and shared state - Shared state (usbip device status) - Device status is used to manage the attach and detach operations on import-able devices. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - Device has to be in the right state to be attached and detached. Attach sequence includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the host to get access to the imported device. rx and tx threads depends on local and shared state to be correct and in sync. Detach sequence shuts the socket down and stops the rx and tx threads. Detach sequence relies on local and shared states to be in sync. There are races in updating the local and shared status in the current attach sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync. 1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to VDEV_ST_NOTASSIGNED. This opens up a race condition between the threads, port connect, and detach handling. Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold vhci and usbip_device locks to update local and shared states after creating rx and tx threads. - Update usbip_device status to VDEV_ST_NOTASSIGNED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete. Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the attach sequence. - Update usbip_device tcp_rx and tcp_tx pointers holding vhci and usbip_device locks. Tested with syzbot reproducer: - https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000 Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot Reported-by: syzbot Reported-by: syzbot Reported-by: Tetsuo Handa Signed-off-by: Shuah Khan Link: https://lore.kernel.org/r/bb434bd5d7a64fbec38b5ecfb838a6baef6eb12b.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit f11d195b505d47d0442c59981efa41c47d0a8c9c Author: Shuah Khan Date: Sun Mar 7 20:53:29 2021 -0700 usbip: fix stub_dev usbip_sockfd_store() races leading to gpf commit 9380afd6df70e24eacbdbde33afc6a3950965d22 upstream. usbip_sockfd_store() is invoked when user requests attach (import) detach (unimport) usb device from usbip host. vhci_hcd sends import request and usbip_sockfd_store() exports the device if it is free for export. Export and unexport are governed by local state and shared state - Shared state (usbip device status, sockfd) - sockfd and Device status are used to determine if stub should be brought up or shut down. - Local state (tcp_socket, rx and tx thread task_struct ptrs) A valid tcp_socket controls rx and tx thread operations while the device is in exported state. - While the device is exported, device status is marked used and socket, sockfd, and thread pointers are valid. Export sequence (stub-up) includes validating the socket and creating receive (rx) and transmit (tx) threads to talk to the client to provide access to the exported device. rx and tx threads depends on local and shared state to be correct and in sync. Unexport (stub-down) sequence shuts the socket down and stops the rx and tx threads. Stub-down sequence relies on local and shared states to be in sync. There are races in updating the local and shared status in the current stub-up sequence resulting in crashes. These stem from starting rx and tx threads before local and global state is updated correctly to be in sync. 1. Doesn't handle kthread_create() error and saves invalid ptr in local state that drives rx and tx threads. 2. Updates tcp_socket and sockfd, starts stub_rx and stub_tx threads before updating usbip_device status to SDEV_ST_USED. This opens up a race condition between the threads and usbip_sockfd_store() stub up and down handling. Fix the above problems: - Stop using kthread_get_run() macro to create/start threads. - Create threads and get task struct reference. - Add kthread_create() failure handling and bail out. - Hold usbip_device lock to update local and shared states after creating rx and tx threads. - Update usbip_device status to SDEV_ST_USED. - Update usbip_device tcp_socket, sockfd, tcp_rx, and tcp_tx - Start threads after usbip_device (tcp_socket, sockfd, tcp_rx, tcp_tx, and status) is complete. Credit goes to syzbot and Tetsuo Handa for finding and root-causing the kthread_get_run() improper error handling problem and others. This is a hard problem to find and debug since the races aren't seen in a normal case. Fuzzing forces the race window to be small enough for the kthread_get_run() error path bug and starting threads before updating the local and shared state bug in the stub-up sequence. Tested with syzbot reproducer: - https://syzkaller.appspot.com/text?tag=ReproC&x=14801034d00000 Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread") Cc: stable@vger.kernel.org Reported-by: syzbot Reported-by: syzbot Reported-by: syzbot Reported-by: Tetsuo Handa Signed-off-by: Shuah Khan Link: https://lore.kernel.org/r/268a0668144d5ff36ec7d87fdfa90faf583b7ccc.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit e0a3d7e8a8ffb16c0966ca9b9e4ae86c94d7b46c Author: Shuah Khan Date: Sun Mar 7 20:53:28 2021 -0700 usbip: fix vudc to check for stream socket commit 6801854be94fe8819b3894979875ea31482f5658 upstream. Fix usbip_sockfd_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream. Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa Signed-off-by: Shuah Khan Link: https://lore.kernel.org/r/387a670316002324113ac7ea1e8b53f4085d0c95.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit d28f4c2c777315ad8700a9cc9b01a85e9befc0c1 Author: Shuah Khan Date: Sun Mar 7 20:53:27 2021 -0700 usbip: fix vhci_hcd to check for stream socket commit f55a0571690c4aae03180e001522538c0927432f upstream. Fix attach_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream. Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa Signed-off-by: Shuah Khan Link: https://lore.kernel.org/r/52712aa308915bda02cece1589e04ee8b401d1f3.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit 1652c100ca7279d084935b63ad105de11be4e77b Author: Shuah Khan Date: Sun Mar 7 20:53:26 2021 -0700 usbip: fix stub_dev to check for stream socket commit 47ccc8fc2c9c94558b27b6f9e2582df32d29e6e8 upstream. Fix usbip_sockfd_store() to validate the passed in file descriptor is a stream socket. If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream. Cc: stable@vger.kernel.org Suggested-by: Tetsuo Handa Signed-off-by: Shuah Khan Link: https://lore.kernel.org/r/e942d2bd03afb8e8552bd2a5d84e18d17670d521.1615171203.git.skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman commit c5a43bffa0c7984f4155de26d64b58dfc13048c2 Author: Sebastian Reichel Date: Tue Feb 23 17:44:18 2021 +0100 USB: serial: cp210x: add some more GE USB IDs commit 42213a0190b535093a604945db05a4225bf43885 upstream. GE CS1000 has some more custom USB IDs for CP2102N; add them to the driver to have working auto-probing. Signed-off-by: Sebastian Reichel Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit b68e6af8997399489d4b7a69165d2edaa56c1e70 Author: Karan Singhal Date: Tue Feb 16 11:03:10 2021 -0500 USB: serial: cp210x: add ID for Acuity Brands nLight Air Adapter commit ca667a33207daeaf9c62b106815728718def60ec upstream. IDs of nLight Air Adapter, Acuity Brands, Inc.: vid: 10c4 pid: 88d8 Signed-off-by: Karan Singhal Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit e760989414bc1dffe610856e145345817cd70767 Author: Niv Sardi Date: Mon Mar 1 17:16:12 2021 -0300 USB: serial: ch341: add new Product ID commit 5563b3b6420362c8a1f468ca04afe6d5f0a8d0a3 upstream. Add PID for CH340 that's found on cheap programmers. The driver works flawlessly as soon as the new PID (0x9986) is added to it. These look like ANU232MI but ship with a ch341 inside. They have no special identifiers (mine only has the string "DB9D20130716" printed on the PCB and nothing identifiable on the packaging. The merchant i bought it from doesn't sell these anymore). the lsusb -v output is: Bus 001 Device 009: ID 9986:7523 Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 1.10 bDeviceClass 255 Vendor Specific Class bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x9986 idProduct 0x7523 bcdDevice 2.54 iManufacturer 0 iProduct 0 iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 0x0027 bNumInterfaces 1 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 96mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 1 bInterfaceProtocol 2 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0020 1x 32 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0020 1x 32 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 1 Signed-off-by: Niv Sardi Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 6b2876b831a950293f789b246082517122932c5f Author: Pavel Skripkin Date: Tue Mar 2 02:01:52 2021 +0300 USB: serial: io_edgeport: fix memory leak in edge_startup commit cfdc67acc785e01a8719eeb7012709d245564701 upstream. sysbot found memory leak in edge_startup(). The problem was that when an error was received from the usb_submit_urb(), nothing was cleaned up. Reported-by: syzbot+59f777bdcbdd7eea5305@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin Fixes: 6e8cf7751f9f ("USB: add EPIC support to the io_edgeport driver") Cc: stable@vger.kernel.org # 2.6.21: c5c0c55598ce Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit b4bea73a5df516a2befa4feae8b79e177e55c788 Author: Mathias Nyman Date: Thu Mar 11 13:53:53 2021 +0200 xhci: Fix repeated xhci wake after suspend due to uncleared internal wake state commit d26c00e7276fc92b18c253d69e872f6b03832bad upstream. If port terminations are detected in suspend, but link never reaches U0 then xHCI may have an internal uncleared wake state that will cause an immediate wake after suspend. This wake state is normally cleared when driver clears the PORT_CSC bit, which is set after a device is enabled and in U0. Write 1 to clear PORT_CSC for ports that don't have anything connected when suspending. This makes sure any pending internal wake states in xHCI are cleared. Cc: stable@vger.kernel.org Tested-by: Mika Westerberg Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20210311115353.2137560-5-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 004caa19cb2e87c29a13209a0abb84d6dfab8052 Author: Forest Crossman Date: Thu Mar 11 13:53:52 2021 +0200 usb: xhci: Fix ASMedia ASM1042A and ASM3242 DMA addressing commit b71c669ad8390dd1c866298319ff89fe68b45653 upstream. I've confirmed that both the ASMedia ASM1042A and ASM3242 have the same problem as the ASM1142 and ASM2142/ASM3142, where they lose some of the upper bits of 64-bit DMA addresses. As with the other chips, this can cause problems on systems where the upper bits matter, and adding the XHCI_NO_64BIT_SUPPORT quirk completely fixes the issue. Cc: stable@vger.kernel.org Signed-off-by: Forest Crossman Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20210311115353.2137560-4-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 631ad31ea151ac5d4808ddfc8f8a4132f391390a Author: Mathias Nyman Date: Thu Mar 11 13:53:51 2021 +0200 xhci: Improve detection of device initiated wake signal. commit 253f588c70f66184b1f3a9bbb428b49bbda73e80 upstream. A xHC USB 3 port might miss the first wake signal from a USB 3 device if the port LFPS reveiver isn't enabled fast enough after xHC resume. xHC host will anyway be resumed by a PME# signal, but will go back to suspend if no port activity is seen. The device resends the U3 LFPS wake signal after a 100ms delay, but by then host is already suspended, starting all over from the beginning of this issue. USB 3 specs say U3 wake LFPS signal is sent for max 10ms, then device needs to delay 100ms before resending the wake. Don't suspend immediately if port activity isn't detected in resume. Instead add a retry. If there is no port activity then delay for 120ms, and re-check for port activity. Cc: Signed-off-by: Mathias Nyman Link: https://lore.kernel.org/r/20210311115353.2137560-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit 8d7d6b4c2026c90bcd98e30cd20ac7853741c89a Author: Stanislaw Gruszka Date: Thu Mar 11 13:53:50 2021 +0200 usb: xhci: do not perform Soft Retry for some xHCI hosts commit a4a251f8c23518899d2078c320cf9ce2fa459c9f upstream. On some systems rt2800usb and mt7601u devices are unable to operate since commit f8f80be501aa ("xhci: Use soft retry to recover faster from transaction errors") Seems that some xHCI controllers can not perform Soft Retry correctly, affecting those devices. To avoid the problem add xhci->quirks flag that restore pre soft retry xhci behaviour for affected xHCI controllers. Currently those are AMD_PROMONTORYA_4 and AMD_PROMONTORYA_2, since it was confirmed by the users: on those xHCI hosts issue happen and is gone after disabling Soft Retry. [minor commit message rewording for checkpatch -Mathias] Fixes: f8f80be501aa ("xhci: Use soft retry to recover faster from transaction errors") Cc: # 4.20+ Reported-by: Bernhard Tested-by: Bernhard Signed-off-by: Stanislaw Gruszka Signed-off-by: Mathias Nyman Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202541 Link: https://lore.kernel.org/r/20210311115353.2137560-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman commit a85d7eb852dbf466bba04b2142b76278be24e5ef Author: Yoshihiro Shimoda Date: Mon Mar 8 10:55:38 2021 +0900 usb: renesas_usbhs: Clear PIPECFG for re-enabling pipe with other EPNUM commit b1d25e6ee57c2605845595b6c61340d734253eb3 upstream. According to the datasheet, this controller has a restriction which "set an endpoint number so that combinations of the DIR bit and the EPNUM bits do not overlap.". However, since the udc core driver is possible to assign a bulk pipe as an interrupt endpoint, an endpoint number may not match the pipe number. After that, when user rebinds another gadget driver, this driver broke the restriction because the driver didn't clear any configuration in usb_ep_disable(). Example: # modprobe g_ncm Then, EP3 = pipe 3, EP4 = pipe 4, EP5 = pipe 6 # rmmod g_ncm # modprobe g_hid Then, EP3 = pipe 6, EP4 = pipe 7. So, pipe 3 and pipe 6 are set as EP3. So, clear PIPECFG register in usbhs_pipe_free(). Fixes: dfb87b8bfe09 ("usb: renesas_usbhs: gadget: fix re-enabling pipe without re-connecting") Cc: stable Signed-off-by: Yoshihiro Shimoda Link: https://lore.kernel.org/r/1615168538-26101-1-git-send-email-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Greg Kroah-Hartman commit 63dd82c2aab04b0bfe8cce0ceb6e2134c4379527 Author: Pete Zaitcev Date: Wed Mar 3 22:10:53 2021 -0600 USB: usblp: fix a hang in poll() if disconnected commit 9de2c43acf37a17dc4c69ff78bb099b80fb74325 upstream. Apparently an application that opens a device and calls select() on it, will hang if the decice is disconnected. It's a little surprising that we had this bug for 15 years, but apparently nobody ever uses select() with a printer: only write() and read(), and those work fine. Well, you can also select() with a timeout. The fix is modeled after devio.c. A few other drivers check the condition first, then do not add the wait queue in case the device is disconnected. We doubt that's completely race-free. So, this patch adds the process first, then locks properly and checks for the disconnect. Reviewed-by: Zqiang Signed-off-by: Pete Zaitcev Cc: stable Link: https://lore.kernel.org/r/20210303221053.1cf3313e@suzdal.zaitcev.lan Signed-off-by: Greg Kroah-Hartman commit 851c74a266f9eefe9654474198cd7b84872695aa Author: Matthias Kaehlcke Date: Tue Mar 2 10:37:03 2021 -0800 usb: dwc3: qcom: Honor wakeup enabled/disabled state commit 2664deb0930643149d61cddbb66ada527ae180bd upstream. The dwc3-qcom currently enables wakeup interrupts unconditionally when suspending, however this should not be done when wakeup is disabled (e.g. through the sysfs attribute power/wakeup). Only enable wakeup interrupts when device_may_wakeup() returns true. Fixes: a4333c3a6ba9 ("usb: dwc3: Add Qualcomm DWC3 glue driver") Reviewed-by: Bjorn Andersson Signed-off-by: Matthias Kaehlcke Cc: stable Link: https://lore.kernel.org/r/20210302103659.v2.1.I44954d9e1169f2cf5c44e6454d357c75ddfa99a2@changeid Signed-off-by: Greg Kroah-Hartman commit f09baa292abc12638a0afe88980832be957075c9 Author: Shawn Guo Date: Mon Mar 1 15:57:45 2021 +0800 usb: dwc3: qcom: add ACPI device id for sc8180x commit 1edbff9c80ed32071fffa7dbaaea507fdb21ff2d upstream. It enables USB Host support for sc8180x ACPI boot, both the standalone one and the one behind URS (USB Role Switch). And they share the the same dwc3_acpi_pdata with sdm845. Signed-off-by: Shawn Guo Link: https://lore.kernel.org/r/20210301075745.20544-1-shawn.guo@linaro.org Cc: stable Signed-off-by: Greg Kroah-Hartman commit 69d4cb11a8492786551b817a8278625add9921f4 Author: Shawn Guo Date: Fri Jan 15 11:50:57 2021 +0800 usb: dwc3: qcom: add URS Host support for sdm845 ACPI boot commit c25c210f590e7a37eecd865d84f97d1f40e39786 upstream. For sdm845 ACPI boot, the URS (USB Role Switch) node in ACPI DSDT table holds the memory resource, while interrupt resources reside in the child nodes USB0 and UFN0. It adds USB0 host support by probing URS node, creating platform device for USB0 node, and then retrieve interrupt resources from USB0 platform device. Reviewed-by: Bjorn Andersson Signed-off-by: Shawn Guo Link: https://lore.kernel.org/r/20210115035057.10994-1-shawn.guo@linaro.org Signed-off-by: Greg Kroah-Hartman commit 78a723baadfef8d3d2e52992b50f41fdd749dc1a Author: Serge Semin Date: Fri Feb 12 23:55:19 2021 +0300 usb: dwc3: qcom: Add missing DWC3 OF node refcount decrement commit 1cffb1c66499a9db9a735473778abf8427d16287 upstream. of_get_child_by_name() increments the reference counter of the OF node it managed to find. So after the code is done using the device node, the refcount must be decremented. Add missing of_node_put() invocation then to the dwc3_qcom_of_register_core() method, since DWC3 OF node is being used only there. Fixes: a4333c3a6ba9 ("usb: dwc3: Add Qualcomm DWC3 glue driver") Signed-off-by: Serge Semin Link: https://lore.kernel.org/r/20210212205521.14280-1-Sergey.Semin@baikalelectronics.ru Cc: stable Signed-off-by: Greg Kroah-Hartman commit ab20ef1ce0fc1149ef0695e62a09d1d0bd68fea9 Author: Ruslan Bilovol Date: Mon Mar 1 13:49:32 2021 +0200 usb: gadget: f_uac1: stop playback on function disable commit cc2ac63d4cf72104e0e7f58bb846121f0f51bb19 upstream. There is missing playback stop/cleanup in case of gadget's ->disable callback that happens on events like USB host resetting or gadget disconnection Fixes: 0591bc236015 ("usb: gadget: add f_uac1 variant based on a new u_audio api") Cc: # 4.13+ Signed-off-by: Ruslan Bilovol Link: https://lore.kernel.org/r/1614599375-8803-3-git-send-email-ruslan.bilovol@gmail.com Signed-off-by: Greg Kroah-Hartman commit 7c5ed451c8b373ff7963513cb43e584f01a49850 Author: Ruslan Bilovol Date: Mon Mar 1 13:49:31 2021 +0200 usb: gadget: f_uac2: always increase endpoint max_packet_size by one audio slot commit 789ea77310f0200c84002884ffd628e2baf3ad8a upstream. As per UAC2 Audio Data Formats spec (2.3.1.1 USB Packets), if the sampling rate is a constant, the allowable variation of number of audio slots per virtual frame is +/- 1 audio slot. It means that endpoint should be able to accept/send +1 audio slot. Previous endpoint max_packet_size calculation code was adding sometimes +1 audio slot due to DIV_ROUND_UP behaviour which was rounding up to closest integer. However this doesn't work if the numbers are divisible. It had no any impact with Linux hosts which ignore this issue, but in case of more strict Windows it caused rejected enumeration Thus always add +1 audio slot to endpoint's max packet size Fixes: 913e4a90b6f9 ("usb: gadget: f_uac2: finalize wMaxPacketSize according to bandwidth") Cc: Peter Chen Cc: #v4.3+ Signed-off-by: Ruslan Bilovol Link: https://lore.kernel.org/r/1614599375-8803-2-git-send-email-ruslan.bilovol@gmail.com Signed-off-by: Greg Kroah-Hartman commit 7d2fb64881e4a80051c6067c0eb3626fdac3fc67 Author: Dan Carpenter Date: Mon Feb 15 15:57:16 2021 +0000 USB: gadget: u_ether: Fix a configfs return code commit 650bf52208d804ad5ee449c58102f8dc43175573 upstream. If the string is invalid, this should return -EINVAL instead of 0. Fixes: 73517cf49bd4 ("usb: gadget: add RNDIS configfs options for class/subclass/protocol") Cc: stable Acked-by: Lorenzo Colitti Signed-off-by: Dan Carpenter Link: https://lore.kernel.org/r/YCqZ3P53yyIg5cn7@mwanda Signed-off-by: Greg Kroah-Hartman commit 8cbbe286dfa01f695a46267cb6293cc844fc7580 Author: Wei Yongjun Date: Fri Mar 5 03:49:27 2021 +0000 USB: gadget: udc: s3c2410_udc: fix return value check in s3c2410_udc_probe() commit 414c20df7d401bcf1cb6c13d2dd944fb53ae4acf upstream. In case of error, the function devm_platform_ioremap_resource() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 188db4435ac6 ("usb: gadget: s3c: use platform resources") Cc: stable Reported-by: Hulk Robot Reviewed-by: Arnd Bergmann Reviewed-by: Krzysztof Kozlowski Signed-off-by: Wei Yongjun Link: https://lore.kernel.org/r/20210305034927.3232386-1-weiyongjun1@huawei.com Signed-off-by: Greg Kroah-Hartman commit 12957e93a041191ce3f3ba807c09c65d538b61eb Author: Yorick de Wid Date: Sat Feb 13 15:49:02 2021 +0100 Goodix Fingerprint device is not a modem commit 4d8654e81db7346f915eca9f1aff18f385cab621 upstream. The CDC ACM driver is false matching the Goodix Fingerprint device against the USB_CDC_ACM_PROTO_AT_V25TER. The Goodix Fingerprint device is a biometrics sensor that should be handled in user-space. libfprint has some support for Goodix fingerprint sensors, although not for this particular one. It is possible that the vendor allocates a PID per OEM (Lenovo, Dell etc). If this happens to be the case then more devices from the same vendor could potentially match the ACM modem module table. Signed-off-by: Yorick de Wid Cc: stable Link: https://lore.kernel.org/r/20210213144901.53199-1-ydewid@gmail.com Signed-off-by: Greg Kroah-Hartman commit 836238c6998b08cc8b5c3f6748056ec965eaa02f Author: Paulo Alcantara Date: Mon Mar 8 12:00:50 2021 -0300 cifs: do not send close in compound create+close requests commit 04ad69c342fc4de5bd23be9ef15ea7574fb1a87e upstream. In case of interrupted syscalls, prevent sending CLOSE commands for compound CREATE+CLOSE requests by introducing an CIFS_CP_CREATE_CLOSE_OP flag to indicate lower layers that it should not send a CLOSE command to the MIDs corresponding the compound CREATE+CLOSE request. A simple reproducer: #!/bin/bash mount //server/share /mnt -o username=foo,password=*** tc qdisc add dev eth0 root netem delay 450ms stat -f /mnt &>/dev/null & pid=$! sleep 0.01 kill $pid tc qdisc del dev eth0 root umount /mnt Before patch: ... 6 0.256893470 192.168.122.2 → 192.168.122.15 SMB2 402 Create Request File: ;GetInfo Request FS_INFO/FileFsFullSizeInformation;Close Request 7 0.257144491 192.168.122.15 → 192.168.122.2 SMB2 498 Create Response File: ;GetInfo Response;Close Response 9 0.260798209 192.168.122.2 → 192.168.122.15 SMB2 146 Close Request File: 10 0.260841089 192.168.122.15 → 192.168.122.2 SMB2 130 Close Response, Error: STATUS_FILE_CLOSED Signed-off-by: Paulo Alcantara (SUSE) Reviewed-by: Ronnie Sahlberg Reviewed-by: Aurelien Aptel CC: Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 431e53d76189eb191656cd9aed0b33b931c13e2b Author: Frank Li Date: Wed Mar 3 11:42:48 2021 -0600 mmc: cqhci: Fix random crash when remove mmc module/card commit f06391c45e83f9a731045deb23df7cc3814fd795 upstream. [ 6684.493350] Unable to handle kernel paging request at virtual address ffff800011c5b0f0 [ 6684.498531] mmc0: card 0001 removed [ 6684.501556] Mem abort info: [ 6684.509681] ESR = 0x96000047 [ 6684.512786] EC = 0x25: DABT (current EL), IL = 32 bits [ 6684.518394] SET = 0, FnV = 0 [ 6684.521707] EA = 0, S1PTW = 0 [ 6684.524998] Data abort info: [ 6684.528236] ISV = 0, ISS = 0x00000047 [ 6684.532986] CM = 0, WnR = 1 [ 6684.536129] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000081b22000 [ 6684.543923] [ffff800011c5b0f0] pgd=00000000bffff003, p4d=00000000bffff003, pud=00000000bfffe003, pmd=00000000900e1003, pte=0000000000000000 [ 6684.557915] Internal error: Oops: 96000047 [#1] PREEMPT SMP [ 6684.564240] Modules linked in: sdhci_esdhc_imx(-) sdhci_pltfm sdhci cqhci mmc_block mmc_core fsl_jr_uio caam_jr caamkeyblob_desc caamhash_desc caamalg_desc crypto_engine rng_core authenc libdes crct10dif_ce flexcan can_dev caam error [last unloaded: mmc_core] [ 6684.587281] CPU: 0 PID: 79138 Comm: kworker/0:3H Not tainted 5.10.9-01410-g3ba33182767b-dirty #10 [ 6684.596160] Hardware name: Freescale i.MX8DXL EVK (DT) [ 6684.601320] Workqueue: kblockd blk_mq_run_work_fn [ 6684.606094] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--) [ 6684.612286] pc : cqhci_request+0x148/0x4e8 [cqhci] ^GMessage from syslogd@ at Thu Jan 1 01:51:24 1970 ...[ 6684.617085] lr : cqhci_request+0x314/0x4e8 [cqhci] [ 6684.626734] sp : ffff80001243b9f0 [ 6684.630049] x29: ffff80001243b9f0 x28: ffff00002c3dd000 [ 6684.635367] x27: 0000000000000001 x26: 0000000000000001 [ 6684.640690] x25: ffff00002c451000 x24: 000000000000000f [ 6684.646007] x23: ffff000017e71c80 x22: ffff00002c451000 [ 6684.651326] x21: ffff00002c0f3550 x20: ffff00002c0f3550 [ 6684.656651] x19: ffff000017d46880 x18: ffff00002cea1500 [ 6684.661977] x17: 0000000000000000 x16: 0000000000000000 [ 6684.667294] x15: 000001ee628e3ed1 x14: 0000000000000278 [ 6684.672610] x13: 0000000000000001 x12: 0000000000000001 [ 6684.677927] x11: 0000000000000000 x10: 0000000000000000 [ 6684.683243] x9 : 000000000000002b x8 : 0000000000001000 [ 6684.688560] x7 : 0000000000000010 x6 : ffff00002c0f3678 [ 6684.693886] x5 : 000000000000000f x4 : ffff800011c5b000 [ 6684.699211] x3 : 000000000002d988 x2 : 0000000000000008 [ 6684.704537] x1 : 00000000000000f0 x0 : 0002d9880008102f [ 6684.709854] Call trace: [ 6684.712313] cqhci_request+0x148/0x4e8 [cqhci] [ 6684.716803] mmc_cqe_start_req+0x58/0x68 [mmc_core] [ 6684.721698] mmc_blk_mq_issue_rq+0x460/0x810 [mmc_block] [ 6684.727018] mmc_mq_queue_rq+0x118/0x2b0 [mmc_block] The problem occurs when cqhci_request() get called after cqhci_disable() as it leads to access of allocated memory that has already been freed. Let's fix the problem by calling cqhci_disable() a bit later in the remove path. Signed-off-by: Frank Li Diagnosed-by: Adrian Hunter Acked-by: Adrian Hunter Link: https://lore.kernel.org/r/20210303174248.542175-1-Frank.Li@nxp.com Fixes: f690f4409ddd ("mmc: mmc: Enable CQE's") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit 8ec6ff65208ae035a109e5d0dd044647d4ec3560 Author: Adrian Hunter Date: Wed Mar 3 11:26:14 2021 +0200 mmc: core: Fix partition switch time for eMMC commit 66fbacccbab91e6e55d9c8f1fc0910a8eb6c81f7 upstream. Avoid the following warning by always defining partition switch time: [ 3.209874] mmc1: unspecified timeout for CMD6 - use generic [ 3.222780] ------------[ cut here ]------------ [ 3.233363] WARNING: CPU: 1 PID: 111 at drivers/mmc/core/mmc_ops.c:575 __mmc_switch+0x200/0x204 Reported-by: Paul Fertser Fixes: 1c447116d017 ("mmc: mmc: Fix partition switch timeout for some eMMCs") Signed-off-by: Adrian Hunter Link: https://lore.kernel.org/r/168bbfd6-0c5b-5ace-ab41-402e7937c46e@intel.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit fd3dac97eb4e524210df93b617e2edbdf749c15f Author: Yann Gautier Date: Thu Feb 25 15:54:54 2021 +0100 mmc: mmci: Add MMC_CAP_NEED_RSP_BUSY for the stm32 variants commit 774514bf977377c9137640a0310bd64eed0f7323 upstream. An issue has been observed on STM32MP157C-EV1 board, with an erase command with secure erase argument, ending up waiting for ~4 hours before timeout. The requested busy timeout from the mmc core ends up with 14784000ms (~4 hours), but the supported host->max_busy_timeout is 86767ms, which leads to that the core switch to use an R1 response in favor of the R1B and polls for busy with the host->card_busy() ops. In this case the polling doesn't work as expected, as we never detects that the card stops signaling busy, which leads to the following message: mmc1: Card stuck being busy! __mmc_poll_for_busy The problem boils done to that the stm32 variants can't use R1 responses in favor of R1B responses, as it leads to an internal state machine in the controller to get stuck. To continue to process requests, it would need to be reset. To fix this problem, let's set MMC_CAP_NEED_RSP_BUSY for the stm32 variant, which prevent the mmc core from switching to R1 responses. Additionally, let's cap the cmd->busy_timeout to the host->max_busy_timeout, thus rely on 86767ms to be sufficient (~66 seconds was need for this test case). Fixes: 94fe2580a2f3 ("mmc: core: Enable erase/discard/trim support for all mmc hosts") Signed-off-by: Yann Gautier Link: https://lore.kernel.org/r/20210225145454.12780-1-yann.gautier@foss.st.com Cc: stable@vger.kernel.org [Ulf: Simplified the code and extended the commit message] Signed-off-by: Ulf Hansson Signed-off-by: Greg Kroah-Hartman commit d4029cc53b7ac801008087df250f007fad54f07e Author: Juergen Gross Date: Sat Mar 6 17:18:33 2021 +0100 xen/events: avoid handling the same event on two cpus at the same time commit b6622798bc50b625a1e62f82c7190df40c1f5b21 upstream. When changing the cpu affinity of an event it can happen today that (with some unlucky timing) the same event will be handled on the old and the new cpu at the same time. Avoid that by adding an "event active" flag to the per-event data and call the handler only if this flag isn't set. Cc: stable@vger.kernel.org Reported-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Julien Grall Link: https://lore.kernel.org/r/20210306161833.4552-4-jgross@suse.com Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman commit fad86da87b35bfae1f0365e1a536151e276fb0ce Author: Juergen Gross Date: Sat Mar 6 17:18:32 2021 +0100 xen/events: don't unmask an event channel when an eoi is pending commit 25da4618af240fbec6112401498301a6f2bc9702 upstream. An event channel should be kept masked when an eoi is pending for it. When being migrated to another cpu it might be unmasked, though. In order to avoid this keep three different flags for each event channel to be able to distinguish "normal" masking/unmasking from eoi related masking/unmasking and temporary masking. The event channel should only be able to generate an interrupt if all flags are cleared. Cc: stable@vger.kernel.org Fixes: 54c9de89895e ("xen/events: add a new "late EOI" evtchn framework") Reported-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Julien Grall Reviewed-by: Boris Ostrovsky Tested-by: Ross Lagerwall Link: https://lore.kernel.org/r/20210306161833.4552-3-jgross@suse.com Signed-off-by: Greg Kroah-Hartman [boris -- corrected Fixed tag format] Signed-off-by: Boris Ostrovsky commit 37f4ed2941d6652e75cb5d10c59246ad80c1c0b6 Author: Juergen Gross Date: Sat Mar 6 17:18:31 2021 +0100 xen/events: reset affinity of 2-level event when tearing it down commit 9e77d96b8e2724ed00380189f7b0ded61113b39f upstream. When creating a new event channel with 2-level events the affinity needs to be reset initially in order to avoid using an old affinity from earlier usage of the event channel port. So when tearing an event channel down reset all affinity bits. The same applies to the affinity when onlining a vcpu: all old affinity settings for this vcpu must be reset. As percpu events get initialized before the percpu event channel hook is called, resetting of the affinities happens after offlining a vcpu (this is working, as initial percpu memory is zeroed out). Cc: stable@vger.kernel.org Reported-by: Julien Grall Signed-off-by: Juergen Gross Reviewed-by: Julien Grall Link: https://lore.kernel.org/r/20210306161833.4552-2-jgross@suse.com Signed-off-by: Boris Ostrovsky Signed-off-by: Greg Kroah-Hartman commit 2874b3cf9c762437889ad7465545a966e755692b Author: Heikki Krogerus Date: Mon Mar 1 17:30:11 2021 +0300 software node: Fix node registration commit 8891123f9cbb9c1ee531e5a87fa116f0af685c48 upstream. Software node can not be registered before its parent. Fixes: 80488a6b1d3c ("software node: Add support for static node descriptors") Cc: 5.10+ # 5.10+ Signed-off-by: Heikki Krogerus Reviewed-by: Andy Shevchenko Tested-by: Andy Shevchenko Signed-off-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman commit 7e2f266995531d7fa20dd3911330db0975e1a261 Author: Stefan Haberland Date: Fri Mar 5 13:54:39 2021 +0100 s390/dasd: fix hanging IO request during DASD driver unbind commit 66f669a272898feb1c69b770e1504aa2ec7723d1 upstream. Prevent that an IO request is build during device shutdown initiated by a driver unbind. This request will never be able to be processed or canceled and will hang forever. This will lead also to a hanging unbind. Fix by checking not only if the device is in READY state but also check that there is no device offline initiated before building a new IO request. Fixes: e443343e509a ("s390/dasd: blk-mq conversion") Cc: # v4.14+ Signed-off-by: Stefan Haberland Tested-by: Bjoern Walk Reviewed-by: Jan Hoeppner Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 471fd26dccd7566fbc16b55f2c0e5c2550665356 Author: Stefan Haberland Date: Fri Mar 5 13:54:38 2021 +0100 s390/dasd: fix hanging DASD driver unbind commit 7d365bd0bff3c0310c39ebaffc9a8458e036d666 upstream. In case of an unbind of the DASD device driver the function dasd_generic_remove() is called which shuts down the device. Among others this functions removes the int_handler from the cdev. During shutdown the device cancels all outstanding IO requests and waits for completion of the clear request. Unfortunately the clear interrupt will never be received when there is no interrupt handler connected. Fix by moving the int_handler removal after the call to the state machine where no request or interrupt is outstanding. Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland Tested-by: Bjoern Walk Reviewed-by: Jan Hoeppner Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit c9038764d8ac13d2f2ea4f8d6198c1cb798fa739 Author: Rob Herring Date: Tue Mar 9 17:44:12 2021 -0700 arm64: perf: Fix 64-bit event counter read truncation commit 7bb8bc6eb550116c504fb25af8678b9d7ca2abc5 upstream. Commit 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") changed armv8pmu_read_evcntr() to return a u32 instead of u64. The result is silent truncation of the event counter when using 64-bit counters. Given the offending commit appears to have passed thru several folks, it seems likely this was a bad rebase after v8.5 PMU 64-bit counters landed. Cc: Alexandru Elisei Cc: Julien Thierry Cc: Mark Rutland Cc: Will Deacon Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Fixes: 0fdf1bb75953 ("arm64: perf: Avoid PMXEV* indirection") Signed-off-by: Rob Herring Acked-by: Mark Rutland Reviewed-by: Alexandru Elisei Link: https://lore.kernel.org/r/20210310004412.1450128-1-robh@kernel.org Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit d6c36ce9c15f8e4b699a4af56e2636c308369a00 Author: Catalin Marinas Date: Tue Mar 9 12:26:01 2021 +0000 arm64: mte: Map hotplugged memory as Normal Tagged commit d15dfd31384ba3cb93150e5f87661a76fa419f74 upstream. In a system supporting MTE, the linear map must allow reading/writing allocation tags by setting the memory type as Normal Tagged. Currently, this is only handled for memory present at boot. Hotplugged memory uses Normal non-Tagged memory. Introduce pgprot_mhp() for hotplugged memory and use it in add_memory_resource(). The arm64 code maps pgprot_mhp() to pgprot_tagged(). Note that ZONE_DEVICE memory should not be mapped as Tagged and therefore setting the memory type in arch_add_memory() is not feasible. Signed-off-by: Catalin Marinas Fixes: 0178dc761368 ("arm64: mte: Use Normal Tagged attributes for the linear map") Reported-by: Patrick Daly Tested-by: Patrick Daly Link: https://lore.kernel.org/r/1614745263-27827-1-git-send-email-pdaly@codeaurora.org Cc: # 5.10.x Cc: Will Deacon Cc: Andrew Morton Cc: Vincenzo Frascino Cc: David Hildenbrand Reviewed-by: David Hildenbrand Reviewed-by: Vincenzo Frascino Reviewed-by: Anshuman Khandual Link: https://lore.kernel.org/r/20210309122601.5543-1-catalin.marinas@arm.com Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 0adb91be46b6597f2f8dc6c91d848a09e78882c9 Author: Andrey Konovalov Date: Mon Mar 8 17:10:23 2021 +0100 arm64: kasan: fix page_alloc tagging with DEBUG_VIRTUAL commit 86c83365ab76e4b43cedd3ce07a07d32a4dc79ba upstream. When CONFIG_DEBUG_VIRTUAL is enabled, the default page_to_virt() macro implementation from include/linux/mm.h is used. That definition doesn't account for KASAN tags, which leads to no tags on page_alloc allocations. Provide an arm64-specific definition for page_to_virt() when CONFIG_DEBUG_VIRTUAL is enabled that takes care of KASAN tags. Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Cc: Signed-off-by: Andrey Konovalov Reviewed-by: Catalin Marinas Link: https://lore.kernel.org/r/4b55b35202706223d3118230701c6a59749d9b72.1615219501.git.andreyknvl@google.com Signed-off-by: Will Deacon Signed-off-by: Greg Kroah-Hartman commit 7e0815797656f029fab2edc309406cddf931b9d8 Author: Jan Kara Date: Mon Feb 22 10:48:09 2021 +0100 block: Try to handle busy underlying device on discard commit 56887cffe946bb0a90c74429fa94d6110a73119d upstream. Commit 384d87ef2c95 ("block: Do not discard buffers under a mounted filesystem") made paths issuing discard or zeroout requests to the underlying device try to grab block device in exclusive mode. If that failed we returned EBUSY to userspace. This however caused unexpected fallout in userspace where e.g. FUSE filesystems issue discard requests from userspace daemons although the device is open exclusively by the kernel. Also shrinking of logical volume by LVM issues discard requests to a device which may be claimed exclusively because there's another LV on the same PV. So to avoid these userspace regressions, fall back to invalidate_inode_pages2_range() instead of returning EBUSY to userspace and return EBUSY only of that call fails as well (meaning that there's indeed someone using the particular device range we are trying to discard). Link: https://bugzilla.kernel.org/show_bug.cgi?id=211167 Fixes: 384d87ef2c95 ("block: Do not discard buffers under a mounted filesystem") CC: stable@vger.kernel.org Signed-off-by: Jan Kara Reviewed-by: Christoph Hellwig Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 3cf3d312bab4f2acaa79b9e5b502440764df7285 Author: Shin'ichiro Kawasaki Date: Thu Mar 11 16:25:46 2021 +0900 block: Discard page cache of zone reset target range commit e5113505904ea1c1c0e1f92c1cfa91fbf4da1694 upstream. When zone reset ioctl and data read race for a same zone on zoned block devices, the data read leaves stale page cache even though the zone reset ioctl zero clears all the zone data on the device. To avoid non-zero data read from the stale page cache after zone reset, discard page cache of reset target zones in blkdev_zone_mgmt_ioctl(). Introduce the helper function blkdev_truncate_zone_range() to discard the page cache. Ensure the page cache discarded by calling the helper function before and after zone reset in same manner as fallocate does. This patch can be applied back to the stable kernel version v5.10.y. Rework is needed for older stable kernels. Signed-off-by: Shin'ichiro Kawasaki Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls") Cc: # 5.10+ Reviewed-by: Christoph Hellwig Reviewed-by: Johannes Thumshirn Link: https://lore.kernel.org/r/20210311072546.678999-1-shinichiro.kawasaki@wdc.com Signed-off-by: Jens Axboe Signed-off-by: Greg Kroah-Hartman commit 7fbc077be2f3fad5d75ddfd4b598eeba66459c5d Author: Eric W. Biederman Date: Fri Mar 12 15:07:09 2021 -0600 Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities") commit 3b0c2d3eaa83da259d7726192cf55a137769012f upstream. It turns out that there are in fact userspace implementations that care and this recent change caused a regression. https://github.com/containers/buildah/issues/3071 As the motivation for the original change was future development, and the impact is existing real world code just revert this change and allow the ambiguity in v3 file caps. Cc: stable@vger.kernel.org Fixes: 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities") Signed-off-by: Eric W. Biederman Signed-off-by: Greg Kroah-Hartman commit 614b31a7409718fd792c11a7a96dc23617585fe5 Author: Beata Michalska Date: Thu Mar 4 15:07:34 2021 +0000 opp: Don't drop extra references to OPPs accidentally commit 606a5d4227e4610399c61086ac55c46068a90b03 upstream. We are required to call dev_pm_opp_put() from outside of the opp_table->lock as debugfs removal needs to happen lock-less to avoid circular dependency issues. commit cf1fac943c63 ("opp: Reduce the size of critical section in _opp_kref_release()") tried to fix that introducing a new routine _opp_get_next() which keeps returning OPPs that can be freed by the callers and this routine shall be called without holding the opp_table->lock. Though the commit overlooked the fact that the OPPs can be referenced by other users as well and this routine will end up dropping references which were taken by other users and hence freeing the OPPs prematurely. In effect, other users of the OPPs will end up having invalid pointers at hand. We didn't see any crash reports earlier as the exact situation never happened, though it is certainly possible. We need a way to mark which OPPs are no longer referenced by the OPP core, so we don't drop extra references to them accidentally. This commit adds another OPP flag, "removed", which is used to track this. And now we should never end up dropping extra references to the OPPs. Cc: v5.11+ # v5.11+ Fixes: cf1fac943c63 ("opp: Reduce the size of critical section in _opp_kref_release()") Signed-off-by: Beata Michalska [ Viresh: Almost rewrote entire patch, added new "removed" field, rewrote commit log and added the correct Fixes tag. ] Co-developed-by: Viresh Kumar Signed-off-by: Viresh Kumar Signed-off-by: Greg Kroah-Hartman commit 569c90def48f1150925818929729dc6ea0e63c75 Author: Pavel Skripkin Date: Tue Mar 9 01:30:57 2021 +0300 ALSA: usb-audio: fix use after free in usb_audio_disconnect commit c5aa956eaeb05fe87e33433d7fd9f5e4d23c7416 upstream. The problem was in wrong "if" placement. chip->quirk_type is freed in snd_card_free_when_closed(), but inside if statement it's accesed. Fixes: 9799110825db ("ALSA: usb-audio: Disable USB autosuspend properly in setup_disable_autosuspend()") Signed-off-by: Pavel Skripkin Cc: Link: https://lore.kernel.org/r/16da19126ff461e5e64a9aec648cce28fb8ed73e.1615242183.git.paskripkin@gmail.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit d5fb0e451fa8993f452943844a34bfda4f00161c Author: Pavel Skripkin Date: Tue Mar 9 01:30:36 2021 +0300 ALSA: usb-audio: fix NULL ptr dereference in usb_audio_probe commit 30dea07180de3aa0ad613af88431ef4e34b5ef68 upstream. syzbot reported null pointer dereference in usb_audio_probe. The problem was in case, when quirk == NULL. It's not an error condition, so quirk must be checked before dereferencing. Call Trace: usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396 really_probe+0x291/0xe60 drivers/base/dd.c:554 driver_probe_device+0x26b/0x3d0 drivers/base/dd.c:740 __device_attach_driver+0x1d1/0x290 drivers/base/dd.c:846 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431 __device_attach+0x228/0x4a0 drivers/base/dd.c:914 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491 device_add+0xbdb/0x1db0 drivers/base/core.c:3242 usb_set_configuration+0x113f/0x1910 drivers/usb/core/message.c:2164 usb_generic_driver_probe+0xba/0x100 drivers/usb/core/generic.c:238 usb_probe_device+0xd9/0x2c0 drivers/usb/core/driver.c:293 really_probe+0x291/0xe60 drivers/base/dd.c:554 driver_probe_device+0x26b/0x3d0 drivers/base/dd.c:740 __device_attach_driver+0x1d1/0x290 drivers/base/dd.c:846 bus_for_each_drv+0x15f/0x1e0 drivers/base/bus.c:431 __device_attach+0x228/0x4a0 drivers/base/dd.c:914 bus_probe_device+0x1e4/0x290 drivers/base/bus.c:491 device_add+0xbdb/0x1db0 drivers/base/core.c:3242 usb_new_device.cold+0x721/0x1058 drivers/usb/core/hub.c:2555 hub_port_connect drivers/usb/core/hub.c:5223 [inline] hub_port_connect_change drivers/usb/core/hub.c:5363 [inline] port_event drivers/usb/core/hub.c:5509 [inline] hub_event+0x2357/0x4320 drivers/usb/core/hub.c:5591 process_one_work+0x98d/0x1600 kernel/workqueue.c:2275 worker_thread+0x64c/0x1120 kernel/workqueue.c:2421 kthread+0x3b1/0x4a0 kernel/kthread.c:292 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294 Reported-by: syzbot+719da9b149a931f5143f@syzkaller.appspotmail.com Fixes: 9799110825db ("ALSA: usb-audio: Disable USB autosuspend properly in setup_disable_autosuspend()") Signed-off-by: Pavel Skripkin Cc: Link: https://lore.kernel.org/r/f1ebad6e721412843bd1b12584444c0a63c6b2fb.1615242183.git.paskripkin@gmail.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit d846692eb892215e8dd0c37997b3182bc7da22d9 Author: Kai-Heng Feng Date: Thu Mar 4 12:34:16 2021 +0800 ALSA: usb-audio: Disable USB autosuspend properly in setup_disable_autosuspend() commit 9799110825dba087c2bdce886977cf84dada2005 upstream. Rear audio on Lenovo ThinkStation P620 stops working after commit 1965c4364bdd ("ALSA: usb-audio: Disable autosuspend for Lenovo ThinkStation P620"): [ 6.013526] usbcore: registered new interface driver snd-usb-audio [ 6.023064] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.023083] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.023090] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.023098] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.023103] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.023110] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.045846] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.045866] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.045877] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.045886] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 [ 6.045894] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x100, wIndex = 0x0, type = 1 [ 6.045908] usb 3-6: cannot get ctl value: req = 0x81, wValue = 0x202, wIndex = 0x0, type = 4 I overlooked the issue because when I was working on the said commit, only the front audio is tested. Apology for that. Changing supports_autosuspend in driver is too late for disabling autosuspend, because it was already used by USB probe routine, so it can break the balance on the following code that depends on supports_autosuspend. Fix it by using usb_disable_autosuspend() helper, and balance the suspend count in disconnect callback. Fixes: 1965c4364bdd ("ALSA: usb-audio: Disable autosuspend for Lenovo ThinkStation P620") Signed-off-by: Kai-Heng Feng Cc: Link: https://lore.kernel.org/r/20210304043419.287191-1-kai.heng.feng@canonical.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 05c65afceed81a6360f61b3236bf2e20978ac100 Author: Takashi Iwai Date: Thu Mar 4 09:50:09 2021 +0100 ALSA: usb-audio: Apply the control quirk to Plantronics headsets commit 06abcb18b3a021ba1a3f2020cbefb3ed04e59e72 upstream. Other Plantronics headset models seem requiring the same workaround as C320-M to add the 20ms delay for the control messages, too. Apply the workaround generically for devices with the vendor ID 0x047f. Note that the problem didn't surface before 5.11 just with luck. Since 5.11 got a big code rewrite about the stream handling, the parameter setup procedure has changed, and this seemed triggering the problem more often. BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1182552 Cc: Link: https://lore.kernel.org/r/20210304085009.4770-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 1ed48f8b089f51c1ad16c39813844988fb5fa9e9 Author: Takashi Iwai Date: Thu Mar 4 09:30:21 2021 +0100 ALSA: usb-audio: Fix "cannot get freq eq" errors on Dell AE515 sound bar commit fec60c3bc5d1713db2727cdffc638d48f9c07dc3 upstream. Dell AE515 sound bar (413c:a506) spews the error messages when the driver tries to read the current sample frequency, hence it needs to be on the list in snd_usb_get_sample_rate_quirk(). BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211551 Cc: Link: https://lore.kernel.org/r/20210304083021.2152-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 8db51c29fb969a8fbb229b556ac58c5134c88878 Author: Takashi Iwai Date: Wed Mar 10 12:28:08 2021 +0100 ALSA: hda: Avoid spurious unsol event handling during S3/S4 commit 5ff9dde42e8c72ed8102eb8cb62e03f9dc2103ab upstream. When HD-audio bus receives unsolicited events during its system suspend/resume (S3 and S4) phase, the controller driver may still try to process events although the codec chips are already (or yet) powered down. This might screw up the codec communication, resulting in CORB/RIRB errors. Such events should be rather skipped, as the codec chip status such as the jack status will be fully refreshed at the system resume time. Since we're tracking the system suspend/resume state in codec power.power_state field, let's add the check in the common unsol event handler entry point to filter out such events. BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1182377 Tested-by: Abhishek Sahu Cc: # 183ab39eb0ea: ALSA: hda: Initialize power_state Link: https://lore.kernel.org/r/20210310112809.9215-3-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 5d599146decd88ff2bc65c7bfeb55ee85bbe1f13 Author: Takashi Iwai Date: Wed Mar 10 12:28:07 2021 +0100 ALSA: hda: Flush pending unsolicited events before suspend commit 13661fc48461282e43fe8f76bf5bf449b3d40687 upstream. The HD-audio controller driver processes the unsolicited events via its work asynchronously, and this might be pending when the system goes to suspend. When a lengthy event handling like ELD byte reads is running, this might trigger unexpected accesses among suspend/resume procedure, typically seen with Nvidia driver that still requires the handling via unsolicited event verbs for ELD updates. This patch adds the flush of unsol_work to assure that pending events are processed before going into suspend. Buglink: https://bugzilla.suse.com/show_bug.cgi?id=1182377 Reported-and-tested-by: Abhishek Sahu Cc: Link: https://lore.kernel.org/r/20210310112809.9215-2-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit ce0e1780ccb2a30354d18d338622b6e02a0ee127 Author: Takashi Iwai Date: Mon Mar 8 17:07:26 2021 +0100 ALSA: hda: Drop the BATCH workaround for AMD controllers commit 28e96c1693ec1cdc963807611f8b5ad400431e82 upstream. The commit c02f77d32d2c ("ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457)") introduced a few workarounds for the recent AMD HD-audio controller, and one of them is the forced BATCH PCM mode so that PulseAudio avoids the timer-based scheduling. This was thought to cover for some badly working applications, but this actually worsens for more others. In total, this wasn't a good idea to enforce it. This is a partial revert of the commit above for dropping the PCM BATCH enforcement part to recover from the regression again. Fixes: c02f77d32d2c ("ALSA: hda - Workaround for crackled sound on AMD controller (1022:1457)") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=195303 Cc: Link: https://lore.kernel.org/r/20210308160726.22930-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 17ea8fdb019392a91ab2cf70489fd4c5ab2280a8 Author: Simeon Simeonoff Date: Mon Mar 8 20:48:35 2021 +0200 ALSA: hda/ca0132: Add Sound BlasterX AE-5 Plus support commit f15c5c11abfbf8909eb30598315ecbec2311cfdc upstream. The new AE-5 Plus model has a different Subsystem ID compared to the non-plus model. Adding the new id to the list of quirks. Signed-off-by: Simeon Simeonoff Cc: Link: https://lore.kernel.org/r/998cafbe10b648f724ee33570553f2d780a38963.camel@gmail.com Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 40dda525fc6358e2a52abd841548c7ebb8f582e5 Author: Takashi Iwai Date: Sat Mar 6 10:50:18 2021 +0100 ALSA: hda/conexant: Add quirk for mute LED control on HP ZBook G5 commit 56b26497bb4b7ff970612dc25a8a008c34463f7b upstream. The mute and mic-mute LEDs on HP ZBook Studio G5 are controlled via GPIO bits 0x10 and 0x20, respectively, and we need the extra setup for those. As the similar code is already present for other HP models but with different GPIO pins, this patch factors out the common helper code and applies those GPIO values for each model. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=211893 Cc: Link: https://lore.kernel.org/r/20210306095018.11746-1-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit c276c5f1c71a6243db7bcf8f6f5dca2057e8b5b3 Author: Takashi Iwai Date: Wed Mar 10 12:28:09 2021 +0100 ALSA: hda/hdmi: Cancel pending works before suspend commit eea46a0879bcca23e15071f9968c0f6e6596e470 upstream. The per_pin->work might be still floating at the suspend, and this may hit the access to the hardware at an unexpected timing. Cancel the work properly at the suspend callback for avoiding the buggy access. Note that the bug doesn't trigger easily in the recent kernels since the work is queued only when the repoll count is set, and usually it's only at the resume callback, but it's still possible to hit in theory. BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1182377 Reported-and-tested-by: Abhishek Sahu Cc: Link: https://lore.kernel.org/r/20210310112809.9215-4-tiwai@suse.de Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 1c5f1ab8bf8f7059bb8fd8a1f5080e4cb609e5d3 Author: John Ernberg Date: Wed Mar 3 18:14:39 2021 +0000 ALSA: usb: Add Plantronics C320-M USB ctrl msg delay quirk commit fc7c5c208eb7bc2df3a9f4234f14eca250001cb6 upstream. The microphone in the Plantronics C320-M headset will randomly fail to initialize properly, at least when using Microsoft Teams. Introducing a 20ms delay on the control messages appears to resolve the issue. Link: https://gitlab.freedesktop.org/pulseaudio/pulseaudio/-/issues/1065 Tested-by: Andreas Kempe Signed-off-by: John Ernberg Cc: Link: https://lore.kernel.org/r/20210303181405.39835-1-john.ernberg@actia.se Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 575bf63862b3be38c1d013e6c5dbe98d1a97b005 Author: AngeloGioacchino Del Regno Date: Thu Jan 14 23:10:58 2021 +0100 clk: qcom: gpucc-msm8998: Add resets, cxc, fix flags on gpu_gx_gdsc [ Upstream commit a59c16c80bd791878cf81d1d5aae508eeb2e73f1 ] The GPU GX GDSC has GPU_GX_BCR reset and gfx3d_clk CXC, as stated on downstream kernels (and as verified upstream, because otherwise random lockups happen). Also, add PWRSTS_RET and NO_RET_PERIPH: also as found downstream, and also as verified here, to avoid GPU related lockups it is necessary to force retain mem, but *not* peripheral when enabling this GDSC (and, of course, the inverse on disablement). With this change, the GPU finally works flawlessly on my four different MSM8998 devices from two different manufacturers. Signed-off-by: AngeloGioacchino Del Regno Link: https://lore.kernel.org/r/20210114221059.483390-11-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit 98616e88a6038a93007cdaa50d5515fe171521d3 Author: Aleksandr Miloserdov Date: Tue Feb 9 10:22:02 2021 +0300 scsi: target: core: Prevent underflow for service actions [ Upstream commit 14d24e2cc77411301e906a8cf41884739de192de ] TCM buffer length doesn't necessarily equal 8 + ADDITIONAL LENGTH which might be considered an underflow in case of Data-In size being greater than 8 + ADDITIONAL LENGTH. So truncate buffer length to prevent underflow. Link: https://lore.kernel.org/r/20210209072202.41154-3-a.miloserdov@yadro.com Reviewed-by: Roman Bolshakov Reviewed-by: Bodo Stroesser Signed-off-by: Aleksandr Miloserdov Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 202738aea5533f7c945affff76e1815095caf309 Author: Aleksandr Miloserdov Date: Tue Feb 9 10:22:01 2021 +0300 scsi: target: core: Add cmd length set before cmd complete [ Upstream commit 1c73e0c5e54d5f7d77f422a10b03ebe61eaed5ad ] TCM doesn't properly handle underflow case for service actions. One way to prevent it is to always complete command with target_complete_cmd_with_length(), however it requires access to data_sg, which is not always available. This change introduces target_set_cmd_data_length() function which allows to set command data length before completing it. Link: https://lore.kernel.org/r/20210209072202.41154-2-a.miloserdov@yadro.com Reviewed-by: Roman Bolshakov Reviewed-by: Bodo Stroesser Signed-off-by: Aleksandr Miloserdov Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 42ae74da77d49c9df59e3851c8bb8adeeddef6e5 Author: Mike Christie Date: Sat Feb 6 22:46:00 2021 -0600 scsi: libiscsi: Fix iscsi_prep_scsi_cmd_pdu() error handling [ Upstream commit d28d48c699779973ab9a3bd0e5acfa112bd4fdef ] If iscsi_prep_scsi_cmd_pdu() fails we try to add it back to the cmdqueue, but we leave it partially setup. We don't have functions that can undo the pdu and init task setup. We only have cleanup_task which can clean up both parts. So this has us just fail the cmd and go through the standard cleanup routine and then have the SCSI midlayer retry it like is done when it fails in the queuecommand path. Link: https://lore.kernel.org/r/20210207044608.27585-2-michael.christie@oracle.com Reviewed-by: Lee Duncan Signed-off-by: Mike Christie Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 2d0e68226f17d78466fee09cf2df40885d27c7c7 Author: Lin Feng Date: Thu Feb 25 17:20:53 2021 -0800 sysctl.c: fix underflow value setting risk in vm_table [ Upstream commit 3b3376f222e3ab58367d9dd405cafd09d5e37b7c ] Apart from subsystem specific .proc_handler handler, all ctl_tables with extra1 and extra2 members set should use proc_dointvec_minmax instead of proc_dointvec, or the limit set in extra* never work and potentially echo underflow values(negative numbers) is likely make system unstable. Especially vfs_cache_pressure and zone_reclaim_mode, -1 is apparently not a valid value, but we can set to them. And then kernel may crash. # echo -1 > /proc/sys/vm/vfs_cache_pressure Link: https://lkml.kernel.org/r/20201223105535.2875-1-linf@wangsu.com Signed-off-by: Lin Feng Cc: Alexey Dobriyan Cc: "Eric W. Biederman" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 9f5f4f5a3bd67751972af46eb81c5cfa11c4de74 Author: David Hildenbrand Date: Thu Feb 25 17:17:24 2021 -0800 drivers/base/memory: don't store phys_device in memory blocks [ Upstream commit e9a2e48e8704c9d20a625c6f2357147d03ea7b97 ] No need to store the value for each and every memory block, as we can easily query the value at runtime. Reshuffle the members to optimize the memory layout. Also, let's clarify what the interface once was used for and why it's legacy nowadays. "phys_device" was used on s390x in older versions of lsmem[2]/chmem[3], back when they were still part of s390x-tools. They were later replaced by the variants in linux-utils. For example, RHEL6 and RHEL7 contain lsmem/chmem from s390-utils. RHEL8 switched to versions from util-linux on s390x [4]. "phys_device" was added with sysfs support for memory hotplug in commit 3947be1969a9 ("[PATCH] memory hotplug: sysfs and add/remove functions") in 2005. It always returned 0. s390x started returning something != 0 on some setups (if sclp.rzm is set by HW) in 2010 via commit 57b552ba0b2f ("memory hotplug/s390: set phys_device"). For s390x, it allowed for identifying which memory block devices belong to the same storage increment (RZM). Only if all memory block devices comprising a single storage increment were offline, the memory could actually be removed in the hypervisor. Since commit e5d709bb5fb7 ("s390/memory hotplug: provide memory_block_size_bytes() function") in 2013 a memory block device spans at least one storage increment - which is why the interface isn't really helpful/used anymore (except by old lsmem/chmem tools). There were once RFC patches to make use of "phys_device" in ACPI context; however, the underlying problem could be solved using different interfaces [1]. [1] https://patchwork.kernel.org/patch/2163871/ [2] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/lsmem [3] https://github.com/ibm-s390-tools/s390-tools/blob/v2.1.0/zconf/chmem [4] https://bugzilla.redhat.com/show_bug.cgi?id=1504134 Link: https://lkml.kernel.org/r/20210201181347.13262-2-david@redhat.com Signed-off-by: David Hildenbrand Acked-by: Michal Hocko Reviewed-by: Oscar Salvador Cc: Dave Hansen Cc: Greg Kroah-Hartman Cc: Gerald Schaefer Cc: Jonathan Corbet Cc: "Rafael J. Wysocki" Cc: Mauro Carvalho Chehab Cc: Ilya Dryomov Cc: Vaibhav Jain Cc: Tom Rix Cc: Geert Uytterhoeven Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 64dd11b8b3028342928d9e53589c24cdadc733f9 Author: Heiko Carstens Date: Wed Feb 17 07:13:02 2021 +0100 s390/smp: __smp_rescan_cpus() - move cpumask away from stack [ Upstream commit 62c8dca9e194326802b43c60763f856d782b225c ] Avoid a potentially large stack frame and overflow by making "cpumask_t avail" a static variable. There is no concurrent access due to the existing locking. Signed-off-by: Heiko Carstens Signed-off-by: Vasily Gorbik Signed-off-by: Sasha Levin commit 7c1e30d4c5503b3ed970970480d4de314953a3a7 Author: Andrey Konovalov Date: Wed Feb 24 12:05:42 2021 -0800 kasan: fix memory corruption in kasan_bitops_tags test [ Upstream commit e66e1799a76621003e5b04c9c057826a2152e103 ] Since the hardware tag-based KASAN mode might not have a redzone that comes after an allocated object (when kasan.mode=prod is enabled), the kasan_bitops_tags() test ends up corrupting the next object in memory. Change the test so it always accesses the redzone that lies within the allocated object's boundaries. Link: https://linux-review.googlesource.com/id/I67f51d1ee48f0a8d0fe2658c2a39e4879fe0832a Link: https://lkml.kernel.org/r/7d452ce4ae35bb1988d2c9244dfea56cf2cc9315.1610733117.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov Reviewed-by: Marco Elver Reviewed-by: Alexander Potapenko Cc: Andrey Ryabinin Cc: Branislav Rankov Cc: Catalin Marinas Cc: Dmitry Vyukov Cc: Evgenii Stepanov Cc: Kevin Brodsky Cc: Peter Collingbourne Cc: Vincenzo Frascino Cc: Will Deacon Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin commit 531bf07268091a3342fb503134fb359c175a031f Author: Keith Busch Date: Mon Jan 4 15:02:58 2021 -0800 PCI/ERR: Retain status from error notification [ Upstream commit 387c72cdd7fb6bef650fb078d0f6ae9682abf631 ] Overwriting the frozen detected status with the result of the link reset loses the NEED_RESET result that drivers are depending on for error handling to report the .slot_reset() callback. Retain this status so that subsequent error handling has the correct flow. Link: https://lore.kernel.org/r/20210104230300.1277180-4-kbusch@kernel.org Reported-by: Hinko Kocevar Tested-by: Hedi Berriche Signed-off-by: Keith Busch Signed-off-by: Bjorn Helgaas Acked-by: Sean V Kelley Acked-by: Hedi Berriche Signed-off-by: Sasha Levin commit 1008fa244e3dea131d7d5476cde475bbe03cb434 Author: Keita Suzuki Date: Fri Oct 30 07:14:30 2020 +0000 i40e: Fix memory leak in i40e_probe [ Upstream commit 58cab46c622d6324e47bd1c533693c94498e4172 ] Struct i40e_veb is allocated in function i40e_setup_pf_switch, and stored to an array field veb inside struct i40e_pf. However when i40e_setup_misc_vector fails, this memory leaks. Fix this by calling exit and teardown functions. Signed-off-by: Keita Suzuki Tested-by: Tony Brelinski Signed-off-by: Tony Nguyen Signed-off-by: Sasha Levin commit aa8119817b7afb357219434be50d8215ffda2329 Author: Geert Uytterhoeven Date: Tue Feb 2 11:03:32 2021 +0100 PCI: Fix pci_register_io_range() memory leak [ Upstream commit f6bda644fa3a7070621c3bf12cd657f69a42f170 ] Kmemleak reports: unreferenced object 0xc328de40 (size 64): comm "kworker/1:1", pid 21, jiffies 4294938212 (age 1484.670s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 e0 d8 fc eb 00 00 00 00 ................ 00 00 10 fe 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [] pci_register_io_range+0x3c/0x80 [<2c7f139e>] of_pci_range_to_resource+0x48/0xc0 [] devm_of_pci_get_host_bridge_resources.constprop.0+0x2ac/0x3ac [] devm_of_pci_bridge_init+0x60/0x1b8 [] devm_pci_alloc_host_bridge+0x54/0x64 [] rcar_pcie_probe+0x2c/0x644 In case a PCI host driver's probe is deferred, the same I/O range may be allocated again, and be ignored, causing a memory leak. Fix this by (a) letting logic_pio_register_range() return -EEXIST if the passed range already exists, so pci_register_io_range() will free it, and by (b) making pci_register_io_range() not consider -EEXIST an error condition. Link: https://lore.kernel.org/r/20210202100332.829047-1-geert+renesas@glider.be Signed-off-by: Geert Uytterhoeven Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin commit b8982da7657b4ec79a49dd1a6d4fc02fab0e38aa Author: Sasha Levin Date: Fri Feb 5 22:50:32 2021 -0500 kbuild: clamp SUBLEVEL to 255 [ Upstream commit 9b82f13e7ef316cdc0a8858f1349f4defce3f9e0 ] Right now if SUBLEVEL becomes larger than 255 it will overflow into the territory of PATCHLEVEL, causing havoc in userspace that tests for specific kernel version. While userspace code tests for MAJOR and PATCHLEVEL, it doesn't test SUBLEVEL at any point as ABI changes don't happen in the context of stable tree. Thus, to avoid overflows, simply clamp SUBLEVEL to it's maximum value in the context of LINUX_VERSION_CODE. This does not affect "make kernelversion" and such. Signed-off-by: Sasha Levin Signed-off-by: Masahiro Yamada Signed-off-by: Sasha Levin commit f20ae52db0f94dfedc3d2b2c746a5ad992f79482 Author: Theodore Ts'o Date: Thu Jan 21 12:33:20 2021 -0500 ext4: don't try to processed freed blocks until mballoc is initialized [ Upstream commit 027f14f5357279655c3ebc6d14daff8368d4f53f ] If we try to make any changes via the journal between when the journal is initialized, but before the multi-block allocated is initialized, we will end up deferencing a NULL pointer when the journal commit callback function calls ext4_process_freed_data(). The proximate cause of this failure was commit 2d01ddc86606 ("ext4: save error info to sb through journal if available") since file system corruption problems detected before the call to ext4_mb_init() would result in a journal commit before we aborted the mount of the file system.... and we would then trigger the NULL pointer deref. Link: https://lore.kernel.org/r/YAm8qH/0oo2ofSMR@mit.edu Reported-by: Murphy Zhou Reviewed-by: Jan Kara Signed-off-by: Theodore Ts'o Signed-off-by: Sasha Levin commit 14901733de5db70b2d1060f6a74ecd964356ab27 Author: Bjorn Helgaas Date: Tue Feb 2 14:17:54 2021 -0600 PCI/LINK: Remove bandwidth notification [ Upstream commit b4c7d2076b4e767dd2e075a2b3a9e57753fc67f5 ] The PCIe Bandwidth Change Notification feature logs messages when the link bandwidth changes. Some users have reported that these messages occur often enough to significantly reduce NVMe performance. GPUs also seem to generate these messages. We don't know why the link bandwidth changes, but in the reported cases there's no indication that it's caused by hardware failures. Remove the bandwidth change notifications for now. Hopefully we can add this back when we have a better understanding of why this happens and how we can make the messages useful instead of overwhelming. Link: https://lore.kernel.org/r/20200115221008.GA191037@google.com/ Link: https://lore.kernel.org/r/155605909349.3575.13433421148215616375.stgit@gimli.home/ Link: https://bugzilla.kernel.org/show_bug.cgi?id=206197 Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin commit 0bcef8312c5f7887d1dcb3a8636797a037264e47 Author: Arnd Bergmann Date: Mon Jan 25 13:45:27 2021 +0100 drivers/base: build kunit tests without structleak plugin [ Upstream commit 38009c766725a9877ea8866fc813a5460011817f ] The structleak plugin causes the stack frame size to grow immensely: drivers/base/test/property-entry-test.c: In function 'pe_test_reference': drivers/base/test/property-entry-test.c:481:1: error: the frame size of 2640 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] 481 | } | ^ drivers/base/test/property-entry-test.c: In function 'pe_test_uints': drivers/base/test/property-entry-test.c:99:1: error: the frame size of 2592 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] Turn it off in this file. Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20210125124533.101339-3-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit ec574d41d8ef94fb3d745afd5bd72db8b52b87d8 Author: Krzysztof Wilczyński Date: Wed Jan 20 18:48:10 2021 +0000 PCI: mediatek: Add missing of_node_put() to fix reference leak [ Upstream commit 42814c438aac79746d310f413a27d5b0b959c5de ] The for_each_available_child_of_node helper internally makes use of the of_get_next_available_child() which performs an of_node_get() on each iteration when searching for next available child node. Should an available child node be found, then it would return a device node pointer with reference count incremented, thus early return from the middle of the loop requires an explicit of_node_put() to prevent reference count leak. To stop the reference leak, explicitly call of_node_put() before returning after an error occurred. Link: https://lore.kernel.org/r/20210120184810.3068794-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński Signed-off-by: Lorenzo Pieralisi Signed-off-by: Sasha Levin commit 46393d626e2ff2c7379b25bb587275dae72e7c61 Author: Martin Kaiser Date: Fri Jan 15 22:24:35 2021 +0100 PCI: xgene-msi: Fix race in installing chained irq handler [ Upstream commit a93c00e5f975f23592895b7e83f35de2d36b7633 ] Fix a race where a pending interrupt could be received and the handler called before the handler's data has been setup, by converting to irq_set_chained_handler_and_data(). See also 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ handler"). Based on the mail discussion, it seems ok to drop the error handling. Link: https://lore.kernel.org/r/20210115212435.19940-3-martin@kaiser.cx Signed-off-by: Martin Kaiser Signed-off-by: Lorenzo Pieralisi Signed-off-by: Sasha Levin commit cbeeedcfbacebbc1b84352e1fc33ff80b8917fd6 Author: Ronald Tschalär Date: Fri Feb 19 11:10:51 2021 -0800 Input: applespi - don't wait for responses to commands indefinitely. [ Upstream commit 0ce1ac23149c6da939a5926c098c270c58c317a0 ] The response to a command may never arrive or it may be corrupted (and hence dropped) for some reason. While exceedingly rare, when it did happen it blocked all further commands. One way to fix this was to do a suspend/resume. However, recovering automatically seems like a nicer option. Hence this puts a time limit (1 sec) on how long we're willing to wait for a response, after which we assume it got lost. Signed-off-by: Ronald Tschalär Link: https://lore.kernel.org/r/20210217190718.11035-1-ronald@innovation.ch Signed-off-by: Dmitry Torokhov Signed-off-by: Sasha Levin commit bc6759bd959dc62aea9933f5b54932d44bac80e0 Author: Khalid Aziz Date: Fri Oct 23 11:56:11 2020 -0600 sparc64: Use arch_validate_flags() to validate ADI flag [ Upstream commit 147d8622f2a26ef34beacc60e1ed8b66c2fa457f ] When userspace calls mprotect() to enable ADI on an address range, do_mprotect_pkey() calls arch_validate_prot() to validate new protection flags. arch_validate_prot() for sparc looks at the first VMA associated with address range to verify if ADI can indeed be enabled on this address range. This has two issues - (1) Address range might cover multiple VMAs while arch_validate_prot() looks at only the first VMA, (2) arch_validate_prot() peeks at VMA without holding mmap lock which can result in race condition. arch_validate_flags() from commit c462ac288f2c ("mm: Introduce arch_validate_flags()") allows for VMA flags to be validated for all VMAs that cover the address range given by user while holding mmap lock. This patch updates sparc code to move the VMA check from arch_validate_prot() to arch_validate_flags() to fix above two issues. Suggested-by: Jann Horn Suggested-by: Christoph Hellwig Suggested-by: Catalin Marinas Signed-off-by: Khalid Aziz Reviewed-by: Catalin Marinas Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 9301cf801e4f9206c58d8091aeb91a75d24616ab Author: Andreas Larsson Date: Fri Feb 5 14:20:31 2021 +0100 sparc32: Limit memblock allocation to low memory [ Upstream commit bda166930c37604ffa93f2425426af6921ec575a ] Commit cca079ef8ac29a7c02192d2bad2ffe4c0c5ffdd0 changed sparc32 to use memblocks instead of bootmem, but also made high memory available via memblock allocation which does not work together with e.g. phys_to_virt and can lead to kernel panic. This changes back to only low memory being allocatable in the early stages, now using memblock allocation. Signed-off-by: Andreas Larsson Acked-by: Mike Rapoport Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 3cd74eaa31dd99b593b0fc90d81a8a548e7ec0b0 Author: AngeloGioacchino Del Regno Date: Wed Jan 13 19:38:15 2021 +0100 clk: qcom: gdsc: Implement NO_RET_PERIPH flag [ Upstream commit 785c02eb35009a4be6dbc68f4f7d916e90b7177d ] In some rare occasions, we want to only set the RETAIN_MEM bit, but not the RETAIN_PERIPH one: this is seen on at least SDM630/636/660's GPU-GX GDSC, where unsetting and setting back the RETAIN_PERIPH bit will generate chaos and panics during GPU suspend time (mainly, the chaos is unaligned access). For this reason, introduce a new NO_RET_PERIPH flag to the GDSC driver to address this corner case. Signed-off-by: AngeloGioacchino Del Regno Link: https://lore.kernel.org/r/20210113183817.447866-8-angelogioacchino.delregno@somainline.org Signed-off-by: Stephen Boyd Signed-off-by: Sasha Levin commit a41baef1cb1c7427fb646d9689dd68b096cced1e Author: Suravee Suthikulpanit Date: Mon Feb 8 06:27:12 2021 -0600 iommu/amd: Fix performance counter initialization [ Upstream commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b ] Certain AMD platforms enable power gating feature for IOMMU PMC, which prevents the IOMMU driver from updating the counter while trying to validate the PMC functionality in the init_iommu_perf_ctr(). This results in disabling PMC support and the following error message: "AMD-Vi: Unable to read/write to IOMMU perf counter" To workaround this issue, disable power gating temporarily by programming the counter source to non-zero value while validating the counter, and restore the prior state afterward. Signed-off-by: Suravee Suthikulpanit Tested-by: Tj (Elloe Linux) Link: https://lore.kernel.org/r/20210208122712.5048-1-suravee.suthikulpanit@amd.com Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753 Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin commit 4eb948eff16365d90f07c5d7cd51dd2e943c0a61 Author: Michael Ellerman Date: Wed Feb 10 00:59:20 2021 +1100 powerpc/64: Fix stack trace not displaying final frame [ Upstream commit e3de1e291fa58a1ab0f471a4b458eff2514e4b5f ] In commit bf13718bc57a ("powerpc: show registers when unwinding interrupt frames") we changed our stack dumping logic to show the full registers whenever we find an interrupt frame on the stack. However we didn't notice that on 64-bit this doesn't show the final frame, ie. the interrupt that brought us in from userspace, whereas on 32-bit it does. That is due to confusion about the size of that last frame. The code in show_stack() calls validate_sp(), passing it STACK_INT_FRAME_SIZE to check the sp is at least that far below the top of the stack. However on 64-bit that size is too large for the final frame, because it includes the red zone, but we don't allocate a red zone for the first frame. So add a new define that encodes the correct size for 32-bit and 64-bit, and use it in show_stack(). This results in the full trace being shown on 64-bit, eg: sysrq: Trigger a crash Kernel panic - not syncing: sysrq triggered crash CPU: 0 PID: 83 Comm: sh Not tainted 5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty #649 Call Trace: [c00000000a1c3ac0] [c000000000897b70] dump_stack+0xc4/0x114 (unreliable) [c00000000a1c3b00] [c00000000014334c] panic+0x178/0x41c [c00000000a1c3ba0] [c00000000094e600] sysrq_handle_crash+0x40/0x50 [c00000000a1c3c00] [c00000000094ef98] __handle_sysrq+0xd8/0x210 [c00000000a1c3ca0] [c00000000094f820] write_sysrq_trigger+0x100/0x188 [c00000000a1c3ce0] [c0000000005559dc] proc_reg_write+0x10c/0x1b0 [c00000000a1c3d10] [c000000000479950] vfs_write+0xf0/0x360 [c00000000a1c3d60] [c000000000479d9c] ksys_write+0x7c/0x140 [c00000000a1c3db0] [c00000000002bf5c] system_call_exception+0x19c/0x2c0 [c00000000a1c3e10] [c00000000000d35c] system_call_common+0xec/0x278 --- interrupt: c00 at 0x7fff9fbab428 NIP: 00007fff9fbab428 LR: 000000001000b724 CTR: 0000000000000000 REGS: c00000000a1c3e80 TRAP: 0c00 Not tainted (5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty) MSR: 900000000280f033 CR: 22002884 XER: 00000000 IRQMASK: 0 GPR00: 0000000000000004 00007fffc3cb8960 00007fff9fc59900 0000000000000001 GPR04: 000000002a4b32d0 0000000000000002 0000000000000063 0000000000000063 GPR08: 000000002a4b32d0 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 00007fff9fcca9a0 0000000000000000 0000000000000000 GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000100b8fd0 GPR20: 000000002a4b3485 00000000100b8f90 0000000000000000 0000000000000000 GPR24: 000000002a4b0440 00000000100e77b8 0000000000000020 000000002a4b32d0 GPR28: 0000000000000001 0000000000000002 000000002a4b32d0 0000000000000001 NIP [00007fff9fbab428] 0x7fff9fbab428 LR [000000001000b724] 0x1000b724 --- interrupt: c00 Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210209141627.2898485-1-mpe@ellerman.id.au Signed-off-by: Sasha Levin commit 16a18856f9c99463b7af2060ee4b1cd6e84a4bf7 Author: Filipe Laíns Date: Sat Jan 23 18:02:20 2021 +0000 HID: logitech-dj: add support for the new lightspeed connection iteration [ Upstream commit fab3a95654eea01d6b0204995be8b7492a00d001 ] This new connection type is the new iteration of the Lightspeed connection and will probably be used in some of the newer gaming devices. It is currently use in the G Pro X Superlight. This patch should be backported to older versions, as currently the driver will panic when seing the unsupported connection. This isn't an issue when using the receiver that came with the device, as Logitech has been using different PIDs when they change the connection type, but is an issue when using a generic receiver (well, generic Lightspeed receiver), which is the case of the one in the Powerplay mat. Currently, the only generic Ligthspeed receiver we support, and the only one that exists AFAIK, is ther Powerplay. As it stands, the driver will panic when seeing a G Pro X Superlight connected to the Powerplay receiver and won't send any input events to userspace! The kernel will warn about this so the issue should be easy to identify, but it is still very worrying how hard it will fail :( [915977.398471] logitech-djreceiver 0003:046D:C53A.0107: unusable device of type UNKNOWN (0x0f) connected on slot 1 Signed-off-by: Filipe Laíns Signed-off-by: Jiri Kosina Signed-off-by: Sasha Levin commit 8f4e4316c6aee8215b6c07e023f7f93c751cbaee Author: Athira Rajeev Date: Fri Feb 5 04:14:52 2021 -0500 powerpc/perf: Record counter overflow always if SAMPLE_IP is unset [ Upstream commit d137845c973147a22622cc76c7b0bc16f6206323 ] While sampling for marked events, currently we record the sample only if the SIAR valid bit of Sampled Instruction Event Register (SIER) is set. SIAR_VALID bit is used for fetching the instruction address from Sampled Instruction Address Register(SIAR). But there are some usecases, where the user is interested only in the PMU stats at each counter overflow and the exact IP of the overflow event is not required. Dropping SIAR invalid samples will fail to record some of the counter overflows in such cases. Example of such usecase is dumping the PMU stats (event counts) after some regular amount of instructions/events from the userspace (ex: via ptrace). Here counter overflow is indicated to userspace via signal handler, and captured by monitoring and enabling I/O signaling on the event file descriptor. In these cases, we expect to get sample/overflow indication after each specified sample_period. Perf event attribute will not have PERF_SAMPLE_IP set in the sample_type if exact IP of the overflow event is not requested. So while profiling if SAMPLE_IP is not set, just record the counter overflow irrespective of SIAR_VALID check. Suggested-by: Michael Ellerman Signed-off-by: Athira Rajeev [mpe: Reflow comment and if formatting] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/1612516492-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Sasha Levin commit 23581f5a24864dd24edb1f3aa16df894f7370c27 Author: Nicholas Piggin Date: Sat Jan 30 23:08:35 2021 +1000 powerpc: improve handling of unrecoverable system reset [ Upstream commit 11cb0a25f71818ca7ab4856548ecfd83c169aa4d ] If an unrecoverable system reset hits in process context, the system does not have to panic. Similar to machine check, call nmi_exit() before die(). Signed-off-by: Nicholas Piggin Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210130130852.2952424-26-npiggin@gmail.com Signed-off-by: Sasha Levin commit b9bd02fa6bf9f28931c721b61de14c234305d1a7 Author: Alain Volmat Date: Fri Feb 5 19:59:32 2021 +0100 spi: stm32: make spurious and overrun interrupts visible [ Upstream commit c64e7efe46b7de21937ef4b3594d9b1fc74f07df ] We do not expect to receive spurious interrupts so rise a warning if it happens. RX overrun is an error condition that signals a corrupted RX stream both in dma and in irq modes. Report the error and abort the transfer in either cases. Signed-off-by: Alain Volmat Link: https://lore.kernel.org/r/1612551572-495-9-git-send-email-alain.volmat@foss.st.com Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 326aaaa472c94a84a5b9da128039417041ee9270 Author: Oliver O'Halloran Date: Tue Nov 3 15:35:06 2020 +1100 powerpc/pci: Add ppc_md.discover_phbs() [ Upstream commit 5537fcb319d016ce387f818dd774179bc03217f5 ] On many powerpc platforms the discovery and initalisation of pci_controllers (PHBs) happens inside of setup_arch(). This is very early in boot (pre-initcalls) and means that we're initialising the PHB long before many basic kernel services (slab allocator, debugfs, a real ioremap) are available. On PowerNV this causes an additional problem since we map the PHB registers with ioremap(). As of commit d538aadc2718 ("powerpc/ioremap: warn on early use of ioremap()") a warning is printed because we're using the "incorrect" API to setup and MMIO mapping in searly boot. The kernel does provide early_ioremap(), but that is not intended to create long-lived MMIO mappings and a seperate warning is printed by generic code if early_ioremap() mappings are "leaked." This is all fixable with dumb hacks like using early_ioremap() to setup the initial mapping then replacing it with a real ioremap later on in boot, but it does raise the question: Why the hell are we setting up the PHB's this early in boot? The old and wise claim it's due to "hysterical rasins." Aside from amused grapes there doesn't appear to be any real reason to maintain the current behaviour. Already most of the newer embedded platforms perform PHB discovery in an arch_initcall and between the end of setup_arch() and the start of initcalls none of the generic kernel code does anything PCI related. On powerpc scanning PHBs occurs in a subsys_initcall so it should be possible to move the PHB discovery to a core, postcore or arch initcall. This patch adds the ppc_md.discover_phbs hook and a core_initcall stub that calls it. The core_initcalls are the earliest to be called so this will any possibly issues with dependency between initcalls. This isn't just an academic issue either since on pseries and PowerNV EEH init occurs in an arch_initcall and depends on the pci_controllers being available, similarly the creation of pci_dns occurs at core_initcall_sync (i.e. between core and postcore initcalls). These problems need to be addressed seperately. Reported-by: kernel test robot Signed-off-by: Oliver O'Halloran [mpe: Make discover_phbs() static] Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20201103043523.916109-1-oohall@gmail.com Signed-off-by: Sasha Levin commit aa4e53a7c90ef7f53a8acc8389480e2a985bd431 Author: Lubomir Rintel Date: Tue Jan 26 08:37:38 2021 +0100 Platform: OLPC: Fix probe error handling [ Upstream commit cec551ea0d41c679ed11d758e1a386e20285b29d ] Reset ec_priv if probe ends unsuccessfully. Signed-off-by: Lubomir Rintel Link: https://lore.kernel.org/r/20210126073740.10232-2-lkundrak@v3.sk Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 02dcad5adca9ee88254ad52000fc6b1e83ed69fd Author: Pan Bian Date: Wed Jan 20 20:50:05 2021 -0800 platform/x86: amd-pmc: put device on error paths [ Upstream commit 745ed17a04f966406c8c27c8f992544336c06013 ] Put the PCI device rdev on error paths to fix potential reference count leaks. Signed-off-by: Pan Bian Link: https://lore.kernel.org/r/20210121045005.73342-1-bianpan2016@163.com Reviewed-by: Hans de Goede Signed-off-by: Hans de Goede Signed-off-by: Sasha Levin commit 9535025c428cb6cb33ed6549340622d86eaf5581 Author: Jeremy Linton Date: Tue Jan 19 18:04:06 2021 -0600 mmc: sdhci-iproc: Add ACPI bindings for the RPi [ Upstream commit 4f9833d3ec8da34861cd0680b00c73e653877eb9 ] The RPi4 has an Arasan controller it carries over from the RPi3 and a newer eMMC2 controller. Because of a couple of quirks, it seems wiser to bind these controllers to the same driver that DT is using on this platform rather than the generic sdhci_acpi driver with PNP0D40. So, BCM2847 describes the older Arasan and BRCME88C describes the newer eMMC2. The older Arasan is reusing an existing ACPI _HID used by other OSes booting these tables on the RPi. With this change, Linux is capable of utilizing the SD card slot, and the Wi-Fi when booted with UEFI+ACPI on the RPi4. Signed-off-by: Jeremy Linton Acked-by: Florian Fainelli Link: https://lore.kernel.org/r/20210120000406.1843400-2-jeremy.linton@arm.com Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit cc98f74c9be1affd687936d5ecd5057b66759472 Author: Chaotian Jing Date: Fri Dec 18 15:16:11 2020 +0800 mmc: mediatek: fix race condition between msdc_request_timeout and irq [ Upstream commit 0354ca6edd464a2cf332f390581977b8699ed081 ] when get request SW timeout, if CMD/DAT xfer done irq coming right now, then there is race between the msdc_request_timeout work and irq handler, and the host->cmd and host->data may set to NULL in irq handler. also, current flow ensure that only one path can go to msdc_request_done(), so no need check the return value of cancel_delayed_work(). Signed-off-by: Chaotian Jing Link: https://lore.kernel.org/r/20201218071611.12276-1-chaotian.jing@mediatek.com Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 697f819dd81ce9476581d816537440e1532e92fd Author: Christophe JAILLET Date: Tue Dec 8 21:35:27 2020 +0100 mmc: mxs-mmc: Fix a resource leak in an error handling path in 'mxs_mmc_probe()' [ Upstream commit 0bb7e560f821c7770973a94e346654c4bdccd42c ] If 'mmc_of_parse()' fails, we must undo the previous 'dma_request_chan()' call. Signed-off-by: Christophe JAILLET Link: https://lore.kernel.org/r/20201208203527.49262-1-christophe.jaillet@wanadoo.fr Signed-off-by: Ulf Hansson Signed-off-by: Sasha Levin commit 85269a7cda5d6c29dfbcc7222f01c85dc3642938 Author: Lu Baolu Date: Tue Jan 26 16:07:29 2021 +0800 iommu/vt-d: Clear PRQ overflow only when PRQ is empty [ Upstream commit 28a77185f1cd0650b664f54614143aaaa3a7a615 ] It is incorrect to always clear PRO when it's set w/o first checking whether the overflow condition has been cleared. Current code assumes that if an overflow condition occurs it must have been cleared by earlier loop. However since the code runs in a threaded context, the overflow condition could occur even after setting the head to the tail under some extreme condition. To be sane, we should read both head/tail again when seeing a pending PRO and only clear PRO after all pending PRs have been handled. Suggested-by: Kevin Tian Signed-off-by: Lu Baolu Link: https://lore.kernel.org/linux-iommu/MWHPR11MB18862D2EA5BD432BF22D99A48CA09@MWHPR11MB1886.namprd11.prod.outlook.com/ Link: https://lore.kernel.org/r/20210126080730.2232859-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin commit f72672c290b4c5521c1b09d266c7666f43a6f03d Author: Steven J. Magnani Date: Thu Jan 7 17:41:16 2021 -0600 udf: fix silent AED tagLocation corruption [ Upstream commit 63c9e47a1642fc817654a1bc18a6ec4bbcc0f056 ] When extending a file, udf_do_extend_file() may enter following empty indirect extent. At the end of udf_do_extend_file() we revert prev_epos to point to the last written extent. However if we end up not adding any further extent in udf_do_extend_file(), the reverting points prev_epos into the header area of the AED and following updates of the extents (in udf_update_extents()) will corrupt the header. Make sure that we do not follow indirect extent if we are not going to add any more extents so that returning back to the last written extent works correctly. Link: https://lore.kernel.org/r/20210107234116.6190-2-magnani@ieee.org Signed-off-by: Steven J. Magnani Signed-off-by: Jan Kara Signed-off-by: Sasha Levin commit 69c732112b069c3fa8d8e5ee697d7a471be2c0fc Author: Can Guo Date: Wed Jan 20 02:04:21 2021 -0800 scsi: ufs: Protect some contexts from unexpected clock scaling [ Upstream commit 0e9d4ca43ba8112821397f56a26d20682001c011 ] In contexts like suspend, shutdown, and error handling we need to suspend devfreq to make sure these contexts won't be disturbed by clock scaling. However, suspending devfreq is not enough since users can still trigger a clock scaling by manipulating the devfreq sysfs nodes like min/max_freq and governor even after devfreq is suspended. Moreover, mere suspending devfreq cannot synchroinze a clock scaling which has already been invoked through these sysfs nodes. Add one more flag in struct clk_scaling and wrap the entire func ufshcd_devfreq_scale() with the clk_scaling_lock, so that we can use this flag and clk_scaling_lock to control and synchronize clock scaling invoked through devfreq sysfs nodes. Link: https://lore.kernel.org/r/1611137065-14266-2-git-send-email-cang@codeaurora.org Reviewed-by: Stanley Chu Signed-off-by: Can Guo Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit a35887ef14630fa3eaef600fa35aab217b19a45d Author: Jaegeuk Kim Date: Mon Jan 11 01:59:27 2021 -0800 scsi: ufs: WB is only available on LUN #0 to #7 [ Upstream commit a2fca52ee640a04112ed9d9a137c940ea6ad288e ] Kernel stack violation when getting unit_descriptor/wb_buf_alloc_units from rpmb LUN. The reason is that the unit descriptor length is different per LU. The length of Normal LU is 45 while the one of rpmb LU is 35. int ufshcd_read_desc_param(struct ufs_hba *hba, ...) { param_offset=41; param_size=4; buff_len=45; ... buff_len=35 by rpmb LU; if (is_kmalloc) { /* Make sure we don't copy more data than available */ if (param_offset + param_size > buff_len) param_size = buff_len - param_offset; --> param_size = 250; memcpy(param_read_buf, &desc_buf[param_offset], param_size); --> memcpy(param_read_buf, desc_buf+41, 250); [ 141.868974][ T9174] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: wb_buf_alloc_units_show+0x11c/0x11c } } Link: https://lore.kernel.org/r/20210111095927.1830311-1-jaegeuk@kernel.org Reviewed-by: Avri Altman Signed-off-by: Jaegeuk Kim Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 8fc2b8ddc086cd1cb05f4e3f5af1d518782417f6 Author: akshatzen Date: Sat Jan 9 18:08:45 2021 +0530 scsi: pm80xx: Fix missing tag_free in NVMD DATA req [ Upstream commit 5d28026891c7041deec08cc5ddd8f3abd90195e1 ] Tag was not freed in NVMD get/set data request failure scenario. This caused a tag leak each time a request failed. Link: https://lore.kernel.org/r/20210109123849.17098-5-Viswas.G@microchip.com Acked-by: Jack Wang Signed-off-by: akshatzen Signed-off-by: Viswas G Signed-off-by: Ruksar Devadi Signed-off-by: Radha Ramachandran Signed-off-by: Martin K. Petersen Signed-off-by: Sasha Levin commit 3e635b8184700132a9d96b9a0febbcdfa3e4a44b Author: Wolfram Sang Date: Wed Dec 23 18:21:52 2020 +0100 i2c: rcar: optimize cacheline to minimize HW race condition [ Upstream commit 25c2e0fb5fefb8d7847214cf114d94c7aad8e9ce ] 'flags' and 'io' are needed first, so they should be at the beginning of the private struct. Signed-off-by: Wolfram Sang Reviewed-by: Niklas Söderlund Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin commit 131ccd1c676b9400094f845448307951b78b07f6 Author: Wolfram Sang Date: Wed Dec 23 18:21:51 2020 +0100 i2c: rcar: faster irq code to minimize HW race condition [ Upstream commit c7b514ec979e23a08c411f3d8ed39c7922751422 ] To avoid the HW race condition on R-Car Gen2 and earlier, we need to write to ICMCR as soon as possible in the interrupt handler. We can improve this by writing a static value instead of masking out bits. Signed-off-by: Wolfram Sang Reviewed-by: Niklas Söderlund Signed-off-by: Wolfram Sang Signed-off-by: Sasha Levin commit 8a2d54228d5a3904fc3653ee306743e74ec1f7d2 Author: Florian Westphal Date: Thu Mar 4 13:32:08 2021 -0800 mptcp: reset last_snd on subflow close [ Upstream commit e0be4931f3fee2e04dec4013ea4f27ec2db8556f ] Send logic caches last active subflow in the msk, so it needs to be cleared when the cached subflow is closed. Fixes: d5f49190def61c ("mptcp: allow picking different xmit subflows") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/155 Reported-by: Christoph Paasch Acked-by: Paolo Abeni Reviewed-by: Matthieu Baerts Signed-off-by: Florian Westphal Signed-off-by: Mat Martineau Signed-off-by: David S. Miller Signed-off-by: Sasha Levin commit 862e90a3eeda09afe4aab65fc3b1356534628258 Author: Paolo Abeni Date: Wed Jan 20 15:39:10 2021 +0100 mptcp: always graft subflow socket to parent [ Upstream commit 866f26f2a9c33bc70eb0f07ffc37fd9424ffe501 ] Currently, incoming subflows link to the parent socket, while outgoing ones link to a per subflow socket. The latter is not really needed, except at the initial connect() time and for the first subflow. Always graft the outgoing subflow to the parent socket and free the unneeded ones early. This allows some code cleanup, reduces the amount of memory used and will simplify the next patch Reviewed-by: Mat Martineau Signed-off-by: Paolo Abeni Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin commit f4d509c9978c586161e262481bc9ae59906bd706 Author: Thomas Bogendoerfer Date: Mon Mar 8 10:24:47 2021 +0100 MIPS: kernel: Reserve exception base early to prevent corruption [ Upstream commit bd67b711bfaa02cf19e88aa2d9edae5c1c1d2739 ] BMIPS is one of the few platforms that do change the exception base. After commit 2dcb39645441 ("memblock: do not start bottom-up allocations with kernel_end") we started seeing BMIPS boards fail to boot with the built-in FDT being corrupted. Before the cited commit, early allocations would be in the [kernel_end, RAM_END] range, but after commit they would be within [RAM_START + PAGE_SIZE, RAM_END]. The custom exception base handler that is installed by bmips_ebase_setup() done for BMIPS5000 CPUs ends-up trampling on the memory region allocated by unflatten_and_copy_device_tree() thus corrupting the FDT used by the kernel. To fix this, we need to perform an early reservation of the custom exception space. Additional we reserve the first 4k (1k for R3k) for either normal exception vector space (legacy CPUs) or special vectors like cache exceptions. Huge thanks to Serge for analysing and proposing a solution to this issue. Fixes: 2dcb39645441 ("memblock: do not start bottom-up allocations with kernel_end") Reported-by: Kamal Dasu Debugged-by: Serge Semin Acked-by: Mike Rapoport Tested-by: Florian Fainelli Reviewed-by: Serge Semin Signed-off-by: Thomas Bogendoerfer Signed-off-by: Sasha Levin commit d7452e9c026d05a62e77adee88f962240c4a9087 Author: Hans Verkuil Date: Fri Feb 26 11:37:47 2021 +0100 media: rc: compile rc-cec.c into rc-core commit f09f9f93afad770a04b35235a0aa465fcc8d6e3d upstream. The rc-cec keymap is unusual in that it can't be built as a module, instead it is registered directly in rc-main.c if CONFIG_MEDIA_CEC_RC is set. This is because it can be called from drm_dp_cec_set_edid() via cec_register_adapter() in an asynchronous context, and it is not allowed to use request_module() to load rc-cec.ko in that case. Trying to do so results in a 'WARN_ON_ONCE(wait && current_is_async())'. Since this keymap is only used if CONFIG_MEDIA_CEC_RC is set, we just compile this keymap into the rc-core module and never as a separate module. Signed-off-by: Hans Verkuil Fixes: 2c6d1fffa1d9 (drm: add support for DisplayPort CEC-Tunneling-over-AUX) Reported-by: Hans de Goede Signed-off-by: Sean Young Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 0ff5612a81f31fbf4c87757cf7d052575ab84454 Author: Biju Das Date: Mon Mar 1 13:08:27 2021 +0100 media: v4l: vsp1: Fix bru null pointer access commit ac8d82f586c8692b501cb974604a71ef0e22a04c upstream. RZ/G2L SoC has only BRS. This patch fixes null pointer access,when only BRS is enabled. Fixes: cbb7fa49c7466("media: v4l: vsp1: Rename BRU to BRx") Signed-off-by: Biju Das Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 3ab804278382f6b118f260446b23e5ae684e59c8 Author: Biju Das Date: Mon Mar 1 13:08:28 2021 +0100 media: v4l: vsp1: Fix uif null pointer access commit 6732f313938027a910e1f7351951ff52c0329e70 upstream. RZ/G2L SoC has no UIF. This patch fixes null pointer access, when UIF module is not used. Fixes: 5e824f989e6e8("media: v4l: vsp1: Integrate DISCOM in display pipeline") Signed-off-by: Biju Das Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 0c2b5e1a5cb487d20214849d6ea75afdbcea1953 Author: Dafna Hirschfeld Date: Mon Mar 1 18:18:35 2021 +0100 media: rkisp1: params: fix wrong bits settings commit 2025a48cfd92d541c5ee47deee97f8a46d00c4ac upstream. The histogram mode is set using 'rkisp1_params_set_bits'. Only the bits of the mode should be the value argument for that function. Otherwise bits outside the mode mask are turned on which is not what was intended. Fixes: bae1155cf579 ("media: staging: rkisp1: add output device for parameters") Signed-off-by: Dafna Hirschfeld Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit a678d679c17e26ea9109c1f4f2f0d36d23474788 Author: Maxim Mikityanskiy Date: Fri Feb 5 23:51:39 2021 +0100 media: usbtv: Fix deadlock on suspend commit 8a7e27fd5cd696ba564a3f62cedef7269cfd0723 upstream. usbtv doesn't support power management, so on system suspend the .disconnect callback of the driver is called. The teardown sequence includes a call to snd_card_free. Its implementation waits until the refcount of the sound card device drops to zero, however, if its file is open, snd_card_file_add takes a reference, which can't be dropped during the suspend, because the userspace processes are already frozen at this point. snd_card_free waits for completion forever, leading to a hang on suspend. This commit fixes this deadlock condition by replacing snd_card_free with snd_card_free_when_closed, that doesn't wait until all references are released, allowing suspend to progress. Fixes: 63ddf68de52e ("[media] usbtv: add audio support") Signed-off-by: Maxim Mikityanskiy Signed-off-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 6b8f84ceb4fcce63fe8ce879c517f143eee6195f Author: Sergey Shtylyov Date: Sun Feb 28 23:27:32 2021 +0300 sh_eth: fix TRSCER mask for R7S9210 commit 165bc5a4f30eee4735845aa7dbd6b738643f2603 upstream. According to the RZ/A2M Group User's Manual: Hardware, Rev. 2.00, the TRSCER register has bit 9 reserved, hence we can't use the driver's default TRSCER mask. Add the explicit initializer for sh_eth_cpu_data:: trscer_err_mask for R7S9210. Fixes: 6e0bb04d0e4f ("sh_eth: Add R7S9210 support") Signed-off-by: Sergey Shtylyov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e90d53746d66f58eb5a99cdd499d113d95ae2513 Author: Colin Ian King Date: Thu Mar 4 09:49:28 2021 +0000 qxl: Fix uninitialised struct field head.surface_id commit 738acd49eb018feb873e0fac8f9517493f6ce2c7 upstream. The surface_id struct field in head is not being initialized and static analysis warns that this is being passed through to dev->monitors_config->heads[i] on an assignment. Clear up this warning by initializing it to zero. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: a6d3c4d79822 ("qxl: hook monitors_config updates into crtc, not encoder.") Signed-off-by: Colin Ian King Link: http://patchwork.freedesktop.org/patch/msgid/20210304094928.2280722-1-colin.king@canonical.com Signed-off-by: Gerd Hoffmann Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit 0a269eb9979ee6aa46de507f962949bc8ad6c7ec Author: Wang Qing Date: Mon Mar 1 20:08:21 2021 +0800 s390/crypto: return -EFAULT if copy_to_user() fails commit 942df4be7ab40195e2a839e9de81951a5862bc5b upstream. The copy_to_user() function returns the number of bytes remaining to be copied, but we want to return -EFAULT if the copy doesn't complete. Fixes: e06670c5fe3b ("s390: vfio-ap: implement VFIO_DEVICE_GET_INFO ioctl") Signed-off-by: Wang Qing Reviewed-by: Tony Krowiak Signed-off-by: Heiko Carstens Link: https://lore.kernel.org/r/1614600502-16714-1-git-send-email-wangqing@vivo.com Signed-off-by: Heiko Carstens Signed-off-by: Greg Kroah-Hartman commit 8c64644d3db2767661ff38a31fa3b900e4154de5 Author: Eric Farman Date: Mon Mar 1 19:33:24 2021 +0100 s390/cio: return -EFAULT if copy_to_user() fails commit d9c48a948d29bcb22f4fe61a81b718ef6de561a0 upstream. Fixes: 120e214e504f ("vfio: ccw: realize VFIO_DEVICE_G(S)ET_IRQ_INFO ioctls") Signed-off-by: Eric Farman Signed-off-by: Heiko Carstens Signed-off-by: Greg Kroah-Hartman commit bc8ceb4724afac4191fecbac87f2a7f3539444f9 Author: Tvrtko Ursulin Date: Tue Mar 2 11:42:13 2021 +0000 drm/i915: Wedge the GPU if command parser setup fails commit a829f033e966d5e4aa27c3ef2b381f51734e4a7f upstream. Commit 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing") introduced mandatory command parsing but setup failures were not translated into wedging the GPU which was probably the intent. Possible errors come in two categories. Either the sanity check on internal tables has failed, which should be caught in CI unless an affected platform would be missed in testing; or memory allocation failure happened during driver load, which should be extremely unlikely but for correctness should still be handled. v2: * Tidy coding style. (Chris) [airlied: cherry-picked to avoid rc1 base] Signed-off-by: Tvrtko Ursulin Fixes: 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing") Cc: Jon Bloomfield Cc: Joonas Lahtinen Cc: Chris Wilson Reviewed-by: Chris Wilson Link: https://patchwork.freedesktop.org/patch/msgid/20210302114213.1102223-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 5a1a659762d35a6dc51047c9127c011303c77b7f) Signed-off-by: Rodrigo Vivi Signed-off-by: Dave Airlie Signed-off-by: Greg Kroah-Hartman commit 6936a61a317df41e527687de541e089a40c1d4ad Author: Noralf Trønnes Date: Fri Feb 19 13:22:03 2021 +0100 drm/shmem-helpers: vunmap: Don't put pages for dma-buf commit 64e194e278673bceb68fb2dde7dbc3d812bfceb3 upstream. dma-buf importing was reworked in commit 7d2cd72a9aa3 ("drm/shmem-helpers: Simplify dma-buf importing"). Before that commit drm_gem_shmem_prime_import_sg_table() did set ->pages_use_count=1 and drm_gem_shmem_vunmap_locked() could call drm_gem_shmem_put_pages() unconditionally. Now without the use count set, put pages is called also on dma-bufs. Fix this by only putting pages if it's not imported. Signed-off-by: Noralf Trønnes Fixes: 7d2cd72a9aa3 ("drm/shmem-helpers: Simplify dma-buf importing") Cc: Daniel Vetter Cc: Thomas Zimmermann Acked-by: Thomas Zimmermann Tested-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20210219122203.51130-1-noralf@tronnes.org (cherry picked from commit cdea72518a2b38207146e92e1c9e2fac15975679) Signed-off-by: Thomas Zimmermann Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit cef14d5d92f14a6e282c3216c2da63e05f14758a Author: Artem Lapkin Date: Tue Mar 2 12:22:02 2021 +0800 drm: meson_drv add shutdown function commit fa0c16caf3d73ab4d2e5d6fa2ef2394dbec91791 upstream. Problem: random stucks on reboot stage about 1/20 stuck/reboots // debug kernel log [ 4.496660] reboot: kernel restart prepare CMD:(null) [ 4.498114] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown begin [ 4.503949] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown domain 0:VPU... ...STUCK... Solution: add shutdown function to meson_drm driver // debug kernel log [ 5.231896] reboot: kernel restart prepare CMD:(null) [ 5.246135] [drm:meson_drv_shutdown] ... [ 5.259271] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown begin [ 5.274688] meson_ee_pwrc c883c000.system-controller:power-controller: shutdown domain 0:VPU... [ 5.338331] reboot: Restarting system [ 5.358293] psci: PSCI_0_2_FN_SYSTEM_RESET reboot_mode:0 cmd:(null) bl31 reboot reason: 0xd bl31 reboot reason: 0x0 system cmd 1. ...REBOOT... Tested: on VIM1 VIM2 VIM3 VIM3L khadas sbcs - 1000+ successful reboots and Odroid boards, WeTek Play2 (GXBB) Fixes: bbbe775ec5b5 ("drm: Add support for Amlogic Meson Graphic Controller") Signed-off-by: Artem Lapkin Tested-by: Christian Hewitt Acked-by: Neil Armstrong Acked-by: Kevin Hilman Signed-off-by: Neil Armstrong Link: https://patchwork.freedesktop.org/patch/msgid/20210302042202.3728113-1-art@khadas.com Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit 084b4f3304f923e39e64216a818a8bbf398dd896 Author: Alex Deucher Date: Tue Mar 9 22:58:47 2021 -0500 drm/amdgpu: fix S0ix handling when the CONFIG_AMD_PMC=m commit a5cb3c1a36376c25cd25fd3e99918dc48ac420bb upstream. Need to check the module variant as well. Acked-by: Prike Liang Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 24753adb30536889f2a3076caa7ce957106591ba Author: Thomas Zimmermann Date: Wed Mar 3 14:32:29 2021 +0100 drm: Use USB controller's DMA mask when importing dmabufs commit 659ab7a49cbebe0deffcbe1f9560e82006b21817 upstream. USB devices cannot perform DMA and hence have no dma_mask set in their device structure. Therefore importing dmabuf into a USB-based driver fails, which breaks joining and mirroring of display in X11. For USB devices, pick the associated USB controller as attachment device. This allows the DRM import helpers to perform the DMA setup. If the DMA controller does not support DMA transfers, we're out of luck and cannot import. Our current USB-based DRM drivers don't use DMA, so the actual DMA device is not important. Tested by joining/mirroring displays of udl and radeon under Gnome/X11. v8: * release dmadev if device initialization fails (Noralf) * fix commit description (Noralf) v7: * fix use-before-init bug in gm12u320 (Dan) v6: * implement workaround in DRM drivers and hold reference to DMA device while USB device is in use * remove dev_is_usb() (Greg) * collapse USB helper into usb_intf_get_dma_device() (Alan) * integrate Daniel's TODO statement (Daniel) * fix typos (Greg) v5: * provide a helper for USB interfaces (Alan) * add FIXME item to documentation and TODO list (Daniel) v4: * implement workaround with USB helper functions (Greg) * use struct usb_device->bus->sysdev as DMA device (Takashi) v3: * drop gem_create_object * use DMA mask of USB controller, if any (Daniel, Christian, Noralf) v2: * move fix to importer side (Christian, Daniel) * update SHMEM and CMA helpers for new PRIME callbacks Signed-off-by: Thomas Zimmermann Fixes: 6eb0233ec2d0 ("usb: don't inherity DMA properties for USB devices") Tested-by: Pavel Machek Reviewed-by: Greg Kroah-Hartman Acked-by: Christian König Acked-by: Daniel Vetter Acked-by: Noralf Trønnes Cc: Christoph Hellwig Cc: Greg Kroah-Hartman Cc: # v5.10+ Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20210303133229.3288-1-tzimmermann@suse.de Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit 3923eefd32e36ada97974b9130858d9b39d03752 Author: Neil Roberts Date: Tue Feb 23 16:51:25 2021 +0100 drm/shmem-helper: Don't remove the offset in vm_area_struct pgoff commit 11d5a4745e00e73745774671dbf2fb07bd6e2363 upstream. When mmapping the shmem, it would previously adjust the pgoff in the vm_area_struct to remove the fake offset that is added to be able to identify the buffer. This patch removes the adjustment and makes the fault handler use the vm_fault address to calculate the page offset instead. Although using this address is apparently discouraged, several DRM drivers seem to be doing it anyway. The problem with removing the pgoff is that it prevents drm_vma_node_unmap from working because that searches the mapping tree by address. That doesn't work because all of the mappings are at offset 0. drm_vma_node_unmap is being used by the shmem helpers when purging the buffer. This fixes a bug in Panfrost which is using drm_gem_shmem_purge. Without this the mapping for the purged buffer can still be accessed which might mean it would access random pages from other buffers v2: Don't check whether the unsigned page_offset is less than 0. Cc: stable@vger.kernel.org Fixes: 17acb9f35ed7 ("drm/shmem: Add madvise state and purge helpers") Signed-off-by: Neil Roberts Reviewed-by: Steven Price Signed-off-by: Steven Price Link: https://patchwork.freedesktop.org/patch/msgid/20210223155125.199577-3-nroberts@igalia.com Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit 11c708e69bedd1a03db5a6c25e0e59505ee0a722 Author: Neil Roberts Date: Tue Feb 23 16:51:24 2021 +0100 drm/shmem-helper: Check for purged buffers in fault handler commit d611b4a0907cece060699f2fd347c492451cd2aa upstream. When a buffer is madvised as not needed and then purged, any attempts to access the buffer from user-space should cause a bus fault. This patch adds a check for that. Cc: stable@vger.kernel.org Fixes: 17acb9f35ed7 ("drm/shmem: Add madvise state and purge helpers") Signed-off-by: Neil Roberts Reviewed-by: Steven Price Signed-off-by: Steven Price Link: https://patchwork.freedesktop.org/patch/msgid/20210223155125.199577-2-nroberts@igalia.com Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit e238a59c46bd51d61155e6a1f6a798738e3bd6bb Author: Alex Deucher Date: Thu Dec 10 01:45:12 2020 -0500 drm/amdgpu/display: handle aux backlight in backlight_get_brightness commit 0ad3e64eb46d8c47de3af552e282894e3893e973 upstream. Need to fetch it via aux. Reviewed-by: Nicholas Kazlauskas Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 7c39e084afb0bde83edda80bf1f73d7d9694bf9e Author: Alex Deucher Date: Thu Dec 10 01:20:08 2020 -0500 drm/amdgpu/display: don't assert in set backlight function commit dfd8b7fbd985ec1cf76fe10f2875a50b10833740 upstream. It just spams the logs. Reviewed-by: Nicholas Kazlauskas Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit dcddebbb8e6bd61e0d37ec54464ed2d39e9653ec Author: Alex Deucher Date: Thu Dec 10 01:18:40 2020 -0500 drm/amdgpu/display: simplify backlight setting commit a2f8d988698d7d3645b045f4940415b045140b81 upstream. Avoid the extra wrapper function. Reviewed-by: Nicholas Kazlauskas Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 216158fc01ebb47a3525f538fcc7b7c2dcc85ff0 Author: Kenneth Feng Date: Tue Mar 9 21:10:16 2021 +0800 drm/amd/pm: bug fix for pcie dpm commit 50ceb1fe7acd50831180f4b5597bf7b39e8059c8 upstream. Currently the pcie dpm has two problems. 1. Only the high dpm level speed/width can be overrided if the requested values are out of the pcie capability. 2. The high dpm level is always overrided though sometimes it's not necesarry. Signed-off-by: Kenneth Feng Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 50ffbd09dfb6bb4a9f283c0cdb0a5dd9e99dde57 Author: Evan Quan Date: Fri Mar 5 14:21:26 2021 +0800 drm/amd/pm: correct the watermark settings for Polaris commit 48123d068fcb584838ce29912660c5e9490bad0e upstream. The "/ 10" should be applied to the right-hand operand instead of the left-hand one. Signed-off-by: Evan Quan Noticed-by: Georgios Toptsidis Reviewed-by: Feifei Xu Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit fc24cc8868a1be2819e8958c1bbaa6c51bb88efc Author: Holger Hoffstätte Date: Fri Mar 5 12:39:21 2021 +0100 drm/amd/display: Fix nested FPU context in dcn21_validate_bandwidth() commit 15e8b95d5f7509e0b09289be8c422c459c9f0412 upstream. Commit 41401ac67791 added FPU wrappers to dcn21_validate_bandwidth(), which was correct. Unfortunately a nested function alredy contained DC_FP_START()/DC_FP_END() calls, which results in nested FPU context enter/exit and complaints by kernel_fpu_begin_mask(). This can be observed e.g. with 5.10.20, which backported 41401ac67791 and now emits the following warning on boot: WARNING: CPU: 6 PID: 858 at arch/x86/kernel/fpu/core.c:129 kernel_fpu_begin_mask+0xa5/0xc0 Call Trace: dcn21_calculate_wm+0x47/0xa90 [amdgpu] dcn21_validate_bandwidth_fp+0x15d/0x2b0 [amdgpu] dcn21_validate_bandwidth+0x29/0x40 [amdgpu] dc_validate_global_state+0x3c7/0x4c0 [amdgpu] The warning is emitted due to the additional DC_FP_START/END calls in patch_bounding_box(), which is inlined into dcn21_calculate_wm(), its only caller. Removing the calls brings the code in line with dcn20 and makes the warning disappear. Fixes: 41401ac67791 ("drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()") Signed-off-by: Holger Hoffstätte Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 4290476449535da46bbe7ac865c0fafa06ee17fd Author: Holger Hoffstätte Date: Fri Mar 5 15:23:18 2021 +0100 drm/amdgpu/display: use GFP_ATOMIC in dcn21_validate_bandwidth_fp() commit 680174cfd1e1cea70a8f30ccb44d8fbdf996018e upstream. After fixing nested FPU contexts caused by 41401ac67791 we're still seeing complaints about spurious kernel_fpu_end(). As it turns out this was already fixed for dcn20 in commit f41ed88cbd ("drm/amdgpu/display: use GFP_ATOMIC in dcn20_validate_bandwidth_internal") but never moved forward to dcn21. Signed-off-by: Holger Hoffstätte Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 78122c730749460de4378983af5e292e46bc7fe3 Author: Takashi Iwai Date: Wed Feb 3 13:42:41 2021 +0100 drm/amd/display: Add a backlight module option commit 7a46f05e5e163c00e41892e671294286e53fe15c upstream. There seem devices that don't work with the aux channel backlight control. For allowing such users to test with the other backlight control method, provide a new module option, aux_backlight, to specify enabling or disabling the aux backport support explicitly. As default, the aux support is detected by the hardware capability. v2: make the backlight option generic in case we add future backlight types (Alex) BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1180749 BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1438 Reviewed-by: Nicholas Kazlauskas Signed-off-by: Takashi Iwai Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman commit 90da35d821f527b2d0082d12a532ce8146be70e8 Author: Christian König Date: Mon Mar 8 19:35:14 2021 +0100 drm/radeon: also init GEM funcs in radeon_gem_prime_import_sg_table commit a25955ba123499d7db520175c6be59c29f9215e3 upstream. Otherwise we will run into a NULL ptr deref. Signed-off-by: Christian König Bug: https://bugzilla.kernel.org/show_bug.cgi?id=212137 Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org # 5.11.x Signed-off-by: Greg Kroah-Hartman commit da27b6b41f12c9550342bd47bc576ce518f87cb9 Author: Daniel Vetter Date: Mon Feb 22 11:06:43 2021 +0100 drm/compat: Clear bounce structures commit de066e116306baf3a6a62691ac63cfc0b1dabddb upstream. Some of them have gaps, or fields we don't clear. Native ioctl code does full copies plus zero-extends on size mismatch, so nothing can leak. But compat is more hand-rolled so need to be careful. None of these matter for performance, so just memset. Also I didn't fix up the CONFIG_DRM_LEGACY or CONFIG_DRM_AGP ioctl, those are security holes anyway. Acked-by: Maxime Ripard Reported-by: syzbot+620cf21140fc7e772a5d@syzkaller.appspotmail.com # vblank ioctl Cc: syzbot+620cf21140fc7e772a5d@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter Link: https://patchwork.freedesktop.org/patch/msgid/20210222100643.400935-1-daniel.vetter@ffwll.ch (cherry picked from commit e926c474ebee404441c838d18224cd6f246a71b7) Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit b1d21391595bd4366907bba050cabe216108d820 Author: Tong Zhang Date: Sat Feb 27 23:46:25 2021 -0500 drm/fb-helper: only unmap if buffer not null commit 874a52f9b693ed8bf7a92b3592a547ce8a684e6f upstream. drm_fbdev_cleanup() can be called when fb_helper->buffer is null, hence fb_helper->buffer should be checked before calling drm_client_buffer_vunmap(). This buffer is also checked in drm_client_framebuffer_delete(), so we should also do the same thing for drm_client_buffer_vunmap(). [ 199.128742] RIP: 0010:drm_client_buffer_vunmap+0xd/0x20 [ 199.129031] Code: 43 18 48 8b 53 20 49 89 45 00 49 89 55 08 5b 44 89 e0 41 5c 41 5d 41 5e 5d c3 0f 1f 00 53 48 89 fb 48 8d 7f 10 e8 73 7d a1 ff <48> 8b 7b 10 48 8d 73 18 5b e9 75 53 fc ff 0 f 1f 44 00 00 48 b8 00 [ 199.130041] RSP: 0018:ffff888103f3fc88 EFLAGS: 00010282 [ 199.130329] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff8214d46d [ 199.130733] RDX: 1ffffffff079c6b9 RSI: 0000000000000246 RDI: ffffffff83ce35c8 [ 199.131119] RBP: ffff888103d25458 R08: 0000000000000001 R09: fffffbfff0791761 [ 199.131505] R10: ffffffff83c8bb07 R11: fffffbfff0791760 R12: 0000000000000000 [ 199.131891] R13: ffff888103d25468 R14: ffff888103d25418 R15: ffff888103f18120 [ 199.132277] FS: 00007f36fdcbb6a0(0000) GS:ffff88815b400000(0000) knlGS:0000000000000000 [ 199.132721] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 199.133033] CR2: 0000000000000010 CR3: 0000000103d26000 CR4: 00000000000006f0 [ 199.133420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 199.133807] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 199.134195] Call Trace: [ 199.134333] drm_fbdev_cleanup+0x179/0x1a0 [ 199.134562] drm_fbdev_client_unregister+0x2b/0x40 [ 199.134828] drm_client_dev_unregister+0xa8/0x180 [ 199.135088] drm_dev_unregister+0x61/0x110 [ 199.135315] mgag200_pci_remove+0x38/0x52 [mgag200] [ 199.135586] pci_device_remove+0x62/0xe0 [ 199.135806] device_release_driver_internal+0x148/0x270 [ 199.136094] driver_detach+0x76/0xe0 [ 199.136294] bus_remove_driver+0x7e/0x100 [ 199.136521] pci_unregister_driver+0x28/0xf0 [ 199.136759] __x64_sys_delete_module+0x268/0x300 [ 199.137016] ? __ia32_sys_delete_module+0x300/0x300 [ 199.137285] ? call_rcu+0x3e4/0x580 [ 199.137481] ? fpregs_assert_state_consistent+0x4d/0x60 [ 199.137767] ? exit_to_user_mode_prepare+0x2f/0x130 [ 199.138037] do_syscall_64+0x33/0x40 [ 199.138237] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 199.138517] RIP: 0033:0x7f36fdc3dcf7 Signed-off-by: Tong Zhang Fixes: 763aea17bf57 ("drm/fb-helper: Unmap client buffer during shutdown") Cc: Thomas Zimmermann Cc: Sam Ravnborg Cc: Maxime Ripard Cc: Maarten Lankhorst Cc: David Airlie Cc: Daniel Vetter Cc: dri-devel@lists.freedesktop.org Cc: # v5.11+ Signed-off-by: Thomas Zimmermann Link: https://patchwork.freedesktop.org/patch/msgid/20210228044625.171151-1-ztong0001@gmail.com Signed-off-by: Maarten Lankhorst Signed-off-by: Greg Kroah-Hartman commit 37f79ab59cef9bbb7b6c3c5ade8fcd3841f524e6 Author: Edwin Peer Date: Fri Feb 26 04:43:10 2021 -0500 bnxt_en: reliably allocate IRQ table on reset to avoid crash commit 20d7d1c5c9b11e9f538ed4a2289be106de970d3e upstream. The following trace excerpt corresponds with a NULL pointer dereference of 'bp->irq_tbl' in bnxt_setup_inta() on an Aarch64 system after many device resets: Unable to handle kernel NULL pointer dereference at ... 000000d ... pc : string+0x3c/0x80 lr : vsnprintf+0x294/0x7e0 sp : ffff00000f61ba70 pstate : 20000145 x29: ffff00000f61ba70 x28: 000000000000000d x27: ffff0000009c8b5a x26: ffff00000f61bb80 x25: ffff0000009c8b5a x24: 0000000000000012 x23: 00000000ffffffe0 x22: ffff000008990428 x21: ffff00000f61bb80 x20: 000000000000000d x19: 000000000000001f x18: 0000000000000000 x17: 0000000000000000 x16: ffff800b6d0fb400 x15: 0000000000000000 x14: ffff800b7fe31ae8 x13: 00001ed16472c920 x12: ffff000008c6b1c9 x11: ffff000008cf0580 x10: ffff00000f61bb80 x9 : 00000000ffffffd8 x8 : 000000000000000c x7 : ffff800b684b8000 x6 : 0000000000000000 x5 : 0000000000000065 x4 : 0000000000000001 x3 : ffff0a00ffffff04 x2 : 000000000000001f x1 : 0000000000000000 x0 : 000000000000000d Call trace: string+0x3c/0x80 vsnprintf+0x294/0x7e0 snprintf+0x44/0x50 __bnxt_open_nic+0x34c/0x928 [bnxt_en] bnxt_open+0xe8/0x238 [bnxt_en] __dev_open+0xbc/0x130 __dev_change_flags+0x12c/0x168 dev_change_flags+0x20/0x60 ... Ordinarily, a call to bnxt_setup_inta() (not in trace due to inlining) would not be expected on a system supporting MSIX at all. However, if bnxt_init_int_mode() does not end up being called after the call to bnxt_clear_int_mode() in bnxt_fw_reset_close(), then the driver will think that only INTA is supported and bp->irq_tbl will be NULL, causing the above crash. In the error recovery scenario, we call bnxt_clear_int_mode() in bnxt_fw_reset_close() early in the sequence. Ordinarily, we will call bnxt_init_int_mode() in bnxt_hwrm_if_change() after we reestablish communication with the firmware after reset. However, if the sequence has to abort before we call bnxt_init_int_mode() and if the user later attempts to re-open the device, then it will cause the crash above. We fix it in 2 ways: 1. Check for bp->irq_tbl in bnxt_setup_int_mode(). If it is NULL, call bnxt_init_init_mode(). 2. If we need to abort in bnxt_hwrm_if_change() and cannot complete the error recovery sequence, set the BNXT_STATE_ABORT_ERR flag. This will cause more drastic recovery at the next attempt to re-open the device, including a call to bnxt_init_int_mode(). Fixes: 3bc7d4a352ef ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.") Reviewed-by: Scott Branden Signed-off-by: Edwin Peer Signed-off-by: Michael Chan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 4693548951ed246b2e4ee90fc1f9ec77c9e20c46 Author: Wang Qing Date: Mon Mar 1 20:01:33 2021 +0800 s390/cio: return -EFAULT if copy_to_user() fails again commit 51c44babdc19aaf882e1213325a0ba291573308f upstream. The copy_to_user() function returns the number of bytes remaining to be copied, but we want to return -EFAULT if the copy doesn't complete. Fixes: e01bcdd61320 ("vfio: ccw: realize VFIO_DEVICE_GET_REGION_INFO ioctl") Signed-off-by: Wang Qing Signed-off-by: Heiko Carstens Link: https://lore.kernel.org/r/1614600093-13992-1-git-send-email-wangqing@vivo.com Signed-off-by: Heiko Carstens Signed-off-by: Greg Kroah-Hartman commit 354b3f64709225674e05d7e0c2515e8451b26737 Author: Jian Shen Date: Sat Feb 27 15:24:53 2021 +0800 net: hns3: fix bug when calculating the TCAM table info commit b36fc875bcdee56865c444a2cdae17d354a6d5f5 upstream. The function hclge_fd_convert_tuple() is used to convert tuples and tuples mask to TCAM x and y. But it misuses the source mac as source mac mask when convert INNER_SRC_MAC, which may cause the flow director rule works unexpectedly. So fix it. Fixes: 117328680288 ("net: hns3: Add input key and action config support for flow director") Signed-off-by: Jian Shen Signed-off-by: Huazhong Tan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 1e6c0f6703298d12a32cd7c057cd56eef9fde742 Author: Jian Shen Date: Sat Feb 27 15:24:52 2021 +0800 net: hns3: fix query vlan mask value error for flow director commit c75ec148a316e8cf52274d16b9b422703b96f5ce upstream. Currently, the driver returns VLAN_VID_MASK for vlan mask field, when get flow director rule information for rule doesn't use vlan. It may cause the vlan mask value display as 0xf000 in this case, like below: estuary:/$ ethtool -u eth1 50 RX rings available Total 1 rules Filter: 2 Rule Type: TCP over IPv4 Src IP addr: 0.0.0.0 mask: 255.255.255.255 Dest IP addr: 0.0.0.0 mask: 255.255.255.255 TOS: 0x0 mask: 0xff Src port: 0 mask: 0xffff Dest port: 0 mask: 0xffff VLAN EtherType: 0x0 mask: 0xffff VLAN: 0x0 mask: 0xf000 User-defined: 0x1234 mask: 0x0 Action: Direct to queue 3 Fix it by return 0. Fixes: 05c2314fe6a8 ("net: hns3: Add support for rule query of flow director") Signed-off-by: Jian Shen Signed-off-by: Huazhong Tan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 163fab2e19d66eb0f2f20f8784d5c630e8cfbceb Author: Jian Shen Date: Sat Feb 27 15:24:51 2021 +0800 net: hns3: fix error mask definition of flow director commit ae85ddda0f1b341b2d25f5a5e0eff1d42b6ef3df upstream. Currently, some bit filed definitions of flow director TCAM configuration command are incorrect. Since the wrong MSB is always 0, and these fields are assgined in order, so it still works. Fix it by redefine them. Fixes: 117328680288 ("net: hns3: Add input key and action config support for flow director") Signed-off-by: Jian Shen Signed-off-by: Huazhong Tan Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit e58feda63ef7c596eca61fc919979aafc0a5b4e5 Author: Ravi Bangoria Date: Thu Mar 4 11:59:58 2021 +0530 perf report: Fix -F for branch & mem modes commit 6740a4e70e5d1b9d8e7fe41fd46dd5656d65dadf upstream. perf report fails to add valid additional fields with -F when used with branch or mem modes. Fix it. Before patch: $ perf record -b $ perf report -b -F +srcline_from --stdio Error: Invalid --fields key: `srcline_from' After patch: $ perf report -b -F +srcline_from --stdio # Samples: 8K of event 'cycles' # Event count (approx.): 8784 ... Committer notes: There was an inversion: when looking at branch stack dimensions (keys) it was checking if the sort mode was 'mem', not 'branch'. Fixes: aa6b3c99236b ("perf report: Make -F more strict like -s") Reported-by: Athira Jajeev Signed-off-by: Ravi Bangoria Reviewed-by: Athira Jajeev Tested-by: Arnaldo Carvalho de Melo Tested-by: Athira Jajeev Cc: Jiri Olsa Cc: Kan Liang Cc: Namhyung Kim Link: http://lore.kernel.org/lkml/20210304062958.85465-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit 4f55f7f57e917f576442b2fb11cb5d4d0773d90d Author: Ian Rogers Date: Fri Feb 26 14:14:31 2021 -0800 perf traceevent: Ensure read cmdlines are null terminated. commit 137a5258939aca56558f3a23eb229b9c4b293917 upstream. Issue detected by address sanitizer. Fixes: cd4ceb63438e9e28 ("perf util: Save pid-cmdline mapping into tracing header") Signed-off-by: Ian Rogers Acked-by: Namhyung Kim Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Mark Rutland Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lore.kernel.org/lkml/20210226221431.1985458-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit ea54838e532194daf80d4af002375424e364eca1 Author: Danielle Ratson Date: Thu Feb 25 18:57:20 2021 +0200 mlxsw: spectrum_ethtool: Add an external speed to PTYS register commit ae9b24ddb69b4e31cda1b5e267a5a08a1db11717 upstream. Currently, only external bits are added to the PTYS register, whereas there is one external bit that is wrongly marked as internal, and so was recently removed from the register. Add that bit to the PTYS register again, as this bit is no longer internal. Its removal resulted in '100000baseLR4_ER4/Full' link mode no longer being supported, causing a regression on some setups. Fixes: 5bf01b571cf4 ("mlxsw: spectrum_ethtool: Remove internal speeds from PTYS register") Signed-off-by: Danielle Ratson Reported-by: Eddie Shklaer Tested-by: Eddie Shklaer Reviewed-by: Jiri Pirko Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 182d5a16e9c3ce12431101b249199046df964982 Author: Danielle Ratson Date: Thu Feb 25 18:57:19 2021 +0200 selftests: forwarding: Fix race condition in mirror installation commit edcbf5137f093b5502f5f6b97cce3cbadbde27aa upstream. When mirroring to a gretap in hardware the device expects to be programmed with the egress port and all the encapsulating headers. This requires the driver to resolve the path the packet will take in the software data path and program the device accordingly. If the path cannot be resolved (in this case because of an unresolved neighbor), then mirror installation fails until the path is resolved. This results in a race that causes the test to sometimes fail. Fix this by setting the neighbor's state to permanent, so that it is always valid. Fixes: b5b029399fa6d ("selftests: forwarding: mirror_gre_bridge_1d_vlan: Add STP test") Signed-off-by: Danielle Ratson Reviewed-by: Petr Machata Signed-off-by: Ido Schimmel Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 20f62c1c87c7a3bba4235843239bbf4b615b4484 Author: Arnd Bergmann Date: Thu Feb 25 15:57:27 2021 +0100 net: phy: make mdio_bus_phy_suspend/resume as __maybe_unused commit 7f654157f0aefba04cd7f6297351c87b76b47b89 upstream. When CONFIG_PM_SLEEP is disabled, the compiler warns about unused functions: drivers/net/phy/phy_device.c:273:12: error: unused function 'mdio_bus_phy_suspend' [-Werror,-Wunused-function] static int mdio_bus_phy_suspend(struct device *dev) drivers/net/phy/phy_device.c:293:12: error: unused function 'mdio_bus_phy_resume' [-Werror,-Wunused-function] static int mdio_bus_phy_resume(struct device *dev) The logic is intentional, so just mark these two as __maybe_unused and remove the incorrect #ifdef. Fixes: 4c0d2e96ba05 ("net: phy: consider that suspend2ram may cut off PHY power") Signed-off-by: Arnd Bergmann Reviewed-by: Andrew Lunn Link: https://lore.kernel.org/r/20210225145748.404410-1-arnd@kernel.org Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit bfc1e5619432659c00d1c4f3cfa7879d070d0f23 Author: Yinjun Zhang Date: Thu Feb 25 13:51:02 2021 +0100 ethtool: fix the check logic of at least one channel for RX/TX commit a4fc088ad4ff4a99d01978aa41065132b574b4b2 upstream. The command "ethtool -L combined 0" may clean the RX/TX channel count and skip the error path, since the attrs tb[ETHTOOL_A_CHANNELS_RX_COUNT] and tb[ETHTOOL_A_CHANNELS_TX_COUNT] are NULL in this case when recent ethtool is used. Tested using ethtool v5.10. Fixes: 7be92514b99c ("ethtool: check if there is at least one channel for TX/RX in the core") Signed-off-by: Yinjun Zhang Signed-off-by: Simon Horman Signed-off-by: Louis Peens Link: https://lore.kernel.org/r/20210225125102.23989-1-simon.horman@netronome.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 25048ef1f9d32c86a1596d475779f5593fdf0887 Author: Joakim Zhang Date: Thu Feb 25 17:01:13 2021 +0800 net: stmmac: fix wrongly set buffer2 valid when sph unsupport commit 396e13e11577b614db77db0bbb6fca935b94eb1b upstream. In current driver, buffer2 available only when hardware supports split header. Wrongly set buffer2 valid in stmmac_rx_refill when refill buffer address. You can see that desc3 is 0x81000000 after initialization, but turn out to be 0x83000000 after refill. Fixes: 67afd6d1cfdf ("net: stmmac: Add Split Header support and enable it in XGMAC cores") Signed-off-by: Joakim Zhang Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 6baecb96c5bfb44efeb4bc665f4c008d57ff6683 Author: Joakim Zhang Date: Thu Feb 25 17:01:11 2021 +0800 net: stmmac: fix watchdog timeout during suspend/resume stress test commit c511819d138de38e1637eedb645c207e09680d0f upstream. stmmac_xmit() call stmmac_tx_timer_arm() at the end to modify tx timer to do the transmission cleanup work. Imagine such a situation, stmmac enters suspend immediately after tx timer modified, it's expire callback stmmac_tx_clean() would not be invoked. This could affect BQL, since netdev_tx_sent_queue() has been called, but netdev_tx_completed_queue() have not been involved, as a result, dql_avail(&dev_queue->dql) finally always return a negative value. __dev_queue_xmit->__dev_xmit_skb->qdisc_run->__qdisc_run->qdisc_restart->dequeue_skb: if ((q->flags & TCQ_F_ONETXQUEUE) && netif_xmit_frozen_or_stopped(txq)) // __QUEUE_STATE_STACK_XOFF is set Net core will stop transmitting any more. Finillay, net watchdong would timeout. To fix this issue, we should call netdev_tx_reset_queue() in stmmac_resume(). Fixes: 54139cf3bb33 ("net: stmmac: adding multiple buffers for rx") Signed-off-by: Joakim Zhang Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 598b147b6adb2eb760b2ebd93aed60f4d0ec7dad Author: Joakim Zhang Date: Thu Feb 25 17:01:10 2021 +0800 net: stmmac: stop each tx channel independently commit a3e860a83397bf761ec1128a3f0ba186445992c6 upstream. If clear GMAC_CONFIG_TE bit, it would stop all tx channels, but users may only want to stop specific tx channel. Fixes: 48863ce5940f ("stmmac: add DMA support for GMAC 4.xx") Signed-off-by: Joakim Zhang Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 9ef4c1b438a3646cc339dd334f5b05c84bef1fd5 Author: Antonio Terceiro Date: Wed Feb 24 10:00:46 2021 -0300 perf build: Fix ccache usage in $(CC) when generating arch errno table commit dacfc08dcafa7d443ab339592999e37bbb8a3ef0 upstream. This was introduced by commit e4ffd066ff440a57 ("perf: Normalize gcc parameter when generating arch errno table"). Assuming the first word of $(CC) is the actual compiler breaks usage like CC="ccache gcc": the script ends up calling ccache directly with gcc arguments, what fails. Instead of getting the first word, just remove from $(CC) any word that starts with a "-". This maintains the spirit of the original patch, while not breaking ccache users. Fixes: e4ffd066ff440a57 ("perf: Normalize gcc parameter when generating arch errno table") Signed-off-by: Antonio Terceiro Tested-by: Arnaldo Carvalho de Melo Cc: Alexander Shishkin Cc: He Zhe Cc: Jiri Olsa Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210224130046.346977-1-antonio.terceiro@linaro.org Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit 5864c338d26c04ab7bda860fc83e9a3b1b086b0d Author: Kun-Chuan Hsieh Date: Wed Feb 24 05:27:52 2021 +0000 tools/resolve_btfids: Fix build error with older host toolchains commit 41462c6e730ca0e63f5fed5a517052385d980c54 upstream. Older libelf.h and glibc elf.h might not yet define the ELF compression types. Checking and defining SHF_COMPRESSED fix the build error when compiling with older toolchains. Also, the tool resolve_btfids is compiled with host toolchain. The host toolchain is more likely to be older than the cross compile toolchain. Fixes: 51f6463aacfb ("tools/resolve_btfids: Fix sections with wrong alignment") Signed-off-by: Kun-Chuan Hsieh Signed-off-by: Daniel Borkmann Acked-by: Jiri Olsa Link: https://lore.kernel.org/bpf/20210224052752.5284-1-jetswayss@gmail.com Signed-off-by: Greg Kroah-Hartman commit bc4b6843c44a11840ef25eac8e3f80f19f18b0e5 Author: Antony Antony Date: Wed Oct 14 16:17:48 2020 +0200 ixgbe: fail to create xfrm offload of IPsec tunnel mode SA commit d785e1fec60179f534fbe8d006c890e5ad186e51 upstream. Based on talks and indirect references ixgbe IPsec offlod do not support IPsec tunnel mode offload. It can only support IPsec transport mode offload. Now explicitly fail when creating non transport mode SA with offload to avoid false performance expectations. Fixes: 63a67fe229ea ("ixgbe: add ipsec offload add and remove SA") Signed-off-by: Antony Antony Acked-by: Shannon Nelson Tested-by: Tony Brelinski Signed-off-by: Tony Nguyen Signed-off-by: Greg Kroah-Hartman commit b074f1e5d4d8a7c981a2800791a19a0a24bb3193 Author: Hayes Wang Date: Fri Mar 5 17:34:41 2021 +0800 r8169: fix r8168fp_adjust_ocp_cmd function commit abbf9a0ef8848dca58c5b97750c1c59bbee45637 upstream. The (0xBAF70000 & 0x00FFF000) << 6 should be (0xf70 << 18). Fixes: 561535b0f239 ("r8169: fix OCP access on RTL8117") Signed-off-by: Hayes Wang Acked-by: Heiner Kallweit Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2d89e955e34b7f70e4f840e60a4706bcec42084e Author: Julian Wiedmann Date: Tue Mar 9 17:52:21 2021 +0100 s390/qeth: fix notification for pending buffers during teardown commit 7eefda7f353ef86ad82a2dc8329e8a3538c08ab6 upstream. The cited commit reworked the state machine for pending TX buffers. In qeth_iqd_tx_complete() it turned PENDING into a transient state, and uses NEED_QAOB for buffers that get parked while waiting for their QAOB completion. But it missed to adjust the check in qeth_tx_complete_buf(). So if qeth_tx_complete_pending_bufs() is called during teardown to drain the parked TX buffers, we no longer raise a notification for af_iucv. Instead of updating the checked state, just move this code into qeth_tx_complete_pending_bufs() itself. This also gets rid of the special-case in the common TX completion path. Fixes: 8908f36d20d8 ("s390/qeth: fix af_iucv notification race") Signed-off-by: Julian Wiedmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 890a7665ef524c9a411cf63dfef6409d0fb3a7cb Author: Julian Wiedmann Date: Tue Mar 9 17:52:20 2021 +0100 s390/qeth: schedule TX NAPI on QAOB completion commit 3e83d467a08e25b27c44c885f511624a71c84f7c upstream. When a QAOB notifies us that a pending TX buffer has been delivered, the actual TX completion processing by qeth_tx_complete_pending_bufs() is done within the context of a TX NAPI instance. We shouldn't rely on this instance being scheduled by some other TX event, but just do it ourselves. qeth_qdio_handle_aob() is called from qeth_poll(), ie. our main NAPI instance. To avoid touching the TX queue's NAPI instance before/after it is (un-)registered, reorder the code in qeth_open() and qeth_stop() accordingly. Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 27666f87952294677e6b30e80bad50cf5a43b00a Author: Julian Wiedmann Date: Tue Mar 9 17:52:19 2021 +0100 s390/qeth: improve completion of pending TX buffers commit c20383ad1656b0f6354dd50e4acd894f9d94090d upstream. The current design attaches a pending TX buffer to a custom single-linked list, which is anchored at the buffer's slot on the TX ring. The buffer is then checked for final completion whenever this slot is processed during a subsequent TX NAPI poll cycle. But if there's insufficient traffic on the ring, we might never make enough progress to get back to this ring slot and discover the pending buffer's final TX completion. In particular if this missing TX completion blocks the application from sending further traffic. So convert the custom single-linked list code to a per-queue list_head, and scan this list on every TX NAPI cycle. Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d76d8e4862a0d041b374d5456f074994b8def40b Author: Julian Wiedmann Date: Tue Mar 9 17:52:18 2021 +0100 s390/qeth: fix memory leak after failed TX Buffer allocation commit e7a36d27f6b9f389e41d8189a8a08919c6835732 upstream. When qeth_alloc_qdio_queues() fails to allocate one of the buffers that back an Output Queue, the 'out_freeoutqbufs' path will free all previously allocated buffers for this queue. But it misses to free the half-finished queue struct itself. Move the buffer allocation into qeth_alloc_output_queue(), and deal with such errors internally. Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann Reviewed-by: Alexandra Winter Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cc1438d6ba44377b5465f7c08516fd547fcd225a Author: Jia-Ju Bai Date: Mon Mar 8 01:13:55 2021 -0800 net: qrtr: fix error return code of qrtr_sendmsg() commit 179d0ba0c454057a65929c46af0d6ad986754781 upstream. When sock_alloc_send_skb() returns NULL to skb, no error return code of qrtr_sendmsg() is assigned. To fix this bug, rc is assigned with -ENOMEM in this case. Fixes: 194ccc88297a ("net: qrtr: Support decoding incoming v2 packets") Reported-by: TOTE Robot Signed-off-by: Jia-Ju Bai Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2f71c51db8debdc062404b34c639680225f0a6f4 Author: Vladimir Oltean Date: Sun Mar 7 15:23:39 2021 +0200 net: enetc: allow hardware timestamping on TX queues with tc-etf enabled commit 29d98f54a4fe1b6a9089bec8715a1b89ff9ad59c upstream. The txtime is passed to the driver in skb->skb_mstamp_ns, which is actually in a union with skb->tstamp (the place where software timestamps are kept). Since commit b50a5c70ffa4 ("net: allow simultaneous SW and HW transmit timestamping"), __sock_recv_timestamp has some logic for making sure that the two calls to skb_tstamp_tx: skb_tx_timestamp(skb) # Software timestamp in the driver -> skb_tstamp_tx(skb, NULL) and skb_tstamp_tx(skb, &shhwtstamps) # Hardware timestamp in the driver will both do the right thing and in a race-free manner, meaning that skb_tx_timestamp will deliver a cmsg with the software timestamp only, and skb_tstamp_tx with a non-NULL hwtstamps argument will deliver a cmsg with the hardware timestamp only. Why are races even possible? Well, because although the software timestamp skb->tstamp is private per skb, the hardware timestamp skb_hwtstamps(skb) lives in skb_shinfo(skb), an area which is shared between skbs and their clones. And skb_tstamp_tx works by cloning the packets when timestamping them, therefore attempting to perform hardware timestamping on an skb's clone will also change the hardware timestamp of the original skb. And the original skb might have been yet again cloned for software timestamping, at an earlier stage. So the logic in __sock_recv_timestamp can't be as simple as saying "does this skb have a hardware timestamp? if yes I'll send the hardware timestamp to the socket, otherwise I'll send the software timestamp", precisely because the hardware timestamp is shared. Instead, it's quite the other way around: __sock_recv_timestamp says "does this skb have a software timestamp? if yes, I'll send the software timestamp, otherwise the hardware one". This works because the software timestamp is not shared with clones. But that means we have a problem when we attempt hardware timestamping with skbs that don't have the skb->tstamp == 0. __sock_recv_timestamp will say "oh, yeah, this must be some sort of odd clone" and will not deliver the hardware timestamp to the socket. And this is exactly what is happening when we have txtime enabled on the socket: as mentioned, that is put in a union with skb->tstamp, so it is quite easy to mistake it. Do what other drivers do (intel igb/igc) and write zero to skb->tstamp before taking the hardware timestamp. It's of no use to us now (we're already on the TX confirmation path). Fixes: 0d08c9ec7d6e ("enetc: add support time specific departure base on the qos etf") Cc: Vinicius Costa Gomes Signed-off-by: Vladimir Oltean Acked-by: Vinicius Costa Gomes Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a1f308089257616cdb91b4334c5eaa81ae17e387 Author: Paul Cercueil Date: Sun Mar 7 13:17:48 2021 +0000 net: davicom: Fix regulator not turned off on driver removal commit cf9e60aa69ae6c40d3e3e4c94dd6c8de31674e9b upstream. We must disable the regulator that was enabled in the probe function. Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000") Signed-off-by: Paul Cercueil Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c32b8c03ee1c821af0b152dd174da4eb94967fa8 Author: Paul Cercueil Date: Sun Mar 7 13:17:47 2021 +0000 net: davicom: Fix regulator not turned off on failed probe commit ac88c531a5b38877eba2365a3f28f0c8b513dc33 upstream. When the probe fails or requests to be defered, we must disable the regulator that was previously enabled. Fixes: 7994fe55a4a2 ("dm9000: Add regulator and reset support to dm9000") Signed-off-by: Paul Cercueil Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8eb43ff45aee8dbf81409afb8e84d0c71d10dc3c Author: Xie He Date: Sun Mar 7 03:33:07 2021 -0800 net: lapbether: Remove netif_start_queue / netif_stop_queue commit f7d9d4854519fdf4d45c70a4d953438cd88e7e58 upstream. For the devices in this driver, the default qdisc is "noqueue", because their "tx_queue_len" is 0. In function "__dev_queue_xmit" in "net/core/dev.c", devices with the "noqueue" qdisc are specially handled. Packets are transmitted without being queued after a "dev->flags & IFF_UP" check. However, it's possible that even if this check succeeds, "ops->ndo_stop" may still have already been called. This is because in "__dev_close_many", "ops->ndo_stop" is called before clearing the "IFF_UP" flag. If we call "netif_stop_queue" in "ops->ndo_stop", then it's possible in "__dev_queue_xmit", it sees the "IFF_UP" flag is present, and then it checks "netif_xmit_stopped" and finds that the queue is already stopped. In this case, it will complain that: "Virtual device ... asks to queue packet!" To prevent "__dev_queue_xmit" from generating this complaint, we should not call "netif_stop_queue" in "ops->ndo_stop". We also don't need to call "netif_start_queue" in "ops->ndo_open", because after a netdev is allocated and registered, the "__QUEUE_STATE_DRV_XOFF" flag is initially not set, so there is no need to call "netif_start_queue" to clear it. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xie He Acked-by: Martin Schiller Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e1283ef5d66e418beee8eb3c0530f0c16f852a87 Author: Wong Vee Khee Date: Fri Mar 5 14:03:42 2021 +0800 stmmac: intel: Fixes clock registration error seen for multiple interfaces commit 8eb37ab7cc045ec6305a6a1a9c32374695a1a977 upstream. Issue seen when enumerating multiple Intel mGbE interfaces in EHL. [ 6.898141] intel-eth-pci 0000:00:1d.2: enabling device (0000 -> 0002) [ 6.900971] intel-eth-pci 0000:00:1d.2: Fail to register stmmac-clk [ 6.906434] intel-eth-pci 0000:00:1d.2: User ID: 0x51, Synopsys ID: 0x52 We fix it by making the clock name to be unique following the format of stmmac-pci_name(pci_dev) so that we can differentiate the clock for these Intel mGbE interfaces in EHL platform as follow: /sys/kernel/debug/clk/stmmac-0000:00:1d.1 /sys/kernel/debug/clk/stmmac-0000:00:1d.2 /sys/kernel/debug/clk/stmmac-0000:00:1e.4 Fixes: 58da0cfa6cf1 ("net: stmmac: create dwmac-intel.c to contain all Intel platform") Signed-off-by: Wong Vee Khee Signed-off-by: Voon Weifeng Co-developed-by: Ong Boon Leong Signed-off-by: Ong Boon Leong Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit dbee60e04e42266267740e3240aaab3d29cd509f Author: Ong Boon Leong Date: Fri Mar 5 13:49:30 2021 +0800 net: stmmac: Fix VLAN filter delete timeout issue in Intel mGBE SGMII commit 9a7b3950c7e15968e23d83be215e95ccc7c92a53 upstream. For Intel mGbE controller, MAC VLAN filter delete operation will time-out if serdes power-down sequence happened first during driver remove() with below message. [82294.764958] intel-eth-pci 0000:00:1e.4 eth2: stmmac_dvr_remove: removing driver [82294.778677] intel-eth-pci 0000:00:1e.4 eth2: Timeout accessing MAC_VLAN_Tag_Filter [82294.779997] intel-eth-pci 0000:00:1e.4 eth2: failed to kill vid 0081/0 [82294.947053] intel-eth-pci 0000:00:1d.2 eth1: stmmac_dvr_remove: removing driver [82295.002091] intel-eth-pci 0000:00:1d.1 eth0: stmmac_dvr_remove: removing driver Therefore, we delay the serdes power-down to be after unregister_netdev() which triggers the VLAN filter delete. Fixes: b9663b7ca6ff ("net: stmmac: Enable SERDES power up/down sequence") Signed-off-by: Ong Boon Leong Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 00d566df2cceb8591913b3ea3b43d2918915f7e3 Author: Paul Moore Date: Thu Mar 4 16:29:51 2021 -0500 cipso,calipso: resolve a number of problems with the DOI refcounts commit ad5d07f4a9cd671233ae20983848874731102c08 upstream. The current CIPSO and CALIPSO refcounting scheme for the DOI definitions is a bit flawed in that we: 1. Don't correctly match gets/puts in netlbl_cipsov4_list(). 2. Decrement the refcount on each attempt to remove the DOI from the DOI list, only removing it from the list once the refcount drops to zero. This patch fixes these problems by adding the missing "puts" to netlbl_cipsov4_list() and introduces a more conventional, i.e. not-buggy, refcounting mechanism to the DOI definitions. Upon the addition of a DOI to the DOI list, it is initialized with a refcount of one, removing a DOI from the list removes it from the list and drops the refcount by one; "gets" and "puts" behave as expected with respect to refcounts, increasing and decreasing the DOI's refcount by one. Fixes: b1edeb102397 ("netlabel: Replace protocol/NetLabel linking with refrerence counts") Fixes: d7cce01504a0 ("netlabel: Add support for removing a CALIPSO DOI.") Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com Signed-off-by: Paul Moore Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3e4b709e00977f62fb38b029dce9681ff6fc9877 Author: Hillf Danton Date: Thu Mar 4 10:30:09 2021 -0800 netdevsim: init u64 stats for 32bit hardware commit 863a42b289c22df63db62b10fc2c2ffc237e2125 upstream. Init the u64 stats in order to avoid the lockdep prints on the 32bit hardware like INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 4695 Comm: syz-executor.0 Not tainted 5.11.0-rc5-syzkaller #0 Hardware name: ARM-Versatile Express Backtrace: [<826fc5b8>] (dump_backtrace) from [<826fc82c>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) [<826fc814>] (show_stack) from [<8270d1f8>] (__dump_stack lib/dump_stack.c:79 [inline]) [<826fc814>] (show_stack) from [<8270d1f8>] (dump_stack+0xa8/0xc8 lib/dump_stack.c:120) [<8270d150>] (dump_stack) from [<802bf9c0>] (assign_lock_key kernel/locking/lockdep.c:935 [inline]) [<8270d150>] (dump_stack) from [<802bf9c0>] (register_lock_class+0xabc/0xb68 kernel/locking/lockdep.c:1247) [<802bef04>] (register_lock_class) from [<802baa2c>] (__lock_acquire+0x84/0x32d4 kernel/locking/lockdep.c:4711) [<802ba9a8>] (__lock_acquire) from [<802be840>] (lock_acquire.part.0+0xf0/0x554 kernel/locking/lockdep.c:5442) [<802be750>] (lock_acquire.part.0) from [<802bed10>] (lock_acquire+0x6c/0x74 kernel/locking/lockdep.c:5415) [<802beca4>] (lock_acquire) from [<81560548>] (seqcount_lockdep_reader_access include/linux/seqlock.h:103 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (__u64_stats_fetch_begin include/linux/u64_stats_sync.h:164 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (u64_stats_fetch_begin include/linux/u64_stats_sync.h:175 [inline]) [<802beca4>] (lock_acquire) from [<81560548>] (nsim_get_stats64+0xdc/0xf0 drivers/net/netdevsim/netdev.c:70) [<8156046c>] (nsim_get_stats64) from [<81e2efa0>] (dev_get_stats+0x44/0xd0 net/core/dev.c:10405) [<81e2ef5c>] (dev_get_stats) from [<81e53204>] (rtnl_fill_stats+0x38/0x120 net/core/rtnetlink.c:1211) [<81e531cc>] (rtnl_fill_stats) from [<81e59d58>] (rtnl_fill_ifinfo+0x6d4/0x148c net/core/rtnetlink.c:1783) [<81e59684>] (rtnl_fill_ifinfo) from [<81e5ceb4>] (rtmsg_ifinfo_build_skb+0x9c/0x108 net/core/rtnetlink.c:3798) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3830 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo_event net/core/rtnetlink.c:3821 [inline]) [<81e5ce18>] (rtmsg_ifinfo_build_skb) from [<81e5d0ac>] (rtmsg_ifinfo+0x44/0x70 net/core/rtnetlink.c:3839) [<81e5d068>] (rtmsg_ifinfo) from [<81e45c2c>] (register_netdevice+0x664/0x68c net/core/dev.c:10103) [<81e455c8>] (register_netdevice) from [<815608bc>] (nsim_create+0xf8/0x124 drivers/net/netdevsim/netdev.c:317) [<815607c4>] (nsim_create) from [<81561184>] (__nsim_dev_port_add+0x108/0x188 drivers/net/netdevsim/dev.c:941) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_port_add_all drivers/net/netdevsim/dev.c:990 [inline]) [<8156107c>] (__nsim_dev_port_add) from [<815620d8>] (nsim_dev_probe+0x5cc/0x750 drivers/net/netdevsim/dev.c:1119) [<81561b0c>] (nsim_dev_probe) from [<815661dc>] (nsim_bus_probe+0x10/0x14 drivers/net/netdevsim/bus.c:287) [<815661cc>] (nsim_bus_probe) from [<811724c0>] (really_probe+0x100/0x50c drivers/base/dd.c:554) [<811723c0>] (really_probe) from [<811729c4>] (driver_probe_device+0xf8/0x1c8 drivers/base/dd.c:740) [<811728cc>] (driver_probe_device) from [<81172fe4>] (__device_attach_driver+0x8c/0xf0 drivers/base/dd.c:846) [<81172f58>] (__device_attach_driver) from [<8116fee0>] (bus_for_each_drv+0x88/0xd8 drivers/base/bus.c:431) [<8116fe58>] (bus_for_each_drv) from [<81172c6c>] (__device_attach+0xdc/0x1d0 drivers/base/dd.c:914) [<81172b90>] (__device_attach) from [<8117305c>] (device_initial_probe+0x14/0x18 drivers/base/dd.c:961) [<81173048>] (device_initial_probe) from [<81171358>] (bus_probe_device+0x90/0x98 drivers/base/bus.c:491) [<811712c8>] (bus_probe_device) from [<8116e77c>] (device_add+0x320/0x824 drivers/base/core.c:3109) [<8116e45c>] (device_add) from [<8116ec9c>] (device_register+0x1c/0x20 drivers/base/core.c:3182) [<8116ec80>] (device_register) from [<81566710>] (nsim_bus_dev_new drivers/net/netdevsim/bus.c:336 [inline]) [<8116ec80>] (device_register) from [<81566710>] (new_device_store+0x178/0x208 drivers/net/netdevsim/bus.c:215) [<81566598>] (new_device_store) from [<8116fcb4>] (bus_attr_store+0x2c/0x38 drivers/base/bus.c:122) [<8116fc88>] (bus_attr_store) from [<805b4b8c>] (sysfs_kf_write+0x48/0x54 fs/sysfs/file.c:139) [<805b4b44>] (sysfs_kf_write) from [<805b3c90>] (kernfs_fop_write_iter+0x128/0x1ec fs/kernfs/file.c:296) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (call_write_iter include/linux/fs.h:1901 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (new_sync_write fs/read_write.c:518 [inline]) [<805b3b68>] (kernfs_fop_write_iter) from [<804d22fc>] (vfs_write+0x3dc/0x57c fs/read_write.c:605) [<804d1f20>] (vfs_write) from [<804d2604>] (ksys_write+0x68/0xec fs/read_write.c:658) [<804d259c>] (ksys_write) from [<804d2698>] (__do_sys_write fs/read_write.c:670 [inline]) [<804d259c>] (ksys_write) from [<804d2698>] (sys_write+0x10/0x14 fs/read_write.c:667) [<804d2688>] (sys_write) from [<80200060>] (ret_fast_syscall+0x0/0x2c arch/arm/mm/proc-v7.S:64) Fixes: 83c9e13aa39a ("netdevsim: add software driver for testing offloads") Reported-by: syzbot+e74a6857f2d0efe3ad81@syzkaller.appspotmail.com Tested-by: Dmitry Vyukov Signed-off-by: Hillf Danton Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d493ac5ea098c76b18f812ccf207e6a98a2c847a Author: Daniele Palmas Date: Thu Mar 4 14:15:13 2021 +0100 net: usb: qmi_wwan: allow qmimux add/del with master up commit 6c59cff38e66584ae3ac6c2f0cbd8d039c710ba7 upstream. There's no reason for preventing the creation and removal of qmimux network interfaces when the underlying interface is up. This makes qmi_wwan mux implementation more similar to the rmnet one, simplifying userspace management of the same logical interfaces. Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Reported-by: Aleksander Morgado Signed-off-by: Daniele Palmas Acked-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 12e1eb5a1a9bbf0a357358392d5f416ba4631956 Author: Vladimir Oltean Date: Thu Mar 4 12:56:53 2021 +0200 net: dsa: sja1105: fix SGMII PCS being forced to SPEED_UNKNOWN instead of SPEED_10 commit 053d8ad10d585adf9891fcd049637536e2fe9ea7 upstream. When using MLO_AN_PHY or MLO_AN_FIXED, the MII_BMCR of the SGMII PCS is read before resetting the switch so it can be reprogrammed afterwards. This works for the speeds of 1Gbps and 100Mbps, but not for 10Mbps, because SPEED_10 is actually 0, so AND-ing anything with 0 is false, therefore that last branch is dead code. Do what others do (genphy_read_status_fixed, phy_mii_ioctl) and just remove the check for SPEED_10, let it fall into the default case. Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port") Signed-off-by: Vladimir Oltean Reviewed-by: Andrew Lunn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2930991670c65e2c0affddc8610723d3451c6b37 Author: Vladimir Oltean Date: Thu Mar 4 12:29:43 2021 +0200 net: mscc: ocelot: properly reject destination IP keys in VCAP IS1 commit f1becbed411c6fa29d7ce3def3a1dcd4f63f2d74 upstream. An attempt is made to warn the user about the fact that VCAP IS1 cannot offload keys matching on destination IP (at least given the current half key format), but sadly that warning fails miserably in practice, due to the fact that it operates on an uninitialized "match" variable. We must first decode the keys from the flow rule. Fixes: 75944fda1dfe ("net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1") Reported-by: Colin Ian King Signed-off-by: Vladimir Oltean Reviewed-by: Colin Ian King Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 41fed22409f567faabdb88742039477314546e7b Author: Maximilian Heyne Date: Thu Mar 4 14:43:17 2021 +0000 net: sched: avoid duplicates in classes dump commit bfc2560563586372212b0a8aeca7428975fa91fe upstream. This is a follow up of commit ea3274695353 ("net: sched: avoid duplicates in qdisc dump") which has fixed the issue only for the qdisc dump. The duplicate printing also occurs when dumping the classes via tc class show dev eth0 Fixes: 59cc1f61f09c ("net: sched: convert qdisc linked list to hashtable") Signed-off-by: Maximilian Heyne Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d0f380d733eff3ab787dccdfc1c88f9409bcc151 Author: Ido Schimmel Date: Thu Mar 4 10:57:53 2021 +0200 nexthop: Do not flush blackhole nexthops when loopback goes down commit 76c03bf8e2624076b88d93542d78e22d5345c88e upstream. As far as user space is concerned, blackhole nexthops do not have a nexthop device and therefore should not be affected by the administrative or carrier state of any netdev. However, when the loopback netdev goes down all the blackhole nexthops are flushed. This happens because internally the kernel associates blackhole nexthops with the loopback netdev. This behavior is both confusing to those not familiar with kernel internals and also diverges from the legacy API where blackhole IPv4 routes are not flushed when the loopback netdev goes down: # ip route add blackhole 198.51.100.0/24 # ip link set dev lo down # ip route show 198.51.100.0/24 blackhole 198.51.100.0/24 Blackhole IPv6 routes are flushed, but at least user space knows that they are associated with the loopback netdev: # ip -6 route show 2001:db8:1::/64 blackhole 2001:db8:1::/64 dev lo metric 1024 pref medium Fix this by only flushing blackhole nexthops when the loopback netdev is unregistered. Fixes: ab84be7e54fc ("net: Initial nexthop code") Signed-off-by: Ido Schimmel Reported-by: Donald Sharp Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cef0c0a3f9baeb56d931d62b5bf2fa6493e2dfea Author: Ong Boon Leong Date: Wed Mar 3 20:38:40 2021 +0530 net: stmmac: fix incorrect DMA channel intr enable setting of EQoS v4.10 commit 879c348c35bb5fb758dd881d8a97409c1862dae8 upstream. We introduce dwmac410_dma_init_channel() here for both EQoS v4.10 and above which use different DMA_CH(n)_Interrupt_Enable bit definitions for NIE and AIE. Fixes: 48863ce5940f ("stmmac: add DMA support for GMAC 4.xx") Signed-off-by: Ong Boon Leong Signed-off-by: Ramesh Babu B Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 56c88527b360bd22f8c9940ba78316c73066af78 Author: Kevin(Yudong) Yang Date: Wed Mar 3 09:43:54 2021 -0500 net/mlx4_en: update moderation when config reset commit 00ff801bb8ce6711e919af4530b6ffa14a22390a upstream. This patch fixes a bug that the moderation config will not be applied when calling mlx4_en_reset_config. For example, when turning on rx timestamping, mlx4_en_reset_config() will be called, causing the NIC to forget previous moderation config. This fix is in phase with a previous fix: commit 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called") Tested: Before this patch, on a host with NIC using mlx4, run netserver and stream TCP to the host at full utilization. $ sar -I SUM 1 INTR intr/s 14:03:56 sum 48758.00 After rx hwtstamp is enabled: $ sar -I SUM 1 14:10:38 sum 317771.00 We see the moderation is not working properly and issued 7x more interrupts. After the patch, and turned on rx hwtstamp, the rate of interrupts is as expected: $ sar -I SUM 1 14:52:11 sum 49332.00 Fixes: 79c54b6bbf06 ("net/mlx4_en: Fix TX moderation info loss after set_ringparam is called") Signed-off-by: Kevin(Yudong) Yang Reviewed-by: Eric Dumazet Reviewed-by: Neal Cardwell CC: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e4abfedd42b90afc08036a9656046eed936d50d4 Author: Biao Huang Date: Tue Mar 2 11:33:23 2021 +0800 net: ethernet: mtk-star-emac: fix wrong unmap in RX handling commit 95b39f07a17faef3a9b225248ba449b976e529c8 upstream. mtk_star_dma_unmap_rx() should unmap the dma_addr of old skb rather than that of new skb. Assign new_dma_addr to desc_data.dma_addr after all handling of old skb ends to avoid unexpected receive side error. Fixes: f96e9641e92b ("net: ethernet: mtk-star-emac: fix error path in RX handling") Signed-off-by: Biao Huang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 10ec8fe8eac06c4660bb15666867bc5d465bf681 Author: DENG Qingfang Date: Tue Mar 2 00:01:59 2021 +0800 net: dsa: tag_mtk: fix 802.1ad VLAN egress commit 9200f515c41f4cbaeffd8fdd1d8b6373a18b1b67 upstream. A different TPID bit is used for 802.1ad VLAN frames. Reported-by: Ilario Gelmetti Fixes: f0af34317f4b ("net: dsa: mediatek: combine MediaTek tag with VLAN tag") Signed-off-by: DENG Qingfang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 8220486f94514114d382935e8931748d789bd69a Author: Vladimir Oltean Date: Mon Mar 1 13:18:18 2021 +0200 net: enetc: keep RX ring consumer index in sync with hardware commit 3a5d12c9be6f30080600c8bacaf310194e37d029 upstream. The RX rings have a producer index owned by hardware, where newly received frame buffers are placed, and a consumer index owned by software, where newly allocated buffers are placed, in expectation of hardware being able to place frame data in them. Hardware increments the producer index when a frame is received, however it is not allowed to increment the producer index to match the consumer index (RBCIR) since the ring can hold at most RBLENR[LENGTH]-1 received BDs. Whenever the producer index matches the value of the consumer index, the ring has no unprocessed received frames and all BDs in the ring have been initialized/prepared by software, i.e. hardware owns all BDs in the ring. The code uses the next_to_clean variable to keep track of the producer index, and the next_to_use variable to keep track of the consumer index. The RX rings are seeded from enetc_refill_rx_ring, which is called from two places: 1. initially the ring is seeded until full with enetc_bd_unused(rx_ring), i.e. with 511 buffers. This will make next_to_clean=0 and next_to_use=511: .ndo_open -> enetc_open -> enetc_setup_bdrs -> enetc_setup_rxbdr -> enetc_refill_rx_ring 2. then during the data path processing, it is refilled with 16 buffers at a time: enetc_msix -> napi_schedule -> enetc_poll -> enetc_clean_rx_ring -> enetc_refill_rx_ring There is just one problem: the initial seeding done during .ndo_open updates just the producer index (ENETC_RBPIR) with 0, and the software next_to_clean and next_to_use variables. Notably, it will not update the consumer index to make the hardware aware of the newly added buffers. Wait, what? So how does it work? Well, the reset values of the producer index and of the consumer index of a ring are both zero. As per the description in the second paragraph, it means that the ring is full of buffers waiting for hardware to put frames in them, which by coincidence is almost true, because we have in fact seeded 511 buffers into the ring. But will the hardware attempt to access the 512th entry of the ring, which has an invalid BD in it? Well, no, because in order to do that, it would have to first populate the first 511 entries, and the NAPI enetc_poll will kick in by then. Eventually, after 16 processed slots have become available in the RX ring, enetc_clean_rx_ring will call enetc_refill_rx_ring and then will [ finally ] update the consumer index with the new software next_to_use variable. From now on, the next_to_clean and next_to_use variables are in sync with the producer and consumer ring indices. So the day is saved, right? Well, not quite. Freeing the memory allocated for the rings is done in: enetc_close -> enetc_clear_bdrs -> enetc_clear_rxbdr -> this just disables the ring -> enetc_free_rxtx_rings -> enetc_free_rx_ring -> sets next_to_clean and next_to_use to 0 but again, nothing is committed to the hardware producer and consumer indices (yay!). The assumption is that the ring is disabled, so the indices don't matter anyway, and it's the responsibility of the "open" code path to set those up. .. Except that the "open" code path does not set those up properly. While initially, things almost work, during subsequent enetc_close -> enetc_open sequences, we have problems. To be precise, the enetc_open that is subsequent to enetc_close will again refill the ring with 511 entries, but it will leave the consumer index untouched. Untouched means, of course, equal to the value it had before disabling the ring and draining the old buffers in enetc_close. But as mentioned, enetc_setup_rxbdr will at least update the producer index though, through this line of code: enetc_rxbdr_wr(hw, idx, ENETC_RBPIR, 0); so at this stage we'll have: next_to_clean=0 (in hardware 0) next_to_use=511 (in hardware we'll have the refill index prior to enetc_close) Again, the next_to_clean and producer index are in sync and set to correct values, so the driver manages to limp on. Eventually, 16 ring entries will be consumed by enetc_poll, and the savior enetc_clean_rx_ring will come and call enetc_refill_rx_ring, and then update the hardware consumer ring based upon the new next_to_use. So.. it works? Well, by coincidence, it almost does, but there's a circumstance where enetc_clean_rx_ring won't be there to save us. If the previous value of the consumer index was 15, there's a problem, because the NAPI poll sequence will only issue a refill when 16 or more buffers have been consumed. It's easiest to illustrate this with an example: ip link set eno0 up ip addr add 192.168.100.1/24 dev eno0 ping 192.168.100.1 -c 20 # ping this port from another board ip link set eno0 down ip link set eno0 up ping 192.168.100.1 -c 20 # ping it again from the same other board One by one: 1. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 0) 2. ping 192.168.100.1 -c 20 # ping this port from another board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=15 next_to_clean 14 (in hw 15) next_to_use 511 (in hw 0) enetc_clean_rx_ring: enetc_refill_rx_ring(16) increments next_to_use by 16 (mod 512) and writes it to hw enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=0 next_to_clean 15 (in hw 16) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 16 (in hw 17) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 17 (in hw 18) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 18 (in hw 19) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 19 (in hw 20) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 20 (in hw 21) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 21 (in hw 22) next_to_use 15 (in hw 15) 20 packets transmitted, 20 packets received, 0% packet loss 3. ip link set eno0 down enetc_free_rx_ring: next_to_clean 0 (in hw 22), next_to_use 0 (in hw 15) 4. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 15) 5. ping 192.168.100.1 -c 20 # ping it again from the same other board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 15) 20 packets transmitted, 12 packets received, 40% packet loss And there it dies. No enetc_refill_rx_ring (because cleaned_cnt must be equal to 15 for that to happen), no nothing. The hardware enters the condition where the producer (14) + 1 is equal to the consumer (15) index, which makes it believe it has no more free buffers to put packets in, so it starts discarding them: ip netns exec ns0 ethtool -S eno0 | grep -v ': 0' NIC statistics: Rx ring 0 discarded frames: 8 Summarized, if the interface receives between 16 and 32 (mod 512) frames and then there is a link flap, then the port will eventually die with no way to recover. If it receives less than 16 (mod 512) frames, then the initial NAPI poll [ before the link flap ] will not update the consumer index in hardware (it will remain zero) which will be ok when the buffers are later reinitialized. If more than 32 (mod 512) frames are received, the initial NAPI poll has the chance to refill the ring twice, updating the consumer index to at least 32. So after the link flap, the consumer index is still wrong, but the post-flap NAPI poll gets a chance to refill the ring once (because it passes through cleaned_cnt=15) and makes the consumer index be again back in sync with next_to_use. The solution to this problem is actually simple, we just need to write next_to_use into the hardware consumer index at enetc_open time, which always brings it back in sync after an initial buffer seeding process. The simpler thing would be to put the write to the consumer index into enetc_refill_rx_ring directly, but there are issues with the MDIO locking: in the NAPI poll code we have the enetc_lock_mdio() taken from top-level and we use the unlocked enetc_wr_reg_hot, whereas in enetc_open, the enetc_lock_mdio() is not taken at the top level, but instead by each individual enetc_wr_reg, so we are forced to put an additional enetc_wr_reg in enetc_setup_rxbdr. Better organization of the code is left as a refactoring exercise. Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cf01b4ba6788949daf90803d6d2cedbbb07784e5 Author: Vladimir Oltean Date: Mon Mar 1 13:18:17 2021 +0200 net: enetc: remove bogus write to SIRXIDR from enetc_setup_rxbdr commit 96a5223b918c8b79270fc0fec235a7ebad459098 upstream. The Station Interface Receive Interrupt Detect Register (SIRXIDR) contains a 16-bit wide mask of 'interrupt detected' events for each ring associated with a port. Bit i is write-1-to-clean for RX ring i. I have no explanation whatsoever how this line of code came to be inserted in the blamed commit. I checked the downstream versions of that patch and none of them have it. The somewhat comical aspect of it is that we're writing a binary number to the SIRXIDR register, which is derived from enetc_bd_unused(rx_ring). Since the RX rings have 512 buffer descriptors, we end up writing 511 to this register, which is 0x1ff, so we are effectively clearing the 'interrupt detected' event for rings 0-8. This register is not what is used for interrupt handling though - it only provides a summary for the entire SI. The hardware provides one separate Interrupt Detect Register per RX ring, which auto-clears upon read. So there doesn't seem to be any adverse effect caused by this bogus write. There is, however, one reason why this should be handled as a bugfix: next_to_clean _should_ be committed to hardware, just not to that register, and this was obscuring the fact that it wasn't. This is fixed in the next patch, and removing the bogus line now allows the fix patch to be backported beyond that point. Fixes: fd5736bf9f23 ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6997660f384dc63cf0c64f0844ffd6333516c8a5 Author: Vladimir Oltean Date: Mon Mar 1 13:18:16 2021 +0200 net: enetc: force the RGMII speed and duplex instead of operating in inband mode commit c76a97218dcbb2cb7cec1404ace43ef96c87d874 upstream. The ENETC port 0 MAC supports in-band status signaling coming from a PHY when operating in RGMII mode, and this feature is enabled by default. It has been reported that RGMII is broken in fixed-link, and that is not surprising considering the fact that no PHY is attached to the MAC in that case, but a switch. This brings us to the topic of the patch: the enetc driver should have not enabled the optional in-band status signaling for RGMII unconditionally, but should have forced the speed and duplex to what was resolved by phylink. Note that phylink does not accept the RGMII modes as valid for in-band signaling, and these operate a bit differently than 1000base-x and SGMII (notably there is no clause 37 state machine so no ACK required from the MAC, instead the PHY sends extra code words on RXD[3:0] whenever it is not transmitting something else, so it should be safe to leave a PHY with this option unconditionally enabled even if we ignore it). The spec talks about this here: https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/138/RGMIIv1_5F00_3.pdf Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") Cc: Florian Fainelli Cc: Andrew Lunn Cc: Russell King Signed-off-by: Vladimir Oltean Acked-by: Russell King Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 272db1b7f5aab6d0843bfe9c25f77a68e376f32f Author: Vladimir Oltean Date: Mon Mar 1 13:18:15 2021 +0200 net: enetc: don't disable VLAN filtering in IFF_PROMISC mode commit a74dbce9d4541888fe0d39afe69a3a95004669b4 upstream. Quoting from the blamed commit: In promiscuous mode, it is more intuitive that all traffic is received, including VLAN tagged traffic. It appears that it is necessary to set the flag in PSIPVMR for that to be the case, so VLAN promiscuous mode is also temporarily enabled. On exit from promiscuous mode, the setting made by ethtool is restored. Intuitive or not, there isn't any definition issued by a standards body which says that promiscuity has anything to do with VLAN filtering - it only has to do with accepting packets regardless of destination MAC address. In fact people are already trying to use this misunderstanding/bug of the enetc driver as a justification to transform promiscuity into something it never was about: accepting every packet (maybe that would be the "rx-all" netdev feature?): https://lore.kernel.org/netdev/20201110153958.ci5ekor3o2ekg3ky@ipetronik.com/ This is relevant because there are use cases in the kernel (such as tc-flower rules with the protocol 802.1Q and a vlan_id key) which do not (yet) use the vlan_vid_add API to be compatible with VLAN-filtering NICs such as enetc, so for those, disabling rx-vlan-filter is currently the only right solution to make these setups work: https://lore.kernel.org/netdev/CA+h21hoxwRdhq4y+w8Kwgm74d4cA0xLeiHTrmT-VpSaM7obhkg@mail.gmail.com/ The blamed patch has unintentionally introduced one more way for this to work, which is to enable IFF_PROMISC, however this is non-portable because port promiscuity is not meant to disable VLAN filtering. Therefore, it could invite people to write broken scripts for enetc, and then wonder why they are broken when migrating to other drivers that don't handle promiscuity in the same way. Fixes: 7070eea5e95a ("enetc: permit configuration of rx-vlan-filter with ethtool") Cc: Markus Blöchl Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6db63b552ae54e97941d6a6a16fb5a1f31c59161 Author: Vladimir Oltean Date: Mon Mar 1 13:18:14 2021 +0200 net: enetc: fix incorrect TPID when receiving 802.1ad tagged packets commit 827b6fd046516af605e190c872949f22208b5d41 upstream. When the enetc ports have rx-vlan-offload enabled, they report a TPID of ETH_P_8021Q regardless of what was actually in the packet. When rx-vlan-offload is disabled, packets have the proper TPID. Fix this inconsistency by finishing the TODO left in the code. Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ff966263f5f9fdf9740f03fed0762ce73c230a6a Author: Vladimir Oltean Date: Mon Mar 1 13:18:13 2021 +0200 net: enetc: take the MDIO lock only once per NAPI poll cycle commit 6d36ecdbc4410e61a0e02adc5d3abeee22a8ffd3 upstream. The workaround for the ENETC MDIO erratum caused a performance degradation of 82 Kpps (seen with IP forwarding of two 1Gbps streams of 64B packets). This is due to excessive locking and unlocking in the fast path, which can be avoided. By taking the MDIO read-side lock only once per NAPI poll cycle, we are able to regain 54 Kpps (65%) of the performance hit. The rest of the performance degradation comes from the TX data path, but unfortunately it doesn't look like we can optimize that away easily, even with netdev_xmit_more(), there just isn't any skb batching done, to help with taking the MDIO lock less often than once per packet. We need to change the register accessor type for enetc_get_tx_tstamp, because it now runs under the enetc_lock_mdio as per the new call path detailed below: enetc_msix -> napi_schedule -> enetc_poll -> enetc_lock_mdio -> enetc_clean_tx_ring -> enetc_get_tx_tstamp -> enetc_clean_rx_ring -> enetc_unlock_mdio Fixes: fd5736bf9f23 ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5fb230e6334111bf44b07d4ad6cb621546a8f59d Author: Vladimir Oltean Date: Mon Mar 1 13:18:12 2021 +0200 net: enetc: initialize RFS/RSS memories for unused ports too commit 3222b5b613db558e9a494bbf53f3c984d90f71ea upstream. Michael reports that since linux-next-20210211, the AER messages for ECC errors have started reappearing, and this time they can be reliably reproduced with the first ping on one of his LS1028A boards. $ ping 1[ 33.258069] pcieport 0000:00:1f.0: AER: Multiple Corrected error received: 0000:00:00.0 72.16.0.1 PING [ 33.267050] pcieport 0000:00:1f.0: AER: can't find device of ID0000 172.16.0.1 (172.16.0.1): 56 data bytes 64 bytes from 172.16.0.1: seq=0 ttl=64 time=17.124 ms 64 bytes from 172.16.0.1: seq=1 ttl=64 time=0.273 ms $ devmem 0x1f8010e10 32 0xC0000006 It isn't clear why this is necessary, but it seems that for the errors to go away, we must clear the entire RFS and RSS memory, not just for the ports in use. Sadly the code is structured in such a way that we can't have unified logic for the used and unused ports. For the minimal initialization of an unused port, we need just to enable and ioremap the PF memory space, and a control buffer descriptor ring. Unused ports must then free the CBDR because the driver will exit, but used ports can not pick up from where that code path left, since the CBDR API does not reinitialize a ring when setting it up, so its producer and consumer indices are out of sync between the software and hardware state. So a separate enetc_init_unused_port function was created, and it gets called right after the PF memory space is enabled. Fixes: 07bf34a50e32 ("net: enetc: initialize the RFS and RSS memories") Reported-by: Michael Walle Cc: Jesse Brandeburg Signed-off-by: Vladimir Oltean Tested-by: Michael Walle Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6a1d94af43bad9b0333c920e7c9f8574124c75c2 Author: Vladimir Oltean Date: Mon Mar 1 13:18:11 2021 +0200 net: enetc: don't overwrite the RSS indirection table when initializing commit c646d10dda2dcde82c6ce5a474522621ab2b8b19 upstream. After the blamed patch, all RX traffic gets hashed to CPU 0 because the hashing indirection table set up in: enetc_pf_probe -> enetc_alloc_si_resources -> enetc_configure_si -> enetc_setup_default_rss_table is overwritten later in: enetc_pf_probe -> enetc_init_port_rss_memory which zero-initializes the entire port RSS table in order to avoid ECC errors. The trouble really is that enetc_init_port_rss_memory really neads enetc_alloc_si_resources to be called, because it depends upon enetc_alloc_cbdr and enetc_setup_cbdr. But that whole enetc_configure_si thing could have been better thought out, it has nothing to do in a function called "alloc_si_resources", especially since its counterpart, "free_si_resources", does nothing to unwind the configuration of the SI. The point is, we need to pull out enetc_configure_si out of enetc_alloc_resources, and move it after enetc_init_port_rss_memory. This allows us to set up the default RSS indirection table after initializing the memory. Fixes: 07bf34a50e32 ("net: enetc: initialize the RFS and RSS memories") Cc: Jesse Brandeburg Signed-off-by: Vladimir Oltean Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 05f23acf9083611a56de1fdb0d52136e57951a42 Author: Sergey Shtylyov Date: Sun Feb 28 23:25:43 2021 +0300 sh_eth: fix TRSCER mask for SH771x commit 8c91bc3d44dfef8284af384877fbe61117e8b7d1 upstream. According to the SH7710, SH7712, SH7713 Group User's Manual: Hardware, Rev. 3.00, the TRSCER register actually has only bit 7 valid (and named differently), with all the other bits reserved. Apparently, this was not the case with some early revisions of the manual as we have the other bits declared (and set) in the original driver. Follow the suit and add the explicit sh_eth_cpu_data::trscer_err_mask initializer for SH771x... Fixes: 86a74ff21a7a ("net: sh_eth: add support for Renesas SuperH Ethernet") Signed-off-by: Sergey Shtylyov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 74563632e33779f16c1cc040cf0ef35e9163743b Author: DENG Qingfang Date: Mon Mar 1 01:08:23 2021 +0800 net: dsa: tag_rtl4_a: fix egress tags commit 9eb8bc593a5eed167dac2029abef343854c5ba75 upstream. Commit 86dd9868b878 has several issues, but was accepted too soon before anyone could take a look. - Double free. dsa_slave_xmit() will free the skb if the xmit function returns NULL, but the skb is already freed by eth_skb_pad(). Use __skb_put_padto() to avoid that. - Unnecessary allocation. It has been done by DSA core since commit a3b0b6479700. - A u16 pointer points to skb data. It should be __be16 for network byte order. - Typo in comments. "numer" -> "number". Fixes: 86dd9868b878 ("net: dsa: tag_rtl4_a: Support also egress tags") Signed-off-by: DENG Qingfang Reviewed-by: Florian Fainelli Reviewed-by: Linus Walleij Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 95cafdf41f1893c3690011ddd031d5c233b33f9a Author: Jakub Kicinski Date: Tue Mar 2 18:46:43 2021 -0800 docs: networking: drop special stable handling commit dbbe7c962c3a8163bf724dbc3c9fdfc9b16d3117 upstream. Leave it to Greg. Signed-off-by: Jakub Kicinski Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e410aad6f3825d1782727cfd2ee48adc9fc518e2 Author: Linus Torvalds Date: Wed Mar 10 10:18:04 2021 -0800 Revert "mm, slub: consider rest of partial list if acquire_slab() fails" commit 9b1ea29bc0d7b94d420f96a0f4121403efc3dd85 upstream. This reverts commit 8ff60eb052eeba95cfb3efe16b08c9199f8121cf. The kernel test robot reports a huge performance regression due to the commit, and the reason seems fairly straightforward: when there is contention on the page list (which is what causes acquire_slab() to fail), we do _not_ want to just loop and try again, because that will transfer the contention to the 'n->list_lock' spinlock we hold, and just make things even worse. This is admittedly likely a problem only on big machines - the kernel test robot report comes from a 96-thread dual socket Intel Xeon Gold 6252 setup, but the regression there really is quite noticeable: -47.9% regression of stress-ng.rawpkt.ops_per_sec and the commit that was marked as being fixed (7ced37197196: "slub: Acquire_slab() avoid loop") actually did the loop exit early very intentionally (the hint being that "avoid loop" part of that commit message), exactly to avoid this issue. The correct thing to do may be to pick some kind of reasonable middle ground: instead of breaking out of the loop on the very first sign of contention, or trying over and over and over again, the right thing may be to re-try _once_, and then give up on the second failure (or pick your favorite value for "once"..). Reported-by: kernel test robot Link: https://lore.kernel.org/lkml/20210301080404.GF12822@xsang-OptiPlex-9020/ Cc: Jann Horn Cc: David Rientjes Cc: Joonsoo Kim Acked-by: Christoph Lameter Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit f8f1613b1581f824d0dae2977e49b966e0de91b7 Author: Paulo Alcantara Date: Mon Mar 8 12:00:49 2021 -0300 cifs: return proper error code in statfs(2) commit 14302ee3301b3a77b331cc14efb95bf7184c73cc upstream. In cifs_statfs(), if server->ops->queryfs is not NULL, then we should use its return value rather than always returning 0. Instead, use rc variable as it is properly set to 0 in case there is no server->ops->queryfs. Signed-off-by: Paulo Alcantara (SUSE) Reviewed-by: Aurelien Aptel Reviewed-by: Ronnie Sahlberg CC: Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 6e9c30bcbdc8cb60a3a9d7fa04802289ad39f5db Author: Aurelien Aptel Date: Thu Mar 4 17:42:21 2021 +0000 cifs: fix credit accounting for extra channel commit a249cc8bc2e2fed680047d326eb9a50756724198 upstream. With multichannel, operations like the queries from "ls -lR" can cause all credits to be used and errors to be returned since max_credits was not being set correctly on the secondary channels and thus the client was requesting 0 credits incorrectly in some cases (which can lead to not having enough credits to perform any operation on that channel). Signed-off-by: Aurelien Aptel CC: # v5.8+ Reviewed-by: Shyam Prasad N Signed-off-by: Steve French Signed-off-by: Greg Kroah-Hartman commit 2d71f1c1cecb6a143e7485fe0f19b4c6631cc7bd Author: Christian Brauner Date: Sat Mar 6 11:10:10 2021 +0100 mount: fix mounting of detached mounts onto targets that reside on shared mounts commit ee2e3f50629f17b0752b55b2566c15ce8dafb557 upstream. Creating a series of detached mounts, attaching them to the filesystem, and unmounting them can be used to trigger an integer overflow in ns->mounts causing the kernel to block any new mounts in count_mounts() and returning ENOSPC because it falsely assumes that the maximum number of mounts in the mount namespace has been reached, i.e. it thinks it can't fit the new mounts into the mount namespace anymore. Depending on the number of mounts in your system, this can be reproduced on any kernel that supportes open_tree() and move_mount() by compiling and running the following program: /* SPDX-License-Identifier: LGPL-2.1+ */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include #include /* open_tree() */ #ifndef OPEN_TREE_CLONE #define OPEN_TREE_CLONE 1 #endif #ifndef OPEN_TREE_CLOEXEC #define OPEN_TREE_CLOEXEC O_CLOEXEC #endif #ifndef __NR_open_tree #if defined __alpha__ #define __NR_open_tree 538 #elif defined _MIPS_SIM #if _MIPS_SIM == _MIPS_SIM_ABI32 /* o32 */ #define __NR_open_tree 4428 #endif #if _MIPS_SIM == _MIPS_SIM_NABI32 /* n32 */ #define __NR_open_tree 6428 #endif #if _MIPS_SIM == _MIPS_SIM_ABI64 /* n64 */ #define __NR_open_tree 5428 #endif #elif defined __ia64__ #define __NR_open_tree (428 + 1024) #else #define __NR_open_tree 428 #endif #endif /* move_mount() */ #ifndef MOVE_MOUNT_F_EMPTY_PATH #define MOVE_MOUNT_F_EMPTY_PATH 0x00000004 /* Empty from path permitted */ #endif #ifndef __NR_move_mount #if defined __alpha__ #define __NR_move_mount 539 #elif defined _MIPS_SIM #if _MIPS_SIM == _MIPS_SIM_ABI32 /* o32 */ #define __NR_move_mount 4429 #endif #if _MIPS_SIM == _MIPS_SIM_NABI32 /* n32 */ #define __NR_move_mount 6429 #endif #if _MIPS_SIM == _MIPS_SIM_ABI64 /* n64 */ #define __NR_move_mount 5429 #endif #elif defined __ia64__ #define __NR_move_mount (428 + 1024) #else #define __NR_move_mount 429 #endif #endif static inline int sys_open_tree(int dfd, const char *filename, unsigned int flags) { return syscall(__NR_open_tree, dfd, filename, flags); } static inline int sys_move_mount(int from_dfd, const char *from_pathname, int to_dfd, const char *to_pathname, unsigned int flags) { return syscall(__NR_move_mount, from_dfd, from_pathname, to_dfd, to_pathname, flags); } static bool is_shared_mountpoint(const char *path) { bool shared = false; FILE *f = NULL; char *line = NULL; int i; size_t len = 0; f = fopen("/proc/self/mountinfo", "re"); if (!f) return 0; while (getline(&line, &len, f) > 0) { char *slider1, *slider2; for (slider1 = line, i = 0; slider1 && i < 4; i++) slider1 = strchr(slider1 + 1, ' '); if (!slider1) continue; slider2 = strchr(slider1 + 1, ' '); if (!slider2) continue; *slider2 = '\0'; if (strcmp(slider1 + 1, path) == 0) { /* This is the path. Is it shared? */ slider1 = strchr(slider2 + 1, ' '); if (slider1 && strstr(slider1, "shared:")) { shared = true; break; } } } fclose(f); free(line); return shared; } static void usage(void) { const char *text = "mount-new [--recursive] \n"; fprintf(stderr, "%s", text); _exit(EXIT_SUCCESS); } #define exit_usage(format, ...) \ ({ \ fprintf(stderr, format "\n", ##__VA_ARGS__); \ usage(); \ }) #define exit_log(format, ...) \ ({ \ fprintf(stderr, format "\n", ##__VA_ARGS__); \ exit(EXIT_FAILURE); \ }) static const struct option longopts[] = { {"help", no_argument, 0, 'a'}, { NULL, no_argument, 0, 0 }, }; int main(int argc, char *argv[]) { int exit_code = EXIT_SUCCESS, index = 0; int dfd, fd_tree, new_argc, ret; char *base_dir; char *const *new_argv; char target[PATH_MAX]; while ((ret = getopt_long_only(argc, argv, "", longopts, &index)) != -1) { switch (ret) { case 'a': /* fallthrough */ default: usage(); } } new_argv = &argv[optind]; new_argc = argc - optind; if (new_argc < 1) exit_usage("Missing base directory\n"); base_dir = new_argv[0]; if (*base_dir != '/') exit_log("Please specify an absolute path"); /* Ensure that target is a shared mountpoint. */ if (!is_shared_mountpoint(base_dir)) exit_log("Please ensure that \"%s\" is a shared mountpoint", base_dir); dfd = open(base_dir, O_RDONLY | O_DIRECTORY | O_CLOEXEC); if (dfd < 0) exit_log("%m - Failed to open base directory \"%s\"", base_dir); ret = mkdirat(dfd, "detached-move-mount", 0755); if (ret < 0) exit_log("%m - Failed to create required temporary directories"); ret = snprintf(target, sizeof(target), "%s/detached-move-mount", base_dir); if (ret < 0 || (size_t)ret >= sizeof(target)) exit_log("%m - Failed to assemble target path"); /* * Having a mount table with 10000 mounts is already quite excessive * and shoult account even for weird test systems. */ for (size_t i = 0; i < 10000; i++) { fd_tree = sys_open_tree(dfd, "detached-move-mount", OPEN_TREE_CLONE | OPEN_TREE_CLOEXEC | AT_EMPTY_PATH); if (fd_tree < 0) { fprintf(stderr, "%m - Failed to open %d(detached-move-mount)", dfd); exit_code = EXIT_FAILURE; break; } ret = sys_move_mount(fd_tree, "", dfd, "detached-move-mount", MOVE_MOUNT_F_EMPTY_PATH); if (ret < 0) { if (errno == ENOSPC) fprintf(stderr, "%m - Buggy mount counting"); else fprintf(stderr, "%m - Failed to attach mount to %d(detached-move-mount)", dfd); exit_code = EXIT_FAILURE; break; } close(fd_tree); ret = umount2(target, MNT_DETACH); if (ret < 0) { fprintf(stderr, "%m - Failed to unmount %s", target); exit_code = EXIT_FAILURE; break; } } (void)unlinkat(dfd, "detached-move-mount", AT_REMOVEDIR); close(dfd); exit(exit_code); } and wait for the kernel to refuse any new mounts by returning ENOSPC. How many iterations are needed depends on the number of mounts in your system. Assuming you have something like 50 mounts on a standard system it should be almost instantaneous. The root cause of this is that detached mounts aren't handled correctly when source and target mount are identical and reside on a shared mount causing a broken mount tree where the detached source itself is propagated which propagation prevents for regular bind-mounts and new mounts. This ultimately leads to a miscalculation of the number of mounts in the mount namespace. Detached mounts created via open_tree(fd, path, OPEN_TREE_CLONE) are essentially like an unattached new mount, or an unattached bind-mount. They can then later on be attached to the filesystem via move_mount() which calls into attach_recursive_mount(). Part of attaching it to the filesystem is making sure that mounts get correctly propagated in case the destination mountpoint is MS_SHARED, i.e. is a shared mountpoint. This is done by calling into propagate_mnt() which walks the list of peers calling propagate_one() on each mount in this list making sure it receives the propagation event. The propagate_one() functions thereby skips both new mounts and bind mounts to not propagate them "into themselves". Both are identified by checking whether the mount is already attached to any mount namespace in mnt->mnt_ns. The is what the IS_MNT_NEW() helper is responsible for. However, detached mounts have an anonymous mount namespace attached to them stashed in mnt->mnt_ns which means that IS_MNT_NEW() doesn't realize they need to be skipped causing the mount to propagate "into itself" breaking the mount table and causing a disconnect between the number of mounts recorded as being beneath or reachable from the target mountpoint and the number of mounts actually recorded/counted in ns->mounts ultimately causing an overflow which in turn prevents any new mounts via the ENOSPC issue. So teach propagation to handle detached mounts by making it aware of them. I've been tracking this issue down for the last couple of days and then verifying that the fix is correct by unmounting everything in my current mount table leaving only /proc and /sys mounted and running the reproducer above overnight verifying the number of mounts counted in ns->mounts. With this fix the counts are correct and the ENOSPC issue can't be reproduced. This change will only have an effect on mounts created with the new mount API since detached mounts cannot be created with the old mount API so regressions are extremely unlikely. Link: https://lore.kernel.org/r/20210306101010.243666-1-christian.brauner@ubuntu.com Fixes: 2db154b3ea8e ("vfs: syscall: Add move_mount(2) to move mounts around") Cc: David Howells Cc: Al Viro Cc: linux-fsdevel@vger.kernel.org Cc: Reviewed-by: Christoph Hellwig Signed-off-by: Christian Brauner Signed-off-by: Greg Kroah-Hartman commit ad83fa5a15a6a394718ae4034a44d58904acabd1 Author: Johan Hovold Date: Mon Mar 1 10:05:19 2021 +0100 gpio: fix gpio-device list corruption commit cf25ef6b631c6fc6c0435fc91eba8734cca20511 upstream. Make sure to hold the gpio_lock when removing the gpio device from the gpio_devices list (when dropping the last reference) to avoid corrupting the list when there are concurrent accesses. Fixes: ff2b13592299 ("gpio: make the gpiochip a real device") Cc: stable@vger.kernel.org # 4.6 Reviewed-by: Saravana Kannan Signed-off-by: Johan Hovold Signed-off-by: Bartosz Golaszewski [ johan: adjust context to 5.11 ] Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman commit 9b9c21c218c03ad4815e120e08d4b5aa26a72607 Author: Lorenzo Bianconi Date: Sun Feb 7 12:48:31 2021 +0100 mt76: dma: do not report truncated frames to mac80211 commit d0bd52c591a1070c54dc428e926660eb4f981099 upstream. Commit b102f0c522cf6 ("mt76: fix array overflow on receiving too many fragments for a packet") fixes a possible OOB access but it introduces a memory leak since the pending frame is not released to page_frag_cache if the frag array of skb_shared_info is full. Commit 93a1d4791c10 ("mt76: dma: fix a possible memory leak in mt76_add_fragment()") fixes the issue but does not free the truncated skb that is forwarded to mac80211 layer. Fix the leftover issue discarding even truncated skbs. Fixes: 93a1d4791c10 ("mt76: dma: fix a possible memory leak in mt76_add_fragment()") Signed-off-by: Lorenzo Bianconi Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/a03166fcc8214644333c68674a781836e0f57576.1612697217.git.lorenzo@kernel.org Signed-off-by: Greg Kroah-Hartman commit 70a43e24f58e8002bbebeff165f484927ae1d375 Author: Junlin Yang Date: Fri Mar 5 16:48:39 2021 +0800 ibmvnic: remove excessive irqsave commit 69cdb7947adb816fc9325b4ec02a6dddd5070b82 upstream. ibmvnic_remove locks multiple spinlocks while disabling interrupts: spin_lock_irqsave(&adapter->state_lock, flags); spin_lock_irqsave(&adapter->rwi_lock, flags); As reported by coccinelle, the second _irqsave() overwrites the value saved in 'flags' by the first _irqsave(), therefore when the second _irqrestore() comes,the value in 'flags' is not valid,the value saved by the first _irqsave() has been lost. This likely leads to IRQs remaining disabled. So remove the second _irqsave(): spin_lock_irqsave(&adapter->state_lock, flags); spin_lock(&adapter->rwi_lock); Generated by: ./scripts/coccinelle/locks/flags.cocci ./drivers/net/ethernet/ibm/ibmvnic.c:5413:1-18: ERROR: nested lock+irqsave that reuses flags from line 5404. Fixes: 4a41c421f367 ("ibmvnic: serialize access to work queue on remove") Signed-off-by: Junlin Yang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7ff7b1a6d6e6f982370faf75a32f8f3f0f98b0ca Author: Jiri Wiesner Date: Thu Mar 4 17:18:28 2021 +0100 ibmvnic: always store valid MAC address commit 67eb211487f0c993d9f402d1c196ef159fd6a3b5 upstream. The last change to ibmvnic_set_mac(), 8fc3672a8ad3, meant to prevent users from setting an invalid MAC address on an ibmvnic interface that has not been brought up yet. The change also prevented the requested MAC address from being stored by the adapter object for an ibmvnic interface when the state of the ibmvnic interface is VNIC_PROBED - that is after probing has finished but before the ibmvnic interface is brought up. The MAC address stored by the adapter object is used and sent to the hypervisor for checking when an ibmvnic interface is brought up. The ibmvnic driver ignoring the requested MAC address when in VNIC_PROBED state caused LACP bonds (bonds in 802.3ad mode) with more than one slave to malfunction. The bonding code must be able to change the MAC address of its slaves before they are brought up during enslaving. The inability of kernels with 8fc3672a8ad3 to set the MAC addresses of bonding slaves is observable in the output of "ip address show". The MAC addresses of the slaves are the same as the MAC address of the bond on a working system whereas the slaves retain their original MAC addresses on a system with a malfunctioning LACP bond. Fixes: 8fc3672a8ad3 ("ibmvnic: fix ibmvnic_set_mac") Signed-off-by: Jiri Wiesner Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e6e77eda7286efcc184d209b4e537db2ca2eda20 Author: Michal Suchanek Date: Tue Mar 2 20:47:47 2021 +0100 ibmvnic: Fix possibly uninitialized old_num_tx_queues variable warning. commit 6881b07fdd24850def1f03761c66042b983ff86e upstream. GCC 7.5 reports: ../drivers/net/ethernet/ibm/ibmvnic.c: In function 'ibmvnic_reset_init': ../drivers/net/ethernet/ibm/ibmvnic.c:5373:51: warning: 'old_num_tx_queues' may be used uninitialized in this function [-Wmaybe-uninitialized] ../drivers/net/ethernet/ibm/ibmvnic.c:5373:6: warning: 'old_num_rx_queues' may be used uninitialized in this function [-Wmaybe-uninitialized] The variable is initialized only if(reset) and used only if(reset && something) so this is a false positive. However, there is no reason to not initialize the variables unconditionally avoiding the warning. Fixes: 635e442f4a48 ("ibmvnic: merge ibmvnic_reset_init and ibmvnic_init") Signed-off-by: Michal Suchanek Reviewed-by: Sukadev Bhattiprolu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit badba52f7bfcc2060161d07497c346f42d337285 Author: Maciej Fijalkowski Date: Wed Mar 3 19:56:36 2021 +0100 libbpf: Clear map_info before each bpf_obj_get_info_by_fd commit 2b2aedabc44e9660f90ccf7ba1ca2706d75f411f upstream. xsk_lookup_bpf_maps, based on prog_fd, looks whether current prog has a reference to XSKMAP. BPF prog can include insns that work on various BPF maps and this is covered by iterating through map_ids. The bpf_map_info that is passed to bpf_obj_get_info_by_fd for filling needs to be cleared at each iteration, so that it doesn't contain any outdated fields and that is currently missing in the function of interest. To fix that, zero-init map_info via memset before each bpf_obj_get_info_by_fd call. Also, since the area of this code is touched, in general strcmp is considered harmful, so let's convert it to strncmp and provide the size of the array name for current map_info. While at it, do s/continue/break/ once we have found the xsks_map to terminate the search. Fixes: 5750902a6e9b ("libbpf: proper XSKMAP cleanup") Signed-off-by: Maciej Fijalkowski Signed-off-by: Daniel Borkmann Acked-by: Björn Töpel Link: https://lore.kernel.org/bpf/20210303185636.18070-4-maciej.fijalkowski@intel.com Signed-off-by: Greg Kroah-Hartman commit ad9704ef14d0b24e18593cb0ff182a1bd8f9004f Author: Maciej Fijalkowski Date: Wed Mar 3 19:56:35 2021 +0100 samples, bpf: Add missing munmap in xdpsock commit 6bc6699881012b5bd5d49fa861a69a37fc01b49c upstream. We mmap the umem region, but we never munmap it. Add the missing call at the end of the cleanup. Fixes: 3945b37a975d ("samples/bpf: use hugepages in xdpsock app") Signed-off-by: Maciej Fijalkowski Signed-off-by: Daniel Borkmann Acked-by: Björn Töpel Link: https://lore.kernel.org/bpf/20210303185636.18070-3-maciej.fijalkowski@intel.com Signed-off-by: Greg Kroah-Hartman commit 8bc6d1c56d92e730ed5244a304f26af80499c7cc Author: Yauheni Kaliuta Date: Sun Feb 28 12:30:17 2021 +0200 selftests/bpf: Mask bpf_csum_diff() return value to 16 bits in test_verifier commit 6185266c5a853bb0f2a459e3ff594546f277609b upstream. The verifier test labelled "valid read map access into a read-only array 2" calls the bpf_csum_diff() helper and checks its return value. However, architecture implementations of csum_partial() (which is what the helper uses) differ in whether they fold the return value to 16 bit or not. For example, x86 version has ... if (unlikely(odd)) { result = from32to16(result); result = ((result >> 8) & 0xff) | ((result & 0xff) << 8); } ... while generic lib/checksum.c does: result = from32to16(result); if (odd) result = ((result >> 8) & 0xff) | ((result & 0xff) << 8); This makes the helper return different values on different architectures, breaking the test on non-x86. To fix this, add an additional instruction to always mask the return value to 16 bits, and update the expected return value accordingly. Fixes: fb2abb73e575 ("bpf, selftest: test {rd, wr}only flags and direct value access") Signed-off-by: Yauheni Kaliuta Signed-off-by: Daniel Borkmann Link: https://lore.kernel.org/bpf/20210228103017.320240-1-yauheni.kaliuta@redhat.com Signed-off-by: Greg Kroah-Hartman commit 02e58f0aa436299d07c5dfe65538075e5a61a638 Author: Hangbin Liu Date: Wed Feb 24 16:14:03 2021 +0800 selftests/bpf: No need to drop the packet when there is no geneve opt commit 557c223b643a35effec9654958d8edc62fd2603a upstream. In bpf geneve tunnel test we set geneve option on tx side. On rx side we only call bpf_skb_get_tunnel_opt(). Since commit 9c2e14b48119 ("ip_tunnels: Set tunnel option flag when tunnel metadata is present") geneve_rx() will not add TUNNEL_GENEVE_OPT flag if there is no geneve option, which cause bpf_skb_get_tunnel_opt() return ENOENT and _geneve_get_tunnel() in test_tunnel_kern.c drop the packet. As it should be valid that bpf_skb_get_tunnel_opt() return error when there is not tunnel option, there is no need to drop the packet and break all geneve rx traffic. Just set opt_class to 0 in this test and keep returning TC_ACT_OK. Fixes: 933a741e3b82 ("selftests/bpf: bpf tunnel test.") Signed-off-by: Hangbin Liu Signed-off-by: Daniel Borkmann Acked-by: William Tu Link: https://lore.kernel.org/bpf/20210224081403.1425474-1-liuhangbin@gmail.com Signed-off-by: Greg Kroah-Hartman commit 2073eca18603a7d4dee90dbaba5908169b627eef Author: Ilya Leoshkevich Date: Sat Feb 27 06:17:26 2021 +0100 selftests/bpf: Use the last page in test_snprintf_btf on s390 commit 42a382a466a967dc053c73b969cd2ac2fec502cf upstream. test_snprintf_btf fails on s390, because NULL points to a readable struct lowcore there. Fix by using the last page instead. Error message example: printing fffffffffffff000 should generate error, got (361) Fixes: 076a95f5aff2 ("selftests/bpf: Add bpf_snprintf_btf helper tests") Signed-off-by: Ilya Leoshkevich Signed-off-by: Daniel Borkmann Acked-by: Heiko Carstens Acked-by: Yonghong Song Link: https://lore.kernel.org/bpf/20210227051726.121256-1-iii@linux.ibm.com Signed-off-by: Greg Kroah-Hartman commit 3e14c42627f2cdac58ad9e036eee72240d1d8435 Author: Guangbin Huang Date: Sat Feb 27 11:05:58 2021 +0800 net: phy: fix save wrong speed and duplex problem if autoneg is on commit d9032dba5a2b2bbf0fdce67c8795300ec9923b43 upstream. If phy uses generic driver and autoneg is on, enter command "ethtool -s eth0 speed 50" will not change phy speed actually, but command "ethtool eth0" shows speed is 50Mb/s because phydev->speed has been set to 50 and no update later. And duplex setting has same problem too. However, if autoneg is on, phy only changes speed and duplex according to phydev->advertising, but not phydev->speed and phydev->duplex. So in this case, phydev->speed and phydev->duplex don't need to be set in function phy_ethtool_ksettings_set() if autoneg is on. Fixes: 51e2a3846eab ("PHY: Avoid unnecessary aneg restarts") Signed-off-by: Guangbin Huang Signed-off-by: Huazhong Tan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 226d1438c67c7e1fa2b1807dfe8386adce5866ee Author: Jason A. Donenfeld Date: Sat Feb 27 01:40:19 2021 +0100 net: always use icmp{,v6}_ndo_send from ndo_start_xmit commit 4372339efc06bc2a796f4cc9d0a7a929dfda4967 upstream. There were a few remaining tunnel drivers that didn't receive the prior conversion to icmp{,v6}_ndo_send. Knowing now that this could lead to memory corrution (see ee576c47db60 ("net: icmp: pass zeroed opts from icmp{,v6}_ndo_send before sending") for details), there's even more imperative to have these all converted. So this commit goes through the remaining cases that I could find and does a boring translation to the ndo variety. The Fixes: line below is the merge that originally added icmp{,v6}_ ndo_send and converted the first batch of icmp{,v6}_send users. The rationale then for the change applies equally to this patch. It's just that these drivers were left out of the initial conversion because these network devices are hiding in net/ rather than in drivers/net/. Cc: Florian Westphal Cc: Willem de Bruijn Cc: David S. Miller Cc: Hideaki YOSHIFUJI Cc: David Ahern Cc: Jakub Kicinski Cc: Steffen Klassert Fixes: 803381f9f117 ("Merge branch 'icmp-account-for-NAT-when-sending-icmps-from-ndo-layer'") Signed-off-by: Jason A. Donenfeld Acked-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit be70d3dd9f7bd207d466e59744f4515a067339e3 Author: Vasily Averin Date: Sat Feb 27 11:27:45 2021 +0300 netfilter: x_tables: gpf inside xt_find_revision() commit 8e24edddad152b998b37a7f583175137ed2e04a5 upstream. nested target/match_revfn() calls work with xt[NFPROTO_UNSPEC] lists without taking xt[NFPROTO_UNSPEC].mutex. This can race with module unload and cause host to crash: general protection fault: 0000 [#1] Modules linked in: ... [last unloaded: xt_cluster] CPU: 0 PID: 542455 Comm: iptables RIP: 0010:[] [] strcmp+0x18/0x40 RDX: 0000000000000003 RSI: ffff9a5a5d9abe10 RDI: dead000000000111 R13: ffff9a5a5d9abe10 R14: ffff9a5a5d9abd8c R15: dead000000000100 (VvS: %R15 -- &xt_match, %RDI -- &xt_match.name, xt_cluster unregister match in xt[NFPROTO_UNSPEC].match list) Call Trace: [] match_revfn+0x54/0xc0 [] match_revfn+0xaf/0xc0 [] xt_find_revision+0x6e/0xf0 [] do_ipt_get_ctl+0x100/0x420 [ip_tables] [] nf_getsockopt+0x4f/0x70 [] ip_getsockopt+0xde/0x100 [] raw_getsockopt+0x25/0x50 [] sock_common_getsockopt+0x1a/0x20 [] SyS_getsockopt+0x7d/0xf0 [] system_call_fastpath+0x25/0x2a Fixes: 656caff20e1 ("netfilter 04/09: x_tables: fix match/target revision lookup") Signed-off-by: Vasily Averin Reviewed-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit c85ee6f273e288894bc27bfe31e578d6cffdb9df Author: Florian Westphal Date: Wed Feb 24 17:23:19 2021 +0100 netfilter: nf_nat: undo erroneous tcp edemux lookup commit 03a3ca37e4c6478e3a84f04c8429dd5889e107fd upstream. Under extremely rare conditions TCP early demux will retrieve the wrong socket. 1. local machine establishes a connection to a remote server, S, on port p. This gives: laddr:lport -> S:p ... both in tcp and conntrack. 2. local machine establishes a connection to host H, on port p2. 2a. TCP stack choses same laddr:lport, so we have laddr:lport -> H:p2 from TCP point of view. 2b). There is a destination NAT rewrite in place, translating H:p2 to S:p. This results in following conntrack entries: I) laddr:lport -> S:p (origin) S:p -> laddr:lport (reply) II) laddr:lport -> H:p2 (origin) S:p -> laddr:lport2 (reply) NAT engine has rewritten laddr:lport to laddr:lport2 to map the reply packet to the correct origin. When server sends SYN/ACK to laddr:lport2, the PREROUTING hook will undo-the SNAT transformation, rewriting IP header to S:p -> laddr:lport This causes TCP early demux to associate the skb with the TCP socket of the first connection. The INPUT hook will then reverse the DNAT transformation, rewriting the IP header to H:p2 -> laddr:lport. Because packet ends up with the wrong socket, the new connection never completes: originator stays in SYN_SENT and conntrack entry remains in SYN_RECV until timeout, and responder retransmits SYN/ACK until it gives up. To resolve this, orphan the skb after the input rewrite: Because the source IP address changed, the socket must be incorrect. We can't move the DNAT undo to prerouting due to backwards compatibility, doing so will make iptables/nftables rules to no longer match the way they did. After orphan, the packet will be handed to the next protocol layer (tcp, udp, ...) and that will repeat the socket lookup just like as if early demux was disabled. Fixes: 41063e9dd1195 ("ipv4: Early TCP socket demux.") Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1427 Signed-off-by: Florian Westphal Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman commit 3b72d5a703842f582502d97906f17d6ee122dac2 Author: Eric Dumazet Date: Mon Mar 1 10:29:17 2021 -0800 tcp: add sanity tests to TCP_QUEUE_SEQ commit 8811f4a9836e31c14ecdf79d9f3cb7c5d463265d upstream. Qingyu Li reported a syzkaller bug where the repro changes RCV SEQ _after_ restoring data in the receive queue. mprotect(0x4aa000, 12288, PROT_READ) = 0 mmap(0x1ffff000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x1ffff000 mmap(0x20000000, 16777216, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x20000000 mmap(0x21000000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x21000000 socket(AF_INET6, SOCK_STREAM, IPPROTO_IP) = 3 setsockopt(3, SOL_TCP, TCP_REPAIR, [1], 4) = 0 connect(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0 setsockopt(3, SOL_TCP, TCP_REPAIR_QUEUE, [1], 4) = 0 sendmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="0x0000000000000003\0\0", iov_len=20}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 20 setsockopt(3, SOL_TCP, TCP_REPAIR, [0], 4) = 0 setsockopt(3, SOL_TCP, TCP_QUEUE_SEQ, [128], 4) = 0 recvfrom(3, NULL, 20, 0, NULL, NULL) = -1 ECONNRESET (Connection reset by peer) syslog shows: [ 111.205099] TCP recvmsg seq # bug 2: copied 80, seq 0, rcvnxt 80, fl 0 [ 111.207894] WARNING: CPU: 1 PID: 356 at net/ipv4/tcp.c:2343 tcp_recvmsg_locked+0x90e/0x29a0 This should not be allowed. TCP_QUEUE_SEQ should only be used when queues are empty. This patch fixes this case, and the tx path as well. Fixes: ee9952831cfd ("tcp: Initial repair mode") Signed-off-by: Eric Dumazet Cc: Pavel Emelyanov Link: https://bugzilla.kernel.org/show_bug.cgi?id=212005 Reported-by: Qingyu Li Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit cda8b1cbecacacf4e378930dcf7ea787a6f6e2c1 Author: Arjun Roy Date: Thu Feb 25 15:26:28 2021 -0800 tcp: Fix sign comparison bug in getsockopt(TCP_ZEROCOPY_RECEIVE) commit 2107d45f17bedd7dbf4178462da0ac223835a2a7 upstream. getsockopt(TCP_ZEROCOPY_RECEIVE) has a bug where we read a user-provided "len" field of type signed int, and then compare the value to the result of an "offsetofend" operation, which is unsigned. Negative values provided by the user will be promoted to large positive numbers; thus checking that len < offsetofend() will return false when the intention was that it return true. Note that while len is originally checked for negative values earlier on in do_tcp_getsockopt(), subsequent calls to get_user() re-read the value from userspace which may have changed in the meantime. Therefore, re-add the check for negative values after the call to get_user in the handler code for TCP_ZEROCOPY_RECEIVE. Fixes: c8856c051454 ("tcp-zerocopy: Return inq along with tcp receive zerocopy.") Reported-by: kernel test robot Reported-by: Dan Carpenter Signed-off-by: Arjun Roy Link: https://lore.kernel.org/r/20210225232628.4033281-1-arjunroy.kdev@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit c7a871958d57f3f1df32543ff74b17455e32a75b Author: Torin Cooper-Bennun Date: Fri Feb 26 16:34:41 2021 +0000 can: tcan4x5x: tcan4x5x_init(): fix initialization - clear MRAM before entering Normal Mode commit 2712625200ed69c642b9abc3a403830c4643364c upstream. This patch prevents a potentially destructive race condition. The device is fully operational on the bus after entering Normal Mode, so zeroing the MRAM after entering this mode may lead to loss of information, e.g. new received messages. This patch fixes the problem by first initializing the MRAM, then bringing the device into Normale Mode. Fixes: 5443c226ba91 ("can: tcan4x5x: Add tcan4x5x driver to the kernel") Link: https://lore.kernel.org/r/20210226163440.313628-1-torin@maxiluxsystems.com Suggested-by: Marc Kleine-Budde Signed-off-by: Torin Cooper-Bennun Signed-off-by: Marc Kleine-Budde Signed-off-by: Greg Kroah-Hartman commit d2ba576e68e4bf5e9deb59c6fb530401c3116d90 Author: Joakim Zhang Date: Thu Feb 18 19:00:37 2021 +0800 can: flexcan: invoke flexcan_chip_freeze() to enter freeze mode commit c63820045e2000f05657467a08715c18c9f490d9 upstream. Invoke flexcan_chip_freeze() to enter freeze mode, since need poll freeze mode acknowledge. Fixes: e955cead03117 ("CAN: Add Flexcan CAN controller driver") Link: https://lore.kernel.org/r/20210218110037.16591-4-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang Signed-off-by: Marc Kleine-Budde Signed-off-by: Greg Kroah-Hartman commit e857529dfec3ae93da27e4d3288c6bb8ae8196ab Author: Joakim Zhang Date: Thu Feb 18 19:00:36 2021 +0800 can: flexcan: enable RX FIFO after FRZ/HALT valid commit ec15e27cc8904605846a354bb1f808ea1432f853 upstream. RX FIFO enable failed could happen when do system reboot stress test: [ 0.303958] flexcan 5a8d0000.can: 5a8d0000.can supply xceiver not found, using dummy regulator [ 0.304281] flexcan 5a8d0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.314640] flexcan 5a8d0000.can: registering netdev failed [ 0.320728] flexcan 5a8e0000.can: 5a8e0000.can supply xceiver not found, using dummy regulator [ 0.320991] flexcan 5a8e0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.331360] flexcan 5a8e0000.can: registering netdev failed [ 0.337444] flexcan 5a8f0000.can: 5a8f0000.can supply xceiver not found, using dummy regulator [ 0.337716] flexcan 5a8f0000.can (unnamed net_device) (uninitialized): Could not enable RX FIFO, unsupported core [ 0.348117] flexcan 5a8f0000.can: registering netdev failed RX FIFO should be enabled after the FRZ/HALT are valid. But the current code enable RX FIFO and FRZ/HALT at the same time. Fixes: e955cead03117 ("CAN: Add Flexcan CAN controller driver") Link: https://lore.kernel.org/r/20210218110037.16591-3-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang Signed-off-by: Marc Kleine-Budde Signed-off-by: Greg Kroah-Hartman commit f05cf960f49519bfe39990cbe62f58965f7aa5c7 Author: Joakim Zhang Date: Thu Feb 18 19:00:35 2021 +0800 can: flexcan: assert FRZ bit in flexcan_chip_freeze() commit 449052cfebf624b670faa040245d3feed770d22f upstream. Assert HALT bit to enter freeze mode, there is a premise that FRZ bit is asserted. This patch asserts FRZ bit in flexcan_chip_freeze, although the reset value is 1b'1. This is a prepare patch, later patch will invoke flexcan_chip_freeze() to enter freeze mode, which polling freeze mode acknowledge. Fixes: b1aa1c7a2165b ("can: flexcan: fix transition from and to freeze mode in chip_{,un}freeze") Link: https://lore.kernel.org/r/20210218110037.16591-2-qiangqing.zhang@nxp.com Signed-off-by: Joakim Zhang Signed-off-by: Marc Kleine-Budde Signed-off-by: Greg Kroah-Hartman commit 21702a55f54afe7f8e160170f318c678ef3b4913 Author: Andy Shevchenko Date: Thu Feb 25 18:33:20 2021 +0200 gpio: pca953x: Set IRQ type when handle Intel Galileo Gen 2 commit eb441337c7147514ab45036cadf09c3a71e4ce31 upstream. The commit 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") indeliberately made a regression on how IRQ line from GPIO I²C expander is handled. I.e. it reveals that the quirk for Intel Galileo Gen 2 misses the part of setting IRQ type which previously was predefined by gpio-dwapb driver. Now, we have to reorganize the approach to call necessary parts, which can be done via ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk. Without this fix and with above mentioned change the kernel hangs on the first IRQ event with: gpio gpiochip3: Persistence not supported for GPIO 1 irq 32, desc: 62f8fb50, depth: 0, count: 0, unhandled: 0 ->handle_irq(): 41c7b0ab, handle_bad_irq+0x0/0x40 ->irq_data.chip(): e03f1e72, 0xc2539218 ->action(): 0ecc7e6f ->action->handler(): 8a3db21e, irq_default_primary_handler+0x0/0x10 IRQ_NOPROBE set unexpected IRQ trap at vector 20 Fixes: ba8c90c61847 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2") Depends-on: 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") Signed-off-by: Andy Shevchenko Acked-by: Mika Westerberg Reviewed-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 757a28c5a18f593e9a002fd81681dddcb6129daf Author: Oleksij Rempel Date: Fri Feb 26 10:24:56 2021 +0100 can: skb: can_skb_set_owner(): fix ref counting if socket was closed before setting skb ownership commit e940e0895a82c6fbaa259f2615eb52b57ee91a7e upstream. There are two ref count variables controlling the free()ing of a socket: - struct sock::sk_refcnt - which is changed by sock_hold()/sock_put() - struct sock::sk_wmem_alloc - which accounts the memory allocated by the skbs in the send path. In case there are still TX skbs on the fly and the socket() is closed, the struct sock::sk_refcnt reaches 0. In the TX-path the CAN stack clones an "echo" skb, calls sock_hold() on the original socket and references it. This produces the following back trace: | WARNING: CPU: 0 PID: 280 at lib/refcount.c:25 refcount_warn_saturate+0x114/0x134 | refcount_t: addition on 0; use-after-free. | Modules linked in: coda_vpu(E) v4l2_jpeg(E) videobuf2_vmalloc(E) imx_vdoa(E) | CPU: 0 PID: 280 Comm: test_can.sh Tainted: G E 5.11.0-04577-gf8ff6603c617 #203 | Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree) | Backtrace: | [<80bafea4>] (dump_backtrace) from [<80bb0280>] (show_stack+0x20/0x24) r7:00000000 r6:600f0113 r5:00000000 r4:81441220 | [<80bb0260>] (show_stack) from [<80bb593c>] (dump_stack+0xa0/0xc8) | [<80bb589c>] (dump_stack) from [<8012b268>] (__warn+0xd4/0x114) r9:00000019 r8:80f4a8c2 r7:83e4150c r6:00000000 r5:00000009 r4:80528f90 | [<8012b194>] (__warn) from [<80bb09c4>] (warn_slowpath_fmt+0x88/0xc8) r9:83f26400 r8:80f4a8d1 r7:00000009 r6:80528f90 r5:00000019 r4:80f4a8c2 | [<80bb0940>] (warn_slowpath_fmt) from [<80528f90>] (refcount_warn_saturate+0x114/0x134) r8:00000000 r7:00000000 r6:82b44000 r5:834e5600 r4:83f4d540 | [<80528e7c>] (refcount_warn_saturate) from [<8079a4c8>] (__refcount_add.constprop.0+0x4c/0x50) | [<8079a47c>] (__refcount_add.constprop.0) from [<8079a57c>] (can_put_echo_skb+0xb0/0x13c) | [<8079a4cc>] (can_put_echo_skb) from [<8079ba98>] (flexcan_start_xmit+0x1c4/0x230) r9:00000010 r8:83f48610 r7:0fdc0000 r6:0c080000 r5:82b44000 r4:834e5600 | [<8079b8d4>] (flexcan_start_xmit) from [<80969078>] (netdev_start_xmit+0x44/0x70) r9:814c0ba0 r8:80c8790c r7:00000000 r6:834e5600 r5:82b44000 r4:82ab1f00 | [<80969034>] (netdev_start_xmit) from [<809725a4>] (dev_hard_start_xmit+0x19c/0x318) r9:814c0ba0 r8:00000000 r7:82ab1f00 r6:82b44000 r5:00000000 r4:834e5600 | [<80972408>] (dev_hard_start_xmit) from [<809c6584>] (sch_direct_xmit+0xcc/0x264) r10:834e5600 r9:00000000 r8:00000000 r7:82b44000 r6:82ab1f00 r5:834e5600 r4:83f27400 | [<809c64b8>] (sch_direct_xmit) from [<809c6c0c>] (__qdisc_run+0x4f0/0x534) To fix this problem, only set skb ownership to sockets which have still a ref count > 0. Fixes: 0ae89beb283a ("can: add destructor for self generated skbs") Cc: Oliver Hartkopp Cc: Andre Naujoks Link: https://lore.kernel.org/r/20210226092456.27126-1-o.rempel@pengutronix.de Suggested-by: Eric Dumazet Signed-off-by: Oleksij Rempel Reviewed-by: Oliver Hartkopp Signed-off-by: Marc Kleine-Budde Signed-off-by: Greg Kroah-Hartman commit bdeb4e313a532c5e793e9b6f152ed58a8a700c28 Author: Andy Shevchenko Date: Thu Feb 25 18:33:19 2021 +0200 gpiolib: acpi: Allow to find GpioInt() resource by name and index commit 809390219fb9c2421239afe5c9eb862d73978ba0 upstream. Currently only search by index is supported. However, in some cases we might need to pass the quirks to the acpi_dev_gpio_irq_get(). For this, split out acpi_dev_gpio_irq_get_by() and replace acpi_dev_gpio_irq_get() by calling above with NULL for name parameter. Fixes: ba8c90c61847 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2") Depends-on: 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") Signed-off-by: Andy Shevchenko Acked-by: Mika Westerberg Acked-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit d64123aa0c10b32746c9d1ccc0a016ffbca75dcd Author: Andy Shevchenko Date: Thu Feb 25 18:33:18 2021 +0200 gpiolib: acpi: Add ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk commit 62d5247d239d4b48762192a251c647d7c997616a upstream. On some systems the ACPI tables has wrong pin number and instead of having a relative one it provides an absolute one in the global GPIO number space. Add ACPI_GPIO_QUIRK_ABSOLUTE_NUMBER quirk to cope with such cases. Fixes: ba8c90c61847 ("gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2") Depends-on: 0ea683931adb ("gpio: dwapb: Convert driver to using the GPIO-lib-based IRQ-chip") Signed-off-by: Andy Shevchenko Acked-by: Mika Westerberg Acked-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit 9144b0c03e1010380060abf38773d589eff516f8 Author: Matthias Schiffer Date: Wed Mar 3 16:50:49 2021 +0100 net: l2tp: reduce log level of messages in receive path, add counter instead commit 3e59e8856758eb5a2dfe1f831ef53b168fd58105 upstream. Commit 5ee759cda51b ("l2tp: use standard API for warning log messages") changed a number of warnings about invalid packets in the receive path so that they are always shown, instead of only when a special L2TP debug flag is set. Even with rate limiting these warnings can easily cause significant log spam - potentially triggered by a malicious party sending invalid packets on purpose. In addition these warnings were noticed by projects like Tunneldigger [1], which uses L2TP for its data path, but implements its own control protocol (which is sufficiently different from L2TP data packets that it would always be passed up to userspace even with future extensions of L2TP). Some of the warnings were already redundant, as l2tp_stats has a counter for these packets. This commit adds one additional counter for invalid packets that are passed up to userspace. Packets with unknown session are not counted as invalid, as there is nothing wrong with the format of these packets. With the additional counter, all of these messages are either redundant or benign, so we reduce them to pr_debug_ratelimited(). [1] https://github.com/wlanslovenija/tunneldigger/issues/160 Fixes: 5ee759cda51b ("l2tp: use standard API for warning log messages") Signed-off-by: Matthias Schiffer Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit edb24de123165bca427ea5b36db2faedc53b95b7 Author: Kalle Valo Date: Mon Feb 22 17:14:09 2021 +0200 ath11k: fix AP mode for QCA6390 commit 77d7e87128d4dfb400df4208b2812160e999c165 upstream. Commit c134d1f8c436 ("ath11k: Handle errors if peer creation fails") completely broke AP mode on QCA6390: kernel: [ 151.230734] ath11k_pci 0000:06:00.0: failed to create peer after vdev start delay: -22 wpa_supplicant[2307]: Failed to set beacon parameters wpa_supplicant[2307]: Interface initialization failed wpa_supplicant[2307]: wlan0: interface state UNINITIALIZED->DISABLED wpa_supplicant[2307]: wlan0: AP-DISABLED wpa_supplicant[2307]: wlan0: Unable to setup interface. wpa_supplicant[2307]: Failed to initialize AP interface This was because commit c134d1f8c436 ("ath11k: Handle errors if peer creation fails") added error handling for ath11k_peer_create(), which had been failing all along but was unnoticed due to the missing error handling. The actual bug was introduced already in commit aa44b2f3ecd4 ("ath11k: start vdev if a bss peer is already created"). ath11k_peer_create() was failing because for AP mode the peer is created already earlier op_add_interface() and we should skip creation here, but the check for modes was wrong. Fixing that makes AP mode work again. This shouldn't affect IPQ8074 nor QCN9074 as they have hw_params.vdev_start_delay disabled. Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1 Fixes: c134d1f8c436 ("ath11k: Handle errors if peer creation fails") Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/1614006849-25764-1-git-send-email-kvalo@codeaurora.org Signed-off-by: Greg Kroah-Hartman commit 4b39fd96cbaade5aadb2aa438b530c1a0d7e884f Author: Balazs Nemeth Date: Tue Mar 9 12:31:01 2021 +0100 net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 commit d348ede32e99d3a04863e9f9b28d224456118c27 upstream. A packet with skb_inner_network_header(skb) == skb_network_header(skb) and ETH_P_MPLS_UC will prevent mpls_gso_segment from pulling any headers from the packet. Subsequently, the call to skb_mac_gso_segment will again call mpls_gso_segment with the same packet leading to an infinite loop. In addition, ensure that the header length is a multiple of four, which should hold irrespective of the number of stacked labels. Signed-off-by: Balazs Nemeth Acked-by: Willem de Bruijn Reviewed-by: David Ahern Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 99b1d3f74b9ef72c2f74c8e4c078e1bc0706e748 Author: Balazs Nemeth Date: Tue Mar 9 12:31:00 2021 +0100 net: check if protocol extracted by virtio_net_hdr_set_proto is correct commit 924a9bc362a5223cd448ca08c3dde21235adc310 upstream. For gso packets, virtio_net_hdr_set_proto sets the protocol (if it isn't set) based on the type in the virtio net hdr, but the skb could contain anything since it could come from packet_snd through a raw socket. If there is a mismatch between what virtio_net_hdr_set_proto sets and the actual protocol, then the skb could be handled incorrectly later on. An example where this poses an issue is with the subsequent call to skb_flow_dissect_flow_keys_basic which relies on skb->protocol being set correctly. A specially crafted packet could fool skb_flow_dissect_flow_keys_basic preventing EINVAL to be returned. Avoid blindly trusting the information provided by the virtio net header by checking that the protocol in the packet actually matches the protocol set by virtio_net_hdr_set_proto. Note that since the protocol is only checked if skb->dev implements header_ops->parse_protocol, packets from devices without the implementation are not checked at this stage. Fixes: 9274124f023b ("net: stricter validation of untrusted gso packets") Signed-off-by: Balazs Nemeth Acked-by: Willem de Bruijn Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9b815b6efaf8072ea0cdf2ed22ac51c0b76574d9 Author: Daniel Borkmann Date: Fri Feb 26 22:22:48 2021 +0100 net: Fix gro aggregation for udp encaps with zero csum commit 89e5c58fc1e2857ccdaae506fb8bc5fed57ee063 upstream. We noticed a GRO issue for UDP-based encaps such as vxlan/geneve when the csum for the UDP header itself is 0. In that case, GRO aggregation does not take place on the phys dev, but instead is deferred to the vxlan/geneve driver (see trace below). The reason is essentially that GRO aggregation bails out in udp_gro_receive() for such case when drivers marked the skb with CHECKSUM_UNNECESSARY (ice, i40e, others) where for non-zero csums 2abb7cdc0dc8 ("udp: Add support for doing checksum unnecessary conversion") promotes those skbs to CHECKSUM_COMPLETE and napi context has csum_valid set. This is however not the case for zero UDP csum (here: csum_cnt is still 0 and csum_valid continues to be false). At the same time 57c67ff4bd92 ("udp: additional GRO support") added matches on !uh->check ^ !uh2->check as part to determine candidates for aggregation, so it certainly is expected to handle zero csums in udp_gro_receive(). The purpose of the check added via 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") seems to catch bad csum and stop aggregation right away. One way to fix aggregation in the zero case is to only perform the !csum_valid check in udp_gro_receive() if uh->check is infact non-zero. Before: [...] swapper 0 [008] 731.946506: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100400 len=1500 (1) swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100200 len=1500 swapper 0 [008] 731.946507: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101100 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101700 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101b00 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100600 len=1500 swapper 0 [008] 731.946508: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100f00 len=1500 swapper 0 [008] 731.946509: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100a00 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100500 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100700 len=1500 swapper 0 [008] 731.946516: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101d00 len=1500 (2) swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101000 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101c00 len=1500 swapper 0 [008] 731.946517: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101400 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100e00 len=1500 swapper 0 [008] 731.946518: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497101600 len=1500 swapper 0 [008] 731.946521: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff966497100800 len=774 swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497100400 len=14032 (1) swapper 0 [008] 731.946530: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff966497101d00 len=9112 (2) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 13129.24 After: [...] swapper 0 [026] 521.862641: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479000 len=11286 (1) swapper 0 [026] 521.862643: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479000 len=11236 (1) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d478500 len=2898 (2) swapper 0 [026] 521.862650: net:netif_receive_skb: dev=enp10s0f0 skbaddr=0xffff93ab0d479f00 len=8490 (3) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d478500 len=2848 (2) swapper 0 [026] 521.862653: net:netif_receive_skb: dev=test_vxlan skbaddr=0xffff93ab0d479f00 len=8440 (3) [...] # netperf -H 10.55.10.4 -t TCP_STREAM -l 20 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.55.10.4 () port 0 AF_INET : demo Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 87380 16384 16384 20.01 24576.53 Fixes: 57c67ff4bd92 ("udp: additional GRO support") Fixes: 662880f44203 ("net: Allow GRO to use and set levels of checksum unnecessary") Signed-off-by: Daniel Borkmann Cc: Eric Dumazet Cc: Jesse Brandeburg Cc: Tom Herbert Acked-by: Willem de Bruijn Acked-by: John Fastabend Link: https://lore.kernel.org/r/20210226212248.8300-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman commit 11fe93511d02c2e24f8c9eee1ca3f7f8b36e39c3 Author: Felix Fietkau Date: Sun Feb 14 19:49:11 2021 +0100 ath9k: fix transmitting to stations in dynamic SMPS mode commit 3b9ea7206d7e1fdd7419cbd10badd3b2c80d04b4 upstream. When transmitting to a receiver in dynamic SMPS mode, all transmissions that use multiple spatial streams need to be sent using CTS-to-self or RTS/CTS to give the receiver's extra chains some time to wake up. This fixes the tx rate getting stuck at <= MCS7 for some clients, especially Intel ones, which make aggressive use of SMPS. Cc: stable@vger.kernel.org Reported-by: Martin Kennedy Signed-off-by: Felix Fietkau Signed-off-by: Kalle Valo Link: https://lore.kernel.org/r/20210214184911.96702-1-nbd@nbd.name Signed-off-by: Greg Kroah-Hartman commit 9b51619ad0c22dd20bdf7dc8fce4ebefb50cf342 Author: Davide Caratti Date: Mon Mar 8 10:00:04 2021 +0100 mptcp: fix length of ADD_ADDR with port sub-option commit 27ab92d9996e4e003a726d22c56d780a1655d6b4 upstream. in current Linux, MPTCP peers advertising endpoints with port numbers use a sub-option length that wrongly accounts for the trailing TCP NOP. Also, receivers will only process incoming ADD_ADDR with port having such wrong sub-option length. Fix this, making ADD_ADDR compliant to RFC8684 §3.4.1. this can be verified running tcpdump on the kselftests artifacts: unpatched kernel: [root@bottarga mptcp]# tcpdump -tnnr unpatched.pcap | grep add-addr reading from file unpatched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535 IP 10.0.1.1.10000 > 10.0.1.2.53078: Flags [.], ack 101, win 509, options [nop,nop,TS val 214459678 ecr 521312851,mptcp add-addr v1 id 1 a00:201:2774:2d88:7436:85c3:17fd:101], length 0 IP 10.0.1.2.53078 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 521312852 ecr 214459678,mptcp add-addr[bad opt]] patched kernel: [root@bottarga mptcp]# tcpdump -tnnr patched.pcap | grep add-addr reading from file patched.pcap, link-type LINUX_SLL (Linux cooked v1), snapshot length 65535 IP 10.0.1.1.10000 > 10.0.1.2.38178: Flags [.], ack 101, win 509, options [nop,nop,TS val 3728873902 ecr 2732713192,mptcp add-addr v1 id 1 10.0.2.1:10100 hmac 0xbccdfcbe59292a1f,nop,nop], length 0 IP 10.0.1.2.38178 > 10.0.1.1.10000: Flags [.], ack 101, win 502, options [nop,nop,TS val 2732713195 ecr 3728873902,mptcp add-addr v1-echo id 1 10.0.2.1:10100,nop,nop], length 0 Fixes: 22fb85ffaefb ("mptcp: add port support for ADD_ADDR suboption writing") CC: stable@vger.kernel.org # 5.11+ Reviewed-by: Mat Martineau Acked-and-tested-by: Geliang Tang Signed-off-by: Davide Caratti Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 138bd4da2c7657fe0bc583824005f4e3bc153e47 Author: Maciej W. Rozycki Date: Wed Mar 3 02:16:04 2021 +0100 crypto: mips/poly1305 - enable for all MIPS processors commit 6c810cf20feef0d4338e9b424ab7f2644a8b353e upstream. The MIPS Poly1305 implementation is generic MIPS code written such as to support down to the original MIPS I and MIPS III ISA for the 32-bit and 64-bit variant respectively. Lift the current limitation then to enable code for MIPSr1 ISA or newer processors only and have it available for all MIPS processors. Signed-off-by: Maciej W. Rozycki Fixes: a11d055e7a64 ("crypto: mips/poly1305 - incorporate OpenSSL/CRYPTOGAMS optimized implementation") Cc: stable@vger.kernel.org # v5.5+ Acked-by: Jason A. Donenfeld Signed-off-by: Thomas Bogendoerfer Signed-off-by: Greg Kroah-Hartman commit e2d1bb4b1a100f6f5d31860b74127470bd55c6bf Author: Jakub Kicinski Date: Fri Mar 5 14:17:29 2021 -0800 ethernet: alx: fix order of calls on resume commit a4dcfbc4ee2218abd567d81d795082d8d4afcdf6 upstream. netif_device_attach() will unpause the queues so we can't call it before __alx_open(). This went undetected until commit b0999223f224 ("alx: add ability to allocate and free alx_napi structures") but now if stack tries to xmit immediately on resume before __alx_open() we'll crash on the NAPI being null: BUG: kernel NULL pointer dereference, address: 0000000000000198 CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G OE 5.10.0-3-amd64 #1 Debian 5.10.13-1 Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./H77-D3H, BIOS F15 11/14/2013 RIP: 0010:alx_start_xmit+0x34/0x650 [alx] Code: 41 56 41 55 41 54 55 53 48 83 ec 20 0f b7 57 7c 8b 8e b0 0b 00 00 39 ca 72 06 89 d0 31 d2 f7 f1 89 d2 48 8b 84 df RSP: 0018:ffffb09240083d28 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffffa04d80ae7800 RCX: 0000000000000004 RDX: 0000000000000000 RSI: ffffa04d80afa000 RDI: ffffa04e92e92a00 RBP: 0000000000000042 R08: 0000000000000100 R09: ffffa04ea3146700 R10: 0000000000000014 R11: 0000000000000000 R12: ffffa04e92e92100 R13: 0000000000000001 R14: ffffa04e92e92a00 R15: ffffa04e92e92a00 FS: 0000000000000000(0000) GS:ffffa0508f600000(0000) knlGS:0000000000000000 i915 0000:00:02.0: vblank wait timed out on crtc 0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000198 CR3: 000000004460a001 CR4: 00000000001706f0 Call Trace: dev_hard_start_xmit+0xc7/0x1e0 sch_direct_xmit+0x10f/0x310 Cc: # 4.9+ Fixes: bc2bebe8de8e ("alx: remove WoL support") Reported-by: Zbynek Michl Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=983595 Signed-off-by: Jakub Kicinski Tested-by: Zbynek Michl Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 48a84965536e4572df03c3e9e5b79948b7c4f4ae Author: Greg Kurz Date: Mon Feb 15 10:45:06 2021 +0100 powerpc/pseries: Don't enforce MSI affinity with kdump commit f9619d5e5174867536b7e558683bc4408eab833f upstream. Depending on the number of online CPUs in the original kernel, it is likely for CPU #0 to be offline in a kdump kernel. The associated IRQs in the affinity mappings provided by irq_create_affinity_masks() are thus not started by irq_startup(), as per-design with managed IRQs. This can be a problem with multi-queue block devices driven by blk-mq : such a non-started IRQ is very likely paired with the single queue enforced by blk-mq during kdump (see blk_mq_alloc_tag_set()). This causes the device to remain silent and likely hangs the guest at some point. This is a regression caused by commit 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()"). Note that this only happens with the XIVE interrupt controller because XICS has a workaround to bypass affinity, which is activated during kdump with the "noirqdistrib" kernel parameter. The issue comes from a combination of factors: - discrepancy between the number of queues detected by the multi-queue block driver, that was used to create the MSI vectors, and the single queue mode enforced later on by blk-mq because of kdump (i.e. keeping all queues fixes the issue) - CPU#0 offline (i.e. kdump always succeed with CPU#0) Given that I couldn't reproduce on x86, which seems to always have CPU#0 online even during kdump, I'm not sure where this should be fixed. Hence going for another approach : fine-grained affinity is for performance and we don't really care about that during kdump. Simply revert to the previous working behavior of ignoring affinity masks in this case only. Fixes: 9ea69a55b3b9 ("powerpc/pseries: Pass MSI affinity to irq_create_mapping()") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz Reviewed-by: Laurent Vivier Reviewed-by: Cédric Le Goater Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20210215094506.1196119-1-groug@kaod.org Signed-off-by: Greg Kroah-Hartman commit fc95046ed0d9d730ca63026c7d8caf3bd15c33e9 Author: Athira Rajeev Date: Thu Feb 25 05:10:39 2021 -0500 powerpc/perf: Fix handling of privilege level checks in perf interrupt context commit 5ae5fbd2107959b68ac69a8b75412208663aea88 upstream. Running "perf mem record" in powerpc platforms with selinux enabled resulted in soft lockup's. Below call-trace was seen in the logs: CPU: 58 PID: 3751 Comm: sssd_nss Not tainted 5.11.0-rc7+ #2 NIP: c000000000dff3d4 LR: c000000000dff3d0 CTR: 0000000000000000 REGS: c000007fffab7d60 TRAP: 0100 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x94/0x120 LR _raw_spin_lock_irqsave+0x90/0x120 Call Trace: 0xc00000000fd47260 (unreliable) skb_queue_tail+0x3c/0x90 audit_log_end+0x6c/0x180 common_lsm_audit+0xb0/0xe0 slow_avc_audit+0xa4/0x110 avc_has_perm+0x1c4/0x260 selinux_perf_event_open+0x74/0xd0 security_perf_event_open+0x68/0xc0 record_and_restart+0x6e8/0x7f0 perf_event_interrupt+0x22c/0x560 performance_monitor_exception0x4c/0x60 performance_monitor_common_virt+0x1c8/0x1d0 interrupt: f00 at _raw_spin_lock_irqsave+0x38/0x120 NIP: c000000000dff378 LR: c000000000b5fbbc CTR: c0000000007d47f0 REGS: c00000000fd47860 TRAP: 0f00 Not tainted (5.11.0-rc7+) ... NIP _raw_spin_lock_irqsave+0x38/0x120 LR skb_queue_tail+0x3c/0x90 interrupt: f00 0x38 (unreliable) 0xc00000000aae6200 audit_log_end+0x6c/0x180 audit_log_exit+0x344/0xf80 __audit_syscall_exit+0x2c0/0x320 do_syscall_trace_leave+0x148/0x200 syscall_exit_prepare+0x324/0x390 system_call_common+0xfc/0x27c The above trace shows that while the CPU was handling a performance monitor exception, there was a call to security_perf_event_open() function. In powerpc core-book3s, this function is called from perf_allow_kernel() check during recording of data address in the sample via perf_get_data_addr(). Commit da97e18458fb ("perf_event: Add support for LSM and SELinux checks") introduced security enhancements to perf. As part of this commit, the new security hook for perf_event_open() was added in all places where perf paranoid check was previously used. In powerpc core-book3s code, originally had paranoid checks in perf_get_data_addr() and power_pmu_bhrb_read(). So perf_paranoid_kernel() checks were replaced with perf_allow_kernel() in these PMU helper functions as well. The intention of paranoid checks in core-book3s was to verify privilege access before capturing some of the sample data. Along with paranoid checks, perf_allow_kernel() also does a security_perf_event_open(). Since these functions are accessed while recording a sample, we end up calling selinux_perf_event_open() in PMI context. Some of the security functions use spinlock like sidtab_sid2str_put(). If a perf interrupt hits under a spin lock and if we end up in calling selinux hook functions in PMI handler, this could cause a dead lock. Since the purpose of this security hook is to control access to perf_event_open(), it is not right to call this in interrupt context. The paranoid checks in powerpc core-book3s were done at interrupt time which is also not correct. Reference commits: Commit cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer") We only allow creation of events that have already passed the privilege checks in perf_event_open(). So these paranoid checks are not needed at event time. As a fix, patch uses 'event->attr.exclude_kernel' check to prevent exposing kernel address for userspace only sampling. Fixes: cd1231d7035f ("powerpc/perf: Prevent kernel address leak via perf_get_data_addr()") Cc: stable@vger.kernel.org # v4.17+ Suggested-by: Michael Ellerman Signed-off-by: Athira Rajeev Acked-by: Peter Zijlstra (Intel) Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/1614247839-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com Signed-off-by: Greg Kroah-Hartman commit dfaabaf60d857dd17cf96acb96f6b2059d97a903 Author: Christophe Leroy Date: Mon Feb 1 06:29:50 2021 +0000 powerpc/603: Fix protection of user pages mapped with PROT_NONE commit c119565a15a628efdfa51352f9f6c5186e506a1c upstream. On book3s/32, page protection is defined by the PP bits in the PTE which provide the following protection depending on the access keys defined in the matching segment register: - PP 00 means RW with key 0 and N/A with key 1. - PP 01 means RW with key 0 and RO with key 1. - PP 10 means RW with both key 0 and key 1. - PP 11 means RO with both key 0 and key 1. Since the implementation of kernel userspace access protection, PP bits have been set as follows: - PP00 for pages without _PAGE_USER - PP01 for pages with _PAGE_USER and _PAGE_RW - PP11 for pages with _PAGE_USER and without _PAGE_RW For kernelspace segments, kernel accesses are performed with key 0 and user accesses are performed with key 1. As PP00 is used for non _PAGE_USER pages, user can't access kernel pages not flagged _PAGE_USER while kernel can. For userspace segments, both kernel and user accesses are performed with key 0, therefore pages not flagged _PAGE_USER are still accessible to the user. This shouldn't be an issue, because userspace is expected to be accessible to the user. But unlike most other architectures, powerpc implements PROT_NONE protection by removing _PAGE_USER flag instead of flagging the page as not valid. This means that pages in userspace that are not flagged _PAGE_USER shall remain inaccessible. To get the expected behaviour, just mimic other architectures in the TLB miss handler by checking _PAGE_USER permission on userspace accesses as if it was the _PAGE_PRESENT bit. Note that this problem only is only for 603 cores. The 604+ have an hash table, and hash_page() function already implement the verification of _PAGE_USER permission on userspace pages. Fixes: f342adca3afc ("powerpc/32s: Prepare Kernel Userspace Access Protection") Cc: stable@vger.kernel.org # v5.2+ Reported-by: Christoph Plattner Signed-off-by: Christophe Leroy Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/4a0c6e3bb8f0c162457bf54d9bc6fd8d7b55129f.1612160907.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman commit 62937f9a502f776fff2f1f4eaf1d07bc44ea89c1 Author: Dmitry V. Levin Date: Mon Feb 22 08:00:00 2021 +0000 uapi: nfnetlink_cthelper.h: fix userspace compilation error commit c33cb0020ee6dd96cc9976d6085a7d8422f6dbed upstream. Apparently, and could not be included into the same compilation unit because of a cut-and-paste typo in the former header. Fixes: 12f7a505331e6 ("netfilter: add user-space connection tracking helper infrastructure") Cc: # v3.6 Signed-off-by: Dmitry V. Levin Signed-off-by: Pablo Neira Ayuso Signed-off-by: Greg Kroah-Hartman