commit cd2c5381cba9b0c40519b25841315621738688a0 Author: Greg Kroah-Hartman Date: Thu Oct 30 09:38:45 2014 -0700 Linux 3.14.23 commit f5dcee1c537f977545fc9652ace5c2ac93a519ca Author: David S. Miller Date: Fri Oct 24 09:59:02 2014 -0700 sparc64: Implement __get_user_pages_fast(). [ Upstream commit 06090e8ed89ea2113a236befb41f71d51f100e60 ] It is not sufficient to only implement get_user_pages_fast(), you must also implement the atomic version __get_user_pages_fast() otherwise you end up using the weak symbol fallback implementation which simply returns zero. This is dangerous, because it causes the futex code to loop forever if transparent hugepages are supported (see get_futex_key()). Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d9cd30ad9f2d40780894e30bb23d168db1b5dfb8 Author: David S. Miller Date: Thu Oct 23 12:58:13 2014 -0700 sparc64: Fix register corruption in top-most kernel stack frame during boot. [ Upstream commit ef3e035c3a9b81da8a778bc333d10637acf6c199 ] Meelis Roos reported that kernels built with gcc-4.9 do not boot, we eventually narrowed this down to only impacting machines using UltraSPARC-III and derivitive cpus. The crash happens right when the first user process is spawned: [ 54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 [ 54.451346] [ 54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab #96 [ 54.666431] Call Trace: [ 54.698453] [0000000000762f8c] panic+0xb0/0x224 [ 54.759071] [000000000045cf68] do_exit+0x948/0x960 [ 54.823123] [000000000042cbc0] fault_in_user_windows+0xe0/0x100 [ 54.902036] [0000000000404ad0] __handle_user_windows+0x0/0x10 [ 54.978662] Press Stop-A (L1-A) to return to the boot prom [ 55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 Further investigation showed that compiling only per_cpu_patch() with an older compiler fixes the boot. Detailed analysis showed that the function is not being miscompiled by gcc-4.9, but it is using a different register allocation ordering. With the gcc-4.9 compiled function, something during the code patching causes some of the %i* input registers to get corrupted. Perhaps we have a TLB miss path into the firmware that is deep enough to cause a register window spill and subsequent restore when we get back from the TLB miss trap. Let's plug this up by doing two things: 1) Stop using the firmware stack for client interface calls into the firmware. Just use the kernel's stack. 2) As soon as we can, call into a new function "start_early_boot()" to put a one-register-window buffer between the firmware's deepest stack frame and the top-most initial kernel one. Reported-by: Meelis Roos Tested-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 53d0f8feae8d9da5d589829b75ff3c85912335ed Author: Dave Kleikamp Date: Tue Oct 7 08:12:37 2014 -0500 sparc64: Increase size of boot string to 1024 bytes [ Upstream commit 1cef94c36bd4d79b5ae3a3df99ee0d76d6a4a6dc ] This is the longest boot string that silo supports. Signed-off-by: Dave Kleikamp Cc: Bob Picco Cc: David S. Miller Cc: sparclinux@vger.kernel.org Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 53060b79aa9afab213af6d560a30a3799aed64f8 Author: David S. Miller Date: Sat Sep 27 21:30:57 2014 -0700 sparc64: Kill unnecessary tables and increase MAX_BANKS. [ Upstream commit d195b71bad4347d2df51072a537f922546a904f1 ] swapper_low_pmd_dir and swapper_pud_dir are actually completely useless and unnecessary. We just need swapper_pg_dir[]. Naturally the other page table chunks will be allocated on an as-needed basis. Since the kernel actually accesses these tables in the PAGE_OFFSET view, there is not even a TLB locality advantage of placing them in the kernel image. Use the hard coded vmlinux.ld.S slot for swapper_pg_dir which is naturally page aligned. Increase MAX_BANKS to 1024 in order to handle heavily fragmented virtual guests. Even with this MAX_BANKS increase, the kernel is 20K+ smaller. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit 6334e1dda64c9511e01991a360d5fef502ee936d Author: bob picco Date: Thu Sep 25 12:25:03 2014 -0700 sparc64: sparse irq [ Upstream commit ee6a9333fa58e11577c1b531b8e0f5ffc0fd6f50 ] This patch attempts to do a few things. The highlights are: 1) enable SPARSE_IRQ unconditionally, 2) kills off !SPARSE_IRQ code 3) allocates ivector_table at boot time and 4) default to cookie only VIRQ mechanism for supported firmware. The first firmware with cookie only support for me appears on T5. You can optionally force the HV firmware to not cookie only mode which is the sysino support. The sysino is a deprecated HV mechanism according to the most recent SPARC Virtual Machine Specification. HV_GRP_INTR is what controls the cookie/sysino firmware versioning. The history of this interface is: 1) Major version 1.0 only supported sysino based interrupt interfaces. 2) Major version 2.0 added cookie based VIRQs, however due to the fact that OSs were using the VIRQs without negoatiating major version 2.0 (Linux and Solaris are both guilty), the VIRQs calls were allowed even with major version 1.0 To complicate things even further, the VIRQ interfaces were only actually hooked up in the hypervisor for LDC interrupt sources. VIRQ calls on other device types would result in HV_EINVAL errors. So effectively, major version 2.0 is unusable. 3) Major version 3.0 was created to signal use of VIRQs and the fact that the hypervisor has these calls hooked up for all interrupt sources, not just those for LDC devices. A new boot option is provided should cookie only HV support have issues. hvirq - this is the version for HV_GRP_INTR. This is related to HV API versioning. The code attempts major=3 first by default. The option can be used to override this default. I've tested with SPARSE_IRQ on T5-8, M7-4 and T4-X and Jalap?no. Signed-off-by: Bob Picco Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f2450cb9e4e54ffadee804a91fc50b272088db2f Author: David S. Miller Date: Sat Sep 27 11:05:21 2014 -0700 sparc64: Adjust vmalloc region size based upon available virtual address bits. [ Upstream commit bb4e6e85daa52a9f6210fa06a5ec6269598a202b ] In order to accomodate embedded per-cpu allocation with large numbers of cpus and numa nodes, we have to use as much virtual address space as possible for the vmalloc region. Otherwise we can get things like: PERCPU: max_distance=0x380001c10000 too large for vmalloc space 0xff00000000 So, once we select a value for PAGE_OFFSET, derive the size of the vmalloc region based upon that. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit e0b18223c64d6e11248553fae8a2343d4fe66a42 Author: David S. Miller Date: Wed Sep 24 21:49:29 2014 -0700 sparc64: Increase MAX_PHYS_ADDRESS_BITS to 53. Make sure, at compile time, that the kernel can properly support whatever MAX_PHYS_ADDRESS_BITS is defined to. On M7 chips, use a max_phys_bits value of 49. Based upon a patch by Bob Picco. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit 29070cdcc7533872cb5a16949603557cde43eb84 Author: David S. Miller Date: Wed Sep 24 21:20:14 2014 -0700 sparc64: Use kernel page tables for vmemmap. [ Upstream commit c06240c7f5c39c83dfd7849c0770775562441b96 ] For sparse memory configurations, the vmemmap array behaves terribly and it takes up an inordinate amount of space in the BSS section of the kernel image unconditionally. Just build huge PMDs and look them up just like we do for TLB misses in the vmalloc area. Kernel BSS shrinks by about 2MB. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit eccd108dbb4b150f3554ddd5a87c05d0627ec65d Author: David S. Miller Date: Wed Sep 24 20:56:11 2014 -0700 sparc64: Fix physical memory management regressions with large max_phys_bits. [ Upstream commit 0dd5b7b09e13dae32869371e08e1048349fd040c ] If max_phys_bits needs to be > 43 (f.e. for T4 chips), things like DEBUG_PAGEALLOC stop working because the 3-level page tables only can cover up to 43 bits. Another problem is that when we increased MAX_PHYS_ADDRESS_BITS up to 47, several statically allocated tables became enormous. Compounding this is that we will need to support up to 49 bits of physical addressing for M7 chips. The two tables in question are sparc64_valid_addr_bitmap and kpte_linear_bitmap. The first holds a bitmap, with 1 bit for each 4MB chunk of physical memory, indicating whether that chunk actually exists in the machine and is valid. The second table is a set of 2-bit values which tell how large of a mapping (4MB, 256MB, 2GB, 16GB, respectively) we can use at each 256MB chunk of ram in the system. These tables are huge and take up an enormous amount of the BSS section of the sparc64 kernel image. Specifically, the sparc64_valid_addr_bitmap is 4MB, and the kpte_linear_bitmap is 128K. So let's solve the space wastage and the DEBUG_PAGEALLOC problem at the same time, by using the kernel page tables (as designed) to manage this information. We have to keep using large mappings when DEBUG_PAGEALLOC is disabled, and we do this by encoding huge PMDs and PUDs. On a T4-2 with 256GB of ram the kernel page table takes up 16K with DEBUG_PAGEALLOC disabled and 256MB with it enabled. Furthermore, this memory is dynamically allocated at run time rather than coded statically into the kernel image. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit 4f3a7dd1b14d9ccc668b4d613d8fb788dd5e19dc Author: David S. Miller Date: Wed Sep 17 10:14:56 2014 -0700 sparc64: Adjust KTSB assembler to support larger physical addresses. [ Upstream commit 8c82dc0e883821c098c8b0b130ffebabf9aab5df ] As currently coded the KTSB accesses in the kernel only support up to 47 bits of physical addressing. Adjust the instruction and patching sequence in order to support arbitrary 64 bits addresses. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit b2bbcaa1dd6e2801895ad4df4067aea2dce7f2b3 Author: David S. Miller Date: Fri Sep 26 21:58:33 2014 -0700 sparc64: Define VA hole at run time, rather than at compile time. [ Upstream commit 4397bed080598001e88f612deb8b080bb1cc2322 ] Now that we use 4-level page tables, we can provide up to 53-bits of virtual address space to the user. Adjust the VA hole based upon the capabilities of the cpu type probed. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit c964b0ba69092f8b340f445b2dc4db00f420b61b Author: David S. Miller Date: Fri Sep 26 21:19:46 2014 -0700 sparc64: Switch to 4-level page tables. [ Upstream commit ac55c768143aa34cc3789c4820cbb0809a76fd9c ] This has become necessary with chips that support more than 43-bits of physical addressing. Based almost entirely upon a patch by Bob Picco. Signed-off-by: David S. Miller Acked-by: Bob Picco Signed-off-by: Greg Kroah-Hartman commit a2613388d68c4bbd8512dc3e3bbdf693106b06aa Author: bob picco Date: Tue Sep 16 10:09:06 2014 -0400 sparc64: T5 PMU The T5 (niagara5) has different PCR related HV fast trap values and a new HV API Group. This patch utilizes these and shares when possible with niagara4. We use the same sparc_pmu niagara4_pmu. Should there be new effort to obtain the MCU perf statistics then this would have to be changed. Cc: sparclinux@vger.kernel.org Signed-off-by: Bob Picco Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit af02e9dd14cc732849ca5739d304944a96731a52 Author: Allen Pais Date: Mon Sep 8 11:48:55 2014 +0530 sparc64: cpu hardware caps support for sparc M6 and M7 Signed-off-by: Allen Pais Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a1548c5733d51cd8773acddcb9f2413d336515dc Author: Allen Pais Date: Mon Sep 8 11:48:54 2014 +0530 sparc64: support M6 and M7 for building CPU distribution map Add M6 and M7 chip type in cpumap.c to correctly build CPU distribution map that spans all online CPUs. Signed-off-by: Allen Pais Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d4562a38e32d91d3cafd0106ec636013941797e4 Author: Allen Pais Date: Mon Sep 8 11:48:53 2014 +0530 sparc64: correctly recognise M6 and M7 cpu type The following patch adds support for correctly recognising M6 and M7 cpu type. Signed-off-by: Allen Pais Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit d517ac72570b14086f0d7f8e90f81f8c86e93aae Author: David S. Miller Date: Wed Sep 24 21:05:30 2014 -0700 sparc64: Fix hibernation code refrence to PAGE_OFFSET. We changed PAGE_OFFSET to be a variable rather than a constant, but this reference here in the hibernate assembler got missed. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2424eeabd42388ad030dc3cb92ee2b0da288553d Author: David S. Miller Date: Sat Oct 18 23:12:33 2014 -0400 sparc64: Do not define thread fpregs save area as zero-length array. [ Upstream commit e2653143d7d79a49f1a961aeae1d82612838b12c ] This breaks the stack end corruption detection facility. What that facility does it write a magic value to "end_of_stack()" and checking to see if it gets overwritten. "end_of_stack()" is "task_thread_info(p) + 1", which for sparc64 is the beginning of the FPU register save area. So once the user uses the FPU, the magic value is overwritten and the debug checks trigger. Fix this by making the size explicit. Due to the size we use for the fpsaved[], gsr[], and xfsr[] arrays we are limited to 7 levels of FPU state saves. So each FPU register set is 256 bytes, allocate 256 * 7 for the fpregs area. Reported-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9160f5959f36a712f060f6e350f1008942c4f6a4 Author: David S. Miller Date: Tue Oct 14 19:37:58 2014 -0700 sparc64: Fix FPU register corruption with AES crypto offload. [ Upstream commit f4da3628dc7c32a59d1fb7116bb042e6f436d611 ] The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the key material is preloaded into the FPU registers, and then we loop over and over doing the crypt operation, reusing those pre-cooked key registers. There are intervening blkcipher*() calls between the crypt operation calls. And those might perform memcpy() and thus also try to use the FPU. The sparc64 kernel FPU usage mechanism is designed to allow such recursive uses, but with a catch. There has to be a trap between the two FPU using threads of control. The mechanism works by, when the FPU is already in use by the kernel, allocating a slot for FPU saving at trap time. Then if, within the trap handler, we try to use the FPU registers, the pre-trap FPU register state is saved into the slot. Then at trap return time we notice this and restore the pre-trap FPU state. Over the long term there are various more involved ways we can make this work, but for a quick fix let's take advantage of the fact that the situation where this happens is very limited. All sparc64 chips that support the crypto instructiosn also are using the Niagara4 memcpy routine, and that routine only uses the FPU for large copies where we can't get the source aligned properly to a multiple of 8 bytes. We look to see if the FPU is already in use in this context, and if so we use the non-large copy path which only uses integer registers. Furthermore, we also limit this special logic to when we are doing kernel copy, rather than a user copy. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c254ab484fb9a74853e616199eaba56a90e9eb8e Author: David S. Miller Date: Fri Oct 10 15:49:16 2014 -0400 sparc64: Fix lockdep warnings on reboot on Ultra-5 [ Upstream commit bdcf81b658ebc4c2640c3c2c55c8b31c601b6996 ] Inconsistently, the raw_* IRQ routines do not interact with and update the irqflags tracing and lockdep state, whereas the raw_* spinlock interfaces do. This causes problems in p1275_cmd_direct() because we disable hardirqs by hand using raw_local_irq_restore() and then do a raw_spin_lock() which triggers a lockdep trace because the CPU's hw IRQ state doesn't match IRQ tracing's internal software copy of that state. The CPU's irqs are disabled, yet current->hardirqs_enabled is true. ==================== reboot: Restarting system ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3536 check_flags+0x7c/0x240() DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled) Modules linked in: openpromfs CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G W 3.17.0-dirty #145 Call Trace: [000000000045919c] warn_slowpath_common+0x5c/0xa0 [0000000000459210] warn_slowpath_fmt+0x30/0x40 [000000000048f41c] check_flags+0x7c/0x240 [0000000000493280] lock_acquire+0x20/0x1c0 [0000000000832b70] _raw_spin_lock+0x30/0x60 [000000000068f2fc] p1275_cmd_direct+0x1c/0x60 [000000000068ed28] prom_reboot+0x28/0x40 [000000000043610c] machine_restart+0x4c/0x80 [000000000047d2d4] kernel_restart+0x54/0x80 [000000000047d618] SyS_reboot+0x138/0x200 [00000000004060b4] linux_sparc_syscall32+0x34/0x60 ---[ end trace 5c439fe81c05a100 ]--- possible reason: unannotated irqs-off. irq event stamp: 2010267 hardirqs last enabled at (2010267): [<000000000049a358>] vprintk_emit+0x4b8/0x580 hardirqs last disabled at (2010266): [<0000000000499f08>] vprintk_emit+0x68/0x580 softirqs last enabled at (2010046): [<000000000045d278>] __do_softirq+0x378/0x4a0 softirqs last disabled at (2010039): [<000000000042bf08>] do_softirq_own_stack+0x28/0x40 Resetting ... ==================== Use local_* variables of the hw IRQ interfaces so that IRQ tracing sees all of our changes. Reported-by: Meelis Roos Tested-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 13603e835a8ac04e626862d89a08502e75f7083f Author: David S. Miller Date: Sat Oct 4 21:05:14 2014 -0700 sparc64: Fix reversed start/end in flush_tlb_kernel_range() [ Upstream commit 473ad7f4fb005d1bb727e4ef27d370d28703a062 ] When we have to split up a flush request into multiple pieces (in order to avoid the firmware range) we don't specify the arguments in the right order for the second piece. Fix the order, or else we get hangs as the code tries to flush "a lot" of entries and we get lockups like this: [ 4422.981276] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [expect:117032] [ 4422.996130] Modules linked in: ipv6 loop usb_storage igb ptp sg sr_mod ehci_pci ehci_hcd pps_core n2_rng rng_core [ 4423.016617] CPU: 12 PID: 117032 Comm: expect Not tainted 3.17.0-rc4+ #1608 [ 4423.030331] task: fff8003cc730e220 ti: fff8003d99d54000 task.ti: fff8003d99d54000 [ 4423.045282] TSTATE: 0000000011001602 TPC: 00000000004521e8 TNPC: 00000000004521ec Y: 00000000 Not tainted [ 4423.064905] TPC: <__flush_tlb_kernel_range+0x28/0x40> [ 4423.074964] g0: 000000000052fd10 g1: 00000001295a8000 g2: ffffff7176ffc000 g3: 0000000000002000 [ 4423.092324] g4: fff8003cc730e220 g5: fff8003dfedcc000 g6: fff8003d99d54000 g7: 0000000000000006 [ 4423.109687] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000000000003 o3: 00000000f0000000 [ 4423.127058] o4: 0000000000000080 o5: 00000001295a8000 sp: fff8003d99d56d01 ret_pc: 000000000052ff54 [ 4423.145121] RPC: <__purge_vmap_area_lazy+0x314/0x3a0> [ 4423.155185] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000a38040 l3: 0000000000000000 [ 4423.172559] l4: fff8003dae8965e0 l5: ffffffffffffffff l6: 0000000000000000 l7: 00000000f7e2b138 [ 4423.189913] i0: fff8003d99d576a0 i1: fff8003d99d576a8 i2: fff8003d99d575e8 i3: 0000000000000000 [ 4423.207284] i4: 0000000000008008 i5: fff8003d99d575c8 i6: fff8003d99d56df1 i7: 0000000000530c24 [ 4423.224640] I7: [ 4423.234193] Call Trace: [ 4423.239051] [0000000000530c24] free_vmap_area_noflush+0x64/0x80 [ 4423.251029] [0000000000531a7c] remove_vm_area+0x5c/0x80 [ 4423.261628] [0000000000531b80] __vunmap+0x20/0x120 [ 4423.271352] [000000000071cf18] n_tty_close+0x18/0x40 [ 4423.281423] [00000000007222b0] tty_ldisc_close+0x30/0x60 [ 4423.292183] [00000000007225a4] tty_ldisc_reinit+0x24/0xa0 [ 4423.303120] [0000000000722ab4] tty_ldisc_hangup+0xd4/0x1e0 [ 4423.314232] [0000000000719aa0] __tty_hangup+0x280/0x3c0 [ 4423.324835] [0000000000724cb4] pty_close+0x134/0x1a0 [ 4423.334905] [000000000071aa24] tty_release+0x104/0x500 [ 4423.345316] [00000000005511d0] __fput+0x90/0x1e0 [ 4423.354701] [000000000047fa54] task_work_run+0x94/0xe0 [ 4423.365126] [0000000000404b44] __handle_signal+0xc/0x2c Fixes: 4ca9a23765da ("sparc64: Guard against flushing openfirmware mappings.") Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a312639a66a3668d4d3335855d82967e0be0dddd Author: Andreas Larsson Date: Fri Aug 29 17:08:21 2014 +0200 sparc: Let memset return the address argument [ Upstream commit 74cad25c076a2f5253312c2fe82d1a4daecc1323 ] This makes memset follow the standard (instead of returning 0 on success). This is needed when certain versions of gcc optimizes around memset calls and assume that the address argument is preserved in %o0. Signed-off-by: Andreas Larsson Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit b8dd329aca97726199a213ecbbbebe617db00050 Author: Sowmini Varadhan Date: Tue Sep 16 11:37:08 2014 -0400 sparc64: Move request_irq() from ldc_bind() to ldc_alloc() [ Upstream commit c21c4ab0d6921f7160a43216fa6973b5924de561 ] The request_irq() needs to be done from ldc_alloc() to avoid the following (caught by lockdep) [00000000004a0738] __might_sleep+0xf8/0x120 [000000000058bea4] kmem_cache_alloc_trace+0x184/0x2c0 [00000000004faf80] request_threaded_irq+0x80/0x160 [000000000044f71c] ldc_bind+0x7c/0x220 [0000000000452454] vio_port_up+0x54/0xe0 [00000000101f6778] probe_disk+0x38/0x220 [sunvdc] [00000000101f6b8c] vdc_port_probe+0x22c/0x300 [sunvdc] [0000000000451a88] vio_device_probe+0x48/0x60 [000000000074c56c] really_probe+0x6c/0x300 [000000000074c83c] driver_probe_device+0x3c/0xa0 [000000000074c92c] __driver_attach+0x8c/0xa0 [000000000074a6ec] bus_for_each_dev+0x6c/0xa0 [000000000074c1dc] driver_attach+0x1c/0x40 [000000000074b0fc] bus_add_driver+0xbc/0x280 Signed-off-by: Sowmini Varadhan Acked-by: Dwight Engen Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit c26873b31017e776331fa00b61c4d0075aac1d55 Author: bob picco Date: Tue Sep 16 09:28:15 2014 -0400 sparc64: find_node adjustment [ Upstream commit 3dee9df54836d5f844f3d58281d3f3e6331b467f ] We have seen an issue with guest boot into LDOM that causes early boot failures because of no matching rules for node identitity of the memory. I analyzed this on my T4 and concluded there might not be a solution. I saw the issue in mainline too when booting into the control/primary domain - with guests configured. Note, this could be a firmware bug on some older machines. I'll provide a full explanation of the issues below. Should we not find a matching BEST latency group for a real address (RA) then we will assume node 0. On the T4-2 here with the information provided I can't see an alternative. Technically the LDOM shown below should match the MBLOCK to the favorable latency group. However other factors must be considered too. Were the memory controllers configured "fine" grained interleave or "coarse" grain interleaved - T4. Also should a "group" MD node be considered a NUMA node? There has to be at least one Machine Description (MD) "group" and hence one NUMA node. The group can have one or more latency groups (lg) - more than one memory controller. The current code chooses the smallest latency as the most favorable per group. The latency and lg information is in MLGROUP below. MBLOCK is the base and size of the RAs for the machine as fetched from OBP /memory "available" property. My machine has one MBLOCK but more would be possible - with holes? For a T4-2 the following information has been gathered: with LDOM guest MEMBLOCK configuration: memory size = 0x27f870000 memory.cnt = 0x3 memory[0x0] [0x00000020400000-0x0000029fc67fff], 0x27f868000 bytes memory[0x1] [0x0000029fd8a000-0x0000029fd8bfff], 0x2000 bytes memory[0x2] [0x0000029fd92000-0x0000029fd97fff], 0x6000 bytes reserved.cnt = 0x2 reserved[0x0] [0x00000020800000-0x000000216c15c0], 0xec15c1 bytes reserved[0x1] [0x00000024800000-0x0000002c180c1e], 0x7980c1f bytes MBLOCK[0]: base[20000000] size[280000000] offset[0] (note: "base" and "size" reported in "MBLOCK" encompass the "memory[X]" values) (note: (RA + offset) & mask = val is the formula to detect a match for the memory controller. should there be no match for find_node node, a return value of -1 resulted for the node - BAD) There is one group. It has these forward links MLGROUP[1]: node[545] latency[1f7e8] match[200000000] mask[200000000] MLGROUP[2]: node[54d] latency[2de60] match[0] mask[200000000] NUMA NODE[0]: node[545] mask[200000000] val[200000000] (latency[1f7e8]) (note: "val" is the best lg's (smallest latency) "match") no LDOM guest - bare metal MEMBLOCK configuration: memory size = 0xfdf2d0000 memory.cnt = 0x3 memory[0x0] [0x00000020400000-0x00000fff6adfff], 0xfdf2ae000 bytes memory[0x1] [0x00000fff6d2000-0x00000fff6e7fff], 0x16000 bytes memory[0x2] [0x00000fff766000-0x00000fff771fff], 0xc000 bytes reserved.cnt = 0x2 reserved[0x0] [0x00000020800000-0x00000021a04580], 0x1204581 bytes reserved[0x1] [0x00000024800000-0x0000002c7d29fc], 0x7fd29fd bytes MBLOCK[0]: base[20000000] size[fe0000000] offset[0] there are two groups group node[16d5] MLGROUP[0]: node[1765] latency[1f7e8] match[0] mask[200000000] MLGROUP[3]: node[177d] latency[2de60] match[200000000] mask[200000000] NUMA NODE[0]: node[1765] mask[200000000] val[0] (latency[1f7e8]) group node[171d] MLGROUP[2]: node[1775] latency[2de60] match[0] mask[200000000] MLGROUP[1]: node[176d] latency[1f7e8] match[200000000] mask[200000000] NUMA NODE[1]: node[176d] mask[200000000] val[200000000] (latency[1f7e8]) (note: for this two "group" bare metal machine, 1/2 memory is in group one's lg and 1/2 memory is in group two's lg). Cc: sparclinux@vger.kernel.org Signed-off-by: Bob Picco Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6b837f132cc041c53d65ab881f33f186a6a63b66 Author: David S. Miller Date: Sat Oct 18 23:03:09 2014 -0400 sparc64: Fix corrupted thread fault code. [ Upstream commit 84bd6d8b9c0f06b3f188efb479c77e20f05e9a8a ] Every path that ends up at do_sparc64_fault() must install a valid FAULT_CODE_* bitmask in the per-thread fault code byte. Two paths leading to the label winfix_trampoline (which expects the FAULT_CODE_* mask in register %g4) were not doing so: 1) For pre-hypervisor TLB protection violation traps, if we took the 'winfix_trampoline' path we wouldn't have %g4 initialized with the FAULT_CODE_* value yet. Resulting in using the TLB_TAG_ACCESS register address value instead. 2) In the TSB miss path, when we notice that we are going to use a hugepage mapping, but we haven't allocated the hugepage TSB yet, we still have to take the window fixup case into consideration and in that particular path we leave %g4 not setup properly. Errors on this sort were largely invisible previously, but after commit 4ccb9272892c33ef1c19a783cfa87103b30c2784 ("sparc64: sun4v TLB error power off events") we now have a fault_code mask bit (FAULT_CODE_BAD_RA) that triggers due to this bug. FAULT_CODE_BAD_RA triggers because this bit is set in TLB_TAG_ACCESS (see #1 above) and thus we get seemingly random bus errors triggered for user processes. Fixes: 4ccb9272892c ("sparc64: sun4v TLB error power off events") Reported-by: Meelis Roos Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1771ef5474e659fe5de1d2a4c6f9e8fc176493f7 Author: bob picco Date: Tue Sep 16 09:26:47 2014 -0400 sparc64: sun4v TLB error power off events [ Upstream commit 4ccb9272892c33ef1c19a783cfa87103b30c2784 ] We've witnessed a few TLB events causing the machine to power off because of prom_halt. In one case it was some nfs related area during rmmod. Another was an mmapper of /dev/mem. A more recent one is an ITLB issue with a bad pagesize which could be a hardware bug. Bugs happen but we should attempt to not power off the machine and/or hang it when possible. This is a DTLB error from an mmapper of /dev/mem: [root@sparcie ~]# SUN4V-DTLB: Error at TPC[fffff80100903e6c], tl 1 SUN4V-DTLB: TPC<0xfffff80100903e6c> SUN4V-DTLB: O7[fffff801081979d0] SUN4V-DTLB: O7<0xfffff801081979d0> SUN4V-DTLB: vaddr[fffff80100000000] ctx[1250] pte[98000000000f0610] error[2] . This is recent mainline for ITLB: [ 3708.179864] SUN4V-ITLB: TPC<0xfffffc010071cefc> [ 3708.188866] SUN4V-ITLB: O7[fffffc010071cee8] [ 3708.197377] SUN4V-ITLB: O7<0xfffffc010071cee8> [ 3708.206539] SUN4V-ITLB: vaddr[e0003] ctx[1a3c] pte[2900000dcc800eeb] error[4] . Normally sun4v_itlb_error_report() and sun4v_dtlb_error_report() would call prom_halt() and drop us to OF command prompt "ok". This isn't the case for LDOMs and the machine powers off. For the HV reported error of HV_ENORADDR for HV HV_MMU_MAP_ADDR_TRAP we cause a SIGBUS error by qualifying it within do_sparc64_fault() for fault code mask of FAULT_CODE_BAD_RA. This is done when trap level (%tl) is less or equal one("1"). Otherwise, for %tl > 1, we proceed eventually to die_if_kernel(). The logic of this patch was partially inspired by David Miller's feedback. Power off of large sparc64 machines is painful. Plus die_if_kernel provides more context. A reset sequence isn't a brief period on large sparc64 but better than power-off/power-on sequence. Cc: sparclinux@vger.kernel.org Signed-off-by: Bob Picco Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 88a929f9157f37da3e3997dc1de803ced85a082f Author: Daniel Hellstrom Date: Wed Sep 10 14:17:52 2014 +0200 sparc32: dma_alloc_coherent must honour gfp flags [ Upstream commit d1105287aabe88dbb3af825140badaa05cf0442c ] dma_zalloc_coherent() calls dma_alloc_coherent(__GFP_ZERO) but the sparc32 implementations sbus_alloc_coherent() and pci32_alloc_coherent() doesn't take the gfp flags into account. Tested on the SPARC32/LEON GRETH Ethernet driver which fails due to dma_alloc_coherent(__GFP_ZERO) returns non zeroed pages. Signed-off-by: Daniel Hellstrom Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 92392b1f872c35f4def26f83eeaf14b4acbd053d Author: David S. Miller Date: Mon Aug 11 15:38:46 2014 -0700 sparc64: Fix pcr_ops initialization and usage bugs. [ Upstream commit 8bccf5b313180faefce38e0d1140f76e0f327d28 ] Christopher reports that perf_event_print_debug() can crash in uniprocessor builds. The crash is due to pcr_ops being NULL. This happens because pcr_arch_init() is only invoked by smp_cpus_done() which only executes in SMP builds. init_hw_perf_events() is closely intertwined with pcr_ops being setup properly, therefore: 1) Call pcr_arch_init() early on from init_hw_perf_events(), instead of from smp_cpus_done(). 2) Do not hook up a PMU type if pcr_ops is NULL after pcr_arch_init(). 3) Move init_hw_perf_events to a later initcall so that it we will be sure to invoke pcr_arch_init() after all cpus are brought up. Finally, guard the one naked sequence of pcr_ops dereferences in __global_pmu_self() with an appropriate NULL check. Reported-by: Christopher Alexander Tobias Schulze Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 08a7f3c1e2d1513649883ce6d54a96d2cfe9e950 Author: David S. Miller Date: Mon Aug 11 20:45:01 2014 -0700 sparc64: Do not disable interrupts in nmi_cpu_busy() [ Upstream commit 58556104e9cd0107a7a8d2692cf04ef31669f6e4 ] nmi_cpu_busy() is a SMP function call that just makes sure that all of the cpus are spinning using cpu cycles while the NMI test runs. It does not need to disable IRQs because we just care about NMIs executing which will even with 'normal' IRQs disabled. It is not legal to enable hard IRQs in a SMP cross call, in fact this bug triggers the BUG check in irq_work_run_list(): BUG_ON(!irqs_disabled()); Because now irq_work_run() is invoked from the tail of generic_smp_call_function_single_interrupt(). Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ead061b0198612788364d44127b35ba3c82d7e85 Author: Dave Chinner Date: Tue Sep 23 15:36:27 2014 +1000 xfs: ensure WB_SYNC_ALL writeback handles partial pages correctly commit 0d085a529b427d97710e6a41f8a4f23e1757cd12 upstream. XFS has been having trouble with stray delayed allocation extents beyond EOF for a long time. Recent changes to the collapse range code has triggered erroneous EBUSY errors on page invalidtion for block size smaller than page size filesystems. These have been caused by dirty buffers beyond EOF on a partial page which do not get written to disk during a sync. The issue is that write-ahead in xfs_cluster_write() finds such a partial page and handles it by leaving the page dirty but pushing it into a writeback state. This used to work just fine, as the write_cache_pages() code would then find the dirty partial page in the next mapping tree lookup as the dirty tag is still set. Unfortunately, when we moved to a mark and sweep approach to writeback to fix other writeback sync issues, we broken this. THe act of marking the page as under writeback now clears the TOWRITE tag in the radix tree, even though the page is still dirty. This causes the TOWRITE tag to be cleared, and hence the next lookup on the mapping tree does not find the dirty partial page and so doesn't try to write it again. This same writeback bug was found recently in ext4 and fixed in commit 1c8349a ("ext4: fix data integrity sync in ordered mode") without communication to the wider filesystem community. We can use exactly the same fix here so the TOWRITE flag is not cleared on partial page writes. cc: stable@vger.kernel.org # dependent on 1c8349a17137b93f0a83f276c764a6df1b9a116e Root-cause-found-by: Brian Foster Signed-off-by: Dave Chinner Reviewed-by: Brian Foster Signed-off-by: Dave Chinner Signed-off-by: Greg Kroah-Hartman commit 0b0dfc144bdf887690d7772a39ff14c18d86795c Author: Chao Yu Date: Thu Jul 24 17:25:42 2014 +0800 ecryptfs: avoid to access NULL pointer when write metadata in xattr commit 35425ea2492175fd39f6116481fe98b2b3ddd4ca upstream. Christopher Head 2014-06-28 05:26:20 UTC described: "I tried to reproduce this on 3.12.21. Instead, when I do "echo hello > foo" in an ecryptfs mount with ecryptfs_xattr specified, I get a kernel crash: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] fsstack_copy_attr_all+0x2/0x61 PGD d7840067 PUD b2c3c067 PMD 0 Oops: 0002 [#1] SMP Modules linked in: nvidia(PO) CPU: 3 PID: 3566 Comm: bash Tainted: P O 3.12.21-gentoo-r1 #2 Hardware name: ASUSTek Computer Inc. G60JX/G60JX, BIOS 206 03/15/2010 task: ffff8801948944c0 ti: ffff8800bad70000 task.ti: ffff8800bad70000 RIP: 0010:[] [] fsstack_copy_attr_all+0x2/0x61 RSP: 0018:ffff8800bad71c10 EFLAGS: 00010246 RAX: 00000000000181a4 RBX: ffff880198648480 RCX: 0000000000000000 RDX: 0000000000000004 RSI: ffff880172010450 RDI: 0000000000000000 RBP: ffff880198490e40 R08: 0000000000000000 R09: 0000000000000000 R10: ffff880172010450 R11: ffffea0002c51e80 R12: 0000000000002000 R13: 000000000000001a R14: 0000000000000000 R15: ffff880198490e40 FS: 00007ff224caa700(0000) GS:ffff88019fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000bb07f000 CR4: 00000000000007e0 Stack: ffffffff811826e8 ffff8800a39d8000 0000000000000000 000000000000001a ffff8800a01d0000 ffff8800a39d8000 ffffffff81185fd5 ffffffff81082c2c 00000001a39d8000 53d0abbc98490e40 0000000000000037 ffff8800a39d8220 Call Trace: [] ? ecryptfs_setxattr+0x40/0x52 [] ? ecryptfs_write_metadata+0x1b3/0x223 [] ? should_resched+0x5/0x23 [] ? ecryptfs_initialize_file+0xaf/0xd4 [] ? ecryptfs_create+0xf4/0x142 [] ? vfs_create+0x48/0x71 [] ? do_last.isra.68+0x559/0x952 [] ? link_path_walk+0xbd/0x458 [] ? path_openat+0x224/0x472 [] ? do_filp_open+0x2b/0x6f [] ? __alloc_fd+0xd6/0xe7 [] ? do_sys_open+0x65/0xe9 [] ? system_call_fastpath+0x16/0x1b RIP [] fsstack_copy_attr_all+0x2/0x61 RSP CR2: 0000000000000000 ---[ end trace df9dba5f1ddb8565 ]---" If we create a file when we mount with ecryptfs_xattr_metadata option, we will encounter a crash in this path: ->ecryptfs_create ->ecryptfs_initialize_file ->ecryptfs_write_metadata ->ecryptfs_write_metadata_to_xattr ->ecryptfs_setxattr ->fsstack_copy_attr_all It's because our dentry->d_inode used in fsstack_copy_attr_all is NULL, and it will be initialized when ecryptfs_initialize_file finish. So we should skip copying attr from lower inode when the value of ->d_inode is invalid. Signed-off-by: Chao Yu Signed-off-by: Tyler Hicks Signed-off-by: Greg Kroah-Hartman commit c238c3434129b28de67d01c22b7456b2dd138759 Author: klightspeed@killerwolves.net Date: Wed Sep 10 18:55:41 2014 +1000 ARM: mvebu: Netgear RN102: Use Hardware BCH ECC commit ace8578182dc347b043c0825b9873f62fdaa5b77 upstream. The bootloader on the Netgear ReadyNAS RN102 uses Hardware BCH ECC (strength = 4), while the pxa3xx NAND driver by default uses Hamming ECC (strength = 1). This patch changes the ECC mode on these machines to match that of the bootloader and of the stock firmware. That way, it is now possible to update the kernel from userland (e.g. using standard tools from mtd-utils package); u-boot will happily load and boot it. Fixes: 92beaccd8b49 ("ARM: mvebu: Enable NAND controller in ReadyNAS 102 .dts file") Signed-off-by: Ben Peddell Acked-by: Ezequiel Garcia Tested-by: Arnaud Ebalard Link: https://lkml.kernel.org/r/1410339341-3372-1-git-send-email-klightspeed@killerwolves.net Signed-off-by: Jason Cooper Signed-off-by: Greg Kroah-Hartman commit 55c5fc46be38debf2045c38c23f5084d09f21919 Author: Arnaud Ebalard Date: Sat Sep 6 22:49:38 2014 +0200 ARM: mvebu: Netgear RN2120: Use Hardware BCH ECC commit 500abb6ccb9e3f8d638a7f422443a8549245ef90 upstream. The bootloader on the Netgear ReadyNAS RN2120 uses Hardware BCH ECC (strength = 4), while the pxa3xx NAND driver by default uses Hamming ECC (strength = 1). This patch changes the ECC mode on these machines to match that of the bootloader and of the stock firmware. That way, it is now possible to update the kernel from userland (e.g. using standard tools from mtd-utils package); u-boot will happily load and boot it. The issue was initially reported and fixed by Ben Pedell for RN102. The RN2120 shares the same Hynix H27U1G8F2BTR NAND flash and setup. This patch is based on Ben's fix for RN102. Fixes: ad51eddd95ad ("ARM: mvebu: Enable NAND controller in ReadyNAS 2120 .dts file") Signed-off-by: Arnaud Ebalard Link: https://lkml.kernel.org/r/61f6a1b7ad0adc57a0e201b9680bc2e5f214a317.1410035142.git.arno@natisbad.org Signed-off-by: Jason Cooper Signed-off-by: Greg Kroah-Hartman commit d5f2d5fb6caa9b6233bea9176d03accee7ca815f Author: Arnaud Ebalard Date: Sat Sep 6 22:49:25 2014 +0200 ARM: mvebu: Netgear RN104: Use Hardware BCH ECC commit 225b94cdf719d0bc522a354bdafc18e5da5ff83b upstream. The bootloader on the Netgear ReadyNAS RN104 uses Hardware BCH ECC (strength = 4), while the pxa3xx NAND driver by default uses Hamming ECC (strength = 1). This patch changes the ECC mode on these machines to match that of the bootloader and of the stock firmware. That way, it is now possible to update the kernel from userland (e.g. using standard tools from mtd-utils package); u-boot will happily load and boot it. The issue was initially reported and fixed by Ben Pedell for RN102. The RN104 shares the same Hynix H27U1G8F2BTR NAND flash and setup. This patch is based on Ben's fix for RN102. Fixes: 0373a558bd79 ("ARM: mvebu: Enable NAND controller in ReadyNAS 104 .dts file") Signed-off-by: Arnaud Ebalard Link: https://lkml.kernel.org/r/920c7e7169dc6aaaa3eb4bced2336d38e77b8864.1410035142.git.arno@natisbad.org Signed-off-by: Jason Cooper Signed-off-by: Greg Kroah-Hartman commit 42dc6df404f2dad7bc3e86945e55eb8906b3b301 Author: Ludovic Desroches Date: Mon Sep 22 15:51:33 2014 +0200 ARM: at91/PMC: don't forget to write PMC_PCDR register to disable clocks commit cfa1950e6c6b72251e80adc736af3c3d2907ab0e upstream. When introducing support for sama5d3, the write to PMC_PCDR register has been accidentally removed. Reported-by: Nathalie Cyrille Signed-off-by: Ludovic Desroches Signed-off-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman commit bdd044a9e071c24d1b2f4e338a1e426d37d1f601 Author: Andreas Henriksson Date: Tue Sep 23 17:12:52 2014 +0200 ARM: at91: fix at91sam9263ek DT mmc pinmuxing settings commit b65e0fb3d046cc65d0a3c45d43de351fb363271b upstream. As discovered on a custom board similar to at91sam9263ek and basing its devicetree on that one apparently the pin muxing doesn't get set up properly. This was discovered since the custom boards u-boot does funky stuff with the pin muxing and leaved it set to SPI which made the MMC driver not work under Linux. The fix is simply to define the given configuration as the default. This probably worked by pure luck before, but it's better to make the muxing explicitly set. Signed-off-by: Andreas Henriksson Acked-by: Boris Brezillon Signed-off-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman commit f1dabe249c74a4158597824281ea74dd5d261215 Author: David Dueck Date: Wed Sep 17 10:33:32 2014 +0200 ARM: at91/dt: Fix typo regarding can0_clk commit 0a51d644c20f5c88fd3a659119d1903f74927082 upstream. Otherwise the clock for can0 will never get enabled. Signed-off-by: David Dueck Signed-off-by: Anthony Harivel Acked-by: Boris Brezillon Signed-off-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman commit 92c42b2ef35aec29585b0516f9c8cb13c66d708d Author: Anssi Hannula Date: Sun Oct 19 19:25:19 2014 +0300 ALSA: hda - hdmi: Fix missing ELD change event on plug/unplug commit 6acce400d9daf1353fbf497302670c90a3205e1d upstream. The ELD ALSA control change event is sent by hdmi_present_sense() when eld_changed is true. Currently, it is only true when the ELD buffer contents have been modified. However, the user-visible ELD controls also change to a zero-length value and back when eld_valid is unset/set, and no event is currently sent in such cases (such as when unplugging or replugging a sink). Fix the code to always set eld_changed if eld_valid value is changed, and therefore to always send the change event when the user-visible value changes. Signed-off-by: Anssi Hannula Cc: David Henningsson Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 9198b615897deca33fbe2d86990c911db7183703 Author: Vlad Catoi Date: Sat Oct 18 17:45:41 2014 -0500 ALSA: usb-audio: Add support for Steinberg UR22 USB interface commit f0b127fbfdc8756eba7437ab668f3169280bd358 upstream. Adding support for Steinberg UR22 USB interface via quirks table patch See Ubuntu bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1317244 Also see threads: http://linux-audio.4202.n7.nabble.com/Support-for-Steinberg-UR22-Yamaha-USB-chipset-0499-1509-tc82888.html#a82917 http://www.steinberg.net/forums/viewtopic.php?t=62290 Tested by at least 4 people judging by the threads. Did not test MIDI interface, but audio output and capture both are functional. Built 3.17 kernel with this driver on Ubuntu 14.04 & tested with mpg123 Patch applied to 3.13 Ubuntu kernel works well enough for daily use. Signed-off-by: Vlad Catoi Acked-by: Clemens Ladisch Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 78cc329441b851c40a9ae6f43a95ff6a14b6ef47 Author: Harsha Priya Date: Thu Oct 9 11:04:56 2014 +0000 ALSA: ALC283 codec - Avoid pop noise on headphones during suspend/resume commit b450b17c156e264bc44a198046d3ebaaef5a041d upstream. This patch sets the headphones mode to default before suspending which helps avoid the pop noise on headphones Signed-off-by: Harsha Priya Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 725a14505cd1a93a1a75cb971b119a2931730d18 Author: Takashi Iwai Date: Mon Oct 13 23:18:02 2014 +0200 ALSA: emu10k1: Fix deadlock in synth voice lookup commit 95926035b187cc9fee6fb61385b7da9c28123f74 upstream. The emu10k1 voice allocator takes voice_lock spinlock. When there is no empty stream available, it tries to release a voice used by synth, and calls get_synth_voice. The callback function, snd_emu10k1_synth_get_voice(), however, also takes the voice_lock, thus it deadlocks. The fix is simply removing the voice_lock holds in snd_emu10k1_synth_get_voice(), as this is always called in the spinlock context. Reported-and-tested-by: Arthur Marsh Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 4f5de9214861a47c7b295f41d2af2783455101d2 Author: Anatol Pomozov Date: Fri Oct 17 12:43:34 2014 -0700 ALSA: pcm: use the same dma mmap codepath both for arm and arm64 commit a011e213f3700233ed2a676f1ef0a74a052d7162 upstream. This avoids following kernel crash when try to playback on arm64 [ 107.497203] [] snd_pcm_mmap_data_fault+0x90/0xd4 [ 107.503405] [] __do_fault+0xb0/0x498 [ 107.508565] [] handle_mm_fault+0x224/0x7b0 [ 107.514246] [] do_page_fault+0x11c/0x310 [ 107.519738] [] do_mem_abort+0x38/0x98 Tested: backported to 3.14 and tried to playback on arm64 machine Signed-off-by: Anatol Pomozov Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 805e4f907fb42c160f6e9c9b33fc94ffb20d868e Author: Victor Kamensky Date: Tue Oct 14 06:55:05 2014 +0100 arm64: compat: fix compat types affecting struct compat_elf_prpsinfo commit 971a5b6fe634bb7b617d8c5f25b6a3ddbc600194 upstream. The compat_elf_prpsinfo structure does not match the arch/arm struct elf_pspsinfo definition. As result NT_PRPSINFO note in core file created by arm64 kernel for aarch32 (compat) process has wrong size. So gdb cannot display command that caused process crash. Fix is to change size of __compat_uid_t, __compat_gid_t so it would match size of similar fields in arch/arm case. Signed-off-by: Victor Kamensky Acked-by: Arnd Bergmann Signed-off-by: Catalin Marinas Signed-off-by: Greg Kroah-Hartman commit db3b820e8dc467fbf38341418717be909fa8d4b1 Author: Andy Shevchenko Date: Thu Sep 18 20:08:53 2014 +0300 spi: dw-mid: terminate ongoing transfers at exit commit 8e45ef682cb31fda62ed4eeede5d9745a0a1b1e2 upstream. Do full clean up at exit, means terminate all ongoing DMA transfers. Signed-off-by: Andy Shevchenko Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 4e9c74a3333fbb7ce5fd575ae2649bf4e1130775 Author: Dmitry Kasatkin Date: Fri Jun 27 18:04:27 2014 +0300 ima: provide flag to identify new empty files commit b151d6b00bbb798c58f2f21305e7d43fa763f34f upstream. On ima_file_free(), newly created empty files are not labeled with an initial security.ima value, because the iversion did not change. Commit dff6efc "fs: fix iversion handling" introduced a change in iversion behavior. To verify this change use the shell command: $ (exec >foo) $ getfattr -h -e hex -d -m security foo This patch defines the IMA_NEW_FILE flag. The flag is initially set, when IMA detects that a new file is created, and subsequently checked on the ima_file_free() hook to set the initial security.ima value. Signed-off-by: Dmitry Kasatkin Signed-off-by: Mimi Zohar Signed-off-by: Greg Kroah-Hartman commit 5d8f79d49db525d29cf7d9251b35ff7fbeec7b1c Author: Alexey Kardashevskiy Date: Thu Sep 25 16:39:18 2014 +1000 powerpc/iommu/ddw: Fix endianness commit 9410e0185e65394c0c6d046033904b53b97a9423 upstream. rtas_call() accepts and returns values in CPU endianness. The ddw_query_response and ddw_create_response structs members are defined and treated as BE but as they are passed to rtas_call() as (u32 *) and they get byteswapped automatically, the data is CPU-endian. This fixes ddw_query_response and ddw_create_response definitions and use. of_read_number() is designed to work with device tree cells - it assumes the input is big-endian and returns data in CPU-endian. However due to the ddw_create_response struct fix, create.addr_hi/lo are already CPU-endian so do not byteswap them. ddw_avail is a pointer to the "ibm,ddw-applicable" property which contains 3 cells which are big-endian as it is a device tree. rtas_call() accepts a RTAS token in CPU-endian. This makes use of of_property_read_u32_array to byte swap and avoid the need for a number of be32_to_cpu calls. Cc: Benjamin Herrenschmidt [aik: folded Anton's patch with of_property_read_u32_array] Signed-off-by: Alexey Kardashevskiy Acked-by: Anton Blanchard Signed-off-by: Michael Ellerman Signed-off-by: Greg Kroah-Hartman commit 19f4b01dbc75d117994c55d5e9cfa37a814f8b47 Author: Catalin Marinas Date: Fri Oct 17 17:38:49 2014 +0100 futex: Ensure get_futex_key_refs() always implies a barrier commit 76835b0ebf8a7fe85beb03c75121419a7dec52f0 upstream. Commit b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up) changes the futex code to avoid taking a lock when there are no waiters. This code has been subsequently fixed in commit 11d4616bd07f (futex: revert back to the explicit waiter counting code). Both the original commit and the fix-up rely on get_futex_key_refs() to always imply a barrier. However, for private futexes, none of the cases in the switch statement of get_futex_key_refs() would be hit and the function completes without a memory barrier as required before checking the "waiters" in futex_wake() -> hb_waiters_pending(). The consequence is a race with a thread waiting on a futex on another CPU, allowing the waker thread to read "waiters == 0" while the waiter thread to have read "futex_val == locked" (in kernel). Without this fix, the problem (user space deadlocks) can be seen with Android bionic's mutex implementation on an arm64 multi-cluster system. Signed-off-by: Catalin Marinas Reported-by: Matteo Franchin Fixes: b0c29f79ecea (futexes: Avoid taking the hb->lock if there's nothing to wake up) Acked-by: Davidlohr Bueso Tested-by: Mike Galbraith Cc: Darren Hart Cc: Thomas Gleixner Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Paul E. McKenney Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 017ff97daa4a7892181a4dd315c657108419da0c Author: Sasha Levin Date: Mon Oct 13 15:51:05 2014 -0700 kernel: add support for gcc 5 commit 71458cfc782eafe4b27656e078d379a34e472adf upstream. We're missing include/linux/compiler-gcc5.h which is required now because gcc branched off to v5 in trunk. Just copy the relevant bits out of include/linux/compiler-gcc4.h, no new code is added as of now. This fixes a build error when using gcc 5. Signed-off-by: Sasha Levin Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c950851a1d17a71877fd61f11572207f6676c7f6 Author: Yann Droneaud Date: Thu Oct 9 15:24:40 2014 -0700 fanotify: enable close-on-exec on events' fd when requested in fanotify_init() commit 0b37e097a648aa71d4db1ad108001e95b69a2da4 upstream. According to commit 80af258867648 ("fanotify: groups can specify their f_flags for new fd"), file descriptors created as part of file access notification events inherit flags from the event_f_flags argument passed to syscall fanotify_init(2)[1]. Unfortunately O_CLOEXEC is currently silently ignored. Indeed, event_f_flags are only given to dentry_open(), which only seems to care about O_ACCMODE and O_PATH in do_dentry_open(), O_DIRECT in open_check_o_direct() and O_LARGEFILE in generic_file_open(). It's a pity, since, according to some lookup on various search engines and http://codesearch.debian.net/, there's already some userspace code which use O_CLOEXEC: - in systemd's readahead[2]: fanotify_fd = fanotify_init(FAN_CLOEXEC|FAN_NONBLOCK, O_RDONLY|O_LARGEFILE|O_CLOEXEC|O_NOATIME); - in clsync[3]: #define FANOTIFY_EVFLAGS (O_LARGEFILE|O_RDONLY|O_CLOEXEC) int fanotify_d = fanotify_init(FANOTIFY_FLAGS, FANOTIFY_EVFLAGS); - in examples [4] from "Filesystem monitoring in the Linux kernel" article[5] by Aleksander Morgado: if ((fanotify_fd = fanotify_init (FAN_CLOEXEC, O_RDONLY | O_CLOEXEC | O_LARGEFILE)) < 0) Additionally, since commit 48149e9d3a7e ("fanotify: check file flags passed in fanotify_init"). having O_CLOEXEC as part of fanotify_init() second argument is expressly allowed. So it seems expected to set close-on-exec flag on the file descriptors if userspace is allowed to request it with O_CLOEXEC. But Andrew Morton raised[6] the concern that enabling now close-on-exec might break existing applications which ask for O_CLOEXEC but expect the file descriptor to be inherited across exec(). In the other hand, as reported by Mihai Dontu[7] close-on-exec on the file descriptor returned as part of file access notify can break applications due to deadlock. So close-on-exec is needed for most applications. More, applications asking for close-on-exec are likely expecting it to be enabled, relying on O_CLOEXEC being effective. If not, it might weaken their security, as noted by Jan Kara[8]. So this patch replaces call to macro get_unused_fd() by a call to function get_unused_fd_flags() with event_f_flags value as argument. This way O_CLOEXEC flag in the second argument of fanotify_init(2) syscall is interpreted and close-on-exec get enabled when requested. [1] http://man7.org/linux/man-pages/man2/fanotify_init.2.html [2] http://cgit.freedesktop.org/systemd/systemd/tree/src/readahead/readahead-collect.c?id=v208#n294 [3] https://github.com/xaionaro/clsync/blob/v0.2.1/sync.c#L1631 https://github.com/xaionaro/clsync/blob/v0.2.1/configuration.h#L38 [4] http://www.lanedo.com/~aleksander/fanotify/fanotify-example.c [5] http://www.lanedo.com/2013/filesystem-monitoring-linux-kernel/ [6] http://lkml.kernel.org/r/20141001153621.65e9258e65a6167bf2e4cb50@linux-foundation.org [7] http://lkml.kernel.org/r/20141002095046.3715eb69@mdontu-l [8] http://lkml.kernel.org/r/20141002104410.GB19748@quack.suse.cz Link: http://lkml.kernel.org/r/cover.1411562410.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud Reviewed-by: Jan Kara Reviewed by: Heinrich Schuchardt Tested-by: Heinrich Schuchardt Cc: Mihai Don\u021bu Cc: Pádraig Brady Cc: Heinrich Schuchardt Cc: Jan Kara Cc: Valdis Kletnieks Cc: Michael Kerrisk-manpages Cc: Lino Sanfilippo Cc: Richard Guy Briggs Cc: Eric Paris Cc: Al Viro Cc: Michael Kerrisk Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 814ec1a87de1c8d54748ab8a7e76dd1dec967f0d Author: Junxiao Bi Date: Thu Oct 9 15:28:23 2014 -0700 mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set commit 934f3072c17cc8886f4c043b47eeeb1b12f8de33 upstream. commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O during memory allocation") introduces PF_MEMALLOC_NOIO flag to avoid doing I/O inside memory allocation, __GFP_IO is cleared when this flag is set, but __GFP_FS implies __GFP_IO, it should also be cleared. Or it may still run into I/O, like in superblock shrinker. And this will make the kernel run into the deadlock case described in that commit. See Dave Chinner's comment about io in superblock shrinker: Filesystem shrinkers do indeed perform IO from the superblock shrinker and have for years. Even clean inodes can require IO before they can be freed - e.g. on an orphan list, need truncation of post-eof blocks, need to wait for ordered operations to complete before it can be freed, etc. IOWs, Ext4, btrfs and XFS all can issue and/or block on arbitrary amounts of IO in the superblock shrinker context. XFS, in particular, has been doing transactions and IO from the VFS inode cache shrinker since it was first introduced.... Fix this by clearing __GFP_FS in memalloc_noio_flags(), this function has masked all the gfp_mask that will be passed into fs for the processes setting PF_MEMALLOC_NOIO in the direct reclaim path. v1 thread at: https://lkml.org/lkml/2014/9/3/32 Signed-off-by: Junxiao Bi Cc: Dave Chinner Cc: joyce.xue Cc: Ming Lei Cc: Trond Myklebust Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit c84a5f3b11e3b035fae0303e6705d0d3f6b85107 Author: Champion Chen Date: Sat Sep 6 14:06:08 2014 -0500 Bluetooth: Fix issue with USB suspend in btusb driver commit 85560c4a828ec9c8573840c9b66487b6ae584768 upstream. Suspend could fail for some platforms because btusb_suspend==> btusb_stop_traffic ==> usb_kill_anchored_urbs. When btusb_bulk_complete returns before system suspend and resubmits an URB, the system cannot enter suspend state. Signed-off-by: Champion Chen Signed-off-by: Larry Finger Signed-off-by: Marcel Holtmann Signed-off-by: Greg Kroah-Hartman commit 30861ec2cc9b4a14741facdb4ef2faede0959147 Author: Johan Hedberg Date: Fri Aug 15 21:06:51 2014 +0300 Bluetooth: Fix incorrect LE CoC PDU length restriction based on HCI MTU commit 72c6fb915ff2d30ae14053edee4f0d30019bad76 upstream. The l2cap_create_le_flowctl_pdu() function that l2cap_segment_le_sdu() calls is perfectly capable of doing packet fragmentation if given bigger PDUs than the HCI buffers allow. Forcing the PDU length based on the HCI MTU (conn->mtu) would therefore needlessly strict operation on hardware with limited LE buffers (e.g. both Intel and Broadcom seem to have this set to just 27 bytes). This patch removes the restriction and makes it possible to send PDUs of the full length that the remote MPS value allows. Signed-off-by: Johan Hedberg Signed-off-by: Marcel Holtmann Signed-off-by: Greg Kroah-Hartman commit 93d192a930c7b5bd97f1f77f90d298703896fdd3 Author: Loic Poulain Date: Fri Aug 8 19:07:16 2014 +0200 Bluetooth: Fix HCI H5 corrupted ack value commit 4807b51895dce8aa650ebebc51fa4a795ed6b8b8 upstream. In this expression: seq = (seq - 1) % 8 seq (u8) is implicitly converted to an int in the arithmetic operation. So if seq value is 0, operation is ((0 - 1) % 8) => (-1 % 8) => -1. The new seq value is 0xff which is an invalid ACK value, we expect 0x07. It leads to frequent dropped ACK and retransmission. Fix this by using '&' binary operator instead of '%'. Signed-off-by: Loic Poulain Signed-off-by: Marcel Holtmann Signed-off-by: Greg Kroah-Hartman commit c24580ec5132545788c48249f9d8fab758ee908c Author: Stanislaw Gruszka Date: Wed Sep 24 11:24:54 2014 +0200 rt2800: correct BBP1_TX_POWER_CTRL mask commit 01f7feeaf4528bec83798316b3c811701bac5d3e upstream. Two bits control TX power on BBP_R1 register. Correct the mask, otherwise we clear additional bit on BBP_R1 register, what can have unknown, possible negative effect. Signed-off-by: Stanislaw Gruszka Signed-off-by: John W. Linville Signed-off-by: Greg Kroah-Hartman commit 98f0d20b2adf4e1cbeae63387bff155da350fdf6 Author: Ricardo Ribalda Delgado Date: Wed Aug 27 14:57:57 2014 +0200 PCI: Generate uppercase hex for modalias interface class commit 89ec3dcf17fd3fa009ecf8faaba36828dd6bc416 upstream. Some implementations of modprobe fail to load the driver for a PCI device automatically because the "interface" part of the modalias from the kernel is lowercase, and the modalias from file2alias is uppercase. The "interface" is the low-order byte of the Class Code, defined in PCI r3.0, Appendix D. Most interface types defined in the spec do not use alpha characters, so they won't be affected. For example, 00h, 01h, 10h, 20h, etc. are unaffected. Print the "interface" byte of the Class Code in uppercase hex, as we already do for the Vendor ID, Device ID, Class, etc. [bhelgaas: changelog] Signed-off-by: Ricardo Ribalda Delgado Signed-off-by: Bjorn Helgaas Acked-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman commit 575993900824f2ec6b7f945af823ebebb1094bfd Author: Douglas Lehr Date: Thu Aug 21 09:26:52 2014 +1000 PCI: Increase IBM ipr SAS Crocodile BARs to at least system page size commit 9fe373f9997b48fcd6222b95baf4a20c134b587a upstream. The Crocodile chip occasionally comes up with 4k and 8k BAR sizes. Due to an erratum, setting the SR-IOV page size causes the physical function BARs to expand to the system page size. Since ppc64 uses 64k pages, when Linux tries to assign the smaller resource sizes to the now 64k BARs the address will be truncated and the BARs will overlap. Force Linux to allocate the resource as a full page, which avoids the overlap. [bhelgaas: print expanded resource, too] Signed-off-by: Douglas Lehr Signed-off-by: Anton Blanchard Signed-off-by: Bjorn Helgaas Acked-by: Milton Miller Signed-off-by: Greg Kroah-Hartman commit 1ed8711eab1de778174bba04464ac94860403792 Author: Thomas Petazzoni Date: Wed Sep 17 17:58:27 2014 +0200 PCI: mvebu: Fix uninitialized variable in mvebu_get_tgt_attr() commit 56fab6e189441d714a2bfc8a64f3df9c0749dff7 upstream. Geert Uytterhoeven reported a warning when building pci-mvebu: drivers/pci/host/pci-mvebu.c: In function 'mvebu_get_tgt_attr': drivers/pci/host/pci-mvebu.c:887:39: warning: 'rtype' may be used uninitialized in this function [-Wmaybe-uninitialized] if (slot == PCI_SLOT(devfn) && type == rtype) { ^ And indeed, the code of mvebu_get_tgt_attr() may lead to the usage of rtype when being uninitialized, even though it would only happen if we had entries other than I/O space and 32 bits memory space. This commit fixes that by simply skipping the current DT range being considered, if it doesn't match the resource type we're looking for. Reported-by: Geert Uytterhoeven Signed-off-by: Thomas Petazzoni Signed-off-by: Bjorn Helgaas Signed-off-by: Greg Kroah-Hartman commit db2dccfee50de21a0d4c8ca7b215bf4d50a27335 Author: Oren Givon Date: Wed Sep 17 10:31:56 2014 +0300 iwlwifi: Add missing PCI IDs for the 7260 series commit 4f08970f5284dce486f0e2290834aefb2a262189 upstream. Add 4 missing PCI IDs for the 7260 series. Signed-off-by: Oren Givon Signed-off-by: Emmanuel Grumbach Signed-off-by: Greg Kroah-Hartman commit 23357512d141c38733bd0619b10448ff315f3b3c Author: Andy Adamson Date: Mon Sep 29 12:31:57 2014 -0400 NFSv4.1: Fix an NFSv4.1 state renewal regression commit d1f456b0b9545f1606a54cd17c20775f159bd2ce upstream. Commit 2f60ea6b8ced ("NFSv4: The NFSv4.0 client must send RENEW calls if it holds a delegation") set the NFS4_RENEW_TIMEOUT flag in nfs4_renew_state, and does not put an nfs41_proc_async_sequence call, the NFSv4.1 lease renewal heartbeat call, on the wire to renew the NFSv4.1 state if the flag was not set. The NFS4_RENEW_TIMEOUT flag is set when "now" is after the last renewal (cl_last_renewal) plus the lease time divided by 3. This is arbitrary and sometimes does the following: In normal operation, the only way a future state renewal call is put on the wire is via a call to nfs4_schedule_state_renewal, which schedules a nfs4_renew_state workqueue task. nfs4_renew_state determines if the NFS4_RENEW_TIMEOUT should be set, and the calls nfs41_proc_async_sequence, which only gets sent if the NFS4_RENEW_TIMEOUT flag is set. Then the nfs41_proc_async_sequence rpc_release function schedules another state remewal via nfs4_schedule_state_renewal. Without this change we can get into a state where an application stops accessing the NFSv4.1 share, state renewal calls stop due to the NFS4_RENEW_TIMEOUT flag _not_ being set. The only way to recover from this situation is with a clientid re-establishment, once the application resumes and the server has timed out the lease and so returns NFS4ERR_BAD_SESSION on the subsequent SEQUENCE operation. An example application: open, lock, write a file. sleep for 6 * lease (could be less) ulock, close. In the above example with NFSv4.1 delegations enabled, without this change, there are no OP_SEQUENCE state renewal calls during the sleep, and the clientid is recovered due to lease expiration on the close. This issue does not occur with NFSv4.1 delegations disabled, nor with NFSv4.0, with or without delegations enabled. Signed-off-by: Andy Adamson Link: http://lkml.kernel.org/r/1411486536-23401-1-git-send-email-andros@netapp.com Fixes: 2f60ea6b8ced (NFSv4: The NFSv4.0 client must send RENEW calls...) Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit c66ff82656e29949b1c1bfea1739dbca1dfde8be Author: Trond Myklebust Date: Sat Sep 27 17:41:51 2014 -0400 NFSv4: fix open/lock state recovery error handling commit df817ba35736db2d62b07de6f050a4db53492ad8 upstream. The current open/lock state recovery unfortunately does not handle errors such as NFS4ERR_CONN_NOT_BOUND_TO_SESSION correctly. Instead of looping, just proceeds as if the state manager is finished recovering. This patch ensures that we loop back, handle higher priority errors and complete the open/lock state recovery. Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 7c4ed3855612cf818b65649984ef1ba7b7cbab39 Author: Trond Myklebust Date: Sat Sep 27 17:02:26 2014 -0400 NFSv4: Fix lock recovery when CREATE_SESSION/SETCLIENTID_CONFIRM fails commit a4339b7b686b4acc8b6de2b07d7bacbe3ae44b83 upstream. If a NFSv4.x server returns NFS4ERR_STALE_CLIENTID in response to a CREATE_SESSION or SETCLIENTID_CONFIRM in order to tell us that it rebooted a second time, then the client will currently take this to mean that it must declare all locks to be stale, and hence ineligible for reboot recovery. RFC3530 and RFC5661 both suggest that the client should instead rely on the server to respond to inelegible open share, lock and delegation reclaim requests with NFS4ERR_NO_GRACE in this situation. Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit 29c736dccf67f3c1f9ff88440fbf2d89b782c41f Author: Frans Klaver Date: Thu Sep 25 11:19:51 2014 +0200 tty: omap-serial: fix division by zero commit dc3187564e61260f49eceb21a4e7eb5e4428e90a upstream. If the chosen baud rate is large enough (e.g. 3.5 megabaud), the calculated n values in serial_omap_is_baud_mode16() may become 0. This causes a division by zero when calculating the difference between calculated and desired baud rates. To prevent this, cap the n13 and n16 values on 1. Division by zero in kernel. [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [] (show_stack) from [] (Ldiv0+0x8/0x10) [] (Ldiv0) from [] (serial_omap_baud_is_mode16+0x4c/0x68) [] (serial_omap_baud_is_mode16) from [] (serial_omap_set_termios+0x90/0x8d8) [] (serial_omap_set_termios) from [] (uart_change_speed+0xa4/0xa8) [] (uart_change_speed) from [] (uart_set_termios+0xa0/0x1fc) [] (uart_set_termios) from [] (tty_set_termios+0x248/0x2c0) [] (tty_set_termios) from [] (set_termios+0x248/0x29c) [] (set_termios) from [] (tty_mode_ioctl+0x1c8/0x4e8) [] (tty_mode_ioctl) from [] (tty_ioctl+0xa94/0xb18) [] (tty_ioctl) from [] (do_vfs_ioctl+0x4a0/0x560) [] (do_vfs_ioctl) from [] (SyS_ioctl+0x4c/0x74) [] (SyS_ioctl) from [] (ret_fast_syscall+0x0/0x30) Signed-off-by: Frans Klaver Signed-off-by: Greg Kroah-Hartman commit 7f5f71a9265d9829577393d9005b165f28b1cd77 Author: Willy Tarreau Date: Sat Sep 27 12:31:37 2014 +0200 lzo: check for length overrun in variable length encoding. commit 72cf90124e87d975d0b2114d930808c58b4c05e4 upstream. This fix ensures that we never meet an integer overflow while adding 255 while parsing a variable length encoding. It works differently from commit 206a81c ("lzo: properly check for overruns") because instead of ensuring that we don't overrun the input, which is tricky to guarantee due to many assumptions in the code, it simply checks that the cumulated number of 255 read cannot overflow by bounding this number. The MAX_255_COUNT is the maximum number of times we can add 255 to a base count without overflowing an integer. The multiply will overflow when multiplying 255 by more than MAXINT/255. The sum will overflow earlier depending on the base count. Since the base count is taken from a u8 and a few bits, it is safe to assume that it will always be lower than or equal to 2*255, thus we can always prevent any overflow by accepting two less 255 steps. This patch also reduces the CPU overhead and actually increases performance by 1.1% compared to the initial code, while the previous fix costs 3.1% (measured on x86_64). The fix needs to be backported to all currently supported stable kernels. Reported-by: Willem Pinckaers Cc: "Don A. Bailey" Signed-off-by: Willy Tarreau Signed-off-by: Greg Kroah-Hartman commit be73cb4d097fd2bb49a5277f80da44a72466a161 Author: Willy Tarreau Date: Sat Sep 27 12:31:36 2014 +0200 Revert "lzo: properly check for overruns" commit af958a38a60c7ca3d8a39c918c1baa2ff7b6b233 upstream. This reverts commit 206a81c ("lzo: properly check for overruns"). As analysed by Willem Pinckaers, this fix is still incomplete on certain rare corner cases, and it is easier to restart from the original code. Reported-by: Willem Pinckaers Cc: "Don A. Bailey" Signed-off-by: Willy Tarreau Signed-off-by: Greg Kroah-Hartman commit be30bc63af981c48efe5b4e8d260ced0095f4c9b Author: Willy Tarreau Date: Sat Sep 27 12:31:35 2014 +0200 Documentation: lzo: document part of the encoding commit d98a0526434d27e261f622cf9d2e0028b5ff1a00 upstream. Add a complete description of the LZO format as processed by the decompressor. I have not found a public specification of this format hence this analysis, which will be used to better understand the code. Cc: Willem Pinckaers Cc: "Don A. Bailey" Signed-off-by: Willy Tarreau Signed-off-by: Greg Kroah-Hartman commit 4e95348cdac055cf0ba87af0212bab78ab5964bb Author: Olga Kornievskaia Date: Wed Sep 24 18:11:28 2014 -0400 Fixing lease renewal commit 8faaa6d5d48b201527e0451296d9e71d23afb362 upstream. Commit c9fdeb28 removed a 'continue' after checking if the lease needs to be renewed. However, if client hasn't moved, the code falls down to starting reboot recovery erroneously (ie., sends open reclaim and gets back stale_clientid error) before recovering from getting stale_clientid on the renew operation. Signed-off-by: Olga Kornievskaia Fixes: c9fdeb280b8c (NFS: Add basic migration support to state manager thread) Signed-off-by: Trond Myklebust Signed-off-by: Greg Kroah-Hartman commit c3e1d75a1b07be57e6793becfef725105b8f96e3 Author: Geert Uytterhoeven Date: Sun Sep 28 10:50:06 2014 +0200 m68k: Disable/restore interrupts in hwreg_present()/hwreg_write() commit e4dc601bf99ccd1c95b7e6eef1d3cf3c4b0d4961 upstream. hwreg_present() and hwreg_write() temporarily change the VBR register to another vector table. This table contains a valid bus error handler only, all other entries point to arbitrary addresses. If an interrupt comes in while the temporary table is active, the processor will start executing at such an arbitrary address, and the kernel will crash. While most callers run early, before interrupts are enabled, or explicitly disable interrupts, Finn Thain pointed out that macsonic has one callsite that doesn't, causing intermittent boot crashes. There's another unsafe callsite in hilkbd. Fix this for good by disabling and restoring interrupts inside hwreg_present() and hwreg_write(). Explicitly disabling interrupts can be removed from the callsites later. Reported-by: Finn Thain Signed-off-by: Geert Uytterhoeven Signed-off-by: Greg Kroah-Hartman commit c01f090185b62b906e627e1fa3602a57e0db5f9f Author: Alexander Usyskin Date: Mon Aug 25 16:46:53 2014 +0300 mei: bus: fix possible boundaries violation commit cfda2794b5afe7ce64ee9605c64bef0e56a48125 upstream. function 'strncpy' will fill whole buffer 'id.name' of fixed size (32) with string value and will not leave place for NULL-terminator. Possible buffer boundaries violation in following string operations. Replace strncpy with strlcpy. Signed-off-by: Alexander Usyskin Signed-off-by: Tomas Winkler Signed-off-by: Greg Kroah-Hartman commit 4b417357687ddf1191c248fc18c167822c4d978b Author: K. Y. Srinivasan Date: Wed Aug 27 16:25:35 2014 -0700 Drivers: hv: vmbus: Fix a bug in vmbus_open() commit 45d727cee9e200f5b351528b9fb063b69cf702c8 upstream. Fix a bug in vmbus_open() and properly propagate the error. I would like to thank Dexuan Cui for identifying the issue. Signed-off-by: K. Y. Srinivasan Tested-by: Sitsofe Wheeler Signed-off-by: Greg Kroah-Hartman commit 9c9520596f2c96b906712c468d776319d7836540 Author: K. Y. Srinivasan Date: Wed Aug 27 16:25:34 2014 -0700 Drivers: hv: vmbus: Cleanup vmbus_establish_gpadl() commit 72c6b71c245dac8f371167d97ef471b367d0b66b upstream. Eliminate the call to BUG_ON() by waiting for the host to respond. We are trying to reclaim the ownership of memory that was given to the host and so we will have to wait until the host responds. Signed-off-by: K. Y. Srinivasan Tested-by: Sitsofe Wheeler Signed-off-by: Greg Kroah-Hartman commit 965d61bf8737f52ffd60b98255eb6d07b6ea2b62 Author: K. Y. Srinivasan Date: Wed Aug 27 16:25:33 2014 -0700 Drivers: hv: vmbus: Cleanup vmbus_close_internal() commit 98d731bb064a9d1817a6ca9bf8b97051334a7cfe upstream. Eliminate calls to BUG_ON() in vmbus_close_internal(). We have chosen to potentially leak memory, than crash the guest in case of failures. In this version of the patch I have addressed comments from Dan Carpenter (dan.carpenter@oracle.com). Signed-off-by: K. Y. Srinivasan Tested-by: Sitsofe Wheeler Signed-off-by: Greg Kroah-Hartman commit b8c396a6c072ddbdd31540d61bd24a12445e33a0 Author: K. Y. Srinivasan Date: Wed Aug 27 16:25:32 2014 -0700 Drivers: hv: vmbus: Cleanup vmbus_teardown_gpadl() commit 66be653083057358724d56d817e870e53fb81ca7 upstream. Eliminate calls to BUG_ON() by properly handling errors. In cases where rollback is possible, we will return the appropriate error to have the calling code decide how to rollback state. In the case where we are transferring ownership of the guest physical pages to the host, we will wait for the host to respond. Signed-off-by: K. Y. Srinivasan Tested-by: Sitsofe Wheeler Signed-off-by: Greg Kroah-Hartman commit 5fe80abd8a4fd879a6bf02e3936f03dbffc57c41 Author: K. Y. Srinivasan Date: Wed Aug 27 16:25:31 2014 -0700 Drivers: hv: vmbus: Cleanup vmbus_post_msg() commit fdeebcc62279119dbeafbc1a2e39e773839025fd upstream. Posting messages to the host can fail because of transient resource related failures. Correctly deal with these failures and increase the number of attempts to post the message before giving up. In this version of the patch, I have normalized the error code to Linux error code. Signed-off-by: K. Y. Srinivasan Tested-by: Sitsofe Wheeler Signed-off-by: Greg Kroah-Hartman commit afc41640309383af419174e741c515d999f4dc25 Author: Kees Cook Date: Thu Sep 18 11:25:37 2014 -0700 firmware_class: make sure fw requests contain a name commit 471b095dfe0d693a8d624cbc716d1ee4d74eb437 upstream. An empty firmware request name will trigger warnings when building device names. Make sure this is caught earlier and rejected. The warning was visible via the test_firmware.ko module interface: echo -ne "\x00" > /sys/devices/virtual/misc/test_firmware/trigger_request Reported-by: Sasha Levin Signed-off-by: Kees Cook Tested-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman commit 7489d15203157651c8b77d0526c08c86dc4742f9 Author: Xuelin Shi Date: Tue Jul 1 16:32:38 2014 +0800 dmaengine: fix xor sources continuation commit 87cea76384257e6ac3fa4791b6a6b9d0335f7457 upstream. the partial xor result must be kept until the next tx is generated. Signed-off-by: Xuelin Shi Signed-off-by: Dan Williams Signed-off-by: Greg Kroah-Hartman commit 54adfd41b8ee08a00e6285e8ef27246655467d24 Author: Joe Lawrence Date: Tue Aug 26 17:10:41 2014 -0400 qla2xxx: Fix shost use-after-free on device removal commit db7157d4cfce6edf052452fb1d327d4d11b67f4c upstream. Once calling scsi_host_put, be careful to not access qla_hw_data through the Scsi_Host private data (ie, scsi_qla_host base_vha). Fixes: fe1b806f4f71 ("qla2xxx: Refactor shutdown code so some functionality can be reused") Signed-off-by: Joe Lawrence Acked-by: Chad Dupuis Signed-off-by: Christoph Hellwig Signed-off-by: Greg Kroah-Hartman commit 4952a180c73f95f3eca4f55580bd49ce79cd6789 Author: Arun Easi Date: Thu Sep 25 06:14:45 2014 -0400 qla2xxx: Use correct offset to req-q-out for reserve calculation commit 75554b68ac1e018bca00d68a430b92ada8ab52dd upstream. Signed-off-by: Arun Easi Signed-off-by: Saurav Kashyap Signed-off-by: Christoph Hellwig Signed-off-by: Greg Kroah-Hartman commit 8fd7a73aa51373d7c16440269244c3865a61a9bf Author: Chris J Arges Date: Tue Sep 23 09:22:25 2014 -0500 mptfusion: enable no_write_same for vmware scsi disks commit 4089b71cc820a426d601283c92fcd4ffeb5139c2 upstream. When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used data can be improperly zeroed out using the mptfusion driver. This patch disables write_same for this driver and the vmware subsystem_vendor which ensures that manual zeroing out is used instead. BugLink: http://bugs.launchpad.net/bugs/1371591 Reported-by: Bruce Lucas Tested-by: Chris J Arges Signed-off-by: Chris J Arges Reviewed-by: Martin K. Petersen Signed-off-by: Christoph Hellwig Signed-off-by: Greg Kroah-Hartman commit 7f12c9e03919c64e2dfc63ddd4fedf180d3cd44f Author: Mike Christie Date: Mon Sep 29 13:55:41 2014 -0500 be2iscsi: check ip buffer before copying commit a41a9ad3bbf61fae0b6bfb232153da60d14fdbd9 upstream. Dan Carpenter found a issue where be2iscsi would copy the ip from userspace to the driver buffer before checking the len of the data being copied: http://marc.info/?l=linux-scsi&m=140982651504251&w=2 This patch just has us only copy what we the driver buffer can support. Tested-by: John Soni Jose Signed-off-by: Mike Christie Signed-off-by: Christoph Hellwig Signed-off-by: Greg Kroah-Hartman commit 2f06fa04cf35da5c24481da3ac84a2900d0b99c3 Author: Xiubo Li Date: Sun Sep 28 17:09:54 2014 +0800 regmap: fix possible ZERO_SIZE_PTR pointer dereferencing error. commit d6b41cb06044a7d895db82bdd54f6e4219970510 upstream. Since we cannot make sure the 'val_count' will always be none zero here, and then if it equals to zero, the kmemdup() will return ZERO_SIZE_PTR, which equals to ((void *)16). So this patch fix this with just doing the zero check before calling kmemdup(). Signed-off-by: Xiubo Li Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit e2fe6c3046ba3a75b3228181a34831b9bb8ee861 Author: Pankaj Dubey Date: Sat Sep 27 09:47:55 2014 +0530 regmap: fix NULL pointer dereference in _regmap_write/read commit 5336be8416a71b5568d2cf54a2f2066abe9f2a53 upstream. If LOG_DEVICE is defined and map->dev is NULL it will lead to NULL pointer dereference. This patch fixes this issue by adding check for dev->NULL in all such places in regmap.c Signed-off-by: Pankaj Dubey Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 6d063730920d6bf76105473e67d7ed34bdef437f Author: Xiubo Li Date: Sun Sep 28 11:35:25 2014 +0800 regmap: debugfs: fix possbile NULL pointer dereference commit 2c98e0c1cc6b8e86f1978286c3d4e0769ee9d733 upstream. If 'map->dev' is NULL and there will lead dev_name() to be NULL pointer dereference. So before dev_name(), we need to have check of the map->dev pionter. We also should make sure that the 'name' pointer shouldn't be NULL for debugfs_create_dir(). So here using one default "dummy" debugfs name when the 'name' pointer and 'map->dev' are both NULL. Signed-off-by: Xiubo Li Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 4e973d3174fe8d683c382bc5261a3d8643f91242 Author: Andy Shevchenko Date: Fri Sep 12 15:11:58 2014 +0300 spi: dw-mid: check that DMA was inited before exit commit fb57862ead652454ceeb659617404c5f13bc34b5 upstream. If the driver was compiled with DMA support, but DMA channels weren't acquired by some reason, mid_spi_dma_exit() will crash the kernel. Fixes: 7063c0d942a1 (spi/dw_spi: add DMA support) Signed-off-by: Andy Shevchenko Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 65eea26bfac18b939ef8274785831bdd74fb12f8 Author: Andy Shevchenko Date: Thu Sep 18 20:08:51 2014 +0300 spi: dw-mid: respect 8 bit mode commit b41583e7299046abdc578c33f25ed83ee95b9b31 upstream. In case of 8 bit mode and DMA usage we end up with every second byte written as 0. We have to respect bits_per_word settings what this patch actually does. Signed-off-by: Andy Shevchenko Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 94b209e7d25da7ad9f72ad27a87ba2faddcf1fe6 Author: Bryan O'Donoghue Date: Wed Sep 24 00:26:24 2014 +0100 x86/intel/quark: Switch off CR4.PGE so TLB flush uses CR3 instead commit ee1b5b165c0a2f04d2107e634e51f05d0eb107de upstream. Quark x1000 advertises PGE via the standard CPUID method PGE bits exist in Quark X1000's PTEs. In order to flush an individual PTE it is necessary to reload CR3 irrespective of the PTE.PGE bit. See Quark Core_DevMan_001.pdf section 6.4.11 This bug was fixed in Galileo kernels, unfixed vanilla kernels are expected to crash and burn on this platform. Signed-off-by: Bryan O'Donoghue Cc: Borislav Petkov Link: http://lkml.kernel.org/r/1411514784-14885-1-git-send-email-pure.logic@nexus-software.ie Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit dff6e8cd03a5c2709ef0933e8b48cbf8b28aee4a Author: David Matlack Date: Fri Sep 19 16:03:25 2014 -0700 kvm: don't take vcpu mutex for obviously invalid vcpu ioctls commit 2ea75be3219571d0ec009ce20d9971e54af96e09 upstream. vcpu ioctls can hang the calling thread if issued while a vcpu is running. However, invalid ioctls can happen when userspace tries to probe the kind of file descriptors (e.g. isatty() calls ioctl(TCGETS)); in that case, we know the ioctl is going to be rejected as invalid anyway and we can fail before trying to take the vcpu mutex. This patch does not change functionality, it just makes invalid ioctls fail faster. Signed-off-by: David Matlack Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit a4095e0bcfb8bade0db43db9c90dd8ee4b62ab13 Author: Christian Borntraeger Date: Wed Sep 3 16:21:32 2014 +0200 KVM: s390: unintended fallthrough for external call commit f346026e55f1efd3949a67ddd1dcea7c1b9a615e upstream. We must not fallthrough if the conditions for external call are not met. Signed-off-by: Christian Borntraeger Reviewed-by: Thomas Huth Signed-off-by: Greg Kroah-Hartman commit 55eb96ec53e7b041f10ffcc7b1442a31d89d4488 Author: David Matlack Date: Mon Aug 18 15:46:06 2014 -0700 kvm: fix potentially corrupt mmio cache commit ee3d1570b58677885b4552bce8217fda7b226a68 upstream. vcpu exits and memslot mutations can run concurrently as long as the vcpu does not aquire the slots mutex. Thus it is theoretically possible for memslots to change underneath a vcpu that is handling an exit. If we increment the memslot generation number again after synchronize_srcu_expedited(), vcpus can safely cache memslot generation without maintaining a single rcu_dereference through an entire vm exit. And much of the x86/kvm code does not maintain a single rcu_dereference of the current memslots during each exit. We can prevent the following case: vcpu (CPU 0) | thread (CPU 1) --------------------------------------------+-------------------------- 1 vm exit | 2 srcu_read_unlock(&kvm->srcu) | 3 decide to cache something based on | old memslots | 4 | change memslots | (increments generation) 5 | synchronize_srcu(&kvm->srcu); 6 retrieve generation # from new memslots | 7 tag cache with new memslot generation | 8 srcu_read_unlock(&kvm->srcu) | ... | | ... | | | By incrementing the generation after synchronizing with kvm->srcu readers, we ensure that the generation retrieved in (6) will become invalid soon after (8). Keeping the existing increment is not strictly necessary, but we do keep it and just move it for consistency from update_memslots to install_new_memslots. It invalidates old cached MMIOs immediately, instead of having to wait for the end of synchronize_srcu_expedited, which makes the code more clearly correct in case CPU 1 is preempted right after synchronize_srcu() returns. To avoid halving the generation space in SPTEs, always presume that the low bit of the generation is zero when reconstructing a generation number out of an SPTE. This effectively disables MMIO caching in SPTEs during the call to synchronize_srcu_expedited. Using the low bit this way is somewhat like a seqcount---where the protected thing is a cache, and instead of retrying we can simply punt if we observe the low bit to be 1. Signed-off-by: David Matlack Reviewed-by: Xiao Guangrong Reviewed-by: David Matlack Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit a523af29af05c0e0fbb054904a9b9c8ac55ca1cb Author: David Matlack Date: Mon Aug 18 15:46:07 2014 -0700 kvm: x86: fix stale mmio cache bug commit 56f17dd3fbc44adcdbc3340fe3988ddb833a47a7 upstream. The following events can lead to an incorrect KVM_EXIT_MMIO bubbling up to userspace: (1) Guest accesses gpa X without a memory slot. The gfn is cached in struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets the SPTE write-execute-noread so that future accesses cause EPT_MISCONFIGs. (2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION covering the page just accessed. (3) Guest attempts to read or write to gpa X again. On Intel, this generates an EPT_MISCONFIG. The memory slot generation number that was incremented in (2) would normally take care of this but we fast path mmio faults through quickly_check_mmio_pf(), which only checks the per-vcpu mmio cache. Since we hit the cache, KVM passes a KVM_EXIT_MMIO up to userspace. This patch fixes the issue by using the memslot generation number to validate the mmio cache. Signed-off-by: David Matlack [xiaoguangrong: adjust the code to make it simpler for stable-tree fix.] Signed-off-by: Xiao Guangrong Reviewed-by: David Matlack Reviewed-by: Xiao Guangrong Tested-by: David Matlack Signed-off-by: Paolo Bonzini Signed-off-by: Greg Kroah-Hartman commit 3e9a823aeb6af0a4ddd590658b59b6fa1f450fe9 Author: Filipe Manana Date: Sun Mar 30 23:02:53 2014 +0100 Btrfs: send, fix data corruption due to incorrect hole detection commit 766b5e5ae78dd04a93a275690a49e23d7dcb1f39 upstream. During an incremental send, when we finish processing an inode (corresponding to a regular file) we would assume the gap between the end of the last processed file extent and the file's size corresponded to a file hole, and therefore incorrectly send a bunch of zero bytes to overwrite that region in the file. This affects only kernel 3.14. Reproducer: mkfs.btrfs -f /dev/sdc mount /dev/sdc /mnt xfs_io -f -c "falloc -k 0 268435456" /mnt/foo btrfs subvolume snapshot -r /mnt /mnt/mysnap0 xfs_io -c "pwrite -S 0x01 -b 9216 16190218 9216" /mnt/foo xfs_io -c "pwrite -S 0x02 -b 1121 198720104 1121" /mnt/foo xfs_io -c "pwrite -S 0x05 -b 9216 107887439 9216" /mnt/foo xfs_io -c "pwrite -S 0x06 -b 9216 225520207 9216" /mnt/foo xfs_io -c "pwrite -S 0x07 -b 67584 102138300 67584" /mnt/foo xfs_io -c "pwrite -S 0x08 -b 7000 94897484 7000" /mnt/foo xfs_io -c "pwrite -S 0x09 -b 113664 245083212 113664" /mnt/foo xfs_io -c "pwrite -S 0x10 -b 123 17937788 123" /mnt/foo xfs_io -c "pwrite -S 0x11 -b 39936 229573311 39936" /mnt/foo xfs_io -c "pwrite -S 0x12 -b 67584 174792222 67584" /mnt/foo xfs_io -c "pwrite -S 0x13 -b 9216 249253213 9216" /mnt/foo xfs_io -c "pwrite -S 0x16 -b 67584 150046083 67584" /mnt/foo xfs_io -c "pwrite -S 0x17 -b 39936 118246040 39936" /mnt/foo xfs_io -c "pwrite -S 0x18 -b 67584 215965442 67584" /mnt/foo xfs_io -c "pwrite -S 0x19 -b 33792 97096725 33792" /mnt/foo xfs_io -c "pwrite -S 0x20 -b 125952 166300596 125952" /mnt/foo xfs_io -c "pwrite -S 0x21 -b 123 1078957 123" /mnt/foo xfs_io -c "pwrite -S 0x25 -b 9216 212044492 9216" /mnt/foo xfs_io -c "pwrite -S 0x26 -b 7000 265037146 7000" /mnt/foo xfs_io -c "pwrite -S 0x27 -b 42757 215922685 42757" /mnt/foo xfs_io -c "pwrite -S 0x28 -b 7000 69865411 7000" /mnt/foo xfs_io -c "pwrite -S 0x29 -b 67584 67948958 67584" /mnt/foo xfs_io -c "pwrite -S 0x30 -b 39936 266967019 39936" /mnt/foo xfs_io -c "pwrite -S 0x31 -b 1121 19582453 1121" /mnt/foo xfs_io -c "pwrite -S 0x32 -b 17408 257710255 17408" /mnt/foo xfs_io -c "pwrite -S 0x33 -b 39936 3895518 39936" /mnt/foo xfs_io -c "pwrite -S 0x34 -b 125952 12045847 125952" /mnt/foo xfs_io -c "pwrite -S 0x35 -b 17408 19156379 17408" /mnt/foo xfs_io -c "pwrite -S 0x36 -b 39936 50160066 39936" /mnt/foo xfs_io -c "pwrite -S 0x37 -b 113664 9549793 113664" /mnt/foo xfs_io -c "pwrite -S 0x38 -b 105472 94391506 105472" /mnt/foo xfs_io -c "pwrite -S 0x39 -b 23552 143632863 23552" /mnt/foo xfs_io -c "pwrite -S 0x40 -b 39936 241283845 39936" /mnt/foo xfs_io -c "pwrite -S 0x41 -b 113664 199937606 113664" /mnt/foo xfs_io -c "pwrite -S 0x42 -b 67584 67380093 67584" /mnt/foo xfs_io -c "pwrite -S 0x43 -b 67584 26793129 67584" /mnt/foo xfs_io -c "pwrite -S 0x44 -b 39936 14421913 39936" /mnt/foo xfs_io -c "pwrite -S 0x45 -b 123 253097405 123" /mnt/foo xfs_io -c "pwrite -S 0x46 -b 1121 128233424 1121" /mnt/foo xfs_io -c "pwrite -S 0x47 -b 105472 91577959 105472" /mnt/foo xfs_io -c "pwrite -S 0x48 -b 1121 7245381 1121" /mnt/foo xfs_io -c "pwrite -S 0x49 -b 113664 182414694 113664" /mnt/foo xfs_io -c "pwrite -S 0x50 -b 9216 32750608 9216" /mnt/foo xfs_io -c "pwrite -S 0x51 -b 67584 266546049 67584" /mnt/foo xfs_io -c "pwrite -S 0x52 -b 67584 87969398 67584" /mnt/foo xfs_io -c "pwrite -S 0x53 -b 9216 260848797 9216" /mnt/foo xfs_io -c "pwrite -S 0x54 -b 39936 119461243 39936" /mnt/foo xfs_io -c "pwrite -S 0x55 -b 7000 200178693 7000" /mnt/foo xfs_io -c "pwrite -S 0x56 -b 9216 243316029 9216" /mnt/foo xfs_io -c "pwrite -S 0x57 -b 7000 209658229 7000" /mnt/foo xfs_io -c "pwrite -S 0x58 -b 101376 179745192 101376" /mnt/foo xfs_io -c "pwrite -S 0x59 -b 9216 64012300 9216" /mnt/foo xfs_io -c "pwrite -S 0x60 -b 125952 181705139 125952" /mnt/foo xfs_io -c "pwrite -S 0x61 -b 23552 235737348 23552" /mnt/foo xfs_io -c "pwrite -S 0x62 -b 113664 106021355 113664" /mnt/foo xfs_io -c "pwrite -S 0x63 -b 67584 135753552 67584" /mnt/foo xfs_io -c "pwrite -S 0x64 -b 23552 95730888 23552" /mnt/foo xfs_io -c "pwrite -S 0x65 -b 11 17311415 11" /mnt/foo xfs_io -c "pwrite -S 0x66 -b 33792 120695553 33792" /mnt/foo xfs_io -c "pwrite -S 0x67 -b 9216 17164631 9216" /mnt/foo xfs_io -c "pwrite -S 0x68 -b 9216 136065853 9216" /mnt/foo xfs_io -c "pwrite -S 0x69 -b 67584 37752198 67584" /mnt/foo xfs_io -c "pwrite -S 0x70 -b 101376 189717473 101376" /mnt/foo xfs_io -c "pwrite -S 0x71 -b 7000 227463698 7000" /mnt/foo xfs_io -c "pwrite -S 0x72 -b 9216 12655137 9216" /mnt/foo xfs_io -c "pwrite -S 0x73 -b 7000 7488866 7000" /mnt/foo xfs_io -c "pwrite -S 0x74 -b 113664 87813649 113664" /mnt/foo xfs_io -c "pwrite -S 0x75 -b 33792 25802183 33792" /mnt/foo xfs_io -c "pwrite -S 0x76 -b 39936 93524024 39936" /mnt/foo xfs_io -c "pwrite -S 0x77 -b 33792 113336388 33792" /mnt/foo xfs_io -c "pwrite -S 0x78 -b 105472 184955320 105472" /mnt/foo xfs_io -c "pwrite -S 0x79 -b 101376 225691598 101376" /mnt/foo xfs_io -c "pwrite -S 0x80 -b 23552 77023155 23552" /mnt/foo xfs_io -c "pwrite -S 0x81 -b 11 201888192 11" /mnt/foo xfs_io -c "pwrite -S 0x82 -b 11 115332492 11" /mnt/foo xfs_io -c "pwrite -S 0x83 -b 67584 230278015 67584" /mnt/foo xfs_io -c "pwrite -S 0x84 -b 11 120589073 11" /mnt/foo xfs_io -c "pwrite -S 0x85 -b 125952 202207819 125952" /mnt/foo xfs_io -c "pwrite -S 0x86 -b 113664 86672080 113664" /mnt/foo xfs_io -c "pwrite -S 0x87 -b 17408 208459603 17408" /mnt/foo xfs_io -c "pwrite -S 0x88 -b 7000 73372211 7000" /mnt/foo xfs_io -c "pwrite -S 0x89 -b 7000 42252122 7000" /mnt/foo xfs_io -c "pwrite -S 0x90 -b 23552 46784881 23552" /mnt/foo xfs_io -c "pwrite -S 0x91 -b 101376 63172351 101376" /mnt/foo xfs_io -c "pwrite -S 0x92 -b 23552 59341931 23552" /mnt/foo xfs_io -c "pwrite -S 0x93 -b 39936 239599283 39936" /mnt/foo xfs_io -c "pwrite -S 0x94 -b 67584 175643105 67584" /mnt/foo xfs_io -c "pwrite -S 0x97 -b 23552 105534880 23552" /mnt/foo xfs_io -c "pwrite -S 0x98 -b 113664 8236844 113664" /mnt/foo xfs_io -c "pwrite -S 0x99 -b 125952 144489686 125952" /mnt/foo xfs_io -c "pwrite -S 0xa0 -b 7000 73273112 7000" /mnt/foo xfs_io -c "pwrite -S 0xa1 -b 125952 194580243 125952" /mnt/foo xfs_io -c "pwrite -S 0xa2 -b 123 56296779 123" /mnt/foo xfs_io -c "pwrite -S 0xa3 -b 11 233066845 11" /mnt/foo xfs_io -c "pwrite -S 0xa4 -b 39936 197727090 39936" /mnt/foo xfs_io -c "pwrite -S 0xa5 -b 101376 53579812 101376" /mnt/foo xfs_io -c "pwrite -S 0xa6 -b 9216 85669738 9216" /mnt/foo xfs_io -c "pwrite -S 0xa7 -b 125952 21266322 125952" /mnt/foo xfs_io -c "pwrite -S 0xa8 -b 23552 125726568 23552" /mnt/foo xfs_io -c "pwrite -S 0xa9 -b 9216 18423680 9216" /mnt/foo xfs_io -c "pwrite -S 0xb0 -b 1121 165901483 1121" /mnt/foo btrfs subvolume snapshot -r /mnt /mnt/mysnap1 xfs_io -c "pwrite -S 0xff -b 10 16190218 10" /mnt/foo btrfs subvolume snapshot -r /mnt /mnt/mysnap2 md5sum /mnt/foo # returns 79e53f1466bfc09fd82b450689e6119e md5sum /mnt/mysnap2/foo # returns 79e53f1466bfc09fd82b450689e6119e too btrfs send /mnt/mysnap1 -f /tmp/1.snap btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/2.snap mkfs.btrfs -f /dev/sdc mount /dev/sdc /mnt btrfs receive /mnt -f /tmp/1.snap btrfs receive /mnt -f /tmp/2.snap md5sum /mnt/mysnap2/foo # returns 2bb414c5155767cedccd7063e51beabd !! A testcase for xfstests follows soon too. Signed-off-by: Filipe David Borba Manana Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 6172eb2d0bccf29d83e334a6ca67415589d6c405 Author: Josef Ahmad Date: Tue Sep 2 13:45:20 2014 +0300 pci_ids: Add support for Intel Quark ILB commit bb048713bba3ead39f6112910906d9fe3f88ede7 upstream. This patch adds the PCI id for Intel Quark ILB. It will be used for GPIO and Multifunction device driver. Signed-off-by: Josef Ahmad Acked-by: Bjorn Helgaas Signed-off-by: Andy Shevchenko Signed-off-by: Lee Jones Signed-off-by: Chang Rebecca Swee Fun Signed-off-by: Greg Kroah-Hartman commit 00ada3c3176a741d55b3be800e8d0cdb4f388826 Author: Bryan O'Donoghue Date: Mon Aug 4 10:22:54 2014 -0700 usb: pch_udc: usb gadget device support for Intel Quark X1000 commit a68df7066a6f974db6069e0b93c498775660a114 upstream. This patch is to enable the USB gadget device for Intel Quark X1000 Signed-off-by: Bryan O'Donoghue Signed-off-by: Bing Niu Signed-off-by: Alvin (Weike) Chen Signed-off-by: Felipe Balbi Signed-off-by: Chang Rebecca Swee Fun Signed-off-by: Greg Kroah-Hartman commit dc3980ea4ad9d8d0b63b3cde732c9b95750208ce Author: Andy Lutomirski Date: Wed Oct 8 12:32:47 2014 -0700 fs: Add a missing permission check to do_umount commit a1480dcc3c706e309a88884723446f2e84fedd5b upstream. Accessing do_remount_sb should require global CAP_SYS_ADMIN, but only one of the two call sites was appropriately protected. Fixes CVE-2014-7975. Signed-off-by: Andy Lutomirski Signed-off-by: Greg Kroah-Hartman commit 6ebe2d33e2867223e6a35afc3ba547aee5a3e196 Author: Sage Weil Date: Fri Sep 26 08:30:06 2014 -0700 Btrfs: fix race in WAIT_SYNC ioctl commit 42383020beb1cfb05f5d330cc311931bc4917a97 upstream. We check whether transid is already committed via last_trans_committed and then search through trans_list for pending transactions. If last_trans_committed is updated by btrfs_commit_transaction after we check it (there is no locking), we will fail to find the committed transaction and return EINVAL to the caller. This has been observed occasionally by ceph-osd (which uses this ioctl heavily). Fix by rechecking whether the provided transid <= last_trans_committed after the search fails, and if so return 0. Signed-off-by: Sage Weil Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 3daf513d5aca4038a70514e448fc6dbc871ad679 Author: Josef Bacik Date: Fri Sep 19 15:43:34 2014 -0400 Btrfs: fix build_backref_tree issue with multiple shared blocks commit bbe9051441effce51c9a533d2c56440df64db2d7 upstream. Marc Merlin sent me a broken fs image months ago where it would blow up in the upper->checked BUG_ON() in build_backref_tree. This is because we had a scenario like this block a -- level 4 (not shared) | block b -- level 3 (reloc block, shared) | block c -- level 2 (not shared) | block d -- level 1 (shared) | block e -- level 0 (shared) We go to build a backref tree for block e, we notice block d is shared and add it to the list of blocks to lookup it's backrefs for. Now when we loop around we will check edges for the block, so we will see we looked up block c last time. So we lookup block d and then see that the block that points to it is block c and we can just skip that edge since we've already been up this path. The problem is because we clear need_check when we see block d (as it is shared) we never add block b as needing to be checked. And because block c is in our path already we bail out before we walk up to block b and add it to the backref check list. To fix this we need to reset need_check if we trip over a block that doesn't need to be checked. This will make sure that any subsequent blocks in the path as we're walking up afterwards are added to the list to be processed. With this patch I can now mount Marc's fs image and it'll complete the balance without panicing. Thanks, Reported-by: Marc MERLIN Signed-off-by: Josef Bacik Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit e5efe4c1a248dc7cc226eb1bb6ea2d502a51028c Author: Josef Bacik Date: Fri Sep 19 10:40:00 2014 -0400 Btrfs: cleanup error handling in build_backref_tree commit 75bfb9aff45e44625260f52a5fd581b92ace3e62 upstream. When balance panics it tends to panic in the BUG_ON(!upper->checked); test, because it means it couldn't build the backref tree properly. This is annoying to users and frankly a recoverable error, nothing in this function is actually fatal since it is just an in-memory building of the backrefs for a given bytenr. So go through and change all the BUG_ON()'s to ASSERT()'s, and fix the BUG_ON(!upper->checked) thing to just return an error. This patch also fixes the error handling so it tears down the work we've done properly. This code was horribly broken since we always just panic'ed instead of actually erroring out, so it needed to be completely re-worked. With this patch my broken image no longer panics when I mount it. Thanks, Signed-off-by: Josef Bacik Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 935edd0b9d811cb96065796bf1886fb8d86989cd Author: Josef Bacik Date: Thu Sep 18 11:30:44 2014 -0400 Btrfs: try not to ENOSPC on log replay commit 1d52c78afbbf80b58299e076a159617d6b42fe3c upstream. When doing log replay we may have to update inodes, which traditionally goes through our delayed inode stuff. This will try to move space over from the trans handle, but we don't reserve space in our trans handle on replay since we don't know how much we will need, so instead we try to flush. But because we have a trans handle open we won't flush anything, so if we are out of reserve space we will simply return ENOSPC. Since we know that if an operation made it into the log then we definitely had space before the box bought the farm then we don't need to worry about doing this space reservation. Use the fs_info->log_root_recovering flag to skip the delayed inode stuff and update the item directly. Thanks, Signed-off-by: Josef Bacik Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit c5e89b9aa507da08d7d5b3429cd4bcffc7529557 Author: Liu Bo Date: Tue Sep 16 17:49:30 2014 +0800 Btrfs: fix up bounds checking in lseek commit 4d1a40c66bed0b3fa43b9da5fbd5cbe332e4eccf upstream. An user reported this, it is because that lseek's SEEK_SET/SEEK_CUR/SEEK_END allow a negative value for @offset, but btrfs's SEEK_DATA/SEEK_HOLE don't prepare for that and convert the negative @offset into unsigned type, so we get (end < start) warning. [ 1269.835374] ------------[ cut here ]------------ [ 1269.836809] WARNING: CPU: 0 PID: 1241 at fs/btrfs/extent_io.c:430 insert_state+0x11d/0x140() [ 1269.838816] BTRFS: end < start 4094 18446744073709551615 [ 1269.840334] CPU: 0 PID: 1241 Comm: a.out Tainted: G W 3.16.0+ #306 [ 1269.858229] Call Trace: [ 1269.858612] [] dump_stack+0x4e/0x68 [ 1269.858952] [] warn_slowpath_common+0x8c/0xc0 [ 1269.859416] [] warn_slowpath_fmt+0x46/0x50 [ 1269.859929] [] insert_state+0x11d/0x140 [ 1269.860409] [] __set_extent_bit+0x3b6/0x4e0 [ 1269.860805] [] lock_extent_bits+0x87/0x200 [ 1269.861697] [] btrfs_file_llseek+0x148/0x2a0 [ 1269.862168] [] SyS_lseek+0xae/0xc0 [ 1269.862620] [] system_call_fastpath+0x16/0x1b [ 1269.862970] ---[ end trace 4d33ea885832054b ]--- This assumes that btrfs starts finding DATA/HOLE from the beginning of file if the assigned @offset is negative. Also we add alignment for lock_extent_bits 's range. Reported-by: Toralf Förster Signed-off-by: Liu Bo Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 91419a956156de0d49de9cb907c8bac0b679eca2 Author: Filipe Manana Date: Thu Sep 11 11:44:49 2014 +0100 Btrfs: add missing compression property remove in btrfs_ioctl_setflags commit 78a017a2c92df9b571db0a55a016280f9019c65e upstream. The behaviour of a 'chattr -c' consists of getting the current flags, clearing the FS_COMPR_FL bit and then sending the result to the set flags ioctl - this means the bit FS_NOCOMP_FL isn't set in the flags passed to the ioctl. This results in the compression property not being cleared from the inode - it was cleared only if the bit FS_NOCOMP_FL was set in the received flags. Reproducer: $ mkfs.btrfs -f /dev/sdd $ mount /dev/sdd /mnt && cd /mnt $ mkdir a $ chattr +c a $ touch a/file $ lsattr a/file --------c------- a/file $ chattr -c a $ touch a/file2 $ lsattr a/file2 --------c------- a/file2 $ lsattr -d a ---------------- a Reported-by: Andreas Schneider Signed-off-by: Filipe Manana Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit b1a821f574f0d2200dd3514010246341d1d7abf5 Author: David Sterba Date: Wed Jul 23 14:39:35 2014 +0200 btrfs: wake up transaction thread from SYNC_FS ioctl commit 2fad4e83e12591eb3bd213875b9edc2d18e93383 upstream. The transaction thread may want to do more work, namely it pokes the cleaner ktread that will start processing uncleaned subvols. This can be triggered by user via the 'btrfs fi sync' command, otherwise there was a delay up to 30 seconds before the cleaner started to clean old snapshots. Signed-off-by: David Sterba Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman