commit 206f91a12c5f69c9b4dfd4e0029043794a046933 Author: Sasha Levin Date: Sun Apr 3 18:26:26 2016 -0400 Linux 4.1.21 Signed-off-by: Sasha Levin commit 919e67a63aa967566a909f4f6e1c13f8e88cf76e Author: Alexander Shishkin Date: Thu Mar 24 11:14:53 2016 +0000 perf/core: Fix perf_sched_count derailment [ Upstream commit 927a5570855836e5d5859a80ce7e91e963545e8f ] The error path in perf_event_open() is such that asking for a sampling event on a PMU that doesn't generate interrupts will end up in dropping the perf_sched_count even though it hasn't been incremented for this event yet. Given a sufficient amount of these calls, we'll end up disabling scheduler's jump label even though we'd still have active events in the system, thereby facilitating the arrival of the infernal regions upon us. I'm fixing this by moving account_event() inside perf_event_alloc(). Signed-off-by: Alexander Shishkin Signed-off-by: Peter Zijlstra (Intel) Cc: Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: vince@deater.net Link: http://lkml.kernel.org/r/1456917854-29427-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: He Kuang Signed-off-by: Sasha Levin commit 882f862db7f3509f055208b2e3e5bd263265a03b Author: Peter Zijlstra Date: Thu Mar 24 11:14:52 2016 +0000 perf: Cure event->pending_disable race [ Upstream commit 28a967c3a2f99fa3b5f762f25cb2a319d933571b ] Because event_sched_out() checks event->pending_disable _before_ actually disabling the event, it can happen that the event fires after it checks but before it gets disabled. This would leave event->pending_disable set and the queued irq_work will try and process it. However, if the event trigger was during schedule(), the event might have been de-scheduled by the time the irq_work runs, and perf_event_disable_local() will fail. Fix this by checking event->pending_disable _after_ we call event->pmu->del(). This depends on the latter being a compiler barrier, such that the compiler does not lift the load and re-creates the problem. Tested-by: Alexander Shishkin Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174948.040469884@infradead.org Signed-off-by: Ingo Molnar Signed-off-by: He Kuang Signed-off-by: Sasha Levin commit 5709e7ba03717ab760b5ad6ebcf4a9e2f633dcc4 Author: Peter Zijlstra Date: Thu Mar 24 11:14:51 2016 +0000 perf: Do not double free [ Upstream commit 130056275ade730e7a79c110212c8815202773ee ] In case of: err_file: fput(event_file), we'll end up calling perf_release() which in turn will free the event. Do not then free the event _again_. Tested-by: Alexander Shishkin Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: dvyukov@google.com Cc: eranian@google.com Cc: oleg@redhat.com Cc: panand@redhat.com Cc: sasha.levin@oracle.com Cc: vince@deater.net Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.org Signed-off-by: Ingo Molnar Signed-off-by: He Kuang Signed-off-by: Sasha Levin commit 143cf26c48278bd438a97a8bd3e18b6460192981 Author: Yang Shi Date: Thu Mar 24 11:14:50 2016 +0000 arm64: replace read_lock to rcu lock in call_step_hook [ Upstream commit cf0a25436f05753aca5151891aea4fd130556e2a ] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917 in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh Preemption disabled at:[] kgdb_cpu_enter+0x158/0x6b8 CPU: 3 PID: 383 Comm: sh Tainted: G W 4.1.13-rt13 #2 Hardware name: Freescale Layerscape 2085a RDB Board (DT) Call trace: [] dump_backtrace+0x0/0x128 [] show_stack+0x24/0x30 [] dump_stack+0x80/0xa0 [] ___might_sleep+0x18c/0x1a0 [] __rt_spin_lock+0x2c/0x40 [] rt_read_lock+0x40/0x58 [] single_step_handler+0x38/0xd8 [] do_debug_exception+0x58/0xb8 Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0) 7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000 7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000 7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000 7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000 7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000 7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000 7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff 7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff 7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000 [] el1_dbg+0x18/0x6c This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in call_break_hook"), but comes to single_step_handler. This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel. Signed-off-by: Yang Shi Acked-by: Will Deacon Signed-off-by: Catalin Marinas Signed-off-by: He Kuang Signed-off-by: Sasha Levin commit 1a138f3e487026aede3642cbe09aee0f64c2f66b Author: Yang Shi Date: Thu Mar 24 11:14:49 2016 +0000 arm64: replace read_lock to rcu lock in call_break_hook [ Upstream commit 62c6c61adbc623cdacf74b8f29c278e539060c48 ] BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917 in_atomic(): 0, irqs_disabled(): 128, pid: 342, name: perf 1 lock held by perf/342: #0: (break_hook_lock){+.+...}, at: [] call_break_hook+0x34/0xd0 irq event stamp: 62224 hardirqs last enabled at (62223): [] __call_rcu.constprop.59+0x104/0x270 hardirqs last disabled at (62224): [] vprintk_emit+0x68/0x640 softirqs last enabled at (0): [] copy_process.part.8+0x428/0x17f8 softirqs last disabled at (0): [< (null)>] (null) CPU: 0 PID: 342 Comm: perf Not tainted 4.1.6-rt5 #4 Hardware name: linux,dummy-virt (DT) Call trace: [] dump_backtrace+0x0/0x128 [] show_stack+0x20/0x30 [] dump_stack+0x7c/0xa0 [] ___might_sleep+0x174/0x260 [] __rt_spin_lock+0x28/0x40 [] rt_read_lock+0x60/0x80 [] call_break_hook+0x30/0xd0 [] brk_handler+0x30/0x98 [] do_debug_exception+0x50/0xb8 Exception stack(0xffffffc00514fe30 to 0xffffffc00514ff50) fe20: 00000000 00000000 c1594680 0000007f fe40: ffffffff ffffffff 92063940 0000007f 0550dcd8 ffffffc0 00000000 00000000 fe60: 0514fe70 ffffffc0 000be1f8 ffffffc0 0514feb0 ffffffc0 0008948c ffffffc0 fe80: 00000004 00000000 0514fed0 ffffffc0 ffffffff ffffffff 9282a948 0000007f fea0: 00000000 00000000 9282b708 0000007f c1592820 0000007f 00083914 ffffffc0 fec0: 00000000 00000000 00000010 00000000 00000064 00000000 00000001 00000000 fee0: 005101e0 00000000 c1594680 0000007f c1594740 0000007f ffffffd8 ffffff80 ff00: 00000000 00000000 00000000 00000000 c1594770 0000007f c1594770 0000007f ff20: 00665e10 00000000 7f7f7f7f 7f7f7f7f 01010101 01010101 00000000 00000000 ff40: 928e4cc0 0000007f 91ff11e8 0000007f call_break_hook is called in atomic context (hard irq disabled), so replace the sleepable lock to rcu lock, replace relevant list operations to rcu version and call synchronize_rcu() in unregister_break_hook(). And, replace write lock to spinlock in {un}register_break_hook. Signed-off-by: Yang Shi Signed-off-by: Will Deacon Signed-off-by: He Kuang Signed-off-by: Sasha Levin commit f2b132595b89d9236b386e1d6ed3fcf5e9edf4cb Author: Jan Kara Date: Mon Dec 7 14:34:49 2015 -0500 ext4: fix races of writeback with punch hole and zero range When doing delayed allocation, update of on-disk inode size is postponed until IO submission time. However hole punch or zero range fallocate calls can end up discarding the tail page cache page and thus on-disk inode size would never be properly updated. Make sure the on-disk inode size is updated before truncating page cache. Signed-off-by: Jan Kara Signed-off-by: Theodore Ts'o Reviewed-by: Mingming Cao Signed-off-by: Sasha Levin commit 181aaebde9360b8235df647ee36dafdc041d4964 Author: Jan Kara Date: Mon Dec 7 14:31:11 2015 -0500 ext4: fix races between buffered IO and collapse / insert range Current code implementing FALLOC_FL_COLLAPSE_RANGE and FALLOC_FL_INSERT_RANGE is prone to races with buffered writes and page faults. If buffered write or write via mmap manages to squeeze between filemap_write_and_wait_range() and truncate_pagecache() in the fallocate implementations, the written data is simply discarded by truncate_pagecache() although it should have been shifted. Fix the problem by moving filemap_write_and_wait_range() call inside i_mutex and i_mmap_sem. That way we are protected against races with both buffered writes and page faults. Signed-off-by: Jan Kara Signed-off-by: Theodore Ts'o Reviewed-by: Mingming Cao Signed-off-by: Sasha Levin commit 9621787d69783fc23d14e1332377d7170d6928ed Author: Jan Kara Date: Mon Dec 7 14:29:17 2015 -0500 ext4: move unlocked dio protection from ext4_alloc_file_blocks() Currently ext4_alloc_file_blocks() was handling protection against unlocked DIO. However we now need to sometimes call it under i_mmap_sem and sometimes not and DIO protection ranks above it (although strictly speaking this cannot currently create any deadlocks). Also ext4_zero_range() was actually getting & releasing unlocked DIO protection twice in some cases. Luckily it didn't introduce any real bug but it was a land mine waiting to be stepped on. So move DIO protection out from ext4_alloc_file_blocks() into the two callsites. Signed-off-by: Jan Kara Signed-off-by: Theodore Ts'o Reviewed-by: Mingming Cao Signed-off-by: Sasha Levin commit 248766f068fd1d3d95479f470bc926d1136141d6 Author: Jan Kara Date: Mon Dec 7 14:28:03 2015 -0500 ext4: fix races between page faults and hole punching Currently, page faults and hole punching are completely unsynchronized. This can result in page fault faulting in a page into a range that we are punching after truncate_pagecache_range() has been called and thus we can end up with a page mapped to disk blocks that will be shortly freed. Filesystem corruption will shortly follow. Note that the same race is avoided for truncate by checking page fault offset against i_size but there isn't similar mechanism available for punching holes. Fix the problem by creating new rw semaphore i_mmap_sem in inode and grab it for writing over truncate, hole punching, and other functions removing blocks from extent tree and for read over page faults. We cannot easily use i_data_sem for this since that ranks below transaction start and we need something ranking above it so that it can be held over the whole truncate / hole punching operation. Also remove various workarounds we had in the code to reduce race window when page fault could have created pages with stale mapping information. Signed-off-by: Jan Kara Signed-off-by: Theodore Ts'o Reviewed-by: Mingming Cao Signed-off-by: Sasha Levin commit 14b4d1419ee6e71e672926d90a5bd87f698014d3 Author: Hauke Mehrtens Date: Sun Mar 6 22:28:56 2016 +0100 MIPS: Fix build error when SMP is used without GIC [ Upstream commit 588bad2ef32cae7abad24d5ca2f4611a7a7fb2a2 ] commit 7a50e4688dabb8005df39b2b992d76629b8af8aa upstream. The MIPS_GIC_IPI should only be selected when MIPS_GIC is also selected, otherwise it results in a compile error. smp-gic.c uses some functions from include/linux/irqchip/mips-gic.h like plat_ipi_call_int_xlate() which are only added to the header file when MIPS_GIC is set. The Lantiq SoC does not use the GIC, but supports SMP. The calls top the functions from smp-gic.c are already protected by some #ifdefs The first part of this was introduced in commit 72e20142b2bf ("MIPS: Move GIC IPI functions out of smp-cmp.c") Signed-off-by: Hauke Mehrtens Cc: Paul Burton Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/12774/ Signed-off-by: Ralf Baechle Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 7c196e5a2e90d172ce2bca85bb368f70a016b02c Author: Markos Chandras Date: Thu Jul 9 10:40:38 2015 +0100 MIPS: Kconfig: Disable MIPS MT and SMP implementations for R6 [ Upstream commit 5676319c91c8d668635ac0b9b6d9145c4fa418ac ] R6 does not support the MIPS MT ASE and the CMP/SMP options so restrict them in order to prevent users from selecting incompatible SMP configuration for R6 cores. We also disable the CPS/SMP option because its support hasn't been added to the CPS code yet. Signed-off-by: Markos Chandras Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/10637/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit e652be4b177954875b8d2d842abfda3626cb1d6a Author: Markos Chandras Date: Wed Jul 1 09:31:14 2015 +0100 Revert "MIPS: Kconfig: Disable SMP/CPS for 64-bit" [ Upstream commit 1c885357da2d3cf62132e611c0beaf4cdf607dd9 ] This reverts commit 6ca716f2e5571d25a3899c6c5c91ff72ea6d6f5e. SMP/CPS is now supported on 64bit cores. Cc: # 4.1 Reviewed-by: Paul Burton Signed-off-by: Markos Chandras Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/10592/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit c839d6e6b096f1c9f2d9be880aedd796095b7a80 Author: James Hogan Date: Tue Mar 8 16:47:53 2016 +0000 ld-version: Fix awk regex compile failure [ Upstream commit 4b7b1ef2c2f83d702272555e8adb839a50ba0f8e ] The ld-version.sh script fails on some versions of awk with the following error, resulting in build failures for MIPS: awk: scripts/ld-version.sh: line 4: regular expression compile failed (missing '(') This is due to the regular expression ".*)", meant to strip off the beginning of the ld version string up to the close bracket, however brackets have a meaning in regular expressions, so lets escape it so that awk doesn't expect a corresponding open bracket. Fixes: ccbef1674a15 ("Kbuild, lto: add ld-version and ld-ifversion ...") Reported-by: Geert Uytterhoeven Signed-off-by: James Hogan Tested-by: Michael S. Tsirkin Acked-by: Michael S. Tsirkin Tested-by: Sudip Mukherjee Cc: Michal Marek Cc: Andi Kleen Cc: Geert Uytterhoeven Cc: linux-mips@linux-mips.org Cc: linux-kbuild@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 4.4.x- Patchwork: https://patchwork.linux-mips.org/patch/12838/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin commit c7d4bd1d975e3fa1dd4ecf557ada0e792d551a6c Author: Ludovic Desroches Date: Thu Mar 10 10:17:55 2016 +0100 dmaengine: at_xdmac: fix residue computation [ Upstream commit 25c5e9626ca4d40928dc9c44f009ce2ed0a739e7 ] When computing the residue we need two pieces of information: the current descriptor and the remaining data of the current descriptor. To get that information, we need to read consecutively two registers but we can't do it in an atomic way. For that reason, we have to check manually that current descriptor has not changed. Signed-off-by: Ludovic Desroches Suggested-by: Cyrille Pitchen Reported-by: David Engraf Tested-by: David Engraf Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Cc: stable@vger.kernel.org #4.1 and later Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin commit eac525506a083a389ba173880979a6291401af2d Author: Paolo Bonzini Date: Tue Mar 8 12:13:39 2016 +0100 KVM: MMU: fix ept=0/pte.u=1/pte.w=0/CR0.WP=0/CR4.SMEP=1/EFER.NX=0 combo [ Upstream commit 844a5fe219cf472060315971e15cbf97674a3324 ] Yes, all of these are needed. :) This is admittedly a bit odd, but kvm-unit-tests access.flat tests this if you run it with "-cpu host" and of course ept=0. KVM runs the guest with CR0.WP=1, so it must handle supervisor writes specially when pte.u=1/pte.w=0/CR0.WP=0. Such writes cause a fault when U=1 and W=0 in the SPTE, but they must succeed because CR0.WP=0. When KVM gets the fault, it sets U=0 and W=1 in the shadow PTE and restarts execution. This will still cause a user write to fault, while supervisor writes will succeed. User reads will fault spuriously now, and KVM will then flip U and W again in the SPTE (U=1, W=0). User reads will be enabled and supervisor writes disabled, going back to the originary situation where supervisor writes fault spuriously. When SMEP is in effect, however, U=0 will enable kernel execution of this page. To avoid this, KVM also sets NX=1 in the shadow PTE together with U=0. If the guest has not enabled NX, the result is a continuous stream of page faults due to the NX bit being reserved. The fix is to force EFER.NX=1 even if the CPU is taking care of the EFER switch. (All machines with SMEP have the CPU_LOAD_IA32_EFER vm-entry control, so they do not use user-return notifiers for EFER---if they did, EFER.NX would be forced to the same value as the host). There is another bug in the reserved bit check, which I've split to a separate patch for easier application to stable kernels. Cc: stable@vger.kernel.org Cc: Andy Lutomirski Reviewed-by: Xiao Guangrong Fixes: f6577a5fa15d82217ca73c74cd2dcbc0f6c781dd Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit 09b4fd2014b1ef7d46df8df553f94254ba2a0497 Author: Martin Schwidefsky Date: Mon Feb 15 14:46:49 2016 +0100 s390/mm: four page table levels vs. fork [ Upstream commit 3446c13b268af86391d06611327006b059b8bab1 ] The fork of a process with four page table levels is broken since git commit 6252d702c5311ce9 "[S390] dynamic page tables." All new mm contexts are created with three page table levels and an asce limit of 4TB. If the parent has four levels dup_mmap will add vmas to the new context which are outside of the asce limit. The subsequent call to copy_page_range will walk the three level page table structure of the new process with non-zero pgd and pud indexes. This leads to memory clobbers as the pgd_index *and* the pud_index is added to the mm->pgd pointer without a pgd_deref in between. The init_new_context() function is selecting the number of page table levels for a new context. The function is used by mm_init() which in turn is called by dup_mm() and mm_alloc(). These two are used by fork() and exec(). The init_new_context() function can distinguish the two cases by looking at mm->context.asce_limit, for fork() the mm struct has been copied and the number of page table levels may not change. For exec() the mm_alloc() function set the new mm structure to zero, in this case a three-level page table is created as the temporary stack space is located at STACK_TOP_MAX = 4TB. This fixes CVE-2016-2143. Reported-by: Marcin Kościelnicki Reviewed-by: Heiko Carstens Cc: stable@vger.kernel.org Signed-off-by: Martin Schwidefsky Signed-off-by: Sasha Levin commit 1b3ce90bcd25ea9ba08450e605df29e16387a7ca Author: Steven Rostedt (Red Hat) Date: Wed Mar 9 11:58:41 2016 -0500 tracing: Fix check for cpu online when event is disabled [ Upstream commit dc17147de328a74bbdee67c1bf37d2f1992de756 ] Commit f37755490fe9b ("tracepoints: Do not trace when cpu is offline") added a check to make sure that tracepoints only get called when the cpu is online, as it uses rcu_read_lock_sched() for protection. Commit 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") added lockdep checks (including rcu checks) for events that are not enabled to catch possible RCU issues that would only be triggered if a trace event was enabled. Commit f37755490fe9b only stopped the warnings when the trace event was enabled but did not prevent warnings if the trace event was called when disabled. To fix this, the cpu online check is moved to where the condition is added to the trace event. This will place the cpu online check in all places that it may be used now and in the future. Cc: stable@vger.kernel.org # v3.18+ Fixes: f37755490fe9b ("tracepoints: Do not trace when cpu is offline") Fixes: 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled") Reported-by: Sudeep Holla Tested-by: Sudeep Holla Signed-off-by: Steven Rostedt Signed-off-by: Sasha Levin commit 3b9f9280aa1321618fc5024314aca23a8716ffd6 Author: Alex Deucher Date: Tue Mar 8 11:31:00 2016 -0500 Revert "drm/radeon/pm: adjust display configuration after powerstate" [ Upstream commit d74e766e1916d0e09b86e4b5b9d0f819628fd546 ] This reverts commit 39d4275058baf53e89203407bf3841ff2c74fa32. This caused a regression on some older hardware. bug: https://bugzilla.kernel.org/show_bug.cgi?id=113891 Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 8fc3813ab4b3a863b44b56013f02d8c955ffd954 Author: Alex Deucher Date: Thu Mar 3 19:26:24 2016 -0500 drm/radeon/dp: add back special handling for NUTMEG [ Upstream commit c8213a638f65bf487c10593c216525952cca3690 ] When I fixed the dp rate selection in: 092c96a8ab9d1bd60ada2ed385cc364ce084180e drm/radeon: fix dp link rate selection (v2) I accidently dropped the special handling for NUTMEG DP bridge chips. They require a fixed link rate. Reviewed-by: Christian König Reviewed-by: Ken Wang Reviewed-by: Harry Wentland Tested-by: Ken Moffat Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit fddbe6c2569a24f097a9973d08a8e282c977ecf3 Author: Alex Deucher Date: Thu Dec 17 10:23:34 2015 -0500 drm/radeon: fix dp link rate selection (v2) [ Upstream commit 092c96a8ab9d1bd60ada2ed385cc364ce084180e ] Need to properly handle the max link rate in the dpcd. This prevents some cases where 5.4 Ghz is selected when it shouldn't be. v2: simplify logic, add array bounds check Reviewed-by: Tom St Denis Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit 7dac6e4062f42f37ef99e86e7f0369ff476af5f6 Author: Alex Deucher Date: Thu May 14 12:47:45 2015 -0400 drm/radeon: make dpcd parameters const [ Upstream commit 0c3a88407ef2be8bb7c302c298d6ff58ebde4a43 ] Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin commit eb34a645aee3906baac9cad7defdabf61ac40bfd Author: Radim Krčmář Date: Fri Mar 4 15:08:42 2016 +0100 KVM: VMX: disable PEBS before a guest entry [ Upstream commit 7099e2e1f4d9051f31bbfa5803adf954bb5d76ef ] Linux guests on Haswell (and also SandyBridge and Broadwell, at least) would crash if you decided to run a host command that uses PEBS, like perf record -e 'cpu/mem-stores/pp' -a This happens because KVM is using VMX MSR switching to disable PEBS, but SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it isn't safe: When software needs to reconfigure PEBS facilities, it should allow a quiescent period between stopping the prior event counting and setting up a new PEBS event. The quiescent period is to allow any latent residual PEBS records to complete its capture at their previously specified buffer address (provided by IA32_DS_AREA). There might not be a quiescent period after the MSR switch, so a CPU ends up using host's MSR_IA32_DS_AREA to access an area in guest's memory. (Or MSR switching is just buggy on some models.) The guest can learn something about the host this way: If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results in #PF where we leak host's MSR_IA32_DS_AREA through CR2. After that, a malicious guest can map and configure memory where MSR_IA32_DS_AREA is pointing and can therefore get an output from host's tracing. This is not a critical leak as the host must initiate with PEBS tracing and I have not been able to get a record from more than one instruction before vmentry in vmx_vcpu_run() (that place has most registers already overwritten with guest's). We could disable PEBS just few instructions before vmentry, but disabling it earlier shouldn't affect host tracing too much. We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that optimization isn't worth its code, IMO. (If you are implementing PEBS for guests, be sure to handle the case where both host and guest enable PEBS, because this patch doesn't.) Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.") Cc: Reported-by: Jiří Olša Signed-off-by: Radim Krčmář Signed-off-by: Paolo Bonzini Signed-off-by: Sasha Levin commit c62aadae234ffad0901c20ac1a1aa4e13cce1c20 Author: Al Viro Date: Mon Mar 7 23:07:10 2016 -0500 jffs2: reduce the breakage on recovery from halfway failed rename() [ Upstream commit f93812846f31381d35c04c6c577d724254355e7f ] d_instantiate(new_dentry, old_inode) is absolutely wrong thing to do - it will oops if new_dentry used to be positive, for starters. What we need is d_invalidate() the target and be done with that. Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 906e5a6e6e73316fa4741ca53be014c9477a100c Author: Al Viro Date: Mon Mar 7 22:17:07 2016 -0500 ncpfs: fix a braino in OOM handling in ncp_fill_cache() [ Upstream commit 803c00123a8012b3a283c0530910653973ef6d8f ] Failing to allocate an inode for child means that cache for *parent* is incompletely populated. So it's parent directory inode ('dir') that needs NCPI_DIR_CACHE flag removed, *not* the child inode ('inode', which is what we'd failed to allocate in the first place). Fucked-up-in: commit 5e993e25 ("ncpfs: get rid of d_validate() nonsense") Fucked-up-by: Al Viro Cc: stable@vger.kernel.org # v3.19 Signed-off-by: Al Viro Signed-off-by: Sasha Levin commit 6d44ac3f884b220573b2d46c691127fb6fee0707 Author: Paul Mackerras Date: Sat Mar 5 19:34:39 2016 +1100 KVM: PPC: Book3S HV: Sanitize special-purpose register values on guest exit [ Upstream commit ccec44563b18a0ce90e2d4f332784b3cb25c8e9c ] Thomas Huth discovered that a guest could cause a hard hang of a host CPU by setting the Instruction Authority Mask Register (IAMR) to a suitable value. It turns out that this is because when the code was added to context-switch the new special-purpose registers (SPRs) that were added in POWER8, we forgot to add code to ensure that they were restored to a sane value on guest exit. This adds code to set those registers where a bad value could compromise the execution of the host kernel to a suitable neutral value on guest exit. Cc: stable@vger.kernel.org # v3.14+ Fixes: b005255e12a3 Reported-by: Thomas Huth Reviewed-by: David Gibson Signed-off-by: Paul Mackerras Signed-off-by: Sasha Levin commit 0d00dbe120f15fd798e7b0e2c69d67da887a9bfb Author: Mugunthan V N Date: Mon Mar 7 01:41:22 2016 -0700 ARM: dts: dra7: do not gate cpsw clock due to errata i877 [ Upstream commit 0f514e690740e54815441a87708c3326f8aa8709 ] Errata id: i877 Description: ------------ The RGMII 1000 Mbps Transmit timing is based on the output clock (rgmiin_txc) being driven relative to the rising edge of an internal clock and the output control/data (rgmiin_txctl/txd) being driven relative to the falling edge of an internal clock source. If the internal clock source is allowed to be static low (i.e., disabled) for an extended period of time then when the clock is actually enabled the timing delta between the rising edge and falling edge can change over the lifetime of the device. This can result in the device switching characteristics degrading over time, and eventually failing to meet the Data Manual Delay Time/Skew specs. To maintain RGMII 1000 Mbps IO Timings, SW should minimize the duration that the Ethernet internal clock source is disabled. Note that the device reset state for the Ethernet clock is "disabled". Other RGMII modes (10 Mbps, 100Mbps) are not affected Workaround: ----------- If the SoC Ethernet interface(s) are used in RGMII mode at 1000 Mbps, SW should minimize the time the Ethernet internal clock source is disabled to a maximum of 200 hours in a device life cycle. This is done by enabling the clock as early as possible in IPL (QNX) or SPL/u-boot (Linux/Android) by setting the register CM_GMAC_CLKSTCTRL[1:0]CLKTRCTRL = 0x2:SW_WKUP. So, do not allow to gate the cpsw clocks using ti,no-idle property in cpsw node assuming 1000 Mbps is being used all the time. If someone does not need 1000 Mbps and wants to gate clocks to cpsw, this property needs to be deleted in their respective board files. Signed-off-by: Mugunthan V N Signed-off-by: Grygorii Strashko Signed-off-by: Lokesh Vutla Cc: Signed-off-by: Paul Walmsley Signed-off-by: Sasha Levin commit a7029eb2f16ad28aee80f16998bfb1a2f2a787c5 Author: Lokesh Vutla Date: Mon Mar 7 01:41:21 2016 -0700 ARM: OMAP2+: hwmod: Introduce ti,no-idle dt property [ Upstream commit 6327a31a3f875c438ca13058bc4c73f1a752cd8a ] commit 2e18f5a1bc18e8af7031b3b26efde25307014837 upstream. Introduce a dt property, ti,no-idle, that prevents an IP to idle at any point. This is to handle Errata i877, which tells that GMAC clocks cannot be disabled. Acked-by: Roger Quadros Tested-by: Mugunthan V N Signed-off-by: Lokesh Vutla Signed-off-by: Sekhar Nori Signed-off-by: Dave Gerlach Acked-by: Rob Herring Signed-off-by: Paul Walmsley Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin commit 6991aaf3004c32f5e3e856b0bfb0874e122db65d Author: Peter Ujfalusi Date: Thu Nov 12 09:32:58 2015 +0200 ARM: OMAP2+: hwmod: Add hwmod flag for HWMOD_OPT_CLKS_NEEDED [ Upstream commit c12ba8ce2335389ce5416f88391cd67c7325c963 ] Some module needs more than one functional clock in order to be accessible, like the McASPs found in DRA7xx family. This flag will indicate that the opt_clks need to be handled at the same time as the main_clk for the given hwmod, ensuring that all needed clocks are enabled before we try to access the module's address space. Signed-off-by: Peter Ujfalusi Acked-by: Paul Walmsley Tested-by: Felipe Balbi Signed-off-by: Tony Lindgren Signed-off-by: Sasha Levin commit bde1cccf1a837d6905fe71543c2f4a4e3328dce0 Author: Nicholas Bellinger Date: Sat Mar 5 20:00:12 2016 -0800 target: Drop incorrect ABORT_TASK put for completed commands [ Upstream commit 7f54ab5ff52fb0b91569bc69c4a6bc5cac1b768d ] This patch fixes a recent ABORT_TASK regression associated with commit febe562c, where a left-over target_put_sess_cmd() would still be called when __target_check_io_state() detected a command has already been completed, and explicit ABORT must be avoided. Note commit febe562c dropped the local kref_get_unless_zero() check in core_tmr_abort_task(), but did not drop this extra corresponding target_put_sess_cmd() in the failure path. So go ahead and drop this now bogus target_put_sess_cmd(), and avoid this potential use-after-free. Reported-by: Dan Lane Cc: Quinn Tran Cc: Himanshu Madhani Cc: Sagi Grimberg Cc: Christoph Hellwig Cc: Hannes Reinecke Cc: Andy Grover Cc: Mike Christie Cc: stable@vger.kernel.org # 3.14+ Signed-off-by: Nicholas Bellinger Signed-off-by: Sasha Levin commit 546a8b3c4059af5fd8466f3d1848321e7613904c Author: Konstantin Khlebnikov Date: Sun Jan 31 16:21:29 2016 +0300 ovl: copy new uid/gid into overlayfs runtime inode [ Upstream commit b81de061fa59f17d2730aabb1b84419ef3913810 ] Overlayfs must update uid/gid after chown, otherwise functions like inode_owner_or_capable() will check user against stale uid. Catched by xfstests generic/087, it chowns file and calls utimes. Signed-off-by: Konstantin Khlebnikov Signed-off-by: Miklos Szeredi Cc: Signed-off-by: Sasha Levin commit 091baa9c784fe57b8778a4b754931ffe57245db3 Author: Konstantin Khlebnikov Date: Sun Jan 31 16:17:53 2016 +0300 ovl: ignore lower entries when checking purity of non-directory entries [ Upstream commit 45d11738969633ec07ca35d75d486bf2d8918df6 ] After rename file dentry still holds reference to lower dentry from previous location. This doesn't matter for data access because data comes from upper dentry. But this stale lower dentry taints dentry at new location and turns it into non-pure upper. Such file leaves visible whiteout entry after remove in directory which shouldn't have whiteouts at all. Overlayfs already tracks pureness of file location in oe->opaque. This patch just uses that for detecting actual path type. Comment from Vivek Goyal's patch: Here are the details of the problem. Do following. $ mkdir upper lower work merged upper/dir/ $ touch lower/test $ sudo mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir= work merged $ mv merged/test merged/dir/ $ rm merged/dir/test $ ls -l merged/dir/ /usr/bin/ls: cannot access merged/dir/test: No such file or directory total 0 c????????? ? ? ? ? ? test Basic problem seems to be that once a file has been unlinked, a whiteout has been left behind which was not needed and hence it becomes visible. Whiteout is visible because parent dir is of not type MERGE, hence od->is_real is set during ovl_dir_open(). And that means ovl_iterate() passes on iterate handling directly to underlying fs. Underlying fs does not know/filter whiteouts so it becomes visible to user. Why did we leave a whiteout to begin with when we should not have. ovl_do_remove() checks for OVL_TYPE_PURE_UPPER() and does not leave whiteout if file is pure upper. In this case file is not found to be pure upper hence whiteout is left. So why file was not PURE_UPPER in this case? I think because dentry is still carrying some leftover state which was valid before rename. For example, od->numlower was set to 1 as it was a lower file. After rename, this state is not valid anymore as there is no such file in lower. Signed-off-by: Konstantin Khlebnikov Reported-by: Viktor Stanchev Suggested-by: Vivek Goyal Link: https://bugzilla.kernel.org/show_bug.cgi?id=109611 Acked-by: Vivek Goyal Signed-off-by: Miklos Szeredi Cc: Signed-off-by: Sasha Levin commit e786702fff38e2b5142029d6de615abf1c8e436f Author: Rui Wang Date: Fri Jan 8 23:09:59 2016 +0800 ovl: fix getcwd() failure after unsuccessful rmdir [ Upstream commit ce9113bbcbf45a57c082d6603b9a9f342be3ef74 ] ovl_remove_upper() should do d_drop() only after it successfully removes the dir, otherwise a subsequent getcwd() system call will fail, breaking userspace programs. This is to fix: https://bugzilla.kernel.org/show_bug.cgi?id=110491 Signed-off-by: Rui Wang Reviewed-by: Konstantin Khlebnikov Signed-off-by: Miklos Szeredi Cc: Signed-off-by: Sasha Levin commit 7ce08a0c9992da8986b06154a768646e41012469 Author: Jouni Malinen Date: Tue Mar 1 00:29:00 2016 +0200 mac80211: Fix Public Action frame RX in AP mode [ Upstream commit 1ec7bae8bec9b72e347e01330c745ab5cdd66f0e ] Public Action frames use special rules for how the BSSID field (Address 3) is set. A wildcard BSSID is used in cases where the transmitter and recipient are not members of the same BSS. As such, we need to accept Public Action frames with wildcard BSSID. Commit db8e17324553 ("mac80211: ignore frames between TDLS peers when operating as AP") added a rule that drops Action frames to TDLS-peers based on an Action frame having different DA (Address 1) and BSSID (Address 3) values. This is not correct since it misses the possibility of BSSID being a wildcard BSSID in which case the Address 1 would not necessarily match. Fix this by allowing mac80211 to accept wildcard BSSID in an Action frame when in AP mode. Fixes: db8e17324553 ("mac80211: ignore frames between TDLS peers when operating as AP") Cc: stable@vger.kernel.org Signed-off-by: Jouni Malinen Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 87e0016ccb1f9cbe377d4af19cb840acbbdff206 Author: Johannes Berg Date: Fri Feb 26 22:13:40 2016 +0100 mac80211: check PN correctly for GCMP-encrypted fragmented MPDUs [ Upstream commit 9acc54beb474c81148e2946603d141cf8716b19f ] Just like for CCMP we need to check that for GCMP the fragments have PNs that increment by one; the spec was updated to fix this security issue and now has the following text: The receiver shall discard MSDUs and MMPDUs whose constituent MPDU PN values are not incrementing in steps of 1. Adapt the code for CCMP to work for GCMP as well, luckily the relevant fields already alias each other so no code duplication is needed (just check the aliasing with BUILD_BUG_ON.) Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit a60ebc3d637071d56b66aac189dd3fcdfa707704 Author: Takashi Iwai Date: Mon Feb 29 18:01:18 2016 +0100 ASoC: wm_adsp: Fix enum ctl accesses in a wrong type [ Upstream commit 15c665700bf6f4543f003ac0fbb1e9ec692e93f2 ] The firmware ctls like "DSP1 Firmware" in wm_adsp codec driver are enum, while the current driver accesses wrongly via value.integer.value[]. They have to be via value.enumerated.item[] instead. Signed-off-by: Takashi Iwai Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit f4d57e47121afdfad04a82cbd07b246873ec1e19 Author: Takashi Iwai Date: Mon Feb 29 18:01:15 2016 +0100 ASoC: wm8994: Fix enum ctl accesses in a wrong type [ Upstream commit 8019c0b37cd5a87107808300a496388b777225bf ] The DRC Mode like "AIF1DRC1 Mode" and EQ Mode like "AIF1.1 EQ Mode" in wm8994 codec driver are enum ctls, while the current driver accesses wrongly via value.integer.value[]. They have to be via value.enumerated.item[] instead. Signed-off-by: Takashi Iwai Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 58de29e6c58f7830826f48d7b0454cc1cec630ca Author: Takashi Iwai Date: Mon Feb 29 18:01:12 2016 +0100 ASoC: wm8958: Fix enum ctl accesses in a wrong type [ Upstream commit d0784829ae3b0beeb69b476f017d5c8a2eb95198 ] "MBC Mode", "VSS Mode", "VSS HPF Mode" and "Enhanced EQ Mode" ctls in wm8958 codec driver are enum, while the current driver accesses wrongly via value.integer.value[]. They have to be via value.enumerated.item[] instead. Signed-off-by: Takashi Iwai Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 33824bb3cb275aa931c996c2f67bf8bf9babe301 Author: Takashi Iwai Date: Mon Feb 29 17:20:48 2016 +0100 ASoC: dapm: Fix ctl value accesses in a wrong type [ Upstream commit 741338f99f16dc24d2d01ac777b0798ae9d10a90 ] snd_soc_dapm_dai_link_get() and _put() access the associated ctl values as value.integer.value[]. However, this is an enum ctl, and it has to be accessed via value.enumerated.item[]. The former is long while the latter is unsigned int, so they don't align. Fixes: c66150824b8a ('ASoC: dapm: add code to configure dai link parameters') Cc: Signed-off-by: Takashi Iwai Signed-off-by: Mark Brown Signed-off-by: Sasha Levin commit 01ff3a0a01366a231593476cfe775596ebdba30f Author: Maximilain Schneider Date: Tue Feb 23 01:17:28 2016 +0000 can: gs_usb: fixed disconnect bug by removing erroneous use of kfree() [ Upstream commit e9a2d81b1761093386a0bb8a4f51642ac785ef63 ] gs_destroy_candev() erroneously calls kfree() on a struct gs_can *, which is allocated through alloc_candev() and should instead be freed using free_candev() alone. The inappropriate use of kfree() causes the kernel to hang when gs_destroy_candev() is called. Only the struct gs_usb * which is allocated through kzalloc() should be freed using kfree() when the device is disconnected. Signed-off-by: Maximilian Schneider Cc: linux-stable Signed-off-by: Marc Kleine-Budde Signed-off-by: Sasha Levin commit 870be7d2ade42485fa40ac3d2ac8bcffa3afc957 Author: Felix Fietkau Date: Thu Feb 18 19:49:18 2016 +0100 mac80211: minstrel_ht: set default tx aggregation timeout to 0 [ Upstream commit 7a36b930e6ed4702c866dc74a5ad07318a57c688 ] The value 5000 was put here with the addition of the timeout field to ieee80211_start_tx_ba_session. It was originally added in mac80211 to save resources for drivers like iwlwifi, which only supports a limited number of concurrent aggregation sessions. Since iwlwifi does not use minstrel_ht and other drivers don't need this, 0 is a better default - especially since there have been recent reports of aggregation setup related issues reproduced with ath9k. This should improve stability without causing any adverse effects. Cc: stable@vger.kernel.org Acked-by: Avery Pennarun Signed-off-by: Felix Fietkau Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit ea46df70efaa589117dce85dec3e3707362e514a Author: Charles Keepax Date: Thu Feb 18 15:47:13 2016 +0000 ASoC: samsung: Use IRQ safe spin lock calls [ Upstream commit 316fa9e09ad76e095b9d7e9350c628b918370a22 ] Lockdep warns of a potential lock inversion, i2s->lock is held numerous times whilst we are under the substream lock (snd_pcm_stream_lock). If we use the IRQ unsafe spin lock calls, you can also end up locking snd_pcm_stream_lock whilst under i2s->lock (if an IRQ happens whilst we are holding i2s->lock). This could result in deadlock. [ 18.147001] CPU0 CPU1 [ 18.151509] ---- ---- [ 18.156022] lock(&(&pri_dai->spinlock)->rlock); [ 18.160701] local_irq_disable(); [ 18.166622] lock(&(&substream->self_group.lock)->rlock); [ 18.174595] lock(&(&pri_dai->spinlock)->rlock); [ 18.181806] [ 18.184408] lock(&(&substream->self_group.lock)->rlock); [ 18.190045] [ 18.190045] *** DEADLOCK *** This patch changes to using the irq safe spinlock calls, to avoid this issue. Fixes: ce8bcdbb61d9 ("ASoC: samsung: i2s: Protect more registers with a spinlock") Signed-off-by: Charles Keepax Tested-by: Anand Moon Signed-off-by: Mark Brown Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin commit 7e62b968351c3759db1ad78b4aaaeff72ab2b998 Author: Liad Kaufman Date: Sun Feb 14 15:32:58 2016 +0200 iwlwifi: mvm: inc pending frames counter also when txing non-sta [ Upstream commit fb896c44f88a75843a072cd6961b1615732f7811 ] Until this patch, when TXing non-sta the pending_frames counter wasn't increased, but it WAS decreased in iwl_mvm_rx_tx_cmd_single(), what makes it negative in certain conditions. This in turn caused much trouble when we need to remove the station since we won't be waiting forever until pending_frames gets 0. In certain cases, we were exhausting the station table even in BSS mode, because we had a lot of stale stations. Increase the counter also in iwl_mvm_tx_skb_non_sta() after a successful TX to avoid this outcome. CC: [3.18+] Signed-off-by: Liad Kaufman Signed-off-by: Emmanuel Grumbach Signed-off-by: Sasha Levin commit 60ca0012a0965fe57712eef8361ec99b9c76eb06 Author: Sven Eckelmann Date: Tue Feb 2 08:12:26 2016 +0100 mac80211: minstrel: Change expected throughput unit back to Kbps [ Upstream commit 212c5a5e6ba61678be6b5fee576e38bccb50b613 ] The change from cur_tp to the function minstrel_get_tp_avg/minstrel_ht_get_tp_avg changed the unit used for the current throughput. For example in minstrel_ht the correct conversion between them would be: mrs->cur_tp / 10 == minstrel_ht_get_tp_avg(..). This factor 10 must also be included in the calculation of minstrel_get_expected_throughput and minstrel_ht_get_expected_throughput to return values with the unit [Kbps] instead of [10Kbps]. Otherwise routing algorithms like B.A.T.M.A.N. V will make incorrect decision based on these values. Its kernel based implementation expects expected_throughput always to have the unit [Kbps] and not sometimes [10Kbps] and sometimes [Kbps]. The same requirement has iw or olsrdv2's nl80211 based statistics module which retrieve the same data via NL80211_STA_INFO_TX_BITRATE. Cc: stable@vger.kernel.org Fixes: 6a27b2c40b48 ("mac80211: restructure per-rate throughput calculation into function") Signed-off-by: Sven Eckelmann Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit d5bb89facc7b689292d85471be1fdbae1928e224 Author: Chris Bainbridge Date: Wed Jan 27 15:46:18 2016 +0000 mac80211: fix use of uninitialised values in RX aggregation [ Upstream commit f39ea2690bd61efec97622c48323f40ed6e16317 ] Use kzalloc instead of kmalloc for struct tid_ampdu_rx to initialize the "removed" field (all others are initialized manually). That fixes: UBSAN: Undefined behaviour in net/mac80211/rx.c:932:29 load of value 2 is not a valid value for type '_Bool' CPU: 3 PID: 1134 Comm: kworker/u16:7 Not tainted 4.5.0-rc1+ #265 Workqueue: phy0 rt2x00usb_work_rxdone 0000000000000004 ffff880254a7ba50 ffffffff8181d866 0000000000000007 ffff880254a7ba78 ffff880254a7ba68 ffffffff8188422d ffffffff8379b500 ffff880254a7bab8 ffffffff81884747 0000000000000202 0000000348620032 Call Trace: [] dump_stack+0x45/0x5f [] ubsan_epilogue+0xd/0x40 [] __ubsan_handle_load_invalid_value+0x67/0x70 [] ieee80211_sta_reorder_release.isra.16+0x5ed/0x730 [] ieee80211_prepare_and_rx_handle+0xd04/0x1c00 [] __ieee80211_rx_handle_packet+0x1f3/0x750 [] ieee80211_rx_napi+0x447/0x990 While at it, convert to use sizeof(*tid_agg_rx) instead. Fixes: 788211d81bfdf ("mac80211: fix RX A-MPDU session reorder timer deletion") Cc: stable@vger.kernel.org Signed-off-by: Chris Bainbridge [reword commit message, use sizeof(*tid_agg_rx)] Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 6d5202f35ff2277d76eb53da93ed70080d6f4ec1 Author: Johannes Berg Date: Wed Jan 27 13:29:34 2016 +0100 cfg80211/wext: fix message ordering [ Upstream commit cb150b9d23be6ee7f3a0fff29784f1c5b5ac514d ] Since cfg80211 frequently takes actions from its netdev notifier call, wireless extensions messages could still be ordered badly since the wext netdev notifier, since wext is built into the kernel, runs before the cfg80211 netdev notifier. For example, the following can happen: 5: wlan1: mtu 1500 qdisc mq state DOWN group default link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff 5: wlan1: link/ether when setting the interface down causes the wext message. To also fix this, export the wireless_nlevent_flush() function and also call it from the cfg80211 notifier. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin commit 746ba2ee59997437988060c709324057b761bd96 Author: Johannes Berg Date: Wed Jan 27 12:37:52 2016 +0100 wext: fix message delay/ordering [ Upstream commit 8bf862739a7786ae72409220914df960a0aa80d8 ] Beniamino reported that he was getting an RTM_NEWLINK message for a given interface, after the RTM_DELLINK for it. It turns out that the message is a wireless extensions message, which was sent because the interface had been connected and disconnection while it was deleted caused a wext message. For its netlink messages, wext uses RTM_NEWLINK, but the message is without all the regular rtnetlink attributes, so "ip monitor link" prints just rudimentary information: 5: wlan1: mtu 1500 qdisc mq state DOWN group default link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff Deleted 5: wlan1: mtu 1500 qdisc noop state DOWN group default link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff 5: wlan1: link/ether (from my hwsim reproduction) This can cause userspace to get confused since it doesn't expect an RTM_NEWLINK message after RTM_DELLINK. The reason for this is that wext schedules a worker to send out the messages, and the scheduling delay can cause the messages to get out to userspace in different order. To fix this, have wext register a netdevice notifier and flush out any pending messages when netdevice state changes. This fixes any ordering whenever the original message wasn't sent by a notifier itself. Cc: stable@vger.kernel.org Reported-by: Beniamino Galvani Signed-off-by: Johannes Berg Signed-off-by: Sasha Levin