- May 13, 2015
-
-
Sergey Senozhatsky authored
Fix the following oops: hpet_msi_get_hwirq+0x1f/0x27 msi_domain_alloc+0x35/0xfe ? trace_hardirqs_on_caller+0x16c/0x188 irq_domain_alloc_irqs_recursive+0x51/0x95 __irq_domain_alloc_irqs+0x151/0x223 hpet_assign_irq+0x5d/0x68 hpet_msi_capability_lookup+0x121/0x1cb ? hpet_enable+0x2b4/0x2b4 hpet_late_init+0x5f/0xf2 ? hpet_enable+0x2b4/0x2b4 do_one_initcall+0x184/0x199 kernel_init_freeable+0x1af/0x237 ? rest_init+0x13a/0x13a kernel_init+0xe/0xd4 ret_from_fork+0x3f/0x70 ? rest_init+0x13a/0x13a Since 3cb96f0c ('x86/hpet: Enhance HPET IRQ to support hierarchical irqdomains') hpet_msi_capability_lookup() uses hpet_assign_irq(). The latter initializes irq_alloc_info on stack, but passes a NULL pointer to irq_domain_alloc_irqs(), which causes a NULL pointer dereference later in hpet_msi_get_hwirq(). Pass the pointer to the irq_alloc_info irq_domain_alloc_irqs(). Fixes: 3cb96f0c 'x86/hpet: Enhance HPET IRQ to support hierarchical irqdomains' S...
-
Ingo Molnar authored
Huang Ying reported x86 boot hangs due to this commit. Turns out that the change, despite its changelog, does more than just change timeouts: it also changes the way we assert/deassert INIT via the APIC_DM_INIT IPI, in the x2apic case it skips the deassert step. This is historically fragile code and the patch did not improve it, so revert these changes. This commit: 1a744cb3 ("x86/smp/boot: Remove 10ms delay from cpu_up() on modern processors") independently removes the worst of the delays (the 10 msec delay). The remaining delays can be addressed one by one, combined with careful testing. Reported-by:
Huang Ying <ying.huang@intel.com> Cc: Anthony Liguori <aliguori@amazon.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gang Wei <gang.wei@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Deegan <tim@xen.org> Link: http://lkml.kernel.org/r/1430732554-7294-1-git-send-email-jschoenh@amazon.de Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- May 12, 2015
-
-
Len Brown authored
Modern processor familes do not require the 10ms delay in cpu_up() to de-assert INIT. This speeds up boot and resume by 10ms per (application) processor. Signed-off-by:
Len Brown <len.brown@intel.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/021ce30c88f216ad39686646421194dc25671e55.1431379433.git.len.brown@intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Len Brown authored
No change to default behavior. Replace the hard-coded mdelay(10) in cpu_up() with a variable udelay, that is set to a defined default -- rather than a magic number. Add a boot-time override, "cpu_init_udelay=N" Signed-off-by:
Len Brown <len.brown@intel.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jan H. Schönherr <jschoenh@amazon.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/2fe8e6c798e8def271122f62df9bbf58dc283e2a.1431379433.git.len.brown@intel.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- May 11, 2015
-
-
Borislav Petkov authored
Software optimization guides for both F15h and F16h cite those NOPs as the optimal ones. A microbenchmark confirms that actually even older families are better with the single-insn NOPs so switch to them for the alternatives. Cycles count below includes the loop overhead of the measurement but that overhead is the same with all runs. F10h, revE: ----------- Running NOP tests, 1000 NOPs x 1000000 repetitions K8: 90 288.212282 cycles 66 90 288.220840 cycles 66 66 90 288.219447 cycles 66 66 66 90 288.223204 cycles 66 66 90 66 90 571.393424 cycles 66 66 90 66 66 90 571.374919 cycles 66 66 66 90 66 66 90 572.249281 cycles 66 66 66 90 66 66 66 90 571.388651 cycles P6: 90 288.214193 cycles 66 90 288.225550 cycles 0f 1f 00 288.224441 cycles 0f 1f 40 00 288.225030 cycles 0f 1f 44 00 00 288.233558 cycles 66 0f 1f 44 0...
-
- May 10, 2015
-
-
Brian Gerst authored
Since the ISA irqs are in a single block, use ISA_IRQ_VECTOR(irq) instead of individual macros. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1431185813-15413-5-git-send-email-brgerst@gmail.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Brian Gerst authored
Use IA32_SYSCALL_VECTOR for both compat and native. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1431185813-15413-4-git-send-email-brgerst@gmail.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Brian Gerst authored
The invalidate_interrupt* functions no longer exist. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1431185813-15413-3-git-send-email-brgerst@gmail.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Brian Gerst authored
Move irq_regs and irq_stat definitions to irq.c. Signed-off-by:
Brian Gerst <brgerst@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1431185813-15413-2-git-send-email-brgerst@gmail.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- May 08, 2015
-
-
Denys Vlasenko authored
32-bit code has PER_CPU_VAR(cpu_current_top_of_stack). 64-bit code uses somewhat more obscure: PER_CPU_VAR(cpu_tss + TSS_sp0). Define the 'cpu_current_top_of_stack' macro on CONFIG_X86_64 as well so that the PER_CPU_VAR(cpu_current_top_of_stack) expression can be used in both 32-bit and 64-bit code. Signed-off-by:
Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1429889495-27850-3-git-send-email-dvlasenk@redhat.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Denys Vlasenko authored
Signed-off-by:
Denys Vlasenko <dvlasenk@redhat.com> Acked-by:
Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1429889495-27850-2-git-send-email-dvlasenk@redhat.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Denys Vlasenko authored
PER_CPU_VAR(kernel_stack) is redundant: - On the 64-bit build, we can use PER_CPU_VAR(cpu_tss + TSS_sp0). - On the 32-bit build, we can use PER_CPU_VAR(cpu_current_top_of_stack). PER_CPU_VAR(kernel_stack) will be deleted by a separate change. Signed-off-by:
Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1429889495-27850-1-git-send-email-dvlasenk@redhat.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Denys Vlasenko authored
With both gcc 4.7.2 and 4.9.2, sometimes gcc mysteriously doesn't inline very small functions we expect to be inlined: $ nm --size-sort vmlinux | grep -iF ' t ' | uniq -c | grep -v '^ *1 ' | sort -rn 473 000000000000000b t spin_unlock_irqrestore 449 000000000000005f t rcu_read_unlock 355 0000000000000009 t atomic_inc <== THIS 353 000000000000006e t rcu_read_lock 350 0000000000000075 t rcu_read_lock_sched_held 291 000000000000000b t spin_unlock 266 0000000000000019 t arch_local_irq_restore 215 000000000000000b t spin_lock 180 0000000000000011 t kzalloc 165 0000000000000012 t list_add_tail 161 0000000000000019 t arch_local_save_flags 153 0000000000000016 t test_and_set_bit 134 000000000000000b t spin_unlock_irq 134 0000000000000009 t atomic_dec <== THIS 130 000000000000000b t spin_unlock_bh 122 0000000000000010 t brelse 120 0000000000000016 t test_and_clear_bit 120 000000000000000b t spin_lock_irq 119 000000000000001e t get_dma_ops 117 0000000000000053 t cpumask_next 116 0000000000000036 t kref_get 114 000000000000001a t schedule_work 106 000000000000000b t spin_lock_bh 103 0000000000000019 t arch_local_irq_disable ... Note sizes of marked functions. They are merely 9 bytes long! Selecting function with 'atomic' in their names: 355 0000000000000009 t atomic_inc 134 0000000000000009 t atomic_dec 98 0000000000000014 t atomic_dec_and_test 31 000000000000000e t atomic_add_return 27 000000000000000a t atomic64_inc 26 000000000000002f t kmap_atomic 24 0000000000000009 t atomic_add 12 0000000000000009 t atomic_sub 10 0000000000000021 t __atomic_add_unless 10 000000000000000a t atomic64_add 5 000000000000001f t __atomic_add_unless.constprop.7 5 000000000000000a t atomic64_dec 4 000000000000001f t __atomic_add_unless.constprop.18 4 000000000000001f t __atomic_add_unless.constprop.12 4 000000000000001f t __atomic_add_unless.constprop.10 3 000000000000001f t __atomic_add_unless.constprop.13 3 0000000000000011 t atomic64_add_return 2 000000000000001f t __atomic_add_unless.constprop.9 2 000000000000001f t __atomic_add_unless.constprop.8 2 000000000000001f t __atomic_add_unless.constprop.6 2 000000000000001f t __atomic_add_unless.constprop.5 2 000000000000001f t __atomic_add_unless.constprop.3 2 000000000000001f t __atomic_add_unless.constprop.22 2 000000000000001f t __atomic_add_unless.constprop.14 2 000000000000001f t __atomic_add_unless.constprop.11 2 000000000000001e t atomic_dec_if_positive 2 0000000000000014 t atomic_inc_and_test 2 0000000000000011 t atomic_add_return.constprop.4 2 0000000000000011 t atomic_add_return.constprop.17 2 0000000000000011 t atomic_add_return.constprop.16 2 000000000000000d t atomic_inc.constprop.4 2 000000000000000c t atomic_cmpxchg This patch fixes this for x86 atomic ops via s/inline/__always_inline/. This decreases allyesconfig kernel by about 25k: text data bss dec hex filename 82399481 22255416 20627456 125282353 777a831 vmlinux.before 82375570 22255544 20627456 125258570 7774b4a vmlinux Signed-off-by:
Denys Vlasenko <dvlasenk@redhat.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1431080762-17797-1-git-send-email-dvlasenk@redhat.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Denys Vlasenko authored
By the nature of TEST operation, it is often possible to test a narrower part of the operand: "testl $3, mem" -> "testb $3, mem" This results in shorter insns, because TEST insn has no sign-entending byte-immediate forms unlike other ALU ops. text data bss dec hex filename 11674 0 0 11674 2d9a entry_64.o.before 11658 0 0 11658 2d8a entry_64.o Changes in object code: - f7 84 24 88 00 00 00 03 00 00 00 testl $0x3,0x88(%rsp) + f6 84 24 88 00 00 00 03 testb $0x3,0x88(%rsp) - f7 44 24 68 03 00 00 00 testl $0x3,0x68(%rsp) + f6 44 24 68 03 testb $0x3,0x68(%rsp) - f7 84 24 90 00 00 00 03 00 00 00 testl $0x3,0x90(%rsp) + f6 84 24 90 00 00 00 03 testb $0x3,0x90(%rsp) Signed-off-by:
Denys Vlasenko <dvlasenk@redhat.com> Acked-by:
Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1430140912-7960-2-git-send-email-dvlasenk@redhat.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Denys Vlasenko authored
After TESTs, use logically correct JZ/JNZ mnemonics instead of JE/JNE. This doesn't change code. Signed-off-by:
Denys Vlasenko <dvlasenk@redhat.com> Acked-by:
Andy Lutomirski <luto@kernel.org> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Drewry <wad@chromium.org> Link: http://lkml.kernel.org/r/1430140912-7960-1-git-send-email-dvlasenk@redhat.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- May 06, 2015
-
-
Stefano Stabellini authored
Make sure that xen_swiotlb_init allocates buffers that are DMA capable when at least one memblock is available below 4G. Otherwise we assume that all devices on the SoC can cope with >4G addresses. We do this on ARM and ARM64, where dom0 is mapped 1:1, so pfn == mfn in this case. No functional changes on x86. From: Chen Baozi <baozich@gmail.com> Signed-off-by:
Chen Baozi <baozich@gmail.com> Signed-off-by:
Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by:
Chen Baozi <baozich@gmail.com> Acked-by:
Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com>
-
Borislav Petkov authored
Add some text to the macro magic for future reference and against failing human memory. Requested-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Bobby Powers authored
The following commit: f893959b ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()") removed drop_init_fpu() usage from flush_thread(). This seems to break things for me - the Go 1.4 test suite fails all over the place with floating point comparision errors (offending commit found through bisection). The functional change was that flush_thread() after this commit only calls restore_init_xstate() when both use_eager_fpu() and !used_math() are true. drop_init_fpu() (now fpu_reset_state()) calls restore_init_xstate() regardless of whether current used_math() - apply the same logic here. Switch used_math() -> tsk_used_math(tsk) to consistently use the grabbed tsk instead of current, like in the rest of flush_thread(). Tested-by:
Dave Hansen <dave.hansen@intel.com> Signed-off-by:
Bobby Powers <bobbypowers@gmail.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Acked-by:
Oleg Nesterov <oleg@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pekka Riikonen <priikone@iki.fi> Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com> Cc: Rik van Riel <riel@redhat.com> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: f893959b ("x86/fpu: Don't abuse drop_init_fpu() in flush_thread()") Link: http://lkml.kernel.org/r/1430147441-9820-1-git-send-email-bobbypowers@gmail.com Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
H.J. Lu authored
GCC 5 added a compiler option, -mskip-rax-setup, for x86-64. It skips setting up the RAX register when SSE is disabled and there are no variable arguments passed in vector registers. (According to the x86_64 ABI, %al is used as a hidden register containing the number of vector registers used). Since the kernel doesn't pass vector registers to functions with variable arguments, this option can be used to optimize the x86-64 kernel. This GCC feature was suggested by Rasmus Villemoes <linux@rasmusvillemoes.dk>. This is the corresponding kernel change using it. For kernel v3.17: text data bss dec filename 11455921 2204048 5853184 19513153 vmlinux #with -mskip-rax-setup 11480079 2204048 5853184 19537311 vmlinux For Kernel v4.0+ - custom config: text data bss dec filename 10231778 3479800 16617472 30329050 vmlinux-gcc5+-mskip-rax-setup 10268797 3547448 16621568 30437813 vmlinux Signed-off-by:
H.J. Lu <hjl.tools@gmail.com> Signed-off-by:
Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Jan H. Schönherr authored
Remove the per-CPU delays during SMP initialization, which seems to be possible on newer architectures with an x2APIC. Xen does this since 2011. In fact, this commit is basically a combination of the following Xen commits. The first removes the delays, the second fixes an issue with the removal: commit 68fce206f6dba9981e8322269db49692c95ce250 Author: Tim Deegan <Tim.Deegan@citrix.com> Date: Tue Jul 19 14:13:01 2011 +0100 x86: Remove timeouts from INIT-SIPI-SIPI sequence when using x2apic. Some of the timeouts are pointless since they're waiting for the ICR to ack the IPI delivery and that doesn't happen on x2apic. The others should be benign (and are suggested in the SDM) but removing them makes AP bringup much more reliable on some test boxes. Signed-off-by:
Tim Deegan <Tim.Deegan@citrix.com> commit f12ee533150761df5a7099c83f2a5fa6c07d1187 Author: Gang Wei <gang.wei@intel.com> Date: Thu Dec 29 10:07:54 2011 +0000 X86: Add a delay between INIT & SIPIs for tboot AP bring-up in X2APIC case Without this delay, Xen could not bring APs up while working with TXT/tboot, because tboot needs some time in APs to handle INIT before becoming ready for receiving SIPIs (this delay was removed as part of c/s 23724 by Tim Deegan). Signed-off-by:
Gang Wei <gang.wei@intel.com> Acked-by:
Keir Fraser <keir@xen.org> Acked-by:
Tim Deegan <tim@xen.org> Committed-by:
Tim Deegan <tim@xen.org> Signed-off-by:
Jan H. Schönherr <jschoenh@amazon.de> Cc: Anthony Liguori <aliguori@amazon.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gang Wei <gang.wei@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Len Brown <len.brown@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Deegan <tim@xen.org> Link: http://lkml.kernel.org/r/1430732554-7294-1-git-send-email-jschoenh@amazon.de Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
Marc Dionne authored
Commit 75182b16 ("x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()") changed current_thread_info to use this_cpu_sp0, and indirectly made it rely on init_tss which was exported with EXPORT_PER_CPU_SYMBOL_GPL. As a result some macros and inline functions such as set/get_fs, test_thread_flag and variants have been made unusable for external modules. Make cpu_tss exported with EXPORT_PER_CPU_SYMBOL so that these functions are accessible again, as they were previously. Signed-off-by:
Marc Dionne <marc.dionne@your-file-system.com> Acked-by:
Andy Lutomirski <luto@amacapital.net> Link: http://lkml.kernel.org/r/1430763404-21221-1-git-send-email-marc.dionne@your-file-system.com Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Boris Ostrovsky authored
Commit 61f01dd9 ("x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue") makes AMD processors set SS to __KERNEL_DS in __switch_to() to deal with cases when SS is NULL. This breaks Xen PV guests who do not want to load SS with__KERNEL_DS. Since the problem that the commit is trying to address would have to be fixed in the hypervisor (if it in fact exists under Xen) there is no reason to set X86_BUG_SYSRET_SS_ATTRS flag for PV VPCUs here. This can be easily achieved by adding x86_hyper_xen_hvm.set_cpu_features op which will clear this flag. (And since this structure is no longer HVM-specific we should do some renaming). Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Reported-by:
Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by:
David Vrabel <david.vrabel@citrix.com>
-
- May 05, 2015
-
-
Jan Kiszka authored
We are able to use x2APIC mode in the absence of interrupt remapping on certain hypervisors. So it is fine to disable IRQ_REMAP without having to give up x2APIC support. Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com> Link: http://lkml.kernel.org/r/55479709.4030901@siemens.com Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Jan Kiszka authored
If ACPI is disabled, acpi_gbl_FADT is not available, and and the build breaks. Signed-off-by:
Jan Kiszka <jan.kiszka@siemens.com> Link: http://lkml.kernel.org/r/55479708.2000104@siemens.com Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
Nothing changes those ops. Make the initializers readable while at it. Reported-by:
Krzysztof Kozlowski <k.kozlowski.k@gmail.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
Tahsin Erdogan authored
A spinlock is regarded as contended when there is at least one waiter. Currently, the code that checks whether there are any waiters rely on tail value being greater than head. However, this is not true if tail reaches the max value and wraps back to zero, so arch_spin_is_contended() incorrectly returns 0 (not contended) when tail is smaller than head. The original code (before regression) handled this case by casting the (tail - head) to an unsigned value. This change simply restores that behavior. Fixes: d6abfdb2 ("x86/spinlocks/paravirt: Fix memory corruption on unlock") Signed-off-by:
Tahsin Erdogan <tahsin@google.com> Cc: peterz@infradead.org Cc: Waiman.Long@hp.com Cc: borntraeger@de.ibm.com Cc: oleg@redhat.com Cc: raghavendra.kt@linux.vnet.ibm.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1430799331-20445-1-git-send-email-tahsin@google.com Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
-
- May 01, 2015
-
-
Sam Bobroff authored
Patches 7cba160a "powernv/cpuidle: Redesign idle states management" and 77b54e9f "powernv/powerpc: Add winkle support for offline cpus" use non-volatile condition registers (cr2, cr3 and cr4) early in the system reset interrupt handler (system_reset_pSeries()) before it has been determined if state loss has occurred. If state loss has not occurred, control returns via the power7_wakeup_noloss() path which does not restore those condition registers, leaving them corrupted. Fix this by restoring the condition registers in the power7_wakeup_noloss() case. This is apparent when running a KVM guest on hardware that does not support winkle or sleep and the guest makes use of secondary threads. In practice this means Power7 machines, though some early unreleased Power8 machines may also be susceptible. The secondary CPUs are taken off line before the guest is started and they call pnv_smp_cpu_kill_self(). This checks support for sleep states (in this case there is no support) and power7_nap() is called. When the CPU is woken, power7_nap() returns and because the CPU is still off line, the main while loop executes again. The sleep states support test is executed again, but because the tested values cannot have changed, the compiler has optimized the test away and instead we rely on the result of the first test, which has been left in cr3 and/or cr4. With the result overwritten, the wrong branch is taken and power7_winkle() is called on a CPU that does not support it, leading to it stalling. Fixes: 7cba160a ("powernv/cpuidle: Redesign idle states management") Fixes: 77b54e9f ("powernv/powerpc: Add winkle support for offline cpus") [mpe: Massage change log a bit more] Signed-off-by:
Sam Bobroff <sam.bobroff@au1.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Gavin Shan authored
Commit 1c509148b ("powerpc/eeh: Do probe on pci_dn") probes EEH devices in early stage, which is reasonable to pSeries platform. However, it's wrong for PowerNV platform because the PE# isn't determined until the resources (IO and MMIO) are assigned to PE in hotplug case. So we have to delay probing EEH devices for PowerNV platform until the PE# is assigned. Fixes: ff57b454 ("powerpc/eeh: Do probe on pci_dn") Signed-off-by:
Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Gavin Shan authored
When asserting reset in pcibios_set_pcie_reset_state(), the PE is enforced to (hardware) frozen state in order to drop unexpected PCI transactions (except PCI config read/write) automatically by hardware during reset, which would cause recursive EEH error. However, the (software) frozen state EEH_PE_ISOLATED is missed. When users get 0xFF from PCI config or MMIO read, EEH_PE_ISOLATED is set in PE state retrival backend. Unfortunately, nobody (the reset handler or the EEH recovery functinality in host) will clear EEH_PE_ISOLATED when the PE has been passed through to guest. The patch sets and clears EEH_PE_ISOLATED properly during reset in function pcibios_set_pcie_reset_state() to fix the issue. Fixes: 28158cd1 ("Enhance pcibios_set_pcie_reset_state()") Reported-by:
Carol L. Soto <clsoto@us.ibm.com> Signed-off-by:
Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by:
Carol L. Soto <clsoto@us.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Nathan Fontenot authored
The incorrect ordering of operations during cpu dlpar add results in invalid affinity for the cpu being added. The ibm,associativity property in the device tree is populated with all zeroes for the added cpu which results in invalid affinity mappings and all cpus appear to belong to node 0. This occurs because rtas configure-connector is called prior to making the rtas set-indicator calls. Phyp does not assign affinity information for a cpu until the rtas set-indicator calls are made to set the isolation and allocation state. Correct the order of operations to make the rtas set-indicator calls (done in dlpar_acquire_drc) before calling rtas configure-connector. Fixes: 1a8061c4 ("powerpc/pseries: Add kernel based CPU DLPAR handling") Signed-off-by:
Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Jiang Liu authored
An IO port or MMIO resource assigned to a PCI host bridge may be consumed by the host bridge itself or available to its child bus/devices. The ACPI specification defines a bit (Producer/Consumer) to tell whether the resource is consumed by the host bridge itself, but firmware hasn't used that bit consistently, so we can't rely on it. Before commit 593669c2 ("x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation"), arch/x86/pci/acpi.c ignored all IO port resources defined by acpi_resource_io and acpi_resource_fixed_io to filter out IO ports consumed by the host bridge itself. Commit 593669c2 ("x86/PCI/ACPI: Use common ACPI resource interfaces to simplify implementation") started accepting all IO port and MMIO resources, which caused a regression that IO port resources consumed by the host bridge itself became available to its child devices. Then commit 63f1789e ("x86/PCI/ACPI: Ignore resources consumed by host bridge itself") ignored resources consumed by the host bridge itself by checking the IORESOURCE_WINDOW flag, which accidently removed MMIO resources defined by acpi_resource_memory24, acpi_resource_memory32 and acpi_resource_fixed_memory32. On x86 and IA64 platforms, all IO port and MMIO resources are assumed to be available to child bus/devices except one special case: IO port [0xCF8-0xCFF] is consumed by the host bridge itself to access PCI configuration space. So explicitly filter out PCI CFG IO ports[0xCF8-0xCFF]. This solution will also ease the way to consolidate ACPI PCI host bridge common code from x86, ia64 and ARM64. Related ACPI table are archived at: https://bugzilla.kernel.org/show_bug.cgi?id=94221 Related discussions at: http://patchwork.ozlabs.org/patch/461633/ https://lkml.org/lkml/2015/3/29/304 Fixes: 63f1789e (Ignore resources consumed by host bridge itself) Reported-by:
Bernhard Thaler <bernhard.thaler@wvnet.at> Signed-off-by:
Jiang Liu <jiang.liu@linux.intel.com> Cc: 4.0+ <stable@vger.kernel.org> # 4.0+ Reviewed-by:
Bjorn Helgaas <bhelgaas@google.com> Signed-off-by:
Rafael J. Wysocki <rafael.j.wysocki@intel.com>
-
- Apr 30, 2015
-
-
Suzuki K. Poulose authored
With commit d5efd9cc ("arm64: pmu: add support for interrupt-affinity property"), we print a warning when we find a PMU SPI with a missing missing interrupt-affinity property in a pmu node. Unfortunately, we pass the wrong (NULL) device node to of_node_full_name, resulting in unhelpful messages such as: hw perfevents: Failed to parse <no-node>/interrupt-affinity[0] This patch fixes the name to that of the pmu node. Fixes: d5efd9cc (arm64: pmu: add support for interrupt-affinity property) Acked-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Suzuki K. Poulose <suzuki.poulose@arm.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Will Deacon authored
PPIs are affine by nature, so the interrupt-affinity property is not used and therefore we shouldn't print a warning in its absence. Reported-by:
Maxime Ripard <maxime.ripard@free-electrons.com> Reviewed-by:
Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Michael Ellerman authored
This reverts commit feba4036. Although the principle of this change is good, the implementation has a few issues. Firstly we can sometimes fail to abort a syscall because r12 may have been clobbered by C code if we went down the virtual CPU accounting path, or if syscall tracing was enabled. Secondly we have decided that it is safer to abort the syscall even earlier in the syscall entry path, so that we avoid the syscall tracing path when we are transactional. So that we have time to thoroughly test those changes we have decided to revert this for this merge window and will merge the fixed version in the next window. NB. Rather than reverting the selftest we just drop tm-syscall from TEST_PROGS so that it's not run by default. Fixes: feba4036 ("powerpc/tm: Abort syscalls in active transactions") Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Dean Nelson authored
__dma_alloc() does a PAGE_ALIGN() on the passed in size argument before doing anything else. __dma_free() does not. And because it doesn't, it is possible to leak memory should size not be an integer multiple of PAGE_SIZE. The solution is to add a PAGE_ALIGN() to __dma_free() like is done in __dma_alloc(). Additionally, this patch removes a redundant PAGE_ALIGN() from __dma_alloc_coherent(), since __dma_alloc_coherent() can only be called from __dma_alloc(), which already does a PAGE_ALIGN() before the call. Cc: stable@vger.kernel.org Acked-by:
Catalin Marinas <catalin.marinas@arm.com> Signed-off-by:
Dean Nelson <dnelson@redhat.com> Signed-off-by:
Will Deacon <will.deacon@arm.com>
-
Boris Ostrovsky authored
Commit 77e32c89 ("clockevents: Manage device's state separately for the core") decouples clockevent device's modes from states. With this change when a Xen guest tries to resume, it won't be calling its set_mode op which needs to be done on each VCPU in order to make the hypervisor aware that we are in oneshot mode. This happens because clockevents_tick_resume() (which is an intermediate step of resuming ticks on a processor) doesn't call clockevents_set_state() anymore and because during suspend clockevent devices on all VCPUs (except for the one doing the suspend) are left in ONESHOT state. As result, during resume the clockevents state machine will assume that device is already where it should be and doesn't need to be updated. To avoid this problem we should suspend ticks on all VCPUs during suspend. Signed-off-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by:
David Vrabel <david.vrabel@citrix.com>
-
- Apr 29, 2015
-
-
Daniel Axtens authored
Load the PowerNV platform pci controller ops into pci controllers after all the operations are loaded into the platform ops struct, not before. Otherwise we aren't actually setting the ops properly which can break IO for some devices. Fixes: 65ebf4b6 ("powerpc/powernv: Move controller ops from ppc_md to controller_ops") Reported-by:
Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by:
Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by:
Daniel Axtens <dja@axtens.net> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Commit 34cb7954 "Convert ICS mutex lock to spin lock" added an include of asm/spinlock.h, which does not work in the SMP=n case. It should instead include linux/spinlock.h Fixes: 34cb7954 ("KVM: PPC: Book3S HV: Convert ICS mutex lock to spin lock") Acked-by:
Paul Mackerras <paulus@samba.org> Reviewed-by:
Alexander Graf <agraf@suse.de> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
- Apr 28, 2015
-
-
Chris Metcalf authored
The code accidentally used cpu_isset() previously in one place (though properly node_isset() elsewhere). Signed-off-by:
Chris Metcalf <cmetcalf@ezchip.com>
-
- Apr 27, 2015
-
-
Paolo Bonzini authored
This reverts commits 0a4e6be9 and 80f7fdb1. The task migration notifier was originally introduced in order to support the pvclock vsyscall with non-synchronized TSC, but KVM only supports it with synchronized TSC. Hence, on KVM the race condition is only needed due to a bad implementation on the host side, and even then it's so rare that it's mostly theoretical. As far as KVM is concerned it's possible to fix the host, avoiding the additional complexity in the vDSO and the (re)introduction of the task migration notifier. Xen, on the other hand, hasn't yet implemented vsyscall support at all, so we do not care about its plans for non-synchronized TSC. Reported-by:
Peter Zijlstra <peterz@infradead.org> Suggested-by:
Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-