Skip to content
Snippets Groups Projects
  1. Jun 23, 2017
  2. Apr 15, 2017
    • Nicolai Stange's avatar
      x86/apic/timer: Set ->min_delta_ticks and ->max_delta_ticks · 747d04b3
      Nicolai Stange authored
      
      In preparation for making the clockevents core NTP correction aware,
      all clockevent device drivers must set ->min_delta_ticks and
      ->max_delta_ticks rather than ->min_delta_ns and ->max_delta_ns: a
      clockevent device's rate is going to change dynamically and thus, the
      ratio of ns to ticks ceases to stay invariant.
      
      Make the x86 arch's apic clockevent driver initialize these fields
      properly.
      
      This patch alone doesn't introduce any change in functionality as the
      clockevents core still looks exclusively at the (untouched) ->min_delta_ns
      and ->max_delta_ns. As soon as this has changed, a followup patch will
      purge the initialization of ->min_delta_ns and ->max_delta_ns from this
      driver.
      
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Richard Cochran <richardcochran@gmail.com>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: x86@kernel.org
      CC: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarNicolai Stange <nicstange@gmail.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      747d04b3
  3. Mar 22, 2017
  4. Mar 14, 2017
  5. Mar 11, 2017
    • Dou Liyang's avatar
      x86/acpi: Restore the order of CPU IDs · 2b85b3d2
      Dou Liyang authored
      
      The following commits:
      
        f7c28833 ("x86/acpi: Enable acpi to register all possible cpus at
      boot time") and 8f54969d ("x86/acpi: Introduce persistent storage
      for cpuid <-> apicid mapping")
      
      ... registered all the possible CPUs at boot time via ACPI tables to
      make the mapping of cpuid <-> apicid fixed. Both enabled and disabled
      CPUs could have a logical CPU ID after boot time.
      
      But, ACPI tables are unreliable. the number amd order of Local APIC
      entries which depends on the firmware is often inconsistent with the
      physical devices. Even if they are consistent, The disabled CPUs which
      take up some logical CPU IDs will also make the order discontinuous.
      
      Revert the part of disabled CPUs registration, keep the allocation
      logic of logical CPU IDs and also keep some code location changes.
      
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Tested-by: default avatarXiaolong Ye <xiaolong.ye@intel.com>
      Cc: rjw@rjwysocki.net
      Cc: linux-acpi@vger.kernel.org
      Cc: guzheng1@huawei.com
      Cc: izumi.taku@jp.fujitsu.com
      Cc: lenb@kernel.org
      Link: http://lkml.kernel.org/r/1488528147-2279-4-git-send-email-douly.fnst@cn.fujitsu.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      2b85b3d2
  6. Mar 01, 2017
  7. Feb 11, 2017
  8. Feb 09, 2017
    • Linus Torvalds's avatar
      Revert "x86/ioapic: Restore IO-APIC irq_chip retrigger callback" · d966564f
      Linus Torvalds authored
      
      This reverts commit 020eb3da.
      
      Gabriel C reports that it causes his machine to not boot, and we haven't
      tracked down the reason for it yet.  Since the bug it fixes has been
      around for a longish time, we're better off reverting the fix for now.
      
      Gabriel says:
       "It hangs early and freezes with a lot RCU warnings.
      
        I bisected it down to :
      
        > Ruslan Ruslichenko (1):
        >       x86/ioapic: Restore IO-APIC irq_chip retrigger callback
      
        Reverting this one fixes the problem for me..
      
        The box is a PRIMERGY TX200 S5 , 2 socket , 2 x E5520 CPU(s) installed"
      
      and Ruslan and Thomas are currently stumped.
      
      Reported-and-bisected-by: default avatarGabriel C <nix.or.die@gmail.com>
      Cc: Ruslan Ruslichenko <rruslich@cisco.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@kernel.org   # for the backport of the original commit
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d966564f
  9. Feb 07, 2017
  10. Feb 01, 2017
  11. Jan 29, 2017
    • Ingo Molnar's avatar
      x86/boot/e820: Rename e820_reserve_resources*() to e820__reserve_resources*() · 1506c8dc
      Ingo Molnar authored
      
      Also do some minor cleanups.
      
      No change in functionality.
      
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Huang, Ying <ying.huang@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1506c8dc
  12. Jan 28, 2017
    • Ingo Molnar's avatar
      x86/boot/e820: Remove spurious asm/e820/api.h inclusions · 5520b7e7
      Ingo Molnar authored
      
      A commonly used lowlevel x86 header, asm/pgtable.h, includes asm/e820/api.h
      spuriously, without making direct use of it.
      
      Removing it is not simple: over the years various .c code learned to rely
      on this indirect inclusion.
      
      Remove the unnecessary include - this should speed up the kernel build a bit,
      as a large header is not included anymore in totally unrelated code.
      
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Huang, Ying <ying.huang@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      5520b7e7
    • Ingo Molnar's avatar
      x86/boot/e820: Move asm/e820.h to asm/e820/api.h · 66441bd3
      Ingo Molnar authored
      In line with asm/e820/types.h, move the e820 API declarations to
      asm/e820/api.h and update all usage sites.
      
      This is just a mechanical, obviously correct move & replace patch,
      there will be subsequent changes to clean up the code and to make
      better use of the new header organization.
      
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Huang, Ying <ying.huang@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rafael J. Wysocki <rjw@sisk.pl>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wei Yang <richard.weiyang@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel....
      66441bd3
  13. Jan 20, 2017
  14. Jan 18, 2017
  15. Jan 14, 2017
  16. Jan 09, 2017
  17. Jan 06, 2017
  18. Jan 05, 2017
    • Daniel Bristot de Oliveira's avatar
      x86/irq, trace: Add __irq_entry annotation to x86's platform IRQ handlers · c4158ff5
      Daniel Bristot de Oliveira authored
      
      This patch adds the __irq_entry annotation to the default x86
      platform IRQ handlers. ftrace's function_graph tracer uses the
      __irq_entry annotation to notify the entry and return of IRQ
      handlers.
      
      For example, before the patch:
        354549.667252 |   3)  d..1              |  default_idle_call() {
        354549.667252 |   3)  d..1              |    arch_cpu_idle() {
        354549.667253 |   3)  d..1              |      default_idle() {
        354549.696886 |   3)  d..1              |        smp_trace_reschedule_interrupt() {
        354549.696886 |   3)  d..1              |          irq_enter() {
        354549.696886 |   3)  d..1              |            rcu_irq_enter() {
      
      After the patch:
        366416.254476 |   3)  d..1              |    arch_cpu_idle() {
        366416.254476 |   3)  d..1              |      default_idle() {
        366416.261566 |   3)  d..1  ==========> |
        366416.261566 |   3)  d..1              |        smp_trace_reschedule_interrupt() {
        366416.261566 |   3)  d..1              |          irq_enter() {
        366416.261566 |   3)  d..1              |            rcu_irq_enter() {
      
      KASAN also uses this annotation. The smp_apic_timer_interrupt()
      was already annotated.
      
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Acked-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Claudio Fontana <claudio.fontana@huawei.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Cc: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Nicolai Stange <nicstange@gmail.com>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Cc: linux-edac@vger.kernel.org
      Link: http://lkml.kernel.org/r/059fdf437c2f0c09b13c18c8fe4e69999d3ffe69.1483528431.git.bristot@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c4158ff5
  19. Dec 25, 2016
  20. Dec 13, 2016
    • Thomas Gleixner's avatar
      x86/smpboot: Make logical package management more robust · 9d85eb91
      Thomas Gleixner authored
      
      The logical package management has several issues:
      
       - The APIC ids provided by ACPI are not required to be the same as the
         initial APIC id which can be retrieved by CPUID. The APIC ids provided
         by ACPI are those which are written by the BIOS into the APIC. The
         initial id is set by hardware and can not be changed. The hardware
         provided ids contain the real hardware package information.
      
         Especially AMD sets the effective APIC id different from the hardware id
         as they need to reserve space for the IOAPIC ids starting at id 0.
      
         As a consequence those machines trigger the currently active firmware
         bug printouts in dmesg, These are obviously wrong.
      
       - Virtual machines have their own interesting of enumerating APICs and
         packages which are not reliably covered by the current implementation.
      
      The sizing of the mapping array has been tweaked to be generously large to
      handle systems which provide a wrong core count when HT is disabled so the
      whole magic which checks for space in the physical hotplug case is not
      needed anymore.
      
      Simplify the whole machinery and do the mapping when the CPU starts and the
      CPUID derived physical package information is available. This solves the
      observed problems on AMD machines and works for the virtualization issues
      as well.
      
      Remove the extra call from XEN cpu bringup code as it is not longer
      required.
      
      Fixes: d49597fd ("x86/cpu: Deal with broken firmware (VMWare/XEN)")
      Reported-and-tested-by: default avatarBorislav Petkov <bp@suse.de>
      Tested-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: M. Vefa Bicakci <m.v.b@runbox.com>
      Cc: xen-devel <xen-devel@lists.xen.org>
      Cc: Charles (Chas) Williams <ciwillia@brocade.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanos
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      9d85eb91
  21. Dec 10, 2016
  22. Nov 24, 2016
    • Dan Carpenter's avatar
      x86/apic/uv: Silence a shift wrapping warning · c4597fd7
      Dan Carpenter authored
      
      'm_io' is stored in 6 bits so it's a number in the 0-63 range.  Static
      analysis tools complain that 1 << 63 will wrap so I have changed it to
      1ULL << m_io.
      
      This code is over three years old so presumably the bug doesn't happen
      very frequently in real life or someone would have complained by now.
      
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: Alex Thorlton <athorlton@sgi.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Travis <travis@sgi.com>
      Cc: Nathan Zimmer <nzimmer@sgi.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: kernel-janitors@vger.kernel.org
      Fixes: b15cc4a1 ("x86, uv, uv3: Update x2apic Support for SGI UV3")
      Link: http://lkml.kernel.org/r/20161123221908.GA23997@mwanda
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c4597fd7
  23. Nov 10, 2016
    • Wanpeng Li's avatar
      x86/apic: Prevent tracing on apic_msr_write_eoi() · 8ca22552
      Wanpeng Li authored
      
      The following RCU lockdep warning led to adding irq_enter()/irq_exit() into
      smp_reschedule_interrupt():
      
       RCU used illegally from idle CPU!
       rcu_scheduler_active = 1, debug_locks = 0
       RCU used illegally from extended quiescent state!
       no locks held by swapper/1/0.
       
        do_trace_write_msr
        native_write_msr
        native_apic_msr_eoi_write
        smp_reschedule_interrupt
        reschedule_interrupt
      
      As Peterz pointed out:
      
      | So now we're making a very frequent interrupt slower because of debug 
      | code.
      |
      | The thing is, many many smp_reschedule_interrupt() invocations don't
      | actually execute anything much at all and are only sent to tickle the
      | return to user path (which does the actual preemption).
      | 
      | Having to do the whole irq_enter/irq_exit dance just for this unlikely
      | debug case totally blows.
      
      Use the wrmsr_notrace() variant in native_apic_msr_write_eoi, annotate the
      kvm variant with notrace and add a native_apic_eoi callback to the apic
      structure so KVM guests are covered as well.
      
      This allows to revert the irq_enter/irq_exit dance in
      smp_reschedule_interrupt().
      
      Suggested-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Suggested-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Acked-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Cc: kvm@vger.kernel.org
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: http://lkml.kernel.org/r/1478488420-5982-3-git-send-email-wanpeng.li@hotmail.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      8ca22552
  24. Oct 08, 2016
    • Thomas Gleixner's avatar
      x86/apic: Prevent pointless warning messages · df610d67
      Thomas Gleixner authored
      
      Markus reported that he sees new warnings:
      
        APIC: NR_CPUS/possible_cpus limit of 4 reached.  Processor 4/0x84 ignored.
        APIC: NR_CPUS/possible_cpus limit of 4 reached.  Processor 5/0x85 ignored.
      
      This comes from the recent persistant cpuid - nodeid changes. The code
      which emits the warning has been called prior to these changes only for
      enabled processors. Now it's called for disabled processors as well to get
      the possible cpu accounting correct. So if the kernel is compiled for the
      number of actual available/enabled CPUs and the BIOS reports disabled CPUs
      as well then the above warnings are printed.
      
      That's a pointless exercise as it only makes sense if there are more CPUs
      enabled than the kernel supports.
      
      Nake the warning conditional on enabled processors so we are back to the
      state before these changes.
      
      Fixes: 8f54969d ("x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping") 
      Reported-and-tested-by: default avatarMarkus Trippelsdorf <markus@trippelsdorf.de>
      Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: linux-acpi@vger.kernel.org
      Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610071549330.19804@nanos
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      df610d67
    • Chris Metcalf's avatar
      nmi_backtrace: add more trigger_*_cpu_backtrace() methods · 9a01c3ed
      Chris Metcalf authored
      Patch series "improvements to the nmi_backtrace code" v9.
      
      This patch series modifies the trigger_xxx_backtrace() NMI-based remote
      backtracing code to make it more flexible, and makes a few small
      improvements along the way.
      
      The motivation comes from the task isolation code, where there are
      scenarios where we want to be able to diagnose a case where some cpu is
      about to interrupt a task-isolated cpu.  It can be helpful to see both
      where the interrupting cpu is, and also an approximation of where the
      cpu that is being interrupted is.  The nmi_backtrace framework allows us
      to discover the stack of the interrupted cpu.
      
      I've tested that the change works as desired on tile, and build-tested
      x86, arm, mips, and sparc64.  For x86 I confirmed that the generic
      cpuidle stuff as well as the architecture-specific routines are in the
      new cpuidle section.  For arm, mips, and sparc I just build-tested it
      and made sure the generic cpuidle routines were in the new cpuidle
      section, but I didn't attempt to figure out which the platform-specific
      idle routines might be.  That might be more usefully done by someone
      with platform experience in follow-up patches.
      
      This patch (of 4):
      
      Currently you can only request a backtrace of either all cpus, or all
      cpus but yourself.  It can also be helpful to request a remote backtrace
      of a single cpu, and since we want that, the logical extension is to
      support a cpumask as the underlying primitive.
      
      This change modifies the existing lib/nmi_backtrace.c code to take a
      cpumask as its basic primitive, and modifies the linux/nmi.h code to use
      the new "cpumask" method instead.
      
      The existing clients of nmi_backtrace (arm and x86) are converted to
      using the new cpumask approach in this change.
      
      The other users of the backtracing API (sparc64 and mips) are converted
      to use the cpumask approach rather than the all/allbutself approach.
      The mips code ignored the "include_self" boolean but with this change it
      will now also dump a local backtrace if requested.
      
      Link: http://lkml.kernel.org/r/1472487169-14923-2-git-send-email-cmetcalf@mellanox.com
      
      
      Signed-off-by: default avatarChris Metcalf <cmetcalf@mellanox.com>
      Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm]
      Reviewed-by: default avatarAaron Tomlin <atomlin@redhat.com>
      Reviewed-by: default avatarPetr Mladek <pmladek@suse.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: David Miller <davem@davemloft.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a01c3ed
  25. Oct 04, 2016
    • Mika Westerberg's avatar
      x86/irq: Prevent force migration of irqs which are not in the vector domain · db91aa79
      Mika Westerberg authored
      
      When a CPU is about to be offlined we call fixup_irqs() that resets IRQ
      affinities related to the CPU in question. The same thing is also done when
      the system is suspended to S-states like S3 (mem).
      
      For each IRQ we try to complete any on-going move regardless whether the
      IRQ is actually part of x86_vector_domain. For each IRQ descriptor we fetch
      its chip_data, assume it is of type struct apic_chip_data and manipulate it
      by clearing old_domain mask etc. For irq_chips that are not part of the
      x86_vector_domain, like those created by various GPIO drivers, will find
      their chip_data being changed unexpectly.
      
      Below is an example where GPIO chip owned by pinctrl-sunrisepoint.c gets
      corrupted after resume:
      
        # cat /sys/kernel/debug/gpio
        gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
         gpio-511 (                    |sysfs               ) in  hi
      
        # rtcwake -s10 -mmem
        <10 seconds passes>
      
        # cat /sys/kernel/debug/gpio
        gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00:
         gpio-511 (                    |sysfs               ) in  ?
      
      Note '?' in the output. It means the struct gpio_chip ->get function is
      NULL whereas before suspend it was there.
      
      Fix this by first checking that the IRQ belongs to x86_vector_domain before
      we try to use the chip_data as struct apic_chip_data.
      
      Reported-and-tested-by: default avatarSakari Ailus <sakari.ailus@linux.intel.com>
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Cc: stable@vger.kernel.org # 4.4+
      Link: http://lkml.kernel.org/r/20161003101708.34795-1-mika.westerberg@linux.intel.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      db91aa79
  26. Sep 27, 2016
  27. Sep 22, 2016
    • Gu Zheng's avatar
      x86/acpi: Introduce persistent storage for cpuid <-> apicid mapping · 8f54969d
      Gu Zheng authored
      
      The whole patch-set aims at making cpuid <-> nodeid mapping persistent. So that,
      when node online/offline happens, cache based on cpuid <-> nodeid mapping such as
      wq_numa_possible_cpumask will not cause any problem.
      It contains 4 steps:
      1. Enable apic registeration flow to handle both enabled and disabled cpus.
      2. Introduce a new array storing all possible cpuid <-> apicid mapping.
      3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
      4. Establish all possible cpuid <-> nodeid mapping.
      
      This patch finishes step 2.
      
      In this patch, we introduce a new static array named cpuid_to_apicid[],
      which is large enough to store info for all possible cpus.
      
      And then, we modify the cpuid calculation. In generic_processor_info(),
      it simply finds the next unused cpuid. And it is also why the cpuid <-> nodeid
      mapping changes with node hotplug.
      
      After this patch, we find the next unused cpuid, map it to an apicid,
      and store the mapping in cpuid_to_apicid[], so that cpuid <-> apicid
      mapping will be persistent.
      
      And finally we will use this array to make cpuid <-> nodeid persistent.
      
      cpuid <-> apicid mapping is established at local apic registeration time.
      But non-present or disabled cpus are ignored.
      
      In this patch, we establish all possible cpuid <-> apicid mapping when
      registering local apic.
      
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: default avatarZhu Guihua <zhugh.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: mika.j.penttila@gmail.com
      Cc: len.brown@intel.com
      Cc: rafael@kernel.org
      Cc: rjw@rjwysocki.net
      Cc: yasu.isimatu@gmail.com
      Cc: linux-mm@kvack.org
      Cc: linux-acpi@vger.kernel.org
      Cc: isimatu.yasuaki@jp.fujitsu.com
      Cc: gongzhaogang@inspur.com
      Cc: tj@kernel.org
      Cc: izumi.taku@jp.fujitsu.com
      Cc: cl@linux.com
      Cc: chen.tang@easystack.cn
      Cc: akpm@linux-foundation.org
      Cc: kamezawa.hiroyu@jp.fujitsu.com
      Cc: lenb@kernel.org
      Link: http://lkml.kernel.org/r/1472114120-3281-4-git-send-email-douly.fnst@cn.fujitsu.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      8f54969d
    • Gu Zheng's avatar
      x86/acpi: Enable acpi to register all possible cpus at boot time · f7c28833
      Gu Zheng authored
      
      cpuid <-> nodeid mapping is firstly established at boot time. And workqueue caches
      the mapping in wq_numa_possible_cpumask in wq_numa_init() at boot time.
      
      When doing node online/offline, cpuid <-> nodeid mapping is established/destroyed,
      which means, cpuid <-> nodeid mapping will change if node hotplug happens. But
      workqueue does not update wq_numa_possible_cpumask.
      
      So here is the problem:
      
      Assume we have the following cpuid <-> nodeid in the beginning:
      
        Node | CPU
      
      ------------------------
      node 0 |  0-14, 60-74
      node 1 | 15-29, 75-89
      node 2 | 30-44, 90-104
      node 3 | 45-59, 105-119
      
      and we hot-remove node2 and node3, it becomes:
      
        Node | CPU
      ------------------------
      node 0 |  0-14, 60-74
      node 1 | 15-29, 75-89
      
      and we hot-add node4 and node5, it becomes:
      
        Node | CPU
      ------------------------
      node 0 |  0-14, 60-74
      node 1 | 15-29, 75-89
      node 4 | 30-59
      node 5 | 90-119
      
      But in wq_numa_possible_cpumask, cpu30 is still mapped to node2, and the like.
      
      When a pool workqueue is initialized, if its cpumask belongs to a node, its
      pool->node will be mapped to that node. And memory used by this workqueue will
      also be allocated on that node.
      
      static struct worker_pool *get_unbound_pool(const struct workqueue_attrs *attrs){
      ...
              /* if cpumask is contained inside a NUMA node, we belong to that node */
              if (wq_numa_enabled) {
                      for_each_node(node) {
                              if (cpumask_subset(pool->attrs->cpumask,
                                                 wq_numa_possible_cpumask[node])) {
                                      pool->node = node;
                                      break;
                              }
                      }
              }
      
      Since wq_numa_possible_cpumask is not updated, it could be mapped to an offline node,
      which will lead to memory allocation failure:
      
       SLUB: Unable to allocate memory on node 2 (gfp=0x80d0)
        cache: kmalloc-192, object size: 192, buffer size: 192, default order: 1, min order: 0
        node 0: slabs: 6172, objs: 259224, free: 245741
        node 1: slabs: 3261, objs: 136962, free: 127656
      
      It happens here:
      
      create_worker(struct worker_pool *pool)
       |--> worker = alloc_worker(pool->node);
      
      static struct worker *alloc_worker(int node)
      {
              struct worker *worker;
      
              worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, node); --> Here, useing the wrong node.
      
              ......
      
              return worker;
      }
      
      [Solution]
      
      There are four mappings in the kernel:
      1. nodeid (logical node id)   <->   pxm
      2. apicid (physical cpu id)   <->   nodeid
      3. cpuid (logical cpu id)     <->   apicid
      4. cpuid (logical cpu id)     <->   nodeid
      
      1. pxm (proximity domain) is provided by ACPI firmware in SRAT, and nodeid <-> pxm
         mapping is setup at boot time. This mapping is persistent, won't change.
      
      2. apicid <-> nodeid mapping is setup using info in 1. The mapping is setup at boot
         time and CPU hotadd time, and cleared at CPU hotremove time. This mapping is also
         persistent.
      
      3. cpuid <-> apicid mapping is setup at boot time and CPU hotadd time. cpuid is
         allocated, lower ids first, and released at CPU hotremove time, reused for other
         hotadded CPUs. So this mapping is not persistent.
      
      4. cpuid <-> nodeid mapping is also setup at boot time and CPU hotadd time, and
         cleared at CPU hotremove time. As a result of 3, this mapping is not persistent.
      
      To fix this problem, we establish cpuid <-> nodeid mapping for all the possible
      cpus at boot time, and make it persistent. And according to init_cpu_to_node(),
      cpuid <-> nodeid mapping is based on apicid <-> nodeid mapping and cpuid <-> apicid
      mapping. So the key point is obtaining all cpus' apicid.
      
      apicid can be obtained by _MAT (Multiple APIC Table Entry) method or found in
      MADT (Multiple APIC Description Table). So we finish the job in the following steps:
      
      1. Enable apic registeration flow to handle both enabled and disabled cpus.
         This is done by introducing an extra parameter to generic_processor_info to let the
         caller control if disabled cpus are ignored.
      
      2. Introduce a new array storing all possible cpuid <-> apicid mapping. And also modify
         the way cpuid is calculated. Establish all possible cpuid <-> apicid mapping when
         registering local apic. Store the mapping in this array.
      
      3. Enable _MAT and MADT relative apis to return non-present or disabled cpus' apicid.
         This is also done by introducing an extra parameter to these apis to let the caller
         control if disabled cpus are ignored.
      
      4. Establish all possible cpuid <-> nodeid mapping.
         This is done via an additional acpi namespace walk for processors.
      
      This patch finished step 1.
      
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarTang Chen <tangchen@cn.fujitsu.com>
      Signed-off-by: default avatarZhu Guihua <zhugh.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: mika.j.penttila@gmail.com
      Cc: len.brown@intel.com
      Cc: rafael@kernel.org
      Cc: rjw@rjwysocki.net
      Cc: yasu.isimatu@gmail.com
      Cc: linux-mm@kvack.org
      Cc: linux-acpi@vger.kernel.org
      Cc: isimatu.yasuaki@jp.fujitsu.com
      Cc: gongzhaogang@inspur.com
      Cc: tj@kernel.org
      Cc: izumi.taku@jp.fujitsu.com
      Cc: cl@linux.com
      Cc: chen.tang@easystack.cn
      Cc: akpm@linux-foundation.org
      Cc: kamezawa.hiroyu@jp.fujitsu.com
      Cc: lenb@kernel.org
      Link: http://lkml.kernel.org/r/1472114120-3281-3-git-send-email-douly.fnst@cn.fujitsu.com
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      f7c28833
  28. Sep 20, 2016