Skip to content
Snippets Groups Projects
  1. Oct 27, 2022
  2. Jul 19, 2021
    • Thomas Gleixner's avatar
      x86/apic: Mark _all_ legacy interrupts when IO/APIC is missing · a9c92fd4
      Thomas Gleixner authored
      
      stable inclusion
      from linux-4.19.194
      commit 7e25cb1b22f81239ae3332e14a1d0cff7014bccd
      
      --------------------------------
      
      commit 7d65f9e80646c595e8c853640a9d0768a33e204c upstream.
      
      PIC interrupts do not support affinity setting and they can end up on
      any online CPU. Therefore, it's required to mark the associated vectors
      as system-wide reserved. Otherwise, the corresponding irq descriptors
      are copied to the secondary CPUs but the vectors are not marked as
      assigned or reserved. This works correctly for the IO/APIC case.
      
      When the IO/APIC is disabled via config, kernel command line or lack of
      enumeration then all legacy interrupts are routed through the PIC, but
      nothing marks them as system-wide reserved vectors.
      
      As a consequence, a subsequent allocation on a secondary CPU can result in
      allocating one of these vectors, which triggers the BUG() in
      apic_update_vector() because the interrupt descriptor slot is not empty.
      
      Imran tried to work around that by marking those interrupts as allocated
      when a CPU comes online. But that's wrong in case that the IO/APIC is
      available and one of the legacy interrupts, e.g. IRQ0, has been switched to
      PIC mode because then marking them as allocated will fail as they are
      already marked as system vectors.
      
      Stay consistent and update the legacy vectors after attempting IO/APIC
      initialization and mark them as system vectors in case that no IO/APIC is
      available.
      
      Fixes: 69cde000 ("x86/vector: Use matrix allocator for vector assignment")
      Reported-by: default avatarImran Khan <imran.f.khan@oracle.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20210519233928.2157496-1-imran.f.khan@oracle.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      a9c92fd4
  3. May 26, 2021
    • Thomas Gleixner's avatar
      x86/apic/vector: Force interupt handler invocation to irq context · 3e134563
      Thomas Gleixner authored
      
      mainline inclusion
      from mainline-5.7
      commit 008f1d60
      category: bugfix
      bugzilla: NA
      CVE: NA
      
      -------------------------------------------------
      
      Sathyanarayanan reported that the PCI-E AER error injection mechanism
      can result in a NULL pointer dereference in apic_ack_edge():
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
       RIP: 0010:apic_ack_edge+0x1e/0x40
       Call Trace:
         handle_edge_irq+0x7d/0x1e0
         generic_handle_irq+0x27/0x30
         aer_inject_write+0x53a/0x720
      
      It crashes in irq_complete_move() which dereferences get_irq_regs() which
      is obviously NULL when this is called from non interrupt context.
      
      Of course the pointer could be checked, but that just papers over the real
      issue. Invoking the low level interrupt handling mechanism from random code
      can wreckage the fragile interrupt affinity mechanism of x86 as interrupts
      can only be moved in interrupt context or with special care when a CPU goes
      offline and the move has to be enforced.
      
      In the best case this triggers the warning in the MSI affinity setter, but
      if the call happens on the correct CPU it just corrupts state and might
      prevent further interrupt delivery for the affected device.
      
      Mark the APIC interrupts as unsuitable for being invoked in random contexts.
      
      This prevents the AER injection from proliferating the wreckage, but that's
      less broken than the current state of affairs and more correct than just
      papering over the problem by sprinkling random checks all over the place
      and silently corrupting state.
      
      Reported-by: default avatar <sathyanarayanan.kuppuswamy@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20200306130623.684591280@linutronix.de
      
      
      Signed-off-by: default avatarLiao Chang <liaochang1@huawei.com>
      Reviewed-by: default avatarHanjun Guo <guohanjun@huawei.com>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      3e134563
  4. Sep 22, 2020
    • Thomas Gleixner's avatar
      genirq/affinity: Make affinity setting if activated opt-in · 7cf94405
      Thomas Gleixner authored
      
      stable inclusion
      from linux-4.19.141
      commit 5c4d9eefd314e763dcb2a499797176c17ad6ab69
      
      --------------------------------
      
      commit f0c7baca upstream.
      
      John reported that on a RK3288 system the perf per CPU interrupts are all
      affine to CPU0 and provided the analysis:
      
       "It looks like what happens is that because the interrupts are not per-CPU
        in the hardware, armpmu_request_irq() calls irq_force_affinity() while
        the interrupt is deactivated and then request_irq() with IRQF_PERCPU |
        IRQF_NOBALANCING.
      
        Now when irq_startup() runs with IRQ_STARTUP_NORMAL, it calls
        irq_setup_affinity() which returns early because IRQF_PERCPU and
        IRQF_NOBALANCING are set, leaving the interrupt on its original CPU."
      
      This was broken by the recent commit which blocked interrupt affinity
      setting in hardware before activation of the interrupt. While this works in
      general, it does not work for this particular case. As contrary to the
      initial analysis not all interrupt chip drivers implement an activate
      callback, the safe cure is to make the deferred interrupt affinity setting
      at activation time opt-in.
      
      Implement the necessary core logic and make the two irqchip implementations
      for which this is required opt-in. In hindsight this would have been the
      right thing to do, but ...
      
      Fixes: baedb87d ("genirq/affinity: Handle affinity setting on inactive interrupts correctly")
      Reported-by: default avatarJohn Keeping <john@metanate.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarMarc Zyngier <maz@kernel.org>
      Acked-by: default avatarMarc Zyngier <maz@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/87blk4tzgm.fsf@nanos.tec.linutronix.de
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      7cf94405
    • Thomas Gleixner's avatar
      genirq/affinity: Handle affinity setting on inactive interrupts correctly · c998ccb2
      Thomas Gleixner authored
      
      stable inclusion
      from linux-4.19.134
      commit 2048e4375c552614d26a7191394d8a8398fe7a85
      
      --------------------------------
      
      commit baedb87d upstream.
      
      Setting interrupt affinity on inactive interrupts is inconsistent when
      hierarchical irq domains are enabled. The core code should just store the
      affinity and not call into the irq chip driver for inactive interrupts
      because the chip drivers may not be in a state to handle such requests.
      
      X86 has a hacky workaround for that but all other irq chips have not which
      causes problems e.g. on GIC V3 ITS.
      
      Instead of adding more ugly hacks all over the place, solve the problem in
      the core code. If the affinity is set on an inactive interrupt then:
      
          - Store it in the irq descriptors affinity mask
          - Update the effective affinity to reflect that so user space has
            a consistent view
          - Don't call into the irq chip driver
      
      This is the core equivalent of the X86 workaround and works correctly
      because the affinity setting is established in the irq chip when the
      interrupt is activated later on.
      
      Note, that this is only effective when hierarchical irq domains are enabled
      by the architecture. Doing it unconditionally would break legacy irq chip
      implementations.
      
      For hierarchial irq domains this works correctly as none of the drivers can
      have a dependency on affinity setting in inactive state by design.
      
      Remove the X86 workaround as it is not longer required.
      
      Fixes: 02edee15 ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
      Reported-by: default avatarAli Saidi <alisaidi@amazon.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAli Saidi <alisaidi@amazon.com>
      Cc: stable@vger.kernel.org
      Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
      Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      c998ccb2
  5. Dec 27, 2019
    • Neil Horman's avatar
      x86/apic/vector: Warn when vector space exhaustion breaks affinity · b6e4f231
      Neil Horman authored and 谢秀奇's avatar 谢秀奇 committed
      
      [ Upstream commit 743dac49 ]
      
      On x86, CPUs are limited in the number of interrupts they can have affined
      to them as they only support 256 interrupt vectors per CPU. 32 vectors are
      reserved for the CPU and the kernel reserves another 22 for internal
      purposes. That leaves 202 vectors for assignement to devices.
      
      When an interrupt is set up or the affinity is changed by the kernel or the
      administrator, the vector assignment code attempts to honor the requested
      affinity mask. If the vector space on the CPUs in that affinity mask is
      exhausted the code falls back to a wider set of CPUs and assigns a vector
      on a CPU outside of the requested affinity mask silently.
      
      While the effective affinity is reflected in the corresponding
      /proc/irq/$N/effective_affinity* files the silent breakage of the requested
      affinity can lead to unexpected behaviour for administrators.
      
      Add a pr_warn() when this happens so that adminstrators get at least
      informed about it in the syslog.
      
      [ tglx: Massaged changelog and made the pr_warn() more informative ]
      
      Reported-by: default avatar <djuran@redhat.com>
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatar <djuran@redhat.com>
      Link: https://lkml.kernel.org/r/20190822143421.9535-1-nhorman@tuxdriver.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      b6e4f231
    • Thomas Gleixner's avatar
      x86/irq: Handle spurious interrupt after shutdown gracefully · 18cf32ca
      Thomas Gleixner authored and 谢秀奇's avatar 谢秀奇 committed
      
      commit b7107a67 upstream
      
      Since the rework of the vector management, warnings about spurious
      interrupts have been reported. Robert provided some more information and
      did an initial analysis. The following situation leads to these warnings:
      
         CPU 0                  CPU 1               IO_APIC
      
                                                    interrupt is raised
                                                    sent to CPU1
      			  Unable to handle
      			  immediately
      			  (interrupts off,
      			   deep idle delay)
         mask()
         ...
         free()
           shutdown()
           synchronize_irq()
           clear_vector()
                                do_IRQ()
                                  -> vector is clear
      
      Before the rework the vector entries of legacy interrupts were statically
      assigned and occupied precious vector space while most of them were
      unused. Due to that the above situation was handled silently because the
      vector was handled and the core handler of the assigned interrupt
      descriptor noticed that it is shut down and returned.
      
      While this has been usually observed with legacy interrupts, this situation
      is not limited to them. Any other interrupt source, e.g. MSI, can cause the
      same issue.
      
      After adding proper synchronization for level triggered interrupts, this
      can only happen for edge triggered interrupts where the IO-APIC obviously
      cannot provide information about interrupts in flight.
      
      While the spurious warning is actually harmless in this case it worries
      users and driver developers.
      
      Handle it gracefully by marking the vector entry as VECTOR_SHUTDOWN instead
      of VECTOR_UNUSED when the vector is freed up.
      
      If that above late handling happens the spurious detector will not complain
      and switch the entry to VECTOR_UNUSED. Any subsequent spurious interrupt on
      that line will trigger the spurious warning as before.
      
      Fixes: 464d1230 ("x86/vector: Switch IOAPIC to global reservation mode")
      Reported-by: default avatarRobert Hodaszi <Robert.Hodaszi@digi.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de&gt;->
      Tested-by: default avatarRobert Hodaszi <Robert.Hodaszi@digi.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Link: https://lkml.kernel.org/r/20190628111440.459647741@linutronix.de
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      18cf32ca
    • Dou Liyang's avatar
      irq/matrix: Spread managed interrupts on allocation · 5fc68885
      Dou Liyang authored and 谢秀奇's avatar 谢秀奇 committed
      
      [ Upstream commit 76f99ae5 ]
      
      Linux spreads out the non managed interrupt across the possible target CPUs
      to avoid vector space exhaustion.
      
      Managed interrupts are treated differently, as for them the vectors are
      reserved (with guarantee) when the interrupt descriptors are initialized.
      
      When the interrupt is requested a real vector is assigned. The assignment
      logic uses the first CPU in the affinity mask for assignment. If the
      interrupt has more than one CPU in the affinity mask, which happens when a
      multi queue device has less queues than CPUs, then doing the same search as
      for non managed interrupts makes sense as it puts the interrupt on the
      least interrupt plagued CPU. For single CPU affine vectors that's obviously
      a NOOP.
      
      Restructre the matrix allocation code so it does the 'best CPU' search, add
      the sanity check for an empty affinity mask and adapt the call site in the
      x86 vector management code.
      
      [ tglx: Added the empty mask check to the core and improved change log ]
      
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarYang Yingliang <yangyingliang@huawei.com>
      5fc68885
  6. Sep 08, 2018
  7. Aug 05, 2018
    • Nicolai Stange's avatar
      x86: Don't include linux/irq.h from asm/hardirq.h · 447ae316
      Nicolai Stange authored
      
      The next patch in this series will have to make the definition of
      irq_cpustat_t available to entering_irq().
      
      Inclusion of asm/hardirq.h into asm/apic.h would cause circular header
      dependencies like
      
        asm/smp.h
          asm/apic.h
            asm/hardirq.h
              linux/irq.h
                linux/topology.h
                  linux/smp.h
                    asm/smp.h
      
      or
      
        linux/gfp.h
          linux/mmzone.h
            asm/mmzone.h
              asm/mmzone_64.h
                asm/smp.h
                  asm/apic.h
                    asm/hardirq.h
                      linux/irq.h
                        linux/irqdesc.h
                          linux/kobject.h
                            linux/sysfs.h
                              linux/kernfs.h
                                linux/idr.h
                                  linux/gfp.h
      
      and others.
      
      This causes compilation errors because of the header guards becoming
      effective in the second inclusion: symbols/macros that had been defined
      before wouldn't be available to intermediate headers in the #include chain
      anymore.
      
      A possible workaround would be to move the definition of irq_cpustat_t
      into its own header and include that from both, asm/hardirq.h and
      asm/apic.h.
      
      However, this wouldn't solve the real problem, namely asm/harirq.h
      unnecessarily pulling in all the linux/irq.h cruft: nothing in
      asm/hardirq.h itself requires it. Also, note that there are some other
      archs, like e.g. arm64, which don't have that #include in their
      asm/hardirq.h.
      
      Remove the linux/irq.h #include from x86' asm/hardirq.h.
      
      Fix resulting compilation errors by adding appropriate #includes to *.c
      files as needed.
      
      Note that some of these *.c files could be cleaned up a bit wrt. to their
      set of #includes, but that should better be done from separate patches, if
      at all.
      
      Signed-off-by: default avatarNicolai Stange <nstange@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      447ae316
  8. Jul 31, 2018
  9. Jun 06, 2018
    • Thomas Gleixner's avatar
      x86/apic/vector: Print APIC control bits in debugfs · a07771ac
      Thomas Gleixner authored
      
      Extend the debugability of the vector management by adding the state bits
      to the debugfs output.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Link: https://lkml.kernel.org/r/20180604162224.908136099@linutronix.de
      a07771ac
    • Thomas Gleixner's avatar
      x86/apic: Provide apic_ack_irq() · c0255770
      Thomas Gleixner authored
      
      apic_ack_edge() is explicitely for handling interrupt affinity cleanup when
      interrupt remapping is not available or disable.
      
      Remapped interrupts and also some of the platform specific special
      interrupts, e.g. UV, invoke ack_APIC_irq() directly.
      
      To address the issue of failing an affinity update with -EBUSY the delayed
      affinity mechanism can be reused, but ack_APIC_irq() does not handle
      that. Adding this to ack_APIC_irq() is not possible, because that function
      is also used for exceptions and directly handled interrupts like IPIs.
      
      Create a new function, which just contains the conditional invocation of
      irq_move_irq() and the final ack_APIC_irq().
      
      Reuse the new function in apic_ack_edge().
      
      Preparatory change for the real fix.
      
      Fixes: dccfe314 ("x86/vector: Simplify vector move cleanup")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Song Liu <liu.song.a23@gmail.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de
      c0255770
    • Thomas Gleixner's avatar
      x86/apic/vector: Prevent hlist corruption and leaks · 80ae7b1a
      Thomas Gleixner authored
      
      Several people observed the WARN_ON() in irq_matrix_free() which triggers
      when the caller tries to free an vector which is not in the allocation
      range. Song provided the trace information which allowed to decode the root
      cause.
      
      The rework of the vector allocation mechanism failed to preserve a sanity
      check, which prevents setting a new target vector/CPU when the previous
      affinity change has not fully completed.
      
      As a result a half finished affinity change can be overwritten, which can
      cause the leak of a irq descriptor pointer on the previous target CPU and
      double enqueue of the hlist head into the cleanup lists of two or more
      CPUs. After one CPU cleaned up its vector the next CPU will invoke the
      cleanup handler with vector 0, which triggers the out of range warning in
      the matrix allocator.
      
      Prevent this by checking the apic_data of the interrupt whether the
      move_in_progress flag is false and the hlist node is not hashed. Return
      -EBUSY if not.
      
      This prevents the damage and restores the behaviour before the vector
      allocation rework, but due to other changes in that area it also widens the
      chance that user space can observe -EBUSY. In theory this should be fine,
      but actually not all user space tools handle -EBUSY correctly. Addressing
      that is not part of this fix, but will be addressed in follow up patches.
      
      Fixes: 69cde000 ("x86/vector: Use matrix allocator for vector assignment")
      Reported-by: default avatarDmitry Safonov <0x7f454c46@gmail.com>
      Reported-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Reported-by: default avatarSong Liu <liu.song.a23@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Cc: Joerg Roedel <jroedel@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Cc: Mike Travis <mike.travis@hpe.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: https://lkml.kernel.org/r/20180604162224.303870257@linutronix.de
      80ae7b1a
  10. May 19, 2018
  11. Feb 23, 2018
  12. Jan 17, 2018
  13. Dec 30, 2017
    • Thomas Gleixner's avatar
      genirq/msi, x86/vector: Prevent reservation mode for non maskable MSI · bc976233
      Thomas Gleixner authored
      
      The new reservation mode for interrupts assigns a dummy vector when the
      interrupt is allocated and assigns a real vector when the interrupt is
      requested. The reservation mode prevents vector pressure when devices with
      a large amount of queues/interrupts are initialized, but only a minimal
      subset of those queues/interrupts is actually used.
      
      This mode has an issue with MSI interrupts which cannot be masked. If the
      driver is not careful or the hardware emits an interrupt before the device
      irq is requestd by the driver then the interrupt ends up on the dummy
      vector as a spurious interrupt which can cause malfunction of the device or
      in the worst case a lockup of the machine.
      
      Change the logic for the reservation mode so that the early activation of
      MSI interrupts checks whether:
      
       - the device is a PCI/MSI device
       - the reservation mode of the underlying irqdomain is activated
       - PCI/MSI masking is globally enabled
       - the PCI/MSI device uses either MSI-X, which supports masking, or
         MSI with the maskbit supported.
      
      If one of those conditions is false, then clear the reservation mode flag
      in the irq data of the interrupt and invoke irq_domain_activate_irq() with
      the reserve argument cleared. In the x86 vector code, clear the can_reserve
      flag in the vector allocation data so a subsequent free_irq() won't create
      the same situation again. The interrupt stays assigned to a real vector
      until pci_disable_msi() is invoked and all allocations are undone.
      
      Fixes: 4900be83 ("x86/vector/msi: Switch to global reservation mode")
      Reported-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Reported-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291406420.1899@nanos
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712291409460.1899@nanos
      bc976233
    • Thomas Gleixner's avatar
      genirq/irqdomain: Rename early argument of irq_domain_activate_irq() · 702cb0a0
      Thomas Gleixner authored
      
      The 'early' argument of irq_domain_activate_irq() is actually used to
      denote reservation mode. To avoid confusion, rename it before abuse
      happens.
      
      No functional change.
      
      Fixes: 72491643 ("genirq/irqdomain: Update irq_domain_ops.activate() signature")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Alexandru Chirvasitu <achirvasub@gmail.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      702cb0a0
    • Thomas Gleixner's avatar
      x86/vector: Use IRQD_CAN_RESERVE flag · 945f50a5
      Thomas Gleixner authored
      
      Set the new CAN_RESERVE flag when the initial reservation for an interrupt
      happens. The flag is used in a subsequent patch to disable reservation mode
      for a certain class of MSI devices.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarAlexandru Chirvasitu <achirvasub@gmail.com>
      Tested-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Mikael Pettersson <mikpelinux@gmail.com>
      Cc: Josh Poulson <jopoulso@microsoft.com>
      Cc: Mihai Costache <v-micos@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: linux-pci@vger.kernel.org
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Dexuan Cui <decui@microsoft.com>
      Cc: Simon Xiao <sixiao@microsoft.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Jork Loeser <Jork.Loeser@microsoft.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: devel@linuxdriverproject.org
      Cc: KY Srinivasan <kys@microsoft.com>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: Sakari Ailus <sakari.ailus@intel.com>,
      Cc: linux-media@vger.kernel.org
      
      945f50a5
  14. Dec 07, 2017
  15. Nov 24, 2017
  16. Oct 17, 2017
    • Thomas Gleixner's avatar
      x86/vector: Use correct per cpu variable in free_moved_vector() · 0696d059
      Thomas Gleixner authored
      
      free_moved_vector() accesses the per cpu vector array with this_cpu_write()
      to clear the vector. The function has two call sites:
      
       1) The vector cleanup IPI
       2) The force_complete_move() code path
      
      For #1 this_cpu_write() is correct as it runs on the CPU on which the
      vector needs to be freed.
      
      For #2 this_cpu_write() is wrong because the function is called from an
      outgoing CPU which is not necessarily the CPU on which the previous vector
      needs to be freed. As a result it sets the vector on the outgoing CPU to
      NULL, which is pointless as that CPU does not handle interrupts
      anymore. What's worse is that it leaves the vector on the previous target
      CPU in place which later on triggers the BUG_ON(vector) in the vector
      allocation code when the vector gets reused. That's possible because the
      bitmap allocator entry of that CPU is freed correctly.
      
      Always use the CPU to which the vector was associated and clear the vector
      entry on that CPU. Fixup the tracepoint as well so it tracks on which CPU
      the vector gets removed.
      
      Fixes: 69cde000 ("x86/vector: Use matrix allocator for vector assignment")
      Reported-by: default avatarPetri Latvala <petri.latvala@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Yu Chen <yu.c.chen@intel.com>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710161614430.1973@nanos
      0696d059
  17. Oct 12, 2017
  18. Sep 26, 2017
    • Thomas Gleixner's avatar
      x86/vector: Respect affinity mask in irq descriptor · d6ffc6ac
      Thomas Gleixner authored
      
      The interrupt descriptor has a preset affinity mask at allocation
      time, which is usually the default affinity mask.
      
      The current code does not respect that mask and places the vector at some
      random CPU, which gets corrected later by a set_affinity() call. That's
      silly because the vector allocation can respect the mask upfront and place
      the interrupt on a CPU which is in the mask. If that fails, then the
      affinity is broken and a interrupt assigned on any online CPU.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213156.431670325@linutronix.de
      d6ffc6ac
    • Thomas Gleixner's avatar
      x86/irq: Simplify hotplug vector accounting · 2cffad7b
      Thomas Gleixner authored
      
      Before a CPU is taken offline the number of active interrupt vectors on the
      outgoing CPU and the number of vectors which are available on the other
      online CPUs are counted and compared. If the active vectors are more than
      the available vectors on the other CPUs then the CPU hot-unplug operation
      is aborted. This again uses loop based search and is inaccurate.
      
      The bitmap matrix allocator has accurate accounting information and can
      tell exactly whether the vector space is sufficient or not.
      
      Emit a message when the number of globaly reserved (unallocated) vectors is
      larger than the number of available vectors after offlining a CPU because
      after that point request_irq() might fail.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213156.351193962@linutronix.de
      2cffad7b
    • Thomas Gleixner's avatar
      x86/vector: Switch IOAPIC to global reservation mode · 464d1230
      Thomas Gleixner authored
      
      IOAPICs install and allocate vectors for inactive interrupts. This results
      in problems on CPU offline and wastes vector resources for nothing.
      
      Handle inactive IOAPIC interrupts in the same way as inactive MSI
      interrupts and switch them to the global reservation mode.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213156.273454591@linutronix.de
      464d1230
    • Thomas Gleixner's avatar
      x86/vector/msi: Switch to global reservation mode · 4900be83
      Thomas Gleixner authored
      
      Devices with many queues allocate a huge number of interrupts and get
      assigned a vector for each of them, even if the queues are not active and
      the interrupts never requested. This causes problems with the decision
      whether the global vector space is sufficient for CPU hot unplug
      operations.
      
      Change it to a reservation scheme, which allows overcommitment.
      
      When the interrupt is allocated and initialized the vector assignment
      merily updates the reservation request counter in the matrix
      allocator. This counter is used to emit warnings when the reservation
      exceeds the available vector space, but does not affect CPU offline
      operations. Like the managed interrupts the corresponding MSI/DMAR/IOAPIC
      entries are directed to the special shutdown vector.
      
      When the interrupt is requested, then the activation code tries to assign a
      real vector. If that succeeds the interrupt is started up and functional.
      
      If that fails, then subsequently request_irq() fails with -ENOSPC.
      
      This allows a clear separation of inactive and active modes and simplifies
      the final decisions whether the global vector space is sufficient for CPU
      offline operations.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213156.184211133@linutronix.de
      4900be83
    • Thomas Gleixner's avatar
      x86/vector: Handle managed interrupts proper · 2db1f959
      Thomas Gleixner authored
      
      Managed interrupts need to reserve interrupt vectors permanently, but as
      long as the interrupt is deactivated, the vector should not be active.
      
      Reserve a new system vector, which can be used to initially initialize
      MSI/DMAR/IOAPIC entries. In that situation the interrupts are disabled in
      the corresponding MSI/DMAR/IOAPIC devices. So the vector should never be
      sent to any CPU.
      
      When the managed interrupt is started up, a real vector is assigned from
      the managed vector space and configured in MSI/DMAR/IOAPIC.
      
      This allows a clear separation of inactive and active modes and simplifies
      the final decisions whether the global vector space is sufficient for CPU
      offline operations.
      
      The vector space can be reserved even on offline CPUs and will survive CPU
      offline/online operations.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213156.104616625@linutronix.de
      2db1f959
    • Thomas Gleixner's avatar
      x86/vector: Untangle internal state from irq_cfg · ba224fea
      Thomas Gleixner authored
      
      The vector management state is not required to live in irq_cfg. irq_cfg is
      only relevant for the depending irq domains (IOAPIC, DMAR, MSI ...).
      
      The seperation of the vector management status allows to direct a shut down
      interrupt to a special shutdown vector w/o confusing the internal state of
      the vector management.
      
      Preparatory change for the rework of managed interrupts and the global
      vector reservation scheme.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.683712356@linutronix.de
      ba224fea
    • Thomas Gleixner's avatar
      x86/vector: Compile SMP only code conditionally · ba801640
      Thomas Gleixner authored
      
      No point in compiling this for UP.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.603191841@linutronix.de
      ba801640
    • Thomas Gleixner's avatar
      x86/vector: Use matrix allocator for vector assignment · 69cde000
      Thomas Gleixner authored
      
      Replace the magic vector allocation code by a simple bitmap matrix
      allocator. This avoids loops and hoops over CPUs and vector arrays, so in
      case of densly used vector spaces it's way faster.
      
      This also gets rid of the magic 'spread the vectors accross priority
      levels' heuristics in the current allocator:
      
      The comment in __asign_irq_vector says:
      
         * NOTE! The local APIC isn't very good at handling
         * multiple interrupts at the same interrupt level.
         * As the interrupt level is determined by taking the
         * vector number and shifting that right by 4, we
         * want to spread these out a bit so that they don't
         * all fall in the same interrupt level.                         
      
      After doing some palaeontological research the following was found the
      following in the PPro Developer Manual Volume 3:
      
           "7.4.2. Valid Interrupts
      
           The local and I/O APICs support 240 distinct vectors in the range of 16
           to 255. Interrupt priority is implied by its vector, according to the
           following relationship: priority = vector / 16
      
           One is the lowest priority and 15 is the highest. Vectors 16 through
           31 are reserved for exclusive use by the processor. The remaining
           vectors are for general use. The processor's local APIC includes an
           in-service entry and a holding entry for each priority level. To avoid
           losing inter- rupts, software should allocate no more than 2 interrupt
           vectors per priority."
      
      The current SDM tells nothing about that, instead it states:
      
           "If more than one interrupt is generated with the same vector number,
            the local APIC can set the bit for the vector both in the IRR and the
            ISR. This means that for the Pentium 4 and Intel Xeon processors, the
            IRR and ISR can queue two interrupts for each interrupt vector: one
            in the IRR and one in the ISR. Any additional interrupts issued for
            the same interrupt vector are collapsed into the single bit in the
            IRR.
      
            For the P6 family and Pentium processors, the IRR and ISR registers
            can queue no more than two interrupts per interrupt vector and will
            reject other interrupts that are received within the same vector."
      
         Which means, that on P6/Pentium the APIC will reject a new message and
         tell the sender to retry, which increases the load on the APIC bus and
         nothing more.
      
      There is no affirmative answer from Intel on that, but it's a sane approach
      to remove that for the following reasons:
      
          1) No other (relevant Open Source) operating systems bothers to
             implement this or mentiones this at all.
      
          2) The current allocator has no enforcement for this and especially the
             legacy interrupts, which are the main source of interrupts on these
             P6 and older systmes, are allocated linearly in the same priority
             level and just work.
      
          3) The current machines have no problem with that at all as verified
             with some experiments.
      
          4) AMD at least confirmed that such an issue is unknown.
      
          5) P6 and older are dinosaurs almost 20 years EOL, so there is really
             no reason to worry about that too much.
      
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.443678104@linutronix.de
      69cde000
    • Thomas Gleixner's avatar
      x86/vector: Add tracepoints for vector management · 8d1e3dca
      Thomas Gleixner authored
      
      Add tracepoints for analysing the new vector management
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.357986795@linutronix.de
      8d1e3dca
    • Thomas Gleixner's avatar
      x86/vector: Add vector domain debugfs support · 65d7ed57
      Thomas Gleixner authored
      
      Add the debug callback for the vector domain, which gives a detailed
      information about vector usage if invoked for the domain by using rhe
      matrix allocator debug function and vector/target information when invoked
      for a particular interrupt.
      
      Extra information foir the Vector domain:
      
      Online bitmaps:       32
      Global available:   6352
      Global reserved:       5
      Total allocated:      20
      System: 41: 0-19,32,50,128,238-255
       | CPU | avl | man | act | vectors
           0   183     4    19  33-48,51-53
           1   199     4     1  33
           2   199     4     0  
      
      Extra information for interrupts:
      
           Vector:    42
           Target:     4
      
      This allows a detailed analysis of the vector usage and the association to
      interrupts and devices.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.188137174@linutronix.de
      65d7ed57
    • Thomas Gleixner's avatar
      x86/irq/vector: Initialize matrix allocator · 0fa115da
      Thomas Gleixner authored
      
      Initialize the matrix allocator and add the proper accounting points to the
      code.
      
      No functional change, just preparation.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213155.108410660@linutronix.de
      0fa115da
    • Thomas Gleixner's avatar
      x86/vector: Move helper functions around · 99a1482d
      Thomas Gleixner authored
      
      Move the helper functions to a different place as they would end up in the
      middle of management functions.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213154.949581934@linutronix.de
      99a1482d
    • Thomas Gleixner's avatar
      x86/vector: Remove pointless pointer checks · 258d86ee
      Thomas Gleixner authored
      
      The info pointer checks in assign_irq_vector_policy() are pointless because
      the pointer cannot be NULL, otherwise the calling code would already crash.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213154.859484148@linutronix.de
      258d86ee
    • Thomas Gleixner's avatar
      x86/apic: Get rid of the legacy irq data storage · 4ef76eb6
      Thomas Gleixner authored
      
      Now that the legacy PIC takeover by the IOAPIC is marked accordingly the
      early boot allocation of APIC data is not longer necessary. Use the regular
      allocation mechansim as it is used by non legacy interrupts and fill in the
      known information (vector and affinity) so the allocator reuses the vector,
      This is important as the timer check might move the timer interrupt 0 back
      to the PIC in case the delivery through the IOAPIC fails.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213154.780521549@linutronix.de
      4ef76eb6
    • Thomas Gleixner's avatar
      x86/vector: Simplify vector move cleanup · dccfe314
      Thomas Gleixner authored
      
      The vector move cleanup needs to walk the vector space and do a lot of
      sanity checks to find a vector to cleanup.
      
      With single CPU affinities this can be simplified and made more robust by
      queueing the vector configuration which needs to be cleaned up in a hlist
      on the CPU which was the previous target.
      
      That removes all the race conditions because the cleanup either finds a
      valid list entry or not. The latter happens when the interrupt was torn
      down before the cleanup handler was able to run.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213154.622727892@linutronix.de
      dccfe314
    • Thomas Gleixner's avatar
      x86/vector: Store the single CPU targets in apic data · 029c6e1c
      Thomas Gleixner authored
      
      Now that the interrupt affinities are targeted at single CPUs storing them
      in a cpumask is overkill. Store them in a dedicated variable.
      
      This does not yet remove the domain cpumasks because the current allocator
      relies on them. Preparatory change for the allocator rework.
      
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarJuergen Gross <jgross@suse.com>
      Tested-by: default avatarYu Chen <yu.c.chen@intel.com>
      Acked-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Alok Kataria <akataria@vmware.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Rui Zhang <rui.zhang@intel.com>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Len Brown <lenb@kernel.org>
      Link: https://lkml.kernel.org/r/20170913213154.544867277@linutronix.de
      029c6e1c