Skip to content

Commit

Permalink
Merge tag 'pci-v6.5-changes' of git://git.kernel.org/pub/scm/linux/ke…
Browse files Browse the repository at this point in the history
…rnel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Export pcie_retrain_link() for use outside ASPM

   - Add Data Link Layer Link Active Reporting as another way for
     pcie_retrain_link() to determine the link is up

   - Work around link training failures (especially on the ASMedia
     ASM2824 switch) by training first at 2.5GT/s and then attempting
     higher rates

  Resource management:

   - When we coalesce host bridge windows, remove invalidated resources
     from the resource tree so future allocations work correctly

  Hotplug:

   - Cancel bringup sequence if card is not present, to keep from
     blinking Power Indicator indefinitely

   - Reassign bridge resources if necessary for ACPI hotplug

  Driver binding:

   - Convert platform_device .remove() callbacks to return void instead
     of a mostly useless int

  Power management:

   - Reduce wait time for secondary bus to be ready to speed up resume

   - Avoid putting EloPOS E2/S2/H2 (as well as Elo i2) PCIe Ports in
     D3cold

   - Call _REG when transitioning D-states so AML that uses the PCI
     config space OpRegion works, which fixes some ASMedia GPIO
     controllers after resume

  Virtualization:

   - Delay extra 250ms after FLR of Solidigm P44 Pro NVMe to avoid KVM
     hang when guest is rebooted

   - Add function 1 DMA alias quirk for Marvell 88SE9235

  Error handling:

   - Unexport pci_save_aer_state() since it's only used in drivers/pci/

   - Drop recommendation for drivers to configure AER Capability, since
     the PCI core does this for all devices

  ASPM:

   - Disable ASPM on MFD function removal to avoid use-after-free

   - Tighten up pci_enable_link_state() and pci_disable_link_state()
     interfaces so they don't enable/disable states the driver didn't
     specify

   - Avoid link retraining race that can happen if ASPM sets link
     control parameters while the link is in the midst of training for
     some other reason

  Endpoint framework:

   - Change "PCI Endpoint Virtual NTB driver" Kconfig prompt to be
     different from "PCI Endpoint NTB driver"

   - Automatically create a function specific attributes group for
     endpoint drivers to avoid reference counting issues

   - Fix many EPC test issues

   - Return pci_epf_type_add_cfs() error if EPF has no driver

   - Add kernel-doc for pci_epc_raise_irq() and pci_epc_map_msi_irq()
     MSI vector parameters

   - Pass EPF device ID to driver probe functions

   - Return -EALREADY if EPC has already been started/stopped

   - Add linkdown notifier support and use it in qcom-ep

   - Add Bus Master Enable event support and use it in qcom-ep

   - Add Qualcomm Modem Host Interface (MHI) endpoint driver

   - Add Layerscape PME interrupt handling to manage link-up
     notification

  Cadence PCIe controller driver:

   - Wait for link retrain to complete when working around the J721E
     i2085 erratum with Gen2 mode

  Faraday FTPC100 PCI controller driver:

   - Release clock resources on error paths

  Freescale i.MX6 PCIe controller driver:

   - Save and restore Root Port MSI control to work around hardware defect

  Intel VMD host bridge driver:

   - Reset VMD config register between soft reboots

   - Capture pci_reset_bus() return value instead of printing junk when
     it fails

  Qualcomm PCIe controller driver:

   - Add SDX65 endpoint compatible string to DT binding

   - Disable register write access after init for IP v2.3.3, v2.9.0

   - Use DWC helpers for enabling/disabling writes to DBI registers

   - Hide slot hotplug capability for IP v1.0.0, v1.9.0, v2.1.0, v2.3.2,
     v2.3.3, v2.7.0, v2.9.0

   - Reuse v2.3.2 post-init sequence for v2.4.0

  Renesas R-Car PCIe controller driver:

   - Remove unused static pcie_base and pcie_dev

  Rockchip PCIe controller driver:

   - Remove writes to unused registers

   - Write endpoint Device ID using correct register

   - Assert PCI Configuration Enable bit after probe so endpoint
     responds instead of generating Request Retry Status messages

   - Poll waiting for PHY PLLs to lock

   - Update RK3399 example DT binding to be valid

   - Use RK3399 PCIE_CLIENT_LEGACY_INT_CTRL to generate INTx instead of
     manually generating PCIe message

   - Use multiple windows to avoid address translation conflicts

   - Use u32 (not u16) when accessing 32-bit registers

   - Hide MSI-X Capability, since RK3399 can't generate MSI-X

   - Set endpoint controller required alignment to 256

  Synopsys DesignWare PCIe controller driver:

   - Wait for link to come up only if we've initiated link training

  Miscellaneous:

   - Add pci_clear_master() stub for non-CONFIG_PCI"

* tag 'pci-v6.5-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (116 commits)
  Documentation: PCI: correct spelling
  PCI: vmd: Fix uninitialized variable usage in vmd_enable_domain()
  PCI: xgene-msi: Convert to platform remove callback returning void
  PCI: tegra: Convert to platform remove callback returning void
  PCI: rockchip-host: Convert to platform remove callback returning void
  PCI: mvebu: Convert to platform remove callback returning void
  PCI: mt7621: Convert to platform remove callback returning void
  PCI: mediatek-gen3: Convert to platform remove callback returning void
  PCI: mediatek: Convert to platform remove callback returning void
  PCI: iproc: Convert to platform remove callback returning void
  PCI: hisi-error: Convert to platform remove callback returning void
  PCI: dwc: Convert to platform remove callback returning void
  PCI: j721e: Convert to platform remove callback returning void
  PCI: brcmstb: Convert to platform remove callback returning void
  PCI: altera-msi: Convert to platform remove callback returning void
  PCI: altera: Convert to platform remove callback returning void
  PCI: aardvark: Convert to platform remove callback returning void
  PCI: rcar: Use correct product family name for Renesas R-Car
  PCI: layerscape: Add the endpoint linkup notifier support
  PCI: endpoint: pci-epf-vntb: Fix typo in comments
  ...
  • Loading branch information
Linus Torvalds committed Jun 30, 2023
2 parents 28968f3 + 6ecac46 commit 9070577
Show file tree
Hide file tree
Showing 70 changed files with 1,626 additions and 843 deletions.
11 changes: 4 additions & 7 deletions Documentation/PCI/endpoint/pci-ntb-howto.rst
Original file line number Diff line number Diff line change
Expand Up @@ -88,13 +88,10 @@ commands can be used::
# echo 0x104c > functions/pci_epf_ntb/func1/vendorid
# echo 0xb00d > functions/pci_epf_ntb/func1/deviceid

In order to configure NTB specific attributes, a new sub-directory to func1
should be created::

# mkdir functions/pci_epf_ntb/func1/pci_epf_ntb.0/

The NTB function driver will populate this directory with various attributes
that can be configured by the user::
The PCI endpoint framework also automatically creates a sub-directory in the
function attribute directory. This sub-directory has the same name as the name
of the function device and is populated with the following NTB specific
attributes that can be configured by the user::

# ls functions/pci_epf_ntb/func1/pci_epf_ntb.0/
db_count mw1 mw2 mw3 mw4 num_mws
Expand Down
13 changes: 5 additions & 8 deletions Documentation/PCI/endpoint/pci-vntb-howto.rst
Original file line number Diff line number Diff line change
Expand Up @@ -84,13 +84,10 @@ commands can be used::
# echo 0x1957 > functions/pci_epf_vntb/func1/vendorid
# echo 0x0809 > functions/pci_epf_vntb/func1/deviceid

In order to configure NTB specific attributes, a new sub-directory to func1
should be created::

# mkdir functions/pci_epf_vntb/func1/pci_epf_vntb.0/

The NTB function driver will populate this directory with various attributes
that can be configured by the user::
The PCI endpoint framework also automatically creates a sub-directory in the
function attribute directory. This sub-directory has the same name as the name
of the function device and is populated with the following NTB specific
attributes that can be configured by the user::

# ls functions/pci_epf_vntb/func1/pci_epf_vntb.0/
db_count mw1 mw2 mw3 mw4 num_mws
Expand All @@ -103,7 +100,7 @@ A sample configuration for NTB function is given below::
# echo 1 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws
# echo 0x100000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1

A sample configuration for virtual NTB driver for virutal PCI bus::
A sample configuration for virtual NTB driver for virtual PCI bus::

# echo 0x1957 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_vid
# echo 0x080A > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_pid
Expand Down
2 changes: 1 addition & 1 deletion Documentation/PCI/msi-howto.rst
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@ PCI_IRQ_MSI or PCI_IRQ_MSIX flags.
List of device drivers MSI(-X) APIs
===================================

The PCI/MSI subystem has a dedicated C file for its exported device driver
The PCI/MSI subsystem has a dedicated C file for its exported device driver
APIs — `drivers/pci/msi/api.c`. The following functions are exported:

.. kernel-doc:: drivers/pci/msi/api.c
Expand Down
2 changes: 1 addition & 1 deletion Documentation/PCI/pci-error-recovery.rst
Original file line number Diff line number Diff line change
Expand Up @@ -364,7 +364,7 @@ Note, however, not all failures are truly "permanent". Some are
caused by over-heating, some by a poorly seated card. Many
PCI error events are caused by software bugs, e.g. DMA's to
wild addresses or bogus split transactions due to programming
errors. See the discussion in powerpc/eeh-pci-error-recovery.txt
errors. See the discussion in Documentation/powerpc/eeh-pci-error-recovery.rst
for additional detail on real-life experience of the causes of
software errors.

Expand Down
183 changes: 65 additions & 118 deletions Documentation/PCI/pcieaer-howto.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,62 +16,61 @@ Overview
About this guide
----------------

This guide describes the basics of the PCI Express Advanced Error
This guide describes the basics of the PCI Express (PCIe) Advanced Error
Reporting (AER) driver and provides information on how to use it, as
well as how to enable the drivers of endpoint devices to conform with
PCI Express AER driver.
well as how to enable the drivers of Endpoint devices to conform with
the PCIe AER driver.


What is the PCI Express AER Driver?
-----------------------------------
What is the PCIe AER Driver?
----------------------------

PCI Express error signaling can occur on the PCI Express link itself
or on behalf of transactions initiated on the link. PCI Express
PCIe error signaling can occur on the PCIe link itself
or on behalf of transactions initiated on the link. PCIe
defines two error reporting paradigms: the baseline capability and
the Advanced Error Reporting capability. The baseline capability is
required of all PCI Express components providing a minimum defined
required of all PCIe components providing a minimum defined
set of error reporting requirements. Advanced Error Reporting
capability is implemented with a PCI Express advanced error reporting
capability is implemented with a PCIe Advanced Error Reporting
extended capability structure providing more robust error reporting.

The PCI Express AER driver provides the infrastructure to support PCI
Express Advanced Error Reporting capability. The PCI Express AER
driver provides three basic functions:
The PCIe AER driver provides the infrastructure to support PCIe Advanced
Error Reporting capability. The PCIe AER driver provides three basic
functions:

- Gathers the comprehensive error information if errors occurred.
- Reports error to the users.
- Performs error recovery actions.

AER driver only attaches root ports which support PCI-Express AER
capability.
The AER driver only attaches to Root Ports and RCECs that support the PCIe
AER capability.


User Guide
==========

Include the PCI Express AER Root Driver into the Linux Kernel
-------------------------------------------------------------
Include the PCIe AER Root Driver into the Linux Kernel
------------------------------------------------------

The PCI Express AER Root driver is a Root Port service driver attached
to the PCI Express Port Bus driver. If a user wants to use it, the driver
has to be compiled. Option CONFIG_PCIEAER supports this capability. It
depends on CONFIG_PCIEPORTBUS, so pls. set CONFIG_PCIEPORTBUS=y and
CONFIG_PCIEAER = y.
The PCIe AER driver is a Root Port service driver attached
via the PCIe Port Bus driver. If a user wants to use it, the driver
must be compiled. It is enabled with CONFIG_PCIEAER, which
depends on CONFIG_PCIEPORTBUS.

Load PCI Express AER Root Driver
--------------------------------
Load PCIe AER Root Driver
-------------------------

Some systems have AER support in firmware. Enabling Linux AER support at
the same time the firmware handles AER may result in unpredictable
the same time the firmware handles AER would result in unpredictable
behavior. Therefore, Linux does not handle AER events unless the firmware
grants AER control to the OS via the ACPI _OSC method. See the PCI FW 3.0
grants AER control to the OS via the ACPI _OSC method. See the PCI Firmware
Specification for details regarding _OSC usage.

AER error output
----------------

When a PCIe AER error is captured, an error message will be output to
console. If it's a correctable error, it is output as a warning.
console. If it's a correctable error, it is output as an info message.
Otherwise, it is printed as an error. So users could choose different
log level to filter out correctable error messages.

Expand All @@ -82,9 +81,9 @@ Below shows an example::
0000:50:00.0: [20] Unsupported Request (First)
0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100

In the example, 'Requester ID' means the ID of the device who sends
the error message to root port. Pls. refer to pci express specs for
other fields.
In the example, 'Requester ID' means the ID of the device that sent
the error message to the Root Port. Please refer to PCIe specs for other
fields.

AER Statistics / Counters
-------------------------
Expand All @@ -96,90 +95,81 @@ Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
Developer Guide
===============

To enable AER aware support requires a software driver to configure
the AER capability structure within its device and to provide callbacks.
To enable error recovery, a software driver must provide callbacks.

To support AER better, developers need understand how AER does work
firstly.
To support AER better, developers need to understand how AER works.

PCI Express errors are classified into two types: correctable errors
and uncorrectable errors. This classification is based on the impacts
PCIe errors are classified into two types: correctable errors
and uncorrectable errors. This classification is based on the impact
of those errors, which may result in degraded performance or function
failure.

Correctable errors pose no impacts on the functionality of the
interface. The PCI Express protocol can recover without any software
interface. The PCIe protocol can recover without any software
intervention or any loss of data. These errors are detected and
corrected by hardware. Unlike correctable errors, uncorrectable
corrected by hardware.

Unlike correctable errors, uncorrectable
errors impact functionality of the interface. Uncorrectable errors
can cause a particular transaction or a particular PCI Express link
can cause a particular transaction or a particular PCIe link
to be unreliable. Depending on those error conditions, uncorrectable
errors are further classified into non-fatal errors and fatal errors.
Non-fatal errors cause the particular transaction to be unreliable,
but the PCI Express link itself is fully functional. Fatal errors, on
but the PCIe link itself is fully functional. Fatal errors, on
the other hand, cause the link to be unreliable.

When AER is enabled, a PCI Express device will automatically send an
error message to the PCIe root port above it when the device captures
When PCIe error reporting is enabled, a device will automatically send an
error message to the Root Port above it when it captures
an error. The Root Port, upon receiving an error reporting message,
internally processes and logs the error message in its PCI Express
capability structure. Error information being logged includes storing
internally processes and logs the error message in its AER
Capability structure. Error information being logged includes storing
the error reporting agent's requestor ID into the Error Source
Identification Registers and setting the error bits of the Root Error
Status Register accordingly. If AER error reporting is enabled in Root
Error Command Register, the Root Port generates an interrupt if an
Status Register accordingly. If AER error reporting is enabled in the Root
Error Command Register, the Root Port generates an interrupt when an
error is detected.

Note that the errors as described above are related to the PCI Express
Note that the errors as described above are related to the PCIe
hierarchy and links. These errors do not include any device specific
errors because device specific errors will still get sent directly to
the device driver.

Configure the AER capability structure
--------------------------------------

AER aware drivers of PCI Express component need change the device
control registers to enable AER. They also could change AER registers,
including mask and severity registers. Helper function
pci_enable_pcie_error_reporting could be used to enable AER. See
section 3.3.

Provide callbacks
-----------------

callback reset_link to reset pci express link
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
callback reset_link to reset PCIe link
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This callback is used to reset the pci express physical link when a
fatal error happens. The root port aer service driver provides a
default reset_link function, but different upstream ports might
have different specifications to reset pci express link, so all
upstream ports should provide their own reset_link functions.
This callback is used to reset the PCIe physical link when a
fatal error happens. The Root Port AER service driver provides a
default reset_link function, but different Upstream Ports might
have different specifications to reset the PCIe link, so
Upstream Port drivers may provide their own reset_link functions.

Section 3.2.2.2 provides more detailed info on when to call
reset_link.

PCI error-recovery callbacks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The PCI Express AER Root driver uses error callbacks to coordinate
The PCIe AER Root driver uses error callbacks to coordinate
with downstream device drivers associated with a hierarchy in question
when performing error recovery actions.

Data struct pci_driver has a pointer, err_handler, to point to
pci_error_handlers who consists of a couple of callback function
pointers. AER driver follows the rules defined in
pci-error-recovery.txt except pci express specific parts (e.g.
reset_link). Pls. refer to pci-error-recovery.txt for detailed
pointers. The AER driver follows the rules defined in
pci-error-recovery.rst except PCIe-specific parts (e.g.
reset_link). Please refer to pci-error-recovery.rst for detailed
definitions of the callbacks.

Below sections specify when to call the error callback functions.
The sections below specify when to call the error callback functions.

Correctable errors
~~~~~~~~~~~~~~~~~~

Correctable errors pose no impacts on the functionality of
the interface. The PCI Express protocol can recover without any
the interface. The PCIe protocol can recover without any
software intervention or any loss of data. These errors do not
require any recovery actions. The AER driver clears the device's
correctable error status register accordingly and logs these errors.
Expand All @@ -190,12 +180,12 @@ Non-correctable (non-fatal and fatal) errors
If an error message indicates a non-fatal error, performing link reset
at upstream is not required. The AER driver calls error_detected(dev,
pci_channel_io_normal) to all drivers associated within a hierarchy in
question. for example::
question. For example::

EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort
Endpoint <==> Downstream Port B <==> Upstream Port A <==> Root Port

If Upstream port A captures an AER error, the hierarchy consists of
Downstream port B and EndPoint.
If Upstream Port A captures an AER error, the hierarchy consists of
Downstream Port B and Endpoint.

A driver may return PCI_ERS_RESULT_CAN_RECOVER,
PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
Expand All @@ -212,36 +202,11 @@ to reset the link. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER
and reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
to mmio_enabled.

helper functions
----------------
::

int pci_enable_pcie_error_reporting(struct pci_dev *dev);

pci_enable_pcie_error_reporting enables the device to send error
messages to root port when an error is detected. Note that devices
don't enable the error reporting by default, so device drivers need
call this function to enable it.

::

int pci_disable_pcie_error_reporting(struct pci_dev *dev);

pci_disable_pcie_error_reporting disables the device to send error
messages to root port when an error is detected.

::

int pci_aer_clear_nonfatal_status(struct pci_dev *dev);`

pci_aer_clear_nonfatal_status clears non-fatal errors in the uncorrectable
error status register.

Frequent Asked Questions
------------------------

Q:
What happens if a PCI Express device driver does not provide an
What happens if a PCIe device driver does not provide an
error recovery handler (pci_driver->err_handler is equal to NULL)?

A:
Expand All @@ -257,24 +222,6 @@ A:
Fatal error recovery will fail if the errors are reported by the
upstream ports who are attached by the service driver.

Q:
How does this infrastructure deal with driver that is not PCI
Express aware?

A:
This infrastructure calls the error callback functions of the
driver when an error happens. But if the driver is not aware of
PCI Express, the device might not report its own errors to root
port.

Q:
What modifications will that driver need to make it compatible
with the PCI Express AER Root driver?

A:
It could call the helper functions to enable AER in devices and
cleanup uncorrectable status register. Pls. refer to section 3.3.


Software error injection
========================
Expand All @@ -296,5 +243,5 @@ from:

https://git.kernel.org/cgit/linux/kernel/git/gong.chen/aer-inject.git/

More information about aer-inject can be found in the document comes
with its source code.
More information about aer-inject can be found in the document in
its source code.
2 changes: 2 additions & 0 deletions Documentation/devicetree/bindings/pci/qcom,pcie-ep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ properties:
compatible:
enum:
- qcom,sdx55-pcie-ep
- qcom,sdx65-pcie-ep
- qcom,sm8450-pcie-ep

reg:
Expand Down Expand Up @@ -109,6 +110,7 @@ allOf:
contains:
enum:
- qcom,sdx55-pcie-ep
- qcom,sdx65-pcie-ep
then:
properties:
clocks:
Expand Down
Loading

0 comments on commit 9070577

Please sign in to comment.