Skip to content

Commit

Permalink
Merge branch 'pci/hotplug'
Browse files Browse the repository at this point in the history
  - Differentiate between pciehp surprise and safe removal (Lukas Wunner)

  - Remove unnecessary pciehp includes (Lukas Wunner)

  - Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner)

  - Tolerate PCIe Slot Presence Detect being hardwired to zero to
    workaround broken hardware, e.g., the Wilocity switch/wireless device
    (Lukas Wunner)

  - Unify pciehp controller & slot structs (Lukas Wunner)

  - Constify hotplug_slot_ops (Lukas Wunner)

  - Drop hotplug_slot_info (Lukas Wunner)

  - Embed hotplug_slot struct into users instead of allocating it
    separately (Lukas Wunner)

  - Initialize PCIe port service drivers directly instead of relying on
    initcall ordering (Keith Busch)

  - Restore PCI config state after a slot reset (Keith Busch)

  - Save/restore DPC config state along with other PCI config state (Keith
    Busch)

  - Reference count devices during AER handling to avoid race issue with
    concurrent hot removal (Keith Busch)

  - If an Upstream Port reports ERR_FATAL, don't try to read the Port's
    config space because it is probably unreachable (Keith Busch)

  - During error handling, use slot-specific reset instead of secondary
    bus reset to avoid link up/down issues on hotplug ports (Keith Busch)

  - Restore previous AER/DPC handling that does not remove and re-enumerate
    devices on ERR_FATAL (Keith Busch)

  - Notify all drivers that may be affected by error recovery resets (Keith
    Busch)

  - Always generate error recovery uevents, even if a driver doesn't have
    error callbacks (Keith Busch)

  - Make PCIe link active reporting detection generic (Keith Busch)

  - Support D3cold in PCIe hierarchies during system sleep and runtime,
    including hotplug and Thunderbolt ports (Mika Westerberg)

  - Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots
    are empty or occupied (Jon Derrick)

  - Remove duplicated include from pci/pcie/err.c and unused variable from
    cpqphp (YueHaibing)

  - Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza
    Pawandeep)

  - Uninline PCI bus accessors for better ftracing (Keith Busch)

  - Remove unused AER Root Port .error_resume method (Keith Busch)

  - Use kfifo in AER instead of a local version (Keith Busch)

  - Use threaded IRQ in AER bottom half (Keith Busch)

  - Use managed resources in AER core (Keith Busch)

  - Reuse pcie_port_find_device() for AER injection (Keith Busch)

  - Abstract AER interrupt handling to disconnect error injection (Keith
    Busch)

  - Refactor AER injection callbacks to simplify future improvments (Keith
    Busch)

* pci/hotplug:
  PCI/AER: Refactor error injection fallbacks
  PCI/AER: Abstract AER interrupt handling
  PCI/AER: Reuse existing pcie_port_find_device() interface
  PCI/AER: Use managed resource allocations
  PCI/AER: Use threaded IRQ for bottom half
  PCI/AER: Use kfifo_in_spinlocked() to insert locked elements
  PCI/AER: Use kfifo for tracking events instead of reimplementing it
  PCI/AER: Remove error source from AER struct aer_rpc
  PCI/AER: Remove unused aer_error_resume()
  PCI: Uninline PCI bus accessors for better ftracing
  PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls
  PCI: pnv_php: Use kmemdup()
  PCI: cpqphp: Remove set but not used variable 'physical_slot'
  PCI/ERR: Remove duplicated include from err.c
  PCI: Equalize hotplug memory and io for occupied and empty slots
  PCI / ACPI: Whitelist D3 for more PCIe hotplug ports
  ACPI / property: Allow multiple property compatible _DSD entries
  PCI/PME: Implement runtime PM callbacks
  PCI: pciehp: Implement runtime PM callbacks
  PCI/portdrv: Add runtime PM hooks for port service drivers
  PCI/portdrv: Resume upon exit from system suspend if left runtime suspended
  PCI: pciehp: Do not handle events if interrupts are masked
  PCI: pciehp: Disable hotplug interrupt during suspend
  PCI / ACPI: Enable wake automatically for power managed bridges
  PCI: Do not skip power-managed bridges in pci_enable_wake()
  PCI: Make link active reporting detection generic
  PCI: Unify device inaccessible
  PCI/ERR: Always report current recovery status for udev
  PCI/ERR: Simplify broadcast callouts
  PCI/ERR: Run error recovery callbacks for all affected devices
  PCI/ERR: Handle fatal error recovery
  PCI/ERR: Use slot reset if available
  PCI/AER: Don't read upstream ports below fatal errors
  PCI/AER: Take reference on error devices
  PCI/DPC: Save and restore config state
  PCI: portdrv: Restore PCI config state on slot reset
  PCI: portdrv: Initialize service drivers directly
  PCI: hotplug: Document TODOs
  PCI: hotplug: Embed hotplug_slot
  PCI: hotplug: Drop hotplug_slot_info
  PCI: hotplug: Constify hotplug_slot_ops
  PCI: pciehp: Reshuffle controller struct for clarity
  PCI: pciehp: Rename controller struct members for clarity
  PCI: pciehp: Unify controller and slot structs
  PCI: pciehp: Tolerate Presence Detect hardwired to zero
  PCI: pciehp: Drop hotplug_slot_ops wrappers
  PCI: pciehp: Drop unnecessary includes
  PCI: pciehp: Differentiate between surprise and safe removal
  PCI: Simplify disconnected marking
  • Loading branch information
Bjorn Helgaas committed Oct 20, 2018
2 parents de468b7 + e51cd9c commit 20634dc
Show file tree
Hide file tree
Showing 84 changed files with 1,383 additions and 1,747 deletions.
35 changes: 10 additions & 25 deletions Documentation/PCI/pci-error-recovery.txt
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ The actual steps taken by a platform to recover from a PCI error
event will be platform-dependent, but will follow the general
sequence described below.

STEP 0: Error Event: ERR_NONFATAL
STEP 0: Error Event
-------------------
A PCI bus error is detected by the PCI hardware. On powerpc, the slot
is isolated, in that all I/O is blocked: all reads return 0xffffffff,
Expand Down Expand Up @@ -228,7 +228,13 @@ proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
proceeds to STEP 4 (Slot Reset)

STEP 3: Slot Reset
STEP 3: Link Reset
------------------
The platform resets the link. This is a PCI-Express specific step
and is done whenever a fatal error has been detected that can be
"solved" by resetting the link.

STEP 4: Slot Reset
------------------

In response to a return value of PCI_ERS_RESULT_NEED_RESET, the
Expand Down Expand Up @@ -314,7 +320,7 @@ Failure).
>>> However, it probably should.


STEP 4: Resume Operations
STEP 5: Resume Operations
-------------------------
The platform will call the resume() callback on all affected device
drivers if all drivers on the segment have returned
Expand All @@ -326,7 +332,7 @@ a result code.
At this point, if a new error happens, the platform will restart
a new error recovery sequence.

STEP 5: Permanent Failure
STEP 6: Permanent Failure
-------------------------
A "permanent failure" has occurred, and the platform cannot recover
the device. The platform will call error_detected() with a
Expand All @@ -349,27 +355,6 @@ errors. See the discussion in powerpc/eeh-pci-error-recovery.txt
for additional detail on real-life experience of the causes of
software errors.

STEP 0: Error Event: ERR_FATAL
-------------------
PCI bus error is detected by the PCI hardware. On powerpc, the slot is
isolated, in that all I/O is blocked: all reads return 0xffffffff, all
writes are ignored.

STEP 1: Remove devices
--------------------
Platform removes the devices depending on the error agent, it could be
this port for all subordinates or upstream component (likely downstream
port)

STEP 2: Reset link
--------------------
The platform resets the link. This is a PCI-Express specific step and is
done whenever a fatal error has been detected that can be "solved" by
resetting the link.

STEP 3: Re-enumerate the devices
--------------------
Initiates the re-enumeration.

Conclusion; General Remarks
---------------------------
Expand Down
2 changes: 1 addition & 1 deletion arch/powerpc/include/asm/pnv-pci.h
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,6 @@ void pnv_cxl_release_hwirq_ranges(struct cxl_irq_ranges *irqs,

struct pnv_php_slot {
struct hotplug_slot slot;
struct hotplug_slot_info slot_info;
uint64_t id;
char *name;
int slot_no;
Expand All @@ -72,6 +71,7 @@ struct pnv_php_slot {
struct pci_dev *pdev;
struct pci_bus *bus;
bool power_state_check;
u8 attention_state;
void *fdt;
void *dt;
struct of_changeset ocs;
Expand Down
97 changes: 71 additions & 26 deletions drivers/acpi/property.c
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,15 @@ static int acpi_data_get_property_array(const struct acpi_device_data *data,
acpi_object_type type,
const union acpi_object **obj);

/* ACPI _DSD device properties GUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301 */
static const guid_t prp_guid =
static const guid_t prp_guids[] = {
/* ACPI _DSD device properties GUID: daffd814-6eba-4d8c-8a91-bc9bbf4aa301 */
GUID_INIT(0xdaffd814, 0x6eba, 0x4d8c,
0x8a, 0x91, 0xbc, 0x9b, 0xbf, 0x4a, 0xa3, 0x01);
/* ACPI _DSD data subnodes GUID: dbb8e3e6-5886-4ba6-8795-1319f52a966b */
0x8a, 0x91, 0xbc, 0x9b, 0xbf, 0x4a, 0xa3, 0x01),
/* Hotplug in D3 GUID: 6211e2c0-58a3-4af3-90e1-927a4e0c55a4 */
GUID_INIT(0x6211e2c0, 0x58a3, 0x4af3,
0x90, 0xe1, 0x92, 0x7a, 0x4e, 0x0c, 0x55, 0xa4),
};

static const guid_t ads_guid =
GUID_INIT(0xdbb8e3e6, 0x5886, 0x4ba6,
0x87, 0x95, 0x13, 0x19, 0xf5, 0x2a, 0x96, 0x6b);
Expand Down Expand Up @@ -56,6 +60,7 @@ static bool acpi_nondev_subnode_extract(const union acpi_object *desc,
dn->name = link->package.elements[0].string.pointer;
dn->fwnode.ops = &acpi_data_fwnode_ops;
dn->parent = parent;
INIT_LIST_HEAD(&dn->data.properties);
INIT_LIST_HEAD(&dn->data.subnodes);

result = acpi_extract_properties(desc, &dn->data);
Expand Down Expand Up @@ -288,6 +293,35 @@ static void acpi_init_of_compatible(struct acpi_device *adev)
adev->flags.of_compatible_ok = 1;
}

static bool acpi_is_property_guid(const guid_t *guid)
{
int i;

for (i = 0; i < ARRAY_SIZE(prp_guids); i++) {
if (guid_equal(guid, &prp_guids[i]))
return true;
}

return false;
}

struct acpi_device_properties *
acpi_data_add_props(struct acpi_device_data *data, const guid_t *guid,
const union acpi_object *properties)
{
struct acpi_device_properties *props;

props = kzalloc(sizeof(*props), GFP_KERNEL);
if (props) {
INIT_LIST_HEAD(&props->list);
props->guid = guid;
props->properties = properties;
list_add_tail(&props->list, &data->properties);
}

return props;
}

static bool acpi_extract_properties(const union acpi_object *desc,
struct acpi_device_data *data)
{
Expand All @@ -312,21 +346,21 @@ static bool acpi_extract_properties(const union acpi_object *desc,
properties->type != ACPI_TYPE_PACKAGE)
break;

if (!guid_equal((guid_t *)guid->buffer.pointer, &prp_guid))
if (!acpi_is_property_guid((guid_t *)guid->buffer.pointer))
continue;

/*
* We found the matching GUID. Now validate the format of the
* package immediately following it.
*/
if (!acpi_properties_format_valid(properties))
break;
continue;

data->properties = properties;
return true;
acpi_data_add_props(data, (const guid_t *)guid->buffer.pointer,
properties);
}

return false;
return !list_empty(&data->properties);
}

void acpi_init_properties(struct acpi_device *adev)
Expand All @@ -336,6 +370,7 @@ void acpi_init_properties(struct acpi_device *adev)
acpi_status status;
bool acpi_of = false;

INIT_LIST_HEAD(&adev->data.properties);
INIT_LIST_HEAD(&adev->data.subnodes);

if (!adev->handle)
Expand Down Expand Up @@ -398,11 +433,16 @@ static void acpi_destroy_nondev_subnodes(struct list_head *list)

void acpi_free_properties(struct acpi_device *adev)
{
struct acpi_device_properties *props, *tmp;

acpi_destroy_nondev_subnodes(&adev->data.subnodes);
ACPI_FREE((void *)adev->data.pointer);
adev->data.of_compatible = NULL;
adev->data.pointer = NULL;
adev->data.properties = NULL;
list_for_each_entry_safe(props, tmp, &adev->data.properties, list) {
list_del(&props->list);
kfree(props);
}
}

/**
Expand All @@ -427,32 +467,37 @@ static int acpi_data_get_property(const struct acpi_device_data *data,
const char *name, acpi_object_type type,
const union acpi_object **obj)
{
const union acpi_object *properties;
int i;
const struct acpi_device_properties *props;

if (!data || !name)
return -EINVAL;

if (!data->pointer || !data->properties)
if (!data->pointer || list_empty(&data->properties))
return -EINVAL;

properties = data->properties;
for (i = 0; i < properties->package.count; i++) {
const union acpi_object *propname, *propvalue;
const union acpi_object *property;
list_for_each_entry(props, &data->properties, list) {
const union acpi_object *properties;
unsigned int i;

property = &properties->package.elements[i];
properties = props->properties;
for (i = 0; i < properties->package.count; i++) {
const union acpi_object *propname, *propvalue;
const union acpi_object *property;

propname = &property->package.elements[0];
propvalue = &property->package.elements[1];
property = &properties->package.elements[i];

if (!strcmp(name, propname->string.pointer)) {
if (type != ACPI_TYPE_ANY && propvalue->type != type)
return -EPROTO;
if (obj)
*obj = propvalue;
propname = &property->package.elements[0];
propvalue = &property->package.elements[1];

return 0;
if (!strcmp(name, propname->string.pointer)) {
if (type != ACPI_TYPE_ANY &&
propvalue->type != type)
return -EPROTO;
if (obj)
*obj = propvalue;

return 0;
}
}
}
return -EINVAL;
Expand Down
2 changes: 1 addition & 1 deletion drivers/acpi/x86/apple.c
Original file line number Diff line number Diff line change
Expand Up @@ -132,8 +132,8 @@ void acpi_extract_apple_properties(struct acpi_device *adev)
}
WARN_ON(free_space != (void *)newprops + newsize);

adev->data.properties = newprops;
adev->data.pointer = newprops;
acpi_data_add_props(&adev->data, &apple_prp_guid, newprops);

out_free:
ACPI_FREE(props);
Expand Down
1 change: 0 additions & 1 deletion drivers/crypto/qat/qat_common/adf_aer.c
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,6 @@ static pci_ers_result_t adf_slot_reset(struct pci_dev *pdev)
pr_err("QAT: Can't find acceleration device\n");
return PCI_ERS_RESULT_DISCONNECT;
}
pci_cleanup_aer_uncorrect_error_status(pdev);
if (adf_dev_aer_schedule_reset(accel_dev, ADF_DEV_RESET_SYNC))
return PCI_ERS_RESULT_DISCONNECT;

Expand Down
7 changes: 0 additions & 7 deletions drivers/dma/ioat/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -1252,7 +1252,6 @@ static pci_ers_result_t ioat_pcie_error_detected(struct pci_dev *pdev,
static pci_ers_result_t ioat_pcie_error_slot_reset(struct pci_dev *pdev)
{
pci_ers_result_t result = PCI_ERS_RESULT_RECOVERED;
int err;

dev_dbg(&pdev->dev, "%s post reset handling\n", DRV_NAME);

Expand All @@ -1267,12 +1266,6 @@ static pci_ers_result_t ioat_pcie_error_slot_reset(struct pci_dev *pdev)
pci_wake_from_d3(pdev, false);
}

err = pci_cleanup_aer_uncorrect_error_status(pdev);
if (err) {
dev_err(&pdev->dev,
"AER uncorrect error status clear failed: %#x\n", err);
}

return result;
}

Expand Down
2 changes: 1 addition & 1 deletion drivers/gpio/gpiolib-acpi.c
Original file line number Diff line number Diff line change
Expand Up @@ -1198,7 +1198,7 @@ int acpi_gpio_count(struct device *dev, const char *con_id)
bool acpi_can_fallback_to_crs(struct acpi_device *adev, const char *con_id)
{
/* Never allow fallback if the device has properties */
if (adev->data.properties || adev->driver_gpios)
if (acpi_dev_has_props(adev) || adev->driver_gpios)
return false;

return con_id == NULL;
Expand Down
1 change: 0 additions & 1 deletion drivers/infiniband/hw/hfi1/pcie.c
Original file line number Diff line number Diff line change
Expand Up @@ -650,7 +650,6 @@ pci_resume(struct pci_dev *pdev)
struct hfi1_devdata *dd = pci_get_drvdata(pdev);

dd_dev_info(dd, "HFI1 resume function called\n");
pci_cleanup_aer_uncorrect_error_status(pdev);
/*
* Running jobs will fail, since it's asynchronous
* unlike sysfs-requested reset. Better than
Expand Down
1 change: 0 additions & 1 deletion drivers/infiniband/hw/qib/qib_pcie.c
Original file line number Diff line number Diff line change
Expand Up @@ -597,7 +597,6 @@ qib_pci_resume(struct pci_dev *pdev)
struct qib_devdata *dd = pci_get_drvdata(pdev);

qib_devinfo(pdev, "QIB resume function called\n");
pci_cleanup_aer_uncorrect_error_status(pdev);
/*
* Running jobs will fail, since it's asynchronous
* unlike sysfs-requested reset. Better than
Expand Down
2 changes: 0 additions & 2 deletions drivers/net/ethernet/atheros/alx/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -1964,8 +1964,6 @@ static pci_ers_result_t alx_pci_error_slot_reset(struct pci_dev *pdev)
if (!alx_reset_mac(hw))
rc = PCI_ERS_RESULT_RECOVERED;
out:
pci_cleanup_aer_uncorrect_error_status(pdev);

rtnl_unlock();

return rc;
Expand Down
7 changes: 0 additions & 7 deletions drivers/net/ethernet/broadcom/bnx2.c
Original file line number Diff line number Diff line change
Expand Up @@ -8793,13 +8793,6 @@ static pci_ers_result_t bnx2_io_slot_reset(struct pci_dev *pdev)
if (!(bp->flags & BNX2_FLAG_AER_ENABLED))
return result;

err = pci_cleanup_aer_uncorrect_error_status(pdev);
if (err) {
dev_err(&pdev->dev,
"pci_cleanup_aer_uncorrect_error_status failed 0x%0x\n",
err); /* non-fatal, continue */
}

return result;
}

Expand Down
8 changes: 0 additions & 8 deletions drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -14385,14 +14385,6 @@ static pci_ers_result_t bnx2x_io_slot_reset(struct pci_dev *pdev)

rtnl_unlock();

/* If AER, perform cleanup of the PCIe registers */
if (bp->flags & AER_ENABLED) {
if (pci_cleanup_aer_uncorrect_error_status(pdev))
BNX2X_ERR("pci_cleanup_aer_uncorrect_error_status failed\n");
else
DP(NETIF_MSG_HW, "pci_cleanup_aer_uncorrect_error_status succeeded\n");
}

return PCI_ERS_RESULT_RECOVERED;
}

Expand Down
7 changes: 0 additions & 7 deletions drivers/net/ethernet/broadcom/bnxt/bnxt.c
Original file line number Diff line number Diff line change
Expand Up @@ -9231,13 +9231,6 @@ static pci_ers_result_t bnxt_io_slot_reset(struct pci_dev *pdev)

rtnl_unlock();

err = pci_cleanup_aer_uncorrect_error_status(pdev);
if (err) {
dev_err(&pdev->dev,
"pci_cleanup_aer_uncorrect_error_status failed 0x%0x\n",
err); /* non-fatal, continue */
}

return PCI_ERS_RESULT_RECOVERED;
}

Expand Down
1 change: 0 additions & 1 deletion drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -4747,7 +4747,6 @@ static pci_ers_result_t eeh_slot_reset(struct pci_dev *pdev)
pci_set_master(pdev);
pci_restore_state(pdev);
pci_save_state(pdev);
pci_cleanup_aer_uncorrect_error_status(pdev);

if (t4_wait_dev_ready(adap->regs) < 0)
return PCI_ERS_RESULT_DISCONNECT;
Expand Down
1 change: 0 additions & 1 deletion drivers/net/ethernet/emulex/benet/be_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -6151,7 +6151,6 @@ static pci_ers_result_t be_eeh_reset(struct pci_dev *pdev)
if (status)
return PCI_ERS_RESULT_DISCONNECT;

pci_cleanup_aer_uncorrect_error_status(pdev);
be_clear_error(adapter, BE_CLEAR_ALL);
return PCI_ERS_RESULT_RECOVERED;
}
Expand Down
2 changes: 0 additions & 2 deletions drivers/net/ethernet/intel/e1000e/netdev.c
Original file line number Diff line number Diff line change
Expand Up @@ -6854,8 +6854,6 @@ static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev)
result = PCI_ERS_RESULT_RECOVERED;
}

pci_cleanup_aer_uncorrect_error_status(pdev);

return result;
}

Expand Down
Loading

0 comments on commit 20634dc

Please sign in to comment.