Skip to content

Commit

Permalink
Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kerne…
Browse files Browse the repository at this point in the history
…l/git/mchehab/i7core

* 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/i7core: (83 commits)
  i7core_edac: Better describe the supported devices
  Add support for Westmere to i7core_edac driver
  i7core_edac: don't free on success
  i7core_edac: Add support for X5670
  Always call i7core_[ur]dimm_check_mc_ecc_err
  i7core_edac: fix memory leak of i7core_dev
  EDAC: add __init to i7core_xeon_pci_fixup
  i7core_edac: Fix wrong device id for channel 1 devices
  i7core: add support for Lynnfield alternate address
  i7core_edac: Add initial support for Lynnfield
  i7core_edac: do not export static functions
  edac: fix i7core build
  edac: i7core_edac produces undefined behaviour on 32bit
  i7core_edac: Use a more generic approach for probing PCI devices
  i7core_edac: PCI device is called NONCORE, instead of NOCORE
  i7core_edac: Fix ringbuffer maxsize
  i7core_edac: First store, then increment
  i7core_edac: Better parse "any" addrmask
  i7core_edac: Use a lockless ringbuffer
  edac: Create an unique instance for each kobj
  ...
  • Loading branch information
Linus Torvalds committed Jun 4, 2010
2 parents e620d1e + 52707f9 commit 9a9620d
Show file tree
Hide file tree
Showing 13 changed files with 2,598 additions and 44 deletions.
152 changes: 152 additions & 0 deletions Documentation/edac.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ Written by Doug Thompson <dougthompson@xmission.com>
7 Dec 2005
17 Jul 2007 Updated

(c) Mauro Carvalho Chehab <mchehab@redhat.com>
05 Aug 2009 Nehalem interface

EDAC is maintained and written by:

Expand Down Expand Up @@ -717,3 +719,153 @@ unique drivers for their hardware systems.
The 'test_device_edac' sample driver is located at the
bluesmoke.sourceforge.net project site for EDAC.

=======================================================================
NEHALEM USAGE OF EDAC APIs

This chapter documents some EXPERIMENTAL mappings for EDAC API to handle
Nehalem EDAC driver. They will likely be changed on future versions
of the driver.

Due to the way Nehalem exports Memory Controller data, some adjustments
were done at i7core_edac driver. This chapter will cover those differences

1) On Nehalem, there are one Memory Controller per Quick Patch Interconnect
(QPI). At the driver, the term "socket" means one QPI. This is
associated with a physical CPU socket.

Each MC have 3 physical read channels, 3 physical write channels and
3 logic channels. The driver currenty sees it as just 3 channels.
Each channel can have up to 3 DIMMs.

The minimum known unity is DIMMs. There are no information about csrows.
As EDAC API maps the minimum unity is csrows, the driver sequencially
maps channel/dimm into different csrows.

For example, suposing the following layout:
Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
The driver will map it as:
csrow0: channel 0, dimm0
csrow1: channel 0, dimm1
csrow2: channel 1, dimm0
csrow3: channel 2, dimm0

exports one
DIMM per csrow.

Each QPI is exported as a different memory controller.

2) Nehalem MC has the hability to generate errors. The driver implements this
functionality via some error injection nodes:

For injecting a memory error, there are some sysfs nodes, under
/sys/devices/system/edac/mc/mc?/:

inject_addrmatch/*:
Controls the error injection mask register. It is possible to specify
several characteristics of the address to match an error code:
dimm = the affected dimm. Numbers are relative to a channel;
rank = the memory rank;
channel = the channel that will generate an error;
bank = the affected bank;
page = the page address;
column (or col) = the address column.
each of the above values can be set to "any" to match any valid value.

At driver init, all values are set to any.

For example, to generate an error at rank 1 of dimm 2, for any channel,
any bank, any page, any column:
echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm
echo 1 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank

To return to the default behaviour of matching any, you can do:
echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/dimm
echo any >/sys/devices/system/edac/mc/mc0/inject_addrmatch/rank

inject_eccmask:
specifies what bits will have troubles,

inject_section:
specifies what ECC cache section will get the error:
3 for both
2 for the highest
1 for the lowest

inject_type:
specifies the type of error, being a combination of the following bits:
bit 0 - repeat
bit 1 - ecc
bit 2 - parity

inject_enable starts the error generation when something different
than 0 is written.

All inject vars can be read. root permission is needed for write.

Datasheet states that the error will only be generated after a write on an
address that matches inject_addrmatch. It seems, however, that reading will
also produce an error.

For example, the following code will generate an error for any write access
at socket 0, on any DIMM/address on channel 2:

echo 2 >/sys/devices/system/edac/mc/mc0/inject_addrmatch/channel
echo 2 >/sys/devices/system/edac/mc/mc0/inject_type
echo 64 >/sys/devices/system/edac/mc/mc0/inject_eccmask
echo 3 >/sys/devices/system/edac/mc/mc0/inject_section
echo 1 >/sys/devices/system/edac/mc/mc0/inject_enable
dd if=/dev/mem of=/dev/null seek=16k bs=4k count=1 >& /dev/null

For socket 1, it is needed to replace "mc0" by "mc1" at the above
commands.

The generated error message will look like:

EDAC MC0: UE row 0, channel-a= 0 channel-b= 0 labels "-": NON_FATAL (addr = 0x0075b980, socket=0, Dimm=0, Channel=2, syndrome=0x00000040, count=1, Err=8c0000400001009f:4000080482 (read error: read ECC error))

3) Nehalem specific Corrected Error memory counters

Nehalem have some registers to count memory errors. The driver uses those
registers to report Corrected Errors on devices with Registered Dimms.

However, those counters don't work with Unregistered Dimms. As the chipset
offers some counters that also work with UDIMMS (but with a worse level of
granularity than the default ones), the driver exposes those registers for
UDIMM memories.

They can be read by looking at the contents of all_channel_counts/

$ for i in /sys/devices/system/edac/mc/mc0/all_channel_counts/*; do echo $i; cat $i; done
/sys/devices/system/edac/mc/mc0/all_channel_counts/udimm0
0
/sys/devices/system/edac/mc/mc0/all_channel_counts/udimm1
0
/sys/devices/system/edac/mc/mc0/all_channel_counts/udimm2
0

What happens here is that errors on different csrows, but at the same
dimm number will increment the same counter.
So, in this memory mapping:
csrow0: channel 0, dimm0
csrow1: channel 0, dimm1
csrow2: channel 1, dimm0
csrow3: channel 2, dimm0
The hardware will increment udimm0 for an error at the first dimm at either
csrow0, csrow2 or csrow3;
The hardware will increment udimm1 for an error at the second dimm at either
csrow0, csrow2 or csrow3;
The hardware will increment udimm2 for an error at the third dimm at either
csrow0, csrow2 or csrow3;

4) Standard error counters

The standard error counters are generated when an mcelog error is received
by the driver. Since, with udimm, this is counted by software, it is
possible that some errors could be lost. With rdimm's, they displays the
contents of the registers
2 changes: 2 additions & 0 deletions arch/x86/include/asm/pci_x86.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ extern int pcibios_last_bus;
extern struct pci_bus *pci_root_bus;
extern struct pci_ops pci_root_ops;

void pcibios_scan_specific_bus(int busn);

/* pci-irq.c */

struct irq_info {
Expand Down
10 changes: 10 additions & 0 deletions arch/x86/kernel/cpu/mcheck/mce.c
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/debugfs.h>
#include <linux/edac_mce.h>

#include <asm/processor.h>
#include <asm/hw_irq.h>
Expand Down Expand Up @@ -168,6 +169,15 @@ void mce_log(struct mce *mce)
for (;;) {
entry = rcu_dereference_check_mce(mcelog.next);
for (;;) {
/*
* If edac_mce is enabled, it will check the error type
* and will process it, if it is a known error.
* Otherwise, the error will be sent through mcelog
* interface
*/
if (edac_mce_parse(mce))
return;

/*
* When the buffer fills up discard new entries.
* Assume that the earlier errors are the more
Expand Down
42 changes: 25 additions & 17 deletions arch/x86/pci/legacy.c
Original file line number Diff line number Diff line change
Expand Up @@ -11,28 +11,14 @@
*/
static void __devinit pcibios_fixup_peer_bridges(void)
{
int n, devfn;
long node;
int n;

if (pcibios_last_bus <= 0 || pcibios_last_bus > 0xff)
return;
DBG("PCI: Peer bridge fixup\n");

for (n=0; n <= pcibios_last_bus; n++) {
u32 l;
if (pci_find_bus(0, n))
continue;
node = get_mp_bus_to_node(n);
for (devfn = 0; devfn < 256; devfn += 8) {
if (!raw_pci_read(0, n, devfn, PCI_VENDOR_ID, 2, &l) &&
l != 0x0000 && l != 0xffff) {
DBG("Found device at %02x:%02x [%04x]\n", n, devfn, l);
printk(KERN_INFO "PCI: Discovered peer bus %02x\n", n);
pci_scan_bus_on_node(n, &pci_root_ops, node);
break;
}
}
}
for (n=0; n <= pcibios_last_bus; n++)
pcibios_scan_specific_bus(n);
}

int __init pci_legacy_init(void)
Expand All @@ -50,6 +36,28 @@ int __init pci_legacy_init(void)
return 0;
}

void pcibios_scan_specific_bus(int busn)
{
int devfn;
long node;
u32 l;

if (pci_find_bus(0, busn))
return;

node = get_mp_bus_to_node(busn);
for (devfn = 0; devfn < 256; devfn += 8) {
if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) &&
l != 0x0000 && l != 0xffff) {
DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l);
printk(KERN_INFO "PCI: Discovered peer bus %02x\n", busn);
pci_scan_bus_on_node(busn, &pci_root_ops, node);
return;
}
}
}
EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus);

int __init pci_subsys_init(void)
{
/*
Expand Down
13 changes: 13 additions & 0 deletions drivers/edac/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,9 @@ config EDAC_MM_EDAC
occurred so that a particular failing memory module can be
replaced. If unsure, select 'Y'.

config EDAC_MCE
bool

config EDAC_AMD64
tristate "AMD64 (Opteron, Athlon64) K8, F10h, F11h"
depends on EDAC_MM_EDAC && K8_NB && X86_64 && PCI && EDAC_DECODE_MCE
Expand Down Expand Up @@ -166,6 +169,16 @@ config EDAC_I5400
Support for error detection and correction the Intel
i5400 MCH chipset (Seaburg).

config EDAC_I7CORE
tristate "Intel i7 Core (Nehalem) processors"
depends on EDAC_MM_EDAC && PCI && X86
select EDAC_MCE
help
Support for error detection and correction the Intel
i7 Core (Nehalem) Integrated Memory Controller that exists on
newer processors like i7 Core, i7 Core Extreme, Xeon 35xx
and Xeon 55xx processors.

config EDAC_I82860
tristate "Intel 82860"
depends on EDAC_MM_EDAC && PCI && X86_32
Expand Down
2 changes: 2 additions & 0 deletions drivers/edac/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

obj-$(CONFIG_EDAC) := edac_stub.o
obj-$(CONFIG_EDAC_MM_EDAC) += edac_core.o
obj-$(CONFIG_EDAC_MCE) += edac_mce.o

edac_core-objs := edac_mc.o edac_device.o edac_mc_sysfs.o edac_pci_sysfs.o
edac_core-objs += edac_module.o edac_device_sysfs.o
Expand All @@ -23,6 +24,7 @@ obj-$(CONFIG_EDAC_CPC925) += cpc925_edac.o
obj-$(CONFIG_EDAC_I5000) += i5000_edac.o
obj-$(CONFIG_EDAC_I5100) += i5100_edac.o
obj-$(CONFIG_EDAC_I5400) += i5400_edac.o
obj-$(CONFIG_EDAC_I7CORE) += i7core_edac.o
obj-$(CONFIG_EDAC_E7XXX) += e7xxx_edac.o
obj-$(CONFIG_EDAC_E752X) += e752x_edac.o
obj-$(CONFIG_EDAC_I82443BXGX) += i82443bxgx_edac.o
Expand Down
23 changes: 22 additions & 1 deletion drivers/edac/edac_core.h
Original file line number Diff line number Diff line change
Expand Up @@ -341,12 +341,30 @@ struct csrow_info {
struct channel_info *channels;
};

struct mcidev_sysfs_group {
const char *name; /* group name */
struct mcidev_sysfs_attribute *mcidev_attr; /* group attributes */
};

struct mcidev_sysfs_group_kobj {
struct list_head list; /* list for all instances within a mc */

struct kobject kobj; /* kobj for the group */

struct mcidev_sysfs_group *grp; /* group description table */
struct mem_ctl_info *mci; /* the parent */
};

/* mcidev_sysfs_attribute structure
* used for driver sysfs attributes and in mem_ctl_info
* sysfs top level entries
*/
struct mcidev_sysfs_attribute {
struct attribute attr;
/* It should use either attr or grp */
struct attribute attr;
struct mcidev_sysfs_group *grp; /* Points to a group of attributes */

/* Ops for show/store values at the attribute - not used on group */
ssize_t (*show)(struct mem_ctl_info *,char *);
ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
};
Expand Down Expand Up @@ -424,6 +442,9 @@ struct mem_ctl_info {
/* edac sysfs device control */
struct kobject edac_mci_kobj;

/* list for all grp instances within a mc */
struct list_head grp_kobj_list;

/* Additional top controller level attributes, but specified
* by the low level driver.
*
Expand Down
Loading

0 comments on commit 9a9620d

Please sign in to comment.