Comment 0 for bug 1769730

Revision history for this message
dann frazier (dannf) wrote :

[Impact]
The APEI (ACPI Platform Error Interface) interface is supposed to report PCIe errors to the AER (Advanced Error Reporting) driver, which surfaces them to userspace. However, we're currently only reporting "recoverable" errors and not errors of other types (e.g. correctable), thus hiding signs of faulty hardware from the user.

[Test Case]
$ sudo apt install rasdaemon
# On a system that supports ACPI EINJ (dmesg | grep "ACPI: EINJ"), use the attached script to inject a correctable PCIe error.
$ sudo ras-mc-ctl --errors
# There should be an entry for the injected error, as shown below:
No Memory errors.

PCIe AER events:
1 2018-05-07 17:55:46 +0000 Fatal error: Receiver Error

No Extlog errors.

No MCE errors.

[Regression Risk]