It seems that "no officially accepted spec about SMART attribute decoding" also hits here in the sense of that way too many drives get the raw counts wrong. In all the 30 or so logs that I looked at in the various Launchpad/RedHat/fd.o bug reports related to this I didn't see an implausible value of the normalized values, though.
I appreciate the effort of doing vendor independent bad blocks checking, but a lot of people get tons of false alarms due to that, and thus won't believe it any more if there is really a disk failing some day.
My feeling is that a more cautious approach would be to use the normalized value vs. treshold for the time being, and use the raw values if/when that can be made more reliable (then we should use something in between logarithmic and linear, though, since due to sheer probabilities, large disks will have more bad sectors and also more reserve sectors than small ones).
The bigger problem of this is (as you already mentioned) that the raw value is misparsed way too often. Random examples from bug reports:
http:// launchpadlibrar ian.net/ 34574037/ smartctl. txt Sector_ Ct 0x0033 100 100 005 Pre-fail Always - 327697
5 Reallocated_
http:// launchpadlibrar ian.net/ 35971054/ smartctl_ tests.log Sector_ Ct 0x0033 100 100 005 Pre-fail Always - 65542
5 Reallocated_
http:// launchpadlibrar ian.net/ 36599746/ smartctl_ tests-deer. log Sector_ Ct 0x0033 100 100 005 Pre-fail Always - 65552
5 Reallocated_
https:/ /bugzilla. redhat. com/attachment. cgi?id= 382378 Sector_ Ct 0x0033 100 100 005 Pre-fail Always - 655424
5 Reallocated_
https:/ /bugzilla. redhat. com/show_ bug.cgi? id=506254 sector- count 100/100/ 5 FAIL 1900724 sectors Prefail
reallocated-
Online
It seems that "no officially accepted spec about SMART attribute decoding" also hits here in the sense of that way too many drives get the raw counts wrong. In all the 30 or so logs that I looked at in the various Launchpad/ RedHat/ fd.o bug reports related to this I didn't see an implausible value of the normalized values, though.
I appreciate the effort of doing vendor independent bad blocks checking, but a lot of people get tons of false alarms due to that, and thus won't believe it any more if there is really a disk failing some day.
My feeling is that a more cautious approach would be to use the normalized value vs. treshold for the time being, and use the raw values if/when that can be made more reliable (then we should use something in between logarithmic and linear, though, since due to sheer probabilities, large disks will have more bad sectors and also more reserve sectors than small ones).