do not generate apport reports for non-critical kernel messages

Bug #346303 reported by Matt Zimmerman
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
apport (Ubuntu)
Invalid
Low
Unassigned
Nominated for Karmic by Mark Stosberg
Lucid
Invalid
Low
Unassigned
Oneiric
Invalid
Undecided
Unassigned
kerneloops (Ubuntu)
Fix Released
Low
James Westby
Nominated for Karmic by Mark Stosberg
Lucid
Won't Fix
Low
James Westby
Oneiric
Fix Released
High
Unassigned

Bug Description

Binary package hint: kerneloops

The dialog indicates a "serious kernel problem", but kerneloops also triggers on WARNING: (which is not a serious problem). Perhaps kerneloops should not trigger apport for WARNINGs, or perhaps it should not be described as a serious problem by apport.

ProblemType: Bug
Architecture: amd64
DistroRelease: Ubuntu 9.04
Package: kerneloops 0.12-0ubuntu4
ProcEnviron:
 LC_COLLATE=C
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/zsh
SourcePackage: kerneloops
Uname: Linux 2.6.28-11-generic x86_64

Revision history for this message
Matt Zimmerman (mdz) wrote :
Changed in apport (Ubuntu):
importance: Undecided → Low
status: New → Triaged
Changed in kerneloops (Ubuntu):
importance: Undecided → Low
status: New → Triaged
Revision history for this message
Mark Stosberg (markstos) wrote :

I disagree about the "Low Importance" of this.

For many ThinkPad users like myself our first impression of Karmic is a pop-up stating your kernel has encountered a "serious error" and that the system may be unstable and may need to be shutdown. This is due an otherwise harmless warning from the ibm-acpi driver, which triggers this scary message on a daily message for us.

To me it seems very important that warnings should be handled differently. Warnings may be just that-- a warning. It is in appropriate to treat them the as "serious error" that "may require a restart".

Here's a related bug about the issue:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/430361

You can see that there were were about 20 duplicate bug reports filed about this even before Karmic was officially released.

Matt Zimmerman (mdz)
Changed in apport (Ubuntu):
assignee: nobody → Canonical Desktop Team (canonical-desktop-team)
Revision history for this message
Steve Beattie (sbeattie) wrote :

A couple of things:

First, a kerneloops update was issued last night (see bug 471137) that effectively neuters kerneloops by default for karmic, so you should stop seeing these once you apply the update. (It can be re-enabled by editing /etc/default/kerneloops)

What kerneloops is doing is monitoring the kernel log looking for output that looks like a kernel oops (hence the name) or was generated by a BUG_ON() or WARN_ON() call. I haven't looked at what typical WARN_ON()s are to verify that they are generally safe to ignore. In the case of the thinkpad_acpi issue in bug 430361, it's a false positive in that the reported message starts with "Warning" which is what kerneloops triggers on to look for an invocation of WARN_ON(). The thinkpad_acpi kernel code that emits it should either be fixed to do a proper WARN_ON() if it's warrented or the string should be changed to "Alert" or something similar.

Thanks!

Martin Pitt (pitti)
Changed in apport (Ubuntu):
assignee: Canonical Desktop Team (canonical-desktop-team) → Martin Pitt (pitti)
Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

Oh yes, I am going to change kernel code because your tool scared your users senseless for a LOG_WARN message. Forget it.

LOG_WARN is of LOWER severity than KERN_ERR, which is also somewhat common, and in no way means the kernel is unstable. At most it means a driver encountered an error condition that the user should know about.

We have KERN_CRIT, KERN_ALERT and KERN_EMERG for the situations where something really bad is about to happen, and WARN_ON() and BUG_ON() for kernel bugs or other really strange situations that should be notified to the kernel people.

It gets even more strange to see this kind of answer in a bug (#430361) that was ultimately caused by some utterly broken tool that touches sysfs files at random. Instead of people scrambling out to find out why something they are about to ship is reading files in sysfs at random, I get asked to change a kernel driver log message so that it won't trip a (broken) regexp that thinks anything that start with "WARNING" is the result of WARN_ON?!

Compare THIS:
------------[ cut here ]------------
WARNING: at drivers/platform/x86/thinkpad_acpi.c:3759 hotkey_enabledisable_warn+0x50/0x65 [thinkpad_acpi]()
Hardware name: 2687DDU
thinkpad_acpi: hotkey enable/disable functionality has been removed from the driver. Hotkeys are always enabled
Modules linked in: thinkpad_acpi ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc option usbserial usb_storage radeon drm i2c_core tp_smapi thinkpad_ec snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm rfkill snd_seq_dummy ipw2200 libipw video output snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq tg3 libphy snd_timer snd_seq_device snd soundcore snd_page_alloc ehci_hcd uhci_hcd joydev [last unloaded: thinkpad_acpi]
Pid: 12451, comm: bash Not tainted 2.6.31.6-t43 #1
Call Trace:
 [<c1021119>] warn_slowpath_common+0x60/0x90
 [<c102117d>] warn_slowpath_fmt+0x24/0x27
 [<fa969431>] hotkey_enabledisable_warn+0x50/0x65 [thinkpad_acpi]
 [<fa9694d3>] hotkey_write+0x8d/0x164 [thinkpad_acpi]
 [<fa969980>] dispatch_procfs_write+0x71/0x91 [thinkpad_acpi]
 [<c10a54d3>] proc_file_write+0x6b/0x86
 [<c10a5468>] ? proc_file_write+0x0/0x86
 [<c10a1cbd>] proc_reg_write+0x81/0x95
 [<c10a1c3c>] ? proc_reg_write+0x0/0x95
 [<c10707ee>] vfs_write+0x8a/0x113
 [<c1070910>] sys_write+0x3b/0x60
 [<c10028f0>] sysenter_do_call+0x12/0x22
---[ end trace 7534b65473f2e981 ]---

to THIS:
thinkpad_acpi: WARNING: sysfs attribute hotkey_enable is deprecated and will be removed. Hotkeys can be disabled through hotkey_mask

That "cut here" and "end trace" lines are not there for show.

Revision history for this message
Matt Zimmerman (mdz) wrote : Re: [Bug 346303] Re: Kernel oops dialog is inconsistent with kerneloops semantics

On Sun, Nov 15, 2009 at 05:13:53PM -0000, Henrique de Moraes Holschuh wrote:
> Oh yes, I am going to change kernel code because your tool scared your
> users senseless for a LOG_WARN message. Forget it.

I don't know what you're referring to here. Did someone ask you to change
kernel code?

The problem in Ubuntu (as explained in the title and description of this
bug) is that there is a mismatch in the messages presented to the user in:

 * Apport (which says a "serious kernel problem" has been encountered), and

 * Kerneloops, which attempts to capture many other things which are NOT
   "serious kernel problems"

Right now, apport is triggered for any kerneloops event, which is why we
have this problem.

In my view, the way to fix this is to enable apport or kerneloops to
differentiate between serious and non-serious events, and only trigger an
apport problem report for a serious one.

Is there anything about that which you find problematic?

--
 - mdz

Revision history for this message
Henrique de Moraes Holschuh (hmh) wrote :

On Sun, 15 Nov 2009, Matt Zimmerman wrote:
> On Sun, Nov 15, 2009 at 05:13:53PM -0000, Henrique de Moraes Holschuh wrote:
> > Oh yes, I am going to change kernel code because your tool scared your
> > users senseless for a LOG_WARN message. Forget it.
>
> I don't know what you're referring to here. Did someone ask you to change
> kernel code?

Yes. A snide comment promoting a kernel change to get rid of "WARNING" in
the printks in the thinkpad-acpi driver, to avoid triggering the tool,
instead of fixing the tool (and whatever was causing it to be triggered).

Which is, obviously, the exactly wrong way to go about it, and the way it
was delivered pissed me off.

I don't have a web interface handy right now, or I'd tell you the comment
number exactly.

> In my view, the way to fix this is to enable apport or kerneloops to
> differentiate between serious and non-serious events, and only trigger an
> apport problem report for a serious one.
>
> Is there anything about that which you find problematic?

No, I don't think what you propose problematic: that's the correct way to
fix the issue. Obviously KERN_WARN and higher-urgency messages need to be
shown to the user, it is just a matter of using the right language for each
severity level (and also of handling WARN_ON and BUG_ON output blocks).

--
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

Revision history for this message
Martin Pitt (pitti) wrote : Re: Kernel oops dialog is inconsistent with kerneloops semantics

Michael, any chance you could look into this, since you did the recent kerneloops integration? If you do not have enough time, do you have some pointers how the current oops/warning detection looks like and where it decides whether or not to call apport's kernel_oops script? Thanks!

Changed in apport (Ubuntu):
assignee: Martin Pitt (pitti) → Michael Vogt (mvo)
summary: - Kernel oops dialog is inconsistent with kerneloops semantics
+ do not generate apport reports for non-critical kernel messages
Revision history for this message
James Westby (james-w) wrote :

I did the work, so I'll fix this.

Thanks,

James

Changed in apport (Ubuntu):
assignee: Michael Vogt (mvo) → James Westby (james-w)
Changed in kerneloops (Ubuntu):
assignee: nobody → James Westby (james-w)
Revision history for this message
Matt Zimmerman (mdz) wrote :

I don't think there's any work to do in apport here, only in kerneloops. Have I missed something? It would be good to get this off of the apport bug list.

Revision history for this message
Martin Pitt (pitti) wrote :

I agree. We shouldn't generate reports in kerneloops which will just be ignored by apport, we shouldn't generate them in the first place then.

Changed in apport (Ubuntu):
status: Triaged → Invalid
Changed in apport (Ubuntu Lucid):
assignee: James Westby (james-w) → nobody
status: Triaged → Invalid
Changed in apport (Ubuntu):
assignee: James Westby (james-w) → nobody
Changed in kerneloops (Ubuntu Oneiric):
status: New → Triaged
importance: Undecided → High
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kerneloops - 0.12+git20090217-1ubuntu12

---------------
kerneloops (0.12+git20090217-1ubuntu12) oneiric; urgency=low

  [ Brian Murray ]
  * submit.c: do not report OOPSes with WARNING in them (LP: #346303)
  * debian/kerneloops.default: re-enable kerneloops for oneiric, now
    that apport is enabled.
  * kerneloops-applet.c: modify number of arguments for
    notify_notification_new

  [ Kees Cook ]
  * debian/control: add libgtk2.0-dev and libdbus-glib-1-dev to Build Depends.
  * Makefile: add dbus to pkg-config calls.
 -- Brian Murray <email address hidden> Wed, 13 Jul 2011 12:38:39 -0700

Changed in kerneloops (Ubuntu Oneiric):
status: Triaged → Fix Released
Changed in apport (Ubuntu Oneiric):
status: New → Confirmed
Changed in apport (Ubuntu):
status: Confirmed → Invalid
Changed in apport (Ubuntu Oneiric):
status: Confirmed → Invalid
Revision history for this message
Rolf Leggewie (r0lf) wrote :

lucid has seen the end of its life and is no longer receiving any updates. Marking the lucid task for this ticket as "Won't Fix".

Changed in kerneloops (Ubuntu Lucid):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.