Comment 31 for bug 226470

Revision history for this message
In , Robert-bradbury (robert-bradbury) wrote :

Update. I strongly suspect the top level problem area is in:
  mozilla/widget/src/gtk2/nsWindow.cpp - nsWindow::NativeCreate(...)
between lines ~2825 tto 2897. The lower level calls include such functions
as:
gtk_window_new(), gtk_window_group_add_window(), gtk_container_add(),
gtk_widget_realize() and gtk_window_set_focus(). The problem is that
...realize() and ...set_focus() functions are called from other than
NativeCreate() so debugging it is tricky.

In the course of trying to debug across a Create-TAB request, gdb ended up
handing me a "Cannot get thread event message: debugger service failed" error.
Attempts to continue Firefox ended up with a SETFAULT and core dump (so I've
lost the error state, though I have the sessionstore.js file for it).

There is some interaction between the Create-TAB request and creating new
threads so the pthread() library gets dragged into this discussion (along with
GDK/GTK/GLIB). I think it would help if people also made clear:
1) What processor you are using.
2) What thread library you are using.

I'm running a Pentium IV (Prescott) which has only 1 core but does support
hyperthreading. I'm using the most recent release of the Linux Posix pthread
library (glibc 2.5 I think) and it looks like GLIB is supposed to be using
pthread_mutex_lock() and pthread_mutex_unlock to get and release locks.

It appears that it may be impossible using gdb (at least on my system) to debug
pthread_mutex functions (setting breakpoints at them results in the "... thread
event..." message mentioned previously).

It may be necessary to compile GLIB with G_DEBUG_LOCKS (glib/gthreads.h) and
set the proper debug flags and/or compile mozilla with MOZ_LOGGING at least for
the widget/src/gtk2 functions (see #define LOG() in
widget/src/gtk2/nsCommonWidget.h. Of course adding logging to either the
widget functions or GLIB may disrupt the timing sufficiently to make the
problem go away. One thing that appears key is the need to find out where the
DestroyNotify is coming from (see
gtk+/gdk/x11/gdkevents-x11.c:gdk_event_translate() -- case DestroyNotify:).
If your Gtk/Gdk library is compiled with debugging, running Firefox with
"export GDK_DEBUG=events" may help provide destroy notify messages on the
console log, but what one really wants is a way to do "_gdk_debug_flags |=
GDK_DEBUG_EVENTS;" (see GDK_NOTE macro in gdk/gdkinternals.h) after you have
loaded up all of the windows & tabs that lead up to the problem state.

I hope the above helps to put our hands around the problem.