> Thanks, RB. So is it "run" or "continue" once I get in gdb?
If you are running straight from the shell, e.g. "gdb firefox-bin" then you have to use "run" to get things going. If you attach an existing process, e.g. "gdb firefox-bin process-id-#", then once gdb gets going it will suspend the process and you have to "continue" it to get it to respond again.
> And what about the 300s delay -- should I leave gdb alone for 300s before I
> try, or "continue" gdb and expect a 300s delay before anything happens? Can we
> reduce the 300s delay without recompiling?
I am unsure about the 300s delay. Firefox-bin is *not* a small program and loading in all of the symbol tables takes a while (depends on the speed of your system). But once you get a gdb prompt you should be able to set breakpoints, do stack traces, run, continue, etc. without excessive delays.
I would strongly urge you to find a "GDB Quick Reference" manual. Google should offer it up and its only 2 pages. There are perhaps a dozen commands from that that are essential (which I can advise on but is perhaps better done one-on-one).
> My main problem was that Firefox just sat there and didn't even give me one
> window. And no CPU usage.
Sounds like it was not running or "suspended". When I run firefox on my machine, the first thing it attempts is to request which profile I would like to run (I've got several). If you get that far is is "running".
If you run it from the start out of gdb, you can also set breakpoints using the gdb "break command, e.g. "break XRE_main" or "break gtk_init" or "break gdk_init". If those trigger a stopping of Firefox (i.e. it becomes unusable) and an activation of gdb (such that you can do "backtrace"(s)) (the critical "incantation" in gdb is "thread apply all bt". That will show you what the various firefox threads are doing. (Firefox has to go through all of these breakpoints before it is really "running".)
>
> Multiple profiles: The extra firefox instantiations is just to create
> contention for memory, X and network bandwidth. I was planning on connecting
> gdb only to "Firefox1".
>
Nothing wrong with that if Firefox1 is the one which springs the error. But if I understand your approach it sounds like you are going to have multiple instantiations of Firefox (each with a separate process-ID affiliated with specific profiles). I'm not optimistic that you will be able to break 1 vs. 2 or 3. (Take it from someone who has tried 100 windows and half-a-thousand tabs.) Fortunately, if you start them up in separate shell windows, I believe the Glib/GDK/GTK errors provide the process-ID of the process which throws the "unexpectedly destroyed" error. You should be able to attach GDB to that specific process, do stack traces, set breakpoints, and continue the process (so in theory if you throw more URLs at it it will throw the error again).
So, Q1 is will firefox-bin run on your machine (ignoring gdb involvement).
Then Q2 is whether you can generate the window unexpectedly destroyed (untitled window) error in a relatively reliable fashion (I can do it but I can't do it reliably). Then Q3 is whether once you have generated the untitled window problem you can use gdb to attach to the specific firefox-bin which threw the error. Then at that point things start to get into the we need more detailed information category and and until we have it Q4 remains an open question.
It would be useful to know whether the version of Linux you are running is running with glibc with linux-pthreads. That is how the firefox-bin I provided was compiled. Linux-pthreads is supposed to be significantly faster than older implementations of threads and so its probability of generating errors (if we are dealing with a very subtle timing issues) may be much different from those which may have been present in earlier releases of Firefox and/or running on earlier releases of Linux.
(And as an aside to any "real" Firefox developers reading this thread, given the increase in the number of cores on processors that we can anticipate (if this turns out to be a subtle timing problem) -- you have a serious Q/A problem. Because if you can't guarantee that Firefox should fail gracefully on a machine with 64B of memory [1] (which it does not) How can you assert that it will work on machines with 2, 4, or 8 cores?). And I don't particularly care if it works on an 4 core Windows Vista machine. I only care if it works on an 8 core Linux machine or an 8 core FreeBSD machine.
1. Netscape 4.72 did not even come close to this requirement for memory.
> Thanks, RB. So is it "run" or "continue" once I get in gdb?
If you are running straight from the shell, e.g. "gdb firefox-bin" then you have to use "run" to get things going. If you attach an existing process, e.g. "gdb firefox-bin process-id-#", then once gdb gets going it will suspend the process and you have to "continue" it to get it to respond again.
> And what about the 300s delay -- should I leave gdb alone for 300s before I
> try, or "continue" gdb and expect a 300s delay before anything happens? Can we
> reduce the 300s delay without recompiling?
I am unsure about the 300s delay. Firefox-bin is *not* a small program and loading in all of the symbol tables takes a while (depends on the speed of your system). But once you get a gdb prompt you should be able to set breakpoints, do stack traces, run, continue, etc. without excessive delays.
I would strongly urge you to find a "GDB Quick Reference" manual. Google should offer it up and its only 2 pages. There are perhaps a dozen commands from that that are essential (which I can advise on but is perhaps better done one-on-one).
> My main problem was that Firefox just sat there and didn't even give me one
> window. And no CPU usage.
Sounds like it was not running or "suspended". When I run firefox on my machine, the first thing it attempts is to request which profile I would like to run (I've got several). If you get that far is is "running".
If you run it from the start out of gdb, you can also set breakpoints using the gdb "break command, e.g. "break XRE_main" or "break gtk_init" or "break gdk_init". If those trigger a stopping of Firefox (i.e. it becomes unusable) and an activation of gdb (such that you can do "backtrace"(s)) (the critical "incantation" in gdb is "thread apply all bt". That will show you what the various firefox threads are doing. (Firefox has to go through all of these breakpoints before it is really "running".)
>
> Multiple profiles: The extra firefox instantiations is just to create
> contention for memory, X and network bandwidth. I was planning on connecting
> gdb only to "Firefox1".
>
Nothing wrong with that if Firefox1 is the one which springs the error. But if I understand your approach it sounds like you are going to have multiple instantiations of Firefox (each with a separate process-ID affiliated with specific profiles). I'm not optimistic that you will be able to break 1 vs. 2 or 3. (Take it from someone who has tried 100 windows and half-a-thousand tabs.) Fortunately, if you start them up in separate shell windows, I believe the Glib/GDK/GTK errors provide the process-ID of the process which throws the "unexpectedly destroyed" error. You should be able to attach GDB to that specific process, do stack traces, set breakpoints, and continue the process (so in theory if you throw more URLs at it it will throw the error again).
So, Q1 is will firefox-bin run on your machine (ignoring gdb involvement).
Then Q2 is whether you can generate the window unexpectedly destroyed (untitled window) error in a relatively reliable fashion (I can do it but I can't do it reliably). Then Q3 is whether once you have generated the untitled window problem you can use gdb to attach to the specific firefox-bin which threw the error. Then at that point things start to get into the we need more detailed information category and and until we have it Q4 remains an open question.
It would be useful to know whether the version of Linux you are running is running with glibc with linux-pthreads. That is how the firefox-bin I provided was compiled. Linux-pthreads is supposed to be significantly faster than older implementations of threads and so its probability of generating errors (if we are dealing with a very subtle timing issues) may be much different from those which may have been present in earlier releases of Firefox and/or running on earlier releases of Linux.
(And as an aside to any "real" Firefox developers reading this thread, given the increase in the number of cores on processors that we can anticipate (if this turns out to be a subtle timing problem) -- you have a serious Q/A problem. Because if you can't guarantee that Firefox should fail gracefully on a machine with 64B of memory [1] (which it does not) How can you assert that it will work on machines with 2, 4, or 8 cores?). And I don't particularly care if it works on an 4 core Windows Vista machine. I only care if it works on an 8 core Linux machine or an 8 core FreeBSD machine.
1. Netscape 4.72 did not even come close to this requirement for memory.