untitled popup window
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
firefox-3.0 (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
flashplugin-nonfree (Ubuntu) |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: firefox-3.0
With some websites, a "Untitled window" that pops up. Closing the window will close the browser.
ProblemType: Bug
Architecture: i386
Date: Sun May 4 08:58:33 2008
DistroRelease: Ubuntu 8.04
Package: firefox-3.0 3.0~b5+
PackageArchitec
ProcEnviron:
PATH=/
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: firefox-3.0
Uname: Linux 2.6.24-17-generic i686
In Mozilla Bugzilla #263160, Nilbus (nilbus) wrote : | #2 |
I've experienced this bug several times as well. It happens after I've had
firefox open for several weeks. On each page reload, on the frames page, any of
the frames may "jump out" of the page in a new window. Between 80 and 90% of
the time, the frame will actually fix itself on reload, but may pop out on
subsequent reloads.
Restarting firefox does fix the problem. Everything Mikael said was true for me
as well.
I run Gentoo Linux with Xorg and Blackbox WM.
I took a screenshot: <a
href="http://
In Mozilla Bugzilla #263160, Trevor-watson (trevor-watson) wrote : | #3 |
I have this happen at least once a day on FF 1.0.2 on Solaris, running under
GNOME. It has been happening since FF 0.9 and on more than one GNOME release.
In Mozilla Bugzilla #263160, Gervase Markham (gerv-mozilla) wrote : | #4 |
This is an automated message, with ID "auto-resolve01".
This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.
While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.
If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.
The latest beta releases can be obtained from:
Firefox: http://
Thunderbird: http://
Seamonkey: http://
In Mozilla Bugzilla #263160, Gervase Markham (gerv-mozilla) wrote : | #5 |
This bug has been automatically resolved after a period of inactivity (see above
comment). If anyone thinks this is incorrect, they should feel free to reopen it.
In Mozilla Bugzilla #263160, Philringnalda (philringnalda) wrote : | #6 |
Reopening, since I have a couple more to mark as duplicates...
In Mozilla Bugzilla #263160, Philringnalda (philringnalda) wrote : | #7 |
*** Bug 348734 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Philringnalda (philringnalda) wrote : | #8 |
*** Bug 354104 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Bugzilla-hernmarck (bugzilla-hernmarck) wrote : | #9 |
Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
I can confirm this bug - I often leave ff (and Mozilla-suite) open for days/weeks. Early 2006 I started "facing" this bug and hating it :-)
I often work with Typo3, phpMyAdmin and other tools heavily using frames.
This summer I also noticed this bug in mozilla
Mozilla/5.0 (X11; U; Linux x86_64; de-AT; rv:1.7.13) Gecko/20060411
There I have about 30 tabs open - all the time (news sites etc).
I never saw it on MS Windows, I mostly work on linux (SuSE 10.0 - Xorg 6.8.2, kde). Until end of 2005 I had SuSE 9.1 (XFree 4.3.99, kde) and can't remember this behaviour.
So I think it's also depending on the Window-Manager...
I always try to continue working without closing firefox. When closing the tab with the frames the frame-windows are also disappearing. But sometimes it's useless to try opening the same framepage in another tab or ff window - you always get the "flying frame windows". Just now: after waiting (and reporting here) I can open the last problem frame page again without problems, so maybe it's also a time problem...???
Reproducable: no step-by-step.
It happens when it happens (having ff open for some days and using many frame websites).
/Christian
In Mozilla Bugzilla #263160, Robert-leathley (robert-leathley) wrote : | #10 |
For me (see Bug 354104) it happens on SLES9, XFree86 4.3.99, Window Maker 0.92.0
In Mozilla Bugzilla #263160, Amettler (amettler) wrote : | #11 |
Created an attachment (id=243592)
gtk errors logged to stderr when bug occurs
In Mozilla Bugzilla #263160, Amettler (amettler) wrote : | #12 |
I can confirm occurence of this bug with Debian, KDE, on i386 and amd64; I usually keep firefox open for weeks.
At the moment, this and bug 341731 are more or less the only stability issues I'm experiencing. One can interact with the dislocated content as usual, but closing them or clicking in the pane where they should be rendered causes a crash; reloading the page several times will eventually result in the frames being rendered correctly, but loading another page will often cause the dislocation to occur again. "tail -n 1000 .xsession-errors | grep Gecko" is attached.
In Mozilla Bugzilla #263160, Dev-null-gmx (dev-null-gmx) wrote : | #13 |
I can confirm this bug. It happens very often on my Debian Computer too. It seems to be connected to the time the xsession (and/or the computer) is running. Furthermore I've noticed this bug the first time after upgrading to a dual-core system using smp.
Why is the status still unconfirmed after several people have confirmed this bug?
In Mozilla Bugzilla #263160, Philringnalda (philringnalda) wrote : | #14 |
It's unconfirmed partly because it's certainly in the wrong product and component, though the right one isn't clear, and partly because nobody knows what's at fault, and mostly because it doesn't make any difference: if someone chooses to work on it, the difference between this report being UNCO and NEW won't matter to them, while there's a slight chance that someone looking through UNCO bugs will say "oh, I know something that causes that..."
In Mozilla Bugzilla #263160, Dev-null-gmx (dev-null-gmx) wrote : | #15 |
Sorry, but I can't follow that argumentation. I'd think the contrary is the case. If there is a confirmed (and very nasty) bug reported by several people, I would assume somebody feels the obligation to correct this bug before the next Firefox release. I mean somebody has to be reponsible for the quality of Firefox. If the bug is unconfirmed, developers might not care about it because to them, it's very possbile that the bug is a problem somewhere else and not in their product.
You're right that if someone chooses to work on it, the difference between this report being UNCO or NEW won't matter to them. Though, I think marking this bug as new will improve the chances that somebody chooses to work on it.
As I can see, the bug was opened in 2004. How many Firefox release have there been since the first report? Why does nobody fix the bugs for new release? (Or at least stop the new releases until somebody is willing to fix the bug.) How can you release new Firefox versions with open bugs? Or do the developers think there are no bugs because even after several confirmations you leave the bug unconfirmed?
I suggest increasing the severity to at least critical because for users who happen to stumble upon this bug, it's very painful.
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #16 |
I'm experiencing this in Epiphany; see <http://
In Mozilla Bugzilla #263160, Aaron-lithic (aaron-lithic) wrote : | #17 |
*** Bug 367211 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Aaron-lithic (aaron-lithic) wrote : | #18 |
this bug is seriously awful, i also am at a loss why it hasn't been prioritized upwards.
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #19 |
It is awful. It's very hard to debug because it's very hard to reproduce. If you can figure out a reliable way to reproduce it, that would help very very much.
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #20 |
If my experience is any indication, Epiphany might be a better context in which to reproduce this than Firefox. This bug bites me *frequently* in Epiphany. While I don't have a magic formula for reliably reproducing it, it tends happen within an hour or so of use. "Busy" pages with a lot of IFRAMEs seem especially likely to trigger it. A heavily customized Google home page is one such animal; cnn.com seems to be another.
While Epiphany is my preferred browser, I've tried reproducing this in Firefox--and I haven't had any luck so far.
Oh, and when I do close one of these rogue windows causing the browser to "crash", I don't get a stack. I get a program exit with a return code of 1.
And FWIW, I'm on x86_64.
I'd be happy to help someone familiar with the Mozilla code to chase this down; but without a stack, I'm not sure where to start.
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #21 |
My guess is that what happens is a subframe's window gets created as a top level widget by mistake. (The crash occurs later so a stack might not help.) I really have no idea how that could happen. A mess of logging code in nsWindow:
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #22 |
I'll give it a shot.
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #23 |
Created an attachment (id=253838)
Backtrace from the creation of a rogue window in Epiphany
This is a backtrace from the creation of one of these rogue top-level windows when using Epiphany.
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #24 |
In that call to NativeCreate, aNativeParent is non-null. So we should be hitting either
http://
or
http://
and setting parentGdkWindow or parentGtkContainer to something, which should ensure that this window gets created inside some other window. Can you figure out why it isn't?
In Mozilla Bugzilla #263160, Philringnalda (philringnalda) wrote : | #25 |
*** Bug 370787 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #26 |
Glad to get my bug report 370787 classified as a duplicate of this one. What Mikael Hedberg and the rest of you have described is exactly what is happening to me.
I think the key to reproducing the bug is to open LOTS of windows with LOTS of tabs in them, say 10-20 windows and a 100 tabs. The more the merrier. And try some of the more prone web sites such as www.marketwatch.com or www.huffingtonp
I have a hunch that web pages with lots of subframes or flash frames also may provoke the bug faster.
This is really a major annoyance, and seeing how many other people have the problem I vote for upgrading it to critical.
In Mozilla Bugzilla #263160, Philringnalda (philringnalda) wrote : | #27 |
*** Bug 370915 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #28 |
Bug 370915 is another excellent description of the same problem.
I have some observations to add: In my experience, it is not strictly required
to run out of memory and start chewing up swap space before the problem occurs.
For example, at the moment of this writing I have 3G ram, 1.4G used, 0G swap
used and I already have two disembodied Untitled windows popping up.
I can also confirm the console error messages syndrome, although I had not made
the connection with the disembodied window problem before.
I can however confirm that the cobination of X and my 2-3 firefox processes are chewing up quite a lot of cpu while this is going on.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #29 |
Ok, I, as author of Bug 370915, agree that we are all talking about the same bug and am noving discussion from Bug 370915 to Bug 263160. Wrestling with this is difficult unless you are adept at building Firefox and system libraries completely in debug mode. It took me months to work out how to do this but I now have such a system (a "debug" version of firefox-bin alone without the shared libraries is 130MB). To debug this specific problem easily it appears that you also need to be using gtk+ libraries (gdk & gtk) and the glib libraries (glib) compiled for debugging. Having libstdc++ and glibc compiled for debugging helps as well. (minus points to the Firefox developers for not releasing a static binary for Linux with all of these included!).
I am currently *still* running the gdb & firefox instance in the state that produces the bug in the hope that we can figure out what to do with it (I'm filing this bug report using a different seamonkey process set). As stated in the series of bug reports, the bug isn't easy to reproduce -- but once you get Firefox+X into the state where it is consuming a significant fraction of CPU time (40-60% minimum?), and depending on what other processes are consuming CPU time, you can make it happen without too much trouble.
I thought memory usage was the problem initially but I now no longer think that is the real problem. The problem is that if you leave Firefox running for days, and/or have opened and closed lots of windows the Firefox heap becomes increasingly fragmented and it take more CPU time to allocate or deallocate anything from the heap. This becomes problematic if one is near the system physical memory limits (active resident memory ~= total physical memory) because running through the fragmented heap may require paging which will of course make the process slower.
My current working hypothesis is that there is a subtle coordination/timing problem between Firefox, GDK/GLIB & X.
Here is the scenario. My Firefox is currently using 214 MiB of X Server Memory according to the Process Monitor. I believe that X programs map shared memory and then coordinate when Firefox can write into it and when X can read from it. Firefox says "create a new tab". Firefox talks to GDK/GLIB they talk to X and begin this process. (I am relatively well qualified to debug C programs but relatively illiterate about Firefox/
The key error seems to be the "GdkWindow ...... unexpe...
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #30 |
Robert Bradbury,
I think you are an absolute hero for getting this close to identifying the bug. Operations on a window that X has not yet finished creating seems like a good theory. Please keep up the good work!!
While we are on the topic, is it not strange that X and/or Firefox does not seem to have robust heap management and garbage collection? I mean, If I create huge amounts of tabs, X/Firefox will get very large, but they do not seem to shrink much if I delete the window. Maybe I'm wrong, I'm certainly no expert on the innards of X nor Firefox.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #31 |
Update. I strongly suspect the top level problem area is in:
mozilla/
between lines ~2825 tto 2897. The lower level calls include such functions
as:
gtk_window_new(), gtk_window_
gtk_widget_
...realize() and ...set_focus() functions are called from other than
NativeCreate() so debugging it is tricky.
In the course of trying to debug across a Create-TAB request, gdb ended up
handing me a "Cannot get thread event message: debugger service failed" error.
Attempts to continue Firefox ended up with a SETFAULT and core dump (so I've
lost the error state, though I have the sessionstore.js file for it).
There is some interaction between the Create-TAB request and creating new
threads so the pthread() library gets dragged into this discussion (along with
GDK/GTK/GLIB). I think it would help if people also made clear:
1) What processor you are using.
2) What thread library you are using.
I'm running a Pentium IV (Prescott) which has only 1 core but does support
hyperthreading. I'm using the most recent release of the Linux Posix pthread
library (glibc 2.5 I think) and it looks like GLIB is supposed to be using
pthread_
It appears that it may be impossible using gdb (at least on my system) to debug
pthread_mutex functions (setting breakpoints at them results in the "... thread
event..." message mentioned previously).
It may be necessary to compile GLIB with G_DEBUG_LOCKS (glib/gthreads.h) and
set the proper debug flags and/or compile mozilla with MOZ_LOGGING at least for
the widget/src/gtk2 functions (see #define LOG() in
widget/
widget functions or GLIB may disrupt the timing sufficiently to make the
problem go away. One thing that appears key is the need to find out where the
DestroyNotify is coming from (see
gtk+/gdk/
If your Gtk/Gdk library is compiled with debugging, running Firefox with
"export GDK_DEBUG=events" may help provide destroy notify messages on the
console log, but what one really wants is a way to do "_gdk_debug_flags |=
GDK_DEBUG_EVENTS;" (see GDK_NOTE macro in gdk/gdkinternals.h) after you have
loaded up all of the windows & tabs that lead up to the problem state.
I hope the above helps to put our hands around the problem.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #32 |
(In reply to comment #31)
> Update. I strongly suspect the top level problem area is in:
> mozilla/
> between lines ~2825 tto 2897. The lower level calls include such functions
> as:
> gtk_window_new(), gtk_window_
> gtk_widget_
> ...realize() and ...set_focus() functions are called from other than
> NativeCreate() so debugging it is tricky.
>
> In the course of trying to debug across a Create-TAB request, gdb ended up
> handing me a "Cannot get thread event message: debugger service failed" error.
> Attempts to continue Firefox ended up with a SEGFAULT and core dump (so I've
> lost the error state, though I have the sessionstore.js file for it).
>
> There is some interaction between the Create-TAB request and creating new
> threads so the pthread() library gets dragged into this discussion (along with
> GDK/GTK/GLIB). I think it would help if people also made clear:
> 1) What processor you are using.
> 2) What thread library you are using.
>
> I'm running a Pentium IV (Prescott) which has only 1 core but does support
> hyperthreading. I'm using the most recent release of the Linux Posix pthread
> library (glibc 2.5 I think) and it looks like GLIB is supposed to be using
> pthread_
>
> It appears that it may be impossible using gdb (at least on my system) to debug
> pthread_mutex functions (setting breakpoints at them results in the "... thread
> event..." message mentioned previously).
>
> It may be necessary to compile GLIB with G_DEBUG_LOCKS (glib/gthreads.h) and
> set the proper debug flags and/or compile mozilla with MOZ_LOGGING at least for
> the widget/src/gtk2 functions (see #define LOG() in
> widget/
> widget functions or GLIB may disrupt the timing sufficiently to make the
> problem go away. One thing that appears key is the need to find out where the
> DestroyNotify is coming from (see
> gtk+/gdk/
> If your Gtk/Gdk library is compiled with debugging, running Firefox with
> "export GDK_DEBUG=events" may help provide destroy notify messages on the
> console log, but what one really wants is a way to do "_gdk_debug_flags |=
> GDK_DEBUG_EVENTS;" (see GDK_NOTE macro in gdk/gdkinternals.h) after you have
> loaded up all of the windows & tabs that lead up to the problem state.
>
> I hope the above helps to put our hands around the problem.
>
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #33 |
Sorry about comment #32. I was trying to correct a spelling error in #31 but there doesn't appear to be an easy way to do that.
Eric, regarding Firefox memory usage, I can kind of explain that problem. I believe that Firefox does have garbage collection for Java and perhaps Javascript. However everything else, the image management, the TCP/IP management, the window and tab management, etc. seem to all be written in C++ and C. So those will normally go through the C++: new()/delete() functions, or C: malloc()/free() functions. These all end up on normal Linux systems using the "standard" GNU glibc memory management functions which in turn rely upon the standard Linux(UNIX) sbrk() and brk() system calls.
The Glibc memory management function system *is* robust (I think it is 90+ pages of code). The problem is that it was not designed to handle situations of "run-for-days" allocating and deallocating many small memory fragments. Once you allocate such memory (in C++ or C) its location has to remain fixed in the virtual memory address space. Over time that means that the heap memory becomes increasingly fragmented (lots of little small holes) and total memory usage tends to creep up. This isn't the same as a "memory leak" where you are losing track of the memory. The glibc memory management system *knows* where all the small fragments are -- the problem is it can't relocate the in use fragments (defragment the heap) to turn all of the unused small fragments into a single large free fragment (preferably at the end of the virtual address space) which could then be returned to Linux (and shrink your VM memory requirements). In contrast, I believe Java, and perhaps Javascript, are sufficiently object oriented that you can relocate objects and perform garbage collection thus preventing the problem of excessive memory fragmentation in their heaps.
In practice, if you watch what Firefox is doing on the System Monitor you may sometimes see VM shrink if you open a window, open lots of tabs in it and then close that window, particularly if you have opened all of those tabs sometime previously. But if they are "new" URLs, then those records may get added to your history list (which may be at the end of Virtual Memory). In this case you can delete the window and the memory will be returned to glibc pool but because the history records are locking up the end of the glibc pool, glibc will not return the memory to Linux. In practice you only see VM shrink at the very end of an normal "Quit" request when Firefox has closed all of the windows, closed the bookmarks file, closed the history file, closed all TCP/IP handles, i.e. freed up *all* of the memory in the glibc memory pool. Only when all of the memory in the heap is completely free will the glibc memory allocator reunite it all as one big hunk of free memory and return it all to Linux (in practice this is done by issuing a brk() system call to lower the last physical address of the process heap).
The "right" way to make this problem better is to put (a) the history records; (b) the Bookmarks file; and (c) image files into separately managed heaps away from the glibc functions (so glibc is ...
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #34 |
FWIW... I'm observing this on a machine with 6 GB of physical memory.
So I am very skeptical that throwing more physical memory at this problem will alleviate it in the least. On the contrary, I'd be more optimistic about a theory that suggested large amounts of physical memory could aggravate the problem. But I don't have one.
That is not to say that fragmentation of the process address space could not be related to this. It does make a certain amount of sense--just considering the fact that this bug seems only to surface after the process has been running for a while. But, please, let's try to avoid clogging this report with *too* much speculation.
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #35 |
Robert: thanks for all the info!
My current hypothesis is that window creation is failing somehow and GTK isn't picking it up, or we're not checking GTK results correctly, and then we create another window with the failed window as its parent and this new window ends up as a rogue toplevel window.
So what'd I'd like you (or someone else) to try is setting a breakpoint at nsWindow:
There are a lot of reports of these "unexpectedly destroyed" messages happening to various apps over the years, but nothing much in the way of information about what causes them or how to resolve them...
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #36 |
Robert Bradbury, thanks for the lesson on firefox heap management. Great stuff.
I wish I had a better way of finding this kind of information from a web search.
Too many false hits on anything that relates to firefox.
Braden, Robert O'Callahan, Metler, Phil Ringnalda and everyone else:
I should have included you all in the initial kudos. Not that it matters much, I'm a total nobody around here anyway :-). But I'm sure you agree that Bradbury did a great job with his gdb magic.
I'm rooting for you all, unfortunately I'm not capable of contributing much else than anecdotal evidence about the bug.
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #37 |
(In reply to comment #24)
> In that call to NativeCreate, aNativeParent is non-null. So we should be
> hitting either
> http://
> or
> http://
> and setting parentGdkWindow or parentGtkContainer to something, which should
> ensure that this window gets created inside some other window.
We hit
2270 else if (aNativeParent && GDK_IS_
and do
2271 parentGdkWindow = GDK_WINDOW(
> Can you figure out why it isn't?
Umm... It seems that mWindowType == eWindowType_popup. That doesn't seem right; any idea why might it be happening?
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #38 |
Morning update. After the gdb debacle yesterday (gdb needs some work... :-(), I made the following changes.
1) Recompile gtk+ with --enable-debug=all (because GDK_DEBUG wasn't working);
2) Setup firefox-3.0a2 (with all the debug code) to run with:
export NSPR_LOG_
export NSPR_LOG_
export GDK_DEBUG=events
export GOBJECT_
to reveal some interesting problems with GLIB "leftovers" when
firefox exits).
firefox-bin 2>firefox.err (= the GDK error log)
3) Run firefox restoring the previous session (which had demonstrated the problem).
Now, the session that first showed the problem had 73 windows & 424 tabs. When gdb went belly up I was up to 76 windows and 438 tabs (demonstrating the tab-start=
I'm now up to 100 windows and 586 tabs (mainly using random URLs from the first 25 pages of Digg) with firefox-bin consuming 1.1 GiB of VM and 1.2 GiB Mem (is this including page tables???). No problem. I can't push this much further because I normally run firefox with a 1.4 GiB virtual memory limit (ulimit -Sv 1400000) -- due to Firefox's poor handling of memory allocation failures it will likely core dump if I push it to 1.4. (If one allows Firefox VM >> system PysMem (1.5 GiB for me) ==> watch the system turn into a dog -- but this is really a Linux paging problem somewhat aggravated by the Firefox heap management problems so not for this discussion).
Firefox has been running for ~12 hours. I ran it for a while with Java and Javascript disabled (because the logging seems to slow down new tab/window creation), but Javascript is now enabled without making much difference. One difference may be that the AdBlock addon may not have been active in the previous instance when the problem did occur. Noscript is active and is blocking Javascript on most sites (gmail exempted). Gmail works fine (and it tends to be a moderately reliable "helper" (?) in my case to trigger the problem state).
CPU-wise firefox-bin+X are consuming 40-60% of the CPU time.
The debug log files (for NSPR & GDK) are rather large (10's of MB). I am concerned that outputing the debug messages has changed the timing of Firefox+
I'm not sure I understand yet the discussion of the nsWindow code, but would argue that until we know *precisely* where the DestroyNotify is coming from and why it is happening it may be difficult to know whether changing the upper level code has fixed or simply masked the real problem. But given the number (5362) of "Gdk-Message: destroy notify" events I'm seeing in the GDK log file, it isn't going to be as simple setting a breakpoint in gdkevents-
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #39 |
Created an attachment (id=255907)
Example of Window creation & destruction with NSPR_LOG_
Example of "normal" window creation & destruction trace (for comparison purposes).
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #40 |
Created an attachment (id=255908)
Example of unusual window destruction case.
This is an example of a window destruction trace which occurs when firefox is "inactive", i.e. no firefox windows or tabs are being manipulated by the keyboard or mouse. If firefox is free to create & destroy windows "behind the scenes" then debugging Bug #263160 is going to be a problem.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #41 |
Robert Bradbury,
I have a suggestion that might help on getting the problem to show up more
quickly and with less windows/tabs:
After starting up the gdb/firefox combo, start up a 2nd firefox (I use firefox
--class=
tabs/windows. Start a 3d, 4th, ... fox as well if you like.
When I do this, each of the foxes need not have so many windows and tabs before
the Untitled windows start popping up.
This method rhymes with your working assumption about the bug being related to
X interactions: having multiple foxes creates more competition for getting the
attention of X, and competition for X resources will cause unpredictable delays
in creating windows etc.
Also, this way, the size of the gdb/fox combo may also be more manageable.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #42 |
Robert Bradbury,
In response to comment #40, I also get Untitled windows popping up when FF is idle. No need to be doing any keboard or mouse operations. Presumably this happens because one of the "busy" web pages (www.marketwatc
So it is probably some javascript or java doing the deed (I have javascript enabled everywhere, with Adblock and Flashblock add-ons, an popups blocked).
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #43 |
(In reply to comment #37)
> 2270 else if (aNativeParent && GDK_IS_
>
> and do
>
> 2271 parentGdkWindow = GDK_WINDOW(
Which is what? null? or something that was destroyed?
> > Can you figure out why it isn't?
>
> Umm... It seems that mWindowType == eWindowType_popup. That doesn't seem right;
> any idea why might it be happening?
No. Are you sure? Look up the call stack to see where it got set...
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #44 |
*** Bug 244482 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Jnoyes-sf (jnoyes-sf) wrote : | #45 |
So how does a bug opened 5 months earlier get resolved as the duplicate?
What a convenient way to make this bug look 5 months younger, drop its vote counts, and reset all its flags.
Sheesh. This is never gonna get fixed.
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #46 |
Because this bug has all the analysis and people actually helping to work on it, that's why.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #47 |
Jnoyes, not so fast. Robert O'C. is right -- to solve this bug we are going to have to concentrate experiences and knowledge in one place. But not solving it is not in the cards now that I have a handle on it (i.e. I've got at least one core complete core dump of the process state when the problem took place as well as at least some understanding of the complexity of the problem.)
Let me recount experiences over the last day. I backed off on running Firefox with the debug flags enabled (generating the large log files). I've now recreated the most of the original situation (i.e. all the URLs) + some more. Am up to 107 windows and 658 tabs in Firefox and it has been running most of the day without any problems. Its consuming 1.1 GiB VirMem, 1.2GiB Mem, 511 MiB X server Mem (on a 1.5 GiB PhysMem w/ a 1.4 VirMem ulimit set).
I tried to lean on the process switching Firefox+X CPU use theory. Nada. I ran glxgears (on a non-DRI X instantiation). That drove X use up to the point where Firefox was very slow (30+ seconds to minimize a Firefox window). As Firefox was pretty unusable in that situation I backed off to a situation of running a continual loop of MPlayer video+sound side-by-side with Firefox. That bumps X usage a bit but does not present the problem with Firefox working after a fashion. Overall the system is currently @ 100% CPU use, varying something like 30-60% firefox-bin, 10-30% X, and most of the rest gnome-system-
Now here is the interesting part. While I was trying to resume the complex Firefox session last night, my seamonkey session went belly-up (with the *same* problem) -- i.e. starts out with "Gecko:Process-#): Gdk-WARNING **: GdkWindow 0x######## unexpectedly destroyed" followed by many errors regarding GDK/GLib trying to manipulate a null objects or assertion failures.
In all cases that I've seen this problem starts out with the "unexpectedly destroyed" WARNING. So for people who want to work on this you have to start your Firefox/
Now Firefox 3.0a2 and Seamonkey 1.0.7 are relatively distinct on my system. The Firefox instantiation is running with the debug libs (separately downloaded and compiled for debugging). Seamonkey was running with the standard system libraries (though I'm running it with the debug libraries now). Firefox was compiled in full debug mode, Seamonkey was not.
Now, common aspects. My normal Firefox runs (including the current run) is running with NoScript enabled (currently not showing the problem when one might expect that it should). The Firefox 3.0a2 run when the "unexpectedly destroyed" error took place may not have had NoScript in effect. Seamonkey also does not normally have a NoScript activity in effect. So this raises the open question of whether Javascripts are running amok in such a way as to corrupt GDK/GLIB memory? Or perhaps whether a Javascript garbage collection takes place from time to time when CPU resources should be shifting directly betwe...
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #48 |
Created an attachment (id=256156)
gdb trace of SegFault while attempting to trigger this bug
Well, after a day Firefox finally core dumped. But it does not appear to be related to the destroy window activity. Firefox was running at around 1.1-1.2 GiB (with a 1.4GiB ulimit set) so I don't think it was a problem of hitting the memory limit either. I believe the activity was returning from a message in gmail back to the Inbox (so it was attempting to redraw the list of messages in the Inbox). As firefox is so slow after one has 100 windows open I was working in other windows and am not precisely sure what it was doing. If anyone wants to dig into a 1.16 GiB core file however I'd be happy to hand it over.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #49 |
Robert Bradbury,
I should correct myself. What I was doing was
firefox --class=Firefox1 -P default
firefox --class=Firefox2 -P profile2
firefox --class=Firefox3 -P profile3
...
and so on. But you probably realized what I meant already. I encourage you to give it a try. Even if you load he same session in all of the instantiations, I think it will do the trick of tripping up the bug.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #50 |
Created an attachment (id=256310)
gdb trace of seamonkey around the time of problem
The attached trace is a debug of seamonkey in the "problem state". I threw two "untitled windows" right after each other in seamonkey when accessing the NY Times.
Notes:
1) I run Firefox with Noscript active, seamonkey (as currently installed does not limit Javascript).
2) Seamonkey is not compiled with -g2, so code information may be limited. It was however running on -g2 compiled gtk+ libs.
3) It is most likely not a memory consumption related problem. Firefox had aborted several days ago and Seamonkey was only consuming ~300MB on a 1.5 GiB main memory machine. (In contrast Firefox was consuming 1.1-1.2 GiB and not throwing the problem.)
4) I note from the stack trace that one of the threads (thread 4) appears to be doing a DNS lookup on "graphics8.
This may be important. I normally run Firefox with NoScript enabled except primarily for gmail.com and NCBI PubMed. Seamonkey runs with Javascript completely operational (and thus the NY Times advertisements can run away with the browser). This tends to be consistent with my experience in Firefox. E.g. I am much more likely to spring the problem when gmail (with javascript) is running than when it is not.
Now, while bearing in mind that the stack trace is *after* two subsequent "untitled window" errors had taken place, it is interesting that when GDB attached to the process it was still doing a DNS lookup. The problem in my mind is not explicitly related to Javascript running or the async DNS lookups but in the interference in GLIB/GDK/GTK they may introduce.
Further information regarding the parent "tab". You can take the tab which does not normally display anything once the "untitled window" has appeared and run it <BACK>. In that case it will properly display the previous window. But it does not destroy or manipulate the "orphan" untitled window. Only closing the tab for the orphan window will eliminate it. So there is still some link between the orphan window and the tabs.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #51 |
Created an attachment (id=256359)
Firefox segfaulting in gdk_window_
I'm not an expert at interpreting these stack traces yet, but it looks like this segfault is in gdk/x11/
What I was doing at the time was a Google search and had just done a search along the lines of "firefox fetch CVS site:mozilla.org". It segfaulted before it could display any of the results.
It was not a memory problem (the core file is only 282 MB). Nor was the system particularly busy. The libraries were gtk+-2.10.9 and glib-2.12.9.
Just as an FYI, for those of you who build firefox from scratch and who want to debug these types of problems, I would highly urge you to compile toolkit/
It also looks as if we can arrange glib so if we can tell it *when* there is a problem (using a signal or the debugger to set a flag) we could have it produce stack traces in gdk_event_
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #52 |
Robert B,
Would you consider making your debug environment available by ftp? If I had your setup, including a dot.gdb file or equivalent gdb setup commands, I could try some runs and see if any usable data came out of it. Or is it fair to say that you have enough data at this point? I'm on Fedora Core 5.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #53 |
Erik, I'm working towards precisely what you ask. I am *almost* at the point where I have GDK turn on stack tracing after the first orphan window condition occurs. This is a *REAL* pain as it involves a callback from the GDK library which is written in C to the Mozilla stack trace code which is in C++ (I am learning things I really didn't want to have to learn about allowing C to call C++).
But getting a workable static debug variant of Firefox 3.0a2 is proving difficult (there are problems in the Mozilla cairo/pango code -- even the latest CVS sources refuse to compile).
I am making available the relatively static debugable Firefox 2.0 I managed to assemble at one point.
URL: http://
Note the firefox-2.0d link.
This is a complete 2.0 install directory which does work for me (Firefox has some real problems if you don't give it a complete environment on startup -- but that is another bug).
At any rate you should note the firefox-bin file (which is the key component) at 139 MB (debugging doesn't come cheap...). If you do a ldd analysis on that firefox-bin you will see that it uses a limited set of system libraries. That is because almost all of the other libraries have been compiled into it having been compiled with debugging enabled. So if you choose to use it to diagnose the problem (since we seem to have it narrowed down to a mozilla / gtk / pthread / glib arena) it may be a useful.
Also please note, the downloads are going to be fed via a standard U.S. DSL line so ones use should be balanced.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #55 |
Robert B,
The file date was surprisingly old,
is this the right one?
firefox-bin 24-Dec-2006 12:26 139M
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #56 |
Robert B,
I saw you said "managed to assemble at some point", so that probably means that
2006-12-24 is indeed the version you meant.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #57 |
Yes, that version is the Firefox 2.0 version. As such it is old. I have not yet been able to get the 3.0 version to assemble in the same way. But given the history of this problem I think it is present in most if not all versions of Linux based Firefox. You can attach firefox-bin in gdb and set a breakpoint at "g_log_
But it seems apparent at this point that it is Linux/GTK specific problem so we are not going to receive assistance from Firefox developers focused on other operating systems.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #58 |
RB,
I tried the binary last night. Here's what I did:
1. make a copy of my firefox-2.0.0.1 to firefox-2.0d
2. drop you firefox-bin into firefox-2.0d
3. env LD_LIBRARY_
4. fix some .so filenames (link .so files to .so.6 or .so.11, for example)
5. try again, programs starts
6. gdb attach
At this point I was not quite sure what to do in gdb, I tried "cont" and later "run". I didn't get any windows, though, and there was hardly any cpu usage.
Is my recipe sound?
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #59 |
Erik, It sounds as if you have the right approach. This entire process is a real *pain* due to the need of getting the .so's right. That is why I attempted a 2.0 compile with as many static debug libs as possible. It appears at this time that given general Linux system configurations and the Firefox dependencies on system libraries that it is impossible to actually compile a fully "static" Firefox under Linux (see bug 372269 for example).
Now, with respect to your debugging scenario. If you have the firefox-bin installed correctly (and all of the libraries upon which it may depend -- dicey question I'll admit). The firefox-bin should start up and run (if it doesn't then there is some kind of library compatibility problem).
So the simple question is can you get firefox-bin up and running on your machine. If it runs, then at least we have the possible library incompatibility problems under control.
Now, firefox-bin is compiled with debugging symbols for Firefox as well as being a static link to the gtk+, glib, stdc++ and glibc with debug symbol libraries. So in theory you should be able to grab almost anything of significance in the binary. Firefox still seems to load "dynamic" libraries, (e.g. thai fonts from either cairo or pango) so it is not fully "stand-alone". The "debugability" of those libraries depends upon how they are compiled on your system (but since that isn't the focus of this bug it may be a no-op).
Once you have firefox-bin up and running normally, you can attach it from gdb. Simply gdb the path to the binary and the process number. You then want to set a breakpoint at g_log_default_
I would fully expect, given my experience, that M/F in this state "works" some of the time that one is going to be dealing with kind of a 50:50 probability of the stack traces being useful. The "gold ring" in this case is having someone say this is a stack trace for a DestroyNotify Event which in turn generated Firefox attempting to mess around with a window which no longer existed.
The fundamental questions may be why was the window "destroyed" and why did M/F not recognize it as such (and one would hope compensate for it).
As a warning, I do not know if you will be able to work with this in your proposed multi-profile scenario. I suspect that one is going to have a problem of attaching multiple gdbs to multiple firefox-bin's and attempting to manage that. That sounds a bit tricky.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #60 |
Thanks, RB. So is it "run" or "continue" once I get in gdb? Or does it not matter?
And what about the 300s delay -- should I leave gdb alone for 300s before I try,
or "continue" gdb and expect a 300s delay before anything happens? Can we reduce the 300s delay without recompiling?
My main problem was that Forefox just sat there and didn't even give me one window. And no CPU usage.
Multiple profiles: The extra firefox instantiations is just to create contention for memory, X and network bandwidth. I was planning on connecting gdb only to "Firefox1".
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #61 |
> Thanks, RB. So is it "run" or "continue" once I get in gdb?
If you are running straight from the shell, e.g. "gdb firefox-bin" then you have to use "run" to get things going. If you attach an existing process, e.g. "gdb firefox-bin process-id-#", then once gdb gets going it will suspend the process and you have to "continue" it to get it to respond again.
> And what about the 300s delay -- should I leave gdb alone for 300s before I
> try, or "continue" gdb and expect a 300s delay before anything happens? Can we
> reduce the 300s delay without recompiling?
I am unsure about the 300s delay. Firefox-bin is *not* a small program and loading in all of the symbol tables takes a while (depends on the speed of your system). But once you get a gdb prompt you should be able to set breakpoints, do stack traces, run, continue, etc. without excessive delays.
I would strongly urge you to find a "GDB Quick Reference" manual. Google should offer it up and its only 2 pages. There are perhaps a dozen commands from that that are essential (which I can advise on but is perhaps better done one-on-one).
> My main problem was that Firefox just sat there and didn't even give me one
> window. And no CPU usage.
Sounds like it was not running or "suspended". When I run firefox on my machine, the first thing it attempts is to request which profile I would like to run (I've got several). If you get that far is is "running".
If you run it from the start out of gdb, you can also set breakpoints using the gdb "break command, e.g. "break XRE_main" or "break gtk_init" or "break gdk_init". If those trigger a stopping of Firefox (i.e. it becomes unusable) and an activation of gdb (such that you can do "backtrace"(s)) (the critical "incantation" in gdb is "thread apply all bt". That will show you what the various firefox threads are doing. (Firefox has to go through all of these breakpoints before it is really "running".)
>
> Multiple profiles: The extra firefox instantiations is just to create
> contention for memory, X and network bandwidth. I was planning on connecting
> gdb only to "Firefox1".
>
Nothing wrong with that if Firefox1 is the one which springs the error. But if I understand your approach it sounds like you are going to have multiple instantiations of Firefox (each with a separate process-ID affiliated with specific profiles). I'm not optimistic that you will be able to break 1 vs. 2 or 3. (Take it from someone who has tried 100 windows and half-a-thousand tabs.) Fortunately, if you start them up in separate shell windows, I believe the Glib/GDK/GTK errors provide the process-ID of the process which throws the "unexpectedly destroyed" error. You should be able to attach GDB to that specific process, do stack traces, set breakpoints, and continue the process (so in theory if you throw more URLs at it it will throw the error again).
So, Q1 is will firefox-bin run on your machine (ignoring gdb involvement).
Then Q2 is whether you can generate the window unexpectedly destroyed (untitled window) error in a relatively reliable fashion (I can do it but I can't do it reliably). Then Q3 is whether once you have generated the untitled windo...
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #62 |
Created an attachment (id=257189)
case if using RobertB firefox-bin 2.0 version with debug instrumentation
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #63 |
RB,
I realize now that my firefox process based on your debug binary never really ran properly in my environment. It got killed by signal 11, but then it strangely continued with assorted messages about how to attach gdb, and so on.
The problem is related to symbol SSL_Implemented
Please see the attachment above. You will also see the message about the 300sec delay, which perhaps you don't get when you run. I thought maybe it was there to give me time to start gdb.
Summary: There is something fundamentally wrong with my setup using your binary, so I have not been able to produce any useful data.
About linux-pthread: "locate linux-pthread" produces no matching filenames on computer. I'm running Fedora Core 5.
In Mozilla Bugzilla #263160, Ian-hixie (ian-hixie) wrote : | #64 |
FWIW, I've been seeing this a _lot_ in the last few months.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #65 |
Erik, I've been wrestling with trying to get a 3.0 version compiled (not yet but getting there). But briefly on your gdb attachment.
Netscape has two "portable" system library sets, the portable runtime "nspr" [1], and the security "nss" [2]. Presumably so one can share them between Mozilla, Firefox, Seamonkey, Thunderbird, etc. These are usually in subdirectories under /usr/lib, etc.
While, the binary I released has the "nspr" functions, compiled with debugging, loaded into the binary, the "nss" functions are *much* more problematic. Some of them may be compiled into the binary (I'd have to check), but it may try to access others at runtime (from your system libraries). If so then there could quite possibly be problems intermixing the nss libraries. There is also interaction between the nss libraries and the SSL libraries on your system (probably). So there is ample opportunity for difficulties.
Do you run into the problem if you completely avoid https: pages (or anything likely to request a password)?
One possibility - try renaming your nss libraries, e.g. mv /usr/lib/nss /usr/lib/NSS and see if it runs [I've never tried this]. You might be able to run fine and then only get a runtime error at those times that it tried to use encryption. (Or it might fault in a clearer location saying that the NSS libraries are unavailable -- in which case we have another "bug" that the Mozilla code doesn't cleanly handle missing security libraries). [There are lots of examples of this -- you ought to try starting it without the various subdirectories containing the icons, "pseudo-code", etc. under MOZILLA_FIVE_HOME (MOZILLA_LIBDIR) sometime... :-(]
1. http://
2. http://
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #66 |
Further update. I've beem working with SeaMonkey 1.1.1 because its memory requirements seem significantly lower than Firefox (why is this???). At any rate this morning I managed to spring the untitled window problem several times. The system is *not* memory constrained (I've got 20+MB of kernel file system buffers). It is however relatively busy, running mplayer and a fairly high network load (~20-30% busy over a DSL line). The same URL will not always reproduce the problem. I sprang it both on the result of an eBay query as well as google query results (where I commonly make multiple "Open Link in New Tab" requests in rapid succession before the initial request has completed). I think this may be a critical aspect of this -- trying to get the browser to open multiple tabs (or the resizing of window contents of half-downloaded
The work-around is to select the tab with no displayed contents, i.e. the untitled windows "parent" tab and copy its URL. Then close that tab (which will close all the untitled windows [which although they display the URL don't have the "widow dressing", e.g. scroll bars, required to do anything with them]). Then open an entirely new browser window (ctrl-N), paste the copied URL. The page seems to always display properly for me. It is highly annoying to have to do this however.
I've been running SeaMonkey for 2 days, but it isn't heavily loaded. About 20 windows, maybe 60-70 tabs, only consuming 176MB (VirMem) / 51MB (ResMem). The trick seems to be to make enough process switching take place that the Mozilla/GTK threads experience delays (which is probably why one frequently runs into it when one is either up against the system memory limits or when one is running large system builds in the background.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #67 |
Jackpot sort of. It turns out that with my system in this state the bug may be easier to reproduce. I was able to reproduce the problem 3 times using gmail. Gmail appears to be particularly adept at reproducing the problem due to its interface.
The steps appear to be.
0) Make you network "relatively" busy (i.e. such that an Inbox "fetch" from gmail will not be "instantaneous".)
1) Ctrl-T to open a new tab.
2) Enter http://
When you do this, gmail should startup (I've got it setup so it does an autologin) and give you the [LOADING...] message in the upper left corner of the tab window. Now, shortly after this, gmail will throw up 2 untitled windows (if you are going to encounter the problem). Gmail seems to treat one of these as the "Inbox" window and the other as the "messagebox" window. Now, unlike the classical case, gmail will usually throw up identical images in both the "tab" window and the separate "Inbox" window. If you check a box in the tab window *or* Inbox window the image will usually change in *both* windows. If you click on a message it will show up in the separate messagebox window (and the tab window most of the time I think). I think what is happening is that Gmail has two window images (Inbox & Messagebox) that it is flipping back and forth into the tab window.
Points of note. If you open something else, e.g. www.google.com in the "fresh" tab *before* you open www.gmail.com it doesn't seem to spring the problem (i.e. you will get normal gmail behavior within the tab). You have to open gmail in a "fresh" tab.
Gmail can also get "stuck". This appears to happen when it downloading the Inbox contents (on an bandwidth limited connection). In which case you will not get the 2 untitled windows and the inbox will not display in the tab window. Now interestingly, after I closed the tab in this case, then tried it again I got *4* untitled windows (2 apparently from the closed tab and 2 from the new tab). When I closed the new tab all 4 untitled windows went away. Very briefly during the inbox/window setup process seemed to appear an item at the bottom of the window indicating that it was contacting (Waiting for/Transferring data from) "chatenabled.
I've got a couple of gdb traces I'll be attaching.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #68 |
Created an attachment (id=258427)
gdb trace of Gmail hung during loading Inbox
This was the trace where gmail hung during loading the Inbox. It did not seem to spring the message it sometimes does involving the fact that loading the Inbox was taking too long (perhaps because the "secondary" window creation was hung as well?).
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #69 |
Created an attachment (id=258428)
gdb trace of gmail hung with 4 untitled windows
This is the trace when gmail had brought up 4 untitled windows (presumably 2 from the previous request to gmail, and 2 from the current request).
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #70 |
*** Bug 352178 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #71 |
The plot thickens even more... Ok. While running moderately high network load (perhaps 50+% of outgoing bandwidth) *and* while running a limited Firefox build (80+% CPU consumption)... New Window (pulls up www.google.com as the default window), change the URL to www.gmail.com. Spawns the typical 2 untitled windows and hangs in the primary tab window with a red "[Loading...]" box in the upper right hand corner. Other SeaMonkey windows appear to work fine. Start a second new window (starts with www.google.com), reset URL to www.gmail.com, bang, 2 more untitled windows (now we have 4). This time it appears to hang for a long time with "transferring data from 'chatenabled.
Ok, minor qualification. If I have one window which has 2 tabs, one a "normal" URL, the other a "hung", 2 untitled window "enabled", gmail ... If I attempt to move up or move down in the gmail window (i.e. by single clicking above or below the window position marker in the scroll bar) nothing will happen. If I attempt to drag up or drag down the window position marker nothing will happen. If I click up or click down *and* switch to the non-gmail tab and switch back to the gmail tab the view will have scrolled (up or down).
I would guess we have thread activation or priority problem. It would appear that Mozilla/GTK is ignoring signals coming from the current window (i.e. scroll this window up or down) but is responding to those signals when the window is reactivated after having been deactivated. Of course in the case of gmail there is a complex interaction with javascript going on.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #72 |
Created an attachment (id=258441)
gdb trace of SeaMonkey hung in gmail with 4 untitled windows
This is an alternate to the previous trace. This is a completely different instantiation of a gmail window (with untitled windows) being "hung". Indeed under this trace there are 4 untitled windows attached to 2 "gmail" tabs at least one of which is fairly unresponsive (i.e. it will seem to respond if I switch to other tabs or windows but not if I sit and wait on the primary window).
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #73 |
A bug report for this was filed in the Gnome bug database (#417973) which was marked as a duplicate of:
http://
which is where I am now posting in an attempt to get the attention of the gtk+ developers.
I will confirm an additional situation of opening a new window, starting gmail and getting untitled popup windows. This did *not* occur when non-browser network traffic was stopped. So I think a key element to reproducing this bug involves having a system load which delays prompt processing of the browser network requests (perhaps DNS lookups which I think may involve a specific SeaMonkey/Firefox thread(?)).
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #74 |
I have refiled this as a GTK+ bug. See:
http://
We shall have to see whether the GTK developers bounce it back into the Mozilla camp.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #75 |
Shoot, no response over in GTK+ land. Could it be that the bug title did not ring a bell when people read it? It may not sound like a GTK+ bug just from reading the title.
"Gmail / Firefox / SeaMonkey / Epiphany fail to manage windows properly"
I like the bug title on *this* site better, is there a way to change the bug title and see if anyone reacts? Something like
"frames open in new GTK+ windows, leaving firefox unusable"
Only intended as a friendly suggestion!
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #76 |
I'm also seeing this behavior in Suiterunner. It's happened off and on for the past several weeks using the browser. Today is the first time I saw it with MailNews though - I had just finished reading a message in the preview pane, deleted it, and the next new message popped up in a separate window instead of in the preview pane. All the same symptoms as above - GDK errors, etc...
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #77 |
Created an attachment (id=264615)
Full trace of GdkWindow %#lx unexpectedly destroyed
This is a subset of a much larger debugging session. This is running firefox under gdb with args set to "--sync". In particular this show the start of a window unexpectedly destroyed error set.
In this instance I think I know what caused it because the full thread trace reveals images being loaded from a site which "auto refreshes", in particular the page which displayed the "untitled window", had the following code:
<meta http-equiv=
Now, the page in question also had multiple javascripts, and though I was running with NoScript enabled, the page appears to have some very strange code which appears that it might be designed to prevent NoScript from functioning properly (I don't know how NoScript disables javascripts).
I also know that system builds were running and the CPU was maxed out.
This suggests that one could build a debug test case by simply opening a lot of pages which frequently refresh (1-5 sec instead of 900) and max out the CPU by repeatedly building firefox.
I am more convinced than ever this is a thread sequencing problem as when gmail has the problem it seems to be in the Destroy Frame code, when one is "archiving" a message and returning to the Inbox screen. It looks as if gmail may be running multiple Javascript threads -- one which is closing the message window (frame?) while the other is redrawing the Inbox window (frame?). It is worth noting because gmail "monitors" ones Inbox and perhaps ones potential chat partners, it does its own internal equivalent of a "REFRESH" at random intervals. But it is being done by javascript rather than the HTML code.
Firefox seems to be assuming that actions by gdk are synchronous and they really don't appear to be. So if it starts some activity in a window and that thread gets suspended, then destroys the window, the thread which was doing stuff to the window finds a destroyed window when it is unsuspended.
The question is whether in gdk there is some way to guarantee that all pending operations within a window are complete before one destroys it?
There is also an interesting question that might be asked, "What happens to javascripts which are dealing with a window when the window is destroyed (or refreshed with a completely different window)?"
In Mozilla Bugzilla #263160, L. David Baron (dbaron) wrote : | #78 |
Mozilla interacts with Gdk only on a single thread (often called "the main thread").
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #79 |
David, that is true, thread 1 through DoProcessNextNa
But glib is designed to handle inputs from multiple threads and has locks on the data structures to prevent different threads from interfering with each other. But in the situations where I am seeing the untitled window problem there are anywhere from 9-12 threads running. Thread 2 appears to be the network interface (to fetch the contents of web pages) and in some cases a third thread may be doing a DNS lookup, but I have to assume that the other threads are presumably asynchronous javascript or plugin threads (though in my test case I've limited plugins to just NoScript and AdBlock).
Glib seems entirely event driven, so if one has a case where thread A is destroying a window and thread B is manipulating the window and thread A isn't locking the window down so it cannot be manipulated during its destruction (or it cannot be destroyed during its manipulation) then you have a recipe for what we are seeing when the machine gets busy -- thread B starts a sequence of operations on the window, gets suspended, thread A deletes it, and thread B resumes only to discover its assertions are failing because the window no longer exists.
The REFRESH and gmail examples are pretty clear. To "destroy" the window, you have to start the process of freeing all of the data structures within the window -- that may take a while (especially if memory is in short supply and paging of the fragmented heap causes thread suspensions). If an event comes along and tries to manipulate partially destroyed window structures there may be problems.
I'm just guessing, but it looks like in nsWindow.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #80 |
Created an attachment (id=264760)
Traces of window (frame) destroy's
This is a set of backtraces from closing the gmail login window. The gmail login window does have a number of javascripts that run, including the counter that updates how much free space one has (it uses the setTimeout() function.
From studying this and the glib functions it looks to me like the nsBaseAppShell is running and processing events, which in turn invokes the g_main_context functions to process pending gtk/glib events. Glib maintains its own event queue which through previously setup callbacks ends up back in the Firefox functions in nsWindow.cpp -- delete_event_cb() and OnDeleteEvent() which go through all kinds of functions to delete all of the Firefox data that is associated with the window. Now, on a machine with a heavy CPU load and/or one where Firefox is using a lot of memory [1], particularly with complex windows, the destruction and freeing can easily become suspended. If at that time, the result of an HTML REFRESH takes place or a javascript timer goes off and they "think" they have control over the window, I suspect they may do things which add actions to the glib events queue. Once the g_closure_invoke function completes and backs out to the g_main_context level it will continue to process these events only to discover that the window that the events were being applied to no longer exists -- and this is what causes the Gtk error messages.
I think the nsWindow:
I can see this as being a bug which is likely to only occur in gtk/glib due to the way it seems to be handling queueing and dispatching events. But the firefox windows code may be at fault for not preventing any activities within the windows while they are being deleted.
1. The problem shows up under high memory usage because the Linux paging functions aren't particularly adept at dealing with this (and one can wait many seconds for a page to get swapped in). It shows up when you leave Firefox running for an extended period because the heap becomes fragmented and one is more likely to have pages required for freeing data structures (in the fragmented heap) unavailable forcing a suspension of the window destroy thread.
In Mozilla Bugzilla #263160, L. David Baron (dbaron) wrote : | #81 |
Why do you think there are multiple threads involved? There should only be one.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #82 |
(In reply to comment #81)
> Why do you think there are multiple threads involved? There should only be
> one.
>
I've got a debug compiled Firefox currently running under gdb with several dozen open windows (and perhaps a hundred+ tabs). If I ctrl-C it and say "thread apply all bt" I get 7 active threads (1, 2, 3, 12, 25, 26, 27). Now gdb cannot give me meaningfuls stack traces for the later threads, but I view this as a gdb bug in tracing system calls. #1 is, as you have said, the "main" thread apparently controlling much of the activity with gtk/gdk/glib. #2 is apparently the single network communication thread, which is of course highly questionable if my system (or network) happens to have separate connections for dial-up, DSL, Cable, Satellite and WiFi links to various sources. Single-threading network communications at the application level is wrong. The distribution and collection of network requests is under OS level mamagement problem. If thread creation is not prohibitively expensive (and it will not be on 2, 4 & 8 core processors) then all network requests should be running on individual threads. I do not fully understand what threads are created and destroyed within the current Mozilla/Firefox model -- if there is a document which clearly outlines this I would be happy to review it. (I strongly suspect that the "model" for the program is lost within the minds of a few developers who wrote core aspects of the code -- presumably many of whom were writing for a Windows paradigm rather than an open source Linux paradigm.)
I have no evidence that asynchronous operations, e.g. the HTML REFRESH operation or asynchronous javascript timers are or are not operating in different threads (the handling of asynchronous operations isn't exactly documented to the best of my knowledge). It really doesn't matter. If an async interrupt occurs in the "default" main thread it can still add operations to the glib "event" processing queue unless it recognizes that it is adding such operations to a window (or its subcomponents) which are in the process of being deleted.
I am willing to be wrong about this. Point out the precise functions where REFRESH and/or Javascript Timeouts are being handled and point out the precise locations where they will lock or block on access to the window in which they are running. Or point out that all of these window activities are being bounced back up to the top level where they are being enqueued and the enqueing will detect an attempt to enqueue on a "Destroyed" object. Because as things stand right now -- it looks like the code is destroying the window and it is enqueing things to be done to the window which has been destroyed. One should not perform magic upon an object incapable of supporting magic. At least in my humble opinion.
In Mozilla Bugzilla #263160, L. David Baron (dbaron) wrote : | #83 |
There's lots of asynchronous stuff happening, but everything to do with Gdk/X, windows, and scripts should be running on the main thread. The other threads (timer thread, socket transport thread, and a few others) should have nothing to do with it.
Please don't use this bug to discuss general complains about threading design; if we have that debate here we won't be able to find the part relevant to this bug anymore.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #84 |
David, understood.
If indeed all Gtk/glib requests are being submitted through the main thread, there should *still* be a constraint that they should *NOT* be submitted while a window deletion is in progress. Indeed, all frames, windows, etc. should have a flag indicating that "modifications to this window/frame will be bounced" and such a flag should be set once a window destruction (or redraw) process is invoked.
The form of the error, i.e. "Window unexpectedly destroyed" followed by several window data structure consistency checks reeks of the fact that one is adding things to do to windows that are in the middle of the downwards spiral for destruction. That is a really bad idea. One should not be attempting to schedule activities for windows which are effectively dead!
Robert
In Mozilla Bugzilla #263160, Adam Guthrie (ispiked) wrote : | #85 |
*** Bug 339251 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Nico R. (n-roeser) wrote : | #86 |
The following bugs seem to be duplicates of this bug report:
bug 349497,
bug 354970,
bug 366896.
They are probably filed with the wrong product and/or component.
Please have a look at them and take appropriate action. Thanks!
By the way, I have also experienced this bug with the Firefox Preferences window two days ago, and with its Downloads window twice today.
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #87 |
*** Bug 349497 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Braden (braden) wrote : | #88 |
*** Bug 354970 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, D. Hugh Redelmeier (hugh-mimosa) wrote : | #89 |
I'm a refugee from bug 349497. And before that, https:/
I've had this problem for a long time. I experienced this using firefox on Fedora Core 5 and then 6 on x86_64. I put some more detail into the Redhat bugzilla entry.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #90 |
I would agree that #349497 and #354970 are the same bug. Bug #366896 might be if one is getting a case where gtk/glib is not catching a case of the window being destroyed and is attempting to manipulate deleted (free-ed) memory structures. Given the number of structure consistency checks in Gtk/Glib is is easy to imagine that they may not have caught them all (I've encountered a few of these even with the most up-to-date libraries). If a user is running on libraries which do not have the debug (& consistency check) options enabled, SEGFAULTs would not be out of the realm of possibility. One needs a stack trace to determine whether the faults are taking place explicitly within Gtk/Glib.
We are clearly in the realm of items being added to various "event" queues for a window while the window is in the process of being destroyed. In cases of high CPU use and/or high memory use -- a window "destroy" operation is not a "guaranteed to go to completion" situation and therefore adding anything to pending event queues (where when they come to the head of a queue they are dealing with a semi- (or completely) destroyed object) is quite problematic.
I maintain my position -- that this is a Firefox (and associated program) problem that one should not be attempting to add activities to a window queue during its destruction. Whether you view this as a Firefox problem or a library problem is open to debate. (For example the most immediate action when one detects a window destroy request is to destroy all Javascripts (and timeouts) or window REFRESH's associated with said window *before* one destroys the window itself!)
In Mozilla Bugzilla #263160, Philringnalda (philringnalda) wrote : | #91 |
*** Bug 381270 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #92 |
Something else I've noticed recently - as noted before, the problems don't start right away, they only start happening after mozilla has been running for a while. Over the course of time things run a lot more sluggishly. Just now I noticed in top that the X server was taking 16% of my CPU even though nothing was happening anywhere. The seamonkey binary was taking like 1%, nothing else was really taking anything. But when I exited seamonkey, the X server's CPU usage dropped back to 0.7%. So it appears that seamonkey is doing something weird that confuses the X server, and that's when things start going wrong.
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #93 |
Seeing the X server slowdowns again, after some prolonged use. As a wild-ass-guess I suspect seamonkey is using up the X server's backing store resources. I wonder if this has anything to do with the cached page renderings.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #94 |
(In reply to comment #93)
> Seeing the X server slowdowns again, after some prolonged use. As a
> wild-ass-guess I suspect seamonkey is using up the X server's backing store
> resources. I wonder if this has anything to do with the cached page renderings.
>
Howard, I've seen the X server CPU usage go up in conjunction with the Firefox CPU usage but it generally happens only when I've got hundreds of tabs (windows) open. It seems to be aggravated if I have many other programs open which may be using X as well. Firefox is not a particularly large user of "X Server Memory". Right now, on my machine, the System Monitor is indicating that acroread is consuming 30 MiB, epiphany: 19 MiB, soffice.bin: 16.5 MiB, nautilus: 747 KiB, gnumeric: 336 KiB. Firefix is only consuming 335 KiB). If anything, Firefox may be consuming CPU because it may be constantly transferring data from its own memory to the X-server memory. (I suspect for example, one can't run Firefox in "DRI" mode for things like animated GIFs, video, etc.).
But in any case, this is not a bug for discussing Firefox+X CPU usage -- a separate bug report should be filed for that if one has not been filed already. This bug should *only* be discussing cases where one is destroying a window and one gets the "Window --address-- unexpectadly destroyed" error on the Firefox (Seamonkey/
The working hypothesis is that this usually only happens under high CPU use conditions (or high swapping conditions) because there are thread race conditions that allow window commands to be added to a window event queue *after* the window deletion process has begun. Unless you are consistently seeing "window unexpectedly destroyed" errors, you should not be discussing your problem(s) under this bug.
If you find or create a bug which discusses the Firefox-X CPU usage problems, you may want to make a note of it here as it is certainly true that the high CPU usage may make this bug more likely to occur. This bug is more critical than high CPU usage, because if one closes the Untitled Windows (rather than the tabs associated with them) it will crash Firefox.
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #95 |
Those "window unexpectedly destroyed" messages start showing up shortly after the X server starts to get sluggish, that's the reason I got here in the first place.
In my case there are no other CPU or memory-hungry processes on the machine. Nor am I watching any videos inside a browser when the problems occur. And as I noted before, the problem doesn't just affect browser windows - in Seamonkey once things start going wrong, any window can be affected, e.g. the Print Status dialog, different panes in the MailNews module, etc. have all been affected at various times.
If Window *destruction* is the root cause of the problem, then why is the visible effect disrupting window *creation/
I think you're just chasing a symptom here, and you really need to find out why the CPU usage got to this bad state in the first place.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #96 |
Ok, Howard, sounds like a legitimate case of this bug. The reason that CPU usage or memory usage (and swapping) are critical is that to redraw *any* previously drawn window or tab requires destroying the previous window and all of the objects within it (this includes javascripts, GDK/GTK/GLIB "window object" structures, etc.) It is very complex, and therefore time consuming. It is also difficult to debug because the "destroy" (and memory deallocation) processes are usually not explicit in the code but are linked into the data structures and may be called in some very strange situations.
In a "light load" situation, window destructions will tend to run to completion in one operation (with no interference). In heavy load situations, particularly if large amounts of the heap are paged out, a delete operation may be suspended (freeing heap memory amy require scanning through much of a highly fragmented heap to find out how to insert the freed memory back into the free memory list). Window operations (window or sub-window "redraws") can also occur asynchronously -- through either HTTP, e.g.
<meta http-equiv=
or Javascript, using the setTimeout() functions and window.
It *would* be useful to know what URLs you have open when you see this happening as well as whether you think there are pages that happen to be doing some type of "auto-refresh". If you don't know all the pages, and you don't have privacy concerns send me your sessionstore.js file and I'll turn it into a list of URLs and try to track down the offending URLs.
It isn't that the URLs are really doing anything wrong -- it is that the Firefox/
I suspect there is no test case in Mozilla for a set of windows which attempts to bury the machine (e.g. 100% CPU and/or Swap usage based on nothing more than window refresh/redraw requests). If there were we probably would have resolved this bug long ago.
My statements still stand that there is a second, probably unrelated bug involving excessive CPU usage (I suspect this is due to poor management of "inactive" windows but haven't begun to investigate it yet). For example, I suspect a few hundred windows monitoring a few hundred active RSS feeds would drag the machine into the ground -- but it would not do so if the X utilization of visible vs. non-visible windows were managed properly.
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #97 |
Thanks for the recap.
One site that seems to suffer from the problem pretty consistently is the discussion threads at http://
It still takes a long time before the problem first appears, but once it starts, it pretty much keeps on going. For some sites the problem will appear on a window, but hitting Return in the location bar will cause the page to be displayed correctly (with the bogus window disappearing). For this site, once the problem appears, any kind of navigation (forward to new threads, back to previous pages, etc) has the problem.
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #98 |
PS: I've also tried LD_PRELOADING tcmalloc_minimal.so as Google's malloc library generally performs better than glibc's in multithreaded programs. It seems to delay the onset of the problem but doesn't cure it.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #99 |
Howard, looking at the realtechworld.com site, it looks like they are making use of:
<!-- BEGIN: AdSolution-Tag 4.3: Global-Code [PLACE IN HTML-HEAD-AREA!] -->
<script type="text/
<!-- END: AdSolution-Tag 4.3: Global-Code -->
Taking a look at the aslmain.js script, sure enough they have an assembly of complex ad manipulation code, including a setTimeout() call, lots of calls writing to the window() and even options for handling java and/or shockwave "ads". I am reasonably sure they are attempting to redraw the window, presumably with rotating ads, on a regular basis thus producing your frequent encounters with the destroy window problem.
You may be able to avoid this by turning off javascript entirely (but it looks like the realtechworld.com site depends heavily on it (boo, hiss!)), or use the NoScript addon to enable it *only* for specific "worthy" sites (e.g. gmail, amazon, perhaps realtechworld, etc. as I do).
However the advertising people are presumably trying to get increasingly clever and shove their "services" down your throat (avoiding NoScript and similar tools) so selective "enabling" is likely to have a limited lifetime.
The only way to fix the problem ultimately is to completely disable Web 2.0 (push) type applications -- i.e. download & draw the page once & don't download anything else until I explicitly request a reload.
In Mozilla Bugzilla #263160, Tom+bugzilla (tom+bugzilla) wrote : | #100 |
Created an attachment (id=267808)
Screenshot displaying window error
In Mozilla Bugzilla #263160, Tom+bugzilla (tom+bugzilla) wrote : | #101 |
Created an attachment (id=267809)
Screenshot displaying window error
I've been seeing this problems for some time now too. Still in 2.0.0.4 and somewhat surprised nothing has been done to resolve it since 2004 (0.9).
A page that causes this problem for me is: http://
Attached is the screenshot for this bug to show what is going on.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #102 |
It looks a lot like the developertutori
PLEASE! - ALL FUTURE ADDITIONS TO THIS BUG SHOULD INDICATE WHETHER JAVASCRIPT IS ENABLED OR BLOCKED. Esp. with respect to "unknown" sites. Tracing javascript timeout calls is very difficult if they go through several levels of indirection to get to the code which is setting and springing the timeouts.
HTTP Refresh problems causing destroyed+detached windows are probably much easier to trace than complex javascript timeout window manipulations.
It should be noted that this is *not* a case of Firefox getting "worse" (the problem has been around for ages), the problem is that the advertisers are making more frequent use of AJAX/Web 3.0/Timeouts in attempts to shove different ads down your throat -- "Well they didn't click on that ad after 5 minutes -- lets throw a different ad at them.".
If you are using "unsafe" sites with Javascript enabled (NoScript solves many of these problems) then (cough) you get what you deserve. You *are* allowing commercial enterprises (and corrupted web sites) to run programs (not merely display text) on your computer. Of course one should only worry about browser security, after say what -- half a million bugs? A browser with less than that, say 383,866 bugs, well certainly that's got to be a "safe" system.
In Mozilla Bugzilla #263160, cburroughs (chris-burroughs) wrote : | #103 |
I have created bug 386429 to track the separate X usage bug first described in comment #92.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #104 |
I am confirming that the bug *still* exists in Firefox 3.0a7pre using CVS source dated 19 Jul 07. I managed to spring it by reopening (via Back) a gmail window. It looks like gmail may be managing more windows-
I will also confirm that it isn't a memory use by the current Firefox problem (the 3.0a7pre version was only consuming about 30% of main memory. However main memory was fully in use and a high (nice -19'ed) CPU load had been generated by starting a Gentoo package emerge sequence.
The problem clearly seems to be a window delete (or redraw / resize?) operation is stuck into the glib events queue at the same time various processes are operating on the window. When the subsequent operations go to work on the deleted window the errors (and new untitled windows) are the result.
It seems to me that this might present a security problem as one is depending on the integrity of the glib code to detect the fact that a window has been deleted and prevent operations on it -- if there are cases where it misses that situation the code (which might be foreign Javascript) could be copying things to/from random parts of memory (e.g. former window memory reallocated to contain form data such as CC #'s, SS #'s, etc).
In Mozilla Bugzilla #263160, Aaron Lehmann (aaronl) wrote : | #105 |
On the topic of possible heap fragmentation, has anyone tried linking Firefox with TCMalloc? It seems like others have had some pretty good results: http://
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #106 |
(In reply to comment #105)
> On the topic of possible heap fragmentation, has anyone tried linking Firefox
> with TCMalloc? It seems like others have had some pretty good results:
> http://
>
Yes, I run with tcmalloc all the time now. It's only a band-aid, not a true fix. The problems still occur, it just takes longer for them to begin.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #107 |
Getting back on topic (firefox/mozilla heap memory usage is a separate problem from the untitled window problem) [1].
This comment is to confirm that in Epiphany (with has no NoScript option so Javascript is enabled), even somewhat moderate browser usage (VirMem ~735 MiB ResMem ~481 MiB) can trigger rather frequent untitled windows if one subjects the CPU to even moderate non-browser loads. The two specific URLs are the NY Times home page [2]) and a recent CNN news article [3]. I suspect both of these sites, like many news sites, are on auto HTML refresh and/or Javascript managed page reloads, e.g. "window.
Someone who is skilled in HTML/Javascript needs to write a test program [5] which attempts to swamp the CPU (and/or network) with page reloads (increasingly shorter times between page reload requests should do it). Open up a couple of hundred pages (tabs) with refreshes every 1-5 seconds and I'm reasonably sure the problem will reveal itself. I would hazard a guess that this type of stress testing of various asynchronous browser features on various operating systems has not been done.
I would note that given my limited investigation of the window redraw (delete + draw) code thus far I do not believe a patch involving suspending or producing errors on window operations during a destroy window would be that difficult (and it could be applied back to version 2.0.0.X as part of the ongoing security upgrades). But it should be developed by someone who really understands how the code works and not by someone like me.
1. I'm sure there are more than a few bugs active regarding Firefox memory usage (I filed a few myself). If anyone runs across them, they may want to post the references to them here and to this topic in the most closely related. There is a relation between these bugs because as Firefox is used for long periods of time extensive numbers of window management data structures are allocated by gtk/gdk/glib causing heap fragmentation. The more fragmented the heap is the more CPU time (and paging) will be required to execute a "Delete Window" operation and the more likely an asynchronous operation on the window will be triggered during the deletion operation (thus leading to the window unexpectedly destroyed error).
2. http://
3. http://
4. I am unsure whether Epiphany is using the same window management code as is found in the mozilla sources or simply code which involves similar gdk/gtk/glib functions for the reload/redraw window operations. This may open the question as to whether this is a Mozilla problem or a system library problem -- it ultimately revolves around what operations should be permitted on windows slated for destruction and whether the application or the libraries should manage that.
5. I'm not going to reread all of the comments on this bug but I think I may have encouraged/
In Mozilla Bugzilla #263160, Antoine-mechelynck-gmail (antoine-mechelynck-gmail) wrote : | #108 |
(In reply to comment #107)
Epiphany is a Gecko product, see http://
In Mozilla Bugzilla #263160, Mozilla-bugs-rambler (mozilla-bugs-rambler) wrote : | #109 |
Looks like these bugs are the duplicates:
bug 362955
bug 368260
Also, I'm copying console errors I get when this happens so this bug shows up in the searches:
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
(seamonkey-
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #110 |
Created an attachment (id=287587)
Yet another example of Firefox going south...
This is yet another example of X spitting out errors when Firefox has gone south.
In Mozilla Bugzilla #263160, Bugzilla-tuxmachine (bugzilla-tuxmachine) wrote : | #111 |
*** Bug 402774 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #112 |
Jeremy, the reason that I assert solving this bug is not difficult is because the code to DELETE Javascript and HTML redraw asynchronous operations *should* be in the base level code. You have to call such functions before one "really" destroys a window otherwise one has the brower running code/functions which are attached to a non-existent window (or perhaps you don't have such code and that is why Firefox memory and CPU usage grows over time -- orphaned active window subroutines suck down the machine).
The problem is that "code invoked" redraw (delete and recreate) window operations do not clean things up the same way that a legitimate window delete operation should.
I recently experienced this problem a lot with gmail and a maxed out machine. Gmail apparently redraws its primary window asynchronously under the control of Javascript.
It *very* simple. Whenever a window redraw is issued:
a) Shut down all javascripts on the window.
b) Shut down all HTML redraws on the window.
Then redraw the window.
It isn't easy for me to fix since I don't know the functions required.
But for someone who does it shouldn't be that difficult.
In Mozilla Bugzilla #263160, Reed Loden (reed) wrote : | #113 |
Created an attachment (id=290364)
craziness (#1)
This is what I've been getting lately. :(
Windows start appearing and doubling... they eventually come to a peak and then all close.
In Mozilla Bugzilla #263160, Reed Loden (reed) wrote : | #114 |
Created an attachment (id=290366)
craziness (#2)
another one
In Mozilla Bugzilla #263160, Reed Loden (reed) wrote : | #115 |
I seem to get this when I have lots of tinderbox tabs open. Note that http://
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #116 |
Reed, if the page you cite indeed contains that many HTTP refresh commands then that would be a very good way to trigger this problem.
It is very clear to me what the fix needs to be.
Before one redraws a window/tab (which internally appears to be a delete and recreate window operation) one has to delete any pending HTTP refresh and any Javascript equivalents. One cannot have asynchronous operations attempting to redraw a window which has just been destroyed.
In Mozilla Bugzilla #263160, L. David Baron (dbaron) wrote : | #117 |
I also saw this again recently; it may be due to the massive pixmap leaks in bug 403481 (which vlad fixed yesterday).
In Mozilla Bugzilla #263160, Ventnor-bugzilla (ventnor-bugzilla) wrote : | #118 |
Any way to reproduce this? I've never seen anything like this before.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #119 |
Michael, it helps to be strongly up against the system limits. I do not know how this problem displays under Windows as I believe it is an X-windows interface problem. I could reproduce it relatively frequently when Firefox was consuming 60-70% of system memory and if I was running Gentoo emerges on various programs (and thus ~100% CPU usage) at the same time.
It is combination of the X architecture (that one can submit actions to disembodied windows) with Firefox activities which place an unusual load on activity. Linux is not extremely responsive to "paging on demand", so an excessively large Firefox heap is going to stress this and delay responses to adding or deleting anything from the Firefox heap memory space (because there will be delays in paging things in or out). And thus Firefox "delete" and "recreate" windows may have long time windows and allow for interruption by async windows operations -- which is what bothers the X windows manager -- remember the messages are about operations on "deleted windows".
The way to test for this is a stress test on Firefox HTTP and/or javascipt page refresh commands when you are stressing the system under high load conditions. If Firefox 3.0 has not been stress tested to the max, i.e. what are the limits to reliable page refreshes under *Linux* [1], then IMO it should not be released.
1. At one point I started to write a test page which would evolve continually decreasing times for HTTP refresh and/or javascipt page refresh commands. I never finished it but it seems to be quite feasible. At some point such a diagnostic should swamp ones system. If your system is sufficiently loaded with other processes I believe that will trigger the observed bug.
In Mozilla Bugzilla #263160, Antoine-mechelynck-gmail (antoine-mechelynck-gmail) wrote : | #120 |
(In reply to comment #119)
For a stress test of one kind of page refreshes, I suggest opening various Tinderbox pages in increasingly many tabs: links to such pages can be found at http://
In Mozilla Bugzilla #263160, Antoine-mechelynck-gmail (antoine-mechelynck-gmail) wrote : | #121 |
P.S. Of course, to avoid "DoD attacks", use _copies_ of the original pages.
In Mozilla Bugzilla #263160, Antoine-mechelynck-gmail (antoine-mechelynck-gmail) wrote : | #122 |
P.P.S. :-( I meand DoS attacks.
In Mozilla Bugzilla #263160, Howard Chu (hyc) wrote : | #123 |
I've found that running Seamonkey inside gdb slows Mozilla down enough to cause the problems to occur much more often.
E.g. invoke as "seamonkey -g" and then just "run". Gdb prints a message every time a thread starts or exits, and this appears to be enough overhead to trip things up.
In Mozilla Bugzilla #263160, D. Hugh Redelmeier (hugh-mimosa) wrote : | #124 |
I have a dual-core AMD system running Fedora Core 7 (and before that, FC6). I used to get this kind of FireFox crash every few days. After I turned off one core I seem to get a lot fewer. I cannot be sure that turning off the core was the cause of fewer FireFox misbehaviours -- it could be a coincidence of time.
(I turned off one core because of a Linux kernel bug that showed up this spring.)
It seems to me that this mildly supports the race condition theory.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #125 |
The conditions documented by "D. Hugh".indicate the potential problems of a multi-CPU system with software not designed for such. The problem is that when an async window/tab refresh comes in the second CPU may get it while the first CPU is dealing with the process of deleting and redrawing the first window.
You have to delete the async refresh operations before you attempt to redraw a tab or window.
The X server is taking things stuck into a queue and if you stick a "delete window" operation into the queue and then an async operation sticks a "do something with that window" into the queue it is not surprising that problems result.
The X (windows) server I believe is working ok. It is Firefox which is not recognizing that it is attempting operations on windows in the process of being deleted.
There should either be a block on operations on windows being deleted or there should be an elimination of refresh operations on windows being deleted.
In Mozilla Bugzilla #263160, Nilbus (nilbus) wrote : | #126 |
Note that this bug is not limited to multi-CPU systems. I was only on a single-CPU when I experienced this problem.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #127 |
I can also confirm that the bug exists independently of whether a single-core or dual-core CPU being used. To me, the bug has NOT been tripped more frequently after I got a dual core system, rather the other way around if anything. That observation does in some sense rhyme with Robert Bradbury's earlier observations that higher CPU load matters -- it is likely that a dual core is more lightly loaded.
In Mozilla Bugzilla #263160, c7d2f5c8667d26fffd5e7772d632c76d (c7d2f5c8667d26fffd5e7772d632c76d-deactivatedaccount) wrote : | #128 |
*** Bug 368260 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #129 |
Here's a twist that may be useful for differentially debugging this problem:
I recently switched to fedora8_x86_64, but I stayed with a 32b version of firefox, because it was easier to get Java to work that way.
Since this change, I have NOT seen the bug. No more disembodied windows, no
gtk console messages. However,instead, firefox has started crashing regularly,
and it tends to correlate with visiting previously mentioned popular sites that tend to exercise the original bug.
To complicate matters, this is with a slightly newer firefox binary (2.0.0.10),
but presumably someone else can confirm that 2.0.0.10 is stil buggy when run on a 32b linux.
Here's the plain and smple error message:
/usr/lib/
In Mozilla Bugzilla #263160, Timeless-bemail (timeless-bemail) wrote : | #130 |
*** Bug 409059 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #131 |
(In reply to comment #129)
> I recently switched to fedora8_x86_64, but I stayed with a 32b version of
> firefox, because it was easier to get Java to work that way.
Erik, see my comment #48 under Bug 244482 (I didn't realize it had been marked duplicate) regarding the probability of the bug *not* appearing on multi-core (or perhaps simply faster) CPUs. I view 64 bit kernels & libraries (including the X server) as being inherently faster than 32-bit equivalents due to the increased number of registers available on the 64 bit architecture compared with the 32 bit architecture). So it may simply be due to the fact that critical aspects of the system (like how fast X is processing window operations) run faster and make it harder to trigger the bug.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #132 |
Note: This is using Epiphany rather than Firefox, but the symptoms are the same.
I have definitely confirmed that the NY Times home page (www.nytimes.com) can trigger this error. I was doing other work and this morning a minimized window sitting on the NY Times home page sprang an "untitled" window with a refreshed home page. The log file contained the typical "Gdk-WARNING **: GdkWindow 0x2123dc0 unexpectedly destroyed" followed by 6 Gdk-CRITICAL/
I tried to save the "Untitled Window" and that did not work. Going back to the original window and executing a "save" did work, but the "Save" window failed to exit properly (I think the entire browser window set was effectively hung). Trying to minimize the now dysfunctional "save" pop-up window seemed to result in: "Gdk-CRITICAL **: gdk_window_hide: assertion `GDK_IS_WINDOW (window)' failed" (followed by 3 Gtk-CRITICAL/
Killing original NY Times home page window (clicking on the window X) deleted the original window, the untitled window and the save pop-up window (at least there is a work-around). Of course its a bad work around if you have other useful tabs in the same window as the one causing the problem (though I believe killing the dysfunctional tab might have worked).
Now, I've looked at the homepage window and it does not have a:
meta http-equiv=
command to refresh the window using HTML. Given that I don't think I've seen the error occur when I have Javascript disabled, I think the NY Times is using a javascript window refresh timeout.
It is worth noting that I think you could reproduce the bug (at least on a 32 bit single core CPU) if one simply copied down a number of pages (newspaper or TV station "homepages" might be a good bet) and hacked them to contain a line like
<meta http-equiv=
The problem is the getting the full impact of loading a time-consuming network bandwidth limited page. This probably requires something like:
content="1;url=http://
content="1;url=http://
etc.
Then the problem is how to get it to repeat itself. It might require one master reloading file which sets up multiple foreign-site reloads.
But I think if you do something like this and max out either the CPU or the network bandwidth you should eventually get to the point where the bug becomes reproducible.
It is also useful to note that the NY Times home page includes the lines:
<meta http-equiv=
<meta http-equiv="Pragma" content="no-cache">
which I believe function to prevent caching of the page contents. Generally speaking if you have a fast network connection the pages one is loading should probably contain such lines. If on the other hand you max out your network bandwidth before you max out your CPU (or memory) one may want to try loading pages which are more static and can be cached (to reduce the network load).
In Mozilla Bugzilla #263160, Paul Brannan (pbrannan) wrote : | #133 |
> Killing original NY Times home page window (clicking on the window X) deleted
> the original window, the untitled window and the save pop-up window (at least
> there is a work-around). Of course its a bad work around if you have other
> useful tabs in the same window as the one causing the problem (though I
> believe killing the dysfunctional tab might have worked).
Usually killing the tab also destroys the unwanted window, for me. Sometimes it takes down firefox, but I think that's because this bug is usually triggered in low-memory situations.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #134 |
The NY Times javacript timeout function appears to be the file:
http://
I fetched it using "wget".
It looks like it times out every 15 minutes. I still haven't figured where it
gets called from (perhaps it is simply setup when the common.js file is loaded from the home page). It gives you a good idea of how to setup the timeouts
using Javascript (which I don't speak).
Since Javacript appears to have a millisecond timer vs. HTML which uses seconds
one ought to be able to max out the machine by having a function like the NY
Times timer function deduct start with a 5 second refresh then deduct 10-100
milliseconds for each successive refresh until the machine gets maxed out.
In Mozilla Bugzilla #263160, Roc-ocallahan (roc-ocallahan) wrote : | #135 |
There are so many comments here that it's going to be hard to get anything done.
What we need here is a testcase that will reproduce the bug, not just for one person but for many or hopefully all people. This may involve writing HTML or possibly even using a Python web server. Or you may be able to get away with enabling popups and using window.open and document.write. Please lets focus on that and not discuss the details of hardware configurations or speculate about what might be causing the bug.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #136 |
Please note Bug #413390 and the NYT-test.sh attachment to it. In a perfect world, i.e. if Firefox could launch tabs until ones swap space was exhausted, I suspect that script (or multiple invocations thereof) would in fact provide the test case ":roc" desires. The script might provide a test case if one increases MAXSESSIONS to 250+ and increases INTERVAL to 30+ but it is going to require running the script for several hours to generate a sufficient number of tabs (running a sufficient number of page refreshes) that an "untitled window" may appear.
I suspect, the INTERVAL and MAXSESSIONS are going to be highly CPU dependent. The goal is to generate a sufficiently large number of asynchronous Javascript timeouts running such that one or more timeouts will expire in the middle of a previous page refresh (delete and redraw) operation has completed. This leading to the GDK errors and the "untitled window".
To the best of my knowledge there is no way to obtain a "ps" within Firefox for currently pending Javascript timeouts. This is another "bug".
In Mozilla Bugzilla #263160, Mook-moz+mozbz (mook-moz+mozbz) wrote : | #137 |
*** Bug 365734 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Antoine-mechelynck-gmail (antoine-mechelynck-gmail) wrote : | #138 |
*** Bug 367832 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #139 |
Please note the creation of Bug #427024, using a very recent release of Firefox 3.0pre. That bug provides both a sessionstore.js file and firefox log files for a reproducible (at least within an existing firefox session) for the "window unexpectedly destroyed" and the "untitled window" problems using Gmail, which means the problem is being generated using Javascript -- this is slightly different from many of the problems reported under this bug which are typically generated by window redraw commands from HTTP timeouts (sometimes used by news providers, advertisers, etc.).
In Mozilla Bugzilla #263160, Antoine-mechelynck-gmail (antoine-mechelynck-gmail) wrote : | #140 |
*** Bug 399436 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #141 |
Created an attachment (id=317518)
GDB log of window unexpectedly destroyed errors
Ok, here you go, finally after more than a year of encountering this problem is a set of stack traces. This is a "current" firefox (CVS compiled 29 Mar 08 / version 3.0pre). Firefox itself, gdk+, glib and libc are compiled with debug symbols (-g2).
The firefox session has been running 3 days (when the system was rebooted). It was a restart of a previous long running session so it currently has 53 windows and 445 tabs open.
The problem is gmail is completely dysfunctional! The symptoms first appeared as an old (working) gmail window could not compose a message. One could enter the To: line and the Subject: line but the window would fail to echo text typed into the main body message. One could discard the partial message and start a new message and it exhibited the same problem. One could close the old gmail window and attempt to reopen a new gmail window and that would produce the standard "GdkWindow ... unexpectedly destroyed" messages *consistantly*. There are 14 GdkWindow warnings followed by 3 gdk_x11_
It should be noted that a gmail "window" is displayed, with a title "Gmail - Inbox(4) - <email address hidden> - Mozilla Firefox" but no text is displayed within the window.
The debugger was attached to firefox, some breakpoints were set and deleted when they were determined to be too "chatty". The last ~18 backtraces involved the g_log messages resulting from a fresh gmail window restart.
I will attempt to keep this firefox/gdb session open in the hope that someone who understands widget/
It should be noted, that even with a troublesome sessionstore.js file (many windows and tabs) it still usually takes several days of use to get Firefox into the window destroying problem state.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #142 |
Created an attachment (id=317520)
Window destroyed problems opening new tabs
Ok, here are the Firefox traces from the same problematic firefox session. In this case however, the problem is not gmail, instead it is opening new tabs from http://
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #143 |
Regarding the "semi-responsiv
It may be worth noting that although the Firefox gmail window contents is "dead" (i.e. the window body displays the contents of what was previously at that location on the monitor), it is still "alive".
One can move the window around on the monitor, can move it between workspaces and interestingly enough it is still communicating with gmail. If I use Galeon to access my gmail mailbox and send myself a test message the title bar on the window does change from 4 unread messages to 5 unread messages. The time for this to happen however is some number of seconds, significantly longer than for the same change to be reflected in the Galeon window for my mailbox.
sra136 (sra136) wrote : | #144 |
Binary package hint: firefox-3.0
With some websites, a "Untitled window" that pops up. Closing the window will close the browser.
ProblemType: Bug
Architecture: i386
Date: Sun May 4 08:58:33 2008
DistroRelease: Ubuntu 8.04
Package: firefox-3.0 3.0~b5+
PackageArchitec
ProcEnviron:
PATH=/
LANG=en_US.UTF-8
SHELL=/bin/bash
SourcePackage: firefox-3.0
Uname: Linux 2.6.24-17-generic i686
sra136 (sra136) wrote : | #145 |
- Dependencies.txt Edit (2.6 KiB, text/plain; charset="utf-8")
- pluginreg.dat.txt Edit (2.5 KiB, text/plain; charset="utf-8")
- profiles.ini.txt Edit (94 bytes, text/plain; charset="utf-8")
Joe Smith (yasumoto7) wrote : | #146 |
Could you give an example of a site where this occurs?
Arthur (moz-liebesgedichte) wrote : | #147 |
I've had this happen on tagi.ch several times today now. I've never seen it before. Probably advertisers have changed their popup-scripts. Bug #227068 sounds similar.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #148 |
Ok, hear is the stack trace from an "unexpectedly destroyed" in the context of returning from a gmail message back to the Inbox index error:
Breakpoint 1, IA__g_logv (log_domain=
format=
396 gboolean was_fatal = (log_level & G_LOG_FLAG_FATAL) != 0;
(gdb) thread apply bt all
(gdb) thread apply all bt
Thread 7 (Thread 0xb67a8b90 (LWP 12544)):
#0 0xb7f83410 in __kernel_vsyscall ()
#1 0xb71e65b7 in *__GI___poll (fds=0xb67a7f98, nfds=2, timeout=65535000)
at ../sysdeps/
#2 0xb7dffe5a in PR_Poll () from /usr/local/
#3 0x080d0a0c in ?? ()
#4 0x08cbd450 in ?? ()
#5 0x00000002 in ?? ()
#6 0x03e7fc18 in ?? ()
#7 0xb7dfe516 in PR_ExitMonitor () from /usr/local/
#8 0x080d137d in ?? ()
#9 0x08cbcf70 in ?? ()
#10 0x00000001 in ?? ()
#11 0xb67a8208 in ?? ()
#12 0xb7eb7ff4 in ?? () from /usr/local/
#13 0x08cbd7b8 in ?? ()
#14 0x00000001 in ?? ()
#15 0xb67a8218 in ?? ()
#16 0xb7e81dd7 in ?? () from /usr/local/
#17 0x08cbd7d8 in ?? ()
#18 0x00000000 in ?? ()
Thread 6 (Thread 0xb5f75b90 (LWP 12545)):
#0 0xb7f83410 in __kernel_vsyscall ()
#1 0xb7f73b12 in pthread_
#2 0xb7dfd3a5 in ?? () from /usr/local/
#3 0x08c3497c in ?? ()
#4 0x08c58410 in ?? ()
#5 0xb5f7528c in ?? ()
#6 0xb7f745f5 in __pthread_
#7 0xb7dfe194 in PR_WaitCondVar () from /usr/local/
#8 0xb7e8646f in ?? () from /usr/local/
#9 0x08c34978 in ?? ()
#10 0x00051fb9 in ?? ()
#11 0x08c58410 in ?? ()
#12 0xb7eb7ff4 in ?? () from /usr/local/
#13 0x08e2a420 in ?? ()
#14 0x00000000 in ?? ()
Thread 5 (Thread 0xb4679b90 (LWP 12549)):
#0 0xb7f83410 in __kernel_vsyscall ()
#1 0xb7f737e5 in pthread_
#2 0xb7dfe226 in PR_WaitCondVar () from /usr/local/
#3 0x088daa5a in ?? ()
#4 0x090aeb18 in ?? ()
#5 0xffffffff in ?? ()
#6 0xb1c8b2f0 in ?? ()
#7 0xb1746cb0 in ?? ()
#8 0x08bff4a8 in ?? ()
#9 0x00000000 in ?? ()
Thread 4 (Thread 0xb4e7ab90 (LWP 12550)):
#0 0xb7f83410 in __kernel_vsyscall ()
#1 0xb7f737e5 in pthread_
#2 0xb7dfe226 in PR_WaitCondVar () from /usr/local/
#3 0x088c6de3 in ?? ()
#4 0x09290278 in ?? ()
#5 0xffffffff in ?? ()
#6 0x092901dc in ?? ()
#7 0xb7e0cff4 in ?? () from /usr/local/
#8 0x092902b8 in ?? ()
#9 0x00000000 in ?? ()
Thread 3 (Thread 0xb2593b90 (LWP 12563)):
#0 0xb7f83410 in __kernel_vsyscall ()
#1 0xb7f737e5 in pthread_
#2 0xb7dfe226 in PR_WaitCondVar () from /usr/local/
#3 0xb7dfe287 in PR_Wait () from /...
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #149 |
Another example with clearer traces:
(the former may involve extended Glib errors while this involves current glib errors.
Breakpoint 1, IA__g_logv (log_domain=
format=
args1=
396 gboolean was_fatal = (log_level & G_LOG_FLAG_FATAL) != 0;
(gdb) bt
#0 IA__g_logv (log_domain=
format=
args1=
#1 0xb77ed9b9 in IA__g_log (log_domain=
format=
#2 0xb79e1477 in IA__gdk_
#3 0xb79c6c68 in gdk_event_translate (display=0x8c18018, event=0x2211c2e8, xevent=0xbfb43e5c,
return_
#4 0xb79c82d7 in _gdk_events_queue (display=0x8c18018) at gdkevents-
#5 0xb79c879f in gdk_event_dispatch (source=0x8c1f188, callback=0, user_data=0x0) at gdkevents-
#6 0xb77e47f8 in IA__g_main_
#7 0xb77e7a4e in g_main_
at gmain.c:2642
#8 0xb77e7f9c in IA__g_main_
#9 0x08246aec in ?? ()
#10 0x00000000 in ?? ()
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #150 |
My recent file bug reports on this bug have been generated by Gnail which seems adept at generating this bug under heavy load conditions (i.e. 277+ active tabs in the browser).
And so I am saying to the people who wish to verify firefox functionality -- you will not know it until you test it. My Gmail problems do not seem to appear until I have multiple sites active.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #151 |
Let us seriously discuss this question. The bug has been here for 4+ years and has still not been resolved, Therefore it must be an issue between the Mozilla developers and the X developers -- who do not choose to cross-pollinate with respect to potential X-bugs. OR we must generally consent to the fact that the masses are generally immune to Linux and proceed along their general way.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #152 |
As this has been a Firefox bug for 4+ years and is still unresolved, I feel compelled to point out that it appears in relatively static mode with respect the glib stack dumps and their position.
The fundamental problem appears to be 'do this operation on window X when window X and its subunits have been deleted''
That requires a commitment from the release "gods" of firefox 3.0 that they will not release it with "known bugs on deck" It is insufficient if a program can be claimed to work for Windows and not for Linux. IMO, that is a non-functionable.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #153 |
Created an attachment (id=322449)
Set of gdb stack traces of destroyed windows
Here is a set of gdb stack traces of firefox throwing the "window unexpectedly destroyed" bug (plus a few other glib errors). The URLs involved were gmail (searching ones own mailbox) and the Internet Movie Database (www.imdb.com).
I've got a firefox setup *NOW* which is regularly throwing these errors into gdb. It will not get resolved until someone, presumably someone who understands Firefox's javascript enabled use of windows timers, creation and destruction, contacts me for further information.
I can verify that this is currently the *real* bug, because in gmail when it is throwing bugs I see it create and subsequently delete the little untitled windows before it returns to the main screen.
Alexander Sack (asac) wrote : Re: [Bug 226470] Re: untitled popup window | #154 |
On Fri, May 23, 2008 at 08:29:31AM -0000, Arthur wrote:
> I've had this happen on tagi.ch several times today now. I've never seen
> it before. Probably advertisers have changed their popup-scripts. Bug
> #227068 sounds similar.
>
Please attach a screenshot of that situation.
status incomplete
Thanks,
- Alexander
Changed in firefox-3.0: | |
status: | New → Incomplete |
Arthur (moz-liebesgedichte) wrote : | #155 |
It's as mysteriously gone as it came... I haven't seen it now for several days. Really weird.
Alexander Sack (asac) wrote : | #156 |
On Fri, May 30, 2008 at 07:50:59AM -0000, Arthur wrote:
> It's as mysteriously gone as it came... I haven't seen it now for
> several days. Really weird.
>
OK thanks. If you see it again, please reopen this bug.
status invalid
- Alexander
Changed in firefox-3.0: | |
status: | Incomplete → Invalid |
Arthur (moz-liebesgedichte) wrote : | #157 |
- Screenshot.png Edit (386.9 KiB, image/png)
Speaking of the devil... Today it showed up again. This time under nzz.ch. See attached screenshot.
Changed in firefox-3.0: | |
status: | Invalid → New |
Alexander Sack (asac) wrote : | #158 |
On Fri, May 30, 2008 at 01:03:41PM -0000, Arthur wrote:
> Speaking of the devil... Today it showed up again. This time under
> nzz.ch. See attached screenshot.
>
Can you reproduce? maybe uninstalling flash helps? maybe disabling
your extensions in tools -> addons helps?
status incomplete
- Alexander
Changed in firefox-3.0: | |
status: | New → Incomplete |
Arthur (moz-liebesgedichte) wrote : | #159 |
I'm pretty sure it's triggered by certain adds which unfortunately change each time you load the page. I'll try to see what triggers it, but that can take its time as they are rather infrequent. By the way I've only ever seen this with the Ubuntu FF3 Beta 5, never with the current mozilla.org nightlies.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #160 |
It also may be of use to look at Bug #437021 which is a distinct bug of its own because it relates to Firefox SEGFAULTing under Linux (repeatable as I have 5+ traces involving the problem with the associated crashes of Firefox), only the most recent of which involved getting a gdb trace. But Firefox *was* in the state where it was repeatedly throwing the "window unexpectedly destroyed" messages and it was being generated usually from "gmail" which probably means an improperly handled Javascript window timeout (or destruction) problem.
Alexander Sack (asac) wrote : Re: [Bug 226470] Re: untitled popup window | #161 |
On Mon, Jun 02, 2008 at 08:57:26PM -0000, Arthur wrote:
> I'm pretty sure it's triggered by certain adds which unfortunately
> change each time you load the page. I'll try to see what triggers it,
> but that can take its time as they are rather infrequent. By the way
> I've only ever seen this with the Ubuntu FF3 Beta 5, never with the
> current mozilla.org nightlies.
>
I assume those are flash adds. Try to remove the adobe flash player
and install mozilla-
affects ubuntu/firefox-3.0
status invalid
affects ubuntu/
status incomplete
- Alexander
Changed in firefox-3.0: | |
status: | Incomplete → Invalid |
Been having the same issue since starting on Ubuntu 7.04 and Firefox 2.0
Apparently it's caused by a number of things and it's tied to Gnome
Read more:
https:/
In Mozilla Bugzilla #263160, Cameron McCormack (cam-mcc) wrote : | #163 |
I used to get this kind of behaviour before Firefox 3, where occasionally I'd get a tab's frame open in a separate top-level window (with no window title, and strangely isn't focussable -- using Sawfish as my window manager). Since Firefox 3 I don't get this as much, but I have noticed it happening with Flash sometimes. I have the FlashBlock extension running. Sometimes the new top-level window has the Flash object running in it (despite the fact that the replacement graphic in the main page's window is showing), and sometimes it is an empty, grey window. Sorry I haven't got any more useful information to provide.
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #164 |
Maybe, just maybe, I have located at least one source of this problem. People plagued by this over the years may want to look at Bug #467744.
But what I am seeing in that bug is consistent with this bug. It depends entirely on *when* the thread destroying the parent window marks it as "destroyed". The gdk/gtk libraries seem to have this interesting feature that the windows don't immediately disappear when they are "destroyed" but are simply marked as such for a period of time. Of course if one is able to create a new window as a "child" of a window in the process of being destroyed then one is likely to end up with the "orphan" windows we see with this bug.
The asynchronous (multi-threaded) aspect of window creation and destruction is why this bug was/is so sensitive to the machine CPU/memeory (swapping) usage and so difficult to get a handle on.
In Mozilla Bugzilla #263160, L. David Baron (dbaron) wrote : | #165 |
(In reply to comment #152)
> The asynchronous (multi-threaded) aspect of window creation and destruction is
> why this bug was/is so sensitive to the machine CPU/memeory (swapping) usage
> and so difficult to get a handle on.
As I said in comment 78, all of Mozilla's interaction with GTK/Gdk/X11 is on a single thread.
In Mozilla Bugzilla #263160, Erikred (erikred) wrote : | #166 |
David Baron,
But what if one has multiple firefoxes all pounding on GTK/Gdk/X11 at the same time? Could that be part of the problem? I just got an Untitled window again, this time in Fedora 10 64b with 5 instances of firefox 3.0.4 running and maybe 500 tabs altogether,
On a possibly related note, in fedora 10 my X11 process has been going wild using up 11-12G of main memory, and there appears to be a correlation with whether the browsers are started sequentially/
In Mozilla Bugzilla #263160, Robert-bradbury (robert-bradbury) wrote : | #167 |
Erik/David, related to your comments regarding the problem, see my comments on Bug #467744 # 6.
Regarding David's claims that the GDK access is single threaded (I want the function names that insure this.) I have been reading C since 1974, I have actually met both Dennis Ritchie and Ken Thompson at various points. You can claim crap but this is a "trust but verify world". One of the shortcomings IMO for the mozilla perspective is that the do *NOT* have a perspective for bringing one "up-to-speed".
Getting back to Erik's points, there is a question of whether or not asynchronous processes (threads) get to address the display manager (GDK). Given his many valid points about when and how the display manager may be addressed, is the issue of how one is managing that. (Note I do not see messages between the Firefox developers and the GDK/GTK developers) revealing that they might understand the capabilities and limits of their software systems. (Which when you are attempting to operate on a deleted window -- clearly show you do not understand.)
In Mozilla Bugzilla #263160, Karlt (karlt) wrote : | #168 |
I'd expect this to result from http://
Changes to gdk_window_new before gtk+-2.14.0 would have meant that the crash of bug 467744 resulted instead:
http://
But gtk+-2.15.1 and newer will probably start showing these symptoms again as the crash of bug 467744 is patched up here:
http://
In Mozilla Bugzilla #263160, Ovemen (ovemen) wrote : | #169 |
It used to happen to me every time on RH9 and FC6, using Firefox 1.5 (and I believe also 2.0), after the browser was used for a day or so.
Now, with FC6, FF 3.0.10 it doesn't happen very often, but it happens, especially recently.
** (evince:16737): WARNING **: Unimplemented named action: POPPLER_DEST_FITBH, please post a bug report in Evince bugzilla (http://
** (evince:16737): WARNING **: Unimplemented named action: POPPLER_DEST_FITBH, please post a bug report in Evince bugzilla (http://
** (evince:16737): WARNING **: Unimplemented named action: POPPLER_DEST_FITBH, please post a bug report in Evince bugzilla (http://
** (evince:16737): WARNING **: Unimplemented named action: POPPLER_DEST_FITBH, please post a bug report in Evince bugzilla (http://
(Gecko:28775): Gtk-CRITICAL **: gtk_drag_
(Gecko:28775): Gdk-WARNING **: GdkWindow 0x28d6e54 unexpectedly destroyed
(Gecko:28775): Gdk-WARNING **: GdkWindow 0x28d6e47 unexpectedly destroyed
(Gecko:28775): Gdk-WARNING **: GdkWindow 0x28d664f unexpectedly destroyed
(Gecko:28775): Gdk-CRITICAL **: gdk_window_
(Gecko:28775): Gdk-CRITICAL **: gdk_window_
(Gecko:28775): GLib-GObject-
(Gecko:28775): GLib-GObject-
(Gecko:28775): Gdk-CRITICAL **: gdk_x11_
(Gecko:28775): Gdk-CRITICAL **: gdk_window_hide: assertion `GDK_IS_WINDOW (window)' failed
(Gecko:28775): Gdk-CRITICAL **: gdk_window_
(Gecko:28775): Gdk-CRITICAL **: _gdk_window_
(Gecko:28775): GLib-GObject-
(Gecko:28775): Gdk-WARNING **: GdkWindow 0x28d6650 unexpectedly destroyed
(Gecko:28775): Gdk-CRITICAL **: gdk_window_
(Gecko:28775): Gdk-CRITICAL **: gdk_window_
(Gecko:28775): GLib-GObject-
(Gecko:28775): GLib-GObject-
(Gecko:28775): Gdk-CRITICAL **: gdk_x11_
(Gecko:28775): Gdk-CRITICAL **: gdk_window_hide: assertion `GDK_IS_WINDOW (window)' failed
(Gecko:28775): Gdk-CRITICAL **: gdk_window_
(Gecko:28775): Gdk-CRITICAL **: _gdk_window_
(Gecko:28775): GLib-GObject-
(Gecko:28775): Gdk-WARNING **: GdkWindow 0x28d6653 unexpectedly destroyed
(Gecko:28775): Gdk-CRITICAL **: gdk_window_
In Mozilla Bugzilla #263160, Jruderman (jruderman) wrote : | #170 |
*** Bug 395999 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Jruderman (jruderman) wrote : | #171 |
*** Bug 410325 has been marked as a duplicate of this bug. ***
In Mozilla Bugzilla #263160, Francewhoa (francewhoa) wrote : | #172 |
> sometimes it is an empty, grey window
Same here, with Ubuntu 12.04 LTS Unity 2d
Managed to grab a couple of screenshots. Note the ad banner image (which is an www.mozilla. se/bugs/ firefox- bug-1.png
iframe) that has turned up in the wrong place, its own window. Also, the
blue-ish box in the middle of the page is my desktop background image hanging
around from a workspace change:
http://
Right after that, reloading the page made it even worse. Now both the iframe of www.mozilla. se/bugs/ firefox- bug-2.png
the banner and the frame of the whole tab got ripped out. Note how the window
covers the toolbar of the "real" firefox window, and the "Screen Shot" window
showing through the firefox window is actually the first screenshot, which again
is on a different workspace. That area isnt updated at all - dragging a window
across it creates a "trail".
http://