Emacs crashes in GC due to VM design
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
VM |
Triaged
|
High
|
Uday Reddy |
Bug Description
Pieter van Ooostrum found Emacs devo version crashing in GC, and reported it:
https:/
Pip Cet diagnosed the problem as follows:
That indeed looks like a stack overflow.
Here's some speculation about what I think is happening:
We're seeing deep recursion in the garbage collector. If you look at
the tag bits of the objects marked by mark_object, you'll notice the
sequence is
symbol - cons - vectorlike - vectorlike - symbol - cons - vectorlike -
vectorlike - ...
That means there are thousands of symbols referring to values which
again contain symbols, and so on.
...
Essentially, that code is building a singly-linked list of message
vectors, but the links go via symbols rather than directly to the next
message. The garbage collector isn't written for that case, and
recurses rather than iterating, causing the stack overflow.
The first attachment to this message is an Elisp file which does the
same thing, by creating thousands of symbols. On GNU/Linux, with
fairly default standard stack size settings, I get a segfault after
some 85,000 symbols have been created.
Eli Zaretsky responded:
Of course, given enough recursive data structures we can always crash
the current GC the way it is implemented. But the question is how
many such recursive symbols are there in Pieter's sessions? are they
anywhere near the 1000000000 mark you used in your test program? IOW,
I think we need to know how close we are in real-life sessions to the
dangerous mark.
Maybe this is also worth reporting to VM developers. They might
consider changing their implementation to avoid these problems.