javascript parsing error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Undecided
|
Unassigned | ||
Python |
Fix Released
|
Unknown
|
|||
beautifulsoup (Debian) |
Fix Released
|
Unknown
|
|||
beautifulsoup (Ubuntu) |
Fix Released
|
Low
|
Unassigned | ||
python2.6 (Ubuntu) |
Invalid
|
Low
|
Unassigned | ||
python2.7 (Ubuntu) |
Fix Released
|
Low
|
Unassigned |
Bug Description
>>> p = """
... <HTML>
... <HEAD>
... </HEAD>
... <BODY>
... <script type=text/
... rgvij="></if";
... </script>
... </BODY>
... </html>
... """
>>> soup = BeautifulSoup(p)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'BeautifulSoup' is not defined
>>> from BeautifulSoup import BeautifulSoup
>>> soup = BeautifulSoup(p)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/
'th' : ['tr'],
File "/Users/
"""We need to pop up to the previous tag of this type, unless
File "/Users/
#If we encounter one of the nesting reset triggers
File "/opt/local/
self.goahead(0)
File "/opt/local/
k = self.parse_
File "/opt/local/
self.error("bad end tag: %r" % (rawdata[i:j],))
File "/opt/local/
raise HTMLParseError(
HTMLParser.
>>>
This works correctly in 3.0.x series.
Changed in beautifulsoup (Debian): | |
status: | Unknown → New |
Changed in python: | |
status: | Unknown → New |
Changed in beautifulsoup: | |
status: | New → Confirmed |
Changed in beautifulsoup (Ubuntu): | |
importance: | Undecided → Low |
Changed in beautifulsoup (Debian): | |
status: | New → Confirmed |
Changed in beautifulsoup (Debian): | |
status: | Confirmed → Fix Released |
Changed in beautifulsoup: | |
status: | Confirmed → Fix Released |
affects: | python-defaults (Ubuntu) → python2.6 (Ubuntu) |
Changed in python2.7 (Ubuntu): | |
importance: | Undecided → Low |
status: | New → Triaged |
Changed in python: | |
status: | New → Fix Released |
This happens even if the Javascript is inside <!-- -->. I think at least this case should be handled, because the contents of HTML comments are easy enough to ignore (easier than contents of strings within Javascript blocks).