libtidy creates invalid tags
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tidy (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
Binary package hint: tidy
The following HTML causes libtidy to produce invalid tags:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://
<html><body>
<p><b><i><a href="A"> <big>B<
</body></html>
The result of running this through the tidy command line is:
$ tidy -w0 t.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://
<html xmlns="http://
<head>
<meta name="generator" content="HTML Tidy for Linux (vers 25 March 2009), see www.w3.org" />
<title></title>
</head>
<body>
<p><b><i><a href="A"
</body>
</html>
The invalid tags change on each run, looking like use after free kind of problem.
This is running on Ubuntu 10.04 LTS, updated recently.
Linux 2.6.32-22-generic #35-Ubuntu SMP Tue Jun 1 14:18:25 UTC 2010 x86_64 GNU/Linux
/etc/tidy.conf only contains comments.
there are no other tidy configuration files
Using LD_PRELOAD I tested the libtidy from earlier distributions;
only the version from libtidy-
Definitely use after free - tested by remarking the free() call in alloc.c.
After this change the invalid tags show as <big> and </big>.
Code to do this was introduced in parse.c 1.178 title: "inline propagation"
A work around is to remove this patch; but simply commenting out the call
to InlineDup1 on line 1535 is sufficient.
This file was at 1.188 at the time of testing this.
The code in the InlineDup1 (istack.c) was introduced by the same patch.