empty document causes strange parse error (memory pointer issue?)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Confirmed
|
Low
|
Unassigned |
Bug Description
b*jwp@torch:clients 0$ python lxmlv.py
Python : sys.version_
lxml.etree : (2, 3, -99, 0)
libxml used : (2, 7, 8)
libxml compiled : (2, 7, 3)
libxslt used : (1, 1, 26)
libxslt compiled : (1, 1, 24)
Python 2.7.1 (r271:86832, Jan 19 2011, 15:23:13)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import lxml.html
>>> lxml.html
<module 'lxml.html' from '/pluto/
>>> lxml.html.
<function fromstring at 0x1005e36e0>
>>> fs=lxml.
>>> fs
<function fromstring at 0x1005e36e0>
>>> fs(b'')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/pluto/
doc = document_
File "/pluto/
value = etree.fromstrin
File "lxml.etree.pyx", line 2740, in lxml.etree.
File "parser.pxi", line 1556, in lxml.etree.
File "parser.pxi", line 1435, in lxml.etree.
File "parser.pxi", line 943, in lxml.etree.
File "parser.pxi", line 547, in lxml.etree.
File "parser.pxi", line 628, in lxml.etree.
File "parser.pxi", line 579, in lxml.etree.
lxml.etree.
>>> fs(b'<x/>')
<Element x at 0x1005ced10>
>>> fs(b'')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/pluto/
doc = document_
File "/pluto/
value = etree.fromstrin
File "lxml.etree.pyx", line 2740, in lxml.etree.
File "parser.pxi", line 1556, in lxml.etree.
File "parser.pxi", line 1435, in lxml.etree.
File "parser.pxi", line 943, in lxml.etree.
File "parser.pxi", line 547, in lxml.etree.
File "parser.pxi", line 628, in lxml.etree.
File "parser.pxi", line 577, in lxml.etree.
lxml.etree.
Is this really a bug? XML document must contain root node, I suppose HTML must as well, so an empty string is not a valid XML/HTML document. Did I miss something?
Of course you can argue that empty document should cause an informate, useful error, not a strange one.