Element's serialised namespace can be different to in-memory ns after being inserted into an el
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxml |
Confirmed
|
Medium
|
Unassigned |
Bug Description
I've come across a situation where the namespace of an Element is correct when using say etree.QName(
There's an attached script which reproduces the issue, but here's a brief description of how this situation occurs:
You have two XML trees:
<a xmlns="x:/foo"/>
<f:c xmlns="x:/bar" xmlns:f="x:/foo"/>
Note that {x:/foo}a and {x:/foo}c are in the same namespace, but c has a different default ns to a.
When c is inserted as a child of a, it loses it's "f": "x:/foo" entry in the nsmap, but retains the None: "x:/bar" entry (as expected), so when serialised, the c element is written without a prefix, but with the xmlns="x:/bar", so it effectively gets moved into the x:/bar ns. However, in memory c.tag reports "{x:/foo}c" as expected.
Interactive examples:
$ python
Python 3.4.2 (default, Oct 19 2014, 17:52:17)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.51)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> a = etree.XML('<a xmlns="x:/foo"/>')
>>> c = etree.XML('<f:c xmlns="x:/bar" xmlns:f=
>>> print(etree.
<a xmlns="x:/foo"/>
>>> print(etree.
<f:c xmlns="x:/bar" xmlns:f="x:/foo"/>
>>> c.nsmap
{'f': 'x:/foo', None: 'x:/bar'}
>>> a.insert(0, c)
>>> c.nsmap
{None: 'x:/bar'}
>>> c.tag
'{x:/foo}c'
>>> print(etree.
<a xmlns="x:/foo"><c xmlns="
Note that this also happens in the same way if a and c start in the same document and c is inserted into a again:
>>> a = etree.XML('<a xmlns="x:/foo"><f:c xmlns="x:/bar" xmlns:f=
>>> print(etree.
<a xmlns="x:/foo"><f:c xmlns="x:/bar" xmlns:f=
>>> c = list(a)[0]
>>> a.insert(0, c)
>>> print(etree.
<a xmlns="x:/foo"><c xmlns="
Strangely, if you insert c twice, it somewhat fixes itself by giving x:/foo a default prefix in the nsmap:
>>> a = etree.XML('<a xmlns="x:/foo"><f:c xmlns="x:/bar" xmlns:f=
>>> print(etree.
<a xmlns="x:/foo"><f:c xmlns="x:/bar" xmlns:f=
>>> c = list(a)[0]
>>> a.insert(0, c)
>>> print(etree.
<a xmlns="x:/foo"><c xmlns="
>>> a.insert(0, c)
>>> print(etree.
<a xmlns="
My versions:
Python : sys.version_
lxml.etree : (3, 4, 2, 0)
libxml used : (2, 9, 0)
libxml compiled : (2, 9, 0)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)
summary: |
Element's serialised namespace can be different to in-memory ns after - being inserting into an el + being inserted into an el |
description: | updated |
The bug stems from a non-cleaned-up nsmap-property:
>>> x = etree.fromstrin g('<div xmlns="foo"/>') g(etree. tostring( x)).tag == 'div'
>>> x.tag = 'div'
>>> assert etree.fromstrin
Traceback (most recent call last):
File "<input>", line 1, in <module>
AssertionError
>>> x.nsmap
{None: 'foo'}