element find removes for subsequent calls

Bug #2003038 reported by KeithSloan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
New
Undecided
Unassigned

Bug Description

% python3 lxmlReport.py
Python : sys.version_info(major=3, minor=10, micro=9, releaselevel='final', serial=0)
lxml.etree : (4, 9, 2, 0)
libxml used : (2, 9, 14)
libxml compiled : (2, 9, 14)
libxslt used : (1, 1, 35)
libxslt compiled : (1, 1, 35)

I have a function

    def getSolid(self, sname) :
        print(f"getSolid : {self.solids} {len(self.solids)} {sname}")
        #self.printElement(self.solids)
        #return self.solids.find(f"*[@name='{sname}']")
        ret = self.solids.find(f"*[@name='{sname}']")
        if ret is not None:
            self.printElement(ret)
        print(ret)
        return ret

I call it once and it outputs

getSolid : <Element solids at 0x10589eb00> 163 VTBox2
b'<box xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" aunit="radian" lunit="mm" name="VTBox2" x="300" y="6" z="990"></box>\n '
<Element box at 0x1058b2e40>
solidDict {'VTBox2': <Element box at 0x1058b2e40>}

I call it from a different place with same info and get
getSolid : <Element solids at 0x10589eb00> 162 VTBox2
None
Solid : VTBox2 Not Found

I note the length of the Element on the second call is one less i.e. 162
It looks like the find call is removing the element.

Revision history for this message
scoder (scoder) wrote :

Thanks for the report, but there shouldn't be anything in .find() that would change the document.

Is this repeatable? I.e., if you call it multiple times, does an element disappear each time?

Could you remove the "printElement()" call, and also any further code on the caller side, to make sure it's lxml that does the tree modification and not something on your side? A short, complete reproducer would help.

Revision history for this message
KeithSloan (keilh) wrote :

Here are some comments from a fellow developer Munther Hindi.

"It looks the find returns a reference to the element and the append moves the element from one container to the other. So appending back to to the self.materials probably moves the element back to self.materials and leaves the local materials list empty. I did not check (watching my usual football now). So after the find one needs to make a copy of the found element and then append that."

"So I am guessing that internally the append operation is reassigning the parent of the element, rather than creating a new element in that parent. So probably need to try and make a deep copy of the element before the append. Will try later, but if you get to it before I do, please let me know."

at the top of the file:

import copy

then

    def processElement(self, matxml, elem):
        print(f"Process Element : {elem}")
        print(f"len self.materials: {len(self.materials)}")
        elemXml = self.materials.find(f"*[@name='{elem}']")
        if elemXml is not None:
            newelemXml = copy.deepcopy(elemXml) # <- make a deep copy of the found element
            print(f"Element : {elemXml.get('name')}")
            matxml.append(newelemXml) # <- append the copy
            self.printMaterials()

    def processMaterial(self, mat):
        print(f"Process Material : {mat}")
        newMat = etree.Element("material")
        matXml = self.materials.find(f"*[@name='{mat}']")
        print(f"matXml {matXml}")

        if matXml is not None:
            # Munther please review
            # newMat.insert(0, matXml)
            newmatXml = copy.deepcopy(matXml) # <- make a deep copy of the found element
            newMat.append(newmatXml) # <- append the copy
            for fractXml in matXml.findall("fraction"):
                ref = fractXml.get("ref")
                print(f"Faction ref {ref}")
                self.processElement(newMat, ref)
                for compXml in matXml.findall("composite"):
                    ref = compXml.get("ref")
                    print(f"Composite ref {ref}")
                    self.processElement(newMat, ref)
        return newMat

Munther

Code is at https://github.com/KeithSloan/CERN-Alice-MultiFile-gdml

test script testVelo1

If one comments out the maxXml.appends then when it runs it finds the Materials, with the maxXml.appends the finds are failing.

Revision history for this message
KeithSloan (keilh) wrote :

That is some of the finds i.e. a find of the same solid for a different volume.
First one seems okay, second gets None.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.