Skip to content Skip to sidebar Skip to footer

Extracting Raw Xml Via Lxml Etree

I'm trying to extract raw XML from an XML file. So if my data is: ... Lots of XML ...

Solution 1:

You should be able to use tostring() to serialize the XML.

Example...

from lxml import etree

xml = """
<xml><getThese><clonedKey>1</clonedKey><clonedKey>2</clonedKey><clonedKey>3</clonedKey><randomStuff>this is a sentence</randomStuff></getThese><getThese><clonedKey>6</clonedKey><clonedKey>8</clonedKey><clonedKey>3</clonedKey><randomStuff>more words</randomStuff></getThese></xml>
"""

parser = etree.XMLParser(remove_blank_text=True)

tree = etree.fromstring(xml, parser=parser)

elems = []

for elem in tree.xpath("getThese"):
    elems.append(etree.tostring(elem).decode())

print(elems)

Printed output...

['<getThese><clonedKey>1</clonedKey><clonedKey>2</clonedKey><clonedKey>3</clonedKey><randomStuff>this is a sentence</randomStuff></getThese>', '<getThese><clonedKey>6</clonedKey><clonedKey>8</clonedKey><clonedKey>3</clonedKey><randomStuff>more words</randomStuff></getThese>']

Post a Comment for "Extracting Raw Xml Via Lxml Etree"