Python: Update Xml-file Using Elementtree While Conserving Layout As Much As Possible
from xml.etree import ElementTree as ET
import re
classElementTreeHelper():
def__init__(self, xml_file_name):
xml_file = open(xml_file_name, "rb")
self.__parse_xml_declaration(xml_file)
self.element_tree = ET.parse(xml_file)
xml_file.seek(0)
root_tag_namespace = self.__root_tag_namespace(self.element_tree)
self.namespace = Noneif root_tag_namespace isnotNone:
self.namespace = '{' + root_tag_namespace + '}'# Register the root tag namespace as having an empty prefix, as# this has to be done before parsing xml_file we re-parse.
ET.register_namespace('', root_tag_namespace)
self.element_tree = ET.parse(xml_file)
deffind(self, xpath_query):
return self.element_tree.find(xpath_query)
defwrite(self, xml_file_name):
xml_file = open(xml_file_name, "wb")
if self.xml_declaration_line isnotNone:
xml_file.write(self.xml_declaration_line + '\n')
return self.element_tree.write(xml_file)
def__parse_xml_declaration(self, xml_file):
first_line = xml_file.readline().strip()
if first_line.startswith('<?xml') and first_line.endswith('?>'):
self.xml_declaration_line = first_line
else:
self.xml_declaration_line = None
xml_file.seek(0)
def__root_tag_namespace(self, element_tree):
namespace_search = re.search('^{(\S+)}', element_tree.getroot().tag)
if namespace_search isnotNone:
return namespace_search.group(1)
else:
returnNonedef__main():
el_tree_hlp = ElementTreeHelper('houses.xml')
dogs_tag = el_tree_hlp.element_tree.getroot().find(
'{ns}house/{ns}dogs'.format(
ns=el_tree_hlp.namespace))
one_dog_added = int(dogs_tag.text.strip()) + 1
dogs_tag.text = str(one_dog_added)
el_tree_hlp.write('hejsan.xml')
if __name__ == '__main__':
__main()
The output:
<?xml version="1.0"?><groupxmlns="http://dogs.house.local"><house><id>2821</id><dogs>3</dogs></house></group>
If someone has an improvement to this solution please don´t hesitate to grab the code and improve it.
Solution 2:
Round-tripping, unfortunately, isn't a trivial problem. With XML, it's generally not possible to preserve the original document unless you use a special parser (like DecentXML but that's for Java).
Depending on your needs, you have the following options:
If you control the source and you can secure your code with unit tests, you can write your own, simple parser. This parser doesn't accept XML but only a limited subset. You can, for example, read the whole document as a string and then use Python's string operations to locate
<dogs>
and replace anything up to the next<
. Hack? Yes.You can filter the output. XML allows the string
<ns0:
only in one place, so you can search&replace it with<
and then the same with<group xmlns:ns0="
→<group xmlns="
. This is pretty safe unless you can have CDATA in your XML.You can write your own, simple XML parser. Read the input as a string and then create Elements for each pair of
<>
plus their positions in the input. That allows you to take the input apart quickly but only works for small inputs.
Solution 3:
when Save xml add default_namespace argument is easy to avoid ns0, on my code
key code: xmltree.write(xmlfiile,"utf-8",default_namespace=xmlnamespace)
if os.path.isfile(xmlfiile):
xmltree = ET.parse(xmlfiile)
root = xmltree.getroot()
xmlnamespace = root.tag.split('{')[1].split('}')[0] //get namespace
initwin=xmltree.find("./{"+ xmlnamespace +"}test")
initwin.find("./{"+ xmlnamespace +"}content").text = "aaa"
xmltree.write(xmlfiile,"utf-8",default_namespace=xmlnamespace)
Solution 4:
etree from lxml provides this feature.
elementTree.write('houses2.xml',encoding = "UTF-8",xml_declaration = True)
helps you in not omitting the declarationWhile writing into the file it does not change the namespaces.
http://lxml.de/parsing.html is the link for its tutorial.
P.S : lxml should be installed separately.
Post a Comment for "Python: Update Xml-file Using Elementtree While Conserving Layout As Much As Possible"