没关系,我是Python的新手。我试图将几个XML节点从sample1.xml复制到out.xml,如果它不存在于sample2.xml中。
这是我被困之前走了多远
import xml.etree.ElementTree as ET
tree = ET.ElementTree(file='sample1.xml')
addtree = ET.ElementTree(file='sample2.xml')
root = tree.getroot()
addroot = addtree.getroot()
for adel in addroot.findall('.//cars/car'):
for el in root.findall('cars/car'):
with open('out.xml', 'w+') as f:
f.write("BEFORE\n")
f.write(el.tag)
f.write("\n")
f.write(adel.tag)
f.write("\n")
f.write("\n")
f.write("AFTER\n")
el = adel
f.write(el.tag)
f.write("\n")
f.write(adel.tag)
我不知道我要缺少什么,但这只是复制实际的“ tag
”本身。
输出此:
BEFORE
car
car
AFTER
car
car
所以我错过了孩子的节点,也是<
,>
,</
,>
标签。预期结果如下。
sample1.xml:
<cars>
<car>
<use-car>0</use-car>
<use-gas>0</use-gas>
<car-name />
<car-key />
<car-location>hawaii</car-location>
<car-port>5</car-port>
</car>
</cars>
sample2.xml:
<cars>
<old>
1
</old>
<new>
8
</new>
<car />
</cars>
out.xml中的预期结果(最终产品)
<cars>
<old>
1
</old>
<new>
8
</old>
<car>
<use-car>0</use-car>
<use-gas>0</use-gas>
<car-name />
<car-key />
<car-location>hawaii</car-location>
<car-port>5</car-port>
</car>
</cars>
所有其他节点old
,new
必须保持不变。我只是想用<car />
其所有子代和孙代(如果存在)替换。
首先,您的XML有几个琐碎的问题:
cars
标记缺少a/
new
标签错误地读取old
,应读取new
其次,免责声明:以下我的解决方案有其局限性-特别是,它不会重复处理将car
节点从sample1替换为sample2中的多个位置。但这对于您提供的示例文件来说效果很好。
第三:由于对访问ElementTree节点父节点有最重要的回答-他们为get_node_parent_info
下面的实现提供了信息。
最后,代码:
import xml.etree.ElementTree as ET
def find_child(node, with_name):
"""Recursively find node with given name"""
for element in list(node):
if element.tag == with_name:
return element
elif list(element):
sub_result = find_child(element, with_name)
if sub_result is not None:
return sub_result
return None
def replace_node(from_tree, to_tree, node_name):
"""
Replace node with given node_name in to_tree with
the same-named node from the from_tree
"""
# Find nodes of given name ('car' in the example) in each tree
from_node = find_child(from_tree.getroot(), node_name)
to_node = find_child(to_tree.getroot(), node_name)
# Find where to substitute the from_node into the to_tree
to_parent, to_index = get_node_parent_info(to_tree, to_node)
# Replace to_node with from_node
to_parent.remove(to_node)
to_parent.insert(to_index, from_node)
def get_node_parent_info(tree, node):
"""
Return tuple of (parent, index) where:
parent = node's parent within tree
index = index of node under parent
"""
parent_map = {c:p for p in tree.iter() for c in p}
parent = parent_map[node]
return parent, list(parent).index(node)
from_tree = ET.ElementTree(file='sample1.xml')
to_tree = ET.ElementTree(file='sample2.xml')
replace_node(from_tree, to_tree, 'car')
# ET.dump(to_tree)
to_tree.write('output.xml')
更新:最近引起我注意的是,find_child()
如果所讨论的“子级”不在所遍历的XML树的第一个分支中,则我最初提供的解决方案的实现将失败。我已经更新了上面的实现以纠正此问题。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句