如何使用 ElementTree 在 Python 中迭代 XML 标签并保存到 CSV

拉米兹·谢赫

我正在尝试使用 ElementTree 迭代树中的所有节点和子节点。我想将所有父级及其子级 XML 标记作为列和值,这些列和值可以将子节点以 CSV 格式附加到父级。我正在使用 python 2.7。标题应该只打印一次,下面应该是各自的值

XML文件:

<Customers>  
<Customer CustomerID="GREAL">  
      <CompanyName>Great Lakes Food Market</CompanyName>  
      <ContactName>Howard Snyder</ContactName>  
      <ContactTitle>Marketing Manager</ContactTitle>  
      <Phone>(503) 555-7555</Phone>  
      <FullAddress>  
        <Address>2732 Baker Blvd.</Address>  
        <City>Eugene</City>  
        <Region>OR</Region>  
        <PostalCode>97403</PostalCode>  
        <Country>USA</Country>  
      </FullAddress>  
 </Customer>  
    <Customer CustomerID="HUNGC">  
      <CompanyName>Hungry Coyote Import Store</CompanyName>  
      <ContactName>Yoshi Latimer</ContactName>  
      <ContactTitle>Sales Representative</ContactTitle>  
      <Phone>(503) 555-6874</Phone>  
      <Fax>(503) 555-2376</Fax>  
      <FullAddress>  
        <Address>City Center Plaza 516 Main St.</Address>  
        <City>Elgin</City>  
        <Region>OR</Region>  
        <PostalCode>97827</PostalCode>  
        <Country>USA</Country>  
      </FullAddress>  
    </Customer>  
    <Customer CustomerID="LAZYK">  
      <CompanyName>Lazy K Kountry Store</CompanyName>  
      <ContactName>John Steel</ContactName>  
      <ContactTitle>Marketing Manager</ContactTitle>  
      <Phone>(509) 555-7969</Phone>  
      <Fax>(509) 555-6221</Fax>  
      <FullAddress>  
        <Address>12 Orchestra Terrace</Address>  
        <City>Walla Walla</City>  
        <Region>WA</Region>  
        <PostalCode>99362</PostalCode>  
        <Country>USA</Country>  
      </FullAddress>  
    </Customer>  
    <Customer CustomerID="LETSS">  
      <CompanyName>Let's Stop N Shop</CompanyName>  
      <ContactName>Jaime Yorres</ContactName>  
      <ContactTitle>Owner</ContactTitle>  
      <Phone>(415) 555-5938</Phone>  
      <FullAddress>  
        <Address>87 Polk St. Suite 5</Address>  
        <City>San Francisco</City>  
        <Region>CA</Region>  
        <PostalCode>94117</PostalCode>  
        <Country>USA</Country>  
      </FullAddress>  
    </Customer>  
  </Customers>  

我的代码:

#Import Libraries
import csv
import xmlschema
import xml.etree.ElementTree as ET

#Define the variable to store the XML Document
xml_file = 'C:/Users/391648/Desktop/BOSS_20190618_20190516_18062019141928_CUMA/source_Files_XML/CustomersOrders.xml'

#using XML Schema Library validate the XML against XSD
my_schema = xmlschema.XMLSchema('C:/Users/391648/Desktop/BOSS_20190618_20190516_18062019141928_CUMA/source_Files_XML/CustomersOrders.xsd')
SchemaCheck = my_schema.is_valid(xml_file)
print(SchemaCheck) #Prints as True if the document is validated with XSD

#Parse XML & get root
tree = ET.parse(xml_file)
root = tree.getroot()

#Create & Open CSV file
xml_data_to_csv = open('C:/Users/391648/Desktop/BOSS_20190618_20190516_18062019141928_CUMA/source_Files_XML/PythonXMl.csv','w')

#create variable to write to csv
csvWriter = csv.writer(xml_data_to_csv)

#Create list contains header
count =0

#Loop for each node
for element in root.findall('Customers/Customer'):
    List_nodes = []

    #Get head by Tag
    if count ==0:
        list_header =[]
        Full_Address = []
        CompanyName = element.find('CompanyName').tag
        list_header.append(CompanyName)

        ContactName = element.find('ContactName').tag
        list_header.append(ContactName)

        ContactTitle = element.find('ContactTitle').tag
        list_header.append(ContactTitle)

        Phone = element.find('Phone').tag
        list_header.append(Phone)

        print(list_header)
        csvWriter.writerow(list_header)

        count = count + 1

    #Get the data of the Node
    CompanyName = element.find('CompanyName').text
    List_nodes.append(CompanyName)

    ContactName = element.find('ContactName').text
    List_nodes.append(ContactName)

    ContactTitle = element.find('ContactTitle').text
    List_nodes.append(ContactTitle)

    Phone = element.find('Phone').text
    List_nodes.append(Phone)

    print(List_nodes)

    #Write List_Nodes to CSV
    csvWriter.writerow(List_nodes)

xml_data_to_csv.close()
Expected CSV output:

CompanyName,ContactName,ContactTitle,Phone, Address, City, Region, PostalCode, Country
Great Lakes Food Market,Howard Snyder,Marketing Manager,(503) 555-7555, City Center Plaza 516 Main St., Elgin, OR, 97827, USA
Hungry Coyote Import Store,Yoshi Latimer,Sales Representative,(503) 555-6874, 12 Orchestra Terrace, Walla Walla, WA, 99362, USA
五元素

我改变了几件事:

  • 删除了架构验证,因为我没有 XSD。你可以包括它
  • 使子节点遍历动态而不是静态引用每个子节点
  • 主要for loop条件改为for customer in root.findall('Customer')for customer in root.findall('Customers/Customer')

但是,我试图保持您的程序结构,库使用完整这是修改后的程序:

import xml.etree.ElementTree as et
import csv

tree = et.parse("../data/customers.xml")
root = tree.getroot()
headers = []
count = 0
xml_data_to_csv = open('../data/customers.csv', 'w')

csvWriter = csv.writer(xml_data_to_csv)
for customer in root.findall('Customer'):
    data = []
    for detail in customer:
        if(detail.tag == 'FullAddress'):
            for addresspart in detail:
                data.append(addresspart.text.rstrip('/n/r'))
                if(count == 0):
                    headers.append(addresspart.tag)
        else:
            data.append(detail.text.rstrip('/n/r'))
            if(count == 0):
                headers.append(detail.tag)
    if(count == 0):
        csvWriter.writerow(headers)
    csvWriter.writerow(data)
    count = count + 1

使用给定的输入XML内容,它产生:

CompanyName,ContactName,ContactTitle,Phone,Address,City,Region,PostalCode,Country
Great Lakes Food Market,Howard Snyde,Marketing Manage,(503) 555-7555,2732 Baker Blvd.,Eugene,OR,97403,USA
Hungry Coyote Import Store,Yoshi Latime,Sales Representative,(503) 555-6874,(503) 555-2376,City Center Plaza 516 Main St.,Elgi,OR,97827,USA
Lazy K Kountry Store,John Steel,Marketing Manage,(509) 555-7969,(509) 555-6221,12 Orchestra Terrace,Walla Walla,WA,99362,USA
Let's Stop N Shop,Jaime Yorres,Owne,(415) 555-5938,87 Polk St. Suite 5,San Francisco,CA,94117,USA

注意:您可以附加到一个数组并一次性写入,而不是在循环中写入 CSV。这取决于您的内容大小和性能。


更新:当您在 XML 中有客户及其订单时

XML 处理和 CSV 编写代码结构保持不变。此外,Orders处理客户时的流程要素。现在,underOrders Order元素可以像Customer. 正如你提到的,每个人Order都有ShipInfo

假设输入 XML 为(基于下面的注释):

<Customers>
    <Customer CustomerID="GREAL">
        <CompanyName>Great Lakes Food Market</CompanyName>
        <ContactName>Howard Snyder</ContactName>
        <ContactTitle>Marketing Manager</ContactTitle>
        <Phone>(503) 555-7555</Phone>
        <FullAddress>
            <Address>2732 Baker Blvd.</Address>
            <City>Eugene</City>
            <Region>OR</Region>
            <PostalCode>97403</PostalCode>
            <Country>USA</Country>
        </FullAddress>
        <Orders>
            <Order>
                <Param1>Value1</Param1>
                <Param2>Value2</Param2>
                <ShipInfo>
                    <ShipInfoParam1>Value3</ShipInfoParam1>
                    <ShipInfoParam2>Value4</ShipInfoParam2>
                </ShipInfo>
            </Order>
            <Order>
                <Param1>Value5</Param1>
                <Param2>Value6</Param2>
                <ShipInfo>
                    <ShipInfoParam1>Value7</ShipInfoParam1>
                    <ShipInfoParam2>Value8</ShipInfoParam2>
                </ShipInfo>
            </Order>
        </Orders>
    </Customer>
    <Customer CustomerID="HUNGC">
        <CompanyName>Hungry Coyote Import Store</CompanyName>
        <ContactName>Yoshi Latimer</ContactName>
        <ContactTitle>Sales Representative</ContactTitle>
        <Phone>(503) 555-6874</Phone>
        <Fax>(503) 555-2376</Fax>
        <FullAddress>
            <Address>City Center Plaza 516 Main St.</Address>
            <City>Elgin</City>
            <Region>OR</Region>
            <PostalCode>97827</PostalCode>
            <Country>USA</Country>
        </FullAddress>
        <Orders>
            <Order>
                <Param1>Value7</Param1>
                <Param2>Value8</Param2>
                <ShipInfo>
                    <ShipInfoParam1>Value9</ShipInfoParam1>
                    <ShipInfoParam2>Value10</ShipInfoParam2>
                </ShipInfo>
            </Order>
        </Orders>
    </Customer>
    <Customer CustomerID="LAZYK">
        <CompanyName>Lazy K Kountry Store</CompanyName>
        <ContactName>John Steel</ContactName>
        <ContactTitle>Marketing Manager</ContactTitle>
        <Phone>(509) 555-7969</Phone>
        <Fax>(509) 555-6221</Fax>
        <FullAddress>
            <Address>12 Orchestra Terrace</Address>
            <City>Walla Walla</City>
            <Region>WA</Region>
            <PostalCode>99362</PostalCode>
            <Country>USA</Country>
        </FullAddress>
    </Customer>
    <Customer CustomerID="LETSS">
        <CompanyName>Let's Stop N Shop</CompanyName>
        <ContactName>Jaime Yorres</ContactName>
        <ContactTitle>Owner</ContactTitle>
        <Phone>(415) 555-5938</Phone>
        <FullAddress>
            <Address>87 Polk St. Suite 5</Address>
            <City>San Francisco</City>
            <Region>CA</Region>
            <PostalCode>94117</PostalCode>
            <Country>USA</Country>
        </FullAddress>
    </Customer>
</Customers>

这是处理客户和订单的修改后的代码:

import xml.etree.ElementTree as et
import csv

tree = et.parse("../data/customers-with-orders.xml")
root = tree.getroot()

customer_csv = open('../data/customers-part.csv', 'w')
order_csv = open('../data/orders-part.csv', 'w')

customerCsvWriter = csv.writer(customer_csv)
orderCsvWriter = csv.writer(order_csv)

customerHeaders = []
orderHeaders = ['CustomerID']
isFirstCustomer = True
isFirstOrder = True


def processOrders(customerId):
    global isFirstOrder
    for order in detail.findall('Order'):
        orderData = [customerId]
        for orderdetail in order:
            if(orderdetail.tag == 'ShipInfo'):
                for shipinfopart in orderdetail:
                    orderData.append(shipinfopart.text.rstrip('/n/r'))
                    if(isFirstOrder):
                        orderHeaders.append(shipinfopart.tag)
            else:
                orderData.append(orderdetail.text.rstrip('/n/r'))
                if(isFirstOrder):
                    orderHeaders.append(orderdetail.tag)
        if(isFirstOrder):
            orderCsvWriter.writerow(orderHeaders)
        orderCsvWriter.writerow(orderData)
        isFirstOrder = False


for customer in root.findall('Customer'):
    customerData = []
    customerId = customer.get('CustomerID')
    for detail in customer:
        if(detail.tag == 'FullAddress'):
            for addresspart in detail:
                customerData.append(addresspart.text.rstrip('/n/r'))
                if(isFirstCustomer):
                    customerHeaders.append(addresspart.tag)
        elif(detail.tag == 'Orders'):
            processOrders(customerId)
        else:
            customerData.append(detail.text.rstrip('/n/r'))
            if(isFirstCustomer):
                customerHeaders.append(detail.tag)
    if(isFirstCustomer):
        customerCsvWriter.writerow(customerHeaders)
    customerCsvWriter.writerow(customerData)
    isFirstCustomer = False

在customers-part.csv 中产生的输出:

CompanyName,ContactName,ContactTitle,Phone,Address,City,Region,PostalCode,Country
Great Lakes Food Market,Howard Snyde,Marketing Manage,(503) 555-7555,2732 Baker Blvd.,Eugene,OR,97403,USA
Hungry Coyote Import Store,Yoshi Latime,Sales Representative,(503) 555-6874,(503) 555-2376,City Center Plaza 516 Main St.,Elgi,OR,97827,USA
Lazy K Kountry Store,John Steel,Marketing Manage,(509) 555-7969,(509) 555-6221,12 Orchestra Terrace,Walla Walla,WA,99362,USA
Let's Stop N Shop,Jaime Yorres,Owne,(415) 555-5938,87 Polk St. Suite 5,San Francisco,CA,94117,USA

在 orders-part.csv 中产生的输出:

CustomerID,Param1,Param2,ShipInfoParam1,ShipInfoParam2
GREAL,Value1,Value2,Value3,Value4
GREAL,Value5,Value6,Value7,Value8
HUNGC,Value7,Value8,Value9,Value10

注意:代码可以通过重用进一步优化。我把那部分留给你。其次,注意在每个订单中都添加了customer Id,以便区分。

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

如何使用ElementTree递归遍历Python中的XML标签?

来自分类Dev

使用python中的ElementTree删除特定的xml标签

来自分类Dev

使用批处理脚本从xml文件获取标签值并将结果保存到csv文件中

来自分类Dev

使用xml子对象,Python中的ElementTree

来自分类Dev

如何在python xml.etree.ElementTree中的迭代器中删除节点

来自分类Dev

如何使用ElementTree在python中制作xml树的副本?

来自分类Dev

如何使用Python elementTree提取xml数据中的特定元素

来自分类Dev

Python ElementTree XML输出到CSV

来自分类Dev

Python ElementTree xml输出到csv

来自分类Dev

使用Python ElementTree解析XML

来自分类Dev

使用python将多个xml文件保存到csv

来自分类Dev

Python-使用ElementTree解析具有重复标签的XML

来自分类Dev

如何使用python(xml.etree.ElementTree)解决下一个迭代?

来自分类Dev

使用Python和ElementTree在XML中搜索变量属性

来自分类Dev

使用ElementTree在Python中解析XML-findall

来自分类Dev

使用ElementTree在Python中解析XML-findall

来自分类Dev

使用 Python 3 ElementTree 的 XML 文档中的多个命名空间

来自分类Dev

如何使用xml.etree.ElementTree Python格式化属性,前缀和标签

来自分类Dev

在Python中,如何使用xml.etree.ElementTree创建数据框?

来自分类Dev

如何使用 python 从soap响应中的多个命名空间获取数据:xml.etree.ElementTree

来自分类Dev

使用ElementTree顺序解析某些XML标签

来自分类Dev

使用 ElementTree 获取 XML 标签值

来自分类Dev

Python 2.5:XML中的ElementTree和UML

来自分类Dev

使用ElementTree的递归XML解析python

来自分类Dev

使用ElementTree Python获取检索XML属性

来自分类Dev

使用python的ElementTree处理xml子对象

来自分类Dev

在Python中使用ElementTree从XML提取数据

来自分类Dev

使用Python ElementTree从XML中提取值

来自分类Dev

如何使用xml.etree.ElementTree访问标签之间的文本

Related 相关文章

  1. 1

    如何使用ElementTree递归遍历Python中的XML标签?

  2. 2

    使用python中的ElementTree删除特定的xml标签

  3. 3

    使用批处理脚本从xml文件获取标签值并将结果保存到csv文件中

  4. 4

    使用xml子对象,Python中的ElementTree

  5. 5

    如何在python xml.etree.ElementTree中的迭代器中删除节点

  6. 6

    如何使用ElementTree在python中制作xml树的副本?

  7. 7

    如何使用Python elementTree提取xml数据中的特定元素

  8. 8

    Python ElementTree XML输出到CSV

  9. 9

    Python ElementTree xml输出到csv

  10. 10

    使用Python ElementTree解析XML

  11. 11

    使用python将多个xml文件保存到csv

  12. 12

    Python-使用ElementTree解析具有重复标签的XML

  13. 13

    如何使用python(xml.etree.ElementTree)解决下一个迭代?

  14. 14

    使用Python和ElementTree在XML中搜索变量属性

  15. 15

    使用ElementTree在Python中解析XML-findall

  16. 16

    使用ElementTree在Python中解析XML-findall

  17. 17

    使用 Python 3 ElementTree 的 XML 文档中的多个命名空间

  18. 18

    如何使用xml.etree.ElementTree Python格式化属性,前缀和标签

  19. 19

    在Python中,如何使用xml.etree.ElementTree创建数据框?

  20. 20

    如何使用 python 从soap响应中的多个命名空间获取数据:xml.etree.ElementTree

  21. 21

    使用ElementTree顺序解析某些XML标签

  22. 22

    使用 ElementTree 获取 XML 标签值

  23. 23

    Python 2.5:XML中的ElementTree和UML

  24. 24

    使用ElementTree的递归XML解析python

  25. 25

    使用ElementTree Python获取检索XML属性

  26. 26

    使用python的ElementTree处理xml子对象

  27. 27

    在Python中使用ElementTree从XML提取数据

  28. 28

    使用Python ElementTree从XML中提取值

  29. 29

    如何使用xml.etree.ElementTree访问标签之间的文本

热门标签

归档