我正在尝试使用 ElementTree 迭代树中的所有节点和子节点。我想将所有父级及其子级 XML 标记作为列和值,这些列和值可以将子节点以 CSV 格式附加到父级。我正在使用 python 2.7。标题应该只打印一次,下面应该是各自的值
XML文件:
<Customers>
<Customer CustomerID="GREAL">
<CompanyName>Great Lakes Food Market</CompanyName>
<ContactName>Howard Snyder</ContactName>
<ContactTitle>Marketing Manager</ContactTitle>
<Phone>(503) 555-7555</Phone>
<FullAddress>
<Address>2732 Baker Blvd.</Address>
<City>Eugene</City>
<Region>OR</Region>
<PostalCode>97403</PostalCode>
<Country>USA</Country>
</FullAddress>
</Customer>
<Customer CustomerID="HUNGC">
<CompanyName>Hungry Coyote Import Store</CompanyName>
<ContactName>Yoshi Latimer</ContactName>
<ContactTitle>Sales Representative</ContactTitle>
<Phone>(503) 555-6874</Phone>
<Fax>(503) 555-2376</Fax>
<FullAddress>
<Address>City Center Plaza 516 Main St.</Address>
<City>Elgin</City>
<Region>OR</Region>
<PostalCode>97827</PostalCode>
<Country>USA</Country>
</FullAddress>
</Customer>
<Customer CustomerID="LAZYK">
<CompanyName>Lazy K Kountry Store</CompanyName>
<ContactName>John Steel</ContactName>
<ContactTitle>Marketing Manager</ContactTitle>
<Phone>(509) 555-7969</Phone>
<Fax>(509) 555-6221</Fax>
<FullAddress>
<Address>12 Orchestra Terrace</Address>
<City>Walla Walla</City>
<Region>WA</Region>
<PostalCode>99362</PostalCode>
<Country>USA</Country>
</FullAddress>
</Customer>
<Customer CustomerID="LETSS">
<CompanyName>Let's Stop N Shop</CompanyName>
<ContactName>Jaime Yorres</ContactName>
<ContactTitle>Owner</ContactTitle>
<Phone>(415) 555-5938</Phone>
<FullAddress>
<Address>87 Polk St. Suite 5</Address>
<City>San Francisco</City>
<Region>CA</Region>
<PostalCode>94117</PostalCode>
<Country>USA</Country>
</FullAddress>
</Customer>
</Customers>
我的代码:
#Import Libraries
import csv
import xmlschema
import xml.etree.ElementTree as ET
#Define the variable to store the XML Document
xml_file = 'C:/Users/391648/Desktop/BOSS_20190618_20190516_18062019141928_CUMA/source_Files_XML/CustomersOrders.xml'
#using XML Schema Library validate the XML against XSD
my_schema = xmlschema.XMLSchema('C:/Users/391648/Desktop/BOSS_20190618_20190516_18062019141928_CUMA/source_Files_XML/CustomersOrders.xsd')
SchemaCheck = my_schema.is_valid(xml_file)
print(SchemaCheck) #Prints as True if the document is validated with XSD
#Parse XML & get root
tree = ET.parse(xml_file)
root = tree.getroot()
#Create & Open CSV file
xml_data_to_csv = open('C:/Users/391648/Desktop/BOSS_20190618_20190516_18062019141928_CUMA/source_Files_XML/PythonXMl.csv','w')
#create variable to write to csv
csvWriter = csv.writer(xml_data_to_csv)
#Create list contains header
count =0
#Loop for each node
for element in root.findall('Customers/Customer'):
List_nodes = []
#Get head by Tag
if count ==0:
list_header =[]
Full_Address = []
CompanyName = element.find('CompanyName').tag
list_header.append(CompanyName)
ContactName = element.find('ContactName').tag
list_header.append(ContactName)
ContactTitle = element.find('ContactTitle').tag
list_header.append(ContactTitle)
Phone = element.find('Phone').tag
list_header.append(Phone)
print(list_header)
csvWriter.writerow(list_header)
count = count + 1
#Get the data of the Node
CompanyName = element.find('CompanyName').text
List_nodes.append(CompanyName)
ContactName = element.find('ContactName').text
List_nodes.append(ContactName)
ContactTitle = element.find('ContactTitle').text
List_nodes.append(ContactTitle)
Phone = element.find('Phone').text
List_nodes.append(Phone)
print(List_nodes)
#Write List_Nodes to CSV
csvWriter.writerow(List_nodes)
xml_data_to_csv.close()
Expected CSV output:
CompanyName,ContactName,ContactTitle,Phone, Address, City, Region, PostalCode, Country
Great Lakes Food Market,Howard Snyder,Marketing Manager,(503) 555-7555, City Center Plaza 516 Main St., Elgin, OR, 97827, USA
Hungry Coyote Import Store,Yoshi Latimer,Sales Representative,(503) 555-6874, 12 Orchestra Terrace, Walla Walla, WA, 99362, USA
我改变了几件事:
for loop
条件改为for customer in root.findall('Customer')
从for customer in root.findall('Customers/Customer')
但是,我试图保持您的程序结构,库使用完整。这是修改后的程序:
import xml.etree.ElementTree as et
import csv
tree = et.parse("../data/customers.xml")
root = tree.getroot()
headers = []
count = 0
xml_data_to_csv = open('../data/customers.csv', 'w')
csvWriter = csv.writer(xml_data_to_csv)
for customer in root.findall('Customer'):
data = []
for detail in customer:
if(detail.tag == 'FullAddress'):
for addresspart in detail:
data.append(addresspart.text.rstrip('/n/r'))
if(count == 0):
headers.append(addresspart.tag)
else:
data.append(detail.text.rstrip('/n/r'))
if(count == 0):
headers.append(detail.tag)
if(count == 0):
csvWriter.writerow(headers)
csvWriter.writerow(data)
count = count + 1
使用给定的输入XML
内容,它产生:
CompanyName,ContactName,ContactTitle,Phone,Address,City,Region,PostalCode,Country
Great Lakes Food Market,Howard Snyde,Marketing Manage,(503) 555-7555,2732 Baker Blvd.,Eugene,OR,97403,USA
Hungry Coyote Import Store,Yoshi Latime,Sales Representative,(503) 555-6874,(503) 555-2376,City Center Plaza 516 Main St.,Elgi,OR,97827,USA
Lazy K Kountry Store,John Steel,Marketing Manage,(509) 555-7969,(509) 555-6221,12 Orchestra Terrace,Walla Walla,WA,99362,USA
Let's Stop N Shop,Jaime Yorres,Owne,(415) 555-5938,87 Polk St. Suite 5,San Francisco,CA,94117,USA
注意:您可以附加到一个数组并一次性写入,而不是在循环中写入 CSV。这取决于您的内容大小和性能。
XML 处理和 CSV 编写代码结构保持不变。此外,Orders
处理客户时的流程要素。现在,underOrders
Order
元素可以像Customer
. 正如你提到的,每个人Order
都有ShipInfo
。
假设输入 XML 为(基于下面的注释):
<Customers>
<Customer CustomerID="GREAL">
<CompanyName>Great Lakes Food Market</CompanyName>
<ContactName>Howard Snyder</ContactName>
<ContactTitle>Marketing Manager</ContactTitle>
<Phone>(503) 555-7555</Phone>
<FullAddress>
<Address>2732 Baker Blvd.</Address>
<City>Eugene</City>
<Region>OR</Region>
<PostalCode>97403</PostalCode>
<Country>USA</Country>
</FullAddress>
<Orders>
<Order>
<Param1>Value1</Param1>
<Param2>Value2</Param2>
<ShipInfo>
<ShipInfoParam1>Value3</ShipInfoParam1>
<ShipInfoParam2>Value4</ShipInfoParam2>
</ShipInfo>
</Order>
<Order>
<Param1>Value5</Param1>
<Param2>Value6</Param2>
<ShipInfo>
<ShipInfoParam1>Value7</ShipInfoParam1>
<ShipInfoParam2>Value8</ShipInfoParam2>
</ShipInfo>
</Order>
</Orders>
</Customer>
<Customer CustomerID="HUNGC">
<CompanyName>Hungry Coyote Import Store</CompanyName>
<ContactName>Yoshi Latimer</ContactName>
<ContactTitle>Sales Representative</ContactTitle>
<Phone>(503) 555-6874</Phone>
<Fax>(503) 555-2376</Fax>
<FullAddress>
<Address>City Center Plaza 516 Main St.</Address>
<City>Elgin</City>
<Region>OR</Region>
<PostalCode>97827</PostalCode>
<Country>USA</Country>
</FullAddress>
<Orders>
<Order>
<Param1>Value7</Param1>
<Param2>Value8</Param2>
<ShipInfo>
<ShipInfoParam1>Value9</ShipInfoParam1>
<ShipInfoParam2>Value10</ShipInfoParam2>
</ShipInfo>
</Order>
</Orders>
</Customer>
<Customer CustomerID="LAZYK">
<CompanyName>Lazy K Kountry Store</CompanyName>
<ContactName>John Steel</ContactName>
<ContactTitle>Marketing Manager</ContactTitle>
<Phone>(509) 555-7969</Phone>
<Fax>(509) 555-6221</Fax>
<FullAddress>
<Address>12 Orchestra Terrace</Address>
<City>Walla Walla</City>
<Region>WA</Region>
<PostalCode>99362</PostalCode>
<Country>USA</Country>
</FullAddress>
</Customer>
<Customer CustomerID="LETSS">
<CompanyName>Let's Stop N Shop</CompanyName>
<ContactName>Jaime Yorres</ContactName>
<ContactTitle>Owner</ContactTitle>
<Phone>(415) 555-5938</Phone>
<FullAddress>
<Address>87 Polk St. Suite 5</Address>
<City>San Francisco</City>
<Region>CA</Region>
<PostalCode>94117</PostalCode>
<Country>USA</Country>
</FullAddress>
</Customer>
</Customers>
这是处理客户和订单的修改后的代码:
import xml.etree.ElementTree as et
import csv
tree = et.parse("../data/customers-with-orders.xml")
root = tree.getroot()
customer_csv = open('../data/customers-part.csv', 'w')
order_csv = open('../data/orders-part.csv', 'w')
customerCsvWriter = csv.writer(customer_csv)
orderCsvWriter = csv.writer(order_csv)
customerHeaders = []
orderHeaders = ['CustomerID']
isFirstCustomer = True
isFirstOrder = True
def processOrders(customerId):
global isFirstOrder
for order in detail.findall('Order'):
orderData = [customerId]
for orderdetail in order:
if(orderdetail.tag == 'ShipInfo'):
for shipinfopart in orderdetail:
orderData.append(shipinfopart.text.rstrip('/n/r'))
if(isFirstOrder):
orderHeaders.append(shipinfopart.tag)
else:
orderData.append(orderdetail.text.rstrip('/n/r'))
if(isFirstOrder):
orderHeaders.append(orderdetail.tag)
if(isFirstOrder):
orderCsvWriter.writerow(orderHeaders)
orderCsvWriter.writerow(orderData)
isFirstOrder = False
for customer in root.findall('Customer'):
customerData = []
customerId = customer.get('CustomerID')
for detail in customer:
if(detail.tag == 'FullAddress'):
for addresspart in detail:
customerData.append(addresspart.text.rstrip('/n/r'))
if(isFirstCustomer):
customerHeaders.append(addresspart.tag)
elif(detail.tag == 'Orders'):
processOrders(customerId)
else:
customerData.append(detail.text.rstrip('/n/r'))
if(isFirstCustomer):
customerHeaders.append(detail.tag)
if(isFirstCustomer):
customerCsvWriter.writerow(customerHeaders)
customerCsvWriter.writerow(customerData)
isFirstCustomer = False
在customers-part.csv 中产生的输出:
CompanyName,ContactName,ContactTitle,Phone,Address,City,Region,PostalCode,Country
Great Lakes Food Market,Howard Snyde,Marketing Manage,(503) 555-7555,2732 Baker Blvd.,Eugene,OR,97403,USA
Hungry Coyote Import Store,Yoshi Latime,Sales Representative,(503) 555-6874,(503) 555-2376,City Center Plaza 516 Main St.,Elgi,OR,97827,USA
Lazy K Kountry Store,John Steel,Marketing Manage,(509) 555-7969,(509) 555-6221,12 Orchestra Terrace,Walla Walla,WA,99362,USA
Let's Stop N Shop,Jaime Yorres,Owne,(415) 555-5938,87 Polk St. Suite 5,San Francisco,CA,94117,USA
在 orders-part.csv 中产生的输出:
CustomerID,Param1,Param2,ShipInfoParam1,ShipInfoParam2
GREAL,Value1,Value2,Value3,Value4
GREAL,Value5,Value6,Value7,Value8
HUNGC,Value7,Value8,Value9,Value10
注意:代码可以通过重用进一步优化。我把那部分留给你。其次,注意在每个订单中都添加了customer Id,以便区分。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句