sample XML File
<ArticleSet>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. [email protected]</Affiliation>
<Keywords>-</Keywords>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>-</Affiliation>
<Keywords>-</Keywords>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. [email protected]</Affiliation>
<Keywords>-</Keywords>
</Article>
</ArticleSet>
SAMPLE CODE
from xml.etree import ElementTree as etree
import re
root = etree.parse("sampleinput.xml").getroot()
for article in root.iter("Affiliation"):
if(article.text != "-"):
email = re.search(r'[\w\.-]+@[\w\.-]+', article.text)
c = etree.Element("<Email>")
c.text = email.group(0)
etree.write(article,c)
OUTPUT REQUIRED UPDATED XML FILE
<?xml version="1.0"?>
<ArticleSet>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. [email protected]</Affiliation>
<Keywords>-</Keywords>
<Email>[email protected]</Email>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>-</Affiliation>
<Keywords>-</Keywords>
<Email>-</Email>
</Article>
<Article>
<ForeName>a</ForeName>
<LastName>b</LastName>
<Affiliation>harvard university of science. [email protected]</Affiliation>
<Keywords>-</Keywords>
<Email>[email protected]</Email>
</Article>
</ArticleSet>
I want to extract email address from <Affiliation>
tag and make a new tag named <Email>
and store extracted email into that tag. If <Affiliation>
is equal to -
then store <Email>-</Email>
into that article.
ERROR
Traceback (most recent call last): File "C:/Users/Ghost Rider/Documents/Python/addingTagsToXML.py", line 11, in etree.write(article,c) AttributeError: module 'xml.etree.ElementTree' has no attribute 'write'
You can try this :
import re
import xml
tree = xml.etree.ElementTree.parse('filename.xml')
e = tree.getroot()
for article in e.findall('Article'):
child = xml.etree.ElementTree.Element("Email")
if article[2].text != '-':
email = re.search(r'[\w\.-]+@[\w\.-]+', article[2].text).group()
child.text = email
else:
child.text = ' - '
article.insert(4,child)
tree.write("filename.xml")
この記事はインターネットから収集されたものであり、転載の際にはソースを示してください。
侵害の場合は、連絡してください[email protected]
コメントを追加