I'm writing a python function to take a chunk of text, parsed from a text file using f.readlines
and split this chunk of text into a list. The text contains dividers and I want to split this text specifically at these locations. Below is an example of the text file in question.
@model:2.4.0=Skeleton "Skeleton"
@compartments
Cell=1.0 "Cell"
@species
Cell:[A]=100.0 "A"
Cell:[B]=1.0 "B"
Cell:[C]=0.0 "C"
Cell:[D]=0.0 "D"
@parameters
kcat=4000
km = 146
v2_k = 88
@reactions
@r=v1 "v1"
A -> C : B
Cell * kcat * B * A / (km + A)
@r=v2 "v2"
C -> C+D
Cell * v2_k * C
My desired output is to have a python dictionary that has the name of the dividers as keys and all the content between that divider and the next as values. For example, the first element of the sections
dictionary should be:
sections['@model']=:2.4.0=Skeleton "Skeleton"
Current Code
def split_sections(SBshorthand_file):
'''
Takes a SBshorthand file and returns a dictionary of each of the sections.
Keys of the dictionary are the dividers.
Values of dictionary are the content between dividers.
'''
SBfile=parse_SBshorthand_read(SBshorthand_file) #simple parsing function. uses f.read()
dividers=["@model", "@units", "@compartments", "@species", "@parameters", "@rules", "@reactions", "@events"]
sections={}
for i in dividers:
pattern=re.compile(i)
if re.findall(pattern,SBfile) == []:
pass
# print 'Section \'{}\' not present in {}'.format(i,SBshorthand_file)
else:
SBfile2=re.sub(pattern,'\n'+i,SBfile)
print SBfile2
This however does not do what I want. Would anybody have any ideas how to fix my code? Thanks
-----------------Edit--------------------
Please note that the section '@reactions' contains a number of 'reactions' all of which start with @r, but they all need to be grouped under the reactions key.
import re
x="""@model:2.4.0=Skeleton "Skeleton"
@compartments
Cell=1.0 "Cell"
@species
Cell:[A]=100.0 "A"
Cell:[B]=1.0 "B"
Cell:[C]=0.0 "C"
Cell:[D]=0.0 "D"
@parameters
kcat=4000
km = 146
v2_k = 88
@reactions
@r=v1 "v1"
A -> C : B
Cell * kcat * B * A / (km + A)
@r=v2 "v2"
C -> C+D
Cell * v2_k * C"""
print dict(re.findall(r"(?:^|(?<=\n))(@\w+)([\s\S]*?)(?=\n@(?!r\b)\w+|$)",x))
You can directly use re.findall
and get what you want.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments