I have a python regex like this:
re.compile(r'(\[chartsjs\].*\[/chartsjs\])', re.DOTALL)
I am trying to do a re.findall
on patterns like this:
[charts]
name: mychart
type: line
labels: fish, cat, dog
data: 4, 5, 6
data2:5, 7, 9
[/charts]
this is some text
[charts]
name: second
type: line
labels: 100, 500, 1000
data: 50, 100, 10000
data2: 100, 100, 100
[/charts]
But it seems like it is matching the first [charts]
to the very last [charts]
and grabbing everything in the middle, because when I print it to the console I am seeing this:
[u'[chartsjs]\r\nname: mychart\r\ntype: line\r\nlabels: fish, cat, dog\r\ndata: 4, 5, 6\r\ndata2:5, 7, 9\r\n[/chartsjs]\r\n\r\nthis is some text now fool\r\n\r\n[chartsjs]\r\nname: second\r\ntype: line\r\nlabels: 100, 500, 1000\r\ndata: 50, 100, 10000\r\ndata2: 100, 100, 100\r\n[/chartsjs]']
I would like the regex to return the first match, eliminate the arbitrary test and then find another arbitrary number of matches. Is there a way to do this?
You have got just 1 problem in your regex.
.*
will greedily match everything in its path. When it encounters the first closing [/charts]
it will go further to check if there are any more [/charts]
ahead. If found then it will proceed.
To make it stop at first [/charts]
we need to make it lazy by putting a question mark. .*?
This will keep matching everything and stops at first [/charts]
Take a look I tested it:
import re
a="""
[charts]
name: mychart
type: line
labels: fish, cat, dog
data: 4, 5, 6
data2:5, 7, 9
[/charts]
this is some text
[charts]
name: second
type: line
labels: 100, 500, 1000
data: 50, 100, 10000
data2: 100, 100, 100
[/charts]
"""
for c in re.findall('(\[charts\].*?\[/charts\])',a, re.DOTALL):
print c
Output:
[charts]
name: mychart
type: line
labels: fish, cat, dog
data: 4, 5, 6
data2:5, 7, 9
[/charts]
[charts]
name: second
type: line
labels: 100, 500, 1000
data: 50, 100, 10000
data2: 100, 100, 100
[/charts]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments