import re
s = "{'key': 1234}{'test': {xyz}123}"
regex = re.compile("{.+?}")
result = regex.findall(s)
print result
result is ["{'key': 1234}", "{'test': {xyz}"]
,
but I hope result is ["{'key': 1234}", "{'test': {xyz}123"]
,
so, change re.compile("{.+?}")
to re.compile("{.+}")
,
now result is ["{'key': 1234}{'test': {xyz}123}"]
,
but I not hope.
I suggest you to change your regex like,
>>> s = "{'key': 1234}{'test': {xyz}123}"
>>> re.findall(r'\{(?:\{[^{}]*}|[^{}])*}', string)
["{'key': 1234}", "{'test': {xyz}123}"]
Problem with the greedy and non-greedy regexes.
>>> re.findall(r'\{.*}', s)
["{'key': 1234}{'test': {xyz}123}"]
Here .*
is greedy and it matches all the characters as much as possible upto the last }
. So the match is from first {
to the last }
symbol. So you got the above as output.
>>> re.findall(r'\{.*?}', s)
["{'key': 1234}", "{'test': {xyz}"]
Here .*?
will do a non-greedy match. So from the {
symbol, it matches all the characters upto the first closing brace }
. So you got the above as output.
Solution:
\{(?:\{[^{}]*}|[^{}])*}
\{
matches the literal {
symbol.
(?:..)
Called non-capturing group.
\{[^{}]*}|[^{}]
means match a set {....}
or |
any character but not of {
or }
([^{}]
),
(?:\{[^{}]*}|[^{}])*
zero or more times. That is, alteration helps to switch here and forth once a character is not matched. If the first pattern fails to match the character, then the next pattern will come and try to match. If both are not matched then the control tranfers to the follwoing pattern that is, }
because we defined the non-capturing group to repeat zero or more times.
}
Matches a literal closing bracket.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments