I'm trying to match a type definition
def euro : t1 -> t2 -> t3 (and this pattern my repeat further in other examples)
I came up with this regex
^def ([^\s]*)\s:\s([^\s]*)(\s->\s[^\s]*)*
But while it matches euro
and t1
it
-> t2
rather than t2
t3
I can't see what I am doing wrong, and my goal is to capture
euro t1 t2 t3
as four separate items, and what I currently get is
0: "def euro : t1 -> t2 -> t3"
1: "euro"
2: "t1"
3: " -> t3"
You can't use a repeated capturing group in JS regex, all but the last values will be "dropped", re-written upon each subsequent iteration.
When creating a regular expression that needs a capturing group to grab part of the text matched, a common mistake is to repeat the capturing group instead of capturing a repeated group. The difference is that the repeated capturing group will capture only the last iteration, while a group capturing another group that's repeated will capture all iterations.
The way out can be capturing the whole substring and then split it. Here is an example:
var s = "def euro : t1 -> t2 -> t3";
var rx = /^def (\S*)\s:\s(\S*)((?:\s->\s\S*)*)/;
var res = [];
var m = s.match(rx);
if (m) {
res = [m[1], m[2]];
for (var s of m[3].split(" -> ").filter(Boolean)) {
res.push(s);
}
}
console.log(res);
Pattern details
^
- start of stringdef
- a literal substring(\S*)
- Capturing group 1: 0+ non-whitespace chars\s:\s
- a :
enclosed with single whitespaces(\S*)
- Capturing group 2: 0+ non-whitespace chars
((?:\s->\s\S*)*)
- Capturing group 3: 0+ repetitions of the following pattern sequences:\s->\s
- whitespace, ->
, whitespace\S*
- 0+ non-whitespace charsCollected from the Internet
Please contact [email protected] to delete if infringement.
Comments