我想在许多字符之一(下面列出)上分割一个句子。我的正则表达式能够根据大多数字符进行拆分,但不能基于'[',']'(方括号的开头和结尾)进行拆分。如果我将字符串SPECIAL_CHARACTERS_REGEX更改为[ :;'=\\()!-\\[\\]]
,它将开始在字符串中的整数上进行拆分,而不是在方括号中进行拆分。如何使正则表达式在方括号而不是整数(“ []”表示所有整数)上分割。
另一个相关的问题,是否有一种方法也可以从字符串中拆分数字?例如,9pm
应分为9
和pm
。
This:
private static final String SPECIAL_CHARACTERS_REGEX = "[ :;'=\\()!-]";
String rawMessage = "let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]"
String[] tokens = rawMessage.split(SPECIAL_CHARACTERS_REGEX);
Gives:
Input: let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]
output: [let, s, meet, tomorrow, at, 9, 30p?, 7, 8pm?, i, you, go, , no, Go, , , [to, do, , ]]
和,
This:
private static final String SPECIAL_CHARACTERS_REGEX = "[ :;'=\\()!-\\[\\]]";
String rawMessage = "let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]"
String[] tokens = rawMessage.split(SPECIAL_CHARACTERS_REGEX);
Gives:
let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]
[let, s, meet, tomorrow, at, , , , , p, , , , , pm, , i, you, go, , no, , o, , , , to, do]
预期产量:
{"let", "s", "meet", "tomorrow", "at", "9", "30", "p", "7", "8", "pm", "i", "you", "go", "no", "Go", "to", "do"}
如果将破折号留在字符类的中间,则还需要对其进行转义。
但是,请避免将其放在角色类的开头或结尾。同样,您不需要在()
这里转义,并且可能要在字符类之后*
或+
之后使用量词。
更新:为了获得预期的结果,您可以这样做。
private static final String SPECIAL_CHARACTERS_REGEX = "[ :;'?=()!\\[\\]-]+|(?<=\\d)(?=\\D)";
String rawMessage = "let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]";
String[] tokens = rawMessage.split(SPECIAL_CHARACTERS_REGEX);
System.out.println(Arrays.toString(tokens));
正则表达式:
[ :;'?=()!\[\]-]+ any character of: ' ', ':', ';', ''', '?',
'=', '(', ')', '!', '\[', '\]', '-' (1 or more times)
| OR
(?<= look behind to see if there is:
\d digits (0-9)
) end of look-behind
(?= look ahead to see if there is:
\D non-digits (all but 0-9)
) end of look-ahead
输出
[let, s, meet, tomorrow, at, 9, 30, p, 7, 8, pm, i, you, go, no, Go, to, do]
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句