假设我有一个包含 - 一些逗号分隔的字符串 - 和文本的字符串
my_string = "2 Marine Cargo 14,642 10,528 16,016 more text 8,609 argA 2,106 argB"
我想将它们提取到一个由“数字”和“词组”分割的数组中
resultArray = {"2", "Marine Cargo", "14,642", "10,528", "16,016",
"more text", "8,609", "argA", "2,106", "argB"};
注意 0:每个条目之间可能有多个空格,应忽略。
注 1:“Marine Cargo”和“more text”没有分成不同的字符串,因为它们是一组没有数字分隔的单词。而 argA 和 argB 是分开的,因为它们之间有一个数字。
您可以尝试使用此正则表达式进行拆分
([\d,]+|[a-zA-Z]+ *[a-zA-Z]*) //note the spacing between + and *.
[a-zA-Z]+ [a-zA-Z] // 将搜索一个词,然后是一个空格(如果有),然后是另一个词(如果有)。
String regEx = "[0-9,]+|[a-zA-Z]+ *[a-zA-Z]*";
你像这样使用它们
public static void main(String args[]) {
String input = new String("2 Marine Cargo 14,642 10,528 16,016 more text 8,609 argA 2,106 argB");
System.out.println("Return Value :" );
Pattern pattern = Pattern.compile("[0-9,]+|[a-zA-Z]+ *[a-zA-Z]*");
ArrayList<String> result = new ArrayList<String>();
Matcher m = pattern.matcher(input);
while (m.find()) {
System.out.println(">"+m.group(0)+"<");
result.add(m.group(0));
}
}
以下是从https://regex101.com自动生成的 RegEx 的输出以及详细说明
1st Alternative [0-9,]+
Match a single character present in the list below [0-9,]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
, matches the character , literally (case sensitive)
2nd Alternative [a-zA-Z]+ *[a-zA-Z]*
Match a single character present in the list below [a-zA-Z]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
A-Z a single character in the range between A (index 65) and Z (index 90) (case sensitive)
* matches the character literally (case sensitive)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below [a-zA-Z]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
A-Z a single character in the range between A (index 65) and Z (index 90) (case sensitive)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句