如何按数字和单词组拆分文本

悉达多

假设我有一个包含 - 一些逗号分隔的字符串 - 和文本的字符串

  my_string =  "2 Marine Cargo       14,642 10,528       16,016 more text 8,609 argA 2,106 argB"

我想将它们提取到一个由“数字”和“词组”分割的数组中

 resultArray = {"2", "Marine Cargo", "14,642", "10,528", "16,016",
                "more text", "8,609", "argA", "2,106", "argB"};

注意 0:每个条目之间可能有多个空格,应忽略。

注 1:“Marine Cargo”和“more text”没有分成不同的字符串,因为它们是一组没有数字分隔的单词。而 argA 和 argB 是分开的,因为它们之间有一个数字。

安琪儿

您可以尝试使用此正则表达式进行拆分

([\d,]+|[a-zA-Z]+ *[a-zA-Z]*) //note the spacing between + and *.
  • [0-9,]+ // 将搜索一位或多位数字和逗号
  • [a-zA-Z]+ [a-zA-Z] // 将搜索一个词,然后是一个空格(如果有),然后是另一个词(如果有)。

    String regEx = "[0-9,]+|[a-zA-Z]+ *[a-zA-Z]*";
    

你像这样使用它们

public static void main(String args[]) {

  String input = new String("2 Marine Cargo       14,642 10,528       16,016 more text 8,609 argA 2,106 argB");
  System.out.println("Return Value :" );      

  Pattern pattern = Pattern.compile("[0-9,]+|[a-zA-Z]+ *[a-zA-Z]*");

  ArrayList<String> result = new ArrayList<String>();
  Matcher m = pattern.matcher(input);
  while (m.find()) { 
         System.out.println(">"+m.group(0)+"<");  
         result.add(m.group(0));

   }
}

以下是从https://regex101.com自动生成的 RegEx 的输出以及详细说明

在此处输入图片说明

1st Alternative [0-9,]+
Match a single character present in the list below [0-9,]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
0-9 a single character in the range between 0 (index 48) and 9 (index 57) (case sensitive)
, matches the character , literally (case sensitive)


2nd Alternative [a-zA-Z]+ *[a-zA-Z]*
Match a single character present in the list below [a-zA-Z]+
+ Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
A-Z a single character in the range between A (index 65) and Z (index 90) (case sensitive)
 * matches the character   literally (case sensitive)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
Match a single character present in the list below [a-zA-Z]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
A-Z a single character in the range between A (index 65) and Z (index 90) (case sensitive)

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

按行和列拆分文本

来自分类Dev

按行和列拆分文本

来自分类Dev

如何在Python中根据带点的数字拆分文本?

来自分类Dev

如何在Python中根据带点的数字拆分文本?

来自分类Dev

如何使用C#查找文本中的重复单词组?

来自分类Dev

如何在目录中按字符数拆分文本文件

来自分类Dev

如何在java中按行间距拆分文本文件

来自分类Dev

如何按字符串拆分文本文件内容?

来自分类Dev

如何用数字名称拆分文件?

来自分类Dev

Php 按行拆分文本并获取特定元素

来自分类Dev

如何按单词拆分字符串?

来自分类Dev

Android从短信字符串中拆分文本/数字

来自分类Dev

检测字符串“ \ n”和空格时如何拆分文本?

来自分类Dev

如何在两行 plt.annotate [python] 中加粗和拆分文本

来自分类Dev

如何在一定数量的非空格和非段落字符后拆分文本?

来自分类Dev

Excel公式从字符串中提取数字和单词组合

来自分类Dev

从 url 拆分文本

来自分类Dev

如何按与另一组文件相对应的行数拆分文本文件?

来自分类Dev

ASP.NET MVC:如何按固定宽度拆分文本文件中的行并传递给模型

来自分类Dev

按行数和模式以awk / perl拆分文件

来自分类Dev

在 bash 中,如何生成所有可能的单词组合,但按原始顺序?

来自分类Dev

如何在文本框中拆分文本

来自分类Dev

从下拉列表中的值拆分文本和整数

来自分类Dev

拆分以大写开头的任何单词和末尾的每个数字

来自分类Dev

Bash-从一个变量中拆分文本和数字/数字

来自分类Dev

如何在按钮内拆分文本

来自分类Dev

如何使用“ awk”在列中拆分文本?

来自分类Dev

JS、Vue:如何根据窗口高度拆分文本?

来自分类Dev

按新行拆分文本文件中的文本

Related 相关文章

热门标签

归档