I am working on some code and I have ran into issues regarding splitting up certain characters in a string. When given a string below, I can separate it into separate tokens:
String line = "hello world ; how are you ;"
such as hello, world, and ;
But when the code looks like:
String line2 = "hello world; how are you;"
I create tokens such as world; and you; when in reality I want the semicolon to be its own token. Thank you in advance for the help
It is possible to split the second line using word boundary and remove blank lines using filter:
String line2 = "hello world; how are you;";
String[] arr = Arrays.stream(line2.split("\\b"))
.filter(s -> !s.matches("\\s+"))
.toArray(String[]::new);
System.out.println(Arrays.toString(arr));
Output:
[hello, world, ; , how, are, you, ;]
Another option could be to use matching substrings instead of splitting by delimiter. The matching regular expression can be:
\w+|\S+
- at least one word character [0-9A-Za-z_]
OR at least one non-space character:
String[] arr2 = Pattern.compile("\\w+|\\S+")
.matcher(line2)
.results()
.map(mr -> mr.group(0))
.toArray(String[]::new);
The result is the same.
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments