我有2个文本文件:
File1-此文件的格式为 user_id tweet_id tweet_text
文件1
60730027 6298443824 thank you echo park. you've changed A LOT, but as long as I'm getting paid to make you move, I'm still with it! 2009-12-03 02:54:10
60730027 6297282530 fat Albert Einstein goin in right now over here!!! 2009-12-03 01:35:22
File2
该文件的格式为genome_id name ascii_name
4045417 Southwest Indent Southwest Indent
4045418 Southeast Point Southeast Point
以下是读取文件1的代码段:
public void readfromFile() throws FileNotFoundException {
Scanner inputStream;
String source=null;
FileInputStream file = new FileInputStream("file1.txt");
String regex = "/[a-zA-Z ]+/";
Scanner fileScan = new Scanner(file);
while(fileScan.hasNextLine()){
word = fileScan.nextLine();
word = word.replaceAll(regex, "").toLowerCase();
PrintWriter outputStreamName = new PrintWriter(new FileOutputStream("temp.txt"));
outputStreamName.printf("%s",word);
}
我的目的是首先用空值替换user_id,tweet_id,genome_id中存在的数据。然后将大写的值转换为小写。但是,现在只要此代码处理file1,文本文件都不会更改。我也想知道发生了什么事。当我将其输出到控制台时,我得到了输出。
预期产量:
thank you echo park youve changed a lot but as long as im getting paid to make you move im still with it
fat albert einstein goin in right now over here
根据期望的输出,您想要替换字母,点和单词之间的空格以外的所有内容。
[^a-zA-Z. ]+|(?<=\d)\s*(?=\d)|(?<=\D)\s*(?=\d)|(?<=\d)\s*(?=\D)
这是在线演示
或尝试不使用环视
[^a-zA-Z. ]+|\d\s+\d|\D\s+\d|\d\s+\D
在这里\s
匹配任何空格字符[\r\n\t\f ]
样例代码:
String regex = "[^a-zA-Z. ]+|(?<=\\d)\\s*(?=\\d)|(?<=\\D)\\s*(?=\\d)|(?<=\\d)\\s*(?=\\D)";
str.replaceAll(regex,"");
输出:
thank you echo park. youve changed A LOT but as long as Im getting paid to make you move Im still with it
fat Albert Einstein goin in right now over here
要'
同时从输出中排除,请使用,[^a-zA-Z.' ]+
将I'm
和you've
更改为Im
和youve
。
最好只使用[a-zA-Z']+
所有单词。这是演示
样例代码:
String str = "60730027 6297282530 fat Albert Einstein goin in right now over here!!! 2009-12-03 01:35:22 ";
Pattern p = Pattern.compile("[a-zA-Z']+");
Matcher m = p.matcher(str);
while (m.find()) {
System.out.print(m.group()+" ");
}
输出:
fat Albert Einstein goin in right now over here
注意:因此,您正在检查下一行
改变:
source = inputStream.next();
到:
source = inputStream.nextLine();
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句