我目前正在尝试获取一个文本文件,并将文件中的每个单词读入二进制树中,我得到的特定错误是:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
我正在读入项目的文本文件是由教授分配给我的,所以我知道这应该不会遇到我以前从未遇到过的此类异常的任何内存问题,也不知道从哪里开始。帮助。这是我的代码:
public class Tester {
public static void main(String[] args) throws FileNotFoundException {
Tester run = new Tester();
run.it();
}
public void it() throws FileNotFoundException {
BTree theTree = new BTree();
String str = this.readInFile();
String [] firstWords = this.breakIntoWords(str);
String [] finalWords = this.removeNullValues(firstWords);
for(int i = 0; i < finalWords.length; i++) {
theTree.add(finalWords[i]);
}
theTree.print();
}
public String readInFile() throws FileNotFoundException {
String myFile = "";
int numWords = 0;
Scanner myScan = new Scanner(new File("Dracula.txt"));
while(myScan.hasNext() == true) {
myFile += myScan.nextLine() + " ";
}
return myFile;
}
public String [] breakIntoWords(String myFile) {
String[] words = new String[myFile.length()];
String nextWord = "";
int position = 0;
int i = 0;
while(myFile.length() > position) {
char next = myFile.charAt(position);
next = Character.toLowerCase(next);
// First trim beginning
while (((next < 'a') || (next > 'z')) && !Character.isDigit(next)) {
position++;
next = myFile.charAt(position);
next = Character.toLowerCase(next);
}
// Now pull only letters or numbers until we hit a space
while(!Character.isWhitespace(next)) {
if (Character.isLetterOrDigit(next)) {
nextWord += myFile.charAt(position);
}
position++;
next = myFile.charAt(position);
}
words [i] = nextWord;
i++;
}
return words;
}
public String[] removeNullValues(String[] myWords) {
String[] justMyWords = new String[myWords.length];
for (int i = 0; i < myWords.length; i++) {
if (myWords[i] != null) {
justMyWords[i] = myWords[i];
}
}
return justMyWords;
}
}
这是我的B树类:
public class BTree {
private BTNode root;
private int nodeCount;
public boolean add(String word) {
BTNode myNode = new BTNode(word);
if(root == null) {
root = myNode;
nodeCount++;
return true;
}
if(findNode(word)) {
int tmp = myNode.getNumInstance();
tmp++;
myNode.setNumInstance(tmp);
return false;
}
BTNode temp = root;
while(temp != null) {
if(word.compareTo(temp.getMyWord()) < 0) {
if(temp.getRightChild() == null) {
temp.setLeftChild(myNode);
nodeCount++;
return true;
} else {
temp = temp.getRightChild();
}
} else {
if(temp.getLeftChild() == null) {
temp.setLeftChild(myNode);
nodeCount++;
return true;
} else {
temp = temp.getLeftChild();
}
}
}
return false;
}
public boolean findNode(String word) {
return mySearch(root, word);
}
private boolean mySearch(BTNode root, String word) {
if (root == null) {
return false;
}
if ((root.getMyWord().compareTo(word) < 0)) {
return true;
} else {
if (word.compareTo(root.getMyWord()) > 0) {
return mySearch(root.getLeftChild(), word);
} else {
return mySearch(root.getRightChild(), word);
}
}
}
public void print() {
printTree(root);
}
private void printTree(BTNode root) {
if (root == null) {
System.out.print(".");
return;
}
printTree(root.getLeftChild());
System.out.print(root.getMyWord());
printTree(root.getRightChild());
}
public int wordCount() {
return nodeCount;
}
}
还有我的B树节点类:
public class BTNode {
private BTNode rightChild;
private BTNode leftChild;
private String myWord;
private int numWords;
private int numInstance;
private boolean uniqueWord;
private boolean isRoot;
private boolean isDeepest;
public BTNode(String myWord){
this.numInstance = 1;
this.myWord = myWord;
this.rightChild = null;
this.leftChild = null;
}
public String getMyWord() {
return myWord;
}
public void setMyWord(String myWord) {
this.myWord = myWord;
}
public BTNode getRightChild() {
return rightChild;
}
public void setRightChild(BTNode rightChild) {
this.rightChild = rightChild;
}
public BTNode getLeftChild() {
return leftChild;
}
public void setLeftChild(BTNode leftChild) {
this.leftChild = leftChild;
}
public int getnumWords() {
return numWords;
}
public void setnumWords(int numWords) {
this.numWords = numWords;
}
public boolean isUniqueWord() {
return uniqueWord;
}
public void setUniqueWord(boolean uniqueWord) {
this.uniqueWord = uniqueWord;
}
public boolean isRoot() {
return isRoot;
}
public void setRoot(boolean isRoot) {
this.isRoot = isRoot;
}
public boolean isDeepest() {
return isDeepest;
}
public void setDeepest(boolean isDeepest) {
this.isDeepest = isDeepest;
}
public int getNumInstance() {
return numInstance;
}
public void setNumInstance(int numInstance) {
this.numInstance = numInstance;
}
}
这个小文件应该不是OutOfMemory错误的原因。
性能没错,但是如果您要读取内存中的整个文件,
请不要每行读取一行并连接字符串。这会减慢您的编程速度。
您可以使用:
String myFile = new String(Files.readAllBytes(Paths.get("Dracula.txt")));
myFile = myFile.replaceAll("\r\n", " ");
return myFile;
那也不是超快的,而是更快的。
现在的错误
字数组太大
public String[] breakIntoWords(String myFile) {
String[] words = new String[myFile.length()];
您将单词定义为文件长度的数组。如果您的名称是助记符,那么这太大了,这意味着您需要在文件中包含单词长度计数的数组
nextWord永远不会重置(原因是OutOfMemory)
// Now pull only letters or numbers until we hit a space
while (!Character.isWhitespace(next)) {
if (Character.isLetterOrDigit(next)) {
nextWord += myFile.charAt(position);
}
position++;
next = myFile.charAt(position);
}
words[i] = nextWord;
i++;
因为将下一个单词分配给单词[i]之后再也不会将其设置为“”。这样,下一个单词会逐字长大
,并且数组内容如下所示:
words[0] = "Word1"
words[1] = "Word1Word2"
words[2] = "Word1Word2Word3"
可以想象,这将导致大量的已用空间。
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句