木槌主题模型示例无法编译

飞鼠

我想在Java中编译槌槌(而不是使用命令行),因此我将jar包含在我的项目中,并从以下示例引用示例代码:http : //mallet.cs.umass.edu/topics-devel。 php,但是,当我运行此代码时,出现以下错误:

Exception in thread "main" java.lang.NoClassDefFoundError: gnu/trove/TObjectIntHashMap
    at cc.mallet.types.Alphabet.<init>(Alphabet.java:51)
    at cc.mallet.types.Alphabet.<init>(Alphabet.java:70)
    at cc.mallet.pipe.TokenSequence2FeatureSequence.<init>    (TokenSequence2FeatureSequence.java:35)
at mallet.TopicModel.main(TopicModel.java:25)
Caused by: java.lang.ClassNotFoundException: gnu.trove.TObjectIntHashMap
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
... 4 more

我不确定是什么原因导致了错误。有人可以帮忙吗?

package mallet;

import cc.mallet.util.*;
import cc.mallet.types.*;
import cc.mallet.pipe.*;
import cc.mallet.pipe.iterator.*;
import cc.mallet.topics.*;

import java.util.*;
import java.util.regex.*;
import java.io.*;

public class TopicModel {

public static void main(String[] args) throws Exception {

    String filePath = "D:/ap.txt";
    // Begin by importing documents from text to feature sequences
    ArrayList<Pipe> pipeList = new ArrayList<Pipe>();

    // Pipes: lowercase, tokenize, remove stopwords, map to features
    pipeList.add( new CharSequenceLowercase() );
    pipeList.add( new CharSequence2TokenSequence(Pattern.compile("\\p{L}[\\p{L}\\p{P}]+\\p{L}")) );
    pipeList.add( new TokenSequenceRemoveStopwords(new File("stoplists/en.txt"), "UTF-8", false, false, false) );
    pipeList.add( new TokenSequence2FeatureSequence() );

    InstanceList instances = new InstanceList (new SerialPipes(pipeList));

    Reader fileReader = new InputStreamReader(new FileInputStream(new File(filePath)), "UTF-8");
    instances.addThruPipe(new CsvIterator (fileReader, Pattern.compile("^(\\S*)[\\s,]*(\\S*)[\\s,]*(.*)$"),
                                           3, 2, 1)); // data, label, name fields

    // Create a model with 100 topics, alpha_t = 0.01, beta_w = 0.01
    //  Note that the first parameter is passed as the sum over topics, while
    //  the second is 
    int numTopics = 100;
    ParallelTopicModel model = new ParallelTopicModel(numTopics, 1.0, 0.01);

    model.addInstances(instances);

    // Use two parallel samplers, which each look at one half the corpus and combine
    //  statistics after every iteration.
    model.setNumThreads(2);

    // Run the model for 50 iterations and stop (this is for testing only, 
    //  for real applications, use 1000 to 2000 iterations)
    model.setNumIterations(50);
    model.estimate();

    // Show the words and topics in the first instance

    // The data alphabet maps word IDs to strings
    Alphabet dataAlphabet = instances.getDataAlphabet();

    FeatureSequence tokens = (FeatureSequence) model.getData().get(0).instance.getData();
    LabelSequence topics = model.getData().get(0).topicSequence;

    Formatter out = new Formatter(new StringBuilder(), Locale.US);
    for (int position = 0; position < tokens.getLength(); position++) {
        out.format("%s-%d ", dataAlphabet.lookupObject(tokens.getIndexAtPosition(position)), topics.getIndexAtPosition(position));
    }
    System.out.println(out);

    // Estimate the topic distribution of the first instance, 
    //  given the current Gibbs state.
    double[] topicDistribution = model.getTopicProbabilities(0);

    // Get an array of sorted sets of word ID/count pairs
    ArrayList<TreeSet<IDSorter>> topicSortedWords = model.getSortedWords();

    // Show top 5 words in topics with proportions for the first document
    for (int topic = 0; topic < numTopics; topic++) {
        Iterator<IDSorter> iterator = topicSortedWords.get(topic).iterator();

        out = new Formatter(new StringBuilder(), Locale.US);
        out.format("%d\t%.3f\t", topic, topicDistribution[topic]);
        int rank = 0;
        while (iterator.hasNext() && rank < 5) {
            IDSorter idCountPair = iterator.next();
            out.format("%s (%.0f) ", dataAlphabet.lookupObject(idCountPair.getID()), idCountPair.getWeight());
            rank++;
        }
        System.out.println(out);
    }

    // Create a new instance with high probability of topic 0
    StringBuilder topicZeroText = new StringBuilder();
    Iterator<IDSorter> iterator = topicSortedWords.get(0).iterator();

    int rank = 0;
    while (iterator.hasNext() && rank < 5) {
        IDSorter idCountPair = iterator.next();
        topicZeroText.append(dataAlphabet.lookupObject(idCountPair.getID()) + " ");
        rank++;
    }

    // Create a new instance named "test instance" with empty target and source fields.
    InstanceList testing = new InstanceList(instances.getPipe());
    testing.addThruPipe(new Instance(topicZeroText.toString(), null, "test instance", null));

    TopicInferencer inferencer = model.getInferencer();
    double[] testProbabilities = inferencer.getSampledDistribution(testing.get(0), 10, 1, 5);
    System.out.println("0\t" + testProbabilities[0]);
}

}

飞鼠

我解决了这个问题。首先,我尝试在Eclipse中导入trove3.1,但是它不起作用。然后,我注意到在Mallet文件夹中有“ lib”文件夹,因此我在Eclipse中包括了所有jar文件。答对了!有用。

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

木槌主题模型-序列化文件的结果不一致

来自分类Dev

木槌-主题建模-停用词错误

来自分类Dev

木槌-主题建模-停用词错误

来自分类Dev

RcppZiggurat无法编译示例代码

来自分类Dev

llvm文档示例无法编译

来自分类Dev

RcppZiggurat无法编译示例代码

来自分类Dev

OpenCOBOL示例将无法编译

来自分类Dev

QuickSort示例将无法编译

来自分类Dev

Rust 文件示例无法编译

来自分类Dev

MATLAB示例模型无法打开

来自分类Dev

木槌中每个主题p(w | t)的单词分布

来自分类Dev

pyd wrap_class示例无法编译

来自分类Dev

升压累加器示例无法编译

来自分类Dev

MSI SDR设备示例代码无法编译

来自分类Dev

c中的基本MathGL示例无法编译

来自分类Dev

无法在Arduino中编译CapacitiveSensor示例

来自分类Dev

最简单的scalafx示例无法编译

来自分类Dev

odeint复杂状态类型示例无法编译

来自分类Dev

无法编译Microsoft COM / ActiveX示例

来自分类Dev

Java 8 Streams,无法编译的示例

来自分类Dev

无法编译/运行 Rascal 示例代码

来自分类Dev

编译后无法覆盖“小猫”模型

来自分类Dev

odeint简单的一维ode示例无法编译

来自分类Dev

无法从“了解Haskell”中编译Writer Monad示例

来自分类Dev

Rackspace电子邮件API示例C#-无法编译

来自分类Dev

Playframework:无法编译带有服务器示例的测试

来自分类Dev

为什么此Java 8流示例无法编译?

来自分类Dev

为什么Haskell的SVGFonts库中的示例无法编译?

来自分类Dev

无法为malloc__hooks编译GNU示例