CMUSphinx从不识别音频文件中的任何单词

kevinn2065

Sphinx似乎无法识别或处理音频文件,它接受音频流后会吐出一个空数组(SpeechResult结果)。我觉得我正在使用的音频文件没有任何问题,因为我已经尝试了几次,但对任何一个都不起作用。有人有一个他们知道有效的音频文件吗?并且有什么引人注目的可能导致流不产生转录的东西吗?

public static void main(String args[]) throws IOException {
    Configuration configuration = new Configuration();
    configuration.setAcousticModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us");
    configuration.setDictionaryPath("resource:/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict");
    configuration.setLanguageModelPath("resource:/edu/cmu/sphinx/models/en-us/en-us.lm.dmp");

    StreamSpeechRecognizer recognizer = new StreamSpeechRecognizer(configuration);
    //recognizer.startRecognition(new FileInputStream("E:/1video/hello-5.mp3"));

    File file = new File("E:/1video/bargain_not.wav");
    FileInputStream fis = new FileInputStream(file);
    InputStream is = new FileInputStream(file);

    //is = AutomaticSpeechRecognition.class.getResourceAsStream("/edu/cmu/sphinx/demo/aligner/10001-90210-01803.wav");
    recognizer.startRecognition(is);
    SpeechResult result = null;
    while((result = recognizer.getResult()) != null) {
        System.out.println(result.getResult()); 
        System.out.println(result.getHypothesis());

        System.out.println(result.getWords()); 
    }
    //result = recognizer.getResult();
    //System.out.println(result);
    //System.out.println(result.toString());
    //System.out.println(result.getWords());
    /*for (WordResult wordResult : result.getWords())
    {
        System.out.println(wordResult);
    }*/
    recognizer.stopRecognition();


}

这是运行它的输出-它似乎没有任何故障

 09:31:13.430 INFO unitManager          CI Unit: *+NSN+
 09:31:13.433 INFO unitManager          CI Unit: *+SPN+
 09:31:13.433 INFO unitManager          CI Unit: AA
 09:31:13.434 INFO unitManager          CI Unit: AE
 09:31:13.434 INFO unitManager          CI Unit: AH
 09:31:13.434 INFO unitManager          CI Unit: AO
 09:31:13.434 INFO unitManager          CI Unit: AW
 09:31:13.434 INFO unitManager          CI Unit: AY
 09:31:13.434 INFO unitManager          CI Unit: B
 09:31:13.434 INFO unitManager          CI Unit: CH
 09:31:13.434 INFO unitManager          CI Unit: D
 09:31:13.434 INFO unitManager          CI Unit: DH
 09:31:13.434 INFO unitManager          CI Unit: EH
 09:31:13.435 INFO unitManager          CI Unit: ER
 09:31:13.435 INFO unitManager          CI Unit: EY
 09:31:13.435 INFO unitManager          CI Unit: F
 09:31:13.435 INFO unitManager          CI Unit: G
 09:31:13.435 INFO unitManager          CI Unit: HH
 09:31:13.435 INFO unitManager          CI Unit: IH
 09:31:13.435 INFO unitManager          CI Unit: IY
 09:31:13.435 INFO unitManager          CI Unit: JH
 09:31:13.435 INFO unitManager          CI Unit: K
 09:31:13.435 INFO unitManager          CI Unit: L
 09:31:13.435 INFO unitManager          CI Unit: M
 09:31:13.436 INFO unitManager          CI Unit: N
 09:31:13.436 INFO unitManager          CI Unit: NG
 09:31:13.436 INFO unitManager          CI Unit: OW
 09:31:13.436 INFO unitManager          CI Unit: OY
 09:31:13.436 INFO unitManager          CI Unit: P
 09:31:13.436 INFO unitManager          CI Unit: R
 09:31:13.436 INFO unitManager          CI Unit: S
 09:31:13.436 INFO unitManager          CI Unit: SH
 09:31:13.436 INFO unitManager          CI Unit: T
 09:31:13.436 INFO unitManager          CI Unit: TH
 09:31:13.436 INFO unitManager          CI Unit: UH
 09:31:13.437 INFO unitManager          CI Unit: UW
 09:31:13.437 INFO unitManager          CI Unit: V
 09:31:13.437 INFO unitManager          CI Unit: W
 09:31:13.437 INFO unitManager          CI Unit: Y
 09:31:13.437 INFO unitManager          CI Unit: Z
 09:31:13.437 INFO unitManager          CI Unit: ZH
 09:31:14.014 INFO autoCepstrum         Cepstrum component auto-configured      as follows: autoCepstrum {MelFrequencyFilterBank, Denoise,      DiscreteCosineTransform2, Lifter}
 09:31:14.030 INFO dictionary           Loading dictionary from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/cmudict-en-us.dict
 09:31:14.132 INFO dictionary           Loading filler dictionary from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/en-us/noisedict
 09:31:14.132 INFO acousticModelLoader  Loading tied-state acoustic model from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/en-us
 09:31:14.133 INFO acousticModelLoader  Pool means Entries: 16128
 09:31:14.133 INFO acousticModelLoader  Pool variances Entries: 16128
 09:31:14.133 INFO acousticModelLoader  Pool transition_matrices Entries: 42
 09:31:14.133 INFO acousticModelLoader  Pool senones Entries: 5126
 09:31:14.133 INFO acousticModelLoader  Gaussian weights: mixture_weights. Entries: 15378
 09:31:14.133 INFO acousticModelLoader  Pool senones Entries: 5126
 09:31:14.133 INFO acousticModelLoader  Context Independent Unit Entries: 42
 09:31:14.133 INFO acousticModelLoader  HMM Manager: 137095 hmms
 09:31:14.134 INFO acousticModel        CompositeSenoneSequences: 0
 09:31:14.134 INFO largeTrigramModel    Loading n-gram language model from: jar:file:/C:/Users/Kevin/.m2/repository/edu/cmu/sphinx/sphinx4-data/1.0-SNAPSHOT/sphinx4-data-1.0-SNAPSHOT.jar!/edu/cmu/sphinx/models/en-us/en-us.lm.dmp
 09:31:14.807 INFO largeTrigramModel    1-grams: 19794
 09:31:14.807 INFO largeTrigramModel    2-grams: 1377200
 09:31:14.807 INFO largeTrigramModel    3-grams: 3178194
 09:31:15.582 INFO lexTreeLinguist      Max CI Units 43
 09:31:15.583 INFO lexTreeLinguist      Unit table size 79507
 09:31:15.585 INFO speedTracker         # ----------------------------- Timers----------------------------------------
 09:31:15.585 INFO speedTracker         # Name               Count   CurTime   MinTime   MaxTime   AvgTime   TotTime   
 09:31:15.586 INFO speedTracker         Load Dictionary      1       0.1020s   0.1020s   0.1020s   0.1020s   0.1020s   
 09:31:15.586 INFO speedTracker         Load LM              1       0.6730s   0.6730s   0.6730s   0.6730s   0.6730s   
 09:31:15.586 INFO speedTracker         Compile              1       0.7760s   0.7760s   0.7760s   0.7760s   0.7760s   
 09:31:15.586 INFO speedTracker         Load AM              1       1.5450s   1.5450s   1.5450s   1.5450s   1.5450s   
 09:31:15.608 INFO speedTracker            This  Time Audio: 1.94s  Proc: 0.01s  Speed: 0.00 X real time
 09:31:15.608 INFO speedTracker            Total Time Audio: 1.94s  Proc: 0.01s 0.00 X real time
 09:31:15.609 INFO memoryTracker           Mem  Total: 454.75 Mb  Free: 262.35 Mb
 09:31:15.609 INFO memoryTracker           Used: This: 192.40 Mb  Avg: 192.40 Mb  Max: 192.40 Mb
 09:31:15.610 INFO largeTrigramModel    LM Cache Size: 0 Hits: 0 Misses: 0
 <s> </s>
特拉维斯

像Nikolay Shmyrev所说的文件必须是16khz 16bit mono MSWAV。这样的文件可以用Audacity录制。16Khz和单声道

文件导出,并确保选择WAV(Microsoft)签名的16位PCM。

本文收集自互联网,转载请注明来源。

如有侵权,请联系[email protected] 删除。

编辑于
0

我来说两句

0条评论
登录后参与评论

相关文章

来自分类Dev

CMUSphinx从不识别音频文件中的任何单词

来自分类Dev

识别音频文件

来自分类Dev

如何从音频文件中识别歌曲?

来自分类Dev

查找音频文件中的每个单词

来自分类Dev

在python中的单词上分割语音音频文件

来自分类Dev

在音频文件中搜索特定的口语单词

来自分类Dev

无法识别播放哪个音频文件

来自分类Dev

无法识别播放哪个音频文件

来自分类Dev

xml文件,XML源不识别任何列

来自分类Dev

如何在Android NDK中播放音频文件(任何格式)?

来自分类Dev

用于识别音频文件规格的命令行工具

来自分类Dev

在Python中播放远程音频文件?

来自分类Dev

音频文件无法在reactjs中播放

来自分类Dev

资源音频文件中的StringIndexOutOfBoundsException

来自分类Dev

如何从IsolatedStorage中读取音频文件?

来自分类Dev

在音频文件中获取期间

来自分类Dev

搜索音频文件中的声音片段

来自分类Dev

使用VBA Excel播放任何音频文件

来自分类Dev

如何查看音频CD中的隐藏音频文件

来自分类Dev

FFmpeg使用外部音频文件替换音频中的通道

来自分类Dev

快速从音频文件中删除音频通道

来自分类Dev

验证音频文件

来自分类Dev

提供音频文件

来自分类Dev

连接音频文件

来自分类Dev

正则表达式不识别用于从以“#”开头的单词中删除“#”的“#”

来自分类Dev

从影片剪辑“ avi”文件中读取音频文件

来自分类Dev

Proguard破坏资产或原始文件中的音频文件

来自分类Dev

检查文件是否为PHP中的音频文件

来自分类Dev

在Azure Blob存储中的音频文件中查找