Sphinx-4 1.0 Beta
Sponsored Links
Sphinx-4 1.0 Beta Ranking & Summary
File size:
28.9 MB
Platform:
Windows All
License:
GPL
Price:
Downloads:
869
Date added:
2007-07-27
Publisher:
Paul Lamere
Sphinx-4 1.0 Beta description
A speech recognizer written entirely in the Java programming language Sphinx-4 is a state-of-the-art speech recognition system written entirely in the Java programming language.
Sphinx-4 was created via a joint collaboration between the Sphinx group at Carnegie Mellon University, Mitsubishi Electric Research Labs (MERL), Sun Microsystems Laboratories, and Hewlett Packard (HP), with contributions from the University of California at Santa Cruz (UCSC) and the Massachusetts Institute of Technology (MIT).
Sphinx-4 started out as a port of Sphinx-3 to the Java programming language, but evolved into a recognizer designed to be much more flexible than Sphinx-3, thus becoming an excellent platform for speech research.
Sphinx-4 is a very flexible system capable of performing many different types of recognition tasks. As such, it is difficult to characterize the performance and accuracy of Sphinx-4 with just a few simple numbers such as speed and accuracy.
Instead, we regularly run regression tests on Sphinx-4 to determine how it performs under a variety of tasks. These tasks and their latest results are as follows (each task is progressively more difficult than the previous task):
- Isolated Digits (TI46): Runs Sphinx-4 with pre-recorded test data to gather performance metrics for recognizing just one word at a time. The vocabulary is merely the spoken digits from 0 through 9, with a single utterance containing just one digit. (TI46 refers to the "NIST CD-ROM Version of the Texas Instruments-developed 46-Word Speaker-Dependent Isolated Word Speech Database".)
- Connected Digits (TIDIGITS): Extends the Isolated Digits test to recognize more than one word at a time (i.e., continuous speech). The vocabulary is merely the spoken digits from 0 through 9, with a single utterance containing a sequence of digits. (TIDIGITS refers to the "NIST CD-ROM Version of the Texas Instruments-developed Studio Quality Speaker-Independent Connected-Digit Corpus".)
- Small Vocabulary (AN4): Extends the vocabulary to approximately 100 words, with input data ranging from speaking words as well as spelling words out letter by letter.
- Medium Vocabulary (RM1): Extends the vocabulary to approximately 1,000 words.
- Medium Vocabulary (WSJ5K): Extends the vocabulary to approximately 5,000 words.
- Medium Vocabulary (WSJ20K): Extends the vocabulary to approximately 20,000 words.
- Large Vocabulary (HUB4): Extends the vocabulary to approximately 64,000 words.
Main features:
- Live mode and batch mode speech recognizers, capable of recognizing discrete and continuous speech.
- Generalized pluggable front end architecture. Includes pluggable implementations of preemphasis, Hamming window, FFT, Mel frequency filter bank, discrete cosine transform, cepstral mean normalization, and feature extraction of cepstra, delta cepstra, double delta cepstra features.
- Generalized pluggable language model architecture. Includes pluggable language model support for ASCII and binary versions of unigram, bigram, trigram, Java Speech API Grammar Format (JSGF), and ARPA-format FST grammars.
- Generalized acoustic model architecture. Includes pluggable support for Sphinx-3 acoustic models.
- Generalized search management. Includes pluggable support for breadth first and word pruning searches.
- Utilities for post-processing recognition results, including obtaining confidence scores, generating lattices and embedding ECMAScript into JSGF tags.
- Standalone tools. Includes tools for displaying waveforms and spectrograms and generating features from audio.
System requirements:
- Java 2 SDK, Standard Edition 5.0 or better
- Ant 1.6.0
Sphinx-4 was created via a joint collaboration between the Sphinx group at Carnegie Mellon University, Mitsubishi Electric Research Labs (MERL), Sun Microsystems Laboratories, and Hewlett Packard (HP), with contributions from the University of California at Santa Cruz (UCSC) and the Massachusetts Institute of Technology (MIT).
Sphinx-4 started out as a port of Sphinx-3 to the Java programming language, but evolved into a recognizer designed to be much more flexible than Sphinx-3, thus becoming an excellent platform for speech research.
Sphinx-4 is a very flexible system capable of performing many different types of recognition tasks. As such, it is difficult to characterize the performance and accuracy of Sphinx-4 with just a few simple numbers such as speed and accuracy.
Instead, we regularly run regression tests on Sphinx-4 to determine how it performs under a variety of tasks. These tasks and their latest results are as follows (each task is progressively more difficult than the previous task):
- Isolated Digits (TI46): Runs Sphinx-4 with pre-recorded test data to gather performance metrics for recognizing just one word at a time. The vocabulary is merely the spoken digits from 0 through 9, with a single utterance containing just one digit. (TI46 refers to the "NIST CD-ROM Version of the Texas Instruments-developed 46-Word Speaker-Dependent Isolated Word Speech Database".)
- Connected Digits (TIDIGITS): Extends the Isolated Digits test to recognize more than one word at a time (i.e., continuous speech). The vocabulary is merely the spoken digits from 0 through 9, with a single utterance containing a sequence of digits. (TIDIGITS refers to the "NIST CD-ROM Version of the Texas Instruments-developed Studio Quality Speaker-Independent Connected-Digit Corpus".)
- Small Vocabulary (AN4): Extends the vocabulary to approximately 100 words, with input data ranging from speaking words as well as spelling words out letter by letter.
- Medium Vocabulary (RM1): Extends the vocabulary to approximately 1,000 words.
- Medium Vocabulary (WSJ5K): Extends the vocabulary to approximately 5,000 words.
- Medium Vocabulary (WSJ20K): Extends the vocabulary to approximately 20,000 words.
- Large Vocabulary (HUB4): Extends the vocabulary to approximately 64,000 words.
Main features:
- Live mode and batch mode speech recognizers, capable of recognizing discrete and continuous speech.
- Generalized pluggable front end architecture. Includes pluggable implementations of preemphasis, Hamming window, FFT, Mel frequency filter bank, discrete cosine transform, cepstral mean normalization, and feature extraction of cepstra, delta cepstra, double delta cepstra features.
- Generalized pluggable language model architecture. Includes pluggable language model support for ASCII and binary versions of unigram, bigram, trigram, Java Speech API Grammar Format (JSGF), and ARPA-format FST grammars.
- Generalized acoustic model architecture. Includes pluggable support for Sphinx-3 acoustic models.
- Generalized search management. Includes pluggable support for breadth first and word pruning searches.
- Utilities for post-processing recognition results, including obtaining confidence scores, generating lattices and embedding ECMAScript into JSGF tags.
- Standalone tools. Includes tools for displaying waveforms and spectrograms and generating features from audio.
System requirements:
- Java 2 SDK, Standard Edition 5.0 or better
- Ant 1.6.0
Sphinx-4 1.0 Beta Screenshot
Sphinx-4 1.0 Beta Keywords
4 1.0 Beta
Medium Vocabulary
speech recognizer written entirely
Java programming language
speech recognizer written
Java programming
to approximately
programming language
1.0 Beta
Speech recognizer
vocabulary
speech
java
language
programming
beta
Bookmark Sphinx-4 1.0 Beta
Sphinx-4 1.0 Beta Copyright
WareSeeker periodically updates pricing and software information of Sphinx-4 1.0 Beta full version from the publisher, so some information may be slightly out-of-date. You should confirm all information before relying on it. Software piracy is theft, Using crack, password, serial numbers, registration codes, key generators is illegal and prevent future development of Sphinx-4 1.0 Beta Edition. Download links are directly from our publisher sites, torrent files or links from rapidshare.com, yousendit.com or megaupload.com are not allowed
Featured Software
Want to place your software product here?
Please contact us for consideration.
Contact WareSeeker.com
Related Information
java programming language articles
assembly language programming
history of java programming language
speech recognition
programming languages
c programming language
brief history of java programming language
c++ programming language
what is java programming language
java programming languages
java programming language tutorial
programming language guide
what is java used for programming language
vocabulary definition
java programming language guide
java programming tutorials
microsoft speech recognizer
java programming tutorial
Related Software
EPL (Easy Programming Language) is a powerful, easy to use RAD programming language & software development environment Free Download
SuperEdit is a Java IDE written entirely in Java. Free Download
DESCryptX can generate UNIX passwords from a programming language Free Download
DMX plugin for Advanced Serial Port Monitor will make your work with DPL programming language more easy Free Download
Tutorial for the Lua programming language and web-development using LSP. Free Download
Java text-to-speech system, Say-It-Now!. This text-to-speech system converts Eng Free Download
A network computer game, based on the widely known Scrabble game, written in the Java programming language. The game can be played simultaneously by up to 4 players via a network. Free Download
A powerful BASIC programming language Free Download
Latest Software
Popular Software
Favourite Software