[needs-packaging] CMUSphinx
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu |
Confirmed
|
Wishlist
|
Unassigned |
Bug Description
URL:
http://
http://
Description:
The Sphinx Group at Carnegie Mellon University is committed to releasing the long-time, DARPA-funded Sphinx projects widely, in order to stimulate the creation of speech-using tools and applications, and to advance the state of the art both directly in speech recognition, as well as in related areas including dialog systems and speech synthesis.
The Sphinx Group has been supported for many years by funding from the Defense Advanced Research Projects Agency, and the recognition engines to be released are those that the group used for the various DARPA projects and their respective evaluations.
Recent support for the project also include Telefónica I & D, Sun Microsystems, and Mitsubishi Electric Research Labs.
The licensing terms for the Sphinx engines and tools are derived from BSD, and based, in particular, upon the license for the Apache web server. There is no restriction against commercial use or redistribution. (License terms for CMU Sphinx)
The packages that the CMU Sphinx Group is releasing are a set of reasonably mature, world-class speech components that provide a basic level of technology to anyone interested in creating speech-using applications without the once-prohibitive initial investment cost in research and development; the same components are open to peer review by all researchers in the field, and are used for linguistic research as well.
Note however that Sphinx is not a final product. Those with a certain level of expertise can achieve great results with the versions of Sphinx available here, but a naive user will certainly need further help. In other words, the software available here is not meant for users with no experience in speech, but for expert users.
This site will be the canonical location for the release of the Sphinx trainers, recognizers, acoustic and language models, and documentation.
Capabilities
======
Live mode and batch mode speech recognizers, capable of recognizing discrete and continuous speech.
Generalized pluggable front end architecture. Includes pluggable implementations of preemphasis, Hamming window, FFT, Mel frequency filter bank, discrete cosine transform, cepstral mean normalization, and feature extraction of cepstra, delta cepstra, double delta cepstra features.
Generalized pluggable language model architecture. Includes pluggable language model support for ASCII and binary versions of unigram, bigram, trigram, Java Speech API Grammar Format (JSGF), and ARPA-format FST grammars.
Generalized acoustic model architecture. Includes pluggable support for Sphinx-3 acoustic models.
Generalized search management. Includes pluggable support for breadth first and word pruning searches.
Utilities for post-processing recognition results, including obtaining confidence scores, generating lattices and embedding ECMAScript into JSGF tags.
Standalone tools. Includes tools for displaying waveforms and spectrograms and generating features from audio.
(NOTE: The links in this section point to local files created by javadoc. If they are broken, please follow the instructions on Creating Javadocs to create these links.)
License:
The licensing terms for the Sphinx engines and tools are derived from BSD, and based, in particular, upon the license for the Apache web server. There is no restriction against commercial use or redistribution.
There is no restriction against commercial use or redistribution. (License terms for CMU Sphinx)
summary: |
- [needs-packaging] CMUSphinx - speech recognition and synthesis project + [needs-packaging] CMUSphinx |
Changed in ubuntu: | |
status: | New → Confirmed |
Packages sphinx2-bin, sphinx2-hmm-6k, and libsphinx2g0 are already in the Ubuntu repositories.
Changing this bug from 'New' to 'Invalid'. If you find additional software, please reopen this bug by setting it back to 'New'