Environment Settings

Install Swig-decoder

Note that this option is only supported in Ubuntu at the moment. Mac Os gives me weird dependencies issues.

(This implementation is originally from the DeepSpeech by Baidu, however, it is built upon PaddlePaddle which I do not really know, so i just adopt its decoder. To checkout the bash file, click here). But in general, Baidu's DeepSpeech repo provides a lot of helpful resources. Huge thanks to them!

Since the decoder for our language model needs C++ support and uses Swig as the wrapper. First, you need to install Swig (http://www.swig.org). (As the issues addressed here), as the status of 04/16/2018, if we use sudo apt-get install, Ubuntu will install swig 3.0.8, which is lower than the required version, so you first need to go to swig website and install swig 3.0.12, which is the version we use. (You may also need to change you $PATH file after installation).

Next, you need to install openfst 1.6.3. The official address is http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-1.6.3.tar.gz. However, if the openfst website is down (which is what I encountered before), you can go to ./src/openfst in my repo to download the source file of openfst 1.6.3.

Once you have installed all the dependencies, the next step is to get all the files in this folder. An easier way to do that is just to clone Baidu's DeepSpeech (https://github.com/PaddlePaddle/DeepSpeech.git). But really, you just need that folder and go to /swig and run setup.sh (lazy installation), or setup.py (if you have KenLM and everything else installed already).

To test whether you have installed it successfully, try

import swig_decoders

in Python.

In our program, the integration of this language model can be found under the /src folder in the languageModel.py file.

Language Model

One thing I need to reiterate about using swig decoder is that it is specifically for using a language model in speech recognition. It uses Kenlm(https://kheafield.com/code/kenlm/). After you install it, you may want a language model, again, Baidu provides some good ones here: https://github.com/PaddlePaddle/DeepSpeech#released-models.

Non-Conda Setting

We have provided an environment.yml file to facilitate installation processes. However, for some reason if you don't want to use Conda, you can individually install every by using pip For instance --

Install Tensorflow

pip install tensorflow

Install Keras

pip install keras

AIND-VUI-Capstone Package

It is another resources we have found to be very helpful. For more detailed local environment setup, please refer to https://github.com/udacity/AIND-VUI-Capstone/blob/master/README.md

ASR-evaluation

Installation:

pip install asr-evaluation

Commandline usage

wer <true_text.txt> <predicted_test.txt>

For more detailed information, please refer to https://github.com/belambert/asr-evaluation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Environment Settings

Install Swig-decoder

Language Model

Non-Conda Setting

Install Tensorflow

Install Keras

AIND-VUI-Capstone Package

ASR-evaluation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally