-
Notifications
You must be signed in to change notification settings - Fork 1
Environment Settings
Note that this option is only supported in Ubuntu at the moment. Mac Os gives me weird dependencies issues.
(This implementation is originally from the DeepSpeech by Baidu, however, it is built upon PaddlePaddle which I do not really know, so i just adopt its decoder. To checkout the bash file, click here). But in general, Baidu's DeepSpeech repo provides a lot of helpful resources. Huge thanks to them!
Since the decoder for our language model needs C++ support and uses Swig as the wrapper. First, you need to install Swig (http://www.swig.org). (As the issues addressed here), as the status of 04/16/2018, if we use sudo apt-get install
, Ubuntu will install swig 3.0.8, which is lower than the required version, so you first need to go to swig website and install swig 3.0.12, which is the version we use. (You may also need to change you $PATH file after installation).
Next, you need to install openfst 1.6.3. The official address is http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-1.6.3.tar.gz. However, if the openfst website is down (which is what I encountered before), you can go to ./src/openfst
in my repo to download the source file of openfst 1.6.3.
Once you have installed all the dependencies, the next step is to get all the files in this folder. An easier way to do that is just to clone Baidu's DeepSpeech (https://github.com/PaddlePaddle/DeepSpeech.git). But really, you just need that folder and go to /swig
and
run setup.sh
(lazy installation), or setup.py
(if you have KenLM and everything else installed already).
To test whether you have installed it successfully, try
import swig_decoders
in Python.
In our program, the integration of this language model can be found under the /src
folder in the languageModel.py
file.
One thing I need to reiterate about using swig decoder is that it is specifically for using a language model in speech recognition. It uses Kenlm(https://kheafield.com/code/kenlm/). After you install it, you may want a language model, again, Baidu provides some good ones here: https://github.com/PaddlePaddle/DeepSpeech#released-models.
We have provided an environment.yml
file to facilitate installation processes. However, for some reason if you don't want to use Conda, you can individually install every by using pip
For instance --
pip install tensorflow
pip install keras
It is another resources we have found to be very helpful. For more detailed local environment setup, please refer to https://github.com/udacity/AIND-VUI-Capstone/blob/master/README.md
Installation:
pip install asr-evaluation
Commandline usage
wer <true_text.txt> <predicted_test.txt>
For more detailed information, please refer to https://github.com/belambert/asr-evaluation