It would be nice if this could be integrated into the Python speech_recognition module. Corresponding feature request there: https://github.com/Uberi/speech_recognition/issues/359