This is the main repository of the IT-Project "Missing Title". The report with additional information is stored in the documentation repository.
- Docker
- Docker Compose
- Git to download this repository
- Python3 for the initialization
- Prerequisites for the python package mysqlclient also for the initialization
- Open a shell
- Clone this repository to your local system:
git clone https://git.informatik.fh-nuernberg.de/kaldi/kaldi-customization.gitand switch into the repository folder (first time only) - Use the env.cmd or env.sh script in your shell to setup the environment variables for docker-compose
- Build or import missing docker images (first time only)
- kaldi-base: See kaldi/base/README.md. Note: Make sure the name of the image matches the name in the corresponding Dockerfile.
- Create all other images by starting the docker-compose or by executing
docker-compose build
- Start the customization service:
- Load the compose with
docker-compose upand have a cup of tea or coffee - Wait until the service is online (website is reachable: localhost:8080)
- To scale the amount of workers, use the
--scaleparameter fordocker-compose up, available workers aretext-preparation-worker,data-preparation-worker,kaldi-workeranddecode-worker
- Load the compose with
- Use the initialization script initialization/init.py (first time only):
- A few modules are required to execute the script: For example use pip and pipenv:
- Open another shell
pipenv installandpipenv shellto activate the pipenv shellpip install -r ./initialization/requirements.txtto install the requirements- Note: If an error occurs, verify that the prerequisites for the python package mysqlclient are installed
- Execute
python ./initialization/init.pyto prepare the database and upload default model data
- A few modules are required to execute the script: For example use pip and pipenv:
- Web Interface: localhost:8080
- Web API: localhost:8080/api
- Make sure that there are no running jobs like a training
- Use
docker-compose stopin the repository folder or pressCtrl + cin the shell where you startet the compose - All data (database, files) are stored persistantly on the local disk
- Use
docker-compose downto shut down the service and delete the database
This file defines the service. It is used by docker to build and run the images/containers.
Definition of the public API. See api/README.md for further information.
Contains some global settings for the docker-compose.
Persistent storage for database (/dfs/mariadb) and file serivce (/dfs/data).
Do not touch manually!
Use a SQL explorer (e.g. MySQL Workbench) and the MinIO web client at localhost:9001 instead.
As the name indicates: Preparation for the first usage. See initial setup guide.
Contains also the pretrained acoustic models.
Our docker image with a kaldi installation. Use the base image and see the README there.
The server components to run the kaldi customization web service.
This is the API backend. It provides access to the features of the kaldi customization web service and handles authentication.
See the README.
This is the web frontend for users. It offers a user interface to train and test user defined ASR.
Scripts and resources which are used by several components.
The worker directory contains the workers used in the backend to process the user requests via the API.
See the directories for further information about the workers:
- text-preparation-worker: Extract text from uploaded resource files.
- data-preparation-worker: Prepares the training process.
- kaldi-worker: This is the general kaldi-worker to process ASR testing.
- decode-worker: Decodes audio to text.
A SQL Server for the persistent data.
An in memory Redis Server for the task queue.