ASR - Numbers Recognition in Russian

Kaggle competition to complete: https://www.kaggle.com/competitions/asr-numbers-recognition-in-russian/overview

Results

There are results from several runs on diffenert dataset formats:

Result of training with numbers representation (Audio -> 1234)

Result of training with numbers representation (Audio -> "тысяча двести тридцать четыре")

Result of training with numbers representation (Audio -> <1>|<200><30><4>)

Token representation allows us to achieve the best validation loss.

📈 Monitoring

The project exposes training metrics to Prometheus via Pushgateway. The docker-compose.yml file includes services pushgateway, prometheus and grafana. After training each dataset version the script pushes the latest val_loss metric to Pushgateway.

Grafana is provisioned with Prometheus as a data source and an alert rule that checks the val_loss metric. Alerts are sent through a Telegram contact point. Set the following environment variables before starting the stack:

export GF_TELEGRAM_BOT_TOKEN=your_bot_token
export GF_TELEGRAM_CHAT_ID=your_chat_id

Start the full stack with:

docker-compose up -d

Grafana will be available at http://localhost:3000 and Prometheus at http://localhost:9090.

Alerts will trigger when val_loss stays above 1 and will be sent to the specified Telegram chat.

Collaboration

To collaborate on this project please use GitHub Workflow.

Please read Conventional Commits specification to enshure that your commits named properly.

🌀 Airflow Integration (Local Pipeline Orchestration)

This project supports orchestration using Apache Airflow with Docker Compose.

✅ Features

Local Airflow deployment (via Docker)
Pipeline with three steps: Data Processing → Model Training → Model Testing
Integrated with DVC for data and artifact versioning

🔧 Setup

Make sure Docker and Docker Compose are installed.

docker-compose up -d

Airflow UI will be available at: http://localhost:8080
Default credentials: admin / admin

If port 8080 is busy, change it in docker-compose.yml:

ports:
  - "8081:8080"

▶️ Run the DAG

Once Airflow is running:

Go to the web UI.
Find DAG named mlops_pipeline.
Toggle it on, then click Trigger DAG.

The DAG contains:

process_data: runs data_download and data_prepare
train_model: runs model training

All steps are executed via uv for dependency consistency and tracked with dvc.

Others Services URL's

FastAPI model service

The repository exposes a FastAPI application that wraps the existing training pipeline located in src/models and src/data. The service can download and prepare data, perform inference using a trained Conformer model and report metrics on the validation split.

Run locally

uvicorn src.api.main:app --reload

To start the service with Docker Compose:

docker-compose up api

The model checkpoint and validation dataset paths are configured in service_config.yaml.

Endpoints

POST /download_data — download the raw dataset from Kaggle.
POST /prepare_data — generate dataset splits and vocabularies.
POST /predict — return transcription for an uploaded audio file.
GET /metrics — compute character error rate on the validation split.

Example client

python scripts/query_api.py sends the validation set audio files to the API and saves predictions to results.json.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.dvc		.dvc
.github/workflows		.github/workflows
dags		dags
doc		doc
grafana/provisioning		grafana/provisioning
mlflow		mlflow
models		models
prometheus		prometheus
scripts		scripts
src		src
.dvcignore		.dvcignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Dockerfile		Dockerfile
Dockerfile.api		Dockerfile.api
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
service_config.yaml		service_config.yaml
wait-for-it.sh		wait-for-it.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ASR - Numbers Recognition in Russian

Results

📈 Monitoring

Collaboration

🌀 Airflow Integration (Local Pipeline Orchestration)

✅ Features

🔧 Setup

▶️ Run the DAG

Others Services URL's

FastAPI model service

Run locally

Endpoints

Example client

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ASR - Numbers Recognition in Russian

Results

📈 Monitoring

Collaboration

🌀 Airflow Integration (Local Pipeline Orchestration)

✅ Features

🔧 Setup

▶️ Run the DAG

Others Services URL's

FastAPI model service

Run locally

Endpoints

Example client

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages