🇰🇿 Multimodal Kazakh Research Repository

Welcome to the Multimodal Kazakh Research repo — a work-in-progress collection of tools, scripts, and models aimed at advancing research in multimodal learning (mostly vision) for the Kazakh language.

📁 Structure

.
├── projects/
│   └── horde-vision/
│       ├── benchmark/       # Benchmarking apps and model request scripts
│       ├── evaluate/        # Evaluation and metrics calculation
│       ├── scripts/         # Data processing and utility scripts
│       └── train_scripts/   # Training (SFT, RL) and inference scripts
└── results/                 # Evaluation results and visualizations

🧠 Goals

🖼️ Build and curate high-quality Kazakh-language multimodal datasets
🤖 Train and evaluate multimodal models on Kazakh content
🧪 Support downstream tasks like retrieval, captioning, VQA

📊 Results

Horde Vision Model Performance Summary

Model	caption	vqa	ocr	reason	instruct_follow	Avg Rank
horde-vision	83.5 (↑12.3%)	68.1 (↑5.3%)	64.7 (↑2.6%)	77.4 (↑5.7%)	70.5 (↑5.9%)	#1
Qolda	75.2 (↑8.7%)	61.7 (↑3.0%)	60.6 (↑2.0%)	70.3 (↑2.9%)	62.2 (↑2.8%)	#2
Qwen3-VL-8B-Instruct	41.3 (↑0.5%)	53.6 (↑1.1%)	59.3 (↑2.1%)	55.5 (↑0.7%)	49.5 (↑0.9%)	#3
gemma-3-4b-it	42.0 (↑0.1%)	41.8 (↑0.4%)	50.3 (↑2.3%)	53.0 (↑0.6%)	42.5 (↑0.5%)	#4
Qwen2.5-VL-7B-Instruct	35.4 (↑0.0%)	41.6 (↑0.4%)	51.0 (↑0.9%)	44.6 (↑0.3%)	37.7 (↑0.3%)	#5
Llama-3.2-11B-Vision	36.2 (↑0.1%)	38.0 (↑0.3%)	15.0 (↑0.1%)	43.4 (↑0.3%)	36.4 (↑0.3%)	#6
InternVL3-8B	26.1 (↑0.6%)	29.0 (↑0.0%)	29.1 (↑0.3%)	27.3 (↑0.0%)	25.7 (↑0.0%)	#7

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
projects/horde-vision		projects/horde-vision
results/horde-vision-results		results/horde-vision-results
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🇰🇿 Multimodal Kazakh Research Repository

📁 Structure

🧠 Goals

📊 Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🇰🇿 Multimodal Kazakh Research Repository

📁 Structure

🧠 Goals

📊 Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages