Add Arabic diacritics (tashkeel/harakat) using Rust/Python/C++/WASM and NLP models
-
Updated
Oct 4, 2025 - Rust
Add Arabic diacritics (tashkeel/harakat) using Rust/Python/C++/WASM and NLP models
A versatile library offering utility functions for processing and transforming Arabic text. Can be used in Node.js and the browser.
Rababa, the diacritization library for Arabic and Hebrew (Abjad scripts in general)
Using Natural Language Processing techniques, to predict diacritics of an Arabic Text.
Official code for "Fine-Tashkeel at KSAA-2026" — Systematic evaluation of 18 Seq2Seq, token classification, decoder LLM, and ASR models for automatic Arabic text diacritization. 5th place at KSAA-2026 Shared Task (OSACT7 @ LREC 2026).
A versatile library in Java offering utility functions for processing and transforming Arabic text.
Shaddah is a simple web app for automatically adding diacritics (tashkeel) to Arabic text. The app uses mishkal.py as the base model for processing and diacritizing Arabic text, allowing you to get properly diacritized text quickly and easily.
The official implementation of CATT Arabic diacritization models.
ABGD converts Arabic text into a structured list of decimal numbers based on traditional Abjad gematrical values. It also supports encoding diacritics (e.g. shadda + fatha → 0.61) using a smart fractional system.
Add a description, image, and links to the tashkeel topic page so that developers can more easily learn about it.
To associate your repository with the tashkeel topic, visit your repo's landing page and select "manage topics."