Opennyai : An efficient NLP Pipeline for Indian Legal documents

Opennyai is a python library for natural language preprocessing on Indian legal texts.

This library provides unified access to the following 3 pre-trained AI models developed by OpenNyAI which focus on Indian court judgments:

Named Entity Recognition (NER): GitHub , paper, Huggingface Datasets
Judgment Structuring using Sentence Rhetorical Roles: GitHub , paper, Huggingface Datasets
Extractive Summarizer: GitHub

This library is mainly for running the pretrained models on your custom input judgments text. For more details about data and model training, please refer to individual git repo links.

🔧 1. Installation

Using uv (recommended)

Install uv if you haven't already:

curl -LsSf https://astral.sh/uv/install.sh | sh

Create a virtual environment and install opennyai:

uv venv --python 3.13
source .venv/bin/activate
uv pip install opennyai

Using pip

You can also install with pip directly (requires Python >= 3.13):

python -m venv .venv
source .venv/bin/activate
pip install -U opennyai

For GPU support

If you want to utilize spacy with GPU please install Cupy with the appropriate CUDA version:

pip install cupy-cuda12x  # For CUDA 12.x

In case of any issue with installation please refer to spacy installation with cupy.

📖 2. Documentation

Please refer to the Documentation for more details.

👩‍💻 3. Usage

To run the 3 OpenNyAI models on judgment texts of your choice please run following python code

from opennyai import Pipeline
from opennyai.utils import Data
import urllib

# Get court judgment texts on which to run the AI models
text1 = urllib.request.urlopen(
    'https://raw.githubusercontent.com/OpenNyAI/Opennyai/master/samples/sample_judgment1.txt').read().decode()
text2 = urllib.request.urlopen(
    'https://raw.githubusercontent.com/OpenNyAI/Opennyai/master/samples/sample_judgment2.txt').read().decode()

# you can also load your text files directly into this
texts_to_process = [text1, text2]

# create Data object for data  preprocessing before running ML models
data = Data(texts_to_process)

# If you have access to GPU then set this to True else False
use_gpu = True

# Choose which of the AI models you want to run from the 3 models 'NER', 'Rhetorical_Role','Summarizer'. E.g. If just Named Entity is of interest then just select 'NER'

pipeline = Pipeline(components=['NER', 'Rhetorical_Role', 'Summarizer'], use_gpu=use_gpu, verbose=True)

results = pipeline(data)

The output of each model is present in following keys of each element of the output

results[0][
    'annotations']  ## shows the result of model at sentence level, each entry will have entities, rhetorical role, and other details
results[0]['summary']  ## shows Summary for each of the Rheorical Role for first judgment text

For more details on usage please refer to the documentation

Google Colab Notebook

Description	Link
Run Inference

Name		Name	Last commit message	Last commit date
Latest commit History 251 Commits
asset		asset
docs		docs
opennyai		opennyai
samples		samples
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Opennyai : An efficient NLP Pipeline for Indian Legal documents

🔧 1. Installation

Using uv (recommended)

Using pip

For GPU support

📖 2. Documentation

👩‍💻 3. Usage

Google Colab Notebook

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Opennyai : An efficient NLP Pipeline for Indian Legal documents

🔧 1. Installation

Using uv (recommended)

Using pip

For GPU support

📖 2. Documentation

👩‍💻 3. Usage

Google Colab Notebook

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages