Skip to content
/ Padie Public

An open-source Python package for conversational AI with support for Nigerian languages, including Pidgin, Yoruba, Hausa, and Igbo.

License

Notifications You must be signed in to change notification settings

sir-temi/Padie

Repository files navigation

🌟 Padie 🌟

License Status Python Contributions Welcome

Padie is an evolving, open-source Python package designed to enable conversational AI systems with support for Nigerian languages, including Pidgin, Yoruba, Hausa, and Igbo. It aims to provide AI-powered tools for language detection, intent recognition, and response generation, while fostering community collaboration to enhance its capabilities.

🔧 Note: Padie is a work in progress. Models are being trained and refined, and we’re actively gathering datasets to improve accuracy. Your contributions can make a difference!


Features

  • 🌍 AI-Powered Language Detection
    Automatically identify the language of input text across supported Nigerian languages.

  • 🎯 AI-Powered Intent Recognition
    Accurately understand user intentions across multiple domains and contexts.

  • 🤖 Dynamic Response Generation
    Generate intelligent, context-aware responses tailored to user input.

  • ⚙️ Framework-Agnostic Design
    Seamlessly integrate Padie into any framework or application.

  • 🧑‍🤝‍🧑 Community Contributions
    An open platform for developers, linguists, and AI enthusiasts to contribute datasets, models, and features.


🛠️ Handling Datasets and Citations

Dataset Format

Datasets should follow a structured format. Each entry must include at least the following fields:

{
    "text": "Example input text.",
    "label": "language_label",
    "format": "format_type",
    "source": {
        "provider": null,
        "url": null
    },
    "citation": null
}

Citations

  • If the data includes a citation, create a unique citation code (e.g., "c1", "c2") and add the citation details to a separate file named citations.json.
  • Replace the citation field in the dataset with the corresponding code.

Example Citation File (citations.json):

{
    "c2": {
        "title": "Naija Language Translation",
        "author": "Wazobia",
        "year": 2021,
        "url": "https://example.com/naija-paper"
    }
}

Example Dataset Entry with Citation Code:

{
    "text": "Example text in a Nigerian language.",
    "label": "pidgin",
    "format": "article",
    "source": {
        "provider": "BBC",
        "url": "https://www.bbc.com/article"
    },
    "citation": "c2"
}

Why Use a Citation File?

  • Ensures clarity by centralising citation details.
  • Prevents redundancy and reduces dataset file size.
  • Facilitates easier management and updates.

📋 How to Contribute

We welcome contributions from everyone—whether you're a developer, linguist, or data scientist! Here's how you can get involved:

  1. Fork the Repository:
    Click the "Fork" button at the top of the repository page to create your copy.

  2. Clone Your Fork:

    git clone https://github.com/sir-temi/Padie.git
  3. Create a Branch:

    git checkout -b feature-name
  4. Make Your Changes:

    • Contribute datasets for supported or new languages.
    • Add citations and ensure proper referencing in citations.json.
    • Improve AI models or build new ones.
    • Fix bugs or add features.
  5. Commit and Push:

    git commit -m "Describe your changes"
    git push origin feature-name
  6. Submit a Pull Request:
    Open a pull request against the main branch with a clear description of your changes.


📦 Installation

🚧 Coming Soon!
We’re finalizing the core framework and will provide installation instructions once it’s ready.


🌍 Open Source Contribution

Padie is licensed under the MIT License, ensuring it remains free and open for everyone to use, contribute to, and enhance.

About

An open-source Python package for conversational AI with support for Nigerian languages, including Pidgin, Yoruba, Hausa, and Igbo.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages