Skip to content

Repository for the GSoC project 'Towards a Neural Extraction Framework'

Notifications You must be signed in to change notification settings

advenk/neural-extraction-framework

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

211 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neural Extraction Framework

DBpedia Neural Extraction Framework

A research framework for multilingual information extraction over DBpedia, developed and extended through multiple Google Summer of Code (GSoC) projects.


📖 About

The Neural Extraction Framework is a specialized repository under the DBpedia project focused on advancing information extraction using neural networks and large language models. This framework complements DBpedia's traditional extraction methods by providing state-of-the-art neural approaches for extracting structured knowledge from unstructured text.

🎯 Key Features

  • Neural-based Triple Extraction: Extract RDF triples using language models
  • Multi-language Support: Specialized pipelines for various languages (English, Hindi, etc.)
  • Semantic Web Integration: RDF/SPARQL compatible outputs
  • Evaluation Frameworks: Benchmarking tools for IE performance

🗣️ Supported Languages

  • 🇬🇧 English (no suffix)
  • 🇮🇳 Hindi (_H suffix)

🚀 GSoC Projects

This repository hosts Google Summer of Code (GSoC) projects that advance neural extraction techniques:

🏆 Recent Contributions

  • GSoC 2025: Enhanced English and Hindi Information Extraction pipelines with SLM integration, link prediction, and predicate linking
  • GSoC 2024: Language model integration and pipeline optimization
  • GSoC 2023: Advanced relation extraction methods
  • GSoC 2022: Relation extraction pipeline improvements
  • GSoC 2021: Initial neural framework development

🛠️ Technology Stack

Python

Implementation Language

  • Python

📦 Installation

Prerequisites

  • Python 3.8+
  • Git

Quick Start

# Clone the repository
git clone https://github.com/dbpedia/neural-extraction-framework.git
cd neural-extraction-framework

# Navigate to specific GSoC project
cd GSoC25_H  # Example: Hindi Chapter

# Follow project-specific setup
# Each directory has its own README with detailed instructions

Important: Each GSoC project has unique dependencies and setup requirements. Always refer to the README in the specific project directory.


🎓 Project Structure

Each GSoC directory typically contains:

  • README.md - Project-specific documentation
  • src/ - Source code and implementations
  • data/ - Datasets and benchmarks
  • models/ - Trained models or model configurations
  • notebooks/ - Jupyter notebooks for experiments
  • requirements.txt - Python dependencies

🤝 Contributing

We appreciate all contributions! To contribute:

  • Fork the repository
  • Create a feature branch (git checkout -b feature/amazing-feature)
  • Commit your changes (git commit -m 'Add amazing feature')
  • Push to the branch (git push origin feature/amazing-feature)
  • Open a Pull Request

Please ensure your code follows the project's coding standards and includes appropriate tests.


📬 Contact & Community

Forum Slack


🌟 Acknowledgements

  • Google Summer of Code for supporting student contributions
  • DBpedia Association for mentorship and infrastructure
  • All GSoC students and mentors who have contributed
  • The open-source community for tools and collaboration

Built with ❤️ by the DBpedia Community

Visit DBpedia Explore Repos

About

Repository for the GSoC project 'Towards a Neural Extraction Framework'

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 84.6%
  • Python 15.3%
  • Other 0.1%