Skip to content

KathyLau/DocIntel-Extraction-Structured

 
 

Repository files navigation

DocIntel-Extraction-Structured

Overview

The repository is focused on extracting the markdown layout from various documents using document intelligence then using gpt-4o to convert into structured outputs.

Installation

To set up the project, follow these steps:

  1. Clone the repository:

    git clone https://github.com/szetinglau/DocIntel-Extraction-Structured.git
    cd DocIntel-Extraction-Structured
  2. Create a virtual environment and activate it:

    python3 -m venv venv
    source venv/bin/activate   # On Windows use `venv\Scripts\activate`
  3. Install the required dependencies:

    pip install -r requirements.txt
  4. Set your credentials in config.json

Usage

To use the notebooks, follow these steps:

  1. Launch Jupyter Notebook:

    jupyter notebook
  2. Open the desired notebook from the Jupyter interface and follow the instructions within the notebook to perform document extraction tasks.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%