DocIntel-Extraction-Structured

Overview

The repository is focused on extracting the markdown layout from various documents using document intelligence then using gpt-4o to convert into structured outputs.

Installation

To set up the project, follow these steps:

Clone the repository:

git clone https://github.com/szetinglau/DocIntel-Extraction-Structured.git
cd DocIntel-Extraction-Structured

Create a virtual environment and activate it:

python3 -m venv venv
source venv/bin/activate   # On Windows use `venv\Scripts\activate`

Install the required dependencies:
```
pip install -r requirements.txt
```
Set your credentials in config.json

Usage

To use the notebooks, follow these steps:

Launch Jupyter Notebook:
```
jupyter notebook
```
Open the desired notebook from the Jupyter interface and follow the instructions within the notebook to perform document extraction tasks.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.json		config.json
content-schema.json		content-schema.json
process-cleared-agreements.ipynb		process-cleared-agreements.ipynb
process-isda-content.ipynb		process-isda-content.ipynb
process-lch-ltd.ipynb		process-lch-ltd.ipynb
requirements.txt		requirements.txt
sample.env		sample.env

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DocIntel-Extraction-Structured

Overview

Installation

Usage

License

About

Uh oh!

Releases

Packages

Languages

License

KathyLau/DocIntel-Extraction-Structured

Folders and files

Latest commit

History

Repository files navigation

DocIntel-Extraction-Structured

Overview

Installation

Usage

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages