Skip to content
/ gpt-rag-ingestion Public template

The GPT-RAG Data Ingestion service automates processing of diverse documents—PDFs, images, spreadsheets, transcripts, and SharePoint—readying them for Azure AI Search. It applies smart chunking, generates text and image embeddings, and enables rich, multimodal retrieval.

License

Notifications You must be signed in to change notification settings

Azure/gpt-rag-ingestion

Repository files navigation

GPT-RAG Data Ingestion

Part of the GPT-RAG solution.

The GPT-RAG Data Ingestion service automates the processing of diverse document types—such as PDFs, images, spreadsheets, transcripts, and SharePoint files—preparing them for indexing in Azure AI Search. It uses intelligent chunking strategies tailored to each format, generates text and image embeddings, and enables rich, multimodal retrieval experies for agent-based RAG applications.

For full documentation, visit the GPT-RAG documentation site.

Contributing

We welcome contributions! See the contribution guidelines for details on how to contribute.

Trademarks

This project may contain trademarks or logos. Authorized use of Microsoft trademarks or logos must follow Microsoft’s Trademark & Brand Guidelines. Modified versions must not imply sponsorship or cause confusion. Third-party trademarks are subject to their own policies.

About

The GPT-RAG Data Ingestion service automates processing of diverse documents—PDFs, images, spreadsheets, transcripts, and SharePoint—readying them for Azure AI Search. It applies smart chunking, generates text and image embeddings, and enables rich, multimodal retrieval.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 15