Skip to content

Add HTML/Webpage Parsing Layer for RAG Pipeline #82

@aliamerj

Description

@aliamerj

Right now, our RAG pipeline only handles PDFs, but a ton of valuable content lives on web pages and in raw HTML. extend our parser to ingest HTML documents directly and pull out both text and visuals.

Metadata

Metadata

Assignees

No one assigned

    Projects

    Status

    📝 To-Do

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions