-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
good first issueGood for newcomersGood for newcomers
Description
Currently, the embedding pipeline only scrapes a minimal subset of the metadata available in PDFs. We may as well retrieve all of the available data:
This should just be a matter of updating the sqlite metadata table definition, updating the embedding pipeline to add that data into the metadata.json, and updating the generate_index_metadata.py script to insert it.
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomers