Migrate vector database from Pinecone to Qdrant#27
Open
hunterbryant wants to merge 8 commits intomainfrom
Open
Migrate vector database from Pinecone to Qdrant#27hunterbryant wants to merge 8 commits intomainfrom
hunterbryant wants to merge 8 commits intomainfrom
Conversation
Pinecone's free tier was discontinued. Migrated to Qdrant Cloud (1GB free tier) which uses the same LangChain vector store interface pattern. - Swap @langchain/pinecone + @pinecone-database/pinecone for @langchain/qdrant + @qdrant/js-client-rest - Rewrite context.ts to use QdrantVectorStore.fromExistingCollection() - Update all 4 embed endpoints to use QdrantVectorStore.fromDocuments() - Delete unused pinecone.ts utility (was dead code) - Update CLAUDE.md to reflect new stack and env vars (QDRANT_URL, QDRANT_API_KEY, QDRANT_COLLECTION) https://claude.ai/code/session_01M6Bh7NH98wXMiEeAKJMEAu
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
The LangChain NotionAPILoader was throwing AggregateErrors when pages were moved or had access issues. This rewrites the endpoint to use @notionhq/client directly, giving better error handling (skips individual pages that fail instead of crashing) and clearer logging. https://claude.ai/code/session_01M6Bh7NH98wXMiEeAKJMEAu
The endpoint now streams progress via SSE instead of returning a single JSON response. The admin page shows a live progress bar with page names, count, and status (loading/embedding/done/error). The Index button is disabled while running. https://claude.ai/code/session_01M6Bh7NH98wXMiEeAKJMEAu
Previously all 2000+ pages were loaded into memory then uploaded in one batch at the end. Now each page is fetched, chunked, and uploaded to Qdrant immediately so progress is saved incrementally. The admin UI also shows a running chunk count alongside the page progress.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR migrates the vector database backend from Pinecone to Qdrant Cloud for the RAG pipeline. All embedding operations and vector retrieval have been updated to use Qdrant's API instead of Pinecone's.
Key Changes
src/lib/utilities/pinecone.tsutility file and removed all Pinecone client initialization codePineconeStorewithQdrantVectorStoreacross all embedding endpoints:src/routes/api/embed/texts/+server.tssrc/routes/api/embed/notion-file/+server.tssrc/routes/api/embed/notion-url/+server.tssrc/routes/api/embed/urls/+server.tssrc/lib/utilities/context.tsto useQdrantVectorStore.fromExistingCollection()instead ofPineconeStore.fromExistingIndex()PINECONE_API_KEYandPINECONE_INDEXwithQDRANT_URL,QDRANT_API_KEY, andQDRANT_COLLECTIONCLAUDE.mdto reflect Qdrant as the vector database and updated architecture diagrams and setup instructions@langchain/pineconeand@pinecone-database/pineconewith@langchain/qdrantand@qdrant/js-client-restpackage.jsonto allow Node >= 20.0.0 (removed upper bound)Implementation Details
url,apiKey, andcollectionNamemaxConcurrencyparameter from embedding operations (Qdrant handles concurrency differently)getContext()simplified fromPromise<string | ScoredVector[]>toPromise<string>https://claude.ai/code/session_01M6Bh7NH98wXMiEeAKJMEAu
Note
Medium Risk
Switches the RAG/embedding backend to Qdrant and updates required environment variables/dependencies, which can break retrieval or embedding jobs if configuration or collection semantics differ. Node engine constraint loosening may also allow untested Node 21+ runtimes in some environments.
Overview
Migrates the vector DB integration from Pinecone to Qdrant Cloud for RAG retrieval and embedding workflows, including updating the documented pipeline/embedding guidance and required env vars to
QDRANT_URL/QDRANT_API_KEY/QDRANT_COLLECTION.Updates
package.jsonto drop Pinecone-related packages in favor of@langchain/qdrant+@qdrant/js-client-rest, and relaxes the Node engines constraint to>= 20.0.0(removing the previous<21upper bound).Written by Cursor Bugbot for commit 3c84d7b. This will update automatically on new commits. Configure here.