Add support for remote OpenAI-compatible embeddings by alexleach · Pull Request #480 · tobi/qmd

alexleach · 2026-03-28T06:50:12Z

This is mainly a tidy-up of #116, which fell behind main, but adds support for configurable remote endpoints.

There are many Issues and PRs already, requesting that support for remote, OpenAI-compatible endpoints is re-added to the code base, so I apologise for creating a new one, but it's clearly quite a popular request!

PRs

A lot of these are endpoint-specific in nature, either for Voyager, Gemini, OpenRouter. This PR is generic, allowing the use of any embedding provider that follows OpenAI's API-specification for embeddings.

Issues

I went through a few of them, and thought that @jonesj38's version was the closest to what I wanted. I made some minor changes, before it fell behind main and needed some clean merging. I created a PR to his repository, but it became quite messy and I have received no feedback from that in a couple of weeks.

Summary
Clearly there are a lot of use-cases for remote endpoints. My use-case, as mentioned in a couple existing PRs and Issues, is that node-llama-cpp does not build in docker on Mac Silicon. Even if it did, it wouldn't have support for Apple's GPU.

So, I need to host the models in Docker Model Runner, which is treated as an OpenAI-compatible remote endpoint.

Either way, I am using this fork, but I would much prefer if it was merged upstream so I can benefit from any future code changes, too. (It wasn't easy rebasing the fork on main!)

Replace the rerank() stub with a real listwise reranker using gpt-4o-mini. - Sends top candidates with query to gpt-4o-mini as a ranking task - Parses comma-separated index output, handles missing/duplicate indices - Skips API call for ≤2 documents (not worth the latency) - Falls back to original order on API failure - Cost: ~$0.001 per rerank call - Updated qmd.ts to route through OpenAI reranker instead of skipping The full qmd query pipeline with OpenAI now: 1. Query expansion (gpt-4o-mini) 2. BM25 + vector search (parallel) 3. RRF fusion 4. Cross-encoder reranking (gpt-4o-mini) ← NEW 5. Position-aware blending

alexleach and others added 9 commits March 27, 2026 21:53

feat: add OpenAI embedding support

571e930

feat: add OpenAI embedding and query expansion support

8cb2441

fix: use default embedding LLM for hybrid vector queries

ff83ba1

feat: configurable remote OpenAI URLs

094ef0a

fix: add baseUrl to EmbeddingProviderConfig

575af82

fix: don't build/launch node-llama-cpp when using OpenUI

447847e

fix: restore rerank function after conflict resolution

ccf3ff6

chore: bump openai package version

12c7d18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for remote OpenAI-compatible embeddings#480

Add support for remote OpenAI-compatible embeddings#480
alexleach wants to merge 9 commits intotobi:mainfrom
alexleach:feat/openai-embeddings-clean-backup

alexleach commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alexleach commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants