Skip to content

feat(inference): add chat completion message listing endpoint.#5459

Open
skamenan7 wants to merge 36 commits intoogx-ai:mainfrom
skamenan7:feat/MessagesRoute-Jira-3612
Open

feat(inference): add chat completion message listing endpoint.#5459
skamenan7 wants to merge 36 commits intoogx-ai:mainfrom
skamenan7:feat/MessagesRoute-Jira-3612

Conversation

@skamenan7
Copy link
Copy Markdown
Contributor

@skamenan7 skamenan7 commented Apr 7, 2026

What does this PR do?

Adds GET /v1/chat/completions/{completion_id}/messages, the OpenAI-compatible endpoint for listing messages from a stored chat completion.

The route reads from the inference store, flattens input and output messages into a single paginated list with synthetic IDs ({completion_id}-{index}),and supports after, limit, and order query params. It follows the same pattern as GET /conversations/{id}/items.

according to https://github.com/llamastack/llama-stack/blob/main/docs/docs/api-openai/conformance.mdx#chat, the /messages route is missing.

Test Plan

Started a local server with Ollama (remote::ollama provider, llama3.2:3b),
then ran these against it:

Create a completion:

curl -s -X POST http://localhost:8321/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"ollama/llama3.2:3b","messages":[
    {"role":"system","content":"Be brief."},
    {"role":"user","content":"Say hello"}
  ]}' | jq '{id, model, content: .choices[0].message.content}'
{"id": "chatcmpl-899", "model": "ollama/llama3.2:3b", "content": "Hello!"}

List messages (the new route):

curl -s http://localhost:8321/v1/chat/completions/chatcmpl-899/messages | jq .
{
  "object": "list",
  "data": [
    {"id": "chatcmpl-899-0", "role": "system", "content": "Be brief."},
    {"id": "chatcmpl-899-1", "role": "user", "content": "Say hello"},
    {"id": "chatcmpl-899-2", "role": "assistant", "content": "Hello!"}
  ],
  "first_id": "chatcmpl-899-0",
  "last_id": "chatcmpl-899-2",
  "has_more": false
}

Pagination (limit=1, then cursor):

curl -s 'http://localhost:8321/v1/chat/completions/chatcmpl-899/messages?limit=1' | jq .
{"data": [{"id": "chatcmpl-899-0", "role": "system"}], "has_more": true}
curl -s 'http://localhost:8321/v1/chat/completions/chatcmpl-899/messages?after=chatcmpl-899-0' | jq .
{"data": [{"id": "chatcmpl-899-1", "role": "user"}, {"id": "chatcmpl-899-2", "role": "assistant"}], "has_more": false}

Invalid cursor returns 400:

curl -s 'http://localhost:8321/v1/chat/completions/chatcmpl-899/messages?after=bogus' | jq .
{"error": {"message": "Failed to list chat completion messages: cursor 'bogus' not found in completion 'chatcmpl-899'."}}

Multi-turn (4 input + 1 output = 5 messages):

curl -s http://localhost:8321/v1/chat/completions/chatcmpl-623/messages | jq '[.data[] | {id, role, content}]'
[
  {"id": "chatcmpl-623-0", "role": "system", "content": "You are a math tutor. Be brief."},
  {"id": "chatcmpl-623-1", "role": "user", "content": "What is 2+2?"},
  {"id": "chatcmpl-623-2", "role": "assistant", "content": "4"},
  {"id": "chatcmpl-623-3", "role": "user", "content": "What is 4+4?"},
  {"id": "chatcmpl-623-4", "role": "assistant", "content": "8"}
]

Unit tests

Unit tests in tests/unit/utils/inference/test_inference_store.py covering:

  • Basic input/output message listing
  • Pagination with limit and cursor
  • Descending order
  • Not-found and invalid-cursor error handling
  • Multi-choice (n>1) global ID assignment
  • Tool call preservation
  • After-last-message empty page
uv run pytest tests/unit/utils/inference/test_inference_store.py -v -k "list_chat_completion_messages"

Integration test (against running server):

 uv run pytest tests/integration/inference/ -v -k "messages" \
    --text-model ollama/llama3.2:3b --stack-config=http://localhost:8321

Passes. Tests pagination with limit=1, after cursor, role and ID assertions.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 7, 2026

✱ Stainless preview builds

This PR will update the llama-stack-client SDKs with the following commit message.

feat(inference): add chat completion message listing endpoint.

Edit this comment to update it. It will appear in the SDK's changelogs.

llama-stack-client-openapi studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️

llama-stack-client-python studio · code · diff

Your SDK build had at least one "warning" diagnostic, but this did not represent a regression.
generate ⚠️build ✅lint ❗test ✅

pip install https://pkg.stainless.com/s/llama-stack-client-python/00c5fd261ef2089ddee36d9c17922643c5d4d854/ogx_client-0.7.0a2-py3-none-any.whl
llama-stack-client-go studio · conflict

Your SDK build had at least one new note diagnostic, which is a regression from the base state.

New diagnostics (1 note)
💡 Schema/EnumHasOneMember: This enum schema has just one member, so it could be defined using [`const`](https://json-schema.org/understanding-json-schema/reference/const).
llama-stack-client-node studio · conflict

Your SDK build resulted in a merge conflict between your custom code and the newly generated changes, but this did not represent a regression.


This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-04-28 16:23:25 UTC

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 7, 2026

This pull request has merge conflicts that must be resolved before it can be merged. @skamenan7 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Apr 7, 2026
@skamenan7 skamenan7 force-pushed the feat/MessagesRoute-Jira-3612 branch 2 times, most recently from 617ff74 to ed847f6 Compare April 7, 2026 17:21
@mergify mergify Bot removed the needs-rebase label Apr 7, 2026
@skamenan7 skamenan7 marked this pull request as ready for review April 7, 2026 18:59
@skamenan7 skamenan7 force-pushed the feat/MessagesRoute-Jira-3612 branch 3 times, most recently from 8c2b05d to f051ef4 Compare April 8, 2026 15:50
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 8, 2026

This pull request has merge conflicts that must be resolved before it can be merged. @skamenan7 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Apr 8, 2026
@skamenan7 skamenan7 force-pushed the feat/MessagesRoute-Jira-3612 branch 2 times, most recently from 2e577bf to 57af7f5 Compare April 8, 2026 16:51
@mergify mergify Bot removed the needs-rebase label Apr 8, 2026
@skamenan7 skamenan7 force-pushed the feat/MessagesRoute-Jira-3612 branch 2 times, most recently from c4aa828 to eb9adcb Compare April 8, 2026 19:15
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

Recording workflow completed

Providers: azure

Recordings have been generated and will be committed automatically by the companion workflow.

View workflow run

Fork PR: Recordings will be committed if you have "Allow edits from maintainers" enabled.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

Recordings committed successfully

Recordings from the integration tests have been committed to this PR.

View commit workflow

@skamenan7 skamenan7 force-pushed the feat/MessagesRoute-Jira-3612 branch from 62a79e6 to d138aa4 Compare April 8, 2026 20:17
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 9, 2026

This pull request has merge conflicts that must be resolved before it can be merged. @skamenan7 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@skamenan7 skamenan7 marked this pull request as ready for review April 17, 2026 20:56
Copy link
Copy Markdown
Collaborator

@cdoern cdoern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small review, initially looking at this I see new conformance issues we should fix

Comment thread docs/docs/api-openai/conformance.mdx Outdated
Comment thread docs/docs/api-openai/conformance.mdx Outdated
### Chat

**Score:** 98.5% · **Issues:** 5 · **Missing:** 1
**Score:** 98.2% · **Issues:** 7 · **Missing:** 1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this PR scoped to the actual schema fix, so I only changed the API side here.
There’s still one remaining Chat conformance item on GET /chat/completions/{completion_id}/messages. That looks like a schema-shape mismatch in the OpenAI conformance check, not an actual endpoint behavior issue. I didn’t make that conformance-side change here, but I can update it separately if you want.

…formance tooling.

Keep the API-side schema fix in this change while leaving the remaining messages conformance false-positive to a separate OpenAI coverage follow-up.

Signed-off-by: skamenan7 <skamenan@redhat.com>
@skamenan7 skamenan7 requested a review from cdoern April 22, 2026 17:26
Comment thread docs/docs/api-openai/conformance.mdx Outdated

| Property | Issues |
|----------|--------|
| `responses.200.content.application/json.properties.data.items` | Type added: ['object'] |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a net-new issue introduced by the new route. can we fix this? if not possible let me know.

Copy link
Copy Markdown
Contributor Author

@skamenan7 skamenan7 Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, fixed this in 2e6284b73.

The remaining item was coming from the OpenAI coverage diff on ChatCompletionMessageList.data.items rather than the route behavior itself. OpenAI wraps that item schema differently (allOf vs our bare $ref), so I normalized that in scripts/openai_coverage.py for the conformance check.

The /chat/completions/{completion_id}/messages entry is back to 0 issues now.

PTAL @cdoern the above change.

Normalize ChatCompletionMessageList items during OpenAI coverage checks so the new messages endpoint no longer reports a false-positive schema regression.

Signed-off-by: skamenan7 <skamenan@redhat.com>
@skamenan7 skamenan7 requested a review from cdoern April 24, 2026 14:50
skamenan7 and others added 16 commits April 24, 2026 16:14
Signed-off-by: skamenan7 <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
Adds GET /v1/chat/completions/{completion_id}/messages, the OpenAI-compatible
endpoint for listing messages from a stored chat completion. The route reads
from the inference store, flattens input and output messages into a single
paginated list with synthetic IDs, and supports after, limit, and order
query params.

Signed-off-by: Sumanth Kamenani <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
…s pagination.

Signed-off-by: skamenan7 <skamenan@redhat.com>
Made-with: Cursor
Signed-off-by: skamenan7 <skamenan@redhat.com>
Co-Authored-By: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Adds GET /v1/chat/completions/{completion_id}/messages, the OpenAI-compatible
endpoint for listing messages from a stored chat completion. The route reads
from the inference store, flattens input and output messages into a single
paginated list with synthetic IDs, and supports after, limit, and order
query params.

Signed-off-by: Sumanth Kamenani <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
…s pagination.

Signed-off-by: skamenan7 <skamenan@redhat.com>
Made-with: Cursor
Signed-off-by: skamenan7 <skamenan@redhat.com>
Prevent the chat completion messages listing endpoint from failing when stored input messages include multipart file content. Filter unsupported parts in the listing response and extend the regression coverage for both file-only and mixed multipart inputs.

Signed-off-by: skamenan7 <skamenan@redhat.com>
…formance tooling.

Keep the API-side schema fix in this change while leaving the remaining messages conformance false-positive to a separate OpenAI coverage follow-up.

Signed-off-by: skamenan7 <skamenan@redhat.com>
Normalize ChatCompletionMessageList items during OpenAI coverage checks so the new messages endpoint no longer reports a false-positive schema regression.

Signed-off-by: skamenan7 <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
Signed-off-by: skamenan7 <skamenan@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants