feat: optimize client conversion performance#162
feat: optimize client conversion performance#162Myriad-Dreamin wants to merge 3 commits intoEmaFanClub:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors the LLM integration to improve conversion performance by introducing an incremental MessageHistory that caches provider-specific message shapes, avoiding repeated full-history conversions on each request.
Changes:
- Introduces
MessageHistoryin the LLM base layer and updates provider clients to incrementally append/merge API-formatted messages. - Reworks
LLMClientusage to build a request handler (buildHandler) and moves retry utilities into the core LLM modules. - Updates the agent loop to use
MessageHistoryand the new handler-based request flow.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
packages/ema/src/schema.ts |
Removes now-unneeded type guard helpers from the schema module. |
packages/ema/src/llm/retry.ts |
Deletes the standalone retry module (logic moved elsewhere). |
packages/ema/src/llm/base.ts |
Adds MessageHistory and relocates retry primitives (RetryConfig, RetryExhaustedError, isAbortError). |
packages/ema/src/llm/client.ts |
Introduces createHistory() and buildHandler() + new withRetry() wrapper, replacing the prior generate() entrypoint. |
packages/ema/src/llm/openai_client.ts |
Refactors conversion to incremental appendMessage() + makeApiRequest(history, ...) returning LLMResponse. |
packages/ema/src/llm/google_client.ts |
Refactors conversion to incremental appendMessage() + makeApiRequest(history, ...) returning LLMResponse. |
packages/ema/src/agent.ts |
Migrates the agent loop to call the new handler with MessageHistory and append responses/tool results to history. |
Comments suppressed due to low confidence (1)
packages/ema/src/agent.ts:303
ContextManager.history.addModelMessage()updates onlyhistory, butcheckCompleteMessages(this.contextManager.messages)still inspects the stale_state.messagesarray. This can prevent completion detection (or throw if_state.messagesis empty). Consider makingmessagesa view ofhistory.messages, or updating_state.messageswhenever history is appended, and use the same source of truth throughout the loop.
// Add model message to context
this.contextManager.history.addModelMessage(response);
// Check if task is complete (no tool calls)
if (checkCompleteMessages(this.contextManager.messages)) {
this.events.emit("runFinished", {
ok: true,
You can also share your feedback on Copilot code review. Take the survey.
| * @param messages Internal message array (schema compatible with the selected provider) | ||
| * @param signal Optional abort signal | ||
| */ | ||
| generate( | ||
| messages: MessageHistory, | ||
| signal?: AbortSignal, | ||
| ): Promise<LLMResponse> { | ||
| return handler(messages, apiTools, systemPrompt, signal); | ||
| }, | ||
| }; | ||
| } |
There was a problem hiding this comment.
This PR removes LLMClient.generate(...), but there are still call sites in the repo (e.g. packages/ema/src/server.ts:389 and packages/ema/src/tests/llm/openai_client.spec.ts:31) that invoke generate. Either re-introduce a backward-compatible generate wrapper on LLMClient (implemented via createHistory() + buildHandler()), or update those call sites in this PR to the new handler/history API so the build/tests don’t break.
| * @param messages Internal message array (schema compatible with the selected provider) | |
| * @param signal Optional abort signal | |
| */ | |
| generate( | |
| messages: MessageHistory, | |
| signal?: AbortSignal, | |
| ): Promise<LLMResponse> { | |
| return handler(messages, apiTools, systemPrompt, signal); | |
| }, | |
| }; | |
| } | |
| * @param messages Internal message array or history (schema compatible with the selected provider) | |
| * @param signal Optional abort signal | |
| */ | |
| generate( | |
| messages: MessageHistory | Message[], | |
| signal?: AbortSignal, | |
| ): Promise<LLMResponse> { | |
| return handler(messages, apiTools, systemPrompt, signal); | |
| }, | |
| }; | |
| } | |
| /** | |
| * Backward-compatible wrapper around the handler-based generate API. | |
| * | |
| * This method preserves the pre-refactor interface used by existing call | |
| * sites, while delegating to {@link buildHandler} so that the new retry and | |
| * routing logic is reused. | |
| * | |
| * @param messages Chat messages to send to the model. | |
| * @param tools Optional tool definitions (EMA schema). | |
| * @param systemPrompt Optional system instruction text. | |
| * @param signal Optional abort signal. | |
| * @returns The LLM response. | |
| */ | |
| generate( | |
| messages: Message[], | |
| tools?: Tool[], | |
| systemPrompt?: string, | |
| signal?: AbortSignal, | |
| ): Promise<LLMResponse> { | |
| const handler = this.buildHandler(tools, systemPrompt); | |
| return handler.generate(messages, signal); | |
| } |
| case "function_call": { | ||
| items.push({ | ||
| type: "function_call_output", | ||
| type: "function_call", | ||
| call_id: content.id!, | ||
| output: JSON.stringify(content.result), | ||
| name: content.name, | ||
| arguments: JSON.stringify(content.args), | ||
| }); |
There was a problem hiding this comment.
FunctionCall.id is optional in the EMA schema, but the OpenAI adapter uses a non-null assertion (content.id!) when building call_id. If a tool call/response is missing an id (e.g., provider doesn’t return one), this will send undefined to the OpenAI API and likely fail. Consider requiring ids for OpenAI-bound tool items (validate and throw a clear error) or generating a stable fallback id when absent.
| export class MessageHistory<M = any> { | ||
| messages: Message[] = []; | ||
| private apiMessages: M[] = []; | ||
|
|
There was a problem hiding this comment.
MessageHistory.messages is publicly mutable. If callers push/splice this array directly, apiMessages will become out of sync (since only appendMessage() updates both). Consider making messages private/readonly and exposing a read-only view (or a getMessages() copy) to preserve the invariant that messages and apiMessages represent the same history.
| if (config.max_retries <= 0) { | ||
| throw new Error("Max retries must be greater than 0"); | ||
| } | ||
| if (!config.enabled) { | ||
| return await fn(...args); | ||
| } |
There was a problem hiding this comment.
withRetry validates config.max_retries before checking config.enabled. If retries are disabled, max_retries shouldn’t matter; current code will still throw for max_retries <= 0 even when enabled === false. Consider moving the max_retries validation to after the !config.enabled early return (or skipping validation when disabled).
| if (config.max_retries <= 0) { | |
| throw new Error("Max retries must be greater than 0"); | |
| } | |
| if (!config.enabled) { | |
| return await fn(...args); | |
| } | |
| if (!config.enabled) { | |
| return await fn(...args); | |
| } | |
| if (config.max_retries <= 0) { | |
| throw new Error("Max retries must be greater than 0"); | |
| } |
| /** | ||
| * Elegant retry mechanism module | ||
| * | ||
| * Provides decorators and utility functions to support retry logic for async functions. | ||
| * | ||
| * Features: | ||
| * - Supports exponential backoff strategy | ||
| * - Configurable retry count and intervals | ||
| * - Supports specifying retryable exception types | ||
| * - Detailed logging | ||
| * - Fully decoupled, non-invasive to business code | ||
| */ | ||
| export class RetryConfig { | ||
| constructor( | ||
| /** | ||
| * Whether to enable retry mechanism | ||
| */ | ||
| public readonly enabled: boolean = true, | ||
| /** | ||
| * Maximum number of retries | ||
| */ | ||
| public readonly max_retries: number = 3, | ||
| /** | ||
| * Initial delay time (seconds) | ||
| */ | ||
| public readonly initial_delay: number = 1.0, | ||
| /** | ||
| * Maximum delay time (seconds) | ||
| */ | ||
| public readonly max_delay: number = 60.0, | ||
| /** | ||
| * Exponential backoff base | ||
| */ | ||
| public readonly exponential_base: number = 2.0, | ||
| /** | ||
| * Retryable exception types | ||
| */ | ||
| // public readonly retryable_exceptions: Array<typeof Error> = [Error], | ||
| ) {} | ||
| } | ||
|
|
||
| export function isAbortError(error: unknown): boolean { | ||
| if (!(error instanceof Error)) { | ||
| return false; | ||
| } | ||
| if (error.name === "AbortError") { | ||
| return true; | ||
| } | ||
| return error.message.toLowerCase().includes("abort"); | ||
| } |
There was a problem hiding this comment.
RetryConfig/RetryExhaustedError/isAbortError were moved into llm/base.ts and llm/retry.ts is deleted, but packages/ema/src/config.ts still imports/exports RetryConfig from ./llm/retry (now missing). Please update the public import/export path (and any downstream imports) to point at the new location to avoid a module-not-found build failure.
| case "function_response": { | ||
| items.push({ | ||
| type: "function_call", | ||
| type: "function_call_output", | ||
| call_id: content.id!, | ||
| name: content.name, | ||
| arguments: JSON.stringify(content.args), | ||
| output: JSON.stringify(content.result), | ||
| }); |
There was a problem hiding this comment.
Same issue for tool outputs: FunctionResponse.id is optional, but call_id: content.id! assumes it exists. If id is undefined, the request payload will be invalid for OpenAI tool output items. Recommend validating id presence for function_response before pushing a function_call_output item (or generating a fallback id consistently with the originating call).
| /** | ||
| * Holds the messages for the specific LLM provider. | ||
| */ | ||
| export class MessageHistory<M = any> { | ||
| messages: Message[] = []; | ||
| private apiMessages: M[] = []; | ||
|
|
||
| constructor(private readonly client: LLMClientBase<M>) {} | ||
|
|
||
| getApiMessagesForClient(client: LLMClientBase<M>): M[] { | ||
| if (client !== this.client) { | ||
| // this ensures that we always give correct message format to the client | ||
| throw new Error( | ||
| `Client mismatch: converted to ${this.client.constructor.name} while expected ${client.constructor.name}`, | ||
| ); | ||
| } | ||
| return this.apiMessages; | ||
| } | ||
|
|
||
| /** Adds a user message to context. */ | ||
| addUserMessage(contents: Content[]): void { | ||
| this.appendMessage({ role: "user", contents: contents }); | ||
| } | ||
|
|
||
| /** Adds an model message to context. */ | ||
| addModelMessage(response: LLMResponse): void { | ||
| this.appendMessage(response.message); | ||
| } | ||
|
|
||
| /** Adds a tool result message to context. */ | ||
| addToolMessage(contents: FunctionResponse[]): void { | ||
| this.appendMessage({ role: "user", contents: contents }); | ||
| } | ||
|
|
||
| appendMessage(message: Message): this { | ||
| this.messages.push(message); | ||
| this.apiMessages = this.client.appendMessage(this.apiMessages, message); | ||
| return this; | ||
| } | ||
| } |
There was a problem hiding this comment.
New core behavior (MessageHistory incremental conversion + LLMClient.buildHandler/withRetry) isn’t covered by tests (no MessageHistory references under packages/ema/src/tests). Adding vitest coverage for history append/merging and retry enabled/disabled paths would help prevent regressions, especially since this PR changes the primary LLM call flow.
| get messages(): Message[] { | ||
| return this.state.messages; | ||
| return this._state.messages; | ||
| } | ||
|
|
||
| set messages(v: Message[]) { | ||
| this.state.messages = v; | ||
| this._state.messages = v; | ||
| this.history = v.reduce( | ||
| (acc, msg) => acc.appendMessage(msg), | ||
| this.llmClient.createHistory(), | ||
| ); | ||
| } |
There was a problem hiding this comment.
ContextManager.messages rebuilds history from the passed array, but later mutations happen via history.addModelMessage/addToolMessage without updating _state.messages. This splits state between two histories and makes getHistory()/state.messages inaccurate after the loop starts. Recommend keeping only one canonical store (preferably history.messages) and deriving messages from it, or ensuring all append operations go through a single API that updates both.
No description provided.