Skip to content

Conversation

@JonasHelming
Copy link
Contributor

What it does

Adds a TodoList feature to the AI chat system, allowing agents to track task progress during conversations.
The todo tool is currently not integrated in any agent, yet.

  • Introduces TodoList interface and TodoListImpl in @theia/ai-chat
  • Follows the per-request pattern (like ChangeSet) with ChatTreeTodoList aggregator for branch-aware state
  • Adds todoWrite and todoRead tools in @theia/ai-ide
  • Persists todo state with chat sessions

How to test

  1. Add ~{todoWrite} and ~{todoRead} to an agent prompt and start a chat session
  2. Ask the agent to perform a multi-step task and use the todo list
  3. Verify the agent can create and update todos via todoWrite
  4. Verify the agent can read current state via todoRead
  5. Test branching: edit an earlier message, verify the new branch has independent todo state
  6. Test persistence: reload the session, verify todos are restored

Follow-ups

Integrate in agents, specifically the Theia Coder agent mode

Breaking changes

  • This PR introduces breaking changes and requires careful review. If yes, the breaking changes section in the changelog has been updated.

Attribution

Review checklist

Reminder for reviewers

@JonasHelming JonasHelming requested a review from sdirix January 6, 2026 13:57
@github-project-automation github-project-automation bot moved this to Waiting on reviewers in PR Backlog Jan 6, 2026
@sdirix
Copy link
Member

sdirix commented Jan 8, 2026

I will review at the latest next week.

Copy link
Member

@sdirix sdirix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The todo list works in itself as the concept is sound, giving the LLM a tool to structure its output/planning.

However I don't think we need the concept of "reading todos" nor do we really need to track todos over the various chat model branches, in fact we should think about whether todos should be a chat-model concept at all.

A well designed "writeToDo" tool is already giving the LLM all the structure it needs, and the curent state of the ToDo list is tracked via the tool call history in the chat branches already.

As of the current state of the PR I don't see a reason to store ToDos additionally, next to the already existing tool calls, into the chat model. The only use cases I can think of:

  • Injecting ToDos from outside the LLM programmatically
  • Cross-Session work management
  • Automatic todo list restoring after session compacting (which is not a feature in the state of the PR and can be achieved in different means too)
  • Special UI integration in the Chat View

But the use cases are not part of this PR, so as of now, I don't see a reason to add it. For the UI integration we can add special rendering for the ToDoWrite tool calls themselves without needing to rely on special structures in the chat-model.

import { Disposable, Emitter, Event } from '@theia/core';
import { generateUuid } from '@theia/core/lib/common/uuid';

export type TodoItemState = 'pending' | 'completed';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An inProgress state could be useful. Anthropic uses that too and I think it just makes sense to return to the user what the LLM/agent is currently working on. Anthropic makes sure that only one todo can be in progress, maybe we want to enforce that too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this and had it in the first version. Your argument that it enforces step by step completions convinces me, will add it again.

removeItem(id: string): boolean;
clear(): void;
toSerializable(): SerializedTodoItem[];
restoreFromSerialized(items: SerializedTodoItem[]): void;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected the serialized list to be an optional constructor argument similar to the chat-model design.

notes: item.notes
});
}
this.notifyChange();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restoring is done before we hand the ChatModel to the outside world, so no need to trigger notifications

export type TodoItemState = 'pending' | 'completed';

export interface TodoItem {
readonly id: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would get rid of the id. I don't see any real usage for this besides filling the context. When adding todos we can make sure that content is unique.

}

@injectable()
export class TodoReadToolProvider implements ToolProvider {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand what this tool is used for. An LLM knows the list, as it did write it before. When we restore from a previous chat or continue with a new message, the LLM still has the list in context as it has the last todo list write in context.

Even if we switch agents or models, we still send the old tool calls with us, so they still have access to the current list.

The only use case I could think of is to access the actual Todo list after a destructive compact operation where we don't send the old messages but just a summary of them. But even then we could just make sure the summary contains the current Todo list.


When I tried this PR, the LLMs basically just read once at the beginning of the chat. Don't get any items returned and then just never call this again. Sometimes it never called it.

return {
id: TodoWriteToolProvider.ID,
name: TodoWriteToolProvider.ID,
description: 'Manages a structured task list for the current chat session. ' +
Copy link
Member

@sdirix sdirix Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need many more detailed instructions when and how to use the todo list. For me it often happened that the Todo list was created once with 5-6 items. The LLM then did some work and the next call was with the completion of ALL items.

This was likely a case of todos being too trivial to be even used for that use case to begin with.


I tried adding an "in-progress" state and told it to only work on items which are "in-progress" (without limiting it to one like Anthropic does). That worked well for me and it completed todos always in order without skipping entries (i.e. completing multiple at once)

@github-project-automation github-project-automation bot moved this from Waiting on reviewers to Waiting on author in PR Backlog Jan 18, 2026
JonasHelming and others added 4 commits January 19, 2026 11:40
Co-authored-by: Stefan Dirix <sdirix@eclipsesource.com>
Co-authored-by: Stefan Dirix <sdirix@eclipsesource.com>
Co-authored-by: Stefan Dirix <sdirix@eclipsesource.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Waiting on author

Development

Successfully merging this pull request may close these issues.

3 participants