-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Add todo tool #16798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add todo tool #16798
Conversation
|
I will review at the latest next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The todo list works in itself as the concept is sound, giving the LLM a tool to structure its output/planning.
However I don't think we need the concept of "reading todos" nor do we really need to track todos over the various chat model branches, in fact we should think about whether todos should be a chat-model concept at all.
A well designed "writeToDo" tool is already giving the LLM all the structure it needs, and the curent state of the ToDo list is tracked via the tool call history in the chat branches already.
As of the current state of the PR I don't see a reason to store ToDos additionally, next to the already existing tool calls, into the chat model. The only use cases I can think of:
- Injecting ToDos from outside the LLM programmatically
- Cross-Session work management
- Automatic todo list restoring after session compacting (which is not a feature in the state of the PR and can be achieved in different means too)
- Special UI integration in the Chat View
But the use cases are not part of this PR, so as of now, I don't see a reason to add it. For the UI integration we can add special rendering for the ToDoWrite tool calls themselves without needing to rely on special structures in the chat-model.
| import { Disposable, Emitter, Event } from '@theia/core'; | ||
| import { generateUuid } from '@theia/core/lib/common/uuid'; | ||
|
|
||
| export type TodoItemState = 'pending' | 'completed'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An inProgress state could be useful. Anthropic uses that too and I think it just makes sense to return to the user what the LLM/agent is currently working on. Anthropic makes sure that only one todo can be in progress, maybe we want to enforce that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking about this and had it in the first version. Your argument that it enforces step by step completions convinces me, will add it again.
| removeItem(id: string): boolean; | ||
| clear(): void; | ||
| toSerializable(): SerializedTodoItem[]; | ||
| restoreFromSerialized(items: SerializedTodoItem[]): void; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected the serialized list to be an optional constructor argument similar to the chat-model design.
| notes: item.notes | ||
| }); | ||
| } | ||
| this.notifyChange(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restoring is done before we hand the ChatModel to the outside world, so no need to trigger notifications
| export type TodoItemState = 'pending' | 'completed'; | ||
|
|
||
| export interface TodoItem { | ||
| readonly id: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would get rid of the id. I don't see any real usage for this besides filling the context. When adding todos we can make sure that content is unique.
| } | ||
|
|
||
| @injectable() | ||
| export class TodoReadToolProvider implements ToolProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand what this tool is used for. An LLM knows the list, as it did write it before. When we restore from a previous chat or continue with a new message, the LLM still has the list in context as it has the last todo list write in context.
Even if we switch agents or models, we still send the old tool calls with us, so they still have access to the current list.
The only use case I could think of is to access the actual Todo list after a destructive compact operation where we don't send the old messages but just a summary of them. But even then we could just make sure the summary contains the current Todo list.
When I tried this PR, the LLMs basically just read once at the beginning of the chat. Don't get any items returned and then just never call this again. Sometimes it never called it.
| return { | ||
| id: TodoWriteToolProvider.ID, | ||
| name: TodoWriteToolProvider.ID, | ||
| description: 'Manages a structured task list for the current chat session. ' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need many more detailed instructions when and how to use the todo list. For me it often happened that the Todo list was created once with 5-6 items. The LLM then did some work and the next call was with the completion of ALL items.
This was likely a case of todos being too trivial to be even used for that use case to begin with.
I tried adding an "in-progress" state and told it to only work on items which are "in-progress" (without limiting it to one like Anthropic does). That worked well for me and it completed todos always in order without skipping entries (i.e. completing multiple at once)
Co-authored-by: Stefan Dirix <sdirix@eclipsesource.com>
Co-authored-by: Stefan Dirix <sdirix@eclipsesource.com>
Co-authored-by: Stefan Dirix <sdirix@eclipsesource.com>
What it does
Adds a TodoList feature to the AI chat system, allowing agents to track task progress during conversations.
The todo tool is currently not integrated in any agent, yet.
How to test
Follow-ups
Integrate in agents, specifically the Theia Coder agent mode
Breaking changes
Attribution
Review checklist
nlsservice (for details, please see the Internationalization/Localization section in the Coding Guidelines)Reminder for reviewers