-
Notifications
You must be signed in to change notification settings - Fork 108
[Breaking change] Going all-in on MCP #169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@Darktex this is really detailed and thorough! Thank you so much. Instead of Qs, I will detail my thoughts and POV below and how they were answered as I read through this: Open Qs that I was brainstorming before reading:
All of these have been well addressed by the RFC. Some more Qs I have now:
It would be really cool to whiteboard an environment + model training as we start Phase 1 |
|
Thanks for the kind words!
I don't think so. At the end of the day, all you are doing in practice is using fastmcp and a The only case where this is not true is if your policy is not a LLM but it's instead a "normal" RL policy where your output layer is fixed to have exactly N outputs, where N is the number of legal moves. Something like AlphaGo. But for models like that, OpenEnvs is not gonna be very useful, because you cannot take AlphaGo and start playing Pokemon anyway... So I think that's a reasonable tradeoff to take: assume we build for LLMs.
|
|
Ack'ing this for now. Led to bigger convo on work chat. |
This commit consolidates the evolution of the OpenEnv RFC structure, incorporating extensive feedback and iterative refinements: - Refactored RFC 000 with clear project phases and architectural principles - Enhanced RFC 001 with two-interface model (CodeAct + ToolCall) and simulation clarity - Restructured RFC 002 with dual-mode patterns and clearer positioning - Completely revised RFC 003 with MCP primer, progressive disclosure, and clearer interface separation - Removed superseded RFCs (004-007 and FEEDBACK_DECISIONS.md) after porting content - Added comprehensive diagrams (Mermaid) for MCP architecture visualization - Archived previous RFC versions for historical reference - Established MCP as the universal interface for environment interactions The final state represents a clean, cohesive RFC suite focused on the MCP-based architecture with clear separation between CodeAct and ToolCall interfaces.
Summary
I have been thinking about revising our abstractions to really go all-in on MCP and reduce the friction between Prod and Training. If accepted, our RFCs (and code)
will change meaningfully.
We are not at this stage yet. This PR introduces a single consolidated RFC (
env_tools_rfc.md) that proposes significant architectural changes. It's meant tospark discussion and alignment before we commit to implementation.
Key Changes Proposed
mappings
See
rfcs/env_tools_rfc.mdfor full details, rationale, and design decisions.Progression Plan
Discussion Points
Feedback welcome on all aspects - this is the right time to course-correct before implementation begins.