diff --git a/authors.yaml b/authors.yaml
index 99b5111cbe..3afeebcedb 100644
--- a/authors.yaml
+++ b/authors.yaml
@@ -499,4 +499,7 @@ himadri518:
website: "https://www.linkedin.com/in/himadri-acharya-086ba261/"
avatar: "https://avatars.githubusercontent.com/u/14100684?v=4"
-
+jhall-openai:
+ name: "Josh Hall"
+ website: "https://www.linkedin.com/in/jhall14/"
+ avatar: "https://avatars.githubusercontent.com/u/198997750?v=4"
diff --git a/examples/codex/codex_mcp_agents_sdk/building_consistent_workflows_codex_cli_agents_sdk.ipynb b/examples/codex/codex_mcp_agents_sdk/building_consistent_workflows_codex_cli_agents_sdk.ipynb
new file mode 100644
index 0000000000..0b2fe4cb98
--- /dev/null
+++ b/examples/codex/codex_mcp_agents_sdk/building_consistent_workflows_codex_cli_agents_sdk.ipynb
@@ -0,0 +1,703 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "041db3ac",
+ "metadata": {},
+ "source": [
+ "# Building Consistent Workflows with Codex CLI & Agents SDK\n",
+ "### Ensuring Repeatable, Traceable, and Scaleable Agentic Development\n",
+ "\n",
+ "## Introduction\n",
+ "Developers strive for consistency in everything they do. With Codex CLI and the Agents SDK, that consistency can now scale like never before. Whether you’re refactoring a large codebase, rolling out new features, or introducing a new testing framework, Codex integrates seamlessly into CLI, IDE, and cloud workflows to automate and enforce repeatable development patterns. \n",
+ "\n",
+ "In this track, we’ll build both single and multi-agent systems using the Agents SDK, with Codex CLI exposed as an MCP Server. This enables: \n",
+ "- **Consistency and Repeatability** by providing each agent a scoped context. \n",
+ "- **Scalable Orchestration** to coordinate single and multi-agent systems. \n",
+ "- **Observability & Auditability** by reviewing the full agentic stack trace. \n",
+ "\n",
+ "## What We’ll Cover\n",
+ "- Initializing Codex CLI as an MCP Server: How to run Codex as a long-running MCP process. \n",
+ "- Building Single-Agent Systems: Using Codex MCP for scoped tasks. \n",
+ "- Orchestrating Multi-Agent Workflows: Coordinating multiple specialized agents. \n",
+ "- Tracing Agentic Behavior: Leveraging agent traces for visibility and evaluation. \n",
+ "\n",
+ "## Prerequisites & Setup\n",
+ "Before starting this track, ensure you have the following: \n",
+ "- Basic coding familiarity: You should be comfortable with Python and JavaScript. \n",
+ "- Developer environment: You’ll need an IDE, like VS Code or Cursor. \n",
+ "- OpenAI API key: Create or find your API key in the OpenAI Dashboard.\n",
+ "\n",
+ "\n",
+ "## Environment Setup\n",
+ "1. create a `.env` folder in your directory and add your `OPENAI_API_KEY` Key\n",
+ "2. Install dependencies\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f15f3e42",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%pip install openai-agents openai ## install dependencies"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "76a91cc2",
+ "metadata": {},
+ "source": [
+ "## Initializing Codex CLI as an MCP Server\n",
+ "Here run Codex CLI as an MCP Server inside the Agents SDK. We provide the initialization parameters of `codex mcp`. This command starts Codex CLI as an MCP server and exposes two Codex tools available on the MCP server — `codex()` and `codex-reply()`. These are the underlying tools that the Agents SDK will call when it needs to invoke Codex. \n",
+ "- `codex()` is used for creating a conversation. \n",
+ "- `codex-reply()` is for continuing a conversation. \n",
+ "\n",
+ "```python\n",
+ "import asyncio\n",
+ "from agents import Agent, Runner\n",
+ "from agents.mcp import MCPServerStdio\n",
+ "\n",
+ "async def main() -> None:\n",
+ " async with MCPServerStdio(\n",
+ " name=\"Codex CLI\",\n",
+ " params={\n",
+ " \"command\": \"npx\",\n",
+ " \"args\": [\"-y\", \"codex\", \"mcp\"],\n",
+ " },\n",
+ " client_session_timeout_seconds=360000,\n",
+ " ) as codex_mcp_server:\n",
+ " print(\"Codex MCP server started.\")\n",
+ " # We will add more code here in the next section\n",
+ " return\n",
+ "```\n",
+ "\n",
+ "Also note that we are extending the MCP Server timeout to allow Codex CLI enough time to execute and complete the given task. \n",
+ "\n",
+ "---\n",
+ "\n",
+ "## Building Single Agent Systems\n",
+ "Let’s start with a simple example to use our Codex MCP Server. We define two agents: \n",
+ "1. **Designer Agent** – brainstorms and creates a small brief for a game. \n",
+ "2. **Developer Agent** – implements a simple game according to the Designer’s spec.\n",
+ "\n",
+ "```python\n",
+ "developer_agent = Agent(\n",
+ " name=\"Game Developer\",\n",
+ " instructions=(\n",
+ " \"You are an expert in building simple games using basic html + css + javascript with no dependencies. \"\n",
+ " \"Save your work in a file called index.html in the current directory.\"\n",
+ " \"Always call codex with \\\"approval-policy\\\": \\\"never\\\" and \\\"sandbox\\\": \\\"workspace-write\\\"\"\n",
+ " ),\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ ")\n",
+ "\n",
+ "designer_agent = Agent(\n",
+ " name=\"Game Designer\",\n",
+ " instructions=(\n",
+ " \"You are an indie game connoisseur. Come up with an idea for a single page html + css + javascript game that a developer could build in about 50 lines of code. \"\n",
+ " \"Format your request as a 3 sentence design brief for a game developer and call the Game Developer coder with your idea.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " handoffs=[developer_agent],\n",
+ ")\n",
+ "\n",
+ "result = await Runner.run(designer_agent, \"Implement a fun new game!\")\n",
+ "```\n",
+ "\n",
+ "Notice that we are providing the Developer agent with the ability to write files to the project directory without asking the user for permissions. \n",
+ "\n",
+ "Now run the code and you’ll see an `index.html` file generated. Go ahead and open the file and start playing the game! \n",
+ "\n",
+ "Here’s a few screenshots of the game my agentic system created. Yours will be different!\n",
+ "\n",
+ "| Example gameplay | Game Over Score |\n",
+ "| :---: | :---: |\n",
+ "|
|
|"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d8cf6db9",
+ "metadata": {},
+ "source": [
+ "Here's the full executable code. Note that it might take a few minutes to run. It will have run successfully if you see an index.html file produced. You might also see some MCP events warnings about format. You can ignore these events."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c9134a41",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "from dotenv import load_dotenv\n",
+ "import asyncio\n",
+ "from agents import Agent, Runner, set_default_openai_api\n",
+ "from agents.mcp import MCPServerStdio\n",
+ "\n",
+ "load_dotenv(override=True) # load the API key from the .env file. We set override to True here to ensure the notebook is loading any changes\n",
+ "set_default_openai_api(os.getenv(\"OPENAI_API_KEY\"))\n",
+ "\n",
+ "async def main() -> None:\n",
+ " async with MCPServerStdio(\n",
+ " name=\"Codex CLI\",\n",
+ " params={\n",
+ " \"command\": \"npx\",\n",
+ " \"args\": [\"-y\", \"codex\", \"mcp\"],\n",
+ " },\n",
+ " client_session_timeout_seconds=360000,\n",
+ " ) as codex_mcp_server:\n",
+ " developer_agent = Agent(\n",
+ " name=\"Game Developer\",\n",
+ " instructions=(\n",
+ " \"You are an expert in building simple games using basic html + css + javascript with no dependencies. \"\n",
+ " \"Save your work in a file called index.html in the current directory.\"\n",
+ " \"Always call codex with \\\"approval-policy\\\": \\\"never\\\" and \\\"sandbox\\\": \\\"workspace-write\\\"\"\n",
+ " ),\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " )\n",
+ "\n",
+ " designer_agent = Agent(\n",
+ " name=\"Game Designer\",\n",
+ " instructions=(\n",
+ " \"You are an indie game connoisseur. Come up with an idea for a single page html + css + javascript game that a developer could build in about 50 lines of code. \"\n",
+ " \"Format your request as a 3 sentence design brief for a game developer and call the Game Developer coder with your idea.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " handoffs=[developer_agent],\n",
+ " )\n",
+ "\n",
+ " result = await Runner.run(designer_agent, \"Implement a fun new game!\")\n",
+ " # print(result.final_output)\n",
+ "\n",
+ "\n",
+ "if __name__ == \"__main__\":\n",
+ " # Jupyter/IPython already runs an event loop, so calling asyncio.run() here\n",
+ " # raises \"asyncio.run() cannot be called from a running event loop\".\n",
+ " # Workaround: if a loop is running (notebook), use top-level `await`; otherwise use asyncio.run().\n",
+ " try:\n",
+ " asyncio.get_running_loop()\n",
+ " await main()\n",
+ " except RuntimeError:\n",
+ " asyncio.run(main())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "407e2d8f",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "## Orchestrating Multi-Agent Workflows\n",
+ "For larger workflows, we introduce a team of agents: \n",
+ "- **Project Manager**: Breaks down task list, creates requirements, and coordinates work. \n",
+ "- **Designer**: Produces UI/UX specifications. \n",
+ "- **Frontend Developer**: Implements UI/UX. \n",
+ "- **Backend Developer**: Implements APIs and logic. \n",
+ "- **Tester**: Validates outputs against acceptance criteria. \n",
+ "\n",
+ "In this example, we intentionally have the Project Manager agent enforce gating logic between each of the specialized downstream agents. This ensures that artifacts exist before handoffs are made. This mirrors real world enterprise workflows such as JIRA task orchestration, long-chained rollouts, and QA sign-offs. \n",
+ "\n",
+ "
\n",
+ "

\n",
+ "
\n",
+ "
Multi-agent orchestration with Codex MCP and gated handoffs producing artifacts.\n",
+ "
\n",
+ "\n",
+ "\n",
+ "In this structure, each of our agents serve a specialized purpose. The Project Manager is overall responsible for coordinating across all other agents and ensuring the overall task is complete.\n",
+ "\n",
+ "## Define the Codex CLI MCP Server\n",
+ "We set up our MCP Server to initialize Codex CLI just as we did in the single agent example.\n",
+ "\n",
+ "```python\n",
+ "async def main() -> None:\n",
+ " async with MCPServerStdio(\n",
+ " name=\"Codex CLI\",\n",
+ " params={\n",
+ " \"command\": \"npx\",\n",
+ " \"args\": [\"-y\", \"codex\", \"mcp\"],\n",
+ " },\n",
+ " client_session_timeout_seconds=360000,\n",
+ " ) as codex_mcp_server:\n",
+ " print(\"Codex MCP server started.\")\n",
+ " # We will add more code here in the next section\n",
+ " return\n",
+ " ```\n",
+ "\n",
+ "\n",
+ "\n",
+ "## Define each specialized agent\n",
+ "Below we define each of our specialized agents and provide access to our Codex MCP server. Notice that we are also passing the `RECOMMMENDED_PROMPT_PREFIX` to each agent that helps the system optimize for handoffs between agents. \n",
+ "\n",
+ "```python\n",
+ "# Downstream agents are defined first for clarity, then PM references them in handoffs.\n",
+ "designer_agent = Agent(\n",
+ " name=\"Designer\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Designer.\\n\"\n",
+ " \"Your only source of truth is AGENT_TASKS.md and REQUIREMENTS.md from the Project Manager.\\n\"\n",
+ " \"Do not assume anything that is not written there.\\n\\n\"\n",
+ " \"You may use the internet for additional guidance or research.\"\n",
+ " \"Deliverables (write to /design):\\n\"\n",
+ " \"- design_spec.md – a single page describing the UI/UX layout, main screens, and key visual notes as requested in AGENT_TASKS.md.\\n\"\n",
+ " \"- wireframe.md – a simple text or ASCII wireframe if specified.\\n\\n\"\n",
+ " \"Keep the output short and implementation-friendly.\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " tools=[WebSearchTool()],\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ ")\n",
+ "\n",
+ "frontend_developer_agent = Agent(\n",
+ " name=\"Frontend Developer\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Frontend Developer.\\n\"\n",
+ " \"Read AGENT_TASKS.md and design_spec.md. Implement exactly what is described there.\\n\\n\"\n",
+ " \"Deliverables (write to /frontend):\\n\"\n",
+ " \"- index.html – main page structure\\n\"\n",
+ " \"- styles.css or inline styles if specified\\n\"\n",
+ " \"- main.js or game.js if specified\\n\\n\"\n",
+ " \"Follow the Designer’s DOM structure and any integration points given by the Project Manager.\\n\"\n",
+ " \"Do not add features or branding beyond the provided documents.\\n\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager_agent.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ ")\n",
+ "\n",
+ "backend_developer_agent = Agent(\n",
+ " name=\"Backend Developer\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Backend Developer.\\n\"\n",
+ " \"Read AGENT_TASKS.md and REQUIREMENTS.md. Implement the backend endpoints described there.\\n\\n\"\n",
+ " \"Deliverables (write to /backend):\\n\"\n",
+ " \"- package.json – include a start script if requested\\n\"\n",
+ " \"- server.js – implement the API endpoints and logic exactly as specified\\n\\n\"\n",
+ " \"Keep the code as simple and readable as possible. No external database.\\n\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager_agent.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ ")\n",
+ "\n",
+ "tester_agent = Agent(\n",
+ " name=\"Tester\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Tester.\\n\"\n",
+ " \"Read AGENT_TASKS.md and TEST.md. Verify that the outputs of the other roles meet the acceptance criteria.\\n\\n\"\n",
+ " \"Deliverables (write to /tests):\\n\"\n",
+ " \"- TEST_PLAN.md – bullet list of manual checks or automated steps as requested\\n\"\n",
+ " \"- test.sh or a simple automated script if specified\\n\\n\"\n",
+ " \"Keep it minimal and easy to run.\\n\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ ")\n",
+ "```\n",
+ "\n",
+ "\n",
+ "\n",
+ "After each role completes its assignment, it will call `transfer_to_project_manager_agent`, and let the Project Manager confirm that the required files exist (or request fixes) before unblocking the next team. \n",
+ "\n",
+ "## Define Project Manager Agent\n",
+ "The Project Manager is the only agent that receives the initial prompt, creates the planning documents in the project directory, and enforces the gatekeeping logic before every transfer. \n",
+ "\n",
+ "```python \n",
+ "project_manager_agent = Agent(\n",
+ "name=\"Project Manager\",\n",
+ "instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"\"\"\n",
+ " You are the Project Manager.\n",
+ "\n",
+ " Objective:\n",
+ " Convert the input task list into three project-root files the team will execute against.\n",
+ "\n",
+ " Deliverables (write in project root):\n",
+ " - REQUIREMENTS.md: concise summary of product goals, target users, key features, and constraints.\n",
+ " - TEST.md: tasks with [Owner] tags (Designer, Frontend, Backend, Tester) and clear acceptance criteria.\n",
+ " - AGENT_TASKS.md: one section per role containing:\n",
+ " - Project name\n",
+ " - Required deliverables (exact file names and purpose)\n",
+ " - Key technical notes and constraints\n",
+ "\n",
+ " Process:\n",
+ " - Resolve ambiguities with minimal, reasonable assumptions. Be specific so each role can act without guessing.\n",
+ " - Create files using Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}.\n",
+ " - Do not create folders. Only create REQUIREMENTS.md, TEST.md, AGENT_TASKS.md.\n",
+ "\n",
+ " Handoffs (gated by required files):\n",
+ " 1) After the three files above are created, hand off to the Designer with transfer_to_designer_agent and include REQUIREMENTS.md, and AGENT_TASKS.md.\n",
+ " 2) Wait for the Designer to produce /design/design_spec.md. Verify that file exists before proceeding.\n",
+ " 3) When design_spec.md exists, hand off in parallel to both:\n",
+ " - Frontend Developer with transfer_to_frontend_developer_agent (provide design_spec.md, REQUIREMENTS.md, AGENT_TASKS.md).\n",
+ " - Backend Developer with transfer_to_backend_developer_agent (provide REQUIREMENTS.md, AGENT_TASKS.md).\n",
+ " 4) Wait for Frontend to produce /frontend/index.html and Backend to produce /backend/server.js. Verify both files exist.\n",
+ " 5) When both exist, hand off to the Tester with transfer_to_tester_agent and provide all prior artifacts and outputs.\n",
+ " 6) Do not advance to the next handoff until the required files for that step are present. If something is missing, request the owning agent to supply it and re-check.\n",
+ "\n",
+ " PM Responsibilities:\n",
+ " - Coordinate all roles, track file completion, and enforce the above gating checks.\n",
+ " - Do NOT respond with status updates. Just handoff to the next agent until the project is complete.\n",
+ " \"\"\"\n",
+ "),\n",
+ "model=\"gpt-5\",\n",
+ "model_settings=ModelSettings(\n",
+ " reasoning=Reasoning(effort=\"medium\")\n",
+ "),\n",
+ "handoffs=[designer_agent, frontend_developer_agent, backend_developer_agent, tester_agent],\n",
+ "mcp_servers=[codex_mcp_server],\n",
+ ")\n",
+ "```\n",
+ "\n",
+ "After constructing the Project Manager, the script sets every specialist's handoffs back to the Project\n",
+ "Manager. This ensures deliverables return for validation before moving on.\n",
+ "\n",
+ "```python\n",
+ "designer_agent.handoffs = [project_manager_agent]\n",
+ "frontend_developer_agent.handoffs = [project_manager_agent]\n",
+ "backend_developer_agent.handoffs = [project_manager_agent]\n",
+ "tester_agent.handoffs = [project_manager_agent]\n",
+ "```\n",
+ "## Add in your task list\n",
+ "This is the task that the Project Manager will refine into specific requirements and tasks for the entire system.\n",
+ "\n",
+ "```python\n",
+ "task_list = \"\"\"\n",
+ "Goal: Build a tiny browser game to showcase a multi-agent workflow.\n",
+ "\n",
+ "High-level requirements:\n",
+ "- Single-screen game called \"Bug Busters\".\n",
+ "- Player clicks a moving bug to earn points.\n",
+ "- Game ends after 20 seconds and shows final score.\n",
+ "- Optional: submit score to a simple backend and display a top-10 leaderboard.\n",
+ "\n",
+ "Roles:\n",
+ "- Designer: create a one-page UI/UX spec and basic wireframe.\n",
+ "- Frontend Developer: implement the page and game logic.\n",
+ "- Backend Developer: implement a minimal API (GET /health, GET/POST /scores).\n",
+ "- Tester: write a quick test plan and a simple script to verify core routes.\n",
+ "\n",
+ "Constraints:\n",
+ "- No external database—memory storage is fine.\n",
+ "- Keep everything readable for beginners; no frameworks required.\n",
+ "- All outputs should be small files saved in clearly named folders.\n",
+ "\"\"\"\n",
+ "```\n",
+ "\n",
+ "Next, run your system, sit back, and you’ll see the agents go to work and create a game in a few minutes! We've included the fully executable code below. Once it's finished, you'll notice the creation of the following files directory. Note that this multi-agent orchestration usually took about 11 mintues to fully complete.\n",
+ "\n",
+ "```markdown\n",
+ "root_directory/\n",
+ "├── AGENT_TASKS.md\n",
+ "├── REQUIREMENTS.md\n",
+ "├── backend\n",
+ "│ ├── package.json\n",
+ "│ └── server.js\n",
+ "├── design\n",
+ "│ ├── design_spec.md\n",
+ "│ └── wireframe.md\n",
+ "├── frontend\n",
+ "│ ├── game.js\n",
+ "│ ├── index.html\n",
+ "│ └── styles.css\n",
+ "└── TEST.md\n",
+ "```\n",
+ "\n",
+ "Start your backend server with `node server.js` and open your `index.html` file to play your game.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ebe128a8",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "from dotenv import load_dotenv\n",
+ "import asyncio\n",
+ "from agents import Agent, Runner, WebSearchTool, ModelSettings, set_default_openai_api\n",
+ "from agents.mcp import MCPServerStdio\n",
+ "from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX\n",
+ "from openai.types.shared import Reasoning\n",
+ "\n",
+ "load_dotenv(override=True) # load the API key from the .env file. We set override to True here to ensure the notebook is loading any changes\n",
+ "set_default_openai_api(os.getenv(\"OPENAI_API_KEY\"))\n",
+ "\n",
+ "async def main() -> None:\n",
+ " async with MCPServerStdio(\n",
+ " name=\"Codex CLI\",\n",
+ " params={\"command\": \"npx\", \"args\": [\"-y\", \"codex\", \"mcp\"]},\n",
+ " client_session_timeout_seconds=360000,\n",
+ " ) as codex_mcp_server:\n",
+ "\n",
+ " # Downstream agents are defined first for clarity, then PM references them in handoffs.\n",
+ " designer_agent = Agent(\n",
+ " name=\"Designer\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Designer.\\n\"\n",
+ " \"Your only source of truth is AGENT_TASKS.md and REQUIREMENTS.md from the Project Manager.\\n\"\n",
+ " \"Do not assume anything that is not written there.\\n\\n\"\n",
+ " \"You may use the internet for additional guidance or research.\"\n",
+ " \"Deliverables (write to /design):\\n\"\n",
+ " \"- design_spec.md – a single page describing the UI/UX layout, main screens, and key visual notes as requested in AGENT_TASKS.md.\\n\"\n",
+ " \"- wireframe.md – a simple text or ASCII wireframe if specified.\\n\\n\"\n",
+ " \"Keep the output short and implementation-friendly.\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " tools=[WebSearchTool()],\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ " )\n",
+ "\n",
+ " frontend_developer_agent = Agent(\n",
+ " name=\"Frontend Developer\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Frontend Developer.\\n\"\n",
+ " \"Read AGENT_TASKS.md and design_spec.md. Implement exactly what is described there.\\n\\n\"\n",
+ " \"Deliverables (write to /frontend):\\n\"\n",
+ " \"- index.html – main page structure\\n\"\n",
+ " \"- styles.css or inline styles if specified\\n\"\n",
+ " \"- main.js or game.js if specified\\n\\n\"\n",
+ " \"Follow the Designer’s DOM structure and any integration points given by the Project Manager.\\n\"\n",
+ " \"Do not add features or branding beyond the provided documents.\\n\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager_agent.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ " )\n",
+ "\n",
+ " backend_developer_agent = Agent(\n",
+ " name=\"Backend Developer\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Backend Developer.\\n\"\n",
+ " \"Read AGENT_TASKS.md and REQUIREMENTS.md. Implement the backend endpoints described there.\\n\\n\"\n",
+ " \"Deliverables (write to /backend):\\n\"\n",
+ " \"- package.json – include a start script if requested\\n\"\n",
+ " \"- server.js – implement the API endpoints and logic exactly as specified\\n\\n\"\n",
+ " \"Keep the code as simple and readable as possible. No external database.\\n\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager_agent.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ " )\n",
+ "\n",
+ " tester_agent = Agent(\n",
+ " name=\"Tester\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"You are the Tester.\\n\"\n",
+ " \"Read AGENT_TASKS.md and TEST.md. Verify that the outputs of the other roles meet the acceptance criteria.\\n\\n\"\n",
+ " \"Deliverables (write to /tests):\\n\"\n",
+ " \"- TEST_PLAN.md – bullet list of manual checks or automated steps as requested\\n\"\n",
+ " \"- test.sh or a simple automated script if specified\\n\\n\"\n",
+ " \"Keep it minimal and easy to run.\\n\\n\"\n",
+ " \"When complete, handoff to the Project Manager with transfer_to_project_manager.\"\n",
+ " \"When creating files, call Codex MCP with {\\\"approval-policy\\\":\\\"never\\\",\\\"sandbox\\\":\\\"workspace-write\\\"}.\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " handoffs=[],\n",
+ " )\n",
+ "\n",
+ " project_manager_agent = Agent(\n",
+ " name=\"Project Manager\",\n",
+ " instructions=(\n",
+ " f\"\"\"{RECOMMENDED_PROMPT_PREFIX}\"\"\"\n",
+ " \"\"\"\n",
+ " You are the Project Manager.\n",
+ "\n",
+ " Objective:\n",
+ " Convert the input task list into three project-root files the team will execute against.\n",
+ "\n",
+ " Deliverables (write in project root):\n",
+ " - REQUIREMENTS.md: concise summary of product goals, target users, key features, and constraints.\n",
+ " - TEST.md: tasks with [Owner] tags (Designer, Frontend, Backend, Tester) and clear acceptance criteria.\n",
+ " - AGENT_TASKS.md: one section per role containing:\n",
+ " - Project name\n",
+ " - Required deliverables (exact file names and purpose)\n",
+ " - Key technical notes and constraints\n",
+ "\n",
+ " Process:\n",
+ " - Resolve ambiguities with minimal, reasonable assumptions. Be specific so each role can act without guessing.\n",
+ " - Create files using Codex MCP with {\"approval-policy\":\"never\",\"sandbox\":\"workspace-write\"}.\n",
+ " - Do not create folders. Only create REQUIREMENTS.md, TEST.md, AGENT_TASKS.md.\n",
+ "\n",
+ " Handoffs (gated by required files):\n",
+ " 1) After the three files above are created, hand off to the Designer with transfer_to_designer_agent and include REQUIREMENTS.md, and AGENT_TASKS.md.\n",
+ " 2) Wait for the Designer to produce /design/design_spec.md. Verify that file exists before proceeding.\n",
+ " 3) When design_spec.md exists, hand off in parallel to both:\n",
+ " - Frontend Developer with transfer_to_frontend_developer_agent (provide design_spec.md, REQUIREMENTS.md, AGENT_TASKS.md).\n",
+ " - Backend Developer with transfer_to_backend_developer_agent (provide REQUIREMENTS.md, AGENT_TASKS.md).\n",
+ " 4) Wait for Frontend to produce /frontend/index.html and Backend to produce /backend/server.js. Verify both files exist.\n",
+ " 5) When both exist, hand off to the Tester with transfer_to_tester_agent and provide all prior artifacts and outputs.\n",
+ " 6) Do not advance to the next handoff until the required files for that step are present. If something is missing, request the owning agent to supply it and re-check.\n",
+ "\n",
+ " PM Responsibilities:\n",
+ " - Coordinate all roles, track file completion, and enforce the above gating checks.\n",
+ " - Do NOT respond with status updates. Just handoff to the next agent until the project is complete.\n",
+ " \"\"\"\n",
+ " ),\n",
+ " model=\"gpt-5\",\n",
+ " model_settings=ModelSettings(\n",
+ " reasoning=Reasoning(effort=\"medium\")\n",
+ " ),\n",
+ " handoffs=[designer_agent, frontend_developer_agent, backend_developer_agent, tester_agent],\n",
+ " mcp_servers=[codex_mcp_server],\n",
+ " )\n",
+ "\n",
+ " designer_agent.handoffs = [project_manager_agent]\n",
+ " frontend_developer_agent.handoffs = [project_manager_agent]\n",
+ " backend_developer_agent.handoffs = [project_manager_agent]\n",
+ " tester_agent.handoffs = [project_manager_agent]\n",
+ "\n",
+ " # Example task list input for the Project Manager\n",
+ " task_list = \"\"\"\n",
+ "Goal: Build a tiny browser game to showcase a multi-agent workflow.\n",
+ "\n",
+ "High-level requirements:\n",
+ "- Single-screen game called \"Bug Busters\".\n",
+ "- Player clicks a moving bug to earn points.\n",
+ "- Game ends after 20 seconds and shows final score.\n",
+ "- Optional: submit score to a simple backend and display a top-10 leaderboard.\n",
+ "\n",
+ "Roles:\n",
+ "- Designer: create a one-page UI/UX spec and basic wireframe.\n",
+ "- Frontend Developer: implement the page and game logic.\n",
+ "- Backend Developer: implement a minimal API (GET /health, GET/POST /scores).\n",
+ "- Tester: write a quick test plan and a simple script to verify core routes.\n",
+ "\n",
+ "Constraints:\n",
+ "- No external database—memory storage is fine.\n",
+ "- Keep everything readable for beginners; no frameworks required.\n",
+ "- All outputs should be small files saved in clearly named folders.\n",
+ "\"\"\"\n",
+ "\n",
+ " # Only the Project Manager receives the task list directly\n",
+ " result = await Runner.run(project_manager_agent, task_list, max_turns=30)\n",
+ " print(result.final_output)\n",
+ "\n",
+ "if __name__ == \"__main__\":\n",
+ " # Jupyter/IPython already runs an event loop, so calling asyncio.run() here\n",
+ " # raises \"asyncio.run() cannot be called from a running event loop\".\n",
+ " # Workaround: if a loop is running (notebook), use top-level `await`; otherwise use asyncio.run().\n",
+ " try:\n",
+ " asyncio.get_running_loop()\n",
+ " await main()\n",
+ " except RuntimeError:\n",
+ " asyncio.run(main())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9e828b04",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "## Tracing the agentic behavior using Traces\n",
+ "As the complexity of your agentic systems grow, it’s important to see how these agents are interacting. We can do this with the Traces dashboard that records: \n",
+ "- Prompts, tool calls, and handoffs between agents. \n",
+ "- MCP Server calls, Codex CLI calls, execution times, and file writes. \n",
+ "- Errors and warnings. \n",
+ "\n",
+ "Let’s take a look at the agent trace for the team of agents above.\n",
+ "\n",
+ "\n",
+ "

\n",
+ "
\n",
+ "\n",
+ "In this Trace, we can confirm that every agent handoff is quarterbacked by our Project Manager Agent who is confirming that specific artifacts exist before handoff to the next agent. Additionally, we can see specific innovations of the Codex MCP Server and generate each output by calling the Responses API. The timeline bars highlight execution durations, making it easy to spot long-running steps and understand how control passes between agents.\n",
+ "\n",
+ "You can even click into each trace to see the specific details of the prompt, tool calls, and other metadata. Over time you can view this information to further tune, optimize, and track your agentic system performance.\n",
+ "\n",
+ "\n",
+ "

\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7b446e22",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "## Recap of What We Did in This Guide\n",
+ "In this guide, we walked through the process of building consistent, scalable workflows using Codex CLI and the Agents SDK. Specifically, we covered: \n",
+ "\n",
+ "- **Codex MCP Server Setup** – How to initialize Codex CLI as an MCP server and make it available as tools for agent interactions. \n",
+ "- **Single-Agent Example** – A simple workflow with a Designer Agent and a Developer Agent, where Codex executed scoped tasks deterministically to produce a playable game. \n",
+ "- **Multi-Agent Orchestration** – Expanding to a larger workflow with a Project Manager, Designer, Frontend Developer, Backend Developer, and Tester, mirroring complex task orchestration and sign-off processes. \n",
+ "- **Traces & Observability** – Using built-in Traces to capture prompts, tool calls, handoffs, execution times, and artifacts, giving full visibility into agentic behavior for debugging, evaluation, and future optimization. \n",
+ "\n",
+ "---\n",
+ "\n",
+ "## Moving Forward: Applying These Lessons\n",
+ "Now that you’ve seen Codex MCP and the Agents SDK in action, here’s how you can apply the concepts in real projects and extract value: \n",
+ "\n",
+ "### 1. Scale to Real-World Rollouts\n",
+ "- Apply the same multi-agent orchestration to large code refactors (e.g., 500+ files, framework migrations). \n",
+ "- Use Codex MCP’s deterministic execution for long-running, auditable rollouts with traceable progress. \n",
+ "\n",
+ "### 2. Accelerate Delivery Without Losing Control\n",
+ "- Organize teams of specialized agents to parallelize development, while maintaining gating logic for artifact validation. \n",
+ "- Reduce turnaround time for new features, testing, or codebase modernization. \n",
+ "\n",
+ "### 3. Extend and Connect to Your Development Workflows\n",
+ "- Connect MCP-powered agents with Jira, GitHub, or CI/CD pipelines via webhooks for automated, repeatable development cycles. \n",
+ "- Leverage Codex MCP in multi-agent service orchestration: not just codegen, but also documentation, QA, and deployment. \n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python (openai-cookbook)",
+ "language": "python",
+ "name": "openai-cookbook"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.8"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/examples/codex/images/game_example_1.png b/examples/codex/images/game_example_1.png
new file mode 100644
index 0000000000..b1ba426416
Binary files /dev/null and b/examples/codex/images/game_example_1.png differ
diff --git a/examples/codex/images/game_example_2.png b/examples/codex/images/game_example_2.png
new file mode 100644
index 0000000000..56b04121d2
Binary files /dev/null and b/examples/codex/images/game_example_2.png differ
diff --git a/examples/codex/images/multi_agent_codex_workflow.png b/examples/codex/images/multi_agent_codex_workflow.png
new file mode 100644
index 0000000000..20535adea1
Binary files /dev/null and b/examples/codex/images/multi_agent_codex_workflow.png differ
diff --git a/examples/codex/images/multi_agent_trace.png b/examples/codex/images/multi_agent_trace.png
new file mode 100644
index 0000000000..f904421b04
Binary files /dev/null and b/examples/codex/images/multi_agent_trace.png differ
diff --git a/examples/codex/images/multi_agent_trace_details.png b/examples/codex/images/multi_agent_trace_details.png
new file mode 100644
index 0000000000..112776eecf
Binary files /dev/null and b/examples/codex/images/multi_agent_trace_details.png differ
diff --git a/registry.yaml b/registry.yaml
index e8e4041d46..ffc947f8c7 100644
--- a/registry.yaml
+++ b/registry.yaml
@@ -4,6 +4,16 @@
# should build pages for, and indicates metadata such as tags, creation date and
# authors for each page.
+- title: Building Consistent Workflows with Codex CLI & Agents SDK
+ path: examples/codex/codex_mcp_agents_sdk/building_consistent_workflows_codex_cli_agents_sdk.ipynb
+ date: 2025-10-01
+ authors:
+ - jhall-openai
+ tags:
+ - agents-sdk
+ - codex
+ - mcp
+
- title: GPT-5-Codex Prompting Guide
path: examples/gpt-5-codex_prompting_guide.ipynb
date: 2025-09-23