Skip to content

[Feature Request] New Agent Tool: upload_screen_from_image for seamless UI Bootstrapping #7

@jeremiesigrist

Description

@jeremiesigrist

Context:
Currently, the Stitch MCP tools allow agents to create screens via generate_screen_from_text or retrieve existing screens via get_screen. However, when an agent is working on an existing local application, there is no direct way to "send" the current visual state of the app back to Stitch to iterate on its design.

The current workaround requires the agent to describe the UI in text (losing precision) or the user to manually upload a screenshot to the Stitch web interface.

Proposed Feature:
Add a new tool to the Stitch MCP server: upload_screen_from_image.

Functional Specification:
This tool would allow an agent to upload a local image file (screenshot) directly to a Stitch project. Stitch should then process this image using its internal vision-to-layout engine to create a new editable SCREEN.

Tool Definition (JSON Schema style):

   {
     "name": "upload_screen_from_image",
     "description": "Uploads a screenshot of an existing application to a Stitch project to serve as a design baseline.",
     "parameters": {
       "type": "object",
       "properties": {
         "projectId": {
           "type": "string",
           "description": "The ID of the project where the screen will be added."
         },
         "imagePath": {
           "type": "string",
           "description": "The local path or buffer of the screenshot to upload."
         },
         "title": {
           "type": "string",
           "description": "The title of the new screen (e.g., 'Current Admin Dashboard')."
         },
         "deviceType": {
           "type": "string",
           "enum": ["DESKTOP", "MOBILE", "TABLET"],
           "default": "DESKTOP"
         }
       },
       "required": ["projectId", "imagePath", "title"]
     }
   }

User (and Agent) Workflow

  1. Agent takes a screenshot of the local development server (using chrome_screenshot or similar).
  2. Agent calls upload_screen_from_image with the local path.
  3. Stitch creates a new screen in the project, automatically converting the image into editable components/layout.
  4. User opens the Stitch UI, tweaks the design visually.
  5. Agent pulls the changes back via get_screen and updates the React code.

Key Benefits

  • Zero-Friction Baseline: Eliminates the "blank page" problem when redesigning existing apps.
  • Improved Accuracy: Avoids the "hallucinations" or omissions inherent in text-to-image prompts.
  • Full Agent Autonomy: Enables a truly closed-loop system where the agent can say: "I see your app looks like this, let me propose a design improvement in Stitch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions