feat: add window capture MCP tool for GUI screenshots#279
Open
mokcontoro wants to merge 9 commits intodevelopfrom
Open
feat: add window capture MCP tool for GUI screenshots#279mokcontoro wants to merge 9 commits intodevelopfrom
mokcontoro wants to merge 9 commits intodevelopfrom
Conversation
Collaborator
|
That's a very cool feature, can be useful for integration tests too. I will try it. |
Contributor
Author
|
Also sharing a demo result. |
New users unfamiliar with MCP or rosbridge need a quick plain-language explanation before diving into installation. Added: - "What is this?" section with ASCII architecture diagram - "What you need" 3-point prerequisites list - Clearer formatting for key benefits section Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: safe response handling in services.py (#257) * fix: safe response handling in topics.py (#258) * fix: safe response handling in nodes.py (#259) * fix: safe response handling in parameters.py (#260) * fix: safe response handling in actions.py (#261) * fix: safe response handling in ros_metadata.py (#251)
Add capture_window and list_windows MCP tools that capture X11 GUI windows (TurtleSim, RViz, Gazebo, etc.) and return them as ImageContent. This enables the AI to see what's displayed in ROS GUI applications. Features: - capture_window: screenshot any window by name, returns ImageContent - list_windows: list all available GUI windows with sizes - Optional resize support for bandwidth control - Graceful fallback when X11/dependencies not available Dependencies: python3-xlib (optional, for X11 capture) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Bump version to 3.1.0 - Update restructuring_plan.md with window capture tool category (33 tools) - Update README.md features list with GUI window capture - Add unit tests for window capture tools (test_window_capture.py) - Add conftest.py stub for test environment compatibility Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a0c4e96 to
8aaad81
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
capture_windowMCP tool that captures X11 GUI windows (TurtleSim, RViz, Gazebo, etc.) and returns them asImageContentlist_windowsMCP tool to discover available GUI windowsHow it works
The tool uses
python3-xlibto find and capture X11 windows by name. It converts the raw pixel data to JPEG and returns it asImageContentthat displays inline in the AI client.Use cases
Dependencies
python3-xlib(optional — tools gracefully return an error message if not installed)pillow,numpy(already in project dependencies)Test plan
list_windows()returns available windowscapture_window(window_name="TurtleSim")returns screenshot as ImageContentcapture_windowwith resize option worksTested on ROS 2 Jazzy / WSL2 (WSLg) with TurtleSim.
🤖 Generated with Claude Code