Releases: askui/python-sdk
v0.10.0
What's Changed
- feat(models)!: based get model on google genai api by @adi-wan-askui in #108
🚀 New Features
- Google Gemini API Support: The
askuimodel now usesgemini-2.5-flashas the default model, falling back to the originalaskuimodel (Inference API's VQA endpoint) if the Google GenAI API fails, e.g., because of missing support of schema or for unknown reason. For example, Google GenAI API does not support recursive schemas at the moment. - New Model Options:
askui/gemini-2.5-flashandaskui/gemini-2.5-proare now supported as model choices.
🚨 Breaking Changes
- Default Model Change: The
askuidefault model forAgentBase.get()(and, therefore,VisionAgent.get()etc.) has changed, which may affect the behavior of existing implementations.
Full Changelog: v0.9.7...v0.10.0
v0.9.7
What's Changed
Rerelease of v0.9.6 due to a problem while releasing.
Full Changelog: v0.9.6...v0.9.7
v0.9.6
What's Changed
Rerelease of v0.9.5 due to a dependency problem.
- Add missing protobuf dependency by @onur-askui in #107
Full Changelog: v0.9.5...v0.9.6
v0.9.5
What's Changed
- Update controller grpc by @onur-askui in #94
- feat: set askui controller path directly by @adi-wan-askui in #105
- Fix UnicodeEncodeError by @programminx-askui in #106
- feat: add display management tools and enhance VisionAgent by @danyalxahid-askui in #103
🚀 New Features
-
Computer Tool support of "cursor_position" action":
VisionAgentcan now retrieve the current cursor position, e.g., to answer questions like Is the cursor currently hovering over the star icon button? -
Multiple Display Support: Comprehensive multi-display functionality for computer vision agents
- New Display Management Tools ⟶ e.g.,
VisionAgentsearches now through all available displays if something cannot be found on one:ListDisplaysTool: List all available displays with their propertiesSetActiveDisplayTool: Set the active display for screenshots and actionsRetrieveActiveDisplayTool: Get information about the currently active display
- New Display Management Tools ⟶ e.g.,
-
AskUI Controller Path Configuration: Enhanced flexibility in controller setup
- Direct Path Setting: New
ASKUI_CONTROLLER_PATHenvironment variable for direct controller executable specification - Priority-Based Resolution: Controller path resolution with precedence: direct path > component registry > installation directory
- Cross-Platform Support: Improved path resolution for Windows, macOS, and Linux
- Better Error Handling: Clear error messages for missing or invalid controller paths
- Direct Path Setting: New
-
Modernized GRPC Controller Architecture: Major overhaul of the controller communication system
- JSON Schema Integration: New JSON schema definitions for
AgentOS-Send-Request-2501andAgentOS-Send-Response-2501 - Automated Code Generation:
- New
grpc:genandjson:genPDM scripts for automated code generation - Generated Python classes from JSON schemas using
datamodel-code-generator - Updated GRPC bindings with enhanced type safety
- New
- Enhanced Command Helpers: New
command_helpers.pymodule with functions for:- Mouse position management (
create_get_mouse_position_command,create_set_mouse_position_command) - Render object management (quad, line, image, text commands)
- Styling system with
create_stylefunction supporting CSS-like properties
- Mouse position management (
- Improved Proto Definitions: Updated
Controller_V1.protowith expanded command support
- JSON Schema Integration: New JSON schema definitions for
🔧 Improvements
- Development Workflow Enhancements:
- New Dev Dependency Group: Separated development dependencies (
datamodel-code-generator,grpcio-tools) into dedicated group - Automated Generation Scripts: New
scripts/grpc-gen.shfor GRPC code generation
- New Dev Dependency Group: Separated development dependencies (
🐞 Bug Fixes
- Unicode Encoding: Fixed encoding in chat api persistence layer
🔄 Dependencies
- Moved to Dev Dependencies:
grpcio-tools>=1.73.1: Moved from core to dev dependencies (was>=1.67.0)datamodel-code-generator>=0.31.2: Added for automated code generation
Full Changelog: v0.9.4...v0.9.5
v0.9.4
What's Changed
- feat: add web testing agent and related tools by @adi-wan-askui in #97
- fix: slow typing in playwright agent os by @adi-wan-askui in #96
🚀 New Features
- Web Testing Agent: We've introduced the
WebTestingAgentfor doing simple exploratory testing. Given an url, it explores the features of a website or webapp and creates testing scenarios and executes them.- Main Limitations:
- Features, scenarios and executions are currently not scoped to a particular url. So if you try to test multiple apps (across different chats/conversations) it may get confused.
- It can go off rails, e.g., if it encounters a link to another website/webapp on the website/app it should test, it may also test the other one.
- With growing number of features, scenarios, executions, it may get more and more confused, as it is currently not scalable.
- It currently lacks focus in what to test so that it may sometimes test things that are not really important.
- It shares the current issues of the
WebVisionAgent.
- Main Limitations:
🐞 Bug Fixes
- Performance Optimization: Fixed slow typing performance in Playwright agent OS integration for better user experience because of incorrect units
🔧 Improvements
- Python 3.13 Compatibility: Enhanced
NOT_GIVENimplementation now works correctly as dataclass field defaults in Python 3.13 - Configuration Management:
- Extracted mypy configuration to separate
mypy.inifile for better import handling and module-specific settings - Improved VS Code debugger configuration for chat API module path
- Extracted mypy configuration to separate
- Enhanced Utility Modules: New utility modules have been added:
api_utils: Streamlined API interaction utilitiesdatetime_utils: Enhanced datetime handling capabilitiesid_utils: Improved ID generation and validationnot_given: Better handling of optional parameters with immutableNOT_GIVENimplementation
🔄 Dependencies
- Added:
jsonref>=1.1.0for JSON reference handling in testing tools
Full Changelog: v0.9.3...v0.9.4
v0.9.3
What's Changed
- fix: override beta flag settings by @adi-wan-askui in #95
🐞 Bug Fixes
- allow overriding
betasflag with empty list - override it with empty list in
AndroidVisionAgentso that it does not use computer beta flag (AskUI Inference API default) - fix serialization issues in telemetry module for
AgentBase.act()
Full Changelog: v0.9.2...v0.9.3
v0.9.2
What's Changed
- fix(models): make askui token optional by @adi-wan-askui in #93
🐞 Bug Fixes
- make askui token optional if
ASKUI__AUTHORIZATIONis set when using AskUI Inference API
Full Changelog: v0.9.1...v0.9.2
v0.9.1
What's Changed
- feat/credentials fowarding chat by @adi-wan-askui in #92
🚀 New Features
- AskUI Inference API: Configure authorization header using
ASKUI__AUTHORIZATIONenv variable (take precedence over constructing authorization header fromASKUI_TOKEN) - Chat API: Allow to configure workspace id and AskUI Inference API authorization header from headers to enable client, e.g., https://hub.askui.com to set it
Other
- Allow importing
OnMessageCbParamfromaskui
Full Changelog: v0.9.0...v0.9.1
v0.9.0
What's Changed
- introduce health endpoint for chat api by @onur-askui in #85
- refactor!: make agents more composable by @adi-wan-askui in #86
- feat: web automation support using playwright + install chat api with pip by @adi-wan-askui in #90
- fix(models): askui & anthropic api settings by @adi-wan-askui in #91
🚀 New Features
- Web Automation Support: We've introduced the
WebVisionAgentfor browser automation, powered by Playwright. This new agent allows you to automate tasks directly within web browsers. You can install the required dependencies usingpip install askui[web]. - Chat API is Now Part of the Package: The AskUI Chat API has been integrated into the
askuipackage under theaskui.chatmodule. You can now run it directly usingpython -m askui.chat. - Enhanced Agent Capabilities:
- The
actmethod across all agents now acceptstoolsandsettingsparameters, allowing for more fine-grained control over agent execution. - The
AndroidVisionAgentcan now leverage the Claude 4 model for its operations. - Agents now support new actions like
scrollandwaitfor more complex interactions.
- The
🔧 Improvements
- Composable Agent Architecture: Agents have been significantly refactored for better composability and extensibility. A new base class, AgentBase, has been introduced, from which
VisionAgent,AndroidVisionAgent, and the newWebVisionAgentinherit. - Refactored Settings Management: Settings of AskUI Inference API and Anthropic API have been refactored to be more consistent and easier to use and allow access to all API settings, e.g., you can now set the model to be used for the
actmethod when using AskUI Inference API byexport ASKUI__MESSAGES__MODEL=anthropic-claude-3-5-sonnet-20241022. Check inference_api.py or messages_api.py for more details. - Chat API Enhancements:
- The Chat API now uses a consistent default port of
9261for easier testing and setup. - The Chat UI has been moved and is now hosted on the [AskUI Hub](https://hub.askui.com/).
- The Chat API now uses a consistent default port of
- Refined Keyboard Tooling: The internal keyboard tooling has been improved to better support a wider range of keys and modifier combinations.
🚨 Breaking Changes
- Optional Dependencies: Core dependencies have been made optional to provide a more lightweight installation. You now need to install extras based on your needs. For example:
- For web automation:
pip install askui[web] - For Android automation:
pip install askui[android] - For using the Chat API:
pip install askui[chat] - To install everything:
pip install askui[all]
- For web automation:
- Removed APIs and Exceptions:
- The unused exceptions
AskUiApiErrorandAskUiApiRequestFailedErrorhave been removed. Please use more specific exceptions. - Older methods of configuring
InferenceandAnthropicAPIs via environment variables and settings classes have been removed in favor of the new Pydantic-based settings management.
- The unused exceptions
ActModel.actMethod Signature: The signature forActModel.acthas been extended with newtoolsandsettingsparameters. If you have implemented custom models, you will need to update your method signatures accordingly.- Configuration Changes: All environment variables for configuring the AskUI Inference API or Anthropic API have been replaced by more consistent environment variables except for the
ANTHROPIC_API_KEY,ASKUI_TOKEN,ASKUI_WORKSPACE_IDandASKUI_INFERENCE_ENDPOINTwhich are still supported.
📜 Documentation
- The
README.mdhas been significantly updated to reflect the new architecture, installation procedures, and features. - Updated installation instructions to use extras like
pip install askui[chat]. - Updated usage examples for running the chat API with
python -m askui.chat. - Clarified that the Chat UI is now hosted on the [AskUI Hub](https://hub.askui.com/).
🔄 Dependencies
- Added:
playwright>=1.41.0for web automation support.greenlet>=3.1.1andpyee<14,>=13as dependencies forplaywright.
- Restructured:
- Dependencies are now managed in optional groups (
android,chat,mcp,pynput,web) to reduce the size of the default installation. You must now install the extras you need.
- Dependencies are now managed in optional groups (
- Removed from Core Dependencies:
fastmcp,mcp,openapi-pydantic,python-multipart, andtyperhave been moved to the[mcp]optional dependency group.httpx-sseandsse-starlettewere removed as they are no longer needed.
🧪 Experimental
- AskUI Chat: The AskUI Chat feature remains in an experimental stage. We welcome your feedback as we continue to improve its functionality and user experience.
Full Changelog: v0.8.0...v0.9.0
v0.8.0
What's Changed
- feat(agent): add support for
locatorintype()by @adi-wan-askui in #84 - feat!: introduce Claude 4 Sonnet by @onur-askui in #83
🚀 Features
- add support for clicking/focusing and clearing to
VisionAgent.type()so that consumers don't have to use custom solutions - add support for Claude 4 + thinking (new computer tool only supported partially for now)
🐞 Bug Fixes
- Fix that exceptions were hidden by serialization exception from telemetry module by fixing serialization of (exception) classes
🚨 Breaking Changes
- default model used for
VisionAgent.act()changed which may make these calls behave differently from before
Full Changelog: v0.7.0...v0.8.0