Releases · askui/python-sdk

30 Jul 15:03

adi-wan-askui

v0.10.0

c2ec271

v0.10.0

What's Changed

feat(models)!: based get model on google genai api by @adi-wan-askui in #108

🚀 New Features

Google Gemini API Support: The askui model now uses gemini-2.5-flash as the default model, falling back to the original askui model (Inference API's VQA endpoint) if the Google GenAI API fails, e.g., because of missing support of schema or for unknown reason. For example, Google GenAI API does not support recursive schemas at the moment.
New Model Options: askui/gemini-2.5-flash and askui/gemini-2.5-pro are now supported as model choices.

🚨 Breaking Changes

Default Model Change: The askui default model for AgentBase.get() (and, therefore, VisionAgent.get() etc.) has changed, which may affect the behavior of existing implementations.

Full Changelog: v0.9.7...v0.10.0

Contributors

adi-wan-askui

Assets 2

30 Jul 08:27

onur-askui

v0.9.7

3b32e44

v0.9.7

What's Changed

Rerelease of v0.9.6 due to a problem while releasing.

Full Changelog: v0.9.6...v0.9.7

Assets 2

30 Jul 08:20

onur-askui

v0.9.6

e0dce94

v0.9.6

What's Changed

Rerelease of v0.9.5 due to a dependency problem.

Add missing protobuf dependency by @onur-askui in #107

Full Changelog: v0.9.5...v0.9.6

Contributors

onur-askui

Assets 2

29 Jul 16:10

adi-wan-askui

v0.9.5

efd12ef

v0.9.5

What's Changed

Update controller grpc by @onur-askui in #94
feat: set askui controller path directly by @adi-wan-askui in #105
Fix UnicodeEncodeError by @programminx-askui in #106
feat: add display management tools and enhance VisionAgent by @danyalxahid-askui in #103

🚀 New Features

Computer Tool support of "cursor_position" action": VisionAgent can now retrieve the current cursor position, e.g., to answer questions like Is the cursor currently hovering over the star icon button?
Multiple Display Support: Comprehensive multi-display functionality for computer vision agents
- New Display Management Tools ⟶ e.g., VisionAgent searches now through all available displays if something cannot be found on one:
  - ListDisplaysTool: List all available displays with their properties
  - SetActiveDisplayTool: Set the active display for screenshots and actions
  - RetrieveActiveDisplayTool: Get information about the currently active display
AskUI Controller Path Configuration: Enhanced flexibility in controller setup
- Direct Path Setting: New ASKUI_CONTROLLER_PATH environment variable for direct controller executable specification
- Priority-Based Resolution: Controller path resolution with precedence: direct path > component registry > installation directory
- Cross-Platform Support: Improved path resolution for Windows, macOS, and Linux
- Better Error Handling: Clear error messages for missing or invalid controller paths
Modernized GRPC Controller Architecture: Major overhaul of the controller communication system
- JSON Schema Integration: New JSON schema definitions for AgentOS-Send-Request-2501 and AgentOS-Send-Response-2501
- Automated Code Generation:
  - New grpc:gen and json:gen PDM scripts for automated code generation
  - Generated Python classes from JSON schemas using datamodel-code-generator
  - Updated GRPC bindings with enhanced type safety
- Enhanced Command Helpers: New command_helpers.py module with functions for:
  - Mouse position management (create_get_mouse_position_command, create_set_mouse_position_command)
  - Render object management (quad, line, image, text commands)
  - Styling system with create_style function supporting CSS-like properties
- Improved Proto Definitions: Updated Controller_V1.proto with expanded command support

🔧 Improvements

Development Workflow Enhancements:
- New Dev Dependency Group: Separated development dependencies (datamodel-code-generator, grpcio-tools) into dedicated group
- Automated Generation Scripts: New scripts/grpc-gen.sh for GRPC code generation

🐞 Bug Fixes

Unicode Encoding: Fixed encoding in chat api persistence layer

🔄 Dependencies

Moved to Dev Dependencies:
- grpcio-tools>=1.73.1: Moved from core to dev dependencies (was >=1.67.0)
- datamodel-code-generator>=0.31.2: Added for automated code generation

Full Changelog: v0.9.4...v0.9.5

Contributors

adi-wan-askui, programminx-askui, and 2 other contributors

Assets 2

22 Jul 10:00

adi-wan-askui

v0.9.4

5977bde

v0.9.4

What's Changed

feat: add web testing agent and related tools by @adi-wan-askui in #97
fix: slow typing in playwright agent os by @adi-wan-askui in #96

🚀 New Features

Web Testing Agent: We've introduced the WebTestingAgent for doing simple exploratory testing. Given an url, it explores the features of a website or webapp and creates testing scenarios and executes them.
- Main Limitations:
  - Features, scenarios and executions are currently not scoped to a particular url. So if you try to test multiple apps (across different chats/conversations) it may get confused.
  - It can go off rails, e.g., if it encounters a link to another website/webapp on the website/app it should test, it may also test the other one.
  - With growing number of features, scenarios, executions, it may get more and more confused, as it is currently not scalable.
  - It currently lacks focus in what to test so that it may sometimes test things that are not really important.
  - It shares the current issues of the WebVisionAgent.

🐞 Bug Fixes

Performance Optimization: Fixed slow typing performance in Playwright agent OS integration for better user experience because of incorrect units

🔧 Improvements

Python 3.13 Compatibility: Enhanced NOT_GIVEN implementation now works correctly as dataclass field defaults in Python 3.13
Configuration Management:
- Extracted mypy configuration to separate mypy.ini file for better import handling and module-specific settings
- Improved VS Code debugger configuration for chat API module path
Enhanced Utility Modules: New utility modules have been added:
- api_utils: Streamlined API interaction utilities
- datetime_utils: Enhanced datetime handling capabilities
- id_utils: Improved ID generation and validation
- not_given: Better handling of optional parameters with immutable NOT_GIVEN implementation

🔄 Dependencies

Added:
- jsonref>=1.1.0 for JSON reference handling in testing tools

Full Changelog: v0.9.3...v0.9.4

Contributors

adi-wan-askui

Assets 2

14 Jul 13:50

adi-wan-askui

v0.9.3

023f089

v0.9.3

What's Changed

fix: override beta flag settings by @adi-wan-askui in #95

🐞 Bug Fixes

allow overriding betas flag with empty list
override it with empty list in AndroidVisionAgent so that it does not use computer beta flag (AskUI Inference API default)
fix serialization issues in telemetry module for AgentBase.act()

Full Changelog: v0.9.2...v0.9.3

Contributors

adi-wan-askui

Assets 2

11 Jul 12:11

adi-wan-askui

v0.9.2

dc677dc

v0.9.2

What's Changed

fix(models): make askui token optional by @adi-wan-askui in #93

🐞 Bug Fixes

make askui token optional if ASKUI__AUTHORIZATION is set when using AskUI Inference API

Full Changelog: v0.9.1...v0.9.2

Contributors

adi-wan-askui

Assets 2

11 Jul 11:54

adi-wan-askui

v0.9.1

ffd5e7f

v0.9.1

What's Changed

feat/credentials fowarding chat by @adi-wan-askui in #92

🚀 New Features

AskUI Inference API: Configure authorization header using ASKUI__AUTHORIZATION env variable (take precedence over constructing authorization header from ASKUI_TOKEN)
Chat API: Allow to configure workspace id and AskUI Inference API authorization header from headers to enable client, e.g., https://hub.askui.com to set it

Other

Allow importing OnMessageCbParam from askui

Full Changelog: v0.9.0...v0.9.1

Contributors

adi-wan-askui

Assets 2

10 Jul 13:09

adi-wan-askui

v0.9.0

c2cac82

v0.9.0

What's Changed

introduce health endpoint for chat api by @onur-askui in #85
refactor!: make agents more composable by @adi-wan-askui in #86
feat: web automation support using playwright + install chat api with pip by @adi-wan-askui in #90
fix(models): askui & anthropic api settings by @adi-wan-askui in #91

🚀 New Features

Web Automation Support: We've introduced the WebVisionAgent for browser automation, powered by Playwright. This new agent allows you to automate tasks directly within web browsers. You can install the required dependencies using pip install askui[web].
Chat API is Now Part of the Package: The AskUI Chat API has been integrated into the askui package under the askui.chat module. You can now run it directly using python -m askui.chat.
Enhanced Agent Capabilities:
- The act method across all agents now accepts tools and settings parameters, allowing for more fine-grained control over agent execution.
- The AndroidVisionAgent can now leverage the Claude 4 model for its operations.
- Agents now support new actions like scroll and wait for more complex interactions.

🔧 Improvements

Composable Agent Architecture: Agents have been significantly refactored for better composability and extensibility. A new base class, AgentBase, has been introduced, from which VisionAgent, AndroidVisionAgent, and the new WebVisionAgent inherit.
Refactored Settings Management: Settings of AskUI Inference API and Anthropic API have been refactored to be more consistent and easier to use and allow access to all API settings, e.g., you can now set the model to be used for the act method when using AskUI Inference API by export ASKUI__MESSAGES__MODEL=anthropic-claude-3-5-sonnet-20241022. Check inference_api.py or messages_api.py for more details.
Chat API Enhancements:
- The Chat API now uses a consistent default port of 9261 for easier testing and setup.
- The Chat UI has been moved and is now hosted on the [AskUI Hub](https://hub.askui.com/).
Refined Keyboard Tooling: The internal keyboard tooling has been improved to better support a wider range of keys and modifier combinations.

🚨 Breaking Changes

Optional Dependencies: Core dependencies have been made optional to provide a more lightweight installation. You now need to install extras based on your needs. For example:
- For web automation: pip install askui[web]
- For Android automation: pip install askui[android]
- For using the Chat API: pip install askui[chat]
- To install everything: pip install askui[all]
Removed APIs and Exceptions:
- The unused exceptions AskUiApiError and AskUiApiRequestFailedError have been removed. Please use more specific exceptions.
- Older methods of configuring Inference and Anthropic APIs via environment variables and settings classes have been removed in favor of the new Pydantic-based settings management.
ActModel.act Method Signature: The signature for ActModel.act has been extended with new tools and settings parameters. If you have implemented custom models, you will need to update your method signatures accordingly.
Configuration Changes: All environment variables for configuring the AskUI Inference API or Anthropic API have been replaced by more consistent environment variables except for the ANTHROPIC_API_KEY, ASKUI_TOKEN, ASKUI_WORKSPACE_ID and ASKUI_INFERENCE_ENDPOINT which are still supported.

📜 Documentation

The README.md has been significantly updated to reflect the new architecture, installation procedures, and features.
Updated installation instructions to use extras like pip install askui[chat].
Updated usage examples for running the chat API with python -m askui.chat.
Clarified that the Chat UI is now hosted on the [AskUI Hub](https://hub.askui.com/).

🔄 Dependencies

Added:
- playwright>=1.41.0 for web automation support.
- greenlet>=3.1.1 and pyee<14,>=13 as dependencies for playwright.
Restructured:
- Dependencies are now managed in optional groups (android, chat, mcp, pynput, web) to reduce the size of the default installation. You must now install the extras you need.
Removed from Core Dependencies:
- fastmcp, mcp, openapi-pydantic, python-multipart, and typer have been moved to the [mcp] optional dependency group.
- httpx-sse and sse-starlette were removed as they are no longer needed.

🧪 Experimental

AskUI Chat: The AskUI Chat feature remains in an experimental stage. We welcome your feedback as we continue to improve its functionality and user experience.

Full Changelog: v0.8.0...v0.9.0

Contributors

adi-wan-askui and onur-askui

Assets 2

26 Jun 13:56

adi-wan-askui

v0.8.0

978f472

v0.8.0

What's Changed

feat(agent): add support for locator in type() by @adi-wan-askui in #84
feat!: introduce Claude 4 Sonnet by @onur-askui in #83

🚀 Features

add support for clicking/focusing and clearing to VisionAgent.type() so that consumers don't have to use custom solutions
add support for Claude 4 + thinking (new computer tool only supported partially for now)

🐞 Bug Fixes

Fix that exceptions were hidden by serialization exception from telemetry module by fixing serialization of (exception) classes

🚨 Breaking Changes

default model used for VisionAgent.act() changed which may make these calls behave differently from before

Full Changelog: v0.7.0...v0.8.0

Contributors

adi-wan-askui and onur-askui

Assets 2

Releases: askui/python-sdk

v0.10.0

What's Changed

🚀 New Features

🚨 Breaking Changes

Contributors

Uh oh!

v0.9.7

What's Changed

Uh oh!

v0.9.6

What's Changed

Contributors

Uh oh!

v0.9.5

What's Changed

🚀 New Features

🔧 Improvements

🐞 Bug Fixes

🔄 Dependencies

Contributors

Uh oh!

v0.9.4

What's Changed

🚀 New Features

🐞 Bug Fixes

🔧 Improvements

🔄 Dependencies

Contributors

Uh oh!

v0.9.3

What's Changed

🐞 Bug Fixes

Contributors

Uh oh!

v0.9.2

What's Changed

🐞 Bug Fixes

Contributors

Uh oh!

v0.9.1

What's Changed

🚀 New Features

Other

Contributors

Uh oh!

v0.9.0

What's Changed

🚀 New Features

🔧 Improvements

🚨 Breaking Changes

📜 Documentation

🔄 Dependencies

🧪 Experimental

Contributors

Uh oh!

v0.8.0

What's Changed

🚀 Features

🐞 Bug Fixes

🚨 Breaking Changes

Contributors

Uh oh!