System Prompts

System prompts define how the Vision Agent behaves when executing tasks through the act() command. A well-structured system prompt provides the agent with the necessary context, capabilities, and constraints to successfully interact with your application's UI across different devices and platforms.

Overview

The Vision Agent uses system prompts to understand:

What actions it can perform
What device/platform it's operating on
Specific information about your UI
How to format test reports
Special rules or edge cases to handle

The default prompts work well for general use cases, but customizing them for your specific application can significantly improve reliability and performance.

Prompt Structure

System prompts for the AskUI ComputerAgent consist of six distinct parts:

Part	Default	Purpose
System Capabilities	Yes	Defines what the agent can do and how it should behave
Device Information	Yes	Provides platform-specific context (desktop, mobile, web)
UI Information	No (but strongly recommended!)	Custom information about your specific UI
Report Format	Yes	Specifies how to format execution results
Cache Use	Yes	Specifices when and how the agent should use cache files
Additional Rules	No	Special handling for edge cases or known issues

1. System Capabilities

Defines the agent's core capabilities and operational guidelines.

Default prompts available:

COMPUTER_USE_CAPABILITIES - For desktop applications
WEB_BROWSER_CAPABILITIES - For web applications
ANDROID_CAPABILITIES - For Android devices

Important: We strongly recommend using the default AskUI capabilities unless you have specific requirements, as custom capabilities can lead to unexpected behavior.

2. Device Information

Provides platform-specific context to help the agent understand the environment.

Default options:

DESKTOP_DEVICE_INFORMATION - Platform, architecture, internet access
WEB_AGENT_DEVICE_INFORMATION - Browser environment details
ANDROID_DEVICE_INFORMATION - ADB connection, device type

3. UI Information

This is the most important part to customize for your application.

Provide specific details about your UI that the agent needs to know:

Location of key functions and features
Non-standard interaction patterns
Common navigation paths
Areas where users typically encounter issues
Actions that should NOT be performed

4. Report Format

Specifies how the agent should format its execution report.

Default options:

MD_REPORT_FORMAT - Markdown formatted summary with observations
NO_REPORT_FORMAT - No formal report required

5. Cache Use Prompt

Will be added automatically depending on your caching settings.

6. Additional Rules

Optional rules for handling specific edge cases or known issues with your application.

Use cases:

Browser-specific workarounds (e.g., Firefox startup wizards)
Special handling for specific UI states
Recovery strategies for common failure scenarios

Creating Custom Prompts

Using Factory Functions (Recommended)

The simplest way to create custom prompts:

from askui.prompts.act_prompts import create_web_agent_prompt

# Create prompt with custom UI information
prompt = create_web_agent_prompt(
    ui_information="""
    **Navigation:**
    - Main menu is accessible via hamburger icon in top-left corner
    - Search functionality is in the header on all pages

    **Login Flow:**
    - Username field must be filled before password field becomes active
    - "Remember me" checkbox should NOT be used in automated tests

    **Common Issues:**
    - Loading spinner may appear for 2-3 seconds after clicking "Submit"
    - Error messages appear as toast notifications in top-right corner
    """,
    additional_rules="""
    - Always wait for the loading spinner to disappear before proceeding
    - Never click "Save and Exit" without explicit user confirmation
    """
)

# Use in agent
from askui import WebVisionAgent
from askui.models.shared.settings import ActSettings, MessageSettings

with WebVisionAgent() as agent:
    agent.act(
        "Log in with username 'testuser' and password 'testpass123'",
        # CAUTION: this will also override all other MessageSettings
        # eventually provided earlier!
        act_settings=ActSettings(messages=MessageSettings(system=prompt))
    )

Available factory functions:

create_computer_agent_prompt() - Desktop applications
create_web_agent_prompt() - Web applications
create_android_agent_prompt() - Android devices

Using ActSystemPrompt Directly

For full control over all prompt components:

from askui.models.shared.prompts import ActSystemPrompt
from askui.prompts.act_prompts import (
    WEB_BROWSER_CAPABILITIES,
    WEB_AGENT_DEVICE_INFORMATION,
    NO_REPORT_FORMAT,
)

prompt = ActSystemPrompt(
    system_capabilities=WEB_BROWSER_CAPABILITIES,
    device_information=WEB_AGENT_DEVICE_INFORMATION,
    ui_information="Your custom UI information here",
    report_format=NO_REPORT_FORMAT,
    additional_rules="Your additional rules here"
)

Power User Override (Not Recommended)

Warning: This feature is intended for power users only and can lead to unexpected behavior.

ActSystemPrompt includes a prompt field that completely overrides all structured prompt parts when set. This is useful only if you need full control over the exact prompt text:

from askui.models.shared.prompts import ActSystemPrompt
from askui.models.shared.settings import ActSettings, MessageSettings

# Power user override - ignores all other prompt fields
prompt = ActSystemPrompt(
    prompt="Your completely custom system prompt here",
    # All other fields will be ignored when prompt is set:
    system_capabilities="Ignored",
    device_information="Ignored",
    # ... etc
)

with WebVisionAgent() as agent:
    agent.act(
        "Your task",
        act_settings=ActSettings(messages=MessageSettings(system=prompt))
    )

Important limitations:

⚠️ Using the prompt field will trigger a UserWarning on model creation
⚠️ All structured prompt parts (capabilities, device info, etc.) are completely ignored
✅ Other MessageSettings fields remain unchanged (thinking, max_tokens, temperature, tool_choice, provider_options)
✅ Only the system prompt text itself is affected - all other settings remain at their configured values

When to use this:

You have extensive experience with prompt engineering
You need to experiment with completely different prompt structures
You're conducting research or debugging specific prompt behaviors

When NOT to use this:

For normal customization needs (use factory functions or structured fields instead)
When you want to maintain the tested structure of default prompts
In production environments where reliability is critical

Modifying Default Prompts

You can extend the default prompts with your own content:

from askui.prompts.act_prompts import (
    create_computer_agent_prompt,
    BROWSER_SPECIFIC_RULES,
)

# Add your own rules to the defaults
custom_rules = f"""
{BROWSER_SPECIFIC_RULES}

**Application-Specific Rules:**
- Always verify the page title before proceeding with actions
- Wait 1 second after navigation before taking screenshots
- Ignore popup notifications that appear during test execution
"""

prompt = create_computer_agent_prompt(
    ui_information="E-commerce checkout flow with 3-step process",
    additional_rules=custom_rules
)

Best Practices

Language and Clarity

Use consistent English: Stick to clear English throughout your prompt. Mixed languages or non-English prompts will degrade performance.
Be specific and detailed: Provide as much relevant detail as possible. Over-specification is better than under-specification.
Use structured format: Organize information with bullet points and clear sections.
Avoid contradictions: Ensure rules don't conflict with each other.

UI Information

Document navigation patterns: Explain how users navigate through your application, as one would expect in a documentation.
Identify unique elements: Point out non-standard UI components or interactions.
List forbidden actions: Explicitly state what the agent should NOT do.

Additional Rules

Handle indiviual failures: Add specific instructions to overcome agent failures that you commonly encounter
Target specific issues: Use this section to address known failure scenarios.
Include examples: Show concrete examples of the situation you're addressing.
Keep it current: Update rules as your application changes.

Testing and Iteration

Start with defaults: Use default prompts initially to establish a baseline.
Add UI information: Customize with your application-specific details.
Monitor failures: Track where the agent struggles or fails.
Refine rules: Add additional rules to handle discovered edge cases.
Test changes: Verify that prompt changes improve reliability.

Available Constants

Import these constants from askui.prompts.act_prompts:

System Capabilities:

GENERAL_CAPABILITIES
COMPUTER_USE_CAPABILITIES
ANDROID_CAPABILITIES
WEB_BROWSER_CAPABILITIES

Device Information:

DESKTOP_DEVICE_INFORMATION
ANDROID_DEVICE_INFORMATION
WEB_AGENT_DEVICE_INFORMATION

Report Formats:

MD_REPORT_FORMAT
NO_REPORT_FORMAT

Additional Rules:

BROWSER_SPECIFIC_RULES
BROWSER_INSTALL_RULES
ANDROID_RECOVERY_RULES

Example: Complete Custom Prompt

from askui import WebVisionAgent
from askui.prompts.act_prompts import create_web_agent_prompt
from askui.models.shared.settings import ActSettings, MessageSettings

# Create comprehensive custom prompt
prompt = create_web_agent_prompt(
    ui_information="""
    **Application Overview:**
    - Multi-page e-commerce application with product catalog and checkout
    - Uses single-page navigation with URL updates

    **Key Features:**
    - Product search in header (always visible)
    - Shopping cart icon shows item count
    - Checkout is 3-step process: Cart → Shipping → Payment

    **Important Elements:**
    - "Add to Cart" buttons are blue with white text
    - Price displays always show currency symbol ($)
    - Out-of-stock items show "Notify Me" instead of "Add to Cart"

    **Navigation:**
    - Home: Click logo in top-left
    - Categories: Dropdown menu under "Shop" in header
    - Cart: Click cart icon in top-right
    - Account: Click user icon in top-right

    **Common Patterns:**
    - All forms require clicking "Next" or "Continue" to proceed
    - Error messages appear in red above form fields
    - Success messages appear as green banner at top of page

    **Timing Considerations:**
    - Product images may take 1-2 seconds to load
    - Cart updates trigger 500ms animation
    - Checkout validation shows spinner for 1-3 seconds
    """,
    additional_rules="""
    - Always verify cart contents before proceeding to checkout
    - Wait for page transitions to complete before taking next action
    - If "Out of Stock" message appears, report it and stop execution
    - Ignore promotional popups that may appear during browsing

    **DO NOT:**
    - Click "Complete Purchase" without explicit user confirmation
    - Submit payment information
    - Delete items from saved lists
    """
)

# Use the prompt
with WebVisionAgent() as agent:
    agent.act(
        "Find a laptop under $1000 and add it to cart",
        # CAUTION: this will also override all other MessageSettings
        # eventually provided earlier!
        act_settings=ActSettings(messages=MessageSettings(system=prompt))
    )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System Prompts

Overview

Prompt Structure

1. System Capabilities

2. Device Information

3. UI Information

4. Report Format

5. Cache Use Prompt

6. Additional Rules

Creating Custom Prompts

Using Factory Functions (Recommended)

Using ActSystemPrompt Directly

Power User Override (Not Recommended)

Modifying Default Prompts

Best Practices

Language and Clarity

UI Information

Additional Rules

Testing and Iteration

Available Constants

Example: Complete Custom Prompt

FilesExpand file tree

03_prompting.md

Latest commit

History

03_prompting.md

File metadata and controls

System Prompts

Overview

Prompt Structure

1. System Capabilities

2. Device Information

3. UI Information

4. Report Format

5. Cache Use Prompt

6. Additional Rules

Creating Custom Prompts

Using Factory Functions (Recommended)

Using ActSystemPrompt Directly

Power User Override (Not Recommended)

Modifying Default Prompts

Best Practices

Language and Clarity

UI Information

Additional Rules

Testing and Iteration

Available Constants

Example: Complete Custom Prompt