askui · Lumpin-askui · Nov 18, 2025 · Nov 29, 2025 · Feb 8, 2026 · Feb 8, 2026
diff --git a/.env.template b/.env.template
diff --git a/.gitattributes b/.gitattributes
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,10 @@
+.DS_Store
+.pdm-python
+
+.venv
+.mypy_cache
+.ruff_cache
+__pycache__
+
+.askui_cache
+.env
diff --git a/README.md b/README.md
@@ -1,112 +1,288 @@
-## Description
+# AskUI Demo Project
 
-This is a proof of concept demonstrating the use of **AskUI** to automate desktop devices.
-As an example, we tested a grocery cashier game and sent a quote request to Bahn Business.
-The goal is to showcase AskUI's capabilities for desktop automation (Android and Web are supported too).
+A task-driven automation framework built on AskUI Agent that reads tasks from the `tasks/` folder, performs UI interactions, and generates per-task reports with screenshots in a timestamped workspace. Tasks are organized in a hierarchical folder structure with support for rules, setup, and teardown.
 
-## What is AskUI Agent?
+## Overview
 
-AskUI Agent acts as a "brain" that follows instructions and uses a provided set of tools to perform tasks.
+This project automates UI tasks defined in text-based files under the `tasks/` directory. The AskUI Agent:
 
-AskUI provides a default toolset for each agent type. For example:
+- Reads tasks from the **Task Folder** (`tasks/`) — supports `.txt`, `.md`, `.csv`, `.json`, and `.pdf`
+- Supports **hierarchical task organization** with rules, setup, and teardown per folder
+- Executes each task step-by-step via UI automation
+- Writes a summary report per task (what was done, result, issues, conclusion)
+- Saves screenshots of system interactions and includes them in reports
+- Writes all outputs into a timestamped **Agent Workspace** directory
+- Supports **custom tools** via the `helpers/` module
+- Supports **caching** for repeated task runs
 
-* **Default Computer Agent toolset**:
+## Project Structure
 
-  * Full control of mouse and keyboard
-  * Take and analyze screenshots
-  * Uses the default display but can list and switch displays
+```
+solution-delivery-template/
+├── main.py                        # Entry point - hierarchical folder runner
+├── system_prompt.py               # System prompt builder (reads from prompts/)
+├── requirements.txt               # Python dependencies
+├── ruff.toml                      # Linting/formatting configuration
+├── .vscode/settings.json          # Editor & AskUI Shell terminal profile
+├── helpers/                       # Custom tools and utilities
+│   ├── __init__.py
+│   ├── get_tools.py               # Tool factory function
+│   └── tools/
+│       ├── __init__.py
+│       └── greeting_tool.py       # Example custom tool
+├── prompts/                       # Prompt parts for the system prompt (MD files)
+│   ├── system_capabilities.md     # Agent capabilities description
+│   ├── device_information.md      # Desktop device context
+│   └── report_format.md           # Report formatting guidelines
+├── tasks/                         # Task definitions (hierarchical)
+│   └── demo/
+│       ├── rules.md               # Rules for this task group
+│       ├── calculator.csv         # CSV test case
+│       ├── clock_demo.txt         # Text task
+│       ├── notepad_hello.md       # Markdown task
+│       └── webbrowser.json        # JSON task
+├── agent_workspace/               # Generated per run (timestamped)
+├── .gitignore
+└── README.md                      # This file
+```
 
-The toolset can be expanded with custom tools.
-For this POC, we added custom tools in [tools.py](./helpers/tools.py).
+## Task Hierarchy
 
-* Example tools: **FileWriteTool** and **FileReadTool** allow the agent to read a CSV and write a `.md` report.
+Tasks are organized in folders under `tasks/`. Each folder can contain:
 
-**ℹ️ Remark:** Tools can be anything: sending emails, making API calls, interacting with devices, etc.
+| File | Purpose |
+|------|---------|
+| `rules.(md\|txt\|csv\|json\|pdf)` | Context/rules injected as system prompt for all tasks in folder |
+| `setup.(md\|txt\|csv\|json\|pdf)` | Executed before tasks in folder |
+| `teardown.(md\|txt\|csv\|json\|pdf)` | Executed after all tasks in folder |
+| `*.csv`, `*.md`, `*.txt`, `*.json`, `*.pdf` | Task files (executed in sorted order) |
+| Subdirectories | Subgroups that inherit parent rules |
 
-## Requirements
+Rules cascade from parent to child folders, so subgroups inherit their parent's context.
 
-* [AskUI Suite Installed](https://docs.askui.com/01-tutorials/00-installation)
-* Chrome browser
+### Example: setup.md
 
-## Example Agent Output
+A setup file runs before any tasks in the folder. Use it to prepare the environment:
 
-All outputs are saved under [reports](./reports).
+```markdown
+## Setup Steps
+
+1. Open the Settings application.
+2. Navigate to the "Network" section.
+3. Ensure WiFi is enabled before proceeding with tests.
+```
 
-### Cashier Game
+### Example: teardown.md
 
-Prompt used to play the game:
+A teardown file runs after all tasks in the folder complete. Use it to clean up:
 
-```python
-desktop_agent.act(
-    goal="""
-    Open "https://www.mortgagecalculator.org/money-games/grocery-cashier/"
-    Play the first level of the game.
-    Save a screenshot after each interaction.
-    Write a detailed report of the gameplay.
-    Use only the mouse for interaction.
-    Use tools in parallel to speed up gameplay.
-    """,
-    tools=custom_tools,
-)
+```markdown
+## Teardown Steps
+
+1. Close the Settings application.
+2. Return to the home screen.
+3. Clear any temporary files created during testing.
 ```
 
-After execution, the agent generated this
-[Execution Report.](./reports/20250923_grocery_cashier_game/grocery_cashier_game_test_report.md)
-
-### Automating GK Software Website to make a contact request.
-
-The test plan is defined in a [CSV file](./csv_files/demo.csv).
-
-Code snippet used:
-
-```python
-desktop_agent.act(
-    goal="""
-      You are an AI UI Automation Engineer created with AskUI Agent.  
-      You are running in a controlled test environment with full control of the computer, browser, and UI.  
-      You can analyze and solve image-based questions as part of test execution.
-      Analyze the icons and click on the correct one based on the question.
-      Execute the test case from the CSV file step by step.  
-      After each interaction, capture and save a screenshot.  
-      Write a detailed report of every interaction, including visual findings if images are involved.  
-      Use tools in parallel to optimize and speed up execution.
-      Do Not raise any Exception and do not ask the user for any input.
-    """,
-    tools=[
-        FileReadTool(absolute_csv_file_path),
-    ] + custom_tools,
-)
+## Prerequisites
+
+Before you begin, ensure you have:
+
+- **AskUI Shell** installed on your system
+- Python 3.12 or higher
+- Access to the AskUI platform with valid credentials
+
+### Installing AskUI Shell
+
+If you haven't already, install AskUI Shell following the [official installation guide](https://docs.askui.com/).
+
+## Installation
+
+### Step 1: Open AskUI Shell
+
+Launch the AskUI Shell environment:
+
+```bash
+askui-shell
 ```
 
-After execution, the agent generated this [Execution Report.](./reports/20250923_submit_contact_request_via_GKSoftware_website/TC001_Test_Report.md)
+### Step 2: Configure AskUI Credentials (First Time Only)
 
-## Setup Steps
+1. **Create an Access Token**
+   Follow the [Access Token Guide](https://docs.askui.com/02-how-to-guides/01-account-management/04-tokens#create-access-token).
+
+2. **Set Up Your Credentials**
+   Follow the [Credentials Setup Guide](https://docs.askui.com/04-reference/02-askui-suite/02-askui-suite/ADE/Public/AskUI-SetSettings#askui-setsettings).
+
+### Step 3: Set Up Python Environment
+
+Activate the virtual environment (run this each time you start a new terminal):
+
+```powershell
+AskUI-EnablePythonEnvironment -name 'AskUI-POC' -CreateIfNotExists
+```
+
+### Step 4: Install Dependencies
+
+Install required packages (only needed the first time or when `requirements.txt` is updated):
+
+```powershell
+pip install -r requirements.txt
+```
+
+### Step 5: Configure Environment Variables
+
+```bash
+cp .env.template .env
+# Edit .env file with your API keys
+```
+
+## Configuration
+
+Key paths are defined in `main.py`:
+
+- **`TASK_FOLDER`** (`tasks/`): Folder containing task files the agent reads and executes.
+- **`AGENT_WORKSPACE`** (`agent_workspace/YYYY-MM-DD_HH-MM-SS/`): Where the agent can write reports and screenshots (timestamped per run).
+
+You can customize the system prompt by editing the markdown files in `prompts/`:
+- `system_capabilities.md` — Agent capabilities and behavior rules
+- `device_information.md` — Information about the device being controlled
+- `report_format.md` — Report formatting guidelines
+
+## Usage
+
+### Running Tasks
+
+```bash
+# Run all tasks from the default tasks/ folder
+python main.py
+
+# Run tasks from a specific subfolder
+python main.py tasks/demo
+
+# Run a single task file (with setup/teardown from its folder hierarchy)
+python main.py tasks/demo/calculator.csv
 
-1. **Open AskUI Shell**
+# Custom caching options
+python main.py tasks/demo --cache-strategy auto --cache-dir .askui_cache
+```
+
+### Output Structure
+
+Each run creates a new workspace directory:
+
+```
+agent_workspace/YYYY-MM-DD_HH-MM-SS/
+├── <task_name>/
+│   ├── <task_name>_report.md
+│   └── <task_name>_screenshot.png
+└── ... (HTML report artifacts from SimpleHtmlReporter)
+```
+
+### Adding New Tasks
+
+1. Create a new folder under `tasks/` for your task group
+2. Add a `rules.md` with context and rules for the group
+3. Optionally add `setup` and `teardown` files
+4. Add task files (CSV, Markdown, etc.) — they execute in sorted order
+
+### Adding Custom Tools
+
+1. Create new tool classes in `helpers/tools/`
+2. Inherit from `askui.models.shared.tools.Tool`
+3. Register tools in `helpers/get_tools.py`
+
+See `helpers/tools/greeting_tool.py` for an example.
+
+## Task Formats
+
+Tasks can be provided in several formats. The agent reads files from `tasks/` and interprets them as tasks to execute.
+
+### Plain text (`.txt`)
+
+Short step-by-step instructions, e.g. open an app, read and report information, include a screenshot.
+
+### Markdown (`.md`)
+
+Structured task with objective, steps, and deliverables.
+
+### CSV
 
-   ```bash
-   askui-shell
-   ```
+Table format with test case ID, name, preconditions, step number, step description, and expected result — suitable for test-case style automation.
 
-2. **Configure AskUI Credentials** (first-time setup only)
+**Example columns:** `Test case ID`, `Test case name`, `Precondition`, `Step number`, `Step description`, `Expected result`
+
+### JSON
+
+Structured task with `id`, `name`, `description`, `precondition`, `steps` (array of `number`, `action`, `expectedResult`), and optional `deliverables`.
+
+### PDF
+
+PDF files are supported as task references. The agent will note the PDF path for processing.
+
+## Agent Tools
+
+The **AskUI Agent** comes with built-in computer tools for UI automation, including:
+
+- Mouse control (move, click, press, drag)
+- Keyboard input (typing, key presses)
+- Taking screenshots
+- Other desktop interaction capabilities
+
+**In addition**, this project adds the following tools:
+
+- **ReadFromFileTool** (base: Task Folder): Read task file contents
+- **ListFilesTool** (Task Folder & Agent Workspace): List files in those directories
+- **WriteToFileTool** (base: Agent Workspace): Write reports and other files
+- **ComputerSaveScreenshotTool** (base: Agent Workspace): Capture and save screenshots to disk
+- **PrintToConsoleTool**: Print messages to the console
+- **Window management tools**: Virtual display, process/window listing, focus control
+- **Custom tools**: Registered via `helpers/get_tools.py` (e.g., GreetingTool)
+
+Reporting is enhanced by **SimpleHtmlReporter**, which writes HTML reports into the agent workspace.
+
+## Available VLMs
+The project supports multiple Claude models:
+
+claude-opus-4-6
+claude-sonnet-4-6 (default)
+claude-haiku-4-5-20251001
+claude-sonnet-4-5-20250929
+
+## Web UI
+
+The project includes a web-based dashboard for managing tasks, viewing reports, editing prompts, and running the agent — all from the browser.
+
+### Running the Web UI
+
+```bash
+# Install dependencies (if not already)
+pip install -r requirements.txt
+
+# Start the server
+uvicorn web.server:app --port 8877 --reload
+```
 
-   1. Create an access token: [Access Token Guide](https://docs.askui.com/02-how-to-guides/01-account-management/04-tokens)
-   2. Set up credentials: [Credentials Setup Guide](https://docs.askui.com/04-reference/02-askui-suite/02-askui-suite/ADE/Public/AskUI-SetSettings)
+Then open **http://localhost:8877** in your browser.
 
-3. **Enable Python Environment**
+### Features
 
-   ```bash
-   AskUI-EnablePythonEnvironment -name 'AskUIDemo' -CreateIfNotExists
-   ```
+| Page | Description |
+|------|-------------|
+| **Dashboard** | Overview of tasks, runs, datasets, tools, and cache |
+| **Tasks** | Browse, create, edit, and delete task files in the hierarchy |
+| **Runner** | Execute tasks with live terminal output and stop support |
+| **Reports** | View past runs, HTML reports, markdown reports, and screenshots |
+| **Test Data** | Manage JSON/CSV test datasets with inline editing |
+| **Tools** | View available custom tools and their parameters |
+| **Prompts** | Edit system prompt components (capabilities, device info, report format) |
+| **Cache** | View and clear cached execution trajectories |
+| **Settings** | Configure AskUI credentials and model provider API keys (writes to `.env`) |
 
-4. **Install Dependencies** (re-run if `requirements.txt` changes)
+### Theme
 
-   ```bash
-   pip install -r requirements.txt
-   ```
+The UI supports both light and dark themes, toggled via the switch at the bottom of the sidebar. The theme preference is persisted in the browser.
 
-5. **Run the Agent**
+## License
 
-   ```bash
-   python ./act.py
-   ```
+This project is provided as an AskUI solution delivery template.