MOVA Ticket Agent

Contract-Driven Multimodal UI Support Agent

Gemini Live Agent Challenge · Category: UI Navigator

Live Demo → | Process Ticket | Morning Report | Contract Catalog

The Problem

Support teams handling UI-related tickets face two compounding issues:

Ambiguous screenshots — agents describe what they think they see, not what the UI actually shows. A "login doesn't work" ticket can mean 5 different root causes requiring 5 entirely different resolutions.
Inconsistent responses — without a constrained action space, agents improvise answers. The same issue gets different instructions depending on who handles it, leading to repeat contacts and escalations.

MOVA Ticket Agent solves both: Gemini's multimodal vision reads the screenshot directly, and a contract system constrains which resolution paths are even possible — making every response auditable and repeatable.

What It Does

Reads screenshots, not descriptions — Gemini 2.5 Flash extracts up to 14 structured UI signals (error codes, form state, step indicators, button labels) directly from the uploaded image, bypassing agent interpretation bias
Contracts constrain the answer space — each resolution path is defined in a JSON contract specifying allowed actions, required inputs, and permitted outputs; the model cannot hallucinate outside these bounds
Channel-aware resolution — chat tickets receive a single actionable next step; email tickets receive a ranked solution list — same signal set, different contract, different output shape
Full audit trail per episode — every ticket becomes an episode: raw signals → candidate contracts → business checks → resolution → saved to Firestore; reviewable anytime
Overnight morning report — aggregates all processed episodes into a structured daily digest with channel breakdown, contract usage stats, and escalation flags

Live Demo — Step-by-Step for Judges

The deployed app is fully functional with real Gemini API calls, Firestore storage, and Cloud Storage.

Quick path (2 minutes)

Open → https://contract-support-agent-799834288723.europe-west3.run.app
Submit a chat ticket (simulates a user who can't log in):
- Description: I can't log into my account, it says my credentials are invalid
- Channel: chat
- Upload: demo_assets/screenshots/login_screen_error.png (red error banner: "Invalid credentials or account locked", 5/5 failed attempts)
- Click Process Ticket
See the pipeline trace: signals extracted from screenshot → contracts ranked → business checks → single next step instruction for chat
Submit an email ticket (simulates a user stuck waiting for confirmation):
- Description: I registered but never received the confirmation email
- Channel: email
- Upload: demo_assets/screenshots/registration_waiting_confirmation.png (confirmation pending screen, step 2/3)
- Click Process Ticket
See the difference: same pipeline, different contract → ranked solution list instead of single step
Morning Report → — see both episodes aggregated: channel breakdown, contract usage, escalation status
Contract Catalog → — browse all 8 contracts; click any to see its full specification (allowed actions, required inputs, output shape)

Architecture

┌──────────────────────────────────────────────────────┐
│  User Input: ticket text + screenshot + channel       │
└──────────────────┬───────────────────────────────────┘
                   │
          ┌────────▼────────┐
          │  Signal Agent   │  ← Gemini 2.5 Flash (multimodal)
          │                 │    contracts: extract_support_signals_v1
          │  Extracts ≤14   │              identify_support_step_v1
          │  structured     │
          │  UI signals     │
          └────────┬────────┘
                   │ signal_set: {error_code, form_state, step, ...}
          ┌────────▼────────┐
          │ Contract Agent  │  ← Gemini (text)
          │                 │    contract: rank_candidate_resolution_contracts_v1
          │ Ranks which     │
          │ resolution      │
          │ contract fits   │
          └────────┬────────┘
                   │ selected_contract
          ┌────────▼──────────────┐
          │ Business Check Agent  │  ← Read-only Firestore queries
          │                       │    contracts: check_account_state_v1
          │ Validates account /   │              check_registration_state_v1
          │ registration state    │
          └────────┬──────────────┘
                   │ checks: {account_locked, email_sent, ...}
          ┌────────▼──────────────────────────────────┐
          │         Resolution Agent                   │  ← Gemini (text)
          │                                            │
          │  chat   → chat_guided_resolution_v1        │
          │           "single next step instruction"   │
          │                                            │
          │  email  → email_ranked_resolution_v1       │
          │           "ranked solution list (1–3)"     │
          │                                            │
          │  unknown→ escalate_unknown_case_v1         │
          └────────┬──────────────────────────────────┘
                   │
          ┌────────▼────────┐
          │    Episode      │  → Firestore (full audit trail)
          │    Storage      │  → Morning Report aggregation
          └─────────────────┘

Key design principle: The contract constrains the allowed action space. Gemini extracts signals and selects contracts — but cannot produce outputs outside the contract's allowed_actions and success_outputs. This makes every resolution auditable and reproducible.

Tech Stack

Layer	Technology	Why
LLM / Vision	Gemini 2.5 Flash	Multimodal: reads screenshot pixels directly; fast and cost-efficient for batch processing
Agent Orchestration	Google ADK (custom)	ADK-style agent classes with explicit contract references; each agent has a single responsibility
Gen AI SDK	`google-genai`	Unified client for both Vertex AI and API key auth; structured JSON output mode
Backend	Python 3.12 + FastAPI	Async-first, clean route separation between HTML views and JSON API
Frontend	Jinja2 + Vanilla CSS	Zero JS dependencies; pipeline trace renders server-side for reliability
Persistence	Firestore	Schemaless episode storage; in-memory fallback for local dev without credentials
File Storage	Cloud Storage	Screenshot upload with signed URL pattern; MIME auto-detection
Deployment	Cloud Run	Serverless, scales to zero; deployed via `gcloud run deploy --source .` in one command
Contracts	JSON (8 files)	Declarative, version-controlled, diff-able; no code changes needed to constrain agent behavior

Contract System

All 8 contracts live in contracts/. Each contract is a JSON file defining:

{
  "contract_id": "chat_guided_resolution_v1",
  "kind": "resolution",
  "purpose": "Provide the next single troubleshooting step for chat interaction",
  "applicable_channels": ["chat"],
  "required_inputs": ["selected_contract", "signal_set"],
  "allowed_actions": ["generate_next_step_instruction"],
  "success_outputs": ["chat_next_step_instruction"]
}

Contract	Kind	Role
`extract_support_signals_v1`	signal_extraction	Defines the 14 signals Gemini must extract from a screenshot
`identify_support_step_v1`	step_identification	Maps visual cues to a support step label
`rank_candidate_resolution_contracts_v1`	diagnostic	Ranks which resolution contract best fits the signals
`check_account_state_v1`	business_check	Read-only account lock / suspension check
`check_registration_state_v1`	business_check	Read-only registration / email confirmation check
`chat_guided_resolution_v1`	resolution	Chat channel: produce one next step
`email_ranked_resolution_v1`	resolution	Email channel: produce ranked solution list
`escalate_unknown_case_v1`	escalation	Fallback when no contract matches with sufficient confidence

Getting Started

Prerequisites

Python 3.12+
pip
A Google AI Studio API key → aistudio.google.com (free tier works)
(Optional) Google Cloud project with Firestore + Cloud Storage for full persistence

Run locally in 4 steps

# 1. Clone and enter
git clone https://github.com/your-org/mova-ticket-agent
cd mova-ticket-agent

# 2. Create virtual environment and install
python -m venv .venv && source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# 3. Configure environment
cp .env.example .env
# Open .env and set GEMINI_API_KEY=your_key_here
# Everything else has sensible defaults (in-memory Firestore fallback)

# 4. Start
uvicorn app.main:app --reload --host 0.0.0.0 --port 8080

Open http://localhost:8080

Environment variables

Variable	Required	Description
`GEMINI_API_KEY`	Yes	Google AI Studio API key
`GOOGLE_GENAI_USE_VERTEXAI`	No	Set `true` to use Vertex AI instead of API key
`GOOGLE_CLOUD_PROJECT`	No	GCP project ID (for Firestore + Storage)
`STORAGE_BUCKET`	No	Cloud Storage bucket name for screenshots
`MODEL_NAME`	No	Gemini model (default: `gemini-2.5-flash`)

Without GOOGLE_CLOUD_PROJECT, the app runs with an in-memory Firestore fallback — fully functional for demo purposes.

UI Pages Reference

Page	URL	What judges see
Process Ticket	`/`	Ticket submission form + live activity stats
Morning Report	`/demo/report`	Overnight digest: channel breakdown, contract usage, escalations
Ticket Detail	`/demo/ticket/{id}`	Full pipeline trace: screenshot → signals → contracts → resolution
Contract Catalog	`/demo/contracts`	All 8 contracts with kind badges and purpose descriptions
Contract Viewer	`/demo/contracts/{id}`	Single contract: full JSON spec + allowed actions
Episode Viewer	`/demo/episode/{id}`	Raw episode audit trail stored in Firestore

API Reference

POST /api/tickets/process       Process ticket (multipart/form-data: description, channel, screenshot)
GET  /api/contracts             Contract catalog (JSON)
GET  /api/contracts/{id}        Single contract (JSON)
GET  /api/reports/morning       Morning report (JSON)
GET  /api/episodes              Episodes list (JSON)
GET  /api/episodes/{id}         Single episode (JSON)

Deploy to Cloud Run

gcloud run deploy contract-support-agent \
  --source . \
  --project YOUR_PROJECT_ID \
  --region europe-west3 \
  --allow-unauthenticated \
  --set-env-vars="GOOGLE_GENAI_USE_VERTEXAI=false,GEMINI_API_KEY=YOUR_KEY,MODEL_NAME=gemini-2.5-flash"

Responsible AI

Risk	How we addressed it
Hallucination	Resolution agents cannot produce outputs outside the contract's `allowed_actions`; the contract is passed verbatim into the prompt as a hard constraint
Scope creep	Business check agents are read-only by design; the contract explicitly lists `prohibited_actions: ["write", "update", "delete"]`
Auditability	Every ticket produces a Firestore episode with full intermediate state: raw signals, contract ranking scores, check results, and final resolution — reviewable at any time
Bias in signal extraction	Signal extraction contract defines an explicit closed vocabulary of 14 fields; the model fills in values or returns `null`, it cannot introduce new signal categories

What's Next

Streaming resolution — surface Gemini's reasoning token-by-token via SSE for real-time chat UX
Contract versioning UI — allow support managers to edit contracts in the browser and A/B test resolution quality
Feedback loop — agents rate resolution quality post-ticket; low-rated outcomes trigger contract refinement suggestions via Gemini
Multi-language signal extraction — extend prompt contracts to handle non-English UI screenshots (currently English only)
Vertex AI RAG grounding — ground resolution contracts against a knowledge base of historical resolved tickets

Project Structure

mova-ticket-agent/
├── app/
│   ├── agents/          # ADK-style agents (signal, contract, business_check, resolution, root)
│   ├── services/        # Gemini, Firestore, Cloud Storage, business checks
│   ├── routers/         # FastAPI route handlers (html_pages, api)
│   ├── config.py        # Settings (Pydantic BaseModel)
│   └── main.py          # App factory
├── contracts/           # 8 JSON contracts (declarative action constraints)
├── templates/           # Jinja2 HTML templates
├── static/              # CSS + logo
├── demo_assets/
│   └── screenshots/     # Two pre-built demo PNG screenshots
├── Dockerfile
└── requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MOVA Ticket Agent

Contract-Driven Multimodal UI Support Agent

The Problem

What It Does

Live Demo — Step-by-Step for Judges

Quick path (2 minutes)

Architecture

Tech Stack

Contract System

Getting Started

Prerequisites

Run locally in 4 steps

Environment variables

UI Pages Reference

API Reference

Deploy to Cloud Run

Responsible AI

What's Next

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
contracts		contracts
demo_assets		demo_assets
scripts		scripts
static		static
templates		templates
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
cloudbuild.example.yaml		cloudbuild.example.yaml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MOVA Ticket Agent

Contract-Driven Multimodal UI Support Agent

The Problem

What It Does

Live Demo — Step-by-Step for Judges

Quick path (2 minutes)

Architecture

Tech Stack

Contract System

Getting Started

Prerequisites

Run locally in 4 steps

Environment variables

UI Pages Reference

API Reference

Deploy to Cloud Run

Responsible AI

What's Next

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages