Verifiable Agent Demo

The minimal end-to-end demonstration for the Digital Biosphere Architecture stack.

This repository connects persona, interaction semantics, governance context, execution traceability, and audit evidence into one walkthrough. It is a demo and reference path rather than a general-purpose framework.

Shared doctrine:

Sandbox controls execution; portable evidence verifies execution.

Governance decides what should be allowed.
Execution integrity proves what actually happened.
Audit evidence exports artifacts for independent review.

flowchart LR
    Persona["Persona (POP)"] --> Intent["Intent Object (AIP)"]
    Intent --> Governance["Governance Check"]
    Governance --> Trace["Execution Trace"]
    Trace --> Audit["Audit Evidence (ARO)"]

What this demo proves

a portable persona-oriented entry point can be projected into runtime
explicit intent and action objects can be emitted before execution
result objects can be emitted after execution
execution steps can be recorded as inspectable evidence
audit-facing artifacts can be exported as bounded outputs

Architecture Path in this Demo

Persona Layer -> POP-aligned persona context carried into the run
Interaction Layer -> intent, action, and result objects emitted under interaction/
Governance Layer -> referenced as the control checkpoint for runtime policy and budget constraints
Execution Integrity Layer -> runtime execution trace and verifiable execution context
Audit Evidence Layer -> ARO-style exported evidence artifacts

This repository does not claim a full Token Governor integration. It demonstrates a minimal aligned path across the broader stack, with explicit governance checkpoint references in the emitted interaction and result objects.

It now also includes one fixed enterprise sandbox artifact chain for the scenario 整理客户拜访记录 → 生成周报 → 发起审批, while still not claiming a general full-stack Token Governor integration.

How to read this demo

This demo is a guided path across layers. It is not the normative specification for each layer, and it points outward to the canonical repositories for those layers: digital-biosphere-architecture, persona-object-protocol, agent-intent-protocol, token-governor, and aro-audit.

Execution Evidence Demo Note

See docs/execution-evidence-demo-note.md.

Expected Artifacts

Repo-tracked sample bundle:

interaction/intent.json
interaction/action.json
interaction/result.json
evidence/example_audit.json
evidence/result.json
evidence/sample-manifest.json

Additional tracked example:

evidence/crew_demo_audit.json

Current concrete examples in this repository include:

docs/quick-walkthrough.md
docs/interaction-flow.md
docs/shortest-validation-loop.md

Run the Demo

Fastest local path

python3 -m demo.agent

Scripted wrapper

bash scripts/run_demo.sh

This local wrapper writes fresh output under artifacts/demo_output/.

Enterprise sandbox artifact chain

python3 examples/enterprise_sandbox_demo/run.py

This writes a reviewer-facing directory under artifacts/enterprise_sandbox_demo/ containing:

intent.json
policy.json
trace.jsonl
sep.bundle.json
replay_verdict.json
audit_receipt.json

Existing CrewAI demo path

bash scripts/setup_framework_venv.sh
.venv/bin/python crew/crew_demo.py

Environment notes:

Python 3 is sufficient for the minimal local path.
Refresh the tracked deterministic sample bundle with python3 scripts/refresh_demo_samples.py.
The optional CrewAI and LangChain paths should run from a git-ignored local .venv/ created by scripts/setup_framework_venv.sh.
The pinned framework helper environment currently uses crewai 1.10.1, langchain 1.2.12, and langchain-core 1.2.18.
CrewAI currently requires Python <3.14.
Both demo paths use deterministic local mock data and do not require external API calls.

Repository Automation

The Mermaid render workflow opens PRs to main only through a dedicated GitHub App.
Configure repository variable PROTOCOL_BOT_APP_ID and repository secret PROTOCOL_BOT_PRIVATE_KEY under Settings -> Secrets and variables -> Actions.
The default repository GITHUB_TOKEN remains read-only and is not used for auto-PR promotion.

Paper Evaluation Harness

This repository now includes a paper-ready evaluation harness for Execution Evidence Architecture for Agentic Software Systems: From Intent Objects to Verifiable Audit Receipts.

Primary entry points:

make eval-baseline
make eval-evidence
make eval-external-baseline
make eval-framework-pair
make eval-langchain-pair
make eval-ablation
make falsification-checks
make human-review-kit
make review-sample
make compare
make paper-eval
make top-journal-pack

Supporting material:

Generated outputs:

artifacts/runs/<task_id>/<mode>/
docs/paper_support/comparison-summary.md
docs/paper_support/comparison-summary.csv
artifacts/metrics/comparison-summary.json
docs/paper_support/external-baseline-summary.md
docs/paper_support/framework-pair-summary.md
docs/paper_support/langchain-pair-summary.md
docs/paper_support/ablation-summary.md
docs/paper_support/falsification-summary.md
artifacts/human_review/synthetic-review-summary.json

English LaTeX Manuscript Draft

The repository also includes a manuscript draft grounded in the current implemented harness and checked-in metrics:

paper/latex/README.md
paper/latex/main.tex
paper/latex/main.pdf after local compilation

Related Repositories

digital-biosphere-architecture - system overview and canonical architecture hub
persona-object-protocol - portable persona object layer
agent-intent-protocol - semantic interaction layer
token-governor - runtime governance and budget-policy control layer
aro-audit - audit evidence and conformance-oriented verification layer

Minimal Reference Surface

interaction/ for explicit interaction objects
evidence/ for audit and result artifacts
demo/ and crew/ for runnable entry points
integration/ for persona and intent adapters
docs/spec/ for schema notes and example payloads

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Verifiable Agent Demo

What this demo proves

Architecture Path in this Demo

How to read this demo

Execution Evidence Demo Note

Expected Artifacts

Run the Demo

Fastest local path

Scripted wrapper

Enterprise sandbox artifact chain

Existing CrewAI demo path

Repository Automation

Paper Evaluation Harness

English LaTeX Manuscript Draft

Related Repositories

Minimal Reference Surface

Further Reading

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github		.github
adapters		adapters
artifacts		artifacts
crew		crew
demo		demo
docs		docs
evaluation		evaluation
evidence		evidence
examples		examples
integration		integration
interaction		interaction
outreach		outreach
paper/latex		paper/latex
paper_eval		paper_eval
poster		poster
schemas		schemas
scripts		scripts
submission/ase2026		submission/ase2026
verifiable_agent		verifiable_agent
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
requirements-frameworks.txt		requirements-frameworks.txt

Folders and files

Latest commit

History

Repository files navigation

Verifiable Agent Demo

What this demo proves

Architecture Path in this Demo

How to read this demo

Execution Evidence Demo Note

Expected Artifacts

Run the Demo

Fastest local path

Scripted wrapper

Enterprise sandbox artifact chain

Existing CrewAI demo path

Repository Automation

Paper Evaluation Harness

English LaTeX Manuscript Draft

Related Repositories

Minimal Reference Surface

Further Reading

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages