Agentic CLI

Agentic CLI is a command line toolkit for researching topics, planning workflows, simulating shell commands (with optional execution), and optionally retrieving documentation from MCP-compatible servers. The current milestone delivers the project scaffold, a lightweight runner loop, artifact helpers, a fully functional web research workflow with explainable reports, deterministic planning with Mermaid diagram generation, and a simulation-first runner that produces auditable run logs.

Quickstart

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
cp .env.example .env  # adjust values as needed
agent --help

Run a research session to validate the workflow and generated artifacts:

agent research "hello world in rust"

The command writes a timestamped directory under artifacts/ containing report.md and sources.json with numbered references. Add tokens like openai/openai-cookbook:README.md to the topic to pull GitHub files into the report alongside the web sources.

Commands

The Typer-based CLI exposes four top-level commands:

Command	Description
`agent research "topic"`	Perform web research, summarise findings, and emit markdown + JSON artifacts.
`agent plan "goal"`	Generate a structured plan, Markdown brief, and Mermaid diagram artifacts.
`agent run "goal"`	Generate a plan, propose shell commands, and simulate or execute them with a run log.
`agent docs "library@version"`	Retrieve documentation snapshots via Context7 MCP (requires configuration).

Run agent --help or agent <command> --help for the latest options. For a complete, real example of the planning flow see examples/nextauth_plan.md.

SDK usage

Prefer calling the workflows directly? Import the high-level SDK dataclasses/functions from agentic_cli.sdk to bypass Typer option parsing and work with typed results:

from agentic_cli.sdk import (
    DocsRequest,
    PlanRequest,
    ResearchRequest,
    RunRequest,
    create_plan,
    execute_plan,
    fetch_docs,
    perform_research,
)

artifacts = perform_research(ResearchRequest(topic="vector databases"))
if artifacts:
    print(artifacts.report_path)

plan = create_plan(PlanRequest(goal="set up Next.js + NextAuth"))
print(plan.plan_markdown, len(plan.steps))

run = execute_plan(RunRequest(goal="ship release", allow_run=False))
print(run.run_log, len(run.records))

docs = fetch_docs(DocsRequest(package="next-auth@5.0.0", limit=5))
print(docs.docs_markdown)

Every SDK call returns dataclasses containing resolved artifact paths plus structured metadata (e.g. plan steps, execution records, Context7 responses), keeping programmatic integrations simple.

Example session

The acceptance test scenario, agent plan "set up Next.js + NextAuth", produces a timestamped artifact directory containing Markdown, JSON, and Mermaid files. A captured output is available in examples/nextauth_plan.md alongside the Mermaid definition so you can render the SVG locally once @mermaid-js/mermaid-cli is installed.

Research workflow

Search the web via a lightweight DuckDuckGo HTML query (tools/web_search.py).
Fetch each result, extract readable text with Readability + BeautifulSoup, and chunk the content.
Summarise the chunks deterministically and register each source with the citation registry.
Detect owner/repo:path tokens in the topic, fetch the referenced files via the GitHub reader, summarise them, and add their links as additional citations.
Emit report.md containing a “Findings” section with inline references plus a numbered “Sources” list, and save the structured source metadata to sources.json.
Store all artifacts in artifacts/<timestamp>_research/ so runs are easy to audit.

The implementation is intentionally mockable: helper functions accept injectable HTTP clients, the summariser is deterministic, and the citation registry deduplicates URLs to keep references tidy.

Development

This repository uses Black, Ruff, MyPy, and pytest. Convenience commands are provided in the Makefile.

make install   # install project and dev dependencies
make lint      # run Ruff and Black (check mode)
make type      # run mypy static type checks
make test      # run pytest suite
make format    # format code with Ruff (import order) and Black

To run the CLI manually without installing the package globally:

python -m agentic_cli.cli.app --help

Extending the CLI

Build custom tools and deterministic workflows with the extension SDK and scaffolding command. See docs/extending.md for a guided tour of the registry, code generation, and testing story.

Project Structure

src/agentic_cli/
    __init__.py          # package metadata
    config.py            # pydantic-powered settings helper
    artifacts/
        manager.py       # artifact directory helpers
    cli/
        app.py           # Typer application wiring subcommands
        research_cmd.py  # CLI wrappers delegating to the SDK research workflow
        plan_cmd.py      # CLI wrapper delegating to the SDK planner
        run_cmd.py       # CLI wrapper delegating to the SDK executor
        docs_cmd.py      # CLI wrapper delegating to the SDK documentation fetcher
    sdk/
        __init__.py      # consolidated SDK exports
        research.py      # research workflow logic + dataclasses
        plan.py          # planning workflow logic + dataclasses
        run.py           # execution workflow logic + dataclasses
        docs.py          # documentation workflow logic + dataclasses
    runner/
        agent.py         # agent + tool dataclasses
        policy.py        # heuristic policy helper
        runner.py        # orchestration loop with simulation + denylist
    tools/
        __init__.py
        citations.py     # citation registry helpers
        github_reader.py # GitHub file fetch + summarisation helpers
        mermaid_gen.py   # Mermaid diagram helpers
        shell.py         # simulation-first shell execution helpers
        web_search.py    # search, fetch, chunk, summarise utilities

Tests live under tests/ and cover the CLI scaffold, configuration parsing, runner behaviour, the research tooling (chunking, citation formatting, summarisation limits, GitHub ingestion), the planning workflow, Context7 integration, and the shell simulator.

Planning workflow

Parse the goal to identify primary focus items (technologies, packages, or deliverables).
Assemble a deterministic sequence of steps covering discovery, preparation, implementation, and validation, enriching the plan with detailed guidance.
Render a Markdown briefing and JSON payload under a timestamped directory in artifacts/.
Generate a Mermaid flowchart (plan.mmd) and render an SVG diagram when mmdc from @mermaid-js/mermaid-cli is installed.

This deterministic approach keeps plans auditable and makes it easy to version the diagrams alongside the textual steps.

Run workflow

Generate a deterministic plan using the same focus extraction as the planning command.
Convert each step into a safe shell command suggestion (currently an echo preview) using the shell tool helpers.
Execute each command in simulation mode by default, capturing previews, stdout, and stderr in memory. Pass --allow-run to run the commands for real—dangerous patterns such as rm -rf, sudo, chmod 777, and curl|sh remain blocked by the denylist.
Persist a Markdown run_log.md and structured run_trace.json under artifacts/<timestamp>_run/ so every execution is auditable.

This flow gives you an end-to-end rehearsal of the plan before choosing to execute commands in your environment.

Documentation workflow (Context7)

The optional agent docs command integrates with the Context7 MCP service to capture structured documentation for a package. To enable it:

Install Node.js 18+ and the Context7 CLI if you plan to run a local MCP server.
Set CONTEXT7_URL in your environment (e.g., https://context7.your-domain/v1).
Optionally set CONTEXT7_API_KEY when the service requires authentication.
Leave CONTEXT7_MODE at http for HTTP-based access (the default in .env.example).

Running agent docs next-auth@5.0.0 writes a timestamped directory under artifacts/ containing:

docs.json — the structured payload from Context7 (library metadata + document entries).
docs.md — a human-readable summary with titles, summaries, and source hints.

These artifacts can be referenced by subsequent agent research or agent plan runs to augment their context with authoritative documentation.

Runner Overview

The runner module introduces:

Agent state (agent.py) — dataclasses for messages, tool definitions, and tool call results, along with a lightweight Agent container for instructions and memory.
Policy heuristics (policy.py) — a deterministic decision helper that selects tools when the last message is from the user and falls back to final responses otherwise.
Execution loop (runner.py) — coordinates policy decisions, invokes tools, enforces simulation mode for shell commands, blocks denylisted patterns (e.g., rm -rf), logs progress via structlog, and returns structured trace data for downstream formatting.
Tool plugins (runner/plugins.py) — discover additional ToolDefinition implementations from configuration or package entry points. See docs/third_party_tools.md for details on distributing custom tools.

Continuous integration

GitHub Actions runs make lint, make type, and make test on every push and pull request. The workflow lives at .github/workflows/ci.yml and ensures Ruff, Black, MyPy, and pytest stay green across environments.

Next Steps

Future iterations will focus on swapping deterministic stubs for real LLM calls (once API keys are configured), adding caching for repeated research runs, and enriching the runner policy with more representative traces.

Contributing

🧪 Before contributing, READ THE BRANCHING GUIDE! We've had enough interdimensional Git disasters.

CONTRIBUTING.md - Complete development workflow and branching strategy
docs/branching-workflow.md - Mermaid diagrams showing proper Git workflow

TL;DR: Always create feature branches from latest main, keep them short-lived, and test before pushing. No exceptions.

License

This project is licensed under the terms of the MIT License. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic CLI

Quickstart

Commands

SDK usage

Example session

Research workflow

Development

Extending the CLI

Project Structure

Planning workflow

Run workflow

Documentation workflow (Context7)

Runner Overview

Continuous integration

Next Steps

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src/agentic_cli		src/agentic_cli
tests		tests
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Agentic CLI

Quickstart

Commands

SDK usage

Example session

Research workflow

Development

Extending the CLI

Project Structure

Planning workflow

Run workflow

Documentation workflow (Context7)

Runner Overview

Continuous integration

Next Steps

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages