From ba3ef02440a48e9572b96863566cbaaeae16788f Mon Sep 17 00:00:00 2001 From: "daibo@machinepulse.ai" Date: Sun, 26 Apr 2026 17:01:22 +0800 Subject: [PATCH 1/3] docs: add /docs site, restructure README, align skill with protocol MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add 9 docs under /docs/ (signal format, architecture, quick-start, build-a-sensor, multi-sensor, sensorhub, why-w2a, rfc-graph, CONTRIBUTING) so the README can stay focused on the landing-page experience and link out for depth. - Restructure the README quick-start around the Claude Code plugin path; demote SensorHub and "Missing a sensor?" to H3 subsections under Sensors; thin the hero-area horizontal rules. - Remove all `w2a` CLI pseudo-commands across docs and README — only capabilities that work today are described. - Align build-w2a-sensor skill with the protocol: mandatory npm keywords for `npm search w2a-sensor` / SensorHub discovery, `source_event.schema` property descriptions as best practice, `event.type` `domain` is the abstract source space (not the platform name), action verbs in past tense. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 211 +++++++++++++------------------ docs/CONTRIBUTING.md | 35 +++++ docs/architecture.md | 110 ++++++++++++++++ docs/build-a-sensor.md | 101 +++++++++++++++ docs/multi-sensor.md | 37 ++++++ docs/quick-start.md | 85 +++++++++++++ docs/rfc-graph.md | 48 +++++++ docs/sensorhub.md | 54 ++++++++ docs/signal-format.md | 152 ++++++++++++++++++++++ docs/why-w2a.md | 37 ++++++ skills/build-w2a-sensor/SKILL.md | 28 +++- 11 files changed, 773 insertions(+), 125 deletions(-) create mode 100644 docs/CONTRIBUTING.md create mode 100644 docs/architecture.md create mode 100644 docs/build-a-sensor.md create mode 100644 docs/multi-sensor.md create mode 100644 docs/quick-start.md create mode 100644 docs/rfc-graph.md create mode 100644 docs/sensorhub.md create mode 100644 docs/signal-format.md create mode 100644 docs/why-w2a.md diff --git a/README.md b/README.md index a96aa4d..489f2cb 100644 --- a/README.md +++ b/README.md @@ -1,120 +1,63 @@ ![Welcome to World2Agent](./docs/images/readme-banner.png) -**Agents can't act on what they can't perceive.** +

+ Agents can't act on what they can't perceive. +

+ +

+ License + npm +

+ +

+ Website · + Quick Start · + Sensors · + SensorHub · + Docs · + Community +

+ + +

+ + Watch the W2A Concept Video + +

+

+ ▶️ Watch: What is World2Agent? +

+ +

+ GitHub Stars +

+

+ Like what you see? Give us a ⭐ — every star helps more developers discover W2A. +

-World2Agent is an open protocol that connects the world to AI agents. It standardizes how agents perceive their surroundings — stock movements, meeting updates, new research papers, GitHub trending repos, X/Twitter feeds, and anything else that can emit a signal. - -## Why World2Agent? - -AI agents today are mostly reactive — they wait for user input, or have to actively search for information. A truly useful agent needs to proactively perceive its environment: a stock price hitting your threshold, a meeting agenda changing 10 minutes before it starts, a new paper dropping in your research area, a repo trending on GitHub that's relevant to your project. - -Without a standard, every agent builder has to: - -* Write bespoke integrations for each data source - -* Design their own signal schema — none of which are interoperable - -* Handle polling, webhooks, auth, dedup, backpressure from scratch - -World2Agent makes perception pluggable. Install a sensor, get structured signals. Swap one sensor for another, your agent code doesn't change. Compose multiple sensors, they all speak the same schema. Standardized, open, and pluggable — for perception. - -* **Unified signal format** — one schema for all sources, designed for AI consumption - -* **Pluggable sensors** — each sensor is an independent npm package; install only what you need +*** -* **Pluggable delivery** — direct to agent, or enriched via a graph layer (self-hosted or third-party) +## What is World2Agent? -* **Pluggable transports** — stdout pipe, HTTP POST, or any custom transport +World2Agent (W2A) is an open protocol that standardizes how AI agents perceive the real world. Install a sensor, your agent gets structured, real-time data. Swap sensors freely — they all speak the same schema. -* **Zero lock-in** — run sensors yourself, compose them freely, no central server +W2A isn't a product. It's an open protocol and an invitation. We built the first sensors — the real breakthroughs will come from the community. -We built the protocol and the first sensors. But these are just the starting point — the real breakthroughs will come from the community. +→ [Why W2A? Full story](./docs/why-w2a.md) ## Architecture **World → Sensor → Agent** -Sensors watch data sources and emit structured signals following W2A Protocol — a unified signal schema designed for AI consumption. Your agent receives signals and decides what to do. - -The protocol defines what a signal looks like. Sensors do the work. Agents make the decisions. - -This is the core loop — and it's all you need to get started. - -## Roadmap - -As your needs grow, W2A supports more advanced patterns: - -* **Graph layer** — compose and enrich signals from multiple sensors before they reach your agent. Run it yourself, or use a hosted service. Graph input and output both follow W2A Protocol, so it slots in without changing your agent code. - -* **SensorHub** — an open registry where anyone can publish, discover, and install sensors from the community. Think npm, but for real-world perception. - -These are on the roadmap. The protocol and the first sensors are ready today. +Sensors watch data sources and emit structured data following W2A Protocol. Your agent receives signals and decides what to do. ![World2Agent system architecture](./docs/images/system-architecture.png) -## Packages - -> [**SDK Reference →**](https://github.com/machinepulse-ai/world2agent-typescript-sdk) Full API documentation for sensor developers and signal consumers. - -*** - -## Signal Format - -Every signal follows a unified schema: - -```typescript -{ - signal_id: "uuid-v4", - schema_version: "w2a/0.1", - emitted_at: 1719000000000, - source: { - sensor_id: "", - sensor_version: "0.1.0", - source_type: "slack", - user_identity: "U01A2B3C4D", // Slack id of the user this sensor serves - package: "", // canonical package coordinate; usually = sensor_id - }, - event: { // normalized cross-source classification - type: "messaging.message.mentioned", // domain.entity.action - occurred_at: 1719000000000, - summary: "Zhang Wei asked about payment deployment safety in #engineering; staging error rate spiked 2h ago, blocking release pipeline", - }, - source_event: { // optional, self-describing original payload from the source - schema: { /* JSON Schema draft-07 describing `data` */ }, - data: { channel_id: "C01ENG0001", message_ts: "1719000000.001200", user_id: "U09Z8Y7X6W" }, - }, - attachments: [ // optional, content blobs (tagged union on `type`) - { type: "inline", mime_type: "text/plain", description: "Original message text", data: "..." }, - { type: "reference", mime_type: "image/png", description: "Error rate dashboard screenshot", uri: "https://..." }, - ], -} -``` - -Key design decisions: - -* **`event.summary`** is the soul of the signal. An AI reading only the summary must be able to decide whether and how to act. Follow Actor-Action-Object-Context-Impact: *who did what, where, and why it matters*. - -* **`event` vs `source_event`** — `event` is the normalized cross-source classification (`type` / `occurred_at` / `summary`). `source_event` is the self-describing original payload from the source platform (`schema` + `data`, both required when present). Keeping them separate lets agents pattern-match on `event.type` without knowing platform-specific shapes, while graph layers still get the full structured facts. - -* **`attachments`** carry actual content blobs (message bodies, diffs, images, audio). Each item is a tagged union: `{ type: "inline", mime_type, description, data }` for embedded content, or `{ type: "reference", mime_type, description, uri }` for externally-addressable content. `description` is required on both so AI always understands what it's looking at. Not for structured metadata — that belongs in `source_event`. - -* **No routing in protocol** -- routing/priority is a consumer-side concern, not the sensor's. - -*** +→ [Signal format spec](./docs/signal-format.md) · [Architecture deep dive](./docs/architecture.md) ## Quick Start -W2A plugs into any agent that can consume structured signals. Pick the integration that matches your setup, or pipe sensors directly into your own consumer. - -Browse the full sensor catalog at [sensorhub.world2agent.ai](https://sensorhub.world2agent.ai). - -> **Security — install only sensors you trust.** -> -> A sensor's signals drive what your agent perceives and does, so an untrusted sensor is effectively an untrusted instruction source. We strongly recommend installing only open-source sensors from authors you trust, and reviewing the code before running it. - -### Claude Code - -In an active Claude Code session, install the `world2agent` plugin: +The fastest way to feel W2A is with Claude Code. In an active session, install the `world2agent` plugin: ``` /plugin marketplace add machinepulse-ai/world2agent-plugins @@ -125,51 +68,79 @@ In an active Claude Code session, install the `world2agent` plugin: Add a sensor — for example, Hacker News: ``` -/world2agent:sensor-add @world2agent/sensor-hacknews +/world2agent:sensor-add @world2agent/sensor-hackernews ``` -Then restart Claude Code with the plugin channel loaded so sensor signals can be delivered into your session: +Restart Claude Code with the plugin channel loaded so sensor signals flow into your session: ```bash claude --dangerously-load-development-channels plugin:world2agent@world2agent-plugins ``` -### More agent integrations +> **Security — install only sensors you trust.** A sensor's signals drive what your agent perceives and does, so an untrusted sensor is effectively an untrusted instruction source. Stick to open-source sensors from authors you trust, and review the code first. + +Or pipe directly to any agent runtime — no plugin needed: + +```bash +w2a-sensor-hackernews | your-agent +``` + +**Building your own agent?** See the [developer quick start](./docs/quick-start.md#option-2-code--sdk--sensor) for the SDK code path. + +→ [Full guide](./docs/quick-start.md) · [Multi-sensor](./docs/multi-sensor.md) · [SensorHub](./docs/sensorhub.md) -More first-class agent integrations are on the way. Until then, any agent can consume W2A signals directly via the pipe mode below. +## Sensors -### Pipe mode +### SensorHub -Every sensor ships as a standalone CLI, so you can pipe signals into any consumer — no plugin required: +Every sensor is a standard npm package. SensorHub is the discovery layer on top — browse the catalog at [world2agent.ai/hub](https://world2agent.ai/hub), or search npm directly: ```bash -w2a-sensor-slack | your-consumer-app +npm search w2a-sensor +npm install @world2agent/sensor-hackernews ``` -*** +→ [SensorHub guide](./docs/sensorhub.md) + +### Missing a sensor? -## Build a sensor +[Build your own](./docs/build-a-sensor.md) in ~50 lines. The `build-w2a-sensor` skill walks an AI coding agent through discovery, signal design, scaffolding, and the install recipe. -A sensor is an independent npm package that watches one source and emits `W2ASignal`. Install our skill and ask your coding agent to build it — the skill walks through source interrogation, signal design, scaffold, and install recipe: +Once it's ready, ship it to npm: ```bash -npx skills add https://github.com/machinepulse-ai/world2agent/skills/build-w2a-sensor +npm publish ``` -See the [TypeScript SDK](https://github.com/machinepulse-ai/world2agent-typescript-sdk) for the `defineSensor` / `run` / transport APIs. +That's all it takes to share your sensor with the world — once published, it's installable by every W2A agent everywhere, and SensorHub indexes it for discovery. -*** +## Roadmap + +* **Graph layer** — compose and enrich signals from multiple sensors before they reach your agent. → [RFC](./docs/rfc-graph.md) + +## Contributing + +* 🔧 **Build a sensor** — `npm publish` and it's live -## Contribute +* 🐛 **Report bugs** — [open an issue](https://github.com/machinepulse-ai/world2agent/issues) -World2Agent is an open protocol — the real breakthroughs come from the community. Ways to get involved: +* 💡 **Suggest a sensor** — [Discussions](https://github.com/machinepulse-ai/world2agent/discussions) -* **Publish a sensor** — pick a source you care about and build a sensor for it (see *Build a sensor* above). Once it's on npm, anyone can install it. High-quality sensors get surfaced on [sensorhub.world2agent.ai](https://sensorhub.world2agent.ai). -* **Evolve the protocol** — propose schema changes via PR against [`schema/`](./schema). Protocol changes land here first, then flow into the SDK and plugins. -* **Improve the SDK** — the reference TypeScript SDK lives at [`world2agent-typescript-sdk`](https://github.com/machinepulse-ai/world2agent-typescript-sdk). Help with transports, testing utilities, or SDKs in other languages. -* **Add an agent integration** — bring W2A to another agent runtime via the [plugins repo](https://github.com/machinepulse-ai/world2agent-plugins). -* **File issues & ideas** — bug reports, ambiguous schema fields, sensor wishlist entries all welcome. +→ [Contributing guide](./docs/CONTRIBUTING.md) + +## Community + +[Website](https://machinepulse.ai/) · [Discord]([DISCORD_LINK]) · [X / Twitter](https://x.com/machinepulse) · [YouTube]([YOUTUBE_LINK]) + + + ## License -Apache 2.0 +[Apache 2.0](./LICENSE) + +*** + +

+ Built by MachinePulse · Open source, open protocol, open invitation. +

diff --git a/docs/CONTRIBUTING.md b/docs/CONTRIBUTING.md new file mode 100644 index 0000000..f6a8e64 --- /dev/null +++ b/docs/CONTRIBUTING.md @@ -0,0 +1,35 @@ +# Contributing to World2Agent + +Thanks for your interest in W2A. Here's how you can help. + +## Build a Sensor + +The most impactful way to contribute. Every new sensor expands what agents can perceive. + +1. Use the W2A SDK to build your sensor ([guide](./build-a-sensor.md)) +2. `npm publish` to distribute + +That's it. No PR to the main repo required — your sensor is an independent npm package. + +## Improve the Protocol + +Found an edge case in the signal format? Have a better idea for event type conventions? Open an issue or a Discussion with the `protocol` label. + +## Improve Docs + +Found something confusing? Typo? Missing example? PRs to `/docs` are always welcome. + +## Report Bugs + +[Open an issue](https://github.com/machinepulse-ai/world2agent/issues) with: +- What you were trying to do +- What happened +- What you expected + +## Suggest a Sensor + +Don't want to build one yourself? Tell us what perception your agent needs in [Discussions](https://github.com/machinepulse-ai/world2agent/discussions). Tag it `sensor-request`. + +## Code of Conduct + +Be kind. Be constructive. We're building this together. diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 0000000..c7e28d6 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,110 @@ +# Architecture + +## Core Loop + +**World → Sensor → Agent** + +Sensors watch data sources and emit structured data following W2A Protocol — a unified signal schema designed for AI consumption. Your agent receives signals and decides what to do. + +The protocol defines what a signal looks like. Sensors do the work. Agents make the decisions. + +``` +World (Flights API, Calendar, GitHub, X, Steam, ...) + │ + ▼ +Sensors (npm packages, emitting W2A-formatted data) + │ + ▼ +Agent (receives signals, reasons, acts) +``` + +This is the core loop — and it's all you need to get started. + +## Key Design Decisions + +1. **Protocol is natural-language-first** — designed for agents, not humans +2. **Sensors don't make value judgments** — they provide controllable granularity of listening and subscription +3. **Sensors don't assume where they run** — consumers define how sensors emit via transports +4. **Graph output stays W2A Protocol** — technically acts as middleware + +## Sensor + +A sensor is a small program that watches one data source and emits structured data following W2A Protocol. Each sensor is an independent npm package. + +Sensors support multiple delivery methods: +- **stdout pipe** — `w2a-sensor-github | your-agent` +- **HTTP POST** — push to an endpoint +- **WebSocket / SSE** — streaming delivery +- **Custom transport** — implement your own + +A sensor does NOT: +- Route or prioritize signals (that's the consumer's job) +- Define actions (that's the agent's job) +- Assume a specific agent runtime + +## SensorHub + +SensorHub is a discovery layer on top of npm. Every sensor is a standard npm package — SensorHub makes them easier to find. + +``` +Developer builds sensor → npm publish → submit to SensorHub + │ +User searches SensorHub → finds sensor → npm install +``` + +No separate registry. npm is the single source for hosting and distribution. + +## Graph Layer (Roadmap) + +As needs grow, W2A supports an intermediate graph layer that composes and enriches signals from multiple sensors before they reach the agent. + +``` +World + │ + ▼ +Sensors ──────────────┐ + │ │ + ▼ ▼ +Agent Graph layer +(direct) (compose, enrich, filter) + │ + ▼ + Agent +``` + +Graph deployment options: +- **Self-hosted** — agent owner runs their own instance, data stays local +- **Third-party** — hosted service (e.g. Karpo), zero ops + +Graph input and output both follow W2A Protocol, so it slots in without changing agent code. + +## Full Architecture + +``` +┌─────────────────────────────────────────────────────────┐ +│ World │ +│ Flights API · Calendar · GitHub · X · Steam · ... │ +└──────────────────────┬──────────────────────────────────┘ + │ +┌──────────────────────▼──────────────────────────────────┐ +│ Sensors (following W2A Protocol) [Today]│ +│ github · x · steam · gcal · feishu · hn · ... │ +│ │ +│ ←→ SensorHub (discover, publish, install) [Today]│ +└──────────┬───────────────────────────┬──────────────────┘ + │ │ + Direct path Graph path [Roadmap] + │ │ + │ ┌────────────▼────────────────┐ + │ │ Graph layer │ + │ │ ┌─ Self-hosted │ + │ │ └─ Third-party (e.g. Karpo)│ + │ └────────────┬────────────────┘ + │ │ + └───────────┬───────────────┘ + │ +┌──────────────────────▼──────────────────────────────────┐ +│ Agent │ +│ Receives signals, reasons, acts │ +└─────────────────────────────────────────────────────────┘ +``` diff --git a/docs/build-a-sensor.md b/docs/build-a-sensor.md new file mode 100644 index 0000000..659c0cc --- /dev/null +++ b/docs/build-a-sensor.md @@ -0,0 +1,101 @@ +# Build a Sensor + +Write your own sensor in ~50 lines. + +## Fast path: use the skill + +The fastest way is to let an AI coding agent walk you through it. Install our skill: + +```bash +npx skills add https://github.com/machinepulse-ai/world2agent/skills/build-w2a-sensor +``` + +Then ask your coding agent: *"Build a W2A sensor for \."* The skill walks through source interrogation, signal design, scaffolding, and the install recipe — the hard parts of sensor design are upstream of the code, and the skill front-loads them. + +The example below is roughly what the skill produces — read on if you want to understand the shape, or skip straight to running the skill. + +## Minimal Example + +```typescript +import { defineSensor } from "@world2agent/sdk/sensor"; +import { createSignal } from "@world2agent/sdk"; +import { z } from "zod"; + +export default defineSensor({ + id: "my-sensor", + version: "0.1.0", + source_type: "my-source", + auth: { + type: "api_key", + fields: [{ name: "token", label: "API Token", sensitive: true }], + }, + configSchema: z.object({ token: z.string() }), + + async start(ctx) { + const interval = setInterval(async () => { + const data = await fetchMySource(ctx.config.token); + + const signal = createSignal(this, { + event: { + // domain is the abstract source space (`tasks`, `messaging`, `repo`, …), + // not the platform name — platform lives in `source.source_type`. + type: "tasks.item.created", + summary: `${data.author} created "${data.title}" in ${data.project}; priority ${data.priority}, assigned to you`, + }, + source_event: { + schema: { + type: "object", + properties: { + id: { + type: "string", + description: "Item id in the source platform", + }, + priority: { + type: "number", + description: "Priority level: 0 (low) to 3 (urgent)", + }, + }, + }, + data: { id: data.id, priority: data.priority }, + }, + attachments: [ + { + type: "inline", + mime_type: "text/plain", + description: "Item description", + data: data.body, + }, + ], + }); + + await ctx.emit(signal); + }, 60_000); + + return () => clearInterval(interval); // cleanup + }, +}); +``` + +## Key Points + +- `event.summary` is mandatory — write it so an AI can triage from this field alone +- Follow `Actor-Action-Object-Context-Impact`: who did what, where, why it matters +- `source_event` (top-level, not inside `event`) must include both `schema` (JSON Schema draft-07) and `data`. Every property in `schema` SHOULD carry a `description` — that's what makes the payload self-describing +- `attachments` is a tagged union — each item is `{ type: "inline", data }` or `{ type: "reference", uri }`, both with required `mime_type` and `description`. For content (text, images, diffs), not structured metadata +- The `start` function returns a cleanup function + +## Publish + +```bash +npm publish +``` + +Sensors are distributed via npm. Once published, anyone can install your sensor by package name. + +## Standalone CLI + +Every sensor also works as a standalone CLI — pipe it to any agent: + +```bash +w2a-sensor-my-source | your-agent +``` diff --git a/docs/multi-sensor.md b/docs/multi-sensor.md new file mode 100644 index 0000000..303ce56 --- /dev/null +++ b/docs/multi-sensor.md @@ -0,0 +1,37 @@ +# Multi-Sensor Composition + +Compose sensors freely — they all speak the same schema. Same handler, regardless of source. Swap one sensor for another, your agent code doesn't change. + +## Basic Composition + +```typescript +import { runAll } from "@world2agent/sdk/sensor"; +import { fanout, stdoutTransport, httpTransport } from "@world2agent/sdk/transports"; + +await runAll([ + { spec: github, config: { token: "xxx" } }, + { spec: steam, config: { userId: "xxx" } }, + { spec: gcal, config: { clientId: "xxx", clientSecret: "xxx", refresh_token: "xxx" } }, +], { + onSignal: fanout([ + stdoutTransport(), + httpTransport({ url: "https://your-app.com/api/signals" }), + ]), +}); +``` + +## Pipe Mode + +Every sensor has a standalone CLI. Compose at the shell level: + +```bash +w2a-sensor-github | your-consumer-app +``` + +## Multiple Transports + +Fan out the same signal to multiple destinations: + +- `stdoutTransport()` — pipe to another process +- `httpTransport({ url })` — POST to an endpoint +- Custom transports — implement your own diff --git a/docs/quick-start.md b/docs/quick-start.md new file mode 100644 index 0000000..0ce8c68 --- /dev/null +++ b/docs/quick-start.md @@ -0,0 +1,85 @@ +# Quick Start + +W2A plugs into any agent that can consume structured signals. Pick the path that fits your setup. + +> **Security — install only sensors you trust.** +> +> A sensor's signals drive what your agent perceives and does, so an untrusted sensor is effectively an untrusted instruction source. We strongly recommend installing only open-source sensors from authors you trust, and reviewing the code before running it. + +## Option 1: Claude Code (recommended) + +The fastest way to feel W2A. In an active Claude Code session, install the `world2agent` plugin: + +``` +/plugin marketplace add machinepulse-ai/world2agent-plugins +/plugin install world2agent@world2agent-plugins +/reload-plugins +``` + +Add a sensor — for example, Hacker News: + +``` +/world2agent:sensor-add @world2agent/sensor-hackernews +``` + +Restart Claude Code with the plugin channel loaded so sensor signals can be delivered into your session: + +```bash +claude --dangerously-load-development-channels plugin:world2agent@world2agent-plugins +``` + +Browse the full sensor catalog at [world2agent.ai/hub](https://world2agent.ai/hub). + +## Option 2: Code — SDK + Sensor + +For agents you're writing yourself. Install the SDK and a sensor: + +```bash +npm install @world2agent/sdk @world2agent/sensor-github +``` + +Start receiving signals: + +```typescript +import { run } from "@world2agent/sdk/sensor"; +import { createSignalHandler } from "@world2agent/sdk/consumer"; +import github from "@world2agent/sensor-github"; + +// 1. Create a handler — an event router for incoming signals +const handler = createSignalHandler(); + +// 2. Register listeners for each event type you care about +// `domain` is the abstract source space (`repo`), not the platform name — +// the platform identity is in `signal.source.source_type`. +handler.on("repo.trending.entered", async (signal) => { + console.log("Trending:", signal.event.summary); + // Your agent logic here +}); + +handler.on("repo.repo.starred", async (signal) => { + console.log("New star:", signal.event.summary); +}); + +// 3. Run the sensor — signals flow in, your agent decides what to do +await run(github, { + config: { token: "xxx" }, + onSignal: (signal) => handler.handle(signal), +}); +``` + +## Option 3: CLI Pipe — any agent runtime + +Every sensor is also a standalone CLI. Pipe it directly to your agent: + +```bash +w2a-sensor-github | your-agent +``` + +No SDK, no TypeScript, no setup. The sensor emits W2A-formatted JSON to stdout, your agent reads stdin. More first-class agent integrations are on the way; until then this is how any runtime can consume W2A. + +## Next Steps + +- Browse available sensors → [Sensor Library](../README.md#sensors) +- Find community sensors → [SensorHub](./sensorhub.md) +- Build your own → [Build a Sensor](./build-a-sensor.md) +- Compose multiple sensors → [Multi-Sensor Composition](./multi-sensor.md) diff --git a/docs/rfc-graph.md b/docs/rfc-graph.md new file mode 100644 index 0000000..f5ff5ee --- /dev/null +++ b/docs/rfc-graph.md @@ -0,0 +1,48 @@ +# RFC: Graph Layer + +> Status: Roadmap — not yet implemented + +## Problem + +Single sensors provide atomic signals. But real-world perception often requires crossing multiple sources — "you have a free evening + a restaurant you like just opened a new menu + the weather is nice" isn't any one sensor's output. + +## Proposal + +A graph layer that sits between sensors and agents, composing and enriching signals from multiple sources before they reach the agent. + +### Key Constraint + +**Graph output stays W2A Protocol.** The graph is technically middleware — it consumes W2A signals and emits W2A signals. This means: + +- Agent code doesn't change when you add a graph +- You can swap between direct path and graph path without touching the agent +- Graphs are composable — one graph can feed another + +### Deployment Options + +- **Self-hosted** — agent owner runs their own instance. Data stays local. +- **Third-party** — hosted service (e.g. Karpo). Zero ops, trade-off is data passes through a third party. + +### Architecture + +``` +Sensors ──────────────┐ + │ │ + ▼ ▼ +Agent Graph layer +(direct) (compose, enrich, filter) + │ + ▼ + Agent +``` + +## Open Questions + +1. How does a graph define its composition logic? Config file? Code? Natural language? +2. Should graphs be publishable on SensorHub alongside sensors? +3. How to handle backpressure when a graph consumes high-frequency sensors? +4. Should graphs be able to call external APIs (e.g. LLM for enrichment), or stay pure data transformations? + +## Status + +This is on the roadmap. The protocol and sensors are the current priority. Community input welcome — open a [Discussion](https://github.com/machinepulse-ai/world2agent/discussions) if you have thoughts on graph design. diff --git a/docs/sensorhub.md b/docs/sensorhub.md new file mode 100644 index 0000000..d33d6b5 --- /dev/null +++ b/docs/sensorhub.md @@ -0,0 +1,54 @@ +# SensorHub + +SensorHub is a discovery layer on top of npm. Every W2A sensor is a standard npm package — SensorHub makes them easier to find. + +No separate registry. No new platform. npm is the single source for hosting and distribution. + +## Find Sensors + +Browse the catalog at [world2agent.ai/hub](https://world2agent.ai/hub), or search npm directly: + +```bash +npm search w2a-sensor +``` + +Install any sensor by package name: + +```bash +npm install {PackageName} +``` + +## Publish a Sensor + +Build your sensor with the [W2A SDK](./build-a-sensor.md), then ship it to npm: + +```bash +npm publish +``` + +That's the distribution. SensorHub indexes published sensors and surfaces them on the website — no separate publish step today. + +Requirements: +- An npm account (this is the quality gate — npm identity = accountability) +- A public GitHub link is recommended; open-source sensors get higher visibility + +## How It Works + +SensorHub is a thin index layer. It doesn't host any code or packages. + +Each sensor entry stores: + +| Field | Source | +|-------|--------| +| npm package name | Developer submits | +| Description | Developer submits | +| GitHub URL | Developer submits (optional, boosts ranking) | +| Category tags | Developer submits | +| Install count | Pulled from npm API | +| Open source flag | Auto-detected from GitHub URL | + +## Ranking + +Sensors are ranked by: **npm install count × open source weight** + +Open-source sensors (with a GitHub link) rank higher. This incentivizes transparency and auditability. diff --git a/docs/signal-format.md b/docs/signal-format.md new file mode 100644 index 0000000..d627126 --- /dev/null +++ b/docs/signal-format.md @@ -0,0 +1,152 @@ +# Signal Format + +Every sensor emits the same envelope and every consumer accepts the same envelope — no sensor-specific shapes on the wire. The canonical schema lives in [`schema/0.1/schema.ts`](../schema/0.1/schema.ts) (with a generated [`schema.json`](../schema/0.1/schema.json) alongside it). Pin to the directory, never to `main`. + +## Example + +```json +{ + "signal_id": "8b1f0c4a-5d2e-4f87-9a1b-3c0e5f8a9d12", + "schema_version": "w2a/0.1", + "emitted_at": 1719000000123, + + "source": { + "sensor_id": "@world2agent/sensor-github", + "sensor_version": "0.1.0", + "source_type": "github", + "user_identity": "octocat", + "package": "@world2agent/sensor-github" + }, + + "event": { + "type": "repo.trending.entered", + "occurred_at": 1719000000000, + "summary": "llm-agents/perception gained 523 stars in 24h and hit #3 on GitHub Trending in AI; relevant to your active 'agent perception' research thread" + }, + + "source_event": { + "schema": { + "type": "object", + "properties": { + "repo": { + "type": "string", + "description": "Repository in `owner/name` form" + }, + "stars_today": { + "type": "integer", + "description": "Stars gained in the last 24 hours" + }, + "trending_rank": { + "type": "integer", + "description": "Position on GitHub Trending; 1 is top of the list" + } + }, + "required": ["repo", "stars_today"] + }, + "data": { + "repo": "llm-agents/perception", + "stars_today": 523, + "trending_rank": 3 + } + }, + + "attachments": [ + { + "type": "inline", + "mime_type": "text/plain", + "description": "Repository README excerpt", + "data": "Perception primitives for LLM agents…" + }, + { + "type": "reference", + "mime_type": "image/png", + "description": "Stars-over-time chart", + "uri": "https://example.com/charts/llm-agents-perception.png" + } + ] +} +``` + +## Fields + +### Envelope + +| Field | Required | Notes | +|---|---|---| +| `signal_id` | yes | UUID v4. Fresh per emission, even for logically identical events — consumers dedupe on this. | +| `schema_version` | yes | `"w2a/0.1"`. Consumers MUST reject versions they don't understand. | +| `emitted_at` | yes | When the sensor emitted, UTC ms. Distinct from `event.occurred_at`. | +| `source` | yes | Who emitted and where the event came from. | +| `event` | yes | Normalized cross-source classification. | +| `source_event` | optional | Self-describing original payload from the source. | +| `attachments` | optional | Content blobs (text, images, audio, etc.). | +| `_meta` | optional | Vendor / experimental fields. Consumers MUST ignore unknown keys. Available on most objects. | + +### `source` + +`sensor_id` is the npm coordinate, `package` is what channels and bridges use to derive the agent-side handler id (typically equal to `sensor_id`). `source_type` is a coarse platform grouping (`github`, `cron`, `feishu`) shared across sensors of the same platform — it's an open set, no central registry. + +### `event` — the soul of the signal + +`event.summary` is what an AI reads first. If the summary alone is not enough to decide whether and how to act, the signal has failed. + +Pattern: **Actor → Action → Object → Context → Impact**. + +```text +[Actor] [Action] [Object] in [Context]; [Impact] +``` + +Examples: + +- ✅ `"Zhang Wei asked about payment deployment safety in #engineering; staging error rate spiked 2h ago, blocking release pipeline"` +- ✅ `"llm-agents/perception gained 523 stars in 24h and hit #3 on Trending; relevant to your active research thread"` +- ❌ `"new event"` / `"PR update"` / `"price moved"` — vague, reject and rewrite. + +`event.type` follows `domain.entity.action`: + +| Example | Domain | Entity | Action | +|---|---|---|---| +| `repo.pull_request.opened` | repo | pull_request | opened | +| `messaging.message.mentioned` | messaging | message | mentioned | +| `market.quote.threshold_crossed` | market | quote | threshold_crossed | +| `calendar.event.updated` | calendar | event | updated | + +Open namespace — sensors coin their own triples. Consumers pattern-match on this string, so the triples a sensor emits are part of its public contract; treat them as you would a public API. + +`domain` is the **abstract source space** (`messaging`, `repo`, `market`, `calendar`), not the platform name. The platform identity already lives in `source.source_type` — keeping the two orthogonal is what lets one handler match the same semantic event across platforms: `handler.on("messaging.message.mentioned")` catches Slack, Discord, Lark, and Teams alike. A sensor for GitHub stars emits `repo.repo.starred` (with `source.source_type: "github"`), not `github.repo.starred`. + +`event.occurred_at` is when the underlying event happened. If the source doesn't expose it, fall back to `emitted_at`. + +### `event` vs `source_event` vs `attachments` + +Three channels, three jobs — keep them separate. + +| Field | Carries | Example | +|---|---|---| +| `event` | Normalized classification — `type`, `occurred_at`, `summary` | `type: "messaging.message.mentioned"`, summary text | +| `source_event` | Self-describing structured data: `{ schema, data }` with JSON Schema draft-07 | IDs, numbers, booleans, enums the graph or agent will reason over | +| `attachments` | Unstructured content blobs | Message body, PDF, screenshot, audio clip | + +Every property in `source_event.schema` SHOULD carry a `description` — that is what makes the payload self-describing. A schema that only declares types (`{ "type": "integer" }`) leaves the consumer guessing what the value means; a schema with descriptions (`{ "type": "integer", "description": "Stars gained in the last 24 hours" }`) lets an agent reason about the data without sensor-specific knowledge. + +Never put structured machine data in an attachment. Never put large blobs in `source_event.data`. + +### `attachments` — tagged union + +Each attachment is `InlineAttachment` or `ReferenceAttachment`, discriminated by `type`. `description` is required on both — AI must always know what it's looking at. + +```json +{ "type": "inline", "mime_type": "text/plain", "description": "…", "data": "…" } +{ "type": "reference", "mime_type": "image/png", "description": "…", "uri": "https://…" } +``` + +- `inline.data` — UTF-8 for text mime types, base64 for binary. +- Prefer `reference` for anything larger than a few KiB, or anything already addressable. Consumers fetch references on demand. + +## Design notes + +**No routing in protocol.** Routing and priority are consumer-side concerns. A sensor just emits; the consumer decides what matters. The same sensor feeds a Claude Code agent, a Slack bot, and a dashboard with zero changes. + +**Versioning.** The schema directory name (`0.1`) is the protocol version. Breaking changes bump the directory; additive changes land in place. Consumers pin to the directory. + +**Extensibility via `_meta`.** Most objects carry `_meta?: Record` so implementations can attach experimental or vendor fields without protocol bumps. Consumers MUST ignore unknown `_meta` keys. diff --git a/docs/why-w2a.md b/docs/why-w2a.md new file mode 100644 index 0000000..0c5c127 --- /dev/null +++ b/docs/why-w2a.md @@ -0,0 +1,37 @@ +# Why World2Agent? + +AI agents today are mostly reactive — they wait for user input, or have to actively search for information. A truly useful agent needs to proactively perceive its environment: a stock price hitting your threshold, a meeting agenda changing 10 minutes before it starts, a new paper dropping in your research area, a repo trending on GitHub that's relevant to your project. + +Right now, an agent only knows your flight is delayed when you tell it — or when it happens to check. That's not perception. That's polling at best, and human labor at worst. + +Real perception means: the world changes, your agent knows — instantly, automatically, without anyone pulling the trigger. + +## The Problem for Builders + +Without a standard, every agent builder has to: + +- Write bespoke integrations for each data source +- Design their own signal schema — none of which are interoperable +- Handle polling, webhooks, auth, dedup, backpressure from scratch + +Each integration is a one-off. Each one parses a different API, emits a different JSON shape, breaks when you swap agent frameworks. The result: developers waste hours building perception infrastructure from scratch, and agents run on insufficient context. + +## What W2A Changes + +World2Agent makes perception pluggable: + +- **Unified signal format** — one schema for all sources, designed for AI consumption +- **Pluggable sensors** — each sensor is an independent npm package; install only what you need +- **Pluggable delivery** — direct to agent, or enriched via a graph layer (self-hosted or third-party) +- **Pluggable transports** — stdout pipe, HTTP POST, or any custom transport +- **Zero lock-in** — run sensors yourself, compose them freely, no central server + +Install a sensor, get structured data. Swap one sensor for another, your agent code doesn't change. Compose multiple sensors, they all speak the same schema. + +## This Needs the Community + +W2A isn't a product. It's an open protocol and an invitation — to build the perception layer for AI agents, together. + +There are millions of data sources out there. We built a few sensors. The rest should come from you. + +We built the protocol and the first sensors. But these are just the starting point — the real breakthroughs will come from the community. diff --git a/skills/build-w2a-sensor/SKILL.md b/skills/build-w2a-sensor/SKILL.md index 7161439..764beb2 100644 --- a/skills/build-w2a-sensor/SKILL.md +++ b/skills/build-w2a-sensor/SKILL.md @@ -44,7 +44,9 @@ Pick the highest available option — **do not default to polling if a push chan ### 3. What event types will it emit? -List every triple it will emit as `domain.entity.action` (open namespace — you coin them). Examples: `slack.message.received`, `jira.issue.created`, `market.quote.threshold_crossed`. These strings are your sensor's public contract — consumers pattern-match against them, so stability matters. +List every triple it will emit as `domain.entity.action` (open namespace — you coin them). Examples: `messaging.message.received`, `repo.issue.created`, `market.quote.threshold_crossed`. These strings are your sensor's public contract — consumers pattern-match against them, so stability matters. + +`domain` is the **abstract source space** (`messaging`, `repo`, `market`, `calendar`), **not the platform name**. The platform identity already lives in `source.source_type` (`"slack"`, `"github"`, `"jira"`). A Slack sensor emits `messaging.message.mentioned`, not `slack.message.mentioned` — that's what lets a consumer write `handler.on("messaging.message.mentioned")` once and match @-events from Slack, Discord, Lark, and Teams alike. `action` is a verb in past tense (`mentioned`, `opened`, `received`), not a gerund (`trending` ❌ → `trending_entered` ✅). ### 4. What config does the sensor need? @@ -72,10 +74,12 @@ Three channels, three jobs. Keep them separate: | Field | Carries | Example | |---|---|---| -| `event` | Normalized cross-source classification — `type`, `occurred_at`, `summary` | `type: "slack.message.mentioned"`, summary text | +| `event` | Normalized cross-source classification — `type`, `occurred_at`, `summary` | `type: "messaging.message.mentioned"`, summary text | | `source_event` | Self-describing structured data from the source: `{ schema, data }` with JSON Schema draft-07 for `schema` | IDs, numbers, booleans, enums the graph/agent will reason over | | `attachments` | Unstructured content blobs (message bodies, diffs, images, audio) | Text body of the message, PDF file, screenshot | +Every property in `source_event.schema` SHOULD carry a `description` — that's what makes the payload self-describing. A schema that only declares types (`{ "type": "integer" }`) leaves the consumer guessing what the value means; with a description (`{ "type": "integer", "description": "Stars gained in the last 24 hours" }`) an agent can reason about the data without sensor-specific knowledge. + Never put structured machine data in an attachment. Never put large blobs in `source_event.data`. ### Attachment choice: inline vs reference @@ -109,13 +113,21 @@ Layout: ### `package.json` -Substitute `` and `` with whatever coordinates you publish under. +Substitute `` and `` with whatever coordinates you publish under. The `keywords` array is the discoverability contract — fill in the real `source_type` decided in Phase 1, do not leave the placeholder. `npm search w2a-sensor` and SensorHub indexing both rely on these keywords. ```json { "name": "", "version": "0.1.0", "description": "W2A sensor — ", + "keywords": [ + "world2agent", + "w2a", + "w2a-sensor", + "sensor", + "agent", + "" + ], "type": "module", "main": "./dist/index.js", "types": "./dist/index.d.ts", @@ -146,6 +158,8 @@ Substitute `` and `` with whatever coordinates you publi } ``` +The first five keywords are mandatory for every W2A sensor — they are how consumers discover sensors via `npm search`. The last entry is the `source_type` decided in Phase 1 (e.g. `"github"`, `"hackernews"`, `"slack"`). Add additional source-specific tags (`"trending"`, `"oauth"`, etc.) only if they would actually help discovery. + The `w2a` block is tooling metadata the install CLI reads; it is not part of the wire protocol. ### `src/index.ts` @@ -193,8 +207,12 @@ export function transformEvent(/* source event args */): W2ASignal { summary: "...", // Phase 2 template }, source_event: { - schema: { /* JSON Schema draft-07 describing `data` */ }, - data: { /* original event fields */ }, + schema: { + /* JSON Schema draft-07 describing `data`. + Every property SHOULD carry a `description` so the payload + is self-describing without sensor-specific knowledge. */ + }, + data: { /* original event fields */ }, }, attachments: [ { From 555e891cc0caeeb1f3438c15470c63b7f2b162da Mon Sep 17 00:00:00 2001 From: "daibo@machinepulse.ai" Date: Sun, 26 Apr 2026 17:58:04 +0800 Subject: [PATCH 2/3] =?UTF-8?q?docs:=20address=20PR=20#1=20review=20?= =?UTF-8?q?=E2=80=94=20fix=20placeholders,=20naming,=20and=20SensorHub=20f?= =?UTF-8?q?low?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - README: point Website to world2agent.ai and SensorHub to /hub - sensorhub.md: replace `{PackageName}` with ``, document the manual SensorHub submission step (no auto-crawl from npm today) - quick-start.md, signal-format.md: rename `repo.repo.starred` to `repo.star.added` to satisfy `domain.entity.action` - signal-format.md: surface the past-tense `action` rule from schema.ts - SKILL.md: drop the `trending` example to align with quick-start's `repo.trending.entered` Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 12 ++++-------- docs/quick-start.md | 2 +- docs/sensorhub.md | 4 ++-- docs/signal-format.md | 4 +++- skills/build-w2a-sensor/SKILL.md | 2 +- 5 files changed, 11 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 489f2cb..969b07c 100644 --- a/README.md +++ b/README.md @@ -10,22 +10,18 @@

- Website · + Website · Quick Start · Sensors · - SensorHub · + SensorHub · Docs · Community

- - Watch the W2A Concept Video - -

-

- ▶️ Watch: What is World2Agent? + + Watch the W2A Concept Video

diff --git a/docs/quick-start.md b/docs/quick-start.md index 0ce8c68..f5ccc7a 100644 --- a/docs/quick-start.md +++ b/docs/quick-start.md @@ -56,7 +56,7 @@ handler.on("repo.trending.entered", async (signal) => { // Your agent logic here }); -handler.on("repo.repo.starred", async (signal) => { +handler.on("repo.star.added", async (signal) => { console.log("New star:", signal.event.summary); }); diff --git a/docs/sensorhub.md b/docs/sensorhub.md index d33d6b5..43fe855 100644 --- a/docs/sensorhub.md +++ b/docs/sensorhub.md @@ -15,7 +15,7 @@ npm search w2a-sensor Install any sensor by package name: ```bash -npm install {PackageName} +npm install ``` ## Publish a Sensor @@ -26,7 +26,7 @@ Build your sensor with the [W2A SDK](./build-a-sensor.md), then ship it to npm: npm publish ``` -That's the distribution. SensorHub indexes published sensors and surfaces them on the website — no separate publish step today. +`npm publish` handles distribution. To get listed on SensorHub, submit your sensor at [world2agent.ai/hub/submit](https://world2agent.ai/hub/submit) — this is a one-time manual step per package; SensorHub does not auto-crawl npm today. Requirements: - An npm account (this is the quality gate — npm identity = accountability) diff --git a/docs/signal-format.md b/docs/signal-format.md index d627126..2d77c03 100644 --- a/docs/signal-format.md +++ b/docs/signal-format.md @@ -111,9 +111,11 @@ Examples: | `market.quote.threshold_crossed` | market | quote | threshold_crossed | | `calendar.event.updated` | calendar | event | updated | +`action` must be a **past-tense verb** (`opened`, `mentioned`, `threshold_crossed`), never a base form or gerund (`open` ❌, `opening` ❌). Signals describe things that already happened, and consumers read the triple as a sentence — past tense is what makes it a sentence. + Open namespace — sensors coin their own triples. Consumers pattern-match on this string, so the triples a sensor emits are part of its public contract; treat them as you would a public API. -`domain` is the **abstract source space** (`messaging`, `repo`, `market`, `calendar`), not the platform name. The platform identity already lives in `source.source_type` — keeping the two orthogonal is what lets one handler match the same semantic event across platforms: `handler.on("messaging.message.mentioned")` catches Slack, Discord, Lark, and Teams alike. A sensor for GitHub stars emits `repo.repo.starred` (with `source.source_type: "github"`), not `github.repo.starred`. +`domain` is the **abstract source space** (`messaging`, `repo`, `market`, `calendar`), not the platform name. The platform identity already lives in `source.source_type` — keeping the two orthogonal is what lets one handler match the same semantic event across platforms: `handler.on("messaging.message.mentioned")` catches Slack, Discord, Lark, and Teams alike. A sensor for GitHub stars emits `repo.star.added` (with `source.source_type: "github"`), not `github.repo.starred`. `event.occurred_at` is when the underlying event happened. If the source doesn't expose it, fall back to `emitted_at`. diff --git a/skills/build-w2a-sensor/SKILL.md b/skills/build-w2a-sensor/SKILL.md index 764beb2..187c0ef 100644 --- a/skills/build-w2a-sensor/SKILL.md +++ b/skills/build-w2a-sensor/SKILL.md @@ -46,7 +46,7 @@ Pick the highest available option — **do not default to polling if a push chan List every triple it will emit as `domain.entity.action` (open namespace — you coin them). Examples: `messaging.message.received`, `repo.issue.created`, `market.quote.threshold_crossed`. These strings are your sensor's public contract — consumers pattern-match against them, so stability matters. -`domain` is the **abstract source space** (`messaging`, `repo`, `market`, `calendar`), **not the platform name**. The platform identity already lives in `source.source_type` (`"slack"`, `"github"`, `"jira"`). A Slack sensor emits `messaging.message.mentioned`, not `slack.message.mentioned` — that's what lets a consumer write `handler.on("messaging.message.mentioned")` once and match @-events from Slack, Discord, Lark, and Teams alike. `action` is a verb in past tense (`mentioned`, `opened`, `received`), not a gerund (`trending` ❌ → `trending_entered` ✅). +`domain` is the **abstract source space** (`messaging`, `repo`, `market`, `calendar`), **not the platform name**. The platform identity already lives in `source.source_type` (`"slack"`, `"github"`, `"jira"`). A Slack sensor emits `messaging.message.mentioned`, not `slack.message.mentioned` — that's what lets a consumer write `handler.on("messaging.message.mentioned")` once and match @-events from Slack, Discord, Lark, and Teams alike. `action` is a verb in past tense (`mentioned`, `opened`, `received`, `entered`), not a base form or gerund (`mention` ❌, `opening` ❌). ### 4. What config does the sensor need? From 8b3426b5720395efaad330126f57ffd815076301 Mon Sep 17 00:00:00 2001 From: "daibo@machinepulse.ai" Date: Sun, 26 Apr 2026 21:06:33 +0800 Subject: [PATCH 3/3] docs: replace community link placeholders with real handles Drop the Discord placeholder, set Twitter to @Karpo_AI, and point YouTube at the actual channel URL. Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 969b07c..5f4d7ed 100644 --- a/README.md +++ b/README.md @@ -126,7 +126,7 @@ That's all it takes to share your sensor with the world — once published, it's ## Community -[Website](https://machinepulse.ai/) · [Discord]([DISCORD_LINK]) · [X / Twitter](https://x.com/machinepulse) · [YouTube]([YOUTUBE_LINK]) +[Website](https://machinepulse.ai/) · [X / Twitter](https://x.com/Karpo_AI) · [YouTube](https://www.youtube.com/channel/UCmuDMSxQp2LLJ4nrkPuCGQw)