Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 6 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,15 @@

bubseek turns fragmented data across operational systems, repositories, and agent runtime traces into **explainable, actionable, and shareable insights** without heavy ETL. It keeps the Bub runtime and extension model while packaging a practical default distribution for real deployments.

`bubseek` now boots through a single distribution entry point and targets SeekDB/OceanBase tape storage through the SQLAlchemy URL or `OCEANBASE_*` settings.
`bubseek` packages a practical Bub distribution with SeekDB/OceanBase defaults, bundled channels, and builtin skills, without adding a second CLI surface on top of `bub`.

## Features

- **Lightweight and on-demand** — Trigger analysis when needed instead of maintaining large offline pipelines.
- **Explainability first** — Conclusions are returned together with agent reasoning context.
- **Cloud-edge ready** — Supports distributed deployment and local execution boundaries.
- **Agent observability** — Treats agent behavior as governed, inspectable runtime data.
- **Bub-compatible** — Forwards Bub commands directly; no fork of the core runtime.
- **Bub-compatible** — Uses Bub directly as the runtime and command surface; no fork of the core runtime.

## Quick start

Expand All @@ -26,23 +26,15 @@ Requires [uv](https://docs.astral.sh/uv/) (recommended) or pip, and Python 3.12+
git clone https://github.com/ob-labs/bubseek.git
cd bubseek
uv sync
uv run bubseek --help
uv run bubseek chat
uv run bub --help
uv run bub chat
```

If your runtime reads credentials from `.env`, bubseek forwards them to the Bub subprocess:

```dotenv
BUB_MODEL=openrouter:qwen/qwen3-coder-next
BUB_API_KEY=sk-or-v1-...
BUB_API_BASE=https://openrouter.ai/api/v1
```

Configure SeekDB or OceanBase before running `bubseek`, using `BUB_TAPESTORE_SQLALCHEMY_URL=mysql+oceanbase://...` or the matching `OCEANBASE_*` variables.
Configure SeekDB or OceanBase before running `bubseek`, using `BUB_TAPESTORE_SQLALCHEMY_URL=mysql+oceanbase://...`.

## Add contrib

Contrib packages remain standard Python packages. Add them as normal dependencies. The bundled channel extras resolve from GitHub-hosted `bub-contrib` packages instead of local workspace packages.
Contrib packages remain standard Python packages. Add them as normal dependencies. bubseek ships its built-in channels and marimo support by default, and resolves bundled contrib packages from GitHub-hosted `bub-contrib` packages instead of local workspace packages.

```toml
[project]
Expand All @@ -58,8 +50,6 @@ Then sync your environment:
uv sync
```

- Optional extras: Feishu `uv sync --extra feishu`, DingTalk `uv sync --extra dingtalk`, WeChat `uv sync --extra wechat`, Discord `uv sync --extra discord`, Marimo `uv sync --extra marimo`.

## Documentation

## Development
Expand Down
6 changes: 3 additions & 3 deletions contrib/bubseek-marimo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@ Marimo channel for Bub — native marimo dashboard with chat and insights index.
## Installation

```bash
uv sync --extra marimo
uv sync
# or
pip install bubseek[marimo]
pip install .
```

## Gateway

```bash
bubseek gateway --enable-channel marimo
bub gateway --enable-channel marimo
```

Open `http://localhost:2718/` — marimo gallery. Click **dashboard** for chat + index. The dashboard queues turns asynchronously and refreshes transcript events from the channel backend.
Expand Down
2 changes: 1 addition & 1 deletion contrib/bubseek-marimo/scripts/verify_marimo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
# Requires: .env with OPENROUTER_API_KEY (or equivalent) for chat.
set -e
cd "$(dirname "$0")/../.."
uv sync --extra marimo
uv sync
uv run pytest contrib/bubseek-marimo/tests/test_marimo_e2e.py -v "$@"
14 changes: 5 additions & 9 deletions contrib/bubseek-marimo/src/bubseek_marimo/channel.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,17 +104,13 @@ def _insights_dir(self) -> Path:

def _tapestore_url(self) -> str:
if resolve_tapestore_url is not None:
return resolve_tapestore_url(self._workspace_dir())
env = env_with_workspace_dotenv(self._workspace_dir()) if env_with_workspace_dotenv else self._marimo_env()
url = (env.get("BUB_TAPESTORE_SQLALCHEMY_URL") or "").strip()
url = resolve_tapestore_url(self._workspace_dir())
else:
env = env_with_workspace_dotenv(self._workspace_dir()) if env_with_workspace_dotenv else self._marimo_env()
url = (env.get("BUB_TAPESTORE_SQLALCHEMY_URL") or "").strip()
if url:
return url
host = (env.get("OCEANBASE_HOST") or "127.0.0.1").strip()
port = int((env.get("OCEANBASE_PORT") or "2881").strip())
user = (env.get("OCEANBASE_USER") or "root").strip()
password = env.get("OCEANBASE_PASSWORD") or ""
database = (env.get("OCEANBASE_DATABASE") or "bub").strip()
return f"mysql+oceanbase://{user}:{password}@{host}:{port}/{database}"
raise RuntimeError("BUB_TAPESTORE_SQLALCHEMY_URL is required for the marimo channel")

def _ensure_seed_notebooks(self) -> None:
insights_dir = self._insights_dir()
Expand Down
34 changes: 33 additions & 1 deletion contrib/bubseek-marimo/tests/test_marimo_e2e.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import sys
import time
from pathlib import Path
from types import ModuleType
from urllib.parse import urlsplit

import pytest
Expand Down Expand Up @@ -39,6 +40,17 @@ async def _noop_handler(*_args, **_kwargs) -> None:
return None


def _stub_bubseek_oceanbase(monkeypatch: pytest.MonkeyPatch) -> None:
monkeypatch.setitem(sys.modules, "bubseek.oceanbase", ModuleType("bubseek.oceanbase"))


def _require_tapestore_url() -> str:
url = (os.environ.get("BUB_TAPESTORE_SQLALCHEMY_URL") or "").strip()
if not url:
pytest.skip("BUB_TAPESTORE_SQLALCHEMY_URL is required for marimo gateway tests")
return url


def _port_ready(host: str, port: int, timeout: float = 2.0) -> bool:
try:
with socket.create_connection((host, port), timeout=timeout):
Expand Down Expand Up @@ -91,10 +103,12 @@ def _assert_notebook_loads(filename: str) -> tuple[int, str]:


def test_workspace_resolution_priority(monkeypatch, tmp_path) -> None:
_stub_bubseek_oceanbase(monkeypatch)
from bubseek_marimo.channel import MarimoChannel

marimo_workspace = tmp_path / "marimo-workspace"
bubb_workspace = tmp_path / "bub-workspace"
monkeypatch.setenv("BUB_TAPESTORE_SQLALCHEMY_URL", "mysql+oceanbase://seek:secret@seekdb.example:2881/analytics")
monkeypatch.setenv("BUB_MARIMO_WORKSPACE", str(marimo_workspace))
monkeypatch.setenv("BUB_WORKSPACE_PATH", str(bubb_workspace))

Expand All @@ -105,8 +119,10 @@ def test_workspace_resolution_priority(monkeypatch, tmp_path) -> None:


def test_workspace_resolution_falls_back_to_cwd(monkeypatch, tmp_path) -> None:
_stub_bubseek_oceanbase(monkeypatch)
from bubseek_marimo.channel import MarimoChannel

monkeypatch.setenv("BUB_TAPESTORE_SQLALCHEMY_URL", "mysql+oceanbase://seek:secret@seekdb.example:2881/analytics")
monkeypatch.delenv("BUB_MARIMO_WORKSPACE", raising=False)
monkeypatch.delenv("BUB_WORKSPACE_PATH", raising=False)
monkeypatch.chdir(tmp_path)
Expand All @@ -120,13 +136,29 @@ def test_workspace_resolution_falls_back_to_cwd(monkeypatch, tmp_path) -> None:
assert channel._insights_dir() == tmp_path.resolve() / "insights"


def test_marimo_channel_requires_explicit_tapestore_url(monkeypatch, tmp_path) -> None:
_stub_bubseek_oceanbase(monkeypatch)
from bubseek_marimo.channel import MarimoChannel

monkeypatch.delenv("BUB_TAPESTORE_SQLALCHEMY_URL", raising=False)
monkeypatch.delenv("BUB_MARIMO_WORKSPACE", raising=False)
monkeypatch.delenv("BUB_WORKSPACE_PATH", raising=False)
monkeypatch.chdir(tmp_path)
monkeypatch.setattr("bubseek_marimo.channel.discover_project_root", lambda start: None)
monkeypatch.setattr("bubseek_marimo.channel._discover_project_root_fallback", lambda start: None)

with pytest.raises(RuntimeError, match="BUB_TAPESTORE_SQLALCHEMY_URL is required"):
MarimoChannel(_noop_handler)


@pytest.fixture(scope="module")
def gateway_process():
"""Start gateway with marimo channel, yield process, cleanup on teardown."""
global PORT, MARIMO_PORT

workspace = REPO_ROOT
env = os.environ.copy()
env["BUB_TAPESTORE_SQLALCHEMY_URL"] = _require_tapestore_url()
PORT = _pick_free_port()
MARIMO_PORT = _pick_free_port()
while MARIMO_PORT == PORT:
Expand All @@ -142,7 +174,7 @@ def gateway_process():
pytest.fail("uv executable is required for marimo gateway tests")

proc = subprocess.Popen( # noqa: S603
[uv_executable, "run", "bubseek", "gateway", "--enable-channel", "marimo"],
[uv_executable, "run", "bub", "gateway", "--enable-channel", "marimo"],
cwd=str(REPO_ROOT),
env=env,
stdout=subprocess.DEVNULL,
Expand Down
4 changes: 2 additions & 2 deletions contrib/bubseek-schedule/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ dependencies = [
- **`load_state` starts the scheduler** on the first inbound message. That way `bub chat` (CLI-only: only the `cli` channel is enabled) still persists jobs to SeekDB. Previously, `AsyncIOScheduler` was only started by the `schedule` channel, so CLI chat left jobs in memory-only `_pending_jobs` and **nothing was written to `apscheduler_jobs`**.
- The channel name is `schedule`. Enabling it in `bub gateway` is optional for persistence; it still starts/stops the scheduler cleanly when you use gateway with that channel.
- Jobs are persisted to:
- **OceanBase/SeekDB**: Same URL as the tape store (`BUB_TAPESTORE_SQLALCHEMY_URL` / `OCEANBASE_*`), table `apscheduler_jobs`.
- **OceanBase/SeekDB**: Same URL as the tape store (`BUB_TAPESTORE_SQLALCHEMY_URL`), table `apscheduler_jobs`.

## Provided Tools

Expand All @@ -47,7 +47,7 @@ dependencies = [

## Debug: job in chat but not in Marimo kanban / DB

The gateway resolves the job store URL from `BUB_TAPESTORE_SQLALCHEMY_URL` or workspace `.env` (`OCEANBASE_*`). Marimo must use the **same** URL. If `insights/schedule_kanban.py` pointed at the default `127.0.0.1:2881/bub` while your `.env` uses another host/db, the table will look empty.
The gateway resolves the job store URL from `BUB_TAPESTORE_SQLALCHEMY_URL` in the workspace `.env` or process environment. Marimo must use the **same** URL.

From the bubseek repo root:

Expand Down
8 changes: 6 additions & 2 deletions contrib/bubseek-schedule/src/tests/test_bubseek_schedule.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,11 @@ def _seekdb_url() -> str:
def test_jobstore_roundtrip():
"""Test jobstore roundtrip via APScheduler on SeekDB/OceanBase."""
from apscheduler.schedulers.background import BackgroundScheduler

url = _seekdb_url()
from bubseek_schedule.jobstore import OceanBaseJobStore

store = OceanBaseJobStore(url=_seekdb_url(), tablename="apscheduler_jobs_test_roundtrip")
store = OceanBaseJobStore(url=url, tablename="apscheduler_jobs_test_roundtrip")
scheduler = BackgroundScheduler(jobstores={"default": store})
scheduler.start()

Expand All @@ -38,9 +40,11 @@ def test_jobstore_roundtrip():
def test_jobstore_get_due_jobs():
"""Test get_due_jobs and get_next_run_time."""
from apscheduler.schedulers.background import BackgroundScheduler

url = _seekdb_url()
from bubseek_schedule.jobstore import OceanBaseJobStore

store = OceanBaseJobStore(url=_seekdb_url(), tablename="apscheduler_jobs_test_due")
store = OceanBaseJobStore(url=url, tablename="apscheduler_jobs_test_due")
scheduler = BackgroundScheduler(jobstores={"default": store})
scheduler.start()

Expand Down
12 changes: 6 additions & 6 deletions docs/api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@

The public Python surface is intentionally small.

## bubseek
## bubseek.config

Package root. Re-exports the CLI entry function.
Configuration helpers for resolving tapestore settings.

::: bubseek
::: bubseek.config

## bubseek.__main__
## bubseek.database

CLI entry point. `main()` forwards CLI arguments and `.env` values to the `bub` subprocess.
Database bootstrap helpers used by maintenance scripts.

::: bubseek.__main__
::: bubseek.database
7 changes: 3 additions & 4 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,9 @@ This page explains what bubseek is responsible for, and what it deliberately lea

## What bubseek does

- provides the `bubseek` executable as a single bootstrap entry point over `bub`
- forwards `.env` values to the Bub subprocess
- standardizes tape storage on SeekDB/OceanBase
- ships a small set of builtin skills with the package
- bundles a practical set of contrib channels and tools by default
- pins a practical default Bub runtime version

## What bubseek does not do
Expand All @@ -25,7 +24,7 @@ Bub remains the runtime, command surface, and extension host.

### bubseek

bubseek is the distribution layer: packaging, bootstrap behavior, runtime defaults, and builtin skills.
bubseek is the distribution layer: packaging, runtime defaults, plugin wiring, and builtin skills.

### Python packaging

Expand All @@ -35,7 +34,7 @@ Python packaging handles dependency resolution, lockfiles, and installation. Con

From a user perspective, the benefit is simple: there is less to learn.

- run `bubseek` the same way you would run `bub`
- run `bub`
- add contrib the same way you add any Python dependency
- use builtin skills without an extra sync step
- treat generated marimo notebooks as runtime artifacts under `insights/`, not committed templates
Expand Down
35 changes: 19 additions & 16 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,22 +30,25 @@ dependencies = [
]
```

If you do not want them installed by default, put them under `optional-dependencies` instead:
For bubseek itself, the official distribution keeps its built-in channels and marimo support in the default dependency set:

```toml
[project.optional-dependencies]
feishu = ["bub-feishu"]
dingtalk = ["bub-dingtalk"]
wechat = ["bub-wechat"]
discord = ["bub-discord"]
marimo = ["bubseek-marimo"]
[project]
dependencies = [
"bub",
"bub-feishu",
"bub-dingtalk",
"bub-wechat",
"bub-discord",
"bubseek-marimo",
]
```

Install with: `uv sync --extra feishu` / `pip install bubseek[feishu]` (Feishu); `uv sync --extra dingtalk` / `pip install bubseek[dingtalk]` (DingTalk); `uv sync --extra wechat` / `pip install bubseek[wechat]` ([WeChat](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-wechat)); `uv sync --extra discord` / `pip install bubseek[discord]` ([Discord](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-discord)); `uv sync --extra marimo` / `pip install bubseek[marimo]` (Marimo channel with bundled notebook skills).
Install with the normal project sync or package install: `uv sync` / `pip install .`.

## Runtime credentials

bubseek forwards `.env` values to the Bub subprocess. Bub reads `BUB_*` variables (see [Bub deployment](https://github.com/bubbuild/bub/blob/main/docs/deployment.md)).
Bub reads `BUB_*` variables directly (see [Bub deployment](https://github.com/bubbuild/bub/blob/main/docs/deployment.md)).

**Minimal OpenRouter setup:**

Expand All @@ -68,21 +71,21 @@ BUB_API_BASE=https://openrouter.ai/api/v1
| `BUB_TELEGRAM_ALLOW_CHATS` | Comma-separated chat allowlist |
| `BUB_SEARCH_OLLAMA_API_KEY` | Required for web.search tool (bundled) |
| `BUB_SEARCH_OLLAMA_API_BASE` | Ollama API base (default: `https://ollama.com/api`) |
| `BUB_FEISHU_APP_ID` | Required for Feishu channel (optional extra: `bubseek[feishu]`) |
| `BUB_FEISHU_APP_ID` | Required for Feishu channel |
| `BUB_FEISHU_APP_SECRET` | Required for Feishu channel |
| `BUB_DINGTALK_CLIENT_ID` | AppKey for DingTalk channel (optional extra: `bubseek[dingtalk]`) |
| `BUB_DINGTALK_CLIENT_ID` | AppKey for DingTalk channel |
| `BUB_DINGTALK_CLIENT_SECRET` | AppSecret for DingTalk channel |
| `BUB_DINGTALK_ALLOW_USERS` | Comma-separated staff_ids, or `*` for all |
| WeChat token file | After `bub login wechat`, credentials live under `~/.bub/wechat_token.json` (optional extra: `bubseek[wechat]`); see [bub-wechat](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-wechat) |
| `BUB_DISCORD_TOKEN` | Discord bot token (optional extra: `bubseek[discord]`); see [bub-discord](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-discord) |
| WeChat token file | After `bub login wechat`, credentials live under `~/.bub/wechat_token.json`; see [bub-wechat](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-wechat) |
| `BUB_DISCORD_TOKEN` | Discord bot token; see [bub-discord](https://github.com/bubbuild/bub-contrib/tree/main/packages/bub-discord) |
| `BUB_DISCORD_ALLOW_USERS` | Optional comma-separated allowlist (user id / username / global name) |
| `BUB_DISCORD_ALLOW_CHANNELS` | Optional comma-separated channel id allowlist |
| `BUB_MARIMO_HOST` | Marimo channel bind host (default: `127.0.0.1`) |
| `BUB_MARIMO_PORT` | Marimo channel bind port (default: `2718`) |
| `BUB_MARIMO_WORKSPACE` | Workspace for insights (default: `BUB_WORKSPACE_PATH` or `.`) |
| `BUB_TAPESTORE_SQLALCHEMY_URL` | SQLAlchemy tape store URL (bundled) |

When `BUB_TAPESTORE_SQLALCHEMY_URL` is unset, bubseek builds a SeekDB/OceanBase URL from the `OCEANBASE_*` variables. Set either the full `mysql+oceanbase://...` URL or the `OCEANBASE_*` fields before running.
Set `BUB_TAPESTORE_SQLALCHEMY_URL` to the full `mysql+oceanbase://...` URL before running any tapestore-backed features.

## Builtin skills

Expand All @@ -93,14 +96,14 @@ bubseek also vendors skills at build time via `pdm-build-skills`; these are merg
- `friendly-python` and `piglet` from [PsiACE/skills](https://github.com/PsiACE/skills)
- `plugin-creator` from [bub-contrib/.agents/skills/plugin-creator](https://github.com/bubbuild/bub-contrib/tree/main/.agents/skills/plugin-creator)

The optional `bubseek[marimo]` extra provides:
The bundled marimo support provides:
- **MarimoChannel** — inbound WebSocket for gateway; chat dashboard at `http://0.0.0.0:2718/`
- **marimo skill** — output data insights as marimo `.py` notebooks; index of charts in `{workspace}/insights/`
- References [marimo-team/skills](https://github.com/marimo-team/skills) marimo-notebook conventions

The dashboard and index are generated into `{workspace}/insights/` at runtime from one canonical template source. They should not be hand-edited inside the repository.

Run `bubseek gateway --enable-channel marimo` to enable the marimo dashboard.
Run `bub gateway --enable-channel marimo` to enable the marimo dashboard.

## Advanced: downstream skill packaging

Expand Down
Loading
Loading