Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions docs/architecture/03_providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## 1. Purpose

The providers layer is Ergon's boundary between runtime code and external execution substrates. It owns four concerns: resolving `model_id` strings to `pydantic_ai.models.Model` instances, provisioning and tearing down E2B sandboxes via per-benchmark manager subclasses, surfacing sandbox state transitions as dashboard events, and publishing worker outputs as content-addressed blobs that evaluators can re-read. Everything that crosses the process boundary (LLM API, container runtime, blob storage) is routed through this layer so the runtime, workers, and evaluators stay substrate-agnostic.
The provider-style boundaries are Ergon's adapters between runtime code and external execution substrates. Model resolution lives in the generation registry, while sandbox infrastructure now lives under `ergon_core.core.sandbox` because it owns lifecycle, instrumentation, event emission, and artifact publishing rather than just a third-party provider adapter.

## 2. Core abstractions

Expand All @@ -11,12 +11,12 @@ The providers layer is Ergon's boundary between runtime code and external execut
| `_BACKEND_REGISTRY` | module-level dict | `ergon_core/core/providers/generation/model_resolution.py` | Frozen shape; entries grow via registration. | Providers layer. |
| `resolve_model_target` | function | `ergon_core/core/providers/generation/model_resolution.py` | Public, frozen signature. Returns `ResolvedModel`. | Providers layer. |
| `register_model_backend` | function | `ergon_core/core/providers/generation/model_resolution.py` | Public, frozen signature. | Providers layer; callers are backend modules executing at import time. |
| `BaseSandboxManager` | abstract class + singleton | `ergon_core/core/providers/sandbox/manager.py` | Shape stable; `event_sink` activation path in flux. | Providers layer. |
| `DefaultSandboxManager` | concrete class | `ergon_core/core/providers/sandbox/manager.py` | Frozen. | Providers layer. |
| `BaseSandboxManager` | abstract class + singleton | `ergon_core/core/sandbox/manager.py` | Shape stable; `event_sink` activation path in flux. | Sandbox domain. |
| `DefaultSandboxManager` | concrete class | `ergon_core/core/sandbox/manager.py` | Frozen. | Sandbox domain. |
| `SWEBenchSandboxManager`, `MiniF2FSandboxManager`, `ResearchRubricsSandboxManager` | concrete subclasses | `ergon_builtins/` | Owned per benchmark; singletons. | Benchmark authors. |
| `SandboxEventSink` | `typing.Protocol` | `ergon_core/core/providers/sandbox/event_sink.py` | Frozen protocol; activation path in flux. | Providers layer. |
| `NoopSandboxEventSink`, `DashboardEmitterSandboxEventSink` | implementations | `ergon_core/core/providers/sandbox/event_sink.py` | Frozen. | Providers layer. |
| `SandboxResourcePublisher` | class | `ergon_core/core/providers/sandbox/resource_publisher.py` | Frozen API; storage backend swappable via `ERGON_BLOB_ROOT`. | Providers layer. |
| `SandboxEventSink` | `typing.Protocol` | `ergon_core/core/sandbox/event_sink.py` | Frozen protocol; activation path in flux. | Sandbox domain. |
| `NoopSandboxEventSink`, `DashboardEmitterSandboxEventSink` | implementations | `ergon_core/core/sandbox/event_sink.py` | Frozen. | Sandbox domain. |
| `SandboxResourcePublisher` | class | `ergon_core/core/sandbox/resource_publisher.py` | Frozen API; storage backend swappable via `ERGON_BLOB_ROOT`. | Sandbox domain. |
| `TransformersModel` | `pydantic_ai.models.Model` subclass | `ergon_builtins/ergon_builtins/models/transformers_backend.py` | Frozen. | ML team (TRL training loop callers). |

### 2.1 Generation registry
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture/cross_cutting/artifacts.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ produces computed artifacts through `CriterionRuntime.run_command(...)`.

| Type | Location | Freeze | Owner |
|------|----------|--------|-------|
| `SandboxResourcePublisher` | `ergon_core/core/providers/sandbox/resource_publisher.py` | Stable | Sandbox provider |
| `SandboxResourcePublisher` | `ergon_core/core/sandbox/resource_publisher.py` | Stable | Sandbox domain |
| `RunResource` | ORM row; table `run_resources` | Stable wire shape | Persistence layer |
| `dashboard/resource.published` | Inngest event | Stable | Dashboard lane |
| `CriterionRuntime.read_resource(name)` | Proposed per RFC | Pending | Evaluator layer |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ to a `type[BaseSandboxManager]` (not an instance). The cleanup function would
need to resolve the class and call the static method
`BaseSandboxManager.terminate_by_sandbox_id(sandbox_id)`.
`terminate_by_sandbox_id` is a `@staticmethod` at
`ergon_core/ergon_core/core/providers/sandbox/manager.py:472-490` that calls
`ergon_core/ergon_core/core/sandbox/manager.py:472-490` that calls
`AsyncSandbox.kill(sandbox_id=..., api_key=...)` directly via E2B, so no
instance is needed. However, `cleanup_cancelled_task_fn` currently has no
import path to `SANDBOX_MANAGERS`.
Expand Down Expand Up @@ -278,7 +278,7 @@ import logging
import inngest

from ergon_builtins.registry import SANDBOX_MANAGERS
from ergon_core.core.providers.sandbox.manager import BaseSandboxManager
from ergon_core.core.sandbox.manager import BaseSandboxManager
from ergon_core.core.runtime.events.task_events import TaskCancelledEvent
from ergon_core.core.runtime.inngest_client import RUN_CANCEL, inngest_client
from ergon_core.core.runtime.services.task_cleanup_dto import CleanupResult
Expand Down Expand Up @@ -712,13 +712,13 @@ class TestReleaseSandboxStep:
async def test_releases_sandbox_when_fields_present(self) -> None:
"""terminate_by_sandbox_id called exactly once for valid payload."""
with patch(
"ergon_core.core.providers.sandbox.manager.BaseSandboxManager"
"ergon_core.core.sandbox.manager.BaseSandboxManager"
".terminate_by_sandbox_id",
new_callable=AsyncMock,
return_value=True,
) as mock_terminate:
from ergon_builtins.registry import SANDBOX_MANAGERS
from ergon_core.core.providers.sandbox.manager import BaseSandboxManager
from ergon_core.core.sandbox.manager import BaseSandboxManager

# Any known slug from SANDBOX_MANAGERS
slug = next(iter(SANDBOX_MANAGERS))
Expand Down
Loading
Loading