[TOW-1299] App Isolation traits #159

sammuti · 2026-01-12T18:17:28Z

Introduces ExecutionBackend trait abstraction to support multiple compute substrates (local subprocesses, Kubernetes
pods, etc.) through a uniform interface. Refactors execution to cleanly separate CLI local runs from tower-runner
server-side execution.

Changes

New abstraction layer (crates/tower-runtime/src/execution.rs)

ExecutionBackend trait - defines interface for compute substrates
ExecutionHandle trait - manages running executions (status, logs, termination)
ExecutionSpec - unified specification for execution requests
Supporting types: BundleRef, RuntimeConfig, CacheConfig, ResourceLimits, NetworkingSpec

Backend implementations

SubprocessBackend - For tower-runner server-side native execution
- Implements ExecutionBackend with full logs() support for multiple consumers
- Used by tower-runner to stream logs to control plane
- Located in crates/tower-runtime/src/backends/subprocess.rs
CliBackend (new) - For CLI --local runs
- Simple single-consumer pattern matching original develop behavior
- Caller creates channel and owns receiver directly
- No complex logs() method needed
- Located in crates/tower-runtime/src/backends/cli.rs

Refactoring

Updated tower-cmd/run.rs to use CliBackend for --local runs
Removed dead code: AppLauncher struct (replaced by direct backend usage), unused imports

Design Rationale

The abstraction cleanly separates two distinct use cases:

CLI --local execution: Simple, single consumer pattern for user-facing CLI runs
tower-runner server execution: Multi-consumer pattern supporting log streaming to control plane

Copilot

Pull request overview

This PR introduces an ExecutionBackend trait abstraction to enable Tower to support multiple compute substrates (local processes, Kubernetes pods, etc.) through a uniform interface, while refactoring existing local execution to implement this new abstraction.

Changes:

Added new execution abstraction layer with ExecutionBackend and ExecutionHandle traits
Implemented LocalBackend wrapping existing LocalApp functionality
Added dependencies async-trait and uuid for trait support and ID generation

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
crates/tower-runtime/src/execution.rs	Defines core execution traits, types, and abstractions for backend-agnostic execution management
crates/tower-runtime/src/local.rs	Implements `LocalBackend` and `LocalHandle` to adapt existing subprocess execution to new abstraction
crates/tower-runtime/src/lib.rs	Exports new execution module
crates/tower-runtime/src/errors.rs	Adds error variants for execution abstraction (`AppNotStarted`, `NoHandle`, `InvalidPackage`)
crates/tower-runtime/Cargo.toml	Adds `async-trait` and `uuid` dependencies
Cargo.toml	Defines workspace-level versions for `async-trait` and `uuid`

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-12T18:19:25Z

crates/tower-runtime/src/local.rs

+            package: match spec.bundle {
+                BundleRef::Local { path } => Package::from_unpacked_path(path).await,
+            },


The Package::from_unpacked_path call is not wrapped in error handling. If this operation fails, the error message will be generic. Consider adding context about which bundle path failed to load to improve debugging.

Copilot · 2026-01-12T18:19:25Z

crates/tower-runtime/src/local.rs

+            typical_cold_start_ms: 1000,     // ~1s for venv + sync
+            typical_warm_start_ms: 100,      // ~100ms with warm cache


These hardcoded timing estimates should be documented as approximate values that may vary based on system resources and bundle complexity. Consider adding a comment explaining these are typical values, not guarantees.

Suggested change

typical_cold_start_ms: 1000, // ~1s for venv + sync

typical_warm_start_ms: 100, // ~100ms with warm cache

// The following timing values are typical, approximate estimates and may vary

// based on system resources, bundle complexity, and runtime conditions.

typical_cold_start_ms: 1000, // ~1s for venv + sync on a typical development machine

typical_warm_start_ms: 100, // ~100ms with a warm cache under typical conditions

Copilot · 2026-01-12T18:19:26Z

crates/tower-runtime/src/local.rs

+        loop {
+            let status = self.status().await?;
+            match status {
+                ExecutionStatus::Preparing | ExecutionStatus::Running => {
+                    tokio::time::sleep(Duration::from_millis(100)).await;
+                }
+                _ => return Ok(status),
+            }
+        }


The polling interval of 100ms is hardcoded. For long-running executions, this creates unnecessary overhead. Consider making the polling interval configurable or implementing an event-based notification mechanism instead of polling.

Copilot · 2026-01-12T18:19:26Z

crates/tower-runtime/src/execution.rs

+    pub async fn status(&self) -> Result<ExecutionStatus, Error> {
+        self.app
+            .as_ref()
+            .ok_or(Error::AppNotStarted)?


The error message 'app not started' is vague. Consider a more descriptive error such as 'cannot get status: no app is currently running' to provide better context to users.

Copilot · 2026-01-12T18:19:26Z

crates/tower-runtime/src/execution.rs

+pub struct AppLauncher<A: App> {
+    backend: Arc<A::Backend>,
+    app: Option<A>,
+}


The AppLauncher struct and its methods lack documentation comments. Since this is a public API component of the new abstraction, it should include doc comments explaining its purpose, usage patterns, and lifecycle management responsibilities.

bradhe

Is this something that you want to land? It's pretty WIP-y it seems to me, has loads of duplicated stuff from elsewhere in the tower-runtime crate. I've left some comments for now, please let me know how you'd like to proceed.

crates/tower-runtime/src/execution.rs

bradhe · 2026-01-12T18:25:29Z

crates/tower-runtime/src/local.rs

+        let opts = StartOptions {
+            ctx: spec.telemetry_ctx,
+            package: match spec.bundle {
+                BundleRef::Local { path } => Package::from_unpacked_path(path).await,


This should be called PackageRef not BundleRef

sammuti · 2026-01-13T17:55:29Z

crates/tower-runtime/Cargo.toml

+uuid = { workspace = true }
+
+# K8s dependencies (optional)
+k8s-openapi = { version = "0.23", features = ["v1_31"], optional = true }


Update deps to latest

bradhe

Did another review here. Let's review my feedback synchronously.

bradhe · 2026-01-14T15:40:02Z

crates/tower-cmd/src/run.rs

 /// monitor_local_status is a helper function that will monitor the status of a given app and waits for
 /// it to progress to a terminal state.
-async fn monitor_local_status(app: Arc<Mutex<LocalApp>>) -> Status {
-    debug!("Starting status monitoring for LocalApp");
+async fn monitor_cli_status(handle: Arc<Mutex<tower_runtime::backends::cli::CliHandle>>) -> Status {


This was renamed to "cli" but runner in third party infrastructure (e.g. self-hosted runners) will use local processes too, not Kubernetes...

bradhe · 2026-01-14T15:41:49Z

crates/tower-runtime/src/backends/k8s.rs

+        // Build container spec
+        // Note: In K8s, 'command' = entrypoint, 'args' = command
+        let container = Container {
+            name: "app".to_string(),
+            image: Some(spec.runtime.image.clone()),
+            env: Some(env_vars),
+            command: spec.runtime.entrypoint.clone(), // K8s command = entrypoint
+            args: spec.runtime.command.clone(),       // K8s args = command
+            volume_mounts: if volume_mounts.is_empty() {
+                None
+            } else {
+                Some(volume_mounts)
+            },
+            resources: Some(resources),
+            working_dir: Some("/app".to_string()),
+            ..Default::default()
+        };
+
+        // Build pod spec
+        let pod_spec = PodSpec {
+            containers: vec![container],
+            volumes: if volumes.is_empty() {
+                None
+            } else {
+                Some(volumes)
+            },
+            restart_policy: Some("Never".to_string()),
+            ..Default::default()
+        };
+
+        Ok(Pod {
+            metadata: k8s_openapi::apimachinery::pkg::apis::meta::v1::ObjectMeta {
+                name: Some(format!("tower-run-{}", spec.id)),
+                namespace: Some(self.namespace.clone()),
+                labels: Some(labels),
+                ..Default::default()
+            },
+            spec: Some(pod_spec),
+            ..Default::default()
+        })


I'm assuming a change for this is coming?

bradhe · 2026-01-14T15:44:11Z

crates/tower-runtime/src/execution.rs

+    /// Get current execution status
+    async fn status(&self) -> Result<Status, Error>;


Does this get the status of the execution environment setup or the status of the app that's running? Or both?

bradhe · 2026-01-14T15:45:57Z

crates/tower-runtime/src/backends/k8s.rs

+        // Create ConfigMap with bundle contents and get path mapping
+        let path_mapping = self.create_bundle_configmap(&spec).await?;


Just calling this out for myself that I expect this will go away.

bradhe · 2026-01-14T15:47:20Z

crates/tower-runtime/src/backends/k8s.rs

+        Ok(match phase.as_str() {
+            "Pending" => Status::None,
+            "Running" => Status::Running,
+            "Succeeded" => Status::Exited,


If this function is meant to get the status of the app in it's lifecycle, this means that once the Pod is provisioned, it'll get marked as "Exited" right?

bradhe · 2026-01-14T15:48:14Z

crates/tower-runtime/src/backends/k8s.rs

+            if tokio::time::timeout(std::time::Duration::from_secs(60), condition)
+                .await
+                .is_ok()


What happens if a container takes longer than 60 seconds to log?

bradhe · 2026-01-14T15:48:31Z

crates/tower-runtime/src/backends/k8s.rs

+                            line,
+                        };
+                        if tx.send(output).is_err() {
+                            break;


probs wanna log the error?

bradhe · 2026-01-14T15:49:04Z

crates/tower-runtime/src/backends/k8s.rs

+    async fn terminate(&mut self) -> Result<(), Error> {
+        let pods: Api<Pod> = Api::namespaced(self.client.clone(), &self.namespace);
+
+        pods.delete(&self.pod_name, &DeleteParams::default())
+            .await
+            .map_err(|_| Error::TerminateFailed)?;
+
+        Ok(())
+    }


Does SubprocessHandle guarantee the process is dead by the end of terminate or is it fire and forget?

bradhe · 2026-01-14T15:49:33Z

crates/tower-runtime/src/backends/k8s.rs

+        // Delete pod
+        self.terminate().await?;


Cleanup is typically called after the app is already terminated/exited.

bradhe · 2026-01-14T15:50:43Z

crates/tower-runtime/src/local.rs

Do we need a k8s.rs in here for a kubernetes app now?

sammuti requested review from bradhe, Copilot, giray123, jo-sm, konstantinoscs and socksy January 12, 2026 18:17

Copilot AI reviewed Jan 12, 2026

View reviewed changes

bradhe reviewed Jan 12, 2026

View reviewed changes

sammuti changed the title ~~[TOW-1299] App Isolation traits~~ [WIP][TOW-1299] App Isolation traits Jan 12, 2026

sammuti added 9 commits January 13, 2026 12:13

impl first iter

b74ea44

Working k8s backend

64c8956

move k8s backend to tower-runner

079982a

Minor

164c3c8

Remove cache abstractions

c066ab6

Refactor run.rs correctly

3fb7831

Refactor back the k8s runtime

03c9a6d

minor

ef57bf7

Make AppLauncher generic over ExecutionBackend instead of app

7432ed6

sammuti force-pushed the feature/tow-1299 branch from 6183c33 to 7432ed6 Compare January 13, 2026 17:53

sammuti commented Jan 13, 2026

View reviewed changes

sammuti added 5 commits January 14, 2026 11:08

Renaming local -> subprocess

d741dac

CliBackend

26369da

remove AppLauncher

948d901

minor

3b2947c

Update deps

6d8860d

sammuti changed the title ~~[WIP][TOW-1299] App Isolation traits~~ [TOW-1299] App Isolation traits Jan 14, 2026

sammuti added 2 commits January 14, 2026 13:29

dep updates

bf8b042

BundleRef -> PackageRef

ad2c197

bradhe reviewed Jan 14, 2026

View reviewed changes

		typical_cold_start_ms: 1000, // ~1s for venv + sync
		typical_warm_start_ms: 100, // ~100ms with warm cache

-            typical_cold_start_ms: 1000,     // ~1s for venv + sync
-            typical_warm_start_ms: 100,      // ~100ms with warm cache
+            // The following timing values are typical, approximate estimates and may vary
+            // based on system resources, bundle complexity, and runtime conditions.
+            typical_cold_start_ms: 1000,     // ~1s for venv + sync on a typical development machine
+            typical_warm_start_ms: 100,      // ~100ms with a warm cache under typical conditions

		/// Get current execution status
		async fn status(&self) -> Result<Status, Error>;

		// Create ConfigMap with bundle contents and get path mapping
		let path_mapping = self.create_bundle_configmap(&spec).await?;

[TOW-1299] App Isolation traits #159

Are you sure you want to change the base?

[TOW-1299] App Isolation traits #159

Uh oh!

Conversation

sammuti commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

New abstraction layer (crates/tower-runtime/src/execution.rs)

Backend implementations

Refactoring

Design Rationale

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

bradhe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bradhe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sammuti commented Jan 12, 2026 •

edited

Loading