-
Notifications
You must be signed in to change notification settings - Fork 1
[TOW-1299] App Isolation traits #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces an ExecutionBackend trait abstraction to enable Tower to support multiple compute substrates (local processes, Kubernetes pods, etc.) through a uniform interface, while refactoring existing local execution to implement this new abstraction.
Changes:
- Added new execution abstraction layer with
ExecutionBackendandExecutionHandletraits - Implemented
LocalBackendwrapping existingLocalAppfunctionality - Added dependencies
async-traitanduuidfor trait support and ID generation
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/tower-runtime/src/execution.rs | Defines core execution traits, types, and abstractions for backend-agnostic execution management |
| crates/tower-runtime/src/local.rs | Implements LocalBackend and LocalHandle to adapt existing subprocess execution to new abstraction |
| crates/tower-runtime/src/lib.rs | Exports new execution module |
| crates/tower-runtime/src/errors.rs | Adds error variants for execution abstraction (AppNotStarted, NoHandle, InvalidPackage) |
| crates/tower-runtime/Cargo.toml | Adds async-trait and uuid dependencies |
| Cargo.toml | Defines workspace-level versions for async-trait and uuid |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
crates/tower-runtime/src/local.rs
Outdated
| package: match spec.bundle { | ||
| BundleRef::Local { path } => Package::from_unpacked_path(path).await, | ||
| }, |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Package::from_unpacked_path call is not wrapped in error handling. If this operation fails, the error message will be generic. Consider adding context about which bundle path failed to load to improve debugging.
crates/tower-runtime/src/local.rs
Outdated
| typical_cold_start_ms: 1000, // ~1s for venv + sync | ||
| typical_warm_start_ms: 100, // ~100ms with warm cache |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These hardcoded timing estimates should be documented as approximate values that may vary based on system resources and bundle complexity. Consider adding a comment explaining these are typical values, not guarantees.
| typical_cold_start_ms: 1000, // ~1s for venv + sync | |
| typical_warm_start_ms: 100, // ~100ms with warm cache | |
| // The following timing values are typical, approximate estimates and may vary | |
| // based on system resources, bundle complexity, and runtime conditions. | |
| typical_cold_start_ms: 1000, // ~1s for venv + sync on a typical development machine | |
| typical_warm_start_ms: 100, // ~100ms with a warm cache under typical conditions |
crates/tower-runtime/src/local.rs
Outdated
| loop { | ||
| let status = self.status().await?; | ||
| match status { | ||
| ExecutionStatus::Preparing | ExecutionStatus::Running => { | ||
| tokio::time::sleep(Duration::from_millis(100)).await; | ||
| } | ||
| _ => return Ok(status), | ||
| } | ||
| } |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The polling interval of 100ms is hardcoded. For long-running executions, this creates unnecessary overhead. Consider making the polling interval configurable or implementing an event-based notification mechanism instead of polling.
| pub async fn status(&self) -> Result<ExecutionStatus, Error> { | ||
| self.app | ||
| .as_ref() | ||
| .ok_or(Error::AppNotStarted)? |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message 'app not started' is vague. Consider a more descriptive error such as 'cannot get status: no app is currently running' to provide better context to users.
| pub struct AppLauncher<A: App> { | ||
| backend: Arc<A::Backend>, | ||
| app: Option<A>, | ||
| } |
Copilot
AI
Jan 12, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AppLauncher struct and its methods lack documentation comments. Since this is a public API component of the new abstraction, it should include doc comments explaining its purpose, usage patterns, and lifecycle management responsibilities.
bradhe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that you want to land? It's pretty WIP-y it seems to me, has loads of duplicated stuff from elsewhere in the tower-runtime crate. I've left some comments for now, please let me know how you'd like to proceed.
crates/tower-runtime/src/local.rs
Outdated
| let opts = StartOptions { | ||
| ctx: spec.telemetry_ctx, | ||
| package: match spec.bundle { | ||
| BundleRef::Local { path } => Package::from_unpacked_path(path).await, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called PackageRef not BundleRef
6183c33 to
7432ed6
Compare
crates/tower-runtime/Cargo.toml
Outdated
| uuid = { workspace = true } | ||
|
|
||
| # K8s dependencies (optional) | ||
| k8s-openapi = { version = "0.23", features = ["v1_31"], optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update deps to latest
bradhe
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did another review here. Let's review my feedback synchronously.
| /// monitor_local_status is a helper function that will monitor the status of a given app and waits for | ||
| /// it to progress to a terminal state. | ||
| async fn monitor_local_status(app: Arc<Mutex<LocalApp>>) -> Status { | ||
| debug!("Starting status monitoring for LocalApp"); | ||
| async fn monitor_cli_status(handle: Arc<Mutex<tower_runtime::backends::cli::CliHandle>>) -> Status { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was renamed to "cli" but runner in third party infrastructure (e.g. self-hosted runners) will use local processes too, not Kubernetes...
| // Build container spec | ||
| // Note: In K8s, 'command' = entrypoint, 'args' = command | ||
| let container = Container { | ||
| name: "app".to_string(), | ||
| image: Some(spec.runtime.image.clone()), | ||
| env: Some(env_vars), | ||
| command: spec.runtime.entrypoint.clone(), // K8s command = entrypoint | ||
| args: spec.runtime.command.clone(), // K8s args = command | ||
| volume_mounts: if volume_mounts.is_empty() { | ||
| None | ||
| } else { | ||
| Some(volume_mounts) | ||
| }, | ||
| resources: Some(resources), | ||
| working_dir: Some("/app".to_string()), | ||
| ..Default::default() | ||
| }; | ||
|
|
||
| // Build pod spec | ||
| let pod_spec = PodSpec { | ||
| containers: vec![container], | ||
| volumes: if volumes.is_empty() { | ||
| None | ||
| } else { | ||
| Some(volumes) | ||
| }, | ||
| restart_policy: Some("Never".to_string()), | ||
| ..Default::default() | ||
| }; | ||
|
|
||
| Ok(Pod { | ||
| metadata: k8s_openapi::apimachinery::pkg::apis::meta::v1::ObjectMeta { | ||
| name: Some(format!("tower-run-{}", spec.id)), | ||
| namespace: Some(self.namespace.clone()), | ||
| labels: Some(labels), | ||
| ..Default::default() | ||
| }, | ||
| spec: Some(pod_spec), | ||
| ..Default::default() | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming a change for this is coming?
| /// Get current execution status | ||
| async fn status(&self) -> Result<Status, Error>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this get the status of the execution environment setup or the status of the app that's running? Or both?
| // Create ConfigMap with bundle contents and get path mapping | ||
| let path_mapping = self.create_bundle_configmap(&spec).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just calling this out for myself that I expect this will go away.
| Ok(match phase.as_str() { | ||
| "Pending" => Status::None, | ||
| "Running" => Status::Running, | ||
| "Succeeded" => Status::Exited, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function is meant to get the status of the app in it's lifecycle, this means that once the Pod is provisioned, it'll get marked as "Exited" right?
| if tokio::time::timeout(std::time::Duration::from_secs(60), condition) | ||
| .await | ||
| .is_ok() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a container takes longer than 60 seconds to log?
| line, | ||
| }; | ||
| if tx.send(output).is_err() { | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probs wanna log the error?
| async fn terminate(&mut self) -> Result<(), Error> { | ||
| let pods: Api<Pod> = Api::namespaced(self.client.clone(), &self.namespace); | ||
|
|
||
| pods.delete(&self.pod_name, &DeleteParams::default()) | ||
| .await | ||
| .map_err(|_| Error::TerminateFailed)?; | ||
|
|
||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does SubprocessHandle guarantee the process is dead by the end of terminate or is it fire and forget?
| // Delete pod | ||
| self.terminate().await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cleanup is typically called after the app is already terminated/exited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a k8s.rs in here for a kubernetes app now?
Introduces ExecutionBackend trait abstraction to support multiple compute substrates (local subprocesses, Kubernetes
pods, etc.) through a uniform interface. Refactors execution to cleanly separate CLI local runs from tower-runner
server-side execution.
Changes
New abstraction layer (crates/tower-runtime/src/execution.rs)
Backend implementations
Refactoring
Design Rationale
The abstraction cleanly separates two distinct use cases: