Idea: parallel sub-goal execution with a lightweight DAG scheduler #5
Replies: 3 comments
-
|
Thanks for taking the time to dig into our architecture, and this is our first discussion too !!!! this is a well-thought-out proposal and the DAG pattern itself is solid.You're right that the patrol cron creates unnecessary serialization. We've been thinking about this exact bottleneck, though our approach is heading in a slightly different direction than shell-level DAG scheduling.The short version: rather than adding a scheduler layer between the cron and the agents, we're working on making goal decomposition itself dependency-aware at the GoalOps framework level. The idea is that when a goal gets decomposed, the controller already understands which sub-goals are independent and can dispatch them in parallel through the existing session system — no separate scheduler process needed. Think of it as pushing the DAG logic up into the orchestration layer rather than down into shell scripts.That said, a few things from your proposal are genuinely useful: |
Beta Was this translation helpful? Give feedback.
-
|
glad this resonated. the mid-execution re-planning question is one we spent a lot of time on. our answer is basically three layers:
the key insight for us was that full re-planning is almost never needed if your tasks are small enough. a 15-minute task that fails and retries with fresh context costs less than a re-planning step that touches the whole graph. we basically chose to make failure cheap rather than prediction perfect. the one case where this breaks down is when upstream output fundamentally changes the approach (eg the architect decides to use a different database). for that we have manual plan editing but no automatic detection yet - thats still a gap. |
Beta Was this translation helpful? Give feedback.
-
|
Parallel sub-goal execution with a DAG scheduler is the right architecture. A few production lessons from running this at scale: Cost estimation before scheduling. When you have 5 parallel sub-goals, the DAG scheduler should estimate the cost of each before committing to execute them in parallel. If sub-goals 3 and 4 together would exceed the remaining budget, the scheduler should serialize them or skip the lower-priority one, rather than discovering the budget problem mid-execution. Dependency tracking must include "soft" dependencies. Hard dependencies (B can't start until A finishes) are easy. But agents also have soft dependencies — B's output quality improves if A's context is available, even though B could run without it. These soft dependencies often determine optimal serialization order even when parallelization is technically possible. Budget propagation through the DAG. The root task has a budget. When it spawns 5 parallel sub-goals, how is the budget divided? Approaches:
We use dynamic pool with per-sub-goal soft limits (can use up to 40% of pool but can't exceed it). This gives flexibility while preventing one runaway sub-goal from starving the others. Cancellation propagation. When the root task is cancelled, all in-flight sub-goals need to be cancelled too. In a DAG, this means cascade cancellation — any node whose parent is cancelled should also cancel. The DAG scheduler needs to track running sub-goals and send cancel signals. More on coordination architecture: https://blog.kinthai.ai/221-agents-multi-agent-coordination-lessons |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hey! I saw your question on our project (bernstein#723) and spent some time reading through your architecture. Really solid setup - 28 lobsters on a single box is no joke.
One thing jumped out at me while reading
ARCHITECTURE.mdandcrontab.example: goal execution appears to be sequential, driven by a 30-minute patrol cron. With 28 agents available, a 6-step goal takes 6 patrol cycles (potentially 3+ hours) even when some sub-goals have no dependency on each other.Here's a pattern we use that might help - a file-based DAG executor that fits naturally into your shell-script architecture. The core idea: when you decompose a goal, declare dependencies between sub-goals explicitly, then let independent ones run in parallel.
The pattern
1. Goal decomposition with dependency declarations
When the controller decomposes a goal, instead of a flat list, emit a dependency graph:
In this example,
research-apiandresearch-authhave no dependencies - they can run immediately and in parallel.design-schemaonly needsresearch-api.implementwaits for both research streams. This turns a 6-step serial chain into 4 waves.2. The scheduler (shell-native)
Replace the 30-min cron patrol with an event-driven loop:
The key details:
mkdirfor atomic locking - no race conditions between lobsters claiming the same sub-goalinotifywaitmakes it event-driven instead of polling (falls back to 5s sleep)dag.json, which triggers the next wave immediately3. Lobster completion callback
When a lobster finishes a sub-goal, it writes its output and signals completion:
4. Context propagation between waves
The important part - downstream sub-goals need to see upstream outputs. When spawning a lobster for
design-schema, inject the output from its dependency:This way,
implementautomatically receives the research outputs from bothresearch-apiandresearch-authwithout any explicit message passing. File-based, simple, no message broker needed.Impact estimate
For a typical 6-step goal with the dependency structure above:
That's roughly an 8-9x speedup on multi-step goals, and it scales with lobster count. Independent research tasks, parallel test suites, concurrent reviews - anything without a true dependency runs immediately.
Bonus: result deduplication
One more thing that might save you significant API costs. If multiple goals ask similar research questions, you're paying for the same Claude calls repeatedly across 28 lobsters. A simple content-addressed cache:
For fuzzy matching (not just exact queries), you can normalize queries before hashing - lowercase, strip whitespace, sort words. Won't catch everything but it's zero-dependency and handles the obvious duplicates.
We've been iterating on similar patterns in Bernstein (Python-based, but the file-state + DAG scheduling concepts are the same). Our orchestrator runs a tick-based control loop with tiered phase scheduling - fast ops every tick, heavy ops every 30th tick - which might be useful if you want to keep the cron approach but tier the frequency.
Happy to go deeper on any of this. Solid project you've got here.
Beta Was this translation helpful? Give feedback.
All reactions