Skip to content

Adding Time Models to Shuttle#217

Open
dylanjwolff wants to merge 8 commits intoawslabs:mainfrom
dylanjwolff:time-models-pr
Open

Adding Time Models to Shuttle#217
dylanjwolff wants to merge 8 commits intoawslabs:mainfrom
dylanjwolff:time-models-pr

Conversation

@dylanjwolff
Copy link
Contributor

@dylanjwolff dylanjwolff commented Oct 1, 2025

This PR creates a new API in Shuttle to allow for specifying a model for wall-clock time and associated functions / primitives.
In spirit, this API is similar to that of the Scheduler API in that it allows users to swap out different time models (or create their own time models), depending on their application and use case.

As initial examples, this PR contains two time models:

  1. The ConstantSteppedTimeModel, which advances a global time by a configurable constant amount (starting from 0) at each scheduling step.
  2. The FrozenTimeModel, which is based on a ConstantSteppedTimeModel of where the step size is zero. This model additionally provides the ability for users to manually expire Sleeps and Timeouts without advancing the global clock.

Both of these models will automatically "fast-forward" time to wake the next sleeping tasks if all tasks are sleeping.

Primitive Representations:

As part of this change, Shuttle now vends it's own Duration, Instant, Sleep, Interval and Timeout primitives, whose behavior depends on the current TimeModel. These primitives are intended to eventually fully model the corresponding types in std::time and tokio::time. In this PR coverage of all functionality is incomplete, but enough core features are implemented such that most common use cases should be covered. Full feature parity can be achieved over time as the TimeModels become more mature.

The Shuttle Instant primitive is an enum. For this PR, there is only one representation of Durations, using a concrete std::time::Duration. However, in the future we can add representations to support other Time Models which might use logical scalar or vector clocks, for example.

The Shuttle Duration primitive is just a std::time::Duration by default. The reason for this is that std::time::Duration is very commonly used in library APIs. This can make switching to a different Duration type extremely painful, as all dependencies also must be made compatible. By leaving Duration unchanged, we can keep the barrier to adoption of time modeling low. With the advanced-time-models feature flag, users can opt-in to time models that may not be able to represent time intervals as a single value (for example, elapsed time between vector-clock timestamps).

Alternative Designs:

Because std::time::Duration and other primitives are concrete types, the corresponding Shuttle primitives also must be concrete. Instead of enums with a fixed number of variants, we could instead opt for a struct which contains a type erased Box<dyn Duration>. This is more flexible, as users can even bring their own primitive representations without needing to make changes to Shuttle itself to add an enum variant. However, it would require type-casting in all operations involving multiple primitives (for example, comparing two durations or arithmetic operations between instants and/or durations). This approach is also further complicated by many Duration methods being const, which would preclude a fully dynamic Duration implementation.

As long as there are only a small number of representations for these primitives that all Time Models share, then an enum allows them to share logic without these complications. However, if later we find that there are many different primitive representations then we may need to reconsider the fully dynamic approach instead.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@sarsko
Copy link
Contributor

sarsko commented Oct 7, 2025

Needs conflict resolution in order to run benchmarks

while ExecutionState::num_runnable() == 0
&& ExecutionState::with(|s| Rc::clone(&s.time_model))
.borrow_mut()
.wake_next()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on the with comment from above

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why do we need a loop? We can just do one call no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use a loop to protect against stale wakers in the time model. It's possible that we don't need this with more diligent tracking, but at least for the current TimeModel implementations there are cases where we have old wakers that don't end up actually making any futures runnable when woken.

I think it is more complexity that it's worth to say that the TimeModel must always wake a non-stale waker with wake_next

runnable_tasks: Vec<*const Task>,

// Counter for unique timing resource ids (Sleeps, Timeouts and Intervals)
pub(crate) timer_id_counter: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this for? It's never used no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's used in shuttle/src/sync/time/mod.rs to differentiate sleepers with the same deadlines (see uses of increment_timer_counter())

&& ExecutionState::with(|s| Rc::clone(&s.time_model))
.borrow_mut()
.wake_next()
{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on the loop comment


pub(crate) fn num_runnable() -> usize {
Self::with(|state| state.tasks.iter().filter(|t| t.runnable()).count())
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should improve the tracking of runnable so that this becomes O(1)

Copy link
Contributor Author

@dylanjwolff dylanjwolff Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, but I don't know if we want to block this PR on that change. if so, I should probably open another PR to do that separately.

this would be strongly related to building the set of runnable tasks incrementally, which I have a prototype of on this branch:

https://github.com/dylanjwolff/shuttle/tree/incremental-persistent-vec

but I haven't PR'ed it because I actually think the right thing to do is to change the Shuttle scheduler API so that the individual schedulers manage their own runnable task sets in whatever data structure is best for that algorithm (for example, priority queue for RP).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah no don't block the PR on this.

And yeah, lets chat about potential scheduler API changes on Monday

@dylanjwolff
Copy link
Contributor Author

Might be worth adding this example as a test case:

https://rfd.shared.oxide.computer/rfd/0609

thread::spawn(move || {
for _ in 0..TEST_LENGTH {
thread::sleep(Duration::from_millis(1));
thread::sleep(Duration::ZERO);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't pay attention to this first time around, but why are we doing Duration::ZERO? The test should work just fine with a sleep of 1ms, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly, I read the original intent of these sleeps to be to insert a yield point without calling yield_now, since PCT takes that yield_now call as a hint that it might be in a busy loop and reduces the current thread's priority.

Since these tests are for the PCT algorithm/implementation and not meant to actually exercise timing behaviors, I think the timeouts should be Duration::ZERO if kept as a sleep, or (probably better) changed to an explicit yield_now_not_a_scheduling_hint API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah getcha, thanks. We could make switch pub. But also: surely these tests will be equivalent with what was before when running under the time model which implements time as modeled before this PR? Like we could just create a time model which makes these tests work with Duration::from_millis(1) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like we could just create a time model which makes these tests work with Duration::from_millis(1) ?

Yeah, this is also fine. My thought was that changing the duration to zero makes it more explicit that this test doesn't actually want any timing related behaviors to influence its outcome.

We could make switch pub.

For this specific test, I think it would probably be most clear if we did call switch directly. But the tradeoff for making it fully pub is that there are now two ways to insert a context switch for users of Shuttle, and the difference between the two is relatively subtle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to not make switch pub as we generally don't want people inserting scheduling points at whim (though they of course can by sleeping for no ZERO duration)

@sarsko sarsko marked this pull request as draft January 13, 2026 00:20
@sarsko
Copy link
Contributor

sarsko commented Jan 13, 2026

Converted to draft just to note that it is not mergeable because time models should live elsewhere. I'll reopen once its ready for review

@sarsko sarsko marked this pull request as ready for review January 28, 2026 20:32
@sarsko
Copy link
Contributor

sarsko commented Jan 29, 2026

Made the code reviewable (and started integrating it into our testing).
I have a few comments:

  1. I am generally pro merging and publishing this.
  2. I am tempted to refactor this into being in a "Futures-first" shape (eg. meaning it could be plug-and-played with bach or tokio or any other async runtime).
  3. The refactoring is not urgent, but I'd prefer that model over every tool building their own way of doing time modeling.
  4. The same Futures-first approach is how I'd like to do network modeling as well


#[allow(clippy::useless_conversion)]
fn advance(&mut self, dur: Duration) {
self.current_time_elapsed += dur.into();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong no? We also need to potentially wake the next task (aka add an unblock_expired)

This is generally a sharp edge when implementing time models. I'm thinking we'll want to protect against this by doing something like the following.

trait TimeModelRunnable: TimeModel  {
    fn advance(&mut self, dur: Duration) {
	    TimeModel::advance(self, dur);
	    self.unblock_expired();
    }

	// etc for all other functions
}

impl<T> TimeModelRunnable for T where T: TimeModel {}

trait TimeModel {
    fn unblock_expired(&mut self) {}
    
    fn advance(&mut self, dur: Duration) {}

	// etc more functions
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while, but I think you are right that this is missing and unblock_expired. I can't think of a use case for advance without waking, and even if there is one, that should be a more explicit call.

+1 for adding the wrapper -- seems like a good idea. Probably should be done for step as well, maybe even wake_next.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we .into() here?

type Output = ();

fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
let is_expired = get_time_model()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the reason for doing the call to register_sleep then a second call to register_sleep with the waker after — just pass the waker the first time.

Also this way of implementing register_sleep is dangerous in that we are requiring the time model to correctly manage the waker. I can't really see when a custom "waker manager" would be needed meaning we could just manage the waker for the time model, following the same layered trait methodology as above.

Actually I'm not sure I see how register_sleep is a function of the time model at all — why don't we just let the time model manage time (via current_time) and have the TimeModelRunnable implement the following:

    fn register_sleep(&mut self, deadline: Instant, sleep_id: u64, waker: Waker) -> bool {
		if self.current_time() >= deadline {
			true
		} else {
			self.register_waker(waker)
			false
		}
    }

	fn register_waker(waker: Waker) {
		todo!()
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrt. to why we need a custom register_sleep, I think this gets into the more "advanced time models" territory, where you might want a representation of time that isn't easily represented as a single Instant. Like if we use a vector time where each task with a particular tag has it's own view of the time (emulating a distributed system).

But maybe this can be put behind the "advanced time models" feature flag to hide some of the complexity for the basic case?

thread::switch();
}

/// Returns a future which sleeps until the duration has elapsed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of this async stuff should be moved into a pub mod async and have their async_ prefix stripped

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sike, async is of course a registered keyword and I think r#async is ugly

#[derive(Debug)]
pub struct Timeout<F>
where
F: Future,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constraint needs to be removed (not there in Tokio: https://docs.rs/tokio/latest/tokio/time/struct.Timeout.html)

/// Timeout a future
#[pin_project]
#[derive(Debug)]
pub struct Interval {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: add fn disable


impl TimeModel for ConstantSteppedTimeModel {
fn pause(&mut self) {
warn!("Pausing stepped model has no effect")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely pausing ConstantSteppedTimeModel means that it becomes the FrozenTimeModel ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it probably should. Maybe there's an argument that you shouldn't be manually pausing/unpausing the non-frozen models because it's error prone?

}

fn resume(&mut self) {
warn!("Resuming stepped model has no effect")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment as for pause


/// Expire all current timeouts/sleeps requested by tasks whose tags match the
/// given predicate. May not be implemented by all TimeModels.
pub fn trigger_timeouts<F>(trigger: F)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments:

  1. I don't see why this only is implemented for the FrozenTimeModel
  2. With a bit of refactoring of the traits this can be implemented for all time models
  3. I am undecided on whether we want this at all. I understand we have it for legacy reasons, but I'm wondering whether we want to support that use case at all. See below.

This way of doing it is based on tracking stuff in Labels and using that to time things out via the passed trigger. This is 1: Error prone on the "maintaining stuff in the Labels side", 2: Error prone on the trigger_timeouts side, 3: Generally hard to maintain and keep up to date, 4: Not actually correct with regards to how time works and 5: Dangerous because it creates a scheme where a task is permanently expired (ie. it will have a weird form of "Midas' touch" and timeout every single timing event it ever touches).

If we add an api to get the current task count (this can be gotten in a roundabout way currently, and TaskId is also forgeable via From<usize>), and a way to get the Instant where a task will be woken, trigger_timeouts mechanism can be implemented in "user space" by doing a similar loop as we do in FrozenTimeModel::trigger_timeouts, but with time::advance() as the driver of timeouts. This solves 4 and 5 above, and alleviates 1 to 3 somewhat, though generally I find trigger_timeouts and friends to not be something which fills me with joy and happiness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why this only is implemented for the FrozenTimeModel

I think for the same reason you mentioned; it's there for legacy reasons and I wasn't sure we wanted it at all. Generally I think this kind of manually trying to trigger timing behaviors is a bit of an anti-pattern for automated testing -- the whole point is to catch scenarios that you didn't think of, not ones you knew about enough already to hard-code.

Doing it in user-space seems reasonable to me (though realistically, I'm not sure who would actually go to the trouble).

@sarsko
Copy link
Contributor

sarsko commented Feb 5, 2026

I wanna add a tokio::Runtime-like construct to Shuttle(-tokio), which contains the tasks on that Runtime and their clock. The way we do time modeling would have to be updated to accommodate that. The motivation for adding Runtime is 1: I think it is an ergonomic and intuitive way of separating hosts, 2: A lot of code that uses Tokio is written that way already, 3: It's the way Turmoil is implemented, so we could (with a couple more additions) get Turmoil support cheaply.

}

/// Manually wake a task without affecting the global clock
pub fn wake_frozen(&mut self, sleep_id: u64) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait why does this exist?

Copy link
Contributor Author

@dylanjwolff dylanjwolff Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's been a while, but I think it's because the frozen time model is just a thin wrapper around the constant stepped model (step size zero). So this is to allow the frozen time model to trigger timeouts without advancing the clock of the inner constant model. For sure it shouldn't be pub, but more generally I don't know whether we do/don't want to advance the clock on triggering timeouts for the frozen model. Per the other convo, it's not even clear whether we even want to support "trigger_timeouts" manually anyways

distribution: ConstantTimeDistribution,
current_step_size: std::time::Duration,
current_time_elapsed: std::time::Duration,
waiters: BinaryHeap<Reverse<(std::time::Duration, TaskId, u64)>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a super fan of the untypedness of hits

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ie what are the fields, in particular the u64)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's just the sleep ID

current_step_size: std::time::Duration,
current_time_elapsed: std::time::Duration,
waiters: BinaryHeap<Reverse<(std::time::Duration, TaskId, u64)>>,
wakers: HashMap<u64, Waker>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have we separated the waiters and the wakers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember, but I think it is because of type restrictions for what you can put on a binary heap in Rust (PartialOrd).

/// Puts the current thread to sleep
/// Behavior of this function depends on the TimeModel provided to Shuttle
pub fn sleep(dur: Duration) {
if dur == Duration::ZERO {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree with the backdoor here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is even the use case of a zero duration sleep?

note that without the backdoor, it is impossible for the current thread to be scheduled again right away after sleeping for zero with the current constant time model implementation (since the clock needs to advance one step to wake it)


/// A constant distribution; each sample returns the same time
#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq)]
pub struct ConstantTimeDistribution {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait huh why does this exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bit speculative, but you might want to sample the time-step size from different distributions (e.g. normal, exponential, constant).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants