Skip to content

Conversation

@abcxff
Copy link
Contributor

@abcxff abcxff commented Dec 11, 2025

No description provided.

@vercel
Copy link

vercel bot commented Dec 11, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

4 Skipped Deployments
Project Deployment Preview Comments Updated (UTC)
rivet-cloud Ignored Ignored Dec 11, 2025 1:09am
rivet-inspector Ignored Ignored Preview Dec 11, 2025 1:09am
rivet-site Ignored Ignored Preview Dec 11, 2025 1:09am
rivetkit-serverless Skipped Skipped Dec 11, 2025 1:09am

@vercel vercel bot temporarily deployed to Preview – rivetkit-serverless December 11, 2025 01:09 Inactive
Copy link
Contributor Author

abcxff commented Dec 11, 2025

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Dec 11, 2025

Code Review: Actor Alarm Tests

This PR adds comprehensive test coverage for Rivet Actor alarm functionality. Overall, this is excellent work with thorough test coverage and well-structured code. Here are my findings:

✅ Strengths

  1. Comprehensive Test Coverage - The PR includes 16 well-designed tests covering:

    • Basic alarm functionality
    • Edge cases (past alarms, null timestamps, rapid cycles)
    • Advanced scenarios (crash behavior, multiple actors, concurrent alarms)
  2. Excellent Test Organization - Tests are grouped logically with MARK comments:

    • Core Functionality
    • Edge Cases
    • Advanced Usage
  3. Well-Documented Actor Behaviors - Each test actor struct has clear documentation explaining its purpose and behavior.

  4. Proper Synchronization - Tests use appropriate synchronization primitives (oneshot channels, unbounded channels) to coordinate between test code and actor behavior.

  5. Follows Repository Conventions - Code adheres to CLAUDE.md guidelines:

    • Uses structured logging with tracing
    • Follows naming conventions
    • Proper error handling with anyhow

🔍 Code Quality Issues

1. Potential Race Condition in wait_for_actor_wake_from_alarm (actors_alarm.rs:75)

The timeout calculation uses saturating_sub which could result in zero timeout if the function has already been running:

_ = tokio::time::sleep(std::time::Duration::from_secs(timeout_secs).saturating_sub(start.elapsed())) => {

Recommendation: Use a proper deadline approach:

let deadline = start + std::time::Duration::from_secs(timeout_secs);
tokio::select! {
    result = lifecycle_rx.recv() => { /* ... */ }
    _ = tokio::time::sleep_until(deadline.into()) => {
        bail!(/* timeout */);
    }
}

2. Polling-Based Helpers Are Inefficient (actors_alarm.rs:11-113)

Functions like wait_for_actor_wake_polling and wait_for_actor_sleep poll every 50ms, which is inefficient and can lead to flaky tests.

Recommendation: Use lifecycle events consistently (like wait_for_actor_wake_from_alarm does) instead of polling. Consider refactoring these helpers to subscribe to lifecycle events.

3. Mutex Poisoning Not Handled (throughout)

Multiple instances of .lock().unwrap() on Mutex could panic if a thread panics while holding the lock:

if let Some(tx) = self.ready_tx.lock().unwrap().take() {

Recommendation: Use .expect() with a descriptive message, or handle poisoning explicitly:

if let Some(tx) = self.ready_tx.lock().expect("ready_tx mutex poisoned").take() {

4. Hardcoded Magic Numbers (actors_alarm.rs:610, 657, etc.)

Time values like 5000, 2000 are hardcoded throughout without constants or comments explaining why those specific values were chosen.

Recommendation: Add constants with descriptive names:

const SHORT_ALARM_MS: i64 = 1000;
const MEDIUM_ALARM_MS: i64 = 2000;
const LONG_ALARM_MS: i64 = 5000;

5. Missing Test Documentation

Some complex tests like alarm_behavior_with_crash_policy_restart lack comments explaining the expected sequence of events.

Recommendation: Add comments explaining the test flow, especially for multi-step scenarios.

⚠️ Potential Bugs

1. Test alarm_behavior_with_crash_policy_restart Has Incorrect Logic (actors_alarm.rs:1196-1271)

The AlarmSleepThenCrashActor behavior is confusing:

  • Gen 0: Sets alarm and crashes (doesn't sleep)
  • Gen 1: Sends sleep intent
  • The test expects the actor to wake from the gen 0 alarm, but gen 0 never actually slept

Issue: The comment says "set alarm and sleeping" but the code returns ActorStartResult::Crash without sending sleep intent.

Recommendation: Clarify the intended behavior or fix the implementation to match expectations.

2. Incomplete Test (actors_alarm.rs:1256-1263)

The test expects the actor to wake from alarm but the logic seems incorrect:

// Verify the next gen is awake (woke from gen 0's alarm)
let actor = wait_for_actor_wake_polling(/* ... */).await.expect("actor should be sleeping");

The expect message says "should be sleeping" but it's checking for wake.

🎯 Best Practices & Improvements

1. Extract Helper Functions (test_helpers.rs)

The new create_actor helper is good, but it could be made more flexible:

pub async fn create_actor_with_options(
    port: u16,
    namespace: &str,
    name: &str,
    runner_name: &str,
    crash_policy: rivet_types::actors::CrashPolicy,
    input: Option<String>,
    datacenter: Option<String>,
) -> CreateResponse

2. Test Timeouts Should Be Consistent

Some tests use TestOpts::new_with_timeout(1, 10), others use 45 seconds. Document why different timeouts are needed.

3. Duplicate Code in Actor Behaviors

Many actor behaviors share similar patterns (ready_tx notification, generation checking). Consider creating builder functions or macros to reduce duplication.

🔒 Security Considerations

No security issues identified. The test code properly handles:

  • Input validation
  • Resource cleanup
  • Error handling

📊 Test Coverage Assessment

Excellent coverage of:

  • ✅ Basic alarm set/fire
  • ✅ Clear alarm
  • ✅ Replace alarm
  • ✅ Past alarms
  • ✅ Multiple alarms
  • ✅ Multi-cycle alarms
  • ✅ Crash policy interactions
  • ✅ Concurrent actor alarms

Missing coverage:

  • ❓ Alarm behavior during actor migration
  • ❓ Alarm persistence across system restarts (if applicable)
  • ❓ Maximum alarm time bounds testing

📝 Minor Issues

  1. Inconsistent Naming: Some actors are named AlarmAndSleepActor, others AlarmSleepThenCrashActor. Consider standardizing to verb-noun pattern.

  2. Unused Variable (actors_lifecycle.rs:43): The refactoring removed sleep-related code but could be cleaner.

  3. Commented Code (actors_lifecycle.rs:234-236): TODO comments about missing HTTP tunnel implementation - consider creating tracking issues.

🎉 Conclusion

This is high-quality test code that significantly improves coverage of the alarm system. The main concerns are:

  1. Fix the race condition in timeout handling
  2. Clarify/fix the crash-with-alarm test logic
  3. Consider moving from polling to event-based waiting for better test reliability

Once these issues are addressed, this will be an excellent addition to the test suite.

Recommendation: ✅ Approve with minor changes

@pkg-pr-new
Copy link

pkg-pr-new bot commented Dec 11, 2025

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3624

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3624

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3624

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3624

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3624

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3624

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3624

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3624

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3624

commit: b9ccf81

@abcxff abcxff requested a review from NathanFlurry December 11, 2025 02:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants