Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Dec 4, 2025

📄 253% (2.53x) speedup for compute_events_latency in inference/core/interfaces/stream/watchdog.py

⏱️ Runtime : 302 microseconds 85.6 microseconds (best of 45 runs)

📝 Explanation and details

The optimized code achieves a 252% speedup by eliminating function call overhead and reducing unnecessary operations in the critical path.

Key optimizations:

  1. Inlined compatibility check in compute_events_latency: The original code called are_events_compatible() which created a list and performed complex checks. The optimized version directly checks if either event is None or if frame_ids differ, eliminating function call overhead and list creation.

  2. Early-exit optimization in are_events_compatible: Instead of using any() with a generator expression and building a complete frame_ids list, the optimized version uses explicit loops that return False immediately upon finding the first None or mismatched frame_id.

Performance impact by test case:

  • None events (336-378% faster): The inlined checks in compute_events_latency avoid the function call entirely when events are None
  • Mismatched frame_ids (403-446% faster): Direct frame_id comparison is much faster than the original's list-building approach
  • Valid events (158-208% faster): Even when computation proceeds, avoiding the function call overhead provides significant gains
  • Large-scale tests (215-407% faster): The optimizations scale well, particularly benefiting scenarios with many mismatched frame_ids

Hot path impact: Based on the function reference showing compute_events_latency is called within _generate_report() for latency monitoring, this optimization will improve the performance of stream processing pipelines where latency measurements are computed frequently. The 252% speedup means latency monitoring operations that previously took ~300μs now complete in ~85μs, reducing overhead in real-time video processing workflows.

The optimizations preserve all original behavior while dramatically reducing computational overhead through smarter control flow and elimination of unnecessary operations.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 11 Passed
🌀 Generated Regression Tests 258 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Existing Unit Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
inference/unit_tests/core/interfaces/stream/test_watchdog.py::test_compute_events_latency_when_events_are_compatible 5.00μs 1.93μs 159%✅
inference/unit_tests/core/interfaces/stream/test_watchdog.py::test_compute_events_latency_when_events_are_not_compatible 3.38μs 915ns 270%✅
🌀 Generated Regression Tests and Runtime
from datetime import datetime, timedelta
from typing import List, Optional

# imports
import pytest
from inference.core.interfaces.stream.watchdog import compute_events_latency


# function to test
class ModelActivityEvent:
    """A minimal mock of the ModelActivityEvent class."""

    def __init__(self, event_timestamp: datetime, frame_id: int):
        self.event_timestamp = event_timestamp
        self.frame_id = frame_id


from inference.core.interfaces.stream.watchdog import compute_events_latency

# unit tests

# ========== Basic Test Cases ==========


def test_basic_latency_positive():
    """Test with two valid events, later_event after earlier_event."""
    t1 = datetime(2024, 6, 1, 12, 0, 0)
    t2 = datetime(2024, 6, 1, 12, 0, 5)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=42)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=42)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 4.08μs -> 1.58μs (158% faster)


def test_basic_latency_zero():
    """Test with two events at the same timestamp (should be zero latency)."""
    t = datetime(2024, 6, 1, 12, 0, 0)
    e1 = ModelActivityEvent(event_timestamp=t, frame_id=1)
    e2 = ModelActivityEvent(event_timestamp=t, frame_id=1)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 3.60μs -> 1.28μs (180% faster)


def test_basic_latency_negative():
    """Test with earlier_event after later_event (negative latency)."""
    t1 = datetime(2024, 6, 1, 12, 0, 10)
    t2 = datetime(2024, 6, 1, 12, 0, 5)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=123)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=123)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 3.51μs -> 1.36μs (158% faster)


def test_basic_different_frame_ids():
    """Test with two events with different frame_ids (should return None)."""
    t1 = datetime(2024, 6, 1, 12, 0, 0)
    t2 = datetime(2024, 6, 1, 12, 0, 5)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=100)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=200)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 2.58μs -> 473ns (446% faster)


# ========== Edge Test Cases ==========


def test_earlier_event_none():
    """Test with earlier_event as None (should return None)."""
    t2 = datetime(2024, 6, 1, 12, 0, 5)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=1)
    codeflash_output = compute_events_latency(None, e2)
    result = codeflash_output  # 1.53μs -> 352ns (336% faster)


def test_later_event_none():
    """Test with later_event as None (should return None)."""
    t1 = datetime(2024, 6, 1, 12, 0, 0)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=1)
    codeflash_output = compute_events_latency(e1, None)
    result = codeflash_output  # 1.54μs -> 376ns (309% faster)


def test_both_events_none():
    """Test with both events as None (should return None)."""
    codeflash_output = compute_events_latency(None, None)
    result = codeflash_output  # 1.42μs -> 359ns (296% faster)


def test_events_with_microsecond_difference():
    """Test with events differing by microseconds."""
    t1 = datetime(2024, 6, 1, 12, 0, 0, 123456)
    t2 = datetime(2024, 6, 1, 12, 0, 0, 223456)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=99)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=99)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 3.96μs -> 1.53μs (158% faster)


def test_events_with_large_time_difference():
    """Test with events separated by a large time delta (e.g., 10 days)."""
    t1 = datetime(2024, 6, 1, 12, 0, 0)
    t2 = t1 + timedelta(days=10, hours=5, minutes=30, seconds=15)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=7)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=7)
    expected = (10 * 24 * 3600) + (5 * 3600) + (30 * 60) + 15
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 3.30μs -> 1.25μs (165% faster)


def test_events_with_negative_frame_id():
    """Test with negative frame_id (should still work if frame_ids match)."""
    t1 = datetime(2024, 6, 1, 12, 0, 0)
    t2 = datetime(2024, 6, 1, 12, 0, 10)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=-5)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=-5)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 3.44μs -> 1.11μs (208% faster)


def test_events_with_zero_frame_id():
    """Test with frame_id zero (should work if both have zero)."""
    t1 = datetime(2024, 6, 1, 12, 0, 0)
    t2 = datetime(2024, 6, 1, 12, 0, 1)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=0)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=0)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 3.23μs -> 1.10μs (194% faster)


def test_are_events_compatible_empty_list():
    """Test are_events_compatible with empty list (should return False)."""


def test_are_events_compatible_with_none_in_list():
    """Test are_events_compatible with a None event in the list."""
    t = datetime(2024, 6, 1, 12, 0, 0)
    e1 = ModelActivityEvent(event_timestamp=t, frame_id=1)


def test_are_events_compatible_with_different_frame_ids():
    """Test are_events_compatible with different frame_ids."""
    t = datetime(2024, 6, 1, 12, 0, 0)
    e1 = ModelActivityEvent(event_timestamp=t, frame_id=1)
    e2 = ModelActivityEvent(event_timestamp=t, frame_id=2)


def test_are_events_compatible_with_same_frame_ids():
    """Test are_events_compatible with same frame_ids."""
    t = datetime(2024, 6, 1, 12, 0, 0)
    e1 = ModelActivityEvent(event_timestamp=t, frame_id=5)
    e2 = ModelActivityEvent(event_timestamp=t, frame_id=5)


# ========== Large Scale Test Cases ==========


def test_large_scale_events_latency():
    """Test compute_events_latency with large number of events (performance and correctness)."""
    base_time = datetime(2024, 6, 1, 12, 0, 0)
    frame_id = 12345
    # Generate 1000 events, each 1 second apart
    events = [
        ModelActivityEvent(
            event_timestamp=base_time + timedelta(seconds=i), frame_id=frame_id
        )
        for i in range(1000)
    ]
    # Test the latency between the first and last event
    codeflash_output = compute_events_latency(events[0], events[-1])
    result = codeflash_output  # 4.59μs -> 1.51μs (204% faster)


def test_large_scale_compatible_check():
    """Test are_events_compatible with 1000 events with same frame_id (should be True)."""
    t = datetime(2024, 6, 1, 12, 0, 0)
    events = [ModelActivityEvent(event_timestamp=t, frame_id=1) for _ in range(1000)]


def test_large_scale_incompatible_check():
    """Test are_events_compatible with 1000 events, one with different frame_id (should be False)."""
    t = datetime(2024, 6, 1, 12, 0, 0)
    events = [ModelActivityEvent(event_timestamp=t, frame_id=1) for _ in range(999)]
    events.append(ModelActivityEvent(event_timestamp=t, frame_id=2))


def test_large_scale_with_none_event():
    """Test are_events_compatible with 1000 events, one is None (should be False)."""
    t = datetime(2024, 6, 1, 12, 0, 0)
    events = [ModelActivityEvent(event_timestamp=t, frame_id=1) for _ in range(999)] + [
        None
    ]


# ========== Mutation Testing (Robustness) ==========


@pytest.mark.parametrize("delta_seconds", [1, 10, 100, -1, -10, 0])
def test_mutation_various_deltas(delta_seconds):
    """Test that the function returns correct latency for various time deltas."""
    t1 = datetime(2024, 6, 1, 12, 0, 0)
    t2 = t1 + timedelta(seconds=delta_seconds)
    e1 = ModelActivityEvent(event_timestamp=t1, frame_id=77)
    e2 = ModelActivityEvent(event_timestamp=t2, frame_id=77)
    codeflash_output = compute_events_latency(e1, e2)
    result = codeflash_output  # 22.1μs -> 7.31μs (202% faster)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
from datetime import datetime, timedelta
from typing import Optional

# imports
import pytest
from inference.core.interfaces.stream.watchdog import compute_events_latency


# --- Mock ModelActivityEvent for testing ---
class ModelActivityEvent:
    def __init__(self, event_timestamp: datetime, frame_id: int):
        self.event_timestamp = event_timestamp
        self.frame_id = frame_id


from inference.core.interfaces.stream.watchdog import compute_events_latency

# --- Unit tests ---

# ========== BASIC TEST CASES ==========


def test_latency_basic_positive():
    # Basic case: later_event is 10 seconds after earlier_event, same frame_id
    earlier = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 0), frame_id=1)
    later = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 10), frame_id=1)
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 3.92μs -> 1.42μs (176% faster)


def test_latency_basic_zero():
    # Basic case: events at same time, same frame_id
    t = datetime(2024, 6, 1, 10, 0, 0)
    earlier = ModelActivityEvent(t, frame_id=42)
    later = ModelActivityEvent(t, frame_id=42)
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 3.52μs -> 1.26μs (179% faster)


def test_latency_basic_negative():
    # Basic case: earlier_event is after later_event, same frame_id
    earlier = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 10), frame_id=7)
    later = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 0), frame_id=7)
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 3.60μs -> 1.36μs (164% faster)


def test_latency_basic_float_seconds():
    # Basic case: fractional seconds
    earlier = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 0, 500000), frame_id=3)
    later = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 1, 250000), frame_id=3)
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 3.54μs -> 1.28μs (175% faster)


# ========== EDGE TEST CASES ==========


def test_latency_none_earlier_event():
    # Edge: earlier_event is None
    later = ModelActivityEvent(datetime.now(), frame_id=1)
    codeflash_output = compute_events_latency(
        None, later
    )  # 1.60μs -> 336ns (378% faster)


def test_latency_none_later_event():
    # Edge: later_event is None
    earlier = ModelActivityEvent(datetime.now(), frame_id=1)
    codeflash_output = compute_events_latency(
        earlier, None
    )  # 1.63μs -> 350ns (367% faster)


def test_latency_both_none():
    # Edge: both events are None
    codeflash_output = compute_events_latency(
        None, None
    )  # 1.46μs -> 336ns (333% faster)


def test_latency_different_frame_ids():
    # Edge: frame_ids do not match
    earlier = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 0), frame_id=1)
    later = ModelActivityEvent(datetime(2024, 6, 1, 10, 0, 5), frame_id=2)
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 2.96μs -> 564ns (426% faster)


def test_latency_events_with_same_timestamp_different_frame_id():
    # Edge: same timestamp, different frame_id
    t = datetime(2024, 6, 1, 10, 0, 0)
    earlier = ModelActivityEvent(t, frame_id=1)
    later = ModelActivityEvent(t, frame_id=2)
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 2.71μs -> 539ns (403% faster)


def test_latency_events_with_none_in_list_are_events_compatible():
    # Edge: are_events_compatible returns False if any event is None
    e1 = ModelActivityEvent(datetime.now(), frame_id=1)


def test_latency_minimum_datetime():
    # Edge: events with minimum datetime
    earlier = ModelActivityEvent(datetime.min, frame_id=1)
    later = ModelActivityEvent(datetime.min, frame_id=1)
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 3.86μs -> 1.50μs (158% faster)


def test_latency_large_time_difference():
    # Edge: events with a large time difference
    earlier = ModelActivityEvent(datetime(2000, 1, 1, 0, 0, 0), frame_id=1)
    later = ModelActivityEvent(datetime(2020, 1, 1, 0, 0, 0), frame_id=1)
    expected = (later.event_timestamp - earlier.event_timestamp).total_seconds()
    codeflash_output = compute_events_latency(
        earlier, later
    )  # 3.72μs -> 1.00μs (272% faster)


# ========== LARGE SCALE TEST CASES ==========


def test_latency_large_scale_sequential_events():
    # Large scale: compute latency across a sequence of events with the same frame_id
    frame_id = 77
    base_time = datetime(2024, 6, 1, 0, 0, 0)
    events = [
        ModelActivityEvent(base_time + timedelta(seconds=i), frame_id=frame_id)
        for i in range(1000)
    ]
    # Test random pairs in the list
    for i in range(0, 1000, 100):  # test every 100th event
        earlier = events[i]
        later = events[i + 50]
        codeflash_output = compute_events_latency(
            earlier, later
        )  # 13.1μs -> 4.17μs (215% faster)


def test_latency_large_scale_all_different_frame_ids():
    # Large scale: all events have different frame_ids, should always return None
    base_time = datetime(2024, 6, 1, 0, 0, 0)
    events = [
        ModelActivityEvent(base_time + timedelta(seconds=i), frame_id=i)
        for i in range(100)
    ]
    for i in range(0, 99):
        codeflash_output = compute_events_latency(
            events[i], events[i + 1]
        )  # 87.4μs -> 17.2μs (407% faster)


def test_latency_large_scale_some_none_events():
    # Large scale: some events are None, should always return None
    frame_id = 42
    base_time = datetime(2024, 6, 1, 0, 0, 0)
    events = [
        (
            ModelActivityEvent(base_time + timedelta(seconds=i), frame_id=frame_id)
            if i % 10 != 0
            else None
        )
        for i in range(100)
    ]
    # Try pairs where one is None
    for i in range(0, 99):
        if events[i] is None or events[i + 1] is None:
            codeflash_output = compute_events_latency(events[i], events[i + 1])
        else:
            codeflash_output = compute_events_latency(events[i], events[i + 1])


def test_latency_large_scale_randomized():
    # Large scale: randomized frame_ids and timestamps, only matching frame_ids should work
    import random

    random.seed(42)
    base_time = datetime(2024, 6, 1, 0, 0, 0)
    events = []
    for i in range(500):
        frame_id = random.randint(1, 5)
        ts = base_time + timedelta(seconds=random.randint(0, 10000))
        events.append(ModelActivityEvent(ts, frame_id))
    # Pick pairs with same frame_id
    for frame_id in range(1, 6):
        filtered = [e for e in events if e.frame_id == frame_id]
        if len(filtered) >= 2:
            earlier, later = filtered[0], filtered[1]
            expected = (later.event_timestamp - earlier.event_timestamp).total_seconds()
            codeflash_output = compute_events_latency(earlier, later)
    # Pick pairs with different frame_id, should return None
    for i in range(10):
        e1 = events[i]
        e2 = events[-(i + 1)]
        if e1.frame_id != e2.frame_id:
            codeflash_output = compute_events_latency(e1, e2)


# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-compute_events_latency-miqpyqt3 and push.

Codeflash Static Badge

The optimized code achieves a **252% speedup** by eliminating function call overhead and reducing unnecessary operations in the critical path.

**Key optimizations:**

1. **Inlined compatibility check in `compute_events_latency`**: The original code called `are_events_compatible()` which created a list and performed complex checks. The optimized version directly checks if either event is None or if frame_ids differ, eliminating function call overhead and list creation.

2. **Early-exit optimization in `are_events_compatible`**: Instead of using `any()` with a generator expression and building a complete `frame_ids` list, the optimized version uses explicit loops that return `False` immediately upon finding the first None or mismatched frame_id.

**Performance impact by test case:**
- **None events** (336-378% faster): The inlined checks in `compute_events_latency` avoid the function call entirely when events are None
- **Mismatched frame_ids** (403-446% faster): Direct frame_id comparison is much faster than the original's list-building approach
- **Valid events** (158-208% faster): Even when computation proceeds, avoiding the function call overhead provides significant gains
- **Large-scale tests** (215-407% faster): The optimizations scale well, particularly benefiting scenarios with many mismatched frame_ids

**Hot path impact:** Based on the function reference showing `compute_events_latency` is called within `_generate_report()` for latency monitoring, this optimization will improve the performance of stream processing pipelines where latency measurements are computed frequently. The 252% speedup means latency monitoring operations that previously took ~300μs now complete in ~85μs, reducing overhead in real-time video processing workflows.

The optimizations preserve all original behavior while dramatically reducing computational overhead through smarter control flow and elimination of unnecessary operations.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 December 4, 2025 00:50
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant