Skip to content

chore(crons): Move monitor update logic to be entirely inside the serializer take 2 #97943

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

wedamija
Copy link
Member

This consolidates update logic to be inside of the serializer. It also fixes some weirdness where we seem to save the monitor first, but then perform other validation that can fail.

The second commit here fixes a problem where when we update a monitor with a parital config, we end up dropping some required fields

@wedamija wedamija requested a review from a team as a code owner August 15, 2025 18:48
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Aug 15, 2025
MonitorCheckIn.objects.filter(
monitor_id=instance.id, status=CheckInStatus.IN_PROGRESS
).update(timeout_at=TruncMinute(F("date_added")) + get_max_runtime(max_runtime))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Serializer Logic Mismatch Causes Unnecessary Updates

In MonitorSerializer.update, the logic for checkin_margin and max_runtime updates compares against validated_data["config"]. This partial data can incorrectly trigger updates to MonitorEnvironment and MonitorCheckIn objects, even when these values haven't changed, potentially passing None to update functions.

Fix in Cursor Fix in Web

wedamija added a commit that referenced this pull request Aug 15, 2025
We had a bug where we accidentally didn't save all the required fields for the monitor config. This causes a bug when we copy the config over to the monitor checkin where we end up storing a null value.

We should instead treat this as a warning - if the config is already stored in the monitor, it should be replicated regardless of whether it's valid or not. The validation should happen at mutation time, which is being fixed in #97943

Fixes SENTRY-4DDN
Comment on lines +449 to +461
if "config" in validated_data:
new_config = validated_data["config"]
checkin_margin = new_config.get("checkin_margin")
if checkin_margin != existing_margin:
MonitorEnvironment.objects.filter(monitor_id=instance.id).update(
next_checkin_latest=F("next_checkin") + get_checkin_margin(checkin_margin)
)

max_runtime = new_config.get("max_runtime")
if max_runtime != existing_max_runtime:
MonitorCheckIn.objects.filter(
monitor_id=instance.id, status=CheckInStatus.IN_PROGRESS
).update(timeout_at=TruncMinute(F("date_added")) + get_max_runtime(max_runtime))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug: The timeout update logic in MonitorValidator uses a partial config, causing custom timeouts to be reset to default values during unrelated partial updates.
  • Description: When a monitor's configuration is partially updated (e.g., only changing the schedule), the timeout update logic incorrectly uses the partial validated_data["config"]. This dictionary lacks keys like checkin_margin or max_runtime if they were not part of the update. The code then compares None (from new_config.get("checkin_margin")) with the existing value, which evaluates to true and triggers an update. This incorrectly resets the monitor's checkin_margin and max_runtime to their default values, causing users to lose their custom timeout settings and potentially leading to incorrect monitor alerting.

  • Suggested fix: The timeout update logic should check if checkin_margin and max_runtime were present in the incoming validated_data["config"] before comparing and updating their values. This ensures that only explicitly changed values are updated and prevents accidental resets.
    severity: 0.7, confidence: 0.95

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link

codecov bot commented Aug 15, 2025

Codecov Report

❌ Patch coverage is 95.29412% with 4 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/monitors/validators.py 94.80% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #97943      +/-   ##
==========================================
+ Coverage   79.55%   80.63%   +1.08%     
==========================================
  Files        8593     8595       +2     
  Lines      379066   379100      +34     
  Branches    24700    24700              
==========================================
+ Hits       301558   305683    +4125     
+ Misses      77130    73039    -4091     
  Partials      378      378              

wedamija added a commit that referenced this pull request Aug 15, 2025
We had a bug where we accidentally didn't save all the required fields
for the monitor config. This causes a bug when we copy the config over
to the monitor checkin where we end up storing a null value.

We should instead treat this as a warning - if the config is already
stored in the monitor, it should be replicated regardless of whether
it's valid or not. The validation should happen at mutation time, which
is being fixed in #97943

Fixes SENTRY-4DDN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant