Skip to content

GODRIVER-3638 Prohibit using failpoints on sharded topologies. #2168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

matthewdale
Copy link
Collaborator

@matthewdale matthewdale commented Aug 12, 2025

GODRIVER-3638

Summary

  • If SetFailPoint is called when mtest is connected to a sharded topology.
  • Add an mtest option AllowFailPointsOnSharded to allow using failpoints on sharded topologies.

Background & Motivation

On sharded topologies, failpoints are applied to only a single mongoS. If the driver is connected to multiple mongoS instances, there's a possibility a different mongoS will be selected for a subsequent command. In that case, the failpoint is effectively ignored, leading to a test failure that is extremely difficult to diagnose.

Note that if we implement GODRIVER-3328, we no longer need to prohibit using failpoints on sharded topologies.

@matthewdale matthewdale force-pushed the godriver3638-prohibit-failpoint branch from 3ab7bb9 to 9aa3b5a Compare August 12, 2025 23:38
Copy link
Contributor

API Change Report

No changes found!

Copy link
Contributor

mongodb-drivers-pr-bot bot commented Aug 12, 2025

🧪 Performance Results

Commit SHA: de83205

The following benchmark tests for version 689ebf5509236500071bcbf9 had statistically significant changes (i.e., |z-score| > 1.96):

Benchmark Measurement % Change Patch Value Stable Region H-Score Z-Score
BenchmarkMultiInsertLargeDocument total_time_seconds 17.5686 1.2747 Avg: 1.0842
Med: 1.0816
Stdev: 0.0488
0.8534 3.9068
BenchmarkBSONDeepDocumentDecoding ops_per_second_min -12.7165 1724.9141 Avg: 1976.2210
Med: 1984.2671
Stdev: 121.0901
0.7361 -2.0754
BenchmarkSingleFindOneByID ns_per_op 12.3064 280211.0000 Avg: 249505.8032
Med: 248200.0000
Stdev: 7712.0254
0.8686 3.9815
BenchmarkSingleFindOneByID ops_per_second_med -11.4774 3644.3414 Avg: 4116.8474
Med: 4138.1822
Stdev: 116.3377
0.8713 -4.0615
BenchmarkSingleRunCommand total_bytes_allocated -11.2053 87749080.0000 Avg: 98822432.8182
Med: 99663608.0000
Stdev: 5104085.6817
0.7753 -2.1695
BenchmarkSingleRunCommand total_mem_allocs -10.8790 950527.0000 Avg: 1066557.3750
Med: 1075540.5000
Stdev: 54558.0767
0.7703 -2.1267
BenchmarkLargeDocInsertOne ops_per_second_med -8.7716 5208.2113 Avg: 5708.9783
Med: 5716.2701
Stdev: 153.3932
0.8383 -3.2646
BenchmarkSingleFindOneByID ops_per_second_max -6.3893 4330.8229 Avg: 4626.4169
Med: 4626.5025
Stdev: 106.1006
0.8106 -2.7860
BenchmarkBSONFullDocumentEncoding total_bytes_allocated -5.4923 229160736.0000 Avg: 242478287.0909
Med: 242695476.0000
Stdev: 3720589.4273
0.8429 -3.5794
BenchmarkBSONFullDocumentEncoding total_mem_allocs -5.4816 1323641.0000 Avg: 1400405.4773
Med: 1401957.0000
Stdev: 21670.3891
0.8412 -3.5424
BenchmarkBSONFullDocumentEncoding ns_per_op 5.2213 26972.0000 Avg: 25633.5909
Med: 25543.5000
Stdev: 379.8047
0.8432 3.5239
BenchmarkBSONFullDocumentEncoding ops_per_second_med -3.6557 40217.1727 Avg: 41743.1955
Med: 41743.2250
Stdev: 579.9676
0.7902 -2.6312
BenchmarkBSONFullDocumentEncoding ops_per_second_max -3.3123 41720.5557 Avg: 43149.8004
Med: 43119.2521
Stdev: 527.6487
0.7955 -2.7087
BenchmarkSingleRunCommand allocated_bytes_per_op 0.1269 12212.0000 Avg: 12196.5217
Med: 12196.0000
Stdev: 4.3991
0.8537 3.5185
BenchmarkBSONFlatDocumentEncoding allocated_bytes_per_op 0.0980 6278.0000 Avg: 6271.8523
Med: 6272.0000
Stdev: 3.0078
0.7342 2.0439

For a comprehensive view of all microbenchmark results for this PR's commit, please check out the Evergreen perf task for this patch.

@matthewdale matthewdale force-pushed the godriver3638-prohibit-failpoint branch from 9aa3b5a to 7645c94 Compare August 15, 2025 04:30
@matthewdale matthewdale added enhancement review-priority-low Low Priority PR for Review: within 3 business days labels Aug 15, 2025
@matthewdale matthewdale marked this pull request as ready for review August 15, 2025 04:49
@Copilot Copilot AI review requested due to automatic review settings August 15, 2025 04:49
@matthewdale matthewdale requested a review from a team as a code owner August 15, 2025 04:49
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a safety mechanism to prevent unreliable failpoint usage on sharded MongoDB topologies by prohibiting SetFailPoint calls unless explicitly allowed through a new option.

  • Adds AllowFailPointsOnSharded() option to bypass the failpoint prohibition on sharded topologies
  • Modifies existing tests to either exclude sharded topologies or explicitly allow failpoints when needed
  • Implements runtime validation in SetFailPoint to prevent accidental failpoint usage on sharded clusters

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
internal/integration/mtest/options.go Adds AllowFailPointsOnSharded() method to enable failpoint usage on sharded topologies
internal/integration/mtest/mongotest.go Implements failpoint validation logic and stores the allowFailPointsOnSharded flag
internal/integration/server_selection_prose_test.go Uses AllowFailPointsOnSharded() for a test that requires failpoints on sharded topology
internal/integration/retryable_writes_prose_test.go Excludes sharded topology from one test and allows failpoints for another
internal/integration/retryable_reads_prose_test.go Excludes sharded topology from one test and allows failpoints for another
internal/integration/sdam_prose_test.go Excludes sharded topology from RTT monitoring test
internal/integration/csot_prose_test.go Excludes sharded topology from multiple GridFS timeout tests
internal/integration/crud_prose_test.go Excludes sharded topology from bulk write error handling tests

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

@matthewdale matthewdale force-pushed the godriver3638-prohibit-failpoint branch from 7645c94 to de83205 Compare August 15, 2025 05:02
@matthewdale matthewdale merged commit 42852c5 into mongodb:master Aug 19, 2025
34 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement review-priority-low Low Priority PR for Review: within 3 business days
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants