Skip to content

Conversation

@neoneo008
Copy link

Improve mongodb bulkwrite performance by supporting configuration optional parameters

  1. Unordered writes: https://mongodb.github.io/mongo-java-driver/5.6/apidocs/driver-core/com/mongodb/client/model/BulkWriteOptions.html#ordered(boolean)
  2. Bypass document validation : https://mongodb.github.io/mongo-java-driver/5.6/apidocs/driver-core/com/mongodb/client/model/BulkWriteOptions.html#bypassDocumentValidation(java.lang.Boolean)

This is a backward compatible changes with existing behaviour of ordered writes with document validation is maintained configuration default values.

@neoneo008
Copy link
Author

@Jiabao-Sun can you please help review?

Copy link

@Savonitar Savonitar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding support for MongoDB's ordered and bypassDocumentValidation write options!
The PR looks good overall.
However, I have concerns about the test coverage and whether the tests actually validate that these parameters work as intended when failures occur.

}

@Test
void unorderedWrite() throws Exception {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The naming style is slightly inconsistent (unorderedWrite vs testRecovery)
Maybe it is cleaner to pick one style?

void bypassDocumentValidation() throws Exception {
final String collection = "test-sink-with-bypass-doc-validation";
final MongoSink<Document> sink =
createSink(collection, DeliveryGuarantee.AT_LEAST_ONCE, true, false);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createSink(collection, DeliveryGuarantee.AT_LEAST_ONCE, true, false);

In this test, you are passing false to the bypass parameter, effectively disabling the feature you intend to test. Or I'm missing something?

}

@Test
void bypassDocumentValidation() throws Exception {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


env.fromSequence(1, 5).map(new TestMapFunction()).sinkTo(sink);
env.execute();
assertThatIdsAreWritten(collectionOf(collection), 1, 2, 3, 4, 5);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current tests don't fully validate the behavioral difference between ordered and unordered writes.
It would be valuable to add a test case that injects a failure (e.g., a duplicate key error) in the middle of a batch.


env.fromSequence(1, 5).map(new TestMapFunction()).sinkTo(sink);
env.execute();
assertThatIdsAreWritten(collectionOf(collection), 1, 2, 3, 4, 5);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current test passes regardless of the flag because the collection has no validation rules.
Maybe to verify bypassDocumentValidation works, we can create the collection with a validator and assert that writes only succeed when the bypass flag is enabled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants