Skip to content

Checking the equality of dimensions with HashSet#8

Open
krishnanand5 wants to merge 1 commit intozzjas:masterfrom
krishnanand5:fixed-flaky-avro-extensions
Open

Checking the equality of dimensions with HashSet#8
krishnanand5 wants to merge 1 commit intozzjas:masterfrom
krishnanand5:fixed-flaky-avro-extensions

Conversation

@krishnanand5
Copy link
Copy Markdown

Description

In this PR, I am proposing a fix for a couple of flaky tests found in org.apache.druid.data.input.AvroStreamInputFormatTest.testParseSchemaless and org.apache.druid.data.input.AvroStreamInputRowParserTest.testParseSchemaless.

The mentioned tests compare if the retrieved schema is similar to the default schema that is individually stored in the test files. When the schema is being retrieved, the schema elements are not retrieved in order and hence that is causing the flakiness from time to time. The flakiness was captured by the NonDex tool.

https://github.com/krishnanand5/druid/blob/ad32f8458670339808a3136a9578a10a52b8394f/extensions-core/avro-extensions/src/test/java/org/apache/druid/data/input/AvroStreamInputRowParserTest.java#L304

[ERROR] Failures: 
[ERROR] org.apache.druid.data.input.AvroStreamInputFormatTest.testParseSchemaless
[ERROR]   Run 1: AvroStreamInputFormatTest.testParseSchemaless:434 expected:<[nestedArrayVal, someOtherId, someIntArray, someFloat, someUnion, eventType, id, someFixed, someBytes, someEnum, someLong, someInt, timestamp]> but was:<[nestedArrayVal, someLong, someEnum, someFixed, someBytes, someInt, someIntArray, eventType, someUnion, someFloat, someOtherId, id, timestamp]>
[ERROR]   Run 2: AvroStreamInputFormatTest.testParseSchemaless:434 expected:<[nestedArrayVal, someOtherId, someIntArray, someFloat, someUnion, eventType, id, someFixed, someBytes, someEnum, someLong, someInt, timestamp]> but was:<[nestedArrayVal, someUnion, someLong, eventType, someFloat, someInt, someBytes, someEnum, id, someIntArray, timestamp, someFixed, someOtherId]>
[ERROR]   Run 3: AvroStreamInputFormatTest.testParseSchemaless:434 expected:<[nestedArrayVal, someOtherId, someIntArray, someFloat, someUnion, eventType, id, someFixed, someBytes, someEnum, someLong, someInt, timestamp]> but was:<[nestedArrayVal, someFixed, someFloat, someIntArray, id, someBytes, someInt, eventType, timestamp, someLong, someEnum, someOtherId, someUnion]>
[ERROR]   Run 4: AvroStreamInputFormatTest.testParseSchemaless:434 expected:<[nestedArrayVal, someOtherId, someIntArray, someFloat, someUnion, eventType, id, someFixed, someBytes, someEnum, someLong, someInt, timestamp]> but was:<[nestedArrayVal, someUnion, someFixed, someInt, timestamp, someEnum, eventType, someLong, someIntArray, someFloat, id, someBytes, someOtherId]>
[INFO] 
[INFO] 
[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0

In order to overcome this problem, I am proposing to use a HashSet to carry out the string equality check.

Steps to verify the patch:

  1. https://github.com/apache/druid.git.
  2. mvn install -pl extensions-core/s3-extensions -am -DskipTests.
  3. mvn -pl extensions-core/s3-extensions test -Dtest=org.apache.druid.data.input.AvroStreamInputFormatTest#testParseSchemaless
    (or)
    mvn -pl extensions-core/s3-extensions test -Dtest=org.apache.druid.data.input.AvroStreamInputRowParserTest#testParseSchemaless
  4. mvn -pl extensions-core/s3-extensions edu.illinois:nondex-maven-plugin:2.1.1:nondex -Dtest=org.apache.druid.data.input.AvroStreamInputFormatTest.testParseSchemaless
    (or)
    mvn -pl extensions-core/s3-extensions edu.illinois:nondex-maven-plugin:2.1.1:nondex -Dtest=org.apache.druid.data.input.AvroStreamInputRowParserTest.testParseSchemaless
  5. To further verify the reliability of the fix, we can set nondex runs to 100
    mvn -pl extensions-core/s3-extensions edu.illinois:nondex-maven-plugin:2.1.1:nondex -DnondexRuns=100 -Dtest=org.apache.druid.data.input.AvroStreamInputFormatTest.testParseSchemaless
    (or)
    mvn -pl extensions-core/s3-extensions edu.illinois:nondex-maven-plugin:2.1.1:nondex -DnondexRuns=100 -Dtest=org.apache.druid.data.input.AvroStreamInputRowParserTest.testParseSchemaless

Release note

Addressed a minor flakiness that was observed in assertInputRowCorrect()


This PR has:

  • been self-reviewed.
  • using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant