Allow unevaluated properties to extend the configuration#514
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the flexibility of the extractor configuration by modifying its JSON schema. By removing a specific constraint, the configuration can now be extended with additional properties, which was previously prevented. This change facilitates more adaptable and extensible configurations for the extractor. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request proposes to allow extending the base extractor configuration by removing the "unevaluatedProperties": false constraint from the JSON schema. While this makes the configuration more flexible, it introduces a critical risk by disabling static validation for typos in property names. This could lead to misconfigurations that are only caught at runtime, potentially causing serious issues like data loss or incorrect processing. I have added a review comment detailing this concern and suggesting a safer alternative that maintains validation while still allowing for schema extension.
| "$schema": "https://json-schema.org/draft/2020-12/schema", | ||
| "type": "object", | ||
| "description": "Base extractor configuration object", | ||
| "unevaluatedProperties": false, |
There was a problem hiding this comment.
Removing "unevaluatedProperties": false allows for extending the configuration, but it critically weakens static validation. This change means that typos in standard property names (e.g., state-stor instead of state-store) will no longer be detected by JSON schema validators in IDEs or CI pipelines.
While Pydantic's runtime validation will eventually catch such errors, the lack of static validation increases the risk of deploying misconfigured extractors. For example, a misspelled state-store configuration would cause the extractor to fall back to a default LocalStateStore, potentially leading to data reprocessing or loss of state, which is a critical correctness issue.
To maintain robust validation, I recommend keeping "unevaluatedProperties": false and documenting how users can extend the schema using allOf. This approach ensures that base properties are always validated correctly while still providing a clear path for extensibility.
Example for an extending schema:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"allOf": [
{ "$ref": "https://.../extractor_config.schema.json" }
],
"properties": {
"my-custom-property": { "type": "string" }
}
}There was a problem hiding this comment.
Isn't this what we are currently doing? Why is this change necessary really?
There was a problem hiding this comment.
unevaluatedProperties: false prevents extending the schema via allOf with additional properties.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #514 +/- ##
==========================================
+ Coverage 81.25% 81.30% +0.04%
==========================================
Files 43 43
Lines 4210 4210
==========================================
+ Hits 3421 3423 +2
+ Misses 789 787 -2 🚀 New features to boost your workflow:
|
|
🦄 |
This PR will allow unevaluated properties to extend the configuration. As
"unevaluatedProperties": falseextractor config is unable to extend.