[b/505303589] Updated parser validation script by prasoonbirla-google · Pull Request #710 · chronicle/content-hub

prasoonbirla-google · 2026-04-22T11:41:26Z

Fix: Improve parser validation output structure and timestamp handling

Description

What problem does this PR solve?
This PR addresses several issues in the parser validation script (run_parser_validations.py) to make test event comparisons more accurate and to improve debugging capabilities:

The script was previously discarding many UDM fields returned by the parser API, selectively picking only metadata and additional, which caused structure mismatches with the expected test_events.json files.
The log_type passed to the validation API was hardcoded to a dummy value, which could affect parser routing or results.
Timestamps were causing false-positive test failures due to differences in microsecond zero-padding (e.g., .198Z vs .198000Z) and due to the diffing tool not ignoring the camelCase eventTimestamp field when the year falls back to the current execution year.

How does this PR solve the problem?

Preserve Full UDM Structure: Modified the event transformation logic to nest the complete raw event payload returned by the parser under idm.readOnlyUdm (using camelCase as expected by the tests), ensuring fields like principal, target, and observer are correctly included in the validation phase.
Dynamic Log Type Resolution: The script now checks for metadata.json in the cbn directory and extracts the actual logType to pass into ``chronicle_client.run_parser(), gracefully falling back to a default if unavailable.
Timestamp Formatting and Filtering Fixes:
- Updated normalize_timestamp() to cleanly strip trailing zeros from microseconds, matching the canonical expected log format.
- Updated filter_timestamps() to properly ignore the camelCase eventTimestamp field during the symmetric diff, preventing false failures caused by fallback execution years.
JSON Output for Debugging: Replaced the print(validation_results) statement with file I/O that cleanly dumps the API payload into a validation_results.json file in the same directory as the generated markdown report.

Any other relevant information (e.g., design choices, tradeoffs, known issues):
The timestamp normalizer handles cases with and without existing milliseconds. By keeping the complete unmodified event mapped directly into readOnlyUdm, the validation suite now acts as a much stricter and more accurate gate against parser regressions across all UDM fields.

Checklist:

Please ensure you have completed the following items before submitting your PR.
This helps us review your contribution faster and more efficiently.

General Checks:

I have read and followed the project's contributing.md guide.
My code follows the project's coding style guidelines.
I have performed a self-review of my own code.
My changes do not introduce any new warnings.
My changes pass all existing tests.
I have added new tests where appropriate to cover my changes. (If applicable)
I have updated the documentation where necessary (e.g., README, API docs). (If applicable)

Open-Source Specific Checks:

My changes do not introduce any Personally Identifiable Information (PII) or sensitive customer data.
My changes do not expose any internal-only code examples, configurations, or URLs.
All code examples, comments, and messages are generic and suitable for a public repository.
I understand that any internal context or sensitive details related to this work are handled separately in internal systems (Buganizer for Google team members).

For Google Team Members and Reviewers Only:

I have included the Buganizer ID in the PR title or description (e.g., "Internal Buganizer ID: 123456789" or "Related Buganizer: go/buganizer/123456789").
I have ensured that all internal discussions and PII related to this work remain in Buganizer.
I have tagged the PR with one or more labels that reflect the pull request purpose.

Screenshots (If Applicable)

If your changes involve UI or visual elements, please include screenshots or GIFs here.
Ensure any sensitive data is redacted or generalized.

Further Comments / Questions

Any additional comments, questions, or areas where you'd like specific feedback.

gemini-code-assist · 2026-04-22T11:41:43Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the parser validation script to improve test accuracy and debugging capabilities. By preserving the full UDM structure, dynamically resolving log types, and refining timestamp normalization, the changes reduce false-positive test failures and provide a more robust validation process for parser regressions.

Highlights

UDM Structure Preservation: Modified event transformation to nest the complete raw event payload under readOnlyUdm, ensuring all UDM fields are included in validation.
Dynamic Log Type Resolution: Updated the script to extract the logType from metadata.json if available, replacing the previous hardcoded dummy value.
Timestamp Normalization: Improved timestamp handling by stripping trailing zeros from microseconds and ignoring the eventTimestamp field during diffing to prevent false-positive test failures.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors timestamp normalization to handle microseconds, updates timestamp filtering, and introduces dynamic log type detection from a metadata file. It also simplifies the UDM event construction by nesting the original event under a new 'readOnlyUdm' key. Feedback includes a security concern regarding the use of 'json.loads' instead of 'yaml.safe_load' as per the style guide, a suggestion for more robust I/O error handling, and a note about missing logic for dumping validation results mentioned in the PR description.

Updated parser validation script

b2378aa

prasoonbirla-google requested a review from a team as a code owner April 22, 2026 11:41

gemini-code-assist Bot reviewed Apr 22, 2026

View reviewed changes

Comment thread tools/parsers/validations/run_parser_validations.py

Comment thread tools/parsers/validations/run_parser_validations.py

TalShafir1 approved these changes Apr 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[b/505303589] Updated parser validation script#710

[b/505303589] Updated parser validation script#710
prasoonbirla-google wants to merge 1 commit intomainfrom
updated-parser-validation-script

prasoonbirla-google commented Apr 22, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Apr 22, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

prasoonbirla-google commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix: Improve parser validation output structure and timestamp handling

Description

Checklist:

General Checks:

Open-Source Specific Checks:

For Google Team Members and Reviewers Only:

Screenshots (If Applicable)

Further Comments / Questions

Uh oh!

gemini-code-assist Bot commented Apr 22, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

prasoonbirla-google commented Apr 22, 2026 •

edited

Loading