-
Notifications
You must be signed in to change notification settings - Fork 4
NRL-1215 Detect duplicate fields #801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mattdean3-nhs
merged 27 commits into
develop
from
feature/axkr1-NRL-1215-duplicate-fields
Jan 30, 2025
Merged
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
b9f8b59
NRL-1215 Add json duplicate checker and tests, use it for request
axelkrastek1-nhs cd586dc
NRL-1215 Fix implementation, refine type hints
axelkrastek1-nhs 2d38b6a
NRL-1215 Add extra edge case
axelkrastek1-nhs 4e77c0a
NRL-1215 Add tests for duplicates of each field in DocumentReference
axelkrastek1-nhs d079bce
NRL-1215 Fix linting issues
axelkrastek1-nhs bfbbf8a
NRL-1215 Fix error code for duplicates
axelkrastek1-nhs 4fab5ab
NRL-1215 Fix possible intermittent related to dict order
axelkrastek1-nhs af54474
NRL-1215 Add logging for duplicate key checks and integration tests f…
axelkrastek1-nhs 03a820c
NRL-1215 Add test for edge case with duplicate keys in JSON arrays
axelkrastek1-nhs 4467cf7
NRL-1215 Enhance duplicate detection for JSON arrays and add correspo…
axelkrastek1-nhs 705b372
NRL-1215 Remove commented-out code for duplicate detection in JSON ar…
axelkrastek1-nhs fce0d9a
NRL-1215 Fix linting issues
axelkrastek1-nhs b07d007
[NRL-1215] WIP - Using hooks instead of class for dupe detection
mattdean3-nhs 20ba00e
NRL-1215 Alternative implementation passing all tests
axelkrastek1-nhs 633c565
NRL-1215 Fix linting, remove redundant code
axelkrastek1-nhs 688b2a7
NRL-1215 Simplify implementation
axelkrastek1-nhs f017a8f
NRL-1215 Add extra docstring
axelkrastek1-nhs 8c54886
NRL-1215 Fix case sensitive detection
axelkrastek1-nhs 8bf8776
NRL-1215 Add more prints for test report
axelkrastek1-nhs 5dbbd92
NRL-1215 Clean up prints
axelkrastek1-nhs 5bc99cb
NRL-1215 Fix remove extra prints
axelkrastek1-nhs 0235c76
NRL-1215 Add docstrings
axelkrastek1-nhs ef8f45a
NRL-1215 Log when duplicate key in post body
axelkrastek1-nhs 023e514
[NRL-1215] Switch root to DocumentReference in json dupe checker and …
mattdean3-nhs 9601c3f
[NRL-1215] Add __str__ to OperationOutcomeError to allow logging of e…
mattdean3-nhs e8d35e4
[NRL-1215] Fixup integ tests for root. fix
mattdean3-nhs 67af076
Merge branch 'develop' into feature/axkr1-NRL-1215-duplicate-fields
mattdean3-nhs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,74 @@ | ||
| import json | ||
| from typing import Any | ||
|
|
||
|
|
||
| def check_for_duplicate_keys(pairs: list[tuple[str, Any]]) -> dict[str, Any]: | ||
| """Custom JSON object_pairs_hook that checks for duplicate keys.""" | ||
| keys: dict[str, Any] = {} | ||
| dupes: dict[str, Any] = {} | ||
| normalized_keys: list[str] = [] | ||
|
|
||
| for key, value in pairs: | ||
| normalized_key = key.lower() | ||
| if normalized_key in normalized_keys: | ||
| dupes.setdefault(key, []).append(value) | ||
| else: | ||
| keys[key] = value | ||
| normalized_keys += [normalized_key] | ||
|
|
||
| if dupes: | ||
| keys["__duplicates__"] = dupes | ||
|
|
||
| return keys | ||
|
|
||
|
|
||
| def flatten_duplicates(data: dict[str, Any] | list[Any]) -> list[str]: | ||
| """Flattens a JSON structure and returns a list of duplicate paths.""" | ||
| duplicates: list[str] = [] | ||
| items = data.items() if isinstance(data, dict) else enumerate(data) | ||
|
|
||
| for key, value in items: | ||
| if key == "__duplicates__": | ||
| duplicates.extend(value.keys()) | ||
| elif isinstance(value, (dict, list)): | ||
| path = f"{key}" if isinstance(data, dict) else f"[{key}]" | ||
| dupes = flatten_duplicates(value) | ||
| duplicates.extend([f"{path}.{dupe}" for dupe in dupes]) | ||
|
|
||
| return duplicates | ||
|
|
||
|
|
||
| def format_path(path: str) -> str: | ||
| """Transforms a path like key1.[2].key2 into key1[2].key2""" | ||
| parts = path.split(".") | ||
| formatted_parts: list[str] = [] | ||
| for part in parts: | ||
| if part.startswith("["): | ||
| formatted_parts[-1] += part | ||
| else: | ||
| formatted_parts.append(part) | ||
| return ".".join(formatted_parts) | ||
|
|
||
|
|
||
| def check_duplicate_keys(json_content: str) -> tuple[list[str], list[str]]: | ||
| """Find all duplicate keys in a JSON string. | ||
|
|
||
| Traverses the entire JSON structure and reports: | ||
| - List of keys that appear multiple times at the same level | ||
| - Full paths to each duplicate key occurrkeysence | ||
|
|
||
| A key is considered duplicate if it appears multiple times within | ||
| the same object, regardless of nesting level or array position. | ||
| """ | ||
| try: | ||
| dupe_data = json.loads(json_content, object_pairs_hook=check_for_duplicate_keys) | ||
| duplicate_paths = [ | ||
| f"DocumentReference.{format_path(path)}" | ||
| for path in flatten_duplicates(dupe_data) | ||
| ] | ||
| duplicate_keys = list( | ||
| dict.fromkeys([key.split(".")[-1] for key in duplicate_paths]) | ||
| ) | ||
| return duplicate_keys, duplicate_paths | ||
| except json.JSONDecodeError: | ||
| raise ValueError("Error: Invalid JSON format") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.