Skip to content

Implementation of Custom Sensitivity Rules #51

@MaximumTrainer

Description

@MaximumTrainer

Context:
OpenDataMask is a Spring Boot (Kotlin) and Vue.js (TypeScript) platform for data masking, modelled after Tonic Structural. It currently has a built-in Sensitivity Scan that detects 19 standard PII types. We need to expand this by implementing Custom Sensitivity Rules, allowing users to define their own detection logic and automate generator assignments as described in the Tonic Custom Sensitivity Rules Guide.

Objective:
Develop a full-stack feature that allows users to create, test, and apply custom sensitivity rules globally across all workspaces. These rules must detect specific columns based on naming patterns and data types and recommend a linked Generator Preset.

  1. Functional Requirements
    A. Rule Definition Engine
    Attributes: Each rule must have a Name (which becomes the "Sensitivity Type"), Description, Data Type Filter, and one or more Matchers.

Matchers: Support "Contains," "Starts With," "Ends With," and "Regex" for column names (with case-sensitivity toggle).

Generic Typing: Rules should use "Generic Data Types" (e.g., Text, Numeric, Date) that map to source-specific types (e.g., VARCHAR in Postgres, StringType in Spark/MongoDB).

B. Generator Preset Integration
Each rule must be linked to a Generator Preset (already existing in OpenDataMask).

When a rule matches a column, the linked preset must be suggested as the "Recommended Generator" in the Privacy Hub.

C. Rule Preview Panel
Within the "Create Rule" UI, implement a "Preview" function.

The user selects a Workspace, and the system runs the proposed rule against that workspace’s schema, displaying a list of columns that would be caught.

D. Scanner Integration
Modify the existing SensitivityScan logic to execute Custom Rules alongside built-in rules.

Custom rule matches should be labeled with the custom Name and marked as AT_RISK in the Privacy Hub.

  1. Technical Implementation Guidance
    Backend (Kotlin): Create a SensitivityRule entity and a corresponding JPA repository. Update the SensitivityScannerService to iterate through active custom rules during a scan.

Generic Type Mapping: Implement a utility to normalize database-specific metadata (from SchemaIntrospection) into the generic types used by the rule engine.

Frontend (Vue.js): Add a "Sensitivity Rules" management page under System Settings. Use a modal or side-drawer for the Rule Creator with the Preview results list.

Acceptance Criteria
ID Criteria Test Method
AC 1 Global Rule Creation Verify a user can save a rule named "Internal_ID" with a matcher for column names containing uid and an "Integer" type filter.
AC 2 Type Mapping Verify that a "Text" rule correctly identifies both VARCHAR in a PostgreSQL source and STRING in a MySQL source.
AC 3 Preview Accuracy Given a workspace with columns user_id and tx_id, a preview for a rule matching *id must display both columns before the rule is saved.
AC 4 Scanner Execution After running a scan, a column matched by a custom rule must appear in the /api/workspaces/{id}/privacy-hub/recommendations endpoint.
AC 5 Preset Assignment Verify that applying the "Bulk Recommendation" in the Privacy Hub correctly assigns the specific Generator Preset linked to the custom rule.
AC 6 Sensitivity Type Tagging Verify the UI displays the custom Rule Name (e.g., "Product_SKU") in the "Sensitivity" column of the Database View instead of "Unknown."

Instruction for Developer:
When implementing the Rule Preview Panel, ensure it does not save the rule to the database. It should perform a transient schema-match operation against the SchemaColumn metadata stored in the application database. Refer to the existing PrivacyHub logic to ensure consistency in how "At Risk" statuses are updated.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions