Implementation of Custom Sensitivity Rules

Context:
OpenDataMask is a Spring Boot (Kotlin) and Vue.js (TypeScript) platform for data masking, modelled after Tonic Structural. It currently has a built-in Sensitivity Scan that detects 19 standard PII types. We need to expand this by implementing Custom Sensitivity Rules, allowing users to define their own detection logic and automate generator assignments as described in the [Tonic Custom Sensitivity Rules Guide](https://www.tonic.ai/guides/custom-sensitivity-rules-to-automate-sensitive-data-detection).

Objective:
Develop a full-stack feature that allows users to create, test, and apply custom sensitivity rules globally across all workspaces. These rules must detect specific columns based on naming patterns and data types and recommend a linked Generator Preset.

1. Functional Requirements
A. Rule Definition Engine
Attributes: Each rule must have a Name (which becomes the "Sensitivity Type"), Description, Data Type Filter, and one or more Matchers.

Matchers: Support "Contains," "Starts With," "Ends With," and "Regex" for column names (with case-sensitivity toggle).

Generic Typing: Rules should use "Generic Data Types" (e.g., Text, Numeric, Date) that map to source-specific types (e.g., VARCHAR in Postgres, StringType in Spark/MongoDB).

B. Generator Preset Integration
Each rule must be linked to a Generator Preset (already existing in OpenDataMask).

When a rule matches a column, the linked preset must be suggested as the "Recommended Generator" in the Privacy Hub.

C. Rule Preview Panel
Within the "Create Rule" UI, implement a "Preview" function.

The user selects a Workspace, and the system runs the proposed rule against that workspace’s schema, displaying a list of columns that would be caught.

D. Scanner Integration
Modify the existing SensitivityScan logic to execute Custom Rules alongside built-in rules.

Custom rule matches should be labeled with the custom Name and marked as AT_RISK in the Privacy Hub.

2. Technical Implementation Guidance
Backend (Kotlin): Create a SensitivityRule entity and a corresponding JPA repository. Update the SensitivityScannerService to iterate through active custom rules during a scan.

Generic Type Mapping: Implement a utility to normalize database-specific metadata (from SchemaIntrospection) into the generic types used by the rule engine.

Frontend (Vue.js): Add a "Sensitivity Rules" management page under System Settings. Use a modal or side-drawer for the Rule Creator with the Preview results list.

Acceptance Criteria 
ID	Criteria	Test Method
AC 1	Global Rule Creation	Verify a user can save a rule named "Internal_ID" with a matcher for column names containing uid and an "Integer" type filter.
AC 2	Type Mapping	Verify that a "Text" rule correctly identifies both VARCHAR in a PostgreSQL source and STRING in a MySQL source.
AC 3	Preview Accuracy	Given a workspace with columns user_id and tx_id, a preview for a rule matching *id must display both columns before the rule is saved.
AC 4	Scanner Execution	After running a scan, a column matched by a custom rule must appear in the /api/workspaces/{id}/privacy-hub/recommendations endpoint.
AC 5	Preset Assignment	Verify that applying the "Bulk Recommendation" in the Privacy Hub correctly assigns the specific Generator Preset linked to the custom rule.
AC 6	Sensitivity Type Tagging	Verify the UI displays the custom Rule Name (e.g., "Product_SKU") in the "Sensitivity" column of the Database View instead of "Unknown."

Instruction for Developer:
When implementing the Rule Preview Panel, ensure it does not save the rule to the database. It should perform a transient schema-match operation against the SchemaColumn metadata stored in the application database. Refer to the existing PrivacyHub logic to ensure consistency in how "At Risk" statuses are updated.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of Custom Sensitivity Rules #51

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Implementation of Custom Sensitivity Rules #51

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions