-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Overview
Introduce the rule engine responsible for executing security detection rules in Detektor.
The rule engine evaluates loaded artifacts and identifies potential security risks related to AI agent behavior and configuration.
Rules will analyze artifacts such as prompts, workflows, and agent configurations to detect issues like:
- prompt injection vulnerabilities
- unauthorized tool invocation attempts
- potential data exfiltration paths
The rule engine produces findings that will later be normalized and included in OpenPAKT reports.
Motivation
Detektor’s primary purpose is to detect AI agent security risks.
To achieve this, the scanner must be able to execute structured detection logic consistently across different artifacts and repository structures.
A dedicated rule engine provides:
- a structured execution model for detection logic
- separation between artifact loading and security evaluation
- a scalable architecture for adding new security rules
- consistent output that can be normalized into OpenPAKT findings
Without a rule engine abstraction, detection logic would become tightly coupled to scanning code, making the system harder to maintain and extend.
Proposed Approach
Implement a rule engine that executes a collection of detection rules against loaded artifacts.
The rule engine should:
- Accept artifacts produced by the artifact loader.
- Maintain a registry of detection rules.
- Execute each rule against applicable artifacts.
- Collect findings produced by rules.
High-level pipeline:
CLI
↓
Artifact Loader
↓
Rule Engine
↓
Findings
↓
Report Writer
Example rule types for v0.1:
- prompt injection detection
- tool privilege escalation detection
- data exfiltration detection
Example internal rule interface:
Rule
- id
- name
- applicable_artifacts
- evaluate(artifact)
Rules should return structured findings that include information such as:
- finding type
- severity
- affected component
- description
- evidence
These findings will later be serialized into OpenPAKT report format.
Alternatives Considered
1. Hardcode detection logic inside the scanner
Detection logic could be implemented directly in the scanner pipeline.
However, this approach would tightly couple scanning orchestration with security rules, making the system harder to extend and maintain.
2. Implement rules as external plugins
A plugin-based rule system could provide more extensibility.
However, plugin systems introduce additional complexity that is unnecessary for the v0.1 MVP.
A simple internal rule registry is sufficient for the first version.
Risks and Trade-offs
Rule complexity growth
Over time, rules may become more complex and require more sophisticated analysis techniques.
A modular rule interface helps mitigate this risk.
Performance considerations
Running many rules across large repositories could impact scan performance.
This can be optimized in future versions with filtering, caching, or parallel execution.
False positives
Some rule heuristics may produce false positives, especially in early versions.
This can be improved through rule refinement and community feedback.
Open Questions
- Should rules run against all artifacts or only specific artifact types?
- Should rules support configuration options in future versions?
- Should the rule engine support parallel execution in later versions?
- Should rule metadata include mapping to OpenPAKT taxonomy identifiers?
Examples
Example scan invocation:
detektor scan ./repo
Expected behavior:
- Artifacts are loaded from the repository.
- The rule engine executes detection rules.
- Rules produce findings when potential security issues are detected.
- Findings are passed to the report generation component.
Next Steps
If this proposal is accepted:
- Define a rule interface for detection rules.
- Implement a rule registry within the rule engine.
- Implement the rule execution pipeline.
- Create initial rule placeholders for v0.1.
- Integrate rule engine execution into the scan pipeline.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status