Update existing security_patterns.json and core.py with enhancements#10
Update existing security_patterns.json and core.py with enhancements#10S3DFX-CYBER wants to merge 6 commits intoOWASP-BLT:mainfrom
Conversation
|
Looks good overall, just recommend double checking severity consistency and JSON schema compatibility before merge. |
|
@coderabbitai Looks good overall, just recommend double checking severity consistency and JSON schema compatibility before merge. Can u review |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me inspect the 🏁 Script executed: Length of output: 20012 Let me also check for any schema definition file in the repo. 🏁 Script executed: Length of output: 695 🏁 Script executed: Length of output: 23037 🏁 Script executed: Length of output: 17188
🔴 Blocking Issue:
|
| Section | Entry | Old Severity | New Severity | Runtime Effect |
|---|---|---|---|---|
file_patterns |
database |
warning |
high |
❌ Dropped silently |
file_patterns |
file_upload |
warning |
high |
❌ Dropped silently |
file_patterns |
api_endpoints |
warning |
high |
❌ Dropped silently |
file_patterns |
user_input |
warning |
high |
❌ Dropped silently |
label_patterns |
api |
warning |
high |
❌ Dropped silently |
label_patterns |
dependencies |
warning |
high |
❌ Dropped silently |
Fix option A – Revert these to "warning" to preserve current behaviour, then open a separate PR to add "high" support in core.py.
Fix option B – Extend core.py in this PR to handle "high" in the report:
# In generate_report(), after Line 327
high = [a for a in advice_list if a.severity == "high"]
# And in the report body, after the critical block:
if high:
report.append("## 🟠 High Severity Security Considerations\n")
for advice in high:
report.append(self._format_advice(advice))🟡 Schema Consistency Issues
1. references array missing from all label_patterns entries
All 7 file_patterns entries now have a references array, but none of the 8 label_patterns entries do. This is a structural inconsistency that will matter once the engine is updated to consume references directly.
// label_patterns.security — missing references
{
"severity": "critical",
"guidance": "🔒 Security-Related Changes\n..."
// ← no "references" array
}2. references field is not yet consumed by the engine
_generate_advice_from_pattern() (core.py Line 153) calls _get_documentation_links(pattern_key) which returns hardcoded links, completely ignoring the new references arrays in the JSON. The enriched links added in this PR won't surface to end users until this method is updated:
# core.py — currently ignores pattern_data["references"]
doc_links = self._get_documentation_links(pattern_key)
# Should be:
doc_links = pattern_data.get("references") or self._get_documentation_links(pattern_key)🟡 Severity Semantic Inconsistencies
3. CWE-306 assigned two different severity levels
CWE-306 (Missing Authentication for Critical Function) is cited in:
authentication→"critical"✅api_endpoints→"high"⚠️
Since the same weakness drives both, the severity should be consistent (or the CWE reference removed from api_endpoints).
4. Under-classified severities worth reconsidering
| Entry | CWE | Current Severity | Suggested |
|---|---|---|---|
file_upload |
CWE-434 (Unrestricted Upload) | high |
critical — can lead directly to RCE |
user_input |
CWE-79 (XSS), CWE-89 (SQLi) | high |
critical — SQLi is CVSS 9.8+ |
database |
CWE-89 (SQLi) | high |
critical — same CWE as above |
5. CWE-89 (SQL injection) covered by both database and user_input
Both patterns reference CWE-89. Consider whether the guidance for each should be scoped differently (e.g. database → storage-level SQLi, user_input → input sanitisation) to avoid duplicate advisories firing on the same file.
Summary
| # | Issue | Severity | Blocking? |
|---|---|---|---|
| 1 | "high" not handled in generate_report() — advisories silently dropped |
🔴 | Yes |
| 2 | references not consumed by engine |
🟡 | No |
| 3 | label_patterns missing references field |
🟡 | No |
| 4 | CWE-306 has two different severity levels | 🟡 | No |
| 5 | file_upload / user_input / database arguably under-classified |
🟡 | No |
Issue #1 should be addressed before merging — everything else is a follow-up improvement.
|
@coderabbitai Authentication pattern not triggering , also suggest fixes |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 25388
🔴 Root Cause 1:
|
| # | Root Cause | Location | Impact |
|---|---|---|---|
| 1 | fnmatch doesn't support ** globstar |
core.py → _matches_pattern() |
All **/x/** file patterns never fire |
| 2 | Label match is one-directional | core.py → evaluate_context() |
Short/hyphenated labels (auth, auth-flow) never match |
| 3 | references from JSON ignored |
core.py → _generate_advice_from_pattern() |
Enriched links in JSON never reach the report |
Fix #1 is the primary culprit — without it, the authentication file pattern cannot trigger under any circumstances regardless of what files are passed in.
|
@coderabbitai update the file with appropriate changes so i can apply |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 17692
"""
Core advisory engine that evaluates context and provides security guidance.
"""
import json
import os
import re
from typing import Dict, List, Optional, Any
from dataclasses import dataclass, asdict
from datetime import datetime
`@dataclass`
class AdvisoryContext:
"""Context information for generating security advice."""
issue_labels: List[str]
repo_metadata: Dict[str, Any]
file_patterns: List[str]
contributor_intent: Optional[str] = None
past_patterns: Optional[Dict[str, Any]] = None
`@dataclass`
class SecurityAdvice:
"""Security advice generated for a contribution."""
severity: str # info, warning, high, critical
title: str
message: str
documentation_links: List[str]
recommendations: List[str]
timestamp: str
class AdvisoryEngine:
"""Main advisory engine for providing security guidance."""
def __init__(self, config_path: str = "config/security_patterns.json"):
self.config_path = config_path
self.security_patterns = self._load_security_patterns()
self.learning_data = self._load_learning_data()
def _load_security_patterns(self) -> Dict:
"""Load security patterns from configuration."""
if os.path.exists(self.config_path):
with open(self.config_path, 'r') as f:
return json.load(f)
return self._get_default_patterns()
def _get_default_patterns(self) -> Dict:
"""Return default security patterns."""
return {
"file_patterns": {
"authentication": {
"patterns": ["**/auth/**", "**/login/**", "**/password/**"],
"severity": "critical",
"guidance": "Authentication changes require careful review"
},
"api_keys": {
"patterns": ["**/*api*key*", "**/*secret*", "**/*token*"],
"severity": "critical",
"guidance": "Never commit API keys or secrets"
},
"database": {
"patterns": ["**/*db*", "**/*database*", "**/migrations/**"],
"severity": "warning",
"guidance": "Database changes should include security considerations"
},
"encryption": {
"patterns": ["**/*crypto*", "**/*encrypt*", "**/*hash*"],
"severity": "critical",
"guidance": "Use established cryptographic libraries"
}
},
"label_patterns": {
"security": {
"severity": "critical",
"guidance": "Security-related changes require thorough review"
},
"authentication": {
"severity": "critical",
"guidance": "Authentication changes must follow security best practices"
},
"data-privacy": {
"severity": "critical",
"guidance": "Ensure compliance with data privacy regulations"
},
"api": {
"severity": "warning",
"guidance": "API changes should include input validation and rate limiting"
}
}
}
def _load_learning_data(self) -> Dict:
"""Load learning data from past patterns."""
learning_path = "config/learning_data.json"
if os.path.exists(learning_path):
with open(learning_path, 'r') as f:
return json.load(f)
return {"patterns": [], "feedback": []}
def evaluate_context(self, context: AdvisoryContext) -> List[SecurityAdvice]:
"""Evaluate context and generate security advice."""
advice_list = []
# Evaluate based on issue labels
# FIX 2: Bidirectional matching + hyphen normalisation so that short labels
# like "auth" or "auth-flow" correctly match the "authentication" pattern key.
for label in context.issue_labels:
label_lower = label.lower().replace("-", "_")
for pattern_key, pattern_data in self.security_patterns.get("label_patterns", {}).items():
pattern_key_norm = pattern_key.replace("-", "_")
if pattern_key_norm in label_lower or label_lower in pattern_key_norm:
advice = self._generate_advice_from_pattern(
pattern_key, pattern_data, "label", context
)
advice_list.append(advice)
# Evaluate based on file patterns
for file_path in context.file_patterns:
for pattern_key, pattern_data in self.security_patterns.get("file_patterns", {}).items():
if self._matches_pattern(file_path, pattern_data.get("patterns", [])):
advice = self._generate_advice_from_pattern(
pattern_key, pattern_data, "file", context
)
advice_list.append(advice)
# Add general security advice
if not advice_list:
advice_list.append(self._get_general_advice(context))
# Refine based on learning data
advice_list = self._refine_with_learning(advice_list, context)
return advice_list
def _matches_pattern(self, file_path: str, patterns: List[str]) -> bool:
"""Check if file path matches any of the patterns (supports ** globstar).
FIX 1: The previous implementation used fnmatch.fnmatch() which does NOT
understand '**' (globstar/recursive) syntax. All patterns like '**/auth/**'
would silently never match. This version translates globstar patterns to
equivalent regular expressions before matching.
"""
import fnmatch
# Normalize path separators to forward slashes
normalized = file_path.replace("\\", "/")
for pattern in patterns:
if "**" in pattern:
# Translate globstar pattern to a regex:
# **/foo → optional leading directories
# foo/** → optional trailing directories
# ** → any path segment(s)
# * → any characters except '/'
regex = re.escape(pattern)
regex = regex.replace(r"\*\*/", "(.+/)?") # leading **/
regex = regex.replace(r"/\*\*", "(/.*)?") # trailing /**
regex = regex.replace(r"\*\*", ".*") # bare **
regex = regex.replace(r"\*", "[^/]+") # single-level *
if re.fullmatch(regex, normalized):
return True
else:
if fnmatch.fnmatch(normalized, pattern):
return True
return False
def _generate_advice_from_pattern(
self, pattern_key: str, pattern_data: Dict, source_type: str, context: AdvisoryContext
) -> SecurityAdvice:
"""Generate security advice from a matched pattern."""
severity = pattern_data.get("severity", "info")
guidance = pattern_data.get("guidance", "Please review security implications")
# Build recommendations
recommendations = self._get_recommendations(pattern_key, severity)
# FIX 3: Prefer the 'references' array from the JSON pattern over the
# hardcoded fallback links, so the enriched OWASP links added to
# security_patterns.json are actually surfaced in the report.
doc_links = (
pattern_data.get("references")
or self._get_documentation_links(pattern_key)
)
return SecurityAdvice(
severity=severity,
title=f"Security Advisory: {pattern_key.replace('_', ' ').title()}",
message=guidance,
documentation_links=doc_links,
recommendations=recommendations,
timestamp=datetime.utcnow().isoformat()
)
def _get_general_advice(self, context: AdvisoryContext) -> SecurityAdvice:
"""Generate general security advice."""
return SecurityAdvice(
severity="info",
title="General Security Guidance",
message="Thank you for contributing! Please review these general security best practices.",
documentation_links=[
"https://owasp.org/www-project-top-ten/",
"https://cheatsheetseries.owasp.org/"
],
recommendations=[
"Review OWASP Top 10 security risks",
"Ensure input validation is in place",
"Check for proper error handling",
"Verify authentication and authorization",
"Avoid hardcoding sensitive information"
],
timestamp=datetime.utcnow().isoformat()
)
def _get_recommendations(self, pattern_key: str, severity: str) -> List[str]:
"""Get specific recommendations for a pattern."""
recommendations = {
"authentication": [
"Use multi-factor authentication where possible",
"Implement proper session management",
"Hash passwords with bcrypt or Argon2",
"Add rate limiting to prevent brute force attacks"
],
"api_keys": [
"Use environment variables for secrets",
"Never commit secrets to version control",
"Rotate keys regularly",
"Use a secrets manager (e.g., HashiCorp Vault, AWS Secrets Manager)"
],
"database": [
"Use parameterized queries to prevent SQL injection",
"Implement proper access controls",
"Encrypt sensitive data at rest",
"Validate and sanitize all inputs"
],
"encryption": [
"Use well-tested cryptographic libraries",
"Avoid creating custom encryption algorithms",
"Use strong key lengths (AES-256, RSA-2048+)",
"Implement proper key management"
],
"security": [
"Follow the principle of least privilege",
"Implement defense in depth",
"Keep security dependencies up to date",
"Conduct security testing"
]
}
return recommendations.get(pattern_key, [
"Review security implications carefully",
"Consult security documentation",
"Consider security testing"
])
def _get_documentation_links(self, pattern_key: str) -> List[str]:
"""Get documentation links for a pattern (fallback when JSON references absent)."""
docs = {
"authentication": [
"https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html",
"https://owasp.org/www-project-top-ten/2017/A2_2017-Broken_Authentication"
],
"api_keys": [
"https://cheatsheetseries.owasp.org/cheatsheets/Secrets_Management_Cheat_Sheet.html",
"https://owasp.org/www-project-top-ten/2017/A3_2017-Sensitive_Data_Exposure"
],
"database": [
"https://cheatsheetseries.owasp.org/cheatsheets/SQL_Injection_Prevention_Cheat_Sheet.html",
"https://cheatsheetseries.owasp.org/cheatsheets/Query_Parameterization_Cheat_Sheet.html"
],
"encryption": [
"https://cheatsheetseries.owasp.org/cheatsheets/Cryptographic_Storage_Cheat_Sheet.html",
"https://cheatsheetseries.owasp.org/cheatsheets/Key_Management_Cheat_Sheet.html"
],
"security": [
"https://owasp.org/www-project-top-ten/",
"https://cheatsheetseries.owasp.org/"
]
}
return docs.get(pattern_key, [
"https://owasp.org/www-project-top-ten/",
"https://cheatsheetseries.owasp.org/"
])
def _refine_with_learning(
self, advice_list: List[SecurityAdvice], context: AdvisoryContext
) -> List[SecurityAdvice]:
"""Refine advice based on learning data."""
# Check if we have feedback patterns
feedback = self.learning_data.get("feedback", [])
if feedback:
# Adjust severity or recommendations based on past feedback
for advice in advice_list:
relevant_feedback = [
f for f in feedback
if f.get("pattern") == advice.title
]
if relevant_feedback:
# Calculate average helpfulness
avg_helpful = sum(
f.get("helpful", 0) for f in relevant_feedback
) / len(relevant_feedback)
# If advice was not helpful, adjust it
if avg_helpful < 0.5:
advice.message += "\n\nNote: This guidance is being refined based on contributor feedback."
return advice_list
def capture_intent(self, intent: str, context: AdvisoryContext) -> None:
"""Capture contributor intent for better guidance."""
intent_data = {
"intent": intent,
"timestamp": datetime.utcnow().isoformat(),
"context": {
"labels": context.issue_labels,
"files": context.file_patterns
}
}
# Store intent for learning
self.learning_data.setdefault("intents", []).append(intent_data)
self._save_learning_data()
def record_feedback(self, advice_title: str, helpful: bool, comments: str = "") -> None:
"""Record feedback on advice for learning loop."""
feedback_data = {
"pattern": advice_title,
"helpful": 1 if helpful else 0,
"comments": comments,
"timestamp": datetime.utcnow().isoformat()
}
self.learning_data.setdefault("feedback", []).append(feedback_data)
self._save_learning_data()
def _save_learning_data(self) -> None:
"""Save learning data to file."""
learning_path = "config/learning_data.json"
os.makedirs(os.path.dirname(learning_path), exist_ok=True)
with open(learning_path, 'w') as f:
json.dump(self.learning_data, f, indent=2)
def generate_report(self, advice_list: List[SecurityAdvice]) -> str:
"""Generate a formatted report from advice list."""
if not advice_list:
return "No specific security advisories for this contribution."
report = ["# 🛡️ BLT Preflight Security Advisory\n"]
report.append("This advisory system helps you understand security expectations before contributing.\n")
report.append("---\n")
# FIX 4: Group by all four supported severity levels (info, warning, high, critical).
# Previously "high" was not handled, causing those advisories to be silently dropped
# from the report even though the JSON patterns were correctly classified.
critical = [a for a in advice_list if a.severity == "critical"]
high = [a for a in advice_list if a.severity == "high"]
warnings = [a for a in advice_list if a.severity == "warning"]
info = [a for a in advice_list if a.severity == "info"]
if critical:
report.append("## 🔴 Critical Security Considerations\n")
for advice in critical:
report.append(self._format_advice(advice))
if high:
report.append("## 🟠 High Severity Security Considerations\n")
for advice in high:
report.append(self._format_advice(advice))
if warnings:
report.append("## 🟡 Security Warnings\n")
for advice in warnings:
report.append(self._format_advice(advice))
if info:
report.append("## 🔵 Security Information\n")
for advice in info:
report.append(self._format_advice(advice))
report.append("\n---")
report.append("\n*This is an advisory system - not enforcement. These suggestions help prevent common security issues.*")
report.append("\n*Questions? Check our [documentation](docs/SECURITY_GUIDANCE.md) or ask a maintainer.*")
return "\n".join(report)
def _format_advice(self, advice: SecurityAdvice) -> str:
"""Format individual advice for display."""
lines = [f"### {advice.title}\n"]
lines.append(f"{advice.message}\n")
if advice.recommendations:
lines.append("**Recommendations:**")
for rec in advice.recommendations:
lines.append(f"- {rec}")
lines.append("")
if advice.documentation_links:
lines.append("**Learn more:**")
for link in advice.documentation_links:
lines.append(f"- {link}")
lines.append("")
return "\n".join(lines)Here's a summary of every change made and why:
|
|
@coderabbitai will this changes fix my pr |
|
Tip For best results, initiate chat on the files or code changes.
Here's what each fix addresses in relation to your PR:
What you need to do
Once both files are updated together, all the patterns you added (OAuth, JWT, 2FA, SSL/TLS, etc.) will correctly trigger advisories, surface your OWASP references, and appear under the right severity section in the report. 🚀 |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⚙️ Run configurationConfiguration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Enhance security advisory generation by improving pattern matching and severity handling.
|
@coderabbitai retry and review again |
|
The results will be posted shortly. 🚀 ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
config/security_patterns.json (3)
29-29: The**/config/**pattern may generate excessive false positives.This pattern matches every file in any
config/directory, including innocuous files likelogging.json,theme.yaml, orfeatures.jsonthat contain no secrets. Consider narrowing to specific sensitive files:"**/config/*secret*", "**/config/*credential*", "**/config/*key*"Alternatively, if broad coverage is intentional for this security-focused tool, this is acceptable—just be aware users may see warnings for non-sensitive config files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/security_patterns.json` at line 29, The glob pattern "**/config/**" is too broad and will produce many false positives; update the security_patterns.json to narrow matches to likely sensitive filenames (e.g., patterns matching "secret", "credential", "key", or other known sensitive names) instead of all files in any config directory, or explicitly document that broad config scanning is intentional; specifically replace or supplement the existing "**/config/**" entry with targeted patterns like "**/config/*secret*", "**/config/*credential*", "**/config/*key*" or equivalent project-specific sensitive names to reduce noise.
136-169: Label patterns guidance is well-improved; consider addingreferencesarrays for consistency.The updated guidance text for label patterns is comprehensive and actionable. However, unlike
file_patterns, these entries lackreferencesarrays. As noted in the PR discussion, this creates a schema inconsistency. The engine's_generate_advice_from_patternwill fall back to hardcoded links, which works but loses the benefit of the enriched OWASP references.Consider adding
referencesarrays to label patterns for consistency:"authentication": { "severity": "critical", "guidance": "...", "references": [ "https://cheatsheetseries.owasp.org/cheatsheets/Authentication_Cheat_Sheet.html" ] }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/security_patterns.json` around lines 136 - 169, The label_patterns entries (e.g., "authentication", "authorization", "security", "data-privacy", "api", "vulnerability", "encryption", "dependencies") lack a references array causing schema inconsistency and fallback behavior in _generate_advice_from_pattern; add a "references" array to each label object containing the canonical URL(s) referenced in its guidance (for example the OWASP cheat sheet or project links already cited in the guidance) so the engine can consume explicit links rather than falling back to hardcoded values.
69-70:**/requests*and**/urllib*patterns may not achieve intended goal.These patterns attempt to catch files using the
requests/urlliblibraries to warn about SSL verification, but filename matching is unreliable:
- Files using
requestsaren't necessarily namedrequests_*.py- Files named
requests_handler.pymight not importrequestsat all- Most actual usage is in files named
api_client.py,http_utils.py, etc.Consider removing these patterns and relying on static analysis tools (like Bandit) to detect actual
verify=Falseusage. Alternatively, keep them but acknowledge they'll have both false positives and false negatives.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/security_patterns.json` around lines 69 - 70, The filename-glob patterns "**/requests*" and "**/urllib*" are unreliable for detecting use of the requests/urllib libraries (they produce false positives/negatives); update the entry that contains these patterns by either removing "**/requests*" and "**/urllib*" entirely and adding a note to rely on actual static analysis (e.g., Bandit) to detect verify=False/SSL issues, or retain them but add a clarifying comment in the same JSON block stating they are best-effort and will yield false positives/negatives; reference the existing patterns "**/requests*" and "**/urllib*" in your change so reviewers can locate and modify them.src/advisory_engine/core.py (1)
193-193: Consider usingdatetime.now(timezone.utc)instead ofdatetime.utcnow()at all 4 occurrences.
datetime.utcnow()is deprecated as of Python 3.12 in favor of timezone-aware datetimes. This appears at lines 193, 213, 318, and 335.♻️ Suggested change
Add import at top of file:
from datetime import datetime, timezoneThen replace all usages:
- timestamp=datetime.utcnow().isoformat() + timestamp=datetime.now(timezone.utc).isoformat()Apply the same change at lines 193, 213, 318, and 335.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/advisory_engine/core.py` at line 193, Replace all uses of datetime.utcnow() with timezone-aware datetime.now(timezone.utc) and add the timezone import: change datetime.utcnow().isoformat() to datetime.now(timezone.utc).isoformat() for each occurrence (the spots that set timestamp and any other places calling datetime.utcnow()); also add from datetime import timezone to the module imports so the code is timezone-aware and avoids the deprecated datetime.utcnow().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/security_patterns.json`:
- Around line 85-86: The glob pattern "**/*file*" is too broad and matches many
unrelated paths (e.g., Dockerfile, Makefile, profile.py); replace that single
pattern with a set of targeted patterns to capture actual upload/file-storage
paths instead—swap "**/*file*" for explicit entries such as "**/upload*",
"**/uploads/**", "**/storage/**", "**/media/**", "**/attachments/**",
"**/*file_upload*", and "**/*fileupload*" so only likely upload-related files
are matched (update the "patterns" array that currently contains "**/*upload*"
and "**/*file*").
In `@src/advisory_engine/core.py`:
- Line 13: Remove the invalid backticks wrapping the `@dataclass` decorators in
this module: replace occurrences of `` `@dataclass` `` with a plain `@dataclass`
decorator (both instances reported on lines 13 and 23) so the dataclass
decorators before the affected class definitions (the classes immediately
following those decorators) are valid Python; ensure you import dataclass from
dataclasses if not already present.
- Around line 157-162: The glob-to-regex conversion treats single-level '*' as
one-or-more chars using '[^/]+', which breaks matches when the segment can be
empty (e.g., filenames starting with the literal); update the replacement that
transforms '*' (the line operating on pattern -> regex, referencing the pattern
variable and the regex being matched against normalized) to use a zero-or-more
class '[^/]*' instead of '[^/]+', ensuring '**/*api*' and similar patterns
correctly match names like 'api_client.py'.
---
Nitpick comments:
In `@config/security_patterns.json`:
- Line 29: The glob pattern "**/config/**" is too broad and will produce many
false positives; update the security_patterns.json to narrow matches to likely
sensitive filenames (e.g., patterns matching "secret", "credential", "key", or
other known sensitive names) instead of all files in any config directory, or
explicitly document that broad config scanning is intentional; specifically
replace or supplement the existing "**/config/**" entry with targeted patterns
like "**/config/*secret*", "**/config/*credential*", "**/config/*key*" or
equivalent project-specific sensitive names to reduce noise.
- Around line 136-169: The label_patterns entries (e.g., "authentication",
"authorization", "security", "data-privacy", "api", "vulnerability",
"encryption", "dependencies") lack a references array causing schema
inconsistency and fallback behavior in _generate_advice_from_pattern; add a
"references" array to each label object containing the canonical URL(s)
referenced in its guidance (for example the OWASP cheat sheet or project links
already cited in the guidance) so the engine can consume explicit links rather
than falling back to hardcoded values.
- Around line 69-70: The filename-glob patterns "**/requests*" and "**/urllib*"
are unreliable for detecting use of the requests/urllib libraries (they produce
false positives/negatives); update the entry that contains these patterns by
either removing "**/requests*" and "**/urllib*" entirely and adding a note to
rely on actual static analysis (e.g., Bandit) to detect verify=False/SSL issues,
or retain them but add a clarifying comment in the same JSON block stating they
are best-effort and will yield false positives/negatives; reference the existing
patterns "**/requests*" and "**/urllib*" in your change so reviewers can locate
and modify them.
In `@src/advisory_engine/core.py`:
- Line 193: Replace all uses of datetime.utcnow() with timezone-aware
datetime.now(timezone.utc) and add the timezone import: change
datetime.utcnow().isoformat() to datetime.now(timezone.utc).isoformat() for each
occurrence (the spots that set timestamp and any other places calling
datetime.utcnow()); also add from datetime import timezone to the module imports
so the code is timezone-aware and avoids the deprecated datetime.utcnow().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: OWASP-BLT/coderabbit/.coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: b224e3b0-e9a2-45a8-b08f-590589bddb1c
📒 Files selected for processing (2)
config/security_patterns.jsonsrc/advisory_engine/core.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
Looks good overall. I don't see any blockers we can address the remaining improvements in future iterations. |
|
@DonnieBLT would appreciate a review |
|
👋 Hi @S3DFX-CYBER! This pull request needs a peer review before it can be merged. Please request a review from a team member who is not:
Once a valid peer review is submitted, this check will pass automatically. Thank you!
|
|
@coderabbitai help me Add unit tests for the new behaviour |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me first inspect the existing test structure and conventions in the repository. 🏁 Script executed: Length of output: 3394 🏁 Script executed: Length of output: 28936
New test functions — append after
|
| Test function | Fix | What it verifies |
|---|---|---|
test_globstar_pattern_matching |
Fix 1 | _matches_pattern() correctly resolves ** for deep paths, root paths, Windows paths, and non-matching paths |
test_bidirectional_label_matching |
Fix 2 | auth, auth-flow, API all match; bug does not |
test_references_wiring |
Fix 3 | Custom JSON references are used; hardcoded fallback URL is absent |
test_high_severity_in_report |
Fix 4 | 🟠 section present; all four severity sections co-exist correctly |
test_high_severity_via_file_pattern |
Fix 4 (E2E) | severity="high" on a file pattern flows through evaluate_context → generate_report without being dropped |
Closes #9
Security_patterns.json enhanced with
Resources used - https://cheatsheetseries.owasp.org/index.html
Summary by CodeRabbit
New Features
Improvements