Rewrite as modular Python 3.12+ package (v3.0.0)#6
Rewrite as modular Python 3.12+ package (v3.0.0)#6ethanolivertroy wants to merge 1 commit intomainfrom
Conversation
Decompose the 3,031-line monolith into a proper src-layout package with uv/hatchling, registry-based framework analyzers, and in-memory OktaData bus that eliminates the original file-read coupling between analysis phases. Key changes: - Python 3.12+ with modern type hints, pyproject.toml, uv lockfile - CLI: `okta-inspector` (primary) + `okta-audit` (backward compat) - New --frameworks flag for selective framework execution - Environment variable support (OKTA_DOMAIN, OKTA_API_TOKEN) - Auto-discovery registry pattern for analyzers and reporters - Add CMMC 2.0 (Level 2) compliance framework (NIST 800-171) - 7 framework analyzers, 10 report generators, 30 source modules - Test suite with mock OktaData fixtures
📝 WalkthroughWalkthroughIntroduces okta-inspector, a comprehensive Okta compliance audit tool with an HTTP client, data collector, modular compliance analyzers for frameworks (CMMC, FedRAMP, IRAP, ISMAP, PCI-DSS, SOC2, STIG), framework reporters, and a CLI entry point. Includes build configuration, core models, and test fixtures. Changes
Sequence DiagramsequenceDiagram
actor User
participant CLI
participant Client as OktaClient
participant Collector as OktaDataCollector
participant Engine as AuditEngine
participant Analyzers
participant Reporters
participant Output as OutputManager
User->>CLI: Run okta-inspector --domain X --token Y --frameworks F
CLI->>Client: Create client (domain, token)
CLI->>Output: Create output manager
CLI->>Client: test_connection()
Client->>Client: GET /users?limit=1
Client-->>CLI: ✓ Connected
CLI->>Engine: Create AuditEngine (client, output, frameworks)
User->>CLI: Initiate audit
CLI->>Engine: run()
Engine->>Collector: Create OktaDataCollector
Engine->>Collector: collect()
Collector->>Client: GET /policies
Collector->>Client: GET /policies/{id}/rules
Collector->>Client: GET /authenticators
Collector->>Client: GET /users, /groups, /apps
Collector->>Client: GET /idps, /network-zones
Collector->>Client: GET /logs, /hooks, /streams
Client-->>Collector: OktaData (complete)
Engine->>Output: _save_raw_data(okta_data)
Engine->>Analyzers: get_analyzers(frameworks)
activate Analyzers
Analyzers->>Analyzers: analyze(okta_data)
note over Analyzers: Generate compliance findings<br/>for each framework
Analyzers-->>Engine: List[ComplianceFinding]
deactivate Analyzers
Engine->>Output: _save_analysis(findings)
Engine->>Reporters: get_reporters(frameworks)
activate Reporters
Reporters->>Reporters: generate(findings, okta_data, output)
note over Reporters: Write framework-specific<br/>reports (Markdown/text)
Reporters-->>Engine: (void)
deactivate Reporters
Engine->>Output: create_archive()
Output-->>Engine: Archive path
Engine->>Engine: _print_summary(result)
Engine-->>CLI: AuditResult
CLI-->>User: ✓ Audit complete<br/>Archive: okta_audit_*.zip
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Review Summary by QodoModular Python 3.12+ package rewrite with CMMC 2.0 framework and enhanced compliance reporting (v3.0.0)
WalkthroughsDescription• Complete rewrite of monolithic okta-audit.py into modular okta_inspector package (v3.0.0) with 30 modules • **New CMMC 2.0 Level 2 framework** added with 12 Okta-evaluable practices across AC, IA, and AU domains, including SPRS scoring and POA&M template generation • Modernized to Python 3.12+ with uv/hatchling, comprehensive type hints, and auto-discovery registry pattern for analyzers and reporters • Replaced fragile file-read coupling with in-memory OktaData bus for all collected API data across 40+ endpoints • **7 compliance frameworks** implemented: DISA STIG (11 controls), FedRAMP/NIST 800-53 (15 controls), IRAP/ISM (10 controls), ISMAP/ISO 27001 (8 controls), SOC 2 CC6 (6 controls), PCI-DSS 4.0.1 (7 controls), CMMC 2.0 (12 practices) • New CLI features: --frameworks flag for selective execution, environment variable support (OKTA_DOMAIN/OKTA_API_TOKEN), OAuth Bearer token option • **10 report generators** including executive summary, unified compliance matrix, validation scripts, and framework-specific reporters with remediation guidance • Shared analysis functions for reusable compliance checks (sessions, passwords, MFA, users, certificates, monitoring) • Comprehensive test suite with pytest fixtures and 11 passing tests covering models, analyzers, and reporters • Backward compatibility maintained with okta-audit CLI entry point Diagramflowchart LR
A["Okta API<br/>40+ endpoints"] -->|"collector.py"| B["OktaData<br/>in-memory bus"]
B -->|"7 analyzers"| C["ComplianceFinding<br/>registry"]
C -->|"10 reporters"| D["Markdown/JSON<br/>reports"]
D -->|"output.py"| E["Archive<br/>audit output"]
F["CLI<br/>--frameworks"] -->|"engine.py"| B
G["CMMC 2.0<br/>NEW"] -->|"SPRS scoring"| D
File Changes1. src/okta_inspector/reporters/cmmc.py
|
Code Review by Qodo
1. STIG report filters all findings
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request represents a significant architectural overhaul, transforming a single script into a structured, modern Python package. The primary goal is to enhance the tool's extensibility, maintainability, and performance by introducing modular components, a streamlined data pipeline, and support for additional compliance standards. This foundational change sets the stage for future development and broader applicability of the Okta compliance audit tool. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
/greptile review |
|
@codex review |
There was a problem hiding this comment.
Code Review
This is an impressive and comprehensive refactoring of the original script into a well-structured, modular Python package. The use of modern tooling like hatchling and uv, strict type checking, and the plugin-based architecture for analyzers and reporters are all excellent choices that will significantly improve maintainability and extensibility.
My review has identified a few key areas for improvement:
- Critical Flaw in MFA Detection: The logic to check for MFA enforcement is based on policy names rather than the actual policy rules, which could lead to false positives. This needs to be addressed to ensure the accuracy of the audit findings.
- Configuration and Dependency Cleanup: There's a redundant and conflicting dependency group in
pyproject.tomlthat should be removed. - Code Duplication and Efficiency: There are opportunities to reduce code duplication, such as the session extraction logic in analyzers, and to improve efficiency by avoiding re-running analyses in the audit engine.
- Test Determinism: One of the tests is non-deterministic due to its reliance on the current time, which can be fixed by mocking
datetime.now().
Overall, this is a fantastic rewrite. Addressing these points will make the tool even more robust and reliable.
| def is_mfa_enforced(data: OktaData) -> bool: | ||
| """Return True if any access policy appears to require MFA.""" | ||
| for policy in data.access_policies: | ||
| name = policy.get("name", "") | ||
| if "Admin Console" in name or "Dashboard" in name: | ||
| return True | ||
| return False |
There was a problem hiding this comment.
This MFA enforcement check is critically flawed as it only verifies the name of the policy, not its actual configuration or rules. A policy could be named "Admin Console" but be disabled or have rules that do not enforce MFA, leading to a dangerous false positive. The same issue applies to has_admin_console_mfa and has_dashboard_mfa.
The check should be made more robust by inspecting the rules within the policy to confirm that MFA is actually required. This would involve iterating through policy.get("_embedded", {}).get("rules", []) and checking the actions for an MFA requirement, such as "factorLifetime" or "factorPromptMode" settings.
| [dependency-groups] | ||
| dev = [ | ||
| "pytest>=9.0.2", | ||
| ] |
There was a problem hiding this comment.
The [dependency-groups] section is redundant and conflicts with the [project.optional-dependencies] section defined above. This dev group specifies pytest>=9.0.2, while the dev group under optional-dependencies specifies pytest>=8. This can lead to confusion and inconsistent development environments.
For a modern hatchling project, all optional dependencies, including development dependencies, should be defined under [project.optional-dependencies]. You should remove this [dependency-groups] section to resolve the conflict and simplify the configuration.
| def analyze_sessions(data: OktaData) -> list[SessionAnalysis]: | ||
| """Analyze sign-on policies for session settings.""" | ||
| results: list[SessionAnalysis] = [] | ||
| for policy in data.sign_on_policies: | ||
| sa = SessionAnalysis( | ||
| policy_id=policy.get("id", ""), | ||
| policy_name=policy.get("name", ""), | ||
| priority=policy.get("priority"), | ||
| ) | ||
| # Rules may be embedded or fetched separately — not available from | ||
| # the sign-on policy endpoint in the current collector, so we skip | ||
| # rule-level detail here. Framework analyzers that need rule data | ||
| # should check access_policies (which include inline rules). | ||
| results.append(sa) | ||
| return results |
There was a problem hiding this comment.
This analyze_sessions function appears to be dead code. It is not called by any of the framework analyzers in the pull request. Instead, each analyzer that needs session settings (cmmc.py, fedramp.py, etc.) implements its own _extract_session_settings helper function.
To reduce code duplication and improve maintainability, you should consolidate the session extraction logic into this common function and have all analyzers use it. This function should be enhanced to extract the rule-level details that the individual helpers are currently extracting.
| name = "cmmc" | ||
| display_name = "CMMC 2.0 (Level 2)" | ||
|
|
||
| def analyze(self, data: OktaData) -> list[ComplianceFinding]: |
There was a problem hiding this comment.
The analyze method is over 380 lines long, which makes it difficult to read, maintain, and test. This is a common pattern across all the analyzer modules.
Consider refactoring this method by breaking down the logic for each control check into its own private helper method. For example:
def _check_ac_l2_3_1_8(self, pw_policies: list[PasswordPolicyAnalysis]) -> list[ComplianceFinding]:
# ... logic for this control
return findings
def _check_ac_l2_3_1_10(self, sessions: list[dict]) -> list[ComplianceFinding]:
# ... logic for this control
return findings
# in analyze():
findings.extend(self._check_ac_l2_3_1_8(pw_policies))
findings.extend(self._check_ac_l2_3_1_10(sessions))
# ... and so onThis would make the main analyze method a much clearer orchestrator of the individual checks and improve the overall modularity of the analyzer.
| from okta_inspector.analyzers.common import ( | ||
| analyze_authenticators, | ||
| analyze_certificates, | ||
| analyze_monitoring, | ||
| analyze_password_policies, | ||
| analyze_users, | ||
| ) |
There was a problem hiding this comment.
The _save_analysis method re-runs several analysis functions from analyzers.common (e.g., analyze_password_policies, analyze_users). These functions are already called within each individual framework analyzer.
This is inefficient as it performs the same computations multiple times. It could also lead to inconsistencies if the analysis functions were non-deterministic.
To improve performance and ensure consistency, the analysis should be performed only once. You could run these common analysis functions at the beginning of the _run_analyzers method, store their results, and pass them to both the individual analyzers and the _save_analysis method.
| system_logs: list[dict[str, Any]] = field(default_factory=list) | ||
|
|
||
| # Additional | ||
| org_factors: list[dict[str, Any]] | dict[str, Any] = field(default_factory=dict) |
There was a problem hiding this comment.
The type hint for org_factors is inconsistent with its default value and the expected API response. The Okta API endpoint /api/v1/org/factors returns a list of factor objects.
The type hint should be list[dict[str, Any]] and the default value should be field(default_factory=list) to accurately reflect the data structure.
| org_factors: list[dict[str, Any]] | dict[str, Any] = field(default_factory=dict) | |
| org_factors: list[dict[str, Any]] = field(default_factory=list) | |
| def test_analyze_users(sample_okta_data: OktaData): | ||
| result = analyze_users(sample_okta_data) | ||
| assert result.total_users == 3 | ||
| assert result.active_users == 2 | ||
| # u2 has lastLogin in Oct 2025, >90 days ago from March 2026 | ||
| assert len(result.inactive_users) >= 1 |
There was a problem hiding this comment.
This test is non-deterministic because the analyze_users function it tests uses datetime.now(). The test's success depends on when it is run relative to the hardcoded dates in the test fixture.
To make this test reliable, you should mock datetime.now() to return a fixed timestamp. This can be done using unittest.mock.patch or freezegun.
Greptile SummaryThis PR is a substantial and well-executed architectural rewrite of a 3,031-line monolith into a 30-module Key findings:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
CLI["cli.py\nmain()"] --> CLIENT["OktaClient\nclient.py"]
CLI --> OUTPUT["OutputManager\noutput.py"]
CLI --> ENGINE["AuditEngine\nengine.py"]
ENGINE --> PHASE1["Phase 1: Collect"]
PHASE1 --> COLLECTOR["OktaDataCollector\ncollector.py"]
COLLECTOR -->|"40+ API endpoints"| OKTA[("Okta API")]
COLLECTOR --> OKTADATA["OktaData\nmodels.py"]
ENGINE --> SAVE_RAW["save_raw_data()\n→ core_data/*.json"]
ENGINE --> PHASE2["Phase 2: Analyze"]
PHASE2 --> REGISTRY["get_analyzers()\nanalyzers/__init__.py"]
REGISTRY --> STIG["STIGAnalyzer"]
REGISTRY --> FEDRAMP["FedRAMPAnalyzer"]
REGISTRY --> IRAP["IRAPAnalyzer"]
REGISTRY --> ISMAP["ISMAPAnalyzer"]
REGISTRY --> SOC2["SOC2Analyzer"]
REGISTRY --> PCI["PCIDSSAnalyzer"]
REGISTRY --> CMMC["CMMCAnalyzer ✨NEW"]
STIG & FEDRAMP & IRAP & ISMAP & SOC2 & PCI & CMMC -->|"analyze(OktaData)"| COMMON["analyzers/common.py\nanalyze_sessions()\nanalyze_password_policies()\nis_mfa_enforced() ⚠️\nanalyze_users()\nanalyze_certificates()\nanalyze_monitoring()"]
COMMON --> FINDINGS["list[ComplianceFinding]"]
ENGINE --> PHASE3["Phase 3: Report"]
PHASE3 --> REPORTERS["get_reporters()\nreporters/__init__.py"]
REPORTERS --> EXEC["ExecutiveSummaryReporter"]
REPORTERS --> MATRIX["MatrixReporter"]
REPORTERS --> VALID["ValidationReporter"]
REPORTERS --> FRAMEWORK_REPORTS["7× Framework Reporters\n(STIG, FedRAMP, IRAP, ISMAP,\nSOC2, PCI-DSS, CMMC ✨NEW)"]
FINDINGS --> REPORTERS
REPORTERS --> OUTPUT
OUTPUT --> ARCHIVE["okta_audit_YYYYMMDD.zip"]
Reviews (1): Last reviewed commit: "Rewrite okta-inspector as modular Python..." | Re-trigger Greptile |
| """Return True if any access policy appears to require MFA.""" | ||
| for policy in data.access_policies: | ||
| name = policy.get("name", "") | ||
| if "Admin Console" in name or "Dashboard" in name: | ||
| return True | ||
| return False | ||
|
|
||
|
|
There was a problem hiding this comment.
is_mfa_enforced produces false positives
The function returns True simply if any access policy exists whose name contains "Admin Console" or "Dashboard". In Okta, an access policy named "Okta Admin Console" is created by default for every org — even one with MFA not configured. The policy's existence does not imply its rules enforce MFA; you'd need to inspect the rules' factorMode / constraints / reauthenticateIn to determine actual MFA enforcement.
Because this is the single shared predicate for the most-critical control across all seven frameworks (IA.L2-3.5.3, ISM-0974, E8-MFA, A.9.4.2, and equivalents in FedRAMP/SOC 2/PCI-DSS), a false "pass" here silently masks a missing MFA configuration for every framework simultaneously.
A safer heuristic would flag as "manual" when inspection of policy rules isn't available, rather than assuming a True result:
| """Return True if any access policy appears to require MFA.""" | |
| for policy in data.access_policies: | |
| name = policy.get("name", "") | |
| if "Admin Console" in name or "Dashboard" in name: | |
| return True | |
| return False | |
| def is_mfa_enforced(data: OktaData) -> bool: | |
| """Return True if any access policy appears to require MFA. | |
| NOTE: This is a best-effort heuristic. Without rule-level data the | |
| presence of an "Admin Console" or "Dashboard" named policy is used as | |
| a proxy. Consider verifying policy rules manually. | |
| """ | |
| for policy in data.access_policies: | |
| name = policy.get("name", "") | |
| # Check for MFA requirement in embedded rules if available | |
| for rule in policy.get("_embedded", {}).get("rules", []): | |
| actions = rule.get("actions", {}) | |
| signon = actions.get("signon", {}) | |
| if signon.get("requireFactor") or signon.get("factorMode") in ("2FA", "REQUIRED"): | |
| return True | |
| # Fall back to name heuristic only when no rule data is present | |
| if "Admin Console" in name or "Dashboard" in name: | |
| if policy.get("_embedded", {}).get("rules"): | |
| # Rules are present but none require MFA | |
| continue | |
| return True # No rule data; assume name-based heuristic | |
| return False |
| return [cls() for cls in _ANALYZER_REGISTRY.values()] | ||
| return [_ANALYZER_REGISTRY[n]() for n in names if n in _ANALYZER_REGISTRY] |
There was a problem hiding this comment.
Silent drop of invalid framework names
When a user passes --frameworks foo,cmmc, the list comprehension silently filters out any name not in the registry (if n in _ANALYZER_REGISTRY). If all names are unrecognized, get_analyzers returns [], the engine runs zero analyzers, and the archive is created with no findings — which looks identical to a perfect audit. No warning is emitted to the user.
| return [cls() for cls in _ANALYZER_REGISTRY.values()] | |
| return [_ANALYZER_REGISTRY[n]() for n in names if n in _ANALYZER_REGISTRY] | |
| def get_analyzers(names: list[str] | None = None) -> list[FrameworkAnalyzer]: | |
| """Instantiate analyzers. Pass *names* to select a subset.""" | |
| _ensure_loaded() | |
| if names is None: | |
| return [cls() for cls in _ANALYZER_REGISTRY.values()] | |
| unknown = [n for n in names if n not in _ANALYZER_REGISTRY] | |
| if unknown: | |
| import logging | |
| logging.getLogger(__name__).warning( | |
| "Unknown framework(s) ignored: %s. Available: %s", | |
| ", ".join(unknown), | |
| ", ".join(sorted(_ANALYZER_REGISTRY)), | |
| ) | |
| return [_ANALYZER_REGISTRY[n]() for n in names if n in _ANALYZER_REGISTRY] |
| f"**Okta Domain:** {data.domain} ", | ||
| "**Assessment Basis:** Cybersecurity Maturity Model Certification (CMMC) 2.0 - Level 2 ", | ||
| "**NIST SP 800-171 Rev 2 Mapping:** Practices align to NIST 800-171 security requirements", |
There was a problem hiding this comment.
findings_by_id overwrites duplicate control IDs
The dict comprehension keeps only the last finding for each control_id:
findings_by_id: dict[str, ComplianceFinding] = {
f.control_id: f for f in findings if f.framework == "CMMC"
}The CMMC analyzer emits one finding per password policy for controls like AC.L2-3.1.8, IA.L2-3.5.7, IA.L2-3.5.8. If an org has two password policies where the first fails and the second passes (or vice versa), only the last one survives here. The SPRS calculation and the POA&M table are then built on incomplete data, potentially missing a real failure.
Consider aggregating per-control: prefer the worst-case status (fail > manual > pass) so failing policies are never silently dropped.
|
|
||
| # ---------- Intermediate analysis types used by common.py ---------- | ||
|
|
||
|
|
There was a problem hiding this comment.
org_factors default factory type mismatch
The field is declared as list[dict] | dict but default_factory=dict will always produce a dict at runtime. Any code that branches on isinstance(data.org_factors, list) to iterate over factors will silently get an empty dict instead of an empty list, skipping the loop. Consistent typing makes this easier to reason about.
| org_factors: dict[str, Any] = field(default_factory=dict) |
If the Okta /org/factors endpoint can genuinely return either a list or a dict depending on the org, add a comment explaining when each type is expected, and update all callers to handle both.
| data=data, | ||
| api_call_count=self._client.api_call_count, | ||
| timestamp=datetime.now().isoformat(), | ||
| domain=self._client.domain, |
There was a problem hiding this comment.
Naive
datetime.now() — inconsistent timezone handling
Every other timestamp in the codebase uses datetime.now(tz=timezone.utc). This call produces a timezone-naive datetime, so AuditResult.timestamp will be serialized without a timezone offset, making it ambiguous when the JSON output is consumed by downstream tools.
| domain=self._client.domain, | |
| timestamp=datetime.now(timezone.utc).isoformat(), |
Also add the missing import at the top of the file:
from datetime import datetime, timezone| details=dict(details), | ||
| ) | ||
|
|
||
|
|
||
| def _extract_session_settings(data: OktaData) -> list[dict]: | ||
| results: list[dict] = [] | ||
| for policy in data.access_policies: | ||
| for rule in policy.get("_embedded", {}).get("rules", []): | ||
| actions = rule.get("actions", {}) | ||
| signon = actions.get("signon", {}) | ||
| session = signon.get("session", {}) | ||
| results.append( | ||
| { | ||
| "policy_name": policy.get("name", ""), | ||
| "rule_name": rule.get("name", ""), | ||
| "idle_minutes": session.get("maxSessionIdleMinutes"), | ||
| "lifetime_minutes": session.get("maxSessionLifetimeMinutes"), | ||
| } | ||
| ) |
There was a problem hiding this comment.
_extract_session_settings duplicated from irap.py
This identical helper is defined in both cmmc.py (lines ~40–55) and irap.py. It should be extracted to analyzers/common.py alongside the other shared analysis helpers. Keeping two copies risks the two diverging silently when the Okta access policy schema changes.
|
|
||
| [project.optional-dependencies] | ||
| dev = [ | ||
| "pytest>=8", | ||
| "pytest-cov", | ||
| "ruff", | ||
| "mypy", | ||
| "types-requests", | ||
| ] | ||
|
|
||
| [project.scripts] | ||
| okta-inspector = "okta_inspector.cli:main" | ||
| okta-audit = "okta_inspector.cli:main" | ||
|
|
||
| [project.urls] | ||
| Homepage = "https://github.com/ethanolivertroy/okta-inspector" | ||
| Repository = "https://github.com/ethanolivertroy/okta-inspector" | ||
|
|
||
| [tool.hatch.build.targets.wheel] | ||
| packages = ["src/okta_inspector"] | ||
|
|
||
| [tool.ruff] | ||
| target-version = "py312" | ||
| line-length = 120 | ||
|
|
||
| [tool.ruff.lint] | ||
| select = ["E", "F", "I", "UP", "B", "SIM"] | ||
|
|
||
| [tool.mypy] | ||
| python_version = "3.12" | ||
| strict = true | ||
|
|
||
| [tool.pytest.ini_options] | ||
| testpaths = ["tests"] | ||
|
|
||
| [dependency-groups] |
There was a problem hiding this comment.
Duplicate dev dependency groups with conflicting pytest constraints
pytest is declared twice with different version constraints:
[project.optional-dependencies] dev→pytest>=8[dependency-groups] dev→pytest>=9.0.2
[dependency-groups] is a PEP 735 feature (uv-native) while [project.optional-dependencies] is the standard PEP 508 approach. Having both creates ambiguity about which constraint resolvers will respect. The uv.lock will win on uv sync, but tools that use pip extras might install a different version. Consider removing one of the two, or making the constraints consistent.
| if f.framework == "DISA STIG": | ||
| findings_by_id[f.control_id] = f |
There was a problem hiding this comment.
🔴 STIG reporter uses wrong framework name, producing an empty report
The STIG analyzer sets _FRAMEWORK = "STIG" (src/okta_inspector/analyzers/stig.py:16), but the STIG reporter filters findings with f.framework == "DISA STIG". Since no finding will ever have framework="DISA STIG", the findings_by_id dict will always be empty, and the entire STIG compliance checklist report will show all controls as "NOT ASSESSED" with zero assessed/pass/fail counts.
| if f.framework == "DISA STIG": | |
| findings_by_id[f.control_id] = f | |
| if f.framework == "STIG": | |
| findings_by_id[f.control_id] = f |
Was this helpful? React with 👍 or 👎 to provide feedback.
| ts = datetime.now(tz=timezone.utc).strftime("%Y-%m-%d %H:%M:%S UTC") | ||
|
|
||
| findings_by_id: dict[str, ComplianceFinding] = { | ||
| f.control_id: f for f in findings if f.framework == "SOC 2" |
There was a problem hiding this comment.
🔴 SOC2 reporter uses wrong framework name, producing an empty report
The SOC2 analyzer sets _FRAMEWORK = "SOC2" (src/okta_inspector/analyzers/soc2.py:13), but the SOC2 reporter filters findings with f.framework == "SOC 2" (note the space). Since no finding will ever have framework="SOC 2", the findings_by_id dict will always be empty, and the entire SOC 2 compliance report will show all controls as "NOT ASSESSED".
| f.control_id: f for f in findings if f.framework == "SOC 2" | |
| f.control_id: f for f in findings if f.framework == "SOC2" |
Was this helpful? React with 👍 or 👎 to provide feedback.
| if response.status_code == 429: | ||
| self._handle_rate_limit(response, page_count) | ||
| continue |
There was a problem hiding this comment.
🔴 Rate-limit retry consumes a page slot and drops query params on first page
When a 429 response is received, page_count has already been incremented (line 74) before the continue (line 85). On the retry iteration, page_count is incremented again, so: (1) the 429 counts against max_pages, potentially preventing all data from being fetched, and (2) params if page_count == 1 else None evaluates to None on the retry since page_count is now ≥ 2, dropping critical query parameters like since for the /logs endpoint (src/okta_inspector/collector.py:121).
Example: first-page 429 drops params for logs collection
In collector.py:121, the logs endpoint is called with params={"since": since, "limit": ...}. If the first request returns 429, the retry will not include since, causing the query to return all logs instead of the last 24 hours.
Prompt for agents
In src/okta_inspector/client.py, the get() method at lines 73-85 has a bug where 429 rate-limit retries increment page_count and drop query params. Fix this by decrementing page_count when a 429 is encountered (before the continue), so that: (1) the retry doesn't count against max_pages, and (2) the params are correctly sent on the retry since page_count will still be 1. Specifically, add `page_count -= 1` after line 84 (the _handle_rate_limit call) and before the `continue` on line 85. Also consider adding a max-retry limit to prevent infinite loops if the server keeps returning 429.
Was this helpful? React with 👍 or 👎 to provide feedback.
| def is_mfa_enforced(data: OktaData) -> bool: | ||
| """Return True if any access policy appears to require MFA.""" | ||
| for policy in data.access_policies: | ||
| name = policy.get("name", "") | ||
| if "Admin Console" in name or "Dashboard" in name: | ||
| return True | ||
| return False |
There was a problem hiding this comment.
🔴 is_mfa_enforced checks policy name existence instead of actual MFA enforcement
The function is_mfa_enforced claims to check if MFA is enforced, but it only checks whether an access policy named "Admin Console" or "Dashboard" exists. Since these policies exist by default in virtually all Okta orgs, this function returns True for nearly every tenant regardless of whether MFA is actually required. This causes false "pass" results for MFA compliance checks across all 6 frameworks that use it (FedRAMP IA-2, CMMC IA.L2-3.5.3, IRAP ISM-0974, ISMAP A.9.4.2, SOC2 CC6.1, PCI-DSS 8.3.1). The same issue affects has_admin_console_mfa and has_dashboard_mfa at lines 116-121.
Prompt for agents
In src/okta_inspector/analyzers/common.py, the is_mfa_enforced function at lines 107-113 needs to actually check MFA enforcement rather than just checking policy name existence. The function should inspect the embedded rules within access policies and check whether any rule requires MFA (e.g., by checking if actions.appSignOn.verificationMethod.factorMode is set to something like 'REQUIRED', or if the policy rules have constraints requiring additional authentication factors). A minimal improvement would be to check if the policy has embedded rules that require a second factor. Similarly fix has_admin_console_mfa (line 116) and has_dashboard_mfa (line 120) to check for actual MFA configuration rather than just policy name existence.
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1efe660e53
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Index findings by control_id for quick lookup | ||
| findings_by_id: dict[str, ComplianceFinding] = {} | ||
| for f in findings: | ||
| if f.framework == "DISA STIG": |
There was a problem hiding this comment.
Match STIG framework label when indexing findings
Use the same framework key as the analyzer here. STIGAnalyzer emits findings with framework="STIG" (see src/okta_inspector/analyzers/stig.py), but this reporter only indexes "DISA STIG", so every STIG control is treated as NOT ASSESSED even when analyzer results exist; this makes the generated STIG checklist materially incorrect for normal runs.
Useful? React with 👍 / 👎.
| ts = datetime.now(tz=timezone.utc).strftime("%Y-%m-%d %H:%M:%S UTC") | ||
|
|
||
| findings_by_id: dict[str, ComplianceFinding] = { | ||
| f.control_id: f for f in findings if f.framework == "SOC 2" |
There was a problem hiding this comment.
Match SOC2 framework label in SOC 2 reporter
This filter uses "SOC 2", but the analyzer emits framework="SOC2" (src/okta_inspector/analyzers/soc2.py). Because of the mismatch, findings_by_id is empty and the SOC 2 report marks controls as not assessed instead of reflecting actual analyzer outcomes.
Useful? React with 👍 / 👎.
| check_file "$OUTPUT_DIR/analysis/session_analysis.json" "Session analysis" | ||
| check_file "$OUTPUT_DIR/analysis/password_policy_analysis.json" "Password policy analysis" | ||
| check_file "$OUTPUT_DIR/analysis/monitoring_analysis.json" "Monitoring analysis" |
There was a problem hiding this comment.
Point validation checks to files the engine actually writes
These checks reference analysis artifacts (session_analysis.json, monitoring_analysis.json) that are never produced by AuditEngine._save_analysis, which writes split files like active_event_hooks.json, active_log_streams.json, and log_event_summary.json instead. As shipped, the generated validation script will report missing-file failures on a standard output directory even when the audit completed successfully.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 19
🧹 Nitpick comments (21)
src/okta_inspector/__main__.py (1)
3-5: Consider addingif __name__ == "__main__":guard.Calling
main()unconditionally at module level means any import of this module (e.g., during testing or introspection) will trigger execution. While this works forpython -m okta_inspector, wrapping in a guard is safer.♻️ Proposed fix
from okta_inspector.cli import main -main() +if __name__ == "__main__": + main()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/__main__.py` around lines 3 - 5, The module currently calls main() at import time; wrap the call to the CLI entrypoint in an if __name__ == "__main__": guard so imports don't execute the program. Specifically, keep the import from okta_inspector.cli (main) but move the main() invocation inside an if __name__ == "__main__": block to ensure main() only runs when the package is executed as a script.src/okta_inspector/cli.py (1)
121-123: Consider logging the traceback for debugging.The broad
Exceptioncatch is acceptable for a CLI entrypoint to ensure graceful exit, but usinglogging.exception()instead oflogging.error()would preserve the traceback for easier debugging.♻️ Proposed fix
except Exception as e: - logging.getLogger(__name__).error("Audit failed: %s", e) + logging.getLogger(__name__).exception("Audit failed: %s", e) sys.exit(1)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/cli.py` around lines 121 - 123, Replace the current exception logging in the CLI entrypoint so the traceback is preserved: in the except Exception as e block in src/okta_inspector/cli.py (the block that currently calls logging.getLogger(__name__).error("Audit failed: %s", e) before sys.exit(1)), call logging.exception (or logging.getLogger(__name__).exception) with a descriptive message (e.g., "Audit failed") instead of logging.error so the full traceback is recorded, then keep the existing sys.exit(1).src/okta_inspector/output.py (2)
84-92: Archive timestamp also uses local time.Line 86 uses
datetime.now()without timezone, same inconsistency as the directory naming. Consider using UTC here as well for consistency with report timestamps.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/output.py` around lines 84 - 92, The archive timestamp in create_archive currently uses naive local time via datetime.now(); change it to use UTC to match other report timestamps by using datetime.now(timezone.utc) (or datetime.utcnow()/replace(tzinfo=timezone.utc)) when building ts so the archive_path uses a UTC timestamp; ensure imports are present and keep the same strftime format and filename construction in create_archive and reference variables ts and archive_path.
31-36: Inconsistent timezone handling in directory timestamp.Line 33 uses
datetime.now()(local time) for the directory name, but reporters usedatetime.now(tz=timezone.utc)for their timestamps. This inconsistency could cause confusion when correlating timestamps.Consider using UTC consistently:
🔧 Proposed fix
+from datetime import datetime, timezone + def __init__(self, base_dir: Path | None = None) -> None: if base_dir is None: - ts = datetime.now().strftime("%Y%m%d_%H%M%S") + ts = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S") base_dir = Path(f"okta_audit_results_{ts}") self.base_dir = base_dir self._ensure_dirs()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/output.py` around lines 31 - 36, The directory timestamp in the __init__ of the output class uses datetime.now() (local time) which is inconsistent with other reporters that use UTC; change the timestamp generation to use datetime.now(tz=timezone.utc). Specifically, update the __init__ method where ts is set (the base_dir = Path(f"okta_audit_results_{ts}") line) to call datetime.now(tz=timezone.utc).strftime(...) (and import timezone if not already present) so base_dir timestamps match the reporters and then keep the existing self.base_dir and self._ensure_dirs() behavior.src/okta_inspector/collector.py (2)
49-52: Inconsistent query parameter handling.Line 50 embeds the query parameter directly in the URL string (
f"/policies?type={policy_type}"), while other methods use theparamsargument (e.g., line 79). Consider usingparamsconsistently for clarity and proper URL encoding:🔧 Proposed fix
for policy_type, attr in policy_map: - result = self._client.get(f"/policies?type={policy_type}") + result = self._client.get("/policies", params={"type": policy_type}) if result and isinstance(result, list): setattr(data, attr, result)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/collector.py` around lines 49 - 52, The loop over policy_map uses an inline f-string URL when calling self._client.get(f"/policies?type={policy_type}") which is inconsistent with other calls that pass query params via the params argument; change the call in that loop to use self._client.get("/policies", params={"type": policy_type}) so the request uses proper URL encoding and matches the pattern used elsewhere (the _client.get call inside the policy_map loop and subsequent setattr(data, attr, result) remain the same).
72-75: Type handling assumes all endpoints return compatible types.Lines 73-75 set attributes directly from API results without distinguishing between list and dict responses. For example,
default_auth_servershould be a dict, whileauthenticatorsshould be a list. If an endpoint returns an unexpected type (e.g., error response), this could cause downstream issues.The current approach works if the client handles errors gracefully, but consider adding type guards for critical fields.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/collector.py` around lines 72 - 75, The loop over mapping in collector.py unconditionally sets attributes on the data object from self._client.get results (for endpoint, attr in mapping / setattr(data, attr, result)), which assumes return types match expected schemas; add type guards before setting critical fields (e.g., verify result is dict for attributes like default_auth_server and dict-like auth server objects, and list for attributes like authenticators) and skip or log and normalize unexpected types (e.g., empty list/dict or None) rather than directly setattr; perform these checks around the self._client.get call and only assign when the type matches the expected shape, otherwise handle the error case (log and leave default) so downstream consumers don’t receive invalid types.src/okta_inspector/reporters/__init__.py (1)
34-39: Unknown reporter names are silently ignored.When
namescontains a reporter name that doesn't exist in the registry, it's silently skipped. This could mask configuration errors or typos in CLI arguments.Consider logging a warning or raising an error for unknown names:
🔧 Proposed improvement
def get_reporters(names: list[str] | None = None) -> list[ReportGenerator]: """Instantiate reporters. Pass *names* to select a subset.""" _ensure_loaded() if names is None: return [cls() for cls in _REPORTER_REGISTRY.values()] - return [_REPORTER_REGISTRY[n]() for n in names if n in _REPORTER_REGISTRY] + reporters = [] + for n in names: + if n in _REPORTER_REGISTRY: + reporters.append(_REPORTER_REGISTRY[n]()) + else: + import logging + logging.getLogger(__name__).warning("Unknown reporter requested: %s", n) + return reporters🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/reporters/__init__.py` around lines 34 - 39, In get_reporters, don't silently ignore unknown reporter names: after calling _ensure_loaded(), compute unknowns = [n for n in names if n not in _REPORTER_REGISTRY] and if any exist raise a ValueError (or alternately processLogger.warning) listing the unknown names so mis-typed or missing reporters are surfaced; then return [_REPORTER_REGISTRY[n]() for n in names if n in _REPORTER_REGISTRY] as before. Use the function name get_reporters and the _REPORTER_REGISTRY symbol to locate the code.src/okta_inspector/reporters/matrix.py (1)
18-23: Thefindingsparameter is unused.The
generatemethod receivesfindingsbut doesn't incorporate them into the matrix. Consider using the findings to show dynamic pass/fail status per control area, which would make the matrix more actionable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/reporters/matrix.py` around lines 18 - 23, The generate method currently ignores the findings list; update generate(self, findings: list[ComplianceFinding], data: OktaData, output: OutputManager) to iterate over findings and compute pass/fail (or severity) per control area (e.g., by grouping on finding.control_id or finding.control_area and evaluating finding.status/severity), then inject those dynamic statuses into the matrix output created by this method via the OutputManager APIs used elsewhere in this class; ensure you reference the ComplianceFinding attributes (control_id/control_area/status/severity) to populate the matrix cells so the produced matrix reflects real pass/fail for each control.src/okta_inspector/analyzers/soc2.py (1)
37-51: Duplicated session extraction logic.This
_extract_session_settingsfunction is nearly identical to the one incmmc.py(context snippet 1), but omitslifetime_minutes. Consider:
- Extracting both fields here for consistency
- Moving the shared extraction logic to
analyzers/common.pyThis would reduce duplication and ensure all analyzers have access to the same session data.
♻️ Proposed refactor
-def _extract_session_settings(data: OktaData) -> list[dict]: - results: list[dict] = [] - for policy in data.access_policies: - for rule in policy.get("_embedded", {}).get("rules", []): - actions = rule.get("actions", {}) - signon = actions.get("signon", {}) - session = signon.get("session", {}) - results.append( - { - "policy_name": policy.get("name", ""), - "rule_name": rule.get("name", ""), - "idle_minutes": session.get("maxSessionIdleMinutes"), - } - ) - return results +# Move to analyzers/common.py and import: +# from okta_inspector.analyzers.common import extract_session_settingsIn
analyzers/common.py:def extract_session_settings(data: OktaData) -> list[dict]: """Extract session settings from access policies.""" results: list[dict] = [] for policy in data.access_policies: for rule in policy.get("_embedded", {}).get("rules", []): actions = rule.get("actions", {}) signon = actions.get("signon", {}) session = signon.get("session", {}) results.append({ "policy_name": policy.get("name", ""), "rule_name": rule.get("name", ""), "idle_minutes": session.get("maxSessionIdleMinutes"), "lifetime_minutes": session.get("maxSessionLifetimeMinutes"), }) return results🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/analyzers/soc2.py` around lines 37 - 51, The current _extract_session_settings in soc2.py duplicates logic from cmmc.py and omits lifetime_minutes; extract the shared logic into a single function (e.g., extract_session_settings) in analyzers/common.py that returns dicts with "policy_name", "rule_name", "idle_minutes" (from maxSessionIdleMinutes) and "lifetime_minutes" (from maxSessionLifetimeMinutes), keep the signature accepting OktaData, then replace the _extract_session_settings implementation in soc2.py with a call/import to analyzers.common.extract_session_settings (removing the local duplicate) and adjust any references to use the unified function.src/okta_inspector/reporters/executive.py (1)
13-13: Hardcoded_FRAMEWORKSlist may drift from actual registered analyzers.This tuple is used in the "Frameworks Assessed" section (lines 75-76), but it doesn't reflect which frameworks were actually run or which are available in the registry. If a new framework is added or one is renamed, this list becomes stale.
Consider dynamically getting frameworks from findings or using
available_frameworks()from the analyzer registry.♻️ Proposed fix
-_FRAMEWORKS = ("FedRAMP", "DISA STIG", "IRAP", "ISMAP", "SOC 2", "PCI-DSS", "CMMC") +from okta_inspector.analyzers import available_frameworks as get_available_frameworksThen in
generate():lines.extend([ "## Frameworks Assessed", "", ]) - for fw in _FRAMEWORKS: - lines.append(f"- {fw}") + # Show frameworks that produced findings + for fw in sorted(framework_counts.keys()): + lines.append(f"- {fw}") lines.append("")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/reporters/executive.py` at line 13, The hardcoded _FRAMEWORKS tuple can become stale; change the code to derive the frameworks dynamically (either from the report's findings or from the analyzer registry via available_frameworks()) instead of the static _FRAMEWORKS constant. In practice, remove or stop using _FRAMEWORKS in the executive.report generation path and update generate() to compute the list of frameworks to display by calling the registry.available_frameworks() (or by extracting unique framework identifiers from findings) and use that result for the "Frameworks Assessed" section so it always reflects actual run/registered analyzers.src/okta_inspector/engine.py (2)
76-77: Consider logging exception tracebacks for debugging.The broad
Exceptioncatches are acceptable for resilience (one failing analyzer/reporter shouldn't abort the entire audit), but the current logging only captures the exception message. Including the traceback would help diagnose issues.♻️ Proposed improvement
except Exception as e: - logger.error("Error in %s analyzer: %s", analyzer.name, e) + logger.exception("Error in %s analyzer: %s", analyzer.name, e)Using
logger.exception()automatically includes the traceback.Also applies to: 86-87
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/engine.py` around lines 76 - 77, Replace the current logger.error calls inside the exception handlers for analyzers and reporters in engine.py (the except Exception as e blocks around analyzer.name and reporter.name) with logger.exception so the traceback is logged; update the two catch blocks that now call logger.error("Error in %s analyzer: %s", analyzer.name, e) and the analogous reporter block to use logger.exception with a clear message (e.g., "Error in %s analyzer" % analyzer.name) so the exception and its stack trace are recorded for debugging.
53-59: Inconsistent timestamp timezone.Line 57 uses
datetime.now()without timezone, producing a naive datetime, while all reporters usedatetime.now(tz=timezone.utc). This inconsistency could cause confusion when comparing timestamps.♻️ Proposed fix
+from datetime import datetime, timezone + result = AuditResult( findings=findings, data=data, api_call_count=self._client.api_call_count, - timestamp=datetime.now().isoformat(), + timestamp=datetime.now(tz=timezone.utc).isoformat(), domain=self._client.domain, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/engine.py` around lines 53 - 59, The AuditResult timestamp is created with a naive datetime (datetime.now()) in the AuditResult construction; change it to a timezone-aware UTC timestamp (e.g., datetime.now(tz=timezone.utc).isoformat()) to match other reporters and ensure consistency, and add/import timezone from datetime if it isn’t already imported; update the timestamp expression used when building the AuditResult in engine.py (the AuditResult(...) block) accordingly.src/okta_inspector/analyzers/__init__.py (1)
34-39: Consider warning when unknown framework names are requested.When
namescontains an unknown framework (e.g., a typo like"stg"instead of"stig"),get_analyzerssilently skips it. This could lead to confusing behavior where a user expects a framework to run but it doesn't, with no indication why.♻️ Proposed improvement to log unknown names
+import logging + +logger = logging.getLogger(__name__) + def get_analyzers(names: list[str] | None = None) -> list[FrameworkAnalyzer]: """Instantiate analyzers. Pass *names* to select a subset.""" _ensure_loaded() if names is None: return [cls() for cls in _ANALYZER_REGISTRY.values()] - return [_ANALYZER_REGISTRY[n]() for n in names if n in _ANALYZER_REGISTRY] + result = [] + for n in names: + if n in _ANALYZER_REGISTRY: + result.append(_ANALYZER_REGISTRY[n]()) + else: + logger.warning("Unknown framework '%s' requested; skipping.", n) + return result🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/analyzers/__init__.py` around lines 34 - 39, get_analyzers silently ignores unknown names from the names list which can confuse callers; update get_analyzers to detect any requested names not present in _ANALYZER_REGISTRY (after calling _ensure_loaded()) and emit a warning (using the module logger) listing the unknown names so callers know which requested frameworks were skipped; keep the existing return behavior but log the set difference between names and _ANALYZER_REGISTRY.keys() before instantiating known analyzers.src/okta_inspector/reporters/stig.py (1)
148-149: Remove unnecessary f-string prefixes.These lines are static strings with no interpolation.
♻️ Proposed fix
- f"| Metric | Count |", - f"|--------|-------|", + "| Metric | Count |", + "|--------|-------|",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/reporters/stig.py` around lines 148 - 149, The two static markdown strings currently using unnecessary f-string prefixes (f"| Metric | Count |" and f"|--------|-------|") should be changed to plain string literals without the f prefix; locate the usage in the STIG reporter where these lines are emitted (the list or template building the table in src/okta_inspector/reporters/stig.py) and remove the leading "f" so they read "| Metric | Count |" and "|--------|-------|" respectively.src/okta_inspector/analyzers/cmmc.py (1)
43-58: Consider moving_extract_session_settingstoanalyzers/common.py.This function is duplicated verbatim in
cmmc.py,pci_dss.py, andirap.py. Extracting it tocommon.pywould reduce duplication and ensure consistent behavior across analyzers.♻️ Proposed refactor
Add to
src/okta_inspector/analyzers/common.py:def extract_session_settings(data: OktaData) -> list[dict]: """Extract session timeout settings from embedded access policy rules.""" results: list[dict] = [] for policy in data.access_policies: for rule in policy.get("_embedded", {}).get("rules", []): actions = rule.get("actions", {}) signon = actions.get("signon", {}) session = signon.get("session", {}) results.append( { "policy_name": policy.get("name", ""), "rule_name": rule.get("name", ""), "idle_minutes": session.get("maxSessionIdleMinutes"), "lifetime_minutes": session.get("maxSessionLifetimeMinutes"), } ) return resultsThen import and use it in each analyzer.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/analyzers/cmmc.py` around lines 43 - 58, The _extract_session_settings function is duplicated across analyzers; to fix, create a single shared function named extract_session_settings in the common analyzer module that contains the same logic (iterate data.access_policies, extract rule.get("_embedded", {}).get("rules", []), pull actions->signon->session and map policy_name, rule_name, idle_minutes, lifetime_minutes), then remove the duplicate _extract_session_settings definitions from cmmc.py, pci_dss.py, and irap.py and import and call the shared extract_session_settings function there (update any call sites to the new name if needed) so all analyzers use the single implementation.src/okta_inspector/reporters/cmmc.py (2)
376-389: Rename unused loop variabledescriptionto_description.The loop unpacks
descriptionbut never uses it, which can confuse readers about whether it was accidentally omitted from the logic.♻️ Proposed fix
- for practice_id, description, point_value in practices: + for practice_id, _description, point_value in practices:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/reporters/cmmc.py` around lines 376 - 389, The loop over _CMMC_PRACTICES unpacks a variable named description that is unused; rename description to _description in the for-loop (for practice_id, _description, point_value in practices) to make intent clear and avoid unused-variable confusion, leaving all logic that references practice_id, point_value, findings_by_id, finding.severity, _SEVERITY_WEIGHTS, deduction, score, and deductions unchanged.
173-173: Remove extraneous f-string prefix.This string has no placeholders, so the
fprefix is unnecessary.♻️ Proposed fix
- f"**Starting Score:** 110 ", + "**Starting Score:** 110 ",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/reporters/cmmc.py` at line 173, The string literal currently written as f"**Starting Score:** 110 " contains no interpolation; remove the unnecessary f prefix and change it to a plain string "\"**Starting Score:** 110 \"" (locate the occurrence of f"**Starting Score:** 110 " in the cmmc reporter output list/variable in src/okta_inspector/reporters/cmmc.py and replace it with a normal string).src/okta_inspector/client.py (1)
114-116: Avoid catching bareException.The broad
except Exceptioncan mask unexpected errors (e.g.,KeyboardInterruptis not caught, but other unexpected exceptions likeTypeErrorfrom internal bugs would be silently swallowed). Use a more specific exception type.♻️ Proposed fix
- except Exception as e: + except requests.exceptions.RequestException as e: logger.error("API connection failed: %s", e) return False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/client.py` around lines 114 - 116, Replace the broad "except Exception as e" that surrounds the API call (the block that logs via logger.error("API connection failed: %s", e) and returns False) with a narrow, explicit exception handler for the network/HTTP client errors you expect (e.g., requests.exceptions.RequestException and requests.exceptions.Timeout if using requests, or the specific Okta SDK/network exception types your client library raises); log the error the same way and return False for those expected exceptions, but do not swallow unexpected exceptions—either let them propagate or catch them and re-raise after logging. Ensure you import the specific exception types and update the except clause to reference them instead of the bare Exception.src/okta_inspector/analyzers/common.py (2)
107-121: Function names suggest MFA enforcement detection, but implementation only checks policy names.
is_mfa_enforced()returnsTrueif any access policy name contains "Admin Console" or "Dashboard", which doesn't verify MFA is actually required. Similarly,has_admin_console_mfa()andhas_dashboard_mfa()only confirm these policies exist.Consider renaming to
has_admin_console_policy()/has_dashboard_policy()to better reflect behavior, or enhance to inspect the policy'sactions.signon.accessfield for actual MFA requirements.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/analyzers/common.py` around lines 107 - 121, The current functions (is_mfa_enforced, has_admin_console_mfa, has_dashboard_mfa) only check policy names rather than whether MFA is required; either rename them to has_admin_console_policy and has_dashboard_policy to reflect that behavior, or update their logic to inspect each policy's actions -> signon fields (e.g., policy.get("actions",{}).get("signon",{})) and verify the signon/access or related MFA requirement fields indicate MFA is enforced (and return True only in that case); update any callers/tests to match the chosen approach.
139-150: Users who have never logged in are not flagged as inactive.The loop only processes users where
last_loginis truthy. Users who have never logged in (e.g.,lastLoginisNoneor missing) won't be included ininactive_users, even though they may represent accounts that should be reviewed.Consider whether never-logged-in users should be flagged separately or included in the inactive list.
💡 Possible enhancement
# Add tracking for never-logged-in users never_logged_in = [u for u in data.users if not u.get("lastLogin")] result.never_logged_in = never_logged_in # Would need to add field to UserAnalysis🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/analyzers/common.py` around lines 139 - 150, The loop that builds result.inactive_users only checks users with a truthy lastLogin, so accounts with missing or None lastLogin are ignored; update the logic in the analyzer (around cutoff, data.users, and result.inactive_users in src/okta_inspector/analyzers/common.py) to handle never-logged-in users—either by appending users with no lastLogin to result.inactive_users or by adding a new field on the UserAnalysis (e.g., result.never_logged_in) and populating it with [u for u in data.users if not u.get("lastLogin")], keeping existing isoformat parsing for users with lastLogin. Ensure you update the UserAnalysis type/constructor if you add the never_logged_in field.src/okta_inspector/analyzers/fedramp.py (1)
327-347: AU-2 finding may be misleading when system logs are empty.Empty
data.system_logscould indicate the API query time window had no events (e.g., new org, query limit), not necessarily that logging is disabled. The "fail" status with message "verify logging is enabled" is appropriate for manual follow-up, but consider softening to "manual" status since empty logs don't definitively prove logging is disabled.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/okta_inspector/analyzers/fedramp.py` around lines 327 - 347, The AU-2 block currently treats an empty data.system_logs as a definitive failure; update the logic that appends the _finding for "AU-2" so that when data.system_logs is empty it uses a "manual" status (rather than "fail") and a softer message explaining that no events were returned and manual verification of logging configuration/collection window is required; keep the existing "pass" branch unchanged and reference the AU-2 finding creation via the _finding(...) call that uses data.system_logs to determine which branch to append.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pyproject.toml`:
- Around line 63-66: The project defines pytest in two places causing redundancy
and a version conflict: remove the duplicate by consolidating dependencies to a
single mechanism — either delete pytest from [project.optional-dependencies]
under the "dev" group (keeping the [dependency-groups] entry "dev" with
"pytest>=9.0.2") or remove the [dependency-groups] "dev" block (keeping
pytest>=8 in [project.optional-dependencies]); update only the pytest entry so
there's one source of truth and ensure the chosen version (pytest>=9.0.2 or
pytest>=8) is the intended one.
In `@src/okta_inspector/analyzers/fedramp.py`:
- Around line 208-244: The current MFA checks use name-based heuristics
(is_mfa_enforced, has_admin_console_mfa, has_dashboard_mfa) which can
misclassify policies; update these helpers to inspect each policy's
rules/actions/conditions (e.g., look for actions like "ENROLL" or "CHALLENGE",
or conditions requiring multifactor in policy.actions and policy.conditions) to
determine if MFA is actually required, and then update the callers in fedramp.py
that build _finding entries to use the revised boolean results (or, if you keep
name-based detection, change the _finding descriptions to explicitly state they
are heuristics rather than definitive enforcement checks).
- Around line 43-58: The collector doesn't fetch rules for ACCESS_POLICY so
_extract_session_settings (in src/okta_inspector/analyzers/fedramp.py) receives
access_policies without an "_embedded.rules" array and AC-11/AC-12 fall back to
"manual"; update the collector (src/okta_inspector/collector.py) to load rules
for access policies the same way it does for PASSWORD policies (either call
/policies/{id}/rules for each policy id or include the expand parameter when
fetching access policies) so that data.access_policies contains
"_embedded.rules" and _extract_session_settings can read
session.maxSessionIdleMinutes and session.maxSessionLifetimeMinutes correctly.
In `@src/okta_inspector/analyzers/pci_dss.py`:
- Around line 113-139: The control uses the old PCI-DSS v3.x ID and threshold;
update the loop that iterates over pw_policies and the associated _finding calls
to use control ID "8.3.4" and change the threshold check from max_attempts > 6
to max_attempts > 10 (and update the failure message to mention 10 attempts);
also update the pass message to state it locks after X attempts within the
10-attempt limit and change the manual "Account lockout" _finding to use "8.3.4"
so all _finding invocations for this check consistently reference the new
PCI-DSS v4.0 requirement.
In `@src/okta_inspector/analyzers/stig.py`:
- Line 16: Change the module-level constant _FRAMEWORK from "STIG" to "DISA
STIG" so the analyzer's framework string matches the reporter filter; locate the
_FRAMEWORK constant in the stig analyzer module (symbol _FRAMEWORK) and update
its value to "DISA STIG" to align with the reporter that expects framework ==
"DISA STIG".
In `@src/okta_inspector/client.py`:
- Around line 122-139: The header parsing in _handle_rate_limit and
_check_remaining can raise ValueError for malformed X-Rate-Limit-Reset or
X-Rate-Limit-Remaining values; update those methods to defensively parse headers
(wrap int(...) conversions in try/except ValueError or use a helper to
safely_int with a fallback), treat invalid reset_header as None (fall back to
exponential backoff in _handle_rate_limit) and treat invalid remaining as "not
low" (skip pausing in _check_remaining), and optionally log a debug/warning when
a header is malformed; reference the methods _handle_rate_limit and
_check_remaining to locate the changes.
In `@src/okta_inspector/engine.py`:
- Around line 80-87: The _run_reporters function currently calls get_reporters()
with no filtering so every reporter runs even when the user passed --frameworks;
change _run_reporters to only invoke reporters relevant to the selected
frameworks by (a) adding or using a reporter.framework attribute (e.g., "stig",
"cis", or None for framework-agnostic) and (b) filtering get_reporters() results
to include reporters whose framework is in the selected frameworks list or whose
framework is None (executive/matrix/validation). Update the filtering logic
inside _run_reporters (before calling reporter.generate) and ensure you still
log reporter.display_name and catch exceptions with reporter.name as before.
In `@src/okta_inspector/reporters/irap.py`:
- Around line 75-77: The current dict comprehension (creating findings_by_id
from findings filtered by f.framework == "IRAP") overwrites entries when
multiple ComplianceFinding objects share the same control_id; change the
aggregation to collect lists of findings per control_id (e.g., build
findings_by_id: dict[str, list[ComplianceFinding]] using a defaultdict(list) or
manual grouping over findings where f.framework == "IRAP") and then update the
rendering logic to iterate over each list for a control_id (or emit multiple
rows) so all findings for a given control_id are preserved.
- Around line 185-189: The check that computes has_log_forwarding uses
irap_findings and looks for control ID "ISM-0991", but the analyzer emits
"ISM-0407" for log streams, so update the condition that computes
has_log_forwarding to test for a "pass" status on "ISM-0407" (or coordinate to
emit ISM-0991); specifically modify the generator that sets has_log_forwarding
(variable name has_log_forwarding, iterating irap_findings) to look for cid ==
"ISM-0407" and f.status == "pass" so backup_maturity reflects actual log
forwarding state.
- Around line 201-205: The triple of formatted strings includes one unnecessary
f-string: change the f-string for the Assessment Basis line (currently
f"**Assessment Basis:** ASD Essential Eight Maturity Model") to a plain string
"**Assessment Basis:** ASD Essential Eight Maturity Model" (no f-prefix) so it
matches the other literals; keep the surrounding entries that use ts and
data.domain as f-strings.
- Around line 93-105: In the IRAP report string assembly, several string
literals were accidentally marked as f-strings even though they contain no
placeholders; remove the unnecessary f prefixes from the static lines (for
example the lines with "**Assessment Basis:** Australian Government Information
Security Manual (ISM)", the table header and divider strings like "| Metric |
Count |" and "|--------|-------|") so only the lines that interpolate variables
(those referencing total_controls and status_counts) remain f-strings; update
the string list in the report-building function that constructs the markdown
table (the block that builds the "Assessment Summary" section) to use plain
string literals where there are no placeholders.
- Around line 14-45: The _ISM_CONTROLS dictionary is missing the ISM-0072 entry
so findings for that control won't appear in reports; add a new tuple
("ISM-0072", "Domain validation for .gov.au") into the appropriate category list
(e.g., "Authentication" or a more fitting category) inside the _ISM_CONTROLS
dict so the report iteration that references _ISM_CONTROLS will include
ISM-0072; ensure the exact symbol _ISM_CONTROLS is updated and that the string
"ISM-0072" matches the analyzer output.
In `@src/okta_inspector/reporters/matrix.py`:
- Around line 43-57: The Matrix reporter references non-existent files (e.g.,
`analysis/session_analysis.json`, `analysis/user_analysis.json`,
`analysis/monitoring_analysis.json`) in the Analysis File column; either update
src/okta_inspector/reporters/matrix.py to point to the actual filenames produced
(e.g., `users_{status}.json`, `inactive_users.json`, and the individual
monitoring output files) or modify engine.py._save_analysis() to produce
aggregated files with the exact names used in the table (create
`session_analysis.json`, `user_analysis.json`, and `monitoring_analysis.json`
containing the combined data). Ensure the entries in the Matrix strings (Session
Management, Account Management, Monitoring) reference the corrected filenames so
the table matches the outputs.
In `@src/okta_inspector/reporters/soc2.py`:
- Around line 82-84: The filter in findings_by_id is excluding SOC2 analyzer
results because it checks f.framework == "SOC 2" while SOC2Analyzer._FRAMEWORK
== "SOC2"; update the filter in the reporter (the findings_by_id comprehension)
to normalize framework names (e.g., strip whitespace and compare
case-insensitively) or accept both variants (check f.framework in ("SOC2","SOC
2")) so SOC2Analyzer findings are included; reference the findings_by_id
variable and SOC2Analyzer/_FRAMEWORK to locate the change.
- Around line 223-231: The three static recommendation strings added via
recs.extend(...) are numbered using len(recs) + 1 before any are appended,
causing duplicate numbers; fix by computing the starting index once (e.g., start
= len(recs) + 1) and then generate the three entries using that start plus
incremental offsets (or use enumerate over a list of messages with a start
parameter) so each appended string in recs gets a unique, sequential number;
update the recs.extend call to build the numbered strings using this computed
start (reference the recs.extend call in soc2.py).
In `@src/okta_inspector/reporters/stig.py`:
- Around line 69-70: The filter is checking f.framework == "DISA STIG" but the
analyzer defines the framework as "STIG", so change the filter to use the
analyzer's framework constant (e.g., compare to STIGAnalyzer._FRAMEWORK) or to
the exact string "STIG" so findings are not dropped; update the conditional in
the reporter that populates findings_by_id (the block referencing f.framework
and f.control_id) to match the STIGAnalyzer's _FRAMEWORK value.
In `@src/okta_inspector/reporters/validation.py`:
- Around line 118-137: The validation script references non-existent analysis
files (SESSION_FILE / session_analysis.json and monitoring_analysis.json) while
engine.py._save_analysis() only writes password_policy_analysis.json,
inactive_users.json, authenticator_analysis.json, active_event_hooks.json,
active_log_streams.json, and log_event_summary.json; fix by either updating the
reporter in src/okta_inspector/reporters/validation.py to check the actual files
produced (replace SESSION_FILE and monitoring checks with the real filenames
like password_policy_analysis.json, inactive_users.json,
authenticator_analysis.json, active_event_hooks.json, active_log_streams.json,
log_event_summary.json and corresponding jq output blocks) or add
session/monitoring aggregation and file writes to engine.py._save_analysis()
(implement creation of session_analysis.json and monitoring_analysis.json) so
they exist for the SESSION_FILE and monitoring checks to read.
- Around line 213-252: Update the QUICK_REFERENCE.md content built in the md =
f"""... block so the documented filenames match actual outputs: change
users.json to all_users.json, apps.json to applications.json, remove the
non-created session_analysis.json and monitoring_analysis.json entries, and
replace the user_analysis.json line with a note that individual user status
files are created (e.g., per-user status files) under analysis; ensure the final
QUICK_REFERENCE.md entry remains listed. Locate the multi-line string in
src/okta_inspector/reporters/validation.py and edit those specific filename
lines so the generated documentation reflects the real output structure.
- Around line 110-122: The validation script is checking filenames that don't
match what engine.py writes; update the check_file calls in validation.py to
match _save_raw_data and _save_analysis outputs: replace
"$OUTPUT_DIR/core_data/users.json" with "$OUTPUT_DIR/core_data/all_users.json"
and "$OUTPUT_DIR/core_data/apps.json" with
"$OUTPUT_DIR/core_data/applications.json"; remove or stop checking for
"session_analysis.json" (engine.py doesn't write it) and instead add checks for
the actual analysis files written by _save_analysis—e.g.
"$OUTPUT_DIR/analysis/active_event_hooks.json" and
"$OUTPUT_DIR/analysis/active_log_streams.json" (and keep existing
password_policy_analysis.json if _save_analysis writes it); ensure you update
the descriptive labels passed to check_file accordingly so messages match the
real files.
---
Nitpick comments:
In `@src/okta_inspector/__main__.py`:
- Around line 3-5: The module currently calls main() at import time; wrap the
call to the CLI entrypoint in an if __name__ == "__main__": guard so imports
don't execute the program. Specifically, keep the import from okta_inspector.cli
(main) but move the main() invocation inside an if __name__ == "__main__": block
to ensure main() only runs when the package is executed as a script.
In `@src/okta_inspector/analyzers/__init__.py`:
- Around line 34-39: get_analyzers silently ignores unknown names from the names
list which can confuse callers; update get_analyzers to detect any requested
names not present in _ANALYZER_REGISTRY (after calling _ensure_loaded()) and
emit a warning (using the module logger) listing the unknown names so callers
know which requested frameworks were skipped; keep the existing return behavior
but log the set difference between names and _ANALYZER_REGISTRY.keys() before
instantiating known analyzers.
In `@src/okta_inspector/analyzers/cmmc.py`:
- Around line 43-58: The _extract_session_settings function is duplicated across
analyzers; to fix, create a single shared function named
extract_session_settings in the common analyzer module that contains the same
logic (iterate data.access_policies, extract rule.get("_embedded",
{}).get("rules", []), pull actions->signon->session and map policy_name,
rule_name, idle_minutes, lifetime_minutes), then remove the duplicate
_extract_session_settings definitions from cmmc.py, pci_dss.py, and irap.py and
import and call the shared extract_session_settings function there (update any
call sites to the new name if needed) so all analyzers use the single
implementation.
In `@src/okta_inspector/analyzers/common.py`:
- Around line 107-121: The current functions (is_mfa_enforced,
has_admin_console_mfa, has_dashboard_mfa) only check policy names rather than
whether MFA is required; either rename them to has_admin_console_policy and
has_dashboard_policy to reflect that behavior, or update their logic to inspect
each policy's actions -> signon fields (e.g.,
policy.get("actions",{}).get("signon",{})) and verify the signon/access or
related MFA requirement fields indicate MFA is enforced (and return True only in
that case); update any callers/tests to match the chosen approach.
- Around line 139-150: The loop that builds result.inactive_users only checks
users with a truthy lastLogin, so accounts with missing or None lastLogin are
ignored; update the logic in the analyzer (around cutoff, data.users, and
result.inactive_users in src/okta_inspector/analyzers/common.py) to handle
never-logged-in users—either by appending users with no lastLogin to
result.inactive_users or by adding a new field on the UserAnalysis (e.g.,
result.never_logged_in) and populating it with [u for u in data.users if not
u.get("lastLogin")], keeping existing isoformat parsing for users with
lastLogin. Ensure you update the UserAnalysis type/constructor if you add the
never_logged_in field.
In `@src/okta_inspector/analyzers/fedramp.py`:
- Around line 327-347: The AU-2 block currently treats an empty data.system_logs
as a definitive failure; update the logic that appends the _finding for "AU-2"
so that when data.system_logs is empty it uses a "manual" status (rather than
"fail") and a softer message explaining that no events were returned and manual
verification of logging configuration/collection window is required; keep the
existing "pass" branch unchanged and reference the AU-2 finding creation via the
_finding(...) call that uses data.system_logs to determine which branch to
append.
In `@src/okta_inspector/analyzers/soc2.py`:
- Around line 37-51: The current _extract_session_settings in soc2.py duplicates
logic from cmmc.py and omits lifetime_minutes; extract the shared logic into a
single function (e.g., extract_session_settings) in analyzers/common.py that
returns dicts with "policy_name", "rule_name", "idle_minutes" (from
maxSessionIdleMinutes) and "lifetime_minutes" (from maxSessionLifetimeMinutes),
keep the signature accepting OktaData, then replace the
_extract_session_settings implementation in soc2.py with a call/import to
analyzers.common.extract_session_settings (removing the local duplicate) and
adjust any references to use the unified function.
In `@src/okta_inspector/cli.py`:
- Around line 121-123: Replace the current exception logging in the CLI
entrypoint so the traceback is preserved: in the except Exception as e block in
src/okta_inspector/cli.py (the block that currently calls
logging.getLogger(__name__).error("Audit failed: %s", e) before sys.exit(1)),
call logging.exception (or logging.getLogger(__name__).exception) with a
descriptive message (e.g., "Audit failed") instead of logging.error so the full
traceback is recorded, then keep the existing sys.exit(1).
In `@src/okta_inspector/client.py`:
- Around line 114-116: Replace the broad "except Exception as e" that surrounds
the API call (the block that logs via logger.error("API connection failed: %s",
e) and returns False) with a narrow, explicit exception handler for the
network/HTTP client errors you expect (e.g.,
requests.exceptions.RequestException and requests.exceptions.Timeout if using
requests, or the specific Okta SDK/network exception types your client library
raises); log the error the same way and return False for those expected
exceptions, but do not swallow unexpected exceptions—either let them propagate
or catch them and re-raise after logging. Ensure you import the specific
exception types and update the except clause to reference them instead of the
bare Exception.
In `@src/okta_inspector/collector.py`:
- Around line 49-52: The loop over policy_map uses an inline f-string URL when
calling self._client.get(f"/policies?type={policy_type}") which is inconsistent
with other calls that pass query params via the params argument; change the call
in that loop to use self._client.get("/policies", params={"type": policy_type})
so the request uses proper URL encoding and matches the pattern used elsewhere
(the _client.get call inside the policy_map loop and subsequent setattr(data,
attr, result) remain the same).
- Around line 72-75: The loop over mapping in collector.py unconditionally sets
attributes on the data object from self._client.get results (for endpoint, attr
in mapping / setattr(data, attr, result)), which assumes return types match
expected schemas; add type guards before setting critical fields (e.g., verify
result is dict for attributes like default_auth_server and dict-like auth server
objects, and list for attributes like authenticators) and skip or log and
normalize unexpected types (e.g., empty list/dict or None) rather than directly
setattr; perform these checks around the self._client.get call and only assign
when the type matches the expected shape, otherwise handle the error case (log
and leave default) so downstream consumers don’t receive invalid types.
In `@src/okta_inspector/engine.py`:
- Around line 76-77: Replace the current logger.error calls inside the exception
handlers for analyzers and reporters in engine.py (the except Exception as e
blocks around analyzer.name and reporter.name) with logger.exception so the
traceback is logged; update the two catch blocks that now call
logger.error("Error in %s analyzer: %s", analyzer.name, e) and the analogous
reporter block to use logger.exception with a clear message (e.g., "Error in %s
analyzer" % analyzer.name) so the exception and its stack trace are recorded for
debugging.
- Around line 53-59: The AuditResult timestamp is created with a naive datetime
(datetime.now()) in the AuditResult construction; change it to a timezone-aware
UTC timestamp (e.g., datetime.now(tz=timezone.utc).isoformat()) to match other
reporters and ensure consistency, and add/import timezone from datetime if it
isn’t already imported; update the timestamp expression used when building the
AuditResult in engine.py (the AuditResult(...) block) accordingly.
In `@src/okta_inspector/output.py`:
- Around line 84-92: The archive timestamp in create_archive currently uses
naive local time via datetime.now(); change it to use UTC to match other report
timestamps by using datetime.now(timezone.utc) (or
datetime.utcnow()/replace(tzinfo=timezone.utc)) when building ts so the
archive_path uses a UTC timestamp; ensure imports are present and keep the same
strftime format and filename construction in create_archive and reference
variables ts and archive_path.
- Around line 31-36: The directory timestamp in the __init__ of the output class
uses datetime.now() (local time) which is inconsistent with other reporters that
use UTC; change the timestamp generation to use datetime.now(tz=timezone.utc).
Specifically, update the __init__ method where ts is set (the base_dir =
Path(f"okta_audit_results_{ts}") line) to call
datetime.now(tz=timezone.utc).strftime(...) (and import timezone if not already
present) so base_dir timestamps match the reporters and then keep the existing
self.base_dir and self._ensure_dirs() behavior.
In `@src/okta_inspector/reporters/__init__.py`:
- Around line 34-39: In get_reporters, don't silently ignore unknown reporter
names: after calling _ensure_loaded(), compute unknowns = [n for n in names if n
not in _REPORTER_REGISTRY] and if any exist raise a ValueError (or alternately
processLogger.warning) listing the unknown names so mis-typed or missing
reporters are surfaced; then return [_REPORTER_REGISTRY[n]() for n in names if n
in _REPORTER_REGISTRY] as before. Use the function name get_reporters and the
_REPORTER_REGISTRY symbol to locate the code.
In `@src/okta_inspector/reporters/cmmc.py`:
- Around line 376-389: The loop over _CMMC_PRACTICES unpacks a variable named
description that is unused; rename description to _description in the for-loop
(for practice_id, _description, point_value in practices) to make intent clear
and avoid unused-variable confusion, leaving all logic that references
practice_id, point_value, findings_by_id, finding.severity, _SEVERITY_WEIGHTS,
deduction, score, and deductions unchanged.
- Line 173: The string literal currently written as f"**Starting Score:** 110 "
contains no interpolation; remove the unnecessary f prefix and change it to a
plain string "\"**Starting Score:** 110 \"" (locate the occurrence of
f"**Starting Score:** 110 " in the cmmc reporter output list/variable in
src/okta_inspector/reporters/cmmc.py and replace it with a normal string).
In `@src/okta_inspector/reporters/executive.py`:
- Line 13: The hardcoded _FRAMEWORKS tuple can become stale; change the code to
derive the frameworks dynamically (either from the report's findings or from the
analyzer registry via available_frameworks()) instead of the static _FRAMEWORKS
constant. In practice, remove or stop using _FRAMEWORKS in the executive.report
generation path and update generate() to compute the list of frameworks to
display by calling the registry.available_frameworks() (or by extracting unique
framework identifiers from findings) and use that result for the "Frameworks
Assessed" section so it always reflects actual run/registered analyzers.
In `@src/okta_inspector/reporters/matrix.py`:
- Around line 18-23: The generate method currently ignores the findings list;
update generate(self, findings: list[ComplianceFinding], data: OktaData, output:
OutputManager) to iterate over findings and compute pass/fail (or severity) per
control area (e.g., by grouping on finding.control_id or finding.control_area
and evaluating finding.status/severity), then inject those dynamic statuses into
the matrix output created by this method via the OutputManager APIs used
elsewhere in this class; ensure you reference the ComplianceFinding attributes
(control_id/control_area/status/severity) to populate the matrix cells so the
produced matrix reflects real pass/fail for each control.
In `@src/okta_inspector/reporters/stig.py`:
- Around line 148-149: The two static markdown strings currently using
unnecessary f-string prefixes (f"| Metric | Count |" and f"|--------|-------|")
should be changed to plain string literals without the f prefix; locate the
usage in the STIG reporter where these lines are emitted (the list or template
building the table in src/okta_inspector/reporters/stig.py) and remove the
leading "f" so they read "| Metric | Count |" and "|--------|-------|"
respectively.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c55d17da-441e-40f4-ae42-cb30c2768d11
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (38)
.gitignorepyproject.tomlsrc/okta_inspector/__init__.pysrc/okta_inspector/__main__.pysrc/okta_inspector/analyzers/__init__.pysrc/okta_inspector/analyzers/base.pysrc/okta_inspector/analyzers/cmmc.pysrc/okta_inspector/analyzers/common.pysrc/okta_inspector/analyzers/fedramp.pysrc/okta_inspector/analyzers/irap.pysrc/okta_inspector/analyzers/ismap.pysrc/okta_inspector/analyzers/pci_dss.pysrc/okta_inspector/analyzers/soc2.pysrc/okta_inspector/analyzers/stig.pysrc/okta_inspector/cli.pysrc/okta_inspector/client.pysrc/okta_inspector/collector.pysrc/okta_inspector/engine.pysrc/okta_inspector/models.pysrc/okta_inspector/output.pysrc/okta_inspector/reporters/__init__.pysrc/okta_inspector/reporters/base.pysrc/okta_inspector/reporters/cmmc.pysrc/okta_inspector/reporters/executive.pysrc/okta_inspector/reporters/fedramp.pysrc/okta_inspector/reporters/irap.pysrc/okta_inspector/reporters/ismap.pysrc/okta_inspector/reporters/matrix.pysrc/okta_inspector/reporters/pci_dss.pysrc/okta_inspector/reporters/soc2.pysrc/okta_inspector/reporters/stig.pysrc/okta_inspector/reporters/validation.pytests/__init__.pytests/conftest.pytests/test_analyzers/__init__.pytests/test_analyzers/test_common.pytests/test_models.pytests/test_reporters/__init__.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Agent
- GitHub Check: Greptile Review
🧰 Additional context used
🪛 Ruff (0.15.6)
src/okta_inspector/cli.py
[warning] 121-121: Do not catch blind exception: Exception
(BLE001)
src/okta_inspector/output.py
[error] 77-77: os.chmod setting a permissive mask 0o755 on file or directory
(S103)
src/okta_inspector/engine.py
[warning] 76-76: Do not catch blind exception: Exception
(BLE001)
[warning] 86-86: Do not catch blind exception: Exception
(BLE001)
src/okta_inspector/reporters/stig.py
[error] 148-148: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 149-149: f-string without any placeholders
Remove extraneous f prefix
(F541)
src/okta_inspector/client.py
[warning] 114-114: Do not catch blind exception: Exception
(BLE001)
src/okta_inspector/reporters/irap.py
[error] 93-93: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 99-99: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 100-100: f-string without any placeholders
Remove extraneous f prefix
(F541)
[error] 203-203: f-string without any placeholders
Remove extraneous f prefix
(F541)
src/okta_inspector/reporters/cmmc.py
[error] 173-173: f-string without any placeholders
Remove extraneous f prefix
(F541)
[warning] 377-377: Loop control variable description not used within loop body
Rename unused description to _description
(B007)
| [dependency-groups] | ||
| dev = [ | ||
| "pytest>=9.0.2", | ||
| ] |
There was a problem hiding this comment.
Duplicate dev dependency group with conflicting pytest version.
The [dependency-groups] section defines pytest>=9.0.2, but [project.optional-dependencies] at lines 29-36 already defines pytest>=8. This creates:
- Redundancy between the two dependency mechanisms
- Version conflict (
>=8vs>=9.0.2)
Consider consolidating into one approach. If using uv with PEP 735 dependency-groups, remove pytest from optional-dependencies.dev. Otherwise, remove the [dependency-groups] section.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pyproject.toml` around lines 63 - 66, The project defines pytest in two
places causing redundancy and a version conflict: remove the duplicate by
consolidating dependencies to a single mechanism — either delete pytest from
[project.optional-dependencies] under the "dev" group (keeping the
[dependency-groups] entry "dev" with "pytest>=9.0.2") or remove the
[dependency-groups] "dev" block (keeping pytest>=8 in
[project.optional-dependencies]); update only the pytest entry so there's one
source of truth and ensure the chosen version (pytest>=9.0.2 or pytest>=8) is
the intended one.
| def _extract_session_settings(data: OktaData) -> list[dict]: | ||
| results: list[dict] = [] | ||
| for policy in data.access_policies: | ||
| for rule in policy.get("_embedded", {}).get("rules", []): | ||
| actions = rule.get("actions", {}) | ||
| signon = actions.get("signon", {}) | ||
| session = signon.get("session", {}) | ||
| results.append( | ||
| { | ||
| "policy_name": policy.get("name", ""), | ||
| "rule_name": rule.get("name", ""), | ||
| "idle_minutes": session.get("maxSessionIdleMinutes"), | ||
| "lifetime_minutes": session.get("maxSessionLifetimeMinutes"), | ||
| } | ||
| ) | ||
| return results |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify that the collector does not fetch rules for ACCESS_POLICY
rg -n -A 10 'ACCESS_POLICY' src/okta_inspector/collector.py
# Check if any other analyzer has a workaround for this
rg -n '_embedded.*rules' src/okta_inspector/analyzers/Repository: ethanolivertroy/okta-inspector
Length of output: 1212
🏁 Script executed:
cat -n src/okta_inspector/analyzers/fedramp.pyRepository: ethanolivertroy/okta-inspector
Length of output: 19744
🏁 Script executed:
sed -n '40,80p' src/okta_inspector/collector.pyRepository: ethanolivertroy/okta-inspector
Length of output: 2078
🏁 Script executed:
# Check all uses of _extract_session_settings or similar patterns
rg -B2 -A5 'password_policy_rules|\.rules' src/okta_inspector/analyzers/ | head -80
# Check STIG analyzer since it also uses _embedded.rules
sed -n '35,65p' src/okta_inspector/analyzers/stig.pyRepository: ethanolivertroy/okta-inspector
Length of output: 1199
🏁 Script executed:
# Search for expand parameters or API calls to /policies
rg -n 'expand|/policies' src/okta_inspector/collector.py
# Check if there are any comments about why rules aren't fetched for access policies
rg -B5 -A5 'ACCESS_POLICY' src/okta_inspector/collector.pyRepository: ethanolivertroy/okta-inspector
Length of output: 768
Access policy rules are not fetched by the collector, causing AC-11 and AC-12 to always report "manual" status.
The collector in src/okta_inspector/collector.py fetches rules only for PASSWORD policies (line 59), but not for ACCESS_POLICY. The _extract_session_settings function expects _embedded.rules to be populated in data.access_policies, but this never occurs. This affects multiple analyzers (fedramp, stig, soc2, irap, pci_dss, cmmc).
The collector must be updated to fetch access policy rules via /policies/{id}/rules similar to how it handles PASSWORD policies, or use an expand parameter when fetching access policies. Without this fix, session idle timeout (AC-11) and session lifetime (AC-12) controls will always fall back to "manual" verification status.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/analyzers/fedramp.py` around lines 43 - 58, The collector
doesn't fetch rules for ACCESS_POLICY so _extract_session_settings (in
src/okta_inspector/analyzers/fedramp.py) receives access_policies without an
"_embedded.rules" array and AC-11/AC-12 fall back to "manual"; update the
collector (src/okta_inspector/collector.py) to load rules for access policies
the same way it does for PASSWORD policies (either call /policies/{id}/rules for
each policy id or include the expand parameter when fetching access policies) so
that data.access_policies contains "_embedded.rules" and
_extract_session_settings can read session.maxSessionIdleMinutes and
session.maxSessionLifetimeMinutes correctly.
| # -- IA-2 / IA-2(1): MFA enforcement ------------------------------- | ||
| mfa_ok = is_mfa_enforced(data) | ||
| admin_mfa = has_admin_console_mfa(data) | ||
| dash_mfa = has_dashboard_mfa(data) | ||
| if mfa_ok: | ||
| findings.append( | ||
| _finding( | ||
| "IA-2", | ||
| "MFA enforcement detected", | ||
| "high", | ||
| "pass", | ||
| "Access policies indicate MFA is enforced.", | ||
| admin_console_mfa=admin_mfa, | ||
| dashboard_mfa=dash_mfa, | ||
| ) | ||
| ) | ||
| else: | ||
| findings.append( | ||
| _finding( | ||
| "IA-2", | ||
| "MFA enforcement not detected", | ||
| "high", | ||
| "fail", | ||
| "No access policy requiring MFA was found.", | ||
| ) | ||
| ) | ||
|
|
||
| if admin_mfa: | ||
| findings.append( | ||
| _finding("IA-2(1)", "Privileged account MFA configured", "high", "pass", | ||
| "Admin Console access policy detected.") | ||
| ) | ||
| else: | ||
| findings.append( | ||
| _finding("IA-2(1)", "Privileged account MFA not configured", "high", "fail", | ||
| "No Admin Console access policy found.") | ||
| ) |
There was a problem hiding this comment.
MFA enforcement detection relies on policy name heuristics, not actual rule inspection.
The is_mfa_enforced() helper (in common.py:107-113) only checks if a policy name contains "Admin Console" or "Dashboard", not whether MFA is actually required in the policy rules. This can produce:
- False positives: A policy named "Admin Console" exists but doesn't actually require MFA
- False negatives: MFA is enforced via a differently-named policy
Consider inspecting the policy's actions or conditions to verify MFA is actually required, or update the finding comments to clarify this is a heuristic check.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/analyzers/fedramp.py` around lines 208 - 244, The current
MFA checks use name-based heuristics (is_mfa_enforced, has_admin_console_mfa,
has_dashboard_mfa) which can misclassify policies; update these helpers to
inspect each policy's rules/actions/conditions (e.g., look for actions like
"ENROLL" or "CHALLENGE", or conditions requiring multifactor in policy.actions
and policy.conditions) to determine if MFA is actually required, and then update
the callers in fedramp.py that build _finding entries to use the revised boolean
results (or, if you keep name-based detection, change the _finding descriptions
to explicitly state they are heuristics rather than definitive enforcement
checks).
| # -- 8.2.6: Lockout <= 6 attempts ---------------------------------- | ||
| for pp in pw_policies: | ||
| if pp.max_attempts > 6: | ||
| findings.append( | ||
| _finding( | ||
| "8.2.6", | ||
| "Lockout threshold exceeds 6 attempts", | ||
| "medium", | ||
| "fail", | ||
| f"Policy '{pp.policy_name}' allows {pp.max_attempts} attempts (max 6).", | ||
| max_attempts=pp.max_attempts, | ||
| ) | ||
| ) | ||
| else: | ||
| findings.append( | ||
| _finding( | ||
| "8.2.6", | ||
| "Lockout threshold within PCI-DSS limit", | ||
| "medium", | ||
| "pass", | ||
| f"Policy '{pp.policy_name}' locks after {pp.max_attempts} attempts.", | ||
| ) | ||
| ) | ||
| if not pw_policies: | ||
| findings.append( | ||
| _finding("8.2.6", "Account lockout", "medium", "manual", "No password policies found.") | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
PCI-DSS 4.0 requirement 8.3.4 account lockout
💡 Result:
PCI DSS 4.0 requirement 8.3.4 (also in v4.0.1) requires that invalid authentication attempts are limited by locking out the user ID after not more than 10 invalid attempts, with the lockout duration set to a minimum of 30 minutes or until the user’s identity is confirmed by an administrator or process. This prevents brute-force attacks by stopping repeated password guessing. Defined Approach Requirements: - Lock out the user ID after not more than 10 attempts. - Lockout duration: minimum 30 minutes or until identity confirmed. Testing Procedures: - 8.3.4.a: Verify systems lock accounts after ≤10 invalid logons. - 8.3.4.b: Verify lockout lasts ≥30 minutes or until identity confirmed. Applicability Note: Does not apply to POS terminal user accounts accessing only one card at a time (e.g., cashier IDs). Purpose: Mitigates brute-force attacks; without lockout, attackers can guess passwords indefinitely. This is under Requirement 8: Identify Users and Authenticate Access to System Components, specifically 8.3 Strong authentication for users and administrators. Note: Older PCI DSS v3 had 6 attempts (8.1.6/8.1.7); updated to 10 in v4.0.
Citations:
- 1: https://service.securitm.ru/docs/pci-dss-v4-0/requirement-8-3-4
- 2: https://www.youtube.com/watch?v=GRcrKvVScCo
- 3: https://blog.rsisecurity.com/creating-a-pci-dss-account-lockout-policy/
- 4: https://www.securitymetrics.com/blog/password-updates-and-requirements-in-pci-4
Update control ID and threshold to PCI-DSS v4.0 requirements.
The code uses control ID 8.2.6 with a 6-attempt threshold, which corresponds to PCI-DSS v3.x. In PCI-DSS v4.0, account lockout is requirement 8.3.4, and the threshold is 10 failed attempts (not 6). Update the control ID and adjust the threshold accordingly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/analyzers/pci_dss.py` around lines 113 - 139, The control
uses the old PCI-DSS v3.x ID and threshold; update the loop that iterates over
pw_policies and the associated _finding calls to use control ID "8.3.4" and
change the threshold check from max_attempts > 6 to max_attempts > 10 (and
update the failure message to mention 10 attempts); also update the pass message
to state it locks after X attempts within the 10-attempt limit and change the
manual "Account lockout" _finding to use "8.3.4" so all _finding invocations for
this check consistently reference the new PCI-DSS v4.0 requirement.
| ) | ||
| from okta_inspector.models import ComplianceFinding, OktaData | ||
|
|
||
| _FRAMEWORK = "STIG" |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check what framework string the STIG reporter filters on
rg -n 'framework\s*==\s*["\047]' src/okta_inspector/reporters/stig.pyRepository: ethanolivertroy/okta-inspector
Length of output: 284
🏁 Script executed:
#!/bin/bash
# Check what framework string the STIG reporter filters on
rg -n 'framework' src/okta_inspector/reporters/stig.py -A 2 -B 2Repository: ethanolivertroy/okta-inspector
Length of output: 266
🏁 Script executed:
#!/bin/bash
# Check how the STIG analyzer uses _FRAMEWORK
cat -n src/okta_inspector/analyzers/stig.py | head -50Repository: ethanolivertroy/okta-inspector
Length of output: 1860
🏁 Script executed:
#!/bin/bash
# Check another framework's naming pattern
fd -t f 'nist.*\.py$' src/okta_inspector/analyzers/ src/okta_inspector/reporters/ | head -4 | xargs -I {} sh -c 'echo "=== {} ===" && rg "_FRAMEWORK|framework ==" {} | head -5'Repository: ethanolivertroy/okta-inspector
Length of output: 56
🏁 Script executed:
#!/bin/bash
# List all analyzer files to see what frameworks exist
ls -1 src/okta_inspector/analyzers/*.py | grep -v __pycache__ | grep -v base | grep -v common | head -10Repository: ethanolivertroy/okta-inspector
Length of output: 380
🏁 Script executed:
#!/bin/bash
# Check CMMC analyzer _FRAMEWORK value
rg -n '_FRAMEWORK' src/okta_inspector/analyzers/cmmc.py
# Check CMMC reporter framework filter
rg -n 'framework' src/okta_inspector/reporters/cmmc.py -A 1 -B 1Repository: ethanolivertroy/okta-inspector
Length of output: 276
Change _FRAMEWORK from "STIG" to "DISA STIG" to match the reporter filter.
The STIG reporter filters findings on framework == "DISA STIG" (line 69 of src/okta_inspector/reporters/stig.py), but the analyzer sets _FRAMEWORK = "STIG". This mismatch prevents any findings from being processed by the reporter. The module's own docstring confirms the framework should be "DISA STIG", consistent with other framework pairs in the codebase (e.g., CMMC uses matching strings in both analyzer and reporter).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/analyzers/stig.py` at line 16, Change the module-level
constant _FRAMEWORK from "STIG" to "DISA STIG" so the analyzer's framework
string matches the reporter filter; locate the _FRAMEWORK constant in the stig
analyzer module (symbol _FRAMEWORK) and update its value to "DISA STIG" to align
with the reporter that expects framework == "DISA STIG".
| recs.extend([ | ||
| f"{len(recs) + 1}. **Implement continuous monitoring:** Configure Okta " | ||
| "event hooks or log streams to forward security events to a SIEM for " | ||
| "CC7 compliance.", | ||
| f"{len(recs) + 1}. **Establish access review cadence:** Perform quarterly " | ||
| "user access reviews to demonstrate ongoing CC6.2 and CC6.3 compliance.", | ||
| f"{len(recs) + 1}. **Document policies:** Ensure all access control policies " | ||
| "referenced by Okta configurations are formally documented and approved.", | ||
| ]) |
There was a problem hiding this comment.
Recommendation numbering produces duplicates.
The three static recommendations are added via recs.extend() with numbering calculated using len(recs) + 1 before any of them are appended. This results in duplicate numbers (e.g., all three get the same number).
🛠️ Proposed fix
- recs.extend([
- f"{len(recs) + 1}. **Implement continuous monitoring:** Configure Okta "
- "event hooks or log streams to forward security events to a SIEM for "
- "CC7 compliance.",
- f"{len(recs) + 1}. **Establish access review cadence:** Perform quarterly "
- "user access reviews to demonstrate ongoing CC6.2 and CC6.3 compliance.",
- f"{len(recs) + 1}. **Document policies:** Ensure all access control policies "
- "referenced by Okta configurations are formally documented and approved.",
- ])
+ recs.append(
+ f"{len(recs) + 1}. **Implement continuous monitoring:** Configure Okta "
+ "event hooks or log streams to forward security events to a SIEM for "
+ "CC7 compliance."
+ )
+ recs.append(
+ f"{len(recs) + 1}. **Establish access review cadence:** Perform quarterly "
+ "user access reviews to demonstrate ongoing CC6.2 and CC6.3 compliance."
+ )
+ recs.append(
+ f"{len(recs) + 1}. **Document policies:** Ensure all access control policies "
+ "referenced by Okta configurations are formally documented and approved."
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| recs.extend([ | |
| f"{len(recs) + 1}. **Implement continuous monitoring:** Configure Okta " | |
| "event hooks or log streams to forward security events to a SIEM for " | |
| "CC7 compliance.", | |
| f"{len(recs) + 1}. **Establish access review cadence:** Perform quarterly " | |
| "user access reviews to demonstrate ongoing CC6.2 and CC6.3 compliance.", | |
| f"{len(recs) + 1}. **Document policies:** Ensure all access control policies " | |
| "referenced by Okta configurations are formally documented and approved.", | |
| ]) | |
| recs.append( | |
| f"{len(recs) + 1}. **Implement continuous monitoring:** Configure Okta " | |
| "event hooks or log streams to forward security events to a SIEM for " | |
| "CC7 compliance." | |
| ) | |
| recs.append( | |
| f"{len(recs) + 1}. **Establish access review cadence:** Perform quarterly " | |
| "user access reviews to demonstrate ongoing CC6.2 and CC6.3 compliance." | |
| ) | |
| recs.append( | |
| f"{len(recs) + 1}. **Document policies:** Ensure all access control policies " | |
| "referenced by Okta configurations are formally documented and approved." | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/reporters/soc2.py` around lines 223 - 231, The three
static recommendation strings added via recs.extend(...) are numbered using
len(recs) + 1 before any are appended, causing duplicate numbers; fix by
computing the starting index once (e.g., start = len(recs) + 1) and then
generate the three entries using that start plus incremental offsets (or use
enumerate over a list of messages with a start parameter) so each appended
string in recs gets a unique, sequential number; update the recs.extend call to
build the numbered strings using this computed start (reference the recs.extend
call in soc2.py).
| if f.framework == "DISA STIG": | ||
| findings_by_id[f.control_id] = f |
There was a problem hiding this comment.
Critical: Framework name mismatch causes all STIG findings to be filtered out.
The reporter filters by f.framework == "DISA STIG", but the STIGAnalyzer in src/okta_inspector/analyzers/stig.py sets _FRAMEWORK = "STIG". This mismatch means findings_by_id will always be empty, and the report will show "NOT ASSESSED" for all 24 V-IDs.
🐛 Proposed fix: Align with the analyzer's framework constant
findings_by_id: dict[str, ComplianceFinding] = {}
for f in findings:
- if f.framework == "DISA STIG":
+ if f.framework == "STIG":
findings_by_id[f.control_id] = fAlternatively, update the analyzer to use "DISA STIG" for consistency with the display_name, but that would require changes across multiple files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/reporters/stig.py` around lines 69 - 70, The filter is
checking f.framework == "DISA STIG" but the analyzer defines the framework as
"STIG", so change the filter to use the analyzer's framework constant (e.g.,
compare to STIGAnalyzer._FRAMEWORK) or to the exact string "STIG" so findings
are not dropped; update the conditional in the reporter that populates
findings_by_id (the block referencing f.framework and f.control_id) to match the
STIGAnalyzer's _FRAMEWORK value.
| # --- File existence checks --- | ||
| echo "--- Core Data Files ---" | ||
| check_file "$OUTPUT_DIR/core_data/users.json" "Users data" | ||
| check_file "$OUTPUT_DIR/core_data/groups.json" "Groups data" | ||
| check_file "$OUTPUT_DIR/core_data/apps.json" "Applications data" | ||
| check_file "$OUTPUT_DIR/core_data/authenticators.json" "Authenticators data" | ||
| echo "" | ||
|
|
||
| echo "--- Analysis Files ---" | ||
| check_file "$OUTPUT_DIR/analysis/session_analysis.json" "Session analysis" | ||
| check_file "$OUTPUT_DIR/analysis/password_policy_analysis.json" "Password policy analysis" | ||
| check_file "$OUTPUT_DIR/analysis/monitoring_analysis.json" "Monitoring analysis" | ||
| echo "" |
There was a problem hiding this comment.
Validation script checks for non-existent files.
The script checks for files that don't match the actual filenames saved by engine.py:
| Script checks | Actual filename saved |
|---|---|
users.json |
all_users.json |
apps.json |
applications.json |
session_analysis.json |
Not saved |
monitoring_analysis.json |
Not saved (individual files: active_event_hooks.json, active_log_streams.json) |
This will cause the validation script to report false FAILures. Align the filenames with what _save_raw_data() and _save_analysis() actually write in engine.py.
🔧 Proposed fix for core_data checks
echo "--- Core Data Files ---"
-check_file "$OUTPUT_DIR/core_data/users.json" "Users data"
+check_file "$OUTPUT_DIR/core_data/all_users.json" "Users data"
check_file "$OUTPUT_DIR/core_data/groups.json" "Groups data"
-check_file "$OUTPUT_DIR/core_data/apps.json" "Applications data"
+check_file "$OUTPUT_DIR/core_data/applications.json" "Applications data"
check_file "$OUTPUT_DIR/core_data/authenticators.json" "Authenticators data"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # --- File existence checks --- | |
| echo "--- Core Data Files ---" | |
| check_file "$OUTPUT_DIR/core_data/users.json" "Users data" | |
| check_file "$OUTPUT_DIR/core_data/groups.json" "Groups data" | |
| check_file "$OUTPUT_DIR/core_data/apps.json" "Applications data" | |
| check_file "$OUTPUT_DIR/core_data/authenticators.json" "Authenticators data" | |
| echo "" | |
| echo "--- Analysis Files ---" | |
| check_file "$OUTPUT_DIR/analysis/session_analysis.json" "Session analysis" | |
| check_file "$OUTPUT_DIR/analysis/password_policy_analysis.json" "Password policy analysis" | |
| check_file "$OUTPUT_DIR/analysis/monitoring_analysis.json" "Monitoring analysis" | |
| echo "" | |
| # --- File existence checks --- | |
| echo "--- Core Data Files ---" | |
| check_file "$OUTPUT_DIR/core_data/all_users.json" "Users data" | |
| check_file "$OUTPUT_DIR/core_data/groups.json" "Groups data" | |
| check_file "$OUTPUT_DIR/core_data/applications.json" "Applications data" | |
| check_file "$OUTPUT_DIR/core_data/authenticators.json" "Authenticators data" | |
| echo "" | |
| echo "--- Analysis Files ---" | |
| check_file "$OUTPUT_DIR/analysis/session_analysis.json" "Session analysis" | |
| check_file "$OUTPUT_DIR/analysis/password_policy_analysis.json" "Password policy analysis" | |
| check_file "$OUTPUT_DIR/analysis/monitoring_analysis.json" "Monitoring analysis" | |
| echo "" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/reporters/validation.py` around lines 110 - 122, The
validation script is checking filenames that don't match what engine.py writes;
update the check_file calls in validation.py to match _save_raw_data and
_save_analysis outputs: replace "$OUTPUT_DIR/core_data/users.json" with
"$OUTPUT_DIR/core_data/all_users.json" and "$OUTPUT_DIR/core_data/apps.json"
with "$OUTPUT_DIR/core_data/applications.json"; remove or stop checking for
"session_analysis.json" (engine.py doesn't write it) and instead add checks for
the actual analysis files written by _save_analysis—e.g.
"$OUTPUT_DIR/analysis/active_event_hooks.json" and
"$OUTPUT_DIR/analysis/active_log_streams.json" (and keep existing
password_policy_analysis.json if _save_analysis writes it); ensure you update
the descriptive labels passed to check_file accordingly so messages match the
real files.
| echo "--- Analysis Files ---" | ||
| check_file "$OUTPUT_DIR/analysis/session_analysis.json" "Session analysis" | ||
| check_file "$OUTPUT_DIR/analysis/password_policy_analysis.json" "Password policy analysis" | ||
| check_file "$OUTPUT_DIR/analysis/monitoring_analysis.json" "Monitoring analysis" | ||
| echo "" | ||
|
|
||
| # --- Session analysis validation --- | ||
| echo "--- Session Configuration ---" | ||
| SESSION_FILE="$OUTPUT_DIR/analysis/session_analysis.json" | ||
| if [[ -f "$SESSION_FILE" ]]; then | ||
| POLICY_COUNT=$(jq 'length' "$SESSION_FILE" 2>/dev/null || echo 0) | ||
| echo " Session policies found: $POLICY_COUNT" | ||
|
|
||
| if [[ "$POLICY_COUNT" -gt 0 ]]; then | ||
| jq -r '.[] | " Policy: \(.policy_name) (ID: \(.policy_id))"' "$SESSION_FILE" 2>/dev/null || true | ||
| jq -r '.[] | .rules[]? | " Rule: \(.name) | Idle: \(.idle_timeout_minutes)m | Lifetime: \(.lifetime_minutes)m | Persistent: \(.persistent_cookies)"' "$SESSION_FILE" 2>/dev/null || true | ||
| fi | ||
| fi | ||
| echo "" | ||
|
|
There was a problem hiding this comment.
Session analysis validation references non-existent file.
The script references session_analysis.json and monitoring_analysis.json, but engine.py._save_analysis() does not create these files. It saves:
password_policy_analysis.jsoninactive_users.jsonauthenticator_analysis.jsonactive_event_hooks.json,active_log_streams.json,log_event_summary.json
Either update the script to check for files that actually exist, or add session/monitoring analysis aggregation to engine.py._save_analysis().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/reporters/validation.py` around lines 118 - 137, The
validation script references non-existent analysis files (SESSION_FILE /
session_analysis.json and monitoring_analysis.json) while
engine.py._save_analysis() only writes password_policy_analysis.json,
inactive_users.json, authenticator_analysis.json, active_event_hooks.json,
active_log_streams.json, and log_event_summary.json; fix by either updating the
reporter in src/okta_inspector/reporters/validation.py to check the actual files
produced (replace SESSION_FILE and monitoring checks with the real filenames
like password_policy_analysis.json, inactive_users.json,
authenticator_analysis.json, active_event_hooks.json, active_log_streams.json,
log_event_summary.json and corresponding jq output blocks) or add
session/monitoring aggregation and file writes to engine.py._save_analysis()
(implement creation of session_analysis.json and monitoring_analysis.json) so
they exist for the SESSION_FILE and monitoring checks to read.
| md = f"""# Quick Reference Guide | ||
|
|
||
| **Generated:** {ts} | ||
|
|
||
| --- | ||
|
|
||
| ## Output Directory Structure | ||
|
|
||
| ``` | ||
| okta_audit_results_<timestamp>/ | ||
| |-- core_data/ Raw Okta API data | ||
| | |-- users.json All user accounts | ||
| | |-- groups.json Group definitions | ||
| | |-- apps.json Application integrations | ||
| | |-- authenticators.json Configured authenticators | ||
| | |-- policies/ Policy data by type | ||
| | |-- network_zones.json Network zone definitions | ||
| | +-- ... | ||
| | | ||
| |-- analysis/ Processed analysis results | ||
| | |-- session_analysis.json Session timeout settings | ||
| | |-- password_policy_analysis.json Password policy evaluation | ||
| | |-- authenticator_analysis.json MFA and authenticator review | ||
| | |-- monitoring_analysis.json Log streams and event hooks | ||
| | +-- user_analysis.json User status breakdown | ||
| | | ||
| |-- compliance/ Framework-specific reports | ||
| | |-- executive_summary.md High-level findings overview | ||
| | |-- unified_compliance_matrix.md Cross-framework mapping | ||
| | |-- fedramp/ FedRAMP FIPS analysis | ||
| | |-- disa_stig/ DISA STIG checklist | ||
| | |-- irap/ IRAP and Essential Eight | ||
| | |-- ismap/ ISMAP ISO 27001 assessment | ||
| | |-- soc2/ SOC 2 Trust Services report | ||
| | |-- pci_dss/ PCI-DSS Req. 7 & 8 | ||
| | +-- cmmc/ CMMC 2.0 assessment | ||
| | | ||
| |-- validate_compliance.sh Automated validation script | ||
| +-- QUICK_REFERENCE.md This file | ||
| ``` |
There was a problem hiding this comment.
QUICK_REFERENCE.md documents non-existent files.
The directory structure documentation references files that don't exist:
users.json→ should beall_users.jsonapps.json→ should beapplications.jsonsession_analysis.json→ not createdmonitoring_analysis.json→ not createduser_analysis.json→ not created (individual status files are created instead)
Update the documentation to reflect the actual output structure.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/okta_inspector/reporters/validation.py` around lines 213 - 252, Update
the QUICK_REFERENCE.md content built in the md = f"""... block so the documented
filenames match actual outputs: change users.json to all_users.json, apps.json
to applications.json, remove the non-created session_analysis.json and
monitoring_analysis.json entries, and replace the user_analysis.json line with a
note that individual user status files are created (e.g., per-user status files)
under analysis; ensure the final QUICK_REFERENCE.md entry remains listed. Locate
the multi-line string in src/okta_inspector/reporters/validation.py and edit
those specific filename lines so the generated documentation reflects the real
output structure.
There was a problem hiding this comment.
Pull request overview
This PR rewrites the Okta audit tool into a modular Python 3.12+ package (okta_inspector) with a registry-based analyzer/reporter architecture, adds CMMC 2.0 checks, and introduces a new CLI plus structured output/report generation.
Changes:
- Introduces a new
src/okta_inspector/package (client, collector, engine, models, analyzers, reporters) and a new CLI entrypoint. - Adds CMMC 2.0 analyzer + report, and expands multi-framework reporting (executive, matrix, framework-specific reports, validation assets).
- Adds pytest fixtures and unit tests for models and shared analyzer utilities.
Reviewed changes
Copilot reviewed 34 out of 39 changed files in this pull request and generated 19 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/init.py | Marks tests as a package. |
| tests/conftest.py | Adds shared OktaData fixtures for analyzer tests. |
| tests/test_analyzers/init.py | Marks analyzer tests as a package. |
| tests/test_analyzers/test_common.py | Adds tests for shared analyzer functions. |
| tests/test_models.py | Adds basic model tests (ComplianceFinding/OktaData/intermediate analyses). |
| tests/test_reporters/init.py | Marks reporter tests as a package. |
| src/okta_inspector/init.py | Defines package version. |
| src/okta_inspector/main.py | Enables python -m okta_inspector. |
| src/okta_inspector/cli.py | New CLI with env var support and framework selection. |
| src/okta_inspector/client.py | New Okta API client (pagination + rate limit handling). |
| src/okta_inspector/collector.py | Collects Okta API data into an in-memory OktaData bus. |
| src/okta_inspector/engine.py | Orchestrates collect → analyze → report → archive and persists outputs. |
| src/okta_inspector/models.py | Adds dataclasses for findings, OktaData, and analysis intermediates. |
| src/okta_inspector/output.py | Adds output directory management and archive creation. |
| src/okta_inspector/analyzers/init.py | Analyzer registry + auto-discovery loader. |
| src/okta_inspector/analyzers/base.py | Defines analyzer interface. |
| src/okta_inspector/analyzers/common.py | Shared analysis helpers used across frameworks. |
| src/okta_inspector/analyzers/cmmc.py | Implements CMMC 2.0 Level 2 analyzer. |
| src/okta_inspector/analyzers/fedramp.py | Implements FedRAMP analyzer. |
| src/okta_inspector/analyzers/irap.py | Implements IRAP analyzer. |
| src/okta_inspector/analyzers/ismap.py | Implements ISMAP analyzer. |
| src/okta_inspector/analyzers/pci_dss.py | Implements PCI-DSS analyzer. |
| src/okta_inspector/analyzers/soc2.py | Implements SOC 2 analyzer. |
| src/okta_inspector/analyzers/stig.py | Implements DISA STIG analyzer. |
| src/okta_inspector/reporters/init.py | Reporter registry + auto-discovery loader. |
| src/okta_inspector/reporters/base.py | Defines reporter interface. |
| src/okta_inspector/reporters/cmmc.py | Generates CMMC report and SPRS estimate. |
| src/okta_inspector/reporters/executive.py | Generates executive summary report. |
| src/okta_inspector/reporters/fedramp.py | Generates FedRAMP/FIPS report. |
| src/okta_inspector/reporters/irap.py | Generates IRAP report + Essential Eight assessment. |
| src/okta_inspector/reporters/ismap.py | Generates ISMAP report. |
| src/okta_inspector/reporters/matrix.py | Generates unified compliance matrix report. |
| src/okta_inspector/reporters/pci_dss.py | Generates PCI-DSS report. |
| src/okta_inspector/reporters/soc2.py | Generates SOC 2 report. |
| src/okta_inspector/reporters/stig.py | Generates DISA STIG checklist report. |
| src/okta_inspector/reporters/validation.py | Generates validation script and quick reference guide. |
| pyproject.toml | Updates packaging to hatchling + Python 3.12+ and defines CLI scripts. |
| .gitignore | Expands ignores for Python/uv/test artifacts and archives. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from okta_inspector import __version__ | ||
| from okta_inspector.analyzers import available_frameworks | ||
|
|
||
|
|
||
| def main(argv: list[str] | None = None) -> None: | ||
| parser = argparse.ArgumentParser( | ||
| prog="okta-inspector", | ||
| description=f"Okta Multi-Framework Compliance Audit Tool v{__version__}", | ||
| formatter_class=argparse.RawDescriptionHelpFormatter, | ||
| epilog="""\ | ||
| examples: | ||
| okta-inspector -d your-org.okta.com -t YOUR_API_TOKEN | ||
| okta-inspector -d your-org.okta.com -t YOUR_API_TOKEN --frameworks stig,cmmc | ||
| OKTA_DOMAIN=your-org.okta.com OKTA_API_TOKEN=xxx okta-inspector | ||
| """, | ||
| ) | ||
|
|
||
| parser.add_argument( | ||
| "-d", | ||
| "--domain", | ||
| default=os.environ.get("OKTA_DOMAIN"), | ||
| help="Okta domain (e.g. your-org.okta.com). Env: OKTA_DOMAIN", | ||
| ) | ||
| parser.add_argument( | ||
| "-t", | ||
| "--token", | ||
| default=os.environ.get("OKTA_API_TOKEN"), | ||
| help="Okta API token. Env: OKTA_API_TOKEN", | ||
| ) | ||
| parser.add_argument( | ||
| "-o", | ||
| "--output-dir", | ||
| help="Custom output directory (default: timestamped)", | ||
| ) | ||
| parser.add_argument( | ||
| "-p", | ||
| "--page-size", | ||
| type=int, | ||
| default=200, | ||
| help="Items per API page (default: 200)", | ||
| ) | ||
| parser.add_argument( | ||
| "--max-pages", | ||
| type=int, | ||
| default=10, | ||
| help="Max pages to retrieve per endpoint (default: 10)", | ||
| ) | ||
| parser.add_argument( | ||
| "--oauth", | ||
| action="store_true", | ||
| help="Token is OAuth 2.0 Bearer (default: SSWS)", | ||
| ) | ||
| parser.add_argument( | ||
| "--frameworks", | ||
| help=f"Comma-separated list of frameworks to run (default: all). Available: {', '.join(available_frameworks()) or 'loading...'}", | ||
| ) |
There was a problem hiding this comment.
--frameworks help text calls available_frameworks() at argument-definition time, which triggers analyzer auto-imports and defeats the later “Lazy imports to keep --help fast” intent (and can slow --help/startup). Consider avoiding the dynamic call in the help string (e.g., list frameworks in epilog after parsing, or print them on demand).
| self.base_dir.mkdir(exist_ok=True) | ||
| (self.base_dir / "core_data").mkdir(exist_ok=True) | ||
| (self.base_dir / "analysis").mkdir(exist_ok=True) | ||
| compliance = self.base_dir / "compliance" | ||
| compliance.mkdir(exist_ok=True) | ||
| for sub in _COMPLIANCE_SUBDIRS: | ||
| (compliance / sub).mkdir(exist_ok=True) |
There was a problem hiding this comment.
OutputManager._ensure_dirs() uses self.base_dir.mkdir(exist_ok=True) without parents=True. If the user passes a nested --output-dir (e.g., reports/okta/run1), directory creation will fail. Use mkdir(parents=True, exist_ok=True) for base_dir (and consider doing the same for child dirs for consistency).
| self.base_dir.mkdir(exist_ok=True) | |
| (self.base_dir / "core_data").mkdir(exist_ok=True) | |
| (self.base_dir / "analysis").mkdir(exist_ok=True) | |
| compliance = self.base_dir / "compliance" | |
| compliance.mkdir(exist_ok=True) | |
| for sub in _COMPLIANCE_SUBDIRS: | |
| (compliance / sub).mkdir(exist_ok=True) | |
| self.base_dir.mkdir(parents=True, exist_ok=True) | |
| (self.base_dir / "core_data").mkdir(parents=True, exist_ok=True) | |
| (self.base_dir / "analysis").mkdir(parents=True, exist_ok=True) | |
| compliance = self.base_dir / "compliance" | |
| compliance.mkdir(parents=True, exist_ok=True) | |
| for sub in _COMPLIANCE_SUBDIRS: | |
| (compliance / sub).mkdir(parents=True, exist_ok=True) |
| # --- File existence checks --- | ||
| echo "--- Core Data Files ---" | ||
| check_file "$OUTPUT_DIR/core_data/users.json" "Users data" | ||
| check_file "$OUTPUT_DIR/core_data/groups.json" "Groups data" | ||
| check_file "$OUTPUT_DIR/core_data/apps.json" "Applications data" | ||
| check_file "$OUTPUT_DIR/core_data/authenticators.json" "Authenticators data" | ||
| echo "" |
There was a problem hiding this comment.
The generated validation script checks for core_data/users.json and core_data/apps.json, but the engine currently writes core_data/all_users.json and core_data/applications.json. As-is, the script will always report missing core files. Align the validation checks (and QUICK_REFERENCE.md) with the actual filenames, or adjust the engine output names for backward compatibility.
| echo "--- Analysis Files ---" | ||
| check_file "$OUTPUT_DIR/analysis/session_analysis.json" "Session analysis" | ||
| check_file "$OUTPUT_DIR/analysis/password_policy_analysis.json" "Password policy analysis" | ||
| check_file "$OUTPUT_DIR/analysis/monitoring_analysis.json" "Monitoring analysis" | ||
| echo "" |
There was a problem hiding this comment.
The validation script expects analysis/session_analysis.json and analysis/monitoring_analysis.json, but AuditEngine._save_analysis() does not create those files (it writes active_log_streams.json, log_event_summary.json, etc.). This makes the validation output misleading and breaks the documented output structure. Either write the expected aggregate analysis files, or update the script/docs to match the produced filenames.
| save(data.users, "all_users.json") | ||
| save(data.groups, "groups.json") | ||
| save(data.apps, "applications.json") |
There was a problem hiding this comment.
Raw output filenames are inconsistent with other parts of this PR: validation/quick-reference expect apps.json and users.json, while this code writes applications.json and all_users.json. Pick one naming scheme (ideally matching existing README/legacy output) and apply it consistently across engine, validation reporter, and docs.
| save(data.users, "all_users.json") | |
| save(data.groups, "groups.json") | |
| save(data.apps, "applications.json") | |
| save(data.users, "users.json") | |
| save(data.groups, "groups.json") | |
| save(data.apps, "apps.json") |
| # Deduct based on practice point value, scaled by severity | ||
| severity_weight = _SEVERITY_WEIGHTS.get(finding.severity, 1) | ||
| deduction = min(point_value, severity_weight) | ||
| score -= deduction | ||
| reason = finding.comments.replace("|", "/").replace("\n", " ") | ||
| if len(reason) > 80: | ||
| reason = reason[:77] + "..." | ||
| deductions.append( | ||
| (practice_id, deduction, finding.severity, reason) |
There was a problem hiding this comment.
_calculate_sprs() says deductions are based on the practice’s NIST/SPRS point value, but it currently uses deduction = min(point_value, severity_weight), which can under-deduct high-weight practices when the finding severity is lower (e.g., a 5-point practice with medium severity deducts only 1). If this is meant to approximate SPRS, the deduction should be the practice point value (the 5/3/1 weight), independent of finding.severity (or update the report text to match the implemented heuristic).
| # Deduct based on practice point value, scaled by severity | |
| severity_weight = _SEVERITY_WEIGHTS.get(finding.severity, 1) | |
| deduction = min(point_value, severity_weight) | |
| score -= deduction | |
| reason = finding.comments.replace("|", "/").replace("\n", " ") | |
| if len(reason) > 80: | |
| reason = reason[:77] + "..." | |
| deductions.append( | |
| (practice_id, deduction, finding.severity, reason) | |
| # Deduct based on practice point value (NIST/SPRS weight) | |
| deduction = point_value | |
| score -= deduction | |
| reason = finding.comments.replace("|", "/").replace("\n", " ") | |
| if len(reason) > 80: | |
| reason = reason[:77] + "..." | |
| deductions.append( | |
| (practice_id, point_value, finding.severity, reason) |
| page_size: int = 200, | ||
| max_pages: int = 10, | ||
| ) -> None: | ||
| self.domain = domain.rstrip("/") |
There was a problem hiding this comment.
OktaClient builds base_url as https://{domain}/api/v1 after only stripping trailing /. If callers pass a full URL (e.g., https://acme.okta.com), requests will go to https://https://acme.okta.com/.... Consider normalizing domain (strip scheme, whitespace) or accept base_url explicitly and validate input.
| self.domain = domain.rstrip("/") | |
| # Normalize domain input: strip whitespace, scheme, and trailing slashes | |
| clean_domain = domain.strip() | |
| if clean_domain.startswith("http://"): | |
| clean_domain = clean_domain[len("http://") :] | |
| elif clean_domain.startswith("https://"): | |
| clean_domain = clean_domain[len("https://") :] | |
| self.domain = clean_domain.rstrip("/") |
| try: | ||
| response = self._session.get( | ||
| url, | ||
| params=params if page_count == 1 else None, | ||
| ) | ||
|
|
There was a problem hiding this comment.
HTTP requests are made without an explicit timeout (requests.Session.get(...)). If the network stalls, the audit can hang indefinitely. Consider setting a reasonable default timeout (and making it configurable via CLI) for operational reliability.
| "| Control Area | FedRAMP | DISA STIG | IRAP (ISM) | ISMAP (ISO 27001) " | ||
| "| SOC 2 | PCI-DSS | CMMC 2.0 | Analysis File |", | ||
| "|---|---|---|---|---|---|---|---|---|", | ||
| # Session Management | ||
| "| **Session Management** | AC-11 | V-273186 | ISM-1546 | A.9.4.2 " | ||
| "| CC6.6 | 8.2.8 | AC.L2-3.1.10 | `analysis/session_analysis.json` |", | ||
| # Authentication | ||
| "| **Authentication** | IA-2 | V-273193 | ISM-0974 | A.9.4.2 " | ||
| "| CC6.1 | 8.3.1 | IA.L2-3.5.3 | `analysis/authenticator_analysis.json` |", | ||
| # Password Policy | ||
| "| **Password Policy** | IA-5 | V-273195 | ISM-0421 | A.9.2.4 " | ||
| "| CC6.1 | 8.3.6 | IA.L2-3.5.7 | `analysis/password_policy_analysis.json` |", | ||
| # Account Management | ||
| "| **Account Management** | AC-2 | V-273188 | ISM-1175 | A.9.2.1 " | ||
| "| CC6.2 | 7.1.1 | AC.L2-3.1.5 | `analysis/user_analysis.json` |", | ||
| # Monitoring | ||
| "| **Monitoring** | AU-4 | V-273202 | ISM-0407 | A.12.4.1 " | ||
| "| CC7.2 | 10.2.1 | AU.L2-3.3.1 | `analysis/monitoring_analysis.json` |", |
There was a problem hiding this comment.
The matrix references analysis artifacts that aren’t currently produced by AuditEngine._save_analysis() (e.g., analysis/session_analysis.json, analysis/monitoring_analysis.json, analysis/user_analysis.json). As a result, the “Analysis File” links in this report will be broken. Either generate these aggregate analysis files in _save_analysis() or update the matrix to point at the actual filenames written (e.g., inactive_users.json, active_log_streams.json, etc.).
| def _save_analysis(self, data: OktaData, findings: list[ComplianceFinding]) -> None: | ||
| """Persist analysis-level JSON (derived data + findings).""" | ||
| from okta_inspector.analyzers.common import ( | ||
| analyze_authenticators, | ||
| analyze_certificates, | ||
| analyze_monitoring, | ||
| analyze_password_policies, | ||
| analyze_users, | ||
| ) | ||
|
|
||
| save = self._output.save_json | ||
|
|
||
| # Shared analysis artifacts | ||
| pw = analyze_password_policies(data) | ||
| save([asdict(p) for p in pw], "password_policy_analysis.json", "analysis") | ||
|
|
||
| users = analyze_users(data) | ||
| save(users.inactive_users, "inactive_users.json", "analysis") | ||
| for status, user_list in users.users_by_status.items(): | ||
| save(user_list, f"users_{status}.json", "analysis") | ||
|
|
||
| auths = analyze_authenticators(data) | ||
| save([asdict(a) for a in auths], "authenticator_analysis.json", "analysis") | ||
|
|
||
| certs = analyze_certificates(data) | ||
| save(certs.cert_idps, "certificate_idps.json", "analysis") | ||
| save(certs.cert_authenticators, "certificate_authenticators.json", "analysis") | ||
|
|
||
| monitoring = analyze_monitoring(data) | ||
| save(monitoring.active_event_hooks, "active_event_hooks.json", "analysis") | ||
| save(monitoring.active_log_streams, "active_log_streams.json", "analysis") | ||
| save(monitoring.log_event_summary, "log_event_summary.json", "analysis") | ||
|
|
There was a problem hiding this comment.
_save_analysis() does not generate the aggregate analysis files referenced elsewhere (README/matrix/validation), such as session_analysis.json, monitoring_analysis.json, or a unified user_analysis.json. Either add these outputs here (e.g., save([asdict(s) for s in analyze_sessions(...)], ...), save(asdict(monitoring), ...), save(asdict(users), ...)) or update downstream reporters/docs to the new per-aspect filenames.
| ) | ||
| from okta_inspector.models import ComplianceFinding, OktaData | ||
|
|
||
| _FRAMEWORK = "STIG" |
There was a problem hiding this comment.
1. Stig report filters all findings 🐞 Bug ✓ Correctness
STIGAnalyzer emits findings with framework="STIG" but STIGReporter only indexes findings where framework=="DISA STIG", so the STIG checklist report will show controls as NOT ASSESSED even when findings were generated.
Agent Prompt
### Issue description
STIGAnalyzer produces findings tagged with framework `"STIG"`, but STIGReporter filters on `"DISA STIG"`, resulting in an empty STIG report.
### Issue Context
Framework identifiers must be consistent across analyzers and reporters.
### Fix Focus Areas
- src/okta_inspector/analyzers/stig.py[16-16]
- src/okta_inspector/reporters/stig.py[66-70]
### Proposed fix
Pick one canonical framework identifier (e.g., `"DISA STIG"` or `"STIG"`) and update both analyzer and reporter to use it consistently. Consider centralizing framework IDs as constants to prevent drift.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| ) | ||
| from okta_inspector.models import ComplianceFinding, OktaData | ||
|
|
||
| _FRAMEWORK = "SOC2" |
There was a problem hiding this comment.
2. Soc2 report filters all findings 🐞 Bug ✓ Correctness
SOC2Analyzer emits findings with framework="SOC2" but SOC2Reporter filters on framework=="SOC 2", so the SOC 2 report will omit all SOC2 findings.
Agent Prompt
### Issue description
SOC2 analyzer uses framework id `"SOC2"` while SOC2 reporter expects `"SOC 2"`.
### Issue Context
Reporter selection depends on exact string match.
### Fix Focus Areas
- src/okta_inspector/analyzers/soc2.py[13-13]
- src/okta_inspector/reporters/soc2.py[82-84]
### Proposed fix
Use a single canonical value for the SOC2 framework across both modules (either `"SOC2"` everywhere or `"SOC 2"` everywhere). Prefer central constants shared by analyzers/reporters to avoid future mismatches.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def _save_analysis(self, data: OktaData, findings: list[ComplianceFinding]) -> None: | ||
| """Persist analysis-level JSON (derived data + findings).""" | ||
| from okta_inspector.analyzers.common import ( | ||
| analyze_authenticators, | ||
| analyze_certificates, | ||
| analyze_monitoring, | ||
| analyze_password_policies, | ||
| analyze_users, | ||
| ) | ||
|
|
||
| save = self._output.save_json | ||
|
|
||
| # Shared analysis artifacts | ||
| pw = analyze_password_policies(data) | ||
| save([asdict(p) for p in pw], "password_policy_analysis.json", "analysis") | ||
|
|
||
| users = analyze_users(data) | ||
| save(users.inactive_users, "inactive_users.json", "analysis") | ||
| for status, user_list in users.users_by_status.items(): | ||
| save(user_list, f"users_{status}.json", "analysis") | ||
|
|
||
| auths = analyze_authenticators(data) | ||
| save([asdict(a) for a in auths], "authenticator_analysis.json", "analysis") | ||
|
|
||
| certs = analyze_certificates(data) | ||
| save(certs.cert_idps, "certificate_idps.json", "analysis") | ||
| save(certs.cert_authenticators, "certificate_authenticators.json", "analysis") | ||
|
|
||
| monitoring = analyze_monitoring(data) | ||
| save(monitoring.active_event_hooks, "active_event_hooks.json", "analysis") | ||
| save(monitoring.active_log_streams, "active_log_streams.json", "analysis") | ||
| save(monitoring.log_event_summary, "log_event_summary.json", "analysis") |
There was a problem hiding this comment.
3. Validation script fails immediately 🐞 Bug ✓ Correctness
ValidationReporter’s generated script runs with set -e and calls check_file for analysis/session_analysis.json, but AuditEngine never writes session_analysis.json, so the script exits early and cannot validate other artifacts.
Agent Prompt
### Issue description
The generated validate_compliance.sh checks for `analysis/session_analysis.json` and exits due to `set -e` when it is missing. The engine never generates that file.
### Issue Context
`okta_inspector.analyzers.common.analyze_sessions()` exists but is not used in `AuditEngine._save_analysis()`.
### Fix Focus Areas
- src/okta_inspector/engine.py[126-157]
- src/okta_inspector/analyzers/common.py[31-45]
- src/okta_inspector/reporters/validation.py[32-63]
- src/okta_inspector/reporters/validation.py[110-122]
### Proposed fix
Option A (preferred):
- Call `analyze_sessions(data)` in `_save_analysis()` and write `analysis/session_analysis.json` in the schema the validation script expects.
Option B:
- Change validation script to not hard-fail when optional analysis files are missing (e.g., call `check_file ... || true`, or remove `set -e` for the existence-check section).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def _collect_policies(self, data: OktaData) -> None: | ||
| logger.info("Retrieving all policy types...") | ||
| policy_map: list[tuple[str, str]] = [ | ||
| ("OKTA_SIGN_ON", "sign_on_policies"), | ||
| ("PASSWORD", "password_policies"), | ||
| ("MFA_ENROLL", "mfa_enrollment_policies"), | ||
| ("ACCESS_POLICY", "access_policies"), | ||
| ("USER_LIFECYCLE", "user_lifecycle_policies"), | ||
| ] | ||
| for policy_type, attr in policy_map: | ||
| result = self._client.get(f"/policies?type={policy_type}") | ||
| if result and isinstance(result, list): | ||
| setattr(data, attr, result) | ||
|
|
||
| # Fetch password policy rules | ||
| if policy_type == "PASSWORD": | ||
| for policy in result: | ||
| pid = policy.get("id") | ||
| if pid: | ||
| rules = self._client.get(f"/policies/{pid}/rules") | ||
| if rules and isinstance(rules, list): | ||
| data.password_policy_rules[pid] = rules |
There was a problem hiding this comment.
4. Access policy rules not collected 🐞 Bug ✓ Correctness
Collector stores ACCESS_POLICY results from /policies?type=ACCESS_POLICY but does not retrieve policy rules, while multiple analyzers read access_policies[*]._embedded.rules; session-related checks will therefore be empty/manual regardless of actual configuration.
Agent Prompt
### Issue description
Session/rule checks across frameworks depend on access policy rules, but the collector does not fetch them.
### Issue Context
Analyzers expect `policy.get('_embedded', {}).get('rules', [])` to exist in `OktaData.access_policies`.
### Fix Focus Areas
- src/okta_inspector/collector.py[40-61]
- src/okta_inspector/analyzers/stig.py[38-56]
- src/okta_inspector/analyzers/fedramp.py[43-58]
- src/okta_inspector/analyzers/pci_dss.py[38-51]
- src/okta_inspector/analyzers/soc2.py[37-51]
- src/okta_inspector/analyzers/irap.py[37-55]
- src/okta_inspector/analyzers/cmmc.py[43-60]
### Proposed fix
Enhance `_collect_policies()` to also fetch rules for ACCESS_POLICY policies (e.g., call `/policies/{id}/rules`) and either:
- attach them under a consistent field (e.g., add `_embedded: {rules: ...}`), or
- add a dedicated `access_policy_rules: dict[policy_id, rules]` to `OktaData` and update analyzers to use it.
Add/update tests to ensure a sample access policy includes rules and session analyzers find them.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def _ensure_dirs(self) -> None: | ||
| self.base_dir.mkdir(exist_ok=True) | ||
| (self.base_dir / "core_data").mkdir(exist_ok=True) | ||
| (self.base_dir / "analysis").mkdir(exist_ok=True) |
There was a problem hiding this comment.
5. Nested output dir may crash 🐞 Bug ⛯ Reliability
OutputManager._ensure_dirs calls self.base_dir.mkdir(exist_ok=True) without parents=True, so --output-dir some/missing/dir raises FileNotFoundError and aborts the run.
Agent Prompt
### Issue description
Output directory initialization fails for nested paths.
### Issue Context
Users may reasonably pass nested `--output-dir` values.
### Fix Focus Areas
- src/okta_inspector/output.py[42-49]
### Proposed fix
Change `self.base_dir.mkdir(exist_ok=True)` to `self.base_dir.mkdir(parents=True, exist_ok=True)` (and consider doing the same for subdirectories for consistency).
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Summary
okta-audit.pymonolith into a propersrc/okta_inspector/package (30 modules)uv/hatchling, modern type hints, auto-discovery registry patternOktaDatabus replaces fragile file-read coupling between analysis phases--frameworksfor selective execution, env var support (OKTA_DOMAIN/OKTA_API_TOKEN)Architecture
client,collector,engine,outputmodelsComplianceFinding,OktaData, typed intermediatesstig,fedramp,irap,ismap,soc2,pci_dss,cmmcexecutive,matrix,validation, + 7 framework reportersanalyzers/commonFrameworks (7)
Test plan
uv syncinstalls cleanlyuv run okta-inspector --helpshows all 7 frameworksuv run okta-audit --versionbackward compat worksuv run pytest tests/ -v— 11 tests passSummary by CodeRabbit
Release Notes
New Features
okta-inspector, a multi-framework compliance audit tool supporting CMMC 2.0, FedRAMP, IRAP, ISMAP, PCI-DSS, SOC 2, and DISA STIG frameworks.okta-inspectorandokta-audit) to run compliance audits against Okta environments.Tests