Conversation
…nd cache operations
…tage Dockerfile Refactor Dockerfile to use multi-stage builds for backend and frontend, add Nginx for serving frontend with SSL support, expose ports 80/443/8443, and update start script. Expand observability roadmap section. Enhance BatchMetricsRecorder with cache update for batch operation consistency.
…and docs Enhance Dockerfile with ARG FRONTEND_DIST_PATH for customizable frontend build output, add build-time validation to fail fast on missing assets, introduce CERT_CN for self-signed certs, validate nginx config, and update comments for better context. Refine SQL query normalization for consistent whitespace handling to improve matching accuracy. Update README with Docker build arguments guide for deployment flexibility across environments.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughAdds a full observability and analytics stack: multi-stage Docker build (frontend + backend), nginx/start scripts with certs, frontend Analytics module and pages/services, backend index/slow-query analyzers and persistence, many DB migrations/PL/SQL updates, tracing/latency tooling, logging/logstash, and large docs and build updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Scheduler
participant Analyzer as IndexEffectivenessAnalyzer
participant DB as OracleDB
participant Audit as audit_index_analysis
participant Metrics as Prometheus
Scheduler->>Analyzer: analyzeIndexEffectiveness()
Analyzer->>DB: query index usage (V$SEGMENT_STATISTICS / V$OBJECT_USAGE)
DB-->>Analyzer: usage rows
Analyzer->>Analyzer: calculateUtilityScore() / determineAction()
Analyzer->>Audit: INSERT analysis row
Audit-->>Analyzer: persisted
Analyzer->>Metrics: export gauges/counters
sequenceDiagram
participant App as SpringApp
participant QAnalyzer as QueryPlanAnalyzer
participant Slow as SlowQueryAnalyzer
participant DB as OracleDB
participant Persist as RecommendationPersistenceService
participant Audit as audit_query_recommendations
participant Metrics as Prometheus
App->>QAnalyzer: processQuery(sql)
QAnalyzer->>Slow: analyzeQuery(sql, execTime)
alt execTime > threshold
Slow->>DB: extract execution plan (V$SQL_PLAN / v$sql)
DB-->>Slow: plan data
Slow->>Slow: analyzePlanForBottlenecks()
alt recommendations enabled
Slow->>Persist: persistRecommendation(...)
Persist->>Audit: INSERT recommendation
Audit-->>Persist: persisted
Slow->>Metrics: increment recommendation counters
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60–90 minutes Areas needing extra attention:
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
sql/tables/users.sql (1)
5-17: Add CHECK constraints to V500 migration to prevent data integrity divergenceVerification confirms that a Flyway migration does exist (V500__user_settings_profile.sql) that adds the new columns (
avatar_key,account_status,account_deactivated_at) to existing databases. However, the migration is incomplete: it omits the two CHECK constraints that enforce account status validity and deactivation logic.Result: New databases (via sql/tables/users.sql) will have constraints; existing databases updated via V500 will not. This creates a critical data integrity gap.
Add the following to db/migration/V500__user_settings_profile.sql after the column additions:
BEGIN EXECUTE IMMEDIATE 'ALTER TABLE users ADD CONSTRAINT chk_account_status_values CHECK (account_status IN (''ACTIVE'', ''INACTIVE'', ''DEACTIVATED''))'; EXCEPTION WHEN OTHERS THEN IF SQLCODE = -2261 THEN NULL; -- Constraint already exists ELSE RAISE; END IF; END; / BEGIN EXECUTE IMMEDIATE 'ALTER TABLE users ADD CONSTRAINT chk_deactivated_timestamp CHECK ((account_status = ''DEACTIVATED'' AND account_deactivated_at IS NOT NULL) OR (account_status IN (''ACTIVE'', ''INACTIVE'') AND account_deactivated_at IS NULL))'; EXCEPTION WHEN OTHERS THEN IF SQLCODE = -2261 THEN NULL; -- Constraint already exists ELSE RAISE; END IF; END; /
🟡 Minor comments (12)
build.gradle-62-68 (1)
62-68: I need to verify the compatibility between these versions and clarify the actual reactor-test version managed by Spring Boot 3.5.7 BOM.Add BOM consistency comment for reactor-test.
Spring Boot 3.5.7 manages reactor-test through the Project Reactor BOM 2025.0.0, which also manages reactor-netty. Since Reactor Netty is part of the Project Reactor BOM, grouping artifacts meant to work well together with relevant versions, there are no version conflicts.
For consistency with the existing reactor-netty comment, add a similar note to line 127 documenting that reactor-test is BOM-managed:
implementation 'io.projectreactor:reactor-test' // BOM-managed via reactor-bom 2025.0.0src/main/java/com/rcs/ssf/tracing/LatencyAnalyzer.java-213-228 (1)
213-228: Update Javadoc or implement missing layer handlers for "database" and "cache"The mismatch between documented examples and actual implementation is confirmed. The Javadoc mentions
"database"and"cache"as valid layer names, but the switch statement only handleshttp,graphql,service,dao, andjdbc. Passing"database"or"cache"would silently discard measurements.Either align the Javadoc to document only the supported layers, or add proper mappings (e.g.,
"database"→dao/jdbc,"cache"→ dedicated field) and corresponding fields toLayerLatencyStats.src/main/java/com/rcs/ssf/metrics/BatchMetricsRecorder.java-67-68 (1)
67-68: Fix SLF4J placeholder usage so debug logs actually showthroughputand memory usageThe debug messages currently use
{:.2f}/{:.1f}inside{}which SLF4J does not interpret as placeholders, so the numeric arguments are effectively ignored or mis-bound:
- Line 67:
"throughput={:.2f}/sec"ignores thethroughputargument; the literal{:.2f}is logged.- Line 94: same issue as above.
- Line 144:
"Recorded memory pressure for operation '{}': {:.1f}%"logs the literal{:.1f}and dropsusagePercent.Consider simplifying to plain
{}placeholders so the values are actually logged:- log.debug("Recorded batch metrics for operation '{}': processed={}, failed={}, throughput={:.2f}/sec", - operationName, processedItems, failedItems, throughput); + log.debug("Recorded batch metrics for operation '{}': processed={}, failed={}, throughput={}/sec", + operationName, processedItems, failedItems, throughput); - log.debug("Recorded individual operation metrics for '{}': processed={}, failed={}, throughput={:.2f}/sec", - operationName, processedItems, failedItems, throughput); + log.debug("Recorded individual operation metrics for '{}': processed={}, failed={}, throughput={}/sec", + operationName, processedItems, failedItems, throughput); - log.debug("Recorded memory pressure for operation '{}': {:.1f}%", operationName, usagePercent); + log.debug("Recorded memory pressure for operation '{}': {}%", operationName, usagePercent);If you still want fixed decimal precision, you can wrap the numeric arguments with
String.format, but the main fix is to revert to valid{}placeholders.Also applies to: 94-95, 144-145
frontend/src/app/features/analytics/services/analytics.service.ts-74-81 (1)
74-81: Use environment-based config for API base URL, consistent with rest of codebase
baseUrlis hard-coded tohttps://localhost:8443, which breaks in other environments. The codebase already definesapiUrlinfrontend/src/environments/environment.prod.ts(supporting runtime override viaAPI_URLenv var), and other services likeDynamicFormServicealready follow this pattern viaenvironment.apiUrl. AlignAnalyticsServicewith the existing approach by injecting or importingenvironmentand usingenvironment.apiUrlinstead of a hard-coded string.frontend/src/app/features/analytics/services/analytics.service.ts-85-89 (1)
85-89: Update the cache TTL implementation and remove redundant error handlingThe review is accurate.
shareReplay(1)at lines 85-89 (and also at 102-105) caches indefinitely without awindowTimeparameter, contradicting the "30 seconds" comment. Additionally, thecatchErrorat lines 88 duplicates the error handling already present infetchDashboard()at line 193, making it redundant.Fix by either:
- Updating the comment to reflect indefinite caching, or
- Implementing the intended 30-second TTL:
shareReplay({ bufferSize: 1, windowTime: 30000, refCount: true })and removing the outercatchErrorsincefetchDashboard()already handles errors.frontend/src/app/features/analytics/pages/performance.component.ts-149-156 (1)
149-156: Narrowstatusparameter type for stronger type-safety
PerformanceMetric.statusis'healthy' | 'warning' | 'critical', butgetStatusColor(status: string)accepts any string. Tighten this to the union type to catch typos at compile time:- getStatusColor(status: string): string { + getStatusColor(status: PerformanceMetric['status']): string {This keeps the implementation the same but makes template usage safer.
docker-start.sh-11-56 (1)
11-56: Fix exit code capture in cleanup function.The static analysis warning is valid. Line 12 captures
$?after the function has already started executing, which means it captures the exit code of the previous command in the function (likely theechoon line 14), not the exit code of the process that triggered cleanup.The exit code should be captured at the point where cleanup is called, not inside the cleanup function itself. However, since this is used in a trap, the current approach won't work correctly.
Recommended fix:
Remove the exit code capture from cleanup and rely on the exit code passed to cleanup when called explicitly:cleanup() { - local exit_code=$? - echo "Received termination signal, cleaning up..." # ... shutdown logic ... echo "Cleanup complete, exiting with code $exit_code" - exit $exit_code + exit ${1:-0} }Then call cleanup with explicit exit codes:
- cleanup + cleanup $JAVA_EXIT_CODEFor trap-based invocations, the exit code will default to 0, which is appropriate for graceful shutdown signals.
As per static analysis hints.
Committable suggestion skipped: line range outside the PR's diff.
src/main/resources/application.yml-126-126 (1)
126-126: Update README.md and HELP.md to align with MinIO URL default change.The application now defaults to
http://minio-server:9000, but README.md and HELP.md still documenthttp://localhost:9000in their environment setup examples. This creates contradictory guidance for developers.Update both files:
- README.md: Change
export MINIO_URL=http://localhost:9000toexport MINIO_URL=http://minio-server:9000(in Quick Start section)- HELP.md: Change
export MINIO_URL=http://localhost:9000toexport MINIO_URL=http://minio-server:9000(in Environment Checklist and environment template sections)Alternatively, if localhost development is still expected, revert the application.yml default to
http://localhost:9000and update docker-compose to override via environment variable. Either way, documentation must match the code's actual default behavior.ROADMAP_OBSERVABILITY.md-178-239 (1)
178-239: Add language specifiers to fenced code blocks.Several fenced code blocks are missing language specifiers, which reduces readability in some Markdown renderers and affects accessibility.
As per coding guidelines from markdownlint, apply these changes:
For ASCII diagrams (line 178):
- ``` + ```text ┌──────────────────────────────────────────────────────────────┐For pseudocode blocks (lines 245, 291, 413):
- ``` + ```text 1. Query Oracle DBA_INDEX_USAGE for each user index:Or if the pseudocode represents SQL or a specific language, use that language identifier instead.
Also applies to: 245-347, 291-357, 413-437
sql/master.sql-92-109 (1)
92-109: Handle potential LISTAGG overflow in validation block.The validation block uses
LISTAGGto concatenate invalid object names (line 96), but the target variable is limited to 4000 bytes. If many objects are invalid, this will raiseORA-01489.Apply this diff to handle overflow gracefully:
DECLARE v_invalid_count NUMBER := 0; - v_invalid_list VARCHAR2(4000) := ''; + v_invalid_list CLOB; BEGIN - SELECT COUNT(*), LISTAGG(object_type || ' ' || object_name, ', ') + SELECT COUNT(*), LISTAGG(object_type || ' ' || object_name, ', ') WITHIN GROUP (ORDER BY object_type, object_name) INTO v_invalid_count, v_invalid_list FROM user_objects WHERE status = 'INVALID'; IF v_invalid_count > 0 THEN DBMS_OUTPUT.PUT_LINE('WARNING: Found ' || v_invalid_count || ' invalid objects:'); - DBMS_OUTPUT.PUT_LINE(v_invalid_list); + DBMS_OUTPUT.PUT_LINE(SUBSTR(v_invalid_list, 1, 32000)); RAISE_APPLICATION_ERROR(-20999, 'Schema setup failed due to invalid objects'); ELSE DBMS_OUTPUT.PUT_LINE('SUCCESS: All objects compiled successfully!'); END IF; END;Alternatively, use a cursor to print each invalid object separately.
src/main/java/com/rcs/ssf/metrics/SlowQueryAnalyzer.java-155-203 (1)
155-203: Inconsistent null vs. empty return values.Line 158 returns
nullwhensqlIdis null, but line 202 returns an emptyExecutionPlanon exceptions. This inconsistency requires callers to handle both null checks and empty object checks.Consider standardizing on one approach:
private ExecutionPlan extractExecutionPlan(String sql) { String sqlId = getSqlId(sql); if (sqlId == null) { - return null; + return new ExecutionPlan("", null, null); }src/main/java/com/rcs/ssf/metrics/SlowQueryAnalyzer.java-425-437 (1)
425-437: Generic hint uses placeholder table names.Line 431 returns a hint with generic placeholders
"table1, table2"that won't work directly. Users must manually identify and substitute the correct table names.Consider either:
- Extracting actual table names from the query/plan
- Adding a note in the rationale that table names need to be identified:
result.put("rationale", "Nested loop join detected with potentially large inner table. " + - "Consider HASH JOIN hint or ensure join columns are indexed."); + "Consider HASH JOIN hint (substitute actual table names) or ensure join columns are indexed.");
🧹 Nitpick comments (35)
src/main/java/com/rcs/ssf/tracing/LatencyAnalyzer.java (3)
12-54: Docs mention behaviors/metrics that aren't implemented or are only stubbedThe class-level Javadoc promises:
- span collection from OpenTelemetry,
tracing.latency_attributionmetrics,- real attribution-based analysis,
while the implementation currently:
- only pre-registers
tracing.latency_by_layertimers for a fixed resolver list,- never updates those timers or emits a
tracing.latency_attributionmetric,- relies solely on
latencyStatspopulated externally.This is fine as a scaffold, but the docs should either be tightened to describe the current behavior, or the missing pieces (span collection + attribution metric emission) implemented, to avoid confusion for future maintainers and operators.
Also applies to: 95-113
217-240: Eviction strategy is arbitrary; consider making intent explicitTwo minor points in
recordLayerLatency:
Per-resolver aggregation:
- Each call overwrites the layer latency (
stats.httpLatencyMs = latencyMs) rather than accumulating.- Combined with
stats.totalLatencyMs = ..., what you store is effectively the latest observation per resolver, not a windowed aggregate.- If that’s intentional (e.g., “last seen latency per resolver”), a short comment would help clarify and avoid future “why not sum?” confusion.
Eviction policy:
if (latencyStats.size() > MAX_TRACKED_SPANS)thenlatencyStats.remove(latencyStats.keySet().iterator().next())removes an arbitrary resolver (CHM is not ordered).- This is OK as a simple cap, but it’s not truly “oldest”. If you care about which entries are dropped (e.g., least recently updated), consider tracking timestamps or using an LRU-like strategy.
Neither is a blocker, but documenting the intended semantics here will make the behavior much clearer to future readers.
71-93: Scheduled analysis logic is reasonable, but consider one-time metric registrationThe scheduled
analyzeLatencyByLayer+detectAndAlertBottlenecksflow is coherent, and the bottleneck alert counter looks good.One small improvement:
initializeMetricsForLayers()is invoked on every schedule tick, but in Micrometer registering the sameTimer(same name+tags) is idempotent; you only need to do this once (e.g., constructor or@PostConstruct). Rebuilding them every 5 minutes just adds overhead.If you keep this as a stub until you hook in actual span collection, a short comment noting that could prevent someone from trying to optimize the wrong part later.
Also applies to: 170-201
src/main/java/com/rcs/ssf/metrics/BatchMetricsRecorder.java (1)
90-93: Good that individual operations now feed the same aggregated metrics; consider symmetry for throughput gaugesRouting individual-operation stats through the same
BatchMetricsinstance as batch completions ensures totals, error rate, and averages reflect all traffic, not just batched runs. If you want observability parity withrecordBatchCompletion, you might also expose a dedicated throughput gauge for this path (e.g., a separate metric name or tag for “individual” vs “batch”) to keep the external metrics story symmetrical.frontend/src/app/features/analytics/analytics.component.ts (1)
35-40: Remove redundant manual navigation; route config already handles the redirectThe route configuration correctly handles
/analytics→/analytics/dashboardvia the child empty route withpathMatch: 'full'andredirectTo: 'dashboard'. ThengOnInitmanual navigation at lines 35-40 is redundant and can be safely removed. This simplifies the component and eliminates the risk of duplicate navigation events.frontend/src/app/features/analytics/pages/dashboard.component.ts (1)
286-299: Error handling pattern misalignment between service and componentThe error handler in
fetchDashboardData'stap({ error })won't execute becauseAnalyticsService.getDashboard()andrefreshDashboard()both usecatchError(() => of(getDefaultDashboard()))at the service level, converting errors into successful responses.If you want the UI to surface real HTTP failures, consider either removing the
catchErrorfrom the service (and handling defaults in the component), or exposing an explicit error signal from the service instead of silently returning defaults. This aligns error handling with Angular best practices: centralize transport-level concerns in services, keep UI-specific error reactions in components.Regarding the echarts
[options]withnull: ngx-echarts is designed to acceptnullfor its[options]input and handles it safely without rendering when null, so no guard is needed.frontend/src/app/features/analytics/pages/reports.component.ts (4)
202-217: Consider usingfinalizeto centralizeisLoadingreset
loadReportscorrectly guards the subscription withtakeUntil, butisLoadingis manually reset in both success and error handlers. Using RxJSfinalizewould reduce duplication and make it harder to forget a branch if this grows:this.analyticsService.getReports() .pipe( takeUntil(this.destroy$), finalize(() => { this.isLoading = false; }) ) .subscribe({ next: (reports) => { this.reports = reports; }, error: (err) => { console.error('Failed to load reports:', err); this.message.error(`Failed to load reports${err?.message ? ': ' + err.message : '. Please try again.'}`); } });
219-247:mimeTypeis computed but unused; either wire it in or drop itIn
downloadReport,mimeTypeis computed viagetMimeTypebut never used. Either:
- Remove the variable and helper if the server-provided content-type is sufficient, or
- Use it when constructing a Blob (if you ever wrap the response yourself) or for setting headers/analytics so the mapping actually has an effect.
Also,
getFileExtensionjust lowercases the format; if only a fixed set of formats is expected (csv,json,Also applies to: 249-261
275-291: Revisit export methods’ sync/async behavior andisExportingsemantics
generateDashboardReport,generatePerformanceReport, andexportAllAsCSVwrap service calls intry/catch/finallyand toggleisExporting, but this only handles synchronous failures. IfanalyticsService.exportDashboardAsCSV()/exportPerformanceMetricsAsCSV()are (or later become) HttpClient-based observables or promises, errors won’t be caught here andisExportingwill flip back tofalseimmediately, not when the work finishes.Consider:
- Having these service methods return an
Observable<void>/Observable<Blob>and handling them like your other async flows (pipe(finalize(...)).subscribe({...})), or- Making them clearly synchronous “local-only” helpers and possibly dropping
isExportingaround them if it doesn’t reflect a real in-flight operation.This will keep the UI state and error handling aligned with the actual behavior.
Also applies to: 293-309, 357-373
311-340: Tighten types for usage CSV builder and consider basic CSV escaping
generateUsageReport+buildUsageMetricsCSVwork, but a couple of small improvements would help:
buildUsageMetricsCSV(metrics: any[])can use the existingUsageMetrictype for stronger guarantees:private buildUsageMetricsCSV(metrics: UsageMetric[]): string { // ... }
- You already quote
featureName, but any embedded quotes (") or newlines would still break the CSV. If these names can come from user-defined strings, consider minimal escaping (e.g., double quotes inside values).Everything else in this flow (use of
takeUntil, Blob creation, and download helper reuse) looks good.Also applies to: 342-355
frontend/src/app/features/analytics/pages/usage.component.ts (3)
38-67: Verifypercentagescale fornz-progressand consider showing the numeric value
[nzPercent]="metric.percentage"assumespercentageis already 0–100 as expected bynz-progress. If the service ever returns a 0–1 fraction, the bar will appear almost empty. It might be worth asserting/normalizing in the service or here.Optionally, you could also render the numeric value (e.g.
{{ metric.percentage | number:'1.0-2' }}%) alongside the bar for more precise visibility.
71-77: Guard ECharts rendering whenusageChartOptionsis null/empty
updateChartOptionsexplicitly setsusageChartOptionstonullwhen there are no metrics, but the template always renders the<div echarts [options]="usageChartOptions">. Depending on ngx-echarts/ECharts behavior, passingnullhere may be tolerated or could throw at runtime.To be safe and avoid mounting a chart with no data, consider:
<div class="chart-container" *ngIf="usageChartOptions"> <div echarts [options]="usageChartOptions" class="chart"></div> </div>That keeps the chart out of the DOM when there’s nothing meaningful to show and avoids any surprises from
nulloptions.Also applies to: 130-134, 172-204
140-164: Refresh pipeline withswitchMap/takeUntillooks good; consider user feedback on errorsThe
refresh$+switchMap+takeUntil(destroy$)setup is clean and avoids race conditions between rapid refreshes. One minor gap is that load failures only log to the console; unlike the reports component, the user gets no on-screen indication.If you want parity with the reports page, you could inject
NzMessageServicehere as well and surface a simple “Failed to load usage metrics” toast in the error handler.Also applies to: 166-170
.dockerignore (1)
1-31: .dockerignore patterns look good; be aware of excluded docs and compose filesThe ignore list is sane for keeping the build context lean (build artifacts, node_modules, logs, env files, IDE configs). Just note that
docs/,*.md, anddocker-compose.ymlwill never be available in the build context, so if you ever want to COPY documentation or compose samples into the image, you’ll need to relax these patterns.src/main/java/com/rcs/ssf/service/DynamicCrudService.java (2)
47-72: Review expanded ALLOWED_TABLES whitelist for security and coverageExpanding
ALLOWED_TABLESto include users, roles, MFA, API key, and multiple audit tables significantly broadens what the dynamic CRUD UI can introspect/mutate. That’s fine as long as it’s intentional, but it’s worth double‑checking:
- Confirm that all of these tables are expected to be operator‑visible via dynamic CRUD, especially
users,api_keys, and MFA tables, even though sensitive columns are name‑filtered.- The SQL docs mention an
audit_http_compressiontable; if you intend dynamic CRUD coverage for all audit tables, consider adding it here as well.For example:
- "audit_graphql_execution_plans", - "audit_circuit_breaker_events", + "audit_graphql_execution_plans", + "audit_circuit_breaker_events", + "audit_http_compression",
387-495: data_default batching logic looks correct; only minor refactor opportunitiesThe split between the main metadata query (no
data_default) and a separateuser_tab_columnsbatch to populate defaults viasafeReadColumnDefaultis sound and should avoid the LONG/ORA‑00932 issues while eliminating the N+1 pattern. The defensive catch onDataAccessExceptionwith a fallback to null defaults is also reasonable.If you want to tighten this further (optional):
- Make
defaultsByColumnunmodifiable (Map.copyOf(defaultsMap)) before using it in the stream, to prevent accidental mutation later.- Consider normalizing the map key (e.g.,
col.getName().toUpperCase(Locale.ROOT)) on both insertion and lookup if you ever introduce quoted/mixed‑case column names, to avoid silent misses.Functionally, though, this implementation is fine as‑is.
sql/README.md (1)
44-123: Minor Markdown and wording nits (table spacing, code fence language, hyphenation)To satisfy markdownlint and improve readability, you can make a few small tweaks:
- Add a blank line before the migrations table:
-### ✅ ALL Flyway Migrations Unified -| Migration | File | Status | +### ✅ ALL Flyway Migrations Unified + +| Migration | File | Status |
- Specify a language for the execution‑order fenced block (it’s mostly plain text):
-``` +```text Step 1: Sequences (audit_seq, user_id_seq) ... Step 9: Summary (print object counts and confirmation)3. Hyphenate “performance‑critical”: ```diff -**Indexes (30+):** -- All performance critical indexes included +**Indexes (30+):** +- All performance-critical indexes includedThese are cosmetic only and won’t affect behavior.
docker-compose.yml (1)
183-185: Port exposure matches nginx+SSL flow; consider whether 8443 needs to stay publicExposing 80/443 alongside 8443 makes sense with nginx terminating HTTP/HTTPS in the container, and the existing healthcheck still targeting 8443. If 8443 is now only an internal app port behind nginx, you might optionally drop the host binding for 8443 to reduce the exposed surface, keeping just 80/443 for normal access.
sql/seed/default_roles.sql (1)
15-35: Use a NULL‑safe comparison in the MERGE UPDATE to backfill missing descriptionsIn the MERGE, the UPDATE branch currently runs only when
r.description != src.description. If an existing role hasdescription IS NULL, that predicate is UNKNOWN and the row won’t be updated, so the seeded description won’t be backfilled.If you want to ensure descriptions are populated or refreshed whenever they differ (including NULL → non‑NULL), consider a NULL‑safe condition, e.g.:
WHEN MATCHED THEN UPDATE SET description = src.description, updated_at = SYSTIMESTAMP - WHERE r.description != src.description; + WHERE (r.description IS NULL AND src.description IS NOT NULL) + OR (r.description IS NOT NULL AND r.description != src.description);The rest of the script (idempotent role creation and guarded ROLE_ADMIN grant) looks solid.
nginx.conf (1)
135-158: Consider extracting security headers to reduce duplication.Security headers are repeated in three location blocks (server level, static assets, and SPA routing). While this is necessary because
add_headerin a child block doesn't inherit from parent blocks, it creates maintenance overhead.Consider using an include file or nginx map to centralize security headers:
Create
/etc/nginx/snippets/security-headers.conf:add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; add_header X-Content-Type-Options "nosniff" always; add_header X-Frame-Options "SAMEORIGIN" always; add_header X-XSS-Protection "1; mode=block" always; add_header Referrer-Policy "strict-origin-when-cross-origin" always;Then include it in each location block:
location ~ ^/(assets|styles)/ { root /var/www/html/browser; expires 30d; add_header Cache-Control "public, immutable"; - add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; - add_header X-Content-Type-Options "nosniff" always; - add_header X-Frame-Options "SAMEORIGIN" always; - add_header X-XSS-Protection "1; mode=block" always; - add_header Referrer-Policy "strict-origin-when-cross-origin" always; + include /etc/nginx/snippets/security-headers.conf; }Note: This requires updating the Dockerfile to copy the snippet file.
Dockerfile (1)
40-55: Consider documenting certificate rotation strategy.The self-signed certificate has a 365-day validity. For long-running containers, this could cause expiration issues. The warning comments correctly note this is for local/dev only, but documentation on certificate rotation would be helpful.
Add a comment block or reference to documentation:
# Generate self-signed SSL certificates for nginx with configurable CN +# NOTE: Certificate expires in 365 days. For production: +# 1. Mount CA-issued certificates via volume: -v /path/to/certs:/etc/nginx/certs +# 2. Or use cert-manager/Let's Encrypt for automatic rotation +# For long-running dev containers, recreate container annually or mount certs. RUN mkdir -p /etc/nginx/certs && \ openssl req -x509 -newkey rsa:2048 -keyout /etc/nginx/certs/private-key.pem -out /etc/nginx/certs/certificate.pem -days 365 -nodes \ -subj "/CN=${CERT_CN}/O=Organization/C=US"sql/tables/05_persisted_queries.sql (1)
19-35: Consider adding CHECK constraint for status field.The
statuscolumn (line 30) has a default value of'ACTIVE'but no CHECK constraint to enforce valid values. This could allow invalid status values to be inserted.Apply this diff to add a CHECK constraint:
usage_count NUMBER(19) DEFAULT 0 NOT NULL, client_name VARCHAR2(255), - status VARCHAR2(20) DEFAULT 'ACTIVE' NOT NULL, + status VARCHAR2(20) DEFAULT 'ACTIVE' NOT NULL CHECK (status IN ('ACTIVE', 'DEPRECATED', 'DISABLED')), created_by VARCHAR2(255), updated_by VARCHAR2(255)Adjust the valid status values based on your application's requirements.
sql/packages/dynamic_crud_pkg_body.sql (2)
239-265: Primary-key helper only supports single-column PKs
get_primary_key_columnintentionally returns only the first PK column; composite primary keys will not be handled correctly by the ID‑returning logic that depends on this helper. If you ever allow dynamic CRUD over tables with composite PKs, you’ll need either to:
- Extend this helper to return all PK columns, and
- Adjust the INSERT/RETURNING path to bind multiple columns.
Otherwise, it would be good to document clearly (in spec/comments) that dynamic_crud_pkg only supports single-column PKs on the allowed tables.
323-327: INSERT/ID generation path looks correct but treats IDENTITY differently than SEQUENCEThe new INSERT branch correctly:
- Builds a quoted column/value list with bind variables.
- Uses
user_tab_identity_colsto detect identity columns.- Uses
RETURNING "<pk_column>" INTO :ret_idwith DBMS_SQL when there is no identity column and a PK exists, and falls back to a plain INSERT otherwise.Two things to consider tightening up:
Comment vs logic – The comment “Check if table has a SEQUENCE-based PK (not IDENTITY)” is misleading: the code checks for an IDENTITY column and treats the “no identity” case as “possibly sequence-based”. You may want to reword the comment to avoid confusion.
Capturing IDs for IDENTITY PKs – For IDENTITY columns you currently just run a normal INSERT and never populate
p_generated_id. Oracle also supportsRETURNING identity_column INTO ..., so if the caller expects IDs for both SEQUENCE‑based and IDENTITY‑based PKs, you could simplify by using the same RETURNING path for both and dropping the identity check altogether.If the current behavior (no generated ID for identity PKs) is intentional and all callers know this, keeping it is fine, but it is worth making explicit.
Also applies to: 373-430
db/migration/V601__index_effectiveness_audit_tables.sql (1)
24-51: Optional: constrain improvement percentage and rationale length if neededThe
audit_query_recommendationstable looks solid, with good completeness checks onapproved/implemented. If you want to prevent out‑of‑range values later, consider:
- Adding
CHECK (estimated_improvement_percent BETWEEN 0 AND 100)if the value is meant to be a percentage.- Verifying that 2000 chars is enough for
rationalegiven your expected usage; if not, switching toCLOBkeeps future changes easier.Not urgent, but worth deciding up front to avoid later migrations.
sql/tables/06_user_settings.sql (2)
19-55: Consider tightening boolean flags and time representationThe three new tables look coherent overall, but a few minor schema hardening points:
The
notification_*columns and API key lifecycle flags (revoked_at, etc.) are modeled asNUMBER(1)/NUMBER(19)without checks. If these are truly boolean flags (0/1), considerCHECK (notification_emails IN (0,1))‑style constraints to prevent accidental non‑boolean values.You’re storing
created_at/updated_atas Unix epoch millis (NUMBER(19)) here, whileusers.created_at/updated_atareTIMESTAMP. That’s fine if intentional, but it’s worth confirming that your Java/JPA layer expects and correctly converts these numeric timestamps, and that mixed conventions won’t cause confusion.These are non‑blocking, but tightening them now will make downstream logic and reporting simpler.
62-77: Usingtimestampas a column name
account_deactivation_audit.timestampis a valid column name, but it does shadow a common SQL type/keyword and may make queries slightly harder to read (e.g.,WHERE timestamp > ...).If you’re still flexible on naming, something like
occurred_atorevent_attends to be clearer. If this table is already referenced in code, keeping it is fine; just be consistent in always qualifying with the table name to avoid ambiguity.src/test/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzerTest.java (1)
34-81: Tests are not exercising IndexEffectivenessAnalyzer behaviorThe current tests (
testUtilityScoreCalculation,testActionDetermination, etc.) only perform inline arithmetic and string comparisons; they never invoke any methods onIndexEffectivenessAnalyzer. This gives no real coverage of the production code.If you want meaningful tests here, consider:
- Extracting
calculateUtilityScore,determineAction, andgenerateRationaleto package‑private methods (or a small helper class) so they can be invoked directly, or- Adding an integration‑style test that stubs
JdbcTemplate.queryForList(...)and then callsanalyzeIndexEffectiveness()to verify that:
- At least one row is processed.
persistAnalysisis called (via verifying an insert or inspectingaudit_index_analysisin a test DB).- Gauges/counters in
MeterRegistryare updated as expected.As it stands, these tests can be removed or refactored; they don’t protect the analyzer’s logic.
db/migration/V700__recompile_plsql_packages_fix_id_generation.sql (1)
389-417: Unused local variablev_has_identityInside the CREATE/INSERT branch’s nested
DECLARE,v_has_identityis declared but never used. It’s harmless but can be removed to reduce noise:DECLARE - v_has_identity NUMBER := 0; v_generated_id_value NUMBER; v_cursor INTEGER; BEGIN ... END;sql/tables/users.sql (1)
30-203: Data consistency correction block is robust but quite heavy for a DDL scriptThe correction block does a thorough job:
- Counts inconsistent rows,
- Logs sample IDs via
DBMS_OUTPUT,- Fixes both “DEACTIVATED with NULL timestamp” and “ACTIVE/INACTIVE with non‑NULL timestamp” cases,
- Handles constraint and permission errors explicitly.
Two minor considerations:
The pre‑flight privilege check just verifies that the queries against
user_sys_privs/user_tab_privssucceed; it doesn’t actually assert that you have UPDATE privileges (zero rows still logs “Update privileges verified”). That’s mostly cosmetic, since the UPDATEs themselves will fail with ORA‑1031 if privileges are missing.Running this block in regular app migrations may be overkill; if it’s intended strictly for DBA‑run initialization scripts, a comment in the header clarifying that (and ensuring it’s not wired into Flyway) would help avoid surprises.
Functionally it looks sound; these are just maintainability clarifications.
src/main/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzer.java (1)
60-82: Index stats collection logic is robust; verify V$ view access assumptions
getIndexUsageStats()is well structured: it triesV$SEGMENT_STATISTICS, thenV$OBJECT_USAGE, then falls back toUSER_INDEXESwith zero usage counts, all wrapped in defensive exception handling.Two small points to be aware of:
Access to
V$SEGMENT_STATISTICSandV$OBJECT_USAGEtypically requires additional dictionary privileges. In many app schemas these queries will fail with ORA‑1031, and you’ll always hit the fallback. That’s fine given your logging, but you may want to document the privilege requirements (and test the fallback behavior) so operators know what to expect.You hard‑code an 8KB block size (
COALESCE(i.leaf_blocks, 1) * 8192) to estimate index size. If your environments use non‑8KB blocks, consider deriving block size from a parameter/table (v$parameteror tablespace metadata) or at least documenting the assumption.Functionally, the current approach is safe; these are mostly operational gotchas.
Also applies to: 98-181
src/main/java/com/rcs/ssf/metrics/SlowQueryAnalyzer.java (4)
234-246: Consider extracting LIKE wildcard escaping to a utility method.Lines 236-238 manually escape LIKE wildcards in a specific order. This logic is fragile and duplicated whenever needed.
Extract to a reusable utility method:
private String escapeLikePattern(String input) { return input .replace("\\", "\\\\") // Escape backslash first .replace("%", "\\%") // Escape percent .replace("_", "\\_"); // Escape underscore }Then use:
-String escapedSql = searchSql - .replace("\\", "\\\\") - .replace("%", "\\%") - .replace("_", "\\_"); +String escapedSql = escapeLikePattern(searchSql);
298-343: Consider refactoring verbose validation logic.The 45 lines of manual field extraction and type checking (lines 298-343) could be simplified for better maintainability.
Consider introducing a helper method or validation utility:
private static class RecommendationExtractor { static RecommendationFields extract(Map<String, Object> recommendation) { String type = getStringOrNull(recommendation, "type"); String confidence = getStringOrNull(recommendation, "confidence"); String rationale = getStringOrNull(recommendation, "rationale"); if (type == null || confidence == null || rationale == null) { return null; // Invalid } Double improvement = getDoubleOrNull(recommendation, "improvement_percent"); String indexDef = getStringOrNull(recommendation, "index_definition"); String hint = getStringOrNull(recommendation, "hint_suggestion"); return new RecommendationFields(type, confidence, rationale, improvement, indexDef, hint); } private static String getStringOrNull(Map<String, Object> map, String key) { Object val = map.get(key); return val instanceof String ? (String) val : null; } private static Double getDoubleOrNull(Map<String, Object> map, String key) { Object val = map.get(key); return val instanceof Number ? ((Number) val).doubleValue() : null; } }
389-420: Extract hardcoded improvement percentage to a constant.Line 403 uses a magic number
80.0for the estimated improvement percentage. This should be a named constant for clarity and maintainability.+private static final double MISSING_INDEX_IMPROVEMENT_PERCENT = 80.0; +private static final double INEFFICIENT_JOIN_IMPROVEMENT_PERCENT = 50.0; +private static final double CORRELATED_SUBQUERY_IMPROVEMENT_PERCENT = 60.0; + private Map<String, Object> analyzeFullTableScan(String sql, ExecutionPlan plan) { Map<String, Object> result = new HashMap<>(); // ... - result.put("improvement_percent", 80.0); + result.put("improvement_percent", MISSING_INDEX_IMPROVEMENT_PERCENT);Apply similar changes to lines 430 and 447 for consistency.
458-470: Fragile table name extraction from execution plan.Line 464 assumes
parts[3]contains the table name after splitting on whitespace. This is brittle and may fail if Oracle changes the plan format or if operations have different structures.Consider more robust parsing with validation:
private String extractTableFromPlan(String planText) { String[] lines = planText.split("\n"); for (String line : lines) { if (line.contains("TABLE ACCESS FULL")) { String[] parts = line.trim().split("\\s+"); - if (parts.length >= 4) { - return parts[3]; + // Look for table name: expect pattern like "TABLE ACCESS FULL table_name" + for (int i = 0; i < parts.length - 1; i++) { + if ("FULL".equalsIgnoreCase(parts[i]) && i + 1 < parts.length) { + String tableName = parts[i + 1]; + // Validate: table names typically don't contain parentheses or operators + if (!tableName.contains("(") && !tableName.matches("\\d+\\.?")) { + return tableName; + } + } } } } return null; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (51)
.dockerignore(1 hunks)Dockerfile(2 hunks)JAVA_21_UPGRADE.md(0 hunks)README.md(1 hunks)ROADMAP_OBSERVABILITY.md(1 hunks)build.gradle(1 hunks)db/migration/V017__account_deactivation_audit_reason_code_not_null.sql(1 hunks)db/migration/V500__user_settings_profile.sql(1 hunks)db/migration/V601__index_effectiveness_audit_tables.sql(1 hunks)db/migration/V700__recompile_plsql_packages_fix_id_generation.sql(1 hunks)docker-compose.yml(1 hunks)docker-start.sh(1 hunks)frontend/src/app/app.routes.ts(1 hunks)frontend/src/app/features/analytics/analytics.component.ts(1 hunks)frontend/src/app/features/analytics/pages/dashboard.component.ts(1 hunks)frontend/src/app/features/analytics/pages/performance.component.ts(1 hunks)frontend/src/app/features/analytics/pages/reports.component.ts(1 hunks)frontend/src/app/features/analytics/pages/usage.component.ts(1 hunks)frontend/src/app/features/analytics/services/analytics.service.ts(1 hunks)nginx.conf(4 hunks)setup-oracle-schema.sh(1 hunks)sql/README.md(1 hunks)sql/grants/data_dictionary_privileges.sql(1 hunks)sql/indexes/users_indexes.sql(1 hunks)sql/master.sql(2 hunks)sql/packages/dynamic_crud_pkg_body.sql(5 hunks)sql/packages/dynamic_crud_pkg_spec.sql(1 hunks)sql/seed/assign_rcs_admin_role.sql(1 hunks)sql/seed/default_roles.sql(1 hunks)sql/tables/05_persisted_queries.sql(1 hunks)sql/tables/06_user_settings.sql(1 hunks)sql/tables/07_audit_index_analysis.sql(1 hunks)sql/tables/08_audit_query_recommendations.sql(1 hunks)sql/tables/user_roles.sql(1 hunks)sql/tables/users.sql(2 hunks)src/main/java/com/rcs/ssf/SsfApplication.java(1 hunks)src/main/java/com/rcs/ssf/dynamic/DynamicCrudGateway.java(3 hunks)src/main/java/com/rcs/ssf/graphql/AuthMutation.java(1 hunks)src/main/java/com/rcs/ssf/metrics/BatchMetricsRecorder.java(3 hunks)src/main/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzer.java(1 hunks)src/main/java/com/rcs/ssf/metrics/QueryNormalizationUtils.java(1 hunks)src/main/java/com/rcs/ssf/metrics/QueryPlanAnalyzer.java(4 hunks)src/main/java/com/rcs/ssf/metrics/RecommendationPersistenceService.java(1 hunks)src/main/java/com/rcs/ssf/metrics/SlowQueryAnalyzer.java(1 hunks)src/main/java/com/rcs/ssf/security/JwtAuthenticationFilter.java(1 hunks)src/main/java/com/rcs/ssf/service/DynamicCrudService.java(6 hunks)src/main/java/com/rcs/ssf/tracing/LatencyAnalyzer.java(1 hunks)src/main/java/com/rcs/ssf/tracing/OtelConfig.java(1 hunks)src/main/resources/application-dev.yml(1 hunks)src/main/resources/application.yml(2 hunks)src/test/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzerTest.java(1 hunks)
💤 Files with no reviewable changes (1)
- JAVA_21_UPGRADE.md
🧰 Additional context used
🧬 Code graph analysis (5)
frontend/src/app/features/analytics/pages/dashboard.component.ts (1)
frontend/src/app/features/analytics/services/analytics.service.ts (2)
AnalyticsDashboard(29-39)LoginAttemptTrendPoint(41-45)
src/main/java/com/rcs/ssf/metrics/QueryNormalizationUtils.java (1)
src/main/java/com/rcs/ssf/service/DynamicCrudService.java (1)
Slf4j(37-616)
frontend/src/app/features/analytics/pages/performance.component.ts (1)
frontend/src/app/features/analytics/services/analytics.service.ts (1)
PerformanceMetric(6-12)
frontend/src/app/features/analytics/pages/usage.component.ts (1)
frontend/src/app/features/analytics/services/analytics.service.ts (1)
UsageMetric(14-19)
frontend/src/app/app.routes.ts (1)
frontend/src/app/core/guards/auth.guard.ts (1)
authGuard(21-40)
🪛 GitHub Actions: CI
src/test/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzerTest.java
[error] 31-31: IndexEffectivenessAnalyzer constructor mismatch. Passed (MeterRegistry, DataSource, JdbcTemplate) but constructor expects (MeterRegistry, JdbcTemplate). Remove DataSource argument or adjust constructor.
build.gradle
[error] 1-1: Gradle build failed: see console output for details. Command: /home/runner/work/spring-graphql/spring-graphql/gradlew clean test jacocoTestReport
🪛 LanguageTool
sql/README.md
[grammar] ~123-~123: Use a hyphen to join words.
Context: ...t` Indexes (30+): - All performance critical indexes included **Packages (2...
(QB_NEW_EN_HYPHEN)
🪛 markdownlint-cli2 (0.18.1)
sql/README.md
45-45: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
83-83: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
ROADMAP_OBSERVABILITY.md
178-178: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
245-245: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
291-291: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
413-413: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Shellcheck (0.11.0)
docker-start.sh
[warning] 12-12: This $? refers to echo/printf, not a previous command. Assign to variable to avoid it being overwritten.
(SC2320)
- Add su-exec to Dockerfile for privilege dropping, enabling non-root execution of the Java application - Update docker-start.sh to run Java as appuser with pre-flight validations for su-exec and appgroup - Fix healthcheck regex in docker-compose.yml for more robust Elasticsearch cluster status parsing - Add reactor-test dependency in build.gradle for testing Spring reactive components - Correct v_operation normalization in PL/SQL package for proper audit logging in both success and error cases - Add validation for non-empty execution plans before generating recommendations to prevent errors
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (13)
frontend/src/styles.css (1)
254-277: Consider documenting component-to-style mappings.The usage comments are helpful for discovering available styles. To improve developer ergonomics, consider adding a comment block that maps these styles to specific analytics components (e.g., "Use
.card-glassfor metric summary cards in AnalyticsDashboard"). This would reduce guesswork when implementing UI features.docker-start.sh (1)
1-153: Consider failure recovery and process monitoring.The script orchestrates nginx and Java processes, but if nginx crashes unexpectedly (lines 84–103 only detect startup failure, not runtime crashes), the Java process continues running without a reverse proxy. Similarly, if the Java process crashes, nginx continues serving requests that cannot be routed.
For production deployments, consider:
- Continuous monitoring of both processes with automatic restart or container exit on critical process failure
- A supervisor tool (e.g.,
supervisord) or init system (e.g.,s6,runit) to manage multiple services with automatic restarts- Health check endpoints on both services with periodic verification in the main loop
This is acceptable for development/testing containers, but a production image may benefit from more robust orchestration.
src/test/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzerTest.java (1)
82-113: Consider more precise mock matching and value assertions.This test correctly invokes the analyzer, but could be more robust:
- Line 98: The mock uses
(Object[]) any()which is permissive and might miss signature mismatches.- Lines 110-112: Assertions check for non-empty metrics but don't verify tag values match the test data (e.g.,
INDEX_NAME = "IDX_USERS_EMAIL").- No verification of the actual persisted values or error handling.
Consider these refinements:
- when(jdbcTemplate.update(any(String.class), (Object[]) any())).thenReturn(1); + // Capture the actual arguments to verify persisted values + ArgumentCaptor<Object[]> argsCaptor = ArgumentCaptor.forClass(Object[].class); + when(jdbcTemplate.update(any(String.class), argsCaptor.capture())).thenReturn(1);- assertFalse(meterRegistry.get("oracle.index.utility_score").gauges().isEmpty()); + // Verify metrics are tagged with the correct index name + assertNotNull(meterRegistry.get("oracle.index.utility_score") + .tag("index_name", "IDX_USERS_EMAIL") + .gauge());Additionally, consider adding a test case for error handling (e.g., when
jdbcTemplate.queryForList()throws an exception).src/main/java/com/rcs/ssf/metrics/SlowQueryAnalyzer.java (2)
397-431: Hardcoded improvement percentages should be documented as estimates.The
improvement_percentvalue of 80% (line 414) is a fixed estimate rather than a measured value. While this heuristic is reasonable for full table scans, users should understand these are approximations.Consider adding a comment or documentation clarifying that improvement percentages are heuristic estimates, not guarantees.
469-481: Fragile string parsing for table name extraction.The
extractTableFromPlanmethod relies on a specific Oracle plan format ("TABLE ACCESS FULL table_name") with whitespace splitting. This could break if Oracle changes the plan text format in future versions.Consider using a more robust pattern-based approach:
private String extractTableFromPlan(String planText) { - // Simple extraction - look for TABLE ACCESS FULL followed by table name String[] lines = planText.split("\n"); + Pattern tablePattern = Pattern.compile("TABLE ACCESS FULL\\s+(\\w+)", Pattern.CASE_INSENSITIVE); for (String line : lines) { - if (line.contains("TABLE ACCESS FULL")) { - String[] parts = line.trim().split("\\s+"); - if (parts.length >= 4) { - return parts[3]; // TABLE ACCESS FULL table_name - } + Matcher matcher = tablePattern.matcher(line); + if (matcher.find()) { + return matcher.group(1); } } return null; }build.gradle (4)
42-44: Consolidate split configuration blocks for clarity.The
configurationsblock is split into two locations (lines 42-44 for Flyway, and lines 197-201 for Tomcat exclusion). Consolidate these into a single block early in the build script to improve readability and maintainability.- configurations { - create('flyway') - } + configurations { + create('flyway') + all { + exclude group: 'org.springframework.boot', module: 'spring-boot-starter-tomcat' + } + } - configurations { - all { - exclude group: 'org.springframework.boot', module: 'spring-boot-starter-tomcat' - } - }Also applies to: 197-201
160-196: Simplify and harden the Flyway task graph validation logic.Line 167 uses
graph.allTasks.any { it.name.toLowerCase().startsWith('flyway') }to detect Flyway tasks, which relies on Gradle's internal task naming convention. This approach is fragile and could break if Flyway changes its task naming. Consider using Gradle's task type matching instead for robustness.Additionally, the validation occurs within the
flywayconfiguration block, meaning credentials are only checked when Flyway block is evaluated (not necessarily when other tasks run). Consider moving this to a separate task or using a different hook if early validation is critical.- gradle.taskGraph.whenReady { graph -> - if (graph.allTasks.any { it.name.toLowerCase().startsWith('flyway') }) { + gradle.taskGraph.whenReady { graph -> + if (graph.allTasks.any { it instanceof org.flywaydb.gradle.task.FlywayTask }) {
71-76: Centralize resilience4j dependency versions in dependency-management block.Lines 71-76 manually pin all resilience4j modules to
2.1.0. For consistency with the BOM-driven approach used for OpenTelemetry and Spring Boot managed dependencies, consider centralizing resilience4j versions in thedependencyManagementblock rather than repeating the version across all modules.Add to the
dependencyManagementblock (after line 31):dependencyManagement { imports { mavenBom "io.opentelemetry:opentelemetry-bom:1.40.0" mavenBom "io.opentelemetry.instrumentation:opentelemetry-instrumentation-bom:2.6.0" } dependencies { dependency 'io.opentelemetry.semconv:opentelemetry-semconv:1.37.0' + // Resilience4j version centralization + dependency 'io.github.resilience4j:resilience4j-spring-boot3:2.1.0' + dependency 'io.github.resilience4j:resilience4j-circuitbreaker:2.1.0' + dependency 'io.github.resilience4j:resilience4j-retry:2.1.0' + dependency 'io.github.resilience4j:resilience4j-bulkhead:2.1.0' + dependency 'io.github.resilience4j:resilience4j-timelimiter:2.1.0' + dependency 'io.github.resilience4j:resilience4j-micrometer:2.1.0' } }Then update lines 71-76 to remove versions:
// Resilience & Observability - implementation 'io.github.resilience4j:resilience4j-spring-boot3:2.1.0' - implementation 'io.github.resilience4j:resilience4j-circuitbreaker:2.1.0' - implementation 'io.github.resilience4j:resilience4j-retry:2.1.0' - implementation 'io.github.resilience4j:resilience4j-bulkhead:2.1.0' - implementation 'io.github.resilience4j:resilience4j-timelimiter:2.1.0' - implementation 'io.github.resilience4j:resilience4j-micrometer:2.1.0' + implementation 'io.github.resilience4j:resilience4j-spring-boot3' + implementation 'io.github.resilience4j:resilience4j-circuitbreaker' + implementation 'io.github.resilience4j:resilience4j-retry' + implementation 'io.github.resilience4j:resilience4j-bulkhead' + implementation 'io.github.resilience4j:resilience4j-timelimiter' + implementation 'io.github.resilience4j:resilience4j-micrometer'
199-199: Add explanatory comment for Tomcat exclusion.Line 199 excludes
spring-boot-starter-tomcatwithout context. Given that line 54 explicitly addsspring-boot-starter-jetty, the exclusion is intentional but unclear to future readers. Add a brief comment explaining the rationale.configurations { all { - exclude group: 'org.springframework.boot', module: 'spring-boot-starter-tomcat' + // Use Jetty (line 54) as the embedded servlet container instead of the default Tomcat + exclude group: 'org.springframework.boot', module: 'spring-boot-starter-tomcat' } }frontend/src/app/features/analytics/analytics.module.ts (1)
1-15: AnalyticsModule wiring is fine; some imports may be redundantGiven that
AnalyticsComponentand the child pages are standalone and routing is handled viaAnalyticsRoutingModule,CommonModuleand importingAnalyticsComponenthere are likely unnecessary unless you plan to use them in non-standalone declarations later. You could simplify to only importAnalyticsRoutingModuleand keepAnalyticsServicein providers, but the current setup is functionally correct.frontend/src/app/features/analytics/pages/dashboard.component.ts (1)
272-368: Data-loading flow is robust; consider a simple loading state for UXThe init/refresh pipeline (
loadData→fetchDashboardDatawithtakeUntil, state updates, and guardedcatchErrorthat preserves the last good dashboard while clearing the chart) is solid. As an optional improvement, you might add anisLoadingflag to disable the Refresh button and/or show a spinner while a request is in flight to avoid overlapping refresh calls and give clearer feedback on slow responses.src/main/java/com/rcs/ssf/tracing/LatencyAnalyzer.java (1)
30-38: Thread-safety design with immutable snapshots andConcurrentHashMap.computelooks solidThe combination of:
- Immutable
LayerLatencyStatswithfinalfields and a constructor-computedtotalLatencyMs, and- Atomic updates via
latencyStats.compute(resolver, (key, old) -> old.withLayerLatency(...))addresses the earlier data-race concerns: readers in
logLatencyAttributions/detectAndAlertBottlenecksnow observe consistent per-resolver snapshots without tornlongreads.One minor doc nit: the Javadoc on
recordLayerLatencystill mentions a “compareAndSet pattern”, but the implementation now relies onConcurrentHashMap.compute(...)for atomic read‑modify‑write. It might be worth rephrasing that comment to match the current approach.Also applies to: 216-224, 269-323
src/main/java/com/rcs/ssf/metrics/BatchMetricsRecorder.java (1)
189-251: Excellent thread-safe refactoring with atomic snapshots.The refactoring from mutable fields to
AtomicReference<MetricsSnapshot>significantly improves thread safety and consistency:
Eliminates TOCTOU races: All getters (
getErrorRate,getAverageThroughput, etc.) now read from a consistent snapshot, preventing time-of-check to time-of-use issues where multiple fields could be read at different points in time.Lock-free atomic updates: The CAS loop in
recordSuccess(lines 197-210) correctly implements lock-free updates, avoiding synchronization overhead.On-demand derived metrics:
getAverageThroughput()andgetErrorRate()now compute from the atomic snapshot, ensuring they're always consistent with the underlying data.Proper zero-division protection: Lines 226, 238-240, and 246-248 correctly handle edge cases.
Minor consideration: Under very high contention, the CAS loop could spin and create multiple short-lived
MetricsSnapshotobjects. For metrics recording, this is an acceptable tradeoff for the benefits of lock-free operation.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
Dockerfile(2 hunks)build.gradle(1 hunks)db/migration/V700__recompile_plsql_packages_fix_id_generation.sql(1 hunks)docker-compose.yml(2 hunks)docker-start.sh(1 hunks)frontend/src/app/app.component.html(1 hunks)frontend/src/app/app.routes.ts(1 hunks)frontend/src/app/features/analytics/analytics-routing.module.ts(1 hunks)frontend/src/app/features/analytics/analytics.component.ts(1 hunks)frontend/src/app/features/analytics/analytics.module.ts(1 hunks)frontend/src/app/features/analytics/pages/dashboard.component.ts(1 hunks)frontend/src/app/features/analytics/pages/reports.component.ts(1 hunks)frontend/src/app/features/analytics/pages/usage.component.ts(1 hunks)frontend/src/app/features/analytics/services/analytics.service.ts(1 hunks)frontend/src/styles.css(1 hunks)setup-oracle-schema.sh(1 hunks)src/main/java/com/rcs/ssf/metrics/BatchMetricsRecorder.java(4 hunks)src/main/java/com/rcs/ssf/metrics/SlowQueryAnalyzer.java(1 hunks)src/main/java/com/rcs/ssf/security/JwtAuthenticationFilter.java(1 hunks)src/main/java/com/rcs/ssf/service/DynamicCrudService.java(6 hunks)src/main/java/com/rcs/ssf/tracing/LatencyAnalyzer.java(1 hunks)src/test/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzerTest.java(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- frontend/src/app/app.routes.ts
- src/main/java/com/rcs/ssf/security/JwtAuthenticationFilter.java
- Dockerfile
- frontend/src/app/features/analytics/services/analytics.service.ts
- frontend/src/app/features/analytics/pages/reports.component.ts
- docker-compose.yml
- frontend/src/app/features/analytics/pages/usage.component.ts
- src/main/java/com/rcs/ssf/service/DynamicCrudService.java
🧰 Additional context used
🧬 Code graph analysis (4)
frontend/src/app/features/analytics/analytics.component.ts (1)
frontend/src/app/features/analytics/analytics-routing.module.ts (1)
Component(7-16)
frontend/src/app/features/analytics/analytics-routing.module.ts (4)
frontend/src/app/features/analytics/analytics.component.ts (1)
Component(12-32)frontend/src/app/features/analytics/pages/dashboard.component.ts (1)
Component(19-434)frontend/src/app/app.routes.ts (1)
routes(12-123)frontend/src/app/features/analytics/analytics.module.ts (1)
NgModule(7-15)
frontend/src/app/features/analytics/analytics.module.ts (1)
frontend/src/app/features/analytics/analytics-routing.module.ts (1)
NgModule(77-81)
frontend/src/app/features/analytics/pages/dashboard.component.ts (1)
frontend/src/app/features/analytics/services/analytics.service.ts (2)
AnalyticsDashboard(30-40)LoginAttemptTrendPoint(42-46)
🪛 Biome (2.1.2)
frontend/src/styles.css
[error] 190-190: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 199-199: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 206-206: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 216-216: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 223-223: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 232-232: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 238-238: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 246-246: Unexpected unknown pseudo-elements: ng-deep
See MDN web docs for more details.
Use a known pseudo-elements instead, such as:
(lint/correctness/noUnknownPseudoElement)
[error] 250-250: Unexpected shorthand property border after border-color
(lint/suspicious/noShorthandPropertyOverrides)
🪛 Shellcheck (0.11.0)
docker-start.sh
[warning] 12-12: This $? refers to echo/printf, not a previous command. Assign to variable to avoid it being overwritten.
(SC2320)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (23)
frontend/src/styles.css (2)
8-184: LGTM: Button and card styling are well-crafted.The gradient, neon, glass, and 3D button styles, along with the glassmorphism, gradient-border, neumorphism, and holographic card styles, are visually cohesive, accessible (clear hover states), and performant (CSS-based animations). The consistent 0.3s transition timing and smooth transforms enhance UX without overhead.
190-251: The provided diff is incorrect and doesn't match the stated fix.The review correctly identifies that
border-color: transparent(line 249) is redundant beforeborder: 2px solid transparent(line 251), but the provided diff removes and re-adds thebordershorthand identically, making no actual change.The correct fix should remove the redundant
border-colorline:.input-animated ::ng-deep .ant-input:focus { - border-color: transparent; background: linear-gradient(white, white) padding-box, linear-gradient(135deg, #667eea 0%, #764ba2 100%) border-box; border: 2px solid transparent; }Regarding
::ng-deepwarnings: Biome's CSS linter doesn't recognize this Angular-specific pseudo-element. This is a known false positive when using third-party component libraries (Ant Design) with view encapsulation. Add a CSS rule tobiome.jsonto suppress this warning, or use component-scoped styles to avoid piercing encapsulation entirely.Likely an incorrect or invalid review comment.
docker-start.sh (2)
84-103: Verify HTTP-only health check aligns with SSL/HTTPS setup.Line 92 performs the nginx readiness check on
http://localhost:80(HTTP), but the PR description mentions automatic generation of self-signed SSL certificates and HTTPS deployment. Confirm whether:
- nginx is configured to serve both HTTP and HTTPS, and the HTTP check is sufficient to verify general readiness
- The health check should instead probe HTTPS (
:443), requiring curl to skip certificate validation (-kflag)- There's a downstream service (e.g., orchestrator) that performs HTTPS health checks
If HTTPS is the primary protocol, consider updating the health check to validate the secure endpoint:
- if curl -sf http://localhost:80 > /dev/null 2>&1; then + if curl -sfk https://localhost:443 > /dev/null 2>&1; then
134-134: No action required — JAR filename matches correctly.The Dockerfile (line 58) copies the built JAR
oracle-graphql-1.0.0-RELEASE.jarand renames it toapp.jarin the container. The docker-start.sh script (line 134) correctly references this renamed file. The filenames match as intended.setup-oracle-schema.sh (1)
1-112: Good structure and error handling overall.The script properly uses
set -efor fail-fast semantics, validates non-empty inputs, handles sqlplus exit codes gracefully, and provides clear messaging. The issues flagged above are about credential exposure at lines 75 and 98; addressing those will resolve the security concerns.db/migration/V700__recompile_plsql_packages_fix_id_generation.sql (2)
468-490: Operation normalization for record_audit is now correct.The code properly normalizes
v_norm_operationback to SQL verbs ('INSERT','SELECT','UPDATE','DELETE') before callingrecord_audit(lines 468–474 and 484–489). This addresses the critical operation enum mismatch flagged in previous review—constants likec_op_create='CREATE'are now correctly mapped to expected SQL verbs before auditing.
398-398: The hard-codedidcolumn is correct for the current implementation.Verification confirms all three tables in
g_allowed_tablesuseidas their primary key column name:
- USERS:
id NUMBER(19) DEFAULT user_id_seq.NEXTVAL PRIMARY KEY(sequence-based)- AUDIT_LOGIN_ATTEMPTS:
id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY(identity-based)- AUDIT_SESSIONS:
id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY(identity-based)The code successfully achieves the stated PR objective of supporting both SEQUENCE-based and IDENTITY-based primary keys. The hard-coded
idin the RETURNING clause works correctly for all allowed tables. No dynamic PK lookup helper exists in the codebase, and none is necessary given the consistent use ofidacross all allowed tables.Likely an incorrect or invalid review comment.
src/test/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzerTest.java (1)
24-32: Constructor mismatch resolved.The previous constructor issue has been fixed. The test now correctly instantiates
IndexEffectivenessAnalyzerwith(MeterRegistry, JdbcTemplate), matching the production signature.src/main/java/com/rcs/ssf/metrics/SlowQueryAnalyzer.java (6)
63-100: Excellent constructor validation.The defensive parameter validation is thorough and provides clear error messages. The unused
planRegressionThresholdPercentfield is acceptable given the TODO at lines 291-296 explaining the deferred baseline comparison feature.
115-150: LGTM!The main analysis flow is well-structured with appropriate null checks, fast-path optimization for queries under the threshold, and proper error handling. The conditional execution based on feature flags (
executionPlanCaptureEnabled,recommendationGenerationEnabled) is a good design choice.
483-603: Excellent dual-strategy approach for WHERE clause extraction.The combination of JSqlParser for robust parsing (lines 483-533) with a regex-based fallback (lines 535-603) provides both accuracy and resilience. The visitor pattern, parameter filtering, and defensive error handling are well-implemented.
605-624: Correct use of Counter instead of Gauge for recommendation metrics.Using
Counter(line 614) instead ofGaugeis the right choice to avoid memory leaks from repeated meter registration. The tagging strategy withquery_hash,type, andconfidenceenables effective metric filtering.
628-670: LGTM!The hash generation using SHA-256 is appropriate for query deduplication, and the fallback to
hashCode()provides resilience. TheExecutionPlaninner class is cleanly designed as an immutable data holder.
206-258: The getSqlId method already includes performance optimizations; environmental validation is appropriate.The code implements a two-stage lookup strategy that mitigates the LIKE performance concern:
- Exact match first (lines 222–228): Attempts canonicalized SQL comparison before falling back to prefix matching.
- Intentional 500-character prefix (line 235): The code explicitly uses 500 characters (not 100) to reduce false positives, as documented in the comments.
- Proper query ordering (lines 226, 245): Orders by
last_active_time DESCand fetches only the first row, reducing unnecessary result processing.- Correct ESCAPE handling (lines 236–239): LIKE wildcards are properly escaped to prevent unintended pattern matching.
The design shows thoughtful optimization. Performance impact will be environment-dependent (tied to v$sql size and query frequency in your system). Monitoring the query execution times for this lookup would be appropriate to validate behavior under production load, but no code changes are required at this time.
build.gradle (3)
34-40: SQL directory structure verified and correctly mapped.The
sql/directory exists at the project root and contains a well-organized structure with database resources (cleanup scripts, grants, indexes, packages, sequences, and tables). ThesrcDir 'sql'configuration is valid and properly maps these SQL files as build resources.
194-196: Directory is properly committed and contains migration files—no action needed.Verification confirms the
db/migrationdirectory exists and is properly committed to the repository with multiple SQL migration files present. ThefailOnMissingLocations = truesetting will not cause unexpected failures on initial project setup.
68-68: I need to verify the compatibility claims by checking official Spring Boot and reactor-netty documentation.I need to check the official Spring Boot 3.5.7 BOM documentation to verify the exact versions.
Based on my verification of the reactor-netty and R2DBC dependency ecosystem, I can now provide the rewritten review comment.
No action required—BOM-managed reactor-netty version is correctly configured.
Omitting a version to allow pickup from the BOM is the standard approach, and the inline comment accurately notes Spring Boot 3.5.7 provides reactor-netty 1.1.8. The web search confirms oracle-r2dbc is interoperable with r2dbc-pool, and no documented conflicts exist between reactor-netty 1.1.8, oracle-r2dbc 1.1.1, and r2dbc-pool 1.0.1 in production environments. The dependency declaration is correct as-is.
frontend/src/app/app.component.html (1)
36-64: Analytics submenu wiring looks consistent with feature routingThe new submenu cleanly maps to the
/analytics/{dashboard,performance,usage,reports}routes and matches the existing menu structure; no functional issues spotted.frontend/src/app/features/analytics/analytics-routing.module.ts (1)
6-75: Lazy-loaded analytics routing with safe fallback looks solidThe
safeLoadComponentwrapper plusAnalyticsLoadErrorComponentgives a clear, resilient fallback for failed lazy-loads, and the child routes underAnalyticsComponent(dashboard/performance/usage/reports with redirects todashboard) line up cleanly with the intended/analytics/*structure.frontend/src/app/features/analytics/analytics.component.ts (1)
5-31: Minimal analytics container is appropriateThis standalone
AnalyticsComponentis a straightforward routing host with scoped layout styling; it keeps the feature boundary clear without adding unnecessary logic.frontend/src/app/features/analytics/pages/dashboard.component.ts (1)
375-433: Defensive helpers around login counts and dates look good
getSuccessfulLoginAttemptssafely clamps negatives, the health color/icon helpers centralize status mapping, andformatDategracefully falls back on invalid or unexpected date strings—these should make the dashboard resilient to mildly inconsistent backend data without blowing up the UI.src/main/java/com/rcs/ssf/metrics/BatchMetricsRecorder.java (2)
173-187: LGTM! Clean immutable snapshot implementation.The
MetricsSnapshotclass correctly implements an immutable data holder with final fields, supporting the thread-safe atomic update pattern used inBatchMetrics. The private visibility is appropriate since this is an internal implementation detail.
65-65: No verification needed—the concern is not supported by evidence in the codebase.The review comment raises a hypothetical concern about external callers using the old
recordSuccesssignature with 4 parameters. However, codebase verification reveals:
- Zero calls to
recordSuccesswith 4 parameters anywhere in the codebase- All internal calls (lines 65, 106) correctly use the new 3-parameter signature
- Test file references
BatchMetricsonly throughgetMetrics(), never directly invokingrecordSuccess- No evidence of external code calling
recordSuccesswith the legacy signatureWhile
BatchMetricsis indeedpublic static(line 193) and theoretically accessible externally, the actual usage patterns show it is not called directly from outsideBatchMetricsRecorder. The refactoring is safe.Likely an incorrect or invalid review comment.
| CREATE OR REPLACE PACKAGE dynamic_crud_pkg AS | ||
| -- Supported operation names | ||
| c_op_create CONSTANT VARCHAR2(10) := 'CREATE'; | ||
| c_op_read CONSTANT VARCHAR2(10) := 'READ'; | ||
| c_op_update CONSTANT VARCHAR2(10) := 'UPDATE'; | ||
| c_op_delete CONSTANT VARCHAR2(10) := 'DELETE'; | ||
|
|
||
| SUBTYPE t_operation IS VARCHAR2(10); | ||
|
|
||
| -- Metadata helpers | ||
| FUNCTION is_table_allowed(p_table_name IN VARCHAR2) RETURN BOOLEAN; | ||
| FUNCTION normalize_table_name(p_table_name IN VARCHAR2) RETURN VARCHAR2; | ||
|
|
||
| -- Execute a single-row CRUD operation | ||
| PROCEDURE execute_operation( | ||
| p_table_name IN VARCHAR2, | ||
| p_operation IN t_operation, | ||
| p_column_names IN dyn_column_name_nt, | ||
| p_column_values IN dyn_column_value_nt, | ||
| p_filters IN dyn_filter_nt DEFAULT NULL, | ||
| p_audit IN dyn_audit_ctx_rec DEFAULT NULL, | ||
| p_message OUT VARCHAR2, | ||
| p_generated_id OUT VARCHAR2, | ||
| p_affected_rows OUT NUMBER | ||
| ); | ||
|
|
||
| -- Execute a bulk CRUD operation using collection payload | ||
| PROCEDURE execute_bulk( | ||
| p_table_name IN VARCHAR2, | ||
| p_operation IN t_operation, | ||
| p_rows IN dyn_row_op_nt, | ||
| p_filters IN dyn_filter_nt DEFAULT NULL, | ||
| p_audit IN dyn_audit_ctx_rec DEFAULT NULL, | ||
| p_message OUT VARCHAR2, | ||
| p_affected OUT NUMBER | ||
| ); | ||
| END dynamic_crud_pkg; |
There was a problem hiding this comment.
🧩 Analysis chain
Embedded package copy diverges from canonical standalone; risks deployment drift.
This migration recompiles the entire dynamic_crud_pkg spec and body inline (lines 26–531), duplicating sql/packages/dynamic_crud_pkg_body.sql. While the past review suggested either extracting a canonical source or ensuring byte-for-byte parity, the current approach remains an embedded copy.
Risk: Whichever file is applied last to the database "wins." If future changes are made to the standalone file, this migration becomes stale unless manually synchronized. The past review example shows operation constants and audit logic already diverging between the two versions.
Recommendation: Either:
- Replace the embedded package with a call to source the standalone file (if Flyway/SQL*Plus supports it), or
- Generate this migration from the canonical standalone file as part of the build/CI pipeline to ensure they stay synchronized.
Check whether the standalone package file and this migration are already divergent by comparing operation handling and PK lookup logic:
[/request_verification]
Also applies to: 66-531
🏁 Script executed:
#!/bin/bash
# Compare key sections between standalone and migration
echo "=== Operation constants in standalone ===" && \
rg -n "c_op_create|c_op_read|c_op_update|c_op_delete" sql/packages/dynamic_crud_pkg_spec.sql -A 1
echo -e "\n=== Operation constants in migration ===" && \
rg -n "c_op_create|c_op_read|c_op_update|c_op_delete" db/migration/V700__recompile_plsql_packages_fix_id_generation.sql -A 1 | head -20
echo -e "\n=== Checking if standalone has PK lookup helper ===" && \
rg -n "FUNCTION.*get.*pk\|FUNCTION.*pk.*column\|find.*primary" -i sql/packages/dynamic_crud_pkg_body.sqlLength of output: 1161
I can see the scripts confirmed a critical divergence. The operation constant values differ fundamentally between the two files. Let me verify the impact of this mismatch on the execution logic:
Embedded package constants diverge critically from standalone version—migration will not execute correctly.
The verification confirms the operation constants differ fundamentally and create a broken normalization path:
- Standalone file (
sql/packages/dynamic_crud_pkg_spec.sql):c_op_create := 'INSERT',c_op_read := 'SELECT' - Migration file (lines 28–31):
c_op_create := 'CREATE',c_op_read := 'READ'
The migration's normalization logic (lines 346–348) converts input strings 'INSERT' and 'SELECT' to the package constants c_op_create and c_op_read, but these constants do not match the input values. Subsequent conditional checks (lines 372, 421) that evaluate WHEN c_op_create will fail because the assignment never completes—the normalized variables remain unset, causing the ELSE clause to trigger and the operation to fail silently or error.
Fix required: Align the migration's operation constants with the standalone file values ('INSERT', 'SELECT'), or rewrite the normalization logic to match the new constant definitions.
🤖 Prompt for AI Agents
In db/migration/V700__recompile_plsql_packages_fix_id_generation.sql around
lines 26 to 62, the package operation constants currently use 'CREATE' and
'READ' but the normalization and conditional logic expect 'INSERT' and 'SELECT'
(as in sql/packages/dynamic_crud_pkg_spec.sql); change c_op_create to 'INSERT'
and c_op_read to 'SELECT' (or alternatively update the normalization/conditional
checks to map to the new names) so the normalized operation values match the
constants used in the WHEN branches and downstream logic.
| cleanup() { | ||
| local exit_code=$? |
There was a problem hiding this comment.
Fix the exit code capture in trap handler.
Line 12 uses $? immediately after the function is defined, which captures the exit status of the local command (always 0), not any meaningful exit code. When a signal-based trap (SIGTERM/SIGINT) is triggered, there is no command exit code to preserve. The cleanup function will always exit with code 0, losing any error context.
Consider one of these approaches:
- Remove
$?and use a fixed exit code (e.g.,exit 1on error,exit 0on success) - Accept the exit code as a parameter:
cleanup() { local exit_code=${1:-0}; ... } - Use a global variable to track error states explicitly
-cleanup() {
- local exit_code=$?
-
+cleanup() {
+ local exit_code=${1:-0}
+ Then update the trap invocation to pass exit codes explicitly when needed, or use a global variable to track error states from the main script.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cleanup() { | |
| local exit_code=$? | |
| cleanup() { | |
| local exit_code=${1:-0} |
🧰 Tools
🪛 Shellcheck (0.11.0)
[warning] 12-12: This $? refers to echo/printf, not a previous command. Assign to variable to avoid it being overwritten.
(SC2320)
🤖 Prompt for AI Agents
In docker-start.sh around lines 11 to 12, the trap handler captures the exit
code incorrectly by reading $? inside the cleanup function immediately after
declaring a local variable (which yields 0); change cleanup to accept an
explicit exit code parameter or use a global status variable and update trap
invocations to pass the intended code. For example, redefine cleanup to take
exit_code=${1:-0} and call trap 'cleanup 1' for signals or call cleanup "$?" on
normal exit, or alternatively set a global error flag in the script and have
cleanup use that flag to decide exit 0 vs 1; ensure trap and normal-exit paths
pass the correct code so cleanup exits with meaningful status.
| echo "==========================================================================" | ||
| echo "" | ||
| echo "Executing from: $SQL_DIR" | ||
| echo "Connection: $CONN_STRING" |
There was a problem hiding this comment.
Remove password from output—credentials printed to terminal and logs.
Command-line parameters can be read by anyone via /proc/<pid>/cmdline, and using sqlplus /nolog avoids exposing the password to process monitoring. Line 75 echoes the full CONN_STRING containing the password to the terminal and any captured output/logs.
Remove the connection string from output, or redact the password:
echo ""
echo "=========================================================================="
echo "SETTING UP SCHEMA"
echo "=========================================================================="
echo ""
echo "Executing from: $SQL_DIR"
-echo "Connection: $CONN_STRING"
+echo "Connection: ${ORACLE_USER}@${ORACLE_CONN}"
echo ""Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In setup-oracle-schema.sh around line 75, the script currently echoes the full
CONN_STRING which exposes credentials; remove the echo or redact the password
before printing (e.g., replace password portion with ****) and prefer invoking
sqlplus with /nolog and then connecting with a secure CONNECT operation so
credentials are not visible in process listings or logs.
| sqlplus "$CONN_STRING" << 'EOF' | ||
| SET HEADING ON FEEDBACK ON TIMING ON ECHO ON | ||
| @master.sql | ||
| EXIT | ||
| EOF |
There was a problem hiding this comment.
Avoid passing password on sqlplus command line—use /nolog with heredoc connection.
The recommended pattern for sqlplus is sqlplus /nolog with credentials passed via heredoc, which prevents the password from showing in process listings. Currently, the password is visible to any process monitor running ps.
Refactor to connect within the sqlplus session:
# Run sqlplus script
# Temporarily disable errexit to capture sqlplus exit code and handle failures gracefully
set +e
-sqlplus "$CONN_STRING" << 'EOF'
+sqlplus /nolog << 'EOF'
+CONNECT ${ORACLE_USER}/${ORACLE_PASSWORD}@${ORACLE_CONN}
SET HEADING ON FEEDBACK ON TIMING ON ECHO ON
@master.sql
EXIT
EOFCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In setup-oracle-schema.sh around lines 98-102, the script invokes sqlplus with
the full connection string on the command line exposing the password; change to
use sqlplus /nolog and perform CONNECT inside the heredoc so credentials are not
visible in process listings: replace the sqlplus "$CONN_STRING" invocation with
sqlplus /nolog <<EOF (note: remove the quoted 'EOF' so variables expand) and add
a first line inside the heredoc that runs CONNECT "$CONN_STRING" before the
SET/@master.sql/EXIT lines; keep the rest of the heredoc intact.
| @Test | ||
| @DisplayName("Should handle zero total indexes gracefully") | ||
| void testZeroTotalIndexes() { | ||
| // If no indexes exist, utility should be 0 | ||
| double utility = (0.0 * 100.0) / (0.0 + 1.0); | ||
| assertEquals(0.0, utility, 0.01); | ||
| } | ||
|
|
||
| @Test | ||
| @DisplayName("Should handle high usage count") | ||
| void testHighUsageCount() { | ||
| // High usage should result in high utility score | ||
| double utility = (1000.0 * 100.0) / (50.0 + 1.0); | ||
| assertTrue(utility > 95.0); | ||
| } |
There was a problem hiding this comment.
Tests don't invoke analyzer methods.
Both testZeroTotalIndexes and testHighUsageCount perform inline arithmetic calculations without calling any methods on the analyzer. Like testUtilityScoreCalculation, these tests provide false coverage and should be rewritten to verify the analyzer's actual behavior.
Consider consolidating these edge cases into parameterized tests that actually invoke the analyzer with mocked data:
@ParameterizedTest
@CsvSource({
"0, 0, 0.0", // zero total indexes
"1000, 50, 96.08" // high usage count
})
@DisplayName("Should handle edge cases in utility calculation")
void testUtilityEdgeCases(long accessCount, int totalIndexes, double expectedMinUtility) {
when(jdbcTemplate.queryForList(any(String.class))).thenReturn(List.of(
Map.of(
"INDEX_NAME", "EDGE_CASE_IDX",
"TABLE_NAME", "TEST",
"TOTAL_SIZE_BYTES", new java.math.BigDecimal("1000"),
"ACCESS_COUNT", new java.math.BigDecimal(accessCount),
"READ_COUNT", new java.math.BigDecimal("0"),
"WRITE_COUNT", new java.math.BigDecimal("0")
)
));
when(jdbcTemplate.update(any(String.class), (Object[]) any())).thenReturn(1);
analyzer.analyzeIndexEffectiveness();
// Verify utility score was calculated
assertDoesNotThrow(() -> meterRegistry.get("oracle.index.utility_score")
.tag("index_name", "EDGE_CASE_IDX")
.gauge());
}🤖 Prompt for AI Agents
In src/test/java/com/rcs/ssf/metrics/IndexEffectivenessAnalyzerTest.java around
lines 66-80, the two tests perform inline math instead of invoking the analyzer;
rewrite them to actually call analyzer.analyzeIndexEffectiveness() with mocked
JDBC results and metric updates. Replace the two inline tests with a
parameterized test (CsvSource for accessCount, totalIndexes, expectedMinUtility)
that stubs jdbcTemplate.queryForList(...) to return a Map for an index
(ACCESS_COUNT as BigDecimal, TOTAL_SIZE_BYTES, etc.), stubs
jdbcTemplate.update(...) as needed, calls analyzer.analyzeIndexEffectiveness(),
and then asserts the metric was recorded (e.g. retrieve the gauge from
meterRegistry by name and tags and assert it exists or meets the expected
value). Ensure mocks are reset/verified per case and keep display name and
assertions robust to floating-point variance.
- Deleted `EXPLAIN_PLAN_ANALYSIS.sql` and `OPTIMIZED_QUERY.sql` as they are no longer needed for query optimization analysis. - Removed `PROJECT_PLAN.md` which contained outdated project roadmap and implementation details. - Eliminated `generate_bcrypt.py` script for generating bcrypt passwords, as it is no longer in use.
- Introduced new TLS/SSL certificates and keys for secure communication. - Configured Logstash to accept TCP input with mTLS for Spring Boot logs. - Added Beats input for Nginx/Infrastructure logs with SSL enabled. - Implemented JDBC input for fetching audit logs from Oracle database. - Updated Logback configuration to include Logstash TCP appender for structured logging. - Added necessary drivers and configuration files for JDBC connectivity.
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 22718269 | Triggered | Generic Private Key | 99d6ea3 | logstash/certs/client.pkcs8.key | View secret |
| 22718270 | Triggered | Generic Private Key | 99d6ea3 | logstash/certs/server.key | View secret |
| 22718271 | Triggered | Generic Private Key | 99d6ea3 | logstash/certs/ca.key | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secrets safely. Learn here the best practices.
- Revoke and rotate these secrets.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
logstash/config/pipelines.conf (1)
23-35: Consider JDBC connection reliability improvements.The JDBC input configuration could benefit from additional resilience and flexibility settings:
Recommendations:
Parameterize the connection string for flexibility:
- jdbc_connection_string => "jdbc:oracle:thin:@oracle:1521/FREEPDB1" + jdbc_connection_string => "${JDBC_CONNECTION_STRING:-jdbc:oracle:thin:@oracle:1521/FREEPDB1}"Add connection validation and pooling to prevent stale connections:
jdbc { # ... existing config ... jdbc_validate_connection => true jdbc_validation_timeout => 5 jdbc_pool_timeout => 30 connection_retry_attempts => 3 connection_retry_attempts_wait_time => 5 }Evaluate the polling schedule (
* * * * *= every minute). For audit logs, this may be aggressive. Consider reducing frequency if real-time ingestion isn't required:schedule => "*/5 * * * *" # Every 5 minutesAdd error handling to prevent pipeline failures on database issues:
jdbc { # ... existing config ... jdbc_fetch_size => 100 clean_run => false # Preserve state across restarts }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
logstash/drivers/ojdbc11.jaris excluded by!**/*.jar
📒 Files selected for processing (16)
ROADMAP_OBSERVABILITY.md(1 hunks)docker-compose.yml(3 hunks)logstash/Dockerfile(1 hunks)logstash/certs/ca.crt(1 hunks)logstash/certs/ca.key(1 hunks)logstash/certs/ca.srl(1 hunks)logstash/certs/client.crt(1 hunks)logstash/certs/client.csr(1 hunks)logstash/certs/client.key(1 hunks)logstash/certs/client.pkcs8.key(1 hunks)logstash/certs/server.crt(1 hunks)logstash/certs/server.csr(1 hunks)logstash/certs/server.key(1 hunks)logstash/certs/server.pkcs8.key(1 hunks)logstash/config/pipelines.conf(1 hunks)src/main/resources/logback-spring.xml(5 hunks)
✅ Files skipped from review due to trivial changes (9)
- logstash/certs/client.crt
- logstash/certs/server.crt
- logstash/certs/client.csr
- logstash/certs/ca.srl
- logstash/certs/server.pkcs8.key
- logstash/certs/client.pkcs8.key
- logstash/certs/server.csr
- logstash/certs/ca.crt
- logstash/certs/server.key
🧰 Additional context used
🪛 Gitleaks (8.29.0)
logstash/certs/client.key
[high] 1-52: Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
(private-key)
logstash/certs/ca.key
[high] 1-52: Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
(private-key)
🪛 markdownlint-cli2 (0.18.1)
ROADMAP_OBSERVABILITY.md
178-178: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
244-244: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
290-290: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
412-412: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
logstash/Dockerfile (1)
1-6: LGTM: Secure privilege escalation pattern.The Dockerfile correctly follows security best practices by temporarily elevating to root only for the necessary file operations, then returning to the unprivileged
logstashuser. The file permissions (644) are appropriately restrictive for a JAR file.docker-compose.yml (2)
138-156: LGTM: Kibana service properly configured.The Kibana service is correctly configured with proper health checks and dependency management on Elasticsearch.
85-85: Good improvement: More permissive Elasticsearch healthcheck.The updated healthcheck regex accepts both green and yellow cluster status, which is appropriate for single-node development environments that cannot achieve green status (no replicas).
src/main/resources/logback-spring.xml (1)
24-27: Good addition: MDC trace correlation fields.Flattening MDC keys for trace_id, user_id, and request_id improves log correlation and integration with distributed tracing systems.
| ``` | ||
| ┌──────────────────────────────────────────────────────────────┐ | ||
| │ Index Effectiveness & Optimization Pipeline │ | ||
| └──────────────────────────────────────────────────────────────┘ | ||
|
|
||
| ┌─ COLLECTION PHASE (Scheduled, 24h cycle) ──────────────────┐ | ||
| │ ┌──────────────────────────────────────────────────────────┐ │ | ||
| │ │ Oracle DBA Metadata Query │ │ | ||
| │ │ - DBA_INDEX_USAGE (index read/write stats) │ │ | ||
| │ │ - DBA_INDEXES (size, status, compression) │ │ | ||
| │ │ - DBA_TAB_COLUMNS (column stats, NDV) │ │ | ||
| │ │ - DBA_SQL_PLAN_BASELINES (baseline plans, usage) │ │ | ||
| │ │ - V$SQL + V$SQL_PLAN (runtime execution plans) │ │ | ||
| │ └──────────────────────────────────────────────────────────┘ │ | ||
| │ ↓ │ | ||
| │ ┌──────────────────────────────────────────────────────────┐ │ | ||
| │ │ SlowQueryAnalyzer Component (Java) │ │ | ||
| │ │ - Intercept queries >1000ms via JDBC/R2DBC listeners │ │ | ||
| │ │ - Extract execution plan from V$SQL_PLAN │ │ | ||
| │ │ - Identify bottleneck: full table scans, missing indexes │ │ | ||
| │ │ - Correlate with QueryPlanAnalyzer baseline │ │ | ||
| │ │ - Generate recommendations: new index, hint rewrite │ │ | ||
| │ └──────────────────────────────────────────────────────────┘ │ | ||
| │ ↓ │ | ||
| │ ┌──────────────────────────────────────────────────────────┐ │ | ||
| │ │ AUDIT_INDEX_ANALYSIS table (persist findings) │ │ | ||
| │ │ - analysis_id, table_name, index_name, utility_score │ │ | ||
| │ │ - action (REMOVE, ADD, REBUILD), rationale │ │ | ||
| │ │ - analyzed_at, recommended_by (component) │ │ | ||
| │ └──────────────────────────────────────────────────────────┘ │ | ||
| └──────────────────────────────────────────────────────────────┘ | ||
|
|
||
| ┌─ ANALYTICS PHASE (Real-time query analysis) ────────────────┐ | ||
| │ ┌──────────────────────────────────────────────────────────┐ │ | ||
| │ │ On Each Slow Query (>1s): │ │ | ||
| │ │ 1. Extract plan from Oracle V$SQL_PLAN │ │ | ||
| │ │ 2. Parse for: TABLE ACCESS FULL, INDEX RANGE SCAN │ │ | ||
| │ │ 3. Compare to QueryPlanAnalyzer baseline │ │ | ||
| │ │ 4. If regression (>5% slower than baseline): │ │ | ||
| │ │ - Log alert: "Query plan regression detected" │ │ | ||
| │ │ - Suggest index or query rewrite │ │ | ||
| │ │ - Store in AUDIT_QUERY_RECOMMENDATIONS │ │ | ||
| │ │ 5. Emit Prometheus metric: query_plan_regression │ │ | ||
| │ └──────────────────────────────────────────────────────────┘ │ | ||
| └──────────────────────────────────────────────────────────────┘ | ||
|
|
||
| ┌─ DASHBOARD PHASE (Visualization & Remediation) ────────────┐ | ||
| │ ┌──────────────────────────────────────────────────────────┐ │ | ||
| │ │ Grafana Dashboard: index-effectiveness-dashboard.json │ │ | ||
| │ │ - Panel 1: Index Utility Score (bar chart) │ │ | ||
| │ │ Top unused indexes (utility <10%) │ │ | ||
| │ │ - Panel 2: Missing Indexes (table) │ │ | ||
| │ │ Columns with high query count, no matching index │ │ | ||
| │ │ - Panel 3: Slow Query Analysis (table) │ │ | ||
| │ │ Query hash, plan changes, recommendations │ │ | ||
| │ │ - Panel 4: Index Size vs Usage (scatter) │ │ | ||
| │ │ Identify bloated indexes (large size, low utility) │ │ | ||
| │ │ - Panel 5: Query Regression Alerts (timeline) │ │ | ||
| │ │ Plan changes, execution time variance │ │ | ||
| │ └──────────────────────────────────────────────────────────┘ │ | ||
| └──────────────────────────────────────────────────────────────┘ | ||
| - **Detailed Tasks**: | ||
| - [ ] **Index Usage Analysis (24h Scheduled Task)**: | ||
| - **Component**: `com.rcs.ssf.metrics.IndexEffectivenessAnalyzer` | ||
| - **Trigger**: `@Scheduled(fixedRate = 86400000)` (24 hours) | ||
| - **Implementation**: | ||
| ``` | ||
| 1. Query Oracle DBA_INDEX_USAGE for each user index: | ||
| SELECT index_name, table_name, object_id, | ||
| leaf_blocks, distinct_keys, | ||
| btree_space, used_space, | ||
| access_count (total reads + writes) | ||
| FROM dba_index_usage | ||
| WHERE owner = 'APP_USER' AND index_type = 'NORMAL' | ||
|
|
||
| 2. For each index, calculate utility metric: | ||
| - Total index size (bytes) = leaf_blocks × 8192 | ||
| - Read weight = 1.0 (beneficial) | ||
| - Write weight = 2.0 (maintenance cost) | ||
| - utility = (access_count × read_weight - write_count × write_weight) / total_size | ||
|
|
||
| 3. Flag candidates: | ||
| - REMOVE: utility < 10% AND access_count < 100 per day | ||
| (extremely low usage, maintenance cost not justified) | ||
| - MONITOR: utility 10-30% | ||
| (potential candidate if business case weak) | ||
| - KEEP: utility >= 30% (actively used, maintain) | ||
|
|
||
| 4. For each table, cross-check DBA_TAB_COLUMNS against existing indexes: | ||
| - Identify high-cardinality columns (NDV > 100) without index | ||
| - Check query patterns in QueryPlanAnalyzer for those columns | ||
| - If query count > 1000/day AND no index: recommend new composite | ||
|
|
||
| 5. Persist findings in AUDIT_INDEX_ANALYSIS: | ||
| INSERT INTO AUDIT_INDEX_ANALYSIS ( | ||
| table_name, index_name, utility_score, | ||
| size_mb, access_count, action, rationale, | ||
| analyzed_at, created_at | ||
| ) | ||
| ``` | ||
| - **Metrics Exported**: | ||
| - Gauge: `oracle.index.utility_score{table, index}` (0-100%) | ||
| - Gauge: `oracle.index.size_mb{table, index}` | ||
| - Counter: `oracle.index.action{table, index, action=REMOVE|ADD|MONITOR}` | ||
| - **Dashboard Integration**: | ||
| - Bar chart: Top 20 indexes by utility (descending) | ||
| - Table: Unused indexes (utility <10%, sortable by size, action recommended) | ||
| - Table: Missing indexes (table, column_name, estimated_impact_percent) | ||
| - [ ] **Slow Query Capture & Analysis (Real-time)**: | ||
| - **Component**: `com.rcs.ssf.metrics.SlowQueryAnalyzer` (NEW) | ||
| - **Trigger**: JDBC/R2DBC listener on every query execution (O(1) fast path for <1s queries) | ||
| - **Implementation**: | ||
| ``` | ||
| 1. Intercept query at JDBC/R2DBC execution layer: | ||
| - Capture: SQL text, execution_time_ms, rows_affected | ||
| - Query threshold: 1000ms (configurable via application.yml) | ||
| - Fast path: Skip analysis for <1000ms (99% of queries) | ||
|
|
||
| 2. For slow queries (>1000ms): | ||
| - Extract Oracle V$SQL_PLAN for query plan: | ||
| SELECT * FROM v$sql_plan | ||
| WHERE sql_id = ? | ||
| ORDER BY plan_line_id | ||
|
|
||
| - Parse plan for indicators: | ||
| * TABLE ACCESS FULL (missing index) | ||
| * FILTER operations (correlated subqueries) | ||
| * NESTED LOOP with large row count (join inefficiency) | ||
| * SORT (not optimal for order) | ||
|
|
||
| - Extract execution statistics: | ||
| SELECT * FROM v$sql_stat | ||
| WHERE sql_id = ? | ||
|
|
||
| 3. Correlation Analysis (with QueryPlanAnalyzer baseline): | ||
| - Get baseline for query from queryBaselines map | ||
| - Calculate variance: |current - baseline| / baseline | ||
| - Detect regression: if variance > 5% threshold: | ||
| - Flag as potential problem | ||
| - Recommend investigation | ||
| - Store in AUDIT_QUERY_RECOMMENDATIONS | ||
|
|
||
| 4. Generate Recommendations: | ||
| IF plan shows TABLE ACCESS FULL: | ||
| - Recommend composite index: (leading_where_col, sort_col) | ||
| - Estimate improvement: cardinality × index_selectivity | ||
|
|
||
| IF plan shows NESTED LOOP with large inner table: | ||
| - Recommend HASH JOIN hint or index on join column | ||
| - Estimate improvement: rows × inner_table_scans_avoided | ||
|
|
||
| IF plan shows correlation (multiple FILTER): | ||
| - Recommend denormalization or materialized view | ||
| - Estimate improvement: cascade effect on other queries | ||
|
|
||
| Confidence levels: | ||
| - HIGH (>80%): Missing simple index, obvious inefficiency | ||
| - MEDIUM (50-80%): Needs DBA verification | ||
| - LOW (<50%): Speculative, requires expert review | ||
|
|
||
| 5. Persistence: | ||
| INSERT INTO AUDIT_QUERY_RECOMMENDATIONS ( | ||
| query_hash, execution_plan, recommendation_type, | ||
| confidence, estimated_improvement_percent, | ||
| index_definition, hint_suggestion, rationale, | ||
| created_at | ||
| ) | ||
|
|
||
| 6. Alert: | ||
| IF regression detected: | ||
| - Log WARN: "Query plan regression for [query_hash]" | ||
| - Emit counter: query.plan.regression | ||
| - If critical (>50% slower): Emit ALERT to monitoring | ||
| ``` | ||
| - **Metrics Exported**: | ||
| - Counter: `query.slow_detected{query_hash}` (incremented for >1s queries) | ||
| - Counter: `query.plan.regression{query_hash}` (plan deviation >5%) | ||
| - Timer: `query.recommendation.generation_ms` (recommendation calc time) | ||
| - Gauge: `query.recommendation.confidence{query_hash, type}` (HIGH/MEDIUM/LOW) | ||
| - **Database Tables Required**: | ||
| ```sql | ||
| CREATE TABLE AUDIT_INDEX_ANALYSIS ( | ||
| id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY, | ||
| table_name VARCHAR2(255) NOT NULL, | ||
| index_name VARCHAR2(255), | ||
| utility_score NUMBER(5,2), -- 0-100% | ||
| size_mb NUMBER(10,2), | ||
| access_count NUMBER(19), | ||
| read_count NUMBER(19), | ||
| write_count NUMBER(19), | ||
| action VARCHAR2(20), -- REMOVE, ADD, MONITOR, KEEP | ||
| rationale VARCHAR2(1000), | ||
| analyzed_at TIMESTAMP NOT NULL, | ||
| created_at TIMESTAMP DEFAULT SYSTIMESTAMP NOT NULL | ||
| ); | ||
| CREATE INDEX idx_analysis_action ON AUDIT_INDEX_ANALYSIS(action, analyzed_at DESC); | ||
| CREATE INDEX idx_analysis_table ON AUDIT_INDEX_ANALYSIS(table_name, analyzed_at DESC); | ||
|
|
||
| CREATE TABLE AUDIT_QUERY_RECOMMENDATIONS ( | ||
| id NUMBER GENERATED ALWAYS AS IDENTITY PRIMARY KEY, | ||
| query_hash VARCHAR2(64) NOT NULL, | ||
| execution_plan CLOB, | ||
| plan_cost NUMBER(10), | ||
| rows_estimated NUMBER(19), | ||
| rows_actual NUMBER(19), | ||
| recommendation_type VARCHAR2(50), -- MISSING_INDEX, HINT, REWRITE, DENORMALIZE | ||
| confidence VARCHAR2(20), -- HIGH, MEDIUM, LOW | ||
| estimated_improvement_percent NUMBER(5,2), | ||
| index_definition VARCHAR2(1000), | ||
| hint_suggestion VARCHAR2(500), | ||
| rationale VARCHAR2(2000), | ||
| approved CHAR(1) DEFAULT 'N' NOT NULL, | ||
| approved_by VARCHAR2(255), | ||
| approved_at TIMESTAMP, | ||
| implemented CHAR(1) DEFAULT 'N' NOT NULL, | ||
| implemented_at TIMESTAMP, | ||
| created_at TIMESTAMP DEFAULT SYSTIMESTAMP NOT NULL | ||
| ); | ||
| CREATE INDEX idx_recommendations_hash ON AUDIT_QUERY_RECOMMENDATIONS(query_hash, approved); | ||
| CREATE INDEX idx_recommendations_type ON AUDIT_QUERY_RECOMMENDATIONS(recommendation_type, created_at DESC); | ||
| ``` | ||
| - **Configuration** (application.yml): | ||
| ```yaml | ||
| observability: | ||
| slow_query: | ||
| threshold_ms: 1000 # Alert on queries >1s | ||
| plan_regression_threshold_percent: 5 # Variance >5% is regression | ||
| execution_plan_capture_enabled: true | ||
| recommendation_generation_enabled: true | ||
| index_analysis_schedule_hours: 24 | ||
| ``` | ||
| - [ ] **Query Plan Regression Detection & Alerting**: | ||
| - **Integration Point**: SlowQueryAnalyzer + QueryPlanAnalyzer baseline | ||
| - **Logic**: | ||
| ``` | ||
| When slow query detected (>1s): | ||
| 1. Extract normalized query from QueryPlanAnalyzer | ||
| 2. Get baseline execution time from queryBaselines map | ||
| 3. If query has no baseline yet: | ||
| - First 100 executions establish baseline via BASELINE_SAMPLES | ||
| - No alert during baseline period | ||
| 4. Once baseline established: | ||
| - Calculate: variance_percent = |current - baseline| / baseline × 100 | ||
| - If variance_percent > ANOMALY_THRESHOLD_PERCENT (5%): | ||
| * Plan likely changed (rewrite, stats change, environment) | ||
| * Log WARN with query, current time, baseline time, variance | ||
| * Emit counter: oracle.query.regression{query_type, variance_percent} | ||
| * Extract V$SQL_PLAN and compare to last known good plan | ||
| * Store comparison in AUDIT_PLAN_CHANGES table | ||
| * Suggest recompilation or index addition | ||
|
|
||
| Alert Severity: | ||
| - CRITICAL (>50% slower): Likely missing index or plan corruption | ||
| → Immediate investigation, escalate to DBA | ||
| - WARNING (20-50%): Significant regression | ||
| → Schedule analysis, update recommendation | ||
| - INFO (5-20%): Minor variance, possibly environment difference | ||
| → Monitor trend, investigate if persists >24h | ||
| ``` | ||
| - **Dashboard**: | ||
| - Timeline: Query regression events (query_hash, variance%, timestamp) | ||
| - Heatmap: Query variance distribution (rows: query_type, columns: time, color: variance%) | ||
| - [ ] **Index Effectiveness Dashboard** (`monitoring/grafana/index-effectiveness-dashboard.json`): | ||
| - **Panel 1: Index Utility Score Distribution** | ||
| - Type: Bar chart (horizontal) | ||
| - Query: `SELECT index_name, utility_score FROM AUDIT_INDEX_ANALYSIS WHERE analyzed_at = latest` | ||
| - Top 20 indexes sorted by utility (descending) | ||
| - Color: Green (>50%), Yellow (20-50%), Red (<20%) | ||
| - Link to recommendation details | ||
|
|
||
| - **Panel 2: Unused Indexes Candidate List** | ||
| - Type: Table | ||
| - Query: `SELECT table_name, index_name, utility_score, size_mb, access_count FROM AUDIT_INDEX_ANALYSIS WHERE utility_score < 10 AND accessed_at IS NOT NULL` | ||
| - Columns: Table, Index, Utility %, Size (MB), 30d Usage Count, Action (Remove?) | ||
| - Sortable by size (largest first, potential space savings) | ||
| - Drilldown: Show full index definition, recent queries using it, drop statement | ||
|
|
||
| - **Panel 3: Missing Index Recommendations** | ||
| - Type: Table | ||
| - Query: `SELECT table_name, missing_column, query_count_7d, estimated_improvement FROM index_gap_analysis` | ||
| - Columns: Table, Column(s), Query Count (7d), Est. Improvement (%), Recommendation | ||
| - Sortable by estimated improvement | ||
| - Drilldown: Show queries that would benefit, create index script | ||
|
|
||
| - **Panel 4: Index Size vs Usage Scatter** | ||
| - Type: Scatter plot | ||
| - X-axis: Index size (MB, log scale) | ||
| - Y-axis: Utility score (0-100%) | ||
| - Bubble size: Access count | ||
| - Color: Green (efficient), Yellow (monitor), Red (bloated, low utility) | ||
| - Identify candidates: Large size + low utility = space savings opportunity | ||
|
|
||
| - **Panel 5: Query Plan Regression Timeline** | ||
| - Type: Graph (time series) | ||
| - Y-axis: Variance % from baseline | ||
| - X-axis: Time | ||
| - Series: Top 5 regressed queries | ||
| - Threshold lines: +5% (warning), +20% (alert) | ||
| - Annotation: When changes detected, show recommendation | ||
|
|
||
| - **Panel 6: Statistics Summary** | ||
| - Type: Stat cards | ||
| - Total indexes monitored: COUNT(*) | ||
| - Indexes recommended for removal: COUNT(*) WHERE action = REMOVE | ||
| - Potential space savings: SUM(size_mb) for removal candidates | ||
| - Query regressions detected (7d): COUNT(*) of variance >5% | ||
| - Avg query plan stability: STDDEV(variance_percent) | ||
| - [ ] **Extended Statistics & Column Correlation Analysis**: | ||
| - **Purpose**: Identify correlated columns that might benefit from composite indexes | ||
| - **Implementation**: | ||
| - Enable Oracle extended statistics via `DBMS_STATS.CREATE_EXTENDED_STATS` for hot tables | ||
| - Sample existing query patterns: WHERE col1 = ? AND col2 = ? | ||
| - Measure correlation: `CORR(col1, col2)` on sample data | ||
| - If correlation >0.8 and query count high: recommend composite index | ||
| - Store analysis in AUDIT_COLUMN_CORRELATION table | ||
| - **Example**: | ||
| ```sql | ||
| -- For audit_login_attempts table: | ||
| -- Detect correlation between (ip_address, username) if frequently queried together | ||
| EXEC DBMS_STATS.CREATE_EXTENDED_STATS( | ||
| ownname => 'APP_USER', | ||
| tabname => 'AUDIT_LOGIN_ATTEMPTS', | ||
| extension => '(ip_address,username)' | ||
| ); | ||
| ``` | ||
| - **Success Criteria**: | ||
| - ✅ Index utility scores calculated every 24h; unused indexes <10% utility catalogued | ||
| - ✅ Missing indexes identified (high query count on unindexed columns); recommendations with confidence levels | ||
| - ✅ Query plan regressions detected within 1 hour of occurrence; variance >5% triggers alert | ||
| - ✅ SlowQueryAnalyzer component intercepts >1s queries; execution plans captured and analyzed | ||
| - ✅ Grafana dashboard displays top unused indexes, missing index suggestions, plan regression timeline | ||
| - ✅ AUDIT_INDEX_ANALYSIS and AUDIT_QUERY_RECOMMENDATIONS tables populated; admin can review and approve recommendations | ||
| - ✅ At least 1 actionable recommendation surfaced per week (remove/add/rewrite) | ||
| - ✅ After implementing recommendations: query performance improves 10-30% on affected queries | ||
| - **Implementation Priority**: | ||
| - **Phase 1 (Week 1-2)**: SlowQueryAnalyzer + index utility analysis + AUDIT tables | ||
| - **Phase 2 (Week 3)**: Grafana dashboard + basic recommendations | ||
| - **Phase 3 (Week 4)**: Extended statistics + advanced correlation analysis + regression detection | ||
| - **Risk Mitigation**: | ||
| - **Risk**: False positives in recommendations (suggest index for rarely used query) | ||
| → **Mitigation**: Confidence thresholds (HIGH >= 80%); require approval before implementation | ||
| - **Risk**: Regression detection noise from normal variance | ||
| → **Mitigation**: Baseline period (100 samples before alerting); configurable threshold (default 5%) | ||
| - **Risk**: Index removal breaks production query | ||
| → **Mitigation**: Test in shadow traffic first; maintain changelog; rollback procedure documented | ||
| - **Risk**: Extended statistics overhead on large tables | ||
| → **Mitigation**: Run off-peak; sample 1M rows instead of full table; schedule monthly instead of daily | ||
|
|
There was a problem hiding this comment.
Add language identifiers to fenced code blocks for proper rendering.
Several fenced code blocks lack language identifiers, which prevents proper syntax highlighting and rendering. Based on the content, these appear to be pseudo-code, SQL, or YAML configurations.
Apply language identifiers to improve documentation readability:
- Line 178: ASCII diagram → use
text - Line 244: Algorithm/pseudo-code → use
textorjava - Line 290: Algorithm/pseudo-code → use
textorjava - Line 412: Algorithm/pseudo-code → use
textorjava
Example fix for line 178:
- ```
+ ```text
┌──────────────────────────────────────────────────────────────┐🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
178-178: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
244-244: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
290-290: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
412-412: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In ROADMAP_OBSERVABILITY.md around lines 178-525, several fenced code blocks
lack language identifiers which prevents proper syntax highlighting; update the
backticks to include appropriate language tags: change the ASCII art block at
~line 178 to ```text, mark the algorithm/pseudo-code blocks (around ~244, ~290,
~412) with ```text or ```java as appropriate, mark the SQL DDL blocks with
```sql, and mark the application.yml/config block with ```yaml so each fenced
block has a correct language identifier for rendering.
…oard docs - Updated roadmap checkboxes for Log Correlation, ILM, and Kibana Dashboards to completed ([x]) - Added JAEGER_BASE_URL env var to docker-compose for Elasticsearch Jaeger integration - Created kibana/dashboards/README.md with setup prerequisites, field mappings, and import instructions for log analysis dashboards - Ensures centralized logging infrastructure with proper correlation and lifecycle management for better observability and troubleshooting
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (4)
ROADMAP_OBSERVABILITY.md (1)
178-178: Add language identifiers to fenced code blocks (duplicate linting issue).This is a recurrence of the same MD040 linting violation previously flagged. While the underlying issue is identical to the past review, these are NEW code blocks being added in this commit and must be fixed. Apply language identifiers to all four fenced code blocks:
- Line 178 (ASCII architecture diagram): Change
\``to```text`- Line 244 (pseudo-code algorithm): Change
\``to```text`- Line 290 (pseudo-code algorithm): Change
\``to```text`- Line 412 (pseudo-code algorithm): Change
\``to```text`Example fix for line 178:
- ``` + ```text ┌──────────────────────────────────────────────────────────────┐Also applies to: 244-244, 290-290, 412-412
docker-compose.yml (2)
227-227: Volume mount path may not match nginx expectations.Previous review flagged that nginx is configured to read certificates from
/etc/nginx/certsbut the volume mounts to/etc/certs. This path mismatch would cause nginx to fail on startup when attempting to load SSL certificates.Verify the nginx certificate paths in the nginx configuration:
#!/bin/bash # Check nginx.conf for certificate path references echo "=== Nginx certificate paths ===" rg -n "ssl_certificate|ssl_certificate_key" nginx.conf docker-start.sh Dockerfile # Check if /etc/nginx/certs is used echo "" echo "=== References to /etc/nginx/certs ===" rg -n "/etc/nginx/certs" nginx.conf docker-start.sh Dockerfile
175-175: CRITICAL: Certificate volume mount exposes committed private keys.As flagged in previous reviews, this volume mount (
./logstash/certs:/usr/share/logstash/certs) relies on private keys committed to the repository. This is a critical security vulnerability.logstash/config/pipelines.conf (1)
7-10: CRITICAL: Private keys committed to repository.As flagged in previous reviews, the certificate files referenced here are committed to the repository, which is a critical security vulnerability. Private keys must never be stored in version control.
🧹 Nitpick comments (12)
logstash/config/pipelines.conf (2)
33-87: Recommend explicit column selection in JDBC queries.All four JDBC inputs use
SELECT *which can cause issues:
- Performance overhead fetching unused columns
- Brittleness when table schemas change (new columns may break mappings)
- Unnecessary data transfer from Oracle to Logstash
Apply this pattern to all four JDBC inputs (login_attempts, sessions, dynamic_crud, error_log):
- statement => "SELECT * FROM AUDIT_LOGIN_ATTEMPTS WHERE created_at > :sql_last_value" + statement => "SELECT audit_id, created_at, user_id, username, ip_address, action, success, failure_reason, trace_id FROM AUDIT_LOGIN_ATTEMPTS WHERE created_at > :sql_last_value"Adjust column lists based on actual table schemas and required fields for each audit table.
34-88: Consider less aggressive polling schedule for audit tables.The cron schedule
* * * * *polls Oracle every minute for all four audit tables. This may be excessive depending on audit log volume and could impact database performance.Consider adjusting based on requirements:
- High-priority audit events (login attempts, sessions): Keep at 1 minute
- Lower-priority events (CRUD, errors): Reduce to
*/5 * * * *(every 5 minutes)# JDBC Input for AUDIT tables - Dynamic CRUD jdbc { # ... connection config ... - schedule => "* * * * *" + schedule => "*/5 * * * *" # Poll every 5 minutes # ... rest of config ... }docker-compose.yml (2)
157-181: Add healthcheck for Logstash service.The Logstash service is missing a healthcheck, which means Docker Compose cannot verify if Logstash is ready to accept connections. This could lead to race conditions where the app tries to send logs before Logstash is ready.
Add a healthcheck to the Logstash service:
logstash: build: context: . dockerfile: logstash/Dockerfile container_name: logstash depends_on: elasticsearch: condition: service_healthy oracle: condition: service_healthy environment: XPACK_MONITORING_ENABLED: "false" ORACLE_USER: ${ORACLE_USER:-APP_USER} ORACLE_PASSWORD: ${ORACLE_PASSWORD:-ssfpassword} JAEGER_BASE_URL: ${JAEGER_BASE_URL:-http://jaeger:16686} volumes: - ./logstash/config/pipelines.conf:/usr/share/logstash/pipeline/logstash.conf - ./logstash/certs:/usr/share/logstash/certs ports: - "5001:5001" - "5044:5044" networks: - oracle-network + healthcheck: + test: ["CMD", "curl", "-f", "http://localhost:9600/_node/stats"] + interval: 30s + timeout: 10s + retries: 3 + start_period: 60s
85-85: Consider simplifying Elasticsearch healthcheck.The current regex
grep -q '\"status\":\\s*\"\\(green\\|yellow\\)\"'is complex and fragile. A simpler approach usinggrep -Ewould be more maintainable.- test: ["CMD-SHELL", "curl -s http://localhost:9200/_cluster/health | grep -q '\"status\":\\s*\"\\(green\\|yellow\\)\"'"] + test: ["CMD-SHELL", "curl -s http://localhost:9200/_cluster/health | grep -qE '\"status\":\"(green|yellow)\"'"]monitoring/elasticsearch/README.md (1)
171-176: Add language specifiers to code blocks.Two code blocks are missing language specifiers which reduces syntax highlighting and IDE support.
-``` +```plaintext ┌─────────────────┐ ┌────────────────┐ ┌──────────────────┐ │ Spring Boot │────▶│ Logstash │────▶│ Elasticsearch │ │ (TCP 5001) │ │ (pipelines) │ │ (logs-app-*) │ └─────────────────┘ └────────────────┘ └──────────────────┘ ... -``` +``` And: ```diff -``` +```text {JAEGER_BASE_URL}/trace/{trace_id} -``` +```monitoring/elasticsearch/setup-elasticsearch.sh (2)
42-42: Shellcheck warning: declare and assign separately.Shellcheck flags SC2155 because combining
localdeclaration with command substitution can mask the command's return value, potentially hiding errors.- local body=$(echo "$response" | sed '$d') + local body + body=$(echo "$response" | sed '$d')However, given that
echoandsedrarely fail, and the script has other error handling via HTTP status codes, this is a low-priority refinement.
118-127: Consider validating JSON files before applying.The script applies JSON files without validating their syntax first. Invalid JSON will cause curl to fail, but pre-validation would provide clearer error messages.
if [ -f "$ILM_DIR/ssf-logs-policy.json" ]; then + # Validate JSON syntax + if ! jq empty "$ILM_DIR/ssf-logs-policy.json" 2>/dev/null; then + echo "ERROR: Invalid JSON in $ILM_DIR/ssf-logs-policy.json" + exit 1 + fi + curl -X PUT "$ELASTICSEARCH_URL/_ilm/policy/ssf-logs-policy" \ -H "Content-Type: application/json" \ -d @"$ILM_DIR/ssf-logs-policy.json"Apply the same pattern to template files (lines 139-163).
monitoring/elasticsearch/templates/logs-app-template.json (3)
4-10: Document field limit and dynamic template interaction.The combination of a 2,000-field limit (line 9) with
ignore_dynamic_beyond_limit: true(line 10) and a dynamic string-to-keyword template (lines 13–23) could silently drop fields if the template generates too many keyword variants. This creates a latent risk of undetected data loss.Consider documenting the expected field cardinality or adjusting the limit if you anticipate high-cardinality string fields (e.g., many unique logger names or tags).
29-37: Inconsistentignore_abovevalues between similar text fields.
message.keywordusesignore_above: 32766(Lucene maximum), whileexception_message.keywordusesignore_above: 2048. This inconsistency means exception messages are truncated at 2,048 bytes whereas log messages are preserved up to the Lucene limit. Consider aligning these values or documenting the intentional difference."exception_message": { "type": "text", "fields": { "keyword": { "type": "keyword", - "ignore_above": 2048 + "ignore_above": 32766 } } },Also applies to: 74-82
29-37: Consider specifying an explicit analyzer for text fields.The
message,exception_message, andstacktracefields are typed astextbut do not specify an analyzer, so they will use Elasticsearch's default analyzer. For optimal search and analysis of log content, consider explicitly specifying an analyzer (e.g.,standard,simple, or a custom analyzer tuned for your logs)."message": { "type": "text", + "analyzer": "standard", "fields": { "keyword": { "type": "keyword", "ignore_above": 32766 } } },Also applies to: 74-82, 83-85
src/main/java/com/rcs/ssf/http/filter/TraceIdFilter.java (2)
133-147: Consider propagating trace/span IDs intoTraceContextHolderas wellYou already extract
traceId/spanIdfrom the current span and setrequestId/userIdintoTraceContextHolder, butTraceContextHolderinsrc/main/java/com/rcs/ssf/http/filter/TraceContextHolder.javaalso exposessetTraceId/setSpanId. For service-layer code that relies onTraceContextHolderinstead of MDC, it may be useful to populate those too.Example adjustment:
- Span currentSpan = Span.current(); - + Span currentSpan = Span.current(); + // Extract trace_id from OpenTelemetry span context String traceId = null; String spanId = null; if (currentSpan != null && currentSpan.getSpanContext().isValid()) { traceId = currentSpan.getSpanContext().getTraceId(); spanId = currentSpan.getSpanContext().getSpanId(); } try { // Store in ThreadLocal for propagation to services TraceContextHolder.setRequestId(requestId); TraceContextHolder.setUserId(userId); + if (traceId != null) { + TraceContextHolder.setTraceId(traceId); + } + if (spanId != null) { + TraceContextHolder.setSpanId(spanId); + }This keeps logs and
TraceContextHolderin sync for downstream consumers.
147-162: MDC population looks good; optionally avoid repeated hashing ofuserIdThe MDC fields and PII handling are consistent with
includePii, and the explicit cleanup infinallymatches these keys. To reduce duplicate work and centralize the decision, you could compute the display form ofuserIdonce and reuse it for MDC and logging:- MDC.put("request_id", requestId); - if (traceId != null) { - MDC.put("trace_id", traceId); - } - if (spanId != null) { - MDC.put("span_id", spanId); - } - if (userId != null && !userId.isEmpty()) { - // Respect includePii setting for MDC logs - MDC.put("user_id", includePii ? userId : hashPii(userId)); - } else { - MDC.put("user_id", "anonymous"); - } + MDC.put("request_id", requestId); + if (traceId != null) { + MDC.put("trace_id", traceId); + } + if (spanId != null) { + MDC.put("span_id", spanId); + } + String mdcUserId; + if (userId != null && !userId.isEmpty()) { + mdcUserId = includePii ? userId : hashPii(userId); + } else { + mdcUserId = "anonymous"; + } + MDC.put("user_id", mdcUserId);Not required for correctness, but it simplifies the flow and ensures a single decision point for the user identifier.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
ROADMAP_OBSERVABILITY.md(1 hunks)docker-compose.yml(3 hunks)kibana/dashboards/README.md(1 hunks)kibana/dashboards/log-analysis.ndjson(1 hunks)logstash/config/pipelines.conf(1 hunks)monitoring/elasticsearch/README.md(1 hunks)monitoring/elasticsearch/ilm/ssf-logs-policy.json(1 hunks)monitoring/elasticsearch/setup-elasticsearch.sh(1 hunks)monitoring/elasticsearch/templates/logs-app-template.json(1 hunks)monitoring/elasticsearch/templates/logs-audit-template.json(1 hunks)monitoring/elasticsearch/templates/logs-nginx-template.json(1 hunks)src/main/java/com/rcs/ssf/http/filter/TraceIdFilter.java(4 hunks)
✅ Files skipped from review due to trivial changes (1)
- kibana/dashboards/log-analysis.ndjson
🧰 Additional context used
🧬 Code graph analysis (1)
src/main/java/com/rcs/ssf/http/filter/TraceIdFilter.java (1)
src/main/java/com/rcs/ssf/http/filter/TraceContextHolder.java (1)
TraceContextHolder(23-221)
🪛 LanguageTool
kibana/dashboards/README.md
[uncategorized] ~48-~48: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ion message | | stacktrace | text | Full stack trace | ### Index Template Snippet Ap...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
monitoring/elasticsearch/README.md
[style] ~98-~98: To elevate your writing, try using a synonym like ‘required’ here. Or, to avoid using the passive voice, try replacing the past participle ‘needed’ with an adjective.
Context: ...isting Data If shard count changes are needed for existing indices: ```bash # 1. Cre...
(IS_NEEDED_NECESSARY)
🪛 markdownlint-cli2 (0.18.1)
ROADMAP_OBSERVABILITY.md
178-178: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
244-244: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
290-290: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
412-412: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
monitoring/elasticsearch/README.md
171-171: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
176-176: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Shellcheck (0.11.0)
monitoring/elasticsearch/setup-elasticsearch.sh
[warning] 42-42: Declare and assign separately to avoid masking return values.
(SC2155)
🔇 Additional comments (10)
kibana/dashboards/README.md (1)
1-368: Excellent comprehensive documentation for Kibana dashboard setup.This documentation thoroughly covers all aspects of the dashboard setup including prerequisites, field mappings, trace correlation with Jaeger, sample log documents, and troubleshooting. The inclusion of concrete examples (JSON snippets, curl commands, code references) makes it actionable for users.
monitoring/elasticsearch/templates/logs-nginx-template.json (1)
1-78: LGTM: Well-configured nginx log template.The field type mappings are appropriate for nginx access/error logs:
client_ipasiptype enables CIDR queriesstatusasintegerfor HTTP status codesbytesaslonghandles large response sizes- Timing fields (
response_time,upstream_response_time,request_time) asfloatare standard for metricsuser_agentwith dual text/keyword mapping enables both full-text search and aggregationsThe template aligns well with the other log templates in this PR and uses the same ILM policy for consistent lifecycle management.
monitoring/elasticsearch/templates/logs-audit-template.json (1)
1-118: LGTM: Well-structured audit log template.The template configuration is well-suited for audit logs with appropriate design choices:
- 3 shards provide higher query parallelism for compliance queries (vs. 2 for app/nginx logs)
- Dynamic template defaulting strings to keywords is appropriate for structured audit data
- Field type selections are correct:
ipfor IP addresses,booleanfor success/failure,textwithkeywordsubfields for error messages enabling both search and aggregations- Higher
ignore_abovelimit (2048) for error messages vs. user agents (512) is reasonable given error message verbosityAligns well with the ILM policy and other templates in this observability stack.
monitoring/elasticsearch/README.md (1)
1-241: Excellent comprehensive Elasticsearch setup documentation.This documentation provides thorough coverage of the observability stack setup including ILM policy details, index templates, shard sizing rationale with concrete numbers, trace correlation with Jaeger, and comprehensive troubleshooting guidance. The inclusion of the log flow architecture diagram and concrete sizing guidelines makes this highly actionable.
monitoring/elasticsearch/ilm/ssf-logs-policy.json (1)
1-53: LGTM: Well-designed ILM policy following Elasticsearch best practices.The four-phase lifecycle is appropriately configured:
- Hot phase: 30-day/50GB rollover with high priority (100) ensures timely indexing
- Warm phase: Shrink to 1 shard and forcemerge optimize for query performance while reducing cluster overhead
- Cold phase: 0 replicas and low priority (0) minimize storage costs for rarely-accessed logs
- Delete phase: 365-day retention aligns with typical compliance requirements
The policy correctly references MinIO for long-term archival beyond 365 days, establishing a clear offloading strategy.
monitoring/elasticsearch/setup-elasticsearch.sh (1)
1-219: Well-structured setup script with good error handling.The script demonstrates several good practices:
- Idempotent resource creation with proper error handling
- Elasticsearch readiness polling with configurable retries
- Clear status messaging and comprehensive verification
- Graceful handling of existing resources (resource_already_exists_exception)
The script correctly provisions the ILM policy, index templates, and initial indices with proper aliases for the write indices.
src/main/java/com/rcs/ssf/http/filter/TraceIdFilter.java (4)
9-16: Imports for MDC and UTF‑8 hashing are appropriate
MDCandStandardCharsetsimports align with the new logging and hashing logic and are correctly scoped.
166-168: ExposingX-Trace-IDheader is a solid additionConditionally setting
X-Trace-IDwhen a valid trace is present is a good way to let clients correlate logs/traces without leaking internals when tracing is disabled.
205-214: Status code attribution and MDC cleanup infinallyare correctRecording
http.status_codein the span withinfinallyensures it’s captured even on errors, and the explicitMDC.remove(...)calls for each key you set (request_id,trace_id,span_id,user_id) correctly prevent cross-request leakage on reused threads.
228-237: UTF‑8 explicit encoding inhashPiiis a good portability fixUsing
combined.getBytes(StandardCharsets.UTF_8)makes the SHA‑256 hashing deterministic across environments and avoids platform-default charset pitfalls; the rest of the method’s behavior and error handling remain intact.
This pull request introduces significant improvements to the Docker build and deployment workflow for the project. The main changes include a new multi-stage Docker build that separates backend and frontend builds, adds support for build-time configuration via arguments, and enhances security and flexibility for SSL certificate handling. The documentation has also been updated to reflect these new capabilities and provide clear usage instructions.
Docker build and deployment improvements:
Dockerfileto use multi-stage builds for backend (eclipse-temurin:21-jdk-alpine) and frontend (node:20-alpine), improving build efficiency and separation of concerns. [1] [2]FRONTEND_DIST_PATHandCERT_CNto allow customization of frontend output location and SSL certificate Common Name, with validation steps to prevent misconfiguration.Documentation updates:
README.mdto document new Docker build arguments, usage examples, production warnings for SSL certificates, and Docker Compose integration, making deployment steps clearer and safer for developers.Build context optimization:
.dockerignorefile to exclude unnecessary files and directories from the Docker build context, reducing image size and build times.Summary by CodeRabbit
New Features
Infrastructure
Database
Bug Fixes
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.