Skip to content

Func endpoints#24

Open
zlovtnik wants to merge 10 commits intomainfrom
func-endpoints
Open

Func endpoints#24
zlovtnik wants to merge 10 commits intomainfrom
func-endpoints

Conversation

@zlovtnik
Copy link
Copy Markdown
Owner

@zlovtnik zlovtnik commented Mar 10, 2026

Summary by CodeRabbit

  • New Features

    • Added gRPC health check endpoint alongside HTTP API.
    • Extended gender options to support male, female, non-binary, and prefer-not-to-say.
    • Implemented tenant health caching with background polling for improved performance.
    • Enhanced OAuth redirect URI validation and handling.
  • Database Migrations

    • Made gender column nullable with safeguards against data loss.
    • Migrated gender field to enumerated type with comprehensive value mapping.
  • Refactor

    • Restructured backend architecture with new adapter and service layers.
    • Introduced centralized configuration and application state management.

Add a `Gender` enum derived with `DbEnum` and switch `Person`/`PersonDTO`
`gender` from `bool` to `Option<Gender>` (with serde default for null/missing).

Also update API tests and request payloads to use string enum values
(e.g. `"male"`, `"female"`) and add `diesel-derive-enum` to support
Postgres enum mapping. This enables richer, type-safe gender values
instead of a limiting boolean flag.feat(person): replace bool gender with DB-backed enum

Add a `Gender` enum derived with `DbEnum` and switch `Person`/`PersonDTO`
`gender` from `bool` to `Option<Gender>` (with serde default for null/missing).

Also update API tests and request payloads to use string enum values
(e.g. `"male"`, `"female"`) and add `diesel-derive-enum` to support
Postgres enum mapping. This enables richer, type-safe gender values
instead of a limiting boolean flag.
- Added new modules for adapters, domain, and ports to improve code organization.
- Implemented gRPC server with a HealthService for health checks.
- Introduced a new AppError type for better error handling across the application.
- Migrated from dotenv to dotenvy for environment variable management.
- Enhanced session management and logging capabilities.
- Updated authentication middleware to mark hot paths for performance monitoring.
- Added health check functionality with a new HealthSnapshot domain type.
- Refactored database access patterns to use async traits for better scalability.
- Introduced a unified AppState for managing shared application state.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 10, 2026

📝 Walkthrough

Walkthrough

Comprehensive backend refactor introducing multi-transport server architecture (HTTP + gRPC), unified configuration management via AppConfig, new gRPC health service with Protobuf definitions, gender enum for database models, centralized AppState, and AppError for cross-transport error handling. Database migrations make gender nullable and enum-based. Main entrypoint rewritten to spawn HTTP and gRPC servers concurrently with graceful shutdown.

Changes

Cohort / File(s) Summary
Cargo.toml & Build Configuration
Cargo.toml, build.rs
Added runtime dependencies (anyhow, prost, tonic, config, dotenvy, tower-http, sqlx, diesel-derive-enum); added tonic-build for gRPC code generation; updated build.rs to generate schema.rs via diesel CLI with cargo rerun hints and Result-based error handling.
Runtime Configuration System
src/config/runtime.rs, src/config/mod.rs
New comprehensive AppConfig with nested ServerConfig, GrpcConfig, DatabaseConfig, RedisConfig, KeycloakConfig, SessionConfig, RuntimeConfig, BootstrapConfig; AppConfig::from_env() loads environment/dotenv; implements Debug redaction for sensitive fields (URLs, secrets); provides bind_addr utility methods.
Error Handling & Type System
src/app_error.rs
New AppError enum (NotFound, Validation, Unauthorized, Conflict, ServiceError, Database, Internal); implements ResponseError for HTTP responses with JSON serialization and error masking; implements From for tonic::Status to map errors to gRPC codes.
Application State Management
src/state.rs
New AppState wrapper around shared state; async initialization with config, pool, tenant manager, Redis, Keycloak, WebSocket broadcaster, per-tenant sqlx pools, and session key; provides public accessors and per-tenant pool lifecycle management with poisoning-safe RwLock patterns.
gRPC Service & Protobuf
proto/nexus/core/core.proto, src/adapters/grpc/..., src/adapters/grpc/mod.rs, src/adapters/grpc/server.rs
New proto3 HealthService with Check RPC and request/response messages; gRPC adapter with HealthGrpcService wrapping CoreHealthService; run_grpc_server spawns tonic server with watch-channel shutdown support; tonic::include_proto!() macro includes generated definitions.
HTTP Server Adapter
src/adapters/http/server.rs, src/adapters/http/mod.rs
New run_http_server bootstraps Actix-web with AppState, cookie sessions (TTL, security, SameSite), CORS with dynamic origins, tracing middleware, multiple AppData entries (tenant manager, pools, Redis, broadcaster, Keycloak); graceful shutdown via watch channel; configurable workers and timeouts.
Database Adapters & Repositories
src/adapters/db/tenant_sqlx_repository.rs, src/adapters/db/mod.rs
New TenantSqlxRepositoryImpl implementing TenantSqlxRepository trait; wraps sqlx::PgPool; health_check executes SELECT 1 query with error mapping to AppError.
Domain & Service Ports
src/domain/health.rs, src/domain/mod.rs, src/ports/outbound.rs, src/ports/repository.rs, src/ports/mod.rs
New HealthSnapshot with status/service/timestamp fields; ClockPort trait with now_rfc3339() for async timestamps; TenantSqlxRepository trait for health checks; SystemClock implementation.
Core Health Service
src/services/core/health_service.rs, src/services/core/mod.rs, src/services/mod.rs
New HealthService wrapping Arc; check() returns HealthSnapshot with "ok" status and RFC3339 timestamp via clock; exposed via new core module.
Main Entrypoint Rewrite
src/main.rs
Replaced Actix-only main with Tokio-based entrypoint; loads AppConfig, initializes AppState, spawns HTTP and gRPC servers concurrently via JoinSet; implements graceful shutdown via watch channel on Ctrl-C; returns anyhow::Result<()>; removed legacy dotenv, migrations, and per-transport configuration.
Database Migrations & Schema
migrations/2026-02-28-000001_make_people_gender_nullable/*, migrations/2026-03-01-000002_convert_people_gender_to_enum/*, src/schema.rs
Make gender column nullable via DOWN guard checking for NULL rows before applying NOT NULL constraint; convert boolean gender to gender enum with variants (male, female, non_binary, prefer_not_to_say); schema.rs updated to reflect Nullable type.
PersonGender Enum & Model Changes
src/models/person.rs, src/api/address_book_controller.rs
Introduce PersonGender DbEnum with Male/Female/NonBinary/PreferNotToSay variants (snake_case, serde); replace bool gender with Option in Person and PersonDTO; update filter API to return Result<Page, ServiceError>; adjust tests and queries to use enum values.
Controller & Middleware Updates
src/api/account_controller.rs, src/api/controller_context.rs, src/middleware/auth_middleware.rs, src/api/health_controller.rs
Add redirect_uri field and validation to KeycloakCallbackRequest; update stateless OAuth flow with redirect_uri handling and UUID-based login_session; enhance DatabaseContext with from_state constructor; refactor auth middleware to use tenant-derived helpers; implement tenant health caching with background polling in health_detailed.
Keycloak Token Exchange & WebSocket Logging
src/utils/keycloak.rs, src/utils/ws_logger.rs
Add optional redirect_uri_override parameter to exchange_code_for_token and exchange_code_stateless; implement init_websocket_logging_with_filter allowing filter override; replace dotenv with dotenvy.
Database Pool & Tenant Management
src/config/db.rs, src/config/oracle_db.rs, src/config/hybrid_manager.rs
Add public get_pool(tenant_id) helper to TenantPoolManager; minor error handling and formatting refinements across Oracle and hybrid managers; get_tenant_pool now delegates to get_pool.
Controller & API Enhancements
src/api/nfag_controller.rs, src/api/tenant_controller.rs, src/api/functional_operations_controller.rs, src/api/crud_engine.rs
Add HOT PATH comments to high-frequency endpoints; minor formatting in health collection helpers; strengthen validation for functional operations (filter conditions, take/skip bounds); refactor error handling for consistency.
Service & Utility Formatting
src/services/account_service.rs, src/services/address_book_service.rs, src/models/user/operations.rs, src/utils/token_utils.rs
Pass Keycloak client to token lookup in me flow; log and propagate errors in address_book filter instead of wrapping; reformat function signatures across multiple lines; minor logging and error handling adjustments.
Functional Library & Examples
functional_lib/src/lazy_pipeline.rs, functional_lib/src/query_composition.rs, functional_lib/src/unified_pagination.rs, examples/pipeline_metrics_demo.rs
Minor formatting and whitespace adjustments in lazy pipeline and query composition; refactor CURSOR_KEY initialization in unified_pagination without behavioral change; reformat demo println for readability.
Documentation & Root Exports
docs/rust_backend_refactor_audit.md, src/lib.rs, src/adapters/mod.rs
New audit document detailing Wave 1 refactor scope, module inventory, async boundary audit, gRPC surface targets, hot path markers, and baseline gates; expose new public modules (adapters, app_error, domain, ports, state) in lib.rs and adapters submodules (db, grpc, http).

Sequence Diagram(s)

sequenceDiagram
    participant Main
    participant AppState
    participant ConfigLoader
    participant HTTPServer
    participant GRPCServer
    participant Database
    participant ShutdownChannel

    Main->>ConfigLoader: AppConfig::from_env()
    ConfigLoader->>ConfigLoader: Parse env vars<br/>with defaults
    ConfigLoader-->>Main: AppConfig

    Main->>AppState: AppState::new(config)
    AppState->>Database: Initialize pools<br/>Run migrations
    Database-->>AppState: Pool ready
    AppState->>AppState: Register tenant<br/>Initialize Redis/Keycloak
    AppState-->>Main: AppState initialized

    Main->>HTTPServer: spawn run_http_server<br/>(state, shutdown)
    Main->>GRPCServer: spawn run_grpc_server<br/>(state, shutdown)

    HTTPServer->>AppState: Access config, pools
    GRPCServer->>AppState: Access config, pools

    Note over Main: Await Ctrl-C or task failure
    Main->>ShutdownChannel: Send shutdown signal
    ShutdownChannel->>HTTPServer: Notify shutdown
    ShutdownChannel->>GRPCServer: Notify shutdown

    HTTPServer->>HTTPServer: Graceful stop
    GRPCServer->>GRPCServer: Graceful stop

    HTTPServer-->>Main: Task complete
    GRPCServer-->>Main: Task complete

    Main->>Main: Exit with result
Loading
sequenceDiagram
    participant Client
    participant GRPCAdapter
    participant HealthService
    participant ClockPort
    participant Database

    Client->>GRPCAdapter: Check(HealthCheckRequest)
    GRPCAdapter->>HealthService: check()
    HealthService->>ClockPort: now_rfc3339()
    ClockPort->>ClockPort: chrono::Utc::now()<br/>.to_rfc3339()
    ClockPort-->>HealthService: RFC3339 timestamp
    HealthService->>Database: SELECT 1 (via pool)
    Database-->>HealthService: OK
    HealthService->>HealthService: Build HealthSnapshot<br/>(status, service, timestamp)
    HealthService-->>GRPCAdapter: Ok(HealthSnapshot)
    GRPCAdapter->>GRPCAdapter: Convert to<br/>tonic::Status
    GRPCAdapter-->>Client: HealthCheckResponse
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through configs bright,
gRPC servers, HTTP in flight,
Gender enums bloom with care,
AppState wires everywhere,
From Tokio's watch, shutdown's fair!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Func endpoints' is vague and generic, using non-descriptive abbreviation without conveying meaningful information about the substantial changes in this pull request. Provide a more specific title that captures the main architectural changes, such as 'Refactor backend to support multi-transport gRPC and HTTP adapters with runtime config' or 'Add gRPC adapter, runtime configuration, and multi-tenant SQLx repositories'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch func-endpoints

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/api/health_controller.rs (2)

218-279: ⚠️ Potential issue | 🟠 Major

This hot path still does a full tenant fan-out on every poll.

Line 218 marks this as an orchestrator-polled endpoint, but Lines 261-283 still list every tenant and run a per-tenant SELECT 1. That makes health checks scale with tenant count and can turn the endpoint itself into a load spike. I'd keep /health/detailed out of the poller path or serve tenant details from cached/background results instead. Also, Line 269 should go through the TenantPoolManager contract instead of the ad-hoc get_tenant_pool branch.

As per coding guidelines, "Use TenantPoolManager to obtain per-tenant connection pools with the pattern: manager.get_pool(&tenant_id)?."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/health_controller.rs` around lines 218 - 279, The detailed health
endpoint currently fans out to every tenant on each poll (iterating
Tenant::list_all and running per-tenant SELECT 1), which is a hot-path
anti-pattern; instead, stop doing synchronous per-request tenant checks in the
handler and read precomputed tenant health from a background/cached store (or
remove tenant details from /health/detailed), and where you must access a tenant
pool replace ad-hoc manager.get_tenant_pool(&tenant.id) usage with the
TenantPoolManager contract: call manager.get_pool(&tenant.id)? (or
manager.get_pool(&tenant.id).map_err(...)) to obtain the pool; move the loop and
per-tenant SQL checks into a background task that periodically updates a shared
cache (e.g., an Arc<RwLock/ConcurrentMap> or web::Data) and have the handler
read that cache instead of iterating Tenant::list_all and executing SELECT 1
inline.

259-305: ⚠️ Potential issue | 🟠 Major

Don't report tenant probe failures as healthy.

Line 265 turns Tenant::list_all errors into an empty list, and Lines 289-304 then collapse join/query failures to None, which Line 304 treats as healthy. That means this endpoint can return overall healthy while tenant inspection is broken.

Suggested direction
-    let tenants = if let Some(manager_ref) = manager {
+    let (tenants, tenant_probe_failed) = if let Some(manager_ref) = manager {
         let manager_data = manager_ref.clone();
         match tokio::task::spawn_blocking(move || {
             let mut main_conn = main_conn
                 .get()
                 .map_err(|e| format!("Failed to get db connection: {}", e))?;
-            let tenants = Tenant::list_all(&mut main_conn).unwrap_or_else(|_| Vec::new());
+            let tenants = Tenant::list_all(&mut main_conn)
+                .map_err(|e| format!("Failed to list tenants: {}", e))?;
             let mut tenant_healths = Vec::new();

             // build tenant_healths ...

             Ok::<Vec<TenantHealth>, String>(tenant_healths)
         })
         .await
         {
-            Ok(Ok(healths)) if !healths.is_empty() => Some(healths),
-            _ => None,
+            Ok(Ok(healths)) => (Some(healths), false),
+            _ => (None, true),
         }
     } else {
-        None
+        (None, false)
     };

     let overall_status = if db_status.is_healthy()
         && (cache_status.is_healthy() || !cache_status.is_available())
+        && !tenant_probe_failed
         && tenants
             .as_ref()
             .map_or(true, |t| t.iter().all(|th| th.status.is_healthy()))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/health_controller.rs` around lines 259 - 305, The tenant probe
currently swallows Tenant::list_all errors and treats a failed spawn/join as
None which the overall_status then considers healthy; fix by (1) stopping the
swallow inside the tokio::task::spawn_blocking closure — replace the
Tenant::list_all(...).unwrap_or_else(|_| Vec::new()) with propagation (use ? /
map_err -> return Err(String)) so the closure returns Err on list/query
failures, and (2) change the final tenants.as_ref().map_or(true, |t| ...) to
map_or(false, |t| ...) so a missing/failed tenant probe is treated as unhealthy;
reference: the spawn_blocking closure that builds tenant_healths,
Tenant::list_all, and the tenants.as_ref().map_or(...) check used in
overall_status.
src/models/person.rs (1)

183-206: ⚠️ Potential issue | 🟠 Major

Don't ignore unknown gender filters.

An unrecognized filter.gender currently turns into None, which drops the predicate entirely and broadens the query instead of failing fast. Please validate the value before building the query, or route invalid values into an explicit error path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/person.rs` around lines 183 - 206, The current filter.gender
branch silently returns None for unrecognized strings, dropping the predicate;
change this to validate and return an explicit error instead of producing None:
detect invalid values from filter.gender (the same match that maps to
Gender::Male/ Female/ NonBinary/ PreferNotToSay) and propagate a failure (e.g.,
return Result::Err or convert the enclosing function to return a Result) rather
than mapping to None; keep the successful branch creating the Box<dyn
BoxableExpression<people::table, diesel::pg::Pg, SqlType = Nullable<Bool>>>
using people::gender.eq(Some(Gender::...)) but ensure unknown strings produce a
clear error path so invalid filter.gender values are rejected rather than
ignored.
src/middleware/auth_middleware.rs (1)

112-123: ⚠️ Potential issue | 🟠 Major

Return 500 when auth infrastructure is missing.

Falling back to 401 Unauthorized when AppState, TenantPoolManager, or KeycloakClient is absent turns a server misconfiguration/outage into a client-auth problem. These branches should fail as internal errors so callers and monitoring see the real fault.

Also applies to: 127-138, 397-423

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/middleware/auth_middleware.rs` around lines 112 - 123, The code currently
treats missing auth infrastructure (AppState, TenantPoolManager, KeycloakClient)
as a 401 Unauthorized in the keycloak_client_from_request branch; change these
branches to return a 500 Internal Server Error instead so infrastructure faults
surface as server errors. Locate the branch around
keycloak_client_from_request(&req) and replace the Unauthorized response with an
InternalServerError response (using HttpResponse::InternalServerError and the
existing ResponseBody pattern), preserving the ServiceResponse::new(request,
response) async return shape. Apply the same change for the other similar
branches referenced (the checks around lines 127-138 and 397-423) so all missing
infrastructure cases return 500 rather than 401.
🧹 Nitpick comments (6)
src/models/user_token.rs (1)

62-62: Consider removing redundant dotenvy::dotenv() call.

This call is likely redundant since SECRET_KEY initialization (line 14) already loads the environment. If generate_token is always called after SECRET_KEY is accessed, this second call has no effect. However, keeping it provides defense-in-depth if MAX_AGE needs to be loaded independently.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/user_token.rs` at line 62, The dotenvy::dotenv() call inside
generate_token appears redundant because SECRET_KEY is already initialized
earlier; remove the standalone dotenvy::dotenv() invocation in generate_token
(or document why defense-in-depth is required) and instead ensure generate_token
reads MAX_AGE via std::env::var or a configuration accessor (e.g., keep using
SECRET_KEY and access MAX_AGE with env::var("MAX_AGE")/parse) so no second
dotenv call is necessary; update generate_token and any references to
MAX_AGE/SECRET_KEY to rely on the module-level initialization or a single config
loader.
Cargo.toml (1)

61-61: Exact version pin on home crate may cause dependency conflicts.

Using =0.5.11 pins to an exact version, which can cause resolution failures when other dependencies require different versions of home. Unless there's a specific compatibility issue, consider using "0.5" or "0.5.11" (without =) to allow patch updates.

Suggested change
-home = "=0.5.11"
+home = "0.5"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Cargo.toml` at line 61, The Cargo.toml entry pins the home crate to an exact
version using `=0.5.11`, which can cause dependency resolution conflicts; edit
the Cargo.toml dependency line for the `home` crate to remove the leading `=`
and allow flexibility (e.g., use "0.5" to accept patch/minor updates or "0.5.11"
without `=` for patch upgrades) so Cargo can resolve compatible versions across
the dependency graph.
proto/core.proto (1)

1-17: Proto file directory structure doesn't match package declaration.

The static analysis tool (Buf) correctly identifies that files with package nexus.core should be in a nexus/core directory relative to the proto root. Currently the file is at proto/core.proto.

Consider moving this file to proto/nexus/core/core.proto (or renaming to proto/nexus/core/health.proto for clarity) to align with the package structure. This follows Protobuf best practices and ensures compatibility with Buf and other proto tooling.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proto/core.proto` around lines 1 - 17, The proto package declaration "package
nexus.core;" in core.proto doesn't match its file path; move the file under a
matching directory (e.g., create proto/nexus/core/ and relocate core.proto
there, or rename to health.proto inside that directory) so the package
nexus.core aligns with the filesystem and Buf expectations; update any
build/proto includes or import paths that reference core.proto to the new path
to avoid breaking references to the HealthService, HealthCheckRequest, and
HealthCheckResponse symbols.
src/state.rs (2)

201-210: Edge case: URL parsing may not handle all credential formats.

The redact_connection_url function assumes credentials are in the format scheme://user:password@host. It may not correctly handle:

  • URLs without a scheme prefix
  • URLs with special characters in the password that include @ or :

For most standard database/Redis URLs, this works. Consider using the url crate for more robust parsing if edge cases become a concern.

💡 More robust URL redaction using the `url` crate
fn redact_connection_url(url: &str) -> String {
    match url::Url::parse(url) {
        Ok(mut parsed) => {
            if parsed.password().is_some() {
                let _ = parsed.set_password(Some("<redacted>"));
            }
            parsed.to_string()
        }
        Err(_) => {
            // Fallback to original logic or return as-is
            url.to_string()
        }
    }
}

This handles edge cases and is more maintainable.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state.rs` around lines 201 - 210, The current redact_connection_url
function assumes credentials follow user:password@host and can mis-handle
missing schemes or passwords/hosts containing ':' or '@'; update
redact_connection_url to use the url crate for parsing: attempt
url::Url::parse(url), on success replace the password via
set_password(Some("<redacted>")) if present and return parsed.to_string(), and
on Err fall back to returning the original string (or the existing simple
redaction) so malformed or scheme-less URLs won't cause incorrect redaction.

149-180: RwLock poisoning recovery strategy is reasonable but warrants caution.

The code recovers from poisoned locks by logging an error and accessing the inner data via into_inner(). This is a pragmatic approach that prioritizes availability over strict consistency.

However, if the lock was poisoned, it indicates a panic occurred while holding the lock, which could mean the HashMap is in an inconsistent state. Consider whether this recovery behavior is acceptable for your reliability requirements.

💡 Alternative: fail-fast approach if strict consistency is required

If you prefer fail-fast behavior:

 pub fn sqlx_pool_for_tenant(&self, tenant_id: &str) -> Option<sqlx::PgPool> {
-    match self.inner.sqlx_pools.read() {
-        Ok(pools) => pools.get(tenant_id).cloned(),
-        Err(poisoned) => {
-            tracing::error!(
-                tenant_id = tenant_id,
-                error = %poisoned,
-                "sqlx pool map lock poisoned during read; recovering inner guard"
-            );
-            let pools = poisoned.into_inner();
-            pools.get(tenant_id).cloned()
-        }
-    }
+    self.inner.sqlx_pools
+        .read()
+        .expect("sqlx pool map lock poisoned")
+        .get(tenant_id)
+        .cloned()
 }

The current approach is valid if availability is prioritized over consistency guarantees.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state.rs` around lines 149 - 180, The current methods
sqlx_pool_for_tenant and insert_sqlx_pool silently recover from RwLock poisoning
by calling into_inner(), which can mask inconsistent state; change their
behavior to propagate poisoning instead: update sqlx_pool_for_tenant to return
Result<Option<sqlx::PgPool>,
std::sync::PoisonError<std::sync::RwLockReadGuard<'_, _>>> (or a boxed
error/your crate error type) and return Err(poisoned) when the read() fails, and
update insert_sqlx_pool to return Result<(),
std::sync::PoisonError<std::sync::RwLockWriteGuard<'_, _>>> (or your error type)
and return Err(poisoned) when write() fails; keep the successful branches
identical (get/cloned and insert) so callers of inner.sqlx_pools must handle the
poison case (or map it to a panic/log at call site) rather than silently using
into_inner().
src/config/runtime.rs (1)

162-212: Consider consolidating environment variable mappings.

The manual override pattern is verbose but explicit. One observation: the config::Environment source added at line 160 with separator __ should already handle environment variables like SERVER__HOST, DB__URL, etc. The explicit overrides handle non-standard naming (e.g., APP_HOSTserver.host, DATABASE_URLdb.url).

This approach works correctly, but consider documenting which env vars are "canonical" (nested like DB__URL) vs "legacy/convenient" (flat like DATABASE_URL) for maintainability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/runtime.rs` around lines 162 - 212, The env override block
duplicates mappings already handled by config::Environment with separator "__"
(e.g., SERVER__HOST, DB__URL) and keeps legacy flat names (e.g., APP_HOST,
DATABASE_URL) via env_opt and builder.set_override; refactor by
grouping/centralizing the manual overrides into a single mapping function (e.g.,
a new function that iterates a static list of (env_var, config_key) pairs and
calls env_opt + builder.set_override) to reduce boilerplate, keep env_opt and
builder usage intact, and add a short doc comment near the config::Environment
setup explaining which env names are canonical (double-underscore nested) vs
legacy/convenience (flat) so future maintainers know which form to prefer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/adapters/http/server.rs`:
- Around line 67-78: The session cookie is set with SameSite::Strict which
prevents the browser from sending the session on the cross-site OAuth callback;
update the SessionMiddleware configuration (the SessionMiddleware::builder call
where cookie_same_site is set) to use SameSite::Lax (or SameSite::None with
Secure=true if you must support cross-site POSTs) instead of SameSite::Strict so
the stored oauth_state, nonce and PKCE verifier are sent on the IdP redirect and
can be retrieved in your callback handler.

In `@src/api/account_controller.rs`:
- Around line 86-89: The redirect_uri field can be used for open redirects;
before calling exchange_code_stateless in keycloak_callback_stateless, validate
callback_request.redirect_uri (and the struct field redirect_uri) against an
application allowlist (e.g., ALLOWED_REDIRECT_URIS env var or config) and reject
requests with a non-empty allowlist that does not contain the provided URI by
returning a BadRequest/AuthResponse error; ensure the check handles missing/None
redirect_uri by allowing Keycloak defaults and normalizes/trims values for exact
matching (do not rely solely on Keycloak).

In `@src/app_error.rs`:
- Around line 23-63: The ResponseError/From<AppError> adapters are flattening
structured ServiceError into a single "error" string and dropping
.with_detail()/.with_tag() metadata; restore the structured transport shape by
returning ServiceError (or including its detail/tag fields) from
AppError::error_response instead of a lone string in error_response, and when
converting AppError -> tonic::Status preserve the ServiceError metadata (use the
ServiceError payload for the status message and attach detail/tag bytes or
metadata via tonic::Status::with_details or by setting gRPC metadata) so HTTP
and gRPC keep the same ServiceError shape and its .with_detail()/.with_tag()
information (refer to ResponseError impl, error_response, AppError::from ->
tonic::Status, and the ServiceError/.with_detail()/.with_tag() symbols).

In `@src/main.rs`:
- Around line 38-53: When the first task join returns Some(join_res) before any
shutdown, an Ok(()) from that task should be treated as an unexpected failure
instead of success; change the Some(join_res) branch so that after resolving
join_res via join_res.context("server task join failure")? you check the inner
task_result: if it is Ok(()) call anyhow::bail!("server task stopped
unexpectedly") (or similar error), otherwise propagate the Err via
task_result.map_err(anyhow::Error::from)?; keep the remaining loop (while let
Some(pending) ...) unchanged for draining siblings after a deliberate shutdown.

In `@src/middleware/auth_middleware.rs`:
- Around line 184-192: The code is trusting claims.tenant_id from the decoded
token to select a tenant pool; change it to derive tenant_id from server-side
context (e.g., extract from host/subdomain or request context) before calling
TenantPoolManager, use manager.get_pool(&tenant_id)? (or the existing
manager.get_pool pattern) to obtain the pool, then decode/verify the token via
token_utils::decode_token and token_utils::verify_token and explicitly assert
that token_data.claims.tenant_id == server_derived_tenant_id (reject if
mismatch); update both the block using
token_utils::decode_token/token_utils::verify_token and the similar logic
referenced at 496-523 to follow this server-first tenant resolution and
cross-check pattern.

In `@src/models/person.rs`:
- Around line 14-21: The Gender enum (pub enum Gender) uses #[DbValueStyle =
"snake_case"] for Diesel but lacks Serde renaming, so JSON will emit
"Male"/"NonBinary" etc.; add #[serde(rename_all = "snake_case")] to the enum
declaration (alongside the existing derives Serialize, Deserialize and DbEnum)
so Serde serializes/deserializes the same snake_case variants as the DB; apply
the same change to the other enums noted (the ones around lines 27 and 40) to
keep JSON/DB casing consistent.

In `@src/schema.rs`:
- Line 673: schema.rs currently references a non-existent type
`crate::models::person::GenderMapping` for the `gender` column; update the
reference to the actual enum type `crate::models::person::Gender` (i.e., change
`GenderMapping` → `Gender`) so Diesel types align with the enum defined in
`src/models/person.rs`, and then regenerate the file by running your migration
and `diesel print-schema` rather than manually editing schema.rs.

---

Outside diff comments:
In `@src/api/health_controller.rs`:
- Around line 218-279: The detailed health endpoint currently fans out to every
tenant on each poll (iterating Tenant::list_all and running per-tenant SELECT
1), which is a hot-path anti-pattern; instead, stop doing synchronous
per-request tenant checks in the handler and read precomputed tenant health from
a background/cached store (or remove tenant details from /health/detailed), and
where you must access a tenant pool replace ad-hoc
manager.get_tenant_pool(&tenant.id) usage with the TenantPoolManager contract:
call manager.get_pool(&tenant.id)? (or
manager.get_pool(&tenant.id).map_err(...)) to obtain the pool; move the loop and
per-tenant SQL checks into a background task that periodically updates a shared
cache (e.g., an Arc<RwLock/ConcurrentMap> or web::Data) and have the handler
read that cache instead of iterating Tenant::list_all and executing SELECT 1
inline.
- Around line 259-305: The tenant probe currently swallows Tenant::list_all
errors and treats a failed spawn/join as None which the overall_status then
considers healthy; fix by (1) stopping the swallow inside the
tokio::task::spawn_blocking closure — replace the
Tenant::list_all(...).unwrap_or_else(|_| Vec::new()) with propagation (use ? /
map_err -> return Err(String)) so the closure returns Err on list/query
failures, and (2) change the final tenants.as_ref().map_or(true, |t| ...) to
map_or(false, |t| ...) so a missing/failed tenant probe is treated as unhealthy;
reference: the spawn_blocking closure that builds tenant_healths,
Tenant::list_all, and the tenants.as_ref().map_or(...) check used in
overall_status.

In `@src/middleware/auth_middleware.rs`:
- Around line 112-123: The code currently treats missing auth infrastructure
(AppState, TenantPoolManager, KeycloakClient) as a 401 Unauthorized in the
keycloak_client_from_request branch; change these branches to return a 500
Internal Server Error instead so infrastructure faults surface as server errors.
Locate the branch around keycloak_client_from_request(&req) and replace the
Unauthorized response with an InternalServerError response (using
HttpResponse::InternalServerError and the existing ResponseBody pattern),
preserving the ServiceResponse::new(request, response) async return shape. Apply
the same change for the other similar branches referenced (the checks around
lines 127-138 and 397-423) so all missing infrastructure cases return 500 rather
than 401.

In `@src/models/person.rs`:
- Around line 183-206: The current filter.gender branch silently returns None
for unrecognized strings, dropping the predicate; change this to validate and
return an explicit error instead of producing None: detect invalid values from
filter.gender (the same match that maps to Gender::Male/ Female/ NonBinary/
PreferNotToSay) and propagate a failure (e.g., return Result::Err or convert the
enclosing function to return a Result) rather than mapping to None; keep the
successful branch creating the Box<dyn BoxableExpression<people::table,
diesel::pg::Pg, SqlType = Nullable<Bool>>> using
people::gender.eq(Some(Gender::...)) but ensure unknown strings produce a clear
error path so invalid filter.gender values are rejected rather than ignored.

---

Nitpick comments:
In `@Cargo.toml`:
- Line 61: The Cargo.toml entry pins the home crate to an exact version using
`=0.5.11`, which can cause dependency resolution conflicts; edit the Cargo.toml
dependency line for the `home` crate to remove the leading `=` and allow
flexibility (e.g., use "0.5" to accept patch/minor updates or "0.5.11" without
`=` for patch upgrades) so Cargo can resolve compatible versions across the
dependency graph.

In `@proto/core.proto`:
- Around line 1-17: The proto package declaration "package nexus.core;" in
core.proto doesn't match its file path; move the file under a matching directory
(e.g., create proto/nexus/core/ and relocate core.proto there, or rename to
health.proto inside that directory) so the package nexus.core aligns with the
filesystem and Buf expectations; update any build/proto includes or import paths
that reference core.proto to the new path to avoid breaking references to the
HealthService, HealthCheckRequest, and HealthCheckResponse symbols.

In `@src/config/runtime.rs`:
- Around line 162-212: The env override block duplicates mappings already
handled by config::Environment with separator "__" (e.g., SERVER__HOST, DB__URL)
and keeps legacy flat names (e.g., APP_HOST, DATABASE_URL) via env_opt and
builder.set_override; refactor by grouping/centralizing the manual overrides
into a single mapping function (e.g., a new function that iterates a static list
of (env_var, config_key) pairs and calls env_opt + builder.set_override) to
reduce boilerplate, keep env_opt and builder usage intact, and add a short doc
comment near the config::Environment setup explaining which env names are
canonical (double-underscore nested) vs legacy/convenience (flat) so future
maintainers know which form to prefer.

In `@src/models/user_token.rs`:
- Line 62: The dotenvy::dotenv() call inside generate_token appears redundant
because SECRET_KEY is already initialized earlier; remove the standalone
dotenvy::dotenv() invocation in generate_token (or document why defense-in-depth
is required) and instead ensure generate_token reads MAX_AGE via std::env::var
or a configuration accessor (e.g., keep using SECRET_KEY and access MAX_AGE with
env::var("MAX_AGE")/parse) so no second dotenv call is necessary; update
generate_token and any references to MAX_AGE/SECRET_KEY to rely on the
module-level initialization or a single config loader.

In `@src/state.rs`:
- Around line 201-210: The current redact_connection_url function assumes
credentials follow user:password@host and can mis-handle missing schemes or
passwords/hosts containing ':' or '@'; update redact_connection_url to use the
url crate for parsing: attempt url::Url::parse(url), on success replace the
password via set_password(Some("<redacted>")) if present and return
parsed.to_string(), and on Err fall back to returning the original string (or
the existing simple redaction) so malformed or scheme-less URLs won't cause
incorrect redaction.
- Around line 149-180: The current methods sqlx_pool_for_tenant and
insert_sqlx_pool silently recover from RwLock poisoning by calling into_inner(),
which can mask inconsistent state; change their behavior to propagate poisoning
instead: update sqlx_pool_for_tenant to return Result<Option<sqlx::PgPool>,
std::sync::PoisonError<std::sync::RwLockReadGuard<'_, _>>> (or a boxed
error/your crate error type) and return Err(poisoned) when the read() fails, and
update insert_sqlx_pool to return Result<(),
std::sync::PoisonError<std::sync::RwLockWriteGuard<'_, _>>> (or your error type)
and return Err(poisoned) when write() fails; keep the successful branches
identical (get/cloned and insert) so callers of inner.sqlx_pools must handle the
poison case (or map it to a panic/log at call site) rather than silently using
into_inner().

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f23a5cc0-716c-43dc-afb4-aad66b40c73e

📥 Commits

Reviewing files that changed from the base of the PR and between 8fdaeea and 72d58f6.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (42)
  • Cargo.toml
  • build.rs
  • docs/rust_backend_refactor_audit.md
  • migrations/2026-02-28-000001_make_people_gender_nullable/down.sql
  • migrations/2026-02-28-000001_make_people_gender_nullable/up.sql
  • migrations/2026-03-01-000002_convert_people_gender_to_enum/down.sql
  • migrations/2026-03-01-000002_convert_people_gender_to_enum/up.sql
  • proto/core.proto
  • src/adapters/db/mod.rs
  • src/adapters/db/tenant_sqlx_repository.rs
  • src/adapters/grpc/health_impl.rs
  • src/adapters/grpc/mod.rs
  • src/adapters/grpc/server.rs
  • src/adapters/http/mod.rs
  • src/adapters/http/server.rs
  • src/adapters/mod.rs
  • src/api/account_controller.rs
  • src/api/address_book_controller.rs
  • src/api/controller_context.rs
  • src/api/health_controller.rs
  • src/api/nfag_controller.rs
  • src/api/tenant_controller.rs
  • src/app_error.rs
  • src/config/mod.rs
  • src/config/runtime.rs
  • src/domain/health.rs
  • src/domain/mod.rs
  • src/lib.rs
  • src/main.rs
  • src/middleware/auth_middleware.rs
  • src/models/person.rs
  • src/models/user_token.rs
  • src/ports/mod.rs
  • src/ports/outbound.rs
  • src/ports/repository.rs
  • src/schema.rs
  • src/services/core/health_service.rs
  • src/services/core/mod.rs
  • src/services/mod.rs
  • src/state.rs
  • src/utils/keycloak.rs
  • src/utils/ws_logger.rs

Comment thread src/adapters/http/server.rs
Comment thread src/api/account_controller.rs
Comment thread src/app_error.rs
Comment thread src/main.rs
Comment thread src/middleware/auth_middleware.rs
Comment thread src/models/person.rs
Comment thread src/schema.rs Outdated
…e, query_composition, and unified_pagination modules
- Consolidated legacy environment variable overrides into a single loop for cleaner code.
- Improved error handling for missing Keycloak client secret in non-dev environments.
- Updated the `AppConfig` struct to use nested configuration keys for better organization.
- Added tenant ID extraction from request host for multi-tenant support in authentication middleware.
- Enhanced error responses in authentication middleware for unauthorized access.
- Refactored person model to use a more descriptive gender enum and improved filtering logic.
- Updated user token validation functions to improve readability and maintainability.
- Cleaned up logging in address book service and other modules for better debugging.
- Improved session key handling and connection URL redaction for security.
- Enhanced Keycloak client methods for better error handling and clarity.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
src/config/hybrid_manager.rs (1)

81-102: ⚠️ Potential issue | 🟠 Major

Don't turn pool-creation failures into NotFound.

If get_or_create_pool_functional(tenant_id) returns an error, this branch logs it and then falls through to ServiceError::not_found(...). That masks real infra/config failures as a missing-tenant 404. Please propagate the original ServiceError instead of collapsing it to None—ideally through the manager's standard pool lookup path.

Based on learnings: Use TenantPoolManager to obtain per-tenant connection pools with the pattern: manager.get_pool(&tenant_id)?.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/hybrid_manager.rs` around lines 81 - 102, The current code masks
infra/config errors by converting get_or_create_pool_functional failures into
None and thereby returning ServiceError::not_found; replace this chain with the
manager's standard API so errors propagate. Instead of calling
get_tenant_pool(...).or_else(....get_or_create_pool_functional...) and mapping
failures to None, call self.pg_tenant_manager.get_pool(&tenant_id)? (or
otherwise return the Err from get_or_create_pool_functional) so the original
ServiceError is propagated; remove the ok_or_else(...) not-found fallthrough and
use the Result-returning TenantPoolManager::get_pool/get_pool(&tenant_id) path
to obtain the pool.
src/api/account_controller.rs (1)

678-678: ⚠️ Potential issue | 🟠 Major

Security concern: Login session contains full ID token string.

Line 678 uses format!("oauth-{}", id_token_str) which embeds the entire ID token in the login session. This is problematic because:

  1. ID tokens can be large (several KB)
  2. The ID token may contain sensitive claims
  3. This differs from the stateless flow (line 939/1109) which correctly uses uuid::Uuid::new_v4()
🔒 Proposed fix for consistency and security
-    let login_session = format!("oauth-{}", id_token_str);
+    let login_session = format!("oauth-{}", uuid::Uuid::new_v4());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/account_controller.rs` at line 678, The login session currently
embeds the full ID token via the login_session = format!("oauth-{}",
id_token_str) assignment; replace this so the session does not contain the raw
token (use a generated opaque identifier such as uuid::Uuid::new_v4() or another
secure random ID) and store the ID token only in a secure server-side store if
needed (refer to the login_session variable and id_token_str usage in
account_controller.rs and ensure any downstream code that expects the session
string is updated to accept the UUID/opaque ID instead of the token).
src/schema.rs (1)

669-680: ⚠️ Potential issue | 🔴 Critical

Fix incorrect type reference in schema.

The gender column references a non-existent type crate::models::person::Gender. The actual enum is named PersonGender (defined in src/models/person.rs line 18). Update the schema to use crate::models::person::PersonGender:

gender -> Nullable<crate::models::person::PersonGender>,

Additionally, src/schema.rs should be auto-generated by Diesel CLI rather than manually edited. Per coding guidelines: "Do not edit src/schema.rs manually; let Diesel manage schema generation from migrations." Regenerate this file using diesel print-schema with proper custom type configuration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/schema.rs` around lines 669 - 680, The schema currently references a
non-existent enum type for the gender column; update the diesel table definition
so the gender column uses the correct Rust type
crate::models::person::PersonGender (the enum defined in src/models/person.rs)
and then regenerate src/schema.rs using the Diesel CLI (diesel print-schema)
rather than editing the file manually so Diesel writes the correct custom type
mapping for the gender (gender -> Nullable<crate::models::person::PersonGender>)
and any other schema changes.
src/middleware/auth_middleware.rs (1)

744-764: ⚠️ Potential issue | 🟡 Minor

Test assertions don't match current implementation.

The tests expect specific error messages ("Missing authorization header", "Invalid authorization scheme", "Empty token") but the current extract_token implementation at lines 566-590 returns different messages:

  • "Missing authorization header or auth_token cookie" (line 589)
  • No explicit "Invalid authorization scheme" - it falls through to the missing header case
  • No explicit "Empty token" - empty tokens in header fall through
💚 Fix test assertions to match implementation
     #[actix_rt::test]
     async fn functional_auth_extract_token_missing_header() {
         let req = test::TestRequest::get().uri("/test").to_srv_request();
         let result = FunctionalAuthenticationMiddleware::<()>::extract_token(&req);

         assert!(result.is_err());
-        assert_eq!(result.unwrap_err(), "Missing authorization header");
+        assert_eq!(result.unwrap_err(), "Missing authorization header or auth_token cookie");
     }

     #[actix_rt::test]
     async fn functional_auth_extract_token_invalid_scheme() {
         let req = test::TestRequest::get()
             .uri("/test")
             .insert_header((constants::AUTHORIZATION, "Basic abc123"))
             .to_srv_request();

         let result = FunctionalAuthenticationMiddleware::<()>::extract_token(&req);

         assert!(result.is_err());
-        assert_eq!(result.unwrap_err(), "Invalid authorization scheme");
+        assert_eq!(result.unwrap_err(), "Missing authorization header or auth_token cookie");
     }

     #[actix_rt::test]
     async fn functional_auth_extract_token_empty_token() {
         let req = test::TestRequest::get()
             .uri("/test")
             .insert_header((constants::AUTHORIZATION, "Bearer "))
             .to_srv_request();

         let result = FunctionalAuthenticationMiddleware::<()>::extract_token(&req);

         assert!(result.is_err());
-        assert_eq!(result.unwrap_err(), "Empty token");
+        assert_eq!(result.unwrap_err(), "Missing authorization header or auth_token cookie");
     }

Alternatively, update the extract_token implementation to return more specific error messages matching the test expectations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/middleware/auth_middleware.rs` around lines 744 - 764, The tests expect
specific error strings but extract_token currently returns different/more
generic messages; update the FunctionalAuthenticationMiddleware::extract_token
implementation so it returns the exact messages the tests assert: return
Err("Missing authorization header") when neither the Authorization header nor
auth_token cookie exist, return Err("Invalid authorization scheme") when an
Authorization header exists but the scheme is not "Bearer", and return
Err("Empty token") when a Bearer header or auth_token cookie is present but the
token string is empty; ensure these checks are performed in that order and
return the exact quoted strings the tests use.
src/api/health_controller.rs (1)

359-368: ⚠️ Potential issue | 🟡 Minor

Overall health reports Unhealthy when tenant cache is empty.

The condition at line 363 uses map_or(false, |t| t.iter().all(...)), which returns false (Unhealthy) when tenants is None. This means:

  1. On first request before the cache is populated, health reports Unhealthy
  2. If tenant probe fails, health reports Unhealthy even if DB and cache are healthy

This could cause orchestrators to mark the service as unhealthy during startup or transient tenant-probe failures.

🛡️ Proposed fix: Treat empty/missing tenant data as healthy
     let overall_status = if db_status.is_healthy()
         && (cache_status.is_healthy() || !cache_status.is_available())
         && tenants
             .as_ref()
-            .map_or(false, |t| t.iter().all(|th| th.status.is_healthy()))
+            .map_or(true, |t| t.is_empty() || t.iter().all(|th| th.status.is_healthy()))
     {
         Status::Healthy
     } else {
         Status::Unhealthy
     };

This treats "no tenant data yet" as healthy, avoiding false negatives during startup. Alternatively, return a separate Status::Degraded or include tenant health as informational only.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/health_controller.rs` around lines 359 - 368, The overall health
computation sets overall_status to Unhealthy when tenants is None because it
uses tenants.as_ref().map_or(false, ...); change this to treat missing/empty
tenant data as healthy by using map_or(true, |t| t.iter().all(|th|
th.status.is_healthy())) (or otherwise consider marking tenant failures as
degraded/informational), so update the condition that combines db_status,
cache_status, and tenants (the overall_status assignment referencing db_status,
cache_status, tenants, and Status::Healthy/Status::Unhealthy) to accept tenants
== None as healthy.
🧹 Nitpick comments (7)
src/api/functional_operations_controller.rs (2)

334-334: Include the unknown operation type in the error message for consistency.

The error for unknown mutation types (line 405) includes the actual value, but this one doesn't. Including the value aids debugging.

♻️ Suggested improvement
-            _ => return Err(ServiceError::bad_request("Unknown operation type")),
+            _ => return Err(ServiceError::bad_request(&format!(
+                "Unknown operation type: '{}'",
+                operation.op_type
+            ))),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/functional_operations_controller.rs` at line 334, The match arm
currently returns a generic ServiceError::bad_request("Unknown operation type");
change the wildcard arm to bind the unmatched value (e.g., use `unknown` instead
of `_`) and include that bound value in the error message so the response
contains the actual operation type (e.g., build the message with the bound
variable and pass it into ServiceError::bad_request). Refer to the match
handling the operation type and ServiceError::bad_request to locate the change.

260-294: Inconsistent validation and overflow handling between chain operations and standalone endpoints.

The chain endpoint silently accepts unrecognized filter/map params with fallback behavior (_ => true for filter, _ => x for map), while the standalone demo_filter and demo_map endpoints strictly validate and reject invalid params. Additionally, the chain's map operations (e.g., x * 2) don't use saturating_* arithmetic like the standalone endpoint does (lines 196-200), which could cause different overflow behavior.

Consider aligning validation and arithmetic handling for consistency:

♻️ Suggested alignment for chain filter validation
             "filter" => {
+                match operation.param.as_str() {
+                    "even" | "odd" => {}
+                    _ => {
+                        return Err(ServiceError::bad_request(&format!(
+                            "Invalid filter condition '{}'. Supported in chain: even, odd",
+                            operation.param
+                        )))
+                    }
+                }
                 output_data = ChainBuilder::from_vec(current_data.clone())
                     .filter(|x| match operation.param.as_str() {
                         "even" => x % 2 == 0,
                         "odd" => x % 2 != 0,
-                        _ => true,
+                        _ => unreachable!(),
                     })
                     .collect();
♻️ Suggested alignment for chain map with saturating arithmetic
             "map" => {
+                match operation.param.as_str() {
+                    "double" | "square" | "increment" => {}
+                    _ => {
+                        return Err(ServiceError::bad_request(&format!(
+                            "Invalid map transformation '{}'. Supported in chain: double, square, increment",
+                            operation.param
+                        )))
+                    }
+                }
                 output_data = ChainBuilder::from_vec(current_data.clone())
                     .map(|x| match operation.param.as_str() {
-                        "double" => x * 2,
-                        "square" => x * x,
-                        "increment" => x + 1,
-                        _ => x,
+                        "double" => x.saturating_mul(2),
+                        "square" => x.saturating_mul(x),
+                        "increment" => x.saturating_add(1),
+                        _ => unreachable!(),
                     })
                     .collect();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/functional_operations_controller.rs` around lines 260 - 294, The
chain handling currently accepts unknown params and uses non-saturating
arithmetic; update the "filter" and "map" arms that use operation.op_type /
operation.param and ChainBuilder::from_vec so they mirror the standalone
demo_filter and demo_map validation: pre-validate operation.param and return a
validation error (HTTP 400) for unknown filter/map params instead of falling
back to defaults, and change the map closure to use saturating arithmetic (e.g.,
x.saturating_mul(2) for "double", x.saturating_mul(x) for "square",
x.saturating_add(1) for "increment") so Transformations (TransformationStep) and
chain behavior match demo_filter/demo_map.
src/config/runtime.rs (1)

134-212: Well-designed configuration module with security considerations.

Good practices observed:

  • Sensitive fields redacted in Debug implementations
  • Legacy environment variable compatibility via explicit overrides
  • Required field validation for production use

Consider case-insensitive comparison for app_env to avoid subtle configuration issues:

🔧 Optional improvement
-        if cfg.bootstrap.app_env != "dev" && cfg.keycloak.client_secret.trim().is_empty() {
+        if !cfg.bootstrap.app_env.eq_ignore_ascii_case("dev") && cfg.keycloak.client_secret.trim().is_empty() {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/runtime.rs` around lines 134 - 212, The check in
AppConfig::from_env uses a case-sensitive comparison of cfg.bootstrap.app_env to
"dev", which can fail if the env var is cased differently; update the condition
in from_env to use a case-insensitive comparison (e.g., trim and call
eq_ignore_ascii_case("dev")) on cfg.bootstrap.app_env so non-lowercase values
like "DEV" or "Dev" are treated as dev.
src/adapters/http/server.rs (1)

107-118: Consider distinguishing unexpected channel closure from intentional shutdown.

The shutdown watcher logs a warning when the channel closes unexpectedly (line 114) but still proceeds to stop the server gracefully. This is reasonable behavior, though you may want to propagate this as an error condition for observability purposes.

💡 Optional: Propagate channel closure as error
     tokio::spawn(async move {
         if *shutdown.borrow() {
             handle.stop(true).await;
             return;
         }

         if shutdown.changed().await.is_err() {
             tracing::warn!("http shutdown watcher channel closed before change notification");
+            // Could return an error here if you want to track this as a failure
         }

         handle.stop(true).await;
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/adapters/http/server.rs` around lines 107 - 118, The shutdown watcher
currently treats channel closure as a warning then proceeds to stop the server;
update the tokio::spawn block that uses shutdown, shutdown.changed(), and
handle.stop(true).await to distinguish intentional shutdown from unexpected
channel closure by treating the .changed().await.is_err() path as an error
condition: log with tracing::error (including context like "shutdown channel
closed unexpectedly") and surface the failure (e.g., by signaling an
observability/error metric or returning/propagating an Err from this task)
instead of only warning, while retaining the graceful handle.stop(true).await
call for cleanup.
src/api/health_controller.rs (1)

86-102: Background health probe task has no cancellation mechanism.

The spawned task at lines 92-100 loops indefinitely with no way to stop it. While this may be acceptable for a server that runs until process termination, consider:

  1. Accepting a shutdown signal to gracefully terminate the probe
  2. Making the 30-second interval configurable
💡 Optional: Add shutdown support
 fn ensure_tenant_health_probe_running(
     manager: TenantPoolManager,
     main_conn: web::Data<DatabasePool>,
+    shutdown: tokio::sync::watch::Receiver<bool>,
 ) {
     let cache = tenant_health_cache().clone();
     let _ = TENANT_HEALTH_POLL_HANDLE.get_or_init(|| {
         tokio::spawn(async move {
             loop {
+                if *shutdown.borrow() {
+                    break;
+                }
                 let tenant_healths =
                     collect_tenant_health(manager.clone(), main_conn.clone()).await;
                 let mut cached = cache.tenants.write().await;
                 *cached = tenant_healths;
-                tokio::time::sleep(Duration::from_secs(30)).await;
+                tokio::select! {
+                    _ = tokio::time::sleep(Duration::from_secs(30)) => {}
+                    _ = shutdown.changed() => break,
+                }
             }
         })
     });
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/health_controller.rs` around lines 86 - 102, The background probe
spawned by ensure_tenant_health_probe_running currently loops forever; modify it
to accept cancellation and a configurable interval by (1) adding a shutdown
trigger (e.g., pass in a tokio::sync::watch::Receiver,
tokio_util::sync::CancellationToken, or otherwise listen to an application
shutdown signal) and using tokio::select! inside the spawned task to break the
loop when the shutdown is signaled, and (2) replacing the hardcoded
Duration::from_secs(30) with a configurable interval value (from config or an
argument) so the sleep uses that variable; update TENANT_HEALTH_POLL_HANDLE
initialization to capture the shutdown token/receiver and the interval and
ensure collect_tenant_health(manager.clone(), main_conn.clone()) and writes to
tenant_health_cache() remain inside the cancellable loop.
src/state.rs (2)

27-27: Consider tokio::sync::RwLock for async-safe locking.

Using std::sync::RwLock in an async context can block the Tokio runtime executor threads when contended, potentially causing performance degradation or deadlocks under load. Since AppState is used in async handlers, prefer tokio::sync::RwLock which yields to the scheduler while waiting.

♻️ Proposed fix
-use std::sync::{Arc, RwLock};
+use std::sync::Arc;
+use tokio::sync::RwLock;

Note: This change requires updating sqlx_pool_for_tenant and insert_sqlx_pool to be async methods.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state.rs` at line 27, Replace the current std::sync::RwLock used on the
AppState field sqlx_pools with tokio::sync::RwLock to avoid blocking the async
runtime, and update the associated accessors sqlx_pool_for_tenant and
insert_sqlx_pool to be async functions that .await on the RwLock reads/writes;
ensure you update their call sites to await them and adjust import paths (use
tokio::sync::RwLock) and signature changes accordingly.

151-175: Document that tenant_id must be server-derived, not client-supplied.

Per coding guidelines, tenant context should be derived from host/subdomain, mTLS, or server-side lookup. Since these methods accept tenant_id as a parameter, consider adding documentation or debug assertions to reinforce that callers must not pass client-supplied values directly.

📝 Suggested documentation
+    /// Returns the sqlx pool for the given tenant.
+    ///
+    /// # Security
+    /// The `tenant_id` must be derived from server-side context (host/subdomain,
+    /// mTLS, or server-side lookup). Never pass client-supplied tenant IDs.
     pub fn sqlx_pool_for_tenant(
         &self,
         tenant_id: &str,
     ) -> Result<Option<sqlx::PgPool>, std::sync::PoisonError<()>> {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state.rs` around lines 151 - 175, Document and enforce that tenant_id is
server-derived (not client-supplied) for the methods sqlx_pool_for_tenant and
insert_sqlx_pool: add clear doc comments on both functions stating the tenant_id
must originate from server-controlled sources (host/subdomain, mTLS, server
lookup) and not from request payloads, and add a lightweight runtime check
(e.g., a debug_assert or call to a helper like
validate_server_derived_tenant_id) to fail fast in debug builds if a suspicious
value is passed; update any State/inner struct documentation to the same effect
so callers are clearly guided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/api/account_controller.rs`:
- Around line 219-261: The validate_redirect_uri function currently treats an
empty ALLOWED_REDIRECT_URIS allowlist as "deny all" and returns a ServiceError;
to implement fail-open behavior, remove the explicit error when
allowlist.is_empty() and only enforce membership if allowlist is non-empty: keep
using normalize_redirect_uri for the input, skip empty normalized inputs, and
only call allowlist.contains(&normalized_uri) when allowlist.len() > 0; update
the error paths to only trigger for non-empty allowlists and add a short comment
near ALLOWED_REDIRECT_URIS/allowlist explaining the chosen policy (or document
elsewhere if you intend to remain fail-closed).

In `@src/config/hybrid_manager.rs`:
- Around line 194-201: The scheme-less Postgres detection in the function that
returns TenantDbType::Postgres incorrectly rejects credentialized addresses
because of the "!url_lower.contains(\":\")" guard; update the logic that checks
url_lower (used around the TenantDbType::Postgres return) to instead verify
there is a "/" following the "@" (e.g., find the index of '@' and ensure a '/'
occurs after it) or simply remove the colon-negation check so
"user:pass@host/db" is accepted; adjust the conditional that currently reads
url_lower.contains("/") && !url_lower.contains(":") accordingly so
credential-containing URLs are classified as Postgres.

In `@src/main.rs`:
- Around line 38-57: The match on result currently always returns or bails (via
the join_res handling and the Err/Ok branches), making the subsequent while let
Some(pending) loop unreachable; either remove that loop or restructure so we
drain remaining tasks before returning: capture the failure (e.g., store an
anyhow::Error or a boolean) instead of immediately returning inside match
(references: tasks.join_next(), shutdown_tx.send(true), join_res, task_result),
then after the match run the while let Some(pending) { let task_result =
pending.context("server task join failure")?; if let Err(e) = task_result {
record/aggregate the error } } and finally return or bail with the
stored/aggregated error; alternatively, if draining is unnecessary simply delete
the unreachable while loop.

In `@src/models/person.rs`:
- Around line 168-179: The name and phone filters use people::name.like(...) and
people::phone.like(...) without escaping SQL LIKE metacharacters, unlike the
email filter which applies escaping; add the same escaping logic used for the
email filter: extract or create a small helper (e.g., escape_like or reuse the
email escape logic) that escapes %, _, and the escape char, run it on
filter.name and filter.phone before building the format!("%{}%", ...), and use
the escaped value in people::name.like(...) and people::phone.like(...) so all
three filters (email, name, phone) consistently escape user input.

In `@src/services/account_service.rs`:
- Around line 542-549: The call to user_ops::find_login_info_by_token inside
me() is converting any error into ServiceError::internal_server_error, which
turns missing/revoked sessions into 500s; change the error mapping so that a
"not found"/missing-login result maps to ServiceError::unauthorized (same
behavior as refresh()), while other unexpected DB errors still become
internal_server_error. Update the map_err for find_login_info_by_token to
inspect the returned error (or match its variant) and return
ServiceError::unauthorized for the not-found case and
ServiceError::internal_server_error(...) otherwise, referencing
find_login_info_by_token, me(), refresh(), and
ServiceError::internal_server_error in your fix.
- Around line 506-510: The doctest for the me() function was not updated after
its signature changed to accept a third parameter (keycloak_client:
&KeycloakClient); update the example to call me(&auth, &pool, &keycloak_client)
(or construct/pass a suitable mock/dummy KeycloakClient used in other tests) and
ensure any setup in the doctest creates the KeycloakClient value so the snippet
compiles when doctests are run; locate the example near the pub fn me(...)
declaration and adjust the call and setup accordingly.

In `@src/state.rs`:
- Around line 45-51: The code creates two separate DB pools (main_pool via
db::init_db_pool and sqlx_pool) causing double connections; fix by either using
a single shared pool type across the app (replace usages of diesel/r2d2
main_pool or sqlx_pool so all DB access uses one pool instance), or if both
libraries must coexist during migration, compute and apply separate connection
caps (e.g., split cfg.db.max_connections between db::init_db_pool and the sqlx
pool creation) and centralize pool creation so init_db_pool and the function
constructing sqlx_pool reference the shared configuration/value; update callers
that reference main_pool or sqlx_pool accordingly to use the chosen shared pool
symbol.

In `@src/utils/keycloak.rs`:
- Around line 927-935: Normalize self.issuer_url by trimming any trailing '/'
(e.g., use self.issuer_url.trim_end_matches('/')) before building jwks_url, and
create the blocking HTTP client with explicit timeouts instead of
BlockingClient::new() (use
BlockingClient::builder().timeout(Duration::from_secs(...)).build() ) so the
.get(&jwks_url).send() call cannot block indefinitely; update the code around
jwks_url, BlockingClient::new(), and the .get(...).send() sequence to use the
normalized URL and a timed client and propagate/send a clear error if the
request times out or fails.

---

Outside diff comments:
In `@src/api/account_controller.rs`:
- Line 678: The login session currently embeds the full ID token via the
login_session = format!("oauth-{}", id_token_str) assignment; replace this so
the session does not contain the raw token (use a generated opaque identifier
such as uuid::Uuid::new_v4() or another secure random ID) and store the ID token
only in a secure server-side store if needed (refer to the login_session
variable and id_token_str usage in account_controller.rs and ensure any
downstream code that expects the session string is updated to accept the
UUID/opaque ID instead of the token).

In `@src/api/health_controller.rs`:
- Around line 359-368: The overall health computation sets overall_status to
Unhealthy when tenants is None because it uses tenants.as_ref().map_or(false,
...); change this to treat missing/empty tenant data as healthy by using
map_or(true, |t| t.iter().all(|th| th.status.is_healthy())) (or otherwise
consider marking tenant failures as degraded/informational), so update the
condition that combines db_status, cache_status, and tenants (the overall_status
assignment referencing db_status, cache_status, tenants, and
Status::Healthy/Status::Unhealthy) to accept tenants == None as healthy.

In `@src/config/hybrid_manager.rs`:
- Around line 81-102: The current code masks infra/config errors by converting
get_or_create_pool_functional failures into None and thereby returning
ServiceError::not_found; replace this chain with the manager's standard API so
errors propagate. Instead of calling
get_tenant_pool(...).or_else(....get_or_create_pool_functional...) and mapping
failures to None, call self.pg_tenant_manager.get_pool(&tenant_id)? (or
otherwise return the Err from get_or_create_pool_functional) so the original
ServiceError is propagated; remove the ok_or_else(...) not-found fallthrough and
use the Result-returning TenantPoolManager::get_pool/get_pool(&tenant_id) path
to obtain the pool.

In `@src/middleware/auth_middleware.rs`:
- Around line 744-764: The tests expect specific error strings but extract_token
currently returns different/more generic messages; update the
FunctionalAuthenticationMiddleware::extract_token implementation so it returns
the exact messages the tests assert: return Err("Missing authorization header")
when neither the Authorization header nor auth_token cookie exist, return
Err("Invalid authorization scheme") when an Authorization header exists but the
scheme is not "Bearer", and return Err("Empty token") when a Bearer header or
auth_token cookie is present but the token string is empty; ensure these checks
are performed in that order and return the exact quoted strings the tests use.

In `@src/schema.rs`:
- Around line 669-680: The schema currently references a non-existent enum type
for the gender column; update the diesel table definition so the gender column
uses the correct Rust type crate::models::person::PersonGender (the enum defined
in src/models/person.rs) and then regenerate src/schema.rs using the Diesel CLI
(diesel print-schema) rather than editing the file manually so Diesel writes the
correct custom type mapping for the gender (gender ->
Nullable<crate::models::person::PersonGender>) and any other schema changes.

---

Nitpick comments:
In `@src/adapters/http/server.rs`:
- Around line 107-118: The shutdown watcher currently treats channel closure as
a warning then proceeds to stop the server; update the tokio::spawn block that
uses shutdown, shutdown.changed(), and handle.stop(true).await to distinguish
intentional shutdown from unexpected channel closure by treating the
.changed().await.is_err() path as an error condition: log with tracing::error
(including context like "shutdown channel closed unexpectedly") and surface the
failure (e.g., by signaling an observability/error metric or
returning/propagating an Err from this task) instead of only warning, while
retaining the graceful handle.stop(true).await call for cleanup.

In `@src/api/functional_operations_controller.rs`:
- Line 334: The match arm currently returns a generic
ServiceError::bad_request("Unknown operation type"); change the wildcard arm to
bind the unmatched value (e.g., use `unknown` instead of `_`) and include that
bound value in the error message so the response contains the actual operation
type (e.g., build the message with the bound variable and pass it into
ServiceError::bad_request). Refer to the match handling the operation type and
ServiceError::bad_request to locate the change.
- Around line 260-294: The chain handling currently accepts unknown params and
uses non-saturating arithmetic; update the "filter" and "map" arms that use
operation.op_type / operation.param and ChainBuilder::from_vec so they mirror
the standalone demo_filter and demo_map validation: pre-validate operation.param
and return a validation error (HTTP 400) for unknown filter/map params instead
of falling back to defaults, and change the map closure to use saturating
arithmetic (e.g., x.saturating_mul(2) for "double", x.saturating_mul(x) for
"square", x.saturating_add(1) for "increment") so Transformations
(TransformationStep) and chain behavior match demo_filter/demo_map.

In `@src/api/health_controller.rs`:
- Around line 86-102: The background probe spawned by
ensure_tenant_health_probe_running currently loops forever; modify it to accept
cancellation and a configurable interval by (1) adding a shutdown trigger (e.g.,
pass in a tokio::sync::watch::Receiver, tokio_util::sync::CancellationToken, or
otherwise listen to an application shutdown signal) and using tokio::select!
inside the spawned task to break the loop when the shutdown is signaled, and (2)
replacing the hardcoded Duration::from_secs(30) with a configurable interval
value (from config or an argument) so the sleep uses that variable; update
TENANT_HEALTH_POLL_HANDLE initialization to capture the shutdown token/receiver
and the interval and ensure collect_tenant_health(manager.clone(),
main_conn.clone()) and writes to tenant_health_cache() remain inside the
cancellable loop.

In `@src/config/runtime.rs`:
- Around line 134-212: The check in AppConfig::from_env uses a case-sensitive
comparison of cfg.bootstrap.app_env to "dev", which can fail if the env var is
cased differently; update the condition in from_env to use a case-insensitive
comparison (e.g., trim and call eq_ignore_ascii_case("dev")) on
cfg.bootstrap.app_env so non-lowercase values like "DEV" or "Dev" are treated as
dev.

In `@src/state.rs`:
- Line 27: Replace the current std::sync::RwLock used on the AppState field
sqlx_pools with tokio::sync::RwLock to avoid blocking the async runtime, and
update the associated accessors sqlx_pool_for_tenant and insert_sqlx_pool to be
async functions that .await on the RwLock reads/writes; ensure you update their
call sites to await them and adjust import paths (use tokio::sync::RwLock) and
signature changes accordingly.
- Around line 151-175: Document and enforce that tenant_id is server-derived
(not client-supplied) for the methods sqlx_pool_for_tenant and insert_sqlx_pool:
add clear doc comments on both functions stating the tenant_id must originate
from server-controlled sources (host/subdomain, mTLS, server lookup) and not
from request payloads, and add a lightweight runtime check (e.g., a debug_assert
or call to a helper like validate_server_derived_tenant_id) to fail fast in
debug builds if a suspicious value is passed; update any State/inner struct
documentation to the same effect so callers are clearly guided.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: abcac3b2-e13e-4f99-8b4e-290107db4bdf

📥 Commits

Reviewing files that changed from the base of the PR and between 72d58f6 and 7fce183.

📒 Files selected for processing (38)
  • Cargo.toml
  • build.rs
  • docs/rust_backend_refactor_audit.md
  • examples/pipeline_metrics_demo.rs
  • functional_lib/src/lazy_pipeline.rs
  • functional_lib/src/query_composition.rs
  • functional_lib/src/unified_pagination.rs
  • proto/nexus/core/core.proto
  • src/adapters/grpc/health_impl.rs
  • src/adapters/http/server.rs
  • src/api/account_controller.rs
  • src/api/address_book_controller.rs
  • src/api/controller_context.rs
  • src/api/crud_engine.rs
  • src/api/functional_operations_controller.rs
  • src/api/health_controller.rs
  • src/api/nfag_controller_v2.rs
  • src/api/tenant_controller.rs
  • src/app_error.rs
  • src/config/app.rs
  • src/config/cache.rs
  • src/config/db.rs
  • src/config/hybrid_manager.rs
  • src/config/oracle_db.rs
  • src/config/runtime.rs
  • src/lib.rs
  • src/main.rs
  • src/middleware/auth_middleware.rs
  • src/models/person.rs
  • src/models/user/operations.rs
  • src/models/user_token.rs
  • src/schema.rs
  • src/services/account_service.rs
  • src/services/address_book_service.rs
  • src/state.rs
  • src/utils/keycloak.rs
  • src/utils/token_utils.rs
  • src/utils/ws_logger.rs
✅ Files skipped from review due to trivial changes (8)
  • functional_lib/src/lazy_pipeline.rs
  • src/api/nfag_controller_v2.rs
  • functional_lib/src/query_composition.rs
  • src/models/user/operations.rs
  • functional_lib/src/unified_pagination.rs
  • src/config/app.rs
  • src/config/cache.rs
  • src/api/crud_engine.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/api/tenant_controller.rs
  • src/utils/ws_logger.rs

Comment on lines +219 to +261
fn validate_redirect_uri(redirect_uri: Option<&str>) -> Result<Option<String>, ServiceError> {
let allowlist = match env::var("ALLOWED_REDIRECT_URIS") {
Ok(values) => {
let mut normalized = Vec::new();
for value in values.split([',', ';']) {
let normalized_uri = normalize_redirect_uri(value)?;
if normalized_uri.is_empty() {
continue;
}
normalized.push(normalized_uri);
}
normalized
}
Err(_) => Vec::new(),
}
.into_iter()
.filter(|uri| !uri.is_empty())
.collect::<Vec<_>>();

if let Some(uri) = redirect_uri {
let normalized_uri = normalize_redirect_uri(uri)?;
if normalized_uri.is_empty() {
return Ok(None);
}
if allowlist.is_empty() {
return Err(
ServiceError::bad_request("redirect_uri is not allowed by server policy")
.with_tag("redirect_uri_not_allowed")
.with_detail(format!("Provided redirect_uri: {normalized_uri}")),
);
}
if !allowlist.contains(&normalized_uri) {
return Err(
ServiceError::bad_request("redirect_uri is not allowed by server policy")
.with_tag("redirect_uri_not_allowed")
.with_detail(format!("Provided redirect_uri: {normalized_uri}")),
);
}
return Ok(Some(normalized_uri));
}

Ok(None)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Logic issue: Empty allowlist blocks all redirect URIs instead of allowing defaults.

When ALLOWED_REDIRECT_URIS is not set or empty, the function returns an error for any provided redirect_uri (lines 243-248). This prevents the stateless OAuth flow from working unless the environment variable is explicitly configured.

Consider whether the intended behavior should be:

  1. Fail-closed (current): Require explicit allowlist configuration
  2. Fail-open: Allow any URI when no allowlist is configured (less secure)
  3. Default allowlist: Use a sensible default when not configured

If fail-closed is intentional, document this requirement clearly. If not, the condition should be inverted:

🔧 Proposed fix if fail-open is intended
         if allowlist.is_empty() {
-            return Err(
-                ServiceError::bad_request("redirect_uri is not allowed by server policy")
-                    .with_tag("redirect_uri_not_allowed")
-                    .with_detail(format!("Provided redirect_uri: {normalized_uri}")),
-            );
+            // No allowlist configured - allow the redirect_uri
+            return Ok(Some(normalized_uri));
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/api/account_controller.rs` around lines 219 - 261, The
validate_redirect_uri function currently treats an empty ALLOWED_REDIRECT_URIS
allowlist as "deny all" and returns a ServiceError; to implement fail-open
behavior, remove the explicit error when allowlist.is_empty() and only enforce
membership if allowlist is non-empty: keep using normalize_redirect_uri for the
input, skip empty normalized inputs, and only call
allowlist.contains(&normalized_uri) when allowlist.len() > 0; update the error
paths to only trigger for non-empty allowlists and add a short comment near
ALLOWED_REDIRECT_URIS/allowlist explaining the chosen policy (or document
elsewhere if you intend to remain fail-closed).

Comment on lines +194 to 201
if url_lower.contains("@localhost")
|| url_lower.contains("@127.0.0.1")
|| url_lower.contains("@::1")
{
// Likely PostgreSQL connection string format: user:pass@host/db
if url_lower.contains("/") && !url_lower.contains(":") {
return Ok(TenantDbType::Postgres);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The scheme-less Postgres check rejects normal user:pass@host/db URLs.

The !url_lower.contains(":") guard rules out the common Postgres form where credentials contain :. Those URLs will fall through to the final bad-request branch even though this block claims to support scheme-less PostgreSQL strings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/hybrid_manager.rs` around lines 194 - 201, The scheme-less
Postgres detection in the function that returns TenantDbType::Postgres
incorrectly rejects credentialized addresses because of the
"!url_lower.contains(\":\")" guard; update the logic that checks url_lower (used
around the TenantDbType::Postgres return) to instead verify there is a "/"
following the "@" (e.g., find the index of '@' and ensure a '/' occurs after it)
or simply remove the colon-negation check so "user:pass@host/db" is accepted;
adjust the conditional that currently reads url_lower.contains("/") &&
!url_lower.contains(":") accordingly so credential-containing URLs are
classified as Postgres.

Comment thread src/main.rs
Comment on lines +38 to +57
result = tasks.join_next() => {
let _ = shutdown_tx.send(true);
match result {
Some(join_res) => {
let task_result = join_res.context("server task join failure")?;
match task_result {
Ok(()) => anyhow::bail!("server task stopped unexpectedly"),
Err(task_err) => return Err(anyhow::Error::from(task_err)),
}
}
}
},
redirect_url: env::var("KEYCLOAK_REDIRECT_URL")
.unwrap_or_else(|_| "http://localhost:8000/api/callback".to_string()),
};
let keycloak_client = web::Data::new(
rcs::utils::keycloak::KeycloakClient::new(keycloak_config)
.await
.map_err(|e| {
io::Error::new(
io::ErrorKind::Other,
format!("Failed to initialize Keycloak client: {}", e),
)
})?,
);

// Load or generate session encryption key for OAuth session state storage
// In production, load from a secure key management system (e.g., AWS KMS, HashiCorp Vault)
let session_key = {
use base64::Engine;

match env::var("SESSION_ENCRYPTION_KEY") {
Ok(key_b64) => {
// Attempt to decode the base64-encoded 64-byte key
let key_vec = base64::engine::general_purpose::STANDARD
.decode(&key_b64)
.map_err(|e| {
io::Error::new(
io::ErrorKind::InvalidInput,
format!("Failed to decode SESSION_ENCRYPTION_KEY from base64: {}", e),
)
})?;

// Convert Vec<u8> to [u8; 64]
let key_bytes: [u8; 64] = key_vec.try_into().map_err(|_| {
io::Error::new(
io::ErrorKind::InvalidInput,
"SESSION_ENCRYPTION_KEY must be exactly 64 bytes when decoded from base64".to_string(),
)
})?;

actix_web::cookie::Key::from(&key_bytes)
}
Err(_) => {
// Fallback to generation only in development
let is_dev = env::var("APP_ENV").map(|v| v == "dev").unwrap_or(false);
if is_dev {
log::warn!("SESSION_ENCRYPTION_KEY not set. Generating random key for development. DO NOT use in production.");
actix_web::cookie::Key::generate()
} else {
return Err(io::Error::new(
io::ErrorKind::InvalidInput,
"SESSION_ENCRYPTION_KEY must be set in production. Set APP_ENV=dev to generate a key for development.",
));
None => {
anyhow::bail!("both server tasks stopped unexpectedly");
}
}
}
};

let manager = config::db::TenantPoolManager::new(main_pool.clone());
// יהי רצון שימצא עבודה, קוד קשה טננט להדגמה, בייצור טען ממסד נתונים
manager
.add_tenant_pool("tenant1".to_string(), main_pool.clone())
.expect("Failed to add tenant pool");

// Clone log_broadcaster for use in main server
let main_broadcaster = log_broadcaster.clone();

// Create and share a PureFunctionRegistry to encourage functional usage across middleware
#[cfg(feature = "functional")]
let pure_registry =
Arc::new(rcs::functional::pure_function_registry::PureFunctionRegistry::new());

// Start the main HTTP server
HttpServer::new(move || {
// Use shared CORS origin configuration from middleware::ws_security
let allowed_origins = rcs::middleware::ws_security::get_allowed_origins();
let mut cors_builder = Cors::default();

// Apply allowed origins to CORS builder
for origin in allowed_origins {
cors_builder = cors_builder.allowed_origin(&origin);
}

// יהי רצון שימצא עבודה, הוסף שיטות וכותרות נפוצות
cors_builder = cors_builder
.allowed_methods(vec![
http::Method::GET,
http::Method::POST,
http::Method::PUT,
http::Method::DELETE,
http::Method::OPTIONS,
])
.allowed_headers(vec![
http::header::AUTHORIZATION,
http::header::ACCEPT,
http::header::CONTENT_TYPE,
http::header::HeaderName::from_static("x-tenant-id"),
])
.expose_headers(vec![
http::header::AUTHORIZATION,
http::header::CONTENT_TYPE,
http::header::HeaderName::from_static("x-tenant-id"),
])
.max_age(3600);

// יהי רצון שימצא עבודה, בדוק דגל אישורים
let cors = if env::var("CORS_ALLOW_CREDENTIALS")
.map(|v| v == "true")
.unwrap_or(false)
{
cors_builder.supports_credentials()
} else {
cors_builder
};

let app = App::new()
// Configure secure session middleware for OAuth state storage
// Uses HttpOnly, Secure (HTTPS only in production), SameSite=Strict cookies
.wrap(
SessionMiddleware::builder(CookieSessionStore::default(), session_key.clone())
.session_lifecycle(
PersistentSession::default()
.session_ttl(Duration::seconds(600)) // 10 minute OAuth session TTL
)
.cookie_name("oauth_session".to_string())
.cookie_path("/api".to_string()) // Covers /api/auth/* and /api/callback
.cookie_http_only(true) // Prevent JavaScript access
.cookie_same_site(SameSite::Strict) // Prevent CSRF
// In production, set to true and ensure HTTPS
.cookie_secure(env::var("SESSION_COOKIE_SECURE")
.map(|v| v == "true")
.unwrap_or(false))
.build()
)
.app_data(web::Data::new(manager.clone()))
.app_data(web::Data::new(main_pool.clone()))
.app_data(web::Data::new(redis_client.clone()))
.app_data(web::Data::new(main_broadcaster.clone()))
.app_data(keycloak_client.clone())
.wrap(tracing_actix_web::TracingLogger::default());

#[cfg(feature = "functional")]
let app = app.wrap(rcs::middleware::auth_middleware::functional_auth::FunctionalAuthentication::with_registry(pure_registry.clone()));

// CORS must be applied LAST so it wraps all other middleware and executes FIRST
// This ensures CORS headers are added to ALL responses including auth errors (401/403)
app.configure(config::app::config_services)
.wrap(cors)
})
.bind(&app_url)?
.run()
.await
}

#[cfg(test)]
mod tests {
use std::panic::{catch_unwind, AssertUnwindSafe};

use actix_cors::Cors;
use actix_web::dev::Service;
use actix_web::web;
use actix_web::{http, App, HttpServer};
use futures::FutureExt;
use testcontainers::clients;
use testcontainers::images::postgres::Postgres;
use testcontainers::Container;

use rcs::config;
use rcs::utils::ws_logger::{init_websocket_logging, LogBroadcaster};

fn try_run_postgres<'a>(docker: &'a clients::Cli) -> Option<Container<'a, Postgres>> {
catch_unwind(AssertUnwindSafe(|| docker.run(Postgres::default()))).ok()
}

#[actix_web::test]
#[cfg(feature = "functional")]
async fn test_startup_ok() {
use std::sync::Arc;
let docker = clients::Cli::default();
let postgres = match try_run_postgres(&docker) {
Some(container) => container,
None => {
eprintln!("Skipping test_startup_ok because Docker is unavailable");
return;
}
};
let pool = config::db::init_db_pool(
format!(
"postgres://postgres:postgres@127.0.0.1:{}/postgres",
postgres.get_host_port_ipv4(5432)
)
.as_str(),
);
config::db::run_migration(&mut pool.get().unwrap());

// Initialize logging for tests
let log_broadcaster = LogBroadcaster::new(100);
init_websocket_logging(log_broadcaster.clone())
.expect("failed to initialize websocket logging in test_startup_ok");

let test_registry =
Arc::new(rcs::functional::pure_function_registry::PureFunctionRegistry::new());

HttpServer::new(move || {
App::new()
.wrap(
Cors::default() // allowed_origin return access-control-allow-origin: * by default
// .allowed_origin("http://127.0.0.1:8080")
.send_wildcard()
.allowed_methods(vec!["GET", "POST", "PUT", "DELETE"])
.allowed_headers(vec![http::header::AUTHORIZATION, http::header::ACCEPT])
.allowed_header(http::header::CONTENT_TYPE)
.max_age(3600),
)
.app_data(web::Data::new(pool.clone()))
.app_data(web::Data::new(log_broadcaster.clone()))
.wrap(tracing_actix_web::TracingLogger::default())
.wrap(rcs::middleware::auth_middleware::functional_auth::FunctionalAuthentication::with_registry(test_registry.clone()))
.wrap_fn(|req, srv| srv.call(req).map(|res| res))
.configure(config::app::config_services)
})
.bind("localhost:8000".to_string())
.unwrap()
.run();

// Test passes if server starts without panicking - HTTP binding success is the verification
}

/// Starts an Actix HTTP server configured with CORS and a database pool to verify it can start without authentication middleware.
///
/// # Examples
///
/// ```
/// // This test starts a PostgreSQL test container, runs DB migrations,
/// // and launches an Actix server bound to localhost:8001 with permissive CORS
/// // and no authentication middleware to ensure startup succeeds.
/// #[actix_web::test]
/// async fn test_startup_without_auth_middleware_ok() {
/// // setup test Postgres, pool, migrations, and start server...
/// }
/// ```
#[actix_web::test]
async fn test_startup_without_auth_middleware_ok() {
let docker = clients::Cli::default();
let postgres = match try_run_postgres(&docker) {
Some(container) => container,
None => {
eprintln!(
"Skipping test_startup_without_auth_middleware_ok because Docker is unavailable"
);
return;
while let Some(pending) = tasks.join_next().await {
let task_result = pending.context("server task join failure")?;
task_result.map_err(anyhow::Error::from)?;
}
};
let pool = config::db::init_db_pool(
format!(
"postgres://postgres:postgres@127.0.0.1:{}/postgres",
postgres.get_host_port_ipv4(5432)
)
.as_str(),
);
config::db::run_migration(&mut pool.get().unwrap());

// Initialize logging for tests
let log_broadcaster = LogBroadcaster::new(100);
init_websocket_logging(log_broadcaster.clone()).expect(
"failed to initialize websocket logging in test_startup_without_auth_middleware_ok",
);

let _ = HttpServer::new(move || {
App::new()
.wrap(
Cors::default() // allowed_origin return access-control-allow-origin: * by default
// .allowed_origin("http://127.0.0.1:8080")
.send_wildcard()
.allowed_methods(vec!["GET", "POST", "PUT", "DELETE"])
.allowed_headers(vec![http::header::AUTHORIZATION, http::header::ACCEPT])
.allowed_header(http::header::CONTENT_TYPE)
.max_age(3600),
)
.app_data(web::Data::new(pool.clone()))
.app_data(web::Data::new(log_broadcaster.clone()))
.wrap(tracing_actix_web::TracingLogger::default())
.wrap_fn(|req, srv| srv.call(req).map(|res| res))
.configure(config::app::config_services)
})
.bind("localhost:8001".to_string())
.unwrap()
.run()
.await;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unreachable code after early return.

Lines 53-56 (the while let Some(pending) loop) are unreachable. The match result block at lines 40-51 returns or bails in all branches (Ok(()) bails at line 44, Err returns at line 45, None bails at line 49), so control never reaches line 53.

🐛 Proposed fix: Remove unreachable code or restructure
         result = tasks.join_next() => {
             let _ = shutdown_tx.send(true);
             match result {
                 Some(join_res) => {
                     let task_result = join_res.context("server task join failure")?;
                     match task_result {
                         Ok(()) => anyhow::bail!("server task stopped unexpectedly"),
                         Err(task_err) => return Err(anyhow::Error::from(task_err)),
                     }
                 }
                 None => {
                     anyhow::bail!("both server tasks stopped unexpectedly");
                 }
             }
-
-            while let Some(pending) = tasks.join_next().await {
-                let task_result = pending.context("server task join failure")?;
-                task_result.map_err(anyhow::Error::from)?;
-            }
         }

If the intent is to drain remaining tasks after one fails, you'd need to restructure to avoid the early return/bail, for example by storing the error and draining before returning it.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
result = tasks.join_next() => {
let _ = shutdown_tx.send(true);
match result {
Some(join_res) => {
let task_result = join_res.context("server task join failure")?;
match task_result {
Ok(()) => anyhow::bail!("server task stopped unexpectedly"),
Err(task_err) => return Err(anyhow::Error::from(task_err)),
}
}
}
},
redirect_url: env::var("KEYCLOAK_REDIRECT_URL")
.unwrap_or_else(|_| "http://localhost:8000/api/callback".to_string()),
};
let keycloak_client = web::Data::new(
rcs::utils::keycloak::KeycloakClient::new(keycloak_config)
.await
.map_err(|e| {
io::Error::new(
io::ErrorKind::Other,
format!("Failed to initialize Keycloak client: {}", e),
)
})?,
);
// Load or generate session encryption key for OAuth session state storage
// In production, load from a secure key management system (e.g., AWS KMS, HashiCorp Vault)
let session_key = {
use base64::Engine;
match env::var("SESSION_ENCRYPTION_KEY") {
Ok(key_b64) => {
// Attempt to decode the base64-encoded 64-byte key
let key_vec = base64::engine::general_purpose::STANDARD
.decode(&key_b64)
.map_err(|e| {
io::Error::new(
io::ErrorKind::InvalidInput,
format!("Failed to decode SESSION_ENCRYPTION_KEY from base64: {}", e),
)
})?;
// Convert Vec<u8> to [u8; 64]
let key_bytes: [u8; 64] = key_vec.try_into().map_err(|_| {
io::Error::new(
io::ErrorKind::InvalidInput,
"SESSION_ENCRYPTION_KEY must be exactly 64 bytes when decoded from base64".to_string(),
)
})?;
actix_web::cookie::Key::from(&key_bytes)
}
Err(_) => {
// Fallback to generation only in development
let is_dev = env::var("APP_ENV").map(|v| v == "dev").unwrap_or(false);
if is_dev {
log::warn!("SESSION_ENCRYPTION_KEY not set. Generating random key for development. DO NOT use in production.");
actix_web::cookie::Key::generate()
} else {
return Err(io::Error::new(
io::ErrorKind::InvalidInput,
"SESSION_ENCRYPTION_KEY must be set in production. Set APP_ENV=dev to generate a key for development.",
));
None => {
anyhow::bail!("both server tasks stopped unexpectedly");
}
}
}
};
let manager = config::db::TenantPoolManager::new(main_pool.clone());
// יהי רצון שימצא עבודה, קוד קשה טננט להדגמה, בייצור טען ממסד נתונים
manager
.add_tenant_pool("tenant1".to_string(), main_pool.clone())
.expect("Failed to add tenant pool");
// Clone log_broadcaster for use in main server
let main_broadcaster = log_broadcaster.clone();
// Create and share a PureFunctionRegistry to encourage functional usage across middleware
#[cfg(feature = "functional")]
let pure_registry =
Arc::new(rcs::functional::pure_function_registry::PureFunctionRegistry::new());
// Start the main HTTP server
HttpServer::new(move || {
// Use shared CORS origin configuration from middleware::ws_security
let allowed_origins = rcs::middleware::ws_security::get_allowed_origins();
let mut cors_builder = Cors::default();
// Apply allowed origins to CORS builder
for origin in allowed_origins {
cors_builder = cors_builder.allowed_origin(&origin);
}
// יהי רצון שימצא עבודה, הוסף שיטות וכותרות נפוצות
cors_builder = cors_builder
.allowed_methods(vec![
http::Method::GET,
http::Method::POST,
http::Method::PUT,
http::Method::DELETE,
http::Method::OPTIONS,
])
.allowed_headers(vec![
http::header::AUTHORIZATION,
http::header::ACCEPT,
http::header::CONTENT_TYPE,
http::header::HeaderName::from_static("x-tenant-id"),
])
.expose_headers(vec![
http::header::AUTHORIZATION,
http::header::CONTENT_TYPE,
http::header::HeaderName::from_static("x-tenant-id"),
])
.max_age(3600);
// יהי רצון שימצא עבודה, בדוק דגל אישורים
let cors = if env::var("CORS_ALLOW_CREDENTIALS")
.map(|v| v == "true")
.unwrap_or(false)
{
cors_builder.supports_credentials()
} else {
cors_builder
};
let app = App::new()
// Configure secure session middleware for OAuth state storage
// Uses HttpOnly, Secure (HTTPS only in production), SameSite=Strict cookies
.wrap(
SessionMiddleware::builder(CookieSessionStore::default(), session_key.clone())
.session_lifecycle(
PersistentSession::default()
.session_ttl(Duration::seconds(600)) // 10 minute OAuth session TTL
)
.cookie_name("oauth_session".to_string())
.cookie_path("/api".to_string()) // Covers /api/auth/* and /api/callback
.cookie_http_only(true) // Prevent JavaScript access
.cookie_same_site(SameSite::Strict) // Prevent CSRF
// In production, set to true and ensure HTTPS
.cookie_secure(env::var("SESSION_COOKIE_SECURE")
.map(|v| v == "true")
.unwrap_or(false))
.build()
)
.app_data(web::Data::new(manager.clone()))
.app_data(web::Data::new(main_pool.clone()))
.app_data(web::Data::new(redis_client.clone()))
.app_data(web::Data::new(main_broadcaster.clone()))
.app_data(keycloak_client.clone())
.wrap(tracing_actix_web::TracingLogger::default());
#[cfg(feature = "functional")]
let app = app.wrap(rcs::middleware::auth_middleware::functional_auth::FunctionalAuthentication::with_registry(pure_registry.clone()));
// CORS must be applied LAST so it wraps all other middleware and executes FIRST
// This ensures CORS headers are added to ALL responses including auth errors (401/403)
app.configure(config::app::config_services)
.wrap(cors)
})
.bind(&app_url)?
.run()
.await
}
#[cfg(test)]
mod tests {
use std::panic::{catch_unwind, AssertUnwindSafe};
use actix_cors::Cors;
use actix_web::dev::Service;
use actix_web::web;
use actix_web::{http, App, HttpServer};
use futures::FutureExt;
use testcontainers::clients;
use testcontainers::images::postgres::Postgres;
use testcontainers::Container;
use rcs::config;
use rcs::utils::ws_logger::{init_websocket_logging, LogBroadcaster};
fn try_run_postgres<'a>(docker: &'a clients::Cli) -> Option<Container<'a, Postgres>> {
catch_unwind(AssertUnwindSafe(|| docker.run(Postgres::default()))).ok()
}
#[actix_web::test]
#[cfg(feature = "functional")]
async fn test_startup_ok() {
use std::sync::Arc;
let docker = clients::Cli::default();
let postgres = match try_run_postgres(&docker) {
Some(container) => container,
None => {
eprintln!("Skipping test_startup_ok because Docker is unavailable");
return;
}
};
let pool = config::db::init_db_pool(
format!(
"postgres://postgres:postgres@127.0.0.1:{}/postgres",
postgres.get_host_port_ipv4(5432)
)
.as_str(),
);
config::db::run_migration(&mut pool.get().unwrap());
// Initialize logging for tests
let log_broadcaster = LogBroadcaster::new(100);
init_websocket_logging(log_broadcaster.clone())
.expect("failed to initialize websocket logging in test_startup_ok");
let test_registry =
Arc::new(rcs::functional::pure_function_registry::PureFunctionRegistry::new());
HttpServer::new(move || {
App::new()
.wrap(
Cors::default() // allowed_origin return access-control-allow-origin: * by default
// .allowed_origin("http://127.0.0.1:8080")
.send_wildcard()
.allowed_methods(vec!["GET", "POST", "PUT", "DELETE"])
.allowed_headers(vec![http::header::AUTHORIZATION, http::header::ACCEPT])
.allowed_header(http::header::CONTENT_TYPE)
.max_age(3600),
)
.app_data(web::Data::new(pool.clone()))
.app_data(web::Data::new(log_broadcaster.clone()))
.wrap(tracing_actix_web::TracingLogger::default())
.wrap(rcs::middleware::auth_middleware::functional_auth::FunctionalAuthentication::with_registry(test_registry.clone()))
.wrap_fn(|req, srv| srv.call(req).map(|res| res))
.configure(config::app::config_services)
})
.bind("localhost:8000".to_string())
.unwrap()
.run();
// Test passes if server starts without panicking - HTTP binding success is the verification
}
/// Starts an Actix HTTP server configured with CORS and a database pool to verify it can start without authentication middleware.
///
/// # Examples
///
/// ```
/// // This test starts a PostgreSQL test container, runs DB migrations,
/// // and launches an Actix server bound to localhost:8001 with permissive CORS
/// // and no authentication middleware to ensure startup succeeds.
/// #[actix_web::test]
/// async fn test_startup_without_auth_middleware_ok() {
/// // setup test Postgres, pool, migrations, and start server...
/// }
/// ```
#[actix_web::test]
async fn test_startup_without_auth_middleware_ok() {
let docker = clients::Cli::default();
let postgres = match try_run_postgres(&docker) {
Some(container) => container,
None => {
eprintln!(
"Skipping test_startup_without_auth_middleware_ok because Docker is unavailable"
);
return;
while let Some(pending) = tasks.join_next().await {
let task_result = pending.context("server task join failure")?;
task_result.map_err(anyhow::Error::from)?;
}
};
let pool = config::db::init_db_pool(
format!(
"postgres://postgres:postgres@127.0.0.1:{}/postgres",
postgres.get_host_port_ipv4(5432)
)
.as_str(),
);
config::db::run_migration(&mut pool.get().unwrap());
// Initialize logging for tests
let log_broadcaster = LogBroadcaster::new(100);
init_websocket_logging(log_broadcaster.clone()).expect(
"failed to initialize websocket logging in test_startup_without_auth_middleware_ok",
);
let _ = HttpServer::new(move || {
App::new()
.wrap(
Cors::default() // allowed_origin return access-control-allow-origin: * by default
// .allowed_origin("http://127.0.0.1:8080")
.send_wildcard()
.allowed_methods(vec!["GET", "POST", "PUT", "DELETE"])
.allowed_headers(vec![http::header::AUTHORIZATION, http::header::ACCEPT])
.allowed_header(http::header::CONTENT_TYPE)
.max_age(3600),
)
.app_data(web::Data::new(pool.clone()))
.app_data(web::Data::new(log_broadcaster.clone()))
.wrap(tracing_actix_web::TracingLogger::default())
.wrap_fn(|req, srv| srv.call(req).map(|res| res))
.configure(config::app::config_services)
})
.bind("localhost:8001".to_string())
.unwrap()
.run()
.await;
}
result = tasks.join_next() => {
let _ = shutdown_tx.send(true);
match result {
Some(join_res) => {
let task_result = join_res.context("server task join failure")?;
match task_result {
Ok(()) => anyhow::bail!("server task stopped unexpectedly"),
Err(task_err) => return Err(anyhow::Error::from(task_err)),
}
}
None => {
anyhow::bail!("both server tasks stopped unexpectedly");
}
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main.rs` around lines 38 - 57, The match on result currently always
returns or bails (via the join_res handling and the Err/Ok branches), making the
subsequent while let Some(pending) loop unreachable; either remove that loop or
restructure so we drain remaining tasks before returning: capture the failure
(e.g., store an anyhow::Error or a boolean) instead of immediately returning
inside match (references: tasks.join_next(), shutdown_tx.send(true), join_res,
task_result), then after the match run the while let Some(pending) { let
task_result = pending.context("server task join failure")?; if let Err(e) =
task_result { record/aggregate the error } } and finally return or bail with the
stored/aggregated error; alternatively, if draining is unnecessary simply delete
the unreachable while loop.

Comment thread src/models/person.rs
Comment on lines +168 to +179
if let Some(name) = filter.name.as_ref() {
predicate_results.push(Ok(
Box::new(people::name.like(format!("%{}%", name)).nullable())
as Box<
dyn BoxableExpression<
people::table,
diesel::pg::Pg,
SqlType = diesel::sql_types::Nullable<diesel::sql_types::Bool>,
>,
>,
));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent LIKE escaping across filter fields.

The email filter properly escapes SQL LIKE wildcards (%, _, \) at lines 152-155, but name and phone filters do not apply the same escaping. This could cause unexpected behavior if users search for names containing % or _.

🛡️ Proposed fix: Apply consistent escaping
         if let Some(name) = filter.name.as_ref() {
-            predicate_results.push(Ok(
-                Box::new(people::name.like(format!("%{}%", name)).nullable())
+            let escaped_name = name
+                .replace('\\', "\\\\")
+                .replace('%', "\\%")
+                .replace('_', "\\_");
+            predicate_results.push(Ok(
+                Box::new(people::name.like(format!("%{}%", escaped_name)).nullable())
                     as Box<
                         dyn BoxableExpression<
                             people::table,
                             diesel::pg::Pg,
                             SqlType = diesel::sql_types::Nullable<diesel::sql_types::Bool>,
                         >,
                     >,
             ));
         }

         if let Some(phone) = filter.phone.as_ref() {
-            predicate_results.push(Ok(Box::new(
-                people::phone.like(format!("%{}%", phone)).nullable(),
-            )
+            let escaped_phone = phone
+                .replace('\\', "\\\\")
+                .replace('%', "\\%")
+                .replace('_', "\\_");
+            predicate_results.push(Ok(Box::new(
+                people::phone.like(format!("%{}%", escaped_phone)).nullable(),
+            )

Also applies to: 181-192

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/models/person.rs` around lines 168 - 179, The name and phone filters use
people::name.like(...) and people::phone.like(...) without escaping SQL LIKE
metacharacters, unlike the email filter which applies escaping; add the same
escaping logic used for the email filter: extract or create a small helper
(e.g., escape_like or reuse the email escape logic) that escapes %, _, and the
escape char, run it on filter.name and filter.phone before building the
format!("%{}%", ...), and use the escaped value in people::name.like(...) and
people::phone.like(...) so all three filters (email, name, phone) consistently
escape user input.

Comment on lines +506 to +510
pub fn me(
authen_header: &HeaderValue,
pool: &Pool,
keycloak_client: &KeycloakClient,
) -> Result<LoginInfoDTO, ServiceError> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update the me() doctest for the new signature.

The example above still calls me(&auth, &pool). After adding keycloak_client, that snippet no longer compiles if doctests are enabled.

📘 Minimal doc fix
-/// let _ = me(&auth, &pool);
+/// let keycloak_client = unimplemented!();
+/// let _ = me(&auth, &pool, &keycloak_client);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/account_service.rs` around lines 506 - 510, The doctest for the
me() function was not updated after its signature changed to accept a third
parameter (keycloak_client: &KeycloakClient); update the example to call
me(&auth, &pool, &keycloak_client) (or construct/pass a suitable mock/dummy
KeycloakClient used in other tests) and ensure any setup in the doctest creates
the KeycloakClient value so the snippet compiles when doctests are run; locate
the example near the pub fn me(...) declaration and adjust the call and setup
accordingly.

Comment on lines +542 to +549
user_ops::find_login_info_by_token(
&verified_token_data.claims,
conn,
Some(keycloak_client),
)
.map_err(|_| {
ServiceError::internal_server_error("Database error".to_string())
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't turn missing login info into a 500.

refresh() maps find_login_info_by_token misses to unauthorized, but me() now converts the same condition into internal_server_error. A revoked or unknown session should stay a 401; otherwise normal auth failures start surfacing as 500s.

🔐 Minimal fix
-                        .map_err(|_| {
-                            ServiceError::internal_server_error("Database error".to_string())
-                        })
+                        .map_err(|_| {
+                            ServiceError::unauthorized(
+                                constants::MESSAGE_PROCESS_TOKEN_ERROR.to_string(),
+                            )
+                        })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/account_service.rs` around lines 542 - 549, The call to
user_ops::find_login_info_by_token inside me() is converting any error into
ServiceError::internal_server_error, which turns missing/revoked sessions into
500s; change the error mapping so that a "not found"/missing-login result maps
to ServiceError::unauthorized (same behavior as refresh()), while other
unexpected DB errors still become internal_server_error. Update the map_err for
find_login_info_by_token to inspect the returned error (or match its variant)
and return ServiceError::unauthorized for the not-found case and
ServiceError::internal_server_error(...) otherwise, referencing
find_login_info_by_token, me(), refresh(), and
ServiceError::internal_server_error in your fix.

Comment thread src/state.rs
Comment on lines +45 to +51
let main_pool = db::init_db_pool(&cfg.db.url);
{
let mut conn = main_pool
.get()
.context("failed to get db connection for migrations")?;
db::run_migration(&mut conn).context("database migration failed")?;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Dual pool creation to the same database doubles connection usage.

Two separate connection pools are created for the same database:

  1. main_pool (diesel/r2d2) at line 45
  2. sqlx_pool at lines 96-104

This effectively doubles the connection footprint against the same PostgreSQL instance. If cfg.db.max_connections is 10, the application may hold up to 20 connections, potentially exhausting database connection limits or causing resource contention.

Consider either:

  1. Migrating fully to sqlx and deprecating the diesel pool
  2. Sharing a single pool type across the application
  3. If both are required during migration, explicitly configuring separate connection limits for each (e.g., max_connections / 2 each)

Also applies to: 96-107

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/state.rs` around lines 45 - 51, The code creates two separate DB pools
(main_pool via db::init_db_pool and sqlx_pool) causing double connections; fix
by either using a single shared pool type across the app (replace usages of
diesel/r2d2 main_pool or sqlx_pool so all DB access uses one pool instance), or
if both libraries must coexist during migration, compute and apply separate
connection caps (e.g., split cfg.db.max_connections between db::init_db_pool and
the sqlx pool creation) and centralize pool creation so init_db_pool and the
function constructing sqlx_pool reference the shared configuration/value; update
callers that reference main_pool or sqlx_pool accordingly to use the chosen
shared pool symbol.

Comment thread src/utils/keycloak.rs
Comment on lines 927 to +935
let jwks_url = format!("{}/protocol/openid-connect/certs", self.issuer_url);

let client = BlockingClient::new();
let jwks: Jwks = client.get(&jwks_url).send().map_err(|e| Box::new(e))?.json().map_err(|e| Box::new(e))?;

let jwk = jwks.keys.iter().find(|j| j.kid == kid).ok_or_else(|| Box::new(std::io::Error::new(std::io::ErrorKind::NotFound, format!("No JWK found for key ID: {}", kid))) as Box<dyn std::error::Error + Send + Sync>)?;
let jwks: Jwks = client
.get(&jwks_url)
.send()
.map_err(|e| Box::new(e))?
.json()
.map_err(|e| Box::new(e))?;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -e
rg -n -C2 '\bvalidate_id_token_sync\s*\(' --type rust

Repository: zlovtnik/backend

Length of output: 360


🏁 Script executed:

#!/bin/bash
# Read the specific code section mentioned in the review
sed -n '920,945p' src/utils/keycloak.rs | cat -n

Repository: zlovtnik/backend

Length of output: 1207


🏁 Script executed:

#!/bin/bash
# Search for the async version to compare implementation
rg -n -A 30 'pub.*fn validate_id_token\(' src/utils/keycloak.rs | head -50

Repository: zlovtnik/backend

Length of output: 1288


🏁 Script executed:

#!/bin/bash
# Search for KEYCLOAK_TIMEOUT definition
rg -n 'KEYCLOAK_TIMEOUT' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 291


🏁 Script executed:

#!/bin/bash
# Check for any existing timeout handling in the file
rg -n 'timeout|Duration' src/utils/keycloak.rs | head -20

Repository: zlovtnik/backend

Length of output: 652


🏁 Script executed:

#!/bin/bash
# Check the validate_id_token_internal implementation
rg -n -A 30 'fn validate_id_token_internal' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 1345


🏁 Script executed:

#!/bin/bash
# Check for any URL normalization patterns in the file
rg -n 'trim_end_matches.*/' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 113


🏁 Script executed:

#!/bin/bash
# Check the fetch_jwks function to see if it applies timeouts
rg -n -B 5 -A 25 'fn fetch_jwks' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check get_cached_jwks to see timeout handling
rg -n -B 5 -A 30 'fn get_cached_jwks' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 1586


🏁 Script executed:

#!/bin/bash
# Check all timeout usages to understand the pattern better
rg -n -B 3 -A 3 '\.timeout\(KEYCLOAK_TIMEOUT\)' src/utils/keycloak.rs | head -60

Repository: zlovtnik/backend

Length of output: 1331


🏁 Script executed:

#!/bin/bash
# Check the imports to understand what BlockingClient is
rg -n 'use.*BlockingClient|use.*Client|use reqwest' src/utils/keycloak.rs | head -20

Repository: zlovtnik/backend

Length of output: 150


🏁 Script executed:

#!/bin/bash
# Check the full context around the sync version to see if there are other differences
sed -n '910,950p' src/utils/keycloak.rs | cat -n

Repository: zlovtnik/backend

Length of output: 1953


🏁 Script executed:

#!/bin/bash
# Verify BlockingClient supports the builder pattern with timeout methods
# by checking reqwest documentation or code patterns in the file
rg -n 'BlockingClient::builder' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check if there are any comments or context around validate_id_token_sync to understand its usage
rg -n -B 10 'pub fn validate_id_token_sync' src/utils/keycloak.rs | head -25

Repository: zlovtnik/backend

Length of output: 463


🏁 Script executed:

#!/bin/bash
# Check reqwest blocking Client API - search for any patterns that might show how BlockingClient is used
rg -n 'Client::new\(\)' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 107


🏁 Script executed:

#!/bin/bash
# Check if there are any docs or tests that show BlockingClient configuration patterns
find . -name '*.rs' -type f | head -20 | xargs grep -l 'blocking::Client' 2>/dev/null | head -5

Repository: zlovtnik/backend

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Check Cargo.toml to see reqwest version
rg -n 'reqwest' Cargo.toml

Repository: zlovtnik/backend

Length of output: 453


🏁 Script executed:

#!/bin/bash
# Search for how blocking client should be configured - check if there are examples in other projects
# or if there's any documentation comments
rg -n 'blocking' src/utils/keycloak.rs

Repository: zlovtnik/backend

Length of output: 110


🏁 Script executed:

#!/bin/bash
# Verify if blocking::Client has a builder method by checking if there's documentation
# or any pattern in typical Rust HTTP client libraries
python3 << 'EOF'
# Check reqwest 0.12 blocking client API
# According to docs, blocking::Client should have a builder() method similar to async Client
import json

# In reqwest 0.12, the API should support:
# reqwest::blocking::Client::builder() or 
# reqwest::blocking::ClientBuilder

print("Reqwest 0.12 blocking client should support builder pattern")
print("Expected: Client::builder() -> ClientBuilder")
print("Methods should include: .timeout(Duration), .connect_timeout(Duration)")
EOF

Repository: zlovtnik/backend

Length of output: 235


🏁 Script executed:

#!/bin/bash
# Check if there's any mention of handling the sync path being problematic or unfinished
rg -n 'TODO|FIXME|XXX|HACK' src/utils/keycloak.rs | grep -i -E 'sync|timeout|block'

Repository: zlovtnik/backend

Length of output: 42


Add timeouts and URL normalization to the blocking JWKS fetch.

The sync validation path lacks timeouts on BlockingClient::new().get(...).send(), which can block indefinitely on slow Keycloak responses, pinning a request thread. Additionally, it skips the issuer_url trailing-slash normalization that the async version applies via trim_end_matches('/'), creating inconsistency and potential URL formatting issues.

Suggested fix
-        let jwks_url = format!("{}/protocol/openid-connect/certs", self.issuer_url);
-
-        let client = BlockingClient::new();
+        let jwks_url = format!(
+            "{}/protocol/openid-connect/certs",
+            self.issuer_url.trim_end_matches('/')
+        );
+
+        let client = BlockingClient::builder()
+            .timeout(KEYCLOAK_TIMEOUT)
+            .connect_timeout(Duration::from_secs(5))
+            .build()
+            .map_err(|e| Box::new(e) as Box<dyn std::error::Error + Send + Sync>)?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/keycloak.rs` around lines 927 - 935, Normalize self.issuer_url by
trimming any trailing '/' (e.g., use self.issuer_url.trim_end_matches('/'))
before building jwks_url, and create the blocking HTTP client with explicit
timeouts instead of BlockingClient::new() (use
BlockingClient::builder().timeout(Duration::from_secs(...)).build() ) so the
.get(&jwks_url).send() call cannot block indefinitely; update the code around
jwks_url, BlockingClient::new(), and the .get(...).send() sequence to use the
normalized URL and a timed client and propagate/send a clear error if the
request times out or fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant