From f4c67c466c2a3dade3cadfcc4340dcc0c6e9ffe7 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sun, 19 Apr 2026 08:45:36 +0000 Subject: [PATCH 01/12] docs: add multi-language client specification document Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/1bb8d149-8b3d-47d0-9f4c-763af9cd2c78 Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/CLIENT_SPEC.md | 1070 +++++++++++++++++ 1 file changed, 1070 insertions(+) create mode 100644 documents/multi-language-client-spec/CLIENT_SPEC.md diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md new file mode 100644 index 000000000..3ebe59052 --- /dev/null +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -0,0 +1,1070 @@ +# OJP Multi-Language Client Specification + +> **Status:** Draft — April 2026 +> **Scope:** This document defines every aspect that a new OJP client library (in any language other than Java) must implement in order to be fully compatible with an OJP server. It is written language-agnostically; where Java-specific concepts appear they are labelled as the reference implementation only. +> **Reference implementation:** `ojp-jdbc-driver` module. +> **Protocol source of truth:** `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto`. + +--- + +## Table of Contents + +1. [gRPC Interface Implementation](#1-grpc-interface-implementation) +2. [URL Parsing](#2-url-parsing) +3. [Client Identity](#3-client-identity) +4. [Connection Establishment and connHash Caching](#4-connection-establishment-and-connhash-caching) +5. [Session Management](#5-session-management) +6. [Session Stickiness](#6-session-stickiness) +7. [Load Balancing](#7-load-balancing) +8. [Failover](#8-failover) +9. [Health Checking](#9-health-checking) +10. [Connection Redistribution on Recovery](#10-connection-redistribution-on-recovery) +11. [Cluster Health Propagation](#11-cluster-health-propagation) +12. [Transaction Management (non-XA)](#12-transaction-management-non-xa) +13. [Savepoints](#13-savepoints) +14. [XA / Distributed Transactions](#14-xa--distributed-transactions) +15. [Statement Execution](#15-statement-execution) +16. [Parameter Type Mapping](#16-parameter-type-mapping) +17. [Temporal Type Handling](#17-temporal-type-handling) +18. [Result Set and Streaming](#18-result-set-and-streaming) +19. [LOB (Large Object) Handling](#19-lob-large-object-handling) +20. [CallResource Protocol](#20-callresource-protocol) +21. [Error and Exception Mapping](#21-error-and-exception-mapping) +22. [Configuration System](#22-configuration-system) +23. [Query Result Caching](#23-query-result-caching) +24. [Security / Transport](#24-security--transport) +25. [DataSource / Integration API](#25-datasource--integration-api) +26. [Testing Coverage](#26-testing-coverage) + +--- + +## 1. gRPC Interface Implementation + +### What to implement + +The client must implement stubs for every RPC in `StatementService` and `EchoService`. + +**`StatementService` RPCs:** + +| RPC | Type | Purpose | +|---|---|---| +| `connect` | unary | Open a logical connection and receive `SessionInfo` | +| `executeUpdate` | unary | DML (INSERT / UPDATE / DELETE / DDL) | +| `executeQuery` | server-streaming | SELECT — returns a stream of `OpResult` blocks | +| `fetchNextRows` | unary | Pull the next page of rows for an open result set | +| `createLob` | client-streaming | Upload LOB data to the server in chunks | +| `readLob` | server-streaming | Download LOB data from the server | +| `terminateSession` | unary | Release server-side session state | +| `startTransaction` | unary | Begin an explicit transaction | +| `commitTransaction` | unary | Commit the active transaction | +| `rollbackTransaction` | unary | Roll back the active transaction | +| `callResource` | unary | Generic remote call for metadata, cursor navigation, savepoints | +| `xaStart` | unary | Begin an XA transaction branch | +| `xaEnd` | unary | End an XA transaction branch | +| `xaPrepare` | unary | Prepare an XA transaction branch | +| `xaCommit` | unary | Commit an XA transaction branch | +| `xaRollback` | unary | Roll back an XA transaction branch | +| `xaRecover` | unary | List XIDs of prepared transactions | +| `xaForget` | unary | Forget a heuristically completed transaction | +| `xaSetTransactionTimeout` | unary | Set XA timeout in seconds | +| `xaGetTransactionTimeout` | unary | Get current XA timeout | +| `xaIsSameRM` | unary | Check whether two sessions share a resource manager | + +**`EchoService` RPC:** + +| RPC | Type | Purpose | +|---|---|---| +| `Echo` | unary | Lightweight heartbeat / connectivity check | + +### gRPC channel lifecycle + +- One `ManagedChannel` (or equivalent) per server endpoint. Channels are long-lived and shared across all logical connections to that endpoint. +- Channels are created lazily on first connection to an endpoint, or eagerly during initialisation when endpoints are known upfront. +- Use DNS-prefixed targets (`dns:///host:port`) where the gRPC runtime supports it, to allow future SRV-based discovery. +- Blocking stubs are used for synchronous operations; async stubs are required for client-streaming (`createLob`) and server-streaming (`executeQuery`, `readLob`) RPCs. +- Channel shutdown must be graceful (allow in-flight calls to complete) and must be triggered on client shutdown. + +--- + +## 2. URL Parsing + +### URL format + +``` +jdbc:ojp[host:port(datasourceName),host2:port2(datasourceName2)]_actual-db-url +``` + +All three parts are mandatory: +- `jdbc:ojp` — fixed prefix that identifies the driver. +- `[...]` — bracket-enclosed, comma-separated list of OJP server endpoints. Each endpoint is `host:port`, optionally followed by a datasource name in parentheses: `host:port(dsName)`. +- `_` — separator between the OJP section and the actual database URL. +- `actual-db-url` — the full database URL that the server will use to connect to the real database (e.g., `postgresql://localhost:5432/mydb`). + +### Parsing rules + +1. **Extract the endpoint list** by capturing everything between `[` and `]`. +2. **Split by comma** to enumerate endpoints. Trim whitespace around each item. +3. **For each endpoint**, split on `:` to obtain host and port. If a `(dsName)` suffix is present, strip it and record the datasource name; default is `"default"`. +4. **Validate** that host is non-empty, port is an integer in `[1, 65535]`, and at least one endpoint is present. +5. **Extract the actual database URL** by removing everything up to and including the first `]_` pattern. +6. **Produce a single-endpoint URL** for use in `ConnectionDetails.url` by replacing `[host1:port1,host2:port2]` with `[chosen_host:chosen_port]` (the endpoint actually selected for the first connection). This stripped URL is forwarded to the server; the server never sees the multinode list. + +### Examples + +| Input | Endpoints | Datasource names | Actual DB URL | +|---|---|---|---| +| `jdbc:ojp[localhost:10591]_postgresql://db:5432/mydb` | `localhost:10591` | `default` | `postgresql://db:5432/mydb` | +| `jdbc:ojp[a:1059,b:1059]_h2:mem:test` | `a:1059`, `b:1059` | `default`, `default` | `h2:mem:test` | +| `jdbc:ojp[a:1059(web),b:1059(analytics)]_postgresql://db/mydb` | `a:1059`, `b:1059` | `web`, `analytics` | `postgresql://db/mydb` | + +--- + +## 3. Client Identity + +### clientUUID + +- Generate one random UUID (version 4) when the client library is first loaded or when the process starts. This UUID must remain stable for the entire lifetime of the process. +- Attach `clientUUID` to every `ConnectionDetails` message sent to the server. +- The server uses `clientUUID` to group all sessions from the same client process. +- Do not persist `clientUUID` across process restarts; each new process should generate a fresh UUID. + +--- + +## 4. Connection Establishment and connHash Caching + +### First connection (cache miss) + +1. Build a `ConnectionDetails` message: + - `url` — the single-endpoint URL extracted during parsing (see §2). + - `user`, `password` — credentials. + - `clientUUID` — the stable process UUID (see §3). + - `properties` — datasource-specific properties from configuration (see §22), including cache rules (see §23). + - `serverEndpoints` — list of all known server endpoints as `host:port` strings, used by the server for cluster coordination. + - `clusterHealth` — current cluster health string (see §11); empty on very first connect. + - `isXA` — `true` for XA connections, `false` otherwise. +2. Call `connect(ConnectionDetails)` on the chosen server. Receive `SessionInfo`. +3. Cache the returned `connHash`, keyed on `url + "|" + user + "|" + password + "|" + datasourceName`. Also store the full `ConnectionDetails` so it can be replayed if the server restarts. +4. Return the received `SessionInfo` to the caller. + +### Subsequent connections (cache hit, non-XA only) + +When a subsequent connection uses the same credentials: +1. Look up `connHash` from the local cache by the connection key. +2. Build a `SessionInfo` locally without making any gRPC call: + ``` + SessionInfo { + connHash: + clientUUID: + isXA: false + } + ``` +3. Return this locally-built `SessionInfo`. No `sessionUUID` is set yet; it will be assigned by the server when the first SQL operation requires a session (e.g., on `startTransaction`). + +**XA connections always call the server** — caching is disabled for XA because each XA connection must create a dedicated pool entry on a specific server. + +### Cache invalidation (NOT_FOUND recovery) + +When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory pool (e.g., after a restart). Recovery procedure: +1. Remove the cached `connHash → connection-key` entry (but keep the stored `ConnectionDetails`). +2. Re-issue a real `connect()` RPC using the stored `ConnectionDetails`. +3. Cache the new `connHash` returned. +4. Retry the original failed operation once with the new `SessionInfo`. +5. This retry is only safe if the original request had no active `sessionUUID` (no open transaction). If a session was in progress, surface the error to the caller — the transaction state is permanently lost. + +--- + +## 5. Session Management + +### SessionInfo fields + +| Field | Type | Meaning | +|---|---|---| +| `connHash` | string | Server-side key identifying which connection pool to use | +| `clientUUID` | string | Client process identity (see §3) | +| `sessionUUID` | string | Server-side session handle; set once a session is established (on `startTransaction`, LOB creation, etc.) | +| `transactionInfo` | `TransactionInfo` | Contains `transactionUUID` and `transactionStatus` (`TRX_ACTIVE`, `TRX_COMMITED`, `TRX_ROLLBACK`) | +| `sessionStatus` | `SessionStatus` | `SESSION_ACTIVE` or `SESSION_TERMINATED` | +| `isXA` | bool | Whether this is an XA session | +| `targetServer` | string | `host:port` of the server this session is pinned to (set by the server, used by the client for stickiness) | +| `clusterHealth` | string | Current cluster health snapshot from the server's perspective | + +### Lifecycle rules + +- Always propagate the **latest** `SessionInfo` on every outgoing request. The server updates and returns it in every response; the client must replace its local copy with the one returned. +- When the response contains a `sessionUUID` that was absent in the request, register it immediately with the session-stickiness layer (see §6). +- On connection close: call `terminateSession(SessionInfo)`. This is mandatory for releasing server-side resources, especially in multinode deployments where multiple servers may hold pools. +- If `sessionStatus == SESSION_TERMINATED` is received, treat the connection as closed and do not make further calls on it. + +--- + +## 6. Session Stickiness + +### Rule + +Once a `sessionUUID` is established, **every subsequent request for that session must go to the same server**. The server embeds `targetServer` (`host:port`) in the `SessionInfo` response; the client must record this binding and honour it. + +### Enforcement + +- Maintain a thread-safe map of `sessionUUID → host:port`. +- On each request: if `sessionUUID` is set in the local `SessionInfo`, look up the bound server. Route the request to that server only. +- If the bound server is currently marked unhealthy: **raise an error to the caller** — do not silently reroute to another server. The in-flight session state (open transaction, LOB handle, cursor) cannot be migrated and the caller must handle the failure. +- When a session is closed (`terminateSession`), remove the binding from the map and decrement the session count for that server in the load-balancing tracker (see §7). + +### Session binding sources + +A session binding is created or updated in these cases: +- A response contains a `sessionUUID` that was not present in the request (first assignment). +- The `targetServer` field in a response differs from the currently recorded binding (re-binding after a recovery; log a warning). + +--- + +## 7. Load Balancing + +### Server selection strategies + +Two strategies must be supported, selectable via configuration (see §22, property `ojp.loadaware.selection.enabled`): + +**Least-connections (default, `true`)** +Select the healthy server with the lowest number of active sessions. Track session counts in a thread-safe counter per server endpoint. Use round-robin as a tie-breaker when all servers have equal counts. + +**Round-robin (`false`)** +Cycle through healthy servers in order using an atomic counter modulo the number of healthy servers. + +### When selection runs + +Server selection runs on every new connection attempt (non-XA, first `connect()`) and on every XA `connect()`. Once a session is assigned a server (via session stickiness), selection does not run again for that session. + +### Healthy server filter + +Only servers whose `isHealthy() == true` are eligible for selection. If no healthy servers exist, raise a connection error. + +--- + +## 8. Failover + +### What triggers failover + +Connection-level gRPC errors indicate that the server is unreachable. The following gRPC status codes are treated as connectivity failures: + +| Status code | Trigger failover? | +|---|---| +| `UNAVAILABLE` | Yes | +| `DEADLINE_EXCEEDED` | Yes | +| `UNKNOWN` (with "connection" in message) | Yes | +| `INTERNAL` with SQL metadata trailers | **No** — this is a database-level error | +| `NOT_FOUND` | **No** — triggers reconnect (see §4), not failover | +| `RESOURCE_EXHAUSTED` (pool exhaustion) | **No** — surface to caller | +| Any `SQLException` from server | **No** | + +### Failover procedure + +1. When a connectivity error is detected on a server: + a. Mark the server unhealthy (`isHealthy = false`), recording the failure timestamp. + b. Log the failure. +2. Select the next healthy server (using the configured strategy, excluding the failed server and any already attempted in this retry cycle). +3. Retry the operation on the new server. +4. If all servers have been attempted and all failed, raise a connection error to the caller. +5. Retry attempts and delay between retries are configurable (see §22, properties `ojp.multinode.retry.attempts` and `ojp.multinode.retry.delay`). + +### What must NOT trigger failover + +- Database errors (bad SQL, constraint violations, auth failures) — surface directly to caller. +- Pool exhaustion — surface directly to caller. +- Session-invalidation errors (session lost after server failure) — surface directly to caller; the caller must re-establish the session. + +--- + +## 9. Health Checking + +### Background task + +Run a periodic background task that checks server health. The task must: +- Run at a configurable fixed interval (property `ojp.health.check.interval`, default 5 000 ms). +- Not block the main execution thread. +- Be a daemon task so it does not prevent process shutdown. + +### Two-phase check + +**Phase 1 — probe healthy servers (detect newly failed servers)** +For each currently healthy server, send a `connect()` with empty credentials. If the call throws any exception, mark the server unhealthy and call the server-failure handler (see §11). + +**Phase 2 — probe unhealthy servers (detect recovery)** +For each currently unhealthy server, check if enough time has passed since the last failure (property `ojp.health.check.threshold`, default 5 000 ms). If so, probe the server. If the probe succeeds, mark it healthy and trigger recovery procedures (see §10). + +### Health probe modes + +| Mode | How to probe | When to use | +|---|---|---| +| Heartbeat (lightweight) | Send `connect()` with empty `url`, `user`, `password` — any response means transport is up | Default | +| Full validation | Send `connect()` with real credentials; on success, call `terminateSession` on the returned session | When heartbeat is insufficient | + +### Configurable properties (see §22) + +| Property | Default | Meaning | +|---|---|---| +| `ojp.health.check.interval` | 5000 ms | How often the check runs | +| `ojp.health.check.threshold` | 5000 ms | How long to wait before re-probing an unhealthy server | +| `ojp.health.check.timeout` | 5000 ms | Maximum time for a single probe call | +| `ojp.redistribution.enabled` | `true` | Whether to run the periodic health checker at all | + +--- + +## 10. Connection Redistribution on Recovery + +### Goal + +When a failed server comes back online, rebalance client-side connections so that the recovered server receives its fair share of traffic again. This avoids all load remaining on the servers that survived the outage. + +### Procedure on recovery + +1. Before marking the server healthy, **proactively re-initialise pools** on the recovered server. For every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This avoids `NOT_FOUND` errors on the first SQL call routed there. +2. Mark the server healthy. +3. Push the updated cluster health string to all healthy servers (see §11) so they can resize their pools. +4. If redistribution is enabled (`ojp.redistribution.enabled = true`), begin rebalancing: + - Determine the ideal share: `totalConnections / numberOfHealthyServers`. + - Identify over-loaded servers (connections > ideal share). + - Close a fraction of idle connections on over-loaded servers so they are returned to the pool, then re-opened — the client's load-balancing layer will route the re-opens to the least-loaded server (including the recovered one). + - Honour the configurable fraction (`ojp.redistribution.idleRebalanceFraction`, default 1.0) and max-close-per-cycle limit (`ojp.redistribution.maxClosePerRecovery`, default 100). + +--- + +## 11. Cluster Health Propagation + +### Cluster health string format + +``` +host1:port1(UP);host2:port2(DOWN);host3:port3(UP) +``` + +Each semicolon-separated segment is `host:port(STATUS)` where status is `UP` or `DOWN`. + +### Client responsibilities + +- **Build** the cluster health string from local server endpoint health state before every `connect()` call and before every operation that carries a `SessionInfo` (by populating `SessionInfo.clusterHealth`). +- **Consume** the cluster health string returned in `SessionInfo.clusterHealth` on every response. Update local endpoint health states accordingly: mark endpoints `DOWN` as unhealthy and endpoints `UP` as healthy (if they were previously failed). +- **Push** the updated cluster health to all currently healthy servers after a server health state change (failure or recovery). This is done by calling `connect()` on each healthy server with a `ConnectionDetails` that contains the new `clusterHealth`. The server uses this to resize its pool immediately. + +### Generation + +``` +generate_cluster_health(endpoints): + return ";".join( + f"{ep.host}:{ep.port}({'UP' if ep.is_healthy else 'DOWN'})" + for ep in endpoints + ) +``` + +--- + +## 12. Transaction Management (non-XA) + +### autoCommit semantics + +- Default state is `autoCommit = true`. +- When `autoCommit` is switched **off** (`false`), immediately call `startTransaction(SessionInfo)`. Store the returned `SessionInfo` (which now contains a `transactionUUID` and `TRX_ACTIVE` status). +- When `autoCommit` is switched **on** (`true`) while a transaction is active (`TRX_ACTIVE`), immediately call `commitTransaction(SessionInfo)` to commit the pending work. +- In `autoCommit = false` mode, no `startTransaction` call is needed before each SQL statement — the server tracks the open transaction via `sessionUUID`. + +### Commit and rollback + +| Client call | gRPC call | Condition | +|---|---|---| +| `commit()` | `commitTransaction(SessionInfo)` | Only when `autoCommit == false` | +| `rollback()` | `rollbackTransaction(SessionInfo)` | Only when `autoCommit == false` | + +Always replace the local `SessionInfo` with the one returned by these calls. + +### Transaction isolation + +- Set isolation level via `callResource` with `CallType.CALL_SET`, resource name `"TransactionIsolation"`, and the integer isolation level as parameter. +- Get isolation level via `callResource` with `CallType.CALL_GET`, resource name `"TransactionIsolation"`. +- The isolation level must be reset to the default after each logical connection is returned to a pool (if the client integrates with a connection pool). + +--- + +## 13. Savepoints + +Savepoints are implemented through the `callResource` protocol using `ResourceType.RES_SAVEPOINT`. + +### Creating a savepoint + +Call `callResource` with: +- `resourceType = RES_SAVEPOINT` +- `target.callType = CALL_SET` (or `CALL_INSERT` for named savepoints, depending on server version) +- `target.resourceName = "Savepoint"` +- `target.params = [savepointName]` if named; empty for anonymous savepoints. + +The response contains the savepoint UUID in `CallResourceResponse.resourceUUID`. + +### Rolling back to a savepoint + +Call `callResource` with: +- `resourceType = RES_SAVEPOINT` +- `resourceUUID = ` +- `target.callType = CALL_ROLLBACK` + +### Releasing a savepoint + +Call `callResource` with: +- `resourceType = RES_SAVEPOINT` +- `resourceUUID = ` +- `target.callType = CALL_RELEASE` + +--- + +## 14. XA / Distributed Transactions + +### Overview + +XA support maps the standard XA resource manager protocol to gRPC RPCs. XA connections are always pinned to a single server (§6). + +### XA transaction lifecycle + +``` +xaStart(XaStartRequest) -- Begin branch; safe to retry on connection error +xaEnd(XaEndRequest) -- End branch; NEVER retry after this point +xaPrepare(XaPrepareRequest) -- Two-phase prepare; returns XA_OK or XA_RDONLY +xaCommit(XaCommitRequest) -- Commit (onePhase=true for one-phase optimisation) +xaRollback(XaRollbackRequest) -- Roll back the branch +xaRecover(XaRecoverRequest) -- List in-doubt XIDs (for recovery after crash) +xaForget(XaForgetRequest) -- Forget a heuristically completed branch +``` + +### Xid encoding (XidProto) + +| Field | Type | Meaning | +|---|---|---| +| `formatId` | int32 | Transaction format ID | +| `globalTransactionId` | bytes | Global transaction ID (up to 64 bytes) | +| `branchQualifier` | bytes | Branch qualifier (up to 64 bytes) | + +### Retry policy + +- **`xaStart`** only: retry on connection-level errors (see §8). No transaction state exists yet so retrying is safe. +- **All other XA operations**: do not retry automatically. Surface failures to the caller's transaction manager. + +### XA session binding + +On the response to `xaStart`, record the `sessionUUID → targetServer` binding (§6). All subsequent XA operations for this branch must go to the same server. If that server is unavailable, raise `XAException(XAER_RMFAIL)`. + +### Timeout + +- `xaSetTransactionTimeout(seconds)` and `xaGetTransactionTimeout()` are straightforward pass-throughs to the server. +- `xaIsSameRM` checks whether two `SessionInfo` objects originate from the same resource manager (same server). + +--- + +## 15. Statement Execution + +### Three statement types + +**Plain Statement** +Execute arbitrary SQL strings without parameters. Maps to `executeUpdate` or `executeQuery` with an empty `parameters` list. + +**Prepared Statement** +Pre-compiled SQL with positional parameters (`?` placeholders). Parameters are accumulated locally and sent with the SQL in a single `StatementRequest`. Assign and track a `statementUUID` (a random UUID per prepared statement instance) for server-side resource management. + +**Callable Statement** +Stored-procedure calls with IN, OUT, and INOUT parameters. The stored-procedure call string is prepared on the server via `callResource` with `CallType.CALL_PREPARE` first. The returned `resourceUUID` becomes the Callable Statement handle. Parameters are registered by index and type, and OUT/INOUT values are retrieved from `CallResourceResponse.values` after execution. + +### StatementRequest structure + +``` +StatementRequest { + session: SessionInfo // current session + sql: string // the SQL (or call string) + parameters: ParameterProto[] // indexed parameters + statementUUID: string // UUID for this statement (for resource tracking) + properties: PropertyEntry[] // optional per-statement properties +} +``` + +### Execution routing + +- Use `executeUpdate` for INSERT / UPDATE / DELETE / DDL — returns `OpResult` with `type = INTEGER` containing affected row count. +- Use `executeQuery` for SELECT — returns a server-streaming response. Consume the first `OpResult` to get the initial batch; call `fetchNextRows` for subsequent pages (see §18). +- After any execution, update the local `SessionInfo` from the `OpResult.session` field. + +--- + +## 16. Parameter Type Mapping + +### ParameterProto + +Each parameter is represented as: +``` +ParameterProto { + index: int32 // 1-based parameter position + type: ParameterTypeProto // one of the 28 type codes + values: ParameterValue[] // one value for normal params; multiple for array params +} +``` + +### ParameterTypeProto values and their ParameterValue encoding + +| Proto enum value | Wire field in `ParameterValue` | Notes | +|---|---|---| +| `PT_NULL` | `is_null = true` | Explicit null | +| `PT_BOOLEAN` | `bool_value` | | +| `PT_BYTE` | `int_value` | Clamp to byte range | +| `PT_SHORT` | `int_value` | Clamp to short range | +| `PT_INT` | `int_value` | | +| `PT_LONG` | `long_value` | | +| `PT_FLOAT` | `float_value` | | +| `PT_DOUBLE` | `double_value` | | +| `PT_BIG_DECIMAL` | `string_value` | Encode as `" "` — see §16.1 | +| `PT_STRING` | `string_value` | | +| `PT_BYTES` | `bytes_value` | Raw bytes | +| `PT_DATE` | `date_value` | `google.type.Date` (year/month/day, no timezone) | +| `PT_TIME` | `time_value` | `google.type.TimeOfDay` (hours/minutes/seconds/nanos) | +| `PT_TIMESTAMP` | `timestamp_value` | `TimestampWithZone` — see §17 | +| `PT_ASCII_STREAM` | `bytes_value` | ASCII bytes | +| `PT_UNICODE_STREAM` | `bytes_value` | Unicode bytes | +| `PT_BINARY_STREAM` | `bytes_value` | Binary bytes | +| `PT_OBJECT` | varies | Best-effort mapping to one of the concrete value types | +| `PT_CHARACTER_READER` | `string_value` | Contents of the character stream | +| `PT_REF` | `string_value` | REF value as string | +| `PT_BLOB` | (LOB reference UUID) | Create LOB first (§19); then pass UUID as `string_value` | +| `PT_CLOB` | (LOB reference UUID) | Same as BLOB | +| `PT_ARRAY` | `int_array_value` / `long_array_value` / `string_array_value` | Use the typed array message matching element type | +| `PT_URL` | `url_value` (StringValue) | `URL.toExternalForm()` — presence-aware; unset = null | +| `PT_ROW_ID` | `rowid_value` (StringValue) | Base64-encoded bytes of the RowId — presence-aware | +| `PT_N_STRING` | `string_value` | Same wire format as PT_STRING | +| `PT_N_CHARACTER_STREAM` | `string_value` | Contents of the NCharacter stream | +| `PT_N_CLOB` | (LOB reference UUID) | Same as CLOB | +| `PT_SQL_XML` | `string_value` | XML content as string | + +#### 16.1 BigDecimal encoding + +BigDecimal is serialised as a space-separated string: `" "`. + +- `unscaledInteger`: the decimal string representation of the unscaled value (may be negative), e.g. `"-12345"`. +- `scale`: integer scale (number of decimal places), e.g. `2`. +- Full value = `unscaledInteger × 10^(-scale)`. + +Example: `BigDecimal("123.45")` → `"12345 2"`. + +> **Note:** A separate binary wire format is documented in `documents/protocol/BIGDECIMAL_WIRE_FORMAT.md` for contexts where binary efficiency is needed. + +#### 16.2 Presence-aware fields + +`url_value`, `rowid_value`, `uuid_value`, `biginteger_value`, `rowidlifetime_value` are all `google.protobuf.StringValue` (a wrapper message). An absent (unset) wrapper means SQL NULL. An empty string inside the wrapper is a valid non-null value. + +--- + +## 17. Temporal Type Handling + +### TimestampWithZone + +Timestamps are transmitted as: + +``` +TimestampWithZone { + instant: google.protobuf.Timestamp // seconds + nanos since Unix epoch (UTC) + timezone: string // IANA zone ID or UTC offset (e.g., "Europe/Rome", "+02:00") + original_type: TemporalType // preserves the caller's original type +} +``` + +### TemporalType enum + +| Value | Original type | +|---|---| +| `TEMPORAL_TYPE_UNSPECIFIED` | Default / unknown | +| `TEMPORAL_TYPE_TIMESTAMP` | `java.sql.Timestamp` | +| `TEMPORAL_TYPE_CALENDAR` | `java.util.Calendar` | +| `TEMPORAL_TYPE_OFFSET_DATE_TIME` | `java.time.OffsetDateTime` | +| `TEMPORAL_TYPE_LOCAL_DATE_TIME` | `java.time.LocalDateTime` | +| `TEMPORAL_TYPE_INSTANT` | `java.time.Instant` | +| `TEMPORAL_TYPE_LOCAL_DATE` | `java.time.LocalDate` | +| `TEMPORAL_TYPE_LOCAL_TIME` | `java.time.LocalTime` | +| `TEMPORAL_TYPE_OFFSET_TIME` | `java.time.OffsetTime` | + +### Encoding rules + +1. Convert the host-language datetime value to an absolute UTC instant (seconds + nanoseconds since the Unix epoch). +2. Record the IANA timezone or UTC offset string. +3. Set `original_type` to the closest matching `TemporalType` enum value. + +### Decoding rules + +On the receiving side, use `original_type` to reconstruct the correct host-language type: +- `TEMPORAL_TYPE_LOCAL_DATE_TIME` / `TEMPORAL_TYPE_TIMESTAMP` → local datetime in the client's timezone. +- `TEMPORAL_TYPE_OFFSET_DATE_TIME` → datetime with offset reconstructed from the `timezone` string. +- `TEMPORAL_TYPE_INSTANT` → UTC instant. +- `TEMPORAL_TYPE_LOCAL_DATE` → date only (no time component). +- `TEMPORAL_TYPE_LOCAL_TIME` / `TEMPORAL_TYPE_OFFSET_TIME` → time-only value with or without offset. + +**Date-only values** use `google.type.Date` (year, month, day — no timezone). +**Time-only values** use `google.type.TimeOfDay` (hours, minutes, seconds, nanos — no timezone). + +### Timezone requirement + +The OJP server must always run with `user.timezone=UTC`. Client libraries should also normalise to UTC when encoding timestamps, using the `timezone` field to carry the original zone for faithful reconstruction. + +--- + +## 18. Result Set and Streaming + +### Consuming executeQuery + +`executeQuery` is a server-streaming RPC. The response stream contains one or more `OpResult` messages: + +1. **First `OpResult`**: always contains the initial data batch in `query_result`: + - `resultSetUUID` — server-side handle for this result set. + - `labels` — ordered list of column names. + - `rows` — first batch of `ResultRow` objects, each containing a `ParameterValue` per column. + - `flag` — if `"ROW_BY_ROW"`, the server sends one row per stream message (row-by-row mode); otherwise the initial batch may contain multiple rows. + +2. **Subsequent `OpResult` messages** (only in non-row-by-row streaming mode): additional batches until the stream closes. + +3. **`fetchNextRows`**: After the initial stream closes, call `fetchNextRows(ResultSetFetchRequest)` with `resultSetUUID` and a page size to fetch additional rows. Repeat until the response contains an empty `rows` list or the result set is exhausted. + +### Column value decoding + +Map each `ParameterValue` oneof to the host language's equivalent type following the inverse of the encoding table in §16. Pay attention to `is_null = true` for SQL NULL values. + +### Cursor navigation + +Scrollable result sets support cursor positioning through `callResource` with `ResourceType.RES_RESULT_SET` and the appropriate `CallType`: + +| Cursor operation | CallType | +|---|---| +| `next()` | `CALL_NEXT` | +| `first()` | `CALL_FIRST` | +| `last()` | `CALL_LAST` | +| `beforeFirst()` | `CALL_BEFORE` | +| `afterLast()` | `CALL_AFTER` | +| `absolute(row)` | `CALL_ABSOLUTE` | +| `relative(rows)` | `CALL_RELATIVE` | +| `previous()` | `CALL_PREVIOUS` | +| `close()` | `CALL_CLOSE` | + +--- + +## 19. LOB (Large Object) Handling + +### LOB types + +| LobType enum | Meaning | +|---|---| +| `LT_BLOB` | Binary large object | +| `LT_CLOB` | Character large object | +| `LT_BINARY_STREAM` | Binary stream (column-streaming variant) | +| `LT_ASCII_STREAM` | ASCII character stream | +| `LT_UNICODE_STREAM` | Unicode character stream | +| `LT_CHARACTER_STREAM` | Generic character stream | + +### Writing a LOB (createLob) + +1. Open a client-streaming call to `createLob`. +2. Send one or more `LobDataBlock` messages: + ``` + LobDataBlock { + session: SessionInfo + position: int64 // byte offset of this chunk + data: bytes // chunk content (recommended chunk size: 32–64 KB) + lobType: LobType + metadata: PropertyEntry[] // used for binary streams to carry prepared statement info + } + ``` +3. Close the stream. The server responds with a `LobReference` stream (typically one message): + ``` + LobReference { + session: SessionInfo + uuid: string // LOB handle + bytesWritten: int32 + lobType: LobType + } + ``` +4. Store the `LobReference.uuid`. This UUID is what gets passed as a parameter value (§16) when binding the LOB to a SQL statement. + +### Reading a LOB (readLob) + +Call `readLob(ReadLobRequest)`: +``` +ReadLobRequest { + lobReference: LobReference // uuid + session info + position: int64 // start byte (1-based for JDBC compatibility) + length: int32 // max bytes to return +} +``` +Receive a server-streaming response of `LobDataBlock` messages. Concatenate the `data` fields in order to reconstruct the content. + +### LOB and session stickiness + +LOB handles are server-side objects. A connection that has an open LOB must remain bound to the same server (§6). Do not reroute such connections during failover; instead surface the error to the caller. + +--- + +## 20. CallResource Protocol + +The `callResource` RPC is a generic mechanism for operations that do not fit a dedicated RPC — primarily `DatabaseMetaData` queries, `ResultSet` cursor/update operations, `Statement` cancellation, savepoint management, and resource lifecycle calls. + +### Request + +``` +CallResourceRequest { + session: SessionInfo + resourceType: ResourceType // what kind of resource to call + resourceUUID: string // the server-side handle for this resource + target: TargetCall // the specific operation to perform + properties: PropertyEntry[] +} +``` + +### TargetCall (supports chaining) + +``` +TargetCall { + callType: CallType // one of the 47 call type codes + resourceName: string // e.g., "Catalog", "TransactionIsolation", "Savepoint" + params: ParameterValue[] // input arguments + nextCall: TargetCall // optional chained call (recursive) +} +``` + +### ResourceType values + +| Value | Meaning | +|---|---| +| `RES_RESULT_SET` | An open result set | +| `RES_STATEMENT` | A plain statement | +| `RES_PREPARED_STATEMENT` | A prepared statement | +| `RES_CALLABLE_STATEMENT` | A callable statement | +| `RES_LOB` | A LOB object | +| `RES_CONNECTION` | The connection itself (for metadata, catalog, etc.) | +| `RES_SAVEPOINT` | A savepoint | + +### Response + +``` +CallResourceResponse { + session: SessionInfo + resourceUUID: string // UUID of a newly created resource, if any + values: ParameterValue[] // return values (may be empty) +} +``` + +Always update the local `SessionInfo` from `response.session`. + +### CallType reference (47 codes) + +`CALL_SET`, `CALL_GET`, `CALL_IS`, `CALL_ALL`, `CALL_NULLS`, `CALL_USES`, `CALL_SUPPORTS`, `CALL_STORES`, `CALL_NULL`, `CALL_DOES`, `CALL_DATA`, `CALL_NEXT`, `CALL_CLOSE`, `CALL_WAS`, `CALL_CLEAR`, `CALL_FIND`, `CALL_BEFORE`, `CALL_AFTER`, `CALL_FIRST`, `CALL_LAST`, `CALL_ABSOLUTE`, `CALL_RELATIVE`, `CALL_PREVIOUS`, `CALL_ROW`, `CALL_UPDATE`, `CALL_INSERT`, `CALL_DELETE`, `CALL_REFRESH`, `CALL_CANCEL`, `CALL_MOVE`, `CALL_OWN`, `CALL_OTHERS`, `CALL_UPDATES`, `CALL_DELETES`, `CALL_INSERTS`, `CALL_LOCATORS`, `CALL_AUTO`, `CALL_GENERATED`, `CALL_RELEASE`, `CALL_NATIVE`, `CALL_PREPARE`, `CALL_ROLLBACK`, `CALL_ABORT`, `CALL_EXECUTE`, `CALL_ADD`, `CALL_ENQUOTE`, `CALL_REGISTER`, `CALL_LENGTH` + +--- + +## 21. Error and Exception Mapping + +### SQL errors carried in gRPC trailers + +When the server encounters a SQL error, it returns `Status.INTERNAL` with a `SqlErrorResponse` message attached to the trailing metadata. Extract it using the proto message key for `SqlErrorResponse`. + +``` +SqlErrorResponse { + reason: string // human-readable message + sqlState: string // ANSI SQL state code + vendorCode: int32 // database-specific error code + sqlErrorType: SqlErrorType // SQL_EXCEPTION or SQL_DATA_EXCEPTION +} +``` + +Map to the host language's exception hierarchy: +- `SQL_EXCEPTION` → standard SQL exception. +- `SQL_DATA_EXCEPTION` → data-specific SQL exception (subtype). + +### Error classification matrix + +| Condition | gRPC status | Client action | +|---|---|---| +| SQL error (bad query, constraint, etc.) | `INTERNAL` + `SqlErrorResponse` trailer | Throw SQL exception; do not retry; do not mark server unhealthy | +| Pool not found (server restarted) | `NOT_FOUND` | Invalidate connHash cache; reconnect; retry once (§4) | +| Server unreachable | `UNAVAILABLE` | Failover to next server (§8) | +| Request timeout | `DEADLINE_EXCEEDED` | Failover to next server (§8) | +| Pool exhausted | `RESOURCE_EXHAUSTED` | Throw pool-exhaustion error; do not retry; do not mark server unhealthy | +| Session invalidated (server failure) | Session-not-found message | Throw session-lost error; do not retry; let caller decide | +| Session stickiness violation (server down) | Local check before RPC | Throw connection error immediately; do not reroute | + +--- + +## 22. Configuration System + +### Configuration sources (in priority order) + +1. **System / environment properties** (highest priority) — e.g., `-Dojp.health.check.interval=10000` or environment variable equivalents. +2. **`ojp.properties` file** — loaded from the classpath or a well-known filesystem path. +3. **Built-in defaults** (lowest priority). + +### Property namespacing + +Properties can be global or per-datasource. Per-datasource properties are prefixed with the datasource name: + +``` +# Global +ojp.health.check.interval=5000 + +# Per-datasource (datasource name: "analytics") +analytics.ojp.health.check.interval=10000 +``` + +### Standard configuration properties + +| Property | Default | Meaning | +|---|---|---| +| `ojp.health.check.interval` | `5000` (ms) | Periodic health check interval | +| `ojp.health.check.threshold` | `5000` (ms) | Minimum wait before re-probing an unhealthy server | +| `ojp.health.check.timeout` | `5000` (ms) | Probe call timeout | +| `ojp.redistribution.enabled` | `true` | Enable/disable the health checker and redistribution | +| `ojp.redistribution.idleRebalanceFraction` | `1.0` | Fraction of idle connections to close per rebalance cycle | +| `ojp.redistribution.maxClosePerRecovery` | `100` | Max connections closed per recovery event | +| `ojp.loadaware.selection.enabled` | `true` | Use least-connections; `false` = round-robin | +| `ojp.multinode.retry.attempts` | `3` | Max failover retry attempts | +| `ojp.multinode.retry.delay` | `100` (ms) | Delay between retry attempts | +| `ojp.datasource.name` | `"default"` | Active datasource name (sent to the server) | +| `ojp.grpc.tls.enabled` | `false` | Enable TLS on gRPC channels | +| `ojp.grpc.tls.cert.path` | — | Path to client certificate for mTLS | + +### Duration format + +Duration values support the following suffixes: +- No suffix — milliseconds (e.g. `5000`) +- `ms` — milliseconds (e.g. `500ms`) +- `s` — seconds (e.g. `10s`) +- `m` — minutes (e.g. `2m`) + +--- + +## 23. Query Result Caching + +Cache configuration is entirely **client-side to server** — the client reads local cache rules and sends them to the server as `ConnectionDetails.properties` entries during `connect()`. The server applies them transparently; the client does not implement any caching logic itself. + +### Properties sent to the server + +| Property key | Meaning | +|---|---| +| `ojp.cache.enabled` | `"true"` to enable caching | +| `ojp.cache.queries..pattern` | Regex pattern matching SQL queries to cache | +| `ojp.cache.queries..ttl` | TTL in seconds for cached results | +| `ojp.cache.queries..invalidateOn` | Comma-separated table names that invalidate this rule | +| `ojp.cache.queries..enabled` | `"true"` / `"false"` to toggle individual rules | + +`` is a 1-based integer index. Rules are processed in index order. + +### Example configuration + +```properties +ojp.cache.enabled=true +ojp.cache.queries.1.pattern=SELECT .* FROM products.* +ojp.cache.queries.1.ttl=600 +ojp.cache.queries.1.invalidateOn=products,product_prices +ojp.cache.queries.2.pattern=SELECT .* FROM users.* +ojp.cache.queries.2.ttl=300 +ojp.cache.queries.2.invalidateOn=users +``` + +--- + +## 24. Security / Transport + +### Plaintext (default) + +Create a plaintext gRPC channel targeting `dns:///host:port`. This is suitable for internal networks or local development. + +### TLS + +When `ojp.grpc.tls.enabled = true`, create a TLS-secured channel: +- Use the platform's default trust store or a custom CA certificate. +- Support mutual TLS (mTLS) when `ojp.grpc.tls.cert.path` is set. +- Certificate paths and key material must be loaded from configurable filesystem paths; do not hard-code them. + +### Credential handling + +- Passwords must never be logged or included in exception messages. +- Connection keys used for cache lookups (§4) may include the password as a cache key only — they must not be serialised or persisted. + +--- + +## 25. DataSource / Integration API + +### DataSource wrapper + +Provide a higher-level `DataSource` (or equivalent) object that: +- Holds connection configuration (URL, user, password, properties). +- Exposes a `getConnection()` method that calls `Driver.connect()` internally. +- Integrates cleanly with the host language's database access conventions (e.g., Python's `DB-API 2.0`, Go's `database/sql`, Node.js connection objects). + +### Framework integration (Spring Boot example) + +For Java/Spring Boot: +- Provide a `spring-boot-starter-ojp` auto-configuration module. +- Auto-configure an `OjpDataSource` bean when the driver is on the classpath. +- Expose a bridge (`OjpSystemPropertiesBridge`) that copies Spring Boot `application.yml` properties to JVM system properties so the configuration system (§22) can pick them up. +- **Disable** the framework's own built-in connection pool (e.g., HikariCP in Spring Boot) when OJP is in use — double-pooling is the most common misconfiguration and causes incorrect behaviour. + +For other languages, document clearly in the library README that the application-side connection pool must be disabled when using OJP. + +--- + +## 26. Testing Coverage + +A conformant client implementation must ship a test suite that exercises all the aspects above. Tests that require a live OJP server (and optionally a real database) should be **gated behind feature flags** so the suite can run incrementally in CI. + +### Test infrastructure requirements + +- A running OJP server (see `ojp-server` module and `download-drivers.sh`). +- At minimum, an embedded/in-process database (e.g., H2) for fast baseline tests. +- Optional: containerised databases (PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, DB2, CockroachDB) gated by per-database flags. + +### Test categories and required scenarios + +#### Basic CRUD +- SELECT, INSERT, UPDATE, DELETE via plain Statement and PreparedStatement. +- Verify affected row counts, returned ResultSet contents. +- Verify empty result sets are handled correctly. + +#### Multiple data types +- Round-trip every `ParameterTypeProto` value through INSERT + SELECT. +- Cover: all integer widths, float, double, BigDecimal, string, boolean, byte array, date, time, timestamp (with and without timezone), LocalDate, LocalTime, LocalDateTime, OffsetDateTime, OffsetTime, Instant, URL, UUID, RowId, BLOB, CLOB, array, NULLs for each type. + +#### Statement variants +- Plain `Statement`: `executeQuery`, `executeUpdate`, `execute`, `executeBatch`, `getResultSet`, `getUpdateCount`, `getGeneratedKeys`, `cancel`, `close`. +- `PreparedStatement`: all `setXxx` methods, `executeBatch`, multiple executions with the same prepared statement, `getParameterMetaData`. +- `CallableStatement`: IN, OUT, INOUT parameters; `registerOutParameter`; retrieval of OUT values after execution; named parameters where supported. + +#### ResultSet navigation +- Forward-only cursors: `next()`, `wasNull()`, `close()`. +- Scrollable cursors: `first()`, `last()`, `beforeFirst()`, `afterLast()`, `absolute(n)`, `relative(n)`, `previous()`. +- Multi-block pagination: queries large enough to exceed one fetch page; verify all rows are retrieved. + +#### ResultSet metadata +- `getColumnCount()`, `getColumnName()`, `getColumnType()`, `getColumnTypeName()`, `getPrecision()`, `getScale()`, `isNullable()`, `isAutoIncrement()`. + +#### DatabaseMetaData +- `getTables()`, `getColumns()`, `getPrimaryKeys()`, `getIndexInfo()`, `getProcedures()`, `getTypeInfo()`, `supportsXxx()` methods. +- Verify results match the actual database schema. + +#### Transactions +- Commit: insert rows in a transaction, commit, verify rows persist. +- Rollback: insert rows in a transaction, rollback, verify rows are absent. +- `autoCommit = false` then `setAutoCommit(true)` — verify implicit commit. +- Transaction isolation level: set, verify via `getTransactionIsolation()`, reset after connection return. + +#### Savepoints +- Create a named and an anonymous savepoint. +- Rollback to each; verify partial rollback semantics. +- Release a savepoint. + +#### XA transactions +- Full lifecycle: `xaStart`, `xaEnd`, `xaPrepare`, `xaCommit`. +- Rollback path: `xaStart`, `xaEnd`, `xaPrepare`, `xaRollback`. +- One-phase commit (`onePhase=true`). +- `xaRecover`: verify in-doubt XIDs are returned. +- `xaForget`: verify heuristically completed branch is removed. +- Transaction isolation reset after XA session. + +#### LOBs +- BLOB: write a small blob (< 1 chunk), a large blob (multiple chunks), read back both; verify byte-for-byte equality. +- CLOB: same as BLOB but with character content. +- Binary stream, ASCII stream, Unicode stream: write via stream API, read back. +- Hydratable LOB: verify that a LOB reference can be passed as a parameter to a second statement. +- NULL LOB: verify that `setBlob(null)` / `setClob(null)` sends a SQL NULL. + +#### Session affinity +- Verify that a connection with an open transaction always routes to the same server. +- Verify that a connection holding an open LOB always routes to the same server. +- Verify that when the bound server is down, an appropriate error is raised rather than silent rerouting. + +#### Multi-block / large result sets +- Execute a query that returns more rows than one page. Verify all rows arrive and are in the correct order. + +#### Multinode load balancing +- With two or more server endpoints, open `N` connections and verify they are distributed across servers (round-robin and least-connections modes separately). + +#### Multinode failover +- Kill one server mid-operation; verify the operation is retried on a surviving server (for stateless operations). +- Verify a server is marked unhealthy after failure. +- Verify subsequent connections avoid the unhealthy server. + +#### Multinode recovery and redistribution +- Bring a server back; verify it is marked healthy after the health check interval. +- Verify new connections start routing to the recovered server. +- Verify connection redistribution closes a fraction of idle connections on over-loaded servers. + +#### XA multinode +- Verify that each XA session binds to exactly one server. +- Verify that failover of an XA session to another server raises an error (not a silent reroute). +- Verify XA redistribution after server recovery. + +#### connHash caching / connect-RPC skip +- Open two connections with the same credentials; verify only one `connect()` gRPC call is made. +- Simulate a `NOT_FOUND` response; verify the driver invalidates the cache and re-issues `connect()`. + +#### Session stickiness error path +- Establish a session on server A. Mark server A unhealthy. Attempt a SQL operation. Verify an error is raised rather than the request being silently routed to server B. + +#### Cluster health propagation +- Fail one server; verify the cluster health string sent in subsequent requests marks it `DOWN`. +- Recover the server; verify the health string marks it `UP`. + +#### Concurrency / pool exhaustion +- Send more concurrent requests than the server-side pool size; verify pool-exhaustion errors are surfaced cleanly and do not mark servers unhealthy. + +#### Slow query segregation +- Send queries that take longer than the slow-query threshold; verify they use the reserved slow-query slots and do not starve fast queries. + +#### Multi-datasource +- Configure two endpoints with different datasource names; verify each endpoint uses its own datasource configuration. + +#### Configuration loading +- Verify properties are loaded from `ojp.properties`. +- Verify system properties override file properties. +- Verify per-datasource properties override global properties. + +#### Performance / mini stress +- Open and close 100–1000 connections in parallel; verify no connection leaks, no deadlocks, and no degrading error rate. + +#### Database-specific test suites + +Each database must have a dedicated test class gated by its own flag. The class must cover the full set of above scenarios for that database's specific SQL dialect, type system, and edge cases. + +| Database | Feature flag | +|---|---| +| H2 | `enableH2Tests` | +| PostgreSQL | `enablePostgresTests` | +| MySQL | `enableMySQLTests` | +| MariaDB | `enableMariaDBTests` | +| Oracle | `enableOracleTests` | +| SQL Server | `enableSqlServerTests` | +| DB2 | `enableDb2Tests` | +| CockroachDB | `enableCockroachDBTests` | + +H2 tests (in-process, no external dependency) must always be runnable in CI without any extra setup and should act as the first gate before any database-specific jobs run. + +--- + +## Appendix A — Proto file locations + +| File | Location | +|---|---| +| Main protocol | `ojp-grpc-commons/src/main/proto/StatementService.proto` | +| Generic value containers | `ojp-grpc-commons/src/main/proto/containers.proto` | +| Echo / heartbeat | `ojp-grpc-commons/src/main/proto/echo.proto` | + +## Appendix B — Reference implementation classes + +| Aspect | Java class | +|---|---| +| gRPC stubs | `StatementServiceGrpcClient` | +| Multinode routing | `MultinodeStatementService`, `MultinodeConnectionManager` | +| URL parsing | `MultinodeUrlParser`, `UrlParser` | +| Session tracking | `SessionTracker` | +| Health checking | `HealthCheckValidator`, `HealthCheckConfig` | +| Redistribution | `ConnectionRedistributor`, `XAConnectionRedistributor` | +| Error mapping | `GrpcExceptionHandler` | +| Connection lifecycle | `Connection` | +| Statement execution | `Statement`, `PreparedStatement`, `CallableStatement` | +| Result set | `ResultSet`, `RemoteProxyResultSet` | +| LOB handling | `Blob`, `Clob`, `NClob`, `Lob`, `LobServiceImpl` | +| XA | `OjpXAResource`, `OjpXAConnection`, `OjpXADataSource` | +| Driver entry point | `Driver` | +| DataSource wrapper | `OjpDataSource` | From d86de063f55fe6192485f745ae40bef290c29e00 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sun, 19 Apr 2026 11:28:32 +0000 Subject: [PATCH 02/12] docs: add Java driver cross-references to each spec section Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/da435d87-9c75-4844-b379-03cf974f166f Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/CLIENT_SPEC.md | 179 ++++++++++++++++++ 1 file changed, 179 insertions(+) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 3ebe59052..928d86f7f 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -84,6 +84,12 @@ The client must implement stubs for every RPC in `StatementService` and `EchoSer - Blocking stubs are used for synchronous operations; async stubs are required for client-streaming (`createLob`) and server-streaming (`executeQuery`, `readLob`) RPCs. - Channel shutdown must be graceful (allow in-flight calls to complete) and must be triggered on client shutdown. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`StatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementService.java): the unified interface declaring all RPC methods (`connect`, `executeUpdate`, `executeQuery`, `fetchNextRows`, `createLob`, `readLob`, `terminateSession`, `startTransaction`, `commitTransaction`, `rollbackTransaction`, `callResource`, all XA operations). +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC implementation of `StatementService`; contains the concrete gRPC stub calls and the `grpcChannelOpenAndStubsInitialized()` channel lifecycle method. +> - `ojp-jdbc-driver` — [`MultinodeStatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): the multinode façade that wraps `StatementServiceGrpcClient` per endpoint with routing, failover, and stickiness. +> - `ojp-grpc-commons` — [`GrpcChannelFactory`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): `createChannel(host, port)` / `createChannel(target)` — builds `ManagedChannel` instances with plaintext or TLS; handles the `dns:///` prefix and max inbound message size. + --- ## 2. URL Parsing @@ -117,6 +123,11 @@ All three parts are mandatory: | `jdbc:ojp[a:1059,b:1059]_h2:mem:test` | `a:1059`, `b:1059` | `default`, `default` | `h2:mem:test` | | `jdbc:ojp[a:1059(web),b:1059(analytics)]_postgresql://db/mydb` | `a:1059`, `b:1059` | `web`, `analytics` | `postgresql://db/mydb` | +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeUrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java): `parseServerEndpoints(url, dataSourceNames)` parses the bracket-enclosed endpoint list; `extractActualJdbcUrl(url)` strips the OJP prefix; `replaceBracketsWithSingleEndpoint(url, endpoint)` produces the single-endpoint URL forwarded to the server; `getOrCreateStatementService(url)` is the main entry point that ties parsing to channel creation. +> - `ojp-jdbc-driver` — [`UrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/UrlParser.java): `parseUrlWithDataSource(url)` handles single-node URL parsing and datasource name extraction. +> - `ojp-jdbc-driver` — [`Driver.connect(url, info)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Driver.java): the JDBC driver entry point that calls both parsers and dispatches to single-node or multinode paths. + --- ## 3. Client Identity @@ -128,6 +139,9 @@ All three parts are mandatory: - The server uses `clientUUID` to group all sessions from the same client process. - Do not persist `clientUUID` across process restarts; each new process should generate a fresh UUID. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`ClientUUID`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ClientUUID.java): `getUUID()` returns the static, process-scoped UUID that is generated once at class-loading time via `UUID.randomUUID()`. + --- ## 4. Connection Establishment and connHash Caching @@ -171,6 +185,13 @@ When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory 4. Retry the original failed operation once with the new `SessionInfo`. 5. This retry is only safe if the original request had no active `sessionUUID` (no open transaction). If a session was in progress, surface the error to the caller — the transaction state is permanently lost. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.connect()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): orchestrates first-connect vs. cache-hit logic; calls `connectToAllServers()` for the real RPC path and `buildLocalSessionInfo()` for the cache-hit path. +> - `MultinodeConnectionManager.computeConnectionKey()`: builds the `url|user|password|datasourceName` cache key. +> - `MultinodeConnectionManager.invalidateConnHash()`: removes the stale key from `connHashByConnectionKey` on `NOT_FOUND`. +> - `MultinodeConnectionManager.reconnectForConnHash()`: re-issues the real `connect()` RPC using stored `ConnectionDetails` and updates the cache. +> - `MultinodeConnectionManager.buildLocalSessionInfo()`: constructs the in-memory `SessionInfo` for cache-hit connections without an RPC call. + --- ## 5. Session Management @@ -195,6 +216,12 @@ When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory - On connection close: call `terminateSession(SessionInfo)`. This is mandatory for releasing server-side resources, especially in multinode deployments where multiple servers may hold pools. - If `sessionStatus == SESSION_TERMINATED` is received, treat the connection as closed and do not make further calls on it. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): holds the mutable `session` field (`SessionInfo`); `close()` calls `terminateSession(session)` and nulls the session; `checkValid()` guards every method against a closed or force-invalidated connection. +> - `ojp-jdbc-driver` — [`MultinodeStatementService.withClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): enriches outgoing `SessionInfo` with the current cluster health string before each RPC. +> - `MultinodeStatementService.checkAndBindSession()`: updates the stickiness map whenever the server returns a new or changed `sessionUUID`. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.terminateSession()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): forwards `terminateSession` to every server that received a `connect()` for this `connHash`. + --- ## 6. Session Stickiness @@ -216,6 +243,13 @@ A session binding is created or updated in these cases: - A response contains a `sessionUUID` that was not present in the request (first assignment). - The `targetServer` field in a response differs from the currently recorded binding (re-binding after a recovery; log a warning). +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.affinityServer(sessionKey)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): returns the bound server for a `sessionUUID`, or selects a new one via load balancing when no binding exists yet; throws `SQLException` if the bound server is unhealthy. +> - `MultinodeConnectionManager.bindSession(sessionUUID, targetServer)`: records the `sessionUUID → host:port` mapping in `sessionToServerMap`. +> - `MultinodeConnectionManager.getBoundTargetServer(sessionUUID)`: reads the current binding. +> - `MultinodeConnectionManager.unbindSession(sessionUUID)`: removes the binding on session close. +> - `ojp-jdbc-driver` — [`SessionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/SessionTracker.java): maintains per-server session counts used by the load-balancer and redistribution logic. + --- ## 7. Load Balancing @@ -238,6 +272,12 @@ Server selection runs on every new connection attempt (non-XA, first `connect()` Only servers whose `isHealthy() == true` are eligible for selection. If no healthy servers exist, raise a connection error. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.selectHealthyServer()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the entry point that dispatches to one of the two strategies based on config. +> - `MultinodeConnectionManager.selectByLeastConnections(healthyServers)`: picks the server with the lowest active-session count; falls back to round-robin on a tie. +> - `MultinodeConnectionManager.selectByRoundRobin(healthyServers)`: atomically increments `roundRobinCounter` and selects `servers[counter % size]`. +> - `ojp-jdbc-driver` — [`ServerEndpoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ServerEndpoint.java): holds `isHealthy`, `lastFailureTime`, host, and port state for each endpoint. + --- ## 8. Failover @@ -272,6 +312,13 @@ Connection-level gRPC errors indicate that the server is unreachable. The follow - Pool exhaustion — surface directly to caller. - Session-invalidation errors (session lost after server failure) — surface directly to caller; the caller must re-establish the session. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.isConnectionLevelError()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): classifies a `StatusRuntimeException` as a connectivity failure vs. a SQL/business error. +> - `GrpcExceptionHandler.isPoolNotFoundException()`: returns `true` for `NOT_FOUND`, triggering reconnect rather than failover. +> - `GrpcExceptionHandler.isSessionInvalidationError()`: returns `true` when the server indicates the session is gone. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.handleServerFailure(endpoint, exception)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): marks the server unhealthy and timestamps the failure. +> - `MultinodeStatementService.executeOpResultWithSessionStickinessAndBinding()`: the retry loop that catches `StatusRuntimeException`, calls `isConnectionLevelError`, drives the server-selection retry cycle, and calls `handleServerFailure` on each failed attempt. + --- ## 9. Health Checking @@ -307,6 +354,12 @@ For each currently unhealthy server, check if enough time has passed since the l | `ojp.health.check.timeout` | 5000 ms | Maximum time for a single probe call | | `ojp.redistribution.enabled` | `true` | Whether to run the periodic health checker at all | +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check (probe healthy servers, then probe unhealthy ones) and triggers recovery. +> - `ojp-jdbc-driver` — [`HealthCheckValidator.validateServer(endpoint)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckValidator.java): performs a single lightweight probe; `validateServer(endpoint, connectionDetails)` performs the full-validation probe with real credentials followed by `terminateSession`. +> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): POJO holding `healthCheckIntervalMs`, `healthCheckThresholdMs`, `healthCheckTimeoutMs`, and `redistributionEnabled`. +> - `MultinodeConnectionManager` constructor: schedules `performHealthCheck` on a `ScheduledExecutorService` at the configured interval. + --- ## 10. Connection Redistribution on Recovery @@ -326,6 +379,12 @@ When a failed server comes back online, rebalance client-side connections so tha - Close a fraction of idle connections on over-loaded servers so they are returned to the pool, then re-opened — the client's load-balancing layer will route the re-opens to the least-loaded server (including the recovered one). - Honour the configurable fraction (`ojp.redistribution.idleRebalanceFraction`, default 1.0) and max-close-per-cycle limit (`ojp.redistribution.maxClosePerRecovery`, default 100). +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.reinitializePoolOnRecoveredServer(recoveredServer)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): iterates `connectionDetailsByConnHash` and calls `connect()` on the recovered server for each stored `ConnectionDetails` before marking it healthy. +> - `ojp-jdbc-driver` — [`ConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionRedistributor.java): closes a fraction of idle connections on over-loaded servers for non-XA mode. +> - `ojp-jdbc-driver` — [`XAConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/XAConnectionRedistributor.java): equivalent redistribution for XA connections. +> - `ojp-jdbc-driver` — [`ConnectionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionTracker.java): maintains the per-server `Connection` list consulted by `ConnectionRedistributor`. + --- ## 11. Cluster Health Propagation @@ -354,6 +413,11 @@ generate_cluster_health(endpoints): ) ``` +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.generateClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): builds the semicolon-delimited health string from `serverEndpoints`. +> - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: broadcasts an updated `ConnectionDetails` (with new `clusterHealth`) to every healthy server via `connect()`. +> - `MultinodeStatementService.withClusterHealth(sessionInfo)`: attaches the current health string to an outgoing `SessionInfo` before each RPC. + --- ## 12. Transaction Management (non-XA) @@ -380,6 +444,13 @@ Always replace the local `SessionInfo` with the one returned by these calls. - Get isolation level via `callResource` with `CallType.CALL_GET`, resource name `"TransactionIsolation"`. - The isolation level must be reset to the default after each logical connection is returned to a pool (if the client integrates with a connection pool). +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection.setAutoCommit(boolean)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): calls `commitTransaction` when switching on and `startTransaction` when switching off; updates the local `session` field from each response. +> - `Connection.commit()` / `Connection.rollback()`: delegate to `statementService.commitTransaction(session)` / `rollbackTransaction(session)` when `autoCommit == false`. +> - `Connection.close()`: calls `terminateSession(session)` unconditionally. +> - `Connection.setTransactionIsolation(level)` / `getTransactionIsolation()`: forwarded via `callProxy(CallType.CALL_SET/GET, "TransactionIsolation", ...)`. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.startTransaction()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java) / `commitTransaction()` / `rollbackTransaction()`: the single-node gRPC calls. + --- ## 13. Savepoints @@ -410,6 +481,11 @@ Call `callResource` with: - `resourceUUID = ` - `target.callType = CALL_RELEASE` +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection.setSavepoint()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java) / `setSavepoint(name)`: calls `callProxy` with `CALL_SET`, `"Savepoint"`, and the optional name; wraps the returned resource UUID in a [`Savepoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Savepoint.java) object. +> - `Connection.rollback(Savepoint)`: calls `callProxy` with `CALL_ROLLBACK`, `"Savepoint"`, and the savepoint's resource UUID. +> - `Connection.releaseSavepoint(Savepoint)`: calls `callProxy` with `CALL_RELEASE`. + --- ## 14. XA / Distributed Transactions @@ -452,6 +528,12 @@ On the response to `xaStart`, record the `sessionUUID → targetServer` binding - `xaSetTransactionTimeout(seconds)` and `xaGetTransactionTimeout()` are straightforward pass-throughs to the server. - `xaIsSameRM` checks whether two `SessionInfo` objects originate from the same resource manager (same server). +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`OjpXAResource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAResource.java): implements `XAResource`; all 10 lifecycle methods (`start`, `end`, `prepare`, `commit`, `rollback`, `recover`, `forget`, `setTransactionTimeout`, `getTransactionTimeout`, `isSameRM`); contains the `xaStart` retry loop and the `toXidProto` / `fromXidProto` conversion helpers. +> - `ojp-jdbc-driver` — [`OjpXAConnection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAConnection.java): creates the XA-mode `StatementService` connection (always calling the server, never cache-hit) and vends `OjpXAResource`. +> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): entry point for XA; calls `MultinodeConnectionManager.connectXA()` to pin the session to a single server. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.xaStart()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java) … `xaIsSameRM()`: the 10 single-node gRPC stub wrappers. + --- ## 15. Statement Execution @@ -485,6 +567,11 @@ StatementRequest { - Use `executeQuery` for SELECT — returns a server-streaming response. Consume the first `OpResult` to get the initial batch; call `fetchNextRows` for subsequent pages (see §18). - After any execution, update the local `SessionInfo` from the `OpResult.session` field. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Statement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Statement.java): `executeQuery(sql)` → `statementService.executeQuery(...)`; `executeUpdate(sql)` → `statementService.executeUpdate(...)`; holds `statementUUID` (assigned lazily); `execute(sql)` handles the dual-result case. +> - `ojp-jdbc-driver` — [`PreparedStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/PreparedStatement.java): accumulates parameters in a `SortedMap`; `executeQuery()` and `executeUpdate()` pass the full param map to `statementService`; all 28 `setXxx(index, value)` methods map to the corresponding `ParameterType` (see §16). +> - `ojp-jdbc-driver` — [`CallableStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CallableStatement.java): issues `callResource(CALL_PREPARE)` on construction; retrieves OUT/INOUT values via `callResource(CALL_EXECUTE)` after execution. + --- ## 16. Parameter Type Mapping @@ -550,6 +637,13 @@ Example: `BigDecimal("123.45")` → `"12345 2"`. `url_value`, `rowid_value`, `uuid_value`, `biginteger_value`, `rowidlifetime_value` are all `google.protobuf.StringValue` (a wrapper message). An absent (unset) wrapper means SQL NULL. An empty string inside the wrapper is a valid non-null value. +> **Reference implementation:** +> - `ojp-grpc-commons` — [`ProtoConverter.toProto(Parameter)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): converts a host-language `Parameter` object to `ParameterProto`; `fromProto(ParameterProto)` is the inverse. +> - `ProtoConverter.toParameterValue(Object value)`: the central dispatcher that routes each Java type to the correct `ParameterValue` oneof field. +> - `ProtoConverter.fromParameterValue(ParameterValue, ParameterType)`: decodes a wire value back to a Java object using both the value and the declared type as hints. +> - `ojp-grpc-commons` — [`ProtoTypeConverters`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoTypeConverters.java): `uuidToProto(UUID)` / `uuidFromProto(StringValue)`, `urlToProto(URL)` / `urlFromProto(StringValue)`, `rowIdToProto(RowId)` / `rowIdBytesFromProto(StringValue)` — handles the presence-aware `StringValue` wrappers for UUID, URL, and RowId. +> - `ojp-grpc-commons` — [`BigDecimalWire`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/BigDecimalWire.java): `writeBigDecimal` / `readBigDecimal` — binary wire encoding for BigDecimal (also see `documents/protocol/BIGDECIMAL_WIRE_FORMAT.md`). + --- ## 17. Temporal Type Handling @@ -602,6 +696,18 @@ On the receiving side, use `original_type` to reconstruct the correct host-langu The OJP server must always run with `user.timezone=UTC`. Client libraries should also normalise to UTC when encoding timestamps, using the `timezone` field to carry the original zone for faithful reconstruction. +> **Reference implementation:** +> - `ojp-grpc-commons` — [`TemporalConverter`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/TemporalConverter.java): the definitive encoding/decoding reference for all temporal types: +> - `toTimestampWithZone(java.sql.Timestamp, ZoneId)` / `fromTimestampWithZone(TimestampWithZone)` — `Timestamp` ↔ `TimestampWithZone`. +> - `calendarToTimestampWithZone(Calendar)` / `timestampWithZoneToCalendar(TimestampWithZone)` — `Calendar`. +> - `offsetDateTimeToTimestampWithZone` / `timestampWithZoneToOffsetDateTime` — `OffsetDateTime`. +> - `localDateTimeToTimestampWithZone` / `timestampWithZoneToLocalDateTime` — `LocalDateTime`. +> - `instantToTimestampWithZone` / `timestampWithZoneToInstant` — `Instant`. +> - `localDateToProtoDate(LocalDate)` / `protoDateToLocalDate(Date)` — `LocalDate` ↔ `google.type.Date`. +> - `localTimeToProtoTimeOfDay(LocalTime)` / `protoTimeOfDayToLocalTime(TimeOfDay)` — `LocalTime` ↔ `google.type.TimeOfDay`. +> - `offsetTimeToTimestampWithZone` / `timestampWithZoneToOffsetTime` — `OffsetTime`. +> - `fromTimestampWithZoneToObject(TimestampWithZone)`: the unified decoder that uses `TemporalType` to reconstruct the original type. + --- ## 18. Result Set and Streaming @@ -640,6 +746,12 @@ Scrollable result sets support cursor positioning through `callResource` with `R | `previous()` | `CALL_PREVIOUS` | | `close()` | `CALL_CLOSE` | +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`ResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ResultSet.java): `next()` drives the multi-block iteration; `setNextOpResult()` loads a new batch from the iterator; `nextWithSessionUpdate()` updates the session from each block. All `getXxx(columnIndex)` methods call `ProtoConverter.fromParameterValue()` on the column's `ParameterValue`. +> - `ojp-jdbc-driver` — [`RemoteProxyResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/RemoteProxyResultSet.java): base class holding `resultSetUUID` and `statementService`; all scrollable-cursor operations issue `callResource(RES_RESULT_SET, CALL_FIRST/LAST/ABSOLUTE/…)`. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.fetchNextRows(sessionInfo, resultSetUUID, size)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the RPC that fetches the next page. +> - `ojp-grpc-commons` — [`ProtoConverter.fromProto(OpQueryResultProto)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): deserialises the initial `OpQueryResult` (labels + rows + resultSetUUID). + --- ## 19. LOB (Large Object) Handling @@ -695,6 +807,13 @@ Receive a server-streaming response of `LobDataBlock` messages. Concatenate the LOB handles are server-side objects. A connection that has an open LOB must remain bound to the same server (§6). Do not reroute such connections during failover; instead surface the error to the caller. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`LobServiceImpl`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/LobServiceImpl.java): `sendBytes(lobType, pos, inputStream)` opens the client-streaming `createLob` call, chunks the data into `LobDataBlock` messages, and returns the `LobReference`. `parseReceivedBlocks(Iterator)` reassembles chunks from a `readLob` stream into an `InputStream`. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.createLob(connection, iterator)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the client-streaming gRPC call; uses an async stub and a `CountDownLatch` to bridge the streaming API back to a synchronous return value. +> - `StatementServiceGrpcClient.readLob(lobReference, pos, length)`: the server-streaming gRPC call that returns an `Iterator`. +> - `ojp-jdbc-driver` — [`Blob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Blob.java): `getBytes(pos, length)` and `getBinaryStream()` call `readLob`; `setBytes(pos, bytes)` calls `sendBytes`. [`Clob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Clob.java) mirrors the same pattern for character data. +> - `ojp-jdbc-driver` — [`BinaryStream`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/BinaryStream.java): streams binary content directly via `createLob` without materialising the full byte array. + --- ## 20. CallResource Protocol @@ -752,6 +871,11 @@ Always update the local `SessionInfo` from `response.session`. `CALL_SET`, `CALL_GET`, `CALL_IS`, `CALL_ALL`, `CALL_NULLS`, `CALL_USES`, `CALL_SUPPORTS`, `CALL_STORES`, `CALL_NULL`, `CALL_DOES`, `CALL_DATA`, `CALL_NEXT`, `CALL_CLOSE`, `CALL_WAS`, `CALL_CLEAR`, `CALL_FIND`, `CALL_BEFORE`, `CALL_AFTER`, `CALL_FIRST`, `CALL_LAST`, `CALL_ABSOLUTE`, `CALL_RELATIVE`, `CALL_PREVIOUS`, `CALL_ROW`, `CALL_UPDATE`, `CALL_INSERT`, `CALL_DELETE`, `CALL_REFRESH`, `CALL_CANCEL`, `CALL_MOVE`, `CALL_OWN`, `CALL_OTHERS`, `CALL_UPDATES`, `CALL_DELETES`, `CALL_INSERTS`, `CALL_LOCATORS`, `CALL_AUTO`, `CALL_GENERATED`, `CALL_RELEASE`, `CALL_NATIVE`, `CALL_PREPARE`, `CALL_ROLLBACK`, `CALL_ABORT`, `CALL_EXECUTE`, `CALL_ADD`, `CALL_ENQUOTE`, `CALL_REGISTER`, `CALL_LENGTH` +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.callResource(CallResourceRequest)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC call. +> - `ojp-jdbc-driver` — [`DatabaseMetaData`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatabaseMetaData.java): every `DatabaseMetaData` method (>200 in total) is implemented by calling `callResource` with `RES_CONNECTION` and the appropriate `CallType` (e.g., `CALL_GET` for `getURL()`, `CALL_SUPPORTS` for `supportsXxx()`, `CALL_STORES` for `storesXxx()`). The private helper `newCallBuilder()` creates the skeleton `CallResourceRequest`. +> - `ojp-jdbc-driver` — `Connection.callProxy(callType, resourceName, returnType, params)`: the private convenience wrapper used throughout `Connection` and `DatabaseMetaData` to issue `callResource` calls without building the full request proto by hand. + --- ## 21. Error and Exception Mapping @@ -785,6 +909,12 @@ Map to the host language's exception hierarchy: | Session invalidated (server failure) | Session-not-found message | Throw session-lost error; do not retry; let caller decide | | Session stickiness violation (server down) | Local check before RPC | Throw connection error immediately; do not reroute | +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.handle(StatusRuntimeException)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): extracts `SqlErrorResponse` from gRPC trailing metadata on `Status.INTERNAL` and throws the appropriate `SQLException` with SQL state and vendor code. +> - `GrpcExceptionHandler.isPoolNotFoundException(exception)`: returns `true` for `NOT_FOUND`. +> - `GrpcExceptionHandler.isSessionInvalidationError(exception)`: returns `true` for session-invalidation error messages. +> - `GrpcExceptionHandler.isConnectionLevelError(exception)`: returns `true` for `UNAVAILABLE`, `DEADLINE_EXCEEDED`, and connection-related `UNKNOWN` errors. + --- ## 22. Configuration System @@ -832,6 +962,12 @@ Duration values support the following suffixes: - `s` — seconds (e.g. `10s`) - `m` — minutes (e.g. `2m`) +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`DatasourcePropertiesLoader`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatasourcePropertiesLoader.java): `loadOjpPropertiesForDataSource(datasourceName)` merges file properties, system properties, and environment variables with per-datasource prefix resolution. `loadOjpProperties()` loads the base `ojp.properties` file from the classpath. +> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): the strongly-typed POJO that holds all health-check and redistribution settings, populated by `MultinodeUrlParser` from the loaded `Properties`. +> - `ojp-jdbc-driver` — [`MultinodeUrlParser.readIntProperty(props, key, default)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java) / `readLongProperty(...)`: reads typed values from the merged `Properties` object. +> - `ojp-grpc-commons` — [`GrpcClientConfig.load()`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loads the gRPC-specific settings (max inbound message size, TLS config) from `ojp.properties`. + --- ## 23. Query Result Caching @@ -862,6 +998,9 @@ ojp.cache.queries.2.ttl=300 ojp.cache.queries.2.invalidateOn=users ``` +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`CacheConfigurationBuilder.addCachePropertiesToMap(propertiesMap, datasourceName)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CacheConfigurationBuilder.java): reads cache rules from the loaded `Properties` and appends them to the `ConnectionDetails.properties` map that is sent to the server on `connect()`. `parseDurationToSeconds(duration)` handles the same duration format as §22. + --- ## 24. Security / Transport @@ -882,6 +1021,11 @@ When `ojp.grpc.tls.enabled = true`, create a TLS-secured channel: - Passwords must never be logged or included in exception messages. - Connection keys used for cache lookups (§4) may include the password as a cache key only — they must not be serialised or persisted. +> **Reference implementation:** +> - `ojp-grpc-commons` — [`GrpcChannelFactory.createChannel(host, port)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): creates a plaintext `ManagedChannel` with configurable max inbound message size; `createSecureChannel(host, port, size, tlsConfig)` builds the TLS-secured variant; `buildSslContext(tlsConfig)` sets up Netty's `SslContext` from the certificate paths. +> - `ojp-grpc-commons` — [`GrpcClientConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loaded by `GrpcClientConfig.load()` from `ojp.properties`; exposes `getTlsConfig()` and `getMaxInboundMessageSize()`. +> - `ojp-grpc-commons` — [`TlsConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/TlsConfig.java): holds `enabled`, `certPath`, `keyPath`, `caPath`, and `clientAuth` flags. + --- ## 25. DataSource / Integration API @@ -903,6 +1047,11 @@ For Java/Spring Boot: For other languages, document clearly in the library README that the application-side connection pool must be disabled when using OJP. +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`OjpDataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/OjpDataSource.java): implements `javax.sql.DataSource`; `getConnection()` / `getConnection(user, password)` delegate to `DriverManager.getConnection(url, info)` which invokes the registered `Driver`. +> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): implements `javax.sql.XADataSource`; `getXAConnection()` creates an `OjpXAConnection` (and thus an `OjpXAResource`) for JTA integration. +> - `spring-boot-starter-ojp` module: provides the Spring Boot auto-configuration class and the `OjpSystemPropertiesBridge` bean; sets `spring.datasource.type=OjpDataSource` and excludes `DataSourceAutoConfiguration` to prevent double-pooling. + --- ## 26. Testing Coverage @@ -1040,6 +1189,36 @@ Each database must have a dedicated test class gated by its own flag. The class H2 tests (in-process, no external dependency) must always be runnable in CI without any extra setup and should act as the first gate before any database-specific jobs run. +> **Reference implementation — test classes by area:** +> +> | Test area | Java test class(es) | +> |---|---| +> | Basic CRUD | [`BasicCrudIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BasicCrudIntegrationTest.java) | +> | Multiple data types | `H2MultipleTypesIntegrationTest`, `PostgresMultipleTypesIntegrationTest`, `MySQLMultipleTypesIntegrationTest`, `OracleMultipleTypesIntegrationTest`, `SQLServerMultipleTypesIntegrationTest`, `Db2MultipleTypesIntegrationTest`, `CockroachDBMultipleTypesIntegrationTest`, `MariaDBMultipleTypesIntegrationTest` | +> | Statement variants | `H2StatementExtensiveTests`, `H2PreparedStatementExtensiveTests` (and per-DB equivalents) | +> | ResultSet navigation / metadata | `H2ResultSetTest` (and per-DB), `H2ResultSetMetaDataExtensiveTests`, `H2ReadMultipleBlocksOfDataIntegrationTest` | +> | DatabaseMetaData | `H2DatabaseMetaDataExtensiveTests`, `H2ConnectionExtensiveTests` (and per-DB) | +> | Transactions | `H2ConnectionExtensiveTests`, [`TransactionIsolationResetTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/TransactionIsolationResetTest.java) | +> | Savepoints | `H2SavepointTests` (and per-DB `*SavepointTests`) | +> | XA transactions | [`PostgresXAIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/PostgresXAIntegrationTest.java), `MySQLXAIntegrationTest`, `MariaDBXAIntegrationTest`, `OracleXAIntegrationTest`, `SqlServerXAIntegrationTest`, `Db2XAIntegrationTest`, [`XASessionInvalidationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/XASessionInvalidationTest.java) | +> | LOBs | [`BlobIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BlobIntegrationTest.java), [`BinaryStreamIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BinaryStreamIntegrationTest.java), [`HydratedLobValidationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/HydratedLobValidationTest.java) (and per-DB `*Blob*` / `*BinaryStream*`) | +> | Session affinity | [`H2SessionAffinityIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/H2SessionAffinityIntegrationTest.java) (and per-DB `*SessionAffinity*`) | +> | Multi-block result sets | `H2ReadMultipleBlocksOfDataIntegrationTest` (and per-DB) | +> | Multinode load balancing | [`LoadAwareServerSelectionTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/LoadAwareServerSelectionTest.java), [`MultinodeIntegrationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeIntegrationTest.java) | +> | Multinode failover | [`MultinodeFailoverTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeFailoverTest.java), [`MultinodeConnectionManagerErrorHandlingTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeConnectionManagerErrorHandlingTest.java) | +> | Multinode recovery / redistribution | [`MultinodeRecoveryTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeRecoveryTest.java) | +> | XA multinode | [`MultinodeXAIntegrationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeXAIntegrationTest.java) | +> | connHash caching | [`ConnectRpcSkipOptimisationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/ConnectRpcSkipOptimisationTest.java), [`UnifiedConnectionModeTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/UnifiedConnectionModeTest.java) | +> | Session stickiness error path | [`MultinodeTargetServerBindingTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeTargetServerBindingTest.java), `MultinodeStatementServiceTest` | +> | Cluster health propagation | [`MultinodeConnectionManagerClusterHealthTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeConnectionManagerClusterHealthTest.java) | +> | Concurrency / pool exhaustion | [`ConcurrencyTimeoutTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/ConcurrencyTimeoutTest.java) | +> | Multi-datasource | [`MultiDataSourceIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/MultiDataSourceIntegrationTest.java), [`MultiDataSourceConfigurationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/MultiDataSourceConfigurationTest.java) | +> | Configuration loading | [`DatasourcePropertiesLoaderSystemPropertyTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DatasourcePropertiesLoaderSystemPropertyTest.java), [`DatasourcePropertiesLoaderEnvironmentTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DatasourcePropertiesLoaderEnvironmentTest.java) | +> | URL parsing | [`MultinodeUrlParserTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeUrlParserTest.java), [`UrlParserTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/UrlParserTest.java), [`DriverMultinodeUrlTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DriverMultinodeUrlTest.java) | +> | DataSource API | [`OjpDataSourceTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/OjpDataSourceTest.java), [`OjpXADataSourceTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/xa/OjpXADataSourceTest.java) | +> | Health check config | [`HealthCheckConfigTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/HealthCheckConfigTest.java), [`MultinodeRetryConfigTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeRetryConfigTest.java) | +> | Session tracker unit | [`SessionTrackerTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/SessionTrackerTest.java) | + --- ## Appendix A — Proto file locations From bd1c1bde0da2f6a06da4a1a916617e23a0177c5e Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sun, 19 Apr 2026 16:12:41 +0000 Subject: [PATCH 03/12] docs: update spec for recent failover/rebalancing fixes (PRs #443-#449) Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/3bdf2786-d9ee-4f1b-a09a-a5b069f4f53f Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/CLIENT_SPEC.md | 42 +++++++++++++------ 1 file changed, 29 insertions(+), 13 deletions(-) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 928d86f7f..2b7228f08 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -292,15 +292,20 @@ Connection-level gRPC errors indicate that the server is unreachable. The follow | `DEADLINE_EXCEEDED` | Yes | | `UNKNOWN` (with "connection" in message) | Yes | | `INTERNAL` with SQL metadata trailers | **No** — this is a database-level error | +| `INTERNAL` without SQL metadata trailers | Yes — treated as a transport-level failure | | `NOT_FOUND` | **No** — triggers reconnect (see §4), not failover | | `RESOURCE_EXHAUSTED` (pool exhaustion) | **No** — surface to caller | +| `CANCELLED` | **No** — this is a client-initiated cancellation signal; must never mark a server unhealthy | | Any `SQLException` from server | **No** | ### Failover procedure 1. When a connectivity error is detected on a server: - a. Mark the server unhealthy (`isHealthy = false`), recording the failure timestamp. - b. Log the failure. + a. Capture whether the server was previously healthy (`wasHealthy`). + b. Mark the server unhealthy (`isHealthy = false`), recording the failure timestamp. + c. Log the failure. + d. If this is a genuine healthy → unhealthy transition (`wasHealthy == true`), submit `pushClusterHealthToAllHealthyServers()` asynchronously to the background scheduler so surviving servers resize their pools immediately. The push is submitted (not called inline) to avoid blocking the query thread. + e. Shut down the gRPC channel for the failed server gracefully (allow in-flight calls to drain, then discard). 2. Select the next healthy server (using the configured strategy, excluding the failed server and any already attempted in this retry cycle). 3. Retry the operation on the new server. 4. If all servers have been attempted and all failed, raise a connection error to the caller. @@ -313,10 +318,10 @@ Connection-level gRPC errors indicate that the server is unreachable. The follow - Session-invalidation errors (session lost after server failure) — surface directly to caller; the caller must re-establish the session. > **Reference implementation:** -> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.isConnectionLevelError()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): classifies a `StatusRuntimeException` as a connectivity failure vs. a SQL/business error. +> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.isConnectionLevelError()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): classifies a `StatusRuntimeException` as a connectivity failure vs. a SQL/business error. `CANCELLED` is explicitly **excluded** (it is a client-side signal, not a server failure). > - `GrpcExceptionHandler.isPoolNotFoundException()`: returns `true` for `NOT_FOUND`, triggering reconnect rather than failover. > - `GrpcExceptionHandler.isSessionInvalidationError()`: returns `true` when the server indicates the session is gone. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.handleServerFailure(endpoint, exception)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): marks the server unhealthy and timestamps the failure. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.handleServerFailure(endpoint, exception)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): marks the server unhealthy, timestamps the failure, shuts down the gRPC channel gracefully, and — only on a genuine healthy→unhealthy transition (`wasHealthy == true`) — submits `pushClusterHealthToAllHealthyServers()` to the background `healthCheckScheduler` so the cluster health push does not block the query thread. > - `MultinodeStatementService.executeOpResultWithSessionStickinessAndBinding()`: the retry loop that catches `StatusRuntimeException`, calls `isConnectionLevelError`, drives the server-selection retry cycle, and calls `handleServerFailure` on each failed attempt. --- @@ -333,10 +338,10 @@ Run a periodic background task that checks server health. The task must: ### Two-phase check **Phase 1 — probe healthy servers (detect newly failed servers)** -For each currently healthy server, send a `connect()` with empty credentials. If the call throws any exception, mark the server unhealthy and call the server-failure handler (see §11). +Run when there are active XA sessions (`sessionToServerMap` is non-empty) **or** cached non-XA connection details (`connectionDetailsByConnHash` is non-empty). This dual guard ensures both XA and non-XA workloads trigger early failure detection. The guard prevents spurious "no healthy servers" errors before any connection has been established. For each currently healthy server that passes the guard, send a probe call. If the call fails, mark the server unhealthy and call the server-failure handler (see §8 and §11). **Phase 2 — probe unhealthy servers (detect recovery)** -For each currently unhealthy server, check if enough time has passed since the last failure (property `ojp.health.check.threshold`, default 5 000 ms). If so, probe the server. If the probe succeeds, mark it healthy and trigger recovery procedures (see §10). +For each currently unhealthy server, check if enough time has passed since the last failure (property `ojp.health.check.threshold`, default 5 000 ms). If so, probe the server. If the probe succeeds, run recovery (see §10). ### Health probe modes @@ -355,7 +360,7 @@ For each currently unhealthy server, check if enough time has passed since the l | `ojp.redistribution.enabled` | `true` | Whether to run the periodic health checker at all | > **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check (probe healthy servers, then probe unhealthy ones) and triggers recovery. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check. Phase 1 fires when `!sessionToServerMap.isEmpty() || !connectionDetailsByConnHash.isEmpty()` (XA sessions OR non-XA cached connections). Phase 1 failure calls `pushClusterHealthToAllHealthyServers()` inline on the health-check thread. Phase 2 calls `reinitializePoolOnRecoveredServer()` before `markHealthy()`, then pushes cluster health. > - `ojp-jdbc-driver` — [`HealthCheckValidator.validateServer(endpoint)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckValidator.java): performs a single lightweight probe; `validateServer(endpoint, connectionDetails)` performs the full-validation probe with real credentials followed by `terminateSession`. > - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): POJO holding `healthCheckIntervalMs`, `healthCheckThresholdMs`, `healthCheckTimeoutMs`, and `redistributionEnabled`. > - `MultinodeConnectionManager` constructor: schedules `performHealthCheck` on a `ScheduledExecutorService` at the configured interval. @@ -370,8 +375,8 @@ When a failed server comes back online, rebalance client-side connections so tha ### Procedure on recovery -1. Before marking the server healthy, **proactively re-initialise pools** on the recovered server. For every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This avoids `NOT_FOUND` errors on the first SQL call routed there. -2. Mark the server healthy. +1. **Reinitialize pools on the recovered server first** (before marking healthy). Check whether any non-XA connections have been cached (`connectionDetailsByConnHash` is non-empty). If so, for every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This closes the NOT_FOUND window that would otherwise exist between marking the server healthy and the first SQL call reaching it. Only after all pools are pre-created, proceed to step 2. +2. Mark the server healthy (`endpoint.markHealthy()`). 3. Push the updated cluster health string to all healthy servers (see §11) so they can resize their pools. 4. If redistribution is enabled (`ojp.redistribution.enabled = true`), begin rebalancing: - Determine the ideal share: `totalConnections / numberOfHealthyServers`. @@ -380,7 +385,7 @@ When a failed server comes back online, rebalance client-side connections so tha - Honour the configurable fraction (`ojp.redistribution.idleRebalanceFraction`, default 1.0) and max-close-per-cycle limit (`ojp.redistribution.maxClosePerRecovery`, default 100). > **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.reinitializePoolOnRecoveredServer(recoveredServer)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): iterates `connectionDetailsByConnHash` and calls `connect()` on the recovered server for each stored `ConnectionDetails` before marking it healthy. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.reinitializePoolOnRecoveredServer(recoveredServer)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): runs only when `!connectionDetailsByConnHash.isEmpty()`; iterates the map and calls `connect()` on the recovered server for each stored `ConnectionDetails`; always called **before** `endpoint.markHealthy()` to eliminate the NOT_FOUND window. > - `ojp-jdbc-driver` — [`ConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionRedistributor.java): closes a fraction of idle connections on over-loaded servers for non-XA mode. > - `ojp-jdbc-driver` — [`XAConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/XAConnectionRedistributor.java): equivalent redistribution for XA connections. > - `ojp-jdbc-driver` — [`ConnectionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionTracker.java): maintains the per-server `Connection` list consulted by `ConnectionRedistributor`. @@ -401,7 +406,13 @@ Each semicolon-separated segment is `host:port(STATUS)` where status is `UP` or - **Build** the cluster health string from local server endpoint health state before every `connect()` call and before every operation that carries a `SessionInfo` (by populating `SessionInfo.clusterHealth`). - **Consume** the cluster health string returned in `SessionInfo.clusterHealth` on every response. Update local endpoint health states accordingly: mark endpoints `DOWN` as unhealthy and endpoints `UP` as healthy (if they were previously failed). -- **Push** the updated cluster health to all currently healthy servers after a server health state change (failure or recovery). This is done by calling `connect()` on each healthy server with a `ConnectionDetails` that contains the new `clusterHealth`. The server uses this to resize its pool immediately. +- **Proactively push** the updated cluster health to all currently healthy servers whenever the topology changes (a server fails or recovers). This push happens via two independent triggers — both are necessary: + + **Trigger 1 — health-check thread**: When `performHealthCheck()` detects a newly failed server or a recovered server, it calls `pushClusterHealthToAllHealthyServers()` inline on the health-check thread. This covers the case when no SQL traffic is active at the moment of the topology change. + + **Trigger 2 — query thread**: When a SQL query thread detects server failure via `handleServerFailure()`, it submits `pushClusterHealthToAllHealthyServers()` to the background scheduler. This covers the race where the query thread marks the server unhealthy before the health checker runs (the health checker's Phase 1 loop would then skip the already-unhealthy server and never push). The push is submitted asynchronously to avoid blocking the query thread. + + The push is done by calling `connect()` on each healthy server with a `ConnectionDetails` whose `clusterHealth` field contains the new topology string. The server uses this to resize its pool immediately, regardless of whether any SQL is in flight. ### Generation @@ -415,8 +426,10 @@ generate_cluster_health(endpoints): > **Reference implementation:** > - `ojp-jdbc-driver` — [`MultinodeConnectionManager.generateClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): builds the semicolon-delimited health string from `serverEndpoints`. -> - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: broadcasts an updated `ConnectionDetails` (with new `clusterHealth`) to every healthy server via `connect()`. -> - `MultinodeStatementService.withClusterHealth(sessionInfo)`: attaches the current health string to an outgoing `SessionInfo` before each RPC. +> - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: calls `connect()` on every healthy server with the new cluster health embedded in `ConnectionDetails`; only runs when `!connectionDetailsByConnHash.isEmpty()` (no-op until the first real connection is established). +> - `MultinodeConnectionManager.handleServerFailure()` (Trigger 2): submits `pushClusterHealthToAllHealthyServers()` to `healthCheckScheduler` on a genuine healthy→unhealthy transition so query threads are never blocked by the push. +> - `MultinodeConnectionManager.performHealthCheck()` (Trigger 1): calls `pushClusterHealthToAllHealthyServers()` directly (inline on health-check thread) after marking a server DOWN or after a recovered server is marked healthy. +> - `MultinodeStatementService.withClusterHealth(sessionInfo)`: attaches the current health string to an outgoing `SessionInfo` before each RPC (reactive secondary path). --- @@ -905,10 +918,13 @@ Map to the host language's exception hierarchy: | Pool not found (server restarted) | `NOT_FOUND` | Invalidate connHash cache; reconnect; retry once (§4) | | Server unreachable | `UNAVAILABLE` | Failover to next server (§8) | | Request timeout | `DEADLINE_EXCEEDED` | Failover to next server (§8) | +| Client-side cancellation | `CANCELLED` | Do **not** failover; do **not** mark server unhealthy; surface to caller | | Pool exhausted | `RESOURCE_EXHAUSTED` | Throw pool-exhaustion error; do not retry; do not mark server unhealthy | | Session invalidated (server failure) | Session-not-found message | Throw session-lost error; do not retry; let caller decide | | Session stickiness violation (server down) | Local check before RPC | Throw connection error immediately; do not reroute | +> **Note:** Before this classification was established (prior to April 2026) the server incorrectly used `Status.CANCELLED` for SQL errors. The correct status is `Status.INTERNAL` with a `SqlErrorResponse` trailer. Any implementation must use `INTERNAL` for SQL errors and must not treat `CANCELLED` as a server failure. + > **Reference implementation:** > - `ojp-jdbc-driver` — [`GrpcExceptionHandler.handle(StatusRuntimeException)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): extracts `SqlErrorResponse` from gRPC trailing metadata on `Status.INTERNAL` and throws the appropriate `SQLException` with SQL state and vendor code. > - `GrpcExceptionHandler.isPoolNotFoundException(exception)`: returns `true` for `NOT_FOUND`. From 35ae38f7c94b6ec19cbec729b3d09b12b4ccf83f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sun, 19 Apr 2026 17:05:06 +0000 Subject: [PATCH 04/12] =?UTF-8?q?docs:=20rewrite=20=C2=A72=20as=20language?= =?UTF-8?q?-agnostic=20ConnectionDetails=20assembly=20guide?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/5c0e63ee-e3db-4996-842d-9aaff69c6cab Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/CLIENT_SPEC.md | 68 +++++++++++-------- 1 file changed, 40 insertions(+), 28 deletions(-) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 2b7228f08..5e623d7bd 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -92,41 +92,53 @@ The client must implement stubs for every RPC in `StatementService` and `EchoSer --- -## 2. URL Parsing +## 2. Connection Configuration and Building ConnectionDetails -### URL format +### What the client collects from the user -``` -jdbc:ojp[host:port(datasourceName),host2:port2(datasourceName2)]_actual-db-url -``` +A non-Java OJP client does not use a JDBC URL. Instead, it collects the following configuration items directly from the user or from a configuration file: + +| Item | Required | Description | +|---|---|---| +| OJP server endpoints | Yes | One or more `host:port` pairs for the OJP server(s). In multinode mode this is a list. | +| Datasource name | No | A logical name for this datasource, default `"default"`. Used to keep separate connection pools per named datasource on the same server. | +| Database URL | Yes | The connection URL for the **real database** that the OJP server will connect to (e.g., `jdbc:postgresql://db:5432/mydb`). This is sent verbatim to the server. | +| User | Yes | Database username. | +| Password | Yes | Database password. | +| Properties | No | Additional key-value configuration pairs (pool sizing, cache rules, etc. — see §22, §23). | + +### Building the `ConnectionDetails` message -All three parts are mandatory: -- `jdbc:ojp` — fixed prefix that identifies the driver. -- `[...]` — bracket-enclosed, comma-separated list of OJP server endpoints. Each endpoint is `host:port`, optionally followed by a datasource name in parentheses: `host:port(dsName)`. -- `_` — separator between the OJP section and the actual database URL. -- `actual-db-url` — the full database URL that the server will use to connect to the real database (e.g., `postgresql://localhost:5432/mydb`). +Map the collected configuration to the `ConnectionDetails` proto fields as follows: -### Parsing rules +| Proto field | Type | Value | +|---|---|---| +| `url` | `string` | The **actual database URL** (e.g., `jdbc:postgresql://db:5432/mydb`). The server uses this to create the real database connection pool. | +| `user` | `string` | Database username. | +| `password` | `string` | Database password. | +| `clientUUID` | `string` | Stable process UUID (see §3). | +| `properties` | `repeated PropertyEntry` | Configuration key-value pairs; include `ojp.datasource.name = ` when using a named datasource. | +| `serverEndpoints` | `repeated string` | All OJP server addresses as `"host:port"` strings (the full cluster list, not just the chosen endpoint). | +| `clusterHealth` | `string` | Current cluster health string (see §11); empty string on the very first connect. | +| `isXA` | `bool` | `true` for XA connections, `false` otherwise. | + +> **Important:** the `url` field must be consistent across all client processes that connect to the same logical datasource. The server computes `connHash` as SHA-256(`url + user + password + datasource_name`). If different clients send different `url` strings for the same database, the server creates separate pools. + +### `connHash` cache key (client side) -1. **Extract the endpoint list** by capturing everything between `[` and `]`. -2. **Split by comma** to enumerate endpoints. Trim whitespace around each item. -3. **For each endpoint**, split on `:` to obtain host and port. If a `(dsName)` suffix is present, strip it and record the datasource name; default is `"default"`. -4. **Validate** that host is non-empty, port is an integer in `[1, 65535]`, and at least one endpoint is present. -5. **Extract the actual database URL** by removing everything up to and including the first `]_` pattern. -6. **Produce a single-endpoint URL** for use in `ConnectionDetails.url` by replacing `[host1:port1,host2:port2]` with `[chosen_host:chosen_port]` (the endpoint actually selected for the first connection). This stripped URL is forwarded to the server; the server never sees the multinode list. +The client caches the `connHash` returned by the server after the first `connect()` RPC. The local lookup key for this cache is: -### Examples +``` +url + "|" + user + "|" + password + "|" + datasource_name +``` -| Input | Endpoints | Datasource names | Actual DB URL | -|---|---|---|---| -| `jdbc:ojp[localhost:10591]_postgresql://db:5432/mydb` | `localhost:10591` | `default` | `postgresql://db:5432/mydb` | -| `jdbc:ojp[a:1059,b:1059]_h2:mem:test` | `a:1059`, `b:1059` | `default`, `default` | `h2:mem:test` | -| `jdbc:ojp[a:1059(web),b:1059(analytics)]_postgresql://db/mydb` | `a:1059`, `b:1059` | `web`, `analytics` | `postgresql://db/mydb` | +Use the same `url` string that was placed in `ConnectionDetails.url` so the cache key matches the server's `connHash` computation. > **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeUrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java): `parseServerEndpoints(url, dataSourceNames)` parses the bracket-enclosed endpoint list; `extractActualJdbcUrl(url)` strips the OJP prefix; `replaceBracketsWithSingleEndpoint(url, endpoint)` produces the single-endpoint URL forwarded to the server; `getOrCreateStatementService(url)` is the main entry point that ties parsing to channel creation. -> - `ojp-jdbc-driver` — [`UrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/UrlParser.java): `parseUrlWithDataSource(url)` handles single-node URL parsing and datasource name extraction. -> - `ojp-jdbc-driver` — [`Driver.connect(url, info)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Driver.java): the JDBC driver entry point that calls both parsers and dispatches to single-node or multinode paths. +> - `ojp-grpc-commons` — [`ConnectionDetails` proto](../../ojp-grpc-commons/src/main/proto/StatementService.proto): field definitions for `url`, `user`, `password`, `clientUUID`, `properties`, `serverEndpoints`, `clusterHealth`, `isXA`. +> - `ojp-server` — [`ConnectionHashGenerator.hashConnectionDetails()`](../../ojp-server/src/main/java/org/openjproxy/grpc/server/utils/ConnectionHashGenerator.java): SHA-256 of `url + user + password + datasource_name_from_properties` — the server-side connHash algorithm. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.computeConnectionKey()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): client-side cache key = `url + "|" + user + "|" + password + "|" + datasource_name`. +> - `ojp-jdbc-driver` — [`MultinodeUrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java): Java reference for how the JDBC URL is parsed to extract server endpoints, datasource names, and the actual DB URL before building `ConnectionDetails` (Java-specific; not needed in non-Java clients). --- @@ -148,8 +160,8 @@ All three parts are mandatory: ### First connection (cache miss) -1. Build a `ConnectionDetails` message: - - `url` — the single-endpoint URL extracted during parsing (see §2). +1. Build a `ConnectionDetails` message (see §2 for field mapping): + - `url` — the actual database connection URL. - `user`, `password` — credentials. - `clientUUID` — the stable process UUID (see §3). - `properties` — datasource-specific properties from configuration (see §22), including cache rules (see §23). From 11300c655fa8bf78e8e44bc9913aaab545fd9e29 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sun, 19 Apr 2026 17:13:34 +0000 Subject: [PATCH 05/12] docs: add gRPC pseudo-code examples and reframe JDBC-specific sections in CLIENT_SPEC.md Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/759120cb-e3cf-4e95-9329-aacb037ad0b7 Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/CLIENT_SPEC.md | 489 +++++++++++++++++- 1 file changed, 464 insertions(+), 25 deletions(-) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 5e623d7bd..5b239f13a 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -10,7 +10,7 @@ ## Table of Contents 1. [gRPC Interface Implementation](#1-grpc-interface-implementation) -2. [URL Parsing](#2-url-parsing) +2. [Connection Configuration and Building ConnectionDetails](#2-connection-configuration-and-building-connectiondetails) 3. [Client Identity](#3-client-identity) 4. [Connection Establishment and connHash Caching](#4-connection-establishment-and-connhash-caching) 5. [Session Management](#5-session-management) @@ -84,6 +84,18 @@ The client must implement stubs for every RPC in `StatementService` and `EchoSer - Blocking stubs are used for synchronous operations; async stubs are required for client-streaming (`createLob`) and server-streaming (`executeQuery`, `readLob`) RPCs. - Channel shutdown must be graceful (allow in-flight calls to complete) and must be triggered on client shutdown. +### Pseudo-code + +```python +# Create one long-lived channel per OJP server endpoint +channel = grpc.create_channel("localhost:10591", credentials=grpc.local_channel_credentials()) +stub = StatementServiceStub(channel) # used for all SQL operations +echo = EchoServiceStub(channel) # used for health heartbeats + +# On process shutdown — drain in-flight calls before closing +channel.shutdown(grace_period_seconds=5) +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`StatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementService.java): the unified interface declaring all RPC methods (`connect`, `executeUpdate`, `executeQuery`, `fetchNextRows`, `createLob`, `readLob`, `terminateSession`, `startTransaction`, `commitTransaction`, `rollbackTransaction`, `callResource`, all XA operations). > - `ojp-jdbc-driver` — [`StatementServiceGrpcClient`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC implementation of `StatementService`; contains the concrete gRPC stub calls and the `grpcChannelOpenAndStubsInitialized()` channel lifecycle method. @@ -172,6 +184,47 @@ Use the same `url` string that was placed in `ConnectionDetails.url` so the cach 3. Cache the returned `connHash`, keyed on `url + "|" + user + "|" + password + "|" + datasourceName`. Also store the full `ConnectionDetails` so it can be replayed if the server restarts. 4. Return the received `SessionInfo` to the caller. +### Pseudo-code + +```python +# --- First connection (cache miss) --- +req = ConnectionDetails( + url = "jdbc:postgresql://db:5432/mydb", # actual DB URL + user = "alice", + password = "secret", + clientUUID = CLIENT_UUID, # stable process UUID (§3) + serverEndpoints = ["host1:10591", "host2:10591"], # full cluster list + clusterHealth = build_cluster_health(endpoints), # §11; "" on very first call + isXA = False, + properties = [PropertyEntry(key="ojp.datasource.name", string_value="default")] +) + +session = stub.connect(req) +# session.connHash = "abc123..." — server-computed pool key +# session.clientUUID = CLIENT_UUID + +# Cache connHash for subsequent connections +cache_key = f"{req.url}|{req.user}|{req.password}|default" +connhash_cache[cache_key] = session.connHash +stored_details[session.connHash] = req # kept for NOT_FOUND recovery (see below) + +# --- Subsequent connection (cache hit, non-XA) --- +# No RPC call needed — build SessionInfo locally from the cached connHash +session = SessionInfo( + connHash = connhash_cache[cache_key], + clientUUID = CLIENT_UUID, + isXA = False + # sessionUUID is absent; the server assigns it lazily on startTransaction +) + +# --- NOT_FOUND recovery --- +# If any RPC returns Status.NOT_FOUND (server restarted, pool lost): +del connhash_cache[cache_key] +session = stub.connect(stored_details[old_conn_hash]) # re-issue real connect() +connhash_cache[cache_key] = session.connHash # update cache +# then retry the original failed RPC once +``` + ### Subsequent connections (cache hit, non-XA only) When a subsequent connection uses the same credentials: @@ -228,6 +281,22 @@ When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory - On connection close: call `terminateSession(SessionInfo)`. This is mandatory for releasing server-side resources, especially in multinode deployments where multiple servers may hold pools. - If `sessionStatus == SESSION_TERMINATED` is received, treat the connection as closed and do not make further calls on it. +### Pseudo-code + +```python +# Every gRPC call returns an updated SessionInfo — always replace the local copy +resp = stub.executeUpdate(StatementRequest(session=current_session, sql="...")) +current_session = resp.session # ← update after every call + +# When a new sessionUUID appears in the response, record the server binding (§6) +if resp.session.sessionUUID and resp.session.sessionUUID != current_session.sessionUUID: + bind_session(resp.session.sessionUUID, resp.session.targetServer) + +# Close a connection — release server-side state +stub.terminateSession(current_session) +# After this call, discard current_session and do not make further calls on it +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`Connection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): holds the mutable `session` field (`SessionInfo`); `close()` calls `terminateSession(session)` and nulls the session; `checkValid()` guards every method against a closed or force-invalidated connection. > - `ojp-jdbc-driver` — [`MultinodeStatementService.withClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): enriches outgoing `SessionInfo` with the current cluster health string before each RPC. @@ -371,6 +440,44 @@ For each currently unhealthy server, check if enough time has passed since the l | `ojp.health.check.timeout` | 5000 ms | Maximum time for a single probe call | | `ojp.redistribution.enabled` | `true` | Whether to run the periodic health checker at all | +### Pseudo-code + +```python +# Lightweight heartbeat: send empty credentials — any response means transport is up +def heartbeat_probe(stub): + try: + stub.connect(ConnectionDetails(url="", user="", password="")) + return True # server is reachable + except grpc.RpcError: + return False # mark server unhealthy (§8) + +# Full validation: connect with real credentials, then immediately terminate +def full_validation_probe(stub, stored_details): + try: + session = stub.connect(stored_details) + stub.terminateSession(session) + return True + except grpc.RpcError: + return False + +# Periodic background task +def run_health_check(endpoints, stubs, stored_details): + for ep in endpoints: + if ep.is_healthy: + # Phase 1 — probe healthy server; detect new failures + if stored_details or xa_sessions: # guard: skip if no connections yet + if not heartbeat_probe(stubs[ep]): + handle_server_failure(ep) + push_cluster_health_async(endpoints, stored_details) + else: + # Phase 2 — probe unhealthy server; detect recovery + if time_since(ep.last_failure) >= HEALTH_CHECK_THRESHOLD: + if heartbeat_probe(stubs[ep]): + reinitialize_pool_on_recovered_server(ep, stored_details) # §10 + ep.mark_healthy() + push_cluster_health_inline(endpoints, stored_details) # §11 +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check. Phase 1 fires when `!sessionToServerMap.isEmpty() || !connectionDetailsByConnHash.isEmpty()` (XA sessions OR non-XA cached connections). Phase 1 failure calls `pushClusterHealthToAllHealthyServers()` inline on the health-check thread. Phase 2 calls `reinitializePoolOnRecoveredServer()` before `markHealthy()`, then pushes cluster health. > - `ojp-jdbc-driver` — [`HealthCheckValidator.validateServer(endpoint)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckValidator.java): performs a single lightweight probe; `validateServer(endpoint, connectionDetails)` performs the full-validation probe with real credentials followed by `terminateSession`. @@ -436,6 +543,41 @@ generate_cluster_health(endpoints): ) ``` +### Pseudo-code + +```python +# Build the health string from local endpoint state +def build_cluster_health(endpoints): + return ";".join( + f"{ep.host}:{ep.port}({'UP' if ep.is_healthy else 'DOWN'})" + for ep in endpoints + ) + +# Push updated cluster health to all healthy servers via a connect() call. +# The server uses the clusterHealth field to resize its pool immediately. +def push_cluster_health(endpoints, stored_details): + if not stored_details: + return # no connections yet — nothing to push + health_str = build_cluster_health(endpoints) + for conn_hash, details in stored_details.items(): + push_req = ConnectionDetails(**details, clusterHealth=health_str) + for ep in endpoints: + if ep.is_healthy: + stubs[ep].connect(push_req) # no-op for pool creation; resizes pool + +# Consume the cluster health returned in every gRPC response +def consume_cluster_health(session_info): + for segment in session_info.clusterHealth.split(";"): + host_port, status = segment.rsplit("(", 1) + status = status.rstrip(")") + endpoint = find_endpoint(host_port) + if status == "DOWN" and endpoint.is_healthy: + handle_server_failure(endpoint) + elif status == "UP" and not endpoint.is_healthy: + # do not mark healthy here — let the health-check thread confirm (§9) + pass +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`MultinodeConnectionManager.generateClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): builds the semicolon-delimited health string from `serverEndpoints`. > - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: calls `connect()` on every healthy server with the new cluster health embedded in `ConnectionDetails`; only runs when `!connectionDetailsByConnHash.isEmpty()` (no-op until the first real connection is established). @@ -447,27 +589,62 @@ generate_cluster_health(endpoints): ## 12. Transaction Management (non-XA) -### autoCommit semantics +### Transaction lifecycle -- Default state is `autoCommit = true`. -- When `autoCommit` is switched **off** (`false`), immediately call `startTransaction(SessionInfo)`. Store the returned `SessionInfo` (which now contains a `transactionUUID` and `TRX_ACTIVE` status). -- When `autoCommit` is switched **on** (`true`) while a transaction is active (`TRX_ACTIVE`), immediately call `commitTransaction(SessionInfo)` to commit the pending work. -- In `autoCommit = false` mode, no `startTransaction` call is needed before each SQL statement — the server tracks the open transaction via `sessionUUID`. +The server tracks open transactions per session. The client controls when transactions begin and end by calling explicit RPCs. -### Commit and rollback - -| Client call | gRPC call | Condition | -|---|---|---| -| `commit()` | `commitTransaction(SessionInfo)` | Only when `autoCommit == false` | -| `rollback()` | `rollbackTransaction(SessionInfo)` | Only when `autoCommit == false` | +- **Start a transaction**: call `startTransaction(SessionInfo)`. The returned `SessionInfo` contains a `transactionUUID` and `transactionStatus = TRX_ACTIVE`. All subsequent SQL calls on this session run inside the transaction until it is committed or rolled back. +- **Commit**: call `commitTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_COMMITED`. +- **Rollback**: call `rollbackTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_ROLLBACK`. +- **Auto-commit mode** (optional, for JDBC compatibility): if your client API exposes an auto-commit flag, implement it by calling `startTransaction` when the flag is switched off, and `commitTransaction` when it is switched back on while a transaction is active. In auto-commit mode, each SQL statement runs without an explicit transaction; the server commits each statement individually. Always replace the local `SessionInfo` with the one returned by these calls. ### Transaction isolation -- Set isolation level via `callResource` with `CallType.CALL_SET`, resource name `"TransactionIsolation"`, and the integer isolation level as parameter. -- Get isolation level via `callResource` with `CallType.CALL_GET`, resource name `"TransactionIsolation"`. -- The isolation level must be reset to the default after each logical connection is returned to a pool (if the client integrates with a connection pool). +Set or get the isolation level by calling `callResource` with `RES_CONNECTION` and `CallType.CALL_SET` / `CALL_GET` and resource name `"TransactionIsolation"`. The isolation level should be reset to the default after each logical connection is reused. + +### Pseudo-code + +```python +# Begin an explicit transaction +session = stub.startTransaction(session) +# session.transactionInfo.transactionUUID = "txn-uuid" +# session.transactionInfo.transactionStatus = TRX_ACTIVE + +# Execute SQL within the open transaction +resp = stub.executeUpdate(StatementRequest(session=session, sql="INSERT INTO orders ...")) +session = resp.session # always update local session + +# Commit +session = stub.commitTransaction(session) +# session.transactionInfo.transactionStatus = TRX_COMMITED + +# — OR — Rollback +session = stub.rollbackTransaction(session) +# session.transactionInfo.transactionStatus = TRX_ROLLBACK + +# Set transaction isolation (READ_COMMITTED = 2) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall( + callType = CALL_SET, + resourceName = "TransactionIsolation", + params = [ParameterValue(int_value=2)] + ) +)) +session = resp.session + +# Get current isolation level +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall(callType=CALL_GET, resourceName="TransactionIsolation") +)) +isolation_level = resp.values[0].int_value +session = resp.session +``` > **Reference implementation:** > - `ojp-jdbc-driver` — [`Connection.setAutoCommit(boolean)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): calls `commitTransaction` when switching on and `startTransaction` when switching off; updates the local `session` field from each response. @@ -506,6 +683,41 @@ Call `callResource` with: - `resourceUUID = ` - `target.callType = CALL_RELEASE` +### Pseudo-code + +```python +# Create a named savepoint +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_SAVEPOINT, + target = TargetCall( + callType = CALL_SET, + resourceName = "Savepoint", + params = [ParameterValue(string_value="my_savepoint")] # omit for anonymous + ) +)) +savepoint_uuid = resp.resourceUUID # keep this to roll back or release later +session = resp.session + +# Roll back to the savepoint (partial undo) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_SAVEPOINT, + resourceUUID = savepoint_uuid, + target = TargetCall(callType=CALL_ROLLBACK, resourceName="Savepoint") +)) +session = resp.session + +# Release the savepoint (no longer needed) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_SAVEPOINT, + resourceUUID = savepoint_uuid, + target = TargetCall(callType=CALL_RELEASE, resourceName="Savepoint") +)) +session = resp.session +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`Connection.setSavepoint()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java) / `setSavepoint(name)`: calls `callProxy` with `CALL_SET`, `"Savepoint"`, and the optional name; wraps the returned resource UUID in a [`Savepoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Savepoint.java) object. > - `Connection.rollback(Savepoint)`: calls `callProxy` with `CALL_ROLLBACK`, `"Savepoint"`, and the savepoint's resource UUID. @@ -553,6 +765,49 @@ On the response to `xaStart`, record the `sessionUUID → targetServer` binding - `xaSetTransactionTimeout(seconds)` and `xaGetTransactionTimeout()` are straightforward pass-throughs to the server. - `xaIsSameRM` checks whether two `SessionInfo` objects originate from the same resource manager (same server). +### Pseudo-code + +```python +xid = XidProto( + formatId = 1, + globalTransactionId = b"global-tx-001", + branchQualifier = b"branch-1" +) + +# 1. Start the XA branch (safe to retry on connection error) +resp = stub.xaStart(XaStartRequest(session=session, xid=xid, flags=0)) +session = resp.session # bind session.targetServer → this server for all remaining calls + +# 2. Execute SQL within the branch (normal executeUpdate/executeQuery calls) +resp = stub.executeUpdate(StatementRequest(session=session, sql="UPDATE accounts ...")) +session = resp.session + +# 3. End the branch — do NOT retry past this point +resp = stub.xaEnd(XaEndRequest(session=session, xid=xid, flags=0)) +session = resp.session + +# 4. Prepare (two-phase commit, phase 1) +prep = stub.xaPrepare(XaPrepareRequest(session=session, xid=xid)) +# prep.result = XA_OK (proceed to commit) or XA_RDONLY (read-only; no commit needed) + +# 5a. Commit (two-phase) +stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=False)) + +# 5b. — OR — One-phase optimisation (skip xaPrepare) +stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=True)) + +# 5c. — OR — Rollback +stub.xaRollback(XaRollbackRequest(session=session, xid=xid)) + +# Recovery: list in-doubt XIDs after a crash +resp = stub.xaRecover(XaRecoverRequest(session=session, flag=TMSTARTRSCAN)) +for recovered_xid in resp.xids: + stub.xaCommit(...) # or xaRollback — decision belongs to the transaction manager + +# Forget a heuristically completed branch +stub.xaForget(XaForgetRequest(session=session, xid=xid)) +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`OjpXAResource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAResource.java): implements `XAResource`; all 10 lifecycle methods (`start`, `end`, `prepare`, `commit`, `rollback`, `recover`, `forget`, `setTransactionTimeout`, `getTransactionTimeout`, `isSameRM`); contains the `xaStart` retry loop and the `toXidProto` / `fromXidProto` conversion helpers. > - `ojp-jdbc-driver` — [`OjpXAConnection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAConnection.java): creates the XA-mode `StatementService` connection (always calling the server, never cache-hit) and vends `OjpXAResource`. @@ -563,25 +818,27 @@ On the response to `xaStart`, record the `sessionUUID → targetServer` binding ## 15. Statement Execution -### Three statement types +### Sending SQL to the server + +All SQL is executed by populating a `StatementRequest` and calling either `executeUpdate` or `executeQuery` on the stub. -**Plain Statement** -Execute arbitrary SQL strings without parameters. Maps to `executeUpdate` or `executeQuery` with an empty `parameters` list. +**Parameterless SQL** +Set `sql` to the full query string and leave `parameters` empty. -**Prepared Statement** -Pre-compiled SQL with positional parameters (`?` placeholders). Parameters are accumulated locally and sent with the SQL in a single `StatementRequest`. Assign and track a `statementUUID` (a random UUID per prepared statement instance) for server-side resource management. +**Parameterized SQL** +Set `sql` with `?` positional placeholders and populate the `parameters` list with one `ParameterProto` per `?`. Parameters are accumulated locally and sent together in a single `StatementRequest`. Assign a `statementUUID` (a random UUID per logical prepared-statement instance) so the server can track resources tied to that statement. -**Callable Statement** -Stored-procedure calls with IN, OUT, and INOUT parameters. The stored-procedure call string is prepared on the server via `callResource` with `CallType.CALL_PREPARE` first. The returned `resourceUUID` becomes the Callable Statement handle. Parameters are registered by index and type, and OUT/INOUT values are retrieved from `CallResourceResponse.values` after execution. +**Stored-procedure calls** +First call `callResource` with `CallType.CALL_PREPARE` to register the procedure on the server and receive a `resourceUUID`. Then call `callResource` with `CallType.CALL_EXECUTE` to run it, passing IN parameters and retrieving OUT/INOUT values from `CallResourceResponse.values`. ### StatementRequest structure ``` StatementRequest { session: SessionInfo // current session - sql: string // the SQL (or call string) - parameters: ParameterProto[] // indexed parameters - statementUUID: string // UUID for this statement (for resource tracking) + sql: string // the SQL string + parameters: ParameterProto[] // indexed parameters (empty for parameterless SQL) + statementUUID: string // random UUID per statement instance properties: PropertyEntry[] // optional per-statement properties } ``` @@ -592,6 +849,62 @@ StatementRequest { - Use `executeQuery` for SELECT — returns a server-streaming response. Consume the first `OpResult` to get the initial batch; call `fetchNextRows` for subsequent pages (see §18). - After any execution, update the local `SessionInfo` from the `OpResult.session` field. +### Pseudo-code + +```python +# DML — INSERT / UPDATE / DELETE (use executeUpdate) +resp = stub.executeUpdate(StatementRequest( + session = session, + sql = "INSERT INTO orders(customer, amount) VALUES(?, ?)", + parameters = [ + ParameterProto(index=1, type=PT_STRING, values=[ParameterValue(string_value="Alice")]), + ParameterProto(index=2, type=PT_INT, values=[ParameterValue(int_value=42)]) + ], + statementUUID = new_uuid() # random UUID per statement instance +)) +session = resp.session # always update local session +rows_affected = resp.value.int_value # e.g., 1 + +# Query — SELECT (use executeQuery, which is server-streaming) +req = StatementRequest( + session = session, + sql = "SELECT id, name FROM orders WHERE customer = ?", + parameters = [ParameterProto(index=1, type=PT_STRING, + values=[ParameterValue(string_value="Alice")])], + statementUUID = new_uuid() +) +result_set_uuid = None +for op_result in stub.executeQuery(req): # iterate the server-streaming response + qr = op_result.query_result + if result_set_uuid is None: + result_set_uuid = qr.resultSetUUID + labels = qr.labels # e.g., ["id", "name"] + for row in qr.rows: + id_val = row.values[0].int_value + name_val = row.values[1].string_value + session = op_result.session +# Fetch additional pages → see §18 + +# Stored procedure — CALL_PREPARE then CALL_EXECUTE +prep_resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CALLABLE_STATEMENT, + target = TargetCall(callType=CALL_PREPARE, resourceName="{call my_proc(?,?)}", + params=[ParameterValue(int_value=1)]) # IN param +)) +proc_uuid = prep_resp.resourceUUID +session = prep_resp.session + +exec_resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CALLABLE_STATEMENT, + resourceUUID = proc_uuid, + target = TargetCall(callType=CALL_EXECUTE) +)) +out_value = exec_resp.values[0] # first OUT/INOUT parameter value +session = exec_resp.session +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`Statement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Statement.java): `executeQuery(sql)` → `statementService.executeQuery(...)`; `executeUpdate(sql)` → `statementService.executeUpdate(...)`; holds `statementUUID` (assigned lazily); `execute(sql)` handles the dual-result case. > - `ojp-jdbc-driver` — [`PreparedStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/PreparedStatement.java): accumulates parameters in a `SortedMap`; `executeQuery()` and `executeUpdate()` pass the full param map to `statementService`; all 28 `setXxx(index, value)` methods map to the corresponding `ParameterType` (see §16). @@ -771,6 +1084,45 @@ Scrollable result sets support cursor positioning through `callResource` with `R | `previous()` | `CALL_PREVIOUS` | | `close()` | `CALL_CLOSE` | +### Pseudo-code + +```python +# After executeQuery stream closes, fetch additional pages with fetchNextRows +result_set_uuid = ... # captured from the first op_result (§15) +all_rows = [] +while True: + resp = stub.fetchNextRows(ResultSetFetchRequest( + session = session, + resultSetUUID = result_set_uuid, + size = 500 # rows per page + )) + session = resp.session + if not resp.query_result.rows: + break # no more rows — result set exhausted + all_rows.extend(resp.query_result.rows) + +# Close the result set explicitly when done +stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_RESULT_SET, + resourceUUID = result_set_uuid, + target = TargetCall(callType=CALL_CLOSE) +)) + +# Cursor navigation — jump to an absolute row (scrollable result sets only) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_RESULT_SET, + resourceUUID = result_set_uuid, + target = TargetCall( + callType = CALL_ABSOLUTE, + params = [ParameterValue(int_value=10)] # jump to row 10 + ) +)) +session = resp.session +current_row = resp.values # column values for row 10 +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`ResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ResultSet.java): `next()` drives the multi-block iteration; `setNextOpResult()` loads a new batch from the iterator; `nextWithSessionUpdate()` updates the session from each block. All `getXxx(columnIndex)` methods call `ProtoConverter.fromParameterValue()` on the column's `ParameterValue`. > - `ojp-jdbc-driver` — [`RemoteProxyResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/RemoteProxyResultSet.java): base class holding `resultSetUUID` and `statementService`; all scrollable-cursor operations issue `callResource(RES_RESULT_SET, CALL_FIRST/LAST/ABSOLUTE/…)`. @@ -832,6 +1184,47 @@ Receive a server-streaming response of `LobDataBlock` messages. Concatenate the LOB handles are server-side objects. A connection that has an open LOB must remain bound to the same server (§6). Do not reroute such connections during failover; instead surface the error to the caller. +### Pseudo-code + +```python +CHUNK_SIZE = 64 * 1024 # 64 KB recommended chunk size + +# --- Write a LOB (createLob is client-streaming) --- +def write_lob(stub, session, data_bytes, lob_type=LT_BLOB): + def generate_blocks(): + for offset in range(0, len(data_bytes), CHUNK_SIZE): + yield LobDataBlock( + session = session, + position = offset, + data = data_bytes[offset : offset + CHUNK_SIZE], + lobType = lob_type + ) + lob_ref = stub.createLob(generate_blocks()) # client-streaming → single LobReference + # lob_ref.uuid → the LOB handle; pass as parameter to executeUpdate (see §16) + # lob_ref.bytesWritten → sanity check + return lob_ref.uuid + +# Bind the LOB UUID when executing a statement +lob_uuid = write_lob(stub, session, my_bytes) +stub.executeUpdate(StatementRequest( + session = session, + sql = "INSERT INTO docs(content) VALUES(?)", + parameters = [ParameterProto(index=1, type=PT_BLOB, + values=[ParameterValue(string_value=lob_uuid)])] +)) + +# --- Read a LOB (readLob is server-streaming) --- +def read_lob(stub, session, lob_uuid, lob_type=LT_BLOB, max_bytes=10_000_000): + req = ReadLobRequest( + lobReference = LobReference(uuid=lob_uuid, session=session, lobType=lob_type), + position = 1, # 1-based start position + length = max_bytes + ) + return b"".join(block.data for block in stub.readLob(req)) + +content = read_lob(stub, session, lob_uuid) +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`LobServiceImpl`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/LobServiceImpl.java): `sendBytes(lobType, pos, inputStream)` opens the client-streaming `createLob` call, chunks the data into `LobDataBlock` messages, and returns the `LobReference`. `parseReceivedBlocks(Iterator)` reassembles chunks from a `readLob` stream into an `InputStream`. > - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.createLob(connection, iterator)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the client-streaming gRPC call; uses an async stub and a `CountDownLatch` to bridge the streaming API back to a synchronous return value. @@ -896,6 +1289,52 @@ Always update the local `SessionInfo` from `response.session`. `CALL_SET`, `CALL_GET`, `CALL_IS`, `CALL_ALL`, `CALL_NULLS`, `CALL_USES`, `CALL_SUPPORTS`, `CALL_STORES`, `CALL_NULL`, `CALL_DOES`, `CALL_DATA`, `CALL_NEXT`, `CALL_CLOSE`, `CALL_WAS`, `CALL_CLEAR`, `CALL_FIND`, `CALL_BEFORE`, `CALL_AFTER`, `CALL_FIRST`, `CALL_LAST`, `CALL_ABSOLUTE`, `CALL_RELATIVE`, `CALL_PREVIOUS`, `CALL_ROW`, `CALL_UPDATE`, `CALL_INSERT`, `CALL_DELETE`, `CALL_REFRESH`, `CALL_CANCEL`, `CALL_MOVE`, `CALL_OWN`, `CALL_OTHERS`, `CALL_UPDATES`, `CALL_DELETES`, `CALL_INSERTS`, `CALL_LOCATORS`, `CALL_AUTO`, `CALL_GENERATED`, `CALL_RELEASE`, `CALL_NATIVE`, `CALL_PREPARE`, `CALL_ROLLBACK`, `CALL_ABORT`, `CALL_EXECUTE`, `CALL_ADD`, `CALL_ENQUOTE`, `CALL_REGISTER`, `CALL_LENGTH` +### Pseudo-code + +```python +# --- Get the database catalog name (connection-level metadata) --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + resourceUUID = "", # empty for connection-level calls + target = TargetCall(callType=CALL_GET, resourceName="Catalog") +)) +catalog_name = resp.values[0].string_value +session = resp.session # always update local session + +# --- Check a database capability --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall(callType=CALL_SUPPORTS, resourceName="Transactions") +)) +supports_transactions = resp.values[0].bool_value +session = resp.session + +# --- Cancel a running statement --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_STATEMENT, + resourceUUID = statement_uuid, # UUID of the statement to cancel + target = TargetCall(callType=CALL_CANCEL) +)) +session = resp.session + +# --- Chained call: get Schema and Catalog in one round-trip --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall( + callType = CALL_GET, + resourceName = "Schema", + nextCall = TargetCall(callType=CALL_GET, resourceName="Catalog") + ) +)) +schema_name = resp.values[0].string_value +catalog_name = resp.values[1].string_value +session = resp.session +``` + > **Reference implementation:** > - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.callResource(CallResourceRequest)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC call. > - `ojp-jdbc-driver` — [`DatabaseMetaData`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatabaseMetaData.java): every `DatabaseMetaData` method (>200 in total) is implemented by calling `callResource` with `RES_CONNECTION` and the appropriate `CallType` (e.g., `CALL_GET` for `getURL()`, `CALL_SUPPORTS` for `supportsXxx()`, `CALL_STORES` for `storesXxx()`). The private helper `newCallBuilder()` creates the skeleton `CallResourceRequest`. From 4ef1e2f5c30659aedf2bdd1ff55cbdae0eaa288f Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sun, 19 Apr 2026 22:51:55 +0000 Subject: [PATCH 06/12] docs: add gRPC pseudo-code examples and reframe JDBC-specific sections in CLIENT_SPEC.md Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/f769d13a-bbd8-4dbc-a3e1-f8f1395a1bf7 Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/write_files.py | 1625 +++++++++++++++++ 1 file changed, 1625 insertions(+) create mode 100644 documents/multi-language-client-spec/write_files.py diff --git a/documents/multi-language-client-spec/write_files.py b/documents/multi-language-client-spec/write_files.py new file mode 100644 index 000000000..181cb2373 --- /dev/null +++ b/documents/multi-language-client-spec/write_files.py @@ -0,0 +1,1625 @@ +#!/usr/bin/env python3 +"""Script to write the two OJP client spec files.""" +import base64 +import os + +SPEC_DIR = "/home/runner/work/ojp/ojp/documents/multi-language-client-spec" + +CLIENT_SPEC = r"""# OJP Multi-Language Client Specification + +> **Status:** Draft — April 2026 +> **Scope:** This document defines every aspect that a new OJP client library (in any language other than Java) must implement in order to be fully compatible with an OJP server. It is written language-agnostically; where Java-specific concepts appear they are labelled as the reference implementation only. +> **Reference implementation:** `ojp-jdbc-driver` module. +> **Protocol source of truth:** `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto`. + +--- + +## Table of Contents + +1. [Overview](#1-overview) +2. [Core Concepts](#2-core-concepts) + - [2.1 The Virtual Connection Model](#21-the-virtual-connection-model) + - [2.2 Session and connHash](#22-session-and-connhash) + - [2.3 Session Affinity (Stickiness)](#23-session-affinity-stickiness) + - [2.4 Client vs. Server Responsibilities](#24-client-vs-server-responsibilities) +3. [Architecture and Data Flow](#3-architecture-and-data-flow) + - [3.1 gRPC Interface and Channel Setup](#31-grpc-interface-and-channel-setup) + - [3.2 Connection Configuration (ConnectionDetails)](#32-connection-configuration-connectiondetails) + - [3.3 Client Identity (clientUUID)](#33-client-identity-clientuuid) + - [3.4 Multinode Load Balancing](#34-multinode-load-balancing) + - [3.5 Cluster Health Propagation](#35-cluster-health-propagation) +4. [Client Responsibilities](#4-client-responsibilities) + - [4.1 Connection Establishment and connHash Caching](#41-connection-establishment-and-connhash-caching) + - [4.2 Session Lifecycle](#42-session-lifecycle) + - [4.3 Failover](#43-failover) + - [4.4 Health Checking](#44-health-checking) + - [4.5 Connection Redistribution on Recovery](#45-connection-redistribution-on-recovery) +5. [Minimal End-to-End Example](#5-minimal-end-to-end-example) +6. [Error Handling](#6-error-handling) + - [6.1 Error Classification Matrix](#61-error-classification-matrix) + - [6.2 SQL Errors vs. gRPC Transport Errors](#62-sql-errors-vs-grpc-transport-errors) +7. [Implementation Guidance](#7-implementation-guidance) + - [7.1 Statement Execution](#71-statement-execution) + - [7.2 Parameter Type Mapping](#72-parameter-type-mapping) + - [7.3 Temporal Type Handling](#73-temporal-type-handling) + - [7.4 Result Set Streaming](#74-result-set-streaming) + - [7.5 LOB (Large Object) Handling](#75-lob-large-object-handling) + - [7.6 Transaction Management (non-XA)](#76-transaction-management-non-xa) + - [7.7 Savepoints](#77-savepoints) + - [7.8 XA / Distributed Transactions](#78-xa--distributed-transactions) + - [7.9 callResource Protocol](#79-callresource-protocol) + - [7.10 Configuration System](#710-configuration-system) + - [7.11 Query Result Caching](#711-query-result-caching) + - [7.12 Security / Transport](#712-security--transport) + - [7.13 DataSource / Integration API](#713-datasource--integration-api) +8. [Testing Coverage](#8-testing-coverage) + +--- + +## 1. Overview + +OJP (Open J Proxy) is the world's first open-source JDBC Type 3 driver. It works by placing a gRPC server between applications and their databases. Applications use an OJP client library — and never touch a real database connection directly. The OJP server owns all real connections in a HikariCP pool and services SQL requests on behalf of clients. + +This document specifies everything a non-Java OJP client library must implement to be fully protocol-compatible. All communication between client and server uses gRPC over HTTP/2. The proto definitions in `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto` are the authoritative source for message formats and RPC signatures. + +--- + +## 2. Core Concepts + +### 2.1 The Virtual Connection Model + +An OJP "connection" is not a real database connection. Real database connections are owned and managed by the OJP server's HikariCP pool. The client holds a `SessionInfo` — a lightweight value object the server uses to route each incoming request to the correct pool. + +Opening a connection is inexpensive. For non-XA connections, a cache hit means no RPC is needed at all: the client constructs a `SessionInfo` locally from a cached `connHash` and begins issuing SQL calls immediately. Multiple client connections may share the same server-side pool by sharing the same `connHash`. + +Because the server already pools real database connections, the application side must not add another connection pool on top. Double-pooling (e.g., HikariCP on the app side in addition to the server side) causes resource waste and incorrect connection-count accounting. + +### 2.2 Session and connHash + +`connHash` is a server-computed SHA-256 hash of the tuple `(url, user, password, datasource_name)`. It identifies which pool on the server to route requests to. The client receives `connHash` on the first `connect()` RPC and must cache it for all subsequent connections that share the same credentials. + +`sessionUUID` is the server-side handle for a persistent session. It is NOT assigned at connection time. The server assigns it on the first operation that requires a persistent server-side context — for example, `startTransaction()`, LOB creation, or a stored-procedure call. Until a `sessionUUID` exists, each request is effectively stateless: the server picks any available connection from the pool identified by `connHash` and processes the request. + +Once a `sessionUUID` is established it is returned in every subsequent response alongside the same `connHash`. The client must replace its local `SessionInfo` with the one returned after every RPC call. + +### 2.3 Session Affinity (Stickiness) + +Once a `sessionUUID` is set, every subsequent request for that session must go to the same server. The server encodes the target server address in the `targetServer` field of every `SessionInfo` response. The client must record this binding in a `sessionUUID -> host:port` map and enforce it strictly. + +Routing a sticky session to a different server is a protocol error — it is not a transparent optimisation. The session state (open transaction, cursor, LOB handle) lives on a specific server and cannot be migrated. If the bound server becomes unreachable, the client must raise an error to the caller rather than silently rerouting. + +### 2.4 Client vs. Server Responsibilities + +The server owns: real database connections, connection pool management, transaction state, LOB storage, cursor state, and query result caching. + +The client owns: `SessionInfo` propagation on every request and response, `connHash` caching and cache-invalidation logic, server endpoint health tracking, load balancing across healthy endpoints, failover on transport errors, cluster health string construction and proactive pushing to healthy servers, and session stickiness enforcement. + +--- + +## 3. Architecture and Data Flow + +### 3.1 gRPC Interface and Channel Setup + +The client must implement stubs for every RPC in `StatementService` and `EchoService`. + +**`StatementService` RPCs:** + +| RPC | Type | Purpose | +|---|---|---| +| `connect` | unary | Open a logical connection and receive `SessionInfo` | +| `executeUpdate` | unary | DML (INSERT / UPDATE / DELETE / DDL) | +| `executeQuery` | server-streaming | SELECT — returns a stream of `OpResult` blocks | +| `fetchNextRows` | unary | Pull the next page of rows for an open result set | +| `createLob` | client-streaming | Upload LOB data to the server in chunks | +| `readLob` | server-streaming | Download LOB data from the server | +| `terminateSession` | unary | Release server-side session state | +| `startTransaction` | unary | Begin an explicit transaction | +| `commitTransaction` | unary | Commit the active transaction | +| `rollbackTransaction` | unary | Roll back the active transaction | +| `callResource` | unary | Generic remote call for metadata, cursor navigation, savepoints | +| `xaStart` | unary | Begin an XA transaction branch | +| `xaEnd` | unary | End an XA transaction branch | +| `xaPrepare` | unary | Prepare an XA transaction branch | +| `xaCommit` | unary | Commit an XA transaction branch | +| `xaRollback` | unary | Roll back an XA transaction branch | +| `xaRecover` | unary | List XIDs of prepared transactions | +| `xaForget` | unary | Forget a heuristically completed transaction | +| `xaSetTransactionTimeout` | unary | Set XA timeout in seconds | +| `xaGetTransactionTimeout` | unary | Get current XA timeout | +| `xaIsSameRM` | unary | Check whether two sessions share a resource manager | + +**`EchoService` RPC:** + +| RPC | Type | Purpose | +|---|---|---| +| `Echo` | unary | Lightweight heartbeat / connectivity check | + +**gRPC channel lifecycle:** + +One `ManagedChannel` (or equivalent) per server endpoint. Channels are long-lived and shared across all logical connections to that endpoint. They are created lazily on first connection or eagerly during initialisation when endpoints are known upfront. Use DNS-prefixed targets (`dns:///host:port`) where the gRPC runtime supports it. Blocking stubs are used for synchronous operations; async stubs are required for client-streaming (`createLob`) and server-streaming (`executeQuery`, `readLob`) RPCs. Channel shutdown must be graceful and triggered on client shutdown. + +```python +# Create one long-lived channel per OJP server endpoint +channel = grpc.create_channel("localhost:10591", credentials=grpc.local_channel_credentials()) +stub = StatementServiceStub(channel) # used for all SQL operations +echo = EchoServiceStub(channel) # used for health heartbeats + +# On process shutdown — drain in-flight calls before closing +channel.shutdown(grace_period_seconds=5) +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`StatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementService.java): the unified interface declaring all RPC methods. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC implementation; contains the concrete gRPC stub calls. +> - `ojp-jdbc-driver` — [`MultinodeStatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): the multinode facade that wraps `StatementServiceGrpcClient` per endpoint with routing, failover, and stickiness. +> - `ojp-grpc-commons` — [`GrpcChannelFactory`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): builds `ManagedChannel` instances with plaintext or TLS. + +--- + +### 3.2 Connection Configuration (ConnectionDetails) + +A non-Java OJP client does not use a JDBC URL. Instead, it collects the following configuration items directly from the user or from a configuration file: + +| Item | Required | Description | +|---|---|---| +| OJP server endpoints | Yes | One or more `host:port` pairs for the OJP server(s). In multinode mode this is a list. | +| Datasource name | No | A logical name for this datasource, default `"default"`. Used to keep separate connection pools per named datasource on the same server. | +| Database URL | Yes | The connection URL for the **real database** that the OJP server will connect to (e.g., `jdbc:postgresql://db:5432/mydb`). This is sent verbatim to the server. | +| User | Yes | Database username. | +| Password | Yes | Database password. | +| Properties | No | Additional key-value configuration pairs (pool sizing, cache rules, etc. — see §7.10, §7.11). | + +Map the collected configuration to the `ConnectionDetails` proto fields as follows: + +| Proto field | Type | Value | +|---|---|---| +| `url` | `string` | The **actual database URL** (e.g., `jdbc:postgresql://db:5432/mydb`). The server uses this to create the real database connection pool. | +| `user` | `string` | Database username. | +| `password` | `string` | Database password. | +| `clientUUID` | `string` | Stable process UUID (see §3.3). | +| `properties` | `repeated PropertyEntry` | Configuration key-value pairs; include `ojp.datasource.name = ` when using a named datasource. | +| `serverEndpoints` | `repeated string` | All OJP server addresses as `"host:port"` strings (the full cluster list, not just the chosen endpoint). | +| `clusterHealth` | `string` | Current cluster health string (see §3.5); empty string on the very first connect. | +| `isXA` | `bool` | `true` for XA connections, `false` otherwise. | + +The `url` field must be consistent across all client processes that connect to the same logical datasource. The server computes `connHash` as SHA-256(`url + user + password + datasource_name`). If different clients send different `url` strings for the same database, the server creates separate pools. + +The client-side cache key for the `connHash` lookup is: `url + "|" + user + "|" + password + "|" + datasource_name` + +> **Reference implementation:** +> - `ojp-grpc-commons` — [`ConnectionDetails` proto](../../ojp-grpc-commons/src/main/proto/StatementService.proto): field definitions. +> - `ojp-server` — [`ConnectionHashGenerator.hashConnectionDetails()`](../../ojp-server/src/main/java/org/openjproxy/grpc/server/utils/ConnectionHashGenerator.java): SHA-256 of `url + user + password + datasource_name_from_properties` — the server-side connHash algorithm. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.computeConnectionKey()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): client-side cache key. +> - `ojp-jdbc-driver` — [`MultinodeUrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java): Java reference for how the JDBC URL is parsed to extract server endpoints, datasource names, and the actual DB URL before building `ConnectionDetails` (Java-specific; not needed in non-Java clients). + +--- + +### 3.3 Client Identity (clientUUID) + +Generate one random UUID (version 4) when the client library is first loaded or when the process starts. This UUID must remain stable for the entire lifetime of the process. Attach `clientUUID` to every `ConnectionDetails` message sent to the server. The server uses `clientUUID` to group all sessions from the same client process. Do not persist `clientUUID` across process restarts; each new process must generate a fresh UUID. + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`ClientUUID`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ClientUUID.java): `getUUID()` returns the static, process-scoped UUID that is generated once at class-loading time via `UUID.randomUUID()`. + +--- + +### 3.4 Multinode Load Balancing + +Two strategies must be supported, selectable via configuration (see §7.10, property `ojp.loadaware.selection.enabled`): + +**Least-connections (default, `true`):** Select the healthy server with the lowest number of active sessions. Track session counts in a thread-safe counter per server endpoint. Use round-robin as a tie-breaker when all servers have equal counts. + +**Round-robin (`false`):** Cycle through healthy servers in order using an atomic counter modulo the number of healthy servers. + +Server selection runs on every new connection attempt (non-XA, first `connect()`) and on every XA `connect()`. Once a session is assigned a server via session stickiness, selection does not run again for that session. Only servers whose `isHealthy() == true` are eligible for selection. If no healthy servers exist, raise a connection error. + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.selectHealthyServer()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the entry point that dispatches to one of the two strategies based on config. +> - `MultinodeConnectionManager.selectByLeastConnections(healthyServers)`: picks the server with the lowest active-session count. +> - `MultinodeConnectionManager.selectByRoundRobin(healthyServers)`: atomically increments `roundRobinCounter` and selects `servers[counter % size]`. +> - `ojp-jdbc-driver` — [`ServerEndpoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ServerEndpoint.java): holds `isHealthy`, `lastFailureTime`, host, and port state for each endpoint. + +--- + +### 3.5 Cluster Health Propagation + +The cluster health string format is: `host1:port1(UP);host2:port2(DOWN);host3:port3(UP)` + +Each semicolon-separated segment is `host:port(STATUS)` where status is `UP` or `DOWN`. + +The client must build the cluster health string from local endpoint state before every `connect()` call. It must also consume the cluster health string returned in `SessionInfo.clusterHealth` on every response, updating local endpoint states: endpoints marked `DOWN` must be treated as unhealthy; endpoints marked `UP` that were previously failed must not be marked healthy immediately — the health-check thread must confirm first. + +When the topology changes (a server fails or recovers), the client must proactively push the updated cluster health to all currently healthy servers via two independent triggers — both are necessary: + +**Trigger 1 — health-check thread**: When `performHealthCheck()` detects a newly failed or recovered server, it calls `pushClusterHealthToAllHealthyServers()` inline. This covers the case when no SQL traffic is active. + +**Trigger 2 — query thread**: When a SQL query thread detects server failure via `handleServerFailure()`, it submits `pushClusterHealthToAllHealthyServers()` to the background scheduler asynchronously. This covers the race where the query thread marks the server unhealthy before the health checker runs. + +The push is done by calling `connect()` on each healthy server with a `ConnectionDetails` whose `clusterHealth` field contains the new topology string. The server uses this to resize its pool immediately. + +```python +# Build the health string from local endpoint state +def build_cluster_health(endpoints): + return ";".join( + f"{ep.host}:{ep.port}({'UP' if ep.is_healthy else 'DOWN'})" + for ep in endpoints + ) + +# Push updated cluster health to all healthy servers via a connect() call. +# The server uses the clusterHealth field to resize its pool immediately. +def push_cluster_health(endpoints, stored_details): + if not stored_details: + return # no connections yet — nothing to push + health_str = build_cluster_health(endpoints) + for conn_hash, details in stored_details.items(): + push_req = ConnectionDetails(**details, clusterHealth=health_str) + for ep in endpoints: + if ep.is_healthy: + stubs[ep].connect(push_req) # no-op for pool creation; resizes pool + +# Consume the cluster health returned in every gRPC response +def consume_cluster_health(session_info): + for segment in session_info.clusterHealth.split(";"): + host_port, status = segment.rsplit("(", 1) + status = status.rstrip(")") + endpoint = find_endpoint(host_port) + if status == "DOWN" and endpoint.is_healthy: + handle_server_failure(endpoint) + elif status == "UP" and not endpoint.is_healthy: + # do not mark healthy here — let the health-check thread confirm (§4.4) + pass +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.generateClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): builds the semicolon-delimited health string from `serverEndpoints`. +> - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: calls `connect()` on every healthy server with the new cluster health embedded in `ConnectionDetails`. +> - `MultinodeConnectionManager.handleServerFailure()` (Trigger 2): submits `pushClusterHealthToAllHealthyServers()` to `healthCheckScheduler` on a genuine healthy->unhealthy transition. +> - `MultinodeConnectionManager.performHealthCheck()` (Trigger 1): calls `pushClusterHealthToAllHealthyServers()` directly after marking a server DOWN or after a recovered server is marked healthy. +> - `MultinodeStatementService.withClusterHealth(sessionInfo)`: attaches the current health string to an outgoing `SessionInfo` before each RPC (reactive secondary path). + +--- + +## 4. Client Responsibilities + +### 4.1 Connection Establishment and connHash Caching + +#### First connection (cache miss) + +1. Build a `ConnectionDetails` message (see §3.2 for field mapping). +2. Call `connect(ConnectionDetails)` on the chosen server. Receive `SessionInfo`. +3. Cache the returned `connHash`, keyed on `url + "|" + user + "|" + password + "|" + datasourceName`. Also store the full `ConnectionDetails` for replay if the server restarts. +4. Return the received `SessionInfo` to the caller. + +#### Subsequent connections (cache hit, non-XA only) + +When a subsequent connection uses the same credentials: +1. Look up `connHash` from the local cache. +2. Build a `SessionInfo` locally without making any gRPC call: + ``` + SessionInfo { + connHash: + clientUUID: + isXA: false + } + ``` +3. Return this locally-built `SessionInfo`. No `sessionUUID` is set yet; it will be assigned by the server lazily. + +XA connections always call the server — caching is disabled for XA because each XA connection must create a dedicated pool entry on a specific server. + +#### Cache invalidation (NOT_FOUND recovery) + +When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory pool. Recovery procedure: +1. Remove the cached `connHash -> connection-key` entry (but keep the stored `ConnectionDetails`). +2. Re-issue a real `connect()` RPC using the stored `ConnectionDetails`. +3. Cache the new `connHash` returned. +4. Retry the original failed operation once with the new `SessionInfo`. +5. This retry is only safe if the original request had no active `sessionUUID`. If a session was in progress, surface the error to the caller — the transaction state is permanently lost. + +```python +# --- First connection (cache miss) --- +req = ConnectionDetails( + url = "jdbc:postgresql://db:5432/mydb", # actual DB URL + user = "alice", + password = "secret", + clientUUID = CLIENT_UUID, # stable process UUID (§3.3) + serverEndpoints = ["host1:10591", "host2:10591"], # full cluster list + clusterHealth = build_cluster_health(endpoints), # §3.5; "" on very first call + isXA = False, + properties = [PropertyEntry(key="ojp.datasource.name", string_value="default")] +) + +session = stub.connect(req) +# session.connHash = "abc123..." — server-computed pool key +# session.clientUUID = CLIENT_UUID + +# Cache connHash for subsequent connections +cache_key = f"{req.url}|{req.user}|{req.password}|default" +connhash_cache[cache_key] = session.connHash +stored_details[session.connHash] = req # kept for NOT_FOUND recovery (see below) + +# --- Subsequent connection (cache hit, non-XA) --- +# No RPC call needed — build SessionInfo locally from the cached connHash +session = SessionInfo( + connHash = connhash_cache[cache_key], + clientUUID = CLIENT_UUID, + isXA = False + # sessionUUID is absent; the server assigns it lazily on startTransaction +) + +# --- NOT_FOUND recovery --- +# If any RPC returns Status.NOT_FOUND (server restarted, pool lost): +del connhash_cache[cache_key] +session = stub.connect(stored_details[old_conn_hash]) # re-issue real connect() +connhash_cache[cache_key] = session.connHash # update cache +# then retry the original failed RPC once +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.connect()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): orchestrates first-connect vs. cache-hit logic. +> - `MultinodeConnectionManager.computeConnectionKey()`: builds the `url|user|password|datasourceName` cache key. +> - `MultinodeConnectionManager.invalidateConnHash()`: removes the stale key from `connHashByConnectionKey` on `NOT_FOUND`. +> - `MultinodeConnectionManager.reconnectForConnHash()`: re-issues the real `connect()` RPC using stored `ConnectionDetails` and updates the cache. +> - `MultinodeConnectionManager.buildLocalSessionInfo()`: constructs the in-memory `SessionInfo` for cache-hit connections without an RPC call. + +--- + +### 4.2 Session Lifecycle + +**SessionInfo fields:** + +| Field | Type | Meaning | +|---|---|---| +| `connHash` | string | Server-side key identifying which connection pool to use | +| `clientUUID` | string | Client process identity (see §3.3) | +| `sessionUUID` | string | Server-side session handle; set once a session is established | +| `transactionInfo` | `TransactionInfo` | Contains `transactionUUID` and `transactionStatus` (`TRX_ACTIVE`, `TRX_COMMITED`, `TRX_ROLLBACK`) | +| `sessionStatus` | `SessionStatus` | `SESSION_ACTIVE` or `SESSION_TERMINATED` | +| `isXA` | bool | Whether this is an XA session | +| `targetServer` | string | `host:port` of the server this session is pinned to (set by the server, used by the client for stickiness) | +| `clusterHealth` | string | Current cluster health snapshot from the server's perspective | + +**Lifecycle rules:** + +Always propagate the latest `SessionInfo` on every outgoing request. The server updates and returns it in every response; the client must replace its local copy with the one returned. When the response contains a `sessionUUID` that was absent in the request, register it immediately with the session-stickiness layer. On connection close, call `terminateSession(SessionInfo)` — this is mandatory for releasing server-side resources. If `sessionStatus == SESSION_TERMINATED` is received, treat the connection as closed and make no further calls on it. + +**Session stickiness enforcement:** + +Once a `sessionUUID` is established, every subsequent request for that session must go to the same server. Maintain a thread-safe map of `sessionUUID -> host:port`. On each request, if `sessionUUID` is set, look up the bound server and route the request there exclusively. If the bound server is currently marked unhealthy, raise an error to the caller — do not silently reroute. When a session is closed via `terminateSession`, remove the binding from the map and decrement the session count for that server in the load-balancing tracker. + +```python +# Every gRPC call returns an updated SessionInfo — always replace the local copy +resp = stub.executeUpdate(StatementRequest(session=current_session, sql="...")) +current_session = resp.session # update after every call + +# When a new sessionUUID appears in the response, record the server binding (§2.3) +if resp.session.sessionUUID and resp.session.sessionUUID != current_session.sessionUUID: + bind_session(resp.session.sessionUUID, resp.session.targetServer) + +# Close a connection — release server-side state +stub.terminateSession(current_session) +# After this call, discard current_session and do not make further calls on it +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): holds the mutable `session` field (`SessionInfo`); `close()` calls `terminateSession(session)` and nulls the session. +> - `ojp-jdbc-driver` — [`MultinodeStatementService.withClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): enriches outgoing `SessionInfo` with the current cluster health string before each RPC. +> - `MultinodeStatementService.checkAndBindSession()`: updates the stickiness map whenever the server returns a new or changed `sessionUUID`. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.terminateSession()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): forwards `terminateSession` to every server that received a `connect()` for this `connHash`. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.affinityServer(sessionKey)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): returns the bound server for a `sessionUUID`, or selects a new one via load balancing when no binding exists yet. +> - `MultinodeConnectionManager.bindSession(sessionUUID, targetServer)`: records the `sessionUUID -> host:port` mapping in `sessionToServerMap`. +> - `MultinodeConnectionManager.unbindSession(sessionUUID)`: removes the binding on session close. +> - `ojp-jdbc-driver` — [`SessionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/SessionTracker.java): maintains per-server session counts used by the load-balancer and redistribution logic. + +--- + +### 4.3 Failover + +**What triggers failover:** + +| Status code | Trigger failover? | +|---|---| +| `UNAVAILABLE` | Yes | +| `DEADLINE_EXCEEDED` | Yes | +| `UNKNOWN` (with "connection" in message) | Yes | +| `INTERNAL` with SQL metadata trailers | **No** — this is a database-level error | +| `INTERNAL` without SQL metadata trailers | Yes — treated as a transport-level failure | +| `NOT_FOUND` | **No** — triggers reconnect (see §4.1), not failover | +| `RESOURCE_EXHAUSTED` (pool exhaustion) | **No** — surface to caller | +| `CANCELLED` | **No** — client-initiated cancellation; must never mark a server unhealthy | +| Any `SQLException` from server | **No** | + +**Failover procedure:** + +1. Capture whether the server was previously healthy (`wasHealthy`). +2. Mark the server unhealthy (`isHealthy = false`), recording the failure timestamp. +3. Log the failure. +4. If this is a genuine healthy -> unhealthy transition (`wasHealthy == true`), submit `pushClusterHealthToAllHealthyServers()` asynchronously to the background scheduler — do not block the query thread. +5. Shut down the gRPC channel for the failed server gracefully. +6. Select the next healthy server using the configured strategy, excluding the failed server and any already attempted in this retry cycle. +7. Retry the operation on the new server. +8. If all servers have been attempted and all failed, raise a connection error to the caller. + +Retry attempts and delay between retries are configurable (see §7.10, properties `ojp.multinode.retry.attempts` and `ojp.multinode.retry.delay`). + +**What must NOT trigger failover:** database errors, pool exhaustion, and session-invalidation errors must all be surfaced directly to the caller. + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.isConnectionLevelError()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): classifies a `StatusRuntimeException` as a connectivity failure vs. a SQL/business error. `CANCELLED` is explicitly **excluded**. +> - `GrpcExceptionHandler.isPoolNotFoundException()`: returns `true` for `NOT_FOUND`, triggering reconnect rather than failover. +> - `GrpcExceptionHandler.isSessionInvalidationError()`: returns `true` when the server indicates the session is gone. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.handleServerFailure(endpoint, exception)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): marks the server unhealthy, timestamps the failure, shuts down the gRPC channel gracefully, and submits `pushClusterHealthToAllHealthyServers()` on a genuine healthy->unhealthy transition. +> - `MultinodeStatementService.executeOpResultWithSessionStickinessAndBinding()`: the retry loop that catches `StatusRuntimeException`, calls `isConnectionLevelError`, drives the server-selection retry cycle. + +--- + +### 4.4 Health Checking + +Run a periodic background task that checks server health. The task must run at a configurable fixed interval (property `ojp.health.check.interval`, default 5 000 ms), not block the main execution thread, and be a daemon task so it does not prevent process shutdown. Start the background scheduler before the first connection is accepted. + +**Two-phase check:** + +**Phase 1 — probe healthy servers (detect newly failed servers):** Run when there are active XA sessions (`sessionToServerMap` is non-empty) **or** cached non-XA connection details (`connectionDetailsByConnHash` is non-empty). For each currently healthy server that passes the guard, send a probe call. If the call fails, mark the server unhealthy. + +**Phase 2 — probe unhealthy servers (detect recovery):** For each currently unhealthy server, check if enough time has passed since the last failure (property `ojp.health.check.threshold`, default 5 000 ms). If so, probe the server. If the probe succeeds, run recovery (see §4.5). + +**Health probe modes:** + +| Mode | How to probe | When to use | +|---|---|---| +| Heartbeat (lightweight) | Send `connect()` with empty `url`, `user`, `password` — any response means transport is up | Default | +| Full validation | Send `connect()` with real credentials; on success, call `terminateSession` on the returned session | When heartbeat is insufficient | + +```python +# Lightweight heartbeat: send empty credentials — any response means transport is up +def heartbeat_probe(stub): + try: + stub.connect(ConnectionDetails(url="", user="", password="")) + return True # server is reachable + except grpc.RpcError: + return False # mark server unhealthy (§4.3) + +# Full validation: connect with real credentials, then immediately terminate +def full_validation_probe(stub, stored_details): + try: + session = stub.connect(stored_details) + stub.terminateSession(session) + return True + except grpc.RpcError: + return False + +# Periodic background task +def run_health_check(endpoints, stubs, stored_details): + for ep in endpoints: + if ep.is_healthy: + # Phase 1 — probe healthy server; detect new failures + if stored_details or xa_sessions: # guard: skip if no connections yet + if not heartbeat_probe(stubs[ep]): + handle_server_failure(ep) + push_cluster_health_async(endpoints, stored_details) + else: + # Phase 2 — probe unhealthy server; detect recovery + if time_since(ep.last_failure) >= HEALTH_CHECK_THRESHOLD: + if heartbeat_probe(stubs[ep]): + reinitialize_pool_on_recovered_server(ep, stored_details) # §4.5 + ep.mark_healthy() + push_cluster_health_inline(endpoints, stored_details) # §3.5 +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check. +> - `ojp-jdbc-driver` — [`HealthCheckValidator.validateServer(endpoint)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckValidator.java): performs a single lightweight probe. +> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): POJO holding `healthCheckIntervalMs`, `healthCheckThresholdMs`, `healthCheckTimeoutMs`, and `redistributionEnabled`. +> - `MultinodeConnectionManager` constructor: schedules `performHealthCheck` on a `ScheduledExecutorService` at the configured interval. + +--- + +### 4.5 Connection Redistribution on Recovery + +When a failed server comes back online, rebalance client-side connections so that the recovered server receives its fair share of traffic again. + +**Procedure on recovery:** + +1. **Reinitialize pools on the recovered server first** (before marking healthy). Check whether any non-XA connections have been cached (`connectionDetailsByConnHash` is non-empty). If so, for every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This eliminates the NOT_FOUND window that would otherwise exist between marking the server healthy and the first SQL call reaching it. Only after all pools are pre-created, proceed to step 2. +2. Mark the server healthy (`endpoint.markHealthy()`). +3. Push the updated cluster health string to all healthy servers (see §3.5) so they can resize their pools. +4. If redistribution is enabled (`ojp.redistribution.enabled = true`), begin rebalancing: + - Determine the ideal share: `totalConnections / numberOfHealthyServers`. + - Identify over-loaded servers (connections > ideal share). + - Close a fraction of idle connections on over-loaded servers so they are returned to the pool, then re-opened. + - Honour the configurable fraction (`ojp.redistribution.idleRebalanceFraction`, default 1.0) and max-close-per-cycle limit (`ojp.redistribution.maxClosePerRecovery`, default 100). + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.reinitializePoolOnRecoveredServer(recoveredServer)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): always called **before** `endpoint.markHealthy()`. +> - `ojp-jdbc-driver` — [`ConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionRedistributor.java): closes a fraction of idle connections on over-loaded servers for non-XA mode. +> - `ojp-jdbc-driver` — [`XAConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/XAConnectionRedistributor.java): equivalent redistribution for XA connections. +> - `ojp-jdbc-driver` — [`ConnectionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionTracker.java): maintains the per-server `Connection` list consulted by `ConnectionRedistributor`. + +--- + +## 5. Minimal End-to-End Example + +The following pseudo-code shows the complete sequence for a basic OJP client session: channel setup, connect, query, transaction with DML, and clean close. + +```python +import grpc +import uuid + +# -- 1. Channel and stub setup +# One long-lived channel per OJP server endpoint; shared across all connections. +CLIENT_UUID = str(uuid.uuid4()) # stable for this process lifetime +channel = grpc.insecure_channel("localhost:10591") +stub = StatementServiceStub(channel) + +# In-process state +connhash_cache = {} # cache_key -> connHash +stored_details = {} # connHash -> ConnectionDetails (for NOT_FOUND recovery) + +# -- 2. connect() — first connection (cache miss) +cache_key = "jdbc:postgresql://db:5432/mydb|alice|secret|default" + +if cache_key not in connhash_cache: + req = ConnectionDetails( + url = "jdbc:postgresql://db:5432/mydb", + user = "alice", + password = "secret", + clientUUID = CLIENT_UUID, + serverEndpoints = ["localhost:10591"], + clusterHealth = "", # empty on very first connect + isXA = False, + properties = [PropertyEntry(key="ojp.datasource.name", + string_value="default")] + ) + session = stub.connect(req) + connhash_cache[cache_key] = session.connHash + stored_details[session.connHash] = req +else: + # Cache hit — build SessionInfo locally; no RPC needed + session = SessionInfo(connHash=connhash_cache[cache_key], + clientUUID=CLIENT_UUID, isXA=False) + +# -- 3. executeQuery() — read rows +result_set_uuid = None +rows = [] +for op_result in stub.executeQuery(StatementRequest( + session = session, + sql = "SELECT id, name FROM products WHERE active = ?", + parameters = [ParameterProto(index=1, type=PT_BOOLEAN, + values=[ParameterValue(bool_value=True)])], + statementUUID = str(uuid.uuid4()))): + qr = op_result.query_result + if result_set_uuid is None: + result_set_uuid = qr.resultSetUUID + rows.extend(qr.rows) + session = op_result.session # always update local session + +# Close result set when done +stub.callResource(CallResourceRequest( + session=session, resourceType=RES_RESULT_SET, + resourceUUID=result_set_uuid, + target=TargetCall(callType=CALL_CLOSE))) + +# -- 4. Transaction — startTransaction, executeUpdate, commitTransaction +session = stub.startTransaction(session) + +resp = stub.executeUpdate(StatementRequest( + session = session, + sql = "INSERT INTO orders(customer, amount) VALUES(?, ?)", + parameters = [ + ParameterProto(index=1, type=PT_STRING, + values=[ParameterValue(string_value="alice")]), + ParameterProto(index=2, type=PT_INT, + values=[ParameterValue(int_value=99)]) + ], + statementUUID = str(uuid.uuid4()))) +session = resp.session + +session = stub.commitTransaction(session) + +# -- 5. terminateSession() — release server-side state +stub.terminateSession(session) +# Discard session — do not make further calls on it. +channel.close() +``` + +--- + +## 6. Error Handling + +### 6.1 Error Classification Matrix + +| Condition | gRPC status | Client action | +|---|---|---| +| SQL error (bad query, constraint, etc.) | `INTERNAL` + `SqlErrorResponse` trailer | Throw SQL exception; do not retry; do not mark server unhealthy | +| Pool not found (server restarted) | `NOT_FOUND` | Invalidate connHash cache; reconnect; retry once (§4.1) | +| Server unreachable | `UNAVAILABLE` | Failover to next server (§4.3) | +| Request timeout | `DEADLINE_EXCEEDED` | Failover to next server (§4.3) | +| Client-side cancellation | `CANCELLED` | Do **not** failover; do **not** mark server unhealthy; surface to caller | +| Pool exhausted | `RESOURCE_EXHAUSTED` | Throw pool-exhaustion error; do not retry; do not mark server unhealthy | +| Session invalidated (server failure) | Session-not-found message | Throw session-lost error; do not retry; let caller decide | +| Session stickiness violation (server down) | Local check before RPC | Throw connection error immediately; do not reroute | + +### 6.2 SQL Errors vs. gRPC Transport Errors + +When the server encounters a SQL error, it returns `Status.INTERNAL` with a `SqlErrorResponse` message attached to the trailing metadata. The client must extract this trailer and use its fields to construct a meaningful error. + +``` +SqlErrorResponse { + reason: string // human-readable message + sqlState: string // ANSI SQL state code + vendorCode: int32 // database-specific error code + sqlErrorType: SqlErrorType // SQL_EXCEPTION or SQL_DATA_EXCEPTION +} +``` + +Map to the host language's exception hierarchy: +- `SQL_EXCEPTION` -> standard SQL exception. +- `SQL_DATA_EXCEPTION` -> data-specific SQL exception (subtype). + +A transport error — `UNAVAILABLE`, `DEADLINE_EXCEEDED`, `UNKNOWN` (with "connection" in the message), or `INTERNAL` without a `SqlErrorResponse` trailer — triggers the failover procedure in §4.3. + +> **Note:** Prior to April 2026 the server incorrectly used `Status.CANCELLED` for SQL errors. The correct status is `Status.INTERNAL` with a `SqlErrorResponse` trailer. Implementations must use `INTERNAL` for SQL errors and must not treat `CANCELLED` as a server failure. + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.handle(StatusRuntimeException)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): extracts `SqlErrorResponse` from gRPC trailing metadata on `Status.INTERNAL` and throws the appropriate `SQLException` with SQL state and vendor code. +> - `GrpcExceptionHandler.isPoolNotFoundException(exception)`: returns `true` for `NOT_FOUND`. +> - `GrpcExceptionHandler.isSessionInvalidationError(exception)`: returns `true` for session-invalidation error messages. +> - `GrpcExceptionHandler.isConnectionLevelError(exception)`: returns `true` for `UNAVAILABLE`, `DEADLINE_EXCEEDED`, and connection-related `UNKNOWN` errors. + +--- + +## 7. Implementation Guidance + +### 7.1 Statement Execution + +All SQL is executed by populating a `StatementRequest` and calling either `executeUpdate` or `executeQuery` on the stub. + +**Parameterless SQL:** Set `sql` to the full query string and leave `parameters` empty. + +**Parameterized SQL:** Set `sql` with `?` positional placeholders and populate the `parameters` list with one `ParameterProto` per `?`. Parameters are accumulated locally and sent together in a single `StatementRequest`. Assign a `statementUUID` (a random UUID per logical prepared-statement instance) so the server can track resources tied to that statement. + +**Stored-procedure calls:** First call `callResource` with `CallType.CALL_PREPARE` to register the procedure on the server and receive a `resourceUUID`. Then call `callResource` with `CallType.CALL_EXECUTE` to run it, passing IN parameters and retrieving OUT/INOUT values from `CallResourceResponse.values`. + +**Execution routing:** +- Use `executeUpdate` for INSERT / UPDATE / DELETE / DDL — returns `OpResult` with `type = INTEGER` containing affected row count. +- Use `executeQuery` for SELECT — returns a server-streaming response. Consume the first `OpResult` to get the initial batch; call `fetchNextRows` for subsequent pages (see §7.4). +- After any execution, update the local `SessionInfo` from the `OpResult.session` field. + +```python +# DML — INSERT / UPDATE / DELETE (use executeUpdate) +resp = stub.executeUpdate(StatementRequest( + session = session, + sql = "INSERT INTO orders(customer, amount) VALUES(?, ?)", + parameters = [ + ParameterProto(index=1, type=PT_STRING, values=[ParameterValue(string_value="Alice")]), + ParameterProto(index=2, type=PT_INT, values=[ParameterValue(int_value=42)]) + ], + statementUUID = new_uuid() # random UUID per statement instance +)) +session = resp.session # always update local session +rows_affected = resp.value.int_value # e.g., 1 + +# Query — SELECT (use executeQuery, which is server-streaming) +req = StatementRequest( + session = session, + sql = "SELECT id, name FROM orders WHERE customer = ?", + parameters = [ParameterProto(index=1, type=PT_STRING, + values=[ParameterValue(string_value="Alice")])], + statementUUID = new_uuid() +) +result_set_uuid = None +for op_result in stub.executeQuery(req): # iterate the server-streaming response + qr = op_result.query_result + if result_set_uuid is None: + result_set_uuid = qr.resultSetUUID + labels = qr.labels # e.g., ["id", "name"] + for row in qr.rows: + id_val = row.values[0].int_value + name_val = row.values[1].string_value + session = op_result.session +# Fetch additional pages -> see §7.4 + +# Stored procedure — CALL_PREPARE then CALL_EXECUTE +prep_resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CALLABLE_STATEMENT, + target = TargetCall(callType=CALL_PREPARE, resourceName="{call my_proc(?,?)}", + params=[ParameterValue(int_value=1)]) # IN param +)) +proc_uuid = prep_resp.resourceUUID +session = prep_resp.session + +exec_resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CALLABLE_STATEMENT, + resourceUUID = proc_uuid, + target = TargetCall(callType=CALL_EXECUTE) +)) +out_value = exec_resp.values[0] # first OUT/INOUT parameter value +session = exec_resp.session +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Statement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Statement.java): `executeQuery(sql)` -> `statementService.executeQuery(...)`; `executeUpdate(sql)` -> `statementService.executeUpdate(...)`. +> - `ojp-jdbc-driver` — [`PreparedStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/PreparedStatement.java): accumulates parameters in a `SortedMap`; all 28 `setXxx(index, value)` methods map to the corresponding `ParameterType` (see §7.2). +> - `ojp-jdbc-driver` — [`CallableStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CallableStatement.java): issues `callResource(CALL_PREPARE)` on construction; retrieves OUT/INOUT values via `callResource(CALL_EXECUTE)` after execution. + +--- + +### 7.2 Parameter Type Mapping + +Each parameter is represented as: +``` +ParameterProto { + index: int32 // 1-based parameter position + type: ParameterTypeProto // one of the 29 type codes + values: ParameterValue[] // one value for normal params; multiple for array params +} +``` + +**ParameterTypeProto values and their ParameterValue encoding:** + +| Proto enum value | Wire field in `ParameterValue` | Notes | +|---|---|---| +| `PT_NULL` | `is_null = true` | Explicit null | +| `PT_BOOLEAN` | `bool_value` | | +| `PT_BYTE` | `int_value` | Clamp to byte range | +| `PT_SHORT` | `int_value` | Clamp to short range | +| `PT_INT` | `int_value` | | +| `PT_LONG` | `long_value` | | +| `PT_FLOAT` | `float_value` | | +| `PT_DOUBLE` | `double_value` | | +| `PT_BIG_DECIMAL` | `string_value` | Encode as `" "` — see §7.2.1 | +| `PT_STRING` | `string_value` | | +| `PT_BYTES` | `bytes_value` | Raw bytes | +| `PT_DATE` | `date_value` | `google.type.Date` (year/month/day, no timezone) | +| `PT_TIME` | `time_value` | `google.type.TimeOfDay` (hours/minutes/seconds/nanos) | +| `PT_TIMESTAMP` | `timestamp_value` | `TimestampWithZone` — see §7.3 | +| `PT_ASCII_STREAM` | `bytes_value` | ASCII bytes | +| `PT_UNICODE_STREAM` | `bytes_value` | Unicode bytes | +| `PT_BINARY_STREAM` | `bytes_value` | Binary bytes | +| `PT_OBJECT` | varies | Best-effort mapping to one of the concrete value types | +| `PT_CHARACTER_READER` | `string_value` | Contents of the character stream | +| `PT_REF` | `string_value` | REF value as string | +| `PT_BLOB` | (LOB reference UUID) | Create LOB first (§7.5); then pass UUID as `string_value` | +| `PT_CLOB` | (LOB reference UUID) | Same as BLOB | +| `PT_ARRAY` | `int_array_value` / `long_array_value` / `string_array_value` | Use the typed array message matching element type | +| `PT_URL` | `url_value` (StringValue) | `URL.toExternalForm()` — presence-aware; unset = null | +| `PT_ROW_ID` | `rowid_value` (StringValue) | Base64-encoded bytes of the RowId — presence-aware | +| `PT_N_STRING` | `string_value` | Same wire format as PT_STRING | +| `PT_N_CHARACTER_STREAM` | `string_value` | Contents of the NCharacter stream | +| `PT_N_CLOB` | (LOB reference UUID) | Same as CLOB | +| `PT_SQL_XML` | `string_value` | XML content as string | + +#### 7.2.1 BigDecimal encoding + +BigDecimal is serialised as a space-separated string: `" "`. + +- `unscaledInteger`: the decimal string representation of the unscaled value (may be negative). +- `scale`: integer scale (number of decimal places). +- Full value = `unscaledInteger * 10^(-scale)`. + +Example: `BigDecimal("123.45")` -> `"12345 2"`. + +> **Note:** A separate binary wire format is documented in `documents/protocol/BIGDECIMAL_WIRE_FORMAT.md` for contexts where binary efficiency is needed. + +#### 7.2.2 Presence-aware fields + +`url_value`, `rowid_value`, `uuid_value`, `biginteger_value`, `rowidlifetime_value` are all `google.protobuf.StringValue` (a wrapper message). An absent (unset) wrapper means SQL NULL. An empty string inside the wrapper is a valid non-null value. + +> **Reference implementation:** +> - `ojp-grpc-commons` — [`ProtoConverter.toProto(Parameter)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): converts a host-language `Parameter` object to `ParameterProto`; `fromProto(ParameterProto)` is the inverse. +> - `ProtoConverter.toParameterValue(Object value)`: the central dispatcher that routes each Java type to the correct `ParameterValue` oneof field. +> - `ProtoConverter.fromParameterValue(ParameterValue, ParameterType)`: decodes a wire value back to a Java object using both the value and the declared type as hints. +> - `ojp-grpc-commons` — [`ProtoTypeConverters`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoTypeConverters.java): handles the presence-aware `StringValue` wrappers for UUID, URL, and RowId. +> - `ojp-grpc-commons` — [`BigDecimalWire`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/BigDecimalWire.java): `writeBigDecimal` / `readBigDecimal` — binary wire encoding for BigDecimal. + +--- + +### 7.3 Temporal Type Handling + +Timestamps are transmitted as: + +``` +TimestampWithZone { + instant: google.protobuf.Timestamp // seconds + nanos since Unix epoch (UTC) + timezone: string // IANA zone ID or UTC offset (e.g., "Europe/Rome", "+02:00") + original_type: TemporalType // preserves the caller's original type +} +``` + +**TemporalType enum:** + +| Value | Original type | +|---|---| +| `TEMPORAL_TYPE_UNSPECIFIED` | Default / unknown | +| `TEMPORAL_TYPE_TIMESTAMP` | `java.sql.Timestamp` | +| `TEMPORAL_TYPE_CALENDAR` | `java.util.Calendar` | +| `TEMPORAL_TYPE_OFFSET_DATE_TIME` | `java.time.OffsetDateTime` | +| `TEMPORAL_TYPE_LOCAL_DATE_TIME` | `java.time.LocalDateTime` | +| `TEMPORAL_TYPE_INSTANT` | `java.time.Instant` | +| `TEMPORAL_TYPE_LOCAL_DATE` | `java.time.LocalDate` | +| `TEMPORAL_TYPE_LOCAL_TIME` | `java.time.LocalTime` | +| `TEMPORAL_TYPE_OFFSET_TIME` | `java.time.OffsetTime` | + +**Encoding rules:** +1. Convert the host-language datetime value to an absolute UTC instant (seconds + nanoseconds since the Unix epoch). +2. Record the IANA timezone or UTC offset string. +3. Set `original_type` to the closest matching `TemporalType` enum value. + +**Decoding rules:** On the receiving side, use `original_type` to reconstruct the correct host-language type. + +Date-only values use `google.type.Date` (year, month, day — no timezone). Time-only values use `google.type.TimeOfDay` (hours, minutes, seconds, nanos — no timezone). + +The OJP server must always run with `user.timezone=UTC`. Client libraries should normalise to UTC when encoding timestamps, using the `timezone` field to carry the original zone for faithful reconstruction. + +> **Reference implementation:** +> - `ojp-grpc-commons` — [`TemporalConverter`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/TemporalConverter.java): the definitive encoding/decoding reference for all temporal types. + +--- + +### 7.4 Result Set Streaming + +`executeQuery` is a server-streaming RPC. The response stream contains one or more `OpResult` messages: + +1. **First `OpResult`**: always contains the initial data batch in `query_result`: + - `resultSetUUID` — server-side handle for this result set. + - `labels` — ordered list of column names. + - `rows` — first batch of `ResultRow` objects, each containing a `ParameterValue` per column. + - `flag` — if `"ROW_BY_ROW"`, the server sends one row per stream message. + +2. **Subsequent `OpResult` messages** (only in non-row-by-row streaming mode): additional batches until the stream closes. + +3. **`fetchNextRows`**: After the initial stream closes, call `fetchNextRows(ResultSetFetchRequest)` with `resultSetUUID` and a page size to fetch additional rows. Repeat until the response contains an empty `rows` list. + +Map each `ParameterValue` oneof to the host language's equivalent type following the inverse of the encoding table in §7.2. Pay attention to `is_null = true` for SQL NULL values. + +**Cursor navigation:** Scrollable result sets support cursor positioning through `callResource` with `ResourceType.RES_RESULT_SET` and the appropriate `CallType`: + +| Cursor operation | CallType | +|---|---| +| `next()` | `CALL_NEXT` | +| `first()` | `CALL_FIRST` | +| `last()` | `CALL_LAST` | +| `beforeFirst()` | `CALL_BEFORE` | +| `afterLast()` | `CALL_AFTER` | +| `absolute(row)` | `CALL_ABSOLUTE` | +| `relative(rows)` | `CALL_RELATIVE` | +| `previous()` | `CALL_PREVIOUS` | +| `close()` | `CALL_CLOSE` | + +```python +# After executeQuery stream closes, fetch additional pages with fetchNextRows +result_set_uuid = ... # captured from the first op_result (§7.1) +all_rows = [] +while True: + resp = stub.fetchNextRows(ResultSetFetchRequest( + session = session, + resultSetUUID = result_set_uuid, + size = 500 # rows per page + )) + session = resp.session + if not resp.query_result.rows: + break # no more rows — result set exhausted + all_rows.extend(resp.query_result.rows) + +# Close the result set explicitly when done +stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_RESULT_SET, + resourceUUID = result_set_uuid, + target = TargetCall(callType=CALL_CLOSE) +)) + +# Cursor navigation — jump to an absolute row (scrollable result sets only) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_RESULT_SET, + resourceUUID = result_set_uuid, + target = TargetCall( + callType = CALL_ABSOLUTE, + params = [ParameterValue(int_value=10)] # jump to row 10 + ) +)) +session = resp.session +current_row = resp.values # column values for row 10 +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`ResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ResultSet.java): `next()` drives the multi-block iteration; `setNextOpResult()` loads a new batch from the iterator; `nextWithSessionUpdate()` updates the session from each block. All `getXxx(columnIndex)` methods call `ProtoConverter.fromParameterValue()` on the column's `ParameterValue`. +> - `ojp-jdbc-driver` — [`RemoteProxyResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/RemoteProxyResultSet.java): base class holding `resultSetUUID` and `statementService`; all scrollable-cursor operations issue `callResource(RES_RESULT_SET, CALL_FIRST/LAST/ABSOLUTE/...)`. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.fetchNextRows(sessionInfo, resultSetUUID, size)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the RPC that fetches the next page. +> - `ojp-grpc-commons` — [`ProtoConverter.fromProto(OpQueryResultProto)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): deserialises the initial `OpQueryResult` (labels + rows + resultSetUUID). + +--- + +### 7.5 LOB (Large Object) Handling + +**LOB types:** + +| LobType enum | Meaning | +|---|---| +| `LT_BLOB` | Binary large object | +| `LT_CLOB` | Character large object | +| `LT_BINARY_STREAM` | Binary stream (column-streaming variant) | +| `LT_ASCII_STREAM` | ASCII character stream | +| `LT_UNICODE_STREAM` | Unicode character stream | +| `LT_CHARACTER_STREAM` | Generic character stream | + +**Writing a LOB (createLob):** +1. Open a client-streaming call to `createLob`. +2. Send one or more `LobDataBlock` messages: + ``` + LobDataBlock { + session: SessionInfo + position: int64 // byte offset of this chunk + data: bytes // chunk content (recommended chunk size: 32-64 KB) + lobType: LobType + metadata: PropertyEntry[] // used for binary streams to carry prepared statement info + } + ``` +3. Close the stream. The server responds with a `LobReference`: + ``` + LobReference { + session: SessionInfo + uuid: string // LOB handle + bytesWritten: int32 + lobType: LobType + } + ``` +4. Store the `LobReference.uuid`. This UUID is passed as a parameter value (§7.2) when binding the LOB to a SQL statement. + +**Reading a LOB (readLob):** Call `readLob(ReadLobRequest)`: +``` +ReadLobRequest { + lobReference: LobReference // uuid + session info + position: int64 // start byte (1-based for JDBC compatibility) + length: int32 // max bytes to return +} +``` +Receive a server-streaming response of `LobDataBlock` messages. Concatenate the `data` fields in order to reconstruct the content. + +LOB handles are server-side objects. A connection that has an open LOB must remain bound to the same server. Do not reroute such connections during failover; surface the error to the caller. + +```python +CHUNK_SIZE = 64 * 1024 # 64 KB recommended chunk size + +# --- Write a LOB (createLob is client-streaming) --- +def write_lob(stub, session, data_bytes, lob_type=LT_BLOB): + def generate_blocks(): + for offset in range(0, len(data_bytes), CHUNK_SIZE): + yield LobDataBlock( + session = session, + position = offset, + data = data_bytes[offset : offset + CHUNK_SIZE], + lobType = lob_type + ) + lob_ref = stub.createLob(generate_blocks()) # client-streaming -> single LobReference + # lob_ref.uuid -> the LOB handle; pass as parameter to executeUpdate (see §7.2) + # lob_ref.bytesWritten -> sanity check + return lob_ref.uuid + +# Bind the LOB UUID when executing a statement +lob_uuid = write_lob(stub, session, my_bytes) +stub.executeUpdate(StatementRequest( + session = session, + sql = "INSERT INTO docs(content) VALUES(?)", + parameters = [ParameterProto(index=1, type=PT_BLOB, + values=[ParameterValue(string_value=lob_uuid)])] +)) + +# --- Read a LOB (readLob is server-streaming) --- +def read_lob(stub, session, lob_uuid, lob_type=LT_BLOB, max_bytes=10_000_000): + req = ReadLobRequest( + lobReference = LobReference(uuid=lob_uuid, session=session, lobType=lob_type), + position = 1, # 1-based start position + length = max_bytes + ) + return b"".join(block.data for block in stub.readLob(req)) + +content = read_lob(stub, session, lob_uuid) +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`LobServiceImpl`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/LobServiceImpl.java): `sendBytes(lobType, pos, inputStream)` opens the client-streaming `createLob` call, chunks the data into `LobDataBlock` messages, and returns the `LobReference`. `parseReceivedBlocks(Iterator)` reassembles chunks from a `readLob` stream into an `InputStream`. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.createLob(connection, iterator)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the client-streaming gRPC call; uses an async stub and a `CountDownLatch` to bridge the streaming API back to a synchronous return value. +> - `StatementServiceGrpcClient.readLob(lobReference, pos, length)`: the server-streaming gRPC call that returns an `Iterator`. +> - `ojp-jdbc-driver` — [`Blob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Blob.java): `getBytes(pos, length)` and `getBinaryStream()` call `readLob`; `setBytes(pos, bytes)` calls `sendBytes`. [`Clob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Clob.java) mirrors the same pattern for character data. +> - `ojp-jdbc-driver` — [`BinaryStream`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/BinaryStream.java): streams binary content directly via `createLob` without materialising the full byte array. + +--- + +### 7.6 Transaction Management (non-XA) + +The server tracks open transactions per session. The client controls when transactions begin and end by calling explicit RPCs. + +- **Start a transaction**: call `startTransaction(SessionInfo)`. The returned `SessionInfo` contains a `transactionUUID` and `transactionStatus = TRX_ACTIVE`. +- **Commit**: call `commitTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_COMMITED`. +- **Rollback**: call `rollbackTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_ROLLBACK`. + +Always replace the local `SessionInfo` with the one returned by these calls. + +Set or get the isolation level by calling `callResource` with `RES_CONNECTION` and `CallType.CALL_SET` / `CALL_GET` and resource name `"TransactionIsolation"`. The isolation level should be reset to the default after each logical connection is reused. + +```python +# Begin an explicit transaction +session = stub.startTransaction(session) +# session.transactionInfo.transactionUUID = "txn-uuid" +# session.transactionInfo.transactionStatus = TRX_ACTIVE + +# Execute SQL within the open transaction +resp = stub.executeUpdate(StatementRequest(session=session, sql="INSERT INTO orders ...")) +session = resp.session # always update local session + +# Commit +session = stub.commitTransaction(session) +# session.transactionInfo.transactionStatus = TRX_COMMITED + +# -- OR -- Rollback +session = stub.rollbackTransaction(session) +# session.transactionInfo.transactionStatus = TRX_ROLLBACK + +# Set transaction isolation (READ_COMMITTED = 2) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall( + callType = CALL_SET, + resourceName = "TransactionIsolation", + params = [ParameterValue(int_value=2)] + ) +)) +session = resp.session + +# Get current isolation level +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall(callType=CALL_GET, resourceName="TransactionIsolation") +)) +isolation_level = resp.values[0].int_value +session = resp.session +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection.setAutoCommit(boolean)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): calls `commitTransaction` when switching on and `startTransaction` when switching off. +> - `Connection.commit()` / `Connection.rollback()`: delegate to `statementService.commitTransaction(session)` / `rollbackTransaction(session)` when `autoCommit == false`. +> - `Connection.close()`: calls `terminateSession(session)` unconditionally. +> - `Connection.setTransactionIsolation(level)` / `getTransactionIsolation()`: forwarded via `callProxy(CallType.CALL_SET/GET, "TransactionIsolation", ...)`. + +--- + +### 7.7 Savepoints + +Savepoints are implemented through the `callResource` protocol using `ResourceType.RES_SAVEPOINT`. + +**Creating a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `target.callType = CALL_SET`, `target.resourceName = "Savepoint"`, and `target.params = [savepointName]` if named; empty for anonymous savepoints. The response contains the savepoint UUID in `CallResourceResponse.resourceUUID`. + +**Rolling back to a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `resourceUUID = `, `target.callType = CALL_ROLLBACK`. + +**Releasing a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `resourceUUID = `, `target.callType = CALL_RELEASE`. + +```python +# Create a named savepoint +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_SAVEPOINT, + target = TargetCall( + callType = CALL_SET, + resourceName = "Savepoint", + params = [ParameterValue(string_value="my_savepoint")] # omit for anonymous + ) +)) +savepoint_uuid = resp.resourceUUID # keep this to roll back or release later +session = resp.session + +# Roll back to the savepoint (partial undo) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_SAVEPOINT, + resourceUUID = savepoint_uuid, + target = TargetCall(callType=CALL_ROLLBACK, resourceName="Savepoint") +)) +session = resp.session + +# Release the savepoint (no longer needed) +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_SAVEPOINT, + resourceUUID = savepoint_uuid, + target = TargetCall(callType=CALL_RELEASE, resourceName="Savepoint") +)) +session = resp.session +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection.setSavepoint()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java) / `setSavepoint(name)`: calls `callProxy` with `CALL_SET`, `"Savepoint"`, and the optional name; wraps the returned resource UUID in a [`Savepoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Savepoint.java) object. +> - `Connection.rollback(Savepoint)`: calls `callProxy` with `CALL_ROLLBACK`, `"Savepoint"`, and the savepoint's resource UUID. +> - `Connection.releaseSavepoint(Savepoint)`: calls `callProxy` with `CALL_RELEASE`. + +--- + +### 7.8 XA / Distributed Transactions + +XA support maps the standard XA resource manager protocol to gRPC RPCs. XA connections are always pinned to a single server (see §2.3). + +**XA transaction lifecycle:** + +``` +xaStart(XaStartRequest) -- Begin branch; safe to retry on connection error +xaEnd(XaEndRequest) -- End branch; NEVER retry after this point +xaPrepare(XaPrepareRequest) -- Two-phase prepare; returns XA_OK or XA_RDONLY +xaCommit(XaCommitRequest) -- Commit (onePhase=true for one-phase optimisation) +xaRollback(XaRollbackRequest) -- Roll back the branch +xaRecover(XaRecoverRequest) -- List in-doubt XIDs (for recovery after crash) +xaForget(XaForgetRequest) -- Forget a heuristically completed branch +``` + +**Xid encoding (XidProto):** + +| Field | Type | Meaning | +|---|---|---| +| `formatId` | int32 | Transaction format ID | +| `globalTransactionId` | bytes | Global transaction ID (up to 64 bytes) | +| `branchQualifier` | bytes | Branch qualifier (up to 64 bytes) | + +**Retry policy:** `xaStart` only: retry on connection-level errors. All other XA operations: do not retry automatically. Surface failures to the caller's transaction manager. + +```python +xid = XidProto( + formatId = 1, + globalTransactionId = b"global-tx-001", + branchQualifier = b"branch-1" +) + +# 1. Start the XA branch (safe to retry on connection error) +resp = stub.xaStart(XaStartRequest(session=session, xid=xid, flags=0)) +session = resp.session # bind session.targetServer -> this server for all remaining calls + +# 2. Execute SQL within the branch (normal executeUpdate/executeQuery calls) +resp = stub.executeUpdate(StatementRequest(session=session, sql="UPDATE accounts ...")) +session = resp.session + +# 3. End the branch — do NOT retry past this point +resp = stub.xaEnd(XaEndRequest(session=session, xid=xid, flags=0)) +session = resp.session + +# 4. Prepare (two-phase commit, phase 1) +prep = stub.xaPrepare(XaPrepareRequest(session=session, xid=xid)) +# prep.result = XA_OK (proceed to commit) or XA_RDONLY (read-only; no commit needed) + +# 5a. Commit (two-phase) +stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=False)) + +# 5b. -- OR -- One-phase optimisation (skip xaPrepare) +stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=True)) + +# 5c. -- OR -- Rollback +stub.xaRollback(XaRollbackRequest(session=session, xid=xid)) + +# Recovery: list in-doubt XIDs after a crash +resp = stub.xaRecover(XaRecoverRequest(session=session, flag=TMSTARTRSCAN)) +for recovered_xid in resp.xids: + stub.xaCommit(...) # or xaRollback -- decision belongs to the transaction manager + +# Forget a heuristically completed branch +stub.xaForget(XaForgetRequest(session=session, xid=xid)) +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`OjpXAResource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAResource.java): implements `XAResource`; all 10 lifecycle methods; contains the `xaStart` retry loop and the `toXidProto` / `fromXidProto` conversion helpers. +> - `ojp-jdbc-driver` — [`OjpXAConnection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAConnection.java): creates the XA-mode `StatementService` connection and vends `OjpXAResource`. +> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): entry point for XA; calls `MultinodeConnectionManager.connectXA()` to pin the session to a single server. + +--- + +### 7.9 callResource Protocol + +The `callResource` RPC is a generic mechanism for operations that do not fit a dedicated RPC — primarily `DatabaseMetaData` queries, `ResultSet` cursor/update operations, `Statement` cancellation, savepoint management, and resource lifecycle calls. + +**Request:** + +``` +CallResourceRequest { + session: SessionInfo + resourceType: ResourceType // what kind of resource to call + resourceUUID: string // the server-side handle for this resource + target: TargetCall // the specific operation to perform + properties: PropertyEntry[] +} +``` + +**TargetCall (supports chaining):** + +``` +TargetCall { + callType: CallType // one of the 47+ call type codes + resourceName: string // e.g., "Catalog", "TransactionIsolation", "Savepoint" + params: ParameterValue[] // input arguments + nextCall: TargetCall // optional chained call (recursive) +} +``` + +**ResourceType values:** + +| Value | Meaning | +|---|---| +| `RES_RESULT_SET` | An open result set | +| `RES_STATEMENT` | A plain statement | +| `RES_PREPARED_STATEMENT` | A prepared statement | +| `RES_CALLABLE_STATEMENT` | A callable statement | +| `RES_LOB` | A LOB object | +| `RES_CONNECTION` | The connection itself (for metadata, catalog, etc.) | +| `RES_SAVEPOINT` | A savepoint | + +**Response:** + +``` +CallResourceResponse { + session: SessionInfo + resourceUUID: string // UUID of a newly created resource, if any + values: ParameterValue[] // return values (may be empty) +} +``` + +Always update the local `SessionInfo` from `response.session`. + +**CallType reference (47 codes):** `CALL_SET`, `CALL_GET`, `CALL_IS`, `CALL_ALL`, `CALL_NULLS`, `CALL_USES`, `CALL_SUPPORTS`, `CALL_STORES`, `CALL_NULL`, `CALL_DOES`, `CALL_DATA`, `CALL_NEXT`, `CALL_CLOSE`, `CALL_WAS`, `CALL_CLEAR`, `CALL_FIND`, `CALL_BEFORE`, `CALL_AFTER`, `CALL_FIRST`, `CALL_LAST`, `CALL_ABSOLUTE`, `CALL_RELATIVE`, `CALL_PREVIOUS`, `CALL_ROW`, `CALL_UPDATE`, `CALL_INSERT`, `CALL_DELETE`, `CALL_REFRESH`, `CALL_CANCEL`, `CALL_MOVE`, `CALL_OWN`, `CALL_OTHERS`, `CALL_UPDATES`, `CALL_DELETES`, `CALL_INSERTS`, `CALL_LOCATORS`, `CALL_AUTO`, `CALL_GENERATED`, `CALL_RELEASE`, `CALL_NATIVE`, `CALL_PREPARE`, `CALL_ROLLBACK`, `CALL_ABORT`, `CALL_EXECUTE`, `CALL_ADD`, `CALL_ENQUOTE`, `CALL_REGISTER`, `CALL_LENGTH` + +```python +# --- Get the database catalog name (connection-level metadata) --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + resourceUUID = "", # empty for connection-level calls + target = TargetCall(callType=CALL_GET, resourceName="Catalog") +)) +catalog_name = resp.values[0].string_value +session = resp.session # always update local session + +# --- Check a database capability --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall(callType=CALL_SUPPORTS, resourceName="Transactions") +)) +supports_transactions = resp.values[0].bool_value +session = resp.session + +# --- Cancel a running statement --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_STATEMENT, + resourceUUID = statement_uuid, # UUID of the statement to cancel + target = TargetCall(callType=CALL_CANCEL) +)) +session = resp.session + +# --- Chained call: get Schema and Catalog in one round-trip --- +resp = stub.callResource(CallResourceRequest( + session = session, + resourceType = RES_CONNECTION, + target = TargetCall( + callType = CALL_GET, + resourceName = "Schema", + nextCall = TargetCall(callType=CALL_GET, resourceName="Catalog") + ) +)) +schema_name = resp.values[0].string_value +catalog_name = resp.values[1].string_value +session = resp.session +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.callResource(CallResourceRequest)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC call. +> - `ojp-jdbc-driver` — [`DatabaseMetaData`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatabaseMetaData.java): every `DatabaseMetaData` method (>200 in total) is implemented by calling `callResource` with `RES_CONNECTION` and the appropriate `CallType`. +> - `ojp-jdbc-driver` — `Connection.callProxy(callType, resourceName, returnType, params)`: the private convenience wrapper used throughout `Connection` and `DatabaseMetaData`. + +--- + +### 7.10 Configuration System + +**Configuration sources (in priority order):** + +1. System / environment properties (highest priority) — e.g., `-Dojp.health.check.interval=10000`. +2. `ojp.properties` file — loaded from the classpath or a well-known filesystem path. +3. Built-in defaults (lowest priority). + +**Property namespacing:** Properties can be global or per-datasource. Per-datasource properties are prefixed with the datasource name: + +```properties +# Global +ojp.health.check.interval=5000 + +# Per-datasource (datasource name: "analytics") +analytics.ojp.health.check.interval=10000 +``` + +**Standard configuration properties:** + +| Property | Default | Meaning | +|---|---|---| +| `ojp.health.check.interval` | `5000` (ms) | Periodic health check interval | +| `ojp.health.check.threshold` | `5000` (ms) | Minimum wait before re-probing an unhealthy server | +| `ojp.health.check.timeout` | `5000` (ms) | Probe call timeout | +| `ojp.redistribution.enabled` | `true` | Enable/disable the health checker and redistribution | +| `ojp.redistribution.idleRebalanceFraction` | `1.0` | Fraction of idle connections to close per rebalance cycle | +| `ojp.redistribution.maxClosePerRecovery` | `100` | Max connections closed per recovery event | +| `ojp.loadaware.selection.enabled` | `true` | Use least-connections; `false` = round-robin | +| `ojp.multinode.retry.attempts` | `3` | Max failover retry attempts | +| `ojp.multinode.retry.delay` | `100` (ms) | Delay between retry attempts | +| `ojp.datasource.name` | `"default"` | Active datasource name (sent to the server) | +| `ojp.grpc.tls.enabled` | `false` | Enable TLS on gRPC channels | +| `ojp.grpc.tls.cert.path` | — | Path to client certificate for mTLS | + +**Duration format:** No suffix = milliseconds; `ms` = milliseconds; `s` = seconds; `m` = minutes. + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`DatasourcePropertiesLoader`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatasourcePropertiesLoader.java): `loadOjpPropertiesForDataSource(datasourceName)` merges file properties, system properties, and environment variables with per-datasource prefix resolution. +> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): the strongly-typed POJO that holds all health-check and redistribution settings. +> - `ojp-jdbc-driver` — [`MultinodeUrlParser.readIntProperty(props, key, default)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java) / `readLongProperty(...)`: reads typed values from the merged `Properties` object. +> - `ojp-grpc-commons` — [`GrpcClientConfig.load()`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loads the gRPC-specific settings (max inbound message size, TLS config) from `ojp.properties`. + +--- + +### 7.11 Query Result Caching + +Cache configuration is entirely **client-side to server** — the client reads local cache rules and sends them to the server as `ConnectionDetails.properties` entries during `connect()`. The server applies them transparently; the client does not implement any caching logic itself. + +**Properties sent to the server:** + +| Property key | Meaning | +|---|---| +| `ojp.cache.enabled` | `"true"` to enable caching | +| `ojp.cache.queries..pattern` | Regex pattern matching SQL queries to cache | +| `ojp.cache.queries..ttl` | TTL in seconds for cached results | +| `ojp.cache.queries..invalidateOn` | Comma-separated table names that invalidate this rule | +| `ojp.cache.queries..enabled` | `"true"` / `"false"` to toggle individual rules | + +`` is a 1-based integer index. Rules are processed in index order. + +**Example configuration:** + +```properties +ojp.cache.enabled=true +ojp.cache.queries.1.pattern=SELECT .* FROM products.* +ojp.cache.queries.1.ttl=600 +ojp.cache.queries.1.invalidateOn=products,product_prices +ojp.cache.queries.2.pattern=SELECT .* FROM users.* +ojp.cache.queries.2.ttl=300 +ojp.cache.queries.2.invalidateOn=users +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`CacheConfigurationBuilder.addCachePropertiesToMap(propertiesMap, datasourceName)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CacheConfigurationBuilder.java): reads cache rules from the loaded `Properties` and appends them to the `ConnectionDetails.properties` map that is sent to the server on `connect()`. + +--- + +### 7.12 Security / Transport + +**Plaintext (default):** Create a plaintext gRPC channel targeting `dns:///host:port`. Suitable for internal networks or local development. + +**TLS:** When `ojp.grpc.tls.enabled = true`, create a TLS-secured channel. Use the platform's default trust store or a custom CA certificate. Support mutual TLS (mTLS) when `ojp.grpc.tls.cert.path` is set. Certificate paths and key material must be loaded from configurable filesystem paths. + +**Credential handling:** Passwords must never be logged or included in exception messages. Connection keys used for cache lookups may include the password as a cache key only — they must not be serialised or persisted. + +> **Reference implementation:** +> - `ojp-grpc-commons` — [`GrpcChannelFactory.createChannel(host, port)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): creates a plaintext `ManagedChannel` with configurable max inbound message size; `createSecureChannel(host, port, size, tlsConfig)` builds the TLS-secured variant. +> - `ojp-grpc-commons` — [`GrpcClientConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): exposes `getTlsConfig()` and `getMaxInboundMessageSize()`. +> - `ojp-grpc-commons` — [`TlsConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/TlsConfig.java): holds `enabled`, `certPath`, `keyPath`, `caPath`, and `clientAuth` flags. + +--- + +### 7.13 DataSource / Integration API + +Provide a higher-level `DataSource` (or equivalent) object that holds connection configuration (URL, user, password, properties) and exposes a `getConnection()` method that calls `Driver.connect()` internally. Integrate cleanly with the host language's database access conventions. + +For Java/Spring Boot, provide a `spring-boot-starter-ojp` auto-configuration module. Auto-configure an `OjpDataSource` bean when the driver is on the classpath. Disable the framework's own built-in connection pool (e.g., HikariCP in Spring Boot) when OJP is in use — double-pooling is the most common misconfiguration and causes incorrect behaviour. + +For other languages, document clearly in the library README that the application-side connection pool must be disabled when using OJP. + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`OjpDataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/OjpDataSource.java): implements `javax.sql.DataSource`; `getConnection()` / `getConnection(user, password)` delegate to `DriverManager.getConnection(url, info)`. +> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): implements `javax.sql.XADataSource`; `getXAConnection()` creates an `OjpXAConnection` (and thus an `OjpXAResource`) for JTA integration. +> - `spring-boot-starter-ojp` module: provides the Spring Boot auto-configuration class and the `OjpSystemPropertiesBridge` bean; sets `spring.datasource.type=OjpDataSource` and excludes `DataSourceAutoConfiguration` to prevent double-pooling. + +--- + +## 8. Testing Coverage + +A conformant client implementation must ship a test suite that exercises all the aspects above. Tests that require a live OJP server (and optionally a real database) should be **gated behind feature flags** so the suite can run incrementally in CI. + +**Test infrastructure requirements:** +- A running OJP server (see `ojp-server` module and `download-drivers.sh`). +- At minimum, an embedded/in-process database (e.g., H2) for fast baseline tests. +- Optional: containerised databases (PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, DB2, CockroachDB) gated by per-database flags. + +**Test categories and required scenarios:** + +#### Basic CRUD +- SELECT, INSERT, UPDATE, DELETE via plain Statement and PreparedStatement. +- Verify affected row counts, returned ResultSet contents. +- Verify empty result sets are handled correctly. + +#### Multiple data types +- Round-trip every `ParameterTypeProto` value through INSERT + SELECT. +- Cover: all integer widths, float, double, BigDecimal, string, boolean, byte array, date, time, timestamp (with and without timezone), LocalDate, LocalTime, LocalDateTime, OffsetDateTime, OffsetTime, Instant, URL, UUID, RowId, BLOB, CLOB, array, NULLs for each type. + +#### Statement variants +- Plain `Statement`: `executeQuery`, `executeUpdate`, `execute`, `executeBatch`, `getResultSet`, `getUpdateCount`, `getGeneratedKeys`, `cancel`, `close`. +- `PreparedStatement`: all `setXxx` methods, `executeBatch`, multiple executions with the same prepared statement, `getParameterMetaData`. +- `CallableStatement`: IN, OUT, INOUT parameters; `registerOutParameter`; retrieval of OUT values after execution; named parameters where supported. + +#### ResultSet navigation +- Forward-only cursors: `next()`, `wasNull()`, `close()`. +- Scrollable cursors: `first()`, `last()`, `beforeFirst()`, `afterLast()`, `absolute(n)`, `relative(n)`, `previous()`. +- Multi-block pagination: queries large enough to exceed one fetch page; verify all rows are retrieved. + +#### ResultSet metadata +- `getColumnCount()`, `getColumnName()`, `getColumnType()`, `getColumnTypeName()`, `getPrecision()`, `getScale()`, `isNullable()`, `isAutoIncrement()`. + +#### DatabaseMetaData +- `getTables()`, `getColumns()`, `getPrimaryKeys()`, `getIndexInfo()`, `getProcedures()`, `getTypeInfo()`, `supportsXxx()` methods. +- Verify results match the actual database schema. + +#### Transactions +- Commit: insert rows in a transaction, commit, verify rows persist. +- Rollback: insert rows in a transaction, rollback, verify rows are absent. +- `autoCommit = false` then `setAutoCommit(true)` — verify implicit commit. +- Transaction isolation level: set, verify via `getTransactionIsolation()`, reset after connection return. + +#### Savepoints +- Create a named and an anonymous savepoint. +- Rollback to each; verify partial rollback semantics. +- Release a savepoint. + +#### XA transactions +- Full lifecycle: `xaStart`, `xaEnd`, `xaPrepare`, `xaCommit`. +- Rollback path: `xaStart`, `xaEnd`, `xaPrepare`, `xaRollback`. +- One-phase commit (`onePhase=true`). +- `xaRecover`: verify in-doubt XIDs are returned. +- `xaForget`: verify heuristically completed branch is removed. +- Transaction isolation reset after XA session. + +#### LOBs +- BLOB: write a small blob (< 1 chunk), a large blob (multiple chunks), read back both; verify byte-for-byte equality. +- CLOB: same as BLOB but with character content. +- Binary stream, ASCII stream, Unicode stream: write via stream API, read back. +- Hydratable LOB: verify that a LOB reference can be passed as a parameter to a second statement. +- NULL LOB: verify that `setBlob(null)` / `setClob(null)` sends a SQL NULL. + +#### Session affinity +- Verify that a connection with an open transaction always routes to the same server. +- Verify that a connection holding an open LOB always routes to the same server. +- Verify that when the bound server is down, an appropriate error is raised rather than silent rerouting. + +#### Multi-block / large result sets +- Execute a query that returns more rows than one page. Verify all rows arrive and are in the correct order. + +#### Multinode load balancing +- With two or more server endpoints, open `N` connections and verify they are distributed across servers (round-robin and least-connections modes separately). + +#### Multinode failover +- Terminate one server mid-operation; verify the operation is retried on a surviving server (for stateless operations). +- Verify a server is marked unhealthy after failure. +- Verify subsequent connections avoid the unhealthy server. + +#### Multinode recovery and redistribution +- Bring a server back; verify it is marked healthy after the health check interval. +- Verify new connections start routing to the recovered server. +- Verify connection redistribution closes a fraction of idle connections on over-loaded servers. + +#### XA multinode +- Verify that each XA session binds to exactly one server. +- Verify that failover of an XA session to another server raises an error (not a silent reroute). +- Verify XA redistribution after server recovery. + +#### connHash caching / connect-RPC skip +- Open two connections with the same credentials; verify only one `connect()` gRPC call is made. +- Simulate a `NOT_FOUND` response; verify the driver invalidates the cache and re-issues `connect()`. + +#### Session stickiness error path +- Establish a session on server A. Mark server A unhealthy. Attempt a SQL operation. Verify an error is raised rather than the request being silently routed to server B. + +#### Cluster health propagation +- Stop one server; verify the cluster health string sent in subsequent requests marks it `DOWN`. +- Recover the server; verify the health string marks it `UP`. + +#### Concurrency / pool exhaustion +- Send more concurrent requests than the server-side pool size; verify pool-exhaustion errors are surfaced cleanly and do not mark servers unhealthy. + +#### Slow query segregation +- Send queries that take longer than the slow-query threshold; verify they use the reserved slow-query slots and do not starve fast queries. + +#### Multi-datasource +- Configure two endpoints with different datasource names; verify each endpoint uses its own datasource configuration. + +#### Configuration loading +- Verify properties are loaded from `ojp.properties`. +- Verify system properties override file properties. +- Verify per-datasource properties override global properties. + +#### Performance / mini stress +- Open and close 100-1000 connections in parallel; verify no connection leaks, no deadlocks, and no degrading error rate. + +#### Database-specific test suites + +Each database must have a dedicated test class gated by its own flag: + +| Database | Feature flag | +|---|---| +| H2 | `enableH2Tests` | +| PostgreSQL | `enablePostgresTests` | +| MySQL | `enableMySQLTests` | +| MariaDB | `enableMariaDBTests` | +| Oracle | `enableOracleTests` | +| SQL Server | `enableSqlServerTests` | +| DB2 | `enableDb2Tests` | +| CockroachDB | `enableCockroachDBTests` | + +H2 tests (in-process, no external dependency) must always be runnable in CI without any extra setup and should act as the first gate before any database-specific jobs run. + +> **Reference implementation — test classes by area:** +> +> | Test area | Java test class(es) | +> |---|---| +> | Basic CRUD | [`BasicCrudIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BasicCrudIntegrationTest.java) | +> | Multiple data types | `H2MultipleTypesIntegrationTest`, `PostgresMultipleTypesIntegrationTest`, `MySQLMultipleTypesIntegrationTest`, `OracleMultipleTypesIntegrationTest`, `SQLServerMultipleTypesIntegrationTest`, `Db2MultipleTypesIntegrationTest`, `CockroachDBMultipleTypesIntegrationTest`, `MariaDBMultipleTypesIntegrationTest` | +> | Statement variants | `H2StatementExtensiveTests`, `H2PreparedStatementExtensiveTests` (and per-DB equivalents) | +> | ResultSet navigation / metadata | `H2ResultSetTest` (and per-DB), `H2ResultSetMetaDataExtensiveTests`, `H2ReadMultipleBlocksOfDataIntegrationTest` | +> | DatabaseMetaData | `H2DatabaseMetaDataExtensiveTests`, `H2ConnectionExtensiveTests` (and per-DB) | +> | Transactions | `H2ConnectionExtensiveTests`, [`TransactionIsolationResetTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/TransactionIsolationResetTest.java) | +> | Savepoints | `H2SavepointTests` (and per-DB `*SavepointTests`) | +> | XA transactions | [`PostgresXAIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/PostgresXAIntegrationTest.java), `MySQLXAIntegrationTest`, `MariaDBXAIntegrationTest`, `OracleXAIntegrationTest`, `SqlServerXAIntegrationTest`, `Db2XAIntegrationTest`, [`XASessionInvalidationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/XASessionInvalidationTest.java) | +> | LOBs | [`BlobIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BlobIntegrationTest.java), [`BinaryStreamIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BinaryStreamIntegrationTest.java), [`HydratedLobValidationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/HydratedLobValidationTest.java) (and per-DB `*Blob*` / `*BinaryStream*`) | +> | Session affinity | [`H2SessionAffinityIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/H2SessionAffinityIntegrationTest.java) (and per-DB `*SessionAffinity*`) | +> | Multi-block result sets | `H2ReadMultipleBlocksOfDataIntegrationTest` (and per-DB) | +> | Multinode load balancing | [`LoadAwareServerSelectionTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/LoadAwareServerSelectionTest.java), [`MultinodeIntegrationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeIntegrationTest.java) | +> | Multinode failover | [`MultinodeFailoverTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeFailoverTest.java), [`MultinodeConnectionManagerErrorHandlingTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeConnectionManagerErrorHandlingTest.java) | +> | Multinode recovery / redistribution | [`MultinodeRecoveryTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeRecoveryTest.java) | +> | XA multinode | [`MultinodeXAIntegrationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeXAIntegrationTest.java) | +> | connHash caching | [`ConnectRpcSkipOptimisationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/ConnectRpcSkipOptimisationTest.java), [`UnifiedConnectionModeTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/UnifiedConnectionModeTest.java) | +> | Session stickiness error path | [`MultinodeTargetServerBindingTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeTargetServerBindingTest.java), `MultinodeStatementServiceTest` | +> | Cluster health propagation | [`MultinodeConnectionManagerClusterHealthTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeConnectionManagerClusterHealthTest.java) | +> | Concurrency / pool exhaustion | [`ConcurrencyTimeoutTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/ConcurrencyTimeoutTest.java) | +> | Multi-datasource | [`MultiDataSourceIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/MultiDataSourceIntegrationTest.java), [`MultiDataSourceConfigurationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/MultiDataSourceConfigurationTest.java) | +> | Configuration loading | [`DatasourcePropertiesLoaderSystemPropertyTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DatasourcePropertiesLoaderSystemPropertyTest.java), [`DatasourcePropertiesLoaderEnvironmentTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DatasourcePropertiesLoaderEnvironmentTest.java) | +> | URL parsing | [`MultinodeUrlParserTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeUrlParserTest.java), [`UrlParserTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/UrlParserTest.java), [`DriverMultinodeUrlTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DriverMultinodeUrlTest.java) | +> | DataSource API | [`OjpDataSourceTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/OjpDataSourceTest.java), [`OjpXADataSourceTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/xa/OjpXADataSourceTest.java) | +> | Health check config | [`HealthCheckConfigTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/HealthCheckConfigTest.java), [`MultinodeRetryConfigTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeRetryConfigTest.java) | +> | Session tracker unit | [`SessionTrackerTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/SessionTrackerTest.java) | + +--- + +## Appendix A — Proto file locations + +| File | Location | +|---|---| +| Main protocol | `ojp-grpc-commons/src/main/proto/StatementService.proto` | +| Generic value containers | `ojp-grpc-commons/src/main/proto/containers.proto` | +| Echo / heartbeat | `ojp-grpc-commons/src/main/proto/echo.proto` | + +## Appendix B — Reference implementation classes + +| Aspect | Java class | +|---|---| +| gRPC stubs | `StatementServiceGrpcClient` | +| Multinode routing | `MultinodeStatementService`, `MultinodeConnectionManager` | +| URL parsing | `MultinodeUrlParser`, `UrlParser` | +| Session tracking | `SessionTracker` | +| Health checking | `HealthCheckValidator`, `HealthCheckConfig` | +| Redistribution | `ConnectionRedistributor`, `XAConnectionRedistributor` | +| Error mapping | `GrpcExceptionHandler` | +| Connection lifecycle | `Connection` | +| Statement execution | `Statement`, `PreparedStatement`, `CallableStatement` | +| Result set | `ResultSet`, `RemoteProxyResultSet` | +| LOB handling | `Blob`, `Clob`, `NClob`, `Lob`, `LobServiceImpl` | +| XA | `OjpXAResource`, `OjpXAConnection`, `OjpXADataSource` | +| Driver entry point | `Driver` | +| DataSource wrapper | `OjpDataSource` | +""" + +with open(os.path.join(SPEC_DIR, "CLIENT_SPEC.md"), "w") as f: + f.write(CLIENT_SPEC) + +print(f"CLIENT_SPEC.md: {len(CLIENT_SPEC)} chars, {len(CLIENT_SPEC.splitlines())} lines") +print("Done writing CLIENT_SPEC.md") From a52e778d97dbf7681905ac0b8cade20608628343 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 20 Apr 2026 07:18:24 +0000 Subject: [PATCH 07/12] chore: remove accidentally committed write_files.py temp file Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/6a131905-98ee-408b-8155-88759641e1c0 Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../CLIENT_SPEC_AI.md | 445 +++++ .../multi-language-client-spec/write_files.py | 1625 ----------------- 2 files changed, 445 insertions(+), 1625 deletions(-) create mode 100644 documents/multi-language-client-spec/CLIENT_SPEC_AI.md delete mode 100644 documents/multi-language-client-spec/write_files.py diff --git a/documents/multi-language-client-spec/CLIENT_SPEC_AI.md b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md new file mode 100644 index 000000000..faa08ac6c --- /dev/null +++ b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md @@ -0,0 +1,445 @@ +# OJP Client Specification — Machine-Oriented Reference + +> **Status:** Normative — April 2026 +> **Scope:** Defines the complete behavioral contract for any OJP client implementation. +> **Keywords:** MUST, MUST NOT, SHOULD, MAY as defined in RFC 2119. +> **Protocol source:** `ojp-grpc-commons/src/main/proto/StatementService.proto`, `echo.proto` +> **Human-readable companion:** [`CLIENT_SPEC.md`](CLIENT_SPEC.md) + +--- + +## 1. Terminology + +| Term | Definition | +|---|---| +| **Client** | A software library implementing this specification. | +| **Server** | An OJP server instance exposing `StatementService` and `EchoService` via gRPC. | +| **Endpoint** | A `host:port` pair identifying one Server. | +| **Virtual Connection** | A client-side object representing logical access to a database pool, identified by a `SessionInfo` token. Does not correspond 1:1 to a real database connection. | +| **Real Connection** | A JDBC connection held by the Server's HikariCP pool. The Client never holds one directly. | +| **connHash** | A server-computed SHA-256 string keying a specific connection pool. Computed as SHA-256(`url + user + password + datasource_name`). | +| **SessionInfo** | A proto message propagated on every RPC. Contains `connHash`, `clientUUID`, `sessionUUID`, `transactionInfo`, `sessionStatus`, `isXA`, `targetServer`, `clusterHealth`. | +| **sessionUUID** | A server-assigned handle for a stateful session (transaction, LOB, cursor). Absent until the Server assigns it. | +| **targetServer** | The `host:port` the Server binds a `sessionUUID` to. The Client MUST route all requests carrying that `sessionUUID` to this server. | +| **clientUUID** | A stable UUID v4 generated once per Client process lifetime. | +| **clusterHealth** | A semicolon-delimited string of `host:port(UP\|DOWN)` segments reflecting known endpoint health. | +| **connHash cache** | A thread-safe client-side map: `url\|user\|password\|datasourceName → connHash`. Populated on first non-XA `connect()` RPC. | + +--- + +## 2. State Machine + +### 2.1 Connection States + +| State | Description | +|---|---| +| `DISCONNECTED` | No `SessionInfo` exists; no RPC has been made. | +| `CONNECTING` | `connect()` RPC is in flight. | +| `CONNECTED` | `connHash` is known; no `sessionUUID` assigned. Requests are stateless. | +| `SESSION_ACTIVE` | `sessionUUID` is assigned; stickiness is enforced. | +| `IN_TRANSACTION` | `SESSION_ACTIVE` and `transactionStatus = TRX_ACTIVE`. | +| `TERMINATED` | `terminateSession()` has been called. No further RPCs are permitted. | + +### 2.2 Connection State Transitions + +| From | Trigger | To | Required Action | +|---|---|---|---| +| `DISCONNECTED` | `connect()` called, cache miss | `CONNECTING` | Issue `connect()` RPC | +| `DISCONNECTED` | `connect()` called, cache hit (non-XA) | `CONNECTED` | Build `SessionInfo` locally; NO RPC | +| `CONNECTING` | `connect()` RPC succeeds | `CONNECTED` | Cache `connHash`; store `ConnectionDetails` | +| `CONNECTING` | `connect()` RPC fails (transport error) | `DISCONNECTED` | Failover (§6.2); retry | +| `CONNECTED` | `startTransaction()` succeeds | `IN_TRANSACTION` | Update local `SessionInfo`; bind `sessionUUID` if newly assigned | +| `CONNECTED` | RPC returns new `sessionUUID` | `SESSION_ACTIVE` | Bind `sessionUUID → targetServer` | +| `SESSION_ACTIVE` | `startTransaction()` succeeds | `IN_TRANSACTION` | | +| `IN_TRANSACTION` | `commitTransaction()` succeeds | `SESSION_ACTIVE` | `transactionStatus = TRX_COMMITED` | +| `IN_TRANSACTION` | `rollbackTransaction()` succeeds | `SESSION_ACTIVE` | `transactionStatus = TRX_ROLLBACK` | +| `SESSION_ACTIVE` or `IN_TRANSACTION` | `terminateSession()` called | `TERMINATED` | Unbind `sessionUUID`; decrement server session count | +| Any | `NOT_FOUND` received | `DISCONNECTED` | Invalidate `connHash` cache entry; re-issue `connect()` | +| Any | `UNAVAILABLE` / `DEADLINE_EXCEEDED` | `DISCONNECTED` (that server) | Mark Endpoint `UNHEALTHY`; failover | + +### 2.3 Server Endpoint States + +| State | Description | +|---|---| +| `HEALTHY` | Server is reachable; eligible for load-balancing selection. | +| `UNHEALTHY` | Server has failed; not eligible for selection; health checker probes it periodically. | + +**`HEALTHY → UNHEALTHY`:** Triggered by any of: `UNAVAILABLE`, `DEADLINE_EXCEEDED`, `UNKNOWN` (message contains "connection"), `INTERNAL` without `SqlErrorResponse` trailer. + +**`UNHEALTHY → HEALTHY`:** Triggered by successful health probe **after** `reinitializePoolOnRecoveredServer()` completes — the pool MUST be pre-created before `markHealthy()` is called. + +### 2.4 Invalid Transitions + +| Situation | Required Behavior | +|---|---| +| Any RPC called on a `TERMINATED` connection | MUST raise an error immediately; MUST NOT make any RPC. | +| Request routed to a server other than `targetServer` when `sessionUUID` is set | MUST raise an error; MUST NOT silently reroute. | +| `terminateSession()` called twice | MUST be idempotent (no error, no second RPC). | +| XA operation when `targetServer` is `UNHEALTHY` | MUST raise `XAER_RMFAIL` (or language equivalent); MUST NOT reroute. | +| `CANCELLED` treated as server failure | MUST NOT mark server `UNHEALTHY`; MUST NOT failover. | + +--- + +## 3. Protocol Model + +### 3.1 Message Structures + +``` +ConnectionDetails: + url: string # actual database connection URL (e.g., jdbc:postgresql://...) + user: string # database username + password: string # database password + clientUUID: string # stable process UUID (§4.1) + properties: list[PropertyEntry] # key-value config pairs; include ojp.datasource.name + serverEndpoints: list[string] # all known OJP addresses as "host:port" + clusterHealth: string # current health string (§3.5); "" on first call + isXA: bool # true for XA connections + +SessionInfo: + connHash: string # opaque pool key; treat as immutable once received + clientUUID: string # echoed from ConnectionDetails + sessionUUID: string # absent until server assigns; triggers stickiness when set + transactionInfo: { + transactionUUID: string + transactionStatus: TRX_ACTIVE | TRX_COMMITED | TRX_ROLLBACK + } + sessionStatus: SESSION_ACTIVE | SESSION_TERMINATED + isXA: bool + targetServer: string # "host:port"; MUST be used for routing when sessionUUID is set + clusterHealth: string # server's view of cluster topology + +StatementRequest: + session: SessionInfo # MUST include current SessionInfo + sql: string + parameters: list[ParameterProto] + statementUUID: string # new random UUID per statement instance + properties: list[PropertyEntry] + +ParameterProto: + index: int32 # 1-based parameter position + type: ParameterTypeProto # one of 28 enum values (see §9.1) + values: list[ParameterValue] # one for normal params; multiple for array params + +ParameterValue (oneof): + is_null: bool # SQL NULL + bool_value: bool + int_value: int32 # also used for PT_BYTE, PT_SHORT + long_value: int64 + float_value: float + double_value: double + string_value: string # also PT_BIG_DECIMAL (" "), PT_CHARACTER_READER, PT_SQL_XML + bytes_value: bytes # PT_BYTES, PT_ASCII_STREAM, PT_UNICODE_STREAM, PT_BINARY_STREAM + date_value: google.type.Date # PT_DATE + time_value: google.type.TimeOfDay # PT_TIME + timestamp_value: TimestampWithZone # PT_TIMESTAMP + int_array_value: IntArray # PT_ARRAY of ints + long_array_value: LongArray # PT_ARRAY of longs + string_array_value: StringArray # PT_ARRAY of strings + url_value: google.protobuf.StringValue # PT_URL; absent wrapper = SQL NULL + rowid_value: google.protobuf.StringValue # PT_ROW_ID; base64 bytes; absent = SQL NULL + +TimestampWithZone: + epochSeconds: int64 + nanos: int32 + timezone: string # IANA zone ID or UTC offset (e.g., "America/New_York", "+05:30") + originalType: TemporalType # UNSPECIFIED | TIMESTAMP | CALENDAR | OFFSET_DATE_TIME | + # LOCAL_DATE_TIME | INSTANT | LOCAL_DATE | LOCAL_TIME | OFFSET_TIME + +CallResourceRequest: + session: SessionInfo + resourceType: ResourceType # RES_RESULT_SET | RES_STATEMENT | RES_PREPARED_STATEMENT | + # RES_CALLABLE_STATEMENT | RES_LOB | RES_CONNECTION | RES_SAVEPOINT + resourceUUID: string + target: TargetCall + properties: list[PropertyEntry] + +TargetCall: + callType: CallType # one of 47 codes + resourceName: string + params: list[ParameterValue] + nextCall: TargetCall # optional chaining for multiple operations in one round-trip + +CallResourceResponse: + session: SessionInfo + resourceUUID: string # UUID of newly created resource, if any + values: list[ParameterValue] # return values + +XidProto: + formatId: int32 + globalTransactionId: bytes # max 64 bytes + branchQualifier: bytes # max 64 bytes + +LobDataBlock: + session: SessionInfo + position: int64 # byte offset of this chunk + data: bytes # chunk content; recommended size 32–64 KB + lobType: LobType # LT_BLOB | LT_CLOB | LT_BINARY_STREAM | LT_ASCII_STREAM | + # LT_UNICODE_STREAM | LT_CHARACTER_STREAM + metadata: list[PropertyEntry] + +LobReference: + session: SessionInfo + uuid: string # LOB handle; pass as PT_BLOB/PT_CLOB string_value parameter + bytesWritten: int32 + lobType: LobType + +SqlErrorResponse (in gRPC trailing metadata on Status.INTERNAL): + reason: string + sqlState: string # ANSI SQL state code + vendorCode: int32 # database-specific error code + sqlErrorType: SQL_EXCEPTION | SQL_DATA_EXCEPTION +``` + +### 3.2 RPC Catalogue + +| RPC | Stream Type | Retry on Transport Error? | +|---|---|---| +| `connect` | unary | YES (always for XA; non-XA cache miss only) | +| `executeUpdate` | unary | Only if `sessionUUID` absent in request | +| `executeQuery` | server-streaming | Only if `sessionUUID` absent in request | +| `fetchNextRows` | unary | NO | +| `createLob` | client-streaming | NO | +| `readLob` | server-streaming | NO | +| `terminateSession` | unary | NO | +| `startTransaction` | unary | Only if `sessionUUID` absent in request | +| `commitTransaction` | unary | NO | +| `rollbackTransaction` | unary | NO | +| `callResource` | unary | Only if `sessionUUID` absent in request | +| `xaStart` | unary | YES | +| `xaEnd` | unary | NO | +| `xaPrepare` | unary | NO | +| `xaCommit` | unary | NO | +| `xaRollback` | unary | NO | +| `xaRecover` | unary | NO | +| `xaForget` | unary | NO | +| `xaSetTransactionTimeout` | unary | NO | +| `xaGetTransactionTimeout` | unary | NO | +| `xaIsSameRM` | unary | NO | +| `EchoService.Echo` | unary | used for health probes only | + +--- + +## 4. Client Contract + +### 4.1 Initialization + +1. The client MUST generate one UUID v4 as `clientUUID` at library initialization time. This value MUST remain constant for the process lifetime and MUST NOT be persisted across restarts. +2. The client MUST create one gRPC `ManagedChannel` (or equivalent) per distinct server endpoint. Channels MUST be long-lived and shared across all connections to that endpoint. +3. The client MUST use graceful channel shutdown (allow in-flight calls to drain) on process termination. +4. The client MUST start a background health-check task scheduled at `ojp.health.check.interval` (default 5 000 ms). + +### 4.2 Connection Rules + +**Non-XA first connect (cache miss):** + +1. Build `ConnectionDetails` with `url`, `user`, `password`, `clientUUID`, `serverEndpoints`, `clusterHealth`, `isXA=false`, and applicable `properties`. +2. Call `connect(ConnectionDetails)` on the selected endpoint. +3. Cache: `connHashByKey[url+"|"+user+"|"+password+"|"+datasourceName] = response.connHash`. +4. Cache: `storedDetails[response.connHash] = ConnectionDetails` (for `NOT_FOUND` recovery). +5. Return `SessionInfo` from response. + +**Non-XA subsequent connect (cache hit):** + +1. Look up `connHash` from cache using the connection key. +2. Build `SessionInfo` locally: `{connHash, clientUUID, isXA=false}`. MUST NOT set `sessionUUID`. +3. Return without making any RPC call. + +**XA connect (always RPC):** + +1. MUST always call `connect(ConnectionDetails)` with `isXA=true`. +2. MUST immediately bind `response.sessionUUID → response.targetServer`. + +**NOT_FOUND recovery:** + +1. Remove `connHashByKey[connectionKey]` from cache (keep `storedDetails`). +2. Re-issue `connect(storedDetails[oldConnHash])`. +3. Update `connHashByKey[connectionKey]` with new `connHash`. +4. Retry the original failed operation once. +5. This retry MUST NOT be performed if a `sessionUUID` was active — the session state is permanently lost and the error MUST be surfaced to the caller. + +### 4.3 Session Propagation Rules + +1. The client MUST include the current `SessionInfo` in every outgoing RPC request. +2. The client MUST replace its local `SessionInfo` with the `SessionInfo` returned in every RPC response. +3. When the response `SessionInfo` contains a `sessionUUID` not present in the request, the client MUST immediately register: `sessionUUID → response.targetServer`. +4. The client MUST call `terminateSession(session)` exactly once when closing a connection. After this call, the connection MUST be considered unusable. + +### 4.4 Statement Execution Rules + +1. The client MUST generate a new random UUID as `statementUUID` for each `StatementRequest`. +2. Parameters MUST use 1-based indexing in `ParameterProto.index`. +3. `PT_BIG_DECIMAL` MUST be encoded as `string_value = " "` (space-separated). Example: `BigDecimal("123.45")` → `"12345 2"`. +4. Presence-aware fields (`url_value`, `rowid_value`, `uuid_value`, `biginteger_value`) use `google.protobuf.StringValue` wrappers. An absent (unset) wrapper MUST be treated as SQL NULL. An empty string inside the wrapper is a valid non-null value. + +### 4.5 Resource Lifecycle Rules + +1. LOB handles (`LobReference.uuid`) are server-side objects. They MUST NOT be used after `terminateSession()`. +2. Result set handles (`resultSetUUID`) are server-side objects. The client MUST call `callResource(RES_RESULT_SET, CALL_CLOSE)` when done, unless the connection is being terminated. +3. Savepoint handles (from `CALL_SET` on `RES_SAVEPOINT`) MUST NOT be used after `commitTransaction()` or `rollbackTransaction()`. + +--- + +## 5. Concurrency Model + +1. A single `SessionInfo` / connection object MUST NOT be used concurrently from multiple threads without external synchronization. +2. The `connHash` cache MUST be thread-safe. Multiple connections with the same credentials will read from it concurrently. +3. The `sessionUUID → targetServer` map MUST be thread-safe. +4. Per-server session counts (for load balancing) MUST be updated atomically. +5. The background health-check task MUST run in a separate thread / goroutine / async task and MUST NOT block SQL execution paths. +6. `pushClusterHealthToAllHealthyServers()` MUST be submitted to a background scheduler (non-blocking) when called from a query thread via `handleServerFailure()`. It MAY be called inline when called from the health-check thread. + +--- + +## 6. Error Handling and Retry Semantics + +### 6.1 Error Classification + +| gRPC Status | Condition | Required Client Action | +|---|---|---| +| `INTERNAL` + `SqlErrorResponse` trailer | SQL error (bad query, constraint, auth) | Throw SQL exception; DO NOT retry; DO NOT mark server `UNHEALTHY` | +| `NOT_FOUND` | Pool not found (server restarted) | Invalidate `connHash`; reconnect; retry once if no active `sessionUUID` | +| `UNAVAILABLE` | Server unreachable | Mark server `UNHEALTHY`; failover (§6.2) | +| `DEADLINE_EXCEEDED` | Request timed out | Mark server `UNHEALTHY`; failover (§6.2) | +| `UNKNOWN` (message contains "connection") | Transport failure | Mark server `UNHEALTHY`; failover (§6.2) | +| `INTERNAL` (no `SqlErrorResponse` trailer) | Transport-level internal error | Mark server `UNHEALTHY`; failover (§6.2) | +| `CANCELLED` | Client-initiated cancellation | DO NOT mark server `UNHEALTHY`; DO NOT failover; surface to caller | +| `RESOURCE_EXHAUSTED` | Pool exhausted | DO NOT retry; DO NOT mark server `UNHEALTHY`; surface to caller | +| Session-invalidation message | Session state lost after server failure | DO NOT retry; surface to caller | + +### 6.2 Failover Procedure (Ordered Steps) + +1. Capture `wasHealthy = endpoint.isHealthy`. +2. Set `endpoint.isHealthy = false`. Record `endpoint.lastFailureTime = now()`. +3. Log the failure. +4. If `wasHealthy == true`: submit `pushClusterHealthToAllHealthyServers()` to the background scheduler. MUST NOT block the caller thread. +5. Gracefully shut down the gRPC channel for the failed endpoint (allow in-flight calls to drain, then discard). +6. Select the next `HEALTHY` endpoint using the configured strategy, excluding all already-attempted endpoints in this retry cycle. +7. Retry the original operation on the new endpoint. +8. If all endpoints are `UNHEALTHY` or exhausted: raise a connection error to the caller. + +### 6.3 Retry Limits + +| Property | Default | Meaning | +|---|---|---| +| `ojp.multinode.retry.attempts` | `3` | Maximum failover attempts per operation | +| `ojp.multinode.retry.delay` | `100` ms | Delay between retry attempts | + +--- + +## 7. Session and Affinity Rules + +### 7.1 When Affinity Is Required + +Affinity is required whenever `sessionUUID` is present in the local `SessionInfo`. This includes all of: +- Any open transaction (`IN_TRANSACTION` state) +- Any open LOB handle +- Any open server-side cursor (`resultSetUUID`) +- Any XA session (entire lifetime of the XA branch) + +### 7.2 How Affinity Is Maintained + +1. The client MUST maintain a thread-safe map: `sessionUUID → host:port`. +2. When routing a request that carries a `sessionUUID`, the client MUST look up the bound server and route exclusively to it. +3. When a response returns a `sessionUUID` not present in the request, the client MUST add the binding immediately. +4. When a response returns a `targetServer` different from the currently bound server, the client MUST update the binding and SHOULD log a warning. +5. On `terminateSession()`: MUST remove the binding and MUST decrement the active-session count for the previously bound server. + +### 7.3 Affinity Violation Behavior + +If the bound server is `UNHEALTHY` when a sticky request is made, the client MUST: + +1. Raise an error to the caller immediately. +2. MUST NOT reroute the request to any other server. +3. MUST NOT retry the operation automatically. + +--- + +## 8. Versioning and Compatibility + +1. The client MUST be compiled against the same `.proto` files as the target server version. +2. The client SHOULD send only fields defined in the proto version it was compiled against. +3. The client MUST gracefully handle unknown enum values in responses by treating them as the zero/default value. +4. The client MUST NOT depend on the internal structure or value of `connHash` — it is an opaque string assigned by the server. +5. The cluster health string format (`host:port(UP|DOWN);...`) MUST be treated as stable across minor versions. The client MUST parse only the `UP`/`DOWN` token and MUST ignore any additional tokens inside the parentheses. + +--- + +## 9. Compliance Requirements + +### 9.1 MUST Implement + +- All 21 `StatementService` RPCs and `EchoService.Echo` +- `connHash` caching (non-XA cache-hit path with no RPC) +- `NOT_FOUND` recovery (invalidate cache, reconnect, retry once) +- `SessionInfo` propagation on every RPC (send current, replace with response) +- Session stickiness enforcement (`sessionUUID → targetServer` binding) +- Failover (mark `UNHEALTHY`, select next endpoint, retry up to configured limit) +- Background health checking (Phase 1: probe healthy endpoints; Phase 2: probe unhealthy endpoints) +- Health check guard: Phase 1 fires only when `sessionToServerMap` non-empty **OR** `connectionDetailsByConnHash` non-empty +- Cluster health string generation (`host:port(UP|DOWN);...`) and consumption +- Both load-balancing strategies (least-connections and round-robin) selectable via `ojp.loadaware.selection.enabled` +- `terminateSession()` on connection close +- Graceful gRPC channel shutdown on process termination +- All 28 `ParameterTypeProto` values (encode and decode): `PT_NULL`, `PT_BOOLEAN`, `PT_BYTE`, `PT_SHORT`, `PT_INT`, `PT_LONG`, `PT_FLOAT`, `PT_DOUBLE`, `PT_BIG_DECIMAL`, `PT_STRING`, `PT_BYTES`, `PT_DATE`, `PT_TIME`, `PT_TIMESTAMP`, `PT_ASCII_STREAM`, `PT_UNICODE_STREAM`, `PT_BINARY_STREAM`, `PT_OBJECT`, `PT_CHARACTER_READER`, `PT_REF`, `PT_BLOB`, `PT_CLOB`, `PT_ARRAY`, `PT_URL`, `PT_ROW_ID`, `PT_N_STRING`, `PT_N_CHARACTER_STREAM`, `PT_N_CLOB`, `PT_SQL_XML` +- `BigDecimal` encoding as `" "` +- `TimestampWithZone` encoding/decoding for all 9 `TemporalType` values +- LOB write (`createLob` client-streaming, chunked at 32–64 KB) and read (`readLob` server-streaming) +- Non-XA transaction lifecycle (`startTransaction`, `commitTransaction`, `rollbackTransaction`) +- Savepoints via `callResource` (`RES_SAVEPOINT`, `CALL_SET`/`CALL_ROLLBACK`/`CALL_RELEASE`) +- `callResource` protocol (all 7 `ResourceType` values, all 47 `CallType` codes) +- Configuration loading: system/env properties > `ojp.properties` file > built-in defaults; per-datasource prefix `.ojp.*` +- TLS transport support (plaintext default; TLS when `ojp.grpc.tls.enabled=true`) +- `clientUUID` generation (one UUID v4 per process lifetime) +- `reinitializePoolOnRecoveredServer()` called **before** `endpoint.markHealthy()` on recovery + +### 9.2 SHOULD Implement + +- Full XA transaction lifecycle (all 10 XA RPCs) +- Full-validation health probe (in addition to heartbeat probe) +- Connection redistribution on recovery (rebalancing idle connections across servers) +- Cache rule pass-through via `ConnectionDetails.properties` +- `DataSource` wrapper / integration API matching host platform conventions +- Per-datasource configuration namespacing + +### 9.3 MAY Implement + +- Async / non-blocking RPC API surface +- Metrics / telemetry export (OpenTelemetry recommended) +- Client-side connection pooling — NOTE: if implemented, pool size MUST effectively be 1 per virtual connection; double-pooling causes incorrect behavior + +--- + +## 10. Action → Protocol Mapping + +| High-Level Action | gRPC RPC(s) | Notes | +|---|---|---| +| `open_connection(endpoint, db_url, user, password)` | `connect(ConnectionDetails)` — only on cache miss | Cache hit: no RPC; build `SessionInfo` locally | +| `open_xa_connection(...)` | `connect(ConnectionDetails, isXA=true)` | Always RPC; pins to one endpoint | +| `execute_query(sql, params)` | `executeQuery(StatementRequest)` | Server-streaming; first message has labels + first batch of rows | +| `fetch_more_rows(result_set_uuid, page_size)` | `fetchNextRows(ResultSetFetchRequest)` | Empty `rows` list means result set exhausted | +| `execute_update(sql, params)` | `executeUpdate(StatementRequest)` | `OpResult.value.int_value` = affected row count | +| `call_stored_procedure(sql, params)` | `callResource(CALL_PREPARE)` then `callResource(CALL_EXECUTE)` | `CALL_PREPARE` returns `resourceUUID` | +| `begin_transaction()` | `startTransaction(SessionInfo)` | Returns `SessionInfo` with `TRX_ACTIVE` | +| `commit()` | `commitTransaction(SessionInfo)` | Returns `SessionInfo` with `TRX_COMMITED` | +| `rollback()` | `rollbackTransaction(SessionInfo)` | Returns `SessionInfo` with `TRX_ROLLBACK` | +| `set_savepoint(name)` | `callResource(RES_SAVEPOINT, CALL_SET, "Savepoint", [name])` | Returns `resourceUUID` for later rollback/release | +| `rollback_to_savepoint(uuid)` | `callResource(RES_SAVEPOINT, CALL_ROLLBACK, resourceUUID=uuid)` | | +| `release_savepoint(uuid)` | `callResource(RES_SAVEPOINT, CALL_RELEASE, resourceUUID=uuid)` | | +| `write_lob(data)` | `createLob(stream LobDataBlock)` | Client-streaming; chunk at 32–64 KB; returns `LobReference.uuid` | +| `read_lob(uuid, pos, len)` | `readLob(ReadLobRequest)` | Server-streaming; concatenate `data` fields in order | +| `close_result_set(uuid)` | `callResource(RES_RESULT_SET, CALL_CLOSE, resourceUUID=uuid)` | | +| `navigate_cursor(uuid, op, row)` | `callResource(RES_RESULT_SET, CALL_ABSOLUTE/RELATIVE/FIRST/LAST/NEXT/PREVIOUS, …)` | | +| `cancel_statement(uuid)` | `callResource(RES_STATEMENT, CALL_CANCEL, resourceUUID=uuid)` | | +| `get_db_metadata(key)` | `callResource(RES_CONNECTION, CALL_GET, resourceName=key)` | | +| `set_transaction_isolation(level)` | `callResource(RES_CONNECTION, CALL_SET, "TransactionIsolation", [level])` | | +| `close_connection()` | `terminateSession(SessionInfo)` | MUST be called exactly once per connection | +| `health_probe_heartbeat(endpoint)` | `connect(url="", user="", password="")` or `EchoService.Echo` | Any response means transport is up | +| `health_probe_full(endpoint, details)` | `connect(details)` then `terminateSession(session)` | Full pool validation | +| `push_cluster_health(endpoints, stored)` | `connect(ConnectionDetails{clusterHealth=…})` on each healthy endpoint | No-op for pool creation; server resizes pool | +| `xa_start(xid)` | `xaStart(XaStartRequest)` | Safe to retry on transport error | +| `xa_end(xid)` | `xaEnd(XaEndRequest)` | MUST NOT retry | +| `xa_prepare(xid)` | `xaPrepare(XaPrepareRequest)` | MUST NOT retry | +| `xa_commit(xid, one_phase)` | `xaCommit(XaCommitRequest)` | MUST NOT retry | +| `xa_rollback(xid)` | `xaRollback(XaRollbackRequest)` | MUST NOT retry | +| `xa_recover()` | `xaRecover(XaRecoverRequest)` | | +| `xa_forget(xid)` | `xaForget(XaForgetRequest)` | | diff --git a/documents/multi-language-client-spec/write_files.py b/documents/multi-language-client-spec/write_files.py deleted file mode 100644 index 181cb2373..000000000 --- a/documents/multi-language-client-spec/write_files.py +++ /dev/null @@ -1,1625 +0,0 @@ -#!/usr/bin/env python3 -"""Script to write the two OJP client spec files.""" -import base64 -import os - -SPEC_DIR = "/home/runner/work/ojp/ojp/documents/multi-language-client-spec" - -CLIENT_SPEC = r"""# OJP Multi-Language Client Specification - -> **Status:** Draft — April 2026 -> **Scope:** This document defines every aspect that a new OJP client library (in any language other than Java) must implement in order to be fully compatible with an OJP server. It is written language-agnostically; where Java-specific concepts appear they are labelled as the reference implementation only. -> **Reference implementation:** `ojp-jdbc-driver` module. -> **Protocol source of truth:** `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto`. - ---- - -## Table of Contents - -1. [Overview](#1-overview) -2. [Core Concepts](#2-core-concepts) - - [2.1 The Virtual Connection Model](#21-the-virtual-connection-model) - - [2.2 Session and connHash](#22-session-and-connhash) - - [2.3 Session Affinity (Stickiness)](#23-session-affinity-stickiness) - - [2.4 Client vs. Server Responsibilities](#24-client-vs-server-responsibilities) -3. [Architecture and Data Flow](#3-architecture-and-data-flow) - - [3.1 gRPC Interface and Channel Setup](#31-grpc-interface-and-channel-setup) - - [3.2 Connection Configuration (ConnectionDetails)](#32-connection-configuration-connectiondetails) - - [3.3 Client Identity (clientUUID)](#33-client-identity-clientuuid) - - [3.4 Multinode Load Balancing](#34-multinode-load-balancing) - - [3.5 Cluster Health Propagation](#35-cluster-health-propagation) -4. [Client Responsibilities](#4-client-responsibilities) - - [4.1 Connection Establishment and connHash Caching](#41-connection-establishment-and-connhash-caching) - - [4.2 Session Lifecycle](#42-session-lifecycle) - - [4.3 Failover](#43-failover) - - [4.4 Health Checking](#44-health-checking) - - [4.5 Connection Redistribution on Recovery](#45-connection-redistribution-on-recovery) -5. [Minimal End-to-End Example](#5-minimal-end-to-end-example) -6. [Error Handling](#6-error-handling) - - [6.1 Error Classification Matrix](#61-error-classification-matrix) - - [6.2 SQL Errors vs. gRPC Transport Errors](#62-sql-errors-vs-grpc-transport-errors) -7. [Implementation Guidance](#7-implementation-guidance) - - [7.1 Statement Execution](#71-statement-execution) - - [7.2 Parameter Type Mapping](#72-parameter-type-mapping) - - [7.3 Temporal Type Handling](#73-temporal-type-handling) - - [7.4 Result Set Streaming](#74-result-set-streaming) - - [7.5 LOB (Large Object) Handling](#75-lob-large-object-handling) - - [7.6 Transaction Management (non-XA)](#76-transaction-management-non-xa) - - [7.7 Savepoints](#77-savepoints) - - [7.8 XA / Distributed Transactions](#78-xa--distributed-transactions) - - [7.9 callResource Protocol](#79-callresource-protocol) - - [7.10 Configuration System](#710-configuration-system) - - [7.11 Query Result Caching](#711-query-result-caching) - - [7.12 Security / Transport](#712-security--transport) - - [7.13 DataSource / Integration API](#713-datasource--integration-api) -8. [Testing Coverage](#8-testing-coverage) - ---- - -## 1. Overview - -OJP (Open J Proxy) is the world's first open-source JDBC Type 3 driver. It works by placing a gRPC server between applications and their databases. Applications use an OJP client library — and never touch a real database connection directly. The OJP server owns all real connections in a HikariCP pool and services SQL requests on behalf of clients. - -This document specifies everything a non-Java OJP client library must implement to be fully protocol-compatible. All communication between client and server uses gRPC over HTTP/2. The proto definitions in `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto` are the authoritative source for message formats and RPC signatures. - ---- - -## 2. Core Concepts - -### 2.1 The Virtual Connection Model - -An OJP "connection" is not a real database connection. Real database connections are owned and managed by the OJP server's HikariCP pool. The client holds a `SessionInfo` — a lightweight value object the server uses to route each incoming request to the correct pool. - -Opening a connection is inexpensive. For non-XA connections, a cache hit means no RPC is needed at all: the client constructs a `SessionInfo` locally from a cached `connHash` and begins issuing SQL calls immediately. Multiple client connections may share the same server-side pool by sharing the same `connHash`. - -Because the server already pools real database connections, the application side must not add another connection pool on top. Double-pooling (e.g., HikariCP on the app side in addition to the server side) causes resource waste and incorrect connection-count accounting. - -### 2.2 Session and connHash - -`connHash` is a server-computed SHA-256 hash of the tuple `(url, user, password, datasource_name)`. It identifies which pool on the server to route requests to. The client receives `connHash` on the first `connect()` RPC and must cache it for all subsequent connections that share the same credentials. - -`sessionUUID` is the server-side handle for a persistent session. It is NOT assigned at connection time. The server assigns it on the first operation that requires a persistent server-side context — for example, `startTransaction()`, LOB creation, or a stored-procedure call. Until a `sessionUUID` exists, each request is effectively stateless: the server picks any available connection from the pool identified by `connHash` and processes the request. - -Once a `sessionUUID` is established it is returned in every subsequent response alongside the same `connHash`. The client must replace its local `SessionInfo` with the one returned after every RPC call. - -### 2.3 Session Affinity (Stickiness) - -Once a `sessionUUID` is set, every subsequent request for that session must go to the same server. The server encodes the target server address in the `targetServer` field of every `SessionInfo` response. The client must record this binding in a `sessionUUID -> host:port` map and enforce it strictly. - -Routing a sticky session to a different server is a protocol error — it is not a transparent optimisation. The session state (open transaction, cursor, LOB handle) lives on a specific server and cannot be migrated. If the bound server becomes unreachable, the client must raise an error to the caller rather than silently rerouting. - -### 2.4 Client vs. Server Responsibilities - -The server owns: real database connections, connection pool management, transaction state, LOB storage, cursor state, and query result caching. - -The client owns: `SessionInfo` propagation on every request and response, `connHash` caching and cache-invalidation logic, server endpoint health tracking, load balancing across healthy endpoints, failover on transport errors, cluster health string construction and proactive pushing to healthy servers, and session stickiness enforcement. - ---- - -## 3. Architecture and Data Flow - -### 3.1 gRPC Interface and Channel Setup - -The client must implement stubs for every RPC in `StatementService` and `EchoService`. - -**`StatementService` RPCs:** - -| RPC | Type | Purpose | -|---|---|---| -| `connect` | unary | Open a logical connection and receive `SessionInfo` | -| `executeUpdate` | unary | DML (INSERT / UPDATE / DELETE / DDL) | -| `executeQuery` | server-streaming | SELECT — returns a stream of `OpResult` blocks | -| `fetchNextRows` | unary | Pull the next page of rows for an open result set | -| `createLob` | client-streaming | Upload LOB data to the server in chunks | -| `readLob` | server-streaming | Download LOB data from the server | -| `terminateSession` | unary | Release server-side session state | -| `startTransaction` | unary | Begin an explicit transaction | -| `commitTransaction` | unary | Commit the active transaction | -| `rollbackTransaction` | unary | Roll back the active transaction | -| `callResource` | unary | Generic remote call for metadata, cursor navigation, savepoints | -| `xaStart` | unary | Begin an XA transaction branch | -| `xaEnd` | unary | End an XA transaction branch | -| `xaPrepare` | unary | Prepare an XA transaction branch | -| `xaCommit` | unary | Commit an XA transaction branch | -| `xaRollback` | unary | Roll back an XA transaction branch | -| `xaRecover` | unary | List XIDs of prepared transactions | -| `xaForget` | unary | Forget a heuristically completed transaction | -| `xaSetTransactionTimeout` | unary | Set XA timeout in seconds | -| `xaGetTransactionTimeout` | unary | Get current XA timeout | -| `xaIsSameRM` | unary | Check whether two sessions share a resource manager | - -**`EchoService` RPC:** - -| RPC | Type | Purpose | -|---|---|---| -| `Echo` | unary | Lightweight heartbeat / connectivity check | - -**gRPC channel lifecycle:** - -One `ManagedChannel` (or equivalent) per server endpoint. Channels are long-lived and shared across all logical connections to that endpoint. They are created lazily on first connection or eagerly during initialisation when endpoints are known upfront. Use DNS-prefixed targets (`dns:///host:port`) where the gRPC runtime supports it. Blocking stubs are used for synchronous operations; async stubs are required for client-streaming (`createLob`) and server-streaming (`executeQuery`, `readLob`) RPCs. Channel shutdown must be graceful and triggered on client shutdown. - -```python -# Create one long-lived channel per OJP server endpoint -channel = grpc.create_channel("localhost:10591", credentials=grpc.local_channel_credentials()) -stub = StatementServiceStub(channel) # used for all SQL operations -echo = EchoServiceStub(channel) # used for health heartbeats - -# On process shutdown — drain in-flight calls before closing -channel.shutdown(grace_period_seconds=5) -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`StatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementService.java): the unified interface declaring all RPC methods. -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC implementation; contains the concrete gRPC stub calls. -> - `ojp-jdbc-driver` — [`MultinodeStatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): the multinode facade that wraps `StatementServiceGrpcClient` per endpoint with routing, failover, and stickiness. -> - `ojp-grpc-commons` — [`GrpcChannelFactory`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): builds `ManagedChannel` instances with plaintext or TLS. - ---- - -### 3.2 Connection Configuration (ConnectionDetails) - -A non-Java OJP client does not use a JDBC URL. Instead, it collects the following configuration items directly from the user or from a configuration file: - -| Item | Required | Description | -|---|---|---| -| OJP server endpoints | Yes | One or more `host:port` pairs for the OJP server(s). In multinode mode this is a list. | -| Datasource name | No | A logical name for this datasource, default `"default"`. Used to keep separate connection pools per named datasource on the same server. | -| Database URL | Yes | The connection URL for the **real database** that the OJP server will connect to (e.g., `jdbc:postgresql://db:5432/mydb`). This is sent verbatim to the server. | -| User | Yes | Database username. | -| Password | Yes | Database password. | -| Properties | No | Additional key-value configuration pairs (pool sizing, cache rules, etc. — see §7.10, §7.11). | - -Map the collected configuration to the `ConnectionDetails` proto fields as follows: - -| Proto field | Type | Value | -|---|---|---| -| `url` | `string` | The **actual database URL** (e.g., `jdbc:postgresql://db:5432/mydb`). The server uses this to create the real database connection pool. | -| `user` | `string` | Database username. | -| `password` | `string` | Database password. | -| `clientUUID` | `string` | Stable process UUID (see §3.3). | -| `properties` | `repeated PropertyEntry` | Configuration key-value pairs; include `ojp.datasource.name = ` when using a named datasource. | -| `serverEndpoints` | `repeated string` | All OJP server addresses as `"host:port"` strings (the full cluster list, not just the chosen endpoint). | -| `clusterHealth` | `string` | Current cluster health string (see §3.5); empty string on the very first connect. | -| `isXA` | `bool` | `true` for XA connections, `false` otherwise. | - -The `url` field must be consistent across all client processes that connect to the same logical datasource. The server computes `connHash` as SHA-256(`url + user + password + datasource_name`). If different clients send different `url` strings for the same database, the server creates separate pools. - -The client-side cache key for the `connHash` lookup is: `url + "|" + user + "|" + password + "|" + datasource_name` - -> **Reference implementation:** -> - `ojp-grpc-commons` — [`ConnectionDetails` proto](../../ojp-grpc-commons/src/main/proto/StatementService.proto): field definitions. -> - `ojp-server` — [`ConnectionHashGenerator.hashConnectionDetails()`](../../ojp-server/src/main/java/org/openjproxy/grpc/server/utils/ConnectionHashGenerator.java): SHA-256 of `url + user + password + datasource_name_from_properties` — the server-side connHash algorithm. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.computeConnectionKey()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): client-side cache key. -> - `ojp-jdbc-driver` — [`MultinodeUrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java): Java reference for how the JDBC URL is parsed to extract server endpoints, datasource names, and the actual DB URL before building `ConnectionDetails` (Java-specific; not needed in non-Java clients). - ---- - -### 3.3 Client Identity (clientUUID) - -Generate one random UUID (version 4) when the client library is first loaded or when the process starts. This UUID must remain stable for the entire lifetime of the process. Attach `clientUUID` to every `ConnectionDetails` message sent to the server. The server uses `clientUUID` to group all sessions from the same client process. Do not persist `clientUUID` across process restarts; each new process must generate a fresh UUID. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`ClientUUID`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ClientUUID.java): `getUUID()` returns the static, process-scoped UUID that is generated once at class-loading time via `UUID.randomUUID()`. - ---- - -### 3.4 Multinode Load Balancing - -Two strategies must be supported, selectable via configuration (see §7.10, property `ojp.loadaware.selection.enabled`): - -**Least-connections (default, `true`):** Select the healthy server with the lowest number of active sessions. Track session counts in a thread-safe counter per server endpoint. Use round-robin as a tie-breaker when all servers have equal counts. - -**Round-robin (`false`):** Cycle through healthy servers in order using an atomic counter modulo the number of healthy servers. - -Server selection runs on every new connection attempt (non-XA, first `connect()`) and on every XA `connect()`. Once a session is assigned a server via session stickiness, selection does not run again for that session. Only servers whose `isHealthy() == true` are eligible for selection. If no healthy servers exist, raise a connection error. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.selectHealthyServer()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the entry point that dispatches to one of the two strategies based on config. -> - `MultinodeConnectionManager.selectByLeastConnections(healthyServers)`: picks the server with the lowest active-session count. -> - `MultinodeConnectionManager.selectByRoundRobin(healthyServers)`: atomically increments `roundRobinCounter` and selects `servers[counter % size]`. -> - `ojp-jdbc-driver` — [`ServerEndpoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ServerEndpoint.java): holds `isHealthy`, `lastFailureTime`, host, and port state for each endpoint. - ---- - -### 3.5 Cluster Health Propagation - -The cluster health string format is: `host1:port1(UP);host2:port2(DOWN);host3:port3(UP)` - -Each semicolon-separated segment is `host:port(STATUS)` where status is `UP` or `DOWN`. - -The client must build the cluster health string from local endpoint state before every `connect()` call. It must also consume the cluster health string returned in `SessionInfo.clusterHealth` on every response, updating local endpoint states: endpoints marked `DOWN` must be treated as unhealthy; endpoints marked `UP` that were previously failed must not be marked healthy immediately — the health-check thread must confirm first. - -When the topology changes (a server fails or recovers), the client must proactively push the updated cluster health to all currently healthy servers via two independent triggers — both are necessary: - -**Trigger 1 — health-check thread**: When `performHealthCheck()` detects a newly failed or recovered server, it calls `pushClusterHealthToAllHealthyServers()` inline. This covers the case when no SQL traffic is active. - -**Trigger 2 — query thread**: When a SQL query thread detects server failure via `handleServerFailure()`, it submits `pushClusterHealthToAllHealthyServers()` to the background scheduler asynchronously. This covers the race where the query thread marks the server unhealthy before the health checker runs. - -The push is done by calling `connect()` on each healthy server with a `ConnectionDetails` whose `clusterHealth` field contains the new topology string. The server uses this to resize its pool immediately. - -```python -# Build the health string from local endpoint state -def build_cluster_health(endpoints): - return ";".join( - f"{ep.host}:{ep.port}({'UP' if ep.is_healthy else 'DOWN'})" - for ep in endpoints - ) - -# Push updated cluster health to all healthy servers via a connect() call. -# The server uses the clusterHealth field to resize its pool immediately. -def push_cluster_health(endpoints, stored_details): - if not stored_details: - return # no connections yet — nothing to push - health_str = build_cluster_health(endpoints) - for conn_hash, details in stored_details.items(): - push_req = ConnectionDetails(**details, clusterHealth=health_str) - for ep in endpoints: - if ep.is_healthy: - stubs[ep].connect(push_req) # no-op for pool creation; resizes pool - -# Consume the cluster health returned in every gRPC response -def consume_cluster_health(session_info): - for segment in session_info.clusterHealth.split(";"): - host_port, status = segment.rsplit("(", 1) - status = status.rstrip(")") - endpoint = find_endpoint(host_port) - if status == "DOWN" and endpoint.is_healthy: - handle_server_failure(endpoint) - elif status == "UP" and not endpoint.is_healthy: - # do not mark healthy here — let the health-check thread confirm (§4.4) - pass -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.generateClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): builds the semicolon-delimited health string from `serverEndpoints`. -> - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: calls `connect()` on every healthy server with the new cluster health embedded in `ConnectionDetails`. -> - `MultinodeConnectionManager.handleServerFailure()` (Trigger 2): submits `pushClusterHealthToAllHealthyServers()` to `healthCheckScheduler` on a genuine healthy->unhealthy transition. -> - `MultinodeConnectionManager.performHealthCheck()` (Trigger 1): calls `pushClusterHealthToAllHealthyServers()` directly after marking a server DOWN or after a recovered server is marked healthy. -> - `MultinodeStatementService.withClusterHealth(sessionInfo)`: attaches the current health string to an outgoing `SessionInfo` before each RPC (reactive secondary path). - ---- - -## 4. Client Responsibilities - -### 4.1 Connection Establishment and connHash Caching - -#### First connection (cache miss) - -1. Build a `ConnectionDetails` message (see §3.2 for field mapping). -2. Call `connect(ConnectionDetails)` on the chosen server. Receive `SessionInfo`. -3. Cache the returned `connHash`, keyed on `url + "|" + user + "|" + password + "|" + datasourceName`. Also store the full `ConnectionDetails` for replay if the server restarts. -4. Return the received `SessionInfo` to the caller. - -#### Subsequent connections (cache hit, non-XA only) - -When a subsequent connection uses the same credentials: -1. Look up `connHash` from the local cache. -2. Build a `SessionInfo` locally without making any gRPC call: - ``` - SessionInfo { - connHash: - clientUUID: - isXA: false - } - ``` -3. Return this locally-built `SessionInfo`. No `sessionUUID` is set yet; it will be assigned by the server lazily. - -XA connections always call the server — caching is disabled for XA because each XA connection must create a dedicated pool entry on a specific server. - -#### Cache invalidation (NOT_FOUND recovery) - -When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory pool. Recovery procedure: -1. Remove the cached `connHash -> connection-key` entry (but keep the stored `ConnectionDetails`). -2. Re-issue a real `connect()` RPC using the stored `ConnectionDetails`. -3. Cache the new `connHash` returned. -4. Retry the original failed operation once with the new `SessionInfo`. -5. This retry is only safe if the original request had no active `sessionUUID`. If a session was in progress, surface the error to the caller — the transaction state is permanently lost. - -```python -# --- First connection (cache miss) --- -req = ConnectionDetails( - url = "jdbc:postgresql://db:5432/mydb", # actual DB URL - user = "alice", - password = "secret", - clientUUID = CLIENT_UUID, # stable process UUID (§3.3) - serverEndpoints = ["host1:10591", "host2:10591"], # full cluster list - clusterHealth = build_cluster_health(endpoints), # §3.5; "" on very first call - isXA = False, - properties = [PropertyEntry(key="ojp.datasource.name", string_value="default")] -) - -session = stub.connect(req) -# session.connHash = "abc123..." — server-computed pool key -# session.clientUUID = CLIENT_UUID - -# Cache connHash for subsequent connections -cache_key = f"{req.url}|{req.user}|{req.password}|default" -connhash_cache[cache_key] = session.connHash -stored_details[session.connHash] = req # kept for NOT_FOUND recovery (see below) - -# --- Subsequent connection (cache hit, non-XA) --- -# No RPC call needed — build SessionInfo locally from the cached connHash -session = SessionInfo( - connHash = connhash_cache[cache_key], - clientUUID = CLIENT_UUID, - isXA = False - # sessionUUID is absent; the server assigns it lazily on startTransaction -) - -# --- NOT_FOUND recovery --- -# If any RPC returns Status.NOT_FOUND (server restarted, pool lost): -del connhash_cache[cache_key] -session = stub.connect(stored_details[old_conn_hash]) # re-issue real connect() -connhash_cache[cache_key] = session.connHash # update cache -# then retry the original failed RPC once -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.connect()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): orchestrates first-connect vs. cache-hit logic. -> - `MultinodeConnectionManager.computeConnectionKey()`: builds the `url|user|password|datasourceName` cache key. -> - `MultinodeConnectionManager.invalidateConnHash()`: removes the stale key from `connHashByConnectionKey` on `NOT_FOUND`. -> - `MultinodeConnectionManager.reconnectForConnHash()`: re-issues the real `connect()` RPC using stored `ConnectionDetails` and updates the cache. -> - `MultinodeConnectionManager.buildLocalSessionInfo()`: constructs the in-memory `SessionInfo` for cache-hit connections without an RPC call. - ---- - -### 4.2 Session Lifecycle - -**SessionInfo fields:** - -| Field | Type | Meaning | -|---|---|---| -| `connHash` | string | Server-side key identifying which connection pool to use | -| `clientUUID` | string | Client process identity (see §3.3) | -| `sessionUUID` | string | Server-side session handle; set once a session is established | -| `transactionInfo` | `TransactionInfo` | Contains `transactionUUID` and `transactionStatus` (`TRX_ACTIVE`, `TRX_COMMITED`, `TRX_ROLLBACK`) | -| `sessionStatus` | `SessionStatus` | `SESSION_ACTIVE` or `SESSION_TERMINATED` | -| `isXA` | bool | Whether this is an XA session | -| `targetServer` | string | `host:port` of the server this session is pinned to (set by the server, used by the client for stickiness) | -| `clusterHealth` | string | Current cluster health snapshot from the server's perspective | - -**Lifecycle rules:** - -Always propagate the latest `SessionInfo` on every outgoing request. The server updates and returns it in every response; the client must replace its local copy with the one returned. When the response contains a `sessionUUID` that was absent in the request, register it immediately with the session-stickiness layer. On connection close, call `terminateSession(SessionInfo)` — this is mandatory for releasing server-side resources. If `sessionStatus == SESSION_TERMINATED` is received, treat the connection as closed and make no further calls on it. - -**Session stickiness enforcement:** - -Once a `sessionUUID` is established, every subsequent request for that session must go to the same server. Maintain a thread-safe map of `sessionUUID -> host:port`. On each request, if `sessionUUID` is set, look up the bound server and route the request there exclusively. If the bound server is currently marked unhealthy, raise an error to the caller — do not silently reroute. When a session is closed via `terminateSession`, remove the binding from the map and decrement the session count for that server in the load-balancing tracker. - -```python -# Every gRPC call returns an updated SessionInfo — always replace the local copy -resp = stub.executeUpdate(StatementRequest(session=current_session, sql="...")) -current_session = resp.session # update after every call - -# When a new sessionUUID appears in the response, record the server binding (§2.3) -if resp.session.sessionUUID and resp.session.sessionUUID != current_session.sessionUUID: - bind_session(resp.session.sessionUUID, resp.session.targetServer) - -# Close a connection — release server-side state -stub.terminateSession(current_session) -# After this call, discard current_session and do not make further calls on it -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`Connection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): holds the mutable `session` field (`SessionInfo`); `close()` calls `terminateSession(session)` and nulls the session. -> - `ojp-jdbc-driver` — [`MultinodeStatementService.withClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): enriches outgoing `SessionInfo` with the current cluster health string before each RPC. -> - `MultinodeStatementService.checkAndBindSession()`: updates the stickiness map whenever the server returns a new or changed `sessionUUID`. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.terminateSession()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): forwards `terminateSession` to every server that received a `connect()` for this `connHash`. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.affinityServer(sessionKey)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): returns the bound server for a `sessionUUID`, or selects a new one via load balancing when no binding exists yet. -> - `MultinodeConnectionManager.bindSession(sessionUUID, targetServer)`: records the `sessionUUID -> host:port` mapping in `sessionToServerMap`. -> - `MultinodeConnectionManager.unbindSession(sessionUUID)`: removes the binding on session close. -> - `ojp-jdbc-driver` — [`SessionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/SessionTracker.java): maintains per-server session counts used by the load-balancer and redistribution logic. - ---- - -### 4.3 Failover - -**What triggers failover:** - -| Status code | Trigger failover? | -|---|---| -| `UNAVAILABLE` | Yes | -| `DEADLINE_EXCEEDED` | Yes | -| `UNKNOWN` (with "connection" in message) | Yes | -| `INTERNAL` with SQL metadata trailers | **No** — this is a database-level error | -| `INTERNAL` without SQL metadata trailers | Yes — treated as a transport-level failure | -| `NOT_FOUND` | **No** — triggers reconnect (see §4.1), not failover | -| `RESOURCE_EXHAUSTED` (pool exhaustion) | **No** — surface to caller | -| `CANCELLED` | **No** — client-initiated cancellation; must never mark a server unhealthy | -| Any `SQLException` from server | **No** | - -**Failover procedure:** - -1. Capture whether the server was previously healthy (`wasHealthy`). -2. Mark the server unhealthy (`isHealthy = false`), recording the failure timestamp. -3. Log the failure. -4. If this is a genuine healthy -> unhealthy transition (`wasHealthy == true`), submit `pushClusterHealthToAllHealthyServers()` asynchronously to the background scheduler — do not block the query thread. -5. Shut down the gRPC channel for the failed server gracefully. -6. Select the next healthy server using the configured strategy, excluding the failed server and any already attempted in this retry cycle. -7. Retry the operation on the new server. -8. If all servers have been attempted and all failed, raise a connection error to the caller. - -Retry attempts and delay between retries are configurable (see §7.10, properties `ojp.multinode.retry.attempts` and `ojp.multinode.retry.delay`). - -**What must NOT trigger failover:** database errors, pool exhaustion, and session-invalidation errors must all be surfaced directly to the caller. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.isConnectionLevelError()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): classifies a `StatusRuntimeException` as a connectivity failure vs. a SQL/business error. `CANCELLED` is explicitly **excluded**. -> - `GrpcExceptionHandler.isPoolNotFoundException()`: returns `true` for `NOT_FOUND`, triggering reconnect rather than failover. -> - `GrpcExceptionHandler.isSessionInvalidationError()`: returns `true` when the server indicates the session is gone. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.handleServerFailure(endpoint, exception)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): marks the server unhealthy, timestamps the failure, shuts down the gRPC channel gracefully, and submits `pushClusterHealthToAllHealthyServers()` on a genuine healthy->unhealthy transition. -> - `MultinodeStatementService.executeOpResultWithSessionStickinessAndBinding()`: the retry loop that catches `StatusRuntimeException`, calls `isConnectionLevelError`, drives the server-selection retry cycle. - ---- - -### 4.4 Health Checking - -Run a periodic background task that checks server health. The task must run at a configurable fixed interval (property `ojp.health.check.interval`, default 5 000 ms), not block the main execution thread, and be a daemon task so it does not prevent process shutdown. Start the background scheduler before the first connection is accepted. - -**Two-phase check:** - -**Phase 1 — probe healthy servers (detect newly failed servers):** Run when there are active XA sessions (`sessionToServerMap` is non-empty) **or** cached non-XA connection details (`connectionDetailsByConnHash` is non-empty). For each currently healthy server that passes the guard, send a probe call. If the call fails, mark the server unhealthy. - -**Phase 2 — probe unhealthy servers (detect recovery):** For each currently unhealthy server, check if enough time has passed since the last failure (property `ojp.health.check.threshold`, default 5 000 ms). If so, probe the server. If the probe succeeds, run recovery (see §4.5). - -**Health probe modes:** - -| Mode | How to probe | When to use | -|---|---|---| -| Heartbeat (lightweight) | Send `connect()` with empty `url`, `user`, `password` — any response means transport is up | Default | -| Full validation | Send `connect()` with real credentials; on success, call `terminateSession` on the returned session | When heartbeat is insufficient | - -```python -# Lightweight heartbeat: send empty credentials — any response means transport is up -def heartbeat_probe(stub): - try: - stub.connect(ConnectionDetails(url="", user="", password="")) - return True # server is reachable - except grpc.RpcError: - return False # mark server unhealthy (§4.3) - -# Full validation: connect with real credentials, then immediately terminate -def full_validation_probe(stub, stored_details): - try: - session = stub.connect(stored_details) - stub.terminateSession(session) - return True - except grpc.RpcError: - return False - -# Periodic background task -def run_health_check(endpoints, stubs, stored_details): - for ep in endpoints: - if ep.is_healthy: - # Phase 1 — probe healthy server; detect new failures - if stored_details or xa_sessions: # guard: skip if no connections yet - if not heartbeat_probe(stubs[ep]): - handle_server_failure(ep) - push_cluster_health_async(endpoints, stored_details) - else: - # Phase 2 — probe unhealthy server; detect recovery - if time_since(ep.last_failure) >= HEALTH_CHECK_THRESHOLD: - if heartbeat_probe(stubs[ep]): - reinitialize_pool_on_recovered_server(ep, stored_details) # §4.5 - ep.mark_healthy() - push_cluster_health_inline(endpoints, stored_details) # §3.5 -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check. -> - `ojp-jdbc-driver` — [`HealthCheckValidator.validateServer(endpoint)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckValidator.java): performs a single lightweight probe. -> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): POJO holding `healthCheckIntervalMs`, `healthCheckThresholdMs`, `healthCheckTimeoutMs`, and `redistributionEnabled`. -> - `MultinodeConnectionManager` constructor: schedules `performHealthCheck` on a `ScheduledExecutorService` at the configured interval. - ---- - -### 4.5 Connection Redistribution on Recovery - -When a failed server comes back online, rebalance client-side connections so that the recovered server receives its fair share of traffic again. - -**Procedure on recovery:** - -1. **Reinitialize pools on the recovered server first** (before marking healthy). Check whether any non-XA connections have been cached (`connectionDetailsByConnHash` is non-empty). If so, for every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This eliminates the NOT_FOUND window that would otherwise exist between marking the server healthy and the first SQL call reaching it. Only after all pools are pre-created, proceed to step 2. -2. Mark the server healthy (`endpoint.markHealthy()`). -3. Push the updated cluster health string to all healthy servers (see §3.5) so they can resize their pools. -4. If redistribution is enabled (`ojp.redistribution.enabled = true`), begin rebalancing: - - Determine the ideal share: `totalConnections / numberOfHealthyServers`. - - Identify over-loaded servers (connections > ideal share). - - Close a fraction of idle connections on over-loaded servers so they are returned to the pool, then re-opened. - - Honour the configurable fraction (`ojp.redistribution.idleRebalanceFraction`, default 1.0) and max-close-per-cycle limit (`ojp.redistribution.maxClosePerRecovery`, default 100). - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.reinitializePoolOnRecoveredServer(recoveredServer)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): always called **before** `endpoint.markHealthy()`. -> - `ojp-jdbc-driver` — [`ConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionRedistributor.java): closes a fraction of idle connections on over-loaded servers for non-XA mode. -> - `ojp-jdbc-driver` — [`XAConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/XAConnectionRedistributor.java): equivalent redistribution for XA connections. -> - `ojp-jdbc-driver` — [`ConnectionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionTracker.java): maintains the per-server `Connection` list consulted by `ConnectionRedistributor`. - ---- - -## 5. Minimal End-to-End Example - -The following pseudo-code shows the complete sequence for a basic OJP client session: channel setup, connect, query, transaction with DML, and clean close. - -```python -import grpc -import uuid - -# -- 1. Channel and stub setup -# One long-lived channel per OJP server endpoint; shared across all connections. -CLIENT_UUID = str(uuid.uuid4()) # stable for this process lifetime -channel = grpc.insecure_channel("localhost:10591") -stub = StatementServiceStub(channel) - -# In-process state -connhash_cache = {} # cache_key -> connHash -stored_details = {} # connHash -> ConnectionDetails (for NOT_FOUND recovery) - -# -- 2. connect() — first connection (cache miss) -cache_key = "jdbc:postgresql://db:5432/mydb|alice|secret|default" - -if cache_key not in connhash_cache: - req = ConnectionDetails( - url = "jdbc:postgresql://db:5432/mydb", - user = "alice", - password = "secret", - clientUUID = CLIENT_UUID, - serverEndpoints = ["localhost:10591"], - clusterHealth = "", # empty on very first connect - isXA = False, - properties = [PropertyEntry(key="ojp.datasource.name", - string_value="default")] - ) - session = stub.connect(req) - connhash_cache[cache_key] = session.connHash - stored_details[session.connHash] = req -else: - # Cache hit — build SessionInfo locally; no RPC needed - session = SessionInfo(connHash=connhash_cache[cache_key], - clientUUID=CLIENT_UUID, isXA=False) - -# -- 3. executeQuery() — read rows -result_set_uuid = None -rows = [] -for op_result in stub.executeQuery(StatementRequest( - session = session, - sql = "SELECT id, name FROM products WHERE active = ?", - parameters = [ParameterProto(index=1, type=PT_BOOLEAN, - values=[ParameterValue(bool_value=True)])], - statementUUID = str(uuid.uuid4()))): - qr = op_result.query_result - if result_set_uuid is None: - result_set_uuid = qr.resultSetUUID - rows.extend(qr.rows) - session = op_result.session # always update local session - -# Close result set when done -stub.callResource(CallResourceRequest( - session=session, resourceType=RES_RESULT_SET, - resourceUUID=result_set_uuid, - target=TargetCall(callType=CALL_CLOSE))) - -# -- 4. Transaction — startTransaction, executeUpdate, commitTransaction -session = stub.startTransaction(session) - -resp = stub.executeUpdate(StatementRequest( - session = session, - sql = "INSERT INTO orders(customer, amount) VALUES(?, ?)", - parameters = [ - ParameterProto(index=1, type=PT_STRING, - values=[ParameterValue(string_value="alice")]), - ParameterProto(index=2, type=PT_INT, - values=[ParameterValue(int_value=99)]) - ], - statementUUID = str(uuid.uuid4()))) -session = resp.session - -session = stub.commitTransaction(session) - -# -- 5. terminateSession() — release server-side state -stub.terminateSession(session) -# Discard session — do not make further calls on it. -channel.close() -``` - ---- - -## 6. Error Handling - -### 6.1 Error Classification Matrix - -| Condition | gRPC status | Client action | -|---|---|---| -| SQL error (bad query, constraint, etc.) | `INTERNAL` + `SqlErrorResponse` trailer | Throw SQL exception; do not retry; do not mark server unhealthy | -| Pool not found (server restarted) | `NOT_FOUND` | Invalidate connHash cache; reconnect; retry once (§4.1) | -| Server unreachable | `UNAVAILABLE` | Failover to next server (§4.3) | -| Request timeout | `DEADLINE_EXCEEDED` | Failover to next server (§4.3) | -| Client-side cancellation | `CANCELLED` | Do **not** failover; do **not** mark server unhealthy; surface to caller | -| Pool exhausted | `RESOURCE_EXHAUSTED` | Throw pool-exhaustion error; do not retry; do not mark server unhealthy | -| Session invalidated (server failure) | Session-not-found message | Throw session-lost error; do not retry; let caller decide | -| Session stickiness violation (server down) | Local check before RPC | Throw connection error immediately; do not reroute | - -### 6.2 SQL Errors vs. gRPC Transport Errors - -When the server encounters a SQL error, it returns `Status.INTERNAL` with a `SqlErrorResponse` message attached to the trailing metadata. The client must extract this trailer and use its fields to construct a meaningful error. - -``` -SqlErrorResponse { - reason: string // human-readable message - sqlState: string // ANSI SQL state code - vendorCode: int32 // database-specific error code - sqlErrorType: SqlErrorType // SQL_EXCEPTION or SQL_DATA_EXCEPTION -} -``` - -Map to the host language's exception hierarchy: -- `SQL_EXCEPTION` -> standard SQL exception. -- `SQL_DATA_EXCEPTION` -> data-specific SQL exception (subtype). - -A transport error — `UNAVAILABLE`, `DEADLINE_EXCEEDED`, `UNKNOWN` (with "connection" in the message), or `INTERNAL` without a `SqlErrorResponse` trailer — triggers the failover procedure in §4.3. - -> **Note:** Prior to April 2026 the server incorrectly used `Status.CANCELLED` for SQL errors. The correct status is `Status.INTERNAL` with a `SqlErrorResponse` trailer. Implementations must use `INTERNAL` for SQL errors and must not treat `CANCELLED` as a server failure. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.handle(StatusRuntimeException)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): extracts `SqlErrorResponse` from gRPC trailing metadata on `Status.INTERNAL` and throws the appropriate `SQLException` with SQL state and vendor code. -> - `GrpcExceptionHandler.isPoolNotFoundException(exception)`: returns `true` for `NOT_FOUND`. -> - `GrpcExceptionHandler.isSessionInvalidationError(exception)`: returns `true` for session-invalidation error messages. -> - `GrpcExceptionHandler.isConnectionLevelError(exception)`: returns `true` for `UNAVAILABLE`, `DEADLINE_EXCEEDED`, and connection-related `UNKNOWN` errors. - ---- - -## 7. Implementation Guidance - -### 7.1 Statement Execution - -All SQL is executed by populating a `StatementRequest` and calling either `executeUpdate` or `executeQuery` on the stub. - -**Parameterless SQL:** Set `sql` to the full query string and leave `parameters` empty. - -**Parameterized SQL:** Set `sql` with `?` positional placeholders and populate the `parameters` list with one `ParameterProto` per `?`. Parameters are accumulated locally and sent together in a single `StatementRequest`. Assign a `statementUUID` (a random UUID per logical prepared-statement instance) so the server can track resources tied to that statement. - -**Stored-procedure calls:** First call `callResource` with `CallType.CALL_PREPARE` to register the procedure on the server and receive a `resourceUUID`. Then call `callResource` with `CallType.CALL_EXECUTE` to run it, passing IN parameters and retrieving OUT/INOUT values from `CallResourceResponse.values`. - -**Execution routing:** -- Use `executeUpdate` for INSERT / UPDATE / DELETE / DDL — returns `OpResult` with `type = INTEGER` containing affected row count. -- Use `executeQuery` for SELECT — returns a server-streaming response. Consume the first `OpResult` to get the initial batch; call `fetchNextRows` for subsequent pages (see §7.4). -- After any execution, update the local `SessionInfo` from the `OpResult.session` field. - -```python -# DML — INSERT / UPDATE / DELETE (use executeUpdate) -resp = stub.executeUpdate(StatementRequest( - session = session, - sql = "INSERT INTO orders(customer, amount) VALUES(?, ?)", - parameters = [ - ParameterProto(index=1, type=PT_STRING, values=[ParameterValue(string_value="Alice")]), - ParameterProto(index=2, type=PT_INT, values=[ParameterValue(int_value=42)]) - ], - statementUUID = new_uuid() # random UUID per statement instance -)) -session = resp.session # always update local session -rows_affected = resp.value.int_value # e.g., 1 - -# Query — SELECT (use executeQuery, which is server-streaming) -req = StatementRequest( - session = session, - sql = "SELECT id, name FROM orders WHERE customer = ?", - parameters = [ParameterProto(index=1, type=PT_STRING, - values=[ParameterValue(string_value="Alice")])], - statementUUID = new_uuid() -) -result_set_uuid = None -for op_result in stub.executeQuery(req): # iterate the server-streaming response - qr = op_result.query_result - if result_set_uuid is None: - result_set_uuid = qr.resultSetUUID - labels = qr.labels # e.g., ["id", "name"] - for row in qr.rows: - id_val = row.values[0].int_value - name_val = row.values[1].string_value - session = op_result.session -# Fetch additional pages -> see §7.4 - -# Stored procedure — CALL_PREPARE then CALL_EXECUTE -prep_resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CALLABLE_STATEMENT, - target = TargetCall(callType=CALL_PREPARE, resourceName="{call my_proc(?,?)}", - params=[ParameterValue(int_value=1)]) # IN param -)) -proc_uuid = prep_resp.resourceUUID -session = prep_resp.session - -exec_resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CALLABLE_STATEMENT, - resourceUUID = proc_uuid, - target = TargetCall(callType=CALL_EXECUTE) -)) -out_value = exec_resp.values[0] # first OUT/INOUT parameter value -session = exec_resp.session -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`Statement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Statement.java): `executeQuery(sql)` -> `statementService.executeQuery(...)`; `executeUpdate(sql)` -> `statementService.executeUpdate(...)`. -> - `ojp-jdbc-driver` — [`PreparedStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/PreparedStatement.java): accumulates parameters in a `SortedMap`; all 28 `setXxx(index, value)` methods map to the corresponding `ParameterType` (see §7.2). -> - `ojp-jdbc-driver` — [`CallableStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CallableStatement.java): issues `callResource(CALL_PREPARE)` on construction; retrieves OUT/INOUT values via `callResource(CALL_EXECUTE)` after execution. - ---- - -### 7.2 Parameter Type Mapping - -Each parameter is represented as: -``` -ParameterProto { - index: int32 // 1-based parameter position - type: ParameterTypeProto // one of the 29 type codes - values: ParameterValue[] // one value for normal params; multiple for array params -} -``` - -**ParameterTypeProto values and their ParameterValue encoding:** - -| Proto enum value | Wire field in `ParameterValue` | Notes | -|---|---|---| -| `PT_NULL` | `is_null = true` | Explicit null | -| `PT_BOOLEAN` | `bool_value` | | -| `PT_BYTE` | `int_value` | Clamp to byte range | -| `PT_SHORT` | `int_value` | Clamp to short range | -| `PT_INT` | `int_value` | | -| `PT_LONG` | `long_value` | | -| `PT_FLOAT` | `float_value` | | -| `PT_DOUBLE` | `double_value` | | -| `PT_BIG_DECIMAL` | `string_value` | Encode as `" "` — see §7.2.1 | -| `PT_STRING` | `string_value` | | -| `PT_BYTES` | `bytes_value` | Raw bytes | -| `PT_DATE` | `date_value` | `google.type.Date` (year/month/day, no timezone) | -| `PT_TIME` | `time_value` | `google.type.TimeOfDay` (hours/minutes/seconds/nanos) | -| `PT_TIMESTAMP` | `timestamp_value` | `TimestampWithZone` — see §7.3 | -| `PT_ASCII_STREAM` | `bytes_value` | ASCII bytes | -| `PT_UNICODE_STREAM` | `bytes_value` | Unicode bytes | -| `PT_BINARY_STREAM` | `bytes_value` | Binary bytes | -| `PT_OBJECT` | varies | Best-effort mapping to one of the concrete value types | -| `PT_CHARACTER_READER` | `string_value` | Contents of the character stream | -| `PT_REF` | `string_value` | REF value as string | -| `PT_BLOB` | (LOB reference UUID) | Create LOB first (§7.5); then pass UUID as `string_value` | -| `PT_CLOB` | (LOB reference UUID) | Same as BLOB | -| `PT_ARRAY` | `int_array_value` / `long_array_value` / `string_array_value` | Use the typed array message matching element type | -| `PT_URL` | `url_value` (StringValue) | `URL.toExternalForm()` — presence-aware; unset = null | -| `PT_ROW_ID` | `rowid_value` (StringValue) | Base64-encoded bytes of the RowId — presence-aware | -| `PT_N_STRING` | `string_value` | Same wire format as PT_STRING | -| `PT_N_CHARACTER_STREAM` | `string_value` | Contents of the NCharacter stream | -| `PT_N_CLOB` | (LOB reference UUID) | Same as CLOB | -| `PT_SQL_XML` | `string_value` | XML content as string | - -#### 7.2.1 BigDecimal encoding - -BigDecimal is serialised as a space-separated string: `" "`. - -- `unscaledInteger`: the decimal string representation of the unscaled value (may be negative). -- `scale`: integer scale (number of decimal places). -- Full value = `unscaledInteger * 10^(-scale)`. - -Example: `BigDecimal("123.45")` -> `"12345 2"`. - -> **Note:** A separate binary wire format is documented in `documents/protocol/BIGDECIMAL_WIRE_FORMAT.md` for contexts where binary efficiency is needed. - -#### 7.2.2 Presence-aware fields - -`url_value`, `rowid_value`, `uuid_value`, `biginteger_value`, `rowidlifetime_value` are all `google.protobuf.StringValue` (a wrapper message). An absent (unset) wrapper means SQL NULL. An empty string inside the wrapper is a valid non-null value. - -> **Reference implementation:** -> - `ojp-grpc-commons` — [`ProtoConverter.toProto(Parameter)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): converts a host-language `Parameter` object to `ParameterProto`; `fromProto(ParameterProto)` is the inverse. -> - `ProtoConverter.toParameterValue(Object value)`: the central dispatcher that routes each Java type to the correct `ParameterValue` oneof field. -> - `ProtoConverter.fromParameterValue(ParameterValue, ParameterType)`: decodes a wire value back to a Java object using both the value and the declared type as hints. -> - `ojp-grpc-commons` — [`ProtoTypeConverters`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoTypeConverters.java): handles the presence-aware `StringValue` wrappers for UUID, URL, and RowId. -> - `ojp-grpc-commons` — [`BigDecimalWire`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/BigDecimalWire.java): `writeBigDecimal` / `readBigDecimal` — binary wire encoding for BigDecimal. - ---- - -### 7.3 Temporal Type Handling - -Timestamps are transmitted as: - -``` -TimestampWithZone { - instant: google.protobuf.Timestamp // seconds + nanos since Unix epoch (UTC) - timezone: string // IANA zone ID or UTC offset (e.g., "Europe/Rome", "+02:00") - original_type: TemporalType // preserves the caller's original type -} -``` - -**TemporalType enum:** - -| Value | Original type | -|---|---| -| `TEMPORAL_TYPE_UNSPECIFIED` | Default / unknown | -| `TEMPORAL_TYPE_TIMESTAMP` | `java.sql.Timestamp` | -| `TEMPORAL_TYPE_CALENDAR` | `java.util.Calendar` | -| `TEMPORAL_TYPE_OFFSET_DATE_TIME` | `java.time.OffsetDateTime` | -| `TEMPORAL_TYPE_LOCAL_DATE_TIME` | `java.time.LocalDateTime` | -| `TEMPORAL_TYPE_INSTANT` | `java.time.Instant` | -| `TEMPORAL_TYPE_LOCAL_DATE` | `java.time.LocalDate` | -| `TEMPORAL_TYPE_LOCAL_TIME` | `java.time.LocalTime` | -| `TEMPORAL_TYPE_OFFSET_TIME` | `java.time.OffsetTime` | - -**Encoding rules:** -1. Convert the host-language datetime value to an absolute UTC instant (seconds + nanoseconds since the Unix epoch). -2. Record the IANA timezone or UTC offset string. -3. Set `original_type` to the closest matching `TemporalType` enum value. - -**Decoding rules:** On the receiving side, use `original_type` to reconstruct the correct host-language type. - -Date-only values use `google.type.Date` (year, month, day — no timezone). Time-only values use `google.type.TimeOfDay` (hours, minutes, seconds, nanos — no timezone). - -The OJP server must always run with `user.timezone=UTC`. Client libraries should normalise to UTC when encoding timestamps, using the `timezone` field to carry the original zone for faithful reconstruction. - -> **Reference implementation:** -> - `ojp-grpc-commons` — [`TemporalConverter`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/TemporalConverter.java): the definitive encoding/decoding reference for all temporal types. - ---- - -### 7.4 Result Set Streaming - -`executeQuery` is a server-streaming RPC. The response stream contains one or more `OpResult` messages: - -1. **First `OpResult`**: always contains the initial data batch in `query_result`: - - `resultSetUUID` — server-side handle for this result set. - - `labels` — ordered list of column names. - - `rows` — first batch of `ResultRow` objects, each containing a `ParameterValue` per column. - - `flag` — if `"ROW_BY_ROW"`, the server sends one row per stream message. - -2. **Subsequent `OpResult` messages** (only in non-row-by-row streaming mode): additional batches until the stream closes. - -3. **`fetchNextRows`**: After the initial stream closes, call `fetchNextRows(ResultSetFetchRequest)` with `resultSetUUID` and a page size to fetch additional rows. Repeat until the response contains an empty `rows` list. - -Map each `ParameterValue` oneof to the host language's equivalent type following the inverse of the encoding table in §7.2. Pay attention to `is_null = true` for SQL NULL values. - -**Cursor navigation:** Scrollable result sets support cursor positioning through `callResource` with `ResourceType.RES_RESULT_SET` and the appropriate `CallType`: - -| Cursor operation | CallType | -|---|---| -| `next()` | `CALL_NEXT` | -| `first()` | `CALL_FIRST` | -| `last()` | `CALL_LAST` | -| `beforeFirst()` | `CALL_BEFORE` | -| `afterLast()` | `CALL_AFTER` | -| `absolute(row)` | `CALL_ABSOLUTE` | -| `relative(rows)` | `CALL_RELATIVE` | -| `previous()` | `CALL_PREVIOUS` | -| `close()` | `CALL_CLOSE` | - -```python -# After executeQuery stream closes, fetch additional pages with fetchNextRows -result_set_uuid = ... # captured from the first op_result (§7.1) -all_rows = [] -while True: - resp = stub.fetchNextRows(ResultSetFetchRequest( - session = session, - resultSetUUID = result_set_uuid, - size = 500 # rows per page - )) - session = resp.session - if not resp.query_result.rows: - break # no more rows — result set exhausted - all_rows.extend(resp.query_result.rows) - -# Close the result set explicitly when done -stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_RESULT_SET, - resourceUUID = result_set_uuid, - target = TargetCall(callType=CALL_CLOSE) -)) - -# Cursor navigation — jump to an absolute row (scrollable result sets only) -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_RESULT_SET, - resourceUUID = result_set_uuid, - target = TargetCall( - callType = CALL_ABSOLUTE, - params = [ParameterValue(int_value=10)] # jump to row 10 - ) -)) -session = resp.session -current_row = resp.values # column values for row 10 -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`ResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ResultSet.java): `next()` drives the multi-block iteration; `setNextOpResult()` loads a new batch from the iterator; `nextWithSessionUpdate()` updates the session from each block. All `getXxx(columnIndex)` methods call `ProtoConverter.fromParameterValue()` on the column's `ParameterValue`. -> - `ojp-jdbc-driver` — [`RemoteProxyResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/RemoteProxyResultSet.java): base class holding `resultSetUUID` and `statementService`; all scrollable-cursor operations issue `callResource(RES_RESULT_SET, CALL_FIRST/LAST/ABSOLUTE/...)`. -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.fetchNextRows(sessionInfo, resultSetUUID, size)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the RPC that fetches the next page. -> - `ojp-grpc-commons` — [`ProtoConverter.fromProto(OpQueryResultProto)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): deserialises the initial `OpQueryResult` (labels + rows + resultSetUUID). - ---- - -### 7.5 LOB (Large Object) Handling - -**LOB types:** - -| LobType enum | Meaning | -|---|---| -| `LT_BLOB` | Binary large object | -| `LT_CLOB` | Character large object | -| `LT_BINARY_STREAM` | Binary stream (column-streaming variant) | -| `LT_ASCII_STREAM` | ASCII character stream | -| `LT_UNICODE_STREAM` | Unicode character stream | -| `LT_CHARACTER_STREAM` | Generic character stream | - -**Writing a LOB (createLob):** -1. Open a client-streaming call to `createLob`. -2. Send one or more `LobDataBlock` messages: - ``` - LobDataBlock { - session: SessionInfo - position: int64 // byte offset of this chunk - data: bytes // chunk content (recommended chunk size: 32-64 KB) - lobType: LobType - metadata: PropertyEntry[] // used for binary streams to carry prepared statement info - } - ``` -3. Close the stream. The server responds with a `LobReference`: - ``` - LobReference { - session: SessionInfo - uuid: string // LOB handle - bytesWritten: int32 - lobType: LobType - } - ``` -4. Store the `LobReference.uuid`. This UUID is passed as a parameter value (§7.2) when binding the LOB to a SQL statement. - -**Reading a LOB (readLob):** Call `readLob(ReadLobRequest)`: -``` -ReadLobRequest { - lobReference: LobReference // uuid + session info - position: int64 // start byte (1-based for JDBC compatibility) - length: int32 // max bytes to return -} -``` -Receive a server-streaming response of `LobDataBlock` messages. Concatenate the `data` fields in order to reconstruct the content. - -LOB handles are server-side objects. A connection that has an open LOB must remain bound to the same server. Do not reroute such connections during failover; surface the error to the caller. - -```python -CHUNK_SIZE = 64 * 1024 # 64 KB recommended chunk size - -# --- Write a LOB (createLob is client-streaming) --- -def write_lob(stub, session, data_bytes, lob_type=LT_BLOB): - def generate_blocks(): - for offset in range(0, len(data_bytes), CHUNK_SIZE): - yield LobDataBlock( - session = session, - position = offset, - data = data_bytes[offset : offset + CHUNK_SIZE], - lobType = lob_type - ) - lob_ref = stub.createLob(generate_blocks()) # client-streaming -> single LobReference - # lob_ref.uuid -> the LOB handle; pass as parameter to executeUpdate (see §7.2) - # lob_ref.bytesWritten -> sanity check - return lob_ref.uuid - -# Bind the LOB UUID when executing a statement -lob_uuid = write_lob(stub, session, my_bytes) -stub.executeUpdate(StatementRequest( - session = session, - sql = "INSERT INTO docs(content) VALUES(?)", - parameters = [ParameterProto(index=1, type=PT_BLOB, - values=[ParameterValue(string_value=lob_uuid)])] -)) - -# --- Read a LOB (readLob is server-streaming) --- -def read_lob(stub, session, lob_uuid, lob_type=LT_BLOB, max_bytes=10_000_000): - req = ReadLobRequest( - lobReference = LobReference(uuid=lob_uuid, session=session, lobType=lob_type), - position = 1, # 1-based start position - length = max_bytes - ) - return b"".join(block.data for block in stub.readLob(req)) - -content = read_lob(stub, session, lob_uuid) -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`LobServiceImpl`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/LobServiceImpl.java): `sendBytes(lobType, pos, inputStream)` opens the client-streaming `createLob` call, chunks the data into `LobDataBlock` messages, and returns the `LobReference`. `parseReceivedBlocks(Iterator)` reassembles chunks from a `readLob` stream into an `InputStream`. -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.createLob(connection, iterator)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the client-streaming gRPC call; uses an async stub and a `CountDownLatch` to bridge the streaming API back to a synchronous return value. -> - `StatementServiceGrpcClient.readLob(lobReference, pos, length)`: the server-streaming gRPC call that returns an `Iterator`. -> - `ojp-jdbc-driver` — [`Blob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Blob.java): `getBytes(pos, length)` and `getBinaryStream()` call `readLob`; `setBytes(pos, bytes)` calls `sendBytes`. [`Clob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Clob.java) mirrors the same pattern for character data. -> - `ojp-jdbc-driver` — [`BinaryStream`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/BinaryStream.java): streams binary content directly via `createLob` without materialising the full byte array. - ---- - -### 7.6 Transaction Management (non-XA) - -The server tracks open transactions per session. The client controls when transactions begin and end by calling explicit RPCs. - -- **Start a transaction**: call `startTransaction(SessionInfo)`. The returned `SessionInfo` contains a `transactionUUID` and `transactionStatus = TRX_ACTIVE`. -- **Commit**: call `commitTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_COMMITED`. -- **Rollback**: call `rollbackTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_ROLLBACK`. - -Always replace the local `SessionInfo` with the one returned by these calls. - -Set or get the isolation level by calling `callResource` with `RES_CONNECTION` and `CallType.CALL_SET` / `CALL_GET` and resource name `"TransactionIsolation"`. The isolation level should be reset to the default after each logical connection is reused. - -```python -# Begin an explicit transaction -session = stub.startTransaction(session) -# session.transactionInfo.transactionUUID = "txn-uuid" -# session.transactionInfo.transactionStatus = TRX_ACTIVE - -# Execute SQL within the open transaction -resp = stub.executeUpdate(StatementRequest(session=session, sql="INSERT INTO orders ...")) -session = resp.session # always update local session - -# Commit -session = stub.commitTransaction(session) -# session.transactionInfo.transactionStatus = TRX_COMMITED - -# -- OR -- Rollback -session = stub.rollbackTransaction(session) -# session.transactionInfo.transactionStatus = TRX_ROLLBACK - -# Set transaction isolation (READ_COMMITTED = 2) -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall( - callType = CALL_SET, - resourceName = "TransactionIsolation", - params = [ParameterValue(int_value=2)] - ) -)) -session = resp.session - -# Get current isolation level -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall(callType=CALL_GET, resourceName="TransactionIsolation") -)) -isolation_level = resp.values[0].int_value -session = resp.session -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`Connection.setAutoCommit(boolean)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): calls `commitTransaction` when switching on and `startTransaction` when switching off. -> - `Connection.commit()` / `Connection.rollback()`: delegate to `statementService.commitTransaction(session)` / `rollbackTransaction(session)` when `autoCommit == false`. -> - `Connection.close()`: calls `terminateSession(session)` unconditionally. -> - `Connection.setTransactionIsolation(level)` / `getTransactionIsolation()`: forwarded via `callProxy(CallType.CALL_SET/GET, "TransactionIsolation", ...)`. - ---- - -### 7.7 Savepoints - -Savepoints are implemented through the `callResource` protocol using `ResourceType.RES_SAVEPOINT`. - -**Creating a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `target.callType = CALL_SET`, `target.resourceName = "Savepoint"`, and `target.params = [savepointName]` if named; empty for anonymous savepoints. The response contains the savepoint UUID in `CallResourceResponse.resourceUUID`. - -**Rolling back to a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `resourceUUID = `, `target.callType = CALL_ROLLBACK`. - -**Releasing a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `resourceUUID = `, `target.callType = CALL_RELEASE`. - -```python -# Create a named savepoint -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_SAVEPOINT, - target = TargetCall( - callType = CALL_SET, - resourceName = "Savepoint", - params = [ParameterValue(string_value="my_savepoint")] # omit for anonymous - ) -)) -savepoint_uuid = resp.resourceUUID # keep this to roll back or release later -session = resp.session - -# Roll back to the savepoint (partial undo) -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_SAVEPOINT, - resourceUUID = savepoint_uuid, - target = TargetCall(callType=CALL_ROLLBACK, resourceName="Savepoint") -)) -session = resp.session - -# Release the savepoint (no longer needed) -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_SAVEPOINT, - resourceUUID = savepoint_uuid, - target = TargetCall(callType=CALL_RELEASE, resourceName="Savepoint") -)) -session = resp.session -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`Connection.setSavepoint()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java) / `setSavepoint(name)`: calls `callProxy` with `CALL_SET`, `"Savepoint"`, and the optional name; wraps the returned resource UUID in a [`Savepoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Savepoint.java) object. -> - `Connection.rollback(Savepoint)`: calls `callProxy` with `CALL_ROLLBACK`, `"Savepoint"`, and the savepoint's resource UUID. -> - `Connection.releaseSavepoint(Savepoint)`: calls `callProxy` with `CALL_RELEASE`. - ---- - -### 7.8 XA / Distributed Transactions - -XA support maps the standard XA resource manager protocol to gRPC RPCs. XA connections are always pinned to a single server (see §2.3). - -**XA transaction lifecycle:** - -``` -xaStart(XaStartRequest) -- Begin branch; safe to retry on connection error -xaEnd(XaEndRequest) -- End branch; NEVER retry after this point -xaPrepare(XaPrepareRequest) -- Two-phase prepare; returns XA_OK or XA_RDONLY -xaCommit(XaCommitRequest) -- Commit (onePhase=true for one-phase optimisation) -xaRollback(XaRollbackRequest) -- Roll back the branch -xaRecover(XaRecoverRequest) -- List in-doubt XIDs (for recovery after crash) -xaForget(XaForgetRequest) -- Forget a heuristically completed branch -``` - -**Xid encoding (XidProto):** - -| Field | Type | Meaning | -|---|---|---| -| `formatId` | int32 | Transaction format ID | -| `globalTransactionId` | bytes | Global transaction ID (up to 64 bytes) | -| `branchQualifier` | bytes | Branch qualifier (up to 64 bytes) | - -**Retry policy:** `xaStart` only: retry on connection-level errors. All other XA operations: do not retry automatically. Surface failures to the caller's transaction manager. - -```python -xid = XidProto( - formatId = 1, - globalTransactionId = b"global-tx-001", - branchQualifier = b"branch-1" -) - -# 1. Start the XA branch (safe to retry on connection error) -resp = stub.xaStart(XaStartRequest(session=session, xid=xid, flags=0)) -session = resp.session # bind session.targetServer -> this server for all remaining calls - -# 2. Execute SQL within the branch (normal executeUpdate/executeQuery calls) -resp = stub.executeUpdate(StatementRequest(session=session, sql="UPDATE accounts ...")) -session = resp.session - -# 3. End the branch — do NOT retry past this point -resp = stub.xaEnd(XaEndRequest(session=session, xid=xid, flags=0)) -session = resp.session - -# 4. Prepare (two-phase commit, phase 1) -prep = stub.xaPrepare(XaPrepareRequest(session=session, xid=xid)) -# prep.result = XA_OK (proceed to commit) or XA_RDONLY (read-only; no commit needed) - -# 5a. Commit (two-phase) -stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=False)) - -# 5b. -- OR -- One-phase optimisation (skip xaPrepare) -stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=True)) - -# 5c. -- OR -- Rollback -stub.xaRollback(XaRollbackRequest(session=session, xid=xid)) - -# Recovery: list in-doubt XIDs after a crash -resp = stub.xaRecover(XaRecoverRequest(session=session, flag=TMSTARTRSCAN)) -for recovered_xid in resp.xids: - stub.xaCommit(...) # or xaRollback -- decision belongs to the transaction manager - -# Forget a heuristically completed branch -stub.xaForget(XaForgetRequest(session=session, xid=xid)) -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`OjpXAResource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAResource.java): implements `XAResource`; all 10 lifecycle methods; contains the `xaStart` retry loop and the `toXidProto` / `fromXidProto` conversion helpers. -> - `ojp-jdbc-driver` — [`OjpXAConnection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAConnection.java): creates the XA-mode `StatementService` connection and vends `OjpXAResource`. -> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): entry point for XA; calls `MultinodeConnectionManager.connectXA()` to pin the session to a single server. - ---- - -### 7.9 callResource Protocol - -The `callResource` RPC is a generic mechanism for operations that do not fit a dedicated RPC — primarily `DatabaseMetaData` queries, `ResultSet` cursor/update operations, `Statement` cancellation, savepoint management, and resource lifecycle calls. - -**Request:** - -``` -CallResourceRequest { - session: SessionInfo - resourceType: ResourceType // what kind of resource to call - resourceUUID: string // the server-side handle for this resource - target: TargetCall // the specific operation to perform - properties: PropertyEntry[] -} -``` - -**TargetCall (supports chaining):** - -``` -TargetCall { - callType: CallType // one of the 47+ call type codes - resourceName: string // e.g., "Catalog", "TransactionIsolation", "Savepoint" - params: ParameterValue[] // input arguments - nextCall: TargetCall // optional chained call (recursive) -} -``` - -**ResourceType values:** - -| Value | Meaning | -|---|---| -| `RES_RESULT_SET` | An open result set | -| `RES_STATEMENT` | A plain statement | -| `RES_PREPARED_STATEMENT` | A prepared statement | -| `RES_CALLABLE_STATEMENT` | A callable statement | -| `RES_LOB` | A LOB object | -| `RES_CONNECTION` | The connection itself (for metadata, catalog, etc.) | -| `RES_SAVEPOINT` | A savepoint | - -**Response:** - -``` -CallResourceResponse { - session: SessionInfo - resourceUUID: string // UUID of a newly created resource, if any - values: ParameterValue[] // return values (may be empty) -} -``` - -Always update the local `SessionInfo` from `response.session`. - -**CallType reference (47 codes):** `CALL_SET`, `CALL_GET`, `CALL_IS`, `CALL_ALL`, `CALL_NULLS`, `CALL_USES`, `CALL_SUPPORTS`, `CALL_STORES`, `CALL_NULL`, `CALL_DOES`, `CALL_DATA`, `CALL_NEXT`, `CALL_CLOSE`, `CALL_WAS`, `CALL_CLEAR`, `CALL_FIND`, `CALL_BEFORE`, `CALL_AFTER`, `CALL_FIRST`, `CALL_LAST`, `CALL_ABSOLUTE`, `CALL_RELATIVE`, `CALL_PREVIOUS`, `CALL_ROW`, `CALL_UPDATE`, `CALL_INSERT`, `CALL_DELETE`, `CALL_REFRESH`, `CALL_CANCEL`, `CALL_MOVE`, `CALL_OWN`, `CALL_OTHERS`, `CALL_UPDATES`, `CALL_DELETES`, `CALL_INSERTS`, `CALL_LOCATORS`, `CALL_AUTO`, `CALL_GENERATED`, `CALL_RELEASE`, `CALL_NATIVE`, `CALL_PREPARE`, `CALL_ROLLBACK`, `CALL_ABORT`, `CALL_EXECUTE`, `CALL_ADD`, `CALL_ENQUOTE`, `CALL_REGISTER`, `CALL_LENGTH` - -```python -# --- Get the database catalog name (connection-level metadata) --- -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - resourceUUID = "", # empty for connection-level calls - target = TargetCall(callType=CALL_GET, resourceName="Catalog") -)) -catalog_name = resp.values[0].string_value -session = resp.session # always update local session - -# --- Check a database capability --- -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall(callType=CALL_SUPPORTS, resourceName="Transactions") -)) -supports_transactions = resp.values[0].bool_value -session = resp.session - -# --- Cancel a running statement --- -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_STATEMENT, - resourceUUID = statement_uuid, # UUID of the statement to cancel - target = TargetCall(callType=CALL_CANCEL) -)) -session = resp.session - -# --- Chained call: get Schema and Catalog in one round-trip --- -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall( - callType = CALL_GET, - resourceName = "Schema", - nextCall = TargetCall(callType=CALL_GET, resourceName="Catalog") - ) -)) -schema_name = resp.values[0].string_value -catalog_name = resp.values[1].string_value -session = resp.session -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.callResource(CallResourceRequest)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC call. -> - `ojp-jdbc-driver` — [`DatabaseMetaData`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatabaseMetaData.java): every `DatabaseMetaData` method (>200 in total) is implemented by calling `callResource` with `RES_CONNECTION` and the appropriate `CallType`. -> - `ojp-jdbc-driver` — `Connection.callProxy(callType, resourceName, returnType, params)`: the private convenience wrapper used throughout `Connection` and `DatabaseMetaData`. - ---- - -### 7.10 Configuration System - -**Configuration sources (in priority order):** - -1. System / environment properties (highest priority) — e.g., `-Dojp.health.check.interval=10000`. -2. `ojp.properties` file — loaded from the classpath or a well-known filesystem path. -3. Built-in defaults (lowest priority). - -**Property namespacing:** Properties can be global or per-datasource. Per-datasource properties are prefixed with the datasource name: - -```properties -# Global -ojp.health.check.interval=5000 - -# Per-datasource (datasource name: "analytics") -analytics.ojp.health.check.interval=10000 -``` - -**Standard configuration properties:** - -| Property | Default | Meaning | -|---|---|---| -| `ojp.health.check.interval` | `5000` (ms) | Periodic health check interval | -| `ojp.health.check.threshold` | `5000` (ms) | Minimum wait before re-probing an unhealthy server | -| `ojp.health.check.timeout` | `5000` (ms) | Probe call timeout | -| `ojp.redistribution.enabled` | `true` | Enable/disable the health checker and redistribution | -| `ojp.redistribution.idleRebalanceFraction` | `1.0` | Fraction of idle connections to close per rebalance cycle | -| `ojp.redistribution.maxClosePerRecovery` | `100` | Max connections closed per recovery event | -| `ojp.loadaware.selection.enabled` | `true` | Use least-connections; `false` = round-robin | -| `ojp.multinode.retry.attempts` | `3` | Max failover retry attempts | -| `ojp.multinode.retry.delay` | `100` (ms) | Delay between retry attempts | -| `ojp.datasource.name` | `"default"` | Active datasource name (sent to the server) | -| `ojp.grpc.tls.enabled` | `false` | Enable TLS on gRPC channels | -| `ojp.grpc.tls.cert.path` | — | Path to client certificate for mTLS | - -**Duration format:** No suffix = milliseconds; `ms` = milliseconds; `s` = seconds; `m` = minutes. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`DatasourcePropertiesLoader`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatasourcePropertiesLoader.java): `loadOjpPropertiesForDataSource(datasourceName)` merges file properties, system properties, and environment variables with per-datasource prefix resolution. -> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): the strongly-typed POJO that holds all health-check and redistribution settings. -> - `ojp-jdbc-driver` — [`MultinodeUrlParser.readIntProperty(props, key, default)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java) / `readLongProperty(...)`: reads typed values from the merged `Properties` object. -> - `ojp-grpc-commons` — [`GrpcClientConfig.load()`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loads the gRPC-specific settings (max inbound message size, TLS config) from `ojp.properties`. - ---- - -### 7.11 Query Result Caching - -Cache configuration is entirely **client-side to server** — the client reads local cache rules and sends them to the server as `ConnectionDetails.properties` entries during `connect()`. The server applies them transparently; the client does not implement any caching logic itself. - -**Properties sent to the server:** - -| Property key | Meaning | -|---|---| -| `ojp.cache.enabled` | `"true"` to enable caching | -| `ojp.cache.queries..pattern` | Regex pattern matching SQL queries to cache | -| `ojp.cache.queries..ttl` | TTL in seconds for cached results | -| `ojp.cache.queries..invalidateOn` | Comma-separated table names that invalidate this rule | -| `ojp.cache.queries..enabled` | `"true"` / `"false"` to toggle individual rules | - -`` is a 1-based integer index. Rules are processed in index order. - -**Example configuration:** - -```properties -ojp.cache.enabled=true -ojp.cache.queries.1.pattern=SELECT .* FROM products.* -ojp.cache.queries.1.ttl=600 -ojp.cache.queries.1.invalidateOn=products,product_prices -ojp.cache.queries.2.pattern=SELECT .* FROM users.* -ojp.cache.queries.2.ttl=300 -ojp.cache.queries.2.invalidateOn=users -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`CacheConfigurationBuilder.addCachePropertiesToMap(propertiesMap, datasourceName)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CacheConfigurationBuilder.java): reads cache rules from the loaded `Properties` and appends them to the `ConnectionDetails.properties` map that is sent to the server on `connect()`. - ---- - -### 7.12 Security / Transport - -**Plaintext (default):** Create a plaintext gRPC channel targeting `dns:///host:port`. Suitable for internal networks or local development. - -**TLS:** When `ojp.grpc.tls.enabled = true`, create a TLS-secured channel. Use the platform's default trust store or a custom CA certificate. Support mutual TLS (mTLS) when `ojp.grpc.tls.cert.path` is set. Certificate paths and key material must be loaded from configurable filesystem paths. - -**Credential handling:** Passwords must never be logged or included in exception messages. Connection keys used for cache lookups may include the password as a cache key only — they must not be serialised or persisted. - -> **Reference implementation:** -> - `ojp-grpc-commons` — [`GrpcChannelFactory.createChannel(host, port)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): creates a plaintext `ManagedChannel` with configurable max inbound message size; `createSecureChannel(host, port, size, tlsConfig)` builds the TLS-secured variant. -> - `ojp-grpc-commons` — [`GrpcClientConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): exposes `getTlsConfig()` and `getMaxInboundMessageSize()`. -> - `ojp-grpc-commons` — [`TlsConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/TlsConfig.java): holds `enabled`, `certPath`, `keyPath`, `caPath`, and `clientAuth` flags. - ---- - -### 7.13 DataSource / Integration API - -Provide a higher-level `DataSource` (or equivalent) object that holds connection configuration (URL, user, password, properties) and exposes a `getConnection()` method that calls `Driver.connect()` internally. Integrate cleanly with the host language's database access conventions. - -For Java/Spring Boot, provide a `spring-boot-starter-ojp` auto-configuration module. Auto-configure an `OjpDataSource` bean when the driver is on the classpath. Disable the framework's own built-in connection pool (e.g., HikariCP in Spring Boot) when OJP is in use — double-pooling is the most common misconfiguration and causes incorrect behaviour. - -For other languages, document clearly in the library README that the application-side connection pool must be disabled when using OJP. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`OjpDataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/OjpDataSource.java): implements `javax.sql.DataSource`; `getConnection()` / `getConnection(user, password)` delegate to `DriverManager.getConnection(url, info)`. -> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): implements `javax.sql.XADataSource`; `getXAConnection()` creates an `OjpXAConnection` (and thus an `OjpXAResource`) for JTA integration. -> - `spring-boot-starter-ojp` module: provides the Spring Boot auto-configuration class and the `OjpSystemPropertiesBridge` bean; sets `spring.datasource.type=OjpDataSource` and excludes `DataSourceAutoConfiguration` to prevent double-pooling. - ---- - -## 8. Testing Coverage - -A conformant client implementation must ship a test suite that exercises all the aspects above. Tests that require a live OJP server (and optionally a real database) should be **gated behind feature flags** so the suite can run incrementally in CI. - -**Test infrastructure requirements:** -- A running OJP server (see `ojp-server` module and `download-drivers.sh`). -- At minimum, an embedded/in-process database (e.g., H2) for fast baseline tests. -- Optional: containerised databases (PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, DB2, CockroachDB) gated by per-database flags. - -**Test categories and required scenarios:** - -#### Basic CRUD -- SELECT, INSERT, UPDATE, DELETE via plain Statement and PreparedStatement. -- Verify affected row counts, returned ResultSet contents. -- Verify empty result sets are handled correctly. - -#### Multiple data types -- Round-trip every `ParameterTypeProto` value through INSERT + SELECT. -- Cover: all integer widths, float, double, BigDecimal, string, boolean, byte array, date, time, timestamp (with and without timezone), LocalDate, LocalTime, LocalDateTime, OffsetDateTime, OffsetTime, Instant, URL, UUID, RowId, BLOB, CLOB, array, NULLs for each type. - -#### Statement variants -- Plain `Statement`: `executeQuery`, `executeUpdate`, `execute`, `executeBatch`, `getResultSet`, `getUpdateCount`, `getGeneratedKeys`, `cancel`, `close`. -- `PreparedStatement`: all `setXxx` methods, `executeBatch`, multiple executions with the same prepared statement, `getParameterMetaData`. -- `CallableStatement`: IN, OUT, INOUT parameters; `registerOutParameter`; retrieval of OUT values after execution; named parameters where supported. - -#### ResultSet navigation -- Forward-only cursors: `next()`, `wasNull()`, `close()`. -- Scrollable cursors: `first()`, `last()`, `beforeFirst()`, `afterLast()`, `absolute(n)`, `relative(n)`, `previous()`. -- Multi-block pagination: queries large enough to exceed one fetch page; verify all rows are retrieved. - -#### ResultSet metadata -- `getColumnCount()`, `getColumnName()`, `getColumnType()`, `getColumnTypeName()`, `getPrecision()`, `getScale()`, `isNullable()`, `isAutoIncrement()`. - -#### DatabaseMetaData -- `getTables()`, `getColumns()`, `getPrimaryKeys()`, `getIndexInfo()`, `getProcedures()`, `getTypeInfo()`, `supportsXxx()` methods. -- Verify results match the actual database schema. - -#### Transactions -- Commit: insert rows in a transaction, commit, verify rows persist. -- Rollback: insert rows in a transaction, rollback, verify rows are absent. -- `autoCommit = false` then `setAutoCommit(true)` — verify implicit commit. -- Transaction isolation level: set, verify via `getTransactionIsolation()`, reset after connection return. - -#### Savepoints -- Create a named and an anonymous savepoint. -- Rollback to each; verify partial rollback semantics. -- Release a savepoint. - -#### XA transactions -- Full lifecycle: `xaStart`, `xaEnd`, `xaPrepare`, `xaCommit`. -- Rollback path: `xaStart`, `xaEnd`, `xaPrepare`, `xaRollback`. -- One-phase commit (`onePhase=true`). -- `xaRecover`: verify in-doubt XIDs are returned. -- `xaForget`: verify heuristically completed branch is removed. -- Transaction isolation reset after XA session. - -#### LOBs -- BLOB: write a small blob (< 1 chunk), a large blob (multiple chunks), read back both; verify byte-for-byte equality. -- CLOB: same as BLOB but with character content. -- Binary stream, ASCII stream, Unicode stream: write via stream API, read back. -- Hydratable LOB: verify that a LOB reference can be passed as a parameter to a second statement. -- NULL LOB: verify that `setBlob(null)` / `setClob(null)` sends a SQL NULL. - -#### Session affinity -- Verify that a connection with an open transaction always routes to the same server. -- Verify that a connection holding an open LOB always routes to the same server. -- Verify that when the bound server is down, an appropriate error is raised rather than silent rerouting. - -#### Multi-block / large result sets -- Execute a query that returns more rows than one page. Verify all rows arrive and are in the correct order. - -#### Multinode load balancing -- With two or more server endpoints, open `N` connections and verify they are distributed across servers (round-robin and least-connections modes separately). - -#### Multinode failover -- Terminate one server mid-operation; verify the operation is retried on a surviving server (for stateless operations). -- Verify a server is marked unhealthy after failure. -- Verify subsequent connections avoid the unhealthy server. - -#### Multinode recovery and redistribution -- Bring a server back; verify it is marked healthy after the health check interval. -- Verify new connections start routing to the recovered server. -- Verify connection redistribution closes a fraction of idle connections on over-loaded servers. - -#### XA multinode -- Verify that each XA session binds to exactly one server. -- Verify that failover of an XA session to another server raises an error (not a silent reroute). -- Verify XA redistribution after server recovery. - -#### connHash caching / connect-RPC skip -- Open two connections with the same credentials; verify only one `connect()` gRPC call is made. -- Simulate a `NOT_FOUND` response; verify the driver invalidates the cache and re-issues `connect()`. - -#### Session stickiness error path -- Establish a session on server A. Mark server A unhealthy. Attempt a SQL operation. Verify an error is raised rather than the request being silently routed to server B. - -#### Cluster health propagation -- Stop one server; verify the cluster health string sent in subsequent requests marks it `DOWN`. -- Recover the server; verify the health string marks it `UP`. - -#### Concurrency / pool exhaustion -- Send more concurrent requests than the server-side pool size; verify pool-exhaustion errors are surfaced cleanly and do not mark servers unhealthy. - -#### Slow query segregation -- Send queries that take longer than the slow-query threshold; verify they use the reserved slow-query slots and do not starve fast queries. - -#### Multi-datasource -- Configure two endpoints with different datasource names; verify each endpoint uses its own datasource configuration. - -#### Configuration loading -- Verify properties are loaded from `ojp.properties`. -- Verify system properties override file properties. -- Verify per-datasource properties override global properties. - -#### Performance / mini stress -- Open and close 100-1000 connections in parallel; verify no connection leaks, no deadlocks, and no degrading error rate. - -#### Database-specific test suites - -Each database must have a dedicated test class gated by its own flag: - -| Database | Feature flag | -|---|---| -| H2 | `enableH2Tests` | -| PostgreSQL | `enablePostgresTests` | -| MySQL | `enableMySQLTests` | -| MariaDB | `enableMariaDBTests` | -| Oracle | `enableOracleTests` | -| SQL Server | `enableSqlServerTests` | -| DB2 | `enableDb2Tests` | -| CockroachDB | `enableCockroachDBTests` | - -H2 tests (in-process, no external dependency) must always be runnable in CI without any extra setup and should act as the first gate before any database-specific jobs run. - -> **Reference implementation — test classes by area:** -> -> | Test area | Java test class(es) | -> |---|---| -> | Basic CRUD | [`BasicCrudIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BasicCrudIntegrationTest.java) | -> | Multiple data types | `H2MultipleTypesIntegrationTest`, `PostgresMultipleTypesIntegrationTest`, `MySQLMultipleTypesIntegrationTest`, `OracleMultipleTypesIntegrationTest`, `SQLServerMultipleTypesIntegrationTest`, `Db2MultipleTypesIntegrationTest`, `CockroachDBMultipleTypesIntegrationTest`, `MariaDBMultipleTypesIntegrationTest` | -> | Statement variants | `H2StatementExtensiveTests`, `H2PreparedStatementExtensiveTests` (and per-DB equivalents) | -> | ResultSet navigation / metadata | `H2ResultSetTest` (and per-DB), `H2ResultSetMetaDataExtensiveTests`, `H2ReadMultipleBlocksOfDataIntegrationTest` | -> | DatabaseMetaData | `H2DatabaseMetaDataExtensiveTests`, `H2ConnectionExtensiveTests` (and per-DB) | -> | Transactions | `H2ConnectionExtensiveTests`, [`TransactionIsolationResetTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/TransactionIsolationResetTest.java) | -> | Savepoints | `H2SavepointTests` (and per-DB `*SavepointTests`) | -> | XA transactions | [`PostgresXAIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/PostgresXAIntegrationTest.java), `MySQLXAIntegrationTest`, `MariaDBXAIntegrationTest`, `OracleXAIntegrationTest`, `SqlServerXAIntegrationTest`, `Db2XAIntegrationTest`, [`XASessionInvalidationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/XASessionInvalidationTest.java) | -> | LOBs | [`BlobIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BlobIntegrationTest.java), [`BinaryStreamIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BinaryStreamIntegrationTest.java), [`HydratedLobValidationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/HydratedLobValidationTest.java) (and per-DB `*Blob*` / `*BinaryStream*`) | -> | Session affinity | [`H2SessionAffinityIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/H2SessionAffinityIntegrationTest.java) (and per-DB `*SessionAffinity*`) | -> | Multi-block result sets | `H2ReadMultipleBlocksOfDataIntegrationTest` (and per-DB) | -> | Multinode load balancing | [`LoadAwareServerSelectionTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/LoadAwareServerSelectionTest.java), [`MultinodeIntegrationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeIntegrationTest.java) | -> | Multinode failover | [`MultinodeFailoverTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeFailoverTest.java), [`MultinodeConnectionManagerErrorHandlingTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeConnectionManagerErrorHandlingTest.java) | -> | Multinode recovery / redistribution | [`MultinodeRecoveryTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeRecoveryTest.java) | -> | XA multinode | [`MultinodeXAIntegrationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeXAIntegrationTest.java) | -> | connHash caching | [`ConnectRpcSkipOptimisationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/ConnectRpcSkipOptimisationTest.java), [`UnifiedConnectionModeTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/UnifiedConnectionModeTest.java) | -> | Session stickiness error path | [`MultinodeTargetServerBindingTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeTargetServerBindingTest.java), `MultinodeStatementServiceTest` | -> | Cluster health propagation | [`MultinodeConnectionManagerClusterHealthTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeConnectionManagerClusterHealthTest.java) | -> | Concurrency / pool exhaustion | [`ConcurrencyTimeoutTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/ConcurrencyTimeoutTest.java) | -> | Multi-datasource | [`MultiDataSourceIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/MultiDataSourceIntegrationTest.java), [`MultiDataSourceConfigurationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/MultiDataSourceConfigurationTest.java) | -> | Configuration loading | [`DatasourcePropertiesLoaderSystemPropertyTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DatasourcePropertiesLoaderSystemPropertyTest.java), [`DatasourcePropertiesLoaderEnvironmentTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DatasourcePropertiesLoaderEnvironmentTest.java) | -> | URL parsing | [`MultinodeUrlParserTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeUrlParserTest.java), [`UrlParserTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/UrlParserTest.java), [`DriverMultinodeUrlTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/DriverMultinodeUrlTest.java) | -> | DataSource API | [`OjpDataSourceTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/OjpDataSourceTest.java), [`OjpXADataSourceTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/jdbc/xa/OjpXADataSourceTest.java) | -> | Health check config | [`HealthCheckConfigTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/HealthCheckConfigTest.java), [`MultinodeRetryConfigTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeRetryConfigTest.java) | -> | Session tracker unit | [`SessionTrackerTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/SessionTrackerTest.java) | - ---- - -## Appendix A — Proto file locations - -| File | Location | -|---|---| -| Main protocol | `ojp-grpc-commons/src/main/proto/StatementService.proto` | -| Generic value containers | `ojp-grpc-commons/src/main/proto/containers.proto` | -| Echo / heartbeat | `ojp-grpc-commons/src/main/proto/echo.proto` | - -## Appendix B — Reference implementation classes - -| Aspect | Java class | -|---|---| -| gRPC stubs | `StatementServiceGrpcClient` | -| Multinode routing | `MultinodeStatementService`, `MultinodeConnectionManager` | -| URL parsing | `MultinodeUrlParser`, `UrlParser` | -| Session tracking | `SessionTracker` | -| Health checking | `HealthCheckValidator`, `HealthCheckConfig` | -| Redistribution | `ConnectionRedistributor`, `XAConnectionRedistributor` | -| Error mapping | `GrpcExceptionHandler` | -| Connection lifecycle | `Connection` | -| Statement execution | `Statement`, `PreparedStatement`, `CallableStatement` | -| Result set | `ResultSet`, `RemoteProxyResultSet` | -| LOB handling | `Blob`, `Clob`, `NClob`, `Lob`, `LobServiceImpl` | -| XA | `OjpXAResource`, `OjpXAConnection`, `OjpXADataSource` | -| Driver entry point | `Driver` | -| DataSource wrapper | `OjpDataSource` | -""" - -with open(os.path.join(SPEC_DIR, "CLIENT_SPEC.md"), "w") as f: - f.write(CLIENT_SPEC) - -print(f"CLIENT_SPEC.md: {len(CLIENT_SPEC)} chars, {len(CLIENT_SPEC.splitlines())} lines") -print("Done writing CLIENT_SPEC.md") From 5cebd4181102a93a2d891dedd4565a59bb0d4de3 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 20 Apr 2026 09:54:05 +0000 Subject: [PATCH 08/12] docs: restructure CLIENT_SPEC.md into 8-section hierarchy with Core Concepts, E2E Example, and pseudo-code Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/99757170-ee53-49e9-9a5e-32a90c8c917b Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/CLIENT_SPEC.md | 1626 +++++++---------- 1 file changed, 691 insertions(+), 935 deletions(-) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 5b239f13a..8fff240fb 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -1,46 +1,110 @@ # OJP Multi-Language Client Specification -> **Status:** Draft — April 2026 -> **Scope:** This document defines every aspect that a new OJP client library (in any language other than Java) must implement in order to be fully compatible with an OJP server. It is written language-agnostically; where Java-specific concepts appear they are labelled as the reference implementation only. -> **Reference implementation:** `ojp-jdbc-driver` module. +> **Status:** Draft — April 2026 +> **Scope:** Defines every aspect that a new OJP client library (in any language) must implement to be fully compatible with an OJP server. Written language-agnostically; Java-specific concepts are labelled as reference implementation only. +> **Reference implementation:** `ojp-jdbc-driver` module. > **Protocol source of truth:** `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto`. +> **Machine-oriented companion:** [`CLIENT_SPEC_AI.md`](CLIENT_SPEC_AI.md) --- ## Table of Contents -1. [gRPC Interface Implementation](#1-grpc-interface-implementation) -2. [Connection Configuration and Building ConnectionDetails](#2-connection-configuration-and-building-connectiondetails) -3. [Client Identity](#3-client-identity) -4. [Connection Establishment and connHash Caching](#4-connection-establishment-and-connhash-caching) -5. [Session Management](#5-session-management) -6. [Session Stickiness](#6-session-stickiness) -7. [Load Balancing](#7-load-balancing) -8. [Failover](#8-failover) -9. [Health Checking](#9-health-checking) -10. [Connection Redistribution on Recovery](#10-connection-redistribution-on-recovery) -11. [Cluster Health Propagation](#11-cluster-health-propagation) -12. [Transaction Management (non-XA)](#12-transaction-management-non-xa) -13. [Savepoints](#13-savepoints) -14. [XA / Distributed Transactions](#14-xa--distributed-transactions) -15. [Statement Execution](#15-statement-execution) -16. [Parameter Type Mapping](#16-parameter-type-mapping) -17. [Temporal Type Handling](#17-temporal-type-handling) -18. [Result Set and Streaming](#18-result-set-and-streaming) -19. [LOB (Large Object) Handling](#19-lob-large-object-handling) -20. [CallResource Protocol](#20-callresource-protocol) -21. [Error and Exception Mapping](#21-error-and-exception-mapping) -22. [Configuration System](#22-configuration-system) -23. [Query Result Caching](#23-query-result-caching) -24. [Security / Transport](#24-security--transport) -25. [DataSource / Integration API](#25-datasource--integration-api) -26. [Testing Coverage](#26-testing-coverage) +1. [Overview](#1-overview) +2. [Core Concepts](#2-core-concepts) + - 2.1 [Virtual Connections](#21-virtual-connections) + - 2.2 [Deferred Session Assignment](#22-deferred-session-assignment) + - 2.3 [Session Affinity](#23-session-affinity) + - 2.4 [Client vs. Server Responsibilities](#24-client-vs-server-responsibilities) +3. [Architecture and Data Flow](#3-architecture-and-data-flow) + - 3.1 [gRPC Interface and Channel Setup](#31-grpc-interface-and-channel-setup) + - 3.2 [Connection Configuration (ConnectionDetails)](#32-connection-configuration-connectiondetails) + - 3.3 [Client Identity (clientUUID)](#33-client-identity-clientuuid) + - 3.4 [Load Balancing](#34-load-balancing) + - 3.5 [Cluster Health Propagation](#35-cluster-health-propagation) +4. [Client Responsibilities](#4-client-responsibilities) + - 4.1 [Connection Establishment and connHash Caching](#41-connection-establishment-and-connhash-caching) + - 4.2 [Session Lifecycle](#42-session-lifecycle) + - 4.3 [Failover](#43-failover) + - 4.4 [Health Checking](#44-health-checking) + - 4.5 [Connection Redistribution on Recovery](#45-connection-redistribution-on-recovery) +5. [Minimal End-to-End Example](#5-minimal-end-to-end-example) +6. [Error Handling](#6-error-handling) + - 6.1 [Error Classification](#61-error-classification) + - 6.2 [SQL Errors vs. Transport Errors](#62-sql-errors-vs-transport-errors) +7. [Implementation Guidance](#7-implementation-guidance) + - 7.1 [Statement Execution](#71-statement-execution) + - 7.2 [Parameter Type Mapping](#72-parameter-type-mapping) + - 7.3 [Temporal Type Handling](#73-temporal-type-handling) + - 7.4 [Result Set Streaming](#74-result-set-streaming) + - 7.5 [LOB Handling](#75-lob-large-object-handling) + - 7.6 [Transaction Management (non-XA)](#76-transaction-management-non-xa) + - 7.7 [Savepoints](#77-savepoints) + - 7.8 [XA / Distributed Transactions](#78-xa--distributed-transactions) + - 7.9 [callResource Protocol](#79-callresource-protocol) + - 7.10 [Configuration System](#710-configuration-system) + - 7.11 [Query Result Caching](#711-query-result-caching) + - 7.12 [Security / Transport](#712-security--transport) + - 7.13 [DataSource / Integration API](#713-datasource--integration-api) +8. [Testing Coverage](#8-testing-coverage) --- -## 1. gRPC Interface Implementation +## 1. Overview -### What to implement +OJP (Open J Proxy) is a JDBC Type 3 proxy. Its central idea is that real database connections are owned exclusively by the OJP server, which manages them in HikariCP connection pools. Client applications communicate with the server via gRPC rather than opening direct database connections. + +``` +[Application] ──native API──> [OJP Client Library] ──gRPC/HTTP2──> [OJP Server] ──JDBC──> [Database] +``` + +This architecture lets many application instances scale independently without overwhelming the database, because the proxy enforces a global connection limit. + +A non-Java OJP client replaces the `ojp-jdbc-driver` module. It must implement all 21 `StatementService` RPCs plus `EchoService.Echo`, handle the `SessionInfo` propagation contract on every call, and manage endpoint health, failover, and session stickiness on the client side. The server handles everything else: real connection management, transaction state, LOB storage, cursor state, and query caching. + +> **Important operational rule:** Application-side connection pools **must be disabled** when using OJP. Double-pooling causes incorrect behavior and resource waste. This is the single most common misconfiguration. + +--- + +## 2. Core Concepts + +Before diving into implementation details, understand these four foundational ideas. Everything else in this specification follows from them. + +### 2.1 Virtual Connections + +An OJP "connection" is not a real database connection. The real JDBC connections are held exclusively in the server's HikariCP pool. What the client holds is a `SessionInfo` — a lightweight proto message containing a `connHash` (a pool identifier), the `clientUUID`, and (once assigned) a `sessionUUID`. + +Opening a connection is cheap. For non-XA connections after the first one, the client can satisfy the `connect()` call entirely from a local cache: it looks up the `connHash` for the given database credentials and builds the `SessionInfo` locally without making any gRPC call. This means connection acquisition for cached credentials costs only a hash-map lookup. + +Multiple client connections sharing the same database credentials share the same server-side pool through the same `connHash`. The server distributes real JDBC connections across all these logical client connections. + +Because the server owns the real connections, application-side connection pools are redundant and harmful — they create a second pool that fights with the server's pool for real database connections. + +### 2.2 Deferred Session Assignment + +The `sessionUUID` field in `SessionInfo` is not assigned at connection time. It is assigned by the server on the first operation that requires a persistent server-side session — for example, `startTransaction()`, creating a LOB, or opening a scrollable cursor. Until the server assigns a `sessionUUID`, requests are effectively stateless: any server can handle them using any real connection from the appropriate pool. + +This means that for simple read-only queries (no transactions, no LOBs), the client never receives a `sessionUUID` at all, and all requests can be freely routed to any healthy server. + +### 2.3 Session Affinity + +Once a `sessionUUID` is assigned, **every subsequent request for that session must go to the same server**. The server encodes the binding in `SessionInfo.targetServer` (`host:port`). The client must record this mapping and enforce it on every outgoing request. + +Session affinity covers all of: open transactions, open LOB handles, open server-side cursors, and XA transaction branches. Rerouting any of these to a different server is a protocol error — the session state exists only on the original server and cannot be migrated. + +If the bound server becomes unhealthy while a sticky session is open, the client must raise an error to the caller immediately. It must not silently reroute the request to another server. + +### 2.4 Client vs. Server Responsibilities + +**The server owns:** real JDBC connections and HikariCP pool management; transaction state; LOB storage; server-side cursor state; query result caching; slow-query slot management; pool resizing in response to cluster health changes. + +**The client owns:** `SessionInfo` propagation (attach current `SessionInfo` to every request; replace with response); `connHash` caching; endpoint health tracking; load balancing; failover; cluster health string building and pushing to surviving servers; session stickiness enforcement (`sessionUUID → targetServer` binding); background health-check task; connection redistribution after server recovery. + +--- + +## 3. Architecture and Data Flow + +### 3.1 gRPC Interface and Channel Setup The client must implement stubs for every RPC in `StatementService` and `EchoService`. @@ -76,13 +140,13 @@ The client must implement stubs for every RPC in `StatementService` and `EchoSer |---|---|---| | `Echo` | unary | Lightweight heartbeat / connectivity check | -### gRPC channel lifecycle +**gRPC channel lifecycle:** - One `ManagedChannel` (or equivalent) per server endpoint. Channels are long-lived and shared across all logical connections to that endpoint. -- Channels are created lazily on first connection to an endpoint, or eagerly during initialisation when endpoints are known upfront. -- Use DNS-prefixed targets (`dns:///host:port`) where the gRPC runtime supports it, to allow future SRV-based discovery. -- Blocking stubs are used for synchronous operations; async stubs are required for client-streaming (`createLob`) and server-streaming (`executeQuery`, `readLob`) RPCs. -- Channel shutdown must be graceful (allow in-flight calls to complete) and must be triggered on client shutdown. +- Channels are created lazily on first connection, or eagerly during initialisation when endpoints are known upfront. +- Use DNS-prefixed targets (`dns:///host:port`) where the gRPC runtime supports it. +- Blocking stubs for synchronous operations; async stubs required for client-streaming (`createLob`) and server-streaming (`executeQuery`, `readLob`) RPCs. +- Channel shutdown must be graceful (allow in-flight calls to complete) and triggered on client shutdown. ### Pseudo-code @@ -90,168 +154,207 @@ The client must implement stubs for every RPC in `StatementService` and `EchoSer # Create one long-lived channel per OJP server endpoint channel = grpc.create_channel("localhost:10591", credentials=grpc.local_channel_credentials()) stub = StatementServiceStub(channel) # used for all SQL operations -echo = EchoServiceStub(channel) # used for health heartbeats +echo = EchoServiceStub(channel) # used for health heartbeats # On process shutdown — drain in-flight calls before closing channel.shutdown(grace_period_seconds=5) ``` > **Reference implementation:** -> - `ojp-jdbc-driver` — [`StatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementService.java): the unified interface declaring all RPC methods (`connect`, `executeUpdate`, `executeQuery`, `fetchNextRows`, `createLob`, `readLob`, `terminateSession`, `startTransaction`, `commitTransaction`, `rollbackTransaction`, `callResource`, all XA operations). -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC implementation of `StatementService`; contains the concrete gRPC stub calls and the `grpcChannelOpenAndStubsInitialized()` channel lifecycle method. -> - `ojp-jdbc-driver` — [`MultinodeStatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): the multinode façade that wraps `StatementServiceGrpcClient` per endpoint with routing, failover, and stickiness. -> - `ojp-grpc-commons` — [`GrpcChannelFactory`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): `createChannel(host, port)` / `createChannel(target)` — builds `ManagedChannel` instances with plaintext or TLS; handles the `dns:///` prefix and max inbound message size. +> - `ojp-jdbc-driver` — [`StatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementService.java): the unified interface declaring all RPC methods. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC implementation; contains the concrete gRPC stub calls and the `grpcChannelOpenAndStubsInitialized()` channel lifecycle method. +> - `ojp-jdbc-driver` — [`MultinodeStatementService`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): the multinode facade that wraps `StatementServiceGrpcClient` per endpoint with routing, failover, and stickiness. +> - `ojp-grpc-commons` — [`GrpcChannelFactory`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): `createChannel(host, port)` — builds `ManagedChannel` instances with plaintext or TLS; handles the `dns:///` prefix and max inbound message size. --- -## 2. Connection Configuration and Building ConnectionDetails - -### What the client collects from the user +### 3.2 Connection Configuration (ConnectionDetails) A non-Java OJP client does not use a JDBC URL. Instead, it collects the following configuration items directly from the user or from a configuration file: | Item | Required | Description | |---|---|---| | OJP server endpoints | Yes | One or more `host:port` pairs for the OJP server(s). In multinode mode this is a list. | -| Datasource name | No | A logical name for this datasource, default `"default"`. Used to keep separate connection pools per named datasource on the same server. | -| Database URL | Yes | The connection URL for the **real database** that the OJP server will connect to (e.g., `jdbc:postgresql://db:5432/mydb`). This is sent verbatim to the server. | +| Datasource name | No | A logical name for this datasource, default `"default"`. | +| Database URL | Yes | The connection URL for the **real database** (e.g., `jdbc:postgresql://db:5432/mydb`). Sent verbatim to the server. | | User | Yes | Database username. | | Password | Yes | Database password. | -| Properties | No | Additional key-value configuration pairs (pool sizing, cache rules, etc. — see §22, §23). | - -### Building the `ConnectionDetails` message +| Properties | No | Additional key-value configuration pairs (pool sizing, cache rules, etc. — see §7.10, §7.11). | Map the collected configuration to the `ConnectionDetails` proto fields as follows: | Proto field | Type | Value | |---|---|---| -| `url` | `string` | The **actual database URL** (e.g., `jdbc:postgresql://db:5432/mydb`). The server uses this to create the real database connection pool. | +| `url` | `string` | The **actual database URL**. The server uses this to create the real database connection pool. | | `user` | `string` | Database username. | | `password` | `string` | Database password. | -| `clientUUID` | `string` | Stable process UUID (see §3). | +| `clientUUID` | `string` | Stable process UUID (see §3.3). | | `properties` | `repeated PropertyEntry` | Configuration key-value pairs; include `ojp.datasource.name = ` when using a named datasource. | -| `serverEndpoints` | `repeated string` | All OJP server addresses as `"host:port"` strings (the full cluster list, not just the chosen endpoint). | -| `clusterHealth` | `string` | Current cluster health string (see §11); empty string on the very first connect. | +| `serverEndpoints` | `repeated string` | All OJP server addresses as `"host:port"` strings (the full cluster list). | +| `clusterHealth` | `string` | Current cluster health string (see §3.5); empty string on the very first connect. | | `isXA` | `bool` | `true` for XA connections, `false` otherwise. | -> **Important:** the `url` field must be consistent across all client processes that connect to the same logical datasource. The server computes `connHash` as SHA-256(`url + user + password + datasource_name`). If different clients send different `url` strings for the same database, the server creates separate pools. +> **Important:** the `url` field must be consistent across all client processes that connect to the same logical datasource. The server computes `connHash` as SHA-256(`url + user + password + datasource_name`). Inconsistent `url` strings cause separate pools to be created. -### `connHash` cache key (client side) +**`connHash` cache key (client side):** `url + "|" + user + "|" + password + "|" + datasource_name` -The client caches the `connHash` returned by the server after the first `connect()` RPC. The local lookup key for this cache is: +> **Reference implementation:** +> - `ojp-grpc-commons` — [`ConnectionDetails` proto](../../ojp-grpc-commons/src/main/proto/StatementService.proto): field definitions. +> - `ojp-server` — [`ConnectionHashGenerator.hashConnectionDetails()`](../../ojp-server/src/main/java/org/openjproxy/grpc/server/utils/ConnectionHashGenerator.java): SHA-256 of `url + user + password + datasource_name_from_properties`. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.computeConnectionKey()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): client-side cache key computation. +> - `ojp-jdbc-driver` — [`MultinodeUrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java): Java reference for JDBC URL parsing (Java-specific; not needed in non-Java clients). -``` -url + "|" + user + "|" + password + "|" + datasource_name -``` +--- -Use the same `url` string that was placed in `ConnectionDetails.url` so the cache key matches the server's `connHash` computation. +### 3.3 Client Identity (clientUUID) + +Generate one random UUID (version 4) when the client library is first loaded or when the process starts. This UUID must remain stable for the entire lifetime of the process. Attach `clientUUID` to every `ConnectionDetails` message sent to the server. Do not persist `clientUUID` across process restarts. > **Reference implementation:** -> - `ojp-grpc-commons` — [`ConnectionDetails` proto](../../ojp-grpc-commons/src/main/proto/StatementService.proto): field definitions for `url`, `user`, `password`, `clientUUID`, `properties`, `serverEndpoints`, `clusterHealth`, `isXA`. -> - `ojp-server` — [`ConnectionHashGenerator.hashConnectionDetails()`](../../ojp-server/src/main/java/org/openjproxy/grpc/server/utils/ConnectionHashGenerator.java): SHA-256 of `url + user + password + datasource_name_from_properties` — the server-side connHash algorithm. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.computeConnectionKey()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): client-side cache key = `url + "|" + user + "|" + password + "|" + datasource_name`. -> - `ojp-jdbc-driver` — [`MultinodeUrlParser`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java): Java reference for how the JDBC URL is parsed to extract server endpoints, datasource names, and the actual DB URL before building `ConnectionDetails` (Java-specific; not needed in non-Java clients). +> - `ojp-jdbc-driver` — [`ClientUUID`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ClientUUID.java): `getUUID()` returns the static, process-scoped UUID generated once at class-loading time via `UUID.randomUUID()`. --- -## 3. Client Identity +### 3.4 Load Balancing + +Two strategies must be supported, selectable via configuration (see §7.10, property `ojp.loadaware.selection.enabled`): -### clientUUID +**Least-connections (default, `true`):** Select the healthy server with the lowest number of active sessions. Track session counts in a thread-safe counter per server endpoint. Use round-robin as a tie-breaker when all servers have equal counts. -- Generate one random UUID (version 4) when the client library is first loaded or when the process starts. This UUID must remain stable for the entire lifetime of the process. -- Attach `clientUUID` to every `ConnectionDetails` message sent to the server. -- The server uses `clientUUID` to group all sessions from the same client process. -- Do not persist `clientUUID` across process restarts; each new process should generate a fresh UUID. +**Round-robin (`false`):** Cycle through healthy servers in order using an atomic counter modulo the number of healthy servers. + +Server selection runs on every new connection attempt (non-XA, first `connect()`) and on every XA `connect()`. Once a session is assigned a server (via session stickiness), selection does not run again for that session. Only servers whose `isHealthy() == true` are eligible. If no healthy servers exist, raise a connection error. > **Reference implementation:** -> - `ojp-jdbc-driver` — [`ClientUUID`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ClientUUID.java): `getUUID()` returns the static, process-scoped UUID that is generated once at class-loading time via `UUID.randomUUID()`. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.selectHealthyServer()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): dispatches to one of the two strategies based on config. +> - `MultinodeConnectionManager.selectByLeastConnections(healthyServers)`: picks the server with the lowest active-session count; falls back to round-robin on a tie. +> - `MultinodeConnectionManager.selectByRoundRobin(healthyServers)`: atomically increments `roundRobinCounter` and selects `servers[counter % size]`. +> - `ojp-jdbc-driver` — [`ServerEndpoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ServerEndpoint.java): holds `isHealthy`, `lastFailureTime`, host, and port state for each endpoint. --- -## 4. Connection Establishment and connHash Caching +### 3.5 Cluster Health Propagation + +**Cluster health string format:** + +``` +host1:port1(UP);host2:port2(DOWN);host3:port3(UP) +``` + +Each semicolon-separated segment is `host:port(STATUS)` where status is `UP` or `DOWN`. + +**Client responsibilities:** + +- **Build** the cluster health string from local server endpoint health state before every `connect()` call and before every operation that carries a `SessionInfo`. +- **Consume** the cluster health string returned in `SessionInfo.clusterHealth` on every response. Update local endpoint health states accordingly. +- **Proactively push** the updated cluster health to all currently healthy servers whenever the topology changes. Two independent triggers — both are necessary: + + **Trigger 1 — health-check thread**: When `performHealthCheck()` detects a newly failed or recovered server, it calls `pushClusterHealthToAllHealthyServers()` inline on the health-check thread. + + **Trigger 2 — query thread**: When a SQL query thread detects server failure via `handleServerFailure()`, it submits `pushClusterHealthToAllHealthyServers()` to the background scheduler asynchronously (to avoid blocking the query thread). + + The push is done by calling `connect()` on each healthy server with a `ConnectionDetails` whose `clusterHealth` field contains the new topology string. + +### Pseudo-code + +```python +def build_cluster_health(endpoints): + return ";".join( + f"{ep.host}:{ep.port}({'UP' if ep.is_healthy else 'DOWN'})" + for ep in endpoints + ) -### First connection (cache miss) +def push_cluster_health(endpoints, stored_details): + if not stored_details: + return # no connections yet + health_str = build_cluster_health(endpoints) + for conn_hash, details in stored_details.items(): + push_req = ConnectionDetails(**details, clusterHealth=health_str) + for ep in endpoints: + if ep.is_healthy: + stubs[ep].connect(push_req) -1. Build a `ConnectionDetails` message (see §2 for field mapping): - - `url` — the actual database connection URL. - - `user`, `password` — credentials. - - `clientUUID` — the stable process UUID (see §3). - - `properties` — datasource-specific properties from configuration (see §22), including cache rules (see §23). - - `serverEndpoints` — list of all known server endpoints as `host:port` strings, used by the server for cluster coordination. - - `clusterHealth` — current cluster health string (see §11); empty on very first connect. - - `isXA` — `true` for XA connections, `false` otherwise. +def consume_cluster_health(session_info): + for segment in session_info.clusterHealth.split(";"): + host_port, status = segment.rsplit("(", 1) + status = status.rstrip(")") + endpoint = find_endpoint(host_port) + if status == "DOWN" and endpoint.is_healthy: + handle_server_failure(endpoint) + # UP: do not mark healthy here — let the health-check thread confirm +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.generateClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): builds the semicolon-delimited health string. +> - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: calls `connect()` on every healthy server; only runs when `!connectionDetailsByConnHash.isEmpty()`. +> - `MultinodeStatementService.withClusterHealth(session)`: enriches outgoing `SessionInfo` with the current cluster health string before each RPC. + +--- + +## 4. Client Responsibilities + +### 4.1 Connection Establishment and connHash Caching + +**First connection (cache miss):** + +1. Build a `ConnectionDetails` message (see §3.2 for field mapping). 2. Call `connect(ConnectionDetails)` on the chosen server. Receive `SessionInfo`. -3. Cache the returned `connHash`, keyed on `url + "|" + user + "|" + password + "|" + datasourceName`. Also store the full `ConnectionDetails` so it can be replayed if the server restarts. +3. Cache the returned `connHash`, keyed on `url + "|" + user + "|" + password + "|" + datasourceName`. Also store the full `ConnectionDetails` for NOT_FOUND recovery. 4. Return the received `SessionInfo` to the caller. +**Subsequent connections (cache hit, non-XA only):** + +1. Look up `connHash` from the local cache by the connection key. +2. Build a `SessionInfo` locally without making any gRPC call: `{connHash, clientUUID, isXA: false}`. +3. Return this locally-built `SessionInfo`. No `sessionUUID` is set yet. + +XA connections always call the server — caching is disabled for XA. + +**Cache invalidation (NOT_FOUND recovery):** + +When any gRPC call returns `Status.NOT_FOUND`: +1. Remove the cached `connHash` entry (but keep the stored `ConnectionDetails`). +2. Re-issue a real `connect()` RPC using the stored `ConnectionDetails`. +3. Cache the new `connHash` returned. +4. Retry the original failed operation once with the new `SessionInfo`. +5. This retry is only safe if the original request had no active `sessionUUID`. If a session was in progress, surface the error to the caller — the transaction state is permanently lost. + ### Pseudo-code ```python -# --- First connection (cache miss) --- +# First connection (cache miss) req = ConnectionDetails( - url = "jdbc:postgresql://db:5432/mydb", # actual DB URL + url = "jdbc:postgresql://db:5432/mydb", user = "alice", password = "secret", - clientUUID = CLIENT_UUID, # stable process UUID (§3) - serverEndpoints = ["host1:10591", "host2:10591"], # full cluster list - clusterHealth = build_cluster_health(endpoints), # §11; "" on very first call + clientUUID = CLIENT_UUID, + serverEndpoints = ["host1:10591", "host2:10591"], + clusterHealth = build_cluster_health(endpoints), # "" on very first call isXA = False, - properties = [PropertyEntry(key="ojp.datasource.name", string_value="default")] + properties = [PropertyEntry(key="ojp.datasource.name", value="default")] ) - session = stub.connect(req) -# session.connHash = "abc123..." — server-computed pool key -# session.clientUUID = CLIENT_UUID - -# Cache connHash for subsequent connections cache_key = f"{req.url}|{req.user}|{req.password}|default" connhash_cache[cache_key] = session.connHash -stored_details[session.connHash] = req # kept for NOT_FOUND recovery (see below) +stored_details[session.connHash] = req # kept for NOT_FOUND recovery -# --- Subsequent connection (cache hit, non-XA) --- -# No RPC call needed — build SessionInfo locally from the cached connHash +# Subsequent connection (cache hit, non-XA) session = SessionInfo( connHash = connhash_cache[cache_key], clientUUID = CLIENT_UUID, isXA = False - # sessionUUID is absent; the server assigns it lazily on startTransaction + # sessionUUID is absent; assigned lazily by server on startTransaction ) -# --- NOT_FOUND recovery --- -# If any RPC returns Status.NOT_FOUND (server restarted, pool lost): +# NOT_FOUND recovery del connhash_cache[cache_key] -session = stub.connect(stored_details[old_conn_hash]) # re-issue real connect() -connhash_cache[cache_key] = session.connHash # update cache +session = stub.connect(stored_details[old_conn_hash]) +connhash_cache[cache_key] = session.connHash # then retry the original failed RPC once ``` -### Subsequent connections (cache hit, non-XA only) - -When a subsequent connection uses the same credentials: -1. Look up `connHash` from the local cache by the connection key. -2. Build a `SessionInfo` locally without making any gRPC call: - ``` - SessionInfo { - connHash: - clientUUID: - isXA: false - } - ``` -3. Return this locally-built `SessionInfo`. No `sessionUUID` is set yet; it will be assigned by the server when the first SQL operation requires a session (e.g., on `startTransaction`). - -**XA connections always call the server** — caching is disabled for XA because each XA connection must create a dedicated pool entry on a specific server. - -### Cache invalidation (NOT_FOUND recovery) - -When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory pool (e.g., after a restart). Recovery procedure: -1. Remove the cached `connHash → connection-key` entry (but keep the stored `ConnectionDetails`). -2. Re-issue a real `connect()` RPC using the stored `ConnectionDetails`. -3. Cache the new `connHash` returned. -4. Retry the original failed operation once with the new `SessionInfo`. -5. This retry is only safe if the original request had no active `sessionUUID` (no open transaction). If a session was in progress, surface the error to the caller — the transaction state is permanently lost. - > **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.connect()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): orchestrates first-connect vs. cache-hit logic; calls `connectToAllServers()` for the real RPC path and `buildLocalSessionInfo()` for the cache-hit path. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.connect()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): orchestrates first-connect vs. cache-hit logic. > - `MultinodeConnectionManager.computeConnectionKey()`: builds the `url|user|password|datasourceName` cache key. > - `MultinodeConnectionManager.invalidateConnHash()`: removes the stale key from `connHashByConnectionKey` on `NOT_FOUND`. > - `MultinodeConnectionManager.reconnectForConnHash()`: re-issues the real `connect()` RPC using stored `ConnectionDetails` and updates the cache. @@ -259,179 +362,119 @@ When any gRPC call returns `Status.NOT_FOUND`, the server has lost its in-memory --- -## 5. Session Management +### 4.2 Session Lifecycle -### SessionInfo fields +**SessionInfo fields:** | Field | Type | Meaning | |---|---|---| | `connHash` | string | Server-side key identifying which connection pool to use | -| `clientUUID` | string | Client process identity (see §3) | -| `sessionUUID` | string | Server-side session handle; set once a session is established (on `startTransaction`, LOB creation, etc.) | +| `clientUUID` | string | Client process identity (see §3.3) | +| `sessionUUID` | string | Server-side session handle; set once a session is established | | `transactionInfo` | `TransactionInfo` | Contains `transactionUUID` and `transactionStatus` (`TRX_ACTIVE`, `TRX_COMMITED`, `TRX_ROLLBACK`) | | `sessionStatus` | `SessionStatus` | `SESSION_ACTIVE` or `SESSION_TERMINATED` | | `isXA` | bool | Whether this is an XA session | -| `targetServer` | string | `host:port` of the server this session is pinned to (set by the server, used by the client for stickiness) | +| `targetServer` | string | `host:port` of the server this session is pinned to | | `clusterHealth` | string | Current cluster health snapshot from the server's perspective | -### Lifecycle rules +**Lifecycle rules:** - Always propagate the **latest** `SessionInfo` on every outgoing request. The server updates and returns it in every response; the client must replace its local copy with the one returned. -- When the response contains a `sessionUUID` that was absent in the request, register it immediately with the session-stickiness layer (see §6). -- On connection close: call `terminateSession(SessionInfo)`. This is mandatory for releasing server-side resources, especially in multinode deployments where multiple servers may hold pools. -- If `sessionStatus == SESSION_TERMINATED` is received, treat the connection as closed and do not make further calls on it. +- When the response contains a `sessionUUID` that was absent in the request, register it immediately: record the binding `sessionUUID → response.targetServer`. +- On connection close: call `terminateSession(SessionInfo)`. This is mandatory for releasing server-side resources. +- If `sessionStatus == SESSION_TERMINATED` is received, treat the connection as closed. + +**Session affinity enforcement:** + +- Maintain a thread-safe map of `sessionUUID → host:port`. +- On each request: if `sessionUUID` is set in the local `SessionInfo`, look up the bound server. Route the request to that server only. +- If the bound server is currently marked unhealthy: **raise an error to the caller** — do not silently reroute. +- When a session is closed (`terminateSession`), remove the binding from the map and decrement the session count for that server. ### Pseudo-code ```python # Every gRPC call returns an updated SessionInfo — always replace the local copy -resp = stub.executeUpdate(StatementRequest(session=current_session, sql="...")) -current_session = resp.session # ← update after every call +resp = stub.executeUpdate(StatementRequest(session=current_session, sql="...")) +current_session = resp.session # update after every call -# When a new sessionUUID appears in the response, record the server binding (§6) +# When a new sessionUUID appears in the response, record the server binding if resp.session.sessionUUID and resp.session.sessionUUID != current_session.sessionUUID: bind_session(resp.session.sessionUUID, resp.session.targetServer) # Close a connection — release server-side state stub.terminateSession(current_session) -# After this call, discard current_session and do not make further calls on it ``` > **Reference implementation:** -> - `ojp-jdbc-driver` — [`Connection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): holds the mutable `session` field (`SessionInfo`); `close()` calls `terminateSession(session)` and nulls the session; `checkValid()` guards every method against a closed or force-invalidated connection. +> - `ojp-jdbc-driver` — [`Connection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): holds the mutable `session` field; `close()` calls `terminateSession(session)`; `checkValid()` guards every method against a closed connection. > - `ojp-jdbc-driver` — [`MultinodeStatementService.withClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeStatementService.java): enriches outgoing `SessionInfo` with the current cluster health string before each RPC. > - `MultinodeStatementService.checkAndBindSession()`: updates the stickiness map whenever the server returns a new or changed `sessionUUID`. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.terminateSession()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): forwards `terminateSession` to every server that received a `connect()` for this `connHash`. - ---- - -## 6. Session Stickiness - -### Rule - -Once a `sessionUUID` is established, **every subsequent request for that session must go to the same server**. The server embeds `targetServer` (`host:port`) in the `SessionInfo` response; the client must record this binding and honour it. - -### Enforcement - -- Maintain a thread-safe map of `sessionUUID → host:port`. -- On each request: if `sessionUUID` is set in the local `SessionInfo`, look up the bound server. Route the request to that server only. -- If the bound server is currently marked unhealthy: **raise an error to the caller** — do not silently reroute to another server. The in-flight session state (open transaction, LOB handle, cursor) cannot be migrated and the caller must handle the failure. -- When a session is closed (`terminateSession`), remove the binding from the map and decrement the session count for that server in the load-balancing tracker (see §7). - -### Session binding sources - -A session binding is created or updated in these cases: -- A response contains a `sessionUUID` that was not present in the request (first assignment). -- The `targetServer` field in a response differs from the currently recorded binding (re-binding after a recovery; log a warning). - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.affinityServer(sessionKey)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): returns the bound server for a `sessionUUID`, or selects a new one via load balancing when no binding exists yet; throws `SQLException` if the bound server is unhealthy. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.affinityServer(sessionKey)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): returns the bound server for a `sessionUUID`; throws `SQLException` if the bound server is unhealthy. > - `MultinodeConnectionManager.bindSession(sessionUUID, targetServer)`: records the `sessionUUID → host:port` mapping in `sessionToServerMap`. -> - `MultinodeConnectionManager.getBoundTargetServer(sessionUUID)`: reads the current binding. -> - `MultinodeConnectionManager.unbindSession(sessionUUID)`: removes the binding on session close. > - `ojp-jdbc-driver` — [`SessionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/SessionTracker.java): maintains per-server session counts used by the load-balancer and redistribution logic. --- -## 7. Load Balancing - -### Server selection strategies - -Two strategies must be supported, selectable via configuration (see §22, property `ojp.loadaware.selection.enabled`): - -**Least-connections (default, `true`)** -Select the healthy server with the lowest number of active sessions. Track session counts in a thread-safe counter per server endpoint. Use round-robin as a tie-breaker when all servers have equal counts. - -**Round-robin (`false`)** -Cycle through healthy servers in order using an atomic counter modulo the number of healthy servers. - -### When selection runs - -Server selection runs on every new connection attempt (non-XA, first `connect()`) and on every XA `connect()`. Once a session is assigned a server (via session stickiness), selection does not run again for that session. - -### Healthy server filter - -Only servers whose `isHealthy() == true` are eligible for selection. If no healthy servers exist, raise a connection error. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.selectHealthyServer()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the entry point that dispatches to one of the two strategies based on config. -> - `MultinodeConnectionManager.selectByLeastConnections(healthyServers)`: picks the server with the lowest active-session count; falls back to round-robin on a tie. -> - `MultinodeConnectionManager.selectByRoundRobin(healthyServers)`: atomically increments `roundRobinCounter` and selects `servers[counter % size]`. -> - `ojp-jdbc-driver` — [`ServerEndpoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ServerEndpoint.java): holds `isHealthy`, `lastFailureTime`, host, and port state for each endpoint. - ---- - -## 8. Failover - -### What triggers failover +### 4.3 Failover -Connection-level gRPC errors indicate that the server is unreachable. The following gRPC status codes are treated as connectivity failures: +**What triggers failover:** | Status code | Trigger failover? | |---|---| | `UNAVAILABLE` | Yes | | `DEADLINE_EXCEEDED` | Yes | | `UNKNOWN` (with "connection" in message) | Yes | -| `INTERNAL` with SQL metadata trailers | **No** — this is a database-level error | -| `INTERNAL` without SQL metadata trailers | Yes — treated as a transport-level failure | -| `NOT_FOUND` | **No** — triggers reconnect (see §4), not failover | +| `INTERNAL` with SQL metadata trailers | **No** — database-level error | +| `INTERNAL` without SQL metadata trailers | Yes — transport-level failure | +| `NOT_FOUND` | **No** — triggers reconnect (see §4.1) | | `RESOURCE_EXHAUSTED` (pool exhaustion) | **No** — surface to caller | -| `CANCELLED` | **No** — this is a client-initiated cancellation signal; must never mark a server unhealthy | +| `CANCELLED` | **No** — client-initiated cancellation; must never mark a server unhealthy | | Any `SQLException` from server | **No** | -### Failover procedure +**Failover procedure:** -1. When a connectivity error is detected on a server: - a. Capture whether the server was previously healthy (`wasHealthy`). - b. Mark the server unhealthy (`isHealthy = false`), recording the failure timestamp. - c. Log the failure. - d. If this is a genuine healthy → unhealthy transition (`wasHealthy == true`), submit `pushClusterHealthToAllHealthyServers()` asynchronously to the background scheduler so surviving servers resize their pools immediately. The push is submitted (not called inline) to avoid blocking the query thread. - e. Shut down the gRPC channel for the failed server gracefully (allow in-flight calls to drain, then discard). -2. Select the next healthy server (using the configured strategy, excluding the failed server and any already attempted in this retry cycle). -3. Retry the operation on the new server. -4. If all servers have been attempted and all failed, raise a connection error to the caller. -5. Retry attempts and delay between retries are configurable (see §22, properties `ojp.multinode.retry.attempts` and `ojp.multinode.retry.delay`). +1. Capture whether the server was previously healthy (`wasHealthy`). +2. Mark the server unhealthy (`isHealthy = false`), recording the failure timestamp. +3. Log the failure. +4. If `wasHealthy == true`, submit `pushClusterHealthToAllHealthyServers()` asynchronously to the background scheduler (to avoid blocking the query thread). +5. Shut down the gRPC channel for the failed server gracefully. +6. Select the next healthy server (using the configured strategy, excluding already-attempted servers). +7. Retry the operation on the new server. +8. If all servers have been attempted and all failed, raise a connection error to the caller. +9. Retry attempts and delay are configurable (see §7.10, `ojp.multinode.retry.attempts` and `ojp.multinode.retry.delay`). -### What must NOT trigger failover - -- Database errors (bad SQL, constraint violations, auth failures) — surface directly to caller. -- Pool exhaustion — surface directly to caller. -- Session-invalidation errors (session lost after server failure) — surface directly to caller; the caller must re-establish the session. +**What must NOT trigger failover:** database errors, pool exhaustion, session-invalidation errors — surface all directly to caller. > **Reference implementation:** -> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.isConnectionLevelError()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): classifies a `StatusRuntimeException` as a connectivity failure vs. a SQL/business error. `CANCELLED` is explicitly **excluded** (it is a client-side signal, not a server failure). +> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.isConnectionLevelError()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): classifies a `StatusRuntimeException` as a connectivity failure vs. a SQL/business error. `CANCELLED` is explicitly **excluded**. > - `GrpcExceptionHandler.isPoolNotFoundException()`: returns `true` for `NOT_FOUND`, triggering reconnect rather than failover. > - `GrpcExceptionHandler.isSessionInvalidationError()`: returns `true` when the server indicates the session is gone. -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.handleServerFailure(endpoint, exception)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): marks the server unhealthy, timestamps the failure, shuts down the gRPC channel gracefully, and — only on a genuine healthy→unhealthy transition (`wasHealthy == true`) — submits `pushClusterHealthToAllHealthyServers()` to the background `healthCheckScheduler` so the cluster health push does not block the query thread. -> - `MultinodeStatementService.executeOpResultWithSessionStickinessAndBinding()`: the retry loop that catches `StatusRuntimeException`, calls `isConnectionLevelError`, drives the server-selection retry cycle, and calls `handleServerFailure` on each failed attempt. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.handleServerFailure(endpoint, exception)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): marks the server unhealthy, timestamps the failure, shuts down the gRPC channel gracefully, and — only on a genuine healthy→unhealthy transition — submits `pushClusterHealthToAllHealthyServers()` to the background `healthCheckScheduler`. +> - `MultinodeStatementService.executeOpResultWithSessionStickinessAndBinding()`: the retry loop that drives the server-selection retry cycle and calls `handleServerFailure` on each failed attempt. --- -## 9. Health Checking - -### Background task +### 4.4 Health Checking -Run a periodic background task that checks server health. The task must: -- Run at a configurable fixed interval (property `ojp.health.check.interval`, default 5 000 ms). -- Not block the main execution thread. -- Be a daemon task so it does not prevent process shutdown. +Run a periodic background task that checks server health at a configurable fixed interval (property `ojp.health.check.interval`, default 5 000 ms). The task must not block the main execution thread and must be a daemon task. -### Two-phase check +**Two-phase check:** -**Phase 1 — probe healthy servers (detect newly failed servers)** -Run when there are active XA sessions (`sessionToServerMap` is non-empty) **or** cached non-XA connection details (`connectionDetailsByConnHash` is non-empty). This dual guard ensures both XA and non-XA workloads trigger early failure detection. The guard prevents spurious "no healthy servers" errors before any connection has been established. For each currently healthy server that passes the guard, send a probe call. If the call fails, mark the server unhealthy and call the server-failure handler (see §8 and §11). +**Phase 1 — probe healthy servers (detect newly failed servers):** +Run when there are active XA sessions (`sessionToServerMap` is non-empty) **or** cached non-XA connection details (`connectionDetailsByConnHash` is non-empty). This dual guard ensures both XA and non-XA workloads trigger early failure detection. For each currently healthy server that passes the guard, send a probe call. If the call fails, mark the server unhealthy and call the server-failure handler. -**Phase 2 — probe unhealthy servers (detect recovery)** -For each currently unhealthy server, check if enough time has passed since the last failure (property `ojp.health.check.threshold`, default 5 000 ms). If so, probe the server. If the probe succeeds, run recovery (see §10). +**Phase 2 — probe unhealthy servers (detect recovery):** +For each currently unhealthy server, check if enough time has passed since the last failure (property `ojp.health.check.threshold`, default 5 000 ms). If so, probe the server. If the probe succeeds, run recovery (see §4.5). -### Health probe modes +**Health probe modes:** | Mode | How to probe | When to use | |---|---|---| | Heartbeat (lightweight) | Send `connect()` with empty `url`, `user`, `password` — any response means transport is up | Default | | Full validation | Send `connect()` with real credentials; on success, call `terminateSession` on the returned session | When heartbeat is insufficient | -### Configurable properties (see §22) +**Configurable properties:** | Property | Default | Meaning | |---|---|---| @@ -443,416 +486,208 @@ For each currently unhealthy server, check if enough time has passed since the l ### Pseudo-code ```python -# Lightweight heartbeat: send empty credentials — any response means transport is up def heartbeat_probe(stub): try: stub.connect(ConnectionDetails(url="", user="", password="")) - return True # server is reachable - except grpc.RpcError: - return False # mark server unhealthy (§8) - -# Full validation: connect with real credentials, then immediately terminate -def full_validation_probe(stub, stored_details): - try: - session = stub.connect(stored_details) - stub.terminateSession(session) return True except grpc.RpcError: return False -# Periodic background task def run_health_check(endpoints, stubs, stored_details): for ep in endpoints: if ep.is_healthy: - # Phase 1 — probe healthy server; detect new failures if stored_details or xa_sessions: # guard: skip if no connections yet if not heartbeat_probe(stubs[ep]): handle_server_failure(ep) push_cluster_health_async(endpoints, stored_details) else: - # Phase 2 — probe unhealthy server; detect recovery if time_since(ep.last_failure) >= HEALTH_CHECK_THRESHOLD: if heartbeat_probe(stubs[ep]): - reinitialize_pool_on_recovered_server(ep, stored_details) # §10 + reinitialize_pool_on_recovered_server(ep, stored_details) # §4.5 ep.mark_healthy() - push_cluster_health_inline(endpoints, stored_details) # §11 + push_cluster_health_inline(endpoints, stored_details) ``` > **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check. Phase 1 fires when `!sessionToServerMap.isEmpty() || !connectionDetailsByConnHash.isEmpty()` (XA sessions OR non-XA cached connections). Phase 1 failure calls `pushClusterHealthToAllHealthyServers()` inline on the health-check thread. Phase 2 calls `reinitializePoolOnRecoveredServer()` before `markHealthy()`, then pushes cluster health. -> - `ojp-jdbc-driver` — [`HealthCheckValidator.validateServer(endpoint)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckValidator.java): performs a single lightweight probe; `validateServer(endpoint, connectionDetails)` performs the full-validation probe with real credentials followed by `terminateSession`. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.performHealthCheck()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): the scheduled task body; implements the two-phase check. Phase 1 fires when `!sessionToServerMap.isEmpty() || !connectionDetailsByConnHash.isEmpty()`. Phase 2 calls `reinitializePoolOnRecoveredServer()` before `markHealthy()`, then pushes cluster health. +> - `ojp-jdbc-driver` — [`HealthCheckValidator.validateServer(endpoint)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckValidator.java): performs a single lightweight probe; `validateServer(endpoint, connectionDetails)` performs the full-validation probe. > - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): POJO holding `healthCheckIntervalMs`, `healthCheckThresholdMs`, `healthCheckTimeoutMs`, and `redistributionEnabled`. -> - `MultinodeConnectionManager` constructor: schedules `performHealthCheck` on a `ScheduledExecutorService` at the configured interval. --- -## 10. Connection Redistribution on Recovery +### 4.5 Connection Redistribution on Recovery -### Goal +When a failed server comes back online, rebalance client-side connections so that the recovered server receives its fair share of traffic again. -When a failed server comes back online, rebalance client-side connections so that the recovered server receives its fair share of traffic again. This avoids all load remaining on the servers that survived the outage. +**Procedure on recovery:** -### Procedure on recovery - -1. **Reinitialize pools on the recovered server first** (before marking healthy). Check whether any non-XA connections have been cached (`connectionDetailsByConnHash` is non-empty). If so, for every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This closes the NOT_FOUND window that would otherwise exist between marking the server healthy and the first SQL call reaching it. Only after all pools are pre-created, proceed to step 2. +1. **Reinitialize pools on the recovered server first** (before marking healthy). For every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This closes the NOT_FOUND window between marking the server healthy and the first SQL call reaching it. 2. Mark the server healthy (`endpoint.markHealthy()`). -3. Push the updated cluster health string to all healthy servers (see §11) so they can resize their pools. +3. Push the updated cluster health string to all healthy servers (see §3.5). 4. If redistribution is enabled (`ojp.redistribution.enabled = true`), begin rebalancing: - Determine the ideal share: `totalConnections / numberOfHealthyServers`. - Identify over-loaded servers (connections > ideal share). - - Close a fraction of idle connections on over-loaded servers so they are returned to the pool, then re-opened — the client's load-balancing layer will route the re-opens to the least-loaded server (including the recovered one). - - Honour the configurable fraction (`ojp.redistribution.idleRebalanceFraction`, default 1.0) and max-close-per-cycle limit (`ojp.redistribution.maxClosePerRecovery`, default 100). + - Close a fraction of idle connections on over-loaded servers; the client's load-balancing layer will route re-opens to the least-loaded server. + - Honour `ojp.redistribution.idleRebalanceFraction` (default 1.0) and `ojp.redistribution.maxClosePerRecovery` (default 100). > **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.reinitializePoolOnRecoveredServer(recoveredServer)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): runs only when `!connectionDetailsByConnHash.isEmpty()`; iterates the map and calls `connect()` on the recovered server for each stored `ConnectionDetails`; always called **before** `endpoint.markHealthy()` to eliminate the NOT_FOUND window. +> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.reinitializePoolOnRecoveredServer(recoveredServer)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): runs only when `!connectionDetailsByConnHash.isEmpty()`; always called **before** `endpoint.markHealthy()`. > - `ojp-jdbc-driver` — [`ConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionRedistributor.java): closes a fraction of idle connections on over-loaded servers for non-XA mode. > - `ojp-jdbc-driver` — [`XAConnectionRedistributor.rebalance(recoveredServers, allHealthyServers)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/XAConnectionRedistributor.java): equivalent redistribution for XA connections. > - `ojp-jdbc-driver` — [`ConnectionTracker`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/ConnectionTracker.java): maintains the per-server `Connection` list consulted by `ConnectionRedistributor`. --- -## 11. Cluster Health Propagation - -### Cluster health string format - -``` -host1:port1(UP);host2:port2(DOWN);host3:port3(UP) -``` - -Each semicolon-separated segment is `host:port(STATUS)` where status is `UP` or `DOWN`. - -### Client responsibilities - -- **Build** the cluster health string from local server endpoint health state before every `connect()` call and before every operation that carries a `SessionInfo` (by populating `SessionInfo.clusterHealth`). -- **Consume** the cluster health string returned in `SessionInfo.clusterHealth` on every response. Update local endpoint health states accordingly: mark endpoints `DOWN` as unhealthy and endpoints `UP` as healthy (if they were previously failed). -- **Proactively push** the updated cluster health to all currently healthy servers whenever the topology changes (a server fails or recovers). This push happens via two independent triggers — both are necessary: - - **Trigger 1 — health-check thread**: When `performHealthCheck()` detects a newly failed server or a recovered server, it calls `pushClusterHealthToAllHealthyServers()` inline on the health-check thread. This covers the case when no SQL traffic is active at the moment of the topology change. +## 5. Minimal End-to-End Example - **Trigger 2 — query thread**: When a SQL query thread detects server failure via `handleServerFailure()`, it submits `pushClusterHealthToAllHealthyServers()` to the background scheduler. This covers the race where the query thread marks the server unhealthy before the health checker runs (the health checker's Phase 1 loop would then skip the already-unhealthy server and never push). The push is submitted asynchronously to avoid blocking the query thread. - - The push is done by calling `connect()` on each healthy server with a `ConnectionDetails` whose `clusterHealth` field contains the new topology string. The server uses this to resize its pool immediately, regardless of whether any SQL is in flight. - -### Generation - -``` -generate_cluster_health(endpoints): - return ";".join( - f"{ep.host}:{ep.port}({'UP' if ep.is_healthy else 'DOWN'})" - for ep in endpoints - ) -``` - -### Pseudo-code - -```python -# Build the health string from local endpoint state -def build_cluster_health(endpoints): - return ";".join( - f"{ep.host}:{ep.port}({'UP' if ep.is_healthy else 'DOWN'})" - for ep in endpoints - ) - -# Push updated cluster health to all healthy servers via a connect() call. -# The server uses the clusterHealth field to resize its pool immediately. -def push_cluster_health(endpoints, stored_details): - if not stored_details: - return # no connections yet — nothing to push - health_str = build_cluster_health(endpoints) - for conn_hash, details in stored_details.items(): - push_req = ConnectionDetails(**details, clusterHealth=health_str) - for ep in endpoints: - if ep.is_healthy: - stubs[ep].connect(push_req) # no-op for pool creation; resizes pool - -# Consume the cluster health returned in every gRPC response -def consume_cluster_health(session_info): - for segment in session_info.clusterHealth.split(";"): - host_port, status = segment.rsplit("(", 1) - status = status.rstrip(")") - endpoint = find_endpoint(host_port) - if status == "DOWN" and endpoint.is_healthy: - handle_server_failure(endpoint) - elif status == "UP" and not endpoint.is_healthy: - # do not mark healthy here — let the health-check thread confirm (§9) - pass -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`MultinodeConnectionManager.generateClusterHealth()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeConnectionManager.java): builds the semicolon-delimited health string from `serverEndpoints`. -> - `MultinodeConnectionManager.pushClusterHealthToAllHealthyServers()`: calls `connect()` on every healthy server with the new cluster health embedded in `ConnectionDetails`; only runs when `!connectionDetailsByConnHash.isEmpty()` (no-op until the first real connection is established). -> - `MultinodeConnectionManager.handleServerFailure()` (Trigger 2): submits `pushClusterHealthToAllHealthyServers()` to `healthCheckScheduler` on a genuine healthy→unhealthy transition so query threads are never blocked by the push. -> - `MultinodeConnectionManager.performHealthCheck()` (Trigger 1): calls `pushClusterHealthToAllHealthyServers()` directly (inline on health-check thread) after marking a server DOWN or after a recovered server is marked healthy. -> - `MultinodeStatementService.withClusterHealth(sessionInfo)`: attaches the current health string to an outgoing `SessionInfo` before each RPC (reactive secondary path). - ---- - -## 12. Transaction Management (non-XA) - -### Transaction lifecycle - -The server tracks open transactions per session. The client controls when transactions begin and end by calling explicit RPCs. - -- **Start a transaction**: call `startTransaction(SessionInfo)`. The returned `SessionInfo` contains a `transactionUUID` and `transactionStatus = TRX_ACTIVE`. All subsequent SQL calls on this session run inside the transaction until it is committed or rolled back. -- **Commit**: call `commitTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_COMMITED`. -- **Rollback**: call `rollbackTransaction(SessionInfo)`. Returns updated `SessionInfo` with `transactionStatus = TRX_ROLLBACK`. -- **Auto-commit mode** (optional, for JDBC compatibility): if your client API exposes an auto-commit flag, implement it by calling `startTransaction` when the flag is switched off, and `commitTransaction` when it is switched back on while a transaction is active. In auto-commit mode, each SQL statement runs without an explicit transaction; the server commits each statement individually. - -Always replace the local `SessionInfo` with the one returned by these calls. - -### Transaction isolation - -Set or get the isolation level by calling `callResource` with `RES_CONNECTION` and `CallType.CALL_SET` / `CALL_GET` and resource name `"TransactionIsolation"`. The isolation level should be reset to the default after each logical connection is reused. - -### Pseudo-code +The following self-contained pseudo-code covers the full lifecycle of a typical OJP client interaction: channel setup, connection (with caching), a SELECT query, a transaction with DML, and graceful shutdown. ```python -# Begin an explicit transaction -session = stub.startTransaction(session) -# session.transactionInfo.transactionUUID = "txn-uuid" -# session.transactionInfo.transactionStatus = TRX_ACTIVE - -# Execute SQL within the open transaction -resp = stub.executeUpdate(StatementRequest(session=session, sql="INSERT INTO orders ...")) -session = resp.session # always update local session - -# Commit -session = stub.commitTransaction(session) -# session.transactionInfo.transactionStatus = TRX_COMMITED - -# — OR — Rollback -session = stub.rollbackTransaction(session) -# session.transactionInfo.transactionStatus = TRX_ROLLBACK - -# Set transaction isolation (READ_COMMITTED = 2) -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall( - callType = CALL_SET, - resourceName = "TransactionIsolation", - params = [ParameterValue(int_value=2)] - ) -)) -session = resp.session - -# Get current isolation level -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall(callType=CALL_GET, resourceName="TransactionIsolation") -)) -isolation_level = resp.values[0].int_value -session = resp.session -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`Connection.setAutoCommit(boolean)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): calls `commitTransaction` when switching on and `startTransaction` when switching off; updates the local `session` field from each response. -> - `Connection.commit()` / `Connection.rollback()`: delegate to `statementService.commitTransaction(session)` / `rollbackTransaction(session)` when `autoCommit == false`. -> - `Connection.close()`: calls `terminateSession(session)` unconditionally. -> - `Connection.setTransactionIsolation(level)` / `getTransactionIsolation()`: forwarded via `callProxy(CallType.CALL_SET/GET, "TransactionIsolation", ...)`. -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.startTransaction()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java) / `commitTransaction()` / `rollbackTransaction()`: the single-node gRPC calls. +import grpc, uuid ---- - -## 13. Savepoints - -Savepoints are implemented through the `callResource` protocol using `ResourceType.RES_SAVEPOINT`. - -### Creating a savepoint - -Call `callResource` with: -- `resourceType = RES_SAVEPOINT` -- `target.callType = CALL_SET` (or `CALL_INSERT` for named savepoints, depending on server version) -- `target.resourceName = "Savepoint"` -- `target.params = [savepointName]` if named; empty for anonymous savepoints. - -The response contains the savepoint UUID in `CallResourceResponse.resourceUUID`. - -### Rolling back to a savepoint - -Call `callResource` with: -- `resourceType = RES_SAVEPOINT` -- `resourceUUID = ` -- `target.callType = CALL_ROLLBACK` - -### Releasing a savepoint +CLIENT_UUID = str(uuid.uuid4()) +connhash_cache = {} # url|user|password|dsname -> connHash +stored_details = {} # connHash -> ConnectionDetails (for NOT_FOUND recovery) -Call `callResource` with: -- `resourceType = RES_SAVEPOINT` -- `resourceUUID = ` -- `target.callType = CALL_RELEASE` +channel = grpc.create_channel("ojp-server:10591", + credentials=grpc.local_channel_credentials()) +stub = StatementServiceStub(channel) -### Pseudo-code - -```python -# Create a named savepoint -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_SAVEPOINT, - target = TargetCall( - callType = CALL_SET, - resourceName = "Savepoint", - params = [ParameterValue(string_value="my_savepoint")] # omit for anonymous - ) -)) -savepoint_uuid = resp.resourceUUID # keep this to roll back or release later -session = resp.session +# Open a connection (first call -> cache miss -> real RPC) +req = ConnectionDetails( + url = "jdbc:postgresql://db:5432/mydb", + user = "alice", + password = "secret", + clientUUID = CLIENT_UUID, + serverEndpoints = ["ojp-server:10591"], + clusterHealth = "", # empty on very first connect + isXA = False, + properties = [PropertyEntry(key="ojp.datasource.name", value="default")] +) +session = stub.connect(req) # single RPC; subsequent connects use the cache +cache_key = "jdbc:postgresql://db:5432/mydb|alice|secret|default" +connhash_cache[cache_key] = session.connHash +stored_details[session.connHash] = req -# Roll back to the savepoint (partial undo) -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_SAVEPOINT, - resourceUUID = savepoint_uuid, - target = TargetCall(callType=CALL_ROLLBACK, resourceName="Savepoint") -)) -session = resp.session +# Execute a SELECT query +result_set_uuid = None +labels = [] +rows = [] +for op_result in stub.executeQuery(StatementRequest( + session = session, + sql = "SELECT id, name FROM orders WHERE customer = ?", + parameters = [ParameterProto(index=1, type=PT_STRING, + values=[ParameterValue(string_value="alice")])], + statementUUID = str(uuid.uuid4()))): + qr = op_result.query_result + if result_set_uuid is None: + result_set_uuid = qr.resultSetUUID + labels = qr.labels + rows.extend(qr.rows) + session = op_result.session # always update session after every RPC -# Release the savepoint (no longer needed) -resp = stub.callResource(CallResourceRequest( +stub.callResource(CallResourceRequest( session = session, - resourceType = RES_SAVEPOINT, - resourceUUID = savepoint_uuid, - target = TargetCall(callType=CALL_RELEASE, resourceName="Savepoint") + resourceType = RES_RESULT_SET, + resourceUUID = result_set_uuid, + target = TargetCall(callType=CALL_CLOSE) )) -session = resp.session -``` - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`Connection.setSavepoint()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java) / `setSavepoint(name)`: calls `callProxy` with `CALL_SET`, `"Savepoint"`, and the optional name; wraps the returned resource UUID in a [`Savepoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Savepoint.java) object. -> - `Connection.rollback(Savepoint)`: calls `callProxy` with `CALL_ROLLBACK`, `"Savepoint"`, and the savepoint's resource UUID. -> - `Connection.releaseSavepoint(Savepoint)`: calls `callProxy` with `CALL_RELEASE`. - ---- - -## 14. XA / Distributed Transactions - -### Overview - -XA support maps the standard XA resource manager protocol to gRPC RPCs. XA connections are always pinned to a single server (§6). - -### XA transaction lifecycle - -``` -xaStart(XaStartRequest) -- Begin branch; safe to retry on connection error -xaEnd(XaEndRequest) -- End branch; NEVER retry after this point -xaPrepare(XaPrepareRequest) -- Two-phase prepare; returns XA_OK or XA_RDONLY -xaCommit(XaCommitRequest) -- Commit (onePhase=true for one-phase optimisation) -xaRollback(XaRollbackRequest) -- Roll back the branch -xaRecover(XaRecoverRequest) -- List in-doubt XIDs (for recovery after crash) -xaForget(XaForgetRequest) -- Forget a heuristically completed branch -``` - -### Xid encoding (XidProto) -| Field | Type | Meaning | -|---|---|---| -| `formatId` | int32 | Transaction format ID | -| `globalTransactionId` | bytes | Global transaction ID (up to 64 bytes) | -| `branchQualifier` | bytes | Branch qualifier (up to 64 bytes) | - -### Retry policy - -- **`xaStart`** only: retry on connection-level errors (see §8). No transaction state exists yet so retrying is safe. -- **All other XA operations**: do not retry automatically. Surface failures to the caller's transaction manager. - -### XA session binding - -On the response to `xaStart`, record the `sessionUUID → targetServer` binding (§6). All subsequent XA operations for this branch must go to the same server. If that server is unavailable, raise `XAException(XAER_RMFAIL)`. - -### Timeout +# Execute a transaction +session = stub.startTransaction(session) +# session.sessionUUID is now assigned; all subsequent calls go to session.targetServer -- `xaSetTransactionTimeout(seconds)` and `xaGetTransactionTimeout()` are straightforward pass-throughs to the server. -- `xaIsSameRM` checks whether two `SessionInfo` objects originate from the same resource manager (same server). +resp = stub.executeUpdate(StatementRequest( + session = session, + sql = "INSERT INTO orders(customer, amount) VALUES(?, ?)", + parameters = [ + ParameterProto(index=1, type=PT_STRING, values=[ParameterValue(string_value="alice")]), + ParameterProto(index=2, type=PT_INT, values=[ParameterValue(int_value=99)]) + ], + statementUUID = str(uuid.uuid4()) +)) +session = resp.session +rows_affected = resp.value.int_value # e.g., 1 -### Pseudo-code +session = stub.commitTransaction(session) -```python -xid = XidProto( - formatId = 1, - globalTransactionId = b"global-tx-001", - branchQualifier = b"branch-1" -) +# Close the connection and shut down +stub.terminateSession(session) +channel.shutdown(grace_period_seconds=5) +``` -# 1. Start the XA branch (safe to retry on connection error) -resp = stub.xaStart(XaStartRequest(session=session, xid=xid, flags=0)) -session = resp.session # bind session.targetServer → this server for all remaining calls +--- -# 2. Execute SQL within the branch (normal executeUpdate/executeQuery calls) -resp = stub.executeUpdate(StatementRequest(session=session, sql="UPDATE accounts ...")) -session = resp.session +## 6. Error Handling -# 3. End the branch — do NOT retry past this point -resp = stub.xaEnd(XaEndRequest(session=session, xid=xid, flags=0)) -session = resp.session +### 6.1 Error Classification -# 4. Prepare (two-phase commit, phase 1) -prep = stub.xaPrepare(XaPrepareRequest(session=session, xid=xid)) -# prep.result = XA_OK (proceed to commit) or XA_RDONLY (read-only; no commit needed) +| Condition | gRPC status | Client action | +|---|---|---| +| SQL error (bad query, constraint, etc.) | `INTERNAL` + `SqlErrorResponse` trailer | Throw SQL exception; do not retry; do not mark server unhealthy | +| Pool not found (server restarted) | `NOT_FOUND` | Invalidate connHash cache; reconnect; retry once (§4.1) | +| Server unreachable | `UNAVAILABLE` | Failover to next server (§4.3) | +| Request timeout | `DEADLINE_EXCEEDED` | Failover to next server (§4.3) | +| Client-side cancellation | `CANCELLED` | Do **not** failover; do **not** mark server unhealthy; surface to caller | +| Pool exhausted | `RESOURCE_EXHAUSTED` | Throw pool-exhaustion error; do not retry; do not mark server unhealthy | +| Session invalidated (server failure) | Session-not-found message | Throw session-lost error; do not retry; let caller decide | +| Session stickiness violation (server down) | Local check before RPC | Throw connection error immediately; do not reroute | -# 5a. Commit (two-phase) -stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=False)) +### 6.2 SQL Errors vs. Transport Errors -# 5b. — OR — One-phase optimisation (skip xaPrepare) -stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=True)) +When the server encounters a SQL error, it returns `Status.INTERNAL` with a `SqlErrorResponse` message attached to the trailing metadata. Extract it using the proto message key for `SqlErrorResponse`. -# 5c. — OR — Rollback -stub.xaRollback(XaRollbackRequest(session=session, xid=xid)) +``` +SqlErrorResponse { + reason: string + sqlState: string // ANSI SQL state code + vendorCode: int32 // database-specific error code + sqlErrorType: SqlErrorType // SQL_EXCEPTION or SQL_DATA_EXCEPTION +} +``` -# Recovery: list in-doubt XIDs after a crash -resp = stub.xaRecover(XaRecoverRequest(session=session, flag=TMSTARTRSCAN)) -for recovered_xid in resp.xids: - stub.xaCommit(...) # or xaRollback — decision belongs to the transaction manager +Map to the host language's exception hierarchy: `SQL_EXCEPTION` -> standard SQL exception; `SQL_DATA_EXCEPTION` -> data-specific SQL exception (subtype). -# Forget a heuristically completed branch -stub.xaForget(XaForgetRequest(session=session, xid=xid)) -``` +> **Note:** Before April 2026, the server incorrectly used `Status.CANCELLED` for SQL errors. The correct status is `Status.INTERNAL` with a `SqlErrorResponse` trailer. Any implementation must use `INTERNAL` for SQL errors and must not treat `CANCELLED` as a server failure. > **Reference implementation:** -> - `ojp-jdbc-driver` — [`OjpXAResource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAResource.java): implements `XAResource`; all 10 lifecycle methods (`start`, `end`, `prepare`, `commit`, `rollback`, `recover`, `forget`, `setTransactionTimeout`, `getTransactionTimeout`, `isSameRM`); contains the `xaStart` retry loop and the `toXidProto` / `fromXidProto` conversion helpers. -> - `ojp-jdbc-driver` — [`OjpXAConnection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAConnection.java): creates the XA-mode `StatementService` connection (always calling the server, never cache-hit) and vends `OjpXAResource`. -> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): entry point for XA; calls `MultinodeConnectionManager.connectXA()` to pin the session to a single server. -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.xaStart()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java) … `xaIsSameRM()`: the 10 single-node gRPC stub wrappers. +> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.handle(StatusRuntimeException)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): extracts `SqlErrorResponse` from gRPC trailing metadata on `Status.INTERNAL`. +> - `GrpcExceptionHandler.isPoolNotFoundException(exception)`: returns `true` for `NOT_FOUND`. +> - `GrpcExceptionHandler.isSessionInvalidationError(exception)`: returns `true` for session-invalidation error messages. +> - `GrpcExceptionHandler.isConnectionLevelError(exception)`: returns `true` for `UNAVAILABLE`, `DEADLINE_EXCEEDED`, and connection-related `UNKNOWN` errors. --- -## 15. Statement Execution +## 7. Implementation Guidance -### Sending SQL to the server +### 7.1 Statement Execution All SQL is executed by populating a `StatementRequest` and calling either `executeUpdate` or `executeQuery` on the stub. -**Parameterless SQL** -Set `sql` to the full query string and leave `parameters` empty. +**Parameterless SQL:** Set `sql` to the full query string and leave `parameters` empty. -**Parameterized SQL** -Set `sql` with `?` positional placeholders and populate the `parameters` list with one `ParameterProto` per `?`. Parameters are accumulated locally and sent together in a single `StatementRequest`. Assign a `statementUUID` (a random UUID per logical prepared-statement instance) so the server can track resources tied to that statement. +**Parameterized SQL:** Set `sql` with `?` positional placeholders and populate the `parameters` list with one `ParameterProto` per `?`. Assign a `statementUUID` (a random UUID per logical prepared-statement instance). -**Stored-procedure calls** -First call `callResource` with `CallType.CALL_PREPARE` to register the procedure on the server and receive a `resourceUUID`. Then call `callResource` with `CallType.CALL_EXECUTE` to run it, passing IN parameters and retrieving OUT/INOUT values from `CallResourceResponse.values`. +**Stored-procedure calls:** First call `callResource` with `CallType.CALL_PREPARE` to register the procedure on the server and receive a `resourceUUID`. Then call `callResource` with `CallType.CALL_EXECUTE` to run it, passing IN parameters and retrieving OUT/INOUT values from `CallResourceResponse.values`. -### StatementRequest structure +**StatementRequest structure:** ``` StatementRequest { - session: SessionInfo // current session - sql: string // the SQL string - parameters: ParameterProto[] // indexed parameters (empty for parameterless SQL) - statementUUID: string // random UUID per statement instance - properties: PropertyEntry[] // optional per-statement properties + session: SessionInfo + sql: string + parameters: ParameterProto[] + statementUUID: string + properties: PropertyEntry[] } ``` -### Execution routing - -- Use `executeUpdate` for INSERT / UPDATE / DELETE / DDL — returns `OpResult` with `type = INTEGER` containing affected row count. -- Use `executeQuery` for SELECT — returns a server-streaming response. Consume the first `OpResult` to get the initial batch; call `fetchNextRows` for subsequent pages (see §18). -- After any execution, update the local `SessionInfo` from the `OpResult.session` field. +**Execution routing:** Use `executeUpdate` for INSERT / UPDATE / DELETE / DDL (returns affected row count in `value.int_value`). Use `executeQuery` for SELECT. Always update the local `SessionInfo` from `OpResult.session`. ### Pseudo-code ```python -# DML — INSERT / UPDATE / DELETE (use executeUpdate) +# DML resp = stub.executeUpdate(StatementRequest( session = session, sql = "INSERT INTO orders(customer, amount) VALUES(?, ?)", @@ -860,72 +695,55 @@ resp = stub.executeUpdate(StatementRequest( ParameterProto(index=1, type=PT_STRING, values=[ParameterValue(string_value="Alice")]), ParameterProto(index=2, type=PT_INT, values=[ParameterValue(int_value=42)]) ], - statementUUID = new_uuid() # random UUID per statement instance -)) -session = resp.session # always update local session -rows_affected = resp.value.int_value # e.g., 1 - -# Query — SELECT (use executeQuery, which is server-streaming) -req = StatementRequest( - session = session, - sql = "SELECT id, name FROM orders WHERE customer = ?", - parameters = [ParameterProto(index=1, type=PT_STRING, - values=[ParameterValue(string_value="Alice")])], statementUUID = new_uuid() -) -result_set_uuid = None -for op_result in stub.executeQuery(req): # iterate the server-streaming response - qr = op_result.query_result - if result_set_uuid is None: - result_set_uuid = qr.resultSetUUID - labels = qr.labels # e.g., ["id", "name"] - for row in qr.rows: +)) +session = resp.session +rows_affected = resp.value.int_value + +# Query +for op_result in stub.executeQuery(StatementRequest( + session=session, sql="SELECT id, name FROM orders", + statementUUID=new_uuid())): + for row in op_result.query_result.rows: id_val = row.values[0].int_value name_val = row.values[1].string_value session = op_result.session -# Fetch additional pages → see §18 -# Stored procedure — CALL_PREPARE then CALL_EXECUTE +# Stored procedure prep_resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CALLABLE_STATEMENT, - target = TargetCall(callType=CALL_PREPARE, resourceName="{call my_proc(?,?)}", - params=[ParameterValue(int_value=1)]) # IN param + session=session, resourceType=RES_CALLABLE_STATEMENT, + target=TargetCall(callType=CALL_PREPARE, resourceName="{call my_proc(?,?)}", + params=[ParameterValue(int_value=1)]) )) -proc_uuid = prep_resp.resourceUUID -session = prep_resp.session - exec_resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CALLABLE_STATEMENT, - resourceUUID = proc_uuid, - target = TargetCall(callType=CALL_EXECUTE) + session=prep_resp.session, resourceType=RES_CALLABLE_STATEMENT, + resourceUUID=prep_resp.resourceUUID, + target=TargetCall(callType=CALL_EXECUTE) )) -out_value = exec_resp.values[0] # first OUT/INOUT parameter value +out_value = exec_resp.values[0] session = exec_resp.session ``` > **Reference implementation:** -> - `ojp-jdbc-driver` — [`Statement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Statement.java): `executeQuery(sql)` → `statementService.executeQuery(...)`; `executeUpdate(sql)` → `statementService.executeUpdate(...)`; holds `statementUUID` (assigned lazily); `execute(sql)` handles the dual-result case. -> - `ojp-jdbc-driver` — [`PreparedStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/PreparedStatement.java): accumulates parameters in a `SortedMap`; `executeQuery()` and `executeUpdate()` pass the full param map to `statementService`; all 28 `setXxx(index, value)` methods map to the corresponding `ParameterType` (see §16). +> - `ojp-jdbc-driver` — [`Statement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Statement.java): `executeQuery(sql)` and `executeUpdate(sql)` delegate to `statementService`; holds `statementUUID` assigned lazily. +> - `ojp-jdbc-driver` — [`PreparedStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/PreparedStatement.java): accumulates parameters in a `SortedMap`; all 28 `setXxx(index, value)` methods map to the corresponding `ParameterType` (see §7.2). > - `ojp-jdbc-driver` — [`CallableStatement`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CallableStatement.java): issues `callResource(CALL_PREPARE)` on construction; retrieves OUT/INOUT values via `callResource(CALL_EXECUTE)` after execution. --- -## 16. Parameter Type Mapping - -### ParameterProto +### 7.2 Parameter Type Mapping Each parameter is represented as: + ``` ParameterProto { - index: int32 // 1-based parameter position - type: ParameterTypeProto // one of the 28 type codes - values: ParameterValue[] // one value for normal params; multiple for array params + index: int32 + type: ParameterTypeProto + values: ParameterValue[] } ``` -### ParameterTypeProto values and their ParameterValue encoding +**ParameterTypeProto values and their ParameterValue encoding:** | Proto enum value | Wire field in `ParameterValue` | Notes | |---|---|---| @@ -937,19 +755,19 @@ ParameterProto { | `PT_LONG` | `long_value` | | | `PT_FLOAT` | `float_value` | | | `PT_DOUBLE` | `double_value` | | -| `PT_BIG_DECIMAL` | `string_value` | Encode as `" "` — see §16.1 | +| `PT_BIG_DECIMAL` | `string_value` | Encode as `" "` — see §7.2.1 | | `PT_STRING` | `string_value` | | | `PT_BYTES` | `bytes_value` | Raw bytes | | `PT_DATE` | `date_value` | `google.type.Date` (year/month/day, no timezone) | | `PT_TIME` | `time_value` | `google.type.TimeOfDay` (hours/minutes/seconds/nanos) | -| `PT_TIMESTAMP` | `timestamp_value` | `TimestampWithZone` — see §17 | +| `PT_TIMESTAMP` | `timestamp_value` | `TimestampWithZone` — see §7.3 | | `PT_ASCII_STREAM` | `bytes_value` | ASCII bytes | | `PT_UNICODE_STREAM` | `bytes_value` | Unicode bytes | | `PT_BINARY_STREAM` | `bytes_value` | Binary bytes | | `PT_OBJECT` | varies | Best-effort mapping to one of the concrete value types | | `PT_CHARACTER_READER` | `string_value` | Contents of the character stream | | `PT_REF` | `string_value` | REF value as string | -| `PT_BLOB` | (LOB reference UUID) | Create LOB first (§19); then pass UUID as `string_value` | +| `PT_BLOB` | (LOB reference UUID) | Create LOB first (§7.5); then pass UUID as `string_value` | | `PT_CLOB` | (LOB reference UUID) | Same as BLOB | | `PT_ARRAY` | `int_array_value` / `long_array_value` / `string_array_value` | Use the typed array message matching element type | | `PT_URL` | `url_value` (StringValue) | `URL.toExternalForm()` — presence-aware; unset = null | @@ -959,34 +777,26 @@ ParameterProto { | `PT_N_CLOB` | (LOB reference UUID) | Same as CLOB | | `PT_SQL_XML` | `string_value` | XML content as string | -#### 16.1 BigDecimal encoding - -BigDecimal is serialised as a space-separated string: `" "`. +#### 7.2.1 BigDecimal encoding -- `unscaledInteger`: the decimal string representation of the unscaled value (may be negative), e.g. `"-12345"`. -- `scale`: integer scale (number of decimal places), e.g. `2`. -- Full value = `unscaledInteger × 10^(-scale)`. - -Example: `BigDecimal("123.45")` → `"12345 2"`. +BigDecimal is serialised as a space-separated string: `" "`. Example: `BigDecimal("123.45")` yields `"12345 2"`. > **Note:** A separate binary wire format is documented in `documents/protocol/BIGDECIMAL_WIRE_FORMAT.md` for contexts where binary efficiency is needed. -#### 16.2 Presence-aware fields +#### 7.2.2 Presence-aware fields `url_value`, `rowid_value`, `uuid_value`, `biginteger_value`, `rowidlifetime_value` are all `google.protobuf.StringValue` (a wrapper message). An absent (unset) wrapper means SQL NULL. An empty string inside the wrapper is a valid non-null value. > **Reference implementation:** > - `ojp-grpc-commons` — [`ProtoConverter.toProto(Parameter)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): converts a host-language `Parameter` object to `ParameterProto`; `fromProto(ParameterProto)` is the inverse. > - `ProtoConverter.toParameterValue(Object value)`: the central dispatcher that routes each Java type to the correct `ParameterValue` oneof field. -> - `ProtoConverter.fromParameterValue(ParameterValue, ParameterType)`: decodes a wire value back to a Java object using both the value and the declared type as hints. -> - `ojp-grpc-commons` — [`ProtoTypeConverters`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoTypeConverters.java): `uuidToProto(UUID)` / `uuidFromProto(StringValue)`, `urlToProto(URL)` / `urlFromProto(StringValue)`, `rowIdToProto(RowId)` / `rowIdBytesFromProto(StringValue)` — handles the presence-aware `StringValue` wrappers for UUID, URL, and RowId. -> - `ojp-grpc-commons` — [`BigDecimalWire`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/BigDecimalWire.java): `writeBigDecimal` / `readBigDecimal` — binary wire encoding for BigDecimal (also see `documents/protocol/BIGDECIMAL_WIRE_FORMAT.md`). +> - `ProtoConverter.fromParameterValue(ParameterValue, ParameterType)`: decodes a wire value back to a Java object. +> - `ojp-grpc-commons` — [`ProtoTypeConverters`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoTypeConverters.java): `uuidToProto(UUID)` / `uuidFromProto(StringValue)`, `urlToProto(URL)` / `urlFromProto(StringValue)`, `rowIdToProto(RowId)` / `rowIdBytesFromProto(StringValue)`. +> - `ojp-grpc-commons` — [`BigDecimalWire`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/BigDecimalWire.java): `writeBigDecimal` / `readBigDecimal` — binary wire encoding for BigDecimal. --- -## 17. Temporal Type Handling - -### TimestampWithZone +### 7.3 Temporal Type Handling Timestamps are transmitted as: @@ -998,7 +808,7 @@ TimestampWithZone { } ``` -### TemporalType enum +**TemporalType enum:** | Value | Original type | |---|---| @@ -1012,65 +822,28 @@ TimestampWithZone { | `TEMPORAL_TYPE_LOCAL_TIME` | `java.time.LocalTime` | | `TEMPORAL_TYPE_OFFSET_TIME` | `java.time.OffsetTime` | -### Encoding rules - -1. Convert the host-language datetime value to an absolute UTC instant (seconds + nanoseconds since the Unix epoch). -2. Record the IANA timezone or UTC offset string. -3. Set `original_type` to the closest matching `TemporalType` enum value. - -### Decoding rules - -On the receiving side, use `original_type` to reconstruct the correct host-language type: -- `TEMPORAL_TYPE_LOCAL_DATE_TIME` / `TEMPORAL_TYPE_TIMESTAMP` → local datetime in the client's timezone. -- `TEMPORAL_TYPE_OFFSET_DATE_TIME` → datetime with offset reconstructed from the `timezone` string. -- `TEMPORAL_TYPE_INSTANT` → UTC instant. -- `TEMPORAL_TYPE_LOCAL_DATE` → date only (no time component). -- `TEMPORAL_TYPE_LOCAL_TIME` / `TEMPORAL_TYPE_OFFSET_TIME` → time-only value with or without offset. - -**Date-only values** use `google.type.Date` (year, month, day — no timezone). -**Time-only values** use `google.type.TimeOfDay` (hours, minutes, seconds, nanos — no timezone). +**Encoding rules:** (1) Convert the host-language datetime value to a UTC instant (seconds + nanoseconds since Unix epoch). (2) Record the IANA timezone or UTC offset string. (3) Set `original_type` to the closest matching `TemporalType` enum value. -### Timezone requirement +**Decoding rules:** Use `original_type` to reconstruct the correct host-language type. `TEMPORAL_TYPE_LOCAL_DATE` uses `google.type.Date`. `TEMPORAL_TYPE_LOCAL_TIME` uses `google.type.TimeOfDay`. -The OJP server must always run with `user.timezone=UTC`. Client libraries should also normalise to UTC when encoding timestamps, using the `timezone` field to carry the original zone for faithful reconstruction. +The OJP server must always run with `user.timezone=UTC`. Client libraries should also normalise to UTC when encoding timestamps. > **Reference implementation:** -> - `ojp-grpc-commons` — [`TemporalConverter`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/TemporalConverter.java): the definitive encoding/decoding reference for all temporal types: -> - `toTimestampWithZone(java.sql.Timestamp, ZoneId)` / `fromTimestampWithZone(TimestampWithZone)` — `Timestamp` ↔ `TimestampWithZone`. -> - `calendarToTimestampWithZone(Calendar)` / `timestampWithZoneToCalendar(TimestampWithZone)` — `Calendar`. -> - `offsetDateTimeToTimestampWithZone` / `timestampWithZoneToOffsetDateTime` — `OffsetDateTime`. -> - `localDateTimeToTimestampWithZone` / `timestampWithZoneToLocalDateTime` — `LocalDateTime`. -> - `instantToTimestampWithZone` / `timestampWithZoneToInstant` — `Instant`. -> - `localDateToProtoDate(LocalDate)` / `protoDateToLocalDate(Date)` — `LocalDate` ↔ `google.type.Date`. -> - `localTimeToProtoTimeOfDay(LocalTime)` / `protoTimeOfDayToLocalTime(TimeOfDay)` — `LocalTime` ↔ `google.type.TimeOfDay`. -> - `offsetTimeToTimestampWithZone` / `timestampWithZoneToOffsetTime` — `OffsetTime`. -> - `fromTimestampWithZoneToObject(TimestampWithZone)`: the unified decoder that uses `TemporalType` to reconstruct the original type. +> - `ojp-grpc-commons` — [`TemporalConverter`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/TemporalConverter.java): the definitive encoding/decoding reference for all temporal types including `toTimestampWithZone`, `fromTimestampWithZone`, `calendarToTimestampWithZone`, `offsetDateTimeToTimestampWithZone`, `localDateTimeToTimestampWithZone`, `instantToTimestampWithZone`, `localDateToProtoDate`, `localTimeToProtoTimeOfDay`, `offsetTimeToTimestampWithZone`, and `fromTimestampWithZoneToObject` (the unified decoder). --- -## 18. Result Set and Streaming - -### Consuming executeQuery +### 7.4 Result Set Streaming `executeQuery` is a server-streaming RPC. The response stream contains one or more `OpResult` messages: -1. **First `OpResult`**: always contains the initial data batch in `query_result`: - - `resultSetUUID` — server-side handle for this result set. - - `labels` — ordered list of column names. - - `rows` — first batch of `ResultRow` objects, each containing a `ParameterValue` per column. - - `flag` — if `"ROW_BY_ROW"`, the server sends one row per stream message (row-by-row mode); otherwise the initial batch may contain multiple rows. - +1. **First `OpResult`**: contains the initial data batch in `query_result`: `resultSetUUID`, `labels` (ordered column names), `rows` (first batch of `ResultRow` objects), and `flag` (`"ROW_BY_ROW"` for one-row-per-message mode). 2. **Subsequent `OpResult` messages** (only in non-row-by-row streaming mode): additional batches until the stream closes. +3. **`fetchNextRows`**: After the initial stream closes, call `fetchNextRows(ResultSetFetchRequest)` with `resultSetUUID` and a page size. Repeat until the response contains an empty `rows` list. -3. **`fetchNextRows`**: After the initial stream closes, call `fetchNextRows(ResultSetFetchRequest)` with `resultSetUUID` and a page size to fetch additional rows. Repeat until the response contains an empty `rows` list or the result set is exhausted. - -### Column value decoding - -Map each `ParameterValue` oneof to the host language's equivalent type following the inverse of the encoding table in §16. Pay attention to `is_null = true` for SQL NULL values. +Map each `ParameterValue` oneof to the host language's equivalent type following the inverse of the encoding table in §7.2. Pay attention to `is_null = true` for SQL NULL values. -### Cursor navigation - -Scrollable result sets support cursor positioning through `callResource` with `ResourceType.RES_RESULT_SET` and the appropriate `CallType`: +**Cursor navigation** (scrollable result sets) — through `callResource` with `ResourceType.RES_RESULT_SET`: | Cursor operation | CallType | |---|---| @@ -1087,53 +860,45 @@ Scrollable result sets support cursor positioning through `callResource` with `R ### Pseudo-code ```python -# After executeQuery stream closes, fetch additional pages with fetchNextRows -result_set_uuid = ... # captured from the first op_result (§15) +# Fetch additional pages +result_set_uuid = ... # captured from the first op_result all_rows = [] while True: resp = stub.fetchNextRows(ResultSetFetchRequest( - session = session, - resultSetUUID = result_set_uuid, - size = 500 # rows per page + session=session, resultSetUUID=result_set_uuid, size=500 )) session = resp.session if not resp.query_result.rows: - break # no more rows — result set exhausted + break all_rows.extend(resp.query_result.rows) -# Close the result set explicitly when done +# Close the result set stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_RESULT_SET, - resourceUUID = result_set_uuid, - target = TargetCall(callType=CALL_CLOSE) + session=session, resourceType=RES_RESULT_SET, + resourceUUID=result_set_uuid, + target=TargetCall(callType=CALL_CLOSE) )) -# Cursor navigation — jump to an absolute row (scrollable result sets only) +# Absolute cursor navigation (scrollable result sets only) resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_RESULT_SET, - resourceUUID = result_set_uuid, - target = TargetCall( - callType = CALL_ABSOLUTE, - params = [ParameterValue(int_value=10)] # jump to row 10 - ) + session=session, resourceType=RES_RESULT_SET, + resourceUUID=result_set_uuid, + target=TargetCall(callType=CALL_ABSOLUTE, params=[ParameterValue(int_value=10)]) )) -session = resp.session -current_row = resp.values # column values for row 10 +session = resp.session +current_row = resp.values ``` > **Reference implementation:** -> - `ojp-jdbc-driver` — [`ResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ResultSet.java): `next()` drives the multi-block iteration; `setNextOpResult()` loads a new batch from the iterator; `nextWithSessionUpdate()` updates the session from each block. All `getXxx(columnIndex)` methods call `ProtoConverter.fromParameterValue()` on the column's `ParameterValue`. -> - `ojp-jdbc-driver` — [`RemoteProxyResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/RemoteProxyResultSet.java): base class holding `resultSetUUID` and `statementService`; all scrollable-cursor operations issue `callResource(RES_RESULT_SET, CALL_FIRST/LAST/ABSOLUTE/…)`. +> - `ojp-jdbc-driver` — [`ResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/ResultSet.java): `next()` drives the multi-block iteration; all `getXxx(columnIndex)` methods call `ProtoConverter.fromParameterValue()` on the column's `ParameterValue`. +> - `ojp-jdbc-driver` — [`RemoteProxyResultSet`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/RemoteProxyResultSet.java): base class holding `resultSetUUID` and `statementService`; all scrollable-cursor operations issue `callResource(RES_RESULT_SET, ...)`. > - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.fetchNextRows(sessionInfo, resultSetUUID, size)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the RPC that fetches the next page. -> - `ojp-grpc-commons` — [`ProtoConverter.fromProto(OpQueryResultProto)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/ProtoConverter.java): deserialises the initial `OpQueryResult` (labels + rows + resultSetUUID). --- -## 19. LOB (Large Object) Handling +### 7.5 LOB (Large Object) Handling -### LOB types +**LOB types:** | LobType enum | Meaning | |---|---| @@ -1144,124 +909,224 @@ current_row = resp.values # column values for row 10 | `LT_UNICODE_STREAM` | Unicode character stream | | `LT_CHARACTER_STREAM` | Generic character stream | -### Writing a LOB (createLob) - -1. Open a client-streaming call to `createLob`. -2. Send one or more `LobDataBlock` messages: - ``` - LobDataBlock { - session: SessionInfo - position: int64 // byte offset of this chunk - data: bytes // chunk content (recommended chunk size: 32–64 KB) - lobType: LobType - metadata: PropertyEntry[] // used for binary streams to carry prepared statement info - } - ``` -3. Close the stream. The server responds with a `LobReference` stream (typically one message): - ``` - LobReference { - session: SessionInfo - uuid: string // LOB handle - bytesWritten: int32 - lobType: LobType - } - ``` -4. Store the `LobReference.uuid`. This UUID is what gets passed as a parameter value (§16) when binding the LOB to a SQL statement. - -### Reading a LOB (readLob) - -Call `readLob(ReadLobRequest)`: -``` -ReadLobRequest { - lobReference: LobReference // uuid + session info - position: int64 // start byte (1-based for JDBC compatibility) - length: int32 // max bytes to return -} -``` -Receive a server-streaming response of `LobDataBlock` messages. Concatenate the `data` fields in order to reconstruct the content. +**Writing a LOB (createLob):** Open a client-streaming call to `createLob`. Send one or more `LobDataBlock` messages (recommended chunk size: 32–64 KB) with `session`, `position` (byte offset), `data`, `lobType`, and optional `metadata`. Close the stream. The server responds with a `LobReference` containing `uuid` (the LOB handle), `bytesWritten`, and `lobType`. Store the `LobReference.uuid` and pass it as a parameter value (§7.2) when binding the LOB. -### LOB and session stickiness +**Reading a LOB (readLob):** Call `readLob(ReadLobRequest)` and concatenate the `data` fields in order from the server-streaming response. -LOB handles are server-side objects. A connection that has an open LOB must remain bound to the same server (§6). Do not reroute such connections during failover; instead surface the error to the caller. +LOB handles are server-side objects. A connection with an open LOB must remain bound to the same server (§2.3). Do not reroute during failover. ### Pseudo-code ```python -CHUNK_SIZE = 64 * 1024 # 64 KB recommended chunk size +CHUNK_SIZE = 64 * 1024 -# --- Write a LOB (createLob is client-streaming) --- def write_lob(stub, session, data_bytes, lob_type=LT_BLOB): def generate_blocks(): for offset in range(0, len(data_bytes), CHUNK_SIZE): - yield LobDataBlock( - session = session, - position = offset, - data = data_bytes[offset : offset + CHUNK_SIZE], - lobType = lob_type - ) - lob_ref = stub.createLob(generate_blocks()) # client-streaming → single LobReference - # lob_ref.uuid → the LOB handle; pass as parameter to executeUpdate (see §16) - # lob_ref.bytesWritten → sanity check - return lob_ref.uuid - -# Bind the LOB UUID when executing a statement + yield LobDataBlock(session=session, position=offset, + data=data_bytes[offset:offset+CHUNK_SIZE], lobType=lob_type) + return stub.createLob(generate_blocks()).uuid + lob_uuid = write_lob(stub, session, my_bytes) stub.executeUpdate(StatementRequest( - session = session, - sql = "INSERT INTO docs(content) VALUES(?)", - parameters = [ParameterProto(index=1, type=PT_BLOB, - values=[ParameterValue(string_value=lob_uuid)])] + session=session, sql="INSERT INTO docs(content) VALUES(?)", + parameters=[ParameterProto(index=1, type=PT_BLOB, + values=[ParameterValue(string_value=lob_uuid)])] )) -# --- Read a LOB (readLob is server-streaming) --- def read_lob(stub, session, lob_uuid, lob_type=LT_BLOB, max_bytes=10_000_000): req = ReadLobRequest( - lobReference = LobReference(uuid=lob_uuid, session=session, lobType=lob_type), - position = 1, # 1-based start position - length = max_bytes + lobReference=LobReference(uuid=lob_uuid, session=session, lobType=lob_type), + position=1, length=max_bytes ) return b"".join(block.data for block in stub.readLob(req)) - -content = read_lob(stub, session, lob_uuid) ``` > **Reference implementation:** -> - `ojp-jdbc-driver` — [`LobServiceImpl`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/LobServiceImpl.java): `sendBytes(lobType, pos, inputStream)` opens the client-streaming `createLob` call, chunks the data into `LobDataBlock` messages, and returns the `LobReference`. `parseReceivedBlocks(Iterator)` reassembles chunks from a `readLob` stream into an `InputStream`. -> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.createLob(connection, iterator)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the client-streaming gRPC call; uses an async stub and a `CountDownLatch` to bridge the streaming API back to a synchronous return value. -> - `StatementServiceGrpcClient.readLob(lobReference, pos, length)`: the server-streaming gRPC call that returns an `Iterator`. +> - `ojp-jdbc-driver` — [`LobServiceImpl`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/LobServiceImpl.java): `sendBytes(lobType, pos, inputStream)` opens the client-streaming `createLob` call, chunks the data, and returns the `LobReference`. `parseReceivedBlocks(Iterator)` reassembles chunks from a `readLob` stream. +> - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.createLob(connection, iterator)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the client-streaming gRPC call. +> - `StatementServiceGrpcClient.readLob(lobReference, pos, length)`: the server-streaming gRPC call. > - `ojp-jdbc-driver` — [`Blob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Blob.java): `getBytes(pos, length)` and `getBinaryStream()` call `readLob`; `setBytes(pos, bytes)` calls `sendBytes`. [`Clob`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Clob.java) mirrors the same pattern for character data. -> - `ojp-jdbc-driver` — [`BinaryStream`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/BinaryStream.java): streams binary content directly via `createLob` without materialising the full byte array. --- -## 20. CallResource Protocol +### 7.6 Transaction Management (non-XA) + +Use the three transaction RPCs — `startTransaction`, `commitTransaction`, `rollbackTransaction` — together with `terminateSession` for connection close. All four update and return `SessionInfo`; always replace the local copy. + +Transaction isolation is set or got by calling `callResource` with `RES_CONNECTION` and `CallType.CALL_SET` / `CALL_GET` and resource name `"TransactionIsolation"`. + +### Pseudo-code + +```python +# Begin an explicit transaction +session = stub.startTransaction(session) +# session.transactionInfo.transactionStatus == TRX_ACTIVE + +resp = stub.executeUpdate(StatementRequest(session=session, sql="INSERT INTO orders ...")) +session = resp.session + +session = stub.commitTransaction(session) +# session.transactionInfo.transactionStatus == TRX_COMMITED +# — OR — +session = stub.rollbackTransaction(session) +# session.transactionInfo.transactionStatus == TRX_ROLLBACK + +# Set transaction isolation (READ_COMMITTED = 2) +resp = stub.callResource(CallResourceRequest( + session=session, resourceType=RES_CONNECTION, + target=TargetCall(callType=CALL_SET, resourceName="TransactionIsolation", + params=[ParameterValue(int_value=2)]) +)) +session = resp.session + +# Get current isolation level +resp = stub.callResource(CallResourceRequest( + session=session, resourceType=RES_CONNECTION, + target=TargetCall(callType=CALL_GET, resourceName="TransactionIsolation") +)) +isolation_level = resp.values[0].int_value +session = resp.session +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection.setAutoCommit(boolean)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java): calls `commitTransaction` when switching on and `startTransaction` when switching off. +> - `Connection.commit()` / `Connection.rollback()`: delegate to `statementService.commitTransaction(session)` / `rollbackTransaction(session)` when `autoCommit == false`. +> - `Connection.close()`: calls `terminateSession(session)` unconditionally. +> - `Connection.setTransactionIsolation(level)` / `getTransactionIsolation()`: forwarded via `callProxy(CallType.CALL_SET/GET, "TransactionIsolation", ...)`. + +--- + +### 7.7 Savepoints + +Savepoints are implemented through the `callResource` protocol using `ResourceType.RES_SAVEPOINT`. + +**Creating a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `target.callType = CALL_SET`, `target.resourceName = "Savepoint"`, `target.params = [savepointName]` if named. The response contains the savepoint UUID in `CallResourceResponse.resourceUUID`. + +**Rolling back to a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `resourceUUID = `, `target.callType = CALL_ROLLBACK`. + +**Releasing a savepoint:** Call `callResource` with `resourceType = RES_SAVEPOINT`, `resourceUUID = `, `target.callType = CALL_RELEASE`. + +### Pseudo-code + +```python +# Create a named savepoint +resp = stub.callResource(CallResourceRequest( + session=session, resourceType=RES_SAVEPOINT, + target=TargetCall(callType=CALL_SET, resourceName="Savepoint", + params=[ParameterValue(string_value="my_savepoint")]) +)) +savepoint_uuid = resp.resourceUUID +session = resp.session + +# Roll back to the savepoint +resp = stub.callResource(CallResourceRequest( + session=session, resourceType=RES_SAVEPOINT, resourceUUID=savepoint_uuid, + target=TargetCall(callType=CALL_ROLLBACK, resourceName="Savepoint") +)) +session = resp.session + +# Release the savepoint +resp = stub.callResource(CallResourceRequest( + session=session, resourceType=RES_SAVEPOINT, resourceUUID=savepoint_uuid, + target=TargetCall(callType=CALL_RELEASE, resourceName="Savepoint") +)) +session = resp.session +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`Connection.setSavepoint()`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Connection.java) / `setSavepoint(name)`: calls `callProxy` with `CALL_SET`, `"Savepoint"`, and the optional name; wraps the returned resource UUID in a [`Savepoint`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/Savepoint.java) object. +> - `Connection.rollback(Savepoint)`: calls `callProxy` with `CALL_ROLLBACK`. +> - `Connection.releaseSavepoint(Savepoint)`: calls `callProxy` with `CALL_RELEASE`. + +--- + +### 7.8 XA / Distributed Transactions + +XA support maps the standard XA resource manager protocol to gRPC RPCs. XA connections are always pinned to a single server (§2.3). + +**XA transaction lifecycle:** + +``` +xaStart(XaStartRequest) -- Begin branch; safe to retry on connection error +xaEnd(XaEndRequest) -- End branch; NEVER retry after this point +xaPrepare(XaPrepareRequest) -- Two-phase prepare; returns XA_OK or XA_RDONLY +xaCommit(XaCommitRequest) -- Commit (onePhase=true for one-phase optimisation) +xaRollback(XaRollbackRequest) -- Roll back the branch +xaRecover(XaRecoverRequest) -- List in-doubt XIDs (for recovery after crash) +xaForget(XaForgetRequest) -- Forget a heuristically completed branch +``` + +**Xid encoding (XidProto):** + +| Field | Type | Meaning | +|---|---|---| +| `formatId` | int32 | Transaction format ID | +| `globalTransactionId` | bytes | Global transaction ID (up to 64 bytes) | +| `branchQualifier` | bytes | Branch qualifier (up to 64 bytes) | + +**Retry policy:** `xaStart` only — retry on connection-level errors (see §4.3). All other XA operations must not be retried automatically. Surface failures to the caller's transaction manager. + +On the response to `xaStart`, record the `sessionUUID -> targetServer` binding. All subsequent XA operations for this branch must go to the same server. If that server is unavailable, raise `XAException(XAER_RMFAIL)`. + +### Pseudo-code + +```python +xid = XidProto(formatId=1, globalTransactionId=b"global-tx-001", branchQualifier=b"branch-1") + +resp = stub.xaStart(XaStartRequest(session=session, xid=xid, flags=0)) +session = resp.session # bind session.targetServer for all remaining calls + +resp = stub.executeUpdate(StatementRequest(session=session, sql="UPDATE accounts ...")) +session = resp.session + +resp = stub.xaEnd(XaEndRequest(session=session, xid=xid, flags=0)) +session = resp.session + +prep = stub.xaPrepare(XaPrepareRequest(session=session, xid=xid)) +# prep.result = XA_OK or XA_RDONLY + +stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=False)) +# OR: stub.xaRollback(XaRollbackRequest(session=session, xid=xid)) +# OR: stub.xaCommit(XaCommitRequest(session=session, xid=xid, onePhase=True)) # skip xaPrepare + +# Recovery +resp = stub.xaRecover(XaRecoverRequest(session=session, flag=TMSTARTRSCAN)) +# Forget +stub.xaForget(XaForgetRequest(session=session, xid=xid)) +``` + +> **Reference implementation:** +> - `ojp-jdbc-driver` — [`OjpXAResource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAResource.java): implements `XAResource`; all 10 lifecycle methods; contains the `xaStart` retry loop and the `toXidProto` / `fromXidProto` conversion helpers. +> - `ojp-jdbc-driver` — [`OjpXAConnection`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXAConnection.java): creates the XA-mode `StatementService` connection (always calling the server, never cache-hit) and vends `OjpXAResource`. +> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): entry point for XA; calls `MultinodeConnectionManager.connectXA()` to pin the session to a single server. + +--- + +### 7.9 callResource Protocol The `callResource` RPC is a generic mechanism for operations that do not fit a dedicated RPC — primarily `DatabaseMetaData` queries, `ResultSet` cursor/update operations, `Statement` cancellation, savepoint management, and resource lifecycle calls. -### Request +**Request:** ``` CallResourceRequest { session: SessionInfo - resourceType: ResourceType // what kind of resource to call - resourceUUID: string // the server-side handle for this resource - target: TargetCall // the specific operation to perform + resourceType: ResourceType + resourceUUID: string + target: TargetCall properties: PropertyEntry[] } -``` -### TargetCall (supports chaining) - -``` TargetCall { - callType: CallType // one of the 47 call type codes - resourceName: string // e.g., "Catalog", "TransactionIsolation", "Savepoint" - params: ParameterValue[] // input arguments - nextCall: TargetCall // optional chained call (recursive) + callType: CallType + resourceName: string + params: ParameterValue[] + nextCall: TargetCall // optional chained call (recursive) } ``` -### ResourceType values +**ResourceType values:** | Value | Meaning | |---|---| @@ -1273,62 +1138,35 @@ TargetCall { | `RES_CONNECTION` | The connection itself (for metadata, catalog, etc.) | | `RES_SAVEPOINT` | A savepoint | -### Response +**CallType reference (47 codes):** -``` -CallResourceResponse { - session: SessionInfo - resourceUUID: string // UUID of a newly created resource, if any - values: ParameterValue[] // return values (may be empty) -} -``` +`CALL_SET`, `CALL_GET`, `CALL_IS`, `CALL_ALL`, `CALL_NULLS`, `CALL_USES`, `CALL_SUPPORTS`, `CALL_STORES`, `CALL_NULL`, `CALL_DOES`, `CALL_DATA`, `CALL_NEXT`, `CALL_CLOSE`, `CALL_WAS`, `CALL_CLEAR`, `CALL_FIND`, `CALL_BEFORE`, `CALL_AFTER`, `CALL_FIRST`, `CALL_LAST`, `CALL_ABSOLUTE`, `CALL_RELATIVE`, `CALL_PREVIOUS`, `CALL_ROW`, `CALL_UPDATE`, `CALL_INSERT`, `CALL_DELETE`, `CALL_REFRESH`, `CALL_CANCEL`, `CALL_MOVE`, `CALL_OWN`, `CALL_OTHERS`, `CALL_UPDATES`, `CALL_DELETES`, `CALL_INSERTS`, `CALL_LOCATORS`, `CALL_AUTO`, `CALL_GENERATED`, `CALL_RELEASE`, `CALL_NATIVE`, `CALL_PREPARE`, `CALL_ROLLBACK`, `CALL_ABORT`, `CALL_EXECUTE`, `CALL_ADD`, `CALL_ENQUOTE`, `CALL_REGISTER`, `CALL_LENGTH` Always update the local `SessionInfo` from `response.session`. -### CallType reference (47 codes) - -`CALL_SET`, `CALL_GET`, `CALL_IS`, `CALL_ALL`, `CALL_NULLS`, `CALL_USES`, `CALL_SUPPORTS`, `CALL_STORES`, `CALL_NULL`, `CALL_DOES`, `CALL_DATA`, `CALL_NEXT`, `CALL_CLOSE`, `CALL_WAS`, `CALL_CLEAR`, `CALL_FIND`, `CALL_BEFORE`, `CALL_AFTER`, `CALL_FIRST`, `CALL_LAST`, `CALL_ABSOLUTE`, `CALL_RELATIVE`, `CALL_PREVIOUS`, `CALL_ROW`, `CALL_UPDATE`, `CALL_INSERT`, `CALL_DELETE`, `CALL_REFRESH`, `CALL_CANCEL`, `CALL_MOVE`, `CALL_OWN`, `CALL_OTHERS`, `CALL_UPDATES`, `CALL_DELETES`, `CALL_INSERTS`, `CALL_LOCATORS`, `CALL_AUTO`, `CALL_GENERATED`, `CALL_RELEASE`, `CALL_NATIVE`, `CALL_PREPARE`, `CALL_ROLLBACK`, `CALL_ABORT`, `CALL_EXECUTE`, `CALL_ADD`, `CALL_ENQUOTE`, `CALL_REGISTER`, `CALL_LENGTH` - ### Pseudo-code ```python -# --- Get the database catalog name (connection-level metadata) --- +# Get database catalog resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - resourceUUID = "", # empty for connection-level calls - target = TargetCall(callType=CALL_GET, resourceName="Catalog") + session=session, resourceType=RES_CONNECTION, resourceUUID="", + target=TargetCall(callType=CALL_GET, resourceName="Catalog") )) catalog_name = resp.values[0].string_value -session = resp.session # always update local session - -# --- Check a database capability --- -resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall(callType=CALL_SUPPORTS, resourceName="Transactions") -)) -supports_transactions = resp.values[0].bool_value -session = resp.session +session = resp.session -# --- Cancel a running statement --- +# Cancel a running statement resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_STATEMENT, - resourceUUID = statement_uuid, # UUID of the statement to cancel - target = TargetCall(callType=CALL_CANCEL) + session=session, resourceType=RES_STATEMENT, resourceUUID=statement_uuid, + target=TargetCall(callType=CALL_CANCEL) )) session = resp.session -# --- Chained call: get Schema and Catalog in one round-trip --- +# Chained call: get Schema and Catalog in one round-trip resp = stub.callResource(CallResourceRequest( - session = session, - resourceType = RES_CONNECTION, - target = TargetCall( - callType = CALL_GET, - resourceName = "Schema", - nextCall = TargetCall(callType=CALL_GET, resourceName="Catalog") - ) + session=session, resourceType=RES_CONNECTION, + target=TargetCall(callType=CALL_GET, resourceName="Schema", + nextCall=TargetCall(callType=CALL_GET, resourceName="Catalog")) )) schema_name = resp.values[0].string_value catalog_name = resp.values[1].string_value @@ -1337,66 +1175,22 @@ session = resp.session > **Reference implementation:** > - `ojp-jdbc-driver` — [`StatementServiceGrpcClient.callResource(CallResourceRequest)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/StatementServiceGrpcClient.java): the single-node gRPC call. -> - `ojp-jdbc-driver` — [`DatabaseMetaData`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatabaseMetaData.java): every `DatabaseMetaData` method (>200 in total) is implemented by calling `callResource` with `RES_CONNECTION` and the appropriate `CallType` (e.g., `CALL_GET` for `getURL()`, `CALL_SUPPORTS` for `supportsXxx()`, `CALL_STORES` for `storesXxx()`). The private helper `newCallBuilder()` creates the skeleton `CallResourceRequest`. -> - `ojp-jdbc-driver` — `Connection.callProxy(callType, resourceName, returnType, params)`: the private convenience wrapper used throughout `Connection` and `DatabaseMetaData` to issue `callResource` calls without building the full request proto by hand. - ---- - -## 21. Error and Exception Mapping - -### SQL errors carried in gRPC trailers - -When the server encounters a SQL error, it returns `Status.INTERNAL` with a `SqlErrorResponse` message attached to the trailing metadata. Extract it using the proto message key for `SqlErrorResponse`. - -``` -SqlErrorResponse { - reason: string // human-readable message - sqlState: string // ANSI SQL state code - vendorCode: int32 // database-specific error code - sqlErrorType: SqlErrorType // SQL_EXCEPTION or SQL_DATA_EXCEPTION -} -``` - -Map to the host language's exception hierarchy: -- `SQL_EXCEPTION` → standard SQL exception. -- `SQL_DATA_EXCEPTION` → data-specific SQL exception (subtype). - -### Error classification matrix - -| Condition | gRPC status | Client action | -|---|---|---| -| SQL error (bad query, constraint, etc.) | `INTERNAL` + `SqlErrorResponse` trailer | Throw SQL exception; do not retry; do not mark server unhealthy | -| Pool not found (server restarted) | `NOT_FOUND` | Invalidate connHash cache; reconnect; retry once (§4) | -| Server unreachable | `UNAVAILABLE` | Failover to next server (§8) | -| Request timeout | `DEADLINE_EXCEEDED` | Failover to next server (§8) | -| Client-side cancellation | `CANCELLED` | Do **not** failover; do **not** mark server unhealthy; surface to caller | -| Pool exhausted | `RESOURCE_EXHAUSTED` | Throw pool-exhaustion error; do not retry; do not mark server unhealthy | -| Session invalidated (server failure) | Session-not-found message | Throw session-lost error; do not retry; let caller decide | -| Session stickiness violation (server down) | Local check before RPC | Throw connection error immediately; do not reroute | - -> **Note:** Before this classification was established (prior to April 2026) the server incorrectly used `Status.CANCELLED` for SQL errors. The correct status is `Status.INTERNAL` with a `SqlErrorResponse` trailer. Any implementation must use `INTERNAL` for SQL errors and must not treat `CANCELLED` as a server failure. - -> **Reference implementation:** -> - `ojp-jdbc-driver` — [`GrpcExceptionHandler.handle(StatusRuntimeException)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/GrpcExceptionHandler.java): extracts `SqlErrorResponse` from gRPC trailing metadata on `Status.INTERNAL` and throws the appropriate `SQLException` with SQL state and vendor code. -> - `GrpcExceptionHandler.isPoolNotFoundException(exception)`: returns `true` for `NOT_FOUND`. -> - `GrpcExceptionHandler.isSessionInvalidationError(exception)`: returns `true` for session-invalidation error messages. -> - `GrpcExceptionHandler.isConnectionLevelError(exception)`: returns `true` for `UNAVAILABLE`, `DEADLINE_EXCEEDED`, and connection-related `UNKNOWN` errors. +> - `ojp-jdbc-driver` — [`DatabaseMetaData`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatabaseMetaData.java): every `DatabaseMetaData` method (>200 in total) is implemented by calling `callResource` with `RES_CONNECTION` and the appropriate `CallType`. +> - `ojp-jdbc-driver` — `Connection.callProxy(callType, resourceName, returnType, params)`: the private convenience wrapper for issuing `callResource` calls. --- -## 22. Configuration System +### 7.10 Configuration System -### Configuration sources (in priority order) +**Configuration sources (in priority order):** -1. **System / environment properties** (highest priority) — e.g., `-Dojp.health.check.interval=10000` or environment variable equivalents. +1. **System / environment properties** (highest priority) — e.g., `-Dojp.health.check.interval=10000`. 2. **`ojp.properties` file** — loaded from the classpath or a well-known filesystem path. 3. **Built-in defaults** (lowest priority). -### Property namespacing - -Properties can be global or per-datasource. Per-datasource properties are prefixed with the datasource name: +**Property namespacing:** -``` +```properties # Global ojp.health.check.interval=5000 @@ -1404,7 +1198,7 @@ ojp.health.check.interval=5000 analytics.ojp.health.check.interval=10000 ``` -### Standard configuration properties +**Standard configuration properties:** | Property | Default | Meaning | |---|---|---| @@ -1421,27 +1215,20 @@ analytics.ojp.health.check.interval=10000 | `ojp.grpc.tls.enabled` | `false` | Enable TLS on gRPC channels | | `ojp.grpc.tls.cert.path` | — | Path to client certificate for mTLS | -### Duration format - -Duration values support the following suffixes: -- No suffix — milliseconds (e.g. `5000`) -- `ms` — milliseconds (e.g. `500ms`) -- `s` — seconds (e.g. `10s`) -- `m` — minutes (e.g. `2m`) +**Duration format** — values support: no suffix = milliseconds; `ms` = milliseconds; `s` = seconds; `m` = minutes. > **Reference implementation:** -> - `ojp-jdbc-driver` — [`DatasourcePropertiesLoader`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatasourcePropertiesLoader.java): `loadOjpPropertiesForDataSource(datasourceName)` merges file properties, system properties, and environment variables with per-datasource prefix resolution. `loadOjpProperties()` loads the base `ojp.properties` file from the classpath. -> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): the strongly-typed POJO that holds all health-check and redistribution settings, populated by `MultinodeUrlParser` from the loaded `Properties`. -> - `ojp-jdbc-driver` — [`MultinodeUrlParser.readIntProperty(props, key, default)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/MultinodeUrlParser.java) / `readLongProperty(...)`: reads typed values from the merged `Properties` object. -> - `ojp-grpc-commons` — [`GrpcClientConfig.load()`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loads the gRPC-specific settings (max inbound message size, TLS config) from `ojp.properties`. +> - `ojp-jdbc-driver` — [`DatasourcePropertiesLoader`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/DatasourcePropertiesLoader.java): `loadOjpPropertiesForDataSource(datasourceName)` merges file properties, system properties, and environment variables with per-datasource prefix resolution. +> - `ojp-jdbc-driver` — [`HealthCheckConfig`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/grpc/client/HealthCheckConfig.java): the strongly-typed POJO holding all health-check and redistribution settings. +> - `ojp-grpc-commons` — [`GrpcClientConfig.load()`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loads gRPC-specific settings (max inbound message size, TLS config) from `ojp.properties`. --- -## 23. Query Result Caching +### 7.11 Query Result Caching Cache configuration is entirely **client-side to server** — the client reads local cache rules and sends them to the server as `ConnectionDetails.properties` entries during `connect()`. The server applies them transparently; the client does not implement any caching logic itself. -### Properties sent to the server +**Properties sent to the server:** | Property key | Meaning | |---|---| @@ -1453,7 +1240,7 @@ Cache configuration is entirely **client-side to server** — the client reads l `` is a 1-based integer index. Rules are processed in index order. -### Example configuration +**Example configuration:** ```properties ojp.cache.enabled=true @@ -1466,77 +1253,49 @@ ojp.cache.queries.2.invalidateOn=users ``` > **Reference implementation:** -> - `ojp-jdbc-driver` — [`CacheConfigurationBuilder.addCachePropertiesToMap(propertiesMap, datasourceName)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CacheConfigurationBuilder.java): reads cache rules from the loaded `Properties` and appends them to the `ConnectionDetails.properties` map that is sent to the server on `connect()`. `parseDurationToSeconds(duration)` handles the same duration format as §22. +> - `ojp-jdbc-driver` — [`CacheConfigurationBuilder.addCachePropertiesToMap(propertiesMap, datasourceName)`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/CacheConfigurationBuilder.java): reads cache rules from the loaded `Properties` and appends them to the `ConnectionDetails.properties` map sent to the server on `connect()`. --- -## 24. Security / Transport - -### Plaintext (default) - -Create a plaintext gRPC channel targeting `dns:///host:port`. This is suitable for internal networks or local development. - -### TLS +### 7.12 Security / Transport -When `ojp.grpc.tls.enabled = true`, create a TLS-secured channel: -- Use the platform's default trust store or a custom CA certificate. -- Support mutual TLS (mTLS) when `ojp.grpc.tls.cert.path` is set. -- Certificate paths and key material must be loaded from configurable filesystem paths; do not hard-code them. +**Plaintext (default):** Create a plaintext gRPC channel targeting `dns:///host:port`. Suitable for internal networks or local development. -### Credential handling +**TLS:** When `ojp.grpc.tls.enabled = true`, create a TLS-secured channel. Use the platform's default trust store or a custom CA certificate. Support mutual TLS (mTLS) when `ojp.grpc.tls.cert.path` is set. Certificate paths and key material must be loaded from configurable filesystem paths; do not hard-code them. -- Passwords must never be logged or included in exception messages. -- Connection keys used for cache lookups (§4) may include the password as a cache key only — they must not be serialised or persisted. +**Credential handling:** Passwords must never be logged or included in exception messages. Connection keys used for cache lookups (§4.1) may include the password as a cache key only — they must not be serialised or persisted. > **Reference implementation:** -> - `ojp-grpc-commons` — [`GrpcChannelFactory.createChannel(host, port)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): creates a plaintext `ManagedChannel` with configurable max inbound message size; `createSecureChannel(host, port, size, tlsConfig)` builds the TLS-secured variant; `buildSslContext(tlsConfig)` sets up Netty's `SslContext` from the certificate paths. -> - `ojp-grpc-commons` — [`GrpcClientConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loaded by `GrpcClientConfig.load()` from `ojp.properties`; exposes `getTlsConfig()` and `getMaxInboundMessageSize()`. +> - `ojp-grpc-commons` — [`GrpcChannelFactory.createChannel(host, port)`](../../ojp-grpc-commons/src/main/java/org/openjproxy/grpc/GrpcChannelFactory.java): creates a plaintext `ManagedChannel`; `createSecureChannel(host, port, size, tlsConfig)` builds the TLS-secured variant. +> - `ojp-grpc-commons` — [`GrpcClientConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/GrpcClientConfig.java): loaded from `ojp.properties`; exposes `getTlsConfig()` and `getMaxInboundMessageSize()`. > - `ojp-grpc-commons` — [`TlsConfig`](../../ojp-grpc-commons/src/main/java/org/openjproxy/config/TlsConfig.java): holds `enabled`, `certPath`, `keyPath`, `caPath`, and `clientAuth` flags. --- -## 25. DataSource / Integration API +### 7.13 DataSource / Integration API -### DataSource wrapper +Provide a higher-level `DataSource` (or equivalent) object that holds connection configuration and exposes a `getConnection()` method. Integrate cleanly with the host language's database access conventions (e.g., Python's DB-API 2.0, Go's `database/sql`, Node.js connection objects). -Provide a higher-level `DataSource` (or equivalent) object that: -- Holds connection configuration (URL, user, password, properties). -- Exposes a `getConnection()` method that calls `Driver.connect()` internally. -- Integrates cleanly with the host language's database access conventions (e.g., Python's `DB-API 2.0`, Go's `database/sql`, Node.js connection objects). - -### Framework integration (Spring Boot example) - -For Java/Spring Boot: -- Provide a `spring-boot-starter-ojp` auto-configuration module. -- Auto-configure an `OjpDataSource` bean when the driver is on the classpath. -- Expose a bridge (`OjpSystemPropertiesBridge`) that copies Spring Boot `application.yml` properties to JVM system properties so the configuration system (§22) can pick them up. -- **Disable** the framework's own built-in connection pool (e.g., HikariCP in Spring Boot) when OJP is in use — double-pooling is the most common misconfiguration and causes incorrect behaviour. - -For other languages, document clearly in the library README that the application-side connection pool must be disabled when using OJP. +**Disable** the framework's own built-in connection pool when OJP is in use. For Spring Boot (Java), provide a `spring-boot-starter-ojp` auto-configuration module that excludes `DataSourceAutoConfiguration`. For other languages, document this clearly in the library README. > **Reference implementation:** -> - `ojp-jdbc-driver` — [`OjpDataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/OjpDataSource.java): implements `javax.sql.DataSource`; `getConnection()` / `getConnection(user, password)` delegate to `DriverManager.getConnection(url, info)` which invokes the registered `Driver`. -> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): implements `javax.sql.XADataSource`; `getXAConnection()` creates an `OjpXAConnection` (and thus an `OjpXAResource`) for JTA integration. -> - `spring-boot-starter-ojp` module: provides the Spring Boot auto-configuration class and the `OjpSystemPropertiesBridge` bean; sets `spring.datasource.type=OjpDataSource` and excludes `DataSourceAutoConfiguration` to prevent double-pooling. +> - `ojp-jdbc-driver` — [`OjpDataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/OjpDataSource.java): implements `javax.sql.DataSource`; `getConnection()` delegates to `DriverManager.getConnection(url, info)`. +> - `ojp-jdbc-driver` — [`OjpXADataSource`](../../ojp-jdbc-driver/src/main/java/org/openjproxy/jdbc/xa/OjpXADataSource.java): implements `javax.sql.XADataSource`; `getXAConnection()` creates an `OjpXAConnection` for JTA integration. +> - `spring-boot-starter-ojp` module: provides the Spring Boot auto-configuration class and the `OjpSystemPropertiesBridge` bean; excludes `DataSourceAutoConfiguration` to prevent double-pooling. --- -## 26. Testing Coverage +## 8. Testing Coverage A conformant client implementation must ship a test suite that exercises all the aspects above. Tests that require a live OJP server (and optionally a real database) should be **gated behind feature flags** so the suite can run incrementally in CI. -### Test infrastructure requirements - -- A running OJP server (see `ojp-server` module and `download-drivers.sh`). -- At minimum, an embedded/in-process database (e.g., H2) for fast baseline tests. -- Optional: containerised databases (PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, DB2, CockroachDB) gated by per-database flags. +**Test infrastructure requirements:** A running OJP server (see `ojp-server` module and `download-drivers.sh`). At minimum, an embedded/in-process database (e.g., H2) for fast baseline tests. Optional: containerised databases gated by per-database flags. -### Test categories and required scenarios +### Required Test Scenarios #### Basic CRUD - SELECT, INSERT, UPDATE, DELETE via plain Statement and PreparedStatement. -- Verify affected row counts, returned ResultSet contents. -- Verify empty result sets are handled correctly. +- Verify affected row counts, returned ResultSet contents, and empty result sets. #### Multiple data types - Round-trip every `ParameterTypeProto` value through INSERT + SELECT. @@ -1545,7 +1304,7 @@ A conformant client implementation must ship a test suite that exercises all the #### Statement variants - Plain `Statement`: `executeQuery`, `executeUpdate`, `execute`, `executeBatch`, `getResultSet`, `getUpdateCount`, `getGeneratedKeys`, `cancel`, `close`. - `PreparedStatement`: all `setXxx` methods, `executeBatch`, multiple executions with the same prepared statement, `getParameterMetaData`. -- `CallableStatement`: IN, OUT, INOUT parameters; `registerOutParameter`; retrieval of OUT values after execution; named parameters where supported. +- `CallableStatement`: IN, OUT, INOUT parameters; `registerOutParameter`; retrieval of OUT values after execution. #### ResultSet navigation - Forward-only cursors: `next()`, `wasNull()`, `close()`. @@ -1556,8 +1315,7 @@ A conformant client implementation must ship a test suite that exercises all the - `getColumnCount()`, `getColumnName()`, `getColumnType()`, `getColumnTypeName()`, `getPrecision()`, `getScale()`, `isNullable()`, `isAutoIncrement()`. #### DatabaseMetaData -- `getTables()`, `getColumns()`, `getPrimaryKeys()`, `getIndexInfo()`, `getProcedures()`, `getTypeInfo()`, `supportsXxx()` methods. -- Verify results match the actual database schema. +- `getTables()`, `getColumns()`, `getPrimaryKeys()`, `getIndexInfo()`, `getProcedures()`, `getTypeInfo()`, `supportsXxx()` methods. Verify results match the actual database schema. #### Transactions - Commit: insert rows in a transaction, commit, verify rows persist. @@ -1566,9 +1324,7 @@ A conformant client implementation must ship a test suite that exercises all the - Transaction isolation level: set, verify via `getTransactionIsolation()`, reset after connection return. #### Savepoints -- Create a named and an anonymous savepoint. -- Rollback to each; verify partial rollback semantics. -- Release a savepoint. +- Create a named and an anonymous savepoint. Rollback to each; verify partial rollback semantics. Release a savepoint. #### XA transactions - Full lifecycle: `xaStart`, `xaEnd`, `xaPrepare`, `xaCommit`. @@ -1597,7 +1353,7 @@ A conformant client implementation must ship a test suite that exercises all the - With two or more server endpoints, open `N` connections and verify they are distributed across servers (round-robin and least-connections modes separately). #### Multinode failover -- Kill one server mid-operation; verify the operation is retried on a surviving server (for stateless operations). +- Simulate one server going down mid-operation; verify the operation is retried on a surviving server (for stateless operations). - Verify a server is marked unhealthy after failure. - Verify subsequent connections avoid the unhealthy server. @@ -1639,9 +1395,9 @@ A conformant client implementation must ship a test suite that exercises all the #### Performance / mini stress - Open and close 100–1000 connections in parallel; verify no connection leaks, no deadlocks, and no degrading error rate. -#### Database-specific test suites +### Database-specific test suites -Each database must have a dedicated test class gated by its own flag. The class must cover the full set of above scenarios for that database's specific SQL dialect, type system, and edge cases. +Each database must have a dedicated test class gated by its own flag. | Database | Feature flag | |---|---| @@ -1668,7 +1424,7 @@ H2 tests (in-process, no external dependency) must always be runnable in CI with > | Transactions | `H2ConnectionExtensiveTests`, [`TransactionIsolationResetTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/TransactionIsolationResetTest.java) | > | Savepoints | `H2SavepointTests` (and per-DB `*SavepointTests`) | > | XA transactions | [`PostgresXAIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/PostgresXAIntegrationTest.java), `MySQLXAIntegrationTest`, `MariaDBXAIntegrationTest`, `OracleXAIntegrationTest`, `SqlServerXAIntegrationTest`, `Db2XAIntegrationTest`, [`XASessionInvalidationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/XASessionInvalidationTest.java) | -> | LOBs | [`BlobIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BlobIntegrationTest.java), [`BinaryStreamIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BinaryStreamIntegrationTest.java), [`HydratedLobValidationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/HydratedLobValidationTest.java) (and per-DB `*Blob*` / `*BinaryStream*`) | +> | LOBs | [`BlobIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BlobIntegrationTest.java), [`BinaryStreamIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/BinaryStreamIntegrationTest.java), [`HydratedLobValidationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/HydratedLobValidationTest.java) (and per-DB variants) | > | Session affinity | [`H2SessionAffinityIntegrationTest`](../../ojp-jdbc-driver/src/test/java/openjproxy/jdbc/H2SessionAffinityIntegrationTest.java) (and per-DB `*SessionAffinity*`) | > | Multi-block result sets | `H2ReadMultipleBlocksOfDataIntegrationTest` (and per-DB) | > | Multinode load balancing | [`LoadAwareServerSelectionTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/LoadAwareServerSelectionTest.java), [`MultinodeIntegrationTest`](../../ojp-jdbc-driver/src/test/java/org/openjproxy/grpc/client/MultinodeIntegrationTest.java) | From 045e484f447b5e386e01660dffda6250dccb3543 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Mon, 20 Apr 2026 10:00:45 +0000 Subject: [PATCH 09/12] docs: replace pool-specific "HikariCP" references with pool-agnostic wording in spec files Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/9a44563c-e1eb-4652-8136-ff1bd686aa0c Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- documents/multi-language-client-spec/CLIENT_SPEC.md | 8 ++++---- documents/multi-language-client-spec/CLIENT_SPEC_AI.md | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 8fff240fb..8d5c447c2 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -52,7 +52,7 @@ ## 1. Overview -OJP (Open J Proxy) is a JDBC Type 3 proxy. Its central idea is that real database connections are owned exclusively by the OJP server, which manages them in HikariCP connection pools. Client applications communicate with the server via gRPC rather than opening direct database connections. +OJP (Open J Proxy) is a JDBC Type 3 proxy. Its central idea is that real database connections are owned exclusively by the OJP server, which manages them in pluggable connection pools (HikariCP by default, replaceable via SPI). Client applications communicate with the server via gRPC rather than opening direct database connections. ``` [Application] ──native API──> [OJP Client Library] ──gRPC/HTTP2──> [OJP Server] ──JDBC──> [Database] @@ -72,7 +72,7 @@ Before diving into implementation details, understand these four foundational id ### 2.1 Virtual Connections -An OJP "connection" is not a real database connection. The real JDBC connections are held exclusively in the server's HikariCP pool. What the client holds is a `SessionInfo` — a lightweight proto message containing a `connHash` (a pool identifier), the `clientUUID`, and (once assigned) a `sessionUUID`. +An OJP "connection" is not a real database connection. The real JDBC connections are held exclusively in the server's connection pool. What the client holds is a `SessionInfo` — a lightweight proto message containing a `connHash` (a pool identifier), the `clientUUID`, and (once assigned) a `sessionUUID`. Opening a connection is cheap. For non-XA connections after the first one, the client can satisfy the `connect()` call entirely from a local cache: it looks up the `connHash` for the given database credentials and builds the `SessionInfo` locally without making any gRPC call. This means connection acquisition for cached credentials costs only a hash-map lookup. @@ -96,7 +96,7 @@ If the bound server becomes unhealthy while a sticky session is open, the client ### 2.4 Client vs. Server Responsibilities -**The server owns:** real JDBC connections and HikariCP pool management; transaction state; LOB storage; server-side cursor state; query result caching; slow-query slot management; pool resizing in response to cluster health changes. +**The server owns:** real JDBC connections and connection pool management (pool implementation is pluggable via SPI); transaction state; LOB storage; server-side cursor state; query result caching; slow-query slot management; pool resizing in response to cluster health changes. **The client owns:** `SessionInfo` propagation (attach current `SessionInfo` to every request; replace with response); `connHash` caching; endpoint health tracking; load balancing; failover; cluster health string building and pushing to surviving servers; session stickiness enforcement (`sessionUUID → targetServer` binding); background health-check task; connection redistribution after server recovery. @@ -521,7 +521,7 @@ When a failed server comes back online, rebalance client-side connections so tha **Procedure on recovery:** -1. **Reinitialize pools on the recovered server first** (before marking healthy). For every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it creates the HikariCP pool immediately. This closes the NOT_FOUND window between marking the server healthy and the first SQL call reaching it. +1. **Reinitialize pools on the recovered server first** (before marking healthy). For every cached `connHash`/`ConnectionDetails` pair, call `connect()` on the recovered server so it pre-warms the connection pool immediately. This closes the NOT_FOUND window between marking the server healthy and the first SQL call reaching it. 2. Mark the server healthy (`endpoint.markHealthy()`). 3. Push the updated cluster health string to all healthy servers (see §3.5). 4. If redistribution is enabled (`ojp.redistribution.enabled = true`), begin rebalancing: diff --git a/documents/multi-language-client-spec/CLIENT_SPEC_AI.md b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md index faa08ac6c..857c1a671 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC_AI.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md @@ -16,7 +16,7 @@ | **Server** | An OJP server instance exposing `StatementService` and `EchoService` via gRPC. | | **Endpoint** | A `host:port` pair identifying one Server. | | **Virtual Connection** | A client-side object representing logical access to a database pool, identified by a `SessionInfo` token. Does not correspond 1:1 to a real database connection. | -| **Real Connection** | A JDBC connection held by the Server's HikariCP pool. The Client never holds one directly. | +| **Real Connection** | A JDBC connection held by the Server's connection pool. The Client never holds one directly. | | **connHash** | A server-computed SHA-256 string keying a specific connection pool. Computed as SHA-256(`url + user + password + datasource_name`). | | **SessionInfo** | A proto message propagated on every RPC. Contains `connHash`, `clientUUID`, `sessionUUID`, `transactionInfo`, `sessionStatus`, `isXA`, `targetServer`, `clusterHealth`. | | **sessionUUID** | A server-assigned handle for a stateful session (transaction, LOB, cursor). Absent until the Server assigns it. | From 5d528a218eb9ed451d8cfe1e905e08a17aed033d Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 30 Apr 2026 15:58:10 +0000 Subject: [PATCH 10/12] docs: add last-updated date to CLIENT_SPEC.md and CLIENT_SPEC_AI.md Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/cc5ba102-9987-42f1-a067-4115fdc99a09 Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- documents/multi-language-client-spec/CLIENT_SPEC.md | 1 + documents/multi-language-client-spec/CLIENT_SPEC_AI.md | 1 + 2 files changed, 2 insertions(+) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 8d5c447c2..0859c3568 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -1,6 +1,7 @@ # OJP Multi-Language Client Specification > **Status:** Draft — April 2026 +> **Last updated:** 2026-04-30 (reflects all changes merged to `main` up to and including this date) > **Scope:** Defines every aspect that a new OJP client library (in any language) must implement to be fully compatible with an OJP server. Written language-agnostically; Java-specific concepts are labelled as reference implementation only. > **Reference implementation:** `ojp-jdbc-driver` module. > **Protocol source of truth:** `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto`. diff --git a/documents/multi-language-client-spec/CLIENT_SPEC_AI.md b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md index 857c1a671..3b1fff87d 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC_AI.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md @@ -1,6 +1,7 @@ # OJP Client Specification — Machine-Oriented Reference > **Status:** Normative — April 2026 +> **Last updated:** 2026-04-30 (reflects all changes merged to `main` up to and including this date) > **Scope:** Defines the complete behavioral contract for any OJP client implementation. > **Keywords:** MUST, MUST NOT, SHOULD, MAY as defined in RFC 2119. > **Protocol source:** `ojp-grpc-commons/src/main/proto/StatementService.proto`, `echo.proto` From 28384b152bf642da5005b3e2193511bdbb2162a7 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 30 Apr 2026 16:03:40 +0000 Subject: [PATCH 11/12] docs: simplify last-updated note in CLIENT_SPEC.md and CLIENT_SPEC_AI.md Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/94f6dc15-c840-48b2-ae8c-a7d2cdcf087c Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- documents/multi-language-client-spec/CLIENT_SPEC.md | 2 +- documents/multi-language-client-spec/CLIENT_SPEC_AI.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 0859c3568..8bf8009b3 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -1,7 +1,7 @@ # OJP Multi-Language Client Specification > **Status:** Draft — April 2026 -> **Last updated:** 2026-04-30 (reflects all changes merged to `main` up to and including this date) +> **Last updated:** 2026-04-30 > **Scope:** Defines every aspect that a new OJP client library (in any language) must implement to be fully compatible with an OJP server. Written language-agnostically; Java-specific concepts are labelled as reference implementation only. > **Reference implementation:** `ojp-jdbc-driver` module. > **Protocol source of truth:** `ojp-grpc-commons/src/main/proto/StatementService.proto` and `echo.proto`. diff --git a/documents/multi-language-client-spec/CLIENT_SPEC_AI.md b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md index 3b1fff87d..0ddc31ba8 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC_AI.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC_AI.md @@ -1,7 +1,7 @@ # OJP Client Specification — Machine-Oriented Reference > **Status:** Normative — April 2026 -> **Last updated:** 2026-04-30 (reflects all changes merged to `main` up to and including this date) +> **Last updated:** 2026-04-30 > **Scope:** Defines the complete behavioral contract for any OJP client implementation. > **Keywords:** MUST, MUST NOT, SHOULD, MAY as defined in RFC 2119. > **Protocol source:** `ojp-grpc-commons/src/main/proto/StatementService.proto`, `echo.proto` From facf7188e321356085b0cad211342f5712a8b903 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Wed, 6 May 2026 17:09:36 +0000 Subject: [PATCH 12/12] docs: add language equivalents table to CLIENT_SPEC.md overview Agent-Logs-Url: https://github.com/Open-J-Proxy/ojp/sessions/75e175b5-0175-4c88-bfe3-b40c8cb465ab Co-authored-by: rrobetti <7221783+rrobetti@users.noreply.github.com> --- .../multi-language-client-spec/CLIENT_SPEC.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/documents/multi-language-client-spec/CLIENT_SPEC.md b/documents/multi-language-client-spec/CLIENT_SPEC.md index 8bf8009b3..9cc3bd537 100644 --- a/documents/multi-language-client-spec/CLIENT_SPEC.md +++ b/documents/multi-language-client-spec/CLIENT_SPEC.md @@ -63,6 +63,21 @@ This architecture lets many application instances scale independently without ov A non-Java OJP client replaces the `ojp-jdbc-driver` module. It must implement all 21 `StatementService` RPCs plus `EchoService.Echo`, handle the `SessionInfo` propagation contract on every call, and manage endpoint health, failover, and session stickiness on the client side. The server handles everything else: real connection management, transaction state, LOB storage, cursor state, and query caching. +### Language Equivalents + +Each target language has its own standard database-access API. When implementing an OJP client, map OJP's connection/statement/result-set concepts to that language's native API: + +| Language | JDBC Equivalent / Standard API | Description | +| :--- | :--- | :--- | +| **Go** | `database/sql` | The standard Go package providing a generic interface for SQL-like databases. | +| **Python** | DB-API 2.0 (`pep-249`) | A standard specification followed by almost all Python database drivers (e.g., `psycopg2`, `sqlite3`). | +| **C# / .NET** | `ADO.NET` | The foundational data access technology for .NET, using standard classes like `DbConnection`. | +| **C++** | `ODBC` | The "Open Database Connectivity" standard used for cross-platform database-agnostic apps. | +| **Node.js** | Common Driver Patterns / ORMs | Relies on community standards or ORMs like Sequelize or Prisma rather than a single built-in API. | +| **Ruby** | `DBI` / `Active Record` | Uses the Database Interface (DBI) module or the Active Record abstraction layer. | +| **PHP** | `PDO` (PHP Data Objects) | A lightweight, consistent interface for accessing various databases in PHP. | +| **Dart** | `sql_conn` / `drift` | Uses specific community packages or the Drift library for structured data access. | + > **Important operational rule:** Application-side connection pools **must be disabled** when using OJP. Double-pooling causes incorrect behavior and resource waste. This is the single most common misconfiguration. ---