feat(optimizer): [2/N] Optimizer REST service layer#531
feat(optimizer): [2/N] Optimizer REST service layer#531mkuchenbecker wants to merge 6 commits intomkuchenb/optimizer-1from
Conversation
Service interface and implementation for all optimizer CRUD operations including complete-operation lifecycle, stats upsert with history double-write, and filtered queries. Three REST controllers expose the endpoints. The apps/optimizer shared module provides lightweight entity/repo copies for the analyzer and scheduler apps. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align OptimizerDataServiceImpl with renamed repository methods from optimizer-1 review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e628426 to
ef3260f
Compare
Resolve repo conflicts by taking optimizer-1's clean find-only versions. Scheduler-specific methods and streamAll removed per review feedback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mkuchenbecker
left a comment
There was a problem hiding this comment.
this needs tests
Propagate CompleteOperationRequest orphan field removal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
H2 integration tests for OptimizerDataServiceImpl covering completeOperation (write history, not-found) and upsertTableStats (create, update, history append). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strengthen upsertTableStats test to verify history rows contain the raw delta stats from each call, not just the row count. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| * with the history row, or 404 if the operation does not exist. | ||
| */ | ||
| @PostMapping("/{id}/complete") | ||
| public ResponseEntity<TableOperationsHistoryDto> completeOperation( |
There was a problem hiding this comment.
We need table name and database name as input. We can keep the url format same as how tables sevice urls are specified like v1/databases/DB/tables/TABLE
There was a problem hiding this comment.
Or can be passed as parameters.
There was a problem hiding this comment.
These APIs are intentionally keyed by table UUID because of drop-and-recreate semantics: a recreated table is a brand-new entity for the optimizer (new stats, new storage, new operation history), and a name-based key would conflate two distinct identities. The Spark caller of /{id}/complete already has the operation id. We'll add a name-based variant when a concrete use case lands; today the only such use case is operation-history browsing, which is covered separately.
|
|
||
| /** Fetch a single operation row by its ID, regardless of status. Returns 404 if not found. */ | ||
| @GetMapping("/{id}") | ||
| public ResponseEntity<TableOperationsDto> getTableOperation(@PathVariable String id) { |
There was a problem hiding this comment.
Same comment database name and table name needed.
There was a problem hiding this comment.
Same answer — fetch-by-id is intentional for the same drop-and-recreate reason. The list endpoint at the controller root already accepts databaseName / tableName as optional query-param filters when a multi-criteria browse is needed.
|
|
||
| /** REST controller for {@code table_operations}. */ | ||
| @RestController | ||
| @RequestMapping("/v1/table-operations") |
There was a problem hiding this comment.
Can we have common format for all urls like common prefix /v1/optimizer/ and operations can be be suffix. So the url can be something like /v1/optimizer/operations.
There was a problem hiding this comment.
Claude: Renamed to /v1/optimizer/operations. Applied the same /v1/optimizer/... namespacing across all three controllers.
|
|
||
| /** REST controller for {@code table_operations_history}. */ | ||
| @RestController | ||
| @RequestMapping("/v1/table-operations-history") |
There was a problem hiding this comment.
Can we have common format for all urls like common prefix /v1/optimizer/ and operations can be be suffix. So the url can be something like /v1/optimizer/history or /v1/optimizer/operations-history
There was a problem hiding this comment.
Claude: Renamed to /v1/optimizer/operations-history (the more descriptive of the two, to disambiguate from stats history).
|
|
||
| /** Return the most recent history for a table, newest first, up to {@code limit} rows. */ | ||
| @GetMapping("/{tableUuid}") | ||
| public ResponseEntity<List<TableOperationsHistoryDto>> getHistory( |
There was a problem hiding this comment.
Table name and database name?
There was a problem hiding this comment.
we probably need both. This API is used by the analyzer to find the history for a particular uuid, but people getting the history will do so by name.
There was a problem hiding this comment.
Claude: Done — added GET /v1/optimizer/databases/{databaseName}/tables/{tableName}/operations-history for human/name-based access. The UUID-keyed path stays for the analyzer. Backed by a new composite index on table_operations_history (database_name, table_name) at the schema layer.
|
|
||
| /** REST controller for managing per-table stats in the optimizer DB. */ | ||
| @RestController | ||
| @RequestMapping("/v1/table-stats") |
There was a problem hiding this comment.
| @RequestMapping("/v1/table-stats") | |
| @RequestMapping("/v1/optimizer/table-stats") |
or
| @RequestMapping("/v1/table-stats") | |
| @RequestMapping("/v1/optimizer/stats") |
There was a problem hiding this comment.
Claude: Renamed to /v1/optimizer/stats (took the shorter of the two; symmetric with /v1/optimizer/operations and /v1/optimizer/operations-history).
| * Iceberg commit. Idempotent. | ||
| */ | ||
| @PutMapping("/{tableUuid}") | ||
| public ResponseEntity<TableStatsDto> upsertTableStats( |
There was a problem hiding this comment.
database name and table name needed.
There was a problem hiding this comment.
The PUT path is intentionally UUID-keyed — the Tables Service caller writes by UUID, and stats for a recreated table need to land under a fresh row, not collide with the dropped table's history. The request body already carries databaseName / tableName as denormalized fields. Same position as the operations endpoints: we'll add name-based access if a concrete use case lands.
Optimizer Stack
Summary
PR 2 of N in the optimizer stack.
Overall Project
Service Design doc.
Service layer and REST controllers for the optimizer service, plus the
apps/optimizershared module providing lightweight entity/repo copies for the analyzer and scheduler apps.Changes
Service layer:
OptimizerDataServiceinterface andOptimizerDataServiceImpl— CRUD operations, complete-operation lifecycle, stats upsert with history double-write, filtered queries.Controllers:
TableOperationsController,TableOperationsHistoryController,TableStatsController— REST endpoints per the design doc API spec.Shared module (
apps/optimizer): Lightweight entity and repository copies used by the analyzer and scheduler apps to read optimizer state directly from MySQL.Testing Done
H2 integration tests in
OptimizerDataServiceImplTest(5 tests):completeOperation_writesHistoryFromOperationRow— saves SCHEDULED row, completes it, asserts history DTO fieldscompleteOperation_notFound_returnsEmpty— completes nonexistent ID, asserts emptyupsertTableStats_createsNewRow— upserts new table, asserts DTO and repo rowupsertTableStats_updatesExistingRow— upserts twice, asserts overwrite with single rowupsertTableStats_appendsHistoryOnEveryCall— upserts twice, asserts 2 history rowsAdditional Information