Skip to content

Edge 373 placement controller log decision trace in json#31

Merged
ktatarnikovhiro merged 14 commits intomainfrom
EDGE-373-placement-controller-log-decision-trace-in-json
Dec 31, 2025
Merged

Edge 373 placement controller log decision trace in json#31
ktatarnikovhiro merged 14 commits intomainfrom
EDGE-373-placement-controller-log-decision-trace-in-json

Conversation

@ktatarnikovhiro
Copy link
Copy Markdown
Contributor

@ktatarnikovhiro ktatarnikovhiro commented Dec 31, 2025

This PR introduces detailed tracing and logging for placement decisions, propagating structured trace data through models, business logic, and outwards to external consumers. The goal is to retain a fully traceable chain of decision steps for each scheduling/placement operation, improving debugging, auditability, and visibility of how placement outcomes are determined.

Notable Changes

  • New Models and OpenAPI Additions:
    • Added TraceLogRowModel and NamespacedNameModel to both Python and OpenAPI schemas.
    • Added trace fields to BidResponseModel and required name in BidRequestModel.
  • Trace Propagation:
    • Placement-related objects (GreedyPlacement, PlacementResult, etc.) now accept and record detailed trace logs (TraceLog).
    • Traces are collected as lists of structured TraceLogRow objects, capturing timestamp, zone, application name, message, and state.
  • API and Internal Model Updates:
    • Updated API, client, and domain models to support structured traces and propagate them between scheduler, placement, and storage layers.
    • Refactored code to ensure trace information flows through actions, results, and is saved in persistent stores (DecisionStore).
  • Test and Fixture Updates:
    • All affected tests now use new trace fields and models, updating expectations and test data for the new format.
  • Dependency and Tooling Bumps:
    • Updated core dependencies such as kubernetes, fastapi, uvicorn, and dev tools.
    • Added and pinned certain transitive dependencies for security and compatibility (filelock, urllib3).
  • Implementation Details:
    • Added now_millis to clocks to support accurate timestamping of trace log events.
    • Enhanced and formalized logging at scheduling, placement, and result-reporting steps.
    • Adjusted various function/method signatures to accept/return traces as needed.
    • Improved trace resets and composition to enable concatenation of local and remote traces.

Impact

  • Debuggability: Each placement attempt now produces a complete, structured trace, making it much easier to follow the decision-making process.
  • Auditability: System actions are now tracked with zone, name, time, state, and arbitrary messages.
  • Test Coverage: All affected tests have been updated to expect and assert on new trace fields.

@ktatarnikovhiro ktatarnikovhiro merged commit 2661df6 into main Dec 31, 2025
6 checks passed
@ktatarnikovhiro ktatarnikovhiro deleted the EDGE-373-placement-controller-log-decision-trace-in-json branch December 31, 2025 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants