RFC: Streamlining dashboard release and deployment#35
RFC: Streamlining dashboard release and deployment#35annakrystalli wants to merge 1 commit intomainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
nickreich
left a comment
There was a problem hiding this comment.
This is a great summary of the current status of things, and a helpful take on some concrete next steps. Some comments and questions:
- It wasn't clear to me from a quick read over the whole RFC about the implications in the case of a breaking schema change. For example, if we introduce a breaking schema change, do we have to go through to all dashboards and force a change to their config files? or will they be able to limp along with a previous version? if we need to make changes, then I think part of the documented process should be how to ensure all those changes get made.
- Claude skills still feel new to me. While they do seem like a prime candidate for a way to "automate" some of these processes, it feels a bit unknown to me. Probably it will be easier to maintain, but we would want to make sure that there is clear documentation so that whenever a change is made to the process that is reflected in skill updates.
- in general, could we be more specific about whether the plan is the complete both Phase 1 and 2 or just phase 1 prior to starting on the new eval schema changes?
|
Thanks for the speedy review @nickreich! Re: breaking schema changes — Yes, breaking changes will still require updating config files across all dashboards. This is effectively the status quo — while the dashboard tools don't explicitly enforce The alternative (version-pinnable tool references + backwards compatibility) was considered and rejected (see "Other Options Considered" 1) because it would require significantly more effort to maintain compatibility across the tool chain (different languages, Docker images, config schemas, JS modules) and would push active version management onto hub admins. What the RFC does propose to improve the breaking change experience:
Re: Claude Code skills — Fair concern. Worth noting that skills are essentially documented runbooks in executable form — they encode the same step-by-step processes that are currently in the developer docs, but in a way that can be executed and checked interactively. Think of it as an executable script with Claude handholding you through the process, with the opportunity to work through issues that arise or ask questions as you go along. Unlike a plain script that either succeeds or fails, if something unexpected happens mid-way (a test fails, a version doesn't match), Claude can help diagnose and adapt rather than just aborting. On maintenance: when the underlying process changes, the skill is the thing that gets updated — there's no separate doc to keep in sync. That said, we should make sure each skill links back to the relevant developer docs, and that any process change triggers a skill review. We could add that as an explicit step in the If it would help, I can demo one of the simpler skills (e.g., Re: phasing vs eval-metrics timeline — The plan is to complete Phase 1 before starting on the eval schema changes. The skills and automated For Phase 2, the most impactful items would be replacing the Docker CI comparison tests with independent validation and the dedicated test hub to run them against, since together they unblock confident releases. However, they're more involved, so whether we tackle them before or alongside the eval schema work depends on how quickly we want to move on those changes. The remaining Phase 2 items (schema handling rework, Docker elimination evaluation) can be picked off if and when we decide we want to — they're improvements, not blockers. |
|
Echoing @nickreich 's comment:
Some small comments: "Aims":
"Claude Code skills for dashboard operations":
"Automated renv.lock update workflow"
"Replace comparison-based Docker CI with independent validation"
"Dedicated test hub and end-to-end smoke test"
"Rework config schema handling in hubPredEvalsData"
"Evaluate eliminating Docker indirection for hubPredEvalsData"
|
Summary
renv.lockupdatesContext
The team's first post-departure schema deployment (hubPredEvalsData#33) exposed structural problems in the multi-repo release chain. With the eval-metrics-expansion project planning multiple schema changes, these issues need addressing before the next deployment cycle.
🤖 Generated with Claude Code