Improve Lakebase skill: synced tables docs, apps decision gate, and connectivity#56
Improve Lakebase skill: synced tables docs, apps decision gate, and connectivity#56
Conversation
cfe90bc to
63dee0b
Compare
|
Stacked a follow-up at #59 — adds the SP-role-creation block for AppKit Lakebase apps (the case where the SP fails with |
keugenek
left a comment
There was a problem hiding this comment.
please fix 'synced tables' term to be more specific before merging
| }, | ||
| "databricks-apps": { | ||
| "description": "Databricks Apps development and deployment", | ||
| "description": "Databricks Apps development and deployment (evaluates analytics vs synced tables data access)", |
There was a problem hiding this comment.
@pkosiec is it clear enough for AI when you say 'synced tables' - maybe there is more desriptive term which is more specific?
There was a problem hiding this comment.
unfortunately I don't think so, it's the official name: https://docs.databricks.com/aws/en/oltp/instances/sync-data/sync-table - I will use "Lakebase synced tables" then.
From my testing an agent wasn't confused with the term but I agree, let's make it more clear 👍
| @@ -1,12 +1,12 @@ | |||
| { | |||
| "version": "2", | |||
| "updated_at": "2026-04-22T15:52:03Z", | |||
There was a problem hiding this comment.
these timings instead better be derived from git not from the contents
There was a problem hiding this comment.
Good point. Looked into deriving timestamps from git — the issue is that git log only knows about committed state, so you'd need a two-step workflow: commit skill changes first, then run the script to regenerate manifest.json, then commit again (or amend). That's friction for every skill edit.
Current st_mtime approach isn't perfect (varies across machines after clone/rebase), but validate already strips updated_at from comparison, so they're effectively cosmetic. The trade-off doesn't seem worth the workflow change right now 🤔
Nitpick Review —
|
| File | Type |
|---|---|
manifest.json |
config |
scripts/skills.py |
python |
skills/databricks-apps/SKILL.md |
docs |
skills/databricks-apps/references/appkit/jobs.md |
deleted (-141) |
skills/databricks-apps/references/appkit/lakebase.md |
docs |
skills/databricks-apps/references/appkit/overview.md |
docs |
skills/databricks-lakebase/SKILL.md |
docs (major rewrite) |
skills/databricks-lakebase/references/connectivity.md |
docs |
skills/databricks-lakebase/references/reverse-etl.md |
deleted (-103) |
skills/databricks-lakebase/references/synced-tables.md |
new (+214) |
Reviewed by three parallel agents: Security, Correctness, Design.
🔴 Must Fix (CRITICAL / HIGH)
1. scripts/skills.py — iter_skill_files filter was removed without replacement
- Severity: HIGH (Correctness CRITICAL · Design MEDIUM · Security LOW — all three reviewers flagged this)
- Where:
scripts/skills.py:98(get_skill_updated_at) and:183(generate_manifest) - Issue: The deleted helper filtered out dot-prefixed paths (
.DS_Store,.git),__pycache__/, and*.pyc. The replacement uses barerglob("*")at both call sites. On any macOS dev machine, Finder will drop.DS_Storeinto skill directories and they will (a) get added to the manifest'sfileslist, (b) bumpupdated_atnon-deterministically, and (c) potentially expose accidentally-committed dotfiles (e.g..env) in the published manifest. - Fix: Restore the filter inline or via a helper:
if any(part.startswith('.') for part in f.relative_to(skill_dir).parts): continue if f.suffix == '.pyc' or '__pycache__' in f.parts: continue
2. manifest.json — updated_at timestamps rolled backward across multiple skills
- Severity: HIGH (Correctness HIGH · Design HIGH)
- Where: Top-level and at least 6 skills (
databricks-apps2026-04-28 → 2026-04-27;databricks-core,databricks-dabs,databricks-jobs,databricks-model-serving,databricks-pipelinesall 2026-04-28 → 2026-04-23) - Issue: The manifest appears to have been regenerated against a stale checkout, so skills not changed in this PR now claim to be older than they were on
main. CI'svalidatemode stripsupdated_atbefore comparison, so it won't catch this. Any consumer that uses these timestamps for cache-busting will incorrectly believe the skills are stale. - Fix: Regenerate
manifest.jsonagainst the actual branch HEAD (after fixing finding Refactor: move skills to skills/ directory #1). Confirm no unrelatedupdated_atvalues move backward versusmain.
3. scripts/skills.py:47–50 — databricks-proto-first entry added but skill directory does not exist
- Severity: HIGH (Correctness HIGH · Design MEDIUM)
- Where:
SKILL_METADATAdict - Issue: New entry
"databricks-proto-first"has no matchingskills/databricks-proto-first/directory. The generator only validates the on-disk → metadata direction (raises if a directory has no metadata) but not metadata → on-disk, so the dead entry is silently dropped today. This will cause confusion when the skill is added (silent conflict or surprising activation) and is misleading to future contributors. - Fix: Either commit the
skills/databricks-proto-first/directory in this PR or remove the dict entry. Add an assertion that everySKILL_METADATAkey resolves to an existing directory.
4. skills/databricks-apps/SKILL.md Decision Gate placement
- Severity: HIGH (Design)
- Where:
SKILL.md:60–76 - Issue: The new Decision Gate sits inside the "Development Workflow" section, after the "Required Reading by Phase" table and "Generic Guidelines". An LLM that reads the skill top-down will hit scaffolding instructions (~line 20) before encountering the gate (line 60), defeating the gate's purpose.
- Fix: Promote the Decision Gate to a top-level section directly after the skill description, or at minimum add an inline
**⚠️ Decision Gate required — see below**to the "Scaffolding" row of the Required Reading table.
5. skills/databricks-apps/references/appkit/lakebase.md ↔ synced-tables.md content duplicated
- Severity: HIGH (Design)
- Where:
appkit/lakebase.md:180–193vsdatabricks-lakebase/references/synced-tables.md(whole file) - Issue: The "Use synced tables when / Do NOT use" guidance is duplicated in two skills and will drift. Single source of truth should be the lakebase skill.
- Fix: In
appkit/lakebase.md, replace the duplicated lists with a one-line pointer ("seedatabricks-lakebase/references/synced-tables.mdfor full guidance") and keep only the AppKit-integration-specific notes.
🟡 Should Fix (MEDIUM)
- [MEDIUM/security]
connectivity.md:155–159—resolve_host()passes rawdig +shortfirst line tohostaddrwithout IP validation; CNAME, warning text, or empty output silently flows through. Fix: validate withsocket.inet_pton(...); raiseValueErrorotherwise. - [MEDIUM/security]
connectivity.md:84— Pattern 3 promotesexport LAKEBASE_PG_URL="postgresql://user:password@..."(creds in shell history/process listing) without a clear "local-dev only / use.envexcluded from VCS / don't put real prod creds here" callout. Fix: add an explicit warning +.env/pg_service.conf alternative. - [MEDIUM/correctness]
databricks-lakebase/SKILL.md:233— Bash snippet parses CLI JSON via['status']['hosts']['host']while Python examples (connectivity.md:22, 33) use the SDK attribute path. The CLI JSON shape isn't verified in the docs; ifhostsis an array or named differently, the snippet fails silently. Fix: verify againstdatabricks postgres get-endpoint -o jsonand add a comment showing the expected shape. - [MEDIUM/correctness]
databricks-apps/SKILL.md"When to Use What" routing table — has rows for Analytics, Lakebase CRUD, Genie, Model Serving but no row for Synced Tables / low-latency UC reads, which the new Decision Gate routes to. Fix: add a bullet pointing low-latency UC reads at--features lakebase+databricks-lakebaseskill. - [MEDIUM/design]
appkit/lakebase.md:228–230— two consecutive callout blocks repeat the same instruction ("usedatabricks postgres create-synced-table, see lakebase skill, grant SP"). Fix: delete one. - [MEDIUM/design]
appkit/overview.md:11–16— two--features lakebaserows (Synced Tables vs Lakebase OLTP) with identical Init command but different--setrequirements. An LLM cannot disambiguate them from the table alone. Fix: include the full init command (with required--setflags) in each row, or annotate that OLTP requires extra flags.
🟢 Consider (LOW / NIT)
- [LOW/security]
appkit/lakebase.md:193— UC FGAC bypass for synced tables is buried in a "Do NOT use" bullet list. Promote to a dedicated> **Security note**callout in the "How It Works" section. - [LOW/security]
databricks-lakebase/SKILL.md:241–243— The exampleGRANT SELECT ON ALL TABLES IN SCHEMA public … ALTER DEFAULT PRIVILEGESgrants the SP read on every current and future object inpublic, which is broader than most users likely intend. Recommend a dedicated schema for synced tables. - [LOW/correctness]
appkit/lakebase.md:463, :484— references the lakebaseSKILL.md"Grant app SP access to synced tables" section, but that text isn't an##/###heading — it's a bold label inside "Other Workflows". Anchor links won't resolve. Fix: promote to a real heading. - [LOW/design]
databricks-lakebase/SKILL.md:3— frontmatterdescriptionis ~253 chars, ~4× the length of sibling skills (databricks-jobs~85,databricks-pipelines~90). Trim to a single sentence; routing models don't benefit from prose. - [LOW/design]
databricks-apps/SKILL.md:72— introduces the novel pattern*Agent notes (do not show to user):*with no precedent or convention defined elsewhere. Establish the convention (scope, who honors it) or drop it. - [NIT/design] Deletion of
references/appkit/jobs.md(141 lines) leaves no in-skill answer to "trigger/monitor a Lakeflow job from my app". Add a 10-line stub or a one-line pointer inoverview.mdtonpx @databricks/appkit docs Jobs plugin+ thedatabricks-jobsskill. - [NIT/design]
databricks-lakebase/SKILL.md:28says "Synced tables (reverse ETL)" whilesynced-tables.md:5correctly says "Previously known as Reverse ETL." Use the deprecated-name framing in both places, or drop the parenthetical fromSKILL.md.
What was checked and found clean
- No hardcoded credentials, no
eval/exec/shell=True, noyaml.loadwithout SafeLoader, nopickle, nosslmode=disablepatterns. subprocess.runcalls inscripts/skills.pyuse list form and don't incorporate user input.- All files listed in
manifest.jsonexist on disk; no phantom entries for deletedjobs.md/reverse-etl.md. - No remaining markdown links to the deleted reference files.
- Cross-skill relative path
../../../databricks-lakebase/references/synced-tables.md(appkit/lakebase.md:228) resolves on disk. - SQL in
synced-tables.mdexamples is valid Postgres syntax; CLI commands use the newdatabricks postgres create-synced-table(consistent with the Autoscaling context). - Autoscaling Max−Min ≤ 16 CU constraint, removed from
SKILL.md, is preserved inreferences/computes-and-scaling.md. - Decision Gate's A/B conditions are mutually exclusive; the "skip for pure CRUD" escape hatch is logically sound.
Reviewed in parallel by Security · Correctness · Design agents (Sonnet 4.6) on 2026-04-30.
Diff: 1103 lines, 55KB, against origin/main.
- Add Pattern 4 (Python Databricks App with platform-injected env vars) to connectivity.md - Clarify connection patterns are Python-specific; JS/AppKit apps get full auto-injection - Remove duplicate Data API / Reverse ETL pointers already covered in reference docs block - Fix missing blank lines before section headers in SKILL.md Co-authored-by: Isaac
- Rename reverse-etl.md to synced-tables.md (official Databricks terminology) - Replace all `databricks database` commands with `databricks postgres` (old Provisioned API → new Autoscaling API) - Add `new_pipeline_spec` as required field with warning: storage_catalog must be a regular UC catalog, not the Lakebase catalog (DLT can't create event logs in Postgres-backed schemas) - Add Prerequisites section for `create-catalog` - Add complete field reference table - Add NYC taxi example (end-to-end lifecycle) - Add synced tables section to apps lakebase.md (read-only pattern, permission grants, differences from CRUD tables) - All commands verified against live Lakebase project Co-authored-by: Isaac
- Fix broken relative path in appkit/lakebase.md (../../ → ../../../) - Trim duplicate autoscaling section in SKILL.md (defer to computes-and-scaling.md) - Restore Data API and Synced Tables links in Other Workflows section - Add "Previously known as Reverse ETL" to synced-tables.md for discoverability Co-authored-by: Isaac
…sync troubleshooting - Expand SKILL.md description with trigger words (synced tables, reverse ETL, Data API, scale-to-zero, OAuth, connection pooling) for better skill activation - Switch to 3rd person voice per create-skill best practices - Replace feature status table with bulleted Capabilities section - Add 3 synced-table troubleshooting entries (storage_catalog, CDF, permissions) - Update manifest description to match Co-authored-by: Isaac
- Add DNS resolution (macOS) workaround to connectivity.md (was referenced from SKILL.md but missing) - Rewrite synced table read example as tRPC route in appkit/lakebase.md (was Express-style, contradicting tRPC guidance) - Deduplicate create-catalog command in synced-tables.md NYC taxi example - Remove redundant storage_catalog warning callout (already in field table) - Clarify that delete-synced-table leaves the Postgres table behind Co-authored-by: Isaac
Add use-case guidance to help AI assistants discover synced tables for the right patterns: operational consoles, user-facing apps on analytical data, online feature serving, hybrid read/write, and Postgres-specific capabilities. Include anti-patterns (OLAP, writes, huge tables, FGAC). - Broaden Lakebase row in decision table to mention synced data - Add callout below table pointing to synced tables section - Add architecture pattern, use cases, and anti-patterns Co-authored-by: Isaac
Add a bullet for "Read lakehouse data with low latency" that points agents to the Lakebase guide when apps need fast point lookups. Add guidance note for agents to ask about latency requirements rather than defaulting to DBSQL for all read-from-UC use cases. Co-authored-by: Isaac
Agent testing showed the "When to Use What" section routes to analytics by default, even when the user explicitly asks for "fast" or "instant" reads. The synced tables bullet was buried after three analytics bullets that matched first. - Add decision gate at top of section: if user mentions speed/latency, present both analytics and synced tables options before proceeding - Rewrite guidance note to reference the gate instead of defaulting - Remove redundant standalone synced tables bullet (covered by gate) Co-authored-by: Isaac
…idance From agent testing: - Add psql connection workflow (generate-credential + psql) for running GRANT, CREATE INDEX, etc. against Lakebase - Add post-deploy GRANT workflow for app SP access to synced tables - Add source table guidance: don't run ad-hoc CREATE TABLE AS SELECT - Add DAB caveat: synced_database_tables maps to Provisioned API, not Autoscaling. DAB support for postgres_synced_tables not yet available. - Add creation + GRANT note to apps lakebase.md Co-authored-by: Isaac
- Fix psql sslmode syntax (connection string instead of broken positional arg) - Add scriptable psql workflow for agent automation - Add endpoint path note for generate-database-credential - Align GRANT SQL between SKILL.md and apps/lakebase.md (add ALTER DEFAULT PRIVILEGES) - Surface DABs caveat in SKILL.md troubleshooting table - Broaden latency decision gate triggers in apps SKILL.md - Add "sync gold tables, not raw" as top best practice - Clarify 8 TB storage limit scope (per branch) Co-authored-by: Isaac
Restructure the data access decision in databricks-apps SKILL.md: - Embed as Step 0 inside Development Workflow (not a separate section) - Always present Analytics vs Synced Tables comparison table for UC reads - Add hybrid apps note (--features analytics,lakebase) - Remove keyword-scanning logic that agents rationalized past - Add constraints section and app access cross-ref to synced-tables.md - Remove duplicate content across reference files Co-authored-by: Isaac Co-authored-by: Orca <help@stably.ai>
- Swap column order: Synced Tables is now option (A) for primacy - Strengthen instruction before the table: "you MUST show the table" and "Do not recommend one option over the other" - Reframe "Setup complexity" → "Setup", "Warehouse required" → "Always available" (positive framing for synced tables) - Add "operational apps" to synced tables Best For row Co-authored-by: Isaac Co-authored-by: Orca <help@stably.ai>
- Simplify comparison table to 3 user-facing rows (Speed, Best for, How it works)
- Move scaffold commands and links to agent-only notes section
- Add contextual recommendation: lean toward Synced Tables (A) unless
clearly aggregation/dashboard use case
- Remove negative framing ("trade-off", "setup complexity")
Co-authored-by: Isaac
Co-authored-by: Orca <help@stably.ai>
- Remove duplicate GRANT SQL from apps/lakebase.md, keep canonical location in lakebase/SKILL.md with cross-reference - Rename "Step 0" to "Data Access Decision Gate" for clarity since it's conditional (only applies when app reads UC data) - Cast PGPORT env var to int in connectivity.md Pattern 4 Co-authored-by: Isaac Co-authored-by: Orca <help@stably.ai>
ae8e36f to
ea7dc03
Compare
- Strengthen Decision Gate callout in Required Reading table - Deduplicate synced tables guidance in appkit/lakebase.md (pointer to lakebase skill) - Merge duplicate callout blocks into single block - Add IP validation to resolve_host() in connectivity.md - Add local-dev-only warning for Pattern 3 credentials - Add synced tables row to "When to Use What" routing table - Add expected JSON shape comment to scriptable CLI snippet - Disambiguate overview.md lakebase rows with "(no --set flags needed)" - Trim lakebase frontmatter description to one sentence - Normalize "reverse ETL" phrasing across files - Drop *Agent notes* pattern, use plain label - Add dedicated-schema recommendation for synced tables grants Co-authored-by: Isaac
2373085 to
bd9612f
Compare
Adds the SQL block (with `databricks_create_role` + DML grants) needed once after the first deploy of an AppKit Lakebase app — without it the SP gets `password authentication failed`. Promotes `databricks psql` as the runnable form of #56's manual `generate-database-credential` recipe. Also flags that the `databricks postgres create-role` CLI rejects every SP-role payload, so agents stop trying to use it. Co-authored-by: Isaac
Summary
reverse-etl.mdwith comprehensivesynced-tables.mdcovering CLI syntax, sync modes, prerequisites, constraints, and worked examplesScreenshots
Case 1:
Case 2:
Sample prompts
Test plan
python3 scripts/skills.py validatepasses