Skip to content

[Engine] Built-in cache step type — replace 5-step cache pattern #81

@ameet

Description

@ameet

Problem

Any flow that caches LLM or API results currently requires a 5-step pattern:

  1. buildCacheKey — deterministic key from inputs
  2. readCacheFile — file-read with onError: continue
  3. checkCache — evaluate hit/miss/expired (age-based TTL)
  4. buildCacheEntry — quality gate + serialize (after computation)
  5. writeCacheFile — persist to disk (guarded on truthy output)

This pattern must be implemented identically in every cached flow. In a 90+ flow codebase, 12 flows use this pattern = 60 steps dedicated to caching.

Brittleness

The pattern has 4 known failure modes that teams build regression tests to catch:

  1. Step ordering bugbuildCacheEntry references $.steps.buildResult but is declared before it. If execution order changes, cache writes stale data.
  2. Null guard missingwriteCacheFile must check $.steps.buildCacheEntry.output is truthy. Without this guard, it writes null or undefined to the cache file.
  3. onError missingreadCacheFile, buildCacheEntry, and writeCacheFile all need onError: continue. Missing any one causes the flow to fail on cache corruption.
  4. TTL inconsistency — each flow hardcodes its own TTL calculation. Easy to have different TTL logic across flows.

Proposal

Add a first-class cache step type:

{
  "id": "cached",
  "type": "cache",
  "cache": {
    "namespace": "company-research",
    "key": "{{$.input.company}}-{{$.input.stage}}",
    "ttl": "30d",
    "qualityGate": {
      "minLength": 100,
      "requiredFields": ["data", "score"]
    },
    "steps": [
      { "id": "doExpensiveWork", "type": "bash", "bash": { "command": "..." } },
      { "id": "buildResult", "type": "code", "code": { "source": "..." } }
    ]
  }
}

Semantics:

  • On cache hit (key exists, not expired, passes quality gate): skip inner steps, return cached data
  • On cache miss/expired: execute inner steps, validate output against quality gate, write to cache
  • Cache key is deterministic from the template expression
  • TTL supports human-readable durations: 3d, 30d, 1h
  • Quality gate is optional; if present, prevents caching bad results
  • All file I/O and error handling is managed by the engine

Benefits

  • 12 flows × 5 steps = 60 steps reduced to 12 cache blocks
  • Eliminates 4 known brittleness patterns
  • Standardizes TTL format (no more 30 * 24 * 60 * 60 * 1000 in code)
  • Quality gates become declarative, not imperative
  • Cache invalidation becomes a platform feature (one cache clear --namespace X)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions