Skip to content

Support calculated measures (derived metrics) in YAML #202

@boringdata

Description

@boringdata

Problem

BSL supports calc_measures in the Python API (SemanticModel(calc_measures={...})) but not in YAML definitions. This means derived metrics like fraud_rate = fraud_volume / transaction_volume must inline the full expression:

fraud_rate:
  expr: (_.has_fraudulent_dispute.cast("int64") * _.eur_amount).sum() / _.eur_amount.sum()

Instead of the more maintainable:

calculated_measures:
  fraud_rate:
    expr: fraud_volume / transaction_volume
    description: "Fraud rate as ratio of fraudulent volume to total volume"

Proposal

Add a calculated_measures section to the YAML schema that allows referencing other measures by name. This would:

  1. Parse calculated_measures in from_yaml() / from_config()
  2. Resolve measure references at evaluation time
  3. Pass them through to SemanticModel(calc_measures=...)

Why

  • DRY: Avoid duplicating aggregation logic across measures
  • Readability: fraud_volume / transaction_volume is clearer than the inlined version
  • Consistency: The Python API already supports this via calc_measures param
  • LLM agents: Agents reading semantic_model.yml can better understand metric relationships when they see explicit references

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions