Skip to content

Feature Request: Extend Symbolic Tensor Graph to Support LLM Inference Workloads #23

@indukantdeo

Description

@indukantdeo

The current Symbolic Tensor Graph (STG) framework is extremely valuable for synthetic modeling of distributed LLM training workloads, supporting rich parallelization strategies (DP/TP/PP/SP) and exporting Chakra execution traces for ASTRA‑sim exploration.
However, LLM inference is increasingly a dominant workload in production systems and presents fundamentally different characteristics from training:

Autoregressive token-by-token execution
Heavy emphasis on KV-cache memory traffic and placement
Absence of backward pass and optimizer steps
Latency‑critical (e2e and per‑token) vs throughput‑oriented optimization
Emerging parallelism strategies (context parallelism, decode batching, speculative decoding)

Extending STG to natively support LLM inference trace generation would make it highly impactful for deployment‑time system design, hardware exploration, and inference‑stack research, complementing the current training‑focused workflow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions