Feature Request: Extend Symbolic Tensor Graph to Support LLM Inference Workloads

The current Symbolic Tensor Graph (STG) framework is extremely valuable for synthetic modeling of distributed LLM training workloads, supporting rich parallelization strategies (DP/TP/PP/SP) and exporting Chakra execution traces for ASTRA‑sim exploration. 
However, LLM inference is increasingly a dominant workload in production systems and presents fundamentally different characteristics from training:

Autoregressive token-by-token execution
Heavy emphasis on KV-cache memory traffic and placement
Absence of backward pass and optimizer steps
Latency‑critical (e2e and per‑token) vs throughput‑oriented optimization
Emerging parallelism strategies (context parallelism, decode batching, speculative decoding)

Extending STG to natively support LLM inference trace generation would make it highly impactful for deployment‑time system design, hardware exploration, and inference‑stack research, complementing the current training‑focused workflow.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Extend Symbolic Tensor Graph to Support LLM Inference Workloads #23

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Extend Symbolic Tensor Graph to Support LLM Inference Workloads #23

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions