TraceLens

TraceLens is a Python library focused on automating analysis from trace files and enabling rich performance insights. Designed with simplicity and extensibility in mind, this library provides tools to simplify the process of profiling and debugging complex distributed training and inference systems. Find the PyTorch Conference 2025 poster for TraceLens here.

Key Features

✨ Hierarchical Performance Breakdowns: Pinpoint bottlenecks with a top-down view, moving from the overall GPU timeline (idle/busy) to operator categories (e.g., convolutions), individual operators, and right down to unique argument shapes.

⚙️ Compute & Roofline Modeling: Automatically translate raw timings into efficiency metrics like TFLOP/s and TB/s for popular operations. Determine if an op is compute- or memory-bound and see how effectively your code is using the hardware.

🔗 Multi-GPU Communication Analysis: Accurately diagnose scaling issues by dissecting collective operations. TraceLens separates pure communication time from synchronization skew and calculates effective bandwidth on your real workload, not a synthetic benchmark.

🔄 Trace Comparison: Quantify the impact of your changes with powerful trace diffing. By analyzing performance at the CPU dispatch level, TraceLens enables meaningful side-by-side comparisons across different hardware and software versions.

▶️ Event Replay: Isolate any operation for focused debugging. TraceLens generates minimal, self-contained replay scripts directly from trace metadata, making it simple to share IP-safe test cases with kernel developers.

🔧 Extensible SDK: Get started instantly with ready-to-use scripts, then build your own custom workflows using a flexible and hackable Python API.

Quick Start

Installation

1. Install TraceLens directly from GitHub:

pip install git+https://github.com/AMD-AGI/TraceLens.git

2. Command Line Scripts for popular analyses

Generate Excel Reports from Traces Detailed docs here (you can use compressed traces too such as .zip and .gz)

# PyTorch profiler traces
TraceLens_generate_perf_report_pytorch --profile_json_path path/to/your/trace.json

# rocprofv3 traces
TraceLens_generate_perf_report_rocprof --profile_json_path path/to/results.json

Compare Traces Detailed docs here

TraceLens_compare_perf_reports_pytorch \
    baseline.xlsx \
    candidate.xlsx \
    --names baseline candidate \
    --sheets all \
    -o comparison.xlsx

Generate Collective Performance Report Detailed docs here

TraceLens_generate_multi_rank_collective_report_pytorch \
    --trace_dir /path/to/traces \
    --world_size 8 \

Refer to the individual module docs in the docs/ directory and the example notebooks under examples/ for further guidance.

📦 Custom Workflows: Check out examples/custom_workflows/ for community-contributed utilities including roofline_analyzer and traceMap — powerful tools we're working on integrating more tightly into the core library.

Supported Profile Formats

TraceLens supports multiple profiling formats:

Format	Tool	Documentation
PyTorch	`torch.profiler`	docs/generate_perf_report.md
JAX	XPlane protobuf	docs/generate_perf_report_jax.md
rocprofv3	AMD ROCm rocprofiler-sdk	docs/generate_perf_report_rocprof.md

rocprofv3 Support

TraceLens now supports AMD's rocprofv3 (rocprofiler-sdk) JSON format:

# Generate performance report from rocprofv3 trace
TraceLens_generate_perf_report_rocprof \
    --profile_json_path trace_results.json \
    --short_kernel_study \
    --kernel_details

Features:

GPU timeline breakdown (kernel, memory, idle time)
Kernel summary with statistical analysis
Automatic kernel categorization (GEMM, Attention, Elementwise, etc.)
Short kernel analysis
Grid/block dimension tracking

See docs/generate_perf_report_rocprof.md for detailed usage.

Contributing

We welcome issues, bug reports, and pull requests. Feel free to open discussions in the GitHub repository or contribute new performance models, operator mappings or analysis modules. Please see CONTRIBUTING.md for guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 457 Commits
.github		.github
TraceLens		TraceLens
docs		docs
examples		examples
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TraceLens

Key Features

Quick Start

Installation

Supported Profile Formats

rocprofv3 Support

Contributing

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 23

Uh oh!

Languages

License

AMD-AGI/TraceLens

Folders and files

Latest commit

History

Repository files navigation

TraceLens

Key Features

Quick Start

Installation

Supported Profile Formats

rocprofv3 Support

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 23

Uh oh!

Languages

Packages