Overview
Built-in profiling tools for kernel performance and memory analysis.
Features
Kernel Profiler
import pygpukit as gk
from pygpukit.profiling import Profiler
with Profiler() as p:
result = gk.ops.matmul(a, b)
gk.ops.softmax(result)
p.summary()
# Kernel | Time (ms) | TFLOPS | Memory (GB/s)
# gemm_fp8_bf16 | 0.42 | 312.5 | 1250.0
# softmax | 0.08 | - | 890.0
p.export('profile.json') # Chrome trace format
Memory Analyzer
from pygpukit.profiling import MemoryProfiler
with MemoryProfiler() as mp:
model = QwenModel.from_safetensors(path)
output = model.generate(prompt)
mp.peak_memory() # 12.5 GB
mp.current_memory() # 8.2 GB
mp.allocation_trace() # List of allocations
mp.fragmentation() # 0.15 (15% fragmented)
Timeline View
from pygpukit.profiling import timeline
with timeline('inference.json'):
for _ in range(100):
model.forward(input)
# Open in chrome://tracing
Implementation
Overview
Built-in profiling tools for kernel performance and memory analysis.
Features
Kernel Profiler
Memory Analyzer
Timeline View
Implementation