Skip to content

Conversation

@cj401-amd
Copy link

This PR is based on the PR for xla 0.6.0 ROCm/xla#302 and upstream PR openxla/xla#29769

rocprofiler-sdk integration for improved profiling with rocprofiler_force_configure() and annotations, support both time-based and step-based profiling,

  • Keep roctracer(v1) for ROCm version < 6.3
  • rocprofiler-sdk and roctracer are selected at compile time based on ROCm version guard
  • Add unit tests for rocm_collector and rocm_tracer for v3 (ROCm version >= 6.3)
  • still need to figure out how to add more stats related to kernel, e.g., kernel size, occupancy, DMA copy, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants