Commit 01569aa
committed
[AMDGPU] Add TDM Descriptor Optimization Pass
AMDGPU code generation frequently creates tensor and address descriptors using sequences of insertelement instructions. When these patterns have significant field reuse or cross-iteration dependencies, the current approach generates inefficient REG_SEQUENCE operations in MIR, instead of INSERT_SUBREG chains, leading to suboptimal register allocation.
This PR introduces the AMDGPUTDMOptimization pass that:
- Pattern Detection: Identifies descriptor creation chains with reusable constant fields
- Grouping: Groups similar patterns to maximize optimization opportunities
- Transformation: Converts insertelement chains to alloca + field updates in address space 5
- Integration: Works with SROA to generate INSERT_SUBREG operations for optimal register allocation1 parent e3ef26d commit 01569aa
File tree
5 files changed
+1015
-0
lines changed- llvm
- lib/Target/AMDGPU
- test/CodeGen/AMDGPU
5 files changed
+1015
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
59 | 59 | | |
60 | 60 | | |
61 | 61 | | |
| 62 | + | |
62 | 63 | | |
63 | 64 | | |
64 | 65 | | |
| |||
170 | 171 | | |
171 | 172 | | |
172 | 173 | | |
| 174 | + | |
| 175 | + | |
173 | 176 | | |
174 | 177 | | |
175 | 178 | | |
| |||
0 commit comments