Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
132 changes: 132 additions & 0 deletions proposals/NNNN-uniformity-qualifiers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
<!-- {% raw %} -->

# Uniformity Qualifiers

* Proposal: [NNNN](NNNN-uniformity-qualifiers.md)
* Author(s): [Chris Bieneman](https://github.com/llvm-beanz)
* Sponsor: [Chris Bieneman](https://github.com/llvm-beanz)

* Status: **Under Consideration**

## Introduction

The HLSL Single Program Multiple Data (SPMD) programming model defines a program
in terms of how it operates on a single element of data. An SPMD program may be
executed on a traditional scalar processor or a Single Instruction Multiple Data
(SIMD) processor. When executing on a SIMD processor the program may execute
where a single instruction produces results for multiple threads of execution
from the source programming model. This is sometimes referred to as Single
Instruction Multiple Threads (SIMT).

Under HLSL's execution model, groups of threads form hierarchical scopes:
* A _dispatch_ represents the full set of threads spawned from a CPU API
invocation.
* A _thread group_ represents a subset of a dispatch that can execute
concurrently.
* A _wave_ represents a subset of a thread group that represents in a single
SIMD processor.
* A _quad_ represents a grouping of four adjacent threads in a wave.
Comment on lines +22 to +28
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth including Vulkan terminology here as well?


When a shader is executing threads concurrently on one or more processing cores,
an emergent property of _uniformity_ exists within the thread group, wave and
quad scopes.

Uniformity can refer to data or control flow. If a variable has the same value
across all threads in a scope, it is said to be _uniform_ across that scope.
Similarly if all threads within a scope are actively executing instructions
within a control flow block, the control flow is said to be _uniform control
flow_ across that scope.

## Motivation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to know what, if anything, this proposal does to NonUniformResourceIndex.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should cease to exist, and I should capture that.


Uniformity of data and control flow are central concepts to SIMT execution
models, and is required for correct execution of shader programs. Despite
the importance of this fundamental property it is not represented in any
explicit way in the HLSL language.
Comment on lines +42 to +45
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention that this requires backend compilers to attempt to analyse for uniformity?


This proposal, seeks to address that by introducing core concepts around
uniformity to HLSL's type system and programming model.

## Proposed solution

### Uniformity as a Type Qualifier
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there alternative approachs to using type qualifiers? I think doing so means we need to change the grammar?

Could something like:

group_uniform<int> myValue

Be made to work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a template and non-template spelling that don't use the same words we could do something similar. For example:

template<typename T>
using GroupUniform = group_uniform T

Would simply enable:

GroupUniform<int> MyInt;


This proposal introduces a new set of type qualifiers to represent the different
scopes of uniformity:
* `group_uniform`
* `simd_uniform`
* `quad_uniform`
Comment on lines +56 to +58
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to explain how these relate to the dispatch, thread group, wave etc. scopes that were defined above.

* non-uniform (default state with no associated keyword)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to have a way to explicitly mark non-uniform? Would leave open possibility of a "strict" mode where uniformity annotations are required and would allow developers to make their intent more explicit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think on this. My intent was that we would only have a strict mode, so explicit uniformity would be required everywhere that requires uniformity, and that no annotation would mean non-uniform. With that approach I'm not sure how an explicit non_uniform helps, but maybe I'm missing something.

Copy link
Member

@damyanp damyanp Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mainly wanting to avoid developers having to come up with naming or commenting conventions to document non-uniformity in their code, so I see it filling a role more like the signed keyword. (Although I expect now to be educated that signed actually does something more than I think it does)

Comment on lines +56 to +59
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any merit to having a command_uniform scope or similar indicating uniformity across all thread groups? E.g. for constants? There are, to my knowledge, some implementations that care about this (at least outside of DX12).


`group_uniform` is the highest scoping of uniformity, and implies all other
scopes. `simd_uniform` implies `quad_uniform`.

A new "UniformityReduction" cast will reduce the uniform scope allowing
conversion of one uniformity scope to another uniformity scope as long as the
source scope has a greater uniformity scope.
Comment on lines +64 to +66
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something i'm missing here is what happens if you did, e.g.:

group_uniform int a;
simd_uniform int b;
int c = a * b;

I would expect c to automatically be treated as simd_uniform (i.e. the smaller of the scopes) as the result, but that's not spelled out here AFAICT.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait now I look again that's what the next two sentences are saying - perhaps worth an example?


Any GLValue with a uniformity scope can be implicitly converted to a GLValue
with reduced uniform scope or no uniformity scope.

Any PRValue with a uniformity scope can be implicitly converted to a PRValue
with reduced uniform scope or no uniformity scope.

No implicit or explicit cast can increase uniformity scope.

HLSL library functionality that produces uniform results will be updated to
produce appropriately qualified uniform types. These functions can produce
uniform values from non-uniform inputs. For example:

```hlsl
simd_uniform bool WaveActiveAllTrue(bool);
quad_uniform bool QuadAny(bool);
```

Compile-time constants and Groupshared variable declarations imply
`group_uniform` uniformity.
Comment on lines +85 to +86
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should constant buffer/SRV data also be automatically marked this way? They're supposed to be read-only and non-volatile for the lifetime of the shader AFAIU.


Builtin operators will produce uniform result values based on the uniformity of
the intersection of the uniformity of the arguments.

Vector and matrix component access expressions and structure member expressions
will produce result values with the same uniformity of the base object.

```hlsl
groupshared int SomeData[10];
simd_uniform int WaveReadLaneFirst(int);

void fn(int Val) {
simd_uniform Idx = WaveReadLaneFirst(Val); // produces a simd_uniform value.
// group_uniform indexed by simd_uniform produces simd_uniform value.
simd_uniform GSVal = SomeData[Idx];
// group_uniform with group_uniform index produces group_uniform value.
group_uniform GSVal2 = SomeData[SomeData[0]];
// Binary operator of group_uniform and simd_uniform values produces a
// simd_uniform value.
if (GSVal > GSVal2) { // This control-flow can be defined as simd_uniform

}
}
```

Uniformity qualifiers may be applied on shader inputs. When applied to an input
the compiler will diagnose known cases where the qualifier mismatches, and it
will trust the user in other cases. A runtime validation may be added to catch
incorrect source annotations.
Comment on lines +112 to +115
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have some sort of interaction with NonUniformResourceIndex documented - feels like these qualifiers could just outright replace it.


### Uniformity Requirements for Functions

This proposal introduces a new set of attributes for defining the control flow
uniformity requirements of functions. These new attributes take the form:

```hlsl
[[hlsl::required_uniform(group|simd|quad)]]
```

With these annotations applied to function declarations the compiler can produce
diagnostics when functions with a required uniformity are called in contexts
with insufficient uniformity. For example, a quad or derivative method called in
non-uniform control flow can become an error that is trivially identified on the
AST.

<!-- {% endraw %} -->