Skip to content

✨ Add non-linear control flow conversions between QC and QCO dialects#1396

Open
li-mingbao wants to merge 122 commits intomunich-quantum-toolkit:mainfrom
li-mingbao:scf-conversion
Open

✨ Add non-linear control flow conversions between QC and QCO dialects#1396
li-mingbao wants to merge 122 commits intomunich-quantum-toolkit:mainfrom
li-mingbao:scf-conversion

Conversation

@li-mingbao
Copy link
Contributor

@li-mingbao li-mingbao commented Dec 19, 2025

Description

This PR adds support for the conversion of the scf operations scf.while, scf.for, scf.if and the conversion of additional functions between the qc and the qco dialect. This allows the conversion of programs with nonlinear controlflow.

Reopened since #1391 got automatically closed after the branch was deleted.

Checklist:

  • The pull request only contains commits that are focused and relevant to this change.
  • I have added appropriate tests that cover the new/changed functionality.
  • I have updated the documentation to reflect these changes.
  • I have added entries to the changelog for any noteworthy additions, changes, fixes, or removals.
  • I have added migration instructions to the upgrade guide (if needed).
  • The changes follow the project's style guidelines and introduce no new warnings.
  • The changes are fully tested and pass the CI checks.
  • I have reviewed my own code changes.

@codecov
Copy link

codecov bot commented Dec 19, 2025

Codecov Report

❌ Patch coverage is 98.14020% with 13 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
mlir/lib/Dialect/QCO/Builder/QCOProgramBuilder.cpp 93.8% 9 Missing ⚠️
mlir/lib/Conversion/QCOToQC/QCOToQC.cpp 97.9% 3 Missing ⚠️
mlir/lib/Conversion/QCToQCO/QCToQCO.cpp 99.6% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@denialhaag denialhaag added enhancement Improvement of existing feature MLIR Anything related to MLIR labels Dec 19, 2025
Copy link
Member

@denialhaag denialhaag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through all the changes again and only found a couple of small suggestions.

I like your implementation for looping through lists of qubits. I see your point that the implementation in the QCO dialect is a bit verbose, but we might just have to accept that.

Judging by the tests, the current implementation is quite flexible, allowing us to loop through arbitrary lists of qubits. I don't have a strong stance on whether we should reintroduce the concept of registers. That said, I don't think they would reduce the complexity of the implementation, especially if we want to loop through arbitrary lists.

I would hand over to @burgholzer again for his input. 😌

@denialhaag denialhaag requested a review from burgholzer January 26, 2026 19:30
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
mlir/include/mlir/Dialect/QCO/Builder/QCOProgramBuilder.h (1)

1287-1307: Document the new Region* parameter in validateQubitValue.

Line 1288’s doc block lists only qubit; please add @param region and fix the grammar in the next block to avoid API confusion.

✍️ Suggested doc fix
   /**
    * `@brief` Validate that a qubit value is valid and unconsumed
    * `@param` qubit Qubit value to validate
+   * `@param` region Region that owns the qubit SSA value
    * `@throws` Aborts if qubit is not tracked (consumed or never created)
    */
   void validateQubitValue(Value qubit, Region* region) const;
@@
-   * `@param` region The Region in where the qubits are defined.
+   * `@param` region The Region where the qubits are defined.
    */
   void updateQubitTracking(Value inputQubit, Value outputQubit, Region* region);

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@mlir/include/mlir/Dialect/QCO/Builder/QCOProgramBuilder.h`:
- Around line 1092-1258: Add runtime validation to prevent container qubit types
for scalar-only ops: in QCOProgramBuilder methods scfIf (validate the ValueRange
qubits), scfWhile (validate ValueRange args), funcFunc (validate argTypes and
resultTypes) and funcCall (validate operands), check that each
operand/argument/result is a scalar qubit type (not a MemRefType/TensorType or
other container of qubits) and emit a clear diagnostic or assert if a container
type is found; alternatively, if you prefer documentation-only, update the API
comments for scfIf, scfWhile, funcFunc and funcCall to state explicitly that
only scalar qubit types are accepted and converters cannot handle memref/tensor
containers.
- Around line 1052-1086: The Value branch of index handling in tensorExtract and
tensorInsert lacks type validation and can accept non-IndexType Values; update
the implementations (or utils::variantToValue usage) to detect when the index
variant holds a Value and assert or error if that Value's type is not
mlir::IndexType (e.g., using indexValue.getType().isa<mlir::IndexType>()),
emitting a clear diagnostic or failing fast with a descriptive message
mentioning tensorExtract/tensorInsert and the invalid index type; alternatively,
if you prefer a contract-based approach, document the precondition on
tensorExtract and tensorInsert that Value indices must be IndexType.

Comment on lines +1052 to +1086
/**
* @brief Construct a tensor.extract operation at the given index
*
* @param tensor The tensor where the value is extracted
* @param index The index where the value is extracted
* @return The extracted value
*
* @par Example:
* ```c++
* q = builder.tensorExtract(tensor, iv);
* ```
* ```mlir
* %q = tensor.extract %tensor[%iv] : tensor<3x!qco.qubit>
* ```
*/
Value tensorExtract(Value tensor, const std::variant<int64_t, Value>& index);

/**
* @brief Construct a tensor.insert operation at the given index
*
* @param element The inserted value
* @param tensor The tensor where the value is inserted
* @param index The index where the value is inserted
* @return The resulting tensor
*
* @par Example:
* ```c++
* newTensor = builder.tensorInsert(element, tensor, iv);
* ```
* ```mlir
* %newTensor = tensor.insert %q into %tensor[%iv] : tensor<3x!qco.qubit>
* ```
*/
Value tensorInsert(Value element, Value tensor,
const std::variant<int64_t, Value>& index);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n -C10 "void.*tensorExtract|Value.*tensorExtract|void.*tensorInsert|Value.*tensorInsert" mlir/lib/Dialect/QCO/Builder/QCOProgramBuilder.cpp | head -100

Repository: munich-quantum-toolkit/core

Length of output: 1721


🏁 Script executed:

fd -t f "\.h" mlir/include | grep -i utils | head -20

Repository: munich-quantum-toolkit/core

Length of output: 198


🏁 Script executed:

rg -n "variantToValue" mlir/include/mlir/Dialect/QCO/QCOUtils.h mlir/include/mlir/Dialect/Utils/Utils.h

Repository: munich-quantum-toolkit/core

Length of output: 182


🏁 Script executed:

cat -n mlir/include/mlir/Dialect/Utils/Utils.h | sed -n '40,80p'

Repository: munich-quantum-toolkit/core

Length of output: 1750


🏁 Script executed:

rg -n "constantFromScalar" mlir/include/mlir/Dialect/Utils/Utils.h -A 20 | head -60

Repository: munich-quantum-toolkit/core

Length of output: 1907


The int64_t case is handled correctly, but Value indices lack type validation.

The tensorExtract and tensorInsert implementations use utils::variantToValue() which properly normalizes int64_t indices to index constants via builder.getIndexAttr(). However, when a Value index is provided, it is returned as-is with no validation that it has IndexType. This allows invalid IR if a non-index Value is passed. Add a runtime check to ensure Value indices have IndexType, or document this as a precondition.

🤖 Prompt for AI Agents
In `@mlir/include/mlir/Dialect/QCO/Builder/QCOProgramBuilder.h` around lines 1052
- 1086, The Value branch of index handling in tensorExtract and tensorInsert
lacks type validation and can accept non-IndexType Values; update the
implementations (or utils::variantToValue usage) to detect when the index
variant holds a Value and assert or error if that Value's type is not
mlir::IndexType (e.g., using indexValue.getType().isa<mlir::IndexType>()),
emitting a clear diagnostic or failing fast with a descriptive message
mentioning tensorExtract/tensorInsert and the invalid index type; alternatively,
if you prefer a contract-based approach, document the precondition on
tensorExtract and tensorInsert that Value indices must be IndexType.

Comment on lines +1092 to +1258
/**
* @brief Construct a scf.for operation with iter args
*
* @param lowerbound Lowerbound of the loop
* @param upperbound Upperbound of the loop
* @param step Stepsize of the loop
* @param initArgs Initial arguments for the iter args
* @param body Function that builds the body of the for operation
* @return ValueRange of the results
*
* @par Example:
* ```c++
* builder.scfFor(lb, ub, step, initArgs, [&](Value iv, ValueRange iterArgs)
* -> llvm::SmallVector<Value> { auto q1 = builder.x(iterArgs[0]);
* return {q1};
* });
* ```
* ```mlir
* %q1 = scf.for %iv = %lb to %ub step %step iter_args(%arg0 = %q0)
* -> !qco.qubit {
* %q2 = qco.x %arg0 : !qco.qubit -> !qco.qubit
* scf.yield %q2 : !qco.qubit
* }
* ```
*/
ValueRange
scfFor(const std::variant<int64_t, Value>& lowerbound,
const std::variant<int64_t, Value>& upperbound,
const std::variant<int64_t, Value>& step, ValueRange initArgs,
llvm::function_ref<llvm::SmallVector<Value>(Value, ValueRange)> body);
/**
* @brief Construct a scf.while operation with return values
*
* @param args Arguments for the while loop
* @param beforeBody Function that builds the before body of the while
* operation
* @param afterBody Function that builds the after body of the while operation
* @return ValueRange of the results
*
* @par Example:
* ```c++
* builder.scfWhile(args, [&](ValueRange iterArgs) -> llvm::SmallVector<Value>
* { auto q1 = builder.h(iterArgs[0]);
* auto [q2, measureRes] = builder.measure(q1);
* builder.scfCondition(measureRes, q2);
* return {q2};
* }, [&](ValueRange iterArgs) -> llvm::SmallVector<Value> {
* auto q1 = builder.x(iterArgs[0]);
* return {q1};
* });
* ```
* ```mlir
* %q1 = scf.while (%arg0 = %q0): (!qco.qubit) -> (!qco.qubit) {
* %q2 = qco.h(%arg0)
* %q3, %result = qco.measure %q2 : !qco.qubit
* scf.condition(%result) %q3 : !qco.qubit
* } do {
* ^bb0(%arg0 : !qco.qubit):
* %q4 = qco.x %arg0 : !qco.qubit -> !qco.qubit
* scf.yield %q4 : !qco.qubit
* }
* ```
*/
ValueRange
scfWhile(ValueRange args,
llvm::function_ref<llvm::SmallVector<Value>(ValueRange)> beforeBody,
llvm::function_ref<llvm::SmallVector<Value>(ValueRange)> afterBody);

/**
* @brief Construct a scf.if operation with return values
*
* @param condition Condition for the if operation
* @param qubits Qubits used in the if/else body
* @param thenBody Function that builds the then body of the if
* operation
* @param elseBody Function that builds the else body of the if operation
* @return ValueRange of the results
*
* @par Example:
* ```c++
* builder.scfIf(condition, qubits, [&]() -> llvm::SmallVector<Value> {
* auto q1 = builder.h(q0);
* return {q1};
* }, [&]() -> llvm::SmallVector<Value> {
* auto q1 = builder.x(q0);
* return {q1};
* });
* ```
* ```mlir
* %q1 = scf.if %condition -> (!qco.qubit) {
* %q2 = qco.h %q0 : !qco.qubit -> !qco.qubit
* scf.yield %q2 : !qco.qubit
* } else {
* %q2 = qco.x %q0 : !qco.qubit -> !qco.qubit
* scf.yield %q2 : !qco.qubit
* }
* ```
*/
ValueRange scfIf(const std::variant<bool, Value>& condition,
ValueRange qubits,
llvm::function_ref<llvm::SmallVector<Value>()> thenBody,
llvm::function_ref<llvm::SmallVector<Value>()> elseBody);

/**
* @brief Construct a scf.condition operation with yielded values
*
* @param condition Condition for condition operation
* @param yieldedValues ValueRange of the yieldedValues
* @return Reference to this builder for method chaining
*
* @par Example:
* ```c++
* builder.scfCondition(condition, yieldedValues);
* ```
* ```mlir
* scf.condition(%condition) %q0 : !qco.qubit
* ```
*/
QCOProgramBuilder& scfCondition(Value condition, ValueRange yieldedValues);

//===--------------------------------------------------------------------===//
// Func operations
//===--------------------------------------------------------------------===//

/**
* @brief Construct a func.call operation with return values
*
* @param name Name of the function that is called
* @param operands ValueRange of the used operands
* @return ValueRange of the results
*
* @par Example:
* ```c++
* auto q1 = builder.funcCall("test", {q0});
* ```
* ```mlir
* %q1 = func.call @test(%q0) : (!qco.qubit) -> !qco.qubit
* ```
*/
ValueRange funcCall(StringRef name, ValueRange operands);

/**
* @brief Construct a func.func operation with return values
*
* @param name Name of the function that is called
* @param argTypes TypeRange of the arguments
* @param resultTypes TypeRange of the results
* @param body Body of the function
* @return Reference to this builder for method chaining
*
* @par Example:
* ```c++
* builder.funcFunc("test", argTypes, resultTypes, [&](ValueRange args) ->
* llvm::SmallVector<Value> { auto q1 = builder.h(args[0]); return {q1};
* })
* ```
* ```mlir
* func.func @test(%arg0 : !qco.qubit) -> !qco.qubit {
* %q1 = qco.h %arg0 : !qco.qubit -> !qco.qubit
* func.return %q1 : !qco.qubit
* }
* ```
*/
QCOProgramBuilder&
funcFunc(StringRef name, TypeRange argTypes, TypeRange resultTypes,
llvm::function_ref<llvm::SmallVector<Value>(ValueRange)> body);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Enforce/document scalar-only qubit operands for scf.if/scf.while/func.

Only scf.for supports qubit containers in the current QC↔QCO conversion flow; scf.if, scf.while, and func.* are scalar-qubit-only. The builder API currently accepts ValueRange without an explicit restriction, so it can emit IR the converters cannot handle. Please add runtime validation (preferred) or clearly document the restriction in these APIs.
Based on learnings “In the QCToQCO and QCOToQC conversion passes (mlir/lib/Conversion/QCToQCO/QCToQCO.cpp and mlir/lib/Conversion/QCOToQC/QCOToQC.cpp), only scf.for operations can handle memref (in QC) or tensor (in QCO) containers of qubits as operands/results. The scf.if and scf.while operations, as well as func operations, are restricted to scalar qubit types only and never handle memref/tensor containers. This constraint applies to both dialects and their conversions.”

📝 Suggested doc note (minimal)
@@
  /**
   * `@brief` Construct a scf.while operation with return values
+  * `@note` Only scalar !qco.qubit operands/results are supported by current
+  *       QC↔QCO conversions.
@@
  /**
   * `@brief` Construct a scf.if operation with return values
+  * `@note` Only scalar !qco.qubit operands/results are supported by current
+  *       QC↔QCO conversions.
@@
  /**
   * `@brief` Construct a func.call operation with return values
+  * `@note` Only scalar !qco.qubit operands/results are supported by current
+  *       QC↔QCO conversions.
@@
  /**
   * `@brief` Construct a func.func operation with return values
+  * `@note` Only scalar !qco.qubit operands/results are supported by current
+  *       QC↔QCO conversions.
🤖 Prompt for AI Agents
In `@mlir/include/mlir/Dialect/QCO/Builder/QCOProgramBuilder.h` around lines 1092
- 1258, Add runtime validation to prevent container qubit types for scalar-only
ops: in QCOProgramBuilder methods scfIf (validate the ValueRange qubits),
scfWhile (validate ValueRange args), funcFunc (validate argTypes and
resultTypes) and funcCall (validate operands), check that each
operand/argument/result is a scalar qubit type (not a MemRefType/TensorType or
other container of qubits) and emit a clear diagnostic or assert if a container
type is found; alternatively, if you prefer documentation-only, update the API
comments for scfIf, scfWhile, funcFunc and funcCall to state explicitly that
only scalar qubit types are accepted and converters cannot handle memref/tensor
containers.

@mergify mergify bot added the conflict label Feb 2, 2026
Copy link
Member

@burgholzer burgholzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @li-mingbao 👋🏼
Thanks for all your work on this. Truly great to see and much appreciated.
Sorry that it took me so long to get back to this. I wanted to make sure that I can dedicate enough time to this to form a solid opinion and give proper feedback here and last week's conference has not really helped in that regard.

To provide some context for how I approached the review and how to interpret the inline comments:

  • I started with the builders, first QC, then QCO, to get a feeling of how it would feel like to build programs with structured elements and/or functions from an API point of view.
  • Then, I moved over to the conversion tests and thought my way through all of the new test cases to really see the builder API in practice.
  • I did not review the conversion code itself for now as I believe it should be a high priority to get the API for the builders; but even more so the semantics for both of our dialects cleared up. The conversions should then, ideally, only be an execution of the translation between the design choices of both dialects.

I read most files top to bottom, so some of the earlier comments might not be 100% relevant because they are partly covered by things further down in the files (especially in case of the tests). I hope the comments still make sense.


On the more general topic of how to deal with registers and scf/func:
Given how we are definitely not the first people dealing with these kind of things, I tried to dig a little deeper into what other frameworks are doing to handle these concepts. I'll try to summarize my findings in a separate comment below, to keep that separated from the review here. (This will probably take a little while more)

Comment on lines +872 to +909
//===--------------------------------------------------------------------===//
// MemRef operations
//===--------------------------------------------------------------------===//

/**
* @brief Allocate a memref register and insert the given values
*
* @param elements The stored elements
* @return The memref register
*
* @par Example:
* ```c++
* builder.memrefAlloc(elements);
* ```
* ```mlir
* %memref = memref.alloc() : memref<2x!qc.qubit>
* memref.store %q0, %memref[%c0] : memref<2x!qc.qubit>
* memref.store %q1, %memref[%c1] : memref<2x!qc.qubit>
* ```
*/
Value memrefAlloc(ValueRange elements);

/**
* @brief Load a value from a memref register
*
* @param memref The memref register
* @param index The index where the value is extracted
* @return The extracted value
*
* @par Example:
* ```c++
* builder.memrefLoad(memref, index);
* ```
* ```mlir
* %q0 = memref.load %memref[%c0] : memref<2x!qc.qubit>
* ```
*/
Value memrefLoad(Value memref, const std::variant<int64_t, Value>& index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these functions really that helpful? The QCProgramBuilder inherits from the OpBuilder and can directly be used to construct memrefs.

Comment on lines +915 to +933
/**
* @brief Construct a scf.for operation without iter args
*
* @param lowerbound Lowerbound of the loop
* @param upperbound Upperbound of the loop
* @param step Stepsize of the loop
* @param body Function that builds the body of the for operation
* @return Reference to this builder for method chaining
*
* @par Example:
* ```c++
* builder.scfFor(lb, ub, step, [&](Value iv) { builder.x(q0); });
* ```
* ```mlir
* scf.for %iv = %lb to %ub step %step {
* qc.x %q0 : !qc.qubit
* }
* ```
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repeated application of a gate to the same qubit isn't particularly interesting. Can we extend this example to demonstrate the use of the loop variable? For example to apply a Hadamard gate to all qubits, which is really common in quantum algorithms.

Comment on lines +940 to +965
* @brief Construct a scf.while operation without return values
*
* @param beforeBody Function that builds the before body of the while
* operation
* @param afterBody Function that builds the after body of the while operation
* @return Reference to this builder for method chaining
*
* @par Example:
* ```c++
* builder.scfWhile([&] {
* builder.h(q0);
* auto res = builder.measure(q0);
* builder.scfCondition(res);
* }, [&] {
* builder.x(q0);
* });
* ```
* ```mlir
* scf.while : () -> () {
* qc.h %q0 : !qc.qubit
* %res = qc.measure %q0 : !qc.qubit -> i1
* scf.condition(%res)
* } do {
* qc.x %q0 : !qc.qubit
* scf.yield
* }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to show both, a while as well as a doWhile loop here?

//===--------------------------------------------------------------------===//

/**
* @brief Construct a scf.for operation without iter args
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly irritated by the "without iter args" here and the "without return values" down below. This feels like internal knowledge that QCO needs these, but it does not really add value to the descriptions here.

Comment on lines +980 to +995
* @par Example:
* ```c++
* builder.scfIf(condition, [&] {
* builder.h(q0);
* }, [&] {
* builder.x(q0);
* });
* ```
* ```mlir
* scf.if %condition {
* qc.h %q0 : !qc.qubit
* } else {
* qc.x %q0 : !qc.qubit
* }
* ```
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how it is very common in quantum to only have a then statement but no else, it's probably worth to show this in the example here.

// Test conversion from qc to qco for scf.for operation with a memref register
auto input = buildQCIR([](mlir::qc::QCProgramBuilder& b) {
auto reg = b.allocQubitRegister(4);
auto memref = b.memrefAlloc(reg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't memrefs have to be deallocated as well?
Same question for tensors I suppose; don't they need to be "freed" or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes memref need to be deallocated. I forgot to add that, thanks for reminding me.
For tensors I dont think there is a dealloc operation.

Comment on lines +414 to +424
auto input = buildQCIR([](mlir::qc::QCProgramBuilder& b) {
auto reg0 = b.allocQubitRegister(4, "q0");
auto reg1 = b.allocQubitRegister(4, "q1");
auto memref0 = b.memrefAlloc(reg0);
b.scfFor(0, 3, 1, [&](Value iv) {
auto extractedQubit = b.memrefLoad(memref0, iv);
b.x(extractedQubit);
auto memref1 = b.memrefAlloc(reg1);
b.scfFor(0, 3, 1, [&](Value iv2) {
auto q1 = b.memrefLoad(memref1, iv2);
b.cx(extractedQubit, q1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of like this demonstration of the nested memref. At the same time, I can't really help but feel this would be much simpler if the two Register allocations would have created a memref from the beginning.

Comment on lines +58 to +64
static void runCanonicalizationPass(ModuleOp module) {
PassManager pm(module.getContext());
pm.addPass(createCanonicalizerPass());
if (pm.run(module).failed()) {
FAIL() << "Error during canonicalization";
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler pipeline tests also run pm.addPass(createRemoveDeadValuesPass()); here.
I suppose it would make sense to also add this here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass causes some issues when there is an iter_arg and the iteration variable is not used. This currently causes a crash when the pass is used (also for non-quantum programs).
This issue is already fixed in the LLVM project some time ago but it was not part of any releases yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah. I remember. At the same time, just following many of the comments on the tests, I don't really think that loops without using the loop index make too much sense (there are use cases, but we can happily avoid them in the tests for now).
With LLVM 22.0, these will hopefully work smoothly then.
I'd be in favor to enable the additional pass and just use programs that make use of the iter_args and the iteration variable (if that is feasible and I am not overlooking something).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly general observation but we are really only comparing the output of the two builders against one another. I somehow have the feeling that this could be slightly dangerous as it might turn out that canonicalization changes the programs in ways that we do not anticipate and if one were to inspect the actual IR, it would look nothing like we would imagine.
Have you manually inspected the IRs? And could we potentially add a similar pretty print for the two IRs to the test as we currently use it in the compiler pipeline tests?
I think this would aid in checking the correctness and our assumptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only looked at the QC to QCO test file so far, but given how I assume that this is "just" the inverse direction, similar comments apply here.
One could even go as far and question whether these tests should be combined to form roundtrip tests QC -> QCO -> QC and assert that QC (begin) == QC (end).
At that point, one could also include the respective programs in the compiler pipeline tests, couldn't one? Is there any particular reason for not doing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will just respond here to answer most of the test related questions to have them bundled.

Both files are pretty much the same, just in opposite directions.
I wasnt sure where to put the tests as these are just a part of the pipeline for now and are missing the final conversion to QIR. Thats why I just split them up for now and test them separately.

The tests atm are quite minimalistic and only check for strings to make it simpler for now as they are just temporary. Once the whole conversion pipeline (and the specifications for the representation of scf operations is defined) for these operations is established until QIR, the tests would be integrated into the pipeline tests.
The tests itself should be correct to the best of my knowledge.

In the same notion I did not really pay attention to test real programs and more or less just tested the concepts but I can adapt them to be more realistic programs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be misremembering, but I think when I wrote the pipeline tests, I explicitly wrote them so that the QIR step is optional. If that is indeed true (to be checked), I would argue it makes sense to move the tests there and include them as part of the compiler pipeline tests.
Even if not, it may make sense to combine the two test files to a "qc_qco_roundtrip" (or whatever) test file to avoid the redundancy and to avoid having to spell out the same programs twice. That should hopefully also allow us to iterate more quickly on the design once this PR is merged

@burgholzer
Copy link
Member

CUDA-Q

Overview

The Quake dialect overloads its operations to represent memory and value semantics in one program.

For memory semantics, it defines

  • a dedicated allocation function for reference-value qubits and vectors thereof
// Allocate a single qubit
%qubit = quake.alloca !quake.ref

// Allocate a qubit register with a size known at compilation time
%veq = quake.alloca !quake.veq<4> {name = "quantum"}

// Allocate a qubit register with a size known at runtime time
%veq = quake.alloca(%size : i32) !quake.veq<?>
  • a dedicated deallocation function for reference-value qubits and vectors thereof
%1 = quake.alloca !quake.veq<4>
...
quake.dealloc %1 : !quake.veq<4>
  • a dedicated concatenation function for combining existing values and/or vectors of reference-valued qubits
%veq = quake.concat %r1, %v1, %r2 : (!quake.ref, !quake.veq<?>,
                                     !quake.ref) -> !quake.veq<?>
  • a dedicated extraction function for extracting qubit values from vectors of reference-valued qubits
%zero = arith.constant 0 : i32
%qr = quake.extract_ref %qv[%zero] : (!quake.veq<?>, i32) -> !quake.ref
  • a dedicated sub-vector function for extracting slices from vectors
%0 = arith.constant 2 : i32
%1 = arith.constant 6 : i32
%qr = quake.subveq %qv, %0, %1 : (!quake.veq<?>, i32, i32) ->
                                  !quake.veq<5>
  • a dedicated size function for getting the size of a qubit vector
%0 = quake.alloca !quake.veq<4>
// %1 will be 4 with canonicalization.
%1 = quake.veq_size %0 : (!quake.veq<4>) -> i64

%2 = ... : !quake.veq<?>
// %3 may not be computed until runtime.
%3 = quake.veq_size %2 : (!quake.veq<?>) -> i64

For the conversion to/from value semantics, it defines

  • wrap and unwrap operations that convert between references and what they call "wires" (qubit values).
%0 = ... : !quake.ref
%1 = quake.unwrap %0 : (!quake.ref) -> !quake.wire
%2 = quake.rx (%dbl) %1 : (f64, !quake.wire) -> !quake.wire
quake.wrap %2 to %0 : !quake.wire, !quake.ref

For the value semantics, it defines

  • dedicated null_wire and sink_wire operations that match allocation and deallocation operations in the memory-semantics model
  • dedicated bundle_cable and terminate_cable operations that either take a variadic list of wires (qubit values) and combine them into a "cable" (collection of qubit values) or the reverse.
  • dedicated null_cable and sink_cable functions as syntactic sugar over null_wire and sink_wire for cables.

General Observations

Generally, it feels like this is somewhat reinventing the wheel for a lot of concepts that are already present in MLIR.
The ref qubit type is simply our qc.qubit, while the wire is our qco.qubit.
The vec for memory-semantics feels very close to our memref usage.
The cable idea seems to match well to the general idea of our tensor usage.

The cable concept is not used in the entire cuda-quantum repository.

I did not really find examples of loops that use quantum types within them inside the repository. But I think it is fair to assume that the value semantics aren't particularly used within the library so there does not seem to be much to learn for the SCF/func topics here.

Registers are explicitly used and allocated by the same function that allocates individual qubit references.

@burgholzer
Copy link
Member

burgholzer commented Feb 4, 2026

Catalyst

Overview

The Quantum dialect of Catalyst generally uses value semantics.

It defines

  • a quantum.bit type representing a value-semantics qubit.
  • a quantum.reg type representing an array of value-semantic qubits.
  • a quantum.alloc operation for allocating a register of qubits
  • a quantum.alloc_qb operation for allocating a single qubit
  • a quantum.dealloc operation for deallocating a register
  • a quantum.dealloc_qb operation for deallocating a single qubit
  • a quantum.extract operation that extracts a qubit value from a register (and does not yield an updated register SSA value)
  • a quantum.insert operation that inserts a qubit value into a register and yields a new register SSA value.
  • a quantum.adjoint region operation (similar to our modifier) that takes a quantum.reg as an argument and returns an updated value for the register.

Observations

This follows a pretty similar philosophy as our QCO dialect, with the main difference that it explicitly defines a register type and the corresponding operations to allocate, deallocate, extract, and insert from it.
I believe the notable thing here is that extraction from a register does not generate a new SSA value for the register. (insertion does)

Regions generally seem to accept registers. At the same time, Catalyst only defines an adjoint region. It does not define a Ctrl region, but instead handles controls as part of unitary operations.

It is hard to find proper examples of structured control flow concepts that demonstrate the use of more than two qubits and how these flow with linear types.
I found one example where a register is used as part of an scf.for loop as an iter_arg, but that is really it.

@burgholzer
Copy link
Member

Jeff

Overview

The Jeff spec generally uses linear types and value semantics.
It defines the following relevant types:

  • qubit: A single quantum bit. Must be treated "linearly": it cannot be copied or deleted (satisfying the no-cloning and no-deleting theorems of quantum mechanics). Structurally, a qubit value must be used exactly once as an input after being produced as an output.
  • qureg: A quantum register—a dynamically sized array of qubits. Also treated linearly: the entire register must be used exactly once. The length of a quantum register is not known at compile time, but can be queried at runtime.

It defines the following operations for qubits:

  • Lifecycle: alloc, free, reset.
  • Measurement: measure (destructive), measureNd (non-destructive).
  • Gates: The gate operation applies a QubitGate

It defines the following operations on qubit registers:

  • Lifecycle: alloc (takes number of qubits to allocate as input), free, and create (creates a register from input qubits).
  • Manipulation: extractIndex, insertIndex, extractSlice, insertSlice, split, join.
  • Introspection: length.

Jeff defines for loops as

# For loop.
#
# The loop iterates from start to stop (exclusive) by step.
# The region is the loop body that is executed once for each iteration.
# The loop maintains a state consisting of any number of values.
# Each iteration receives the state from the previous iteration,
# or the initial state for the first iteration.
# When the loop finishes, the final state is returned.
# Iterations also have access to the current iteration value.
#
# Inputs:
# - `int(N)`: The (signed) start value.
# - `int(N)`: The (signed) stop value (exclusive).
# - `int(N)`: The (signed) step value.
# - `... state`: Any number of values that are passed to the loop body.
#
# Outputs:
# - `... state`: Any number of values that are returned from the loop body.
#
# The region must have the signature `(int(N), ... state) -> (... state)`.
# The first parameter is the current iteration value.
# It is undefined behavior if step is zero.

and the while loop as

# While loop.
#
# The condition is checked before each iteration.
# If the condition is true, the loop body is executed.
# The loop maintains a state consisting of any number of values.
# Each iteration receives the state from the previous iteration,
# or the initial state for the first iteration.
# When the loop finishes, the final state is returned.
#
# Inputs:
# - `... state`: Any number of values that are passed to the condition and body.
#
# Outputs:
# - `... state`: Any number of values that are returned from the body.

condition @3 :Region;
# The condition region that determines whether to continue looping.
#
# The region must have the signature `(... state) -> (int(1))`.
# The output is the condition result - true to continue, false to stop.
# The condition can only evaluate the state, not modify it.

body @4 :Region;
# The body region that is executed on each iteration.
#
# The region must have the signature `(... state) -> (... state)`.
# The outputs are passed as inputs to the condition region for the next iteration.

Observations

Jeff really also treats qubit registers as linear types.
Extraction from a register yield a qubit and an updated register.
Slicing returns a new qubit register for the slice and an updated value for the modified register.

The SCF handling and semantics are probably closest to what we may want to follow here.
Especially treating collections of qubits, i.e., registers, as linear types as well seems to fit quite nicely with what we want to achieve in QCO.

@denialhaag
Copy link
Member

@burgholzer, thanks a lot for the summaries!

Given everything, I would say that we should add the concept of registers to both QC and QCO, even if it's just to make compatibility with other frameworks easier.

The QC case seems rather straightforward to me, and utilizing memref seems more than reasonable.

In principle, I would also be in favor of utilizing tensor in the QCO case. However, if we want to treat registers as linear types, I do not know whether we can do so, since tensor.extract simply does not return the modified tensor. The Jeff spec seems well thought through to me, and I would lean toward following their approach, but I'm also not sure whether this justifies a custom implementation of the register concept in QCO.

@ystade
Copy link
Collaborator

ystade commented Feb 5, 2026

@burgholzer the summaries are great!

I am still very much in favor of re-using what already exists in MLIR if suitable. Or in other words, me personally, I would only add a dedicated "quantum register" if we identified a good reason why a tensor is not suitable for it.

In particular because it reduces ambiguity: it will always be possible to store qubits in either dialects in a memref or tensor even if we define an extra qubit register. So using the existing concepts also reduces the number of possibilities how to represent a program and reduces the need for canocalization passes.

@ystade
Copy link
Collaborator

ystade commented Feb 5, 2026

@burgholzer More importantly and distinct from the dicussion above, I would very much like to have this PR merged rather sooner than later. If we add registers, the corresponding implementation should be done in a separate PR even if it changes code of this PR. I would very much like not to delay this PR much longer because @li-mingbao already put so much work into the PR and I think it's quite fair to declare this work package of his master theses as done.

@burgholzer
Copy link
Member

@denialhaag @ystade to the technical points: I fully agree with your assessment. We should be using existing MLIR concepts as much as possible, reasonable and feasible.

The QC side of things should be rather straightforward by relying on memref. We essentially had that at some point already.

For the QC side, it's a bit trickier due to the question of whether we want the collections of qubit values to be linear types as well (Jeff does that, Catalyst does not).
For the patterns here, it does not change that much. The "extract" sequences would simply not extract from the same tensor SSA value, but from an updated SSA value produced by the previous extract.
Before SCF boundaries, qubits would be collected from values, would be passed to the respective regions, would be unpacked, and before leaving the region, they would be re-packed and either returned or passed to the next loop iteration/block.
That should work pretty cleanly.
Now the big question is whether we find an MLIR builtin that would allow us to represent collections of linear types as a linear type or whether we need to define this ourselves.

Also trying to incorporate @DRovara's offline suggestion here: maybe we can define our own type with the appropriate semantics, but have that time implement all the same/similar interfaces as existing constructs so that we can still take advantage of as much of the MLIR infrastructure as possible. This is also still to be checked.

@burgholzer
Copy link
Member

@burgholzer More importantly and distinct from the dicussion above, I would very much like to have this PR merged rather sooner than later. If we add registers, the corresponding implementation should be done in a separate PR even if it changes code of this PR. I would very much like not to delay this PR much longer because @li-mingbao already put so much work into the PR and I think it's quite fair to declare this work package of his master theses as done.

Valid point. I am also happy to try and work on the code after the PR is merged. For that, I think the main part to still be addressed is the testing setup. Most design questions can be deferred to a follow-up.
I probably won't check the actual conversion code in detail here then, but will also keep that for a follow-up PR.

Note, however, that this procedure also means that the code being contributed here will probably see (substantial) changes in the near future depending on our design decisions. This will influence all code that is being written on top of it. I am all for getting this PR merged sooner than later, but just so we are clear about the consequences of that.

@ystade
Copy link
Collaborator

ystade commented Feb 5, 2026

@burgholzer Just wanted to add one point that was also raised in the offline discussion by @DRovara and @burgholzer : So far, we do not implement the linear types 100%, e.g., sci.if. So a related question is also, whether we want to set up everything around it such that our linear types are actually linear types, meaning that we have to define a new if construct.

@li-mingbao
Copy link
Contributor Author

Hey,
I dont really mind the length of this PR but I think it would be good to come to the conclusion first on how these scf operations should be represented in the dialects.
Looking at the responses, I think there is no way around to introduce registers in some kind of form again in the dialects.
This should be done in a separate PR anyway as this affects more part of the codebase.

It would make sense to clarify these points first as this would safe quite a few iterations in this PR. (In hindsight we should have done this first before I started working on it).

Otherwise I can also close this PR for now or use this PR as discussion and once everything is adressed I can open a "clean" one so that one can be merged.

@burgholzer
Copy link
Member

@burgholzer Just wanted to add one point that was also raised in the offline discussion by @DRovara and @burgholzer : So far, we do not implement the linear types 100%, e.g., sci.if. So a related question is also, whether we want to set up everything around it such that our linear types are actually linear types, meaning that we have to define a new if construct.

Yeah, this is also what came up as part of the review, see #1396 (comment)
Interestingly, we would probably define the new construct solely for QCO and not for QC, where this is not so much of an issue.
So it would then be part of the conversion to convert an SCF construct to our specialized construct whenever qubits are involved in the loop/if/etc.
I will try to do some more digging on linear types in MLIR today and report back with some findings.

@burgholzer
Copy link
Member

burgholzer commented Feb 5, 2026

Hey, I dont really mind the length of this PR but I think it would be good to come to the conclusion first on how these scf operations should be represented in the dialects. Looking at the responses, I think there is no way around to introduce registers in some kind of form again in the dialects. This should be done in a separate PR anyway as this affects more part of the codebase.

It would make sense to clarify these points first as this would safe quite a few iterations in this PR. (In hindsight we should have done this first before I started working on it).

Otherwise I can also close this PR for now or use this PR as discussion and once everything is adressed I can open a "clean" one so that one can be merged.

At least we need some way to aggregate individual values in some form of container that can be indexed as part of SCF constructs.
Maybe the step of moving towards actual registers, e.g., when we read in a QuantumComputation, could be postponed to a follow up in any case.

And just to have that said out loud here again: I believe answering these questions is much more important and impactful than the QIR conversion for the sole reason that this part of QIR is hardly standardized at the moment. It is not yet clear, how all of these constructs even should look like in QIR. So for the most part, we would be coming up with our own concepts anyway. (I also think that MLIR should be capable of handling a lot of the conversion from SCF to CFG-style LLVM IR, but that is a different story).

What I believe would be most helpful to drive the discussion here to some concrete solution is to define a kind of benchmark program that we want to be able to represent in our infrastructure and work our way from there. I am thinking about something like https://github.com/unitaryfoundation/jeff/blob/main/benchmarks/structured/iqpe/iqpe.qasm which has a loop and an if nested in that loop (the power modifier can simply be resolved by tweaking the rotation angle).
Precise steps would be to define how this program should look like in QC (that part should be easy) and how it should look like in QCO (that's the tricky part).
Then we need to write builders that can produce programs that look like the envisioned representation and write conversions that can reasonable convert from one to the other and back.

I have no particular preference if this work is best continued as part of this PR, which lays a really great foundation, but will probably need quite some tweaking; or if it is better to start with a clean slate and just use this PR as inspiration. Or if it is a mixture of both.

Edit: Maybe some helpful context to revisit: #1115
We already once tried to think about how to represent iQPE in the dialects.
@DRovara's suggestion had both reference and value semantics, my suggestion had a function call in there as well.

@burgholzer
Copy link
Member

burgholzer commented Feb 6, 2026

Okay, I took quite some time now to brainstorm and collect ideas on how we could potentially represent registers of qubits in both dialects and use existing MLIR concepts as much as possible.

For QC this actually does not seem too hard and is very close to what we had at some point for MQTRef.
We can use MLIR's memref type to represent registers with reference semantics in the QC dialect. We can directly use memref.alloc to allocate a register of qubits. This supports compile-time as well as runtime sizes out of the box, which is pretty neat.
If we wanted to have more of the semantics directly in our dialect, we could add a qc.alloc_reg operation that would return a memref and add a canonicalization that would immediately lower this operation to a memref.alloc. Any transformation passes or conversions should only ever have to deal with the memref "world".

In the translation from qc::QuantumComputation we would emit the respective memref calls at the very beginning of the main entry point function. Internally, in the QCProgramBuilder, we would probably generate memref.load instructions for every qubit so that we have individual SSA values for the qubits similarly to how we generate these at the moment through individual qc.alloc calls.

For QCO, the story is not that easy. MLIR has no builtin collection type that has linear type semantics. While tensor is an immutable collection of types, where tensor.insert will produce a new tensor SSA value, reading from the tensor via tensor.extract only yields an SSA value for the element, but not a new SSA tensor value.
At first this felt like a showstopper. However, I then realized that we could simply define our own qco.extract operation that takes a tensor of qubits plus an index and yields the respective value plus a new SSA value for the tensor itself. This should be sufficient to enable treating the tensors that we produce as linear types.
We would most likely implement a few similar canonicalizations as are implemented for the tensor.extract operation, like eliminating extract operations directly followed by inserts.
Since we want to treat tensors as linear types in QCO, this means that every tensor (of qubits) value must be used exactly once. This necessitates a qco.dealloc_reg operation that acts as a sink for the tensors corresponding to registers.
For parity/symmetry, we probably want to also add a qco.alloc_reg that produces a tensor and may even be canonicalized to a tensor creation operation.

Conversions would convert a memref (of qubits) to a qco.alloc_reg (or tensor creation), memref.load would be replaced by tensor.extract, memref.dealloc would be replaced by qco.dealloc_reg. memref.store (with qubit value) should not occur in QC programs that we produce so there is no real need to handle it.

scf.for loops in QC would then index into the memref as part of the loop.
The conversion would add a tensor item_arg for every memref used in the loop body as well as individual qubits for every standalone qubit.
scf.while should not be too different to handle.
func.func that take a memref of qubits as arguments in QC should have the parameter converted to a tensor and get additional tensor return values for returning the updated values.

One of the remaining tricky parts is scf.if. Similarly to tensor it is not natively designed to handle linear types. scf.if is defined with the NoRegionArguments trait, so that there is no way to add arguments to the individual regions.
After quite a bit of thinking, it feels like there is no real way around a custom qco.if that we can use as part of qco.
This operation would look extremely similar to a scf.if with the exception that it also takes a variadic list of qco.qubit or tensor of qco.qubit as an operand, adds region arguments to the then and else region, and forces that both branches return updated values for all qco.qubit's or tensors thereof.
The conversion from QC to QCO would look at an scf.if, analyze its branches for whether they use any qubit values or memrefs thereof, and if so, would convert them to qco.if.
The reverse conversion would simply replace qco.if with scf.if, dropping the block/region arguments.

This comment has become long enough already. I think it does pretty much cover a full proposal that should resolve all problems we observed here.
I will try to use my remaining time today (over the weekend) to write down the (i)QPE programs or something similar that makes use of all the concepts.

@denialhaag
Copy link
Member

I wanted to briefly comment on two things that came to mind:

However, I then realized that we could simply define our own qco.extract operation that takes a tensor of qubits plus an index and yields the respective value plus a new SSA value for the tensor itself. This should be sufficient to enable treating the tensors that we produce as linear types. We would most likely implement a few similar canonicalizations as are implemented for the tensor.extract operation, like eliminating extract operations directly followed by inserts. Since we want to treat tensors as linear types in QCO, this means that every tensor (of qubits) value must be used exactly once. This necessitates a qco.dealloc_reg operation that acts as a sink for the tensors corresponding to registers. For parity/symmetry, we probably want to also add a qco.alloc_reg that produces a tensor and may even be canonicalized to a tensor creation operation.

I like that idea! It might be worth creating a wrapper for all useful tensor operations to ensure the IR remains consistent. It might feel a bit weird if a qubit is extracted with qco.extract but inserted with tensor.insert, but maybe that's just me. We definitely also want a version of tensor.extract_slice.

Conversions would convert a memref (of qubits) to a qco.alloc_reg (or tensor creation), memref.load would be replaced by tensor.extract, memref.dealloc would be replaced by qco.dealloc_reg. memref.store (with qubit value) should not occur in QC programs that we produce so there is no real need to handle it.

I do agree that there is no inherent need for memref.store, but this will likely mean that we have to create the tensor.insert/qco.insert operations during the conversion of memref.dealloc. We probably cannot preserve the location of tensor.insert/qco.insert operations during a QCO-QC-QCO round-trip (to be fair, such a round-trip does not necessarily make a lot of sense).

@MatthiasReumann
Copy link
Collaborator

I spent a little bit of time in the last few days thinking about this problem, and here are my two cents and some comments on the proposal of @burgholzer above.

First principles

Why do we need register in the first place? As far as I can tell, registers are a must-have to support loop indexing1

for i in [0:3] {
  h q[i];
}

For all other use cases, a list of qubit SSA values should suffice2. If this is correct, couldn't we simply keep everything the same but handle loop indexing explicitly? For example (credits to @li-mingbao):

%g = group %q0, %q1, %q2
scf.for %iv = 0 to 3 step 1 {
   qc.h %g[%iv]
}
/// continue with %q0, %q1, %q2
/// do not continue with group (group is a temporary helper construct). 

In theory, the memref and tensor dialects could provide the necessary means for grouping.

Semantics

One thing that bothers me is that the memref dialect provides a copy operation. Without any additional verification mechanism one could copy a quantum register. Moreover, memref.alloc has an attribute alignment which also makes little sense in the quantum domain. The counter-argument is of course: "We won't use that".

One-Shot Bufferization

From what I understand, the one-shot bufferization pass does not only perform the transformation from the tensor to memref dialect but also tries to optimize buffer use. For example, re-use buffers. An important problem here is that the one-shot bufferization pass could also insert memref.copy operations. Consequently, we would require a custom conversion pass anyways (I am 90% sure of that).

Custom Logic

Given that it requires custom operations for linear typing and conversions, how much additional effort is required to implement custom registers? What do we gain by reusing the existing dialects besides less tablegen code?3


I hope all of this doesn't sound too skeptical of the current proposed solution (which I still think is a valid one) 😅. Nonetheless, food for thought.

Footnotes

  1. This is really interesting imo. Thinking from a circuit perspective, gate repetitions are trivial because you can simply "copy" a slice of the circuit and insert it N times. However, once you actually start fusing classical with quantum code, it gets tricky.

  2. From my experience, working with these (e.g. for transformation passes) also works quite well.

  3. I am not sure if a conversion from memref to the llvm dialect will ever yield valid QIR.

@burgholzer
Copy link
Member

Thanks @MatthiasReumann for the input! Much appreciated 🙏🏼

As far as I can tell, registers are a must-have to support loop indexing

I would argue that's the immediate feature we see right now. Other ideas include passing lists/arrays/collections of qubits to functions. Assume there is a fairly generic implementation of the QFT which takes a list of qubits as input and probably uses loops to write down the quantum operations. Maybe this boils down to the same main argument though: It seems necessary for writing scale-invariant code, i.e., code where the number of qubits does not impact the size of the IR.

For all other use cases, a list of qubit SSA values should suffice

I think that the current proposal would essentially boil down to this.
Assuming that we define all registers upfront, this would mean that memrefs/tensors are created at the beginning of the program.
The qubits would be extracted from there once (by design), until the first loop construct is encountered.
In QC, we can simply pass around the Register container (memref or custom) as it still holds the same references that we have been using for the entire program.
In QCO, we would have to aggregate the individual SSA values before the loop (we could call this a State), pass that to the loop via iter_args, extract values as needed in the loop body, insert them back at the very end of the loop body, and eventually return the State from the scf.for.
After the loop, we would "unpack" all qubits again and proceed until we hit the next loop construct.
In the end, at least in QCO, we would probably either collect all individual qubits in a State and free the state, or free every individual qubit.

Semantics

One thing that bothers me is that the memref dialect provides a copy operation. Without any additional verification mechanism one could copy a quantum register. Moreover, memref.alloc has an attribute alignment which also makes little sense in the quantum domain. The counter-argument is of course: "We won't use that".

One-Shot Bufferization

From what I understand, the one-shot bufferization pass does not only perform the transformation from the tensor to memref dialect but also tries to optimize buffer use. For example, re-use buffers. An important problem here is that the one-shot bufferization pass could also insert memref.copy operations. Consequently, we would require a custom conversion pass anyways (I am 90% sure of that).

Custom Logic

Given that it requires custom operations for linear typing and conversions, how much additional effort is required to implement custom registers? What do we gain by reusing the existing dialects besides less tablegen code?

These are all very valid points, especially for memref.
I think we might end up implementing a couple more canonicalization rules for registers than if we would simply reuse existing concepts, but it would allow us to get full control of the API and semantics of the operations, which could be a big plus.

I also still have @DRovara 's idea in the back of my head that we may implement our own concepts for registers and linear collections of qubits, but heavily use existing traits used by memref, tensor, etc. to safe some implementation effort.


All that being said: I believe I am starting to be convinced that it might not be too bad of an idea to implement domain specific concepts ourselves.
Can we flesh out the corresponding proposal a little more precisely?

@denialhaag
Copy link
Member

I wanted to briefly give some context on how some of the raised points are handled in Jeff.

I would argue that's the immediate feature we see right now. Other ideas include passing lists/arrays/collections of qubits to functions. Assume there is a fairly generic implementation of the QFT which takes a list of qubits as input and probably uses loops to write down the quantum operations. Maybe this boils down to the same main argument though: It seems necessary for writing scale-invariant code, i.e., code where the number of qubits does not impact the size of the IR.

Functions in Jeff can also have QuregType arguments. The type is basically treated as equal to QubitType, I2, I8, and any other supported type, meaning that it can be used wherever any other type can be used.

I think that the current proposal would essentially boil down to this. Assuming that we define all registers upfront, this would mean that memrefs/tensors are created at the beginning of the program. The qubits would be extracted from there once (by design), until the first loop construct is encountered. In QC, we can simply pass around the Register container (memref or custom) as it still holds the same references that we have been using for the entire program. In QCO, we would have to aggregate the individual SSA values before the loop (we could call this a State), pass that to the loop via iter_args, extract values as needed in the loop body, insert them back at the very end of the loop body, and eventually return the State from the scf.for. After the loop, we would "unpack" all qubits again and proceed until we hit the next loop construct. In the end, at least in QCO, we would probably either collect all individual qubits in a State and free the state, or free every individual qubit.

In Jeff, the aggregation is achieved via the QuregCreateOp, which returns a QuregType. There is no distinct concept for "mid-IR" qubit aggregations (such as the proposed State). That said, I could see the benefit of making this a distinct concept and restricting registers to initial allocation only.

@DRovara
Copy link
Collaborator

DRovara commented Feb 12, 2026

Semantics

One thing that bothers me is that the memref dialect provides a copy operation. Without any additional verification mechanism one could copy a quantum register. Moreover, memref.alloc has an attribute alignment which also makes little sense in the quantum domain. The counter-argument is of course: "We won't use that".

Just a quick comment, but I guess this really depends on how you view the semantics of the memref Qubit arrays.
Since memref.load is supposed to get the element stored inside the memref itself, and what we get from memref.load is a qubit reference, it would be reasonable to view a copy of the memref like a copy of a set of pointers, which should clearly be allowed.

Then again, I'm not sure if our current conversions would even support the concept of having multiple pointers point to the same qubit, so maybe it's not such a reasonable assumption after all 🤔

@burgholzer
Copy link
Member

Then again, I'm not sure if our current conversions would even support the concept of having multiple pointers point to the same qubit, so maybe it's not such a reasonable assumption after all 🤔

I am pretty sure they don't. But this is kind of expected as we currently do not explicitly handle memref<qc.qubit> in the conversions anyway. The qubit tracking that we do as part of the conversion(s) would need to be updated anyway when we introduce the concept of registers.
I would guess that this is more of a problem for QCO than for QC though because tracking the linear types is probably harder than tracking the references.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conflict enhancement Improvement of existing feature MLIR Anything related to MLIR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants