Implementation of Fast Token Delivery flow within Dynamatic #253

pcineverdies · 2024-10-23T09:07:02Z

pcineverdies
Oct 23, 2024
Collaborator

Description

The work from Unleashing Parallelism in Elastic Circuits with Faster Token Delivery [1] is still missing in the codebase. The goal of this issue is to outline a roadmap for their implementation, highlighting what needs to be done.

The Fast Token Delivery (FTD) methodology is a different algorithm to build the handshake IR of the the cf IR. It gets rid of the concepts of basic blocks in the original CFG, relying only on the relationship between producers and consumers among different operations. In the paper, it was shown it can provide significant execution time advantage, together with a reduction of area utilization.

The methodology aims at getting rid of anything related to control dependencies between basic blocks. However, at a first stage of the implementation, the network of control merges is maintained. This allows to allocate the memory operations to the LSQ, and to indicate the termination of the execution of the kernel to the memory interfaces.

The idea of the issue is to design an alternative version of the current CfToHandshake pass (let's call it FtdCfToHandshake) which can be enabled through a flag at compile time (compile --fast-token-delivery). In this way, the current version of Dynamatic requires no change.
Also, since this methodology is about the conversion from the cf dialect to the handshake dialect, the rest of the Dynamatic flow remains unchanged (considering both the transformations to the handshake IR and the following lowering to the hw dialect).

Steps

Finish a Boolean Logic library in Dynamatic.
The FTD methodology requires many analysis of boolean conditions within the circuit (it is mainly about understanding which conditions must be satisfied to go from node A to node B following a given path). A draft of this library is already present in the Experimental folder. However, some new features must be added in order for it to be used in the algorithm. We are aware that having a custom library might be redundant in the codebase - taking into account that Espresso is already among the dependencies of the project. Once that FTD is implemented, we aim at replacing the current custom library with something else.
Implement an analysis pass to obtain control dependency information on the CFG.
This analysis pass (based on [4]) is a fundamental step of the FTD algorithm and the GSA conversion.
Implement an analysis pass which extracts the GSA information out of the MLIR's SSA representation.
The FTD methodology is based on GSA, an alternative version to SSA which employs boolean conditions together with phi nodes (phi nodes are not explicit in MLIR, but encoded through block arguments in the cf dialect). There are many papers showing how to approach such a problem, such as [2] and [3]. The main consequence of having GSA is that, instead of using merge nodes for phi functions, we can go for multiplexers.
Implement the FTD methodology according to [1].
This step is the largest among the ones presented in the issue, as it requires to fully rewrite the CfToHandshake pass. In this first phase, the way memories are handled remains identical to what is done in the current Dynamatic (thus without SQ, using the network of control merges to allocate blocks in the LSQ). This can be further divided in the following steps:
- Initial pass structure and required classes (intermediate non complete step);
- Use the GSA information to get rid of block arguments in basic blocks and substitute them with multiplexers (intermediate non complete step);
- Use the start signal to trigger constants and undefined values in the circuit (intermediate non complete step);
- Add the regeneration mechanism for signals in loops (intermediate non complete step);
- Add the suppression mechanism for signals (final complete step);

Additional comments

As I mentioned above, the rest of the flow is not modified.
Some issues might arise from the interaction with other passes.
Since this is something not strictly related to the FTD algorithm (but rather, on the consistency of the whole flow) some new issues will be opened later on.

[1] A. Elakhras, A. Guerrieri, L. Josipović, and P. Ienne, “Unleashing parallelism in elastic circuits with faster token delivery,” in 2022 32nd International Conference on Field-Programmable Logic and Applications (FPL), IEEE, 2022, pp. 253–261. Accessed: Oct. 14, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10035134/
[2] P. Havlak, “Construction of thinned gated single-assignment form,” in Languages and Compilers for Parallel Computing, U. Banerjee, D. Gelernter, A. Nicolau, and D. Padua, Eds., Berlin, Heidelberg: Springer, 1994, pp. 477–499. doi: 10.1007/3-540-57659-2_28.
[3] P. Tu and D. Padua, “Efficient Building and Placing of Gating Functions,” ACM SIGPLAN Notices, vol. 30, no. 6, pp. 47–55, Jan. 1995, doi: 10.1145/223428.207115.
[4] J. Ferrante, K. J. Ottenstein, and J. D. Warren, “The program dependence graph and its use in optimization,” ACM Trans. Program. Lang. Syst., vol. 9, no. 3, pp. 319–349, Jul. 1987, doi: 10.1145/24039.24041.

Jiahui17 · 2024-10-23T12:44:03Z

Jiahui17
Oct 23, 2024
Maintainer

@pcineverdies Thanks for formatting the PR! I will forward the issue to experts on this (TBH I am not familiar with the compiler things that you are doing here). In any case, here are some thoughts on the plan:

Comment on step 1:

We are aware that having a custom library might be redundant in the codebase - taking into account that Espresso is already among the dependencies of the project. Once that FTD is implemented, we aim at replacing the current custom library with something else.

I think the goal is to make the code introduced in every PR clean (which is the whole point of having a plan like this); now since we all agree that the Boolean library is redundant, it is better to clean it up in the first place.

Comment on step 2 and 3:

Implement an analysis pass which extracts the GSA information out of the MLIR's SSA representation

Isn't the plan to do a conversion pass from SSA to GSA, and then do something like GsaToHandshake? This implementation looks like a hack around the cf dialect though...

Update after talking to others
It would be nice (in the issue description) to have a clear separation between the analysis pass and the conversion pass; so that in case we implement a GSA dialect in the future, many of the code can be reused.

0 replies

lucas-rami · 2024-10-24T22:20:46Z

lucas-rami
Oct 24, 2024
Maintainer

Thanks a lot for doing this, it is great to be able to see a plan like this before PRs start pouring in. Few comments below.

We are aware that having a custom library might be redundant in the codebase - taking into account that Espresso is already among the dependencies of the project. Once that FTD is implemented, we aim at replacing the current custom library with something else.

Just to be clear, I (and everybody else I assume) have 0 emotional attachment to Espresso. I added it to the project over the summer because the people who did the initial work on this told me it was necessary. If it turns out not to be, I am happy to rm -r the folder.

Implement an analysis pass which extracts the GSA information out of the MLIR's SSA representation

Isn't the plan to do a conversion pass from SSA to GSA, and then do something like GsaToHandshake? This implementation looks like a hack around the cf dialect though...

It would be nice (in the issue description) to have a clear separation between the analysis pass and the conversion pass; so that in case we implement a GSA dialect in the future, many of the code can be reused.

Maybe we already all agree on this but I don't think that MLIR can truly support a GSA representation in the IR so it seems difficult to implement an "SSA to GSA" transformation pass (unless you are relying on special operation attributes perhaps?). To me an analysis pass seems like the right choice for this, as this allows you to maintain whatever "GSA information" you need in memory instead of in the IR itself.

0 replies

pcineverdies · 2024-10-25T06:52:07Z

pcineverdies
Oct 25, 2024
Collaborator Author

For what regards the boolean library, I agree that it's important to come up with a clean implementation.
Espresso is a nice software when it comes to boolean optimization, but it was not primarily designed to be a boolean logic library. Either I find a library which comprises both the functionalities (thus having a new reliable dependency, getting rid of both Espresso and the current hand-crafted library) or we continue with what we have right now. I'll be back with my proposal to this Issue in a couple of days.

Right now, GSA is implemented as an analysis pass which provides information to FtdCfToHandshake to perform the conversion (instead of instantiating MERGEs for the live-in values of each block it relies on MUXes). I see the point of having a gsa dialect as an intermediate step between cf and handshake. This would consist in making implicit SSA MLIR representation (which was a fundamental design choice in the framework) explicit.

Here is an example just for clarity.

C code

unsigned loop_sum(in_int_t a[N]) {
  int x = 0;
  for (int i = 0; i < N; i++) x += a[i];
  return x;
}

cf dialect

module {
  func.func @loop_sum(%arg0: memref<8xi32> {handshake.arg_name = "a"}) -> i32 {
    %c0 = arith.constant {handshake.name = "constant2"} 0 : index
    %c0_i32 = arith.constant {handshake.name = "constant3"} 0 : i32
    cf.br ^bb1(%c0, %c0_i32 : index, i32) {handshake.name = "br0"}
  ^bb1(%0: index, %1: i32):  // 2 preds: ^bb0, ^bb1
    %c8 = arith.constant {handshake.name = "constant4"} 8 : index
    %c1 = arith.constant {handshake.name = "constant5"} 1 : index
    %2 = memref.load %arg0[%0] {handshake.mem_interface = #handshake.mem_interface<MC>, handshake.name = "load1"} : memref<8xi32>
    %3 = arith.addi %1, %2 {handshake.name = "addi0"} : i32
    %4 = arith.addi %0, %c1 {handshake.name = "addi1"} : index
    %5 = arith.cmpi ult, %4, %c8 {handshake.name = "cmpi0"} : index
    cf.cond_br %5, ^bb1(%4, %3 : index, i32), ^bb2 {handshake.name = "cond_br0"}
  ^bb2:  // pred: ^bb1
    return {handshake.name = "return0"} %3 : i32
  }
}

gsa dialect (explicit phi without block arguments, see [2] for semantics)

module {
  func.func @loop_sum(%arg0: memref<8xi32> {handshake.arg_name = "a"}) -> i32 {
    %c0 = arith.constant {handshake.name = "constant2"} 0 : index
    %c0_i32 = arith.constant {handshake.name = "constant3"} 0 : i32
    cf.br ^bb1 {handshake.name = "br0"}
  ^bb1():  // 2 preds: ^bb0, ^bb1
    %0 = gsa.mu(%c0, %4); :index
    %1 = gsa.mu(%c0_i32, %3); :i32
    %c8 = arith.constant {handshake.name = "constant4"} 8 : index
    %c1 = arith.constant {handshake.name = "constant5"} 1 : index
    %2 = memref.load %arg0[%0] {handshake.mem_interface = #handshake.mem_interface<MC>, handshake.name = "load1"} : memref<8xi32>
    %3 = arith.addi %1, %2 {handshake.name = "addi0"} : i32
    %4 = arith.addi %0, %c1 {handshake.name = "addi1"} : index
    %5 = arith.cmpi ult, %4, %c8 {handshake.name = "cmpi0"} : index
    cf.cond_br %5, ^bb1, ^bb2 {handshake.name = "cond_br0"}
  ^bb2:  // pred: ^bb1
    return {handshake.name = "return0"} %3 : i32
  }
}

On the one hand, this latter approach seems cleaner, and it would allow to handle GSA gates as normal operations in the IR (rather than having a custom data-structure with the GSA information). On the other hand, as Lucas mentioned, gsa.mu and gsa.gamma would just be some placeholders without an intrinsic semantics attached, which might be an issue when it comes to consistency with the rest of the framework.

Currently - in our working branch - we are relying on the analysis pass, but it's also immediate to introduce a gsa dialect and an intermediate conversion (cfToGsa -> FtdGsaToHandshake). I'm open to both solutions!

0 replies

lucas-rami · 2024-10-27T16:25:46Z

lucas-rami
Oct 27, 2024
Maintainer

gsa dialect (explicit phi without block arguments, see [2] for semantics)

That's what I meant, I am 99% sure MLIR won't let you do what you do with gsa.mu in your example. The reason is that you reference %4 and %3 before they are defined in the graph dominance sense, which is not allowed within an SSACFG region, the kind kind of region inside a func.func operation. Even if your operation's "internal semantics" would make it so that these values are never "picked" before ^bb1 executes once, this is completely opaque to MLIR.

FYI, there is another kind of region called a graph region which allows you to do such things as using an SSA value before it is defined in the IR (a non-external handshake.func operation contains a single of those regions) as long as you have a single basic block in the region, which is unfortunately not the case here.

Therefore I strongly suggest going for an analysis pass. I am not saying you could not dirtily hack around a semi-explicit version of GSA in the IR itself, but I'm saying it won't be practical to use, assuming it is even doable without losing one's sanity.

0 replies

Jiahui17 · 2024-10-29T13:42:46Z

Jiahui17
Oct 29, 2024
Maintainer

@pcineverdies @lucas-rami thank you for the clarifications and discussions! The current plan sounds good to me. I will give a look into #147

0 replies

rpirayadi · 2024-11-12T09:52:55Z

rpirayadi
Nov 12, 2024
Collaborator

Following the discussion on issue #185 and specifically this comment, I would like to mention that making the Fast Token Delivery code more modular, so I can run it later on over the Handshake MLIR would be really useful for my purposes. More specifically, I would like a function that allows me to pass two newly added producer and consumer, and inserts the necessary circuit for Fast Token Delivery (i.e., appending suppressors and generators where necessary).

0 replies

pcineverdies · 2024-11-12T10:15:52Z

pcineverdies
Nov 12, 2024
Collaborator Author

I am currently working in this direction. In this way, FtdCfToHandshake will produce a circuit that is fully functional, but you will be also able to run the algorithm on a portion of the IR.

From an abstract point of view, this would end up in two functions which look like this:

LogicalResult addRegenProducerConsumer(...);
LogicalResult addSuppressionProducerConsumer(...);

The possibility of using FTD outside of the conversion pass introduces however a few problems. I am going to list all of them, maybe this helps us in finding a solution :)

FTD requires CFG information. There is unfortunately no way to skip this requirement, as the CFG is source both of the dominance information and the loop analysis. Clearly, this is not a limit for the conversion pass, but after the IR is flattened (no more blocks) the info is not available.
Support/CFG might be a starting point in this direction, but it needs to be adapted to the structure of the handshake IR after the conversion pass.
Another possibility would be to annotate the CFG information to the handshake::FuncOp operation, and build it again out of that (I assume there is a strong reason why @lucas-rami decided to rebuild the CFG rather than annotating it).
FTD requires to know which operations were added by FTD itself. In the conversion pass, this is handled using local hash maps to store such operations. The main idea is that while running FTD, some pairs of producers/consumers should be skipped to guarantee correctness. Same holds in case someone wants to re-run FTD after the conversion is done.
A solution to such a problem might be to annotate the single handshake operations with the FTD information, so that this can be also extracted at a later stage (see below). I don't like adding "garbage" this way to the IR, but this is functionally correct (I already tested it).

    %98 = not %109 {ftd.skip, handshake.bb = 7 : ui32, handshake.name = "not23"} : <i1>

EDIT: this requirement for FTD does not compromise #186 and #187.

0 replies

lucas-rami · 2024-11-12T21:58:23Z

lucas-rami
Nov 12, 2024
Maintainer

Thanks for the insights on your current/future API @pcineverdies, that's helpful. That's what I meant when I said that sometimes things are much easier conceptually than in the implementation ;)

Another possibility would be to annotate the CFG information to the handshake::FuncOp operation, and build it again out of that (I assume there is a strong reason why @lucas-rami decided to rebuild the CFG rather than annotating it).

I'm actually quite in favor of doing that. I never got around to do it because there has never been a direct need so far but I think it makes a lot of sense and will be almost surely necessary to do other things in the future (e.g., for adapting CFDFC identification to work with FTD circuits). Possibly something to add on this tracking issue and implement at some point.

I don't like adding "garbage" this way to the IR, but this is functionally correct (I already tested it).

Not a fan either, especially if it's only useful in the "I want to call this function from outside the pass" case. Would it be possible to somehow expose the analysis that derives which operations should be skipped?

0 replies

pcineverdies · 2024-11-13T07:02:46Z

pcineverdies
Nov 13, 2024
Collaborator Author

Thanks for your suggestions!
I will look forward for both the problems.

For the former point, not only we need to move the CFG information around, but we also need a way to use MLIR/LLVM APIs over it (for instance, to use the DominanceInfo object). I need to dive into MLIR a bit deeper to understand its feasibility.

EDIT: I was thinking that what's really convenient to have is the block structure as it is just before the call to flattenAndTerminate. Clearly it's not a thing to leave it there after the handshake conversion. However, we already have all the operations annotated with their basic blocks, what if we keep track of the edges and we rebuild the block structure out of that in MLIR (adding blocks and cf.br/cf.cond_br)? It probably sounds sketchy, but it would allow to reuse all the functionalities we have throughout the conversion from cf to handshake, and it would be just an intermediate version of the IR throughout a single pass, so that we can flatten it before the end of the pass.

For the latter: right now the operations are marked according to different categories as they are created ("you should be always skipped, you should be skipped only in this case, you should be skipped in this other case..."). Maybe I end up finding a way to rescue the information out of the IR topology itself. Might be back on this point! :)

0 replies

lucas-rami · 2024-11-13T23:12:49Z

lucas-rami
Nov 13, 2024
Maintainer

However, we already have all the operations annotated with their basic blocks, what if we keep track of the edges and we rebuild the block structure out of that in MLIR (adding blocks and cf.br/cf.cond_br)?

Not sure I understand your idea there. Do you want to maintain a "shell CF function" in the IR post conversion that basically just has the original control flow? If that's it then I am against keeping around IR operations that represent no actual hardware in the end just for the sake of future analysis. That's the job of attributes, which is why I liked you idea of annotating the original CFG on Handshake functions during the conversion. At some point I looked at the DominanceInfo stuff and I seem to vaguely remember that there was a way to pass it an arbitrary graph to compute dominance for, even if that graph is not a CFG of MLIR basic blocks. If that is true, then you should be able to use those utilities through the attribute-encoded CFG.

0 replies

pcineverdies · 2024-11-20T14:55:33Z

pcineverdies
Nov 20, 2024
Collaborator Author

As I have dived a bit deeper into the topic, I'd like to show you my current proposal to the problem "we need to maintain the CFG structure after the conversion from cf to handshake".

As I mentioned earlier, it is really convenient to have the block structure in the IR itself as it happens in the cf format. This allows to use a lot of facilities provided by MLIR. Also, Since this is what we adopt in the conversion pass, it would allow to reuse much of the code developed for the conversion pass itself.

However (and we are absolutely on the same page here) after the conversion is done, nothing except hardware should be in the handshake version of the circuit. This holds for the final outputs of the transformation pass. However, I believe that it might be beneficial to temporary build the block structure again throughout a transformation pass, exploit the functionalities we need and then flatten the IR again.

I'll show you what I have in mind with an example (fir).

We start with the following function in the cf dialect.

module {
  func.func @fir(%arg0: memref<1000xi32> {handshake.arg_name = "di"}, %arg1: memref<1000xi32> {handshake.arg_name = "idx"}) -> i32 {
    %c0 = arith.constant {handshake.name = "constant2"} 0 : index
    %c0_i32 = arith.constant {handshake.name = "constant3"} 0 : i32
    cf.br ^bb1(%c0, %c0_i32 : index, i32) {handshake.name = "br0"}
  ^bb1(%0: index, %1: i32):  // 2 preds: ^bb0, ^bb1
    %c999 = arith.constant {handshake.name = "constant5"} 999 : index
    %c1000 = arith.constant {handshake.name = "constant6"} 1000 : index
    %c1 = arith.constant {handshake.name = "constant7"} 1 : index
    %2 = memref.load %arg1[%0] {handshake.mem_interface = #handshake.mem_interface<MC>, handshake.name = "load2"} : memref<1000xi32>
    %3 = arith.subi %c999, %0 {handshake.name = "subi0"} : index
    %4 = memref.load %arg0[%3] {handshake.mem_interface = #handshake.mem_interface<MC>, handshake.name = "load3"} : memref<1000xi32>
    %5 = arith.muli %2, %4 {handshake.name = "muli0"} : i32
    %6 = arith.addi %1, %5 {handshake.name = "addi0"} : i32
    %7 = arith.addi %0, %c1 {handshake.name = "addi2"} : index
    %8 = arith.cmpi ult, %7, %c1000 {handshake.name = "cmpi0"} : index
    cf.cond_br %8, ^bb1(%7, %6 : index, i32), ^bb2 {handshake.name = "cond_br0"}
  ^bb2:  // pred: ^bb1
    return {handshake.name = "return0"} %6 : i32
  }
}

While converting to handshake, we don't to anything special. However, we keep track of the "edges" between basic blocks. In this case, it's important to remember that:

there is an unconditional branch from BB0 to BB1;
there is a conditional branch from BB1 to either BB1 or BB2, dependant on the result of cmpi0 (the name analysis will keep this name consistent across passes).

As an annotation, we might store something like this (syntax is just as a reference):

{[
   {"src": 0, "dstT":1, "dstF": 1, "cond": ""},
   {"src": 1, "dstT":1, "dstF": 2, "cond": "cmpi0"},
]}

We would end-up with an handshake IR with nothing but handshake operations, together with (annotated somewhere in the handshake::func operation) the CFG structure.

module {
  handshake.func @fir(%arg0: memref<1000xi32>, %arg1: memref<1000xi32>, %arg2: !handshake.control<>, %arg3: !handshake.control<>, %arg4: !handshake.control<>, ...) -> (!handshake.channel<i32>, !handshake.control<>, !handshake.control<>, !handshake.control<>) attributes {argNames = ["di", "idx", "di_start", "idx_start", "start"], resNames = ["out0", "di_end", "idx_end", "end"]} {

    %outputs, %memEnd = mem_controller[%arg1 : memref<1000xi32>] %arg3 (%addressResult) %result_16 {connectedBlocks = [1 : i32], handshake.name = "mem_controller0"} : (!handshake.channel<i32>) -> !handshake.channel<i32>
    %1 = constant %arg4 {handshake.bb = 0 : ui32, handshake.name = "constant3", value = 0 : i32} : <i32>
    [...]
    %2 = br %arg4 {handshake.bb = 0 : ui32, handshake.name = "br1"} : <>

    %trueResult, %falseResult = cond_br %3, %30 {handshake.bb = 1 : ui32, handshake.name = "cond_br1"} : <i1>, <i32>
    [...]
    %trueResult_14, %falseResult_15 = cond_br %31, %result {handshake.bb = 1 : ui32, handshake.name = "cond_br7"} : <i1>, <>

    %result_16, %index_17 = control_merge %falseResult_15  {handshake.bb = 2 : ui32, handshake.name = "control_merge1"} : <>, <i1>
    end {handshake.bb = 2 : ui32, handshake.name = "end0"} %trueResult_2, %memEnd_1, %memEnd, %arg4 : <i32>, <>, <>, <>
  }
}{[{"src": 0, "dstT":1, "dstF": 1, "cond": ""},{"src": 1, "dstT":1, "dstF": 2, "cond": "cmpi0"},]}

Now, each of the current transformation passes is not affected by this new setup.
Let's say that I want to create a new pass over the handshake dialect in which I really need the CFG structure.

What I can do is to parse the edge information annotated, and temporary create some blocks with cf.br and cf.cond_br terminator to re-insert the CFG structure. This is simple as all the operations in the IR are already marked with their respective original basic block. In this case, we would end up (while at an intermediate point of the transform pass) with:

module {
  handshake.func @fir(%arg0: memref<1000xi32>, %arg1: memref<1000xi32>, %arg2: !handshake.control<>, %arg3: !handshake.control<>, %arg4: !handshake.control<>, ...) -> (!handshake.channel<i32>, !handshake.control<>, !handshake.control<>, !handshake.control<>) attributes {argNames = ["di", "idx", "di_start", "idx_start", "start"], resNames = ["out0", "di_end", "idx_end", "end"]} {
    %1 = constant %arg4 {handshake.bb = 0 : ui32, handshake.name = "constant3", value = 0 : i32} : <i32>
    [...]
    %4 = br %arg4 {handshake.bb = 0 : ui32, handshake.name = "br1"} : <>
    cf.br ^bb1(%c0, %c0_i32 : index, i32) {handshake.name = "br0"}
  ^bb1():  // 2 preds: ^bb0, ^bb1
    %trueResult, %falseResult = cond_br %3, %30 {handshake.bb = 1 : ui32, handshake.name = "cond_br1"} : <i1>, <i32>
    [...]
    %31 = cmpi ult, %30, %25 {handshake.bb = 1 : ui32, handshake.name = "cmpi0"} : <i32>
    %trueResult_14, %falseResult_15 = cond_br %31, %result {handshake.bb = 1 : ui32, handshake.name = "cond_br7"} : <i1>, <>
    cf.cond_br %31, ^bb1(), ^bb2 {handshake.name = "cond_br0"}
  ^bb2:  // pred: ^bb1
      %result_16, %index_17 = control_merge %falseResult_15  {handshake.bb = 2 : ui32, handshake.name = "control_merge1"} : <>, <i1>
    end {handshake.bb = 2 : ui32, handshake.name = "end0"} %trueResult_2, %memEnd_1, %memEnd, %arg4 : <i32>, <>, <>, <>
  }
}{[{"src": 0, "dstT":1, "dstF": 1, "cond": ""},{"src": 1, "dstT":1, "dstF": 2, "cond": "cmpi0"},]}

This clearly makes no sense from a circuit perspective, but it allows us to have all the features we used to have in the conversion pass.
Once that we are done with the transformation (before terminating), we flatten the IR: we get rid of cf.branch and cf.cond_br and we put everything back in the only block of the region.

module {
  handshake.func @fir(%arg0: memref<1000xi32>, %arg1: memref<1000xi32>, %arg2: !handshake.control<>, %arg3: !handshake.control<>, %arg4: !handshake.control<>, ...) -> (!handshake.channel<i32>, !handshake.control<>, !handshake.control<>, !handshake.control<>) attributes {argNames = ["di", "idx", "di_start", "idx_start", "start"], resNames = ["out0", "di_end", "idx_end", "end"]} {

    %outputs, %memEnd = mem_controller[%arg1 : memref<1000xi32>] %arg3 (%addressResult) %result_16 {connectedBlocks = [1 : i32], handshake.name = "mem_controller0"} : (!handshake.channel<i32>) -> !handshake.channel<i32>
    %1 = constant %arg4 {handshake.bb = 0 : ui32, handshake.name = "constant3", value = 0 : i32} : <i32>
    [...]
    %2 = br %arg4 {handshake.bb = 0 : ui32, handshake.name = "br1"} : <>

    %trueResult, %falseResult = cond_br %3, %30 {handshake.bb = 1 : ui32, handshake.name = "cond_br1"} : <i1>, <i32>
    [...]
    %trueResult_14, %falseResult_15 = cond_br %31, %result {handshake.bb = 1 : ui32, handshake.name = "cond_br7"} : <i1>, <>

    %result_16, %index_17 = control_merge %falseResult_15  {handshake.bb = 2 : ui32, handshake.name = "control_merge1"} : <>, <i1>
    end {handshake.bb = 2 : ui32, handshake.name = "end0"} %trueResult_2, %memEnd_1, %memEnd, %arg4 : <i32>, <>, <>, <>
  }
}{[{"src": 0, "dstT":1, "dstF": 1, "cond": ""},{"src": 1, "dstT":1, "dstF": 2, "cond": "cmpi0"},]}

In this way, the sketchy representation is only intermediate and the final output is always consistent with the handshake expectations.

What do you think?

So far I have made some experiments while still in the conversion pass, and it seems to work without any issue. After the call to flattenAndTerminate I recreate the block structure, I use it to get some dominance information over the original CFG and then I flatten everything and continue the compilation flow.

0 replies

lucas-rami · 2024-11-23T18:01:34Z

lucas-rami
Nov 23, 2024
Maintainer

Thanks for elaborating on your proposal. If that is the only way to go (I can't think of another) then I think it is ok, even though it feels like kind of a hack. Few comments below.

there is a conditional branch from BB1 to either BB1 or BB2, dependant on the result of cmpi0 (the name analysis will keep this name consistent across passes).

I don't think you need to store the name of the condition's producer if you just care about the CFG. You will anyway create a "dummy" structure so it doesn't need to have proper conditionals on the cf.cond_brs. I am also mentioning this because this information is likely to become inconsistent with the IR state at seemingly random times. If a pass rewrites the producer of one of your conditionals the operation's name will change (unless extra steps are taken to maintain operation names) and then you won't be able to map back the attribute to the actual IR.

One last thought: have you considered creating a full func-level function with your desired CFG instead of modifying the Handshake function itself, then just deleting the func-level function entirely? I am thinking it may be easier because you won't have any "real" operation inside to care about in the process. Just create the function, insert your blocks and terminators as desired, run the analysis, then delete the entire thing. Feel free to ignore this if it is actually harder to implement than what you have, just wanted to share the idea in case you didn't already have it.

0 replies

pcineverdies · 2024-11-23T22:26:44Z

pcineverdies
Nov 23, 2024
Collaborator Author

Thanks! I am finding my way around through the implementation (I had one minor problem yesterday, I hope it's just a matter of modifying some dependencies in Passes.td - when I try to create a cf operation inside a transformation pass MLIR complains over the dialect not being imported).

I see the problem of the naming. Having the operation determining the edge to be taken might (70% sure) might be useful; however - while optimizing the working code before a PR - we might end up stating that such an information is not necessary.

On the contrary, having operations inside the blocks (thus, moving them back and forth across them) is necessary to easily access information related to the relationship between the operation themselves and the CFG. Just to give you an idea, it's about getTerminator, using the loop analysis to say if an operation is inside of a loop and which loop...

With respect to the "hacky" way, I also see it as a dirty approach. However, after these few months working on the project, I realized that the handshake level - being really in between the software and the hardware - always requires going back and forth the two abstractions. My idea comes out of this thought: "It's so simple to manipulate the IR in cfToHandshake before calling flattenAndTerminate. Since no major information gets lost in the process, why can't it be that simple afterwards?"

In the end, if I'm not wrong, the main contributions of such a proposal in the current codebase are the revisits of both ICFPT'19 and buffer placement (CFDFC?), getting rid of the CFG.h functionalities to properly support FTD. I'm not sure if I'll be directly involved in this update - we'll see!

0 replies

lucas-rami · 2024-11-24T16:52:11Z

lucas-rami
Nov 24, 2024
Maintainer

when I try to create a cf operation inside a transformation pass MLIR complains over the dialect not being imported

You may need to add the cf dialect as a dependent dialect in the pass declaration in Passes.td.

On the contrary, having operations inside the blocks (thus, moving them back and forth across them) is necessary to easily access information related to the relationship between the operation themselves and the CFG. Just to give you an idea, it's about getTerminator, using the loop analysis to say if an operation is inside of a loop and which loop...

The relationship between operations and the CFG is currently handled by the handshake.bb attributes, so it doesn't seem necessary to do anything else to link operations to "basic blocks". Besides, do we care for any loop analysis post conversion to Handshake? The right place and time to do whatever CF-style analysis you want in the conversion pass is at the very beginning when you have the func-level function around; once you start translating to Handshake it becomes harder to do analysis on the partially valid IR so you should just be busy doing the translation.

Since no major information gets lost in the process, why can't it be that simple afterwards?

getting rid of the CFG.h functionalities to properly support FTD.

I think I need to clarify a few things here. Major information does get lost in the process, and this is unavoidable considering that we are moving from a CFG-based software-like IR to a pure graph (a CDFG in our jargon). From Handshake forward, the original CFG no longer exists---and I mean that from a conceptual point of view, not just as in "it's no longer visible in the IR"---even if we want to believe really hard that it still does. While some of our downstream algorithms rely on us remembering what that CFG looked like (hence the handshake.bb attributes and this discussion about annotating the CFG somewhere), to me this points to limitations in our algorithms themselves (not the one you are currently implementing though, which can rely on the CFG since its input is an actual CFG); it does not indicate that we should go to great lengths to be able to reconstruct the original CFG at any time and associate all operations to a particular "BB" because we know the latter is basically impossible in the general case.

A good example is FTD circuits and the CFDFC identification logic for buffer placement. These two are currently incompatible because FTD circuits "mess up the CDFG to CFG correspondence" (for good reasons) enough so that the current approach fails to identify to which CFG cycles some edges belong. Most likely this is fixable because FTD circuits mess up the CFG/CDFG in a predictable/limited way that we can probably recover from (though I don't think anybody ever bothered to make a formal argument for this, it looks feasible in practice). However, tomorrow someone will invent a new even more aggressive optimization that transforms the circuit to such an extent that we won't be able to semantically link some operations to any BB that used to exist, and we should be able to handle that i.e., we should not rely on the original CFG.

All of this to say that, every time we rely on the CFG being recoverable in Handshake, we are a little bit lying to ourselves. We cross our fingers and hope that the circuit has not been transformed too hard since conversion from CF, but this is obviously unreliable. Nobody is talking of getting rid of CFG.h (which should really be called CDFG.h), if anything the analyses this unit provides should be greatly improved and become our standard suite for analyzing the CDFG reliably.

Pragmatically, I still think it makes sense to maintain the original CFG as an annotation on the function because we will most likely need it in the near future, but I only personally care about the list of BBs and edges between them.

0 replies

pcineverdies · 2024-12-04T08:57:27Z

pcineverdies
Dec 4, 2024
Collaborator Author

I'm back on this topic.
I was able to make a pass which uses the information of the CFG as annotated in the IR to re-insert the control flow structure, exploit it and get rid of it before terminating.

The API for this CFGAnnotation library looks like this:

annotateCFG: during the conversion pass, the edges are saved as annotation of the handshake::FuncOp operation, in a a format which looks like {[0,1],[1,1,2,cmpi1],[2,3]}. The first number represents the source node; if only one number is following, then we have an unconditional branch; otherwise, we have a conditional branch with false destination, true destination and name of the operation defining the outcome.
restoreCfStructure: when required, the annotation is used to revert the control flow structure.
flattenFunction: both the blocks and the terminators are removed, so that the final operation is consistent with the requirement of having only one block.

A couple of thoughts over what you have mentioned in your past few messages.

The whole need of the control flow structure is to run fast token delivery later in the flow. If we add new operations and new connections after the conversion pass itself, we need to make sure that those new operations undergo the same flow adopted in the conversion stage.
Because of the inner logic of fast token delivery, it is unavoidable to know the conditions determining each conditional edge in the CFG. This is why I inserted cmp1 above in the API description.

I am also mentioning this because this information is likely to become inconsistent with the IR state at seemingly random times. If a pass rewrites the producer of one of your conditionals the operation's name will change (unless extra steps are taken to maintain operation names) and then you won't be able to map back the attribute to the actual IR.

This is clearly the main issue in this requirement. I already encountered a couple of cases in which, due to a renaming, it was not possible to restore the CFG structure.
I came up with a couple of alternatives, which I'd like to have your opinion on:

An error is raised if the operation cannot be recovered;
The very last cmpi/cmpf from the same basic block is adopted as condition;
Some strong countermeasures are taken throughout the flow to make sure that the name of the condition is maintained as it is.

However, each of these solutions have some issues:

The end-user cannot do anything to get rid of this error;
The chosen operation might be not correct, leading to a wrong circuit;
Very strict as a requirement.

As you said the best way to go would be to live without this information, but, for our scopes, this seems unavoidable.

0 replies

pcineverdies · 2024-12-04T09:18:44Z

pcineverdies
Dec 4, 2024
Collaborator Author

What I mentioned above is sort of orthogonal to the conversion pass using fast token delivery which, on the contrary, is now functional (the integration tests working without FTD are also working with FTD - clearly this is not a metric of perfection but I consider it as a significant milestone). This is the code structure:

experimental/lib/Support/handshakeSupport. Set of functionalities which are required across many libraries (for instance, extracting paths between basic blocks, checking conditions over loops, indexing blocks according to their relationship in the dominance tree...).
experimental/lib/Analysis/GSAAnalysis. Merged with [Experimental][Analysis] GSA Analysis #187.
lib/Analysis/ControlDependenceAnalysis. Merged with [Analysis] Control Dependence Analysis #186.
experimental/lib/Support/FtdSupport. Set of functionalities which implement the main FTD algorithm.
experimental/lib/Conversion/FtdCfToHandshake. Conversion pass exploiting all the above functionalities to generate a functional circuit.
experimental/lib/Transform/CombineSteeringLogic. The conversion pass introduces much redundancy in the circuit, which is then removed through this pass.

You can have a clue of what is going on in my worknig branch.

If it looks okay to you, in the next few weeks I plan to go for 5/6 different PRs, each of them containing the above points (I also include the analysis passes as I needed some modifications since the last PRs). The core of the code is in the experimental folder.

FTD will be run from the compilation flow using a custom flag:

set-src /path/to/source.c
compile  --fast-token-delivery 
write-hdl
simulate
exit

The conversion pass runs through a class called FtdLowerFuncToHandshake, which uses LowerFuncToHandshake as base class (since some functionalities are in common). To allow this inheritance, I had to make some methods/attributes from private to protected and public. AFAIR this is the only modification I did for what regards the code outside of experimental/.

FTD cannot currently use the ICFPT'19 functionalities; also, only the "on-merges" buffer placement is available.

0 replies

lucas-rami · 2024-12-05T23:32:02Z

lucas-rami
Dec 5, 2024
Maintainer

Thanks for the summary, I was starting to loose track of some things so it is very helpful.

This is clearly the main issue in this requirement. I already encountered a couple of cases in which, due to a renaming, it was not possible to restore the CFG structure.
I came up with a couple of alternatives, which I'd like to have your opinion on [...] As you said the best way to go would be to live without this information, but, for our scopes, this seems unavoidable.

Thanks for thinking this through so well. The best solution I can think of is a mix of all three. First, and on your alternative 3, we should try to make our existing passes preserve operation names when an operation is replaced with "something equivalent". Many passes do not do that yet. (Initially we designed the "namer" so that it would explicitly disallow name reusing, but many people care about these names being consistent so at some point I relaxed this constraint and created a method on the NameAnalysis to help in replacing an op while keeping the name, which I am sure you have encountered.) This can be done incrementally and at least reduce the number of times you have to worry about losing consistency between your attribute and your IR. Of course we cannot enforce that every pass honors this (neither should we, too strict of a constraint as you mention), but we can at least ask nicely. If that fails, then we resort to alternative 2; it is debatable that we should do this alternative at all, as now if you wrongly identify the cmp you will do something wrong "silently" without necessarily realizing it on the spot. Generating a warning every time you have do this may be the good counter-measure. Finally, if all else fails, alternative 1 is the only way to go.

If it looks okay to you, in the next few weeks I plan to go for 5/6 different PRs, each of them containing the above points (I also include the analysis passes as I needed some modifications since the last PRs). The core of the code is in the experimental folder.

Awesome, thanks for thinking of breaking these things up in multiple contributions, and for the detailed plan :) As we approach end-of-the-year holidays I may not be very responsive depending on when you make your PRs but I will do my best to keep an eye on the project.

To allow this inheritance, I had to make some methods/attributes from private to protected and public. AFAIR this is the only modification I did for what regards the code outside of experimental/.

Perfectly fine.

0 replies

AyaElAkhras · 2024-12-09T21:55:25Z

AyaElAkhras
Dec 9, 2024
Maintainer

Thanks guys for all the discussions!

This is clearly the main issue in this requirement. I already encountered a couple of cases in which, due to a renaming, it was
not possible to restore the CFG structure.

I wonder if this issue can be solved by using attributes to label the operations that calculate the conditions of basic blocks, and to parse these new attributes in the CFG reconstruction, instead of relying on operation names? We could then enforce the propagation of such attributes across the different passes in the same way we propagate the handshake.bb, for instance.

Here is one way of applying this on @pcineverdies's earlier example, assuming we call the new attribute handshake.cond :

module {
  handshake.func @fir(%arg0: memref<1000xi32>, %arg1: memref<1000xi32>, %arg2: !handshake.control<>, %arg3: !handshake.control<>, %arg4: !handshake.control<>, ...) -> (!handshake.channel<i32>, !handshake.control<>, !handshake.control<>, !handshake.control<>) attributes {argNames = ["di", "idx", "di_start", "idx_start", "start"], resNames = ["out0", "di_end", "idx_end", "end"]} {

    %outputs, %memEnd = mem_controller[%arg1 : memref<1000xi32>] %arg3 (%addressResult) %result_16 {connectedBlocks = [1 : i32], handshake.name = "mem_controller0"} : (!handshake.channel<i32>) -> !handshake.channel<i32>
    %1 = constant %arg4 {handshake.bb = 0 : ui32, handshake.name = "constant3", value = 0 : i32} : <i32>
    [...]
    %2 = br %arg4 {handshake.bb = 0 : ui32, handshake.name = "br1"} : <>

    %trueResult, %falseResult = cond_br %3, %30 {handshake.bb = 1 : ui32, handshake.name = "cond_br1"} : <i1>, <i32>

    %31 = cmpi ult, %30, %25 {handshake.bb = 1 : ui32, handshake.name = "cmpi0", handshake.cond = "c0"} : <i32>
    [...]
    %trueResult_14, %falseResult_15 = cond_br %31, %result {handshake.bb = 1 : ui32, handshake.name = "cond_br7"} : <i1>, <>

    %result_16, %index_17 = control_merge %falseResult_15  {handshake.bb = 2 : ui32, handshake.name = "control_merge1"} : <>, <i1>
    end {handshake.bb = 2 : ui32, handshake.name = "end0"} %trueResult_2, %memEnd_1, %memEnd, %arg4 : <i32>, <>, <>, <>
  }
}{[{"src": 0, "dstT":1, "dstF": 1, "cond": ""},{"src": 1, "dstT":1, "dstF": 2, "cond": "c0"},]}

Any thoughts?

0 replies

lucas-rami · 2024-12-10T00:28:13Z

lucas-rami
Dec 10, 2024
Maintainer

Operation names are already attributes just like handshake.cond, so it seems these attributes would be just as hard to maintain throughout lowering passes. Unless I am missing something, our best solution may be to try to keep operation names consistent.

The only case relying on operation names would probably fail is if we start having passes delete the operations responsible for producing the i1 values that are the BB conditions without clearly replacing them with some other operation. Of course these values would still be produced in another way, but we most likely wouldn't really be able to match them to the CFG attribute on the function (I don't think we have anything like that right now, but thinking ahead). I think any attribute-based approach would suffer in this case...

Now that I think of this more however. Have you considered not relying on the operation producing the i1 but instead on the handshake.cond_brs of the block of interest? For example, if you wanted to figure out which operation produces the value that determines which successor is selected after BBX, then you would just iterate over handshake.cond_brs whose handshake.bb == X, then backtrack from the select operand to the defining operation, and voila (technically you would only need one conditional branch but you may want to sanity check that all branches in the BB are controlled by the same value). This would remove any dependency on operation attributes, names or otherwise, being maintained and instead purely rely on the CDFG itself, which is more reliable. Seems a little bit too good to be true, so I am expecting that I am missing an edge case here.

0 replies

pcineverdies · 2024-12-10T09:14:09Z

pcineverdies
Dec 10, 2024
Collaborator Author

It looks like the problem is currently caused by the pass HandshakeOptimizeBitwidths, which does not use the name analysis to propagate the names over arithmetic operations. By modifying its behaviour, everything turns out as expected. This might be beneficial for #205 too, depending on what the buffer people decide to do.

0 replies

AyaElAkhras · 2024-12-10T12:55:29Z

AyaElAkhras
Dec 10, 2024
Maintainer

Operation names are already attributes just like handshake.cond, so it seems these attributes would be just as hard to maintain throughout lowering passes. Unless I am missing something, our best solution may be to try to keep operation names consistent.

The maintenance problem remains equally for all attributes, but with handshake.cond, we rely on a flag that we set, so the only requirement would be to maintain this flag, even if someone decides to change the operation's name. If someone decides to delete the operation completely and replace it with another one, then we need a mechanism to "force" the new operation to inherit the handshake.cond attribute, but I do not know if such a "forcing" mechanism exists...

Now that I think of this more however. Have you considered not relying on the operation producing the i1 but instead on the handshake.cond_brs of the block of interest? For example, if you wanted to figure out which operation produces the value that determines which successor is selected after BBX, then you would just iterate over handshake.cond_brs whose handshake.bb == X, then backtrack from the select operand to the defining operation, and voila (technically you would only need one conditional branch but you may want to sanity check that all branches in the BB are controlled by the same value). This would remove any dependency on operation attributes, names or otherwise, being maintained and instead purely rely on the CDFG itself, which is more reliable. Seems a little bit too good to be true, so I am expecting that I am missing an edge case here.

This is not general enough because, if I understand it correctly, it requires that any branch in a particular BB has an i1 corresponding to the condition of that BB. However, this is not necessarily the case. Specifically, in FTD, some branches may use the output of a Mux that is a function of the conditions of two BBs as its i1. And, in the future, someone might invent a new circuit generation strategy that is even more wild.

0 replies

lucas-rami · 2024-12-10T12:55:51Z

lucas-rami
Dec 10, 2024
Maintainer

This would remove any dependency on operation attributes, names or otherwise, being maintained and instead purely rely on the CDFG itself, which is more reliable. Seems a little bit too good to be true, so I am expecting that I am missing an edge case here.

To slightly correct my own phrasing there since we would technically still be relying on an operation attribute, namely handshake.bb. I meant that this solution does not rely on an attribute that we have to make extra active efforts to maintain (e.g., handshake.name or handshake.cond). All passes (to my knowledge) already make the effort of maintaining correct handshake.bb attributes on all operations, since it is key to a lot of the things we do.

0 replies

lucas-rami · 2024-12-11T12:36:48Z

lucas-rami
Dec 11, 2024
Maintainer

The maintenance problem remains equally for all attributes, but with handshake.cond, we rely on a flag that we set, so the only requirement would be to maintain this flag, even if someone decides to change the operation's name.

I guess I see the benefit over using names (maintaining names remains as much as possible best-effort, maintaining this new attribute is more important for functionality), despite the fact that it is an additional burden to manage down the pipeline. Just to give some direction for implementing this, the way to manage such an attribute would be through an analysis that is queried and maintained valid by most (all?) Dynamatic passes. Using analyses for this kind of cross-pass concern is something we should do more often, including on things that already exist (handshake.bb for example should be managed by an analysis instead of "manually").

If someone decides to delete the operation completely and replace it with another one, then we need a mechanism to "force" the new operation to inherit the handshake.cond attribute, but I do not know if such a "forcing" mechanism exists...

We cannot really force this, but with an analysis you can make it relatively easy for a third-party pass to honor your requirement. Wihtout going into long details, the general idea is the same as the NameAnalysis::replaceOp method to reuse a name for an operation that replaces an existing one. This could be even easier if we started connecting our analyses to MLIR listeners during pattern-rewriting, but that's a significant refactoring I will hopefully one day find time to do.

This is not general enough because, if I understand it correctly, it requires that any branch in a particular BB has an i1 corresponding to the condition of that BB. However, this is not necessarily the case. Specifically, in FTD, some branches may use the output of a Mux that is a function of the conditions of two BBs as its i1.

Sounds good, thanks for clarifying why that would not work as is.

0 replies

AyaElAkhras · 2024-12-13T09:12:53Z

AyaElAkhras
Dec 13, 2024
Maintainer

Thanks!

0 replies

cappellini · 2025-02-09T16:12:36Z

cappellini
Feb 9, 2025
Collaborator

Hi @pcineverdies @AyaElAkhras

I was told that you're using an "Init" operation (buffer with an initial token) for your implementation. Do you have any guidance on how I can use such an operation myself? Thanks!

3 replies

pcineverdies Feb 11, 2025
Collaborator Author

Hi, sorry for late reply, this message slipped off.

We don't have any init component in the proper sense, we just build it by hand each time it's required. See Figure 6 from the original paper.

We have a 2-inputs merge, with one side connected to a constant that is activated only once through the start signal, and the other side connected to the "real" channel we want to use.
As the paper discusses, there are multiple ways of implementing this init behaviour. This is the most dataflow-like we could think of, but you can also go for a sequential approach.

I'll link a couple of lines which might help you :)

pcineverdies Feb 11, 2025
Collaborator Author

https://github.com/pcineverdies/dynamatic/blob/a762a5d56148801fb272e1c4739de7282276a266/results/atax_FTD/comp/handshake_sq.mlir#L22-L23

pcineverdies Feb 11, 2025
Collaborator Author

dynamatic/experimental/lib/Support/FtdImplementation.cpp

Lines 590 to 599 in 50607cc

    
           // Create the false constant to feed `init` 
        
           auto constOp = rewriter.create<handshake::ConstantOp>(consumerOp->getLoc(), 
        
                                                                 cstAttr, startValue); 
        
           constOp->setAttr(FTD_INIT_MERGE, rewriter.getUnitAttr()); 
        
           // Create the `init` operation 
        
           SmallVector<Value> mergeOperands = {constOp.getResult(), conditionValue}; 
        
           auto initMergeOp = rewriter.create<handshake::MergeOp>(consumerOp->getLoc(), 
        
                                                                  mergeOperands); 
        
           initMergeOp->setAttr(FTD_INIT_MERGE, rewriter.getUnitAttr());

Implementation of Fast Token Delivery flow within Dynamatic #253

Uh oh!

Uh oh!

pcineverdies Oct 23, 2024 Collaborator

Description

Steps

Additional comments

Replies: 25 comments · 3 replies

Uh oh!

Uh oh!

Jiahui17 Oct 23, 2024 Maintainer

Uh oh!

lucas-rami Oct 24, 2024 Maintainer

Uh oh!

pcineverdies Oct 25, 2024 Collaborator Author

Uh oh!

lucas-rami Oct 27, 2024 Maintainer

Uh oh!

Jiahui17 Oct 29, 2024 Maintainer

Uh oh!

rpirayadi Nov 12, 2024 Collaborator

Uh oh!

Uh oh!

pcineverdies Nov 12, 2024 Collaborator Author

Uh oh!

Uh oh!

lucas-rami Nov 12, 2024 Maintainer

Uh oh!

Uh oh!

pcineverdies Nov 13, 2024 Collaborator Author

Uh oh!

lucas-rami Nov 13, 2024 Maintainer

Uh oh!

Uh oh!

pcineverdies Nov 20, 2024 Collaborator Author

Uh oh!

lucas-rami Nov 23, 2024 Maintainer

Uh oh!

pcineverdies Nov 23, 2024 Collaborator Author

Uh oh!

lucas-rami Nov 24, 2024 Maintainer

Uh oh!

pcineverdies Dec 4, 2024 Collaborator Author

Uh oh!

Uh oh!

pcineverdies Dec 4, 2024 Collaborator Author

Uh oh!

lucas-rami Dec 5, 2024 Maintainer

Uh oh!

AyaElAkhras Dec 9, 2024 Maintainer

Uh oh!

Uh oh!

lucas-rami Dec 10, 2024 Maintainer

Uh oh!

pcineverdies Dec 10, 2024 Collaborator Author

Uh oh!

AyaElAkhras Dec 10, 2024 Maintainer

Uh oh!

lucas-rami Dec 10, 2024 Maintainer

Uh oh!

lucas-rami Dec 11, 2024 Maintainer

Uh oh!

AyaElAkhras Dec 13, 2024 Maintainer

Uh oh!

cappellini Feb 9, 2025 Collaborator

Uh oh!

pcineverdies
Oct 23, 2024
Collaborator

Replies: 25 comments 3 replies

Jiahui17
Oct 23, 2024
Maintainer

lucas-rami
Oct 24, 2024
Maintainer

pcineverdies
Oct 25, 2024
Collaborator Author

lucas-rami
Oct 27, 2024
Maintainer

Jiahui17
Oct 29, 2024
Maintainer

rpirayadi
Nov 12, 2024
Collaborator

pcineverdies
Nov 12, 2024
Collaborator Author

lucas-rami
Nov 12, 2024
Maintainer

pcineverdies
Nov 13, 2024
Collaborator Author

lucas-rami
Nov 13, 2024
Maintainer

pcineverdies
Nov 20, 2024
Collaborator Author

lucas-rami
Nov 23, 2024
Maintainer

pcineverdies
Nov 23, 2024
Collaborator Author

lucas-rami
Nov 24, 2024
Maintainer

pcineverdies
Dec 4, 2024
Collaborator Author

pcineverdies
Dec 4, 2024
Collaborator Author

lucas-rami
Dec 5, 2024
Maintainer

AyaElAkhras
Dec 9, 2024
Maintainer

lucas-rami
Dec 10, 2024
Maintainer

pcineverdies
Dec 10, 2024
Collaborator Author

AyaElAkhras
Dec 10, 2024
Maintainer

lucas-rami
Dec 10, 2024
Maintainer

lucas-rami
Dec 11, 2024
Maintainer

AyaElAkhras
Dec 13, 2024
Maintainer

cappellini
Feb 9, 2025
Collaborator