Pipelined GEMM generally has a different code structure than naive GEMM, that is related to the hyperparameter: the pipeline stage.
This structure includes a prologue, main loop, and epilogue. Pipelined GEMM uses more shared memory, which limits its tiling policy.