Thanks for your excellent work on DFlash!
I’m interested in comparing the differences between strategies such as MTP-3 and DFlash, with the goal of identifying a more cost-efficient approach for inference systems under a given SLO constraint.
Additionally, I’d like to learn more about any plans to open-source the training code, as that would enable others to more easily reproduce results and explore related directions.