Will the training code be released later?

Thank you very much for making the paper public. I believe dFlash is an outstanding and forward-looking piece of work.

I’m writing to ask if there are plans to release the training code in the future. 

My primary reasons for asking are as follows:


1. Once the training code is available, the community can adapt dFlash to more models and datasets, which will further strengthen the dFlash ecosystem.


2.Exploring 3-layer Configurations: I am particularly interested in testing the performance of a 3-layer setup. As I mentioned previously, a 5-layer configuration can impact TTFT and throughput under high concurrency. I would like to try training a model with fewer layers myself to optimize for these metrics.

3.Integration with SpecForge: After the official training code is released, we can quickly integrate it into the SpecForge project (part of sgl-project). This would enable dFlash to support training for larger-scale models and make it much easier for users to get started with dFlash training.
Thank you again for your contribution~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Will the training code be released later? #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Will the training code be released later? #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions