Skip to content

Will the training code be released later? #19

@jiahe7ay

Description

@jiahe7ay

Thank you very much for making the paper public. I believe dFlash is an outstanding and forward-looking piece of work.

I’m writing to ask if there are plans to release the training code in the future.

My primary reasons for asking are as follows:

  1. Once the training code is available, the community can adapt dFlash to more models and datasets, which will further strengthen the dFlash ecosystem.

2.Exploring 3-layer Configurations: I am particularly interested in testing the performance of a 3-layer setup. As I mentioned previously, a 5-layer configuration can impact TTFT and throughput under high concurrency. I would like to try training a model with fewer layers myself to optimize for these metrics.

3.Integration with SpecForge: After the official training code is released, we can quickly integrate it into the SpecForge project (part of sgl-project). This would enable dFlash to support training for larger-scale models and make it much easier for users to get started with dFlash training.
Thank you again for your contribution~

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions