Skip to content
This repository was archived by the owner on Nov 21, 2025. It is now read-only.
This repository was archived by the owner on Nov 21, 2025. It is now read-only.

Language distribution of training data #29

@rsmlgen

Description

@rsmlgen

Hi,

May I ask how is the language distribution of the data that is used to train this model?
It says Emilia-50k is used but I could not find such subset of Emilia provided. Is it expected to be similar distribution with Emilia-101k?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions