Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Could you provide any instructions to preprocess the dataset? #19

@tangyuelm

Description

@tangyuelm

Hello,

I am new to the CPC method and want to learn something from your marvelous codes. However, I am still confused about how to prepossess the dataset. I downloaded the librispeech-train-clean-100 subset from the website but I did not know how to arrange them as follows. It seems that this dataset only has training samples without labels. And I am also not sure how to use the training/validation sequences lists and the Train / Val splits. Are there any detailed instructions?
PATH_AUDIO_FILES

└───speaker1
│ └───...
│ │ seq_11.{$EXTENSION}
│ │ seq_12.{$EXTENSION}
│ │ ...

└───speaker2
└───...
│ seq_21.{$EXTENSION}
│ seq_22.{$EXTENSION}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions