Content of fine-tuning files?

I find the language a little bit unclear when describing the contents of the text and audio files used for fine-tuning.

First off, what are acceptable contents of an utterance file? Total silence? Only one phoneme? Only one word? A whole sentence?

Secondly, should the comma-separated list in the text file have timestamps? Or is it just a chronological list of phonemes in the associated wave file? Should this list have duplicates if the wave file has the same phoneme show up several times? CAN a wave file have the same phoneme show up several times?

Any answers appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Content of fine-tuning files? #74

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Content of fine-tuning files? #74

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions