Skip to content

Few questions #1

@Fhrozen

Description

@Fhrozen

Thank you very much for your excellent work with this tool.

I am currently trying to use it in ESPnet to generate the alignments for training a model with durations similar to the MFA tool.
However, I am facing some minor issues.

I understand the tool is still under construction, but a little guide will help me continue with my implementation.

  • you need to add a __init__.py at the alqlign folder so you execute the command alqalign.run.
  • In run.py, what is step 1 supposed to do? Transcribe the audio into word/text or just only into phonemes; should it have a similar behavior as step 2?
  • In the case of using step 1, text is no longer required, right? So the argument text becomes unneeded? Is it also necessary to move the text processing after step 1 (

    alqalign/alqalign/run.py

    Lines 70 to 92 in 10081d5

    if text_file.is_dir():
    for text_path in sorted(text_file.glob('*')):
    utt_id = text_path.stem
    if utt_id in utt2audio:
    audio_files.append(utt2audio[utt_id])
    text_files.append(text_path)
    output_dirs.append(output_dir / utt_id)
    utt_ids.append(utt_id)
    else:
    for i, line in enumerate(open(text_file, 'r')):
    if text_format == 'kaldi':
    fields = line.strip().split()
    utt_id = fields[0]
    sent = ' '.join(fields[1:])
    else:
    utt_id = str(i)
    sent = line
    if utt_id in utt2audio:
    audio_files.append(utt2audio[utt_id])
    text_files.append(sent)
    output_dirs.append(output_dir / utt_id)
    utt_ids.append(utt_id)
    ) >>>
  • when using a scp file, do you need to use a text format to load the file?

    alqalign/alqalign/run.py

    Lines 60 to 62 in 10081d5

    for line in open(audio_file):
    utt_id, ark_key = line.strip().split()
    utt2audio[utt_id] = ark_key
    , Is it not possible to just use kaldiio.load_scp to load the file?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions