Skip to content

Update ForcedAlignment/evaluate.py #85

@keighrim

Description

@keighrim

The current ForcedAlignment/evaluate.py script is incompatible with the newly proposed gold format (see clamsproject/aapb-annotations#121). It uses brittle text-matching logic that is now obsolete by the new process.py in the linked PR.

I propose refactoring evaluate.py to use character offsets for alignment instead of text. The updated script will no longer perform any text normalization or string comparison. Instead, it will leverage the alignment-start and alignment-end (or possibly a different set of column names when the PR is merged) character offsets from the gold .tsv files as the ground truth.

The core change will be a complete rewrite of the _read_pred method. To create the sparse hypothesis segments, the script will:

  1. Read the character range (alignment-start, alignment-end) for a segment from the gold .tsv file.
  2. Scan the Token annotations in the prediction MMIF file.
  3. Identify the first and last Token annotations whose own character offsets fall within the gold segment's character range.
  4. Use the timestamps associated with these first and last tokens to define the boundaries of the new hypothesis segment.

This approach is significantly more robust and elegant. It completely removes the dependency on the proprietary gold transcript during evaluation and eliminates all fragile text-matching code.


One more thing to consider during the re-implementing is how we use the pyannote.mectirc library.

Current implementation uses "downsampling" of dense annotation (token-level) in the MMIF to match sparse annotation (every 10 tokens) on the gold data, then uses SegmentationCoverage and SegmentationPurity for the metrics, which seemed correct when I implemented But now I'm requesting @shel-ho for confirmation of this usage or suggestions for alternative metrics from their recent research on the subject matter.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions