context-invariance-paper

Codebase for Evaluating context-invariance in unsupervised speech representations, published in Interspeech.

Install

Clone this repository.

cd context-invariance-paper
conda env create -f environment.yml
conda activate abx-exp23

This installs what you need to run experiments 2-3. As outlined below, Experiment 1 is run separately.

Experiment 1

To run experiment 1 for a given model, go to the Zero Resource Challenge Benchmark Toolkit and follow the instructions for running the abx-LS benchmark. Alternatively, if you want more granular control, you can go directly to https://github.com/zerospeech/libri-light-abx2/.

Experiment 2

GENERATING ABX SUBMISSIONS FROM THE TRANSCRIPTION. This repository includes the code to generate a 1-hot-encoded abx submission from the transcription. You can also generate several 1-hot encoded submissions from the transcription such that some errors are deliberately added in, specifically with the phoneme boundaries occasionally shifted. To generate these submissions, do the following:

conda activate abx-exp23
python experiment2/gen_transcription_submission.py [output_path]
python experiment2/gen_error_submissions.py [output_path]

RUNNING THE COMPARISON. Once you have the 1-hot-encoded submissions, or if you want to test another submission, run https://github.com/zerospeech/libri-light-abx2/ with the option --pooling hamming and then compare with the default abx score (i.e. --pooling none). For these submissions, use one of the clean subsets and set the following options:

--feature_size 0.01
--speaker_mode within
--context_mode all

This will compute the score for both the within-context and without-context conditions.

Experiment 3

GENERATING SUBMISSIONS WITH A BLURRING FILTER APPLIED. To generate a modified submission from a given submission, do

conda activate abx-exp23
python experiment3/convolution_submission_gen/convolution_submission_gen.py -h

and follow the instructions. You will want to run with --convolution_type running_mean and --window_s_running_mean 3 (and {5,7}).

EXTRACTING WORD-LEVEL FEATURES FOR EVALUATION. If you already have WORD-LEVEL representations saved, skip to step (3). Otherwise, run

map_feature_extractor.py [submission_path] [output_path] [item_file_path]

An item file used for the results in the paper is provided with this repository (words_split_nohapax_dev-clean). The submission must be in Zerospeech2021 format, see Zerospeech Benchmarks.

RUNNING THE EVALUATION ON THE SUBMISSIONS. Once you have the submissions from above, run

python experiment3/mapcode/compute_map_from_dir.py [words feature dir] [output path]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
experiment2		experiment2
experiment3		experiment3
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

context-invariance-paper

Install

Experiment 1

Experiment 2

Experiment 3

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

context-invariance-paper

Install

Experiment 1

Experiment 2

Experiment 3

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages