Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Support word analogy score computation for other languages#74

Open
harpaj wants to merge 1 commit intofacebookresearch:mainfrom
harpaj:wordanalogy_other_languages
Open

Support word analogy score computation for other languages#74
harpaj wants to merge 1 commit intofacebookresearch:mainfrom
harpaj:wordanalogy_other_languages

Conversation

@harpaj
Copy link
Contributor

@harpaj harpaj commented Sep 3, 2018

Currently, the word analogy score computation is artificially restricted to English, without any apparent reason.

While currently the get_evaluation script only downloads the English dataset, others can be found for example here from Fasttext for French, Hindi and Polish and here (in slightly different format) for Finnish.

This PR makes it easier to work with these datasets by detecting if the questions-words.txt file exists.

@aconneau
Copy link
Contributor

aconneau commented Sep 3, 2018

Hello, thanks for the PR,

Could you add the download of the fastText French, Hindi and Polish word analogy files in the get_evaluation.sh script https://github.com/facebookresearch/MUSE/blob/master/data/get_evaluation.sh#L18-L22 such that this fix works for these languages?

Thank you,
Best,
Alexis

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants