FIBER

Paper: FIBER: A Multilingual Evaluation Resource for Factual Inference Bias

Overview

FIBER: Factual Inference Bias Evaluation Resource

FIBER is a high-quality dataset designed to evaluate language-model factual inference bias across three languages — English (en), Italian (it), and Turkish (tr).

It contains both single-entity and multi-entity question–answer sets derived from structured world knowledge domains (e.g., capitals, car brands, time zones, official languages).

Methodology

For each prompt, all possible candidate answers are divided into a gold set (true answers) and a surface set (all possible answers). For each candidate in the surface set, the model's token-level log probabilities are computed sequentially--the model predicts each token's probability given the previously generated ones, and these probabilities are summed to produce an overall score for that candidate. After scoring all candidates, they are ranked by their total scores, and Average Precision (AP) is calculates based on this ranking. This procedure is repeated for every subject, topic, query type, and query target in the dataset.

Instructions on Running Scripts

Step 1: Download Dependencies

The following Python libraries are required for evaluation:

torch
transformers
huggingface-hub

To install them automatically, run the following command in your terminal: make requirements

Step 2: Fill the Config Information

Provide the following before model tests in the config.json

Hugging Face token (hugging_face_token)
Hugging Face ID of the model you want to test (model_id)
Input directory (dataset_dir) (leave it as dataset if you have not changed the file structure)
Output directory (results_dir) (results/<MODEL_NAME> is suggested)

Example entries are provided below:

hugging_face_token : YOUR_HUGGING_FACE_TOKEN
model_id : google/gemma-3-27b-it
dataset_dir : dataset
output_dir : results/gemma-3-27b

In paths, don't leave any forward slashes at the end (e.g. dataset/ or results/gemma-3-27b/).

Step 3: Run

Before running the script, make sure that you have downloaded dependencies and filled out the config information. To start evaluation, run the following command in your terminal: make run

File Naming Convention

Each file follows the structure:

Component	Example	Meaning
`<topic>`	`countries_official_languages`	Knowledge domain
`<language>`	`en`, `tr`, `it`	Dataset language
`<index>`	`0`, `1`, `2`, `3`, or `0_0`, `0_1`	Split or subset index

Turkish (_tr_) topics include subsets (_0_0, _0_1) for grammar purposes.

Dataset Structure

The file structure, i.e. tree, of the dataset is as follows.

.
├── multi_entity # Queries with multiple correct answers.
│   ├── car_brands
│   │   ├── car_brands_en_0.json
│   │   ├── car_brands_en_1.json
│   │   ├── car_brands_en_2.json
│   │   ├── car_brands_en_3.json
│   │   ├── car_brands_it_0.json
│   │   ├── car_brands_it_1.json
│   │   ├── car_brands_it_2.json
│   │   ├── car_brands_it_3.json
│   │   ├── car_brands_tr_0_0.json
│   │   ├── car_brands_tr_0_1.json
│   │   ├── car_brands_tr_1.json
│   │   ├── car_brands_tr_2.json
│   │   └── car_brands_tr_3.json
│   ├── countries_heritages
│   │   ├── countries_heritages_en_0.json
│   │   ├── countries_heritages_en_1.json
│   │   ├── countries_heritages_en_2.json
│   │   ├── countries_heritages_en_3.json
│   │   ├── countries_heritages_it_0.json
│   │   ├── countries_heritages_it_1.json
│   │   ├── countries_heritages_it_2.json
│   │   ├── countries_heritages_it_3.json
│   │   ├── countries_heritages_tr_0_0.json
│   │   ├── countries_heritages_tr_0_1.json
│   │   ├── countries_heritages_tr_1.json
│   │   ├── countries_heritages_tr_2.json
│   │   └── countries_heritages_tr_3.json
│   ├── countries_neighbors
│   │   ├── countries_neighbors_en_0.json
│   │   ├── countries_neighbors_en_1.json
│   │   ├── countries_neighbors_en_2.json
│   │   ├── countries_neighbors_en_3.json
│   │   ├── countries_neighbors_it_0.json
│   │   ├── countries_neighbors_it_1.json
│   │   ├── countries_neighbors_it_2.json
│   │   ├── countries_neighbors_it_3.json
│   │   ├── countries_neighbors_tr_0_0.json
│   │   ├── countries_neighbors_tr_0_1.json
│   │   ├── countries_neighbors_tr_1.json
│   │   ├── countries_neighbors_tr_2.json
│   │   └── countries_neighbors_tr_3.json
│   ├── countries_official_languages
│   │   ├── countries_official_languages_en_0.json
│   │   ├── countries_official_languages_en_1.json
│   │   ├── countries_official_languages_en_2.json
│   │   ├── countries_official_languages_en_3.json
│   │   ├── countries_official_languages_it_0.json
│   │   ├── countries_official_languages_it_1.json
│   │   ├── countries_official_languages_it_2.json
│   │   ├── countries_official_languages_it_3.json
│   │   ├── countries_official_languages_tr_0_0.json
│   │   ├── countries_official_languages_tr_0_1.json
│   │   ├── countries_official_languages_tr_1.json
│   │   ├── countries_official_languages_tr_2.json
│   │   └── countries_official_languages_tr_3.json
│   ├── countries_timezones
│   │   ├── countries_timezones_en_0.json
│   │   ├── countries_timezones_en_1.json
│   │   ├── countries_timezones_en_2.json
│   │   ├── countries_timezones_en_3.json
│   │   ├── countries_timezones_it_0.json
│   │   ├── countries_timezones_it_1.json
│   │   ├── countries_timezones_it_2.json
│   │   ├── countries_timezones_it_3.json
│   │   ├── countries_timezones_tr_0_0.json
│   │   ├── countries_timezones_tr_0_1.json
│   │   ├── countries_timezones_tr_1.json
│   │   ├── countries_timezones_tr_2.json
│   │   └── countries_timezones_tr_3.json
│   ├── mobile_network_operators
│   │   ├── mobile_network_operators_en_0.json
│   │   ├── mobile_network_operators_en_1.json
│   │   ├── mobile_network_operators_en_2.json
│   │   ├── mobile_network_operators_en_3.json
│   │   ├── mobile_network_operators_it_0.json
│   │   ├── mobile_network_operators_it_1.json
│   │   ├── mobile_network_operators_it_2.json
│   │   ├── mobile_network_operators_it_3.json
│   │   ├── mobile_network_operators_tr_0_0.json
│   │   ├── mobile_network_operators_tr_0_1.json
│   │   ├── mobile_network_operators_tr_1.json
│   │   ├── mobile_network_operators_tr_2.json
│   │   └── mobile_network_operators_tr_3.json
│   ├── polyglot_celebs
│   │   ├── polyglot_celebs_en_0.json
│   │   ├── polyglot_celebs_en_1.json
│   │   ├── polyglot_celebs_en_2.json
│   │   ├── polyglot_celebs_en_3.json
│   │   ├── polyglot_celebs_it_0.json
│   │   ├── polyglot_celebs_it_1.json
│   │   ├── polyglot_celebs_it_2.json
│   │   ├── polyglot_celebs_it_3.json
│   │   ├── polyglot_celebs_tr_0_0.json
│   │   ├── polyglot_celebs_tr_0_1.json
│   │   ├── polyglot_celebs_tr_1.json
│   │   ├── polyglot_celebs_tr_2.json
│   │   └── polyglot_celebs_tr_3.json
│   └── top_500_universities
│       ├── top_500_universities_en_0.json
│       ├── top_500_universities_en_1.json
│       ├── top_500_universities_en_2.json
│       ├── top_500_universities_en_3.json
│       ├── top_500_universities_it_0.json
│       ├── top_500_universities_it_1.json
│       ├── top_500_universities_it_2.json
│       ├── top_500_universities_it_3.json
│       ├── top_500_universities_tr_0_0.json
│       ├── top_500_universities_tr_0_1.json
│       ├── top_500_universities_tr_1.json
│       ├── top_500_universities_tr_2.json
│       └── top_500_universities_tr_3.json
└── single_entity # Queries with a single correct answer.
    ├── atomic_numbers
    │   ├── atomic_numbers_en_0.json
    │   ├── atomic_numbers_en_1.json
    │   ├── atomic_numbers_it_0.json
    │   ├── atomic_numbers_it_1.json
    │   ├── atomic_numbers_tr_0_0.json
    │   ├── atomic_numbers_tr_0_1.json
    │   └── atomic_numbers_tr_1.json
    ├── capital_cities
    │   ├── capital_cities_en_0.json
    │   ├── capital_cities_en_1.json
    │   ├── capital_cities_it_0.json
    │   ├── capital_cities_it_1.json
    │   ├── capital_cities_tr_0_0.json
    │   ├── capital_cities_tr_0_1.json
    │   └── capital_cities_tr_1.json
    ├── ccTLD
    │   ├── ccTLD_en_0.json
    │   ├── ccTLD_en_1.json
    │   ├── ccTLD_it_0.json
    │   ├── ccTLD_it_1.json
    │   ├── ccTLD_tr_0_0.json
    │   ├── ccTLD_tr_0_1.json
    │   └── ccTLD_tr_1.json
    ├── chemical_symbols
    │   ├── chemical_symbols_en_0.json
    │   ├── chemical_symbols_en_1.json
    │   ├── chemical_symbols_it_0.json
    │   ├── chemical_symbols_it_1.json
    │   ├── chemical_symbols_tr_0_0.json
    │   ├── chemical_symbols_tr_0_1.json
    │   └── chemical_symbols_tr_1.json
    ├── founding_locations
    │   ├── founding_locations_en_0.json
    │   ├── founding_locations_en_1.json
    │   ├── founding_locations_it_0.json
    │   ├── founding_locations_it_1.json
    │   ├── founding_locations_tr_0_0.json
    │   ├── founding_locations_tr_0_1.json
    │   └── founding_locations_tr_1.json
    ├── locations_of_sites
    │   ├── locations_of_sites_en_0.json
    │   ├── locations_of_sites_en_1.json
    │   ├── locations_of_sites_it_0.json
    │   ├── locations_of_sites_it_1.json
    │   ├── locations_of_sites_tr_0_0.json
    │   ├── locations_of_sites_tr_0_1.json
    │   └── locations_of_sites_tr_1.json
    ├── original_langs_of_books
    │   ├── original_langs_of_books_en_0.json
    │   ├── original_langs_of_books_en_1.json
    │   ├── original_langs_of_books_it_0.json
    │   ├── original_langs_of_books_it_1.json
    │   ├── original_langs_of_books_tr_0_0.json
    │   ├── original_langs_of_books_tr_0_1.json
    │   └── original_langs_of_books_tr_1.json
    └── product_maker
        ├── product_maker_en_0.json
        ├── product_maker_en_1.json
        ├── product_maker_it_0.json
        ├── product_maker_it_1.json
        ├── product_maker_tr_0_0.json
        ├── product_maker_tr_0_1.json
        └── product_maker_tr_1.json

Citation

If you use FIBER in your research, please cite:

@misc{munis2025fibermultilingualevaluationresource,
  title        = {FIBER: A Multilingual Evaluation Resource for Factual Inference Bias},
  author       = {Evren Ayberk Munis and Deniz Yılmaz and Arianna Muti and Çağrı Toraman},
  year         = {2025},
  eprint       = {2512.11110},
  archivePrefix= {arXiv},
  primaryClass = {cs.CL},
  url          = {https://arxiv.org/abs/2512.11110},
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
dataset		dataset
test_script		test_script
.DS_Store		.DS_Store
Makefile		Makefile
README.md		README.md
config.json		config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FIBER

Overview

Methodology

Instructions on Running Scripts

Step 1: Download Dependencies

Step 2: Fill the Config Information

Step 3: Run

File Naming Convention

Dataset Structure

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FIBER

Overview

Methodology

Instructions on Running Scripts

Step 1: Download Dependencies

Step 2: Fill the Config Information

Step 3: Run

File Naming Convention

Dataset Structure

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages