🧠 Robustness Meets Knowledge: Probing LLM Weaknesses via Perturbation Sensitivity

Overview

Large Language Models (LLMs) often exhibit inconsistent robustness to input perturbations and demonstrate uneven proficiency across knowledge domains. While prior research has independently explored robustness and model understanding, this project investigates their intersection.

Core Research Question: Is a model’s lack of robustness to input variations indicative of a deeper lack of knowledge in the associated skill domain?

🔍 Motivation

LLMs, while powerful, can be brittle—struggling with slightly reworded questions or specific domains. Understanding whether this brittleness is a symptom of shallow understanding can guide both evaluation practices and model improvement. By exploring the correlation between perturbation sensitivity and conceptual weakness, this research aims to:

Illuminate blind spots in LLM capabilities.
Improve evaluation strategies by combining robustness and knowledge profiling.

📊 Data & Tools

Datasets

DOVE A dataset of prompt perturbations derived from multiple evaluation benchmarks. For each original question, DOVE provides numerous rephrased variants and corresponding model outputs.
EvalTree (on MMLU) A hierarchical evaluation framework that categorizes benchmark questions by skill type. EvalTree provides fine-grained performance metrics per skill and can generate new questions in a structured way.

Model(s)

Primary: LLaMA3 Chosen due to the availability of an existing EvalTree built on MMLU.
Alternative: OLMoE May be considered for deeper trace-based analysis if needed, using tools such as OLMoE Trace.

🧪 Evaluation Strategy

Core Components

Robustness Score: Measured as the model's success rate over DOVE-generated question perturbations.
Skill Accuracy: Taken from EvalTree’s category-wise evaluation of LLaMA3 on MMLU.

Evaluation Pipeline

Robustness Assessment: Compare performance on original and perturbed questions from DOVE.
Category Proficiency Analysis: Correlate robustness scores with EvalTree’s domain-wise accuracy ratings.
Generalization Testing (optional): Use EvalTree to generate new questions and measure the model’s ability to generalize within high- and low-robustness domains.

🗓️ Milestones (Target: 15/06)

✅ Phase 1: Correlation Analysis

Select questions from MMLU that overlap with DOVE.
Calculate robustness scores for each.
Compare to category accuracy in EvalTree.
Analyze trends or correlations between robustness and domain proficiency.

⏳ Phase 2: Robustness & Generalization

Use EvalTree’s generation methods to create new MCQs:
- From high-robustness categories.
- From low-robustness categories.
Evaluate LLaMA3's performance on these to test within-domain generalization.

🧪 Optional: Robustness-based EvalTree

Construct a modified EvalTree where each node represents average robustness rather than accuracy.
Analyze this “robustness tree” to uncover latent structural weaknesses or inconsistencies across domains.

📁 Structure (Updated)

.
├── Data/
│   ├── dove/                 # Raw or processed Dove-related data
│   └── evaltree/             # Data related to EvalTree
├── ExtractDove/
│   ├── extract_dove.py       # Scripts to extract or preprocess Dove data
│   └── ...                   # Additional scripts or configs
├── ExtractQuestion/
│   ├── extract_questions.py  # Scripts to extract/generate questions
│   └── ...                   # Additional scripts or configs
├── EvalTree/                 # Forked EvalTree project (converted to normal folder)
│   ├── src/                  # (Example) source code from EvalTree
│   └── ...                   # Other contents from the EvalTree repo
├── README.md
└── requirements.txt

🚀 Getting Started

Clone this repo.
Set up the environment:
```
pip install -r requirements.txt
```
Run analysis scripts in /analysis to reproduce results from Phase 1 and Phase 2.

📌 Notes

All code assumes access to the LLaMA3 model with EvalTree support. You may substitute with OLMoE where traceability or tooling support is stronger.
DOVE dataset must be preprocessed to align questions with EvalTree skill mappings.

✍️ Authors & Acknowledgements

This research builds on the foundation laid by the creators of DOVE, EvalTree, and MMLU. Special thanks to the teams behind LLaMA3 and OLMoE for making their models and tools available to the community.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
EvalTree		EvalTree
ExtractDove		ExtractDove
ExtractQuestion		ExtractQuestion
files_to_generate_paper_results		files_to_generate_paper_results
.gitignore		.gitignore
GenerateQuestions.py		GenerateQuestions.py
MMLU.zip		MMLU.zip
README.md		README.md
mmlu_question_prompt.txt		mmlu_question_prompt.txt
mmlu_questions_by_subject.zip		mmlu_questions_by_subject.zip
requirements.txt		requirements.txt
result_dove.ipynb		result_dove.ipynb
runEvalTreeMMLU.sh		runEvalTreeMMLU.sh
update_confidence_interval.py		update_confidence_interval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Robustness Meets Knowledge: Probing LLM Weaknesses via Perturbation Sensitivity

Overview

🔍 Motivation

📊 Data & Tools

Datasets

Model(s)

🧪 Evaluation Strategy

Core Components

Evaluation Pipeline

🗓️ Milestones (Target: 15/06)

✅ Phase 1: Correlation Analysis

⏳ Phase 2: Robustness & Generalization

🧪 Optional: Robustness-based EvalTree

📁 Structure (Updated)

🚀 Getting Started

📌 Notes

✍️ Authors & Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

inbarbahnof/anlp-project

Folders and files

Latest commit

History

Repository files navigation

🧠 Robustness Meets Knowledge: Probing LLM Weaknesses via Perturbation Sensitivity

Overview

🔍 Motivation

📊 Data & Tools

Datasets

Model(s)

🧪 Evaluation Strategy

Core Components

Evaluation Pipeline

🗓️ Milestones (Target: 15/06)

✅ Phase 1: Correlation Analysis

⏳ Phase 2: Robustness & Generalization

🧪 Optional: Robustness-based EvalTree

📁 Structure (Updated)

🚀 Getting Started

📌 Notes

✍️ Authors & Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages