News Topic Classification 📰

Student: Matteo Ientile

Context: Data Science & Machine Learning Exam - Winter 2026

📌 Project Overview

This repository contains the solution for the News Topic Classification problem. The project implements a Machine Learning pipeline to classify news articles into distinct categories, addressing challenges such as high dimensionality and class imbalance. The optimal solution is ranked in top 25% across a leaderboard of 200+ people.

The solution is divided into two distinct parts:

Exploration & Tuning: Deep analysis and hyperparameter search.
Final Pipeline: The optimized, production-ready model.

⚠️ Important Note on Performance

Please read before executing:

1_Exploration_and_Tuning.ipynb contains computationally expensive Grid Search and Randomized Search operations. It takes a significant amount of time to run.

The notebook has been saved with all outputs visible. It is recommended to view the static outputs rather than re-running the cells unless you intend to reproduce the full tuning process from scratch.

📂 Repository Structure

File	Description
`1_Exploration_and_Tuning.ipynb`	Analysis & R&D. Contains Exploratory Data Analysis (EDA), split strategy, and extensive Hyperparameter Tuning.
`2_Final_Model_Solution.ipynb`	Production Pipeline. The final, reproducible solution using the best hyperparameters found. It retrains on the full Development set and generates the submission CSV quickly.
`requirements.txt`	List of Python dependencies required to run the environment.
`Report.pdf`	Official IEEE-format report describing the pipeline.

🚀 How to Run

Install Dependencies Ensure you are in the project directory and run:
```
pip install -r requirements.txt
```
Data Placement Ensure the dataset files (development.csv and evaluation.csv) are located in the root directory of this repository.
Generate Submission (Fast) Run the solution notebook to reproduce the final model and submission file:
- Open and run `2_Final_Model_

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.DS_Store		.DS_Store
.gitignore		.gitignore
1_Exploration_and_Tuning.ipynb		1_Exploration_and_Tuning.ipynb
2_Final_Models_Solution.ipynb		2_Final_Models_Solution.ipynb
README.md		README.md
Report.pdf		Report.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News Topic Classification 📰

📌 Project Overview

⚠️ Important Note on Performance

📂 Repository Structure

🚀 How to Run

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

News Topic Classification 📰

📌 Project Overview

⚠️ Important Note on Performance

📂 Repository Structure

🚀 How to Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages