Skip to content
This repository was archived by the owner on Mar 16, 2026. It is now read-only.

luciobaiocchi/PCA_Analysis

 
 

Repository files navigation

Computational Linear Algebra: PCA Psychophysical Analysis

This project applies Principal Component Analysis (PCA) and K-Means Clustering to a high-dimensional dataset (93 variables) to explore the relationship between psychographic traits—such as personality, phobias, and habits—and physical characteristics.

📌 Project Overview

The analysis focuses on reducing 93 initial variables into 5 actionable factors that capture the most significant variance in human behavior and physical profiles. By applying dimensionality reduction, we successfully mapped complex psychological data onto distinct physical clusters.

🛠️ Key Methodologies

  • Data Preprocessing: Handling categorical data via Ordinal Encoding and normalizing numerical features using StandardScaler.
  • Dimensionality Reduction: Implementing PCA to condense the feature space while retaining structural integrity.
  • Unsupervised Learning: Utilizing K-Means Clustering to segment the population based on the principal components.
  • Statistical Profiling: Analyzing cluster centroids to interpret the correlation between height/weight and behavioral factors (e.g., extraversion vs. anxiety).

📊 Key Findings

The model identified four distinct psychophysical profiles:

  • The Reckless Hedonist (Cluster 1): Linked to larger physical stature (avg. 178.6 cm, 73.3 kg) and sensation-seeking behaviors.
  • The Anxious Conformist (Cluster 0): Characterized by smaller physical frames (avg. 169.3 cm, 59.3 kg).
  • The Female-Dominant Profile (Cluster 2): 60.6% female with distinct psychological markers.
  • The Heterogeneous Group (Cluster 3): High weight variability ($\sigma = 16.8$), representing a physically diverse segment.

💻 Technologies Used

  • Language: Python 3
  • Libraries:
    • Scikit-Learn (PCA, KMeans, Preprocessing)
    • Pandas & NumPy (Data Manipulation)
    • Matplotlib (Visualization)
  • Environment: Jupyter Notebook

🎓 Academic Context

  • Course: Computational Linear Algebra
  • Academic Year: 2025/2026
  • Authors: Lucio Baiocchi, Leonardo Passafiume

About

Second assignment for the course of Computational Linear Algebra for Large Scale Problems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 100.0%