K-Means Clustering of NBA Player Performance — 2000–2021

A data analysis project in R that explores NBA player performance through physical attribute trends, Oliver's Four Factors breakdown by position, and unsupervised K-Means clustering on the 2020/21 season.

🖼️ Visuals

K-Means Cluster Plot — NBA 2020/21

Three distinct player profiles emerge from the clustering of Oliver's Four Factors (eFG%, TOV%, ORB%, FTr%).

Elbow Method — Optimal Number of Clusters

📋 Overview

The project is structured in four stages:

Data cleaning — handling NA values, unit conversion (ft/inches → cm, lbs → kg), deduplication, and position standardisation.
Physical analysis — evolution of average height and weight across all NBA seasons (1950–2021), broken down by position.
Four Factors analysis — comparison of Dean Oliver's four offensive efficiency metrics (eFG%, TOV%, ORB%, FTr%) across the five positions.
K-Means clustering — unsupervised clustering of 2020/21 players based on the Four Factors, with the optimal number of clusters determined via the Elbow method.

📂 Repository Structure

ClusteringNBAperformance/
├── analysis.R              # Main analysis script
├── R/
│   └── helpers.R           # Utility functions
├── data/
│   ├── seasons_stats.csv   # Season-level player statistics (Kaggle)
│   ├── player_data.csv     # Player biographical data (Kaggle)
│   └── README.md           # Data description and provenance
├── output/
│   └── figures/
│       ├── kmeans_cluster.png
│       ├── elbow_plot.png
│       ├── four_factors.png
│       └── physique_scatter.png
└── docs/
    └── variable_glossary.md

📦 Dependencies

install.packages(c(
  "tidyverse", "ggplot2", "measurements", "factoextra", "GGally"
))

🚀 Reproducing the Analysis

Clone the repository:

git clone https://github.com/marinoalfonso/ClusteringNBAperformance.git
cd ClusteringNBAperformance

Install dependencies (see above).
Run the analysis:
```
source("analysis.R")
```

The script will clean the data, produce all plots, and export figures to output/figures/.

📊 Data Source

Both datasets are sourced from the Kaggle dataset NBA Players Stats since 1950 by Gilermo:

File	Description	Rows
`seasons_stats.csv`	Season-level player statistics, 1950–2022	~24 000
`player_data.csv`	Player biographical data (height, weight, college, etc.)	~3 900

The datasets are included in this repository for reproducibility. Please refer to the Kaggle license for terms of use.

🔑 Key Findings

Stage	Main Result
Physical trends	Average NBA player height has remained stable (~200 cm) since the 1980s; weight has increased slightly, especially among Centres and Power Forwards.
Four Factors	Centres lead in ORB% and FTr%; Point Guards lead in eFG% and TOV%.
Clustering	3 clusters emerge: high-efficiency scorers (high eFG%, low TOV%), rebounders/interior players (high ORB%, high FTr%), and perimeter / transitional players.

👤 Author

Alfonso Marino
GitHub · Feel free to open an issue or submit a PR.

📄 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
R		R
data		data
docs		docs
output/figures		output/figures
ClusterNBA.pdf		ClusterNBA.pdf
LICENSE		LICENSE
README.md		README.md
analysis.R		analysis.R
gitignore		gitignore

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-Means Clustering of NBA Player Performance — 2000–2021

🖼️ Visuals

K-Means Cluster Plot — NBA 2020/21

Elbow Method — Optimal Number of Clusters

📋 Overview

📂 Repository Structure

📦 Dependencies

🚀 Reproducing the Analysis

📊 Data Source

🔑 Key Findings

👤 Author

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

K-Means Clustering of NBA Player Performance — 2000–2021

🖼️ Visuals

K-Means Cluster Plot — NBA 2020/21

Elbow Method — Optimal Number of Clusters

📋 Overview

📂 Repository Structure

📦 Dependencies

🚀 Reproducing the Analysis

📊 Data Source

🔑 Key Findings

👤 Author

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages