Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
1fe66cc
Updated .gitignore file
Dec 8, 2025
e1e9164
Updated config.yaml file
Dec 8, 2025
a03c317
Added project data files
Dec 8, 2025
817ef4a
Updated notebooks, and deleted template notebooks
Dec 8, 2025
b5a7d5b
Data cleaning: removed 242 duplicates, handled NaN in Sleep Disorder,…
axmbesso Dec 8, 2025
850005b
update day 1
VeroniqueFanchonna Dec 8, 2025
8acdd64
Updated README file for day 1
Dec 8, 2025
40e03a6
Updated config.yaml clean dataframe
Dec 8, 2025
8d4927e
Updated notebook file
Dec 8, 2025
532abb7
Reorganized project: moved notebook to notebooks folder, removed temp…
axmbesso Dec 9, 2025
02a7dde
Updated cleaning notebook and replaced clean dataset
axmbesso Dec 9, 2025
0e8a177
Merge pull request #2 from viladomiupati-sys/Pati
viladomiupati-sys Dec 9, 2025
353945d
Merge pull request #3 from viladomiupati-sys/veronique
viladomiupati-sys Dec 9, 2025
92dfcad
Merge branch 'main' into carmelina
Axoudouxou Dec 9, 2025
dc75936
Merge pull request #1 from viladomiupati-sys/carmelina
viladomiupati-sys Dec 9, 2025
8c12b1c
Format secondary hypotheses for clarity
viladomiupati-sys Dec 9, 2025
b6323c5
update 2 day 1
VeroniqueFanchonna Dec 9, 2025
cd145f0
Merge pull request #4 from viladomiupati-sys/veronique
viladomiupati-sys Dec 9, 2025
5f530f8
H1 hypothesys day 2
VeroniqueFanchonna Dec 9, 2025
75a3025
Sleep health analysis
axmbesso Dec 10, 2025
6abeae8
Upload ananalysis file
Dec 10, 2025
5ba453d
README.md day 2
Dec 10, 2025
c5d147f
Changed name csv
Dec 10, 2025
3068c1e
merge analysis & README "day 2"
viladomiupati-sys Dec 10, 2025
752b174
merge analysis "day 2"
viladomiupati-sys Dec 10, 2025
1f92b9e
merge analysis "day 2"
viladomiupati-sys Dec 10, 2025
9e5ab86
Uploaded slide
Dec 11, 2025
e5456a2
Changing file
Dec 11, 2025
71835fd
Deleting figures we dont need
Dec 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ notebooks/.env
notebooks/.DS_Store
.DS_Store
*.in
.virtual_documents/
anaconda_projects/
115 changes: 58 additions & 57 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,77 +1,78 @@
# Project overview
...
## 💤 Sleep Health & Lifestyle Analysis
#### Business Case: Predicting Sleep Disorders

# Installation

1. **Clone the repository**:
### 📌 Project Overview
This project analyzes a Sleep Health & Lifestyle dataset to identify key factors associated with sleep disorders (Insomnia and Sleep Apnea).
The goal is to understand how lifestyle, physiological metrics, and stress levels contribute to sleep disorder risk and to support early intervention strategies.

```bash
git clone https://github.com/YourUsername/repository_name.git
```

2. **Install UV**
### 🎯 Business Problem
Sleep disorders increase medical costs, stress, and reduce quality of life.
Identifying high-risk individuals early enables:
- Preventive healthcare
- Reduced diagnosis costs
- Targeted wellbeing programs

If you're a MacOS/Linux user type:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```
### ❓ Research Questions
- Which lifestyle and physiological factors correlate with sleep disorders?
- Can stress, BMI, activity, and sleep patterns predict disorder presence?
- What differentiates insomnia from sleep apnea?

If you're a Windows user open an Anaconda Powershell Prompt and type :

```bash
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
```
### 🧪 Hypotheses
#### Primary Hypothesis (H1)
Individuals with high stress, high BMI, low sleep duration, and poor sleep quality are significantly more likely to have a sleep disorder.
**H0:** Sleep disorder presence is independent of these factors.

3. **Create an environment**
#### Secondary Hypotheses
- **H1a:** Obesity increases likelihood of sleep apnea.
- **H1b:** Higher stress correlates with insomnia.
- **H1c:** Sleeping <6 hours increases disorder risk.
- **H1d:** Low physical activity (<40 min/day) increases disorder prevalence.
- **H1e:** High heart rate / BP increases apnea risk.

```bash
uv venv
```

3. **Activate the environment**
### 🧹 Data Cleaning Summary
- Checked for missing values, incorrect data types, and duplicates.
- Standardized column names and trimmed string formatting.
- Normalized inconsistent categories (e.g., "Normal" vs "Normal Weight").
- Split Blood Pressure into numeric Systolic and Diastolic columns.
- Converted all relevant columns to numeric types.
- Filled missing Sleep Disorder values with "No Disorder".
- Removed duplicate rows (242 duplicates dropped).

If you're a MacOS/Linux user type (if you're using a bash shell):
Final result: a clean, consistent dataset ready for analysis.

```bash
source ./venv/bin/activate
```
### 📊 Exploratory Data Analysis (EDA)
Univariate and bivariate analysis was performed to understand data distributions and variable relationships.
- Target Distribution: A slight imbalance was observed, with 73 "No Disorder" cases versus 59 combined cases of Insomnia (n=29) and Sleep Apnea (n=30).
- Key Correlations: A strong positive correlation between Blood Pressure (Systolic/Diastolic) and Heart Rate with sleep disorders was confirmed.

If you're a MacOS/Linux user type (if you're using a csh/tcsh shell):
### 🔬 Hypothesis Testing & Results (H1, H2 & H3)
Statistical tests and Logistic Regression models were used to validate the hypotheses, segmenting results by disorder type.

```bash
source ./venv/bin/activate.csh
```
#### H1: Lifestyle and Physiological Factors

If you're a Windows user type:
**H1a:** Obesity & Apnea ✅ Validated --> Apnea prevalence in the "Obese" group (57.14%) was significantly higher than in the "Normal" group (9.59%).
**H1b:** Stress & Insomnia✅ Validated --> Patients with Insomnia reported the highest average stress level (7.21).
**H1d:** Low Activity (<40 min)❌ Not Validated --> No significant difference in disorder prevalence was found between the low activity group (42%) and the adequate activity group (45%).
**H1e:** High HR / BP & Apnea ✅ Validated --> Elevated blood pressure and heart rate are significant indicators for Sleep Apnea risk.

```bash
.\venv\Scripts\activate
```
#### H2: Age Effects
Age was established to have a moderate positive association with sleep disorders.
- Apnea vs. Insomnia: Sleep Apnea patients are consistently older (44.7 years) than Insomnia patients (41.6 years).
- Sleep Duration: The hypothesis that age correlates with lower sleep duration was rejected, with the opposite observed in this sample (positive correlation).

4. **Install dependencies**:
#### H3: Combined Risk Factor Model
An integrated risk scoring system for population segmentation was developed and strongly validated.

```bash
uv pip install -r requirements.txt
```
- Low Risk = 55% of population --> 18.31% disorder rate
- Medium Risk = 17% of population --> 62.50% disoreder rate
- High Risk = 28% of population --> 83.78% disorder rate

# Questions
...

# Dataset
...

## Main dataset issues

- ...
- ...
- ...

## Solutions for the dataset issues
...

# Conclussions
...

# Next steps
...
### 🏁 Conclusions & Insights
1. Predictive Efficacy: The combined risk model (H3) demonstrated a high capacity for population segmentation, showing a clear dose-response relationship (higher risk score equals higher disorder prevalence).
2. Disorder Differentiation: Risk factors are distinct: Insomnia is primarily associated with high stress levels and slightly younger ages; Sleep Apnea is strongly associated with high physiological metrics (BP, HR) and overweight/obesity.
3. Intervention Focus: The most effective preventive strategies should focus on blood pressure control and weight management to mitigate Apnea risk.
Binary file added archive.zip
Binary file not shown.
4 changes: 2 additions & 2 deletions config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
input_data:
file: "../data/raw/raw_data_file.csv"
file: "../data/raw/sleep_health_and_lifestyle_dataset.csv"

output_data:
file: "../data/clean/cleaned_data_file.csv"
file: "../data/clean/sleep_health_project_clean.csv"
Empty file removed data/clean/cleaned_data_file.csv
Empty file.
133 changes: 133 additions & 0 deletions data/clean/sleep_health_project_clean.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
person_id,gender,age,occupation,sleep_duration,quality_of_sleep,physical_activity_level,stress_level,bmi_category,blood_pressure,heart_rate,daily_steps,sleep_disorder,systolic,diastolic
1,Male,27,Software Engineer,6.1,6,42,6,Overweight,126/83,77,4200,No Disorder,126,83
2,Male,28,Doctor,6.2,6,60,8,Normal,125/80,75,10000,No Disorder,125,80
4,Male,28,Sales Representative,5.9,4,30,8,Obese,140/90,85,3000,Sleep Apnea,140,90
6,Male,28,Software Engineer,5.9,4,30,8,Obese,140/90,85,3000,Insomnia,140,90
7,Male,29,Teacher,6.3,6,40,7,Obese,140/90,82,3500,Insomnia,140,90
8,Male,29,Doctor,7.8,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
11,Male,29,Doctor,6.1,6,30,8,Normal,120/80,70,8000,No Disorder,120,80
14,Male,29,Doctor,6.0,6,30,8,Normal,120/80,70,8000,No Disorder,120,80
17,Female,29,Nurse,6.5,5,40,7,Normal,132/87,80,4000,Sleep Apnea,132,87
18,Male,29,Doctor,6.0,6,30,8,Normal,120/80,70,8000,Sleep Apnea,120,80
19,Female,29,Nurse,6.5,5,40,7,Normal,132/87,80,4000,Insomnia,132,87
20,Male,30,Doctor,7.6,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
21,Male,30,Doctor,7.7,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
25,Male,30,Doctor,7.8,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
26,Male,30,Doctor,7.9,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
31,Female,30,Nurse,6.4,5,35,7,Normal,130/86,78,4100,Sleep Apnea,130,86
32,Female,30,Nurse,6.4,5,35,7,Normal,130/86,78,4100,Insomnia,130,86
33,Female,31,Nurse,7.9,8,75,4,Normal,117/76,69,6800,No Disorder,117,76
34,Male,31,Doctor,6.1,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
35,Male,31,Doctor,7.7,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
38,Male,31,Doctor,7.6,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
44,Male,31,Doctor,7.8,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
50,Male,31,Doctor,7.7,7,75,6,Normal,120/80,70,8000,Sleep Apnea,120,80
51,Male,32,Engineer,7.5,8,45,3,Normal,120/80,70,8000,No Disorder,120,80
53,Male,32,Doctor,6.0,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
54,Male,32,Doctor,7.6,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
57,Male,32,Doctor,7.7,7,75,6,Normal,120/80,70,8000,No Disorder,120,80
63,Male,32,Doctor,6.2,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
67,Male,32,Accountant,7.2,8,50,6,Normal,118/76,68,7000,No Disorder,118,76
68,Male,33,Doctor,6.0,6,30,8,Normal,125/80,72,5000,Insomnia,125,80
69,Female,33,Scientist,6.2,6,50,6,Overweight,128/85,76,5500,No Disorder,128,85
71,Male,33,Doctor,6.1,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
75,Male,33,Doctor,6.0,6,30,8,Normal,125/80,72,5000,No Disorder,125,80
81,Female,34,Scientist,5.8,4,32,8,Overweight,131/86,81,5200,Sleep Apnea,131,86
83,Male,35,Teacher,6.7,7,40,5,Overweight,128/84,70,5600,No Disorder,128,84
85,Male,35,Software Engineer,7.5,8,60,5,Normal,120/80,70,8000,No Disorder,120,80
86,Female,35,Accountant,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
87,Male,35,Engineer,7.2,8,60,4,Normal,125/80,65,5000,No Disorder,125,80
89,Male,35,Engineer,7.3,8,60,4,Normal,125/80,65,5000,No Disorder,125,80
94,Male,35,Lawyer,7.4,7,60,5,Obese,135/88,84,3300,Sleep Apnea,135,88
95,Female,36,Accountant,7.2,8,60,4,Normal,115/75,68,7000,Insomnia,115,75
96,Female,36,Accountant,7.1,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
97,Female,36,Accountant,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
99,Female,36,Teacher,7.1,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
101,Female,36,Teacher,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
104,Male,36,Teacher,6.6,5,35,7,Overweight,129/84,74,4800,Sleep Apnea,129,84
105,Female,36,Teacher,7.2,8,60,4,Normal,115/75,68,7000,Sleep Apnea,115,75
106,Male,36,Teacher,6.6,5,35,7,Overweight,129/84,74,4800,Insomnia,129,84
107,Female,37,Nurse,6.1,6,42,6,Overweight,126/83,77,4200,No Disorder,126,83
108,Male,37,Engineer,7.8,8,70,4,Normal,120/80,68,7000,No Disorder,120,80
110,Male,37,Lawyer,7.4,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
111,Female,37,Accountant,7.2,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
126,Female,37,Nurse,7.5,8,60,4,Normal,120/80,70,8000,No Disorder,120,80
127,Male,38,Lawyer,7.3,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
128,Female,38,Accountant,7.1,8,60,4,Normal,115/75,68,7000,No Disorder,115,75
138,Male,38,Lawyer,7.1,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
145,Male,38,Lawyer,7.1,8,60,5,Normal,130/85,68,8000,Sleep Apnea,130,85
146,Female,38,Lawyer,7.4,7,60,5,Obese,135/88,84,3300,Sleep Apnea,135,88
147,Male,39,Lawyer,7.2,8,60,5,Normal,130/85,68,8000,Insomnia,130,85
148,Male,39,Engineer,6.5,5,40,7,Overweight,132/87,80,4000,Insomnia,132,87
149,Female,39,Lawyer,6.9,7,50,6,Normal,128/85,75,5500,No Disorder,128,85
150,Female,39,Accountant,8.0,9,80,3,Normal,115/78,67,7500,No Disorder,115,78
152,Male,39,Lawyer,7.2,8,60,5,Normal,130/85,68,8000,No Disorder,130,85
162,Female,40,Accountant,7.2,8,55,6,Normal,119/77,73,7300,No Disorder,119,77
164,Male,40,Lawyer,7.9,8,90,5,Normal,130/85,68,8000,No Disorder,130,85
166,Male,41,Lawyer,7.6,8,90,5,Normal,130/85,70,8000,Insomnia,130,85
167,Male,41,Engineer,7.3,8,70,6,Normal,121/79,72,6200,No Disorder,121,79
168,Male,41,Lawyer,7.1,7,55,6,Overweight,125/82,72,6000,No Disorder,125,82
170,Male,41,Lawyer,7.7,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
175,Male,41,Lawyer,7.6,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
178,Male,42,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
179,Male,42,Lawyer,7.8,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
185,Female,42,Teacher,6.8,6,45,7,Overweight,130/85,78,5000,Sleep Apnea,130,85
187,Female,43,Teacher,6.7,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
188,Male,43,Salesperson,6.3,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
190,Male,43,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
192,Male,43,Salesperson,6.4,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
202,Male,43,Engineer,7.8,8,90,5,Normal,130/85,70,8000,Insomnia,130,85
204,Male,43,Engineer,6.9,6,47,7,Normal,117/76,69,6800,No Disorder,117,76
205,Male,43,Engineer,7.6,8,75,4,Overweight,122/80,68,6800,No Disorder,122,80
206,Male,43,Engineer,7.7,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
210,Male,43,Engineer,7.8,8,90,5,Normal,130/85,70,8000,No Disorder,130,85
219,Male,43,Engineer,7.8,8,90,5,Normal,130/85,70,8000,Sleep Apnea,130,85
220,Male,43,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,Sleep Apnea,130,85
221,Female,44,Teacher,6.6,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
222,Male,44,Salesperson,6.4,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
223,Male,44,Salesperson,6.3,6,45,7,Overweight,130/85,72,6000,Insomnia,130,85
238,Female,44,Teacher,6.5,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
248,Male,44,Engineer,6.8,7,45,7,Overweight,130/85,78,5000,Insomnia,130,85
249,Male,44,Salesperson,6.4,6,45,7,Overweight,130/85,72,6000,No Disorder,130,85
250,Male,44,Salesperson,6.5,6,45,7,Overweight,130/85,72,6000,No Disorder,130,85
251,Female,45,Teacher,6.8,7,30,6,Overweight,135/90,65,6000,Insomnia,135,90
253,Female,45,Teacher,6.5,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
257,Female,45,Teacher,6.6,7,45,4,Overweight,135/90,65,6000,Insomnia,135,90
262,Female,45,Teacher,6.6,7,45,4,Overweight,135/90,65,6000,No Disorder,135,90
264,Female,45,Manager,6.9,7,55,5,Overweight,125/82,75,5500,No Disorder,125,82
265,Male,48,Doctor,7.3,7,65,5,Obese,142/92,83,3500,Insomnia,142,92
266,Female,48,Nurse,5.9,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
268,Female,49,Nurse,6.2,6,90,8,Overweight,140/95,75,10000,No Disorder,140,95
269,Female,49,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
270,Female,49,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
274,Female,49,Nurse,6.2,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
277,Male,49,Doctor,8.1,9,85,3,Obese,139/91,86,3700,Sleep Apnea,139,91
279,Female,50,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Insomnia,140,95
280,Female,50,Engineer,8.3,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
281,Female,50,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,No Disorder,140,95
282,Female,50,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
283,Female,50,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
299,Female,51,Engineer,8.5,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
303,Female,51,Nurse,7.1,7,55,6,Normal,125/82,72,6000,No Disorder,125,82
304,Female,51,Nurse,6.0,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
305,Female,51,Nurse,6.1,6,90,8,Overweight,140/95,75,10000,Sleep Apnea,140,95
307,Female,52,Accountant,6.5,7,45,7,Overweight,130/85,72,6000,Insomnia,130,85
309,Female,52,Accountant,6.6,7,45,7,Overweight,130/85,72,6000,Insomnia,130,85
313,Female,52,Engineer,8.4,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
316,Female,53,Engineer,8.3,9,30,3,Normal,125/80,65,5000,Insomnia,125,80
317,Female,53,Engineer,8.5,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
319,Female,53,Engineer,8.4,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
325,Female,53,Engineer,8.3,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
333,Female,54,Engineer,8.4,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
339,Female,54,Engineer,8.5,9,30,3,Normal,125/80,65,5000,No Disorder,125,80
340,Female,55,Nurse,8.1,9,75,4,Overweight,140/95,72,5000,Sleep Apnea,140,95
342,Female,56,Doctor,8.2,9,90,3,Normal,118/75,65,10000,No Disorder,118,75
344,Female,57,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,No Disorder,140,95
345,Female,57,Nurse,8.2,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
350,Female,57,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
353,Female,58,Nurse,8.0,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
359,Female,59,Nurse,8.0,9,75,3,Overweight,140/95,68,7000,No Disorder,140,95
360,Female,59,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,No Disorder,140,95
361,Female,59,Nurse,8.2,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
365,Female,59,Nurse,8.0,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
367,Female,59,Nurse,8.1,9,75,3,Overweight,140/95,68,7000,Sleep Apnea,140,95
Loading