Author: Jose Rodrigo Chacon
Date: September, 2023
Data Set Source: Fitbit Dataset on Kaggle
The project aims to analyze Fitbit data to explore various health-related questions, including activity patterns, calorie burn, exercise duration, and sleep quality.
- Selected datasets from Fitbit include daily activity, calorie burn, steps taken, and sleep data. Research questions are proposed, such as identifying the most active hours and days of the week, exploring the relationship between steps and calorie burn, and investigating the impact of exercise duration on sleep quality.
- Statistical analysis includes histograms, line plots, correlation matrices, and hypothesis tests to answer these questions.
- Regression analysis is conducted to examine how activity levels influence sleep duration.
- Data preprocessing involves merging datasets, selecting predictors, and categorizing data based on activity levels and sleep patterns.
- Various regression models are evaluated using metrics like the root mean squared error (RMSE) to determine the best model for predicting sleep duration based on activity levels.
- Discussion revolves around the quality and consistency of Fitbit data, highlighting the need for improvements such as increasing sample size and ensuring data consistency.
- Recommendations for future research include implementing experimental manipulations for causal inference and addressing limitations like missing data points and potential confounding factors.
In summary, the analysis delves into Fitbit data to explore the relationship between activity levels and various health metrics, providing insights and recommendations for future research.
- Data preprocessing
- Exploratory data analysis
- Statistical analysis
- Regression modeling
- Data visualization
- Report writing and communication