This repository restructures the provided analysis script into a reproducible R project. The layout separates reusable functions, data storage, scripts, and tests so you can continue developing or share the work on GitHub.
R/– reusable functions (config.R,data_loading.R,feature_engineering.R).scripts/– analysis entry points (run_analysis.R).data-raw/– store source data (expecteddatabase_final.xlsx).data/– optional processed data exports.results/– automatically created to hold RDS/metadata outputs.tests/–testthatunit tests for the data-processing helpers.
-
Install the required R packages (one time):
source("R/config.R") load_required_packages() -
Place your raw Excel dataset at
data-raw/database_final.xlsx(or update the path inscripts/run_analysis.R).
Execute the main script from the repository root:
Rscript scripts/run_analysis.ROutputs are written to results/:
database_final_cleaned.rds– cleaned wide data.database_long.rds– long-format version ready for modeling.metadata.txt– human-readable summary of detected variables.
Run the lightweight unit tests to confirm the helpers behave as expected:
Rscript -e 'testthat::test_dir("tests/testthat")'- The repository is already organized with a clear layout and
.gitignoreentries for common R artifacts. - Add your data to
data-raw/(it is git-ignored by default) and push the branch to GitHub.