Skip to content

RowTimis99CDS/aus-health-sql-portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Australian Public Health Data: SQL Analysis

(aus-health-sql-portfolio)

Project overview

This project demonstrates practical SQL skills using Australian Institute of Health and Welfare (AIHW) survey data on hospital usage patterns across Australia's Primary Health Network (PHN) regions.

The goal is not database engineering, but to show clear, well-documented SQL transformations and analytical thinking using real public health data - the kind of work a data analyst does daily.

Skills Demonstrated

  • Data loading & staging: Loading CSV data into PostgreSQL with deliberate design choices
  • Data cleaning & transformation: Handling non-numeric values, type conversions, business rule application
  • SQL techniques: CTEs, aggregations, views, JOINs across related tables
  • Analytical thinking: Formulating business questions, interpreting results, drawing insights
  • Documentation: Clear code comments, reproducible workflow, assumption documentation

Data Source

  • Source: Australian Institute of Health and Welfare (AIHW)
  • Datasets:
    • Percentage of adults who went to hospital ED in preceding 12 months, by PHN
    • Percentage of adults admitted to hospital in preceding 12 months, by PHN
  • License: Creative Commons Attribution 4.0 (CC BY 4.0)
  • URL: aihw-hpf-52-data-tables

Workflow

1. Raw load

(01_raw_load.sql)

Raw CSV files are loaded into staging tables with minimal constraints.

  • Raw CSV files are loaded manually into staging tables using pgAdmin.
  • Load scripts intentionally exclude client-specific commands (e.g. \copy) to keep SQL portable.
  • All fields are stored as text to avoid load failures due to formatting issues (e.g. “NP” values).
  • No transformations are applied at this stage.
  • Staging tables act as a reproducible snapshot of the source data.

2. Cleaned data

(02_clean_transform.sql)

Cleaned tables are created from the staging layer with explicit transformations:

  • Percentage fields are converted to numeric values.
  • Rows containing non-published values (NP) are excluded from analysis tables.
  • Financial years (e.g. 2013–14) are converted to a single numeric year using the ending financial year as a business rule (e.g. 2014). This layer produces analysis-ready tables with consistent data types and documented assumptions.

3. Analysis

(03_analysis.sql)

Q1: How do hospital admission rates vary by PHN?

Query: vw_admission_by_phn

Finding: Across 30 PHN areas, most (19/30) had admission rates above the national average of 12.8%.

  • Highest: North Coast NSW (15.74%), Darling Downs QLD (15.64%)
  • Lowest: Western Sydney (10.62%), Central & Eastern Sydney (10.64%)

Business insight: Regional variation in hospital demand - useful for resource allocation and policy targeting.


Q2: Are emergency visits consistently higher than hospital admissions?

Query: vw_visits_vs_admissions_gap

Finding: The gap between ED visits and admissions ranged from -0.88 to +6.46 percentage points across PHNs, with most showing 1-3 points difference.

Business insight: ED visits generally exceed admissions, indicating pressure on emergency services varies significantly by region.


Q3: How have admission rates changed over time nationally?

Query: vw_avg_admissions_by_year

Finding: Over five years, admission rates remained stable (12.7% to 13.7%) with no clear trend.

Business insight: National admission rates are stable - no major policy or population-driven shifts in this period.

What I Learned

  • Staging tables matter: Loading raw data as text first prevents load failures and makes debugging easier
  • Document your business rules: Converting "2013–14" to 2014 is a decision that needs to be explicit and traceable
  • Views keep analysis clean: Building views on cleaned data rather than raw data makes queries simpler and more reliable

How to Run

  1. Install PostgreSQL (tested on version 9.6)
  2. Create database: CREATE DATABASE aihw_analysis;
  3. Run scripts in order:
    • 01_raw_load.sql - creates staging tables
    • Load CSV files via pgAdmin (instructions in script comments)
    • 02_clean_transform.sql - creates cleaned tables
    • 03_analysis.sql - creates views and runs analysis queries

About

Demonstrates practical SQL skills using Australian Institute of Health and Welfare (AIHW) survey data, focusing on realistic data loading, cleaning, and analysis preparation tasks typical of a data analyst role. Emphasises well-documented SQL transformations and analytical thinking, rather than database engineering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors