Course repository for BDATA 200 Introduction to Data Studies at the University of Washington.
| Instructor | Pedro Albuquerque, PhD |
| Office Hours | Fridays, 4:00 PM – 6:00 PM (appointment required 24h in advance) |
| Component | Weight |
|---|---|
| Weekly Assignments (discussions, in-class activities, readings) | 25% |
| Project 1: Kaggle Dataset (Pitch + Video + Reflection) | 25% |
| Project 2: Data Analysis Report (Literature + Pitch + Report + Presentation + Reflection) | 30% |
| Social Media Post | 20% |
- Google Account — required to access Google Colab, the primary coding environment
- Personal laptop recommended (must run Microsoft Excel and access Google Colab)
- Laptop lending available through the UW IT Laptop & WiFi Hotspot Lending Program
- Data Storytelling — Visualization, narrative strategies, and masters of data storytelling (Hans Rosling, David McCandless, Nate Silver)
- Data Workflow & Working with Data — Reproducible workflows, Google Colab, Jupyter Notebooks, Python fundamentals, NumPy & pandas
- Data Visualization Libraries — matplotlib, plotnine (ggplot-style), plotly (interactive charts)
- Data Analysis & Statistics — Probability, research design, descriptive & inferential statistics, correlation vs. causation
- Misleading Plots & Graph Design — Best practices, color accessibility, detecting deception in data
- Multivariate Analysis & Visualization — Dimensionality reduction (PCA, t-SNE, UMAP), clustering (K-Means, Hierarchical, DBSCAN)
- Network Visualization — Graph structures, community detection, NetworkX & Pyvis
- Spatial Analysis & Visualization — Spatial statistics, Moran's I, choropleth maps, geopandas & folium
LLMs may be used to leverage your work, but you must retain the reasoning. Use them for code, summaries, and exploration — but always explain why your approach is correct, never submit AI output you don't understand, and cite when AI was used.