- Really good collection of video lecture series by Brandon Foltz
- Statistics and probability course from Khan Academy
- Highly recommend Practical Statistics for Data Scientists, 2nd Edition (Oreilly link, Amazon)
- Population vs. Sample
- Population vs. Sample video from Khan academy
- Population vs. Sample video by Brandon Foltz
- Ch 2 : Data and Sampling Distributions of the book : 'Practical Statistics for Data Scientists' (Safari link, Amazon Link)
- Mean/Median/Mode - video by Brandon Foltz
- Standard deviation - video by Brandon Foltz
- Trimmed Mean
- Trimmed Mean - video
- Graphing categorical variables - video by Brandon Foltz
- Understanding histograms - video by Brandon Foltz
- Covariance - video by Brandon Foltz
- Correlation - video by Brandon Foltz
- Covariance Matrix - video by Brandon Foltz
- Ch 1 : Exploratory Data Analysis of the book : 'Practical Statistics for Data Scientists' (Safari link, Amazon Link)
You should be familiar with the following
- Population vs. sample
- Variable types (quantitative, discrete, continuous)
- Plot basic graphs
- Do numerical analysis such as mean, median, variance, standard deviation
- Correlation
These are simple exercises designed to reinforce your learning so far.
★☆☆ - Easy
★★☆ - Medium
★★★ - Challenging
★★★★ - Bonus
We have some sample salary data (in thousands) from two cities.
city1 = [15,12, 20, 25, 50, 35, 75, 80, 60, 45, 36]
city2 = [40,42, 45, 60, 55, 52, 56, 52, 62, 57, 48]
Calculate mean, median, variation, standard deviation for both city data.
- Read nba player stats data
- Extract
Salarycolumn - Find min/max/mean/median of salary
- Is there a large variance in salary? How will you find out?
- Find the 10% trimmed mean of salary
- Do some plots for salary
Hint: boxplot and histograms
- Read nba player stats data
- Extract
HeightandWeightcolumns - You will notice the height is in feet-inches format. For example 6-4. You will need to convert this to single numeric format.
- Create a new column called
height_cm - Conversion formula is:
cm = feet * 30.48 + inches * 2.54
- Create a new column called
- Is there a correlation between
heightandweight? - Create a plot to illustrate the relationship
- Read house-sales.csv
- Create a correlation matrix for this data
- Analyze which attributes affect
Saleprice