Variance, Standard Deviation, and Standardization
πΉ Video Overviewβ
π What We're Learningβ
-
How to measure spread/dispersion in data
-
Variance and Standard Deviation
-
How to compare different datasets
π― Core Concept: Measuring Spreadβ
The Big Idea: Average alone isn't enough! Two datasets can have the same average but look totally different.
Example to Remember:β
-
Dataset 1: 9, 9, 9, 9, 9 (average = 9)
-
Dataset 2: 1, 4, 9, 12, 19 (average = 9)
Both have the same average, but Dataset 2 is way more spread out!
π Variance (sΒ²)β
What It Measuresβ
How far, on average, each data point is from the mean (but squared).
Formulaβ
Memory Hack: "The Three Steps Dance"β
-
Subtract: Find how far each value is from the mean (xα΅’ - xΜ)
-
Square: Make all differences positive (xα΅’ - xΜ)Β²
-
Average: Add them up and divide by n
Why Square?β
-
Gets rid of negative values (otherwise they'd cancel out!)
-
Emphasizes extreme values (outliers get more weight)
Quick Exampleβ
Dataset: 1, 4, 9, 12, 19
-
Mean (xΜ) = 9
-
Variance (sΒ²) = 39.6
π Standard Deviation (s)β
What It Isβ
Just the square root of variance β brings us back to original units!
Formulaβ
Memory Hackβ
"Variance's little brother" β Same thing but in normal units (not squared)
Why Use It?β
-
Easier to interpret (same units as your data)
-
If measuring height in cm, SD is in cm (not cmΒ²)
Quick Exampleβ
-
Variance = 39.6
-
Standard Deviation = β39.6 = 6.29
π Frequency Table Versionβ
When data is in a frequency table:
Memory Hackβ
Same formula, but now multiply by frequency (fα΅’) because some values appear more times!
Example: Coffee Consumptionβ
| Cups (x) | Frequency (f) |
|---|---|
| 0 | 10 |
| 1 | 16 |
| 2 | 12 |
| 3 | 27 |
Each value gets weighted by how often it appears.
π Linear Transformations (The Magic Rules!)β
Rule 1: Adding/Subtracting a Constant (Β± a)β
What happens: NOTHING to variance or SD!
New variance = Old variance
New SD = Old SD
Memory Hack: "Shifting doesn't change spread"β
Think of it like moving all your data points together β the spread between them stays the same!
Exampleβ
-
Original scores: 70, 80, 90
-
Add 10 to each: 80, 90, 100
-
The spread is identical!
Rule 2: Multiplying/Dividing by a Constant (Γ b or Γ· b)β
What happens:
-
Variance gets multiplied by bΒ²
-
SD gets multiplied by b
Memory Hack: "Stretching stretches the spread"β
-
If you double all values β spread doubles
-
But variance squares it β 2Β² = 4
Example: Grade Adjustmentβ
Scores: 91, 77, 61, 83, 88, 71, 89
-
Mean = 80
-
Lecturer adds 5% to each grade
-
Multiply by 1.05 (which is b)
-
New variance = (1.05)Β² Γ old variance = 1.1025 Γ old variance
-
New SD = 1.05 Γ old SD
π Coefficient of Variation (CV)β
What It Isβ
Relative measure of spread β lets you compare different datasets even with different units!
Formulaβ
Memory Hack: "SD per unit of average"β
Shows how big the SD is relative to the mean.
When to Use Itβ
Comparing apples to oranges!
Example: Height vs Weightβ
-
Height: Mean = 174 cm, SD = 14 cm
- CV = 14/174 = 0.080 = 8%
-
Weight: Mean = 60 kg, SD = 9 kg
- CV = 9/60 = 0.150 = 15%
Interpretation: Students are more homogeneous (similar) in height than weight!
The Ruleβ
Lower CV = More homogeneous (less spread)
β Z-Score (Standard Score)β
What It Isβ
Tells you how many standard deviations a value is from the mean.
Formulaβ
Memory Hack: "Distance in SD units"β
-
z = 0 β exactly at average
-
z = 1 β one SD above average
-
z = -2 β two SDs below average
Reverse Formulaβ
If you know z-score and want to find the original value:
π Z-Score: Practical Exampleβ
The Questionβ
Student got 85 in Statistics and 80 in Economics. Where did they do better?
Statistics Courseβ
-
Mean = 82, SD = 5
-
Student's score = 85
-
z = (85 - 82) / 5 = 0.6
Economics Courseβ
-
Mean = 70, SD = 3
-
Student's score = 80
-
z = (80 - 70) / 3 = 3.33
Answer: Student excelled more in Economics! (3.33 SDs above average vs only 0.6)
π― Z-Score Properties (Super Important!)β
No matter what your original data looks like:
Why This Mattersβ
Z-scores standardize everything β you can compare different variables!
Comparison Tableβ
| Student | Height | Weight | IQ |
|---|---|---|---|
| Evyatar | 175 | 70 | 100 |
| Iti | 160 | 70 | 130 |
| Ariel | 165 | 60 | 90 |
| Mean | 165 | 60 | 100 |
| SD | 10 | 12 | 12 |
Calculate z-scores to compare!β
-
Evyatar height: z = (175-165)/10 = 1.0
-
Iti IQ: z = (130-100)/12 = 2.5 β Most outstanding!
-
Ariel weight: z = (60-60)/12 = 0.0 (exactly average)
π§ Memory Hacks Summaryβ
| Concept | Memory Trick |
|---|---|
| Variance | "Average squared distance" |
| SD | "Variance's little brother in normal units" |
| Adding constant | "Shifting doesn't change spread" |
| Multiplying constant | "Stretching stretches the spread" |
| CV | "SD per unit of average" |
| Z-score | "Distance in SD units" |
| Z properties | "Always 0 and 1 after standardizing" |
π― True/False Practice Problemsβ
Problem 1β
"If distribution is negatively skewed, the z-score of the 5th decile will be negative."
Answer: FALSE!
-
5th decile = median = middle value
-
In most distributions, median is near the mean
-
Z-score of mean = 0 (or close to it)
-
Skewness affects the mean position, but the median's z-score isn't necessarily negative
Problem 2β
"After a 7% salary increase for everyone, Daniela's z-score (originally 2.5) will increase."
Answer: FALSE!
-
Everyone gets the same percentage increase
-
This is multiplication by constant (Γ 1.07)
-
Z-score stays the same! (both numerator and denominator are multiplied by same amount)
-
Her relative position doesn't change
Why?β
Original:
After:
The 1.07 cancels out!
π Quick Reference Formulasβ
Variance:
Standard Deviation:
Frequency Table Variance:
Coefficient of Variation:
Z-Score:
Reverse Z-Score:
π Final Tipsβ
-
Always check units: SD should match your data units
-
Use CV for comparisons: Different units? Use CV!
-
Z-scores are universal: Best for comparing across different scales
-
Remember the transformation rules: Save time on exams!
-
Negative z-score = below average: Positive = above average
Good luck! π