Grade 11 → Probability and Statistics → Statistics ↓

Measures of Dispersion

In statistics, measures of dispersion are important metrics that describe the dispersion or variability within a set of data. When you collect data, knowing how spread out the data points are can provide valuable insight beyond knowing the average or mean. Measures of dispersion help you understand the distribution of data. Let's look at these concepts in more detail.

Why are measures of dispersion important?

Imagine two classes take a maths test. The average score across the two classes is 70 out of 100. Does this mean that the two classes performed the same? It doesn't necessarily. Just knowing the average hides the variation in scores. If one class scores between 50 and 90 and another between 68 and 72, the performance is quite different. Measures of dispersion help highlight these differences, by showing how widely the scores are spread out.

Types of measures of dispersion

There are several major measures of dispersion:

Category
Interquartile Range (IQR)
Quarrel
Standard Deviation

1. Range

Range is the simplest measure of dispersion. It is calculated as the difference between the maximum and minimum values in a data set. It tells you the span of your data.

Range = Maximum value - Minimum value

For example, let's say we have the following data set of scores:

Data: 10, 15, 20, 25, 30

The limit will be as follows:

Range = 30 - 10 = 20

Although easy to calculate, the range only considers the extremes of the data and may not reflect the true dispersion if it contains outliers.

2. Interquartile Range (IQR)

The interquartile range (IQR) measures the spread among the data. It is the difference between the upper quartile (Q3) and the lower quartile (Q1). It essentially measures the range within which the central 50% of the data lies.

IQR = Q3 - Q1

To calculate the IQR, follow these steps:

Arrange the data in ascending order.
Identify the quartiles (Q1 and Q3).
Subtract Q1 from Q3.

Let's look at an example:

Data: 4, 8, 15, 16, 23, 42

First, arrange the data (here it is already in order). Next, find Q1 and Q3:

Q1 (25th percentile) = 8 Q3 (75th percentile) = 23

Then calculate the IQR:

IQR = Q3 - Q1 = 23 - 8 = 15

Visualizing the IQR

3. Variation

Variance measures the average squared deviation from the mean. It is useful for understanding how much data points differ from the average value of the data set, and places more emphasis on outliers due to classification.

The formula for variance ( sigma^2 ) in a population is:

sigma^2 = frac{sum (x_i - mu)^2}{N}

For sampling we use:

s^2 = frac{sum (x_i - bar{x})^2}{n - 1}

Where:

( x_i ) = each value
( mu ) = mean of population
( bar{x} ) = mean of the sample
( N ) = number of values in the population
( n ) = number of values in the sample

Example using sample variance:

Data: 6, 8, 10, 12, 14

Find the mean:

bar{x} = frac{6 + 8 + 10 + 12 + 14}{5} = 10

Calculate the squared deviations from the mean and find the average:

(6 - 10)^2 = 16 (8 - 10)^2 = 4 (10 - 10)^2 = 0 (12 - 10)^2 = 4 (14 - 10)^2 = 16

Standard Deviation of the Sample:

s^2 = frac{16 + 4 + 0 + 4 + 16}{5 - 1} = 10

4. Standard deviation

The standard deviation is the square root of the variance, which provides a measure of dispersion in the same units as the original data, making it easier to understand intuitively.

For the variance we calculated earlier:

s = sqrt{10} = 3.16

The standard deviation is valuable because it is expressed in the same units as the data, providing better context.

Visualizing Variance and Standard Deviation

Choosing the right solution

Understanding each measure of dispersion helps you choose the right measure based on the context:

Range: Quickly checks the spread, but is sensitive to outliers.
IQR: Better for skewed data as it is not affected by outliers, and focuses on the mid-range spread.
Variance: More detailed, robust to outliers due to classification, useful for in-depth analysis.
Standard Deviation: Best for comparing datasets because it shares units with the data points.

Practical Example

Consider the following example of two data sets showing the miles run by two groups of athletes in a week:

Group A: 15, 16, 17, 18, 19 Group B: 10, 14, 17, 20, 23

The average of both Group A and Group B is 17 miles. Now, calculate the measure of dispersion:

Category:
- Group A: 19 - 15 = 4
- Group B: 23 - 10 = 13
IQR:
- Group A: Arrangement of data will remain same, IQR = 19 - 16 = 3
- Group B: Arrangement of data will remain same, IQR = 20 - 14 = 6

Variance:

Group A:

Mean = 17
(15 - 17)^2 = 4
(16 - 17)^2 = 1
(17 - 17)^2 = 0
(18 - 17)^2 = 1
(19 - 17)^2 = 4
s^2 = frac{4 + 1 + 0 + 1 + 4}{4} = 2.5

Group B:

Mean = 17
(10 - 17)^2 = 49
(14 - 17)^2 = 9
(17 - 17)^2 = 0
(20 - 17)^2 = 9
(23 - 17)^2 = 36
s^2 = frac{49 + 9 + 0 + 9 + 36}{4} = 25.75

Standard Deviation:
- Group A: ( sqrt{2.5} approx 1.58 )
- Group B: ( sqrt{25.75} approx 5.07 )

When comparing these measurements, group B shows a greater dispersion than group A, indicated by a higher range, IQR, variance, and standard deviation. Although the two groups have the same mean, the variability in their running distance is significantly different.

Conclusion

Measures of dispersion include a variety of tools that provide information about the variability of data, helping you estimate the reliability and volatility of data points in a set. Each measure has its own strengths and weaknesses depending on the nature and context of the data you are analyzing, allowing you to approach data analysis from a broader perspective.

Understanding and using measures of dispersion enables you to describe data sets more completely, which in turn leads to better-informed decision-making in real-world scenarios, scientific research, economics, and many other fields. By mastering these concepts, you develop a strong foundation in statistics that enhances your ability to effectively analyze and interpret data.

Mark as read

Grade 11 → 6.4.2

username

completed in Grade 11

← Prev (6.4.1)

Measures of Central Tendency

Next (6.4.3) →

Data Collection and Representation

Measures of Dispersion

Why are measures of dispersion important?

Types of measures of dispersion

1. Range

2. Interquartile Range (IQR)

Visualizing the IQR

3. Variation

4. Standard deviation

Visualizing Variance and Standard Deviation

Choosing the right solution

Practical Example

Conclusion

Comments

Measures of Dispersion