Graduate → Probability and Statistics → Statistical Inference ↓

Hypothesis Testing

Hypothesis testing is a fundamental concept in statistics, used to make judgements about the characteristics of a population. It is a method that allows us to use sample data to decide between two competing hypotheses about a population parameter.

Introduction to hypothesis testing

Basically, hypothesis testing is a process by which we check whether a statement (hypothesis) about a population parameter is plausible based on the sample data we have. The main elements of hypothesis testing are the null hypothesis, alternative hypothesis, test statistic, rejection region, and conclusion.

Null and alternative hypotheses

The null hypothesis (denoted as H ₀) is a statement about the population parameter we want to test. It is usually a statement of no effect or no difference. In contrast, the alternative hypothesis (denoted as H ₁ or H _a) is what we assume to be true if the null hypothesis is rejected. It represents an effect or a difference.

Null Hypothesis (H ₀ ): μ = μ ₀ Alternative Hypothesis (H _a ): μ ≠ μ ₀ (two-tailed) Alternative Hypothesis (H _a ): μ > μ ₀ (right-tailed) Alternative Hypothesis (H _a ): μ < μ ₀ (left-tailed)

Test statistic

A test statistic is a value calculated from sample data that is used to assess the probability of the null hypothesis. The choice of test statistic depends on the type of data and the hypothesis being tested. Common examples include the z-score, t-score, and F-statistic.

For example, if we want to test a hypothesis about a mean, we can calculate z-score for the sample mean:

z = (x̄ - μ ₀ ) / (σ/√n)

Rejection zone

The rejection region is determined by the significance level, denoted by α, which is the probability of rejecting the null hypothesis when it is actually true. Common choices for α are 0.05, 0.01, and 0.10.

If the test statistic falls in the rejection region, we reject the null hypothesis in favor of the alternative hypothesis.

Conclusion

Based on the test statistic and the rejection region, we draw conclusions. If the test statistic falls in the rejection region, we reject the null hypothesis, suggesting that there is enough evidence to support the alternative hypothesis. If it does not fall in the rejection region, we fail to reject the null hypothesis.

Types of hypothesis testing

Hypothesis tests can be classified into several types depending on the population parameters of interest and the data available.

One-sample z-test

The one-sample z-test is used when we want to compare a sample mean with a known population mean. This test assumes that the population is normally distributed and the population variance is known.

One-sample t-test

If the population variance is unknown, we use a one-sample t-test instead of a z-test. It is appropriate when the sample size is small, and the population is assumed to be normally distributed.

Two-sample t-test

The two-sample t-test compares the means of two independent samples. It tests whether the means of two groups are equal. This test uses the pooled standard deviation when it is assumed that the variances are equal or uses the individual variances when the variances are unequal.

Example: hypothesis testing for the mean

Suppose a company manufactures light bulbs with an average life of 1000 hours. A researcher believes the true average life is less than 1000 hours and wants to test this hypothesis using a random sample of 30 light bulbs.

Let's define our hypotheses:

H ₀ : μ = 1000 H _a : μ < 1000

Significance level: α = 0.05.

Assuming that the standard deviation of the population is 100, the test statistic can be calculated as follows:

z = (x̄ - μ ₀ ) / (σ/√n) x̄ = sample mean n = sample size σ = population standard deviation μ ₀ = population mean under H ₀

If the calculated z value falls to the left of our critical value on the normal distribution curve (found in z-tables), we will reject the null hypothesis.

Real life applications of hypothesis testing

Medical research: Comparing the effectiveness of a new drug to a placebo.
Manufacturing: Comparing the means of different production processes to determine which process is more efficient.
Marketing: Evaluating the impact of a new campaign on customer sales or engagement compared to an old strategy.
Education: Determining if a new teaching method works better than the traditional method.

Common mistakes in hypothesis testing

Type I and Type II errors

Two common mistakes in hypothesis testing are Type I and Type II errors:

Type I error: rejecting the null hypothesis when it is true. The probability of a Type I error occurring is the significance level α.
Type II error: Failing to reject the null hypothesis when the alternative hypothesis is true. The probability of committing a Type II error is denoted as β.

Example of errors in hypothesis testing: medical testing

In a medical test for a disease, the null hypothesis might be that the person does not have the disease (_{H 0}: the person does not have the disease). The alternative hypothesis might be that the person does have the disease (_{H a}: the person does have the disease).

Type I error: The test shows that the person has the disease when in fact they do not. This can cause unnecessary stress and treatment.
Type II error: The test fails to identify a disease when a person actually has it. This results in the disease not being treated.

The power of testing

The power of a test is the probability that it correctly rejects the false null hypothesis (1 - β). Higher power means a greater probability of detecting an effect when there is an effect, thereby reducing type II errors.

The power can be increased as follows:

Increasing the sample size.
Selecting a higher significance level (which increases the probability of a Type I error).
This is the increase in effect size we expect to find.

Critical values and p-values

Critical values

Critical values are the threshold values that define the boundaries of the rejection region(s). For a z-test, the critical values correspond to the z scores at which the tail(s) of the normal distribution beyond the critical value fall within a predefined significance level (α).

P-value

The p-value is the probability of reaching a peak at least equal to the observed test statistic, assuming the null hypothesis is true. A small p-value (typically ≤ 0.05) indicates strong evidence against the null hypothesis, so we reject H ₀ A large p-value (> 0.05) indicates weak evidence against H ₀, so we fail to reject it.

Example of use of p-value

In a study about average sleep hours among students, the null hypothesis states that students sleep an average of 7 hours per night. The sample data provides an average of 6.6 hours with a calculated p-value of 0.03.

Conclusion: Since 0.03 < 0.05, we reject the null hypothesis, and conclude that the average sleep time is different from 7 hours.

Conclusion

Hypothesis testing provides a systematic way to make decisions using data. Although it does not provide definitive evidence, it helps us evaluate evidence against probability statements about a population. Understanding the process, types, errors, and how to effectively apply these concepts can significantly impact outcomes in fields such as medicine, business, and the social sciences.

Materials and references for further reading

Using and Interpreting Statistics in the Social, Behavioral, and Health Sciences, by William R. Nugent
Statistics for Experimenters: Design, Innovation, and Discovery, by George E. P. Box, J. Stuart Hunter, and William G. Hunter
Principles of Statistics, by M.G. Bulmer

Mark as read

Graduate → 5.2.1

username

completed in Graduate

← Prev (5.2)

Statistical Inference

Next (5.2.2) →

Confidence Intervals

Hypothesis Testing

Introduction to hypothesis testing

Null and alternative hypotheses

Test statistic

Rejection zone

Conclusion

Types of hypothesis testing

One-sample z-test

One-sample t-test

Two-sample t-test

Example: hypothesis testing for the mean

Real life applications of hypothesis testing

Common mistakes in hypothesis testing

Type I and Type II errors

Example of errors in hypothesis testing: medical testing

The power of testing

Critical values and p-values

Critical values

P-value

Example of use of p-value

Conclusion

Materials and references for further reading

Comments

Hypothesis Testing