Krista King Math | Online math help

View Original

Inferential statistics and hypothesis statements

The goal of inferential statistics

One thing we like to do in statistics is make a statement about a population parameter, collect a sample from that population, investigate the sample, and then make a clear statement about whether or not that sample supports our original statement about the population.

We’ll cover each part of this process throughout this section. Here’s where we’re headed: There are five steps for hypothesis testing:

  1. State the null and alternative hypotheses.

  2. Determine the level of significance.

  3. Calculate the test statistic.

  4. Find critical value(s).

  5. State the conclusion.

This process is also called inferential statistics, because we’re using information we have about the sample to make inferences about the population.

For instance, I might have been told that ???40\%??? of cars in my town are blue. I could try to take a random sample of cars in my town, look at the proportion of cars in that sample which are blue (maybe ???37\%???), and then build a confidence interval to state how confident I am that my sample, which produced ???\hat p=0.37???, supports the claim that ???p=0.40???.

Proof vs. support

But it’s important to make the distinction right up front between proving a claim and providing support for a claim.

When we do inferential statistics, we’re usually not able to prove something with certainty (my ???\hat p=0.37??? doesn’t prove that ???p=0.40???, even though it might lend some support for that claim). Instead, we use the data to support a theory that we have. Hopefully, if the data is strong enough, we can provide strong, or confident support for our theory, but we still can’t necessarily prove it.

Hypotheses for means and proportions

Before we can use inferential statistics, we first need a hypothesis, which is a statement of expectation about a population parameter that we develop for the purpose of testing it (???40\%??? of cars in my town are blue).

In any hypothesis test, the first thing we always want to do is state what are called the null and alternative hypotheses. Every hypothesis test contains this set of two opposing statements about a population parameter.

The alternative hypothesis ???H_a??? is what we expect we’ll find in the data. Once we have a statement about what we expect to happen, we always want to state the opposite claim, which we call the null hypothesis, ???H_0???.

???H_a???: the proportion of blue cars in my town is not ???40\%???

???H_0???: ???40\%??? of cars in my town are blue

Interestingly enough, we always test the null hypothesis ???H_0???, not the alternative hypothesis ???H_a???. We say that, if our sample gives us good enough evidence, then we can reject the null hypothesis, and therefore provide evidence that supports our alternative hypothesis.

In this section we’ll focus on hypothesis tests about two population parameters: the population mean ???\mu??? and the population proportion ???p???.

For population means

Remember that the population mean is the mean value of some characteristic that we’re interested in. For example, the mean height of American females might be ???\mu=65??? inches if the average American woman is ???5'6''??? tall.

Whether we’re investigating a population proportion or a population mean, the null hypothesis states the status quo that the population parameter is ???\le???, ???=???, or ???\ge??? the claimed value. The null hypothesis always says that the population mean (or parameter) is what we thought it was; nothing new or different is happening.

So if we’re testing the claim that the mean height of American females is ???\mu=65??? inches, the null hypothesis is ???H_0:\mu=65???.

If we think the population mean for height of American females is different than this claim, then we state that in the alternative hypothesis. We could use the alternative hypothesis to make these claims about the height of American females:

  • The mean height of American females is different than ???\mu=65???: ???H_a:\mu\ne65???

  • The mean height of American females is greater than ???\mu=65???: ???H_a:\mu>65???

  • The mean height of American females is less than ???\mu=65???: ???H_a:\mu<65???

For population proportions

On the other hand, the population proportion is the proportion that meets some sort of criteria we’ve established. For example, the proportion of American females with blue eyes might be ???p=0.15??? if ???15\%??? of American females have blue eyes.

For example, if the claim we’re testing is that ???15\%??? of American females have blue eyes, then the null hypothesis would be ???H_0:p=0.15???. The null hypothesis says that the population proportion is ???15\%???.

If we think the population proportion for American females with blue eyes is different than this ???15\%??? claim, then we state that in the alternative hypothesis. For instance, we could use the alternative hypothesis to make these claims about American females with blue eyes:

  • The proportion of American females with blue eyes is different than ???p=0.15???: ???H_a:p\ne0.15???

  • The proportion of American females with blue eyes is greater than ???p=0.15???: ???H_a:p>0.15???

  • The proportion of American females with blue eyes is less than ???p=0.15???: ???H_a:p<0.15???

As you can see for both population means and population proportions, the alternative hypothesis states the opposite of the null hypothesis and is true if the null hypothesis is found to be false. Because the null hypothesis always includes a ???\le???, ???=???, or ???\ge??? sign, the alternative hypothesis always includes a ???<???, ???\neq???, or ???>??? sign.

If ???H_0??? is ???\mu=??? or ???p=???, then ???H_a??? is ???\mu\neq??? or ???p\neq???

If ???H_0??? is ???\mu\leq??? or ???p\leq???, then ???H_a??? is ???\mu>??? or ???p>???

If ???H_0??? is ???\mu\geq??? or ???p\geq???, then ???H_a??? is ???\mu<??? or ???p<???

Using a sample to make inferences about the population


Take the course

Want to learn more about Probability & Statistics? I have a step-by-step course for that. :)


How to set up hypothesis statements

Example

Write different sets of hypothesis statements to test the claims that students at Springdale High School perform 1) differently than, 2) better than, and 3) worse than students at Greenville High School on the SAT test.

1) To test the claim that students at SHS perform differently on the SAT than students at GHS, we would write these hypothesis statements:

Null: Students at Springdale High School do not perform differently on the SAT than students at Greenville High School:

???H_0:\ \mu_S=\mu_G???

Alternative: Students at Springdale High School perform differently on the SAT than students at Greenville High School:

???H_a:\ \mu_S\ne\mu_G???

2) To test the claim that students at SHS perform better on the SAT than students at GHS, we would write these hypothesis statements:

Null: Students at Springdale High School perform worse on the SAT than students at Greenville High School:

???H_0:\ \mu_S\leq\mu_G???

Alternative: Students at Springdale High School perform better on the SAT than students at Greenville High School:

???H_a:\ \mu_S>\mu_G???

3) To test the claim that students at SHS perform worse on the SAT than students at GHS, we would write these hypothesis statements:

Null: Students at Springdale High School perform better on the SAT than students at Greenville High School:

???H_0:\ \mu_S\geq\mu_G???

Alternative: Students at Springdale High School perform worse on the SAT than students at Greenville High School:

???H_a:\ \mu_S<\mu_G???

That being said, some textbooks and teachers will use the convention where the null hypothesis always and only includes an ???=??? sign, with the alternative hypothesis stated with ???<???, ???\neq???, or ???>???. You can use either convention, just make sure you stay consistent with whichever one you choose.


Get access to the complete Probability & Statistics course