V26 Hypothesis Testing

Welcome to part two of our video series in support of hypothesis testing. In this video, we're going to talk about the various cases involving inference for one mean, mu, okay, and these cases will revolve around when we use the Z distribution versus when we use the T distribution. We will also cover hypothesis tests as proofs by contradiction, and finally we'll talk about the relationship of a confidence interval to a hypothesis test. I'm Renee Clark from the Swanson School of Engineering at the University of Pittsburgh.

Okay, so, let's talk about the first case for inference on the mean, and this is the case in which Sigma squared, or population variance, is known, okay? In this case, we know we can use the Z distribution. Okay, this is your Z random variable here, and it has Sigma in the denominator which we can use because Sigma squared- known. Okay, but, in order to use Z, what we must also know is that our xbar is normally distributed. Under what cases will xbar be normally distributed? Okay, one of two cases: if our underlying population for X is normally distributed, then automatically xbar will be normally distributed, or if our… our n is large, meaning greater than or equal to 30, then again xbar will be normally distributed by the central limit theorem. Okay, so, if either or both of these two cases is true, then we know that xbar will be normally distributed, in which case we can transform it to a z random variable and use the Z distribution and tables in the back of the book.

Okay, so, for a hyp… hypothesis test, you know that this is what the null and the alternative hypotheses look like. Mu equals mu we call this mu… mu sub zero. This is our hypothesized value for mu versus the alternative that directly opposes it: mu not equal to mu. Okay, so, this alternative hypothesis, or H sub 1, this is the new claim or belief that we're trying to test about mu. Okay, it is sometimes called the… the… the researchers’ hypothesis or the scientist hypothesis. The alternative is what we are trying to show or prove about mu. Okay, so, you recall that, in a proof by contradiction, which is how a hypothesis test proceeds, in a proof by cont… contradiction, if you're trying to prove this, the alternative that mu is not mu o, then what you have to do is assume the opposite. So, you assume the null hypothesis that mu does equal to…does equal mu o, that's why mu o gets inserted for Mu in that test statistic. Okay, so, we… with… with a hypothesis test, we insert the hypothesized value mu o for mu into your Z random variable in order to do a proof by contradiction, which is how a hypothesis test proceeds.

Alright, let's talk about the relationship of a hypothesis test to a confidence interval. Okay, so, let's say we have the following null and alternative hypotheses. The null is that mu is 50. The alternative that directly opposes it says that it is not equal to 50. Okay, now, because hypothesis tests and confidence intervals are equivalent forms of inference, when you run a hypothesis test, you also calculate a confidence interval. Okay, so, for this hypothesis test, the… that is the corresponding 95% confidence interval for mu. So, we are 95% confident mu is somewhere between 50.18 and 52.42. Okay, so, what you notice about that is that the null hypothesized value of 50, right, or our mu of 50, is not contained in that confidence interval. It is… mu of 50 is outside, or not contained in, that confidence interval, right, because the lower limit of that confidence interval is 50.18 and that… a value of 50 is just outside of that. Okay, so, if mu o of 50 is just outside of that confidence interval, that means that 50 is not a plausible value for mu at this alpha level. Okay, it's outside of our confidence interval. So, mu of 50 is not a plausible value because it's outside, therefore we're going to reject this value for mu. Right? We are going to reject mu equal 50, okay? So, putting that into more formal language: if you have a two-sided hypothesis test, okay, and if your mu, or what you are hypothesizing, is outside of your 100 * 1 - Alpha per confidence interval, okay, then you're going to reject that null hypothesis at alpha. That says that mu equal mu o because it's not a plausible value. So, you reject it.

Okay, this alpha is the same as that alpha. Okay, restating this again in just a bit more formal language, okay, since CI’s and HT’s, or confidence intervals and hypothesis tests, are equivalent forms of inference. That means, okay, for your 100 * 1 - alpha per confidence interval on mu, okay, and given your two-sided hypothesis test on mu, mu equal mu o versus mu not equal to mu, run at a significance level of alpha, we will reject the null hypothesis. We will reject this at that level of significance, we're running the test at Alpha, if the value we’re hypothesizing for mu is not contained, or is outside, of our confidence interval, and, again this Alpha right here is the same as that Alpha there.

Alright, let's talk about another case for inference on the mean, and this is the case of now where our Sigma squared is unknown, population variance unknown, and, in addition, our n is small. So, what do we do in this case? We don't know sigma and we have a small n. Well, you remember we've got to use the T distribution in this case, and this is what your T random variable looks like. It looks very similar to Z, but it has s in the denominator instead of sigma. Okay, remember Z has sigma, s is your sample standard deviation, and… and because we don't know sigma, we have to estimate it by s. Recall T has degrees of freedom of n minus one, and one of the conditions for using the T distribution is that your underlying population, or your population of x's, has to be normally distributed. This must hold. X must be normally distributed in order to use the T distribution. Okay, but, we're going to proceed just as we did in the other case in that we're going to insert mu o for mu in our T random variable (in order to do a proof by contradiction for the same reasons).

Okay, and let's talk about our third case for inference on the mean, mu, and this is a case of when population variance still unknown, sigma squared unknown, but, in this case, we have a large n. What do we do in that case? Well, we can go back, as you recall, to using the Z distribution since… or because n is large (greater than or equal to 30). When n is large, s becomes a very good estimator of sigma, or, another way to state that, you can state it this way as well: s becomes a very good estimator of sigma squared, okay, meaning that the sample values become very good estimators for the population values.

Okay, Alright, so, in this case, we said we can go back to using Z. Now, recall Z has Sigma in the denominator, but with n large, sample values become very good estimators of population values for the standard deviation. So, what we, in essence, do is we cross out that Sigma in the formula and replace it by s because it's a good approximator. Okay, we are still going to insert mu for mu into that Z random variable to do a proof by contradiction.

Okay, but, the benefit of using Z in this case is that the central limit theorem applies, okay? With n large, we know xbar normally distributed by the central limit theorem. Okay, so, we don't have to assume or know that the population of X is normal as you have to do when using the T distribution. That's why it's beneficial to use the Z distribution if you can, and we can because n large.

Okay, we wish to thank the National Science Foundation under Grant 233582 for supporting our work. Thank you for watching.