|
V26
Hypothesis Testing Welcome to part two of our video
series in support of hypothesis testing. In this video, we're going to talk
about the various cases involving inference for one mean, mu, okay, and these
cases will revolve around when we use the Z distribution versus when we use
the T distribution. We will also cover hypothesis tests as proofs
by contradiction, and finally we'll talk about the relationship of a
confidence interval to a hypothesis test. I'm Renee Clark from the Swanson
School of Engineering at the University of Pittsburgh. Okay, so, let's talk about the
first case for inference on the mean, and this is the case in which Sigma
squared, or population variance, is known, okay? In this case, we know we can
use the Z distribution. Okay, this is your Z random variable here, and it has
Sigma in the denominator which we can use because Sigma squared- known. Okay,
but, in order to use Z, what we must also know is that
our xbar is normally distributed. Under what cases
will xbar be normally distributed? Okay, one of two
cases: if our underlying population for X is normally distributed, then
automatically xbar will be normally distributed, or
if our… our n is large, meaning greater than or equal to 30, then again xbar will be normally distributed by the central limit
theorem. Okay, so, if either or both of these two
cases is true, then we know that xbar
will be normally distributed, in which case we can transform it to a z random
variable and use the Z distribution and tables in the back of the book. Okay, so, for a hyp… hypothesis test, you know that this is what the null
and the alternative hypotheses look like. Mu equals mu we call this mu… mu sub zero. This is our hypothesized value for mu versus
the alternative that directly opposes it: mu not equal to mu. Okay, so, this
alternative hypothesis, or H sub 1, this is the new
claim or belief that we're trying to test about mu. Okay, it is sometimes
called the… the… the researchers’ hypothesis or the scientist hypothesis. The
alternative is what we are trying to show or prove about mu. Okay, so, you
recall that, in a proof by contradiction, which is how a hypothesis test
proceeds, in a proof by cont… contradiction, if
you're trying to prove this, the alternative that mu is not mu o, then what
you have to do is assume the opposite. So, you
assume the null hypothesis that mu does equal to…does equal mu o, that's why
mu o gets inserted for Mu in that test statistic. Okay, so, we… with… with a
hypothesis test, we insert the hypothesized value mu o for mu into your Z
random variable in order to do a proof by contradiction,
which is how a hypothesis test proceeds. Alright, let's talk about the
relationship of a hypothesis test to a confidence interval. Okay, so, let's
say we have the following null and alternative hypotheses. The null is that
mu is 50. The alternative that directly opposes it says that it is not equal
to 50. Okay, now, because hypothesis tests and confidence intervals are
equivalent forms of inference, when you run a hypothesis test, you also
calculate a confidence interval. Okay, so, for this hypothesis test, the…
that is the corresponding 95% confidence interval for mu. So, we are 95%
confident mu is somewhere between 50.18 and 52.42. Okay, so, what you notice
about that is that the null hypothesized value of 50, right, or our mu of 50,
is not contained in that confidence interval. It is… mu of 50 is outside, or
not contained in, that confidence interval, right, because the lower limit of
that confidence interval is 50.18 and that… a value of 50 is just outside of
that. Okay, so, if mu o of 50 is just outside of that confidence interval,
that means that 50 is not a plausible value for mu at this alpha level. Okay,
it's outside of our confidence interval. So, mu of 50 is not a plausible
value because it's outside, therefore we're going to reject this value for
mu. Right? We are going to reject mu equal 50, okay? So, putting that into
more formal language: if you have a two-sided hypothesis test, okay, and if
your mu, or what you are hypothesizing, is outside of your 100 * 1 - Alpha
per confidence interval, okay, then you're going to reject that null
hypothesis at alpha. That says that mu equal mu o
because it's not a plausible value. So, you reject it. Okay, this alpha is the same as
that alpha. Okay, restating this again in just a bit more formal language,
okay, since CI’s and HT’s, or confidence intervals and
hypothesis tests, are equivalent forms of inference. That means, okay, for
your 100 * 1 - alpha per confidence interval on mu, okay, and given your
two-sided hypothesis test on mu, mu equal mu o versus mu not equal to mu, run
at a significance level of alpha, we will reject the null hypothesis. We will
reject this at that level of significance, we're running the test at Alpha, if
the value we’re hypothesizing for mu is not contained, or is outside, of our
confidence interval, and, again this Alpha right
here is the same as that Alpha there. Alright, let's talk about another
case for inference on the mean, and this is the case of
now where our Sigma squared is unknown, population variance unknown, and, in
addition, our n is small. So, what do we do in this case? We don't know sigma and we have a small n. Well, you remember we've got
to use the T distribution in this case, and this is what your T random
variable looks like. It looks very similar to Z, but it has s in the
denominator instead of sigma. Okay, remember Z has sigma, s is your sample standard
deviation, and… and because we don't know sigma, we have to
estimate it by s. Recall T has degrees of freedom of n minus one, and one of
the conditions for using the T distribution is that your underlying
population, or your population of x's, has to be normally distributed. This
must hold. X must be normally distributed in order to
use the T distribution. Okay, but, we're going to
proceed just as we did in the other case in that we're going to insert mu o
for mu in our T random variable (in order to do a proof by contradiction for
the same reasons). Okay, and let's talk about our
third case for inference on the mean, mu, and this is a case of when
population variance still unknown, sigma squared unknown, but, in this case,
we have a large n. What do we do in that case? Well, we can go back, as you
recall, to using the Z distribution since… or because n is large (greater
than or equal to 30). When n is large, s becomes a very good estimator of sigma,
or, another way to state that, you can state it this
way as well: s becomes a very good estimator of sigma squared, okay, meaning
that the sample values become very good estimators for the population values.
Okay, Alright, so, in this case,
we said we can go back to using Z. Now, recall Z has Sigma in the denominator,
but with n large, sample values become very good estimators of population
values for the standard deviation. So, what we, in essence, do is we cross out that Sigma in the formula and replace it
by s because it's a good approximator. Okay, we are still going to insert mu
for mu into that Z random variable to do a proof by contradiction. Okay, but,
the benefit of using Z in this case is that the central limit theorem applies,
okay? With n large, we know xbar normally
distributed by the central limit theorem. Okay, so, we don't have to assume
or know that the population of X is normal as you have to
do when using the T distribution. That's why it's beneficial to use the Z
distribution if you can, and we can because n large. Okay, we wish to thank the
National Science Foundation under Grant 233582 for supporting our work. Thank
you for watching. |