|
V21
Estimation Welcome to part four of our video
series on estimation. In this video, we are going to review topics that are
relevant to estimating the difference in two independent means, okay, and
these topics include standardization of the difference in the sample means to
a z random variable, ensuring normality of this
difference, and calculating the pooled estimate of the sample variance for
use with the distribution. I'm Renee Clark from the Swanson School of
Engineering at the University of Pittsburgh. Okay, so, let's first discuss, or
review, how to transform the difference in two sample means to a z random variable. Okay, and this is something that we
covered in our chapter on sampling distribution theory. Okay, so, let's first
focus on the picture on the right in which we have two independent
populations, one and two, and we're going to take a sample out of each of
them of a certain size, N1 and N2. And, in this case, we are assuming that
our population variances are known. Okay, so, this is the case of our sigmas each being known. Okay, in order
to transform the difference in the two sample averages, X1 bar minus
X2 bar, to a z random variable, okay, we are going to proceed like we always
do. We are going to subtract off the mean of this
difference, or the expected value of the difference, which we remember from
our sampling distribution theory. The expected value, or the mean, of the
difference in the sample means is simply mu1 – mu 2, so we subtract off mu1 -
mu2, and then we divide by the standard deviation of the difference in the
means. Now, you'll recall that the variance of the difference in the means
looks like this. It's equal to Sigma 1^ 2/ N1 plus Sigma 2^ 2 over N2. So, in order to get the standard deviation that's shown here
in the denominator, we simply take the square root of that, and that becomes
the standard deviation, okay? Okay, now, in order to transform the difference
in those two sample means to a z, okay, like we just
did, in order to do this transformation- I'm just rewriting this, the
transformation- in order to do this, we must know that the difference in
those two means is normally distributed, right? In order to
transform any variable, any random variable, to a z random variable, it must
be normally distributed. Okay so the question is, under what
conditions will this quantity be normally distributed? That's the question. Okay,
these are the possibilities. Okay, if your underlying pops… if your
underlying populations are each normally distributed, so in other words, if
X1 is normally distributed and X2 is normally distributed, then automatically
X1 bar will be normally distributed and X2 bar will be normally distributed. We
learned that earlier. Okay, so, that's one possibility. The second
possibility is if your sample sizes N1 and N2 are each sufficiently large,
meaning greater than or equal to 30, okay, and these are the sample sizes
used to calculate each of your sample averages, X1 bar and X2 bar, if they
are each sufficiently large, then, again, X1 bar will be normally distributed,
X2 bar will be normally distributed, and this is by the central limit theorem,
right? Okay, so, if either one of these cases is true, okay, if we can arrive
to either one of those, then we know that the
difference in the two averages will be normally distributed because C, the
linear combination of normally distributed random variables, is also, itself,
normally distributed. Okay, we learned that earlier. So,
this is a review, okay, and that's what we wanted to arrive at, right? We
wanted to arrive at that difference being normally distributed so that we can
transform it. It needs to be normally distributed in order
to transform to a z and use the Z probability tables in the back of
the book. Okay, one last topic that I
wanted to talk to you about is what's known as the pooled estimate of the
variance, okay, and this is something that we're going to be using with the T
distribution. Okay, recall that we use the T distribution. T
looks something like this. We use the T distribution when sigma, or
the population standard deviation, is unknown, okay, in which case then we have to use the
sample standard deviation. Okay, now, when your population
variances are unknown, and again, we have two independent… we still have two
independent populations, okay, if your population variances are unknown, so
if these are both unknown but you have some reason to believe they are equal,
or that the variance or the spread of those two populations is equal, then
you can use what's known as the pooled estimate of the variance. Okay, and
this is how you calculate the pooled estimate of the variance, and you'll see
that subscript P there which stands for pooled. Okay, but,
you'll see that this F estimate takes into account a combination of your
sample variances. Okay, so, with a pooled estimate of the variances, we say
that our sample variances are pooled, or combined. Okay,
so pooled is just another way to say combined, brought together. Okay, so
this Sp sqaured, or the pooled
estimate of the variances, is actually a weighted
average of your two sample variances, S1 and S2 squared, that are both shown
in the formula. Okay? Okay, and sp^2 is a
weighted average, okay, which means that S1^2 and S2^2 are each weighted by
their degrees of freedom. What are their degrees of freedom? N 1^ 2 or, I'm
sorry, N1 minus one and N2 minus one. Okay, those are the degrees of freedom,
okay, and why we use a pooled estimate of the variance is because it's a
better estimate of the variances… of the variance versus simply using your
sample variances individually in the calculation. We wish to thank the National
Science Foundation under Grant 233582 for supporting our work. Thank you for watching. |