V30 Hypothesis Testing

Welcome to part six of our video series in support of hypothesis testing. In this video, we are going to review the sample proportion and how that's calculated. We're also going to discuss use of the Z distribution for performing inference on the population proportion. I'm Renee Clark from the Swanson School of Engineering at the University of Pittsburgh.

Okay, this is our sample proportion, okay? We call it p hat. It serves as a point estimate for our population proportion of p. Okay, p hat is calculated by taking y/n. Y is your total number of successes, okay, and you can think of it as a count. It's a count variable, a count of the particular characteristic or condition that you're interested in, regardless of whether that is a desirable or an undesirable characteristic. So, for example, you might be interested in counting the number of defective items, or the number of positive covid tests. Neither of which is highly desirable, right? Or you may be interested in counting the number of non-defective items, okay? So that is why it’s a count of your successes in your n Bernoulli trials. Okay, you recall that n is for your Bernoulli trials. A Bernoulli trial can result in one of two outcomes. Okay, so, y is distributed according to the binomial distribution as you'll recall.

Okay, and, it so happens we can use the normal distribution to approximate the binomial distribution, which we do for ease of calculation when the following conditions hold. When np greater… greater than or equal to 5, and n1 minus p greater than or equal to 5. P is proportion of… population proportion of successes. This is the proportion of failure, and, of course is your number of Bernoulli trials. Okay, so, in performing inference for one proportion, P, this is our Z test statistic. Okay, and, of course, because we can use the binomial…or we can use… use the normal distribution to approximate the binomial, that's why we've got a z there, which is the standard normal variable. Okay, so, let's say our null and alternative hypotheses are set up as follows. P, or population proportion, equals some hypothesized proportion. We don't know what that is, but it's some hypothesized proportion. So, for example, maybe that's .75, okay? The alternative for a two-sided test is that it's p is not equal to p 0. Okay, so, what we do in conducting a proof by contradiction is we insert that hypothesized proportion into the test statistic each place… for each place that the population proportion is found. Okay, remember p is…we need to estimate P because it is the population proportion, and in… in an addition… which we don't know… and in addition, because we want to do a proof by contradiction, we have to assume the null to be true, which is p equal p. So, that is an additional reason why we insert P into the test statistic for each place that we see a p. So, that's why we P sub 0 appears there, there, and there, okay?

Okay, so, in trying to determine whether we meet the key assumption for using the normal approximation to the binomial, okay, which is that np greater than or equal to 5 and N1 minus P greater than or equal to 5, what we do, again, is we insert the hypothesized proportion of p sub 0 in those two places where we find p, and this is how we test that key assumption of greater than or equal to 5.

Okay, finally, I wanted to discuss the relationship to a confidence interval, okay, for… for one proportion. Okay, so, let's say our null hypothesis is the following: P equals some hypothesized proportion. Okay, and let's say that hypothe… hypothesized proportion happens to be .5. Okay, so, let's say I calculate a confidence interval for p of 0.25 to .75. Okay, so, this is a hypothesized or, I'm sorry, this is a confidence interval for p. Okay P be… being… P being between some lower limit and some upper limit. Okay, so, 0.5 is our hypothesized proportion for the first example. 0.5 is contained in that interval, so P 0 of 0.5 is plausible. Okay, it's a plausible value for p. Okay, so, we would not reject, or fail to reject, the null for this case. Okay, but, let's say I calculated a confidence interval of .1 to. 2 for p. Okay, in this case, the hypothesized proportion of 0.5 is outside of this confidence interval. Okay, it's above 0.2. So, in this case, P equal to 0.5- not plausible for p. Okay, in which case then we would reject that null hypothesis.

I wish to thank the National Science Foundation under Grant 233582 for supporting our work. Thank you for watching.