|
V33
Hypothesis Testing Welcome to part nine of our video
series in support of hypothesis testing. In this video, we are going to
discuss the goodness of fit test. I'm Renee Clark
from the Swanson School of Engineering at the University of Pittsburgh. Okay, so, thus far, we've tested
hypotheses about population parameters, right? A quantity such as the
population mean, mu, or perhaps a population proportion, or the population
variance, or the difference in two means, or the ratio of two population
variances, right, or maybe the difference in two population proportions. Okay,
so, we've tested hypotheses about parameter values. Next, we are going to
test hypotheses, but not about parameters, but about data distributions. Okay,
so, specifically, this type of a hypothesis test asks a question such as, “does
my observed or sample data that I have… does its distribution follow a
certain, we'll call it, theoretical or hypothesized distribution?” So, for example, does my observed
data that I have follow, say, a normal distribution, or does it follow a
uniform distribution, okay, or any other distribution that exists? Exponential,
for example. Okay, this type of hypothesis test is known… known as a goodness
of fit test between two distributions. Those two distributions being my
observed, or my sample distribution, the sample of
data that I have, versus some theoretical distribution that I'd like to
compare my sample data to. Let's look at the example of
tossing a six-sided die. Okay, now, with a die, the outcomes have a uniform
distribution, okay, assuming the sides are likely to have an equal
probability for each side of the die to occur. Okay, in fact, these outcomes
have a discrete uniform distrib… distribution, okay,
because of the six distinct or separate sides of the die. Okay, so, let's say
we wanted to investigate this issue. Okay, we would begin with the null
hypothesis that our outcomes have a discrete uniform distribution, meaning
that each face of the die is equally likely to occur. Okay, so, the
probability function for a fair die looks like this: f (x) = 1 over 6 for
each of your six distinct outcomes, okay, or six sides of the die. Okay, so,
in this probability function, there are six equal probabilities, okay, each
worth 1… 1/6 each. Okay, so, let's say we toss a die 120 times, and that die
is in fact fair, okay, or we hypothesize that die to be fair- that the
outcomes have a discrete uniform distribution. Okay, then, we expect- and
this is a key word- we expect each face of the die to appear 20 times, okay,
as shown in the table. Okay, which we would obtain by taking 120 over 6 for
20. Okay, so, that's what we expect. Okay, now, what we actually observe, though, when rolling that die may not
exactly match what we had expected, okay? So, a goodness of fit test is going
to explore the differences between what you expect and what you actually observe. So, for example, for outcome one, or
face one, we, in this case, did in fact observe 20, and we expect a 20. But,
for face number three, we actually observed 17
although we expected the 20. But, in other cases,
for phase six, for example, you know we observed more than 20. Okay, so, each
of these columns does add to 120. Okay, so, a goodness of fit test that I
have abbreviated right there is going to explore whether the differences
between your expected and your observed frequencies are due, perhaps, to
chance, okay, or, alternatively, whether they're due to the distribution not
being uniformly discrete or, alternative… or alternatively, that the… the die…
it's due to the die not being fair. We wish to thank the National
Science Foundation under Grant 233582 for supporting our work. Thank you for watching. |