|
V22
Estimation Welcome to part five of our video
series on estimation. In this video, we are going to return to the concept of
pairing data, but we're going to talk specifically about why we pair data. Okay,
and then we're going to discuss the data setup that paired data takes in order to do statistical or inferential analysis with
the data. I'm Renee Clark from the Swanson School of Engineering at the
University of Pittsburgh. Okay. Okay, so, let's talk about
why we pair data. Okay, let's recall before and after studies that we
discussed in a previous video, okay, in which the same person, or it could…
it could even be an item, is measured both before and after a certain
intervention or method that you are trying, or testing, out. Okay, okay,
let's say that intervention happens to be a new teaching method in the
classroom, or something you're trying different in
the classroom. Okay, and you are trying to assess students’ abilities with
your enhanced, or new method now with pairing. Okay, you would be able to
test… you're able to test that same child's ability both before and after you
apply your new method. Now, if you're able to do that,
to test that same child 's ability both before and after, okay, that's better
than, okay, testing your method using two completely independent groups of
students that have different students in each… completely different students
in each group, okay, in which you may test one group with your method and
then compare that to a second group in which you didn't use your method. Okay,
pairing is better than… than using two independent groups, and the reason for
that is, with pairing, okay, you're able to control for variables that exist
between students. Okay, things such as, perhaps, what might have been their
prior knowledge on a topic or in a certain area, or what their natural
capabilities might be. Could also include variables such as more social
variables such as parental oversight or socioeconomic status, etc. Right? There
are many, many, many variables that lead to differences
student to student. But, with pairing, you're able to
control for these variables. Okay, so, in essence,
what pairing does is it eliminates or controls for these other sources of
variability. That's key. Okay, so, you'll call… recall that with pairing,
okay, the experimental unit remains the same both before and after, or
without and with, your intervention, right? So, let's say we are measuring
Renee. She is an experimental unit, but we're going to record her
measurements both before and after… after. Okay, each row, or each subject,
is an experimental unit, and they remain the same throughout the study both
before and after. Okay, so, this experimental unit remains
the same, or has the same variables, both before and after, including
variables that are not being tested, right? Okay, so, mathematically, pairing
reduces the variance. Okay, it reduces the variance in the
difference between your two variables. So, X and Y represent your two… each
of your two populations… dependent populations. Okay, this is the formula for
the variance of the difference in the two populations, and why, mathematically,
the variance is reduced is because you are subtracting the positive
co-variance there. Okay, you can see how you're
subtracting that. You're subtracting two times that.
But, the covariance term is positive, okay, and so
just recall that covariance is a measure of the nature of the linear
relationship between two variables. In this case, the two variables would be
X and Y, and these variables are not independent, right? They're dependent
because they're paired. So, if they're dependent and not… and not independent,
you would expect them to have a relationship of some sort. Okay, so, with
paired data, it has a certain setup in order for
statistical analysis. Okay, so, with paired data, we say that we have, in
rows, subjects or pairs. However, you want to say it. Okay, so, in this
example shown here, we have six rows, or six pairs of before and after data.
X1 represents the before measurement. X2 represents the after measurement. Okay, and, if you recall from an
earlier video, the… the quantity that we're actually going
to be analyzing with a paired data analysis is the difference between the two
measurements. Okay, and we call these differences between the X1 and the X2 d
sub I, where I is just simply the subscript of the
row number. Okay, but the difference is calculated either by taking X1 - X2
or X2 - X1. It doesn't matter in which order you take the difference, okay? Okay,
so, for the first row, for example, you see that its di, or its difference,
is equal to one, which is ob… which was obtained by
2 – 1. So, we took X2 - X1, second row value of two was 5 – 3, and so on down
the table. Okay, so, with this data, we obtain six individual differences,
right, because there were six rows. Okay, so, what you do next with these
differences is that you average them. Okay, so, it just so happens that these
differences happen to be equal to 1 2 3 4 5 6. But, if you were to take the
average of those six numbers, you would calculate 3.5, which you can see is
right in the middle there (we call that average D Bar).
Okay, so, it is… D is considered the average across all your individual
differences in that table. Okay, using those six individual differences, 1 2
3 4 5 6, we also calculate the standard deviation, which, again, if you were
to calculate the standard deviation of 1 2 3 4 5 6, you would come up with
1.9. We give that the symbol S sub d for standard deviation of the
differences. Okay, but, what S sub d is… it is the
standard deviation for all of your individual differences D i. We wish to thank the National
Science Foundation under Grant 233582 for supporting our work. Thank you for watching. |