Chat with us, powered by LiveChat Fullerton College Balanced and Unbalanced Distributions Probability Worksheet - Credence Writers
+1(978)310-4246 [email protected]

Description

ONLY DO PAGES 61-64

Lecture 4 ? Part I: Balanced and Unbalanced Distributions
The first part of this lecture will answer the following questions:
Q1. If you collect data, and you have an outlier (an extreme score: either a really low score or a really
high score) how will that outlier affect the average score (a measure of central tendency) of the
distribution? For example, if we had a list of salaries, and one of those salaries was Bill Gates?
salary, how would that extremely high salary affect the mean of the set of salaries?
Q2. We have three measures of central tendency. Which is the best measure to use for a set of data:
the mean, median or mode?
Q3. If the data you collect results in two clusters of scores (such as: the times to run a race might fall
into 2 clusters – one for males and another for females) how would you report the average score?
Q4. If you graphed (created polygons) of distributions with an outlier (that is, a distribution with either
an extremely high score or an extremely low score), how will the resulting graphs appear?
Q5. If you are presented with the measures of central tendency for a set of data, and the 3 measures are
the same value (M = Mdn = Mo) what is the difference between that distribution and another
distribution in which the M, Mdn and Mo are not the same value?
From now on, we will be using curves to approximate the shape of a frequency polygon. When you
see a curve, you need to remember the X and Y axis are ?there,? even though they will no longer be
drawn, and remember that frequency is represented on the Y axis and that scores (data) are
represented on the X axis. For example, previously we graphed polygons similar to this:
Polygon:
From now on, we will only show curves (approximations of polygons) as follows:
When you are presented with a curve, visualize the following information that will not be provided:
————-(Q1^) Some distributions are balanced, that is, high scores are balanced out by low scores. When
these distributions are graphed, the curve that represents the shape of the polygon and histogram will
be _______________________________ – folding the curve at the center creates two identical
halves. Example of symmetrical curve:
(this is balanced, not skewed ? a term we are about to define)
48
__________________________________________ – a distribution that is unbalanced by an extreme
score (or a few extreme scores) at or near one end.
Let?s compare two distributions and their means. First here is a distribution of exam scores:
63
65
68
68
68
70
70
71
72
M = 68.33
Here is a very similar distribution of exam scores:
63
65
68
68
68
70
70
71
200
M = 82.56
Notice that this second distribution is a ____________________________ distribution (unbalanced by
an extreme score).
The extreme scores are called _________________________________.
(Q2) In general, when we want to describe a distribution, the best measure of central tendency is the
mean (it gives us the most information), but let us look at the means of the two distributions above.
Notice how one extreme score had a huge effect on the value of the mean in the second (skewed)
distribution. Is this mean (82.56) a good representation of the typical exam score in this distribution (a
good measure of central tendency)? _____________
From this example, we see that when a distribution is skewed, the mean is __________ the best
measure of central tendency to describe the distribution.
In sum, the mean is NOT appropriate if the distribution is somewhat unbalanced (skewed). That is, the
mean does not do a good job of representing a skewed distribution.
What is the median of this last distribution? Mdn =
49
Notice how the median is a better representation of the typical exam score in this distribution, then the
mean. From this example, we learn that the best measure of central tendency for describing a
skewed distribution is the ____________________ (since the median is the ?middle? score, it is not
as effected by extreme scores).
Note: Income related distributions are almost always skewed (and as we will see later, they are almost
always skewed to the right). Therefore, from what we just learned, if you are describing an income
distribution (such as a distribution of salaries), the most appropriate measure to use is almost always
the ___________________ (because income distributions are almost always skewed, and the median is
the most appropriate measure of central tendency for skewed distributions).
Review: What is a bimodal distribution?
(Q3) A bimodal distribution has two distinct ______________________ of scores and therefore a
single value (such as the mean or median) is not a good description of the distribution.
An example of a bimodal distribution would be the case of a hypothetical distribution of running times
in seconds of the 100 yard dash for a co-ed physical education class. (Note: this example is making an
assumption that everyone in the class would identify as either male or female and, of course, we now
know this is not the case.) The running times for girls would probably cluster around one mode, and
the running times for boys would probably cluster around another mode. If the physical education
teacher wanted to best express the average results, the teacher would present both modes. As you
see, the best representation of a bimodal distribution is the modes, because the modes reflect BOTH
clusters of scores.
When a distribution has more than one mode, the _________________ themselves, not the mean
or the median, should be the measure of central tendency used to represent the distribution.
Check your understanding:
1.
When a distribution is skewed (contains an outlier(s)), what is the best measure of central
tendency?
2.
When a distribution is bimodal, what is the best measure of central tendency?
3.
In general, which measure of central tendency would be best to use (if the distribution is not
skewed or bimodal)?
4.
If you are at a job interview and you are told the ?average? salary, which measure of central
tendency would be most representative of the average salary (given that we know that income
distributions tend to be skewed)?
50
(Q4) Positively and Negatively Skewed Distributions and Measures of Central Tendency
_______________________________________ skewed distributions ? or ?skewed to the left?
distributions – are distributions that have a few extremely low scores. The polygon of this type of
distribution would look something like this:
Note: The ?tail? is representing the extreme score(s), in this case, the few low scores.
_______________
______________________________________skewed distributions ? or ?skewed to the right?
distributions are distributions that have a few extremely high scores. The polygon of this type of
distribution would look something like this:
Note: The ?tail? is representing a few extreme high scores.
Note. Distributions of income are almost always positively skewed (skewed to the right).
(Q5) Note, if a distribution is not skewed, nor bimodal, then M = Mdn = Mo (approximately equal
to each other) and may look like this:
As we learned earlier, the highest point represents the frequency of the mode, and if the distribution is
not skewed, the M is about equal to the Mdn which is about equal to the Mode. (The more the M,
Mdn and Mo differ from each other, the greater the skew.)
51
The _________________________ is the measure of central tendency that is most unduly
influenced by a few extreme scores. Thus the mean is always located toward the
___________________ end of the curve.
Also, remember that the _____________ is found under the highest point of a graph (curve).
For example, in a negatively skewed distribution:
In a positively skewed distribution:
Note that the mean is always pulled toward the tail.
Since income distributions are almost always skewed to the right, which measure of central tendency
will be the largest value? ___________________
Therefore, if you know the mean and median of a distribution, you can assess the skewness (this is
one of your questions you must answer on Computer Assignment #1) of the distribution by
comparing the sizes of the M and Mdn. This is how you assess skewness.
?
?
?
?
?
If the M = Mdn = Mo, then the distribution is not skewed.
If the mean is the measure of central tendency that is largest in value, then the distribution is
skewed to the _________________________.
And if the mean is the measure of central tendency that is the lowest in value, the distribution
is skewed to the ________________________.
Also note that the larger the difference between the mean the median, the greater the total
amount of skew of the distribution.
Since income distributions are almost always skewed right, the mean of an income
distribution is usually very high relative to the actual scores.
Therefore, if I were to tell you that for a given distribution, the measures of central tendency are: Mo
= 40, the Mdn = 25, and the M = 7,
is this distribution skewed? ______
If so, in which direction? __________
Because?
52
Check your learning with the following problems (We will do some these together)
Questions regarding which is the best measure of central tendency to use:
1. Given that a distribution has the following measures of central tendency
M = 12, Mdn = 12, Mo = 12
which measure of central tendency is most appropriate and why?
The _________, because the distribution is ________
2. Given that a distribution has the following measures of central tendency,
M = 21.25 Mdn = 12.00 Mo = 12
which measure of central tendency is the most appropriate and why?
The ____________, because the distribution is __________
3. Given that a distribution has the following measures of central tendency.
M = 63.22 Mdn = 63.00 Mo = 62 and 65
Which measure of central tendency is most appropriate and why?
The ____________, because the distribution is _____________
Questions regarding assessing the skewness of a distribution by comparing the M and Mdn:
4. The median of a distribution of incomes is 50,000 dollars and the mean of the distribution of
incomes is 75, 000 dollars. Because the M ____ Mdn, this distribution is skewed to the
____________________.
5. You determine that a distribution has a M = 60, Mdn = 60 and Mo = 60. Is the distribution
skewed or not skewed? If skewed, is it skewed right (positive) or skewed left (negative). Why?
6. A distribution has a M = 200, Mdn = 150 and Mo = 120. Is the distribution skewed or not
skewed? If skewed, is it skewed right (positive) or skewed left (negative). Why?
7. A distribution has a M = 350, Mdn = 450 and Mo = 500. Is the distribution skewed or not
skewed? If skewed, is it skewed right (positive) or skewed left (negative). Why?
Given what we have learned about skewed distributions, answer the following:
8. A distribution is skewed to the right. Which measure of central tendency (mean, median or
mode) will be the largest number? Hint: draw a sketch of a distribution skewed to the right.
Place the M, Mdn and Mo in their proper places relative to each other.
9. A distribution is skewed to the right. Which measure of central tendency will be the smallest
value? (see hint above)
10. Given that income distributions (such as salaries) are often skewed to the right, if you are
considering a job and know the MEAN salary for that job, it is likely that the mean will make
the salaries appear ____________________ (larger or smaller)?
53
Part II of Lecture 4:
Dispersion ? also called Variation (how values in a distribution vary)
This part of the lecture will answer the following questions:
Some distributions may have scores that are similar to each other while other distributions may have
scores that are wildly dissimilar.
1. How can we tell whether distributions contain scores that are similar to each other or contain
dissimilar scores? Is there a measurement that tells us how much the scores differ from each
other (or how much they differ from the mean)?
2. How will graphs (curves) of distributions look different given whether the distribution contains
similar scores, or whether the distribution contains dissimilar scores?
Measures of central tendency (M, Mdn, Mo) tell us about the ?average? or typical score for a given
distribution.
Measures of VARIABILITY tell us about how scores in a distribution ___________or ________.
Another term for variability is DISPERSION.
Importance of measures of variability: What if we knew the mean temperature in Anchorage Alaska is
56 degrees F, and that in Honolulu Hawaii, the mean temperature is 64 degrees F. It would appear, by
comparing the means, that these two distributions of temperatures were similar, thus it would appear
that the climate is similar.
However, knowing information about the VARIABILITY of the distributions of temperatures gives a
whole different picture of a comparison of the climates. For example, in Honolulu, the temperature
rarely rises above 84 degrees F or dips below 50 degrees F. In Anchorage, on the other hand, the
temperature may reach 100 degrees F during the summer and drop to a negative 40 degrees F during
the winter.
Measures of variability (dispersion)
___________________________________ is the measurement of the width of the entire distribution
and is found simply by figuring the difference between the highest and the lowest scores. It is
symbolized by R.
Ex: In a distribution the highest score is 95 and the lowest score is 25. What is the range. R =
Check your learning (together):
Calculate the range of the following distribution of quiz scores (together):
1. Quiz 1 scores: 1, 1, 3, 2, 10 , 6, 7, 7
R=
2. Quiz 2 scores: 8, 9, 15, 8, 2, 8, 3
R=
54
Deviation from the mean
One measure of variability is a ________________________ score (also called the ?deviation from
the mean? score), represented by the symbol x. The deviation score can be found, for each score in a
distribution, by subtracting the mean from the raw score (X ? M). Thus, the deviation score, x,
represents the (1) distance from the mean and (2) the direction of any raw score from the mean
(whether the raw score is above the mean or below the mean) ? if the deviation score (x) is
positive, then the raw score is above the mean, and if the deviation score (x) is negative, the raw score
is below the mean.
Deviation Score: x = X ? M.
Let?s calculate x for the scores 7, 7, 6, 5, 5. The M = 6 (remember: M = SX/N = which in this case
equals (7 + 7 + 6 + 5 + 5)/5 = 30/5 = 6).
X
(X ? M) = x
7
7
6
5
5
Notice that the deviation scores (x) add to equal zero. This will always be the case (that is, when you
add the positive deviation scores together, they will always equal the sum of the negative deviation
scores) due to the definitions of the mean and the deviation scores.
Check your learning: Calculate the following deviation scores.
Five convicts were given the following five prison sentences (in years): 13, 7, 6, 2, 2
The mean equals 6 years (M = 6). (Remember: M = SX/N = which in this case equals (13 + 7 + 6 + 2
+ 2)/5 = 30/5 = 6).
X
13
7
6
2
2
X?M
x
The deviation score (x = X – M) tells us how much each score deviates from the mean.
We often want to know how much (on average) all the scores in a distribution vary from the mean. If
we can take an average of our deviation scores (x), we will know how much, on average, all the scores
deviate from the mean (we would know the average, or standard, deviation from the mean).
What is the formula for the average deviation score? To find the average (standard) deviation we
would take every deviation score for every score in the distribution, add them together and divide by
the number of scores (that is, we would average the deviation scores). This is the formula for the
average of the deviation scores (x):
55
You may notice a problem in that earlier we stated that all deviation scores (x) add up to zero, and this
presents a problem in that the numerator would always be zero. We will learn how to deal with this
problem in a minute, but for now, let us discuss the purpose of the average of the deviation scores.
Instead of saying ?the average of how far each score in a distribution deviates from the distribution?s
mean? we say the ?standard deviation.? (More precisely, the standard deviation is actually the square
root of the average of the squared deviation scores (as we will see in a minute).) Thus the
_______________________ _________________________ is a measure of variability that
indicates by how much all of the scores in a distribution typically differ or vary from the
___________________.
The standard deviation is the most commonly used measure of how much all the scores, on average,
deviate from the mean of the distribution. The larger the standard deviation, the ______________ the
scores are spread out around the mean. The smaller the standard deviation, the _________________
the scores are spread out around the mean. Thus by comparing the sizes of the standard deviation of
different distributions, we can compare how much the scores vary from the mean. For example, which
distribution would have a larger standard deviation? A distribution of temperature in Hawaii or a
distribution of temperatures in Alaska? _________________________ temperatures would have a
larger standard deviation because there is more variability in the temperatures.
Now, how can we calculate a meaningful average of the deviation scores, when we just saw that the
deviation scores will always add up to zero?
We can square the deviation scores (calculate x squared), and calculate an average of the squared
deviation scores. To calculate the average of the squared deviation scores, we add the squared
deviation scores together and divide by our number of scores, as follows:
Sx2
N
This average (mean) of the squared deviation scores is the _______________________.
Since we had to square the deviation scores, for a more meaningful measure of variability we could
take the square root of the variance. The standard deviation is the square root of the average of the
squared deviation scores (square root of the variance):
SD = v Sx2
N
(In other words, the SD is equal to the square root of the Variance: SD =
.)
The standard deviation is the most commonly used measure of how much all the scores, on
average, deviate from the mean of the distribution (but don?t forget that technically, the standard
deviation is the square root of the average of the squared deviation scores.)
56
Given these definitions of deviation scores (x), variance (V), and standard deviation (SD), we can
calculate the standard deviation of a set of scores (a sample).
Deviation method for determining the standard deviation is how to calculate the standard deviation
given its definition: the standard deviation is the square root of the average of the squared deviation
scores.
SD = v Sx2
N
Let us look by at the deviation scores you calculated for those prison sentences. Here was the ?Check
your Learning? problem:
Five convicts were given the following five prison sentences (in years): 13, 7, 6, 2, 2
The mean equals 6 years (M = 6).
Now we will calculate the standard deviation using the deviation method.
Whenever you are asked to calculate the standard deviation using the deviation method, make
the following chart.
Below is the chart with the information that we already calculated.
Review, what will happen when we sum the deviation scores? ______________
That is why we need to square the deviation scores.
X
13
7
6
2
2
X?M
13 ? 6
7?6
6?6
2?6
2?6
x
7
1
0
-4
-4
x2
Now we can average the squares of the deviation scores (and remember that this measure is the
variance). This is how to calculate the variance:
V = Sx2
N
Since we had to square our deviation scores, it often is more descriptive of a distribution if we take
the square root of our average. As we saw above, the square root of the average of the squares of the
deviation scores (square root of the variance) is our very important measure of variability, the
_standard_ _deviation___.
57
As we see, our standard deviation (SD)
SD = v Sx2
=
?V
N
or in other words, the SD is equal to the square root of the Variance.
For the example we are working on, this is how we would calculate the standard deviation once we
know the variance:
When we compute the SD this way (using the deviation method), it is clear that the variance is ?the
average of the squared deviations from the mean? and that the square root of the variance (square root
of the average squared deviations from the mean) is the standard deviation (which gives us
information about how much the scores, on average, deviate from the mean).
Check your learning:
Five different convicts were given the following five prison sentences (in years): 10, 8, 6, 4, 2
The mean equals 6 years. (Remember: M = SX/N = which in this case equals (10 + 8 + 6 + 4 + 2)/5 =
30/5 = 6).
Use the deviation method to calculate the variance and standard deviation.
Note, whenever you are asked to calculate the standard deviation, using the deviation method,
first make this chart:
X
X-M
x
x2
10
8
6
4
2
Note: This is how to calculate the true standard deviation of a sample of scores (SD). When SPSS
and most scientific calculators calculate the standard deviation, they use the sample of scores to
estimate the population?s standard deviation (it?s symbol is s). As we stated earlier, a researcher is
usually most interested in using a sample of scores to learn about the population, and thus the
estimated standard deviation of the population (s) is the standard deviation you most often
encounter, not the true standard deviation of a sample (SD) which is what we are calculating here.
The estimate of the population?s standard deviation (s) ? the standard deviation most often
encountered ? is slightly larger than the true standard deviation of the sample (SD) that we are
currently learning to calculate. Much more on the estimate of the population?s standard deviation (s)
later.
58
_______KURTOSIS_________________ describes how peaked or how flat is a curve (graph). A
curve (graph) is very peaked when there is a small variability (small standard deviation) and a curve is
very flat is there is a large standard deviation (large variability).
When you have a distribution where most scores cluster around the mean and therefore has a small
amount of variability around the mean (that is, the deviation scores are small, and there is a small
standard deviation), it is called a _______LEPTOKURTIC___________ curve, and
it will be very peaked, and look like this when graphed:
When you have a distribution where the scores deviate widely from the mean and therefore has a large
amount of variability around the mean (that is, it has large deviation scores and a large standard
deviation), it is called a _____PLATYKURTIC______ curve
it will look like this when graphed:
When you have a distribution that is about halfway between these two, it is called
____MESOKURTIC_________ . This shape of curve will become very important in the next chapter
and it looks like this:
SEE NEXT PAGE TO HELP YOU REMEMBER THE NAMES OF CURVES J
59
60
NAME ________________________________________________
Homework Assignment #4:
1. Given that a distribution has the following measures of central tendency
M = 120, Mdn = 100, Mo = 80
which measure of central tendency is most appropriate and why?
The _________, because the distribution is ________
3. Given that a distribution has the following measures of central tendency,
M = 120 Mdn = 120 Mo = 120
which measure of central tendency is the most appropriate and why?
The ____________, because the distribution is __________
3. Given that a distribution has the following measures of central tendency.
M = 85 Mdn = 83 Mo = 77 and 90
Which measure of central tendency is most appropriate and why?
The ____________, because the distribution is _____________
Questions regarding assessing the skewness of a distribution by comparing the M and Mdn:
4. You determine that a distribution has a M = 80, Mdn = 80 and Mo = 80. Is the distribution
skewed or not skewed? If skewed, is it skewed right (positive) or skewed left (negative). Why?
Because the M ____ Mdn, this distribution is _______________.
5. A distribution has a M = 500, Mdn = 450 and Mo = 400. Is the distribution skewed or not
skewed? If skewed, is it skewed right (positive) or skewed left (negative). Why?
Because the M ____ Mdn, this distribution is _______________.
6. A distribution has a M = 50, Mdn = 70 and Mo = 80. Is the distribution skewed or not
skewed? If skewed, is it skewed right (positive) or skewed left (negative). Why?
Because the M ____ Mdn, this distribution is _______________.
7. The median home value in a community is $300,000. In the same community, the mean home
value is $450,000. Is this distribution of home values skewed? If so, in which direction?
Why?
Because the M ____ Mdn, this distribution is _______________.
Given what we have learned about skewed distributions, answer the following:
8. If the mean is higher than the median, the distribution is skewed to the __________.
9. If the mean is lower than the median, the distribution is skewed to the _________.
10. If a distribution is skewed, what is the appropriate measure of central tendency? ______
61
11. A distribution is skewed to the left. Which measure of central tendency (mean, median or
mode) will be the largest number? Which measure of central tendency will have the smallest
value? First, draw a sketch of a distribution skewed to the left. Place the M, Mdn and Mo
in their proper places relative to each other. Your sketch:
When a distribution is skewed to the left, the ______________ will be the largest measures of
central tendency and the _______________ will be the smallest measure of central tendency.
12. By comparing the mean and median, the greater the difference between them, the
____________________ (greater? Or Lesser?) the skew.
13. You are presented with measure of central tendency regarding divorce. The average number
of years a couple are married before divorce is 7 years, and the median number of years is 6;
however, most divorces occur at either 3 years of marriage or at 23 years of marriage. Which
measure of central tendency would be best to use to describe this data and why?
Because this distribution is _______________, the best measure of central tendency is _________
14. Which measure of central tendency is most influenced by extreme scores? _____
15. If a distribution is negatively skewed, the mean lies to the _________ (left or right?)of the
median?
16. A symmetrical distribution must only have one mode? ________ (True or False)
17. When a distribution is bimodal, what is the best measure of central tendency to describe the
data? _____
Given the three curves above:
18. Which one represents a platykurtic distribution? _____
19. Which one represents a mesokurtic distribution?______
20. Which one has the smallest standard deviation? ______
21. Which one had the largest standard deviation? ______
22. Name the kurtosis of the curve represented in ?B?: _________
62
Figure A ^
Figure B ^
23. Which one of the above curves is positively skewed?_______
24. Which distribution would have a mean that was a larger number than the median and mode?
25. Which figure is the one most likely to to have the following measures of central tendency:
M = 6, Mdn = 10, Mode = 12? ________
26. What is the best measure of central tendency to describe Figure B? _____
27. What is the best measure of central tendency to describe Figure A? _______
28. Draw a curve, below, that is a bimodal distribution
29. What is the best measure of central tendency for the distribution you just drew? ______
30. Draw a curve in which M = Mdn = Mo
31. Draw a curve that is skewed to the right (positively)
32. When a distribution is skewed right (positively), are there very few high or low scores?
________
33. When a distribution is skewed left (negatively), there are very few ___________ (high? Or
low?) scores.
63
34. Calculate the mean, median, range, variance and standard deviation (use the deviation method, as
we did in this lecture, to calculate the standard deviation) for the following set of essay scores:
9, 12, 16, 13, 10.
Remember, whenever you are asked to calculate the standard deviation, using the deviation
method, first make this chart:
X
X-M
x
x2
35. Calculate the mean, median, range, variance and standard deviation (use the deviation method, as
we did in this lecture, to calculate the standard deviation) for the following exam scores:
3, 5, 10, 7
(Remember to start by making the chart.)
To help you check your work, for this problem, SD = 2.59.
64

Purchase answer to see full
attachment

error: Content is protected !!