Mathematical Statistics 465 Homework 01, Spring 2005

Mathematical Statistics 465 Homework 01, Spring 2005 Due: February 10, 2005 at the start of class

TCGtS stands for The Cartoon Guide to Statistics .

  1. Collect a data set consisting of the weights (in pounds) and heights (in inches) of at least 50 people. Put each height/weight pair on a separate index card.
    1. Make a stem-and-leaf plot for the weights, and a stem-and-leaf plot for the weights.
    2. Find the mean, median, and quartiles for the heights and the weights.
    3. Find the inter-quartile range and the standard deviation for the heights and weights.
    4. Make box-and-whisker plots for the heights and for the weights.
    5. Make histograms for the heights and the weights.
    6. Make a scatter plot of your data. Put the heights on the horizontal axis and the weights on the vertical axis.
    7. Make a copy of your scatter plot and overlay the least squares regression line. What is the sum of the squared errors?
    8. Make a copy of your scatter plot and overlay the median-median line. What is the sum of the squared errors?
    9. Make a copy of the scatter plot and overlay the least squares quadratic function. What is the sum of the squared errors.
    10. Make a copy of the scatter plot and overlay the least squares cubic function. What is the sum of the squared errors.
    11. What differences do you notice in your squared errors among the four plots?
    12. What is the correlation coefficient for the height and the weight data you have collected?
    13. What do your results tell you about the following statement and its relationship to your data? Mass is proportional to volume, and for a sphere, volume is proportional to the cube of the diameter.
  2. What do you expect the histogram to look like for 100 rolls of a die? Now, roll a single die 100 times and record the outcomes. Give summary data and an histogram. What do you make of this?
  3. Get dice of two different colors, or at least a pair of dice where you can tell one from the other. (I will refer to them as red and green.) Roll this pair of dice 100 times, and record each outcome as an ordered pair, (Green, Red).
    1. If we were to make a histogram of the sum of the outcomes of each toss, what would you expect the histogram to look like? What is the relation to an addition table for the whole numbers from 1 to 6?
    2. Construct a histogram of these sums from your data, as well as summary statistics. How different are they? Propose a statistic that measures this difference and compute it.
    3. Construct a scatter plot of your ordered pairs. Compute lines of best fit both by least squares and by the median-median line method. How good are these fits? Compare the results with the height weight data. Explain any similarities and differences.
    4. What is the correlation coefficient for the results on the green and the red dice? Interpret this result.