Homework 10

knitr::opts_chunk$set(fig.height=3, message=FALSE, warning=FALSE)
load(url("http://www.rossmanchance.com/iscam3/ISCAM.RData"))
library(dplyr)
library(ggplot2)
library(readr)
homework at: http://www.rossmanchance.com/iscam3/instructors.html

Chapter 5 HW 6 (the answer to (d) is test of association. why?)

The 2003 study on volunteerism conducted by the Bureau of Labor Statistics reported the sample percentages who performed volunteer work, broken down by many other variables. For example, respondents were categorized by age. The following reports the percentage of sample respondents in each age group who had performed volunteer work in the previous year:

*See the online HW for a better layout to this table of data:*
Age group 16–24 years 25–34 years 35–44 years 45–54 years 55–64 years 65 or more
% volunteer 21.9% 24.8% 34.1% 31.3% 27.5% 22.7%

  1. Is this information sufficient to construct a segmented bar graph for comparing the proportions of volunteers across the various age categories? If so, do so, and comment on what the graph reveals. If not, explain.

  2. Explain why this information is not sufficient to conduct a chi-square test of whether these sample proportions differ significantly across the age categories.

The sample sizes in each age group are not given in the report, but based on other information we can estimate them to be as follows:

*See the online HW for a better layout to this table of data:*
Age group 16–24 years 25–34 years 35–44 years 45–54 years 55–64 years 65 or more
Sample size 9,719 10,613 12,070 10,959 7,329 9,310

  1. Use this information to produce a table of counts with age groups in columns and volunteer status (yes or no) in rows.
vol.data <- matrix(c(2128, 2632, 4116, 3430, 2015, 2113, 7591, 7981, 7954, 7529, 5314, 7197),
                   ncol=6, byrow=TRUE)
rownames(vol.data) <- c("Vol","noVol")
colnames(vol.data) <- c("y1624","y2534","y3544","y4554","y5564", "y65plus")
vol.data
##       y1624 y2534 y3544 y4554 y5564 y65plus
## Vol    2128  2632  4116  3430  2015    2113
## noVol  7591  7981  7954  7529  5314    7197
  1. Consider a chi-square test on the table that you produced in (c). Would this be a test of homogeneity of proportions or association between variables? Explain.

  2. Conduct the chi-square test. Report the hypotheses, check of technical conditions, sampling distribution, test statistic, and p-value. (Provide the details of your calculations and/or relevant computer output.) Summarize your conclusion.

  3. Construct a 2 \(\times\) 6 table with the same row and column headings as in (c), but containing only + and – signs indicating whether the observed count is larger (+) or smaller (–) than expected in that cell. Does this table reveal a pattern? Explain what that pattern suggests about the relationship between age group and volunteerism.

Chapter 5 HW 7

U.S. Volunteerism (cont.)
Reconsider the previous question about volunteerism. Suppose that the sample sizes had all been smaller by a factor of 100 (so that the entire study included only about 600 subjects) but that the conditional proportions of volunteerism within each age group had all turned out the same.

  1. How (if at all) would you expect the segmented bar graph to change? Explain.

  2. How (if at all) would you expect the test statistic to change? Explain.

  3. How (if at all) would you expect the p-value to change? Explain.

  4. How (if at all) would you expect your conclusion to change? Explain.

  5. Repeat the chi-square analysis with this greatly reduced sample size (round the observed counts in the new table to the nearest integer). Confirm or correct your answers to (b)–(d) in light of this analysis.