knitr::opts_chunk$set(fig.height=3, message=FALSE, warning=FALSE)
load(url("http://www.rossmanchance.com/iscam3/ISCAM.RData"))
library(dplyr)
library(ggplot2)
library(readr)
homework at: http://www.rossmanchance.com/iscam3/instructors.html
The 2003 study on volunteerism conducted by the Bureau of Labor Statistics reported the sample percentages who performed volunteer work, broken down by many other variables. For example, respondents were categorized by age. The following reports the percentage of sample respondents in each age group who had performed volunteer work in the previous year:
*See the online HW for a better layout to this table of data:*
Age group 16–24 years 25–34 years 35–44 years 45–54 years 55–64 years 65 or more
% volunteer 21.9% 24.8% 34.1% 31.3% 27.5% 22.7%
Is this information sufficient to construct a segmented bar graph for comparing the proportions of volunteers across the various age categories? If so, do so, and comment on what the graph reveals. If not, explain.
Explain why this information is not sufficient to conduct a chi-square test of whether these sample proportions differ significantly across the age categories.
The sample sizes in each age group are not given in the report, but based on other information we can estimate them to be as follows:
*See the online HW for a better layout to this table of data:*
Age group 16–24 years 25–34 years 35–44 years 45–54 years 55–64 years 65 or more
Sample size 9,719 10,613 12,070 10,959 7,329 9,310
vol.data <- matrix(c(2128, 2632, 4116, 3430, 2015, 2113, 7591, 7981, 7954, 7529, 5314, 7197),
ncol=6, byrow=TRUE)
rownames(vol.data) <- c("Vol","noVol")
colnames(vol.data) <- c("y1624","y2534","y3544","y4554","y5564", "y65plus")
vol.data
## y1624 y2534 y3544 y4554 y5564 y65plus
## Vol 2128 2632 4116 3430 2015 2113
## noVol 7591 7981 7954 7529 5314 7197
Consider a chi-square test on the table that you produced in (c). Would this be a test of homogeneity of proportions or association between variables? Explain.
Conduct the chi-square test. Report the hypotheses, check of technical conditions, sampling distribution, test statistic, and p-value. (Provide the details of your calculations and/or relevant computer output.) Summarize your conclusion.
Construct a 2 \(\times\) 6 table with the same row and column headings as in (c), but containing only + and – signs indicating whether the observed count is larger (+) or smaller (–) than expected in that cell. Does this table reveal a pattern? Explain what that pattern suggests about the relationship between age group and volunteerism.
U.S. Volunteerism (cont.)
Reconsider the previous question about volunteerism. Suppose that the sample sizes had all been smaller by a factor of 100 (so that the entire study included only about 600 subjects) but that the conditional proportions of volunteerism within each age group had all turned out the same.
How (if at all) would you expect the segmented bar graph to change? Explain.
How (if at all) would you expect the test statistic to change? Explain.
How (if at all) would you expect the p-value to change? Explain.
How (if at all) would you expect your conclusion to change? Explain.
Repeat the chi-square analysis with this greatly reduced sample size (round the observed counts in the new table to the nearest integer). Confirm or correct your answers to (b)–(d) in light of this analysis.