load(url("http://www.rossmanchance.com/iscam3/ISCAM.RData"))
library(dplyr)
library(ggplot2)
Note:
homework (HW) at: http://www.rossmanchance.com/iscam3/instructors.html
Practicee Problems (PP) in the textbook at the end of the investigations
Note on the lab write up. If you want to use an iscam function, but you don’t want the plot to print (in the knitted file), then type the following as the beginning of the chunk:
to start R chunk: ```{r fig.keep="none"}
To surpress warnings and messages (can be put together with the figure info above!):
to start R chunk: ```{r warning=FALSE, message=FALSE}
For each of the following statements: (i) identify which is being considered the explanatory variable and which the response variable. (ii) then suggest a potential confounding variable that explains the observed association between the explanatory and response variables. (iii) finally explain how the suggested (confounding) variable is related both to the response and to differences between the explanatory variable groups.
(you’ll first need the simulation like Inv 3.1; then you can use iscamnormprob or iscamtwopropztest)
Born in California?
As a transplant to California, author A wondered whether California residents were more or less likely to have been born in California back in 1950 or more recently, say in 2000. To investigate this question, he took a random sample of 500 CA residents from the 1950 Census and an independent random sample of 500 CA residents from the 2000 Census. The results are shown in the table below:
Birth | 1950 | 2000 |
---|---|---|
Born in CA | 219 | 258 |
Not born in CA | 281 | 242 |
Total | 500 | 500 |
Pulling All-Nighters
A study published in the January 2008 issue of the journal Behavioral Sleep Medicine involved a survey of 120 students at St. Lawrence University, a small liberal arts college in upstate New York. Researchers found that students who claimed to have never pulled an all-nighter have average GPAs of 3.1, compared to 2.9 for those students who do claim to have pulled all-nighters.
Identify the explanatory and response variables in this study. Classify each as categorical or quantitative.
Is this an observational study or a randomized experiment? Explain how you know.
Suppose that the difference between these two averages is shown to be statistically significant. Can you legitimately conclude that pulling all-nighters causes a student’s GPA to decrease? If so, explain why. If not, identify a potential confounding variable, and explain how it provides an alternative explanation for why the all-nighter group has a significantly lower average GPA.
(summary on pg 188 of your text should be helpful, you may also need to find a z critical value using either of: iscamnormprob or pnorm)
U.S. Volunteerism (cont.)
From 3.75: In the September 2003 study of volunteerism in the U.S. conducted by the Bureau of Labor Statistics, 25.1% of men and 32.2% of women surveyed said that they had done volunteer work for or through an organization in the previous year.
Reconsider the previous question and the study about volunteerism.
Suppose that the sample sizes had been the same for men and for women. Determine the smallest sample size so that the difference between the observed sample proportions of 0.251 and 0.322 would be significant at the 0.05 level (with a two-sided test).
For the sample sizes that you found in (a), would the same difference in sample proportions (0.071) have been significant if the sample proportions had been 0.451 and 0.522? Report the test statistic and p-value in this case.
Repeat (b) with the same difference in sample proportions (0.071), but assuming that the sample proportions had been 0.051 and 0.122.
Summarize your findings from this analysis.