diff --git a/GrabData.R b/GrabData.R index f4cc802..4aeab06 100644 --- a/GrabData.R +++ b/GrabData.R @@ -12,4 +12,5 @@ mydata3 <- filter(mydata2,BIGBANG=="True" | BIGBANG=="False",EVOLVED=="True"|EVO GSSdata <- filter(mydata3,CAPPUN=="FAVOR"|CAPPUN=="OPPOSE",VOTE12=="Voted"|VOTE12=="Did not vote",VOTE16=="Voted"|VOTE16=="Did not vote") %>% droplevels() levels(GSSdata$VOTE12)[1] <- "voted12" levels(GSSdata$VOTE12)[2] <- "no in 12" -rm(Gss,Gss1,mydata,mydata2,mydata3) \ No newline at end of file +rm(Gss,Gss1,mydata,mydata2,mydata3) + diff --git a/gss2018.rmd b/gss2018.rmd index 4774400..4ce2eca 100644 --- a/gss2018.rmd +++ b/gss2018.rmd @@ -1,7 +1,7 @@ --- title: "General Social Survey" -author: "Your Name" -date: "Year 2020" +author: "Kaylie Brehm" +date: "Summer 2022" output: html_document: number_sections: true @@ -26,29 +26,68 @@ source("GrabData.R") The data in the dataframe GSSdata is from the 2018 General Social Survey. The first blocks of R-code has selected down a subset of the data to just 16 variables. It has further removed unwanted factor levels in much of the data. Examine the code in the GrabData.R file to see what it is doing. Some of the variables are categorical and others are numerical. Be sure to do a variable analysis before tackling each question. First question - Is opinion on the death penalty (CAPPUN) independent of gun ownership (OWNGUN)? +$H_0$ Opinion on death penalty is not independent on ownership on gun. +$H_A$ Opinion on death penalty is independent on ownership on gun. ## Methods -##Results +Both are categorical variables, each with two levels. Owning a gun would be a yes or no. Opinion on death penalty would be for or against. The analysis technique we will use is CAT~CAT. The results will show a bar chart, some numerical values, a fisher exact test for odds, and a chi-square test of independence. + + +## Results ### Descriptive Results + + #### Graphical Descriptive Results +We create two bar charts - one based on frequency and the other on percent. + + +````{r} +dd2 <- GSSdata %>% group_by(CAPPUN,OWNGUN) %>% summarize(count=n()) %>% mutate(prcnt=count/sum(count)) +# the group_by followed by summarize(count=n()) +basicC <- ggplot(dd2,aes(x=CAPPUN,y=count,fill=OWNGUN)) +basicC + geom_bar(stat="identity",position="dodge") +#Now for percentage plot +basicCC <- ggplot(dd2,aes(x=CAPPUN,y=prcnt*100,fill=OWNGUN)) +basicCC + geom_bar(stat="identity", position = "dodge") +``` + +Based on the data, it is apparent that those who oppose capital punishment, are more likely to say no to gun ownership. In those that favor capital punishment, slightly more people say no to gun ownership. + #### Numerical Descriptive Results +```{r} +table2 <- xtabs(~CAPPUN + OWNGUN, data=GSSdata) +rowPerc(table2) +colPerc(table2) +``` + +The top data set shows percentages for each opinion on capital punishment in relation to opinion on gun ownership. About 70.97% of those who oppose capital punishment are against owning a gun. About 51.72% of those who favor capital punishment are also against owning a gun. + ### Inferential Results +```{r} +chisq.test(table2) +chisqtestGC(table2) +fisher.test(table2) +``` + +If the opinion for capital punishment is dependent on opinion on gun ownership, then there is a difference, meaning it is not 50/50 equal results. The Chi-Square adds up this difference and subtracts what we would expect if the null hypothesis were true. The P-Value is the probability that the null hypothesis is true. The null hypothesis was "Opinion on death penalty is not independent on ownership on gun." The p-value of the chi square test is 0.02022. Since this p-value is under 0.05, I reject the null hypothesis due to it being so small. The p-value of the fisher exact test is 0.01651. Since this p-value is under 0.05, I once again reject the null hypothesis due to it being so small. The odds ratio was 2.271 which is 1.271 away from one. So the probability that capital punishment opinion is dependent on gun ownership opinion is 127%. + + # Question 2