From 12d7723ca1fe86989ad590151beb1f2a7cf583c7 Mon Sep 17 00:00:00 2001 From: Cristina Ceballos Date: Mon, 11 Nov 2019 14:02:39 -0800 Subject: [PATCH 1/3] created pset4 --- Ceballos_ps4.Rmd | 75 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 75 insertions(+) create mode 100644 Ceballos_ps4.Rmd diff --git a/Ceballos_ps4.Rmd b/Ceballos_ps4.Rmd new file mode 100644 index 0000000..4c17336 --- /dev/null +++ b/Ceballos_ps4.Rmd @@ -0,0 +1,75 @@ +--- +title: 'Psych 251 PS4: Simulation' +author: "Cristina Ceballos" +date: "11/11/2019" +output: + html_document: + toc: true +--- + +This is problem set #4, in which we want you to integrate your knowledge of data wrangling with some basic simulation skills. It's a short problem set to help you get your feet wet in testing statistical concepts through "making up data" rather than consulting a textbook or doing math. + +For ease of reading, please separate your answers from our text by marking our text with the `>` character (indicating quotes). + +```{r, warning=F, message=F} +library(tidyverse) +``` + +Let's start by convincing ourselves that t-tests have the appropriate false positive rate. Run 10,000 t-tests with standard, normally-distributed data from a made up 30-person, single-measurement experiment (the command for sampling from a normal distribution is `rnorm`). + +The goal of these t-tests are to determine, based on 30 observations, whether the underlying distribution (in this case a normal distribution with mean 0 and standard deviation 1) has a mean that is different from 0. In reality, the mean is not different from 0 (we sampled it using `rnorm`), but sometimes the 30 observations we get in our experiment will suggest that the mean is higher or lower. In this case, we'll get a "significant" result and incorrectly reject the null hypothesis of mean 0. + +What's the proportion of "significant" results ($p < .05$) that you see? + +First do this using a `for` loop. + +```{r} +``` + +Next, do this using the `replicate` function: + +```{r} +``` + +How does this compare to the intended false-positive rate of $\alpha=0.05$? + +> ANSWER + +Ok, that was a bit boring. Let's try something more interesting - let's implement a p-value sniffing simulation, in the style of Simons, Nelson, & Simonsohn (2011). + +Consider this scenario: you have done an experiment, again with 30 participants (one observation each, just for simplicity). The question is whether the true mean is different from 0. You aren't going to check the p-value every trial, but let's say you run 30 - then if the p-value is within the range p < .25 and p > .05, you optionally run 30 more and add those data, then test again. But if the original p value is < .05, you call it a day, and if the original is > .25, you also stop. + +First, write a function that implements this sampling regime. + +```{r} +double.sample <- function () { +} +``` + +Now call this function 10k times and find out what happens. + +```{r} +``` + +Is there an inflation of false positives? How bad is it? + +> ANSWER + +Now modify this code so that you can investigate this "double the sample" rule in a bit more depth. In the previous question, the researcher doubles the sample only when they think they got "close" to a significant result, i.e. when their not-significant p is less than 0.25. What if the researcher was more optimistic? See what happens in these 3 other scenarios: + +* The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.5. +* The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.75. +* The research doubles their sample whenever they get ANY pvalue that is not significant. + +How do these choices affect the false positive rate? + +HINT: Try to do this by making the function `double.sample` take the upper p value as an argument, so that you can pass this through dplyr. + +HINT 2: You may need more samples. Find out by looking at how the results change from run to run. + +```{r} +``` + +What do you conclude on the basis of this simulation? How bad is this kind of data-dependent policy? + +> ANSWER From 92643aa000800ed7154df4bb9357e6cd27f2f35d Mon Sep 17 00:00:00 2001 From: Cristina Ceballos Date: Tue, 12 Nov 2019 11:47:46 -0800 Subject: [PATCH 2/3] pset4 completed --- .Rhistory | 512 ++++++++++++++++ Ceballos_ps4.Rmd | 123 +++- Ceballos_ps4.html | 557 ++++++++++++++++++ .../rpubs.com/rpubs/Document.dcf | 10 + 4 files changed, 1199 insertions(+), 3 deletions(-) create mode 100644 .Rhistory create mode 100644 Ceballos_ps4.html create mode 100644 rsconnect/documents/Ceballos_ps4.Rmd/rpubs.com/rpubs/Document.dcf diff --git a/.Rhistory b/.Rhistory new file mode 100644 index 0000000..abae361 --- /dev/null +++ b/.Rhistory @@ -0,0 +1,512 @@ +# Now I'll try the histogram function on just one column, d$Game1Angry1. +unique(vector_of_Game1Angry1) +range(vector_of_Game1Angry1) +hist(vector_of_Game1Angry1) +# I learn that the likert scores range from 1 to 7 for that SINGLE column. +# Now, I'll do the same process, but on all the columns, not just one column. +# I can tell R to select all the likert values between the 7th column of the dataset and the 70th column of the dataset, and to make a single vector using the c() function. +vector_of_all_likert_scores <- c(d[,7:70], recursive = TRUE, use.names = FALSE) +length(vector_of_all_likert_scores) +# Now I'll try the same functions on all the likert scores. The unique and hist functions took too long to caclulate, so I ended up just using the range function. +# range(vector_of_all_likert_scores) +# unique(vector_of_all_likert_scores) +hist(vector_of_all_likert_scores) +tail(d) +filtered_d = d %>% +filter(is.na(DoNotUse)) # your code here: exclude subjects that are marked as "DoNotUse" +filtered_d = filtered_d %>% +select(c("Subject", "Cond"), # Generally important columns for both hypotheses +contains("Game"), # we want all the game columns for hypothesis 1 +-contains("Intro"), -c("WhichGames", "GameComments"), # except these +starts_with("DinerDashWith"), c("SOFMusicEnemies", "SOFNoMusicEnemies")) # These columns are for hypothesis 2 +rating_hyp_d = filtered_d %>% +filter(is.na(DoNotUseVideoGamePerformanceData)) %>% # first, let's get rid of the subjects who did so poorly on one game that their data is unusable +select(-DoNotUseVideoGamePerformanceData, # now get rid of that column +-starts_with("DinerDash"), # and the other columns we don't need +-starts_with("SOF")) +performance_hyp_d = filtered_d %>% +select(-contains("Game")) # your code here: remove the columns containing "Game" in the name +tiny_demo_d = head(performance_hyp_d, 2) # get just the first two subjects performance data, for a demo +tiny_demo_d +tiny_demo_d %>% gather(Measurement, Value, +-c("Subject", "Cond")) # this tells it to gather all columns *except* these ones. +performance_hyp_long_d = performance_hyp_d %>% +gather(Measurement, Score, -c("Subject", "Cond")) +rating_hyp_long_d = rating_hyp_d %>% +gather(Measurement, Rating, -c("Subject", "Cond")) +performance_hyp_long_d = performance_hyp_long_d %>% +mutate(ConfrontationalGame = grepl("SOF", Measurement), # create a new variable that will say whether the measurement was of the game soldier of fortune (SOF). +WithMusic = !grepl("NoMusic|WithoutMusic", Measurement), # creates a new column named WithMusic, which is False if the measurement contains *either* "NoMusic" or "WithoutMusic" +MusicCondition = factor(ifelse(Cond > 3, Cond-3, Cond), levels = 1:3, labels = c("Anger", "Exciting", "Neutral"))) # Get rid of uninterpretable condition labels +rating_hyp_long_d = rating_hyp_long_d %>% +mutate( +IsRecall = grepl("Friends|Strangers", Measurement) +) +rating_hyp_long_d = rating_hyp_long_d %>% +mutate( +GameNumber = as.numeric(substr(rating_hyp_long_d$Measurement, 5, 5)), +ConfrontationalGame = GameNumber <= 2, # in a mutate, we can use a column we created (or changed) right away. Games 1 and 2 are confrontational, games 3 and 4 are not. +Emotion = str_extract(Measurement, "Angry|Neutral|Excited|Exciting|Calm"), +Emotion = ifelse(Emotion == "Excited", "Exciting", # this just gets rid of some annoying labeling choices +ifelse(Emotion == "Calm", "Neutral", Emotion)) +) +performance_hyp_long_d %>% +group_by(ConfrontationalGame) %>% +summarize(AvgScore = mean(Score, na.rm=T)) # the na.rm tells R to ignore NA values +performance_hyp_long_d = performance_hyp_long_d %>% +group_by(ConfrontationalGame, WithMusic) %>% # we're going to compute four sets of z-scores, one for the confrontational game without music, one for the confrontational game with, one for the nonconfrontational game without music, and one for the nonconfrontational game with +mutate(z_scored_performance = scale(Score)) %>% +ungroup() +rating_summary_d = rating_hyp_long_d %>% +group_by(ConfrontationalGame, Emotion) %>% +summarize(MeanRating = mean(Rating, na.rm=T)) +rating_summary_d +ggplot(rating_summary_d, aes(x=ConfrontationalGame, y= MeanRating, fill=Emotion)) + +geom_bar(position="dodge", stat="identity") + +scale_fill_brewer(palette="Set1") +model = lm(Rating ~ ConfrontationalGame * Emotion, rating_hyp_long_d) +summary(model) +performance_diff_d = performance_hyp_long_d %>% +mutate(WithMusic = factor(WithMusic, levels=c(F, T), labels=c("PreMusic", "PostMusic"))) %>% # first, tweak the variable so our code is easier to read. +select(-c("Score", "Measurement")) %>% # now we remove columns we don't need (bonus: leave them in and see if you can understand what goes wrong!) +spread(WithMusic, z_scored_performance) %>% +mutate(ImprovementScore=PostMusic-PreMusic) +performance_diff_d +performance_diff_summary_d = performance_diff_d %>% +group_by(ConfrontationalGame, MusicCondition) %>% +summarize(MeanImprovementScore = mean(ImprovementScore, na.rm=T)) +performance_diff_summary_d +ggplot(performance_diff_summary_d, aes(x=ConfrontationalGame, y=MeanImprovementScore, fill=MusicCondition)) + +geom_bar(position="dodge", stat="identity") + +scale_fill_brewer(palette="Set1") +performance_model = lm(ImprovementScore ~ ConfrontationalGame * MusicCondition, performance_diff_d) +summary(performance_model) +library(foreign) # for reading spss formatted data +library(tidyr) +library(dplyr) +library(stringr) # useful for some string manipulation +library(ggplot2) +d = read.spss("Tamiretal2008ReplicationData.sav", to.data.frame=T) +head(d) +colnames(d) +# First, I'll make a vector with all the likert values in d$Game1Angry1, which is the 7th column of the dataset +vector_of_Game1Angry1 <- d[,7] +length(vector_of_Game1Angry1) +print(vector_of_Game1Angry1) +# Now I'll try the histogram function on just one column, d$Game1Angry1. +unique(vector_of_Game1Angry1) +range(vector_of_Game1Angry1) +hist(vector_of_Game1Angry1) +# I learn that the likert scores range from 1 to 7 for that SINGLE column. +# Now, I'll do the same process, but on all the columns with likert ratings, not just one column. +# I can tell R to select all the likert values between the 7th column of the dataset and the 70th column of the dataset, and to make a single vector using the c() function. +vector_of_all_likert_scores <- c(d[,7:70], recursive = TRUE, use.names = FALSE) +length(vector_of_all_likert_scores) +# Now I'll try the hist functions on all the likert scores. +# range(vector_of_all_likert_scores) +# unique(vector_of_all_likert_scores) +hist(vector_of_all_likert_scores) +tail(d) +filtered_d = d %>% +filter(is.na(DoNotUse)) # your code here: exclude subjects that are marked as "DoNotUse" +filtered_d = filtered_d %>% +select(c("Subject", "Cond"), # Generally important columns for both hypotheses +contains("Game"), # we want all the game columns for hypothesis 1 +-contains("Intro"), -c("WhichGames", "GameComments"), # except these +starts_with("DinerDashWith"), c("SOFMusicEnemies", "SOFNoMusicEnemies")) # These columns are for hypothesis 2 +rating_hyp_d = filtered_d %>% +filter(is.na(DoNotUseVideoGamePerformanceData)) %>% # first, let's get rid of the subjects who did so poorly on one game that their data is unusable +select(-DoNotUseVideoGamePerformanceData, # now get rid of that column +-starts_with("DinerDash"), # and the other columns we don't need +-starts_with("SOF")) +performance_hyp_d = filtered_d %>% +select(-contains("Game")) # your code here: remove the columns containing "Game" in the name +tiny_demo_d = head(performance_hyp_d, 2) # get just the first two subjects performance data, for a demo +tiny_demo_d +tiny_demo_d %>% gather(Measurement, Value, +-c("Subject", "Cond")) # this tells it to gather all columns *except* these ones. +performance_hyp_long_d = performance_hyp_d %>% +gather(Measurement, Score, -c("Subject", "Cond")) +rating_hyp_long_d = rating_hyp_d %>% +gather(Measurement, Rating, -c("Subject", "Cond")) +performance_hyp_long_d = performance_hyp_long_d %>% +mutate(ConfrontationalGame = grepl("SOF", Measurement), # create a new variable that will say whether the measurement was of the game soldier of fortune (SOF). +WithMusic = !grepl("NoMusic|WithoutMusic", Measurement), # creates a new column named WithMusic, which is False if the measurement contains *either* "NoMusic" or "WithoutMusic" +MusicCondition = factor(ifelse(Cond > 3, Cond-3, Cond), levels = 1:3, labels = c("Anger", "Exciting", "Neutral"))) # Get rid of uninterpretable condition labels +rating_hyp_long_d = rating_hyp_long_d %>% +mutate( +IsRecall = grepl("Friends|Strangers", Measurement) +) +rating_hyp_long_d = rating_hyp_long_d %>% +mutate( +GameNumber = as.numeric(substr(rating_hyp_long_d$Measurement, 5, 5)), +ConfrontationalGame = GameNumber <= 2, # in a mutate, we can use a column we created (or changed) right away. Games 1 and 2 are confrontational, games 3 and 4 are not. +Emotion = str_extract(Measurement, "Angry|Neutral|Excited|Exciting|Calm"), +Emotion = ifelse(Emotion == "Excited", "Exciting", # this just gets rid of some annoying labeling choices +ifelse(Emotion == "Calm", "Neutral", Emotion)) +) +performance_hyp_long_d %>% +group_by(ConfrontationalGame) %>% +summarize(AvgScore = mean(Score, na.rm=T)) # the na.rm tells R to ignore NA values +performance_hyp_long_d = performance_hyp_long_d %>% +group_by(ConfrontationalGame, WithMusic) %>% # we're going to compute four sets of z-scores, one for the confrontational game without music, one for the confrontational game with, one for the nonconfrontational game without music, and one for the nonconfrontational game with +mutate(z_scored_performance = scale(Score)) %>% +ungroup() +rating_summary_d = rating_hyp_long_d %>% +group_by(ConfrontationalGame, Emotion) %>% +summarize(MeanRating = mean(Rating, na.rm=T)) +rating_summary_d +ggplot(rating_summary_d, aes(x=ConfrontationalGame, y= MeanRating, fill=Emotion)) + +geom_bar(position="dodge", stat="identity") + +scale_fill_brewer(palette="Set1") +model = lm(Rating ~ ConfrontationalGame * Emotion, rating_hyp_long_d) +summary(model) +performance_diff_d = performance_hyp_long_d %>% +mutate(WithMusic = factor(WithMusic, levels=c(F, T), labels=c("PreMusic", "PostMusic"))) %>% # first, tweak the variable so our code is easier to read. +select(-c("Score", "Measurement")) %>% # now we remove columns we don't need (bonus: leave them in and see if you can understand what goes wrong!) +spread(WithMusic, z_scored_performance) %>% +mutate(ImprovementScore=PostMusic-PreMusic) +performance_diff_d +performance_diff_summary_d = performance_diff_d %>% +group_by(ConfrontationalGame, MusicCondition) %>% +summarize(MeanImprovementScore = mean(ImprovementScore, na.rm=T)) +performance_diff_summary_d +ggplot(performance_diff_summary_d, aes(x=ConfrontationalGame, y=MeanImprovementScore, fill=MusicCondition)) + +geom_bar(position="dodge", stat="identity") + +scale_fill_brewer(palette="Set1") +performance_model = lm(ImprovementScore ~ ConfrontationalGame * MusicCondition, performance_diff_d) +summary(performance_model) +install.packages("devtools") +devtools::install_github("TomHardwicke/ReproReports") +library("ReproReports") +devtools::install_github("TomHardwicke/ReproReports") +library("ReproReports") +install.packages("devtools") +devtools::install_github("TomHardwicke/ReproReports") +library("ReproReports") +install.packages("devtools") +devtools::install_github("TomHardwicke/ReproReports") +force = TRUE +TRUE +"force = TRUE" +force = TRUE +devtools::install_github("TomHardwicke/ReproReports") +devtools::install_github("TomHardwicke/ReproReports"), force = TRUE +devtools::install_github("TomHardwicke/ReproReports") force = TRUE +devtools::install_github("TomHardwicke/ReproReports") force = TRUE +devtools::install_github("TomHardwicke/ReproReports") +force = TRUE +library("ReproReports") +install.packages("devtools") +devtools::install_github("TomHardwicke/ReproReports") +devtools::install_github("TomHardwicke/ReproReports", force = TRUE) +library("ReproReports") +library("ReproReports") +install.packages(c("backports", "covr", "curl", "digest", "htmltools", "htmlwidgets", "later", "markdown", "pkgbuild", "pkgconfig", "promises", "rmarkdown", "shiny", "tinytex", "xfun")) +library("ReproReports") +library("devtools") +devtools::install_github("TomHardwicke/ReproReports") +library(ReproReports) +ReproReports::reproCheck() +session_info() +articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" +reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report +pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". +copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". +pilotTTC <- 120 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +copilotTTC <- NA # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +copilotStartDate <- NA # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +completionDate <- NA # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +# sets up some formatting options for the R Markdown document +knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) +# load packages +library(tidyverse) # for data munging +library(knitr) # for kable table formating +library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files +library(readxl) # import excel files +library(ReproReports) # custom reporting functions +library(foreign) +# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared +reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) +data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") +head(data) +colnames(data) +articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" +reportType <- "copilot" # specify whether this is the 'pilot' report or 'copilot' report +pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". +copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". +pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +# sets up some formatting options for the R Markdown document +knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) +# load packages +library(tidyverse) # for data munging +library(knitr) # for kable table formating +library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files +library(readxl) # import excel files +library(ReproReports) # custom reporting functions +library(foreign) +# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared +reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) +data <- read_excel("data/data_Exp1.xlsx", sheet = "summary") +articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" +reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report +pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". +copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". +pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +# sets up some formatting options for the R Markdown document +knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) +# load packages +library(tidyverse) # for data munging +library(knitr) # for kable table formating +library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files +library(readxl) # import excel files +library(ReproReports) # custom reporting functions +library(foreign) +# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared +reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) +data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") +head(data) +# Right now, the data is arranged in wide format. I want to make it longer, so I will use pivot_longer to make it tidy. +# But first, I have to manually modify the columns and assign each column a column name. To do so, I created a new Excel file called "data_Exp1_modified". In this Excel file, I manually renamed each column. For example, the first column was called "closedloop_grasping_uncrowded_3cm". The other columns followed the same naming rules. +# In the Excel file, I also deleted empty columns. I also removed the two top columns. +# I use the read_excel function to read the data. +data_modified <- read_excel("GroupA_6-1-2015/data/data_Exp1_modified.xlsx", sheet = "summary") +# Now I use the functions "separate" and "pivot_longer" to make my data tidy. +pivoted_data <- pivot_longer(data_modified, closedloop_grasping_uncrowded_3cm:openloop_estimation_crowded_3.75cm, names_to = "column", values_to="estimate") +tidy_data <- separate(pivoted_data, column, into = c("open_closed","grasping_estimating","crowded_uncrowded","cm"), sep = "_") +colnames(tidy_data) +# Now my data is tidy! +# To run the ANOVA, I should get rid of the "means" information in the tidy_data. To do that, I simply selected the first 160 rows of the dataset, and ommitted the bottom rows 161-176, since those are the rows that have the means. +ANOVA_tidy_data <- tidy_data[1:160,] +# "ANOVA_tidy_data is the dataset I will use for the ANOVA analysis. It excludes the "means" rows. +# Next, I created a dataset that has ONLY the "means" information, since that is the information used in Figure 3. +Fig3_tidy_data <- tidy_data[161:176,] +# "Fig3_tidy_data is the dataset I will use for try to recreate Fig3 from the paper. It has only the "means" rows. +# I used the aov function to +ANOVA_results <- aov(estimate ~ grasping_estimating * crowded_uncrowded * open_closed * cm, ANOVA_tidy_data) +summary(ANOVA_results) +colnames(Fig3_tidy_data) +# plot("cm", Fig3_tidy_data[10:16], Fig3_tidy_data, "l") +# commented out because I could not get the code to work +# Reprocheck for the four-way ANOVA F value: +reproCheck("6.818", "2.662", valueType = "F") +# Reprocheck for the four-way ANOVA p value: +reproCheck("0.28", "0.105", valueType = "p") +reportObject <- reportObject %>% +filter(dummyRow == FALSE) %>% # remove the dummy row +select(-dummyRow) %>% # remove dummy row designation +mutate(articleID = articleID) %>% # add variables to report +select(articleID, everything()) # make articleID first column +# decide on final outcome +if(any(reportObject$comparisonOutcome %in% c("MAJOR_ERROR", "DECISION_ERROR"))){ +finalOutcome <- "Failure" +}else{ +finalOutcome <- "Success" +} +# collate report extra details +reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, finalOutcome) +# save report objects +if(reportType == "pilot"){ +write_csv(reportObject, "pilotReportDetailed.csv") +write_csv(reportExtras, "pilotReportExtras.csv") +} +if(reportType == "copilot"){ +write_csv(reportObject, "copilotReportDetailed.csv") +write_csv(reportExtras, "copilotReportExtras.csv") +} +articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" +reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report +pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". +copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". +pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +# sets up some formatting options for the R Markdown document +knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) +# load packages +library(tidyverse) # for data munging +library(knitr) # for kable table formating +library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files +library(readxl) # import excel files +library(ReproReports) # custom reporting functions +library(foreign) +# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared +reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) +data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") +head(data) +# Right now, the data is arranged in wide format. I want to make it longer, so I will use pivot_longer to make it tidy. +# But first, I have to manually modify the columns and assign each column a column name. To do so, I created a new Excel file called "data_Exp1_modified". In this Excel file, I manually renamed each column. For example, the first column was called "closedloop_grasping_uncrowded_3cm". The other columns followed the same naming rules. +# In the Excel file, I also deleted empty columns. I also removed the two top columns. +# I use the read_excel function to read the data. +data_modified <- read_excel("GroupA_6-1-2015/data/data_Exp1_modified.xlsx", sheet = "summary") +# Now I use the functions "separate" and "pivot_longer" to make my data tidy. +pivoted_data <- pivot_longer(data_modified, closedloop_grasping_uncrowded_3cm:openloop_estimation_crowded_3.75cm, names_to = "column", values_to="estimate") +tidy_data <- separate(pivoted_data, column, into = c("open_closed","grasping_estimating","crowded_uncrowded","cm"), sep = "_") +colnames(tidy_data) +# Now my data is tidy! +# To run the ANOVA, I should get rid of the "means" information in the tidy_data. To do that, I simply selected the first 160 rows of the dataset, and ommitted the bottom rows 161-176, since those are the rows that have the means. +ANOVA_tidy_data <- tidy_data[1:160,] +# "ANOVA_tidy_data is the dataset I will use for the ANOVA analysis. It excludes the "means" rows. +# Next, I created a dataset that has ONLY the "means" information, since that is the information used in Figure 3. +Fig3_tidy_data <- tidy_data[161:176,] +# "Fig3_tidy_data is the dataset I will use for try to recreate Fig3 from the paper. It has only the "means" rows. +# I used the aov function to +ANOVA_results <- aov(estimate ~ grasping_estimating * crowded_uncrowded * open_closed * cm, ANOVA_tidy_data) +summary(ANOVA_results) +colnames(Fig3_tidy_data) +# plot("cm", Fig3_tidy_data[10:16], Fig3_tidy_data, "l") +# commented out because I could not get the code to work +# Reprocheck for the four-way ANOVA F value: +reproCheck("6.818", "2.662", valueType = "F") +# Reprocheck for the four-way ANOVA p value: +reproCheck("0.28", "0.105", valueType = "p") +reportObject <- reportObject %>% +filter(dummyRow == FALSE) %>% # remove the dummy row +select(-dummyRow) %>% # remove dummy row designation +mutate(articleID = articleID) %>% # add variables to report +select(articleID, everything()) # make articleID first column +# decide on final outcome +if(any(reportObject$comparisonOutcome %in% c("MAJOR_ERROR", "DECISION_ERROR"))){ +finalOutcome <- "Failure" +}else{ +finalOutcome <- "Success" +} +# collate report extra details +reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, finalOutcome) +# save report objects +if(reportType == "pilot"){ +write_csv(reportObject, "pilotReportDetailed.csv") +write_csv(reportExtras, "pilotReportExtras.csv") +} +if(reportType == "copilot"){ +write_csv(reportObject, "copilotReportDetailed.csv") +write_csv(reportExtras, "copilotReportExtras.csv") +} +devtools::session_info() +articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" +reportType <- "copilot" # specify whether this is the 'pilot' report or 'copilot' report +pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". +copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". +pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +# sets up some formatting options for the R Markdown document +knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) +# load packages +library(tidyverse) # for data munging +library(knitr) # for kable table formating +library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files +library(readxl) # import excel files +library(ReproReports) # custom reporting functions +library(foreign) +# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared +reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) +data <- read_excel("data/data_Exp1.xlsx", sheet = "summary") +articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" +reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report +pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". +copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". +pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 +pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") +# sets up some formatting options for the R Markdown document +knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) +# load packages +library(tidyverse) # for data munging +library(knitr) # for kable table formating +library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files +library(readxl) # import excel files +library(ReproReports) # custom reporting functions +library(foreign) +# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared +reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) +data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") +head(data) +# Right now, the data is arranged in wide format. I want to make it longer, so I will use pivot_longer to make it tidy. +# But first, I have to manually modify the columns and assign each column a column name. To do so, I created a new Excel file called "data_Exp1_modified". In this Excel file, I manually renamed each column. For example, the first column was called "closedloop_grasping_uncrowded_3cm". The other columns followed the same naming rules. +# In the Excel file, I also deleted empty columns. I also removed the two top columns. +# I use the read_excel function to read the data. +data_modified <- read_excel("GroupA_6-1-2015/data/data_Exp1_modified.xlsx", sheet = "summary") +# Now I use the functions "separate" and "pivot_longer" to make my data tidy. +pivoted_data <- pivot_longer(data_modified, closedloop_grasping_uncrowded_3cm:openloop_estimation_crowded_3.75cm, names_to = "column", values_to="estimate") +tidy_data <- separate(pivoted_data, column, into = c("open_closed","grasping_estimating","crowded_uncrowded","cm"), sep = "_") +colnames(tidy_data) +# Now my data is tidy! +# To run the ANOVA, I should get rid of the "means" information in the tidy_data. To do that, I simply selected the first 160 rows of the dataset, and ommitted the bottom rows 161-176, since those are the rows that have the means. +ANOVA_tidy_data <- tidy_data[1:160,] +# "ANOVA_tidy_data is the dataset I will use for the ANOVA analysis. It excludes the "means" rows. +# Next, I created a dataset that has ONLY the "means" information, since that is the information used in Figure 3. +Fig3_tidy_data <- tidy_data[161:176,] +# "Fig3_tidy_data is the dataset I will use for try to recreate Fig3 from the paper. It has only the "means" rows. +# I used the aov function to +ANOVA_results <- aov(estimate ~ grasping_estimating * crowded_uncrowded * open_closed * cm, ANOVA_tidy_data) +summary(ANOVA_results) +colnames(Fig3_tidy_data) +# plot("cm", Fig3_tidy_data[10:16], Fig3_tidy_data, "l") +# commented out because I could not get the code to work +# Reprocheck for the four-way ANOVA F value: +reproCheck("6.818", "2.662", valueType = "F") +# Reprocheck for the four-way ANOVA p value: +reproCheck("0.28", "0.105", valueType = "p") +reportObject <- reportObject %>% +filter(dummyRow == FALSE) %>% # remove the dummy row +select(-dummyRow) %>% # remove dummy row designation +mutate(articleID = articleID) %>% # add variables to report +select(articleID, everything()) # make articleID first column +# decide on final outcome +if(any(reportObject$comparisonOutcome %in% c("MAJOR_ERROR", "DECISION_ERROR"))){ +finalOutcome <- "Failure" +}else{ +finalOutcome <- "Success" +} +# collate report extra details +reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, finalOutcome) +# save report objects +if(reportType == "pilot"){ +write_csv(reportObject, "pilotReportDetailed.csv") +write_csv(reportExtras, "pilotReportExtras.csv") +} +if(reportType == "copilot"){ +write_csv(reportObject, "copilotReportDetailed.csv") +write_csv(reportExtras, "copilotReportExtras.csv") +} +devtools::session_info() +setwd("C:/Users/C. Ceballos/OneDrive - Leland Stanford Junior University/Year 4 2019-2020/PSYCH251_HWs/problem_sets") +library(tidyverse) +library(tidyverse) +averages <- numeric(10000) +for (i in 1:10000) { +X <- rnorm(30, 0, 1) +averages[i] <- mean(X) +} +averages <- numeric(10000) +for (i in 1:10000) { +X <- rnorm(30, 0, 1) +averages[i] <- mean(X) +} +averages +averages <- numeric(10000) +for (i in 1:10000) { +X <- rnorm(30, 0, 1) +averages[i] <- mean(X) +} +averages <- numeric(10000) +for (i in 1:10000) { +X <- rnorm(30, 0, 1) +averages[i] <- (X) +} diff --git a/Ceballos_ps4.Rmd b/Ceballos_ps4.Rmd index 4c17336..1a01ce3 100644 --- a/Ceballos_ps4.Rmd +++ b/Ceballos_ps4.Rmd @@ -24,16 +24,36 @@ What's the proportion of "significant" results ($p < .05$) that you see? First do this using a `for` loop. ```{r} + +result <- numeric(10000) + +for (i in 1:10000) { + X <- rnorm(30, 0, 1) + result[i] <- t.test(X)$p.value +} + +sig_results <- result < 0.05 + +sum(sig_results)/10000 + + ``` Next, do this using the `replicate` function: ```{r} + +pvals <- replicate(10000, t.test(rnorm(30))$p.value) + +sig_pvals <- pvals < 0.05 + +sum(sig_pvals)/10000 + ``` How does this compare to the intended false-positive rate of $\alpha=0.05$? -> ANSWER +> This answer makes sense. I'm getting a false positive rate of about 5 percent or slightly less than 5 percent. That's what I would expect given that I am testing for signifiance of ($p < .05$). Ok, that was a bit boring. Let's try something more interesting - let's implement a p-value sniffing simulation, in the style of Simons, Nelson, & Simonsohn (2011). @@ -43,17 +63,39 @@ First, write a function that implements this sampling regime. ```{r} double.sample <- function () { + + first_sample <- rnorm(30) + + pvals_2 <- (t.test(first_sample)$p.value) + + if(pvals_2<0.05) { + return(pvals_2) + } else if (pvals_2>0.25) { + return(pvals_2) + } else { + second_sample <- c(first_sample, rnorm(30)) + return(t.test(second_sample)$p.value) + } + } + + ``` Now call this function 10k times and find out what happens. ```{r} + +tenk_results <- replicate(10000, double.sample()) + +sum(tenk_results < 0.05)/10000 + + ``` Is there an inflation of false positives? How bad is it? -> ANSWER +> Yes, there's an inflation of false positives. I'm getting about 7 percent false positives, using the double sample technique. This is what Mike talked about in class, with p-hacking. While I may falsely believe that I am keeping p-value at less than 0.05, the double sampling technique is instead giving me slightly higher false positives. Now modify this code so that you can investigate this "double the sample" rule in a bit more depth. In the previous question, the researcher doubles the sample only when they think they got "close" to a significant result, i.e. when their not-significant p is less than 0.25. What if the researcher was more optimistic? See what happens in these 3 other scenarios: @@ -68,8 +110,83 @@ HINT: Try to do this by making the function `double.sample` take the upper p val HINT 2: You may need more samples. Find out by looking at how the results change from run to run. ```{r} + +# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.5. + +double.sample0.5 <- function () { + + first_sample_0.05 <- rnorm(30) + + pvals_2_0.05 <- (t.test(first_sample_0.05)$p.value) + + if(pvals_2_0.05<0.05) { + return(pvals_2_0.05) + } else if (pvals_2_0.05>0.5) { + return(pvals_2_0.05) + } else { + second_sample_0.05 <- c(first_sample_0.05, rnorm(30)) + return(t.test(second_sample_0.05)$p.value) + } + +} + +tenk_results_0.05 <- replicate(10000, double.sample0.5()) + +sum(tenk_results_0.05 < 0.05)/10000 + + + +# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.75. + +double.sample0.75 <- function () { + + first_sample_0.75 <- rnorm(30) + + pvals_2_0.75 <- (t.test(first_sample_0.75)$p.value) + + + if(pvals_2_0.75<0.05) { + return(pvals_2_0.75) + } else if (pvals_2_0.75>0.75) { + return(pvals_2_0.75) + } else { + second_sample_0.75 <- c(first_sample_0.75, rnorm(30)) + return(t.test(second_sample_0.75)$p.value) + } + +} + +tenk_results_0.75 <- replicate(10000, double.sample0.75()) + +sum(tenk_results_0.75 < 0.05)/10000 + + + +# The researcher doubles the sample whenever their pvalue is not significant. + +double.sample_whenever <- function () { + + first_sample_whenever <- rnorm(30) + pvals_2_whenever <- (t.test(first_sample_whenever)$p.value) + + if(pvals_2_whenever<0.05) { + return(pvals_2_whenever) + } else { + second_sample_whenever <- c(first_sample_whenever, rnorm(30)) + return(t.test(second_sample_whenever)$p.value) + } + +} + +tenk_results_whenever <- replicate(10000, double.sample_whenever()) + +sum(tenk_results_whenever < 0.05)/10000 + + + + ``` What do you conclude on the basis of this simulation? How bad is this kind of data-dependent policy? -> ANSWER +> I see an inflation of false positives. The more liberal I am with my double-sampling policy, the higher the rate of false positives. For example, when I sampled again whenever, regardless of my p value, I got an 8 percent rate of false positives. This data-dependent double-sampling policy is leading to many more false positives, even though my putative p-value is set below 0.5 diff --git a/Ceballos_ps4.html b/Ceballos_ps4.html new file mode 100644 index 0000000..5c1d284 --- /dev/null +++ b/Ceballos_ps4.html @@ -0,0 +1,557 @@ + + + + + + + + + + + + + + + + +Psych 251 PS4: Simulation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + +

This is problem set #4, in which we want you to integrate your knowledge of data wrangling with some basic simulation skills. It’s a short problem set to help you get your feet wet in testing statistical concepts through “making up data” rather than consulting a textbook or doing math.

+

For ease of reading, please separate your answers from our text by marking our text with the > character (indicating quotes).

+
library(tidyverse)
+

Let’s start by convincing ourselves that t-tests have the appropriate false positive rate. Run 10,000 t-tests with standard, normally-distributed data from a made up 30-person, single-measurement experiment (the command for sampling from a normal distribution is rnorm).

+

The goal of these t-tests are to determine, based on 30 observations, whether the underlying distribution (in this case a normal distribution with mean 0 and standard deviation 1) has a mean that is different from 0. In reality, the mean is not different from 0 (we sampled it using rnorm), but sometimes the 30 observations we get in our experiment will suggest that the mean is higher or lower. In this case, we’ll get a “significant” result and incorrectly reject the null hypothesis of mean 0.

+

What’s the proportion of “significant” results (\(p < .05\)) that you see?

+

First do this using a for loop.

+
result <- numeric(10000)
+
+for (i in 1:10000) {
+  X <- rnorm(30, 0, 1)
+  result[i] <- t.test(X)$p.value
+}
+
+sig_results <- result < 0.05
+
+sum(sig_results)/10000
+
## [1] 0.0527
+

Next, do this using the replicate function:

+
pvals <- replicate(10000, t.test(rnorm(30))$p.value)
+
+sig_pvals <- pvals < 0.05
+
+sum(sig_pvals)/10000
+
## [1] 0.0504
+

How does this compare to the intended false-positive rate of \(\alpha=0.05\)?

+
+

This answer makes sense. I’m getting a false positive rate of about 5 percent or slightly less than 5 percent. That’s what I would expect given that I am testing for signifiance of (\(p < .05\)).

+
+

Ok, that was a bit boring. Let’s try something more interesting - let’s implement a p-value sniffing simulation, in the style of Simons, Nelson, & Simonsohn (2011).

+

Consider this scenario: you have done an experiment, again with 30 participants (one observation each, just for simplicity). The question is whether the true mean is different from 0. You aren’t going to check the p-value every trial, but let’s say you run 30 - then if the p-value is within the range p < .25 and p > .05, you optionally run 30 more and add those data, then test again. But if the original p value is < .05, you call it a day, and if the original is > .25, you also stop.

+

First, write a function that implements this sampling regime.

+
double.sample <- function () {
+  
+  first_sample <- rnorm(30)
+
+    pvals_2 <- (t.test(first_sample)$p.value)
+
+  if(pvals_2<0.05) {
+      return(pvals_2)
+  } else if (pvals_2>0.25) {
+      return(pvals_2) 
+  } else {
+      second_sample <- c(first_sample, rnorm(30))
+      return(t.test(second_sample)$p.value)
+  }
+
+}
+

Now call this function 10k times and find out what happens.

+
tenk_results <- replicate(10000, double.sample())
+
+sum(tenk_results < 0.05)/10000
+
## [1] 0.0763
+

Is there an inflation of false positives? How bad is it?

+
+

Yes, there’s an inflation of false positives. I’m getting about 7 percent false positives, using the double sample technique. This is what Mike talked about in class, with p-hacking. While I may falsely believe that I am keeping p-value at less than 0.05, the double sampling technique is instead giving me slightly higher false positives.

+
+

Now modify this code so that you can investigate this “double the sample” rule in a bit more depth. In the previous question, the researcher doubles the sample only when they think they got “close” to a significant result, i.e. when their not-significant p is less than 0.25. What if the researcher was more optimistic? See what happens in these 3 other scenarios:

+
    +
  • The researcher doubles the sample whenever their pvalue is not significant, but it’s less than 0.5.
  • +
  • The researcher doubles the sample whenever their pvalue is not significant, but it’s less than 0.75.
  • +
  • The research doubles their sample whenever they get ANY pvalue that is not significant.
  • +
+

How do these choices affect the false positive rate?

+

HINT: Try to do this by making the function double.sample take the upper p value as an argument, so that you can pass this through dplyr.

+

HINT 2: You may need more samples. Find out by looking at how the results change from run to run.

+
# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.5.
+
+double.sample0.5 <- function () {
+  
+  first_sample_0.05 <- rnorm(30)
+
+    pvals_2_0.05 <- (t.test(first_sample_0.05)$p.value)
+
+    if(pvals_2_0.05<0.05) {
+      return(pvals_2_0.05)
+  } else if (pvals_2_0.05>0.5) {
+      return(pvals_2_0.05) 
+  } else {
+      second_sample_0.05 <- c(first_sample_0.05, rnorm(30))
+      return(t.test(second_sample_0.05)$p.value)
+  }
+
+}
+
+tenk_results_0.05 <- replicate(10000, double.sample0.5())
+
+sum(tenk_results_0.05 < 0.05)/10000
+
## [1] 0.0803
+
# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.75.
+
+double.sample0.75 <- function () {
+  
+  first_sample_0.75 <- rnorm(30)
+
+    pvals_2_0.75 <- (t.test(first_sample_0.75)$p.value)
+
+  
+  if(pvals_2_0.75<0.05) {
+      return(pvals_2_0.75)
+  } else if (pvals_2_0.75>0.75) {
+      return(pvals_2_0.75) 
+  } else {
+      second_sample_0.75 <- c(first_sample_0.75, rnorm(30))
+      return(t.test(second_sample_0.75)$p.value)
+  }
+
+}
+
+tenk_results_0.75 <- replicate(10000, double.sample0.75())
+
+sum(tenk_results_0.75 < 0.05)/10000
+
## [1] 0.0821
+
# The researcher doubles the sample whenever their pvalue is not significant.
+
+double.sample_whenever <- function () {
+  
+  first_sample_whenever <- rnorm(30)
+    pvals_2_whenever <- (t.test(first_sample_whenever)$p.value)
+
+  if(pvals_2_whenever<0.05) {
+      return(pvals_2_whenever)
+  } else {
+      second_sample_whenever <- c(first_sample_whenever, rnorm(30))
+      return(t.test(second_sample_whenever)$p.value)
+  }
+
+}
+
+tenk_results_whenever <- replicate(10000, double.sample_whenever())
+
+sum(tenk_results_whenever < 0.05)/10000
+
## [1] 0.086
+

What do you conclude on the basis of this simulation? How bad is this kind of data-dependent policy?

+
+

I see an inflation of false positives. The more liberal I am with my double-sampling policy, the higher the rate of false positives. For example, when I sampled again whenever, regardless of my p value, I got an 8 percent rate of false positives. This data-dependent double-sampling policy is leading to many more false positives, even though my putative p-value is set below 0.5

+
+ + + + +
+ + + + + + + + + + + + + + + diff --git a/rsconnect/documents/Ceballos_ps4.Rmd/rpubs.com/rpubs/Document.dcf b/rsconnect/documents/Ceballos_ps4.Rmd/rpubs.com/rpubs/Document.dcf new file mode 100644 index 0000000..3fb0ddc --- /dev/null +++ b/rsconnect/documents/Ceballos_ps4.Rmd/rpubs.com/rpubs/Document.dcf @@ -0,0 +1,10 @@ +name: Document +title: +username: +account: rpubs +server: rpubs.com +hostUrl: rpubs.com +appId: https://api.rpubs.com/api/v1/document/549194/a2d7a0d184e842a1bc80a0b566751e7f +bundleId: https://api.rpubs.com/api/v1/document/549194/a2d7a0d184e842a1bc80a0b566751e7f +url: http://rpubs.com/publish/claim/549194/7334c10ee0ea4cf095682e9e5e13467c +when: 1573587960.91685 From d3bad5ec9401d99a12984c187dae7c009da76dae Mon Sep 17 00:00:00 2001 From: Cristina Ceballos Date: Sat, 30 Nov 2019 14:20:15 -0500 Subject: [PATCH 3/3] pset 4 updated --- .RData | Bin 0 -> 70354 bytes .Rhistory | 1012 ++++++++++++++++----------------- Ceballos_ggplot2_exercise.Rmd | 182 ++++++ blake exercise.Rmd | 122 ++++ blake-exercise.html | 487 ++++++++++++++++ hw4 updated.Rmd | 251 ++++++++ hw4-updated.html | 628 ++++++++++++++++++++ 7 files changed, 2176 insertions(+), 506 deletions(-) create mode 100644 .RData create mode 100644 Ceballos_ggplot2_exercise.Rmd create mode 100644 blake exercise.Rmd create mode 100644 blake-exercise.html create mode 100644 hw4 updated.Rmd create mode 100644 hw4-updated.html diff --git a/.RData b/.RData new file mode 100644 index 0000000000000000000000000000000000000000..13950b018874b4a537f7dc802cd0fe9436625f2b GIT binary patch literal 70354 zcmd3N2S8KVw&={Axuc8)6cv#Y5i}}FL{NG%0}4izSV3tSMMXqFdP|v60g)1EqEa#l zii!|Lq=sau5fCCZ^hiPoJtP6rlYa!Aa__wV|NG_r&(3DAzSdrQoqbMVA5h%NuaECD`%9{N?>*)&7xo=Mzes%Q3{<_((3R>1q8>nwE+_v+` zsgoyfZvN`jjRI=-%ivcpwikk_kXB^K zv=RcU5s!^Xn+$baty=A5=9R6j|?Ygiz2;5ZbxN+)#b5^>v@N zX=GX&tBf&JaVV*;=sTNBq{~k?@80+5>ZHb%VhyWA1zI*5!*&bf)$zr3gOlPWy93w^ ztr?$?58t%nsc1En{_V7+lVqO{25ZG}%*1O1ZTH~u*UB*eK5k_FFJ9MHJa`&%YZr<) zh0Afohn+zZKeh1e>s;s`n4QV!rVes?he1Z6;lnhNkUI)%@l@yu$T|&oPU=nWiH;5&rr0w z-Lu|=;C@22{|fRDym~0Mle_qj&(vpHa+>XZzv5g~snXsZ#51|E|B#?f$6(xJXOE$_ z?2hTdO&1-8g%R_|6Dm)>H>@2w-VZ}f^XwfgAEQ}ZCVJNicDU2A58drsrnVd*5{ymx!so>esk>rMS5GKRUIGaqI zBbS9W5YRJwal($hbcr2K)M+ml2MXlX3i%VdypfEcvs(z&1EC5DcK)e8j9z+%~IJ|?rm{5LpB zHda`WE~}-x3rxszG`^BTwY)q+TtUtyaTn;QE~NAWgg%v~PnA*1NFp+b_guM}b5Iab zLsc-?zT%P@0%^j6ZldUZn~xq%Kp4ooW{}lf5rR0~NtV8WRE}Z^vL29?Y*J%L9!w|m zzyw8<8g#)0BxO6Tu!|JzRr!xpjFdsaj`!PxM4f)&oIuQ!6LL7R{j_^zD~vdG3&wro?Uz}_wvjU| z$4mXq!{S8Qj|`+ph9*{pD##goW%UAS`$kA4 zE07++Tm_Tz{V=pu+OK#|cvhVKvTkrtl?i4b4fMRVdVDw()mIHg!3BtG3NeBn!hY_J znaH5?^9G<}1x_#)8YW;==$br-G1YwiQ`nGd=>kMpr6^mxNk&g+%$_4(WF{bmV!C`9 zL>C7OGBM0Iq)_HMycbd?id>uxqL(`6Z6tSp}c7cwp#iz9X7GfL`1P0lt0S#6E@ml7VDHn zvKFkB;v2PLQ+}Qknv?U^LRRAh3a3*nMO8(W3M`T1=!b1qEkf5+S0Ux>e3HDKEQu6I zvu59LgwYw<3JdI_2aPEIB@Raha*UPgfWq2wNJ$T2aK;T+In$%C@{k20r8$vl_<;N> z*lY_zD7M3n1nR>g2PlIA1P6mBR}5YDtC*Y%R3nvIAQM zmIneCM!Xey+?Q11>z?YNmQrMJ!(LC!Op*z@Wz~I9*hGKG1+UlTLE*AV@LN2#iV^B- zDL6?bk3Lrx&xDdEC?r8+nZ=AoS}2O8Pm^|c zBb-q+NSWE9OL^02h&Xou%`PKJ8(_o$F`r(}ts_lwv4gU9eLuzR0XY?~kh78WbCI&` z%3+g*etcS4WFYsf8dP`~wm1%!$$Wf67qpP5zKilA1z!RarH~im2NxA2`7meLy-LxX z&zYOnCy6}~*daAi*^P-9q`*c4XC-Bl#nt!<9uhO34G~SZ`iXch(k2e-*PyactGP-A zW=>^rw;q40gVmf9Xm*qD#E6pO8j zDta|Z)W8E_JtkE-{pFJw0{X&Yw{pa9Q`soI1rpfGVdF-4^pGmPD!i<|Q$}Vm%cS{sBuT4@n|!d#qKq7p z3n6~oPG2Z3@Z+}Ik>!YN7ql<`8$ZmX2kB&Zrw0`)n9*MR4Fa9&MHjh}DV^BKY_Kmc zIpz#qGN7=Mq=X9FKoTaaOwl}mkN8nU!_HB$v#L(g#W(}WoWhcgQHTi@gCaH9SX*Ip zr{AKtpSw9x^nyUeuw;nvkt~wp4X(U>iX(5+K);X-Mp9)meTe85UWDh(&=-d-2qWPplaBXX6zEPCkmkY6Z7cF;`D08=e=hDA5)@K@xczkz$!dJ{9AeB)TM%rfZSt z|Kv!;R6?~Vba;`kEFMUh zS%qD+!j{UjT}YyF4T7+YTxjJv@)Uttq}U72E5_@PVFd%0SLLy4ZjxU!syVeVG}~zk zCL{78#04Z>)Qzm3(?Q7ItZWrpZi83yi5))AV=1!X^T9m!{!R`te8qG!K0;eV6UzJ#!qpNj-7A= zg^wtsB$Hx3iQUV;?yo~1Cw3^H;s(7ioe5$6{$ zKajmpTLo1pVdpa-S+eotuqtUe#a+(CW19;}oLXhMIA(Z`$F}^4a}RG>F_>E}zlF!p zX)w7rUF4Ag6Flj-rTEwb8kAGjup^5CQCx!)nwdW=c-%p41z~1f@ZpkKry}7jIs8Na z%vu@CTmBK8t$0L!M$Vh-_5>*!3$UZl8*s=)5k=bIf|Z^3BM)^BfFcG`v}Kcg#oXH= zte618OnO7Ar@(R+7{k5-JmEg1xeB6Vpd&>v@?v#^-LMo6s_fgV&F|v~M=e16_$g#w zY-k&b-c7qgnoANG64eLdm%M6gUuWAO#1!L^=H%jve~~fi-LzD-G9^c3ioeX7n*>>>1m^+ zjc!aPrGacuhtpaTi*u)US~PzdRse*Y>zgmu>4`2?aecs`RIST)eJ4ZyJov8dgo;6)@`R1}9`w^t zd4%ENieX%(mFJ)&*GDt-3$b)E?hces+|%I;4c~1ZUq~bD{2lZzc&XR{fLMC#A*c z=Xw>c<_?{Sfryn?TuCFfnJqnVj~y3!V>l|Ba7BmWc?UTNGHtKuAkRZ&pN4A3O0~r& z*a@UKUH)ZI1X?T9GOt4(G`G8l^>HbhNR#bxu_`*&w^y#)cIeLk z4&bSkR6*gk;9%3em$x0hyJ6RGl>f@&pS&e7KEJ|+JJM!LG}YiTt~bWLQHQi$v7~2k zQGSj5YIVP?)-eBr%!a;rHBWu_%)&7LQbysMjS!Y2G1Uby1;p|@q8*#gJ3>sgV?423 z4`C&R?ZVaDhwHgF>vB(hW(@!uKv-H3RtPcmZ;)_f_vTzcIw75R6)?nnVz~vvY6lE8 z+Apz^Qe%#1bIjWWnrs^I(PEouvQ4zuWtwc9X`7!#o5kpi*}r`noNm0>cJ}a#yJt2& z-+k(A-isq=ll}_`@06U9cB`E=c;R}c;#t5+*5l^GX>nf&<&*&Ko`O|pzk2c2nXjMy zz$DQb1gniE%mD}ZE^3{ni2D9 ziyW_^OO)qVKV-~Wszny7g$C@M9gP@lN+?gs@CTPM9JFICDzrj*SL0LtiIltA*vpxz z)uy{5{ZB12@}?W$eQ_zpYT(Ndhi21*G)A6lTuQ*+#YNLyG{(2iajIAMI{f<9^p_sS zw|6s8Rt~=|05J1gXPlN%+x01bU@L-GXoFc1nW+HMc$~Bj%(Bf))$j*)B6!6*i<5uA zOALSh^+tGwI+%4OGc^Mc`h-|S{sDt858iVD=ne3SjbN5zX6hHD<5%)G(Glz_h?_eFquqidM4Lvt>m zo{+-33TQF&EVqDJ?VniiaDX)e;`g5vy`0%(X<%5?a2D$ApGI|dWxd(tBzpNDpvPLv zHsvQQJP6b(j(O1p~~$;VO&)3?-m-S%02U4Il_`M$KmIrh#zoX#BmB zKksw$mWEH0`Xg}ga_0URmNT>&$T8sXbLLW(l6_~{-NhR~iY=Ct_J`VE(*M`I|4Dl( z${HZtrBz$XG_V4hOLKhu=Y{?pWTs{Q@&fqR{k@XIa-5%s&*6Sf3lRV3y#1cOPf^zW z+0Rlzm4N<~lcnr0r~40WKlcc$G8$fF$|Cbu7bZFOkqXKh>N9q9g}gohH7 z_d=U`+xbd$>LA%`JlJS>-^xQYm+<#&s}@29rZdrV6clmlkl&X6anH_X?=kv*reCk< zCA2k6!xn4z5;MIJtvDWtrM@q_45jZE_^FG^;-yw~)Y5j&qbCVc?`Xw=IJE*;9rY=T zHjW(_B>RjF>=*YQ#5m8Tm*1uwQKa1C3VK-u z?Y)71H6i|~WuayrZN)&5a-z90BboYqw?NAvVcvDiM1^rjpjRHd;|TRhQntOWRl2R~ z#g`Sv#bX`s9tG}9x9xG=az4=OQtOkX1L?NW|4iG68uG`~`C~f$F)jY`J4|96{@o9a zWi$o^^oWujT7UBZ&2;U7mu6kB#!l2dvRUK3_CWdj;;|FAZnXx-Z0XUub}sbUg`oo^ zn>AB_m_^sC*%Nh@ONeQ2-;HM%>T;sj#B4dJd9C+G>(q(5Xqz<}YY$vE?RqtG;+9AM zae$?DtvBe|h0dJlPl%foU`?E;dup>LW9bYT#P?Suxh3a2 zyPUwToakJ@PVlplXMmE{HPguxbteHk=3O_ISyzw$df=CyZ=F^DzXhJn!;Fq%&kA46 zoo&MZl1AJea(3Di)lk&znpUx!|IMj~1(yJUkeQx>`-t-`-evOmm)x0f09}N6U&GGy zc>kBq@J8E9r8AO(MW-1;Kk_a48n#>hLUwkEfc5Q?M`-_#)15iu{f|GiKABN`{XzyI zO1S#kR(nKdOyNZ5I*0sRDN4Bu8d8ZZ>o-8W0t?2(7L^sz}OPw|OZl{x- zZH4*%=I*4EMTzkN!Ix4CmRZ*#QwwGQb-;@@>+g`n#DL(S)Ph{VkQQrl957^8VQ${s z-3}N^3^=g5zwgK1=`9hhAEu&*_su`eDO-)M|1c1JP&&UFR`-K;do($Fgy1QhQkO^k z0rsrMdPkEF)X^USoB3SX6`_lfv=G?Mm&n0&^vV&xp}8_&p^J&MP($9c20Il^UMAPg zm0b`H?34OzmLFV=)rcmqsiQ}a_$`x-r9O%G59|&oI+2vS%@h3m_s-TU+~%3;m}GRc z;9W&c_}S6;d%FWdlK<$oUK`DFOYUZ#bUxg+CiQOS)Z1TwSN>wi?r<9%VDBv4#{TZq z;WmfA_QV{(>WAF{tI@t2&GM^X!izlC9vOLj^fLJQzx2YRms1^|>_1wyd!r9|@P}&> zK;-e2KX;X~vt98WD~Qk9jUMQq`S#h$UigdU{t$ikTi2XzA7ASKqpd${%|1 ztATvpc1-dZe^dIdLfFfq7?>U0LZfU3q~{Zp&gml2w; z!KE`wEfbGeL9GFmSIperWEP!wpzX=`zTbJ^3)$Dqtt_+1#eudrf9if`aLYs`D`+U7 za*6DF&cS3)zQ+B|H7yg-tf0Dp$_r*guQHd(8TUJnw@f@`1$72gdYKKCWcoTgnC#78 zCf{cTwFFdNHXAC=^gZuTwkO}^e&^nni6wGGeZYU`Q|V(iRGR7Q;!w6Xf8c(nPs>Cp zD`+5KnXLUb^WQ!fkY0!XrLo>4^qaT+(U_j=_By0G{^wg;Qpw{zf3*FWP3qYn7>EAF z-M_m3U&AlkIX`?<+1J(8|Ig5!2sZ_1E&r*TaZsQCaB#JsCM~%1t)Q>R-H$pjmA_)9 za1Gj?5)QI@E}0S4jS;6vi@*9rI7hl2qJ2Xsn>qTCxENtkq(wO}%lox(3u#7lnHp|S z)AE@YJ!V4HKwhx25-InRKK2;NUB0<7M(SUZ25T(iQNo<%4^UYEE3!8;9hQBah0CQi zf$Y3Q9_RG2tC{xlu;Mh>4kQGCZRDo?+0W%WVew=n#`I{5GSf?Rffwv0Ltspfws06; z-MJSEc|UDN@+c{VQbSRM$YV2zM`?GIZ>Bec0AUYaFuymR zAtGjzBM?e804s}_SM{;+ek>ff1h7+S)D2?AG|>>v%G863H^PKzHxqWo7_Eh|VeEr0!J=k{iIy`^OJkA@CKy`lT!F)<9jqre~TYZj{OD%LS7$a@$ z?C1K5isWSM=1IW~<5?Smf}NmTB;0|j#~PsG-NC5%Y&EDYK^1CCT?NXeDS>j?QMmT0 z+qiaqEoD3y=;0$C^OZ&$LHVoH_)j-#Ggx;Nv({QTJE7v{FH-`!6CL@hXzfx*WdqXKnwi>P5>(guumhV$?rM1kAh-lQm zo0EAxEtJ7JIShXajFi*E>N3}G&yuslW_`Ok^)0!?lwW(uj zODIVTwH2KhR`&q{{ZCKG%88e#tY$zocr?Xs4K6BYbiApimKF*9IDri-^zr(f^8WG4 ztHipeZ!v0b9r5j9MMQ5(ne;qNS&Kq*wkV(XVZY}0@eR-}nG)%HcR$&0ypc+XkGHHL z5v)*`4R$wU8A}p8{Kolzvw<9T3mES|zLNwW0O&5y%ZdO5w9ts62f?9T1ZpW=si+Mgn=e%EJ9p`9%&+c=eG)7xP$DXjP~f z@Sh1QO(+T+>EMkN+XPilS#@0w9G)~|bT*<}aB0-b$T(}} zr8k*G`>k?QCU z@=2rFi7Z^C?s5<{Uy22?h2w!0AT6XE1WA{|vhOQ!9qnwMqFpqqXu?x79Oba1pFH1N z4!Lj&@VFBT@VQiDMLHc!vnJ#9yUa==!~|0 z!^*bF7M5R#tD3<~*nQ3&==Gw&?u3|Dv?Hv?p=T{gI+=@R7m|C!t=U%Bg-)4Uv?!P& z(*qIi{<|@*4bR&h{jbdv8C^RIO*c}>z)0plJ)613ieS1Cf@gG%Ti!jLTgW&g71bp= z`tNp!;x+G{=JH)c8}e;AFehf0J2R!lFE_j~!S1e2E_-8)Ex}21iRwa~z{_Czc&C_^M>_7hB5d%q>qg=Gzj0Q8w88#B?L9 zh;gR)PmX}7eSmxXzen&NGGH`D*M&}JGOpj{(Ny!(;8u?rUm0bxn>WI2;d#uk?-Dk_ zf;%gve(i&7Moa186~OHgbqJ!-&lqMj_t!N~@vsFD4VYWV8!lB$cuVLe*kjauUxH#v zNz$<1l3JiBY86e6LQu7X5HaT6Ayk{~d|Qhy&h-3#cB6nvT?JI;N(Gnqy&{*d1atr| zoP+fPD03BXC9G031*`;MpdQxkl~QmP(DIcoA0mI?qrG-2yH z3vnRNS|OW0^8ESRRZk6oV-)1}P`fNpP$k zJ9%NY@XLc%?v~or=Z)&PI;*w=*|j#)&|R7i&Fve${-14YG2h-}ws2X(0I5+67m-D7 zX=ROhrexr?r$eeZ3{KRgn#%Emfl>MwcV&thG9;G@` z-lXr_5wGM?``~Ye6(DEpF0-hehxcAMwpq3Dw&A}G7A}T6t|~dF91w(k<8YMvJl#0_V3Df3m5B(qej0$V1;^_*z`P~Um8*)0vzfJlTP(HFV;py3% z9ao*B^b-=%wq8M;Z2-_p>=^IW0vu3_j4K5@bBxUw8vqBM#{hoYGa%-!G+V)J+R<%SN{@I2UEh^zD=0b(DDz(e4(0%JYxV+|q!{qhm}*N2xdvd7zW@>Z zSO5t0?o7z;@CWFQ|F6SvIPoxYwST_BWlpar`hT5_8&Fht&#FK7s z_=uVCHEVk1{uQwPXL%>3|5L5^z}wz}2?UOx!%);j-_s>I9v82)Ho(MPj` z8sBF}t|+|8{#JBy&a!nSY)wrdBPoQj?$j^-*MHLA_q03uAIxv-8q_yz7HrmBJY1kE z*wFC4y3W=xkW;3AWmfCt_sB@6xLT*J9^WH9R@@6TLuFNZ@7Va=MoaL!7dP3xYYRTA z?@YYO9wq?xH-0!ZcP!5QpVX2v{jlusZ|jF8wnhWI9}9pPU^ZAHjB0^tT(PMM&sl9TsElrei7ht;B6!KS8^Hj$FMprFm`SnU?xrtg?q#NFGF# zF142jzew9!mh940m|8meg|=OhjBO(S83p*Gjz?WS>-;OV# zmf?q%k6BrUjg~H(sQ`_h{A}jul)i!pxAw<}<2LPeo;RojJC|Jjgbg1i!~&F0NND@i zGP`8lZ9aO5yJ>HPTJ^5qMg2aDvbUgrlFugsP7p@nKgjRJeu4Jf{tRI*A%S4>A6jmO zoi{j@_pbAp_C3(gn|7TV_1dBQjxYs;&ofS0F&c9mI3H%_&-+2D@+o+P(ZeAHvP=~S z@u59OK)?bH#qXa8B($VMahuR%9TFMvkLts5(7*|qJa`Q-0erCk;hDd225AHeyu(I2-0UhD4s=_nd+>%A4D;L+ub0+T+cIdTKHT}TBDDaE$#I{^%RKoIi_qy1z4BzA&{nE^EeIT72Q|^trfJM35D0tCva% zY62`-)o)7`Cf58_`91ai=jC8;2>$`2ZM$s9G9Oq2Bg@bB%?@kd!&`i=aOy7J&(fEv z%b|ulB&`@_SACvA;9a25Qg$q-Hv2TNW@hG}3;$!v3Pzduce(eo3fzV-3185y`CRc= zWk=q8sd(OeJDhDcTKQ%D zM<3ew{$70mnPVFk9=OB!*^HfypbrJhJIgX zYS8Z&HE`J?<{y2YgmcOG4e@W%sBf3ddc9lTSi-k9eZzC1E#(<-dXumeP!}R2?$hO> z{v(k0;N{g8SG#=DwA2L_CdFe};Rorxd@Wk<@Kb1dE(Qb6EiLRZVbLFIY|#Lhn9H=8$((za|EqS zhW_lEauKsJkgF039nJLCOdm$s4KC#D-|m*K_on5FfqKfNxGIR!DrldpDYp*-CmV=5 z<~U5Sk}`?vTE6P_@jD4N+>a@=2INj{o9mO}L5lZEX!7X%g?3L+`f7vW&Pm9j9eBK6 zpW%FExwp-zZ$Q$LD$5%#T+MtF7qh z1=p{zo}R4Dr#cpyS&u0#(5yA8^D-8X z5J~G}F5uUq4=PvmfF-8;>nE&yiFz63^V1y_%APu0F#J23DK!2`gJ*{+9Ljie;hm)y zdT*>+`i!-q+D#2SRqrkQphHIm!j^6c53Fk2KB2F~hmug3F%6U3G~*P2+3m!2jKHGs z#u`$b8onrb8gILa9t=Hi(^hp!X=M9aoPn$|XsR~g*=&k*tI_~GFq$`=JP^ZD$%^XT zmEevFfd~4nKWII_aa)yNTb}N1lk`M%;Pw&S4e9W}3Wm+fgUtC^zAR7TdJw*)Ma{lL zGe;aau@??u&SmGt>8Yd47g|#qXTA1i*2gKkkf_$EffyU)vAMqOvGfxPZmR~PF2`_d zeSY#B$%)iCzDAd$bg@Gvcw$%& zDVW&V#inpujeC#r3iRm5DoxHDOqaP9gPSIJd=<3ldUiby5v^jWmyq-w)1SC~!XP#O z!)u#M{vx0LFy@rx`gEKQ)JhVx;ie}ZDVk5I&qYwZ(6uqQN@${eX;(6hEhW;LC$zlR zuBC8>(R{WFdb{XIGDH+K$~2rH8ADAH*Ct=QY3~GmFG$T z)#>5=C%|1`>+qw=;hI3@51NE#w%N=u>XNB!6vi^q%{pkhGl_~!ex-WRBHdNT)SmZT zkF3=0^9(7TpRL2Jk9$f{6!F4&N zrt>1}^}duEx8$PaQAC^z9vd^K_L!ijnfE&Ikr|2@-MgnAhN9)wxF2~r*N|B5ojj}j z5UfNscp2RYR3(LbZ*&UkCTa|v*gY|vSK+#uuLT#}?zKFWUu2ft7|@zA>IavCNA@Uv z0~aSUJvB2WsDrU;_~Xe^V>kuAh$!|u^3q=4<4$Yx{Da%F4_q76&NGQBS`}f(cn9#+ z$wdo8DVHGWBx`9;zA9d!*H@>O?n;a9?Ew2dORye;CvQK*&M~*IbxuZk5snw5w}5yT z22C&>N)5Jm)+7gQcuMS8s&+9$&-M0rf2i9XY!v2xUap>1 zOmAa;igE8+Eo!javrHXxhg&%+_|*#{|FHQ!^CpG$47>8+rD+wn%EmQGtzet=F#iDI z+(dE%IGswh9uqWW7>9Pa>X@%#HAPjcR9C)YGfm)Q#*3@CGb)kWC+HOWXUgSssyE-L z3`X}FpTPoY6<)kUjv6;tT9rdu_gJ9k7Bk)x;{zh(2)0k`Uqec{F=4@B>eu_BDpxn; ztO7%|p0X%W61@T8tWvC2PVceWn~3{)ruT01KlHA@n65`&>{Si_(=9I<)9XeK2-INAQAYh!hx+`Um4l#JjZ@vdI79Zf4x zLi>&!GM^C=6}%1Lar(Tkx|LsAy%0#NC^;CCpca7aAj18(ZOHczSydBoN89>!a=l%0 zV)8uxlED8`Xw&+~xAMX}gck-gb@uNVN~upls1-$`Lj)6z2b|U?CKrVcfhLg#6>gP? zM=d7Ej_~wBxarE*hgW8!EcGtH4uPPuO5yS1d>jq9@Fw>+#aMHhmBTSwcrEL;n8Ebh zY$?EJuoI9uu#P+D_8#Zrp&y84<4D4j(4g7h>qYK`AG8iOPPAG`2=-F zi-A^5rgB3b@Iv8m)bB5tExm5`{PeBor(<<5j8E5oaG#zocIuDpxM(o%obGFL9dGOh zJc(>xNKp4IJAwV)40MxOV!r8M+wtq=!Xpf~zr3w>^ueRmPm?blNv2=gAR_ z4Cbeob5YQHc^N8hk|OQ|#4GcIC?!a}HZr-@T58txosP%ThI4nww?7(mDcgiCPV5{V zK4B5nt9OP1-(Nn@NzqqQgSv>qlxiLUk6=Pzk0e$!!iz6Ib4!_<KhUv^9%l2?_1;JInk%)y@%i09CLg!Cp&sG066D_-3W&vjo;Ti^B%ruNU3qYBeR~K zWg5bLy!xbHjz`rTqYk*|x{=fwOs_fh7bz#IzzGj+i&SLMlQj$H zciu+{p1$^6@8AkNZ#nIpO}U|7i(4guHP=H%1P^otW`ioA6LK;l7j|e$!C`_&dDqipbZ2rxL%fzzmvhWcc|*RT(a% z%^O4AR`%FjS7$&XYv7vHBH^YjoaAX%=LP3uF_U#0J*+Qj*l`SwMfD$;sLbOVpnw}9 z*yfR;HS_AuE6MXJ>rIkhr8UOvVBvHO4UGMSQp5UT1M)G(UgNgy+=;sERTQsz&|})Y zUM9Z*{%*h0_2d7j0-gxAfDe^1d0OUU?m_oc)vTFUG?jDYWoq%!z1JZjeFa*b<4r7G zX5(|?2Ym&5(X^U8YTsVRj$VB|hw@QG2YfxLwSE6i3DhCRAi%ZPh*xdnW_0xta({sl>QdDG5 zyPcj{p}0FY*yy0@`}B8(-tPB=$j9rf`!}z9?13+8eCSQUm>58l=kH&dQh6EGzkfz_ zFWi;pP)q0-oXUV0m*q8N>SyohvntfGVZKEu!v!%v>3o~3b5OaRfWwu9nY&TSxqWFX zC`r5YHbu=@7G$9Km7+4Q=RbcFY!dn zCNEIRO&@iMZr#Kg2WGeRzD_r^nM^W$pI77BK_r#GrY+{6QRBA?rIcHQwBdpp%% za^Tt{XCtrpdtB6J4Le*;NHF(7v=iPZnc;gU0fkCbO>x~{j`A2RU%v%r9X6McbX#;< zYn8R3rnyscA^Pnv%6{-t#FPSx>G)}%?@RyrjPJa?xE8xKa2W$AF1Qt?!^GW+xi^(E z2YmM`bg&sibc~c2i&8KCl#j1sU<8V+w65?(q| zz^ra&GQhD^W6PMaRd{srjj3jc4(UN-53_zJNV~I{q&6)KdX@g#dOl+ebp&-|v2`VfrFYlmFgy?Lf(&8HRoE>dJr8yU3GPfHwsC< z4&4(qZgD1Bg}zqZXuiL?zV)8Onxb6e`OK~3r|o*P@%p!_O?)+-UaqY1+Ue+O2>#Y7 z$K0WSpn{!mAgJJ1EoAAPM?DDY)32Xp_M6Nk^+1wG=S0_@5mWA1z&)Odu0u9&u@gXt zt@*Kji6@3_X6*+a|5~n>+TI2j_ zw;R>*pm<%UQ<`fNSYB-jfyhVx$-Ig3MQ!ko+oN~=^$RVA9s%|^I#VtNg34c5V)lIRxy84&~ ztW*4h^RcL(_7++?#^mqs!Nqlc2;2NP|3r4m#oyeSHg~YMf9m&aQ5gcBJsG;Y5@SB@ zXu>u&UGNHGGIwo=fit7$-l&ClG^c%+Tf}vUaNI`Gh{uA5rsKf5quS+CZ(&}3|S^zH!zm-x7cdF8W@boJKe zC2Z6wuDZ@j(KjBBSG{G^xVGK?D$0RjRZp>DAHO9jvFkU@_|Bd=sj~r=L(9(!)zpY) z7%Q3Roa2vpp6XrMs>EzdKZuWdpVmXQ*>_|2n1iYUc4O=cmd%9KL=Q1JEjnk%yHO_f5PHg2HtK9-!1hVoN62~i$&s*(e=VSk(+oe%Pe6J zG~3#cS!M98AhO}-76W*=;ohRgg%-GW${jZF&$c%YAqwGKsahsZ*R$XC{-)5BVs80? zPWS0br=dz=8N0J_gOkG+r?SXyyPb3$d_o+)y2mG_XjjL(q(&_8rJ^tg<&n7?rcGHa zl7iNIP?~67m(9+SWFM4SSSBj% zcFYO-YVI#3#PX|1`Pw8pTQ{=iRvbYkv$4-aX@%Eot;fs{Q=Oxnl8WJc1n3I7a*fE+ zxajtzZtFRCFT7UnE`zPf?)d40n!alo;+7R%RNV_ZzIt^x@=j65xbZHNnJA8mMqX@E z4oRh!@V31?ywQYKZnm}F)?lcwKn{?b{s*~J$vDj9kg%rC{OXhsWv(T8Rp9oo)n zG{rPF_>V87%+dXs_aF%2z7g}#-Hckj#-q)4OmXD1#7I&QCT1O-uG&|wCVjGiFR!C0-2F*v;i!-`uP-F;DSB}4I}4Iw z3`^=IP;Z5dB8kPTzy*H0)D%LF~h>D0LB26G6iAX0R1f&ThkN^P! z2_b}#zS(bs=bm@&`@V0yzrQig$QUbotv%OVbN=RB&E9+MxyM%dI?xv9q>pgFiQGl+ z^W+tteEGz1FFA9GxvVA8^sc_xF%vd`}F4Yb5+*vh ztC7>=n{N~$7G!Rtyo>gFS=bS~hH&unSZ=_%Uv2W++eTR1HHAa|U3;bPrkn+%{C!c> zG==rWq2aLuIeZd%M;qtd?4!M6KT*~;eU<6pYs{LwAlZIJbbjK~292MM{30e}dz5ht z6x`X#iGJkg`0#LqAdHri+$PG1Z)_$>>x;@|L`=-1haQ`~&q;AtE%`>mBg>HV+#{F! zDK+gx$YWl4<-WJO*U)}0F679!{x*h@W!3#z(OKn(%?xaVO66O=?4M@XnvwAwtcEJv&UJhkxT|TYx&bqt zy`B7=#+v*pMpkZl))4m&3nLnmAumMRiGK)j{Z-Pkp(?#OanHjGgKK;@?*#`4@ai#Y z|5x_Aq!Ax2Fl*tEM@bx4ap#m3UA{rdk3zj+Hw?ouVLP+Zs<*2*)T4;T&cZuS2zO>jXVS`39{ydo z>(q~d_Z+;r{zO4~@A_o!H{gZq#%1@Ay_%tR*#c>Ae|9wSb(XWc{;R@Wrw*jq(O@1$ z0*qlgv(~u8?-_3oFs+?kc-UXK1dJb`RtHBq9zwVPrBWvpc1E~AS(*7b00*e2d>J^*?RkN!IB5`FL($T@>reIQ(}~%MzEgdu#bCoI0yAVqRAXe*|{p^ z&t<3UzHz?Ni-yNJz0iCE>pcDjOL@-Ky^p@V40$1?Z(Ww;o5^ZAZlczg^3~X>|1%=Z zp_Kj6P1`!I6n2vLX)e#uPNhFFl5~H&3ocQX+n)Y2_>6RacT*M14~btD94XMQC5qqu zAd2mLe-8xH?O&*VQ%ECvE-QXyZP-oDcH1>Mc*ni@U?t@73L52)=^-|}t!>{bb@Mha zdt_x|Uq9As_g+XFv2tk#at+(KWlI+MU&7P;Gv-X2f$Pq@C3%PbXwLkfacmB{p!exL z)Y6ZKZv*R_2Orj5zVl7r?Z%%T!k0m~X7KJ*y-Yf}>ZD!vf!kyD4Tr9h?wlVhJAHNP zV{dEvMa-$-h}qw1xr`(0I``0O6_h&zBnZP^cjtMz89GePM59FzzY1nxstg&UOk!#kKfzfHG z#gvK2-hYWo%sPSY zfD^68CC?JeopxG2rbCJCaNzX{Dm3-j2G-=>G^!VIyflLmfLX{v=UGmH5AZ!?$nNIGw%Wh`F#YV8OC~6o`KNN5STQ6a+rb!^?|s?+GG#QDGi_h0^uNJViW5 z!;oXUk06j(zC0lW2_EWSPSJ~s=`49(_W*Y8)Ib@Foe$<;5JGXj{~(>Z2j+{X`r4si z@KzS&;ca3Y@b)TRZX}q2;=jv_WGW@#z501!9*+q~$e?%>G{DdO*WlRA6 zU>>A73fCK;KHuN{L<-`z1#bpcT>MwodWYBl!V71sHK=1g;W&rbH5>tiXAUzkiH^b@ zm;)BX7!*2gl3x7ZXXJUA1IzyV42W|nyXYTvK(KvMm|dUh%o&F@)do*xfTHCp{$chX z(8#%1Kj?PRFLS1Jp7ojci$$DIHva()0Doo88T}Vf_`>aTo|5)aq=IzJDQ*9g%95te z=$tcRbvN`2_~-K$4$e?Sipd0qo!A z{mVL`1O!3zQ0odoT+%0HRC(NRG+l!BV6#Y6U-(l7FIvwEaI}cmi2x##h&>`o#$DIkKo{{~@zP=B{Mmwp~ zp-=XM#i%nCImAsVscHObM2y@Pt;I?Vd1gKz!n-65$^{#i8-L2uHyqT{2I_(0zz0?0#|J zY>1xF5yO9(5?La2Rg!rjFLcM5W-GY>tT&}Mr^GmW5@&{D~RvWD769rQ^r|+Cb+bhjh!m3Zl{V~~dWuYa)&fY1#3I}2b9w!x-fDp~K zR#-Zqk~n`lmW8q(&r8sq*Obf}boPfq{;urSIo>=O zJyZ!RXMzZqkfRYLLekk5@NBor?Zpwc16U`E8dedV$d>SeABQR*^n-G2e13$mfr$m# zKa1C!IcvcKIR^@+Q$cgeEB!@obN2EVP3Z|qbG!(1hU+FU=hM2m)cj#S?VV3Y=F@2~ z#eycl6?cJh@tg`lZL9-cbIes9bDC3ldqGuX5rp(_!NM2N!T5Q79WH#r8T_B~Be47v z!c&LC=X4VOsjQhp!fhHv#p#DRFi6VR|162#HD5CCQ!x%_*b+&`q2d(yyrUBAnV_HY zYv%1r{z(ff8 ze1;X7xw&3bE3r&u4(yU^j`MzHdRQo)E5 zl-Ya%;4F5k0Q|95-k{k%fF*fETVrv~V@h``34lC3j6!bVvBoVoJ1j%5+lp)FV-G8ga*F!XS;4rxID7%T`jTyJD*j=^@p znZ(@!vt?@l8snBr6}D@*#<*B(N2Lx=fIajMP|Y*}+YLD59gcSZ#8{?91m?9sTC0L6 zH`>BskIgS>$6VMJPnP%Fd@XGppG`wWS>9oDIpUcUR_`u|I~QrlqBF@c=VSzuA6~Vk=dL8e*gZc)LjhLhgZTXjESr4TvGXT2+7z=hXko`CGa2t)r*o`;*8W5 zBH=a;@}Cga)yD8Y3fuABrfhF-(@cSD zlc^eHYAh%TjHqH7JLvB59QpfbkS7?pq9)#2$Mk-fLg^kw(2Ws_re7%Km^v`jm9=q- zn#Cq352$8Bq}gms`w>Vu1v6UOPI3`w>Da$wAs)%7?jwa^aGi1Ji4K!evH(iMo$~93kWl zy-)YzB_Yw)kB4N71O;?$J|BoM@9+YWMt4Y(_6Tg}#x)HXYezPEmp(Rvzb(SptFJ~6 z$noNAUFsDyYR9Z`xQcJ?oDx*c{Kc@%j{YFQw;!RhEFBdTc{l|8XHw)s^s)gswt`-B zwh5zLV4V5&EV=(~_D=q`Ii9w0eBO`lHaSH0C)qFJhFqO-cRwgbD(dETrk zk-}D7KwhQ`mcR;Q;a&6f}<&6r?p$YEKAKC8^5u4x?zG08>lnNJgnXc`VCTmiUOroaMPm90!>OY(n-# z!rK?lVC6Yi(UX)S!%Wn;?K!w5&o)l5KY^JN(e{~ly3bm3+O3CCWEZ{h|vMM=okoD+`{1eJ?zM}d{3xqVs*2>k7ZbEU1r>$#e{fD2>yjfzN8j6}#%6h?00Le!E!Sx;#@iC*p zh{zPE!z`5X3N!$!iAj^bI80JW{(!0U9Pb3ss}CkYRp&uOWjE+m$7Ra~P+AU69;P?S zKVoRxSTh9*|EO_O$JpU)+8<=ddsC?=I!|6RPViFAcuH6uddtU(gry|a1|_Q_9aZd< z4fGMm&D<-oFi8?wVn*L%gmA~;Zt90X6|t9WDq`}+9#E=@3l1)lF`UO`QE5gv{4aRm zSq2FKNZ=8()xi;zx;?aTo=tMTMiGe_kw|7n5KG~}lWI5%_K=T-J%*!Uz{@n=kElBJ zB2BPh5Ai5@9l1b5^A+qd^{vN!gL;BULd~W=aSD{2cPS8zYeG=CNd{1nJdi;CZ~@_* z3X2)JN{5Kyt8m$FlSx&z*e_AH9tDvc3^%4vg5h_}FCX~6PIrK?Cnx-XR84AE7;!1) z^rZA{aB{BI5|EUexnw0o@W7mI*=txe-Y^rdmAy@O^?i-o9{C3~b}0-deUsLZq5atz zeucQlu+^~Yee7(-5=fdl!cf&Xy`nW*dZ{7(Zqx-{EyI)CYdEeR%7EM=N2u#E96~Is z+KdbqdT#J+xLX;btEtru~r)@9UT8*d4OZh758{an+sRQ<63hIB6cbl}sbzj1DVU()j^b$BxTa*b7dnXm0NKE4h9AUlXLw+=6s%vyWkOIn? ztNE85qOB*+DzaBXg19f?0rJBT7*So9?!cyFhCCn20s`|&`H_mwca#XjE4BMaNtL#! z)tlxk-7D=g484SZ4l*`2q&XB*R3@L(mh&a3O)waG%K3fN`7gt=Sv?hA5l`AY8p&Qe zC$Bp-CV1_PfSzTnE@#i|J6>~sbVWFNxAY($O2|xuK-`FmyH_RU##6s4`t~(sAhpk< zp^CbuJn5c6d_y|59p8mO8+3t0kU;y~Ghq_T8P0&Pb%Fa~m=c68jB3&a#=%O>uC>}m zH}z0}>TOezQ1xSm1KKh@Sa*zypZ+oe{hz~8{rEazbU$9YCKeS!-p7_~En?LP-Q%J1 zVtC_#>Z;*kvUN!0pi$xsJ!R+gbrF8kC5V&~!U5~nBI_1}wNzlu>e8lrlKpgnjROvv z?n16Li=Sg`8c@A5Oe+i7y1EV8n)S)N>oj=n^2v!wZnSB*jbv1~`elK~Ho z@ZGkA{;PDLi^3a93b70V5wk<%q<7<7<}AR$Ua%eJTL&8T)@hBGX`05HX;PUPjMbI2 zEO~jYR`3v0ps7n;1O^%6OfBGv0R8I>o1&uWdI-_B+1u3HlwdLhLZ{Qg)sz&j-eq%c zbMByV_vtrVa0^g}Tvqoxskhajzi#qy1HS=Em>O%kq6wBR;|I^IA;Y7zzExB1A(pS* zq2~moTP}y3)y@@yhw8_eb+)-RWSj3GAv@xVn;dhV{9J$0x-KxBbS$+%vCfvR3D&K; z5v?@cGQ;Hc9&RWXvRlU?=T_&6Akwla$y0r;fwol;9=JPE^$~GO|3&tKR4@{c`ikkh zl+4*RNOV(~KDfN$Kz&HP?0iHEqn16|BKKUwH`RJo+v#qMFp0s$-O6^!4Yzf+No|kn zib}ECWvR4v(DYySVDMb7QhSSxGnTq19OE2~*lW7h$V@4#s`PU;9oVk=KJO~;c?G`I z?|NIf|7P1}UfqR+;RD~Kj|Q^PXL(;h4nh51-D&YgT8>&ohLgt1)Fz=Q8+S!|Uy-if z8GxS2YT&m^_UFBlAycC-(5?l0?OdvLn-JAU7iS}F=O+DUqwAOr%u&9)wox$Tw*hrb zIFY`AW^cp{o?PYYs%|&6E^lgtGj?eDdhX&PFo_75PpUxPGVJ#_UMzA9hGH{4$!x<7 zLuu;UM&BISW|U7tFTT!nv4WAA6iW4R*qc&rt1h-(pGmDp|KWWnIF1zXqyXP}3%98) zj>GZ<9{8P5Ct%Jg&LhN_?gv3GyAvPyZ5!#lQPu)1nhWAp;rGfGqsh_&Cvwl8Kdp&O z&IKn%x7VigCAbu0SxNQ{*h|-Vaj7C5sm;9AG%OtZTR1O_^2)D5-Aku!ij2=I2V=+Z zGDFz%|xhwBRu&o|rh)?C5|@sUV*g_*Pcfq7)u2wBmzI zV-GziK30?E>itnbzi?vsVMSpQ^33^YWwcgtT}7{nvL4Uw&)PZlwxS}bo`FoS?g<0~ zi$#R^&197zUUK7Wgr{-AoH%J(!_avWdbkt`KT5%M>)W&?MJ=1M;z+8Nz!cv5p6ZqS z$~Nb*^=0B~lczKf(WJ6s+rbC82rRf?b=wWu8BUZ|E{qmAs+Q(Wn!0-&({yG4))a~vqf4fDm+CVE;4Ph?7Dq&8LXS3bl==w6w> z@=^8TH^qxFea#iL31M``?CAAaZcUW2awa@bM5;t8+z+d&N0QPY+e%Ya>H5I4z1l1!yia>@T;uA@j8Ml3Oi8==xQG;NcQxG-W&K*I zaTRc-Rean_n!V!l!|ByJK?>A+jix&5E?*zfbk|95ifA&8R`IcM$bcJ!Eb9(u$|e{% z72)Y+o?aL76i3-rU_zj>+xiqqx$W44LRkPE+x8?+(aXXMSU%I$ZFEIzK+_l>X(%{R%WG? zG^KRdM~JLm0^7re2S`~jwuLU^b#T+8a-~945>$YJI=uo{cc~@FI5%tq3Is?+=p|h` zVe!F8992kilr&RSoz+m%N=QKt;`5{;R2fZF0|rwHll7WT2$FWKXo@E%mfS(AR`e1_ z!erpEfX!-1^H%j@uX(DDl2mxFxZ!?qL7k^C%w&y9js&wotjyIt&2-r^x-1|+wF!p6 z3=u$|Y23_asD#n46)pnvf9l8re3Lfvj<*WLcDGGRj8tB*{rw>>&h5t>T-+mY+|k2+ zI4ka}NJt)W6(qSLtkQ7M-kiAp#FF43iN|z<9;r%OO-7nyx%$6FsGQBZ&Xkh0&m~bw zef$~e`#`pskkr+XuPRl}1lLhLy(Wb!9LR|f2ZkF2&8I-BLd-pZU!v&x-UwBEwKz&t zAz9tmXUGR7>Iux}Va~_;aPZJiHIgX83dznXu#uV-CR=$X zBOhcl(^iD12=C%{Nj%PTyeEm2Z_aZ-#JHfQVhjhZ5Je|(^mlb2*G~cyKvumkIY>Tk z6FEW^l1@fP3v8F9#w;Xog&Q*XVhK`pm?frFN-xuKMAeEv$mk~Uz#PZc`5z!oE6Xl} zL_}1h;j~cyx0HspD{v9!6c`a4-r6kVsnpH0oX_V*2g-Y`TaE&{dJYldSwSFI7Rb(9 zQb=UH$;LI37;p5P+;;uq2tKZH0F*Cy%p}etaiw9ZDzWE}W?T7F5~x|8${_ZfhDu6P zo3um_mkACRF(|}Uq^$N;O}+@)4_+?jEjLA8RL!Zhm$Yi4f1l7JSI1#(r&Dltt* zLN_t_K1H&?e2M5mm&O$#{u`*FkTIhyQ)-QERT7c^S^k`Wf3uQY5VManKw$*B0m-Cb5yTyue#PSeOr@ zAQAOd*!!5*=S!RhF01|oP1lpYvgr)1Rr{!pnWy6f=jzO$R8ClBnUM@n_$ZO|Rd`Qa z;7U!lEEVT-fu_7%ky|ct(n2NRthDT=DiiX+8y=SEzRziNfQIK5Y!_xmw{2^7+l3&yGx7o?CV(Fm9C z)G_^%)6RO%rFO~bq^aUXHm@9TSQ2}_QlDXf4#wrJF`4~BVc>T&dDLx160idhqY=5v zqo9)qyMiuCX1vk5Rm1|4(;*14nuP8$-BK8{in%zFgIE~#tHM>Hax{RgjpqK+|BOTx zR&phtUm#VzHKSqK#!r>+c4{Mkm}Y$_Jt*paXnL!6s~Ctr560G1_`d?xAW*^aBZB|5 zB$5a+Gsm?BY@V?T7Vl$=F{8c$TR9ktlr2_6qZ8>#WNp1jrK_WH##c~6tG)E~f#UBa zMkoS0DB_tP$MOIP+SpAqD%M)ReePTHJgK7zf(P%wYCDYMtZ~g%@ec8EnAuoJNj1Ck zYOEbY)-PRKqxd58WAH_Ma#x^1wFqWs*&{1Rc^CW*<(u?;cWyx{OoJ2v>~WUEEI>MM zsD`4ot23R|6&dht+)?mG#LTv?^lGo+x|!1x$doOnvXoN%8o5O7{kG|^3L<6Q>-1{i zFlh#CE<7Ynb4ctw{9(HEXYaJs$L64k6XYG1rHpLg9c2VQ12Ix9!8Ui`*8j|^4IxiF zGQ4BaKRX!9{I*}HOu0rcBcHP-WJ6X%!+~0g<&{a~$h?uaDT}0jZ%X4xOx<4at9ig0UYY5Y=83l8Y#x#fGerD9+x{d< zG!vNn@+Dy=hjx_k`3u%~9oqTes+fwX}CearCVpSBW8=D`=#~i zWFWpy_wcHU07hvBBP7A&jF0(VHY=PgDU&;qj-R)|7pYXFlsCZnesAE-X7WUoH>9$| zqDyLNnkF@T2I;5@vOU&&E|WmOt)de6GSS&bSF`BmH-i@J0-L)(gYbUP@>k~U-*zmY|&-T4$kE8_iC1oRm>bCQ62*irzn8dt3MpLMyJmD)tA+UY@)B@gcJ8l zgJ-HNqW4kKG!hDT7Dolbc~p85DZvVVO{=(bciXV&GW$oK7eF1mHt9UEV+J@&&PY=5 z>hvvkDNI9>k=Ko@FVq58sqgZ9?CsU3C(l_Wrr=Xe`Lxvlm0p&(7hg{7t#TRgtvWSUc4~LQL z*@9o9r4;)jY)tG^PaxP5tY^AADPtx8TQwo@=bMqmiqaNaBJAMrs7 zN;Y7g;gtZ$s!DC*b3lZ>Qfyx|!XbRSHUu2X06(QiN1YT?L?YKpAkHp#bt- zcYj}EocF-GMYWpI*3Pz%Z!q-1=+MJ&{xBNaDZx9J%gSpij}-Q zV%KL;6?aA$B}|*7!hWkF8DYbLcjnDID(PRe0x$qK3aS!rX7vDEE3aM4n=50IS2xRq zf)pjtS98i*Hw7*=fjZKWbV;uTZG9&f-33a3Kc;|cP?CeMZD{HJvPjHJhg|}IZjw|0 z(bBi=_HKdT0%X%vgc~tVV4@@Sl zLfM#R3ccox2DD%}M`XzB0hA_*zGhQ0z^}0Bm1dElDi#?tGOeBvsx(S{uph4bTOWyt zPcoCdcx7g5WgD(UgY>&<>961ohRH;j5}HA7M)C{SrDnqI6)x!%woXqpn^QTj(Iwk~65>cA0!d4{z_LMM?>ks;;?K6r@(Eu1x0qmjH8z6(vn zBKMO>+0adpLIt?&YSl_%f}~NbBngT_s)Pb2mJ`cs&K3k&C0ej9Lp4JYux!6U1){sC z8+j3YZG;)`1>eVJ$if6>oi<2;N9F02iawdiEATQ*fNPe$P&5q)l9YA5Rl_e7L3%9_ ztdmN~CPDeyN$^2}-W`VIYxOKTu3MXe;tzjp^hFRsI|?+fFnVo-k-{+J^ZD9%l8Qj5 zrJy7g3@1~BqU2h_qL2*pDQKTIVsh}XNG+j0MtPKL= zRp7T`dX5N$+C<@;#P~_p`-+5v=13|UE`jkiqgD(}X4ZX+#NJS)a=`}>Dh+!ik;9Kt zN(nS@&C470RceS5bB&M6!P+9#7s^w)DiRqXb{NnvA<+b5B=8(pr7?(QV2P+AFLA`W z56WjM#>yo!O^KOgCZChY2DEy+of27+kqOona8aP68g$kOiOg9K=FK&1o#0n4Uxlan zD$SOBGnp$R6)H%P0?j6cH&>tPp(SxSbTcv)L>mV7)$TN~u#rUbo}J^yQ;^&cq%B1% z4PNo;jov_1sjKqz)naLdf`s6xOl~xW7~-pOAf>{U1R@=KN~qfnlr^E0FYPx?mDI3@la7l^)q>%7 z!?^o`8?@lWut&%G_gany?iK8jE**90cWQNRJ}u2_&GC_h?+n)L6*ws4X0HBI+Y{Ng z52*;wclEsCEAzj-$NOJlsfwBs!b?i8NwR}F{OAAm*F|nD3W@x*tAW0zlEyWt;O37G zoU^^;orGqTO>76jbnIB*U!Z`^UKLpEfMwxk!g^Y@jX&we8Lvy@F`X-J0mSc2gqm#b zzA)Ev3;YBZ%QOB=3i1yMPjH^DFi=ri`jeWSH$u-e}3k%3FzxMP|?=^aP zt??><99HZxN5|9q4|)!7`-n(}J@i-uH(BnS;f7chpKipX>l6iG5PA2&W7eg7hbf4h zbe__+cP6-``{z@9q~4)16Gokk)jBuxGJtWjJk7lM;ACnG^(j2T;3g+*iF^WavK8nG z76GA>mB1~RdywKLsxxV~G@KazXwR${WxMVwxcva;BI)&rNyqzExtIilux#!oUJ$UA zDAagaf~(fNkB3pGAj5)pvHW1^o|)bYXQZJ|oCxB^oU7g$qIXjrexc+H3M5M@^Rk5V zN=?|tJ)Nv+wEBef$*4fCmg?|nA*jgL>dk9B9xJhIOLK3-jPDIWBg37;8sFpB2rl|sRgekEb+S8U_1w47czYOFf_rsVn z5|&z#vjkkfc&xYKnpL$t2UR7u8(OEy!han<0O_km=v6N~cVc;?p5;?ajATOA2)mV= zj#auT$f$mv@DcSaVgUXpfr_OF4QS;Vg>A^Nmy(a=wTki$`q4D1Gx>@sv7RDf2^;~E zA`;89bMxY#I zM+x-?xuUJHYP6KseaDj$ZV4ytNx2c0?ZTcuJ*~P9=|w zC5?q^p>NokKqYyV6TrnXx-*b^p{Z$@FUK(q4g&zD8^~8gvQ<%Pw0diKS=2zR_N<B`HIl5b)>* z)mvWi+ybXc?v@SdWu5b*EzswibJ@BDuIAiO3f+3pX(~t`SmGLC&RwaeTWy^j$vF&K z9|+3os1MPS3u7fs2>sE1(S_wK$wWBtY(Q0B^$Md~lLW3tB*XO;lru)~$(pH$LuW~t z5*2up3#m8tuvY?fEF+dj0iTkXQwCzyiZH#9)g*6ev|{w&j0U5#n!q?s7zq;qMMMuM z>Ugmr6*$4OrXgW^_I#p);?%zRR@Z+*1mjR4` zBb_p{<+gn)GmXxo2n>ZqQ#g+5xLWbTj5OQsfYY8Tu$R?cp@L7aDE%Y~LVTX18YFO& z4}60-Zbd5~*>0U6hde^H&RRmN8s?!T5^cZdEQxj^;`~nijFQ08kzN9Zc>=Iu(DZx~ zSi?L_8Rzn1B_7HoZ-Ggm?{7KMWbEP1Nh8LcZ=gubusdQs4yvy(xPHt6+qHlbJaf)1 z5Mq1M)Lk73BH+p*C4?WrH=?db=d0MuMlmm?@h*J|f&*PonzV(CoAd6FRcn=I*h(Ns zSgJawE>l>qywO8Dxq4|BkYyH!2=#763^*0Txqjg5+u8egsv5zk^Z2P96Z1zaT3493 zNkTmoyuy%Z@Xo+yTPIAvtORs!l!RM!4LE$*=}{AqupL|bTb8ad8$zg8drr8FRZR?@ zni^h_jY^z2lATvYJijhUeJvP8?slX?ZuUiiFFX%l*FDW%I-Wy&LWNFVFsqxWNX+Qq z?16)MUzpweW zDK%t>A~E&AEwt+Cw~A=#3k@0zb?`_vR*ZKBy0R_3>zj5z{VRd!)Q(^IrIT&&(?_)r znfD{l>!paDrcIO4Q<%9=M>_94e5>$k&2%LA5lQ2+tv`2t`=pI6fVTZ5?&@mwBP;FG z3+aGw9QvFvJSRn1zNhovfk3oyPp4N-WsHz@ZZi*N+t5nQ1@5WjfI0mH;f3)S%;+jS zq3A?(mU}*n&*^!OV`&NPlhAXdXIe&@7aykKuX)D%Xsjr*$*;V77OE^dXZ9SQI6FWV ze|<+hJ{MBz%xQQ;EPU3q`3t;cWSqe?A5})_f(xI8E_PX;ir^X)!8l#sUbGYA!!Rm*68AtNDQ>A!DS^`{F?6g zN7k#q%3>y2piMjs@%JN=&YRc$xA(7dD`9u0&O@Q=wz^eu~^weDD0&h|myLgXX z=tO$+?o-qvb4h3DT;RHH78(eK6La6j1VApYvCO?I>IUHw-@j}Ey| z2QTbw3Q7DLSe(SzB3qpJ(&E(wKVL(W!rLVaU8cFLRt&*@40V~`O_N<%)U?E}3av%^ ziQlU+S-_BCja?<$Jt$|Za5qb*Ki{7%z39?gZ2istLP8SC95g<;ylJVOAmyIl)EjlK z%XpFIm#n75H@I|j0K8Ql=z^C1F0fC;)gE&4XFH5t2xwv?SytLvutSWN2;{4-%S}^- z{Pu-9fyBuXcC&&>7?b-SW}AB+K`FM4dYSVxHN=`P=L7BpahD zs~MUo7uBhL)CddSj~O`aq8bi3-`<#zyHjxUB8xW6QaMNHSJ7+JLJe`3<%tW)p#d%m zm*Zzff3`v!h7{qK*Lv}TC(!(vJ>n*IbjgR*K>Osf#fw5I_oe4t z#-p`VKl-TQn$Z5!<#Fw^>7A3YC> z7`H64!{lpVSrplSh|!aZ$X_b0w`V&O)s0-GE#me1|f`7M4(V&%^_ z#<{npA$~V`85ZOgk&=4Z9&duT42F$D{8aeYJQD0o<1!aT!0W?RI&`TpWr}KhWtD`!AxO-kEW^0G8~B0Y+Bm*k z6j1-H%^PF+&Tf_JceVX0)lKJs@q*5#U*HK5P3mLzv+bWLkcqpUT$7u>M2?NyzP3xm zm4;_mbuTm{heMrG*@L`{N63-Mx9IFg9qY;3tH+=hlxw3M{0a4)Zl%TPZTR|0H+6Ba zi*hZ_p4BH>uY>G1;Rcy?9M?(FLMn1hb*b9Rq|{-qOyS3i)AR7-Pt(z}FyFz8F0U`4 zUfy&@yc%6><6K8G9#fB4QXI($HUCIl*WNK7O!+c5(a2?=txpC7o_bvS2|M1zUyOqi zMEfD9+~kgZpq{s?(;@mdDGu;E`>DA2^RBG$wTPE-iGya(kqpPlic6f(H-?|!b|-0$GYyN? zsSA0ii-2Ps=TdIyo6*G}aRP&Qxq!x8m2%gJxCSqF!hlFLFDS16>&PqCPxfKfZT5+^ za%cZH!y9|UpMD;*qD(Z-I`APOt=4CTSsQ8o!KHrE$5?qloFEm%pA2!HtXN%e;KNy9 zo1ONcGi{EnGWHHFVC0_t>y%Z@v=H2zU+uU7^)qc8|99WICVrp3s^dhBeQ9A(-1u^S zKz*9Mf7Jpsa`>pz5UXFZCj>bw8MS#;S7kWhniUoJ<5zN_=RQGJP6+_k#6KKn@H z(WeK)_w*b|?2<$M+kuTl`meqv--$QAE_o}HqmzEWb|kTTV-5XR$jjSNp|$lJvHcq{ zPY<}AEqfok?ToWgRDaiJM?}t1;RgG1#_)fvh(Cp1@w3Q)%2`7{@G@cBUT4rVvh6Y4 zAbV9DX8DqUkz4jr$ye99jwqJj2^cR)qoxcN2Q8gi?h!C@zb)49xpTmc?jNF&iAmY` z@{Ahf68>XC6DpZZL#i*mZLJgX3I z2zP)Hj`Qqj&{v`b5a@7{!wu~lv;#_h>khu_J#wV}q!LcO%aw_7DZgJ|VyZCdCF7)Bz;^SxUuB!aW6288&WEgayJRjrh1DWj` zD$9o$%h2(!I-V_xNeXcm3&-gPB4Q3h%7+Rw^8)G%?JtbK`8j=O?^?T*_7EB0zx{{w zh@M(ipfzCH$HkDEeW91M-3}&fxQBezF>3O%iJ#_8=oI@6@*xGZ=X28?67zDom5j07`X6Hv=D zJvwl9cf!&9$fLDmo`a_rt`41+I9g6VMbX~qH})Pq-F5^P{uFj}p}F`&wU9w5jn7#n zJTv8kDQ@kLPu!Ce(Z5#}KfDVpT+|89b9KQvew~er;!o~!=EZIc&8h01609083vwA> zUJ%gr5fpR*H_)&vy$0Ov?O)IG+FFqD5^0z^DfRMGeU-s71UV6L>NGC_?wSC3lH&A$ zeZ11HhtMx_x6{9gcQ{5E=caS;z0CAhe1xRHXPlIpp4VQxcq{sH)$2nl@ge{{032iX}R$=B?V!z57Z875RFE_98Ikr|`@=MvvuO*8y;L~o#au38C?P8zE2?nw#EwmY1 z+vRuIvS7W!>QXsvPMg|PKD#$!o0dD)UNL~U0Lh4#l$$+HdT>To#$B?n%jqoh7n zv1w`_%&P;+o(=b~Z^G;#lgoJlU8Sy#ise9n0cp1sd)1n23&`F1as3!h4S1173-2Ij zc4tR?Tqr;`sxCEqsf&Mpskp>(!v<0N+uhw}rz_mJ1MVAU-^kZg6P1@Px|tNGiK&zuF4dH9)CCj4`uOy58&o87! zJUtkuR4v?M+el>mYR7nM-l)dh?dScJ6}dk6<=f7UhNBJKVnh8#BgHW_wAp{Ib3Z&+ z!3+64qIl%>5qVWN7gZYzE(1$bbNV};Z9w1Y6m})YzZ%%9^U)?dr;6(3K8nje72oF^ z?Ofy(KOF096CLNYy2QuS`}n^aU?jcykqD2y}D$Z^J#4T;_B0vh4{<&%Xhh!?phjN8y))G zA@pcc$#PogyRDvGr#xVJ)-p5`ZyzxRj>DA$%hF8WG|KK+rJ9j?oSQbL6f`c^ms$0gG4bzfSyD#c=y&z zwC#r-T(xiZQt$V;uhfAV)Z~{Eyy3V*9$|b*QeOQb3Xon^b!l1XbeUr^cNh8S1Y^go z*t<2{>(rNTH8Jp>-;F+6N>0_sGXp(~B%r4@>XbI&1GBKp6h9KX0xpb@tewIQmZlvD ztS`6U5mLyjd{#LztB0D*w~kl;T%-C9<&=|LP5-&1Td9iH#gbM{VJHk9bDztER*-JB zws=q7T;N*rSGr?d^(C*+SJK6XqidT=zSjRlVcbd$RX;cqSGE`y`XJiTJlESaB`5D3 z8H+8@q)>Q&Dug%srUZ-oMfvv>%Rziq_T26t3Uh5?=@+{9cp!)4oRG=QhX=?zciBM` zu@1S~s|8T{#H~WYI6a*|TUD9P#iQ1|yGSZ5q7-)2ZpWlPblK4w(K*t1Zg^%pasLY6 z!edMR{)+OM*{S{K&)AO7af{#m@;l@6`x ztIF$!-)@C*znHpiIP)cEWoh`WfPy~`{ABp$Z|1wxw6C~_*Jb_r=I}M*Z|lf^zS{HP zQRpJ`>63A*AKh!#oF3lw!18P3nbWi3&$7O~gKFu0ytJ~Vo;l;2^6o&Z?}4}nR~YBy z)N|**I}rD<(f8RkX6Ina!+@gy&@CBx^m)pE9nbl^BIGlY)3oGs-GAL&v3MkH-46V% z3!&e5tSpMZ-b?=xw)rdeb;F4-QQ!WW=BTb%Cv__Q;T93`B*F7y-B;4=cfRXS?!ETt z@Z%}(t>kO?lHHky-zJ=vv%ZsN*Z%at<=BRccbi>LH+n~jzDupxzwCeU_1R&U4Pa&wc;-t}%ae!Y7$!jFGIFxh$C09LOf3dMPo6i|*_Td4n0EFkQPA zr|ls{%wzs0UGHhd+RV<#VcMjkoNcCD-e;503FFoT#;3+tta+kO7@ZRs&^K)Q7*M9C zgQfJ(a7VW@${McpX<18h=07Lw3RgD%!|i#u@oy(uUWf~UyfM65>t?^XYi&q8>f%(9 z@O%PvXsgLt*OZwS*0NrkB35kQOd)G(@5r))$HU_}_WcT?A{r;kqi2`qH0m)-H~~6E zpUV59B}CxdLI;C9Z>!aX;2Xlo3ys`cQNka%GisuzVhP>Cn0No+xi|7o1bzL@A?L2i zS)8;SB1Sd;Hw}>;Q(^X3(aR!B(id3njnzF91fJ6-FW0~-(wsW6gwMjB)zR^>gg9Z(kI`Z^X^(OC%NJT|#qBVubDGp6s@D357jh4qd2 zInXQO-1i9;R+}$p@qqe(O6*Z$mrSnrEn}g-avZ*>>(o;zXI@ z^De`*1x5P(WO#ntkzzW|@t6Vp#{=|)7mnc|t+pvu`G&FBJb>*XzS@oYO{5 z-cm-MPtM))VOpEwC9J5m9+ye}Y-tFB=hDZUR)|*KbF|y1LA>N)hOO#TEsUMg`u-P( zyiW&TpW$Gvu)LDM)6KzXe_hrm)cjpYb1d!71>E(pMPa$BKc=n=nsKq$$SdF z&~lygax8%)y#MYWEn1z@pWKr`Ccpb!EWGf1rZ!sO`AYM%&9rU{5t>xe=DnG>&jjz; zynUvAZ|pVj2H!++(o?$wMVw!A`dxpN#)*~*E$8$s1KN-+fXN0n;8TB9X~Z{?*5;Qn zW^$LqfhMJ3onihiPOx{ZJT9ourujZpGB-*deZx@gRbfUi*`jl^cg7*|5;$Diu{khm z@<8tVI=p>ZvC@k_I9oIi9^T?KM^bdjaO%wNWX_xQ&ZI%uQuAWghG#r7Yb{@kO1&Ou zSnFp9;MwRu-F%LweHzLWUCFo_#ZWILYkeBVQgV3CrYr<^ksQz9SrdX~8LBhg8 zb069n3cYGIJFEVxz~~+g9;0IR)=eU5{9c@uFz+w@{;L9udo+U>Q?s`T5=qlQzugJC z9*XC*OY3eB*|89%rWUUCF4W4WUr$!(ohj)bb6Cxg4>A)CN-d=HF4Xml&RL%S@MzZ@I59tqAeqFS(ywp8xr?qMQvsChs%%kKm$^vU6j(AspPT zcA+$b*2Hs!&kXyuH{%Vir11_}g`y2Ya15%FA;kb}@)??qPUSP&>Ym1)BR{<2%Uhsv zZlRk&RqDLOIkMr+8b-1)0D4)-&vQq#o0j6OTvwkKMGHQcHTgZ5=QAp4LjM^~SXCyh z9l%_omaF;iZWm)7X`#hG8D7ZdZ9^wbm=l<^ytFR_Js*4f4;bKvqj5Zys7gdRl>((n zKH$9=W2Ut_DO@kkY4>~~T{wWdaQxm%{%=h8MsOnZnmb3vZ$+<%j&p+V#Fzr6^qXSN z%KSU*;%z1|2f5FJ?%CwtX)^y*u0t3AoFfBKWHp}CN5VikZOG`XYuZfV9guXJ;n|~C zt@qF3WZDc1TVSgs3L#&9>_)pu#`pR-AZ9kR!X2C|lvEdZ|?(A ztmHF!F@eK`WTzbkT>-gxx4fD_|C)-)YWAoR?;;q`Dey?w4-*}B< zK;1n(d9mOd%SwJLKR(3z9ASq6h>bM>x}PxGB18S&oPsm%7`|q|@j8nE6$mV%5-q^< z-7Jx5H3YuD2SC3V{U_sU4Dc5RYn?$k-x70e3jTr6divn^yS#1s7qHK1764}Q|0ogv zuyNy)VbboM3+mjJCKp(_XN$Y}OaEZK9`;5U%Dq(spbegM$=E*U;WJ?K+K1?%=QQ&R z2nkM7EMZ4@D{qXLb(DEF`BgWch%G#(*7%d7h^w)1aOx2JEJ9+|MPx28#uV|ET|5bQ zZ{(B+z2pv?rLbOFw2sKOQB1mtaQVL*t{V$8zbZ(+M_A*mx^*__-UwY}`&LZtLm*Bb zxPaz-_>Nx976K!>ZY-+&sxY^guq6-Hgo((BgJA z@{rv7SP3q%zL^4G3`aQ~wzV^AALb=U7`mjzP=J2~3_LWYCwMWatdrQg&A1? zG}AQhD(~R=EXU_+qkOb2MVNwae^w{cFd_X%(33cgi{#G?B<|Y3JX1F#y*%WEJKBh` z0&z2n(aZbvwD^n#n3B%g&z;mX=*spmG;(d$}X9!cp4`$1tp>98(gFYd!C zL3(O>{%@mzBMu{R!t@!2MV4-a=5k`*J!?s6^*!4N)ZST_NV;~9_MmW+WK-D>b)Z+@ zr_0QXIe3`e?Ubjy+Uk+7Y>MM}74u%9WWe#%DC-#Zs^EZ4OGYoddQy9S+Yd#b^iK9+ z!mLw3;mP2W&@uCwV+KscDIONG_3KXY^U5rQXa1^(qrkhqqS44TpCk%BTPeq

!&b004EVw zLk4XTdg&e54TGpx-6lZ!Q;p6-G$+hxm1MKrUR@kw}FZ8>Yj4SrV&H#?`I?Nw%F4q8yOA3F>rP*{Yk|9^?dL`09E~hI_jW90Qfk zOe@4HPbkMv>rzQ8H2u>EX&|7|Kb)#&p_yKozA-1GwS%A!7tS7P^F2)$(L;VV;8|bp zZpU~eiA@RBe@j>+pRhK+>NxRUXadm}D*^@gIgW-B)>{8y!gYVLNNCJykHiG*oren6 zk0-3XkkFYLe48(c?hAbns=Sd{MhRA{WzTZFq^9{Z=_(0Hrm*fSXioH9ap#$ISUYl$ z_0~_cPf{M&NJQATnetdURAP88+s9~{RlMwQG@J8E%}!}=@$SavF7;fn^CC2XDpjJ? zy#paz8KAMg&@_-IG|_#r5`lH0tSE3yZI6k-p@UrR+k{rA(qJ;zHlYD3M>{{J`Wr5) zyDO@7;i`w%Khy;4NjLAVcC*5guZ}v!R+MEVI`t~mvAc^eGixy$l)Nro;N3;=G1F~iv_V>gh)_* zPm+oE$QOvMFMIz&cP_*^2jr>T1%(7a;nuz3Nl;HoxLbdCEHvnac0R@yEeNB~;8>KM z;Qhr6D(#I)Hg>3i!Pi?0G>^Sdk6KA)^iXocCGx)imS!UW9*G zk&P|4uM=Z|eTYjI5k+q0N(Lo~$RKC0=~(uMwMgzu>sY-I;Xt#Wt6= z5}}py9Fg0ENa&VyA`y709N29_BUGjL8u6;_#8vvzc_o#n#8AO*3BCZC_I=js{*Yo; zlrz(jW9#F{N`<(zt3i&JB*G>6Yy%^~_nCtG1CADyY5hg#cumAl7=h&mz$R=)3R$%b>w5TrV zF;@EovxDub0b~v)j?gyEm6P+3-SbcZ2IeSJE`d)(j-zgNbz2G zY==9Z@If*N*I|O6QDSlYKyyjG^a6NeHp<$K-Lu6Ix<(YoLi7`eErNYwsX)YQMoe4% zWxF)QBg>90qkRzK+yM&8Zcn6)Y+*Kt$yt;I0(4)V!)Kf3j?I1{wz1siCa5tu-;SAc z0IpmLl!ZFzady`z#@10#aycdxvznzzMnyMz{$m34sX1yxb?91vqq1h}8CoQ>Ii1vf{ED)UYtNn? zi_*ZOVUW_mn?8_Ff17*4ckw9P(u|z;AkhH7O#TK%_jBoN)AXB%PaqfA>DLmiD6}Vn z!9=?*=-LNhg>J=Y5=u2!*3M&S=Xd9?uKTVu_;!CIsuA$4yY`WnUcQ;?v594E&JI0c z-Rlnd$vQR&9GbBHg0aP{j5wEXHmA+Qiu@d$>JRi`Y%_l4Ml9BAMYM|!K@Wz01Ib;Z zzBra?0d`v?)aJCycoT*|^yLC8FG2|jmT}E`A%+Dbn=NL%*pr5N`ii*^v)T0PT;h;V)X3Ej`vuX=lpy}b}*V~);R$sEF_v)Y67nIAdh->P;d^1Q4DQ5-iM`lW} z7=Iw+M5>XMhN-n!5g+@)jt@uX38qQFSbACT2rPSnfT;%Sy-?#lrV*9kRg&y&1JnMQ zq6vq{;+UB+WkfFcHX#JM1>tDkCVYlgUV}^awp$3c^m3_f6Q}_i%dMoVw#!!owuZC} z+o^{NGdG6Qc-4{VGMbIuuZgUey5-@iW0@T?A6BILOw9tWMr|%cw?P76+ETAH^FXLa z?1zhd%87{!ZI7{Iub9&?a}TgIEOR2pd03P1M$=|xXp|jR;mHI-)+&Xs^T}rXunTv= zk14YF;K$@+1@MG?NiWB(&yri}<3ujPXArIT(Hh88F-qEN;T=e|+aP~2Ne&*Eoevwl zW`}KjDaHd5t`<7L3ZQ^Te+(x|(44Dr-Y?Xn`Z4sCA&$-3vdwZrwX4R)CyO#6nP_!P z0C{M@cVv9!x@JlD3ySsXoazUXNVFLvDUAuAcLYLQw5oFsY!SW?@ z7KW}2FU3Z+Kz+J%8=$c4d_7`vB4tbw9z<*(zUoE1idKClz&;Cv1@EYh5A7x@X;)Jn zTHTqT_R){{=!%+peJ^6=ZLU#Z>>lawMZ0YeCH;tA?x~d@W>jKv4s8gYX8f4gcw!O_ z$s_4Fs}`H+(DNQbSJZCo0cIgeH{tv5+Ren-EQqaHLj4Wx_+((s9Pz)LcokI+-MuR~ zwLKz-LHEsfVODpbjf<7HnsGvhv<|=@{fLTAi>g?AbL|Y|aP91zd!%XlBlWblLB0*5 z2O@F}Pw0@O+PD3Y`u&QIRB{APa!Xdnb>IP)eWhkO_sA)T{a})}_d+ZP^)LZ! zTZW*7l|w7R@cZ52Sx~#B7$MBuZLBASkO-B#4!3$5eQ+}LlL@+48L6ue0i!YS5C(|^SK(&1a zEDPP!koBasW#IW;T+0+fH?(??D~)0UfrE%#92^(Q*+UmK<4`O9`H|C4de2{zpOGFe zpNv`KTYuT6tgsNqDkrSN-u=@5!pBtV>0Zq;fCeldcN6$Uv@LsQiI4?gW-lCUrD3*dUV@PA88Am41^2Cu z*~_uQGjd;^P|aG~vop`X*Y}p{8jL%2=+Nd`-O@5tgJ{9tqCz}g0#WOL*yh6Sx-g?a znm)5K9#V_#3fe!OV%SnloMcZ*W4|E7k>tXD)C#OVIFqD(t7+t#g3Z-4jtbq`@0)wW zP9&@Q!eS)lM0Faw_vFT>vre6XEcAsA1Cbz<2-+jg11D0nKK?y+FbLd+V*>*nz$)BFrCw|&_a zyNE3b!sfa9GW~I*3PZ*~lkP?C&Vre7qd9x-5|hu`DO<%WHP3$*k4PHP3d4HDVGl$` zAI>)13o4RxV{81{tqGp1NB{RV#D#+zJfK zNj(^pt!@rKp51zFtc-ncZR8T$oK1MZ*DFCvt#{6-DHjy-+mzh0bPAbeY94OgVDmBy z9~(az zzofT*bbPMSZ1VNWij$*O$7Z(fD(XUf`|*vOwx`LcPC{sTfl@i$-H{{x-wVGW|SF7n!36W}(hSWk}jCs@kKQZ!;=WM4G5X=H1evg;7sgg3URCsSz9vW^*0B;;uw4W`dN*1ZI-I0t0Mo!4;s?Gq1b) zT(lB@7Ov|${BFMlw^$hkt$IV6jz z*6F|Ct<wU;VGm4> zlAzsYoHVMfzI|M(?`slW%DwvbT(kZyUn5%&?!bpVF-Fb>;&Qe3u)zIfaTluirQ(Xy<>t?x2rAaO>BkbU9zxV(%&13$(k|b}J5{h%+fV*kaM_diP&6bS~nM)`r1>-W@{DhmmkWZSP;uUR(Lv*jXZP(sTA$a zw(TB{{(1$j+*7Pea}A3TMFnvkD)mI@I>(B-B#8Pv`Q(xe$)f2e(?;KQK7MIoKK4-y z_N2`vmNs!9ogn6AGtL8k#FABA5Mw^`R?F3`DLATm7SDj=rj@iJ8DSD+vryu>0OOf| zN!+kvvRMSML&KyW(iKN{6;EW{rU&sP1E=h3PKe3-> zQ+PHt^WQMGD}*DbiOLoAsMepon^}^ZN$3zga0=@Y_00V<&iUn9R^0IKNM$oSr8b>p za6q!kw>t=3^0&7zmkZ_Q*ULLpq*f_BAF75hEYOcH$|H_{^#!w(tx0U=d!TOgjg&Lh zSR?W(a?Ma}Ju;5lzT}GCdj!W-J4QZ z+-<7WDgIBXvryRXFDar1kPg#tcJj-QzNn0h19ui)wPWLR={D8;TLK(_eaT-^oZ(;6 z!M?z+>aN7KIFo#y!tiC)x&pnQ5Ap|#ui53)zthuw&}Fhl*icRXRUWg-H!e;-;iugJ z=EuyN(+T|DUr&v*RZRW`o58yp%hV?k=2Y2Av+-^O*9oWg$csJMjUdB)s4 z?O0rET7CW-E^CxC4Ht4Hcs*ZdokJlb&<1a;%k{mDFSUF*$=-b$VQO@X#64#e;VE8% zlUOMTbLF1*EL?o6xVP5G)->gy^f|UD&g#8%=?C_b2iJ@X|12gO zI>Z|@av0f~r+60A3{&dM&WSq5S>@g>HTcKZ0s)tjVi&?NKiH7Xa-`KP-qU5KFaA-KTha!C?@+;|9;Y3uzh%O@c{O!G%^m`Pz?RGLGbLQipIiKyNz7VPhS{ARcD;I*ZY^wf* zx`pkF?&@+KXU)9p0ZBQJ@_=1pEyq4JOF8HAz@}_-hCWd(MoOi;ZpYS+eIi+`lym^C zT6yHFx|5__C*C?b&u&UNhj6d9rLZumtathB7~az0{!fHb&m9!X8Z#a}9D@HpFw^ur zsAwhse?u?GXva`X1IS6<;=~UM|2N2MvmK2L!TeJz=Mp=z$8`eRICQPQa03SWulMT$fh~1Xa7y*_J`oqAA4M5eNEF z3DZ_pC{dX_z+~z^Lt&%mpB}wT3pi8v3%FlOK=ZHaeWvDbt;>MOb>cKf(M3B6dvoF3 z0ju)~fPm&0qB36tfUPXipkV?c6KIIZ0)TUucl*0xKy$~KR6uhCv4*_QWBPcO@~eQT z1`6}P0Gs|KV-YvW!25Tb)>|w%_gPA5&{X@f$t7P#9HP!6L~S+g9J(MR>YQ9t7);{m z=;FH3)#-=U{h{tA(LcDo0vTB!w%%ksDYH2|XJ=ArkP=|{o~P_BX;na_z_)0usaM6w z!=hu#XthY(<%RfuUTb6D}or_bf@D^c;D&Nfwd8w(7OUn{)h(GPQ#ir;kc`xF`Y zk9Ok#C-W$uzq+u^BtG46jkI+HHq>ym#i{OU3qVJEm{IgXb_8nu0W!y!nSvS>Lyb(%U-6%%_--sH5bs)?ipCK*ZgV60~ZVXg4eO<=q5v2 zK>qo^;l-OUX>l^*7O&A$1BYz!`hvHy;&w)#wYaSw{)MDB&!vmV2(ABj{rBnNkQ4s; zf;X}J^CLrAY*r(eiX)|+SKj^yT)|HbG~`Jetvcpsv>N%WbtKDN`HtOq zZMr8^XmpFB3ci<9WdMA9*(z79FEm7cZF`c#U(9}jvPB8dNn3}i!1s#7OpQvHibFM& z$X!gwW357$X?ABcoM#f*Jih;3m{9(Ynqb#CZC!e0jE2%ymuRi~Tp~~aI{YtXwmhaZ zlV-;i?D`4LwXS@)c7a|Agg$y@ghusDaYVd1&5kA5bvE&DNA8?QtOo5VR_=T{BEwiav+BQKinPL&uSM_V1(EJq8{TnM}TA4a?5!K?j#5#g0_KZYuM>hUFVr)+G%{6T&g3XYAF*vahc|u;DQA*<-)70S zy2a%8`26zaqk}N|*s~hZD&u76^gAZ^urATPCojda7TQGVU_{?7nEL}?i-KvCM@49 zYzi17&4n(n)V16dvo`A9=Uyn=+0~OQfJV$Kt3>N?d-*eFNCa9XnG}^o6^-~Ak&Q)s z9s-L4TeLtHhgtvBFuXE2()|^@RA6v2Qh{K@mCb5{om2SjNH5;p-6(q0zz?&`l`g*a zM^62l1gX`L-%+FHC|hfRpw&2pcBCrsh-1|4oLaM}8rAKQ!@Vnm;MSYrwG%dROmcO% z=#OQQ;bY&fxG5EIT^{jb^FoH#T_(&k*@2@w*yglz5r@<@sW}I(V5bKy;H@fAB^S=9 zEb3ASPIN2xPxZ0iHcSV*eZ1bl zSgZ2onflXZ0NMiCJSIb;n8sW})|keeLe}nT7>BNj*1CoL7hJ;D98WSBYu!TEczhlM zsC8>YupHI2qGBs8X6hrR05FzW;4Y8FABq#6-wP&Iw8*Foe4dW2KVrj?4vC$$d7r0G z);B%9bk|bCf#=b&;s(wOIXW#LGW(d?BxtQy-!HA8PqkHz*`=;6)jB5lgc5n{++uuw z;Fw4ice5Tzfi+#T+By}+93oX6A_JiQT{a_Cj;8zH((cALpLhcvgQ}?-Kx$>RRrO+eyyi1(N<|g?VKcO zxeZQ|jOR~IJ891&z%o#f%B4CuM>yE7%Y)mzf}cFkkqGqX&WoI0#NFamSg~a?a~$em z_3FRU#j1+CwPaS7CFOl(Q_^d2dtdS(eUW1JR5eQPS0A`64fTQM2jr&y#zDqB7E*?r zKfsh_EoWlNGL|Wrr;Q;{mhP5Qn=IY+AtfveG{yz1n^Rga#GO$sd&@h>*cI0#BJbyM zPO8+KauMLydz+j%DY{po2ze z12+dCL={S$9Ptu`EdQ6qp+Q-gz}RDvrD_}q3Ly!zy$huev`<6VK4NHDNIK@J3LKP~ z?}$5*W!}rmXV1vjL)^?~v2hQr9%=J z)E+0=(iSV?oGY?q#-Q0pSaYzQl%JPQ)=Z5#>?1e#kM$9b9p_=qspVEuxh9*`q~DE*;bkflt`Jc#`-{ADd=wfiomyRg?S zZRKF>Gm&MOrA%!s0ANVzj)KR_e7;!EWj=4fit(q!DZqoQTFY2;Kx5vg`z2e~7)&i& zQ(0ylPD@JL3^U5&PdV~btTEV0h4cX9d`g;^oWBLnL>L11IZgZ+yF~sW7hQFAo zRLpYP$%MB#?0mTIY8^>(aHcwRTHS5wBWyW1=Qz05+?Q&J3rd~!&hSj*^Q~(!QB?;A zP@hBw$hL#8hu#Z~kOS|dAdSJBM8ooayi#TtZpkdSKiPe0E17Y7fMIQq(fI=- z8Q7Em_=2R=ZJ9kM+pnA?$x1hbXqRMoii2z81cJhoAGzr_AbZR@PXypil%+~i2EBq( zDfnz$1ksF^F+TxXUi#%|_k&<|Q;bViE6Zw2miB{MH1lWHb9fP$MVPhivB+waLq+ZV zoVxpL)%45CB|^x8c%&(EAbaMuD0bh0T^3AX26Zd8e|KU!8qmiL1jL-N=$Aj1M6F5g zWiuw@Ej9X3_>pwHdxZArxObd#7y1{Ly#tg;CvlZams}El0qe5N=Z@W4`U`bP!QiYe z($Z`&7PZ2Fnq=6r4Fy&&z<`zzNtTgoEG1MP!M-;Br=jT;fM4&f>Ryh`Ml8^UVKv>E z!545AccyjXY6z%bqf@1`f{v+7z2p+JmNtVqyPtNPd1hQHsVnWEigPQF@4k%pw0wXJ z*R|9-pA1Y|oev+Y45IIIIaf1qHb6I|eOI9y5}y4FdX7UqlBiutm_i>IPnM5WclWiyyP)v3C4CDOsPT<%?f&OE_Grbx-hI22izlB(d@1-ZjeWe|`zH{=yeFZg zeuZP`JhA-{8bCB$CMMf0@d0m^$Uu^Moy@pX9Wuo;G|n(D`MQCzqUmOt=?W=#VNUC5 zU9=U?=N@4nvxVkg1!>;Hy_sgM4YLL8Ul1un%zYPe{rD_Cb!eW`)1|On+quHQ_J=%O z+K?*4rIsqQ_*&3P)J@WmB<1~6{)16LhxlSBwy??3lrm5O!`2BfQwJ-ovmV7OXzFTE zJlIEB6iUh+UpPggzO#NHHb#Wk2S?IWbgXk6;>hMDgElfV(d#;WE^G*D`q{YFbcIBXzzkT`&+PQKm+Z4P}58?rh()$%VyJ3uiGO| zx?o*3U)LbjI2$L?Ak!h}etnQitzVLEsGKB)1w zjLt;M2(=HUG}IyATo)&pW+o;Rp6uWp?jT#oTAe!kC?mVJzo~D=JL52I7Wj2UCF^J^ zpG$^$a0ci@T#Iq5p@U<{g$jjmUF6VD*7}^bGKCcsQ{dohBHs=qoK;~A$u!>Im~-Z^ z--cwit`HmOIP1a_H}IBZ)|t@(!_T;+dOB7TCHFEq2Jvy=SUtx|isXF1FbQPWFV_iD z`MZxbfrzTuWeR{94y>}Mf#ply>{ClKPZG9%p>nD9I|w?UFbJjhhu2^_vyhFL&P<@Y zJJYq5DTGi=fP>)CAkfddV^ANn)+v@T$I}8bHWWtJ=8ys2im#te+*b#0UmyPhxY)W@ z_0p=YC^w+8*;DjdK6 zHXrMAGYNY3?OaIMmFL&LGl74)PZmu`e>yiwU2hsyrboHw3>%r?uzouu%lR<%ftzq} z(DT{CW;jNDcH;&%KS*QuFk>lw#M@tR4@&TA9Y5{P+$gZ8VzUua(+-QGa%^VLsq8NA z4BkjE1m&yfZQ}d_zt4Z{x{+$w0{6_`SsB>`4F8-x)m{3J z2^Yl2rT2L<{3jo}`Wj?V7|e;5;q)SaLI%p1zeMxYY(jljB9o#L(Nu~ z2K_1f0|8Gd-+oefN*fO8Za&T9^Ba>5#0^+OR>S60Fq%4xkMb3Q1&FL5 zSTq#EId2x?t_B&a^{7WonC>$kDf|09EUqA6?< z*Yw5rs^gb5s`b-(hfR}%CuXben2zzSaMgGIO=D$qEzeKHfhP>r_H2qqW${lgcqF$E zEY~~~!c8xtsT-5azL=@Npch%jQ)}`+Sbw$lgTmzqhv2Yk!&LPXgN==| zGSGmnPEx)kCr&0N2e@SDy-$=P~`f}go4|W~j z%?b5jYX2nO7pEcKhvYb+&o)I-1?SLnCzRN^<0hVIjgfg-U#zWKpwIEWK4d;o50|s* z@x^b4XzhG-5I8DSH9}j&^U9z_q^g$3H^QOWzT929Amz#_f*%)Aw?uh=5(oj62lS1W z=?*kp&Tnt_Thy}P@xG=W2P$an^BRr@ZQ$t!wv=T4sj` zVAY|%GH@4HMu z^~c_^%$5UgCit$eP#;uHQyYQJ+HJ2sFk*4g8&4dVT*mupPs(tloI&vDBWyfrJIcFZM?@^Z}y3 zI&E0fEm7l9T_1G)sI`IF4K}ks8?+6zU`_!pt$BFLM+56AZmW8Tqj$%0ANa#Q)Tw&%>OS^ScQv&YsR&u*wRNWSNV|Fr} zBoIspt7~rzYMpzO*slJgHS5~XYj^D|eZFsAoOWt4bj??X`z{*B<1akjwOqkA zWaYyNn{B3Rz-4FG5tkL{R)N8j2CS7YHMR8v7nP>A0jZzNYRY4E<5YGv-?6&(0jq4^ z!|4t`-%hI%-^!^Rl<%Nbqwn%Wj<@gNB;;?SbCQqx%nk{3ZUyN~s?Z*KZ2syL^L>rR zL)jMS%G$L+?K~5Bli#OS!xP-W@d!1)S>x+rAr;;foOf6mr2dFfxCzHxf`8C#;!xLE z_gtZ#PL9F~kv!8)+dRZ{)wXQPc~3@CR>Pr=R%)s4NKxQ*0*SSYjBI;_U-j+le=Fm zY13bE)*6qt|HR`sSc0&A$BH@{Egf&yh?~`qyMa}NWbOya2DQ>2HLLbt%@5SW;r1Z2 zGC}`XvzXfEsh<9{XF+Uf z3e?x$w_5Z`88>nxwKx>E>9V;wdl=?sA%9&qv;3w<4wv?`mff9$3}dsXZq7j__=e?8 zY?J?<+vYE$#s7F?zX7@F=SiivA9JXq(Z{pQglbI5q`VQD~}ua(hG6fzWEO-_lOM6A|zUL_M(D&%mN@F>skG zzl>7E5Wb|VM3r(JHI2=t*IPd4~&PBr;crv%D3es%rqBoIBY{($i<$`AS_do5f0 zK)2QP=pxn7k~r;j`kd#a&2i9rM`X(6p0dwFvL>NK00+qd^)*_+WTI*C&}LL)Z@?##9H9v+Y+>#P=o zQxIW`x@_)V4LxglOv=QGky+6ii=szGS_jSsQX6~q<7*o^4dxR+F;qI>$^BtpNWsnr zVe)ke|VMJj*L%wfyPzhPqcZG!w2>eawCyz*i){vu)0^nC+St9ex%Mh3 z?cm0!ve|~d3AM7g-g>~^Bs1`RlnD``9-LdOOq>de=*V(cJYH(bLb@yNicXJfRM9%( zm8MtP;uEGY6u{LUPX7!9Uig$Raq4Fj(@uN}Ut3`|TTyS8{ z`z4vp#*u@?eO~s)2`=kFKN0oUmt`uOo5{`EKMti(wMBy zUo%WljYab*QC3aCMkiUSQ;oiqDczrLgH1o$bvfKtgDL9+cxBJ21+?#|)xhRpga776 zQv+oKFsK{-4GNo64dW-Xbx}kiKRPgcQyl6+=#AMpW-PS-I!6_ne5C!Uo*(x0xG)k9 z4(Qz~6opp>`!eVI_Iu>a`G+CyKuA73+KBTdG4n76rk<*?6RrZE7staTZ9 z=UbgM4trbEh@_+YefN_ndACpcg3}$oB6>f3pUgkjMEFUYWM%vCe0;YM&rK};`hHet zpk)wtj2{&`3O$ESqF^WDvWJVp9Jx>WAEZ5aA|Ysy|4{(kIx!Xj3h_;>{0H{3G;_`@5S|TIPCA_xm>*ff+abVFzj6 z$F6<*e8d5?lPKvvQNcW!Qk9KZ@*;X(J0bG)FEmP$vOqjuv>Vw5Gg^-}q*iR-FhyjP z%GzCgT04Zh4cBS53VQD_D(Pf@*7yB>r6Sj>#*d2EQchaET4W}ZC4><%Rh%wbEfN%O zk-^@angeehp|aV6R7C0BDzuCDAMw|~0V7Wv^RmE*^4;P*g({GkgLNekq$wL-YUH(xXjrnSF6LOE>`GA7kVU0pdW-% z8TC>(W<)q{OxE4=z8gbYoUU$$2YS5zviaP{$j|DCm8Dt2`@YEDxn*&@A>$@^y0pVO zwn<6oD;Y8GC^cnX0@f(;d_Q`^Pjo&i-)+f*u*EwOz7M^)Tf!#bM1K1whAmEcr)ED5 z@#RzEeMlf=u9^1vL^f_SXh+7E7je+==*x7@%Z!{48YxYO)QSA*j(*^imqIpi3*419 z5maYN`TB)Q7c=$4QTjk>UcM2Qn-q=5brK5)*v{D7jf?~09woBH+%1#xT-<($=(^V@ zj0o|a7LPQ^;G|un)P^U28s4Mm?Dt7!i%$mLgCR22dRttMLL986g7vKy&3f=*S-sg7 zb!UvWix!M}io-Hi{YQobYEun$uQ~G@WHq=n7B3pL>*?04PCseW)zxsue5IMT>pt-* zbh@Q0=Az)mrVQ3oQp|fZ5D&fF=(_N2#y&hb-tzLnFug*A#H4XK-Pz;gzOT^Ejlj@b zaM(fn%p7q>G*{{yu7;4)rgsuuY zi%JC!#CPuNH_80bi*U#Y)S<>~Z<=oWwn(+uNOo zd#x@BezVWB_t?bKOCCFDNx1Bjw#&Dgjl=mj0duE5vdjD!3>rmSQ>f-_&Z^eMw{+ny zjpd6I!Ieb=8IEOBNeZQ83Uz9 zN@q4ja-_c&6UM*zi)h^Nta_I8An;yx>0`TyVs6WaX!AlX#+DRoBR|9sj289gjbe;X z=`>iV@KJTDOKJIT@CzytPp+KB#5XJ1x_DG<>Dk?V>`~d)5qS)@b%mYW${D$Gw0Tiu z+t4rRb8X`B%hjFB59{>{wDJ&qO?@6AtRn%=+V6GHW1}PFPMG5XB{t8MR%Dpn8!#B^0&BDDt#=L8&_>MHCz>nMINwRHjY#`qoQ(aCgxEelL1g1q z9Z)rboe04V{Lt6)_BMYND`%1h-7;}M*CdHHdBk&_jDP(;BI9-IFaw+=)BIgsNM0)1 zJd@>~_pKh%y{hZ`W`7#?5p)B~7WB99=fotA%66(bz2CQR!;5K}S~?OSPqU2X}nRProEh2gjzKMU`Z+TVtY{FHGD z{*zfp0UpC8aw7i2#)(}5bIa-YMsniW(od!5ti??WZQjp4;$}5o!anDC=vDCe0pm5J zFKPj$PWN;Zohh0F#mWX94*?IbU6hQNg=@O-k2%fc$x#$#httC^ zAB@mvm{G@`n)`Z#r)lAI(9mF2 zo%;Vaf_rHFJNM81#D@fx)7pTKOPH)9mHo#dn37=ef|@?p?gw@2TK?Kn$F1UcU^;v4cFx!QOkL&P zY*jB1`0GyId;0{%+%qgt@tcu3K#oXaX;nUHYhOj?)i_!qf|~|;*OqOJAU#KV?;pEY zr2p*lFm@GW{qFYh`M`BzC&zgUk;J%R@iwHn3|7v~4sckq|Be0T@NHvDSgWSn@Vlc%W%!GN^7dK-jVAxY&xKOiHK(0x@wJ{LRva^Yl%AI~nxUNT(?` zHMgSs3+Qw;^^`i&3DP-ESi^M>fLsrai~4d$o0;s8a@ywYR;yU(Z>Xe{n7<74N&}@V z{sQV%dg-PPyZ>70qbs|zmXx2k0#r{-%1>Vj`Cp}iE|&*M5mD~=n+u))+;x{wS?DaD zj?XPER=MQ-+XB8=?NX7dC*lJ8(yscDLcCPny`D#gdY0=($`K0#p7$;~k z+Wx=K{#$KITHWUe;{`1M3RFk~?kG+r_-lt?GaUI!Q#+-()Ppx) z_D;#fP`d6aPJ*J=gk1>lP;mSyO0S&u#_&vjmQyggVIeCOJIvJnFGKLE$LD>ufhi}W zVBAnduzH%41|tN~)zTwmKeE~rWwY^j9OhEJc&pv3%KKEGp!^zoX$el;xpu?7yhux}cJ*=chl;87+igNAeQEnAeM`}4TCoDdem_2>|uJ4 zbWv}lRdNt)4QN!zz(%k1U&r@*6oK?1>#s097nj6sy2qp`1CfU30Rc1hW=^KJjSs?Kb`wT=jrI=*W)K>(Z%*pBy`I} zgKe*qrHYSM{V}9qPcN!2t}OwhH4NWy?IlYWGEAM|N8j>T(P^9dHlbhL^hzCb0pz0c zuRBHEh>CQn!oPL^tl)75;{*tIFAa?RAzj9}w~3C8*s`-5^BV5;5( zNlR0zf}b^eZG>myF`83A@A-uwR8guPXbrt)X4G}f`Tn`ph)@_HYA8VI?D$hkS_~5h zNd0OC3!GBf}9&lEl4Yd!pN=0cEV94{`g~B-hq35a7 z?cLc*^BrD+&uDA0g!Eg{>msDwHbCaHgk%Tw*=}~Pn@DV%6N5Q>J#~<5fH}P6_xq0H zNZIvhg$s)DbLJeIgRf&wj@*W&J_RCWQNJ2VB8v_fu7wq}Nr8j!2h80t#-NVKS z8FuN+h!wt4WLm4y*2(80Qws2)BEu@3F5M{;ey3ROf*tB9%0)n0az|_f_WVQF-HV<4 z*Mv@g`wAHQ^aH{MbKUP%fE4R?Pt3oOWyTrxCT(U-DV_hTb}s9+QB#ag2j1?v*H`o^ z(dJC~X+AUjrHqJ#R5 ze%68M)IZuDZx*v%$5{#MU)!N~EMJw&M<6H2G#6JS_I^4IhiA7=Y+I*C;&LKcJ6=d@ z$w+DPvKw4vi`6!j=u#vQC~_!04*iksW+=V!_0BfGmUMV8Pj*9$BpPc7$ZNfd>`Lm6 z;oFrD41JlMK}ZSlE0Zk6nK~(K_TFUMSer(sZs}J${z?6==z<|iSEI`2Ppn(T4~3Vb zv`6qih-FBc82)%Xa?eZ|X|Oqe51}-3E6Jy1H?viVmCuCU7O)H|7}M|cdb=(p04OM& z@RL}!AVVXGjiH}bbLB}P?w_!_?&ug|P=3Th$v|=&{1!mnyR)nN`egiC-Bt_wlD#fw z$uSzgM0*Ig5B~C%W*6lcF;Y!a5p^a)Yi9hsn-|X$2r{a0<}=%LqP~piH_uPdPxBl7 z{K_(E{*?L7j+$eu;4nr{8oMJZOy}F~hdjZs_7CJ_XTPwM9cG|CS&pEQ&en-PvrC8Q zkPgRi!7cO288S~X>z26b1`vbUSvJ)|*_o_~Odr5Ge6&VC z5I_cRxk<0|Ht!@$7}<@dsp()xdi#eOy3YT|mTYnzl1)Zq47|k$yzx&bkp0OMbIoWoJNc&9 z5lz&uMWFAgN8<@%GKRk1ANE^_2F-0#a2@>Up>PbSy>)ap72``MEPhkZc2@6j3TvW9 z%%^E|Fv8G;<ATAA}1dTO%UOBSUVo>>`W~hCkn`kgmhM#b6+FC;Jy6{dy>m} zWcXghAJ&P;TdYQGWed*m(0g)u={X3M3w@q=Ayy##n&;>pYkrvMk>*Is1yTO9Yb%-~ zo!xD{>76G<(kb-YO9!4sHbSNHslzfvTSfo$R#5fg{dKdGltx%1=|00O^qPC@%d=4@ zgQejriPC)2Q>o2CPs(NBZwr5pq<3wgn|!@0S=)f+yjAM=8XN`?FW{zjqtqWpWuYCb!HO_xo^FG%=0b9j*8C zeKkr31>-Rvsb!Oy3?=OB|45f9O5ueIes)+XJD4u9Gpb@0fXiF-45RqQmDSNREglli`jx! z8Xiu4ts_~huAL{-G5x5l!&(FHa`BouJH57Qws*7ZgeNa4h*Nz-Q}x+HSNk)k{l7Ao z6BJ~eXXYG0GcCE2ph#b)J7CIJhw%Pb>637*_ATF_nIR(?;dwW=%ES)+$$FO#6?#{mMC|8u*-3X}a?YAl{FwhS+HACq| zhBQoU1}91uwCcB0Mp;&!1yb*V&o|s!P)%u+5RdJCGK2170T!n1*R6-*W)6op7F$|k zF>k^1v`tTp5!iC8U^>qpxqnC}Cx58y1^crOXseUtLSqUFCy*oW`NfUaEeBm#N9p?h zLvq7QhY$XeXe`W+){de)N{p)z-ZB`;?#81c{w^LY<)%WLGAdJ;ek<=3cNJD|FD-bI zGECDsp!V35L)l&3gPc*r(r+&c1NeZt7L;beFD-J4F|xue3@qwMHiH zPbn6fuwJ)Hvw%xJeJ%I(OK>H+zoJ90tD1q+w!|ltm!6b`+Q{>aECl%F?-v7?2b99| z$Gu7v#eA)!c>j0bt5o`6bsG8zbNO6JHg`xYb53c-XtXnF&GN8ljLR|D?)E*GfzxX5k>moud!iq+uC^UUPEv)V6a>M-I>O>>vJ7sZI|(j#ro~ z-LDE&_Ka{Hggd&7EkU;n-h!HfHpiRDc_P+&h?i+Gq0nS2VLkNx=vbOI8(Jcth(u2G znu#ia4O!4x`IyMit<@>=b4rwU>xs}>tFby?z0g|rg%*J(dC&RhvnVXTj$9`oN0}e- z?$=|bXxUYruYvzAzbY!7NoR&zu%Uqh$NVhOAdvtG=C2#hf3DwAei4n*RebeZ*PR97 z3-}ihv(Qw78)LNl!6kYow@f%P?)qENWrVAr6b(#n-HS@ml=9krNy|i99)ULZ|DxnA zgum!b8nEho_K>uA>Z_f*xlxP2Qb=pK+@j$|_vUrg=3MRJBtSq@SC&v=`=zl~4|z|12-~-};{~qriBVUsf9bT{=>0Y}Lj^QBH2N>n7oS_Wk=&^C z^+pV-5XMnThRKY*PM(YH~Hn`MHxr zOy4kgt`{wZjRnA5G;7X<(gA71-XQY7@!6`aA9DC4N$Pa$NW!H`30c>n9YLJqoUI+p zdA0LvMErhMJ?7AP6+c#!l~}(QrE=yl5uV9dD$#(IG|ZO@h`BR}c~+oKBBcZR2rnI?OZ(ox;hgA7#egOoa5+!MlNuG&u1 zroWXUM9Xs#b7q)4+_*QbrsflYsBu-JgdB}^+rnIE;ntNEa=sxUrKo2T+yykwr`tFK(hLQ=Qt*SCp+M?0-5h*F#f}^>(^cVF3SQ&ZiJ(1@{ zu<|`=-z#QH1!x0?nPG>No!s0)A`p?VI6b6IK5|II9>c>ZLPml`jsb3QM3YlNA&qdt z#-fL|%XrM_2`{VJ7GY$pJHUliz2bM7)da0yMJ=qM5EqCE)qRe<4%f3|D3*2 zN;L>x(JBR-waQNfzudhlV#>P43M>OC&dcuV_iW$Xv#~Ve0LR|i#J)I1P2b1?@I{M4 zGk%$Rem4$(i6yvndFe9!r*ZW%>Vlh6PsobV)t2k2=|M@};$F|jQB?w0&92A3qRVVP z9!LEE8Wqx7oG6nQ^UKdS>ZZNq6C1Kx!#q&U%u(jeovH7!VzGrAD@;BKT;W`Sa%&#l z_fcMwOUYAh^1EE_FcV={>EH@ubPFwG4PaNw@4dQypmdo{|B0f>C=UP#YjORwY@+=S@o@!R!Q~F*sCWVB;Hh` z<377keG|o4!WIQ?r^eL9XYSSuwMVQa)&L*0bQL z*?in$FO(pR(Z2T0bFLz0mfGUxzKqE0m@y@RARVK4VePP{*?HSAPK(M$noq}3*wfjG zpn(kMLjANIPUkj5L_@!RmxCMJUz#60EL@MOk58V2)kVz!QLj7t&qYM_Nns7+qx~ZC zJi>b=ybk9q24OR| z$^Z7B`H$Abi>Ona`>+e2JC)u`u18wZwT@-1Zms z^n&`G)(-eUb3P6)`u%k?CJ5?}%Qc@JoxW@Q$IRZaI|=*ouSAsSW)4ZqgcIgCe=< zX2B@rS}YZYbIL$G<VGByZSW3Aapmt~vj@9-5gv%8j zUP2QS0ChPN-U;mZ1Pu>YdZ-O4{;evqrK`IqHXA6mc=Y?*|CTqcTh!OdDsoP{EeGU@ zsOBZ|R>2c|O?zEKv)(fXMQNQ^t|0>6(430}vFC*iwRI@;zcWcn*{KQeO~gH|Zv`!k z243HjYn|1?V0*}xLG~ZB>RYkL5wyUZ#&p+aX!RsOO}!a8v32*v$nCPEK4tE`4eUH^ zEi`ER$U?wgj=JL+RS7L{_+XnXqM?T#U4$fN%hiV-ZcWr7?Dysuk@)dtC(JNz4#}lN z;uEu6qLH#2tCN*%u7<_vvN4$dSlHWWm z+MG-vyzGYJP%0g~bZ|2vLq1jW_E_!JdebJkLBf^)I`1(@R8y{4qY*IQDH zH>#~etY2}iv5e4nOK{g?lR{K`?(}tk7QAj@7XF(4HVEmlZFs){5IW@%ZEr5F1Cl5vYEqeI-@}?6|r4 zwbVZN@JeqD&Mx0|Kj29XYgTQ}bw8#&gQspxul^J9Lc8Gc*D5}o)K}1YU>It1*MF00 z%BL;gnF|KBx8!DPBU^G$nrTf#6EZxi=$rb0voj)?S?}7x*+CL|m(|k>2s9|Ks^MNE zGO^UTE-a?0k`_fiw|j2qGNdJ`(_S@UA|#1r#A~~Fa#ypXx@=FqS0`TAOq{qUlnQ@fGM^~1HJqthbl`UabBSX;)hGmhcEP8r#Mk!BSVv@;Z1 z;1U+~j+|(vfW$tSq-m zMRCKzr+$B)7|#o!wMJfC7#7gGO|Oot7diERK5Bd>kR!9BRW(qI;B%D^fQ5)&TMYa6 z$Wa?URhE>X`FOV-a;C9a!9ubH(-0IG0T`zDGxw4k}OGQ@=U%}-<(Npb_#Ns_; z)o!_8y1cuv3uj%ye5qG9MM|@a|207>Y~Fd9`1!m9ZxWScaI@O%=h&C=mkX5{GxtbJ zS_K>A6?x6YPi6d>F_s3NL3@8y6dW!8Z!f9PVlDN-xpJTd-C*>lsvZhbdFd8D8s_ zGJ`_&$++oLAbH5a3JZV+5}^mVF1#S%?qgq)ls%MVq%5*D`$wcCgwVRI9^|;w)=E&_ z3rR*u0_HB7Cv#OBz23bOjCNdrUL`H5H@=2VtczFJKXurg{(ayH+V*ziBUH&ysDyp|lZ8uA&o+OjMqxq51 z=uASKQM}kZlEaA?>yMyK6X*5)X1uz%*=qZ{Tb+3i!e($aj7v0)M$ScLN%`pGa$3Yg z5(FDPm=1CBIlkjl@5`#OWb?O=Mzoan%AIeHH-kn^?Mz2u zU)J6~;7(>E#{Rr8josNkh2bWua;TM&mxUEPSGs!nr9t$ZtLJ7Mov}R|i)93a13utw z(zMi4y1B;b7wbfV1UYi!h6~W*d1Jp9?ccXc9GO1Px+98u%IEh3M|J{)GSDNz^% zP_vo8_ua5#%tsTDhEZzju;1E?)HVO8=-~5j8>?I|!DhtiyjQ=~GsGPAnCZVa zd(5b>lw{#wEtgnHesG$b)^I($khbA1xL${D9QOBu<>3X^BnuGwF{5AFP>Xv$s>r#~ zJdsNMndx$B-=>c>2`T-JQC29zT3eUbMnYzmBQUYxGif)sVky*|K)AeT-*>$zX%PpRIvOXYxiPF|IE!;FWF zi7$>W)+Yd47CcvuGbByXSNp{_?SDo9EAJV~N7z{qHoIM39zTdx#%{fQCPvK*raGQ6LxjrO^mYL;KUhnVsL`z$3s|dp@gUXO$VQz7^qO{Js|Uc^L+d-?qpP| zf*%xfOny2577 zxt?dqu60t4wuvH>`moSO+_J8m^Tv3)?YcUcZ&oga8MNj8E-_x-I`V`WQp}DL zH+hgFe1ZQ3_F}Y9fc&xfzwi+M$LsgWinXRQa=S8%z#imd+N;b(qulpO z!J{h$m+{J4MQ$NWyI36_8%oz*W~GP$M2pB#)vn1d8qm7<3XQdvco%gJj$Un&E|KGU zbmoCEsa5gC1hk_S?2qeo8uVf(va<>?hc5l?J!~{SleY^yI&N76il!W$SM13|`6C;@ z_?6;1q&yPW3q*DT+GgJ)f18-&TczP$Ak7U&&9sMy^IGDkdsi&Z;L@N zr7h3<%9u2Kn8$`E+{{RCKlHTp1Vd}yT%Ba>y2#>kOC`b%kY zCdM{&Z>jBxf*rbf>@+K_o%%^uQ>o|t6ngGm&gBQe2WnFr=AU<3Tc7|`Jv zx=;4taB$ek?$Runz?ho1Xj<;Rl&;%W&E;j(dkYL_nU1I=dE?rEbVB#rSlxaQZ_uvg`ZFI zH2+wVs@U+O|6#vI%zg{*WE~!}Q+9A>PF#-}1jWpQqBv)% -filter(is.na(DoNotUse)) # your code here: exclude subjects that are marked as "DoNotUse" -filtered_d = filtered_d %>% -select(c("Subject", "Cond"), # Generally important columns for both hypotheses -contains("Game"), # we want all the game columns for hypothesis 1 --contains("Intro"), -c("WhichGames", "GameComments"), # except these -starts_with("DinerDashWith"), c("SOFMusicEnemies", "SOFNoMusicEnemies")) # These columns are for hypothesis 2 -rating_hyp_d = filtered_d %>% -filter(is.na(DoNotUseVideoGamePerformanceData)) %>% # first, let's get rid of the subjects who did so poorly on one game that their data is unusable -select(-DoNotUseVideoGamePerformanceData, # now get rid of that column --starts_with("DinerDash"), # and the other columns we don't need --starts_with("SOF")) -performance_hyp_d = filtered_d %>% -select(-contains("Game")) # your code here: remove the columns containing "Game" in the name -tiny_demo_d = head(performance_hyp_d, 2) # get just the first two subjects performance data, for a demo -tiny_demo_d -tiny_demo_d %>% gather(Measurement, Value, --c("Subject", "Cond")) # this tells it to gather all columns *except* these ones. -performance_hyp_long_d = performance_hyp_d %>% -gather(Measurement, Score, -c("Subject", "Cond")) -rating_hyp_long_d = rating_hyp_d %>% -gather(Measurement, Rating, -c("Subject", "Cond")) -performance_hyp_long_d = performance_hyp_long_d %>% -mutate(ConfrontationalGame = grepl("SOF", Measurement), # create a new variable that will say whether the measurement was of the game soldier of fortune (SOF). -WithMusic = !grepl("NoMusic|WithoutMusic", Measurement), # creates a new column named WithMusic, which is False if the measurement contains *either* "NoMusic" or "WithoutMusic" -MusicCondition = factor(ifelse(Cond > 3, Cond-3, Cond), levels = 1:3, labels = c("Anger", "Exciting", "Neutral"))) # Get rid of uninterpretable condition labels -rating_hyp_long_d = rating_hyp_long_d %>% -mutate( -IsRecall = grepl("Friends|Strangers", Measurement) +colnames(diamonds) +head(diamonds) +qplot(diamonds$carat, diamonds$price) +colnames(diamonds) +diamonds %>% +qplot(carat, price) +diamonds %>% +ggplot(carat, price) +diamonds %>% +ggplot("carat", "price") +diamonds %>% +qplot("carat", "price") +data <- diamonds +data <- diamonds +data %>% +qplot("carat", "price") +library(tidyverse) +head(diamonds) +qplot(diamonds$carat, diamonds$price) +qplot(diamonds$price) +qplot(data = diamonds, price) +qplot(data = diamonds, carat, price) +# Piping does not work for qplot, because the first argument of qplot is NOT data!! The first argument of qplot is "x". If you use ggplot, then you can pipe, but then it has different syntax and has aes. +qplot(data = diamonds, +x = carat, +y = price, +shape = clarity, +color = cut) +d <- ggplot(diamonds, aes(x=carat, y=price)) # first you set the aesthetic and dataset +d + geom_point() # then you add geoms +d + geom_point(aes(colour = carat)) # and you can keep doing this to add layers to the plot +d <- ggplot(diamonds, aes(x = carat, y = price)) # first you set the aesthetic and dataset +d + geom_point() # then you add geoms +d + geom_point(aes(colour = carat)) # and you can keep doing this to add layers to the plot +#qplot(data = diamonds, +# x = carat, +# y = price, +# shape = clarity, +# color = cut) +ggplot(data = diamonds +mapping = aes(x = carat, y = price) +#qplot(data = diamonds, +# x = carat, +# y = price, +# shape = clarity, +# color = cut) +ggplot(data = diamonds, +mapping = aes(x = carat, y = price) ) -rating_hyp_long_d = rating_hyp_long_d %>% -mutate( -GameNumber = as.numeric(substr(rating_hyp_long_d$Measurement, 5, 5)), -ConfrontationalGame = GameNumber <= 2, # in a mutate, we can use a column we created (or changed) right away. Games 1 and 2 are confrontational, games 3 and 4 are not. -Emotion = str_extract(Measurement, "Angry|Neutral|Excited|Exciting|Calm"), -Emotion = ifelse(Emotion == "Excited", "Exciting", # this just gets rid of some annoying labeling choices -ifelse(Emotion == "Calm", "Neutral", Emotion)) +#qplot(data = diamonds, +# x = carat, +# y = price, +# shape = clarity, +# color = cut) +ggplot(data = diamonds, +mapping = aes(x = "carat", y = "price") ) -performance_hyp_long_d %>% -group_by(ConfrontationalGame) %>% -summarize(AvgScore = mean(Score, na.rm=T)) # the na.rm tells R to ignore NA values -performance_hyp_long_d = performance_hyp_long_d %>% -group_by(ConfrontationalGame, WithMusic) %>% # we're going to compute four sets of z-scores, one for the confrontational game without music, one for the confrontational game with, one for the nonconfrontational game without music, and one for the nonconfrontational game with -mutate(z_scored_performance = scale(Score)) %>% -ungroup() -rating_summary_d = rating_hyp_long_d %>% -group_by(ConfrontationalGame, Emotion) %>% -summarize(MeanRating = mean(Rating, na.rm=T)) -rating_summary_d -ggplot(rating_summary_d, aes(x=ConfrontationalGame, y= MeanRating, fill=Emotion)) + -geom_bar(position="dodge", stat="identity") + -scale_fill_brewer(palette="Set1") -model = lm(Rating ~ ConfrontationalGame * Emotion, rating_hyp_long_d) -summary(model) -performance_diff_d = performance_hyp_long_d %>% -mutate(WithMusic = factor(WithMusic, levels=c(F, T), labels=c("PreMusic", "PostMusic"))) %>% # first, tweak the variable so our code is easier to read. -select(-c("Score", "Measurement")) %>% # now we remove columns we don't need (bonus: leave them in and see if you can understand what goes wrong!) -spread(WithMusic, z_scored_performance) %>% -mutate(ImprovementScore=PostMusic-PreMusic) -performance_diff_d -performance_diff_summary_d = performance_diff_d %>% -group_by(ConfrontationalGame, MusicCondition) %>% -summarize(MeanImprovementScore = mean(ImprovementScore, na.rm=T)) -performance_diff_summary_d -ggplot(performance_diff_summary_d, aes(x=ConfrontationalGame, y=MeanImprovementScore, fill=MusicCondition)) + -geom_bar(position="dodge", stat="identity") + -scale_fill_brewer(palette="Set1") -performance_model = lm(ImprovementScore ~ ConfrontationalGame * MusicCondition, performance_diff_d) -summary(performance_model) -library(foreign) # for reading spss formatted data -library(tidyr) -library(dplyr) -library(stringr) # useful for some string manipulation -library(ggplot2) -d = read.spss("Tamiretal2008ReplicationData.sav", to.data.frame=T) -head(d) -colnames(d) -# First, I'll make a vector with all the likert values in d$Game1Angry1, which is the 7th column of the dataset -vector_of_Game1Angry1 <- d[,7] -length(vector_of_Game1Angry1) -print(vector_of_Game1Angry1) -# Now I'll try the histogram function on just one column, d$Game1Angry1. -unique(vector_of_Game1Angry1) -range(vector_of_Game1Angry1) -hist(vector_of_Game1Angry1) -# I learn that the likert scores range from 1 to 7 for that SINGLE column. -# Now, I'll do the same process, but on all the columns with likert ratings, not just one column. -# I can tell R to select all the likert values between the 7th column of the dataset and the 70th column of the dataset, and to make a single vector using the c() function. -vector_of_all_likert_scores <- c(d[,7:70], recursive = TRUE, use.names = FALSE) -length(vector_of_all_likert_scores) -# Now I'll try the hist functions on all the likert scores. -# range(vector_of_all_likert_scores) -# unique(vector_of_all_likert_scores) -hist(vector_of_all_likert_scores) -tail(d) -filtered_d = d %>% -filter(is.na(DoNotUse)) # your code here: exclude subjects that are marked as "DoNotUse" -filtered_d = filtered_d %>% -select(c("Subject", "Cond"), # Generally important columns for both hypotheses -contains("Game"), # we want all the game columns for hypothesis 1 --contains("Intro"), -c("WhichGames", "GameComments"), # except these -starts_with("DinerDashWith"), c("SOFMusicEnemies", "SOFNoMusicEnemies")) # These columns are for hypothesis 2 -rating_hyp_d = filtered_d %>% -filter(is.na(DoNotUseVideoGamePerformanceData)) %>% # first, let's get rid of the subjects who did so poorly on one game that their data is unusable -select(-DoNotUseVideoGamePerformanceData, # now get rid of that column --starts_with("DinerDash"), # and the other columns we don't need --starts_with("SOF")) -performance_hyp_d = filtered_d %>% -select(-contains("Game")) # your code here: remove the columns containing "Game" in the name -tiny_demo_d = head(performance_hyp_d, 2) # get just the first two subjects performance data, for a demo -tiny_demo_d -tiny_demo_d %>% gather(Measurement, Value, --c("Subject", "Cond")) # this tells it to gather all columns *except* these ones. -performance_hyp_long_d = performance_hyp_d %>% -gather(Measurement, Score, -c("Subject", "Cond")) -rating_hyp_long_d = rating_hyp_d %>% -gather(Measurement, Rating, -c("Subject", "Cond")) -performance_hyp_long_d = performance_hyp_long_d %>% -mutate(ConfrontationalGame = grepl("SOF", Measurement), # create a new variable that will say whether the measurement was of the game soldier of fortune (SOF). -WithMusic = !grepl("NoMusic|WithoutMusic", Measurement), # creates a new column named WithMusic, which is False if the measurement contains *either* "NoMusic" or "WithoutMusic" -MusicCondition = factor(ifelse(Cond > 3, Cond-3, Cond), levels = 1:3, labels = c("Anger", "Exciting", "Neutral"))) # Get rid of uninterpretable condition labels -rating_hyp_long_d = rating_hyp_long_d %>% -mutate( -IsRecall = grepl("Friends|Strangers", Measurement) +#qplot(data = diamonds, +# x = carat, +# y = price, +# shape = clarity, +# color = cut) +ggplot(data = diamonds, +mapping = aes(x = carat, y = price) ) -rating_hyp_long_d = rating_hyp_long_d %>% -mutate( -GameNumber = as.numeric(substr(rating_hyp_long_d$Measurement, 5, 5)), -ConfrontationalGame = GameNumber <= 2, # in a mutate, we can use a column we created (or changed) right away. Games 1 and 2 are confrontational, games 3 and 4 are not. -Emotion = str_extract(Measurement, "Angry|Neutral|Excited|Exciting|Calm"), -Emotion = ifelse(Emotion == "Excited", "Exciting", # this just gets rid of some annoying labeling choices -ifelse(Emotion == "Calm", "Neutral", Emotion)) +#qplot(data = diamonds, +# x = carat, +# y = price, +# shape = clarity, +# color = cut) +ggplot(data = diamonds, +mapping = aes(x = carat, y = price) ++ geom_point() ) -performance_hyp_long_d %>% -group_by(ConfrontationalGame) %>% -summarize(AvgScore = mean(Score, na.rm=T)) # the na.rm tells R to ignore NA values -performance_hyp_long_d = performance_hyp_long_d %>% -group_by(ConfrontationalGame, WithMusic) %>% # we're going to compute four sets of z-scores, one for the confrontational game without music, one for the confrontational game with, one for the nonconfrontational game without music, and one for the nonconfrontational game with -mutate(z_scored_performance = scale(Score)) %>% -ungroup() -rating_summary_d = rating_hyp_long_d %>% -group_by(ConfrontationalGame, Emotion) %>% -summarize(MeanRating = mean(Rating, na.rm=T)) -rating_summary_d -ggplot(rating_summary_d, aes(x=ConfrontationalGame, y= MeanRating, fill=Emotion)) + -geom_bar(position="dodge", stat="identity") + -scale_fill_brewer(palette="Set1") -model = lm(Rating ~ ConfrontationalGame * Emotion, rating_hyp_long_d) -summary(model) -performance_diff_d = performance_hyp_long_d %>% -mutate(WithMusic = factor(WithMusic, levels=c(F, T), labels=c("PreMusic", "PostMusic"))) %>% # first, tweak the variable so our code is easier to read. -select(-c("Score", "Measurement")) %>% # now we remove columns we don't need (bonus: leave them in and see if you can understand what goes wrong!) -spread(WithMusic, z_scored_performance) %>% -mutate(ImprovementScore=PostMusic-PreMusic) -performance_diff_d -performance_diff_summary_d = performance_diff_d %>% -group_by(ConfrontationalGame, MusicCondition) %>% -summarize(MeanImprovementScore = mean(ImprovementScore, na.rm=T)) -performance_diff_summary_d -ggplot(performance_diff_summary_d, aes(x=ConfrontationalGame, y=MeanImprovementScore, fill=MusicCondition)) + -geom_bar(position="dodge", stat="identity") + -scale_fill_brewer(palette="Set1") -performance_model = lm(ImprovementScore ~ ConfrontationalGame * MusicCondition, performance_diff_d) -summary(performance_model) -install.packages("devtools") -devtools::install_github("TomHardwicke/ReproReports") -library("ReproReports") -devtools::install_github("TomHardwicke/ReproReports") -library("ReproReports") -install.packages("devtools") -devtools::install_github("TomHardwicke/ReproReports") -library("ReproReports") -install.packages("devtools") -devtools::install_github("TomHardwicke/ReproReports") -force = TRUE -TRUE -"force = TRUE" -force = TRUE -devtools::install_github("TomHardwicke/ReproReports") -devtools::install_github("TomHardwicke/ReproReports"), force = TRUE -devtools::install_github("TomHardwicke/ReproReports") force = TRUE -devtools::install_github("TomHardwicke/ReproReports") force = TRUE -devtools::install_github("TomHardwicke/ReproReports") -force = TRUE -library("ReproReports") -install.packages("devtools") -devtools::install_github("TomHardwicke/ReproReports") -devtools::install_github("TomHardwicke/ReproReports", force = TRUE) -library("ReproReports") -library("ReproReports") -install.packages(c("backports", "covr", "curl", "digest", "htmltools", "htmlwidgets", "later", "markdown", "pkgbuild", "pkgconfig", "promises", "rmarkdown", "shiny", "tinytex", "xfun")) -library("ReproReports") -library("devtools") -devtools::install_github("TomHardwicke/ReproReports") -library(ReproReports) -ReproReports::reproCheck() -session_info() -articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" -reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report -pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". -copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". -pilotTTC <- 120 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -copilotTTC <- NA # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -copilotStartDate <- NA # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -completionDate <- NA # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -# sets up some formatting options for the R Markdown document -knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) -# load packages -library(tidyverse) # for data munging -library(knitr) # for kable table formating -library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files -library(readxl) # import excel files -library(ReproReports) # custom reporting functions -library(foreign) -# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared -reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) -data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") -head(data) -colnames(data) -articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" -reportType <- "copilot" # specify whether this is the 'pilot' report or 'copilot' report -pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". -copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". -pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -# sets up some formatting options for the R Markdown document -knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) -# load packages -library(tidyverse) # for data munging -library(knitr) # for kable table formating -library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files -library(readxl) # import excel files -library(ReproReports) # custom reporting functions -library(foreign) -# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared -reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) -data <- read_excel("data/data_Exp1.xlsx", sheet = "summary") -articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" -reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report -pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". -copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". -pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -# sets up some formatting options for the R Markdown document -knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) -# load packages -library(tidyverse) # for data munging -library(knitr) # for kable table formating -library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files -library(readxl) # import excel files -library(ReproReports) # custom reporting functions -library(foreign) -# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared -reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) -data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") -head(data) -# Right now, the data is arranged in wide format. I want to make it longer, so I will use pivot_longer to make it tidy. -# But first, I have to manually modify the columns and assign each column a column name. To do so, I created a new Excel file called "data_Exp1_modified". In this Excel file, I manually renamed each column. For example, the first column was called "closedloop_grasping_uncrowded_3cm". The other columns followed the same naming rules. -# In the Excel file, I also deleted empty columns. I also removed the two top columns. -# I use the read_excel function to read the data. -data_modified <- read_excel("GroupA_6-1-2015/data/data_Exp1_modified.xlsx", sheet = "summary") -# Now I use the functions "separate" and "pivot_longer" to make my data tidy. -pivoted_data <- pivot_longer(data_modified, closedloop_grasping_uncrowded_3cm:openloop_estimation_crowded_3.75cm, names_to = "column", values_to="estimate") -tidy_data <- separate(pivoted_data, column, into = c("open_closed","grasping_estimating","crowded_uncrowded","cm"), sep = "_") -colnames(tidy_data) -# Now my data is tidy! -# To run the ANOVA, I should get rid of the "means" information in the tidy_data. To do that, I simply selected the first 160 rows of the dataset, and ommitted the bottom rows 161-176, since those are the rows that have the means. -ANOVA_tidy_data <- tidy_data[1:160,] -# "ANOVA_tidy_data is the dataset I will use for the ANOVA analysis. It excludes the "means" rows. -# Next, I created a dataset that has ONLY the "means" information, since that is the information used in Figure 3. -Fig3_tidy_data <- tidy_data[161:176,] -# "Fig3_tidy_data is the dataset I will use for try to recreate Fig3 from the paper. It has only the "means" rows. -# I used the aov function to -ANOVA_results <- aov(estimate ~ grasping_estimating * crowded_uncrowded * open_closed * cm, ANOVA_tidy_data) -summary(ANOVA_results) -colnames(Fig3_tidy_data) -# plot("cm", Fig3_tidy_data[10:16], Fig3_tidy_data, "l") -# commented out because I could not get the code to work -# Reprocheck for the four-way ANOVA F value: -reproCheck("6.818", "2.662", valueType = "F") -# Reprocheck for the four-way ANOVA p value: -reproCheck("0.28", "0.105", valueType = "p") -reportObject <- reportObject %>% -filter(dummyRow == FALSE) %>% # remove the dummy row -select(-dummyRow) %>% # remove dummy row designation -mutate(articleID = articleID) %>% # add variables to report -select(articleID, everything()) # make articleID first column -# decide on final outcome -if(any(reportObject$comparisonOutcome %in% c("MAJOR_ERROR", "DECISION_ERROR"))){ -finalOutcome <- "Failure" -}else{ -finalOutcome <- "Success" -} -# collate report extra details -reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, finalOutcome) -# save report objects -if(reportType == "pilot"){ -write_csv(reportObject, "pilotReportDetailed.csv") -write_csv(reportExtras, "pilotReportExtras.csv") -} -if(reportType == "copilot"){ -write_csv(reportObject, "copilotReportDetailed.csv") -write_csv(reportExtras, "copilotReportExtras.csv") -} -articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" -reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report -pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". -copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". -pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -# sets up some formatting options for the R Markdown document -knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) -# load packages -library(tidyverse) # for data munging -library(knitr) # for kable table formating -library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files -library(readxl) # import excel files -library(ReproReports) # custom reporting functions -library(foreign) -# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared -reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) -data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") -head(data) -# Right now, the data is arranged in wide format. I want to make it longer, so I will use pivot_longer to make it tidy. -# But first, I have to manually modify the columns and assign each column a column name. To do so, I created a new Excel file called "data_Exp1_modified". In this Excel file, I manually renamed each column. For example, the first column was called "closedloop_grasping_uncrowded_3cm". The other columns followed the same naming rules. -# In the Excel file, I also deleted empty columns. I also removed the two top columns. -# I use the read_excel function to read the data. -data_modified <- read_excel("GroupA_6-1-2015/data/data_Exp1_modified.xlsx", sheet = "summary") -# Now I use the functions "separate" and "pivot_longer" to make my data tidy. -pivoted_data <- pivot_longer(data_modified, closedloop_grasping_uncrowded_3cm:openloop_estimation_crowded_3.75cm, names_to = "column", values_to="estimate") -tidy_data <- separate(pivoted_data, column, into = c("open_closed","grasping_estimating","crowded_uncrowded","cm"), sep = "_") -colnames(tidy_data) -# Now my data is tidy! -# To run the ANOVA, I should get rid of the "means" information in the tidy_data. To do that, I simply selected the first 160 rows of the dataset, and ommitted the bottom rows 161-176, since those are the rows that have the means. -ANOVA_tidy_data <- tidy_data[1:160,] -# "ANOVA_tidy_data is the dataset I will use for the ANOVA analysis. It excludes the "means" rows. -# Next, I created a dataset that has ONLY the "means" information, since that is the information used in Figure 3. -Fig3_tidy_data <- tidy_data[161:176,] -# "Fig3_tidy_data is the dataset I will use for try to recreate Fig3 from the paper. It has only the "means" rows. -# I used the aov function to -ANOVA_results <- aov(estimate ~ grasping_estimating * crowded_uncrowded * open_closed * cm, ANOVA_tidy_data) -summary(ANOVA_results) -colnames(Fig3_tidy_data) -# plot("cm", Fig3_tidy_data[10:16], Fig3_tidy_data, "l") -# commented out because I could not get the code to work -# Reprocheck for the four-way ANOVA F value: -reproCheck("6.818", "2.662", valueType = "F") -# Reprocheck for the four-way ANOVA p value: -reproCheck("0.28", "0.105", valueType = "p") -reportObject <- reportObject %>% -filter(dummyRow == FALSE) %>% # remove the dummy row -select(-dummyRow) %>% # remove dummy row designation -mutate(articleID = articleID) %>% # add variables to report -select(articleID, everything()) # make articleID first column -# decide on final outcome -if(any(reportObject$comparisonOutcome %in% c("MAJOR_ERROR", "DECISION_ERROR"))){ -finalOutcome <- "Failure" -}else{ -finalOutcome <- "Success" -} -# collate report extra details -reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, finalOutcome) -# save report objects -if(reportType == "pilot"){ -write_csv(reportObject, "pilotReportDetailed.csv") -write_csv(reportExtras, "pilotReportExtras.csv") -} -if(reportType == "copilot"){ -write_csv(reportObject, "copilotReportDetailed.csv") -write_csv(reportExtras, "copilotReportExtras.csv") -} -devtools::session_info() -articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" -reportType <- "copilot" # specify whether this is the 'pilot' report or 'copilot' report -pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". -copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". -pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -# sets up some formatting options for the R Markdown document -knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) -# load packages -library(tidyverse) # for data munging -library(knitr) # for kable table formating -library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files -library(readxl) # import excel files -library(ReproReports) # custom reporting functions -library(foreign) -# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared -reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) -data <- read_excel("data/data_Exp1.xlsx", sheet = "summary") -articleID <- "6-1-2015" # insert the article ID code here e.g., "10-3-2015" -reportType <- "Pilot" # specify whether this is the 'pilot' report or 'copilot' report -pilotNames <- "Cristina Ceballos" # insert the pilot's name here e.g., "Tom Hardwicke". -copilotNames <- "Insub Kim" # # insert the co-pilot's name here e.g., "Michael Frank". -pilotTTC <- 520 # insert the pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -copilotTTC <- 180 # insert the co-pilot's estimated time to complete (in minutes, it is fine to approximate) e.g., 120 -pilotStartDate <- "11/3/2019" # insert the piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -copilotStartDate <- "11/10/2019" # insert the co-piloting start date in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -completionDate <- "11/10/2019" # insert the date of final report completion in US format e.g., as.Date("01/25/18", format = "%m/%d/%y") -# sets up some formatting options for the R Markdown document -knitr::opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE) -# load packages -library(tidyverse) # for data munging -library(knitr) # for kable table formating -library(haven) # import and export 'SPSS', 'Stata' and 'SAS' Files -library(readxl) # import excel files -library(ReproReports) # custom reporting functions -library(foreign) -# Prepare report object. This will be updated automatically by the reproCheck function each time values are compared -reportObject <- data.frame(dummyRow = TRUE, reportedValue = NA, obtainedValue = NA, valueType = NA, percentageError = NA, comparisonOutcome = NA, eyeballCheck = NA) -data <- read_excel("GroupA_6-1-2015/data/data_Exp1.xlsx", sheet = "summary") -head(data) -# Right now, the data is arranged in wide format. I want to make it longer, so I will use pivot_longer to make it tidy. -# But first, I have to manually modify the columns and assign each column a column name. To do so, I created a new Excel file called "data_Exp1_modified". In this Excel file, I manually renamed each column. For example, the first column was called "closedloop_grasping_uncrowded_3cm". The other columns followed the same naming rules. -# In the Excel file, I also deleted empty columns. I also removed the two top columns. -# I use the read_excel function to read the data. -data_modified <- read_excel("GroupA_6-1-2015/data/data_Exp1_modified.xlsx", sheet = "summary") -# Now I use the functions "separate" and "pivot_longer" to make my data tidy. -pivoted_data <- pivot_longer(data_modified, closedloop_grasping_uncrowded_3cm:openloop_estimation_crowded_3.75cm, names_to = "column", values_to="estimate") -tidy_data <- separate(pivoted_data, column, into = c("open_closed","grasping_estimating","crowded_uncrowded","cm"), sep = "_") -colnames(tidy_data) -# Now my data is tidy! -# To run the ANOVA, I should get rid of the "means" information in the tidy_data. To do that, I simply selected the first 160 rows of the dataset, and ommitted the bottom rows 161-176, since those are the rows that have the means. -ANOVA_tidy_data <- tidy_data[1:160,] -# "ANOVA_tidy_data is the dataset I will use for the ANOVA analysis. It excludes the "means" rows. -# Next, I created a dataset that has ONLY the "means" information, since that is the information used in Figure 3. -Fig3_tidy_data <- tidy_data[161:176,] -# "Fig3_tidy_data is the dataset I will use for try to recreate Fig3 from the paper. It has only the "means" rows. -# I used the aov function to -ANOVA_results <- aov(estimate ~ grasping_estimating * crowded_uncrowded * open_closed * cm, ANOVA_tidy_data) -summary(ANOVA_results) -colnames(Fig3_tidy_data) -# plot("cm", Fig3_tidy_data[10:16], Fig3_tidy_data, "l") -# commented out because I could not get the code to work -# Reprocheck for the four-way ANOVA F value: -reproCheck("6.818", "2.662", valueType = "F") -# Reprocheck for the four-way ANOVA p value: -reproCheck("0.28", "0.105", valueType = "p") -reportObject <- reportObject %>% -filter(dummyRow == FALSE) %>% # remove the dummy row -select(-dummyRow) %>% # remove dummy row designation -mutate(articleID = articleID) %>% # add variables to report -select(articleID, everything()) # make articleID first column -# decide on final outcome -if(any(reportObject$comparisonOutcome %in% c("MAJOR_ERROR", "DECISION_ERROR"))){ -finalOutcome <- "Failure" -}else{ -finalOutcome <- "Success" -} -# collate report extra details -reportExtras <- data.frame(articleID, pilotNames, copilotNames, pilotTTC, copilotTTC, pilotStartDate, copilotStartDate, completionDate, finalOutcome) -# save report objects -if(reportType == "pilot"){ -write_csv(reportObject, "pilotReportDetailed.csv") -write_csv(reportExtras, "pilotReportExtras.csv") -} -if(reportType == "copilot"){ -write_csv(reportObject, "copilotReportDetailed.csv") -write_csv(reportExtras, "copilotReportExtras.csv") -} -devtools::session_info() -setwd("C:/Users/C. Ceballos/OneDrive - Leland Stanford Junior University/Year 4 2019-2020/PSYCH251_HWs/problem_sets") +d <- ggplot(diamonds, aes(x = carat, y = price)) # first you set the aesthetic and dataset +d + geom_point() # then you add geoms +# d + geom_point(aes(colour = carat)) # and you can keep doing this to add layers to the plot +#qplot(data = diamonds, +# x = carat, +# y = price, +# shape = clarity, +# color = cut) +d <- ggplot(diamonds, aes(x = carat, y = price)) +d + geom_point() +# ggplot(data = diamonds, +# mapping = aes(x = carat, y = price) +# + geom_point() +) +#qplot(data = diamonds, +# x = carat, +# y = price, +# shape = clarity, +# color = cut) +ggplot(diamonds, +aes(x = carat, y = price, +shape = clarity, +color = cut)) + +geom_point() +# d + geom_point() +# ggplot(data = diamonds, +# mapping = aes(x = carat, y = price) +# + geom_point() +) +ggplot(diamonds, +aes(x = carat, y = price, +shape = clarity, +color = cut)) + +geom_point() + +facet_grid(clarity ~ cut) +ggplot(diamonds, +aes(x = carat, y = price, +color = cut)) + +geom_point() + +facet_grid(clarity ~ cut) +ggplot(diamonds, aes(x=carat, y=price)) + +geom_point(shape = ".") + +facet_grid(cut ~ clarity) +ggplot(diamonds, aes(x=carat, y=price)) + +geom_point(shape = ".") + +facet_grid(cut ~ clarity) + +geom_smooth() +ggplot(diamonds, aes(x=carat, y=price)) + +geom_point(shape = ".") + +facet_grid(cut ~ clarity) + +# geom_smooth() +ggplot(diamonds, aes(x=carat, y=price)) + +geom_point(shape = ".") + +facet_grid(cut ~ clarity) # + +# geom_smooth() +ggplot(diamonds, aes(x=carat, y=price)) + +geom_point(shape = ".") + +facet_grid(cut ~ clarity) + +geom_smooth() +ggplot(diamonds, aes(x=carat, y=price)) + +geom_histogram(binwidth = "5") + +facet_grid(cut ~ clarity) +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = "5") + +facet_grid(cut ~ clarity) +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) + +facet_grid(cut ~ clarity) +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) + +facet_grid(cut ~ clarity) + +geom_density() +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) + +# facet_grid(cut ~ clarity) + +geom_density() +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) + +# facet_grid(cut ~ clarity) + +# geom_density() +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) # + +# facet_grid(cut ~ clarity) + +# geom_density() +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) + +facet_grid(cut ~ clarity) + +geom_density() +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) + +facet_grid(cut ~ clarity) + +geom_density() +ggplot(diamonds, aes(x=carat)) + +geom_histogram(binwidth = 5) + +facet_grid(cut ~ clarity) + +geom_density() +ggplot(diamonds, aes(x=carat, y=price)) + +geom_point(shape = ".") + +facet_grid(cut ~ clarity) + +geom_smooth() +library(tidyverse) +# two helper functions +sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))} +ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation +d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", +na = c("NA", ".")) # they use . to indicate NA +facVars <- c("eq.uneq", "value", "decision") +d[, facVars] <- lapply(d[, facVars], factor) +d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial)) +facVars <- c("eq.uneq", "value", "decision") +d[, facVars] <- lapply(d[, facVars], factor) +d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial)) +d$trial_type <- factor(d$eq.uneq, +levels = c("E","U"), +labels = c("Equal","Unequal")) +d$condition <- factor(d$condition, +levels = c("AI","DI"), +labels = c("Advantageous","Disadvantageous")) +library(tidyverse) +# two helper functions +sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))} +ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation +d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", +na = c("NA", ".")) # they use . to indicate NA +facVars <- c("eq.uneq", "value", "decision") +d[, facVars] <- lapply(d[, facVars], factor) +d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial)) +d$trial_type <- factor(d$eq.uneq, +levels = c("E","U"), +labels = c("Equal","Unequal")) +d$condition <- factor(d$condition, +levels = c("AI","DI"), +labels = c("Advantageous","Disadvantageous")) +d$trial_type <- factor(d$eq.uneq, +levels = c("E","U"), +labels = c("Equal","Unequal")) +d$condition <- factor(d$condition, +levels = c("AI","DI"), +labels = c("Advantageous","Disadvantageous")) +colnames(d) +View(d) +colnames(d) +chart1 <- d %>% +group_by(country, actor.age.years) %>% +summarize() +chart1 <- d %>% +group_by(country, actor.age.years) %>% +summarize() +chart1 <- d %>% +group_by(country, actor.age.years) %>% +summarize(count()) +chart1 <- d %>% +group_by(country, actor.age.years) %>% +summarize(l = length()) +chart1 <- d %>% +group_by(country) %>% +summarize(number_n = n()) +chart1 <- d %>% +group_by(country) %>% +summarize(number_n = mean()) +colnames(d) +chart1 <- d %>% +group_by(country) %>% +summarize(count_n = count(decision)) +chart1 <- d %>% +group_by(country) %>% +summarize(decision_n = count(decision)) +chart1 <- d %>% +group_by(country) %>% +summarize(decision, Count = n ()) +chart1 <- d %>% +group_by(country) %>% +summarize(decision, Count = n ()) +chart1 <- d %>% +group_by(country) %>% +summarize(Count = n ()) +chart1 <- d %>% +group_by(country) %>% +summarize(count = n ()) +colnames(d) +library(tidyverse) +# two helper functions +sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))} +ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation +d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", +na = c("NA", ".")) # they use . to indicate NA +facVars <- c("eq.uneq", "value", "decision") +d[, facVars] <- lapply(d[, facVars], factor) +d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial)) +d$trial_type <- factor(d$eq.uneq, +levels = c("E","U"), +labels = c("Equal","Unequal")) +d$condition <- factor(d$condition, +levels = c("AI","DI"), +labels = c("Advantageous","Disadvantageous")) +colnames(d) +chart1 <- d %>% +group_by(country) %>% +summarize(count = n ()) +library(tidyverse) +# two helper functions +sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))} +ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation +colnames(d) +chart1 <- d %>% +group_by(country) %>% +summarize(count = n ()) +chart1 <- d %>% +group_by(country) %>% +summarize(count = n ()) +d %>% +group_by(country) %>% +summarize(count = n ()) +country +d %>% +group_by(country, decision) %>% +summarize(count = n ()) +colnames(d) +d %>% +group_by(country, decision, eq.uneq) %>% +summarize(count = n ()) +d %>% +group_by(country, eq.uneq, decision) %>% +summarize(count = n ()) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +colnames(d) +d %>% +group_by(country, eq.uneq, decision) %>% +summarize(count = n ()) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +?facet +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +facet_wrap(country) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +facet_wrap(~country) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +facet_wrap(~eq.uneq) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +facet_wrap(country~eq.uneq) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(country~eq.uneq) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +colnames(d) +d <- d %>% +filter(eq.uneq != "NA" & decision != "NA") +colnames(d) +d <- d %>% +filter(eq.uneq != "NA" & decision != "NA") +d %>% +group_by(country, eq.uneq, decision) %>% +summarize(count = n ()) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +colnames(d) +colnames(d) +d <- d %>% +filter(eq.uneq != "NA" & decision != "NA") +d %>% +group_by(country, eq.uneq, decision) %>% +summarize(count = n ()) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +d <- d %>% +group_by(country) %>% +mutate() +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +?mutate +d <- d %>% +mutate(mean(decision == accept)) +d <- d %>% +mutate(mean(decision == "accept")) +colnames(d) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +d <- d %>% +mutate(percent_accept <- mean(decision == "accept")) +colnames(d) +d <- d %>% +mutate(mean(decision == "accept")) +colnames(d) +colnames(d) +d <- d %>% +mutate(foo = mean(decision == "accept")) +d <- d %>% +mutate(percentage_accept = mean(decision == "accept")) +colnames(d) library(tidyverse) +# two helper functions +sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))} +ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation +d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", +na = c("NA", ".")) # they use . to indicate NA +facVars <- c("eq.uneq", "value", "decision") +d[, facVars] <- lapply(d[, facVars], factor) +d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial)) +d$trial_type <- factor(d$eq.uneq, +levels = c("E","U"), +labels = c("Equal","Unequal")) +d$condition <- factor(d$condition, +levels = c("AI","DI"), +labels = c("Advantageous","Disadvantageous")) +d <- d %>% +filter(eq.uneq != "NA" & decision != "NA") +d %>% +group_by(country, eq.uneq, decision) %>% +summarize(count = n ()) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +d <- d %>% +mutate(percentage_accept = mean(decision == "accept")) +head(d) +head(d) +d <- d %>% +group_by(country) %>% +mutate(percentage_accept = mean(decision == "accept")) +head(d) +tail(d) +ggplot(d, aes(x = percentage_accept)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +colnames(d) +ggplot(d, aes(x = country)) + +geom_histogram(stat = "count") + +facet_wrap(~percentage_accept) +ggplot(d, aes(x = country)) + +geom_histogram(stat = "count") + +facet_wrap(~actor.age.years) +ggplot(d, aes(x = actor.age.years)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +ggplot(d, aes(x = actor.age.years)) + +geom_smooth + +facet_wrap(~country) +ggplot(d, aes(x = actor.age.years)) + +geom_smooth() + +facet_wrap(~country) +ggplot(d, aes(x = actor.age.years, y = percentage_accept)) + +geom_smooth() + +facet_wrap(~country) +d <- d %>% +group_by(country, actor.age.years) %>% +mutate(percentage_accept = mean(decision == "accept")) +ggplot(d, aes(x = actor.age.years, y = percentage_accept)) + +geom_smooth() + +facet_wrap(~country) +d <- d %>% +group_by(country, actor.age.years, eq.uneq) %>% +mutate(percentage_accept = mean(decision == "accept")) +ggplot(d, aes(x = actor.age.years, y = percentage_accept)) + +geom_smooth() + +facet_wrap(~country) +ggplot(d, aes(x = actor.age.years, y = percentage_accept)) + +geom_smooth() + +facet_wrap(eq.uneq ~country) +?geom_smooth +ggplot(d, aes(x = actor.age.years, y = country)) + +geom_line() + +facet_wrap(eq.uneq ~country) +ggplot(d, aes(x = actor.age.years)) + +geom_line() + +facet_wrap(eq.uneq ~country) +ggplot(d, aes(x = actor.age.years, y = percent_accept, color = country)) + +geom_smooth() + +facet_wrap(eq.uneq ~country) library(tidyverse) -averages <- numeric(10000) -for (i in 1:10000) { -X <- rnorm(30, 0, 1) -averages[i] <- mean(X) -} -averages <- numeric(10000) -for (i in 1:10000) { -X <- rnorm(30, 0, 1) -averages[i] <- mean(X) -} -averages -averages <- numeric(10000) -for (i in 1:10000) { -X <- rnorm(30, 0, 1) -averages[i] <- mean(X) -} -averages <- numeric(10000) -for (i in 1:10000) { -X <- rnorm(30, 0, 1) -averages[i] <- (X) -} +# two helper functions +sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))} +ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation +d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", +na = c("NA", ".")) # they use . to indicate NA +facVars <- c("eq.uneq", "value", "decision") +d[, facVars] <- lapply(d[, facVars], factor) +d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial)) +d$trial_type <- factor(d$eq.uneq, +levels = c("E","U"), +labels = c("Equal","Unequal")) +d$condition <- factor(d$condition, +levels = c("AI","DI"), +labels = c("Advantageous","Disadvantageous")) +d <- d %>% +filter(eq.uneq != "NA" & decision != "NA") +d %>% +group_by(country, eq.uneq, decision) %>% +summarize(count = n ()) +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") +ggplot(d, aes(x = decision)) + +geom_histogram(stat = "count") + +facet_wrap(~country) +d <- d %>% +group_by(country, actor.age.years, eq.uneq) %>% +mutate(percentage_accept = mean(decision == "accept")) +head(d) +tail(d) +ggplot(d, aes(x = country)) + +geom_histogram(stat = "count") + +facet_wrap(~actor.age.years) +colnames(d) +ggplot(d, aes(x = actor.age.years, y = percentage_accept)) + +geom_smooth() + +facet_wrap(eq.uneq ~country) +ggplot(d, aes(x = actor.age.years, y = percentage_accept, color = country)) + +geom_smooth() + +facet_wrap(eq.uneq ~country) +ggplot(d, aes(x = actor.age.years, y = percentage_accept, color = country)) + +geom_smooth() + +facet_wrap(~eq.uneq) +d <- d %>% +mutate(percentage_accept = mean(decision == "accept")) +ggplot(d, aes(x = actor.age.years, y = percentage_accept, color = country)) + +geom_smooth() + +facet_wrap(~eq.uneq) +ggplot(d, aes(x = actor.age.years, y = percentage_accept, color = country)) + +geom_smooth() + +facet_wrap(~eq.uneq) diff --git a/Ceballos_ggplot2_exercise.Rmd b/Ceballos_ggplot2_exercise.Rmd new file mode 100644 index 0000000..7ce6175 --- /dev/null +++ b/Ceballos_ggplot2_exercise.Rmd @@ -0,0 +1,182 @@ +--- +title: 'ggplot2 intro exercise' +author: "Cristina Ceballos" +date: "November 13, 2019" +output: html_document +--- + +This is an in-class exercise on using `ggplot2`. + +Note, that this example is from the_grammar.R on [http://had.co.nz/ggplot2](). I've adapted this for psych 251 purposes + +First install and load the package. It's part of the "core tidvyerse". + +```{r} +library(tidyverse) +``` + +# Exploring ggplot2 using `qplot` + +We'll start by using `qplot`. `qplot` is the easy interface, meant to replace `plot`. You can give it simple `qplot(x,y)` examples, or slightly more complex examples like `qplot(x, y, col=grp, data=d)`. + +We're going to be using the `diamonds` dataset. This is a set of measurements of diamonds, along with their price etc. + +```{r} +head(diamonds) +qplot(diamonds$carat, diamonds$price) +qplot(diamonds$price) +colnames(diamonds) +``` + +Scatter plots are trivial, and easy to add features to. Modify this plot so that it uses the dataframe rather than working from variables in the general namespace (good to get away from retyping `diamonds$` every time you reference a variable). + +```{r} + +qplot(data = diamonds, carat, price) + + +# Piping does not work for qplot, because the first argument of qplot is NOT data!! The first argument of qplot is "x". If you use ggplot, then you can pipe, but then it has different syntax and has aes. + +``` + +Try adding clarity and cut, using shape and color as your visual variables. + +```{r} + + +qplot(data = diamonds, + x = carat, + y = price, + shape = clarity, + color = cut) + +``` + + +# More complex with `ggplot` + +`ggplot` is just a way of building `qplot` calls up more systematically. It's +sometimes easier to use and sometimes a bit more complicated. What I want to show off here is the functionality of being able to build up complex plots with multiple elements. You can actually do this using `qplot` pretty easily, but there are a few things that are hard to do. + +`ggplot` is the basic call, where you specify A) a dataframe and B) an aesthetic (`aes`) mapping from variables in the plot space to variables in the dataset. + +```{r} +d <- ggplot(diamonds, aes(x = carat, y = price)) # first you set the aesthetic and dataset +d + geom_point() # then you add geoms +# d + geom_point(aes(colour = carat)) # and you can keep doing this to add layers to the plot +``` + +Try writing this as a single set of additions (e.g. one line of R code, though you can put in linebreaks). This is the most common workflow for me. + + +```{r} + +``` + + +# Facets + +Let's try out another version of the `qplot` example above. Rewrite the last qplot example with ggplot. + +```{r} + +#qplot(data = diamonds, + # x = carat, + # y = price, + # shape = clarity, + # color = cut) + +ggplot(diamonds, + aes(x = carat, y = price, + shape = clarity, + color = cut)) + + geom_point() + +# d + geom_point() + +# ggplot(data = diamonds, + # mapping = aes(x = carat, y = price) + # + geom_point() + # ) + +``` + +One of the primary benefits of `ggplot2` is the use of facets - also known as small multiples in the Tufte vocabulary. That last plot was probably hard to read. Facets could make it better. Try adding `facet_grid(x ~ y)`. `x ~ y` means row facets are by `x`, column facets by `y`. + +```{r} + +ggplot(diamonds, + aes(x = carat, y = price, + color = cut)) + + geom_point() + + facet_grid(clarity ~ cut) + +# Facet statement has two variables: clarity and cut. Cut is fair, good, premium, ideal. it tells you what to make the columsn. The rows are the cut. + +# If you wanted just all the cuts. clarity is "x", cut is "y". Gives you a bunch of mini-plots. + +# Otherwise, you can switch to facet_wrap. And it will wrap it around for the most pleasing grid of subplots. +# Facet_wrap(~clarity) + +# When you have discrete variables, and you want to look at the effect in some continuous variable space. That's when facet is really useful. + +# To export these kinds of plots, you can use ggssave("~/Desktop/Diamons.pdf", plot = p) + +# You can also export it. Plot it to the plotting window, and use export. But this is less reproducible. + + +``` + +But facets can also get overwhelming. Try to strike a good balance between color, shape, and faceting. + +HINT: `facet_grid(. ~ x)` puts x on the columns, but `facet_wrap(~ x)` (no dot) *wraps* the facets. + +```{r} + +``` + + +# Geoms + +As you've seen above, the basic unit of a ggplot plot is a "geom" - a mapping between data (via an "aesthetic") and a particular geometric configuration on coordinate axes. + +Let's try adding some geoms and manipulating their parameters. One combo I really like is a scatterplot with a smoothing layer (`geom_smooth`). Try adding one onto this plot. + +```{r} +ggplot(diamonds, aes(x=carat, y=price)) + + geom_point(shape = ".") + + facet_grid(cut ~ clarity) + + geom_smooth() +``` + + +CHALLENGE: You could also try starting with a histogram and adding densities. (`geom_density`), Try [this link](https://stackoverflow.com/questions/5688082/ggplot2-overlay-histogram-with-density-curve). + +```{r} + +ggplot(diamonds, aes(x=carat)) + + geom_histogram(binwidth = 5) + + facet_grid(cut ~ clarity) + + geom_density() + + +``` + +# Themes and plot cleanup + +I like a slightly cleaner look to my plots. Luckily, ggplot allows you to add "themes" to your plots. Try doing the same plot but adding `+ theme_bw()` or `+ theme_classic()`. Different themes work better for different applications, in my experience. My favorite right now is `ggthemes::theme_few`. + +You can also try different color scales. Experiment with `scale_color_...` - try writing that and hitting TAB for autocomplete options. Check out the help on this. + +You can also try transparency/different point sizes to clean up scatterplots. Try `alpha = .1` or `pch = "."` to make transparent or small points. + +Finally, don't forget to "fix the axis labels"! + +Here's a somewhat ugly plot - see if you can make it look awesome. + +```{r} +ggplot(diamonds, aes(x = carat, y = price, col = cut)) + + geom_point() + + facet_wrap(~clarity) +``` + diff --git a/blake exercise.Rmd b/blake exercise.Rmd new file mode 100644 index 0000000..e3a95d3 --- /dev/null +++ b/blake exercise.Rmd @@ -0,0 +1,122 @@ +--- +title: 'Blake et al. (2015) exercise' +author: "Mike Frank" +date: "November 15, 2019" +output: + html_document: + toc: true +--- + +# Intro + +This is an in-class exercise exploring Blake et al. (2015), [Ontogeny of fairness in seven societies](http://www.nature.com/nature/journal/v528/n7581/full/nature15703.html), *Nature*. + +Please explore these data together (without looking at the analyses supplied by the authors). + +The overall goal is to understand the degree to which the data support the authors' hypotheses, and to make some more awesome plots along the way. + +```{r} +library(tidyverse) + +# two helper functions +sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))} +ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation +``` + +# Data Prep + +First read in the data, as distributed by the journal. + +```{r} +d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", + na = c("NA", ".")) # they use . to indicate NA +``` + +Do some preprocessing, taken directly from the supplemental material. + +```{r} +facVars <- c("eq.uneq", "value", "decision") +d[, facVars] <- lapply(d[, facVars], factor) +d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial)) +``` + +Rename things so that they are easy to deal with. I hate hard to remember abbreviations for condition names. + +```{r} +d$trial_type <- factor(d$eq.uneq, + levels = c("E","U"), + labels = c("Equal","Unequal")) +d$condition <- factor(d$condition, + levels = c("AI","DI"), + labels = c("Advantageous","Disadvantageous")) +``` + +# Variable exploration + +Describe the dataset graphically in ways that are useful for you to get a handle on the data collection effort. + +Histograms are good. Ages of the participants are useful too. + +Remember your `group_by` + `summarise` workflow. This will help you here. + +```{r} + +d <- d %>% + filter(eq.uneq != "NA" & decision != "NA") + +d %>% + group_by(country, eq.uneq, decision) %>% + summarize(count = n ()) + + +ggplot(d, aes(x = decision)) + + geom_histogram(stat = "count") + + +ggplot(d, aes(x = decision)) + + geom_histogram(stat = "count") + + facet_wrap(~country) + + +d <- d %>% + mutate(percentage_accept = mean(decision == "accept")) + +head(d) +tail(d) + +ggplot(d, aes(x = country)) + + geom_histogram(stat = "count") + + facet_wrap(~actor.age.years) + + +``` + +Make sure you understand what the design was: how many trials per participant, what was between- and within-subject, etc. + +# Hypothesis-related exploration + +In this second, explore the authors' hypotheses related to advantageous and inadvantageous inequity aversion. Create 1 - 3 pictures that describe the support (or lack of it) for this hypothesis. + +```{r} + +colnames(d) + +ggplot(d, aes(x = actor.age.years, y = percentage_accept)) + + geom_smooth() + + facet_wrap(eq.uneq ~country) + + + +``` + + +```{r} + +ggplot(d, aes(x = actor.age.years, y = percentage_accept, color = country)) + + geom_smooth() + + facet_wrap(~eq.uneq) + + + +``` + diff --git a/blake-exercise.html b/blake-exercise.html new file mode 100644 index 0000000..2711a64 --- /dev/null +++ b/blake-exercise.html @@ -0,0 +1,487 @@ + + + + + + + + + + + + + + + + +Blake et al. (2015) exercise + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

+ + + + + + + + +
+

Intro

+

This is an in-class exercise exploring Blake et al. (2015), Ontogeny of fairness in seven societies, Nature.

+

Please explore these data together (without looking at the analyses supplied by the authors).

+

The overall goal is to understand the degree to which the data support the authors’ hypotheses, and to make some more awesome plots along the way.

+
library(tidyverse)
+
## -- Attaching packages --------------------------------------- tidyverse 1.2.1 --
+
## v ggplot2 3.2.1     v purrr   0.3.3
+## v tibble  2.1.3     v dplyr   0.8.3
+## v tidyr   1.0.0     v stringr 1.4.0
+## v readr   1.3.1     v forcats 0.4.0
+
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
+## x dplyr::filter() masks stats::filter()
+## x dplyr::lag()    masks stats::lag()
+
# two helper functions
+sem <- function(x) {sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))}
+ci95 <- function(x) {sem(x) * 1.96} # lazy normal approximation
+
+
+

Data Prep

+

First read in the data, as distributed by the journal.

+
d <- read_csv("data/Ontogeny_fairness_seven_societies_data.csv", 
+              na = c("NA", ".")) # they use . to indicate NA
+
## Parsed with column specification:
+## cols(
+##   condition = col_character(),
+##   country = col_character(),
+##   actor.id = col_character(),
+##   actor.age.years = col_double(),
+##   actor.gender = col_character(),
+##   dist = col_character(),
+##   eq.uneq = col_character(),
+##   decision = col_character(),
+##   value = col_character(),
+##   trial = col_character()
+## )
+

Do some preprocessing, taken directly from the supplemental material.

+
facVars <- c("eq.uneq", "value", "decision")
+d[, facVars] <- lapply(d[, facVars], factor)
+d$trial.number <- as.numeric(gsub(".(\\d+)", "\\1", d$trial))
+

Rename things so that they are easy to deal with. I hate hard to remember abbreviations for condition names.

+
d$trial_type <- factor(d$eq.uneq, 
+                       levels = c("E","U"), 
+                       labels = c("Equal","Unequal"))
+d$condition <- factor(d$condition,
+                      levels = c("AI","DI"), 
+                      labels = c("Advantageous","Disadvantageous"))
+
+
+

Variable exploration

+

Describe the dataset graphically in ways that are useful for you to get a handle on the data collection effort.

+

Histograms are good. Ages of the participants are useful too.

+

Remember your group_by + summarise workflow. This will help you here.

+

Make sure you understand what the design was: how many trials per participant, what was between- and within-subject, etc.

+
+ + + + + +
+ + + + + + + + + + + + + + + diff --git a/hw4 updated.Rmd b/hw4 updated.Rmd new file mode 100644 index 0000000..5ac51f0 --- /dev/null +++ b/hw4 updated.Rmd @@ -0,0 +1,251 @@ +--- +title: 'Psych 251 PS4: Simulation + Analysis' +author: "Cristina Ceballos" +date: "11/15/2019" +output: + pdf_document: + toc: yes + html_document: + toc: yes +--- + +This is problem set #4, in which we want you to integrate your knowledge of data wrangling with some basic simulation skills. It's a short problem set to help consolidate your `ggplot2` skills and then help you get your feet wet in testing statistical concepts through "making up data" rather than consulting a textbook or doing math. + +For ease of reading, please separate your answers from our text by marking our text with the `>` character (indicating quotes). + +# Part 1: ggplot practice + +This part is a warmup, it should be relatively straightforward `ggplot2` practice. + +Load data from Frank, Vul, Saxe (2011, Infancy), a study in which we measured infants' looking to hands in moving scenes. There were infants from 3 months all the way to about two years, and there were two movie conditions (`Faces_Medium`, in which kids played on a white background, and `Faces_Plus`, in which the backgrounds were more complex and the people in the videos were both kids and adults). An eye-tracker measured children's attention to faces. This version of the dataset only gives two conditions and only shows the amount of looking at hands (other variables were measured as well). + +```{r, warning=F, message=F} +library(tidyverse) +``` + + + +```{r} +fvs <- read_csv("data/FVS2011-hands.csv") + +colnames(fvs) +head(fvs) +``` + +First, use `ggplot` to plot a histogram of the ages of children in the study. NOTE: this is a repeated measures design, so you can't just take a histogram of every measurement. + +```{r} + + +fvs %>% + group_by(subid, age) %>% + summarise() + + +ggplot(fvs, aes(x=fvs$age)) + geom_histogram() + + +``` + +Second, make a scatter plot showing hand looking as a function of age and condition. Add appropriate smoothing lines. Take the time to fix the axis labels and make the plot look nice. + +```{r} + +ggplot(fvs, aes(x=fvs$age, y=fvs$hand.look)) + + geom_point(shape = ".") + + geom_smooth() + + +``` + +What do you conclude from this pattern of data? + +> ANSWER HERE?? Around 15 months, infants start paying more attention to hands. + +What statistical analyses would you perform here to quantify these differences? + +> ANSWER HERE. ???? + + +# Part 2: Simulation + +```{r, warning=F, message=F} +library(tidyverse) +``` + + +Let's start by convincing ourselves that t-tests have the appropriate false positive rate. Run 10,000 t-tests with standard, normally-distributed data from a made up 30-person, single-measurement experiment (the command for sampling from a normal distribution is `rnorm`). + +The goal of these t-tests are to determine, based on 30 observations, whether the underlying distribution (in this case a normal distribution with mean 0 and standard deviation 1) has a mean that is different from 0. In reality, the mean is not different from 0 (we sampled it using `rnorm`), but sometimes the 30 observations we get in our experiment will suggest that the mean is higher or lower. In this case, we'll get a "significant" result and incorrectly reject the null hypothesis of mean 0. + +What's the proportion of "significant" results ($p < .05$) that you see? + +First do this using a `for` loop. + +```{r} + +result <- numeric(10000) + +for (i in 1:10000) { + X <- rnorm(30, 0, 1) + result[i] <- t.test(X)$p.value +} + +sig_results <- result < 0.05 + +sum(sig_results)/10000 + + +``` + +Next, do this using the `replicate` function: + +```{r} + +pvals <- replicate(10000, t.test(rnorm(30))$p.value) + +sig_pvals <- pvals < 0.05 + +sum(sig_pvals)/10000 + +``` + +How does this compare to the intended false-positive rate of $\alpha=0.05$? + +> This answer makes sense. I'm getting a false positive rate of about 5 percent or slightly less than 5 percent. That's what I would expect given that I am testing for signifiance of ($p < .05$). + +Ok, that was a bit boring. Let's try something more interesting - let's implement a p-value sniffing simulation, in the style of Simons, Nelson, & Simonsohn (2011). + +Consider this scenario: you have done an experiment, again with 30 participants (one observation each, just for simplicity). The question is whether the true mean is different from 0. You aren't going to check the p-value every trial, but let's say you run 30 - then if the p-value is within the range p < .25 and p > .05, you optionally run 30 more and add those data, then test again. But if the original p value is < .05, you call it a day, and if the original is > .25, you also stop. + +First, write a function that implements this sampling regime. + +```{r} +double.sample <- function () { + + first_sample <- rnorm(30) + + pvals_2 <- (t.test(first_sample)$p.value) + + if(pvals_2<0.05) { + return(pvals_2) + } else if (pvals_2>0.25) { + return(pvals_2) + } else { + second_sample <- c(first_sample, rnorm(30)) + return(t.test(second_sample)$p.value) + } + +} + + +``` + +Now call this function 10k times and find out what happens. + +```{r} + +tenk_results <- replicate(10000, double.sample()) + +sum(tenk_results < 0.05)/10000 + + +``` + +Is there an inflation of false positives? How bad is it? + +> Yes, there's an inflation of false positives. I'm getting about 7 percent false positives, using the double sample technique. This is what Mike talked about in class, with p-hacking. While I may falsely believe that I am keeping p-value at less than 0.05, the double sampling technique is instead giving me slightly higher false positives. + +Now modify this code so that you can investigate this "double the sample" rule in a bit more depth. In the previous question, the researcher doubles the sample only when they think they got "close" to a significant result, i.e. when their not-significant p is less than 0.25. What if the researcher was more optimistic? See what happens in these 3 other scenarios: + +* The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.5. +* The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.75. +* The research doubles their sample whenever they get ANY pvalue that is not significant. + +How do these choices affect the false positive rate? + +HINT: Try to do this by making the function `double.sample` take the upper p value as an argument, so that you can pass this through dplyr. + +HINT 2: You may need more samples. Find out by looking at how the results change from run to run. + +```{r} + +# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.5. + +double.sample0.5 <- function () { + + first_sample_0.05 <- rnorm(30) + + pvals_2_0.05 <- (t.test(first_sample_0.05)$p.value) + + if(pvals_2_0.05<0.05) { + return(pvals_2_0.05) + } else if (pvals_2_0.05>0.5) { + return(pvals_2_0.05) + } else { + second_sample_0.05 <- c(first_sample_0.05, rnorm(30)) + return(t.test(second_sample_0.05)$p.value) + } + +} + +tenk_results_0.05 <- replicate(10000, double.sample0.5()) + +sum(tenk_results_0.05 < 0.05)/10000 + + + +# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.75. + +double.sample0.75 <- function () { + + first_sample_0.75 <- rnorm(30) + + pvals_2_0.75 <- (t.test(first_sample_0.75)$p.value) + + + if(pvals_2_0.75<0.05) { + return(pvals_2_0.75) + } else if (pvals_2_0.75>0.75) { + return(pvals_2_0.75) + } else { + second_sample_0.75 <- c(first_sample_0.75, rnorm(30)) + return(t.test(second_sample_0.75)$p.value) + } + +} + +tenk_results_0.75 <- replicate(10000, double.sample0.75()) + +sum(tenk_results_0.75 < 0.05)/10000 + + + +# The researcher doubles the sample whenever their pvalue is not significant. + +double.sample_whenever <- function () { + + first_sample_whenever <- rnorm(30) + pvals_2_whenever <- (t.test(first_sample_whenever)$p.value) + + if(pvals_2_whenever<0.05) { + return(pvals_2_whenever) + } else { + second_sample_whenever <- c(first_sample_whenever, rnorm(30)) + return(t.test(second_sample_whenever)$p.value) + } + +} + +tenk_results_whenever <- replicate(10000, double.sample_whenever()) + +sum(tenk_results_whenever < 0.05)/10000 + + + + +``` + +What do you conclude on the basis of this simulation? How bad is this kind of data-dependent policy? + +> I see an inflation of false positives. The more liberal I am with my double-sampling policy, the higher the rate of false positives. For example, when I sampled again whenever, regardless of my p value, I got an 8 percent rate of false positives. This data-dependent double-sampling policy is leading to many more false positives, even though my putative p-value is set below 0.5 diff --git a/hw4-updated.html b/hw4-updated.html new file mode 100644 index 0000000..96feec5 --- /dev/null +++ b/hw4-updated.html @@ -0,0 +1,628 @@ + + + + + + + + + + + + + + + + +Psych 251 PS4: Simulation + Analysis + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + + + + + + +

This is problem set #4, in which we want you to integrate your knowledge of data wrangling with some basic simulation skills. It’s a short problem set to help consolidate your ggplot2 skills and then help you get your feet wet in testing statistical concepts through “making up data” rather than consulting a textbook or doing math.

+

For ease of reading, please separate your answers from our text by marking our text with the > character (indicating quotes).

+
+

Part 1: ggplot practice

+

This part is a warmup, it should be relatively straightforward ggplot2 practice.

+

Load data from Frank, Vul, Saxe (2011, Infancy), a study in which we measured infants’ looking to hands in moving scenes. There were infants from 3 months all the way to about two years, and there were two movie conditions (Faces_Medium, in which kids played on a white background, and Faces_Plus, in which the backgrounds were more complex and the people in the videos were both kids and adults). An eye-tracker measured children’s attention to faces. This version of the dataset only gives two conditions and only shows the amount of looking at hands (other variables were measured as well).

+
library(tidyverse)
+
fvs <- read_csv("data/FVS2011-hands.csv")
+
## Parsed with column specification:
+## cols(
+##   subid = col_double(),
+##   age = col_double(),
+##   condition = col_character(),
+##   hand.look = col_double()
+## )
+
colnames(fvs)
+
## [1] "subid"     "age"       "condition" "hand.look"
+
head(fvs)
+
## # A tibble: 6 x 4
+##   subid   age condition    hand.look
+##   <dbl> <dbl> <chr>            <dbl>
+## 1     2  3.16 Faces_Medium    0.0319
+## 2    93  5.03 Faces_Medium    0.119 
+## 3    29  5.85 Faces_Medium    0.0921
+## 4    76  5.85 Faces_Medium    0.130 
+## 5    48  6.08 Faces_Medium    0.0138
+## 6   101  6.15 Faces_Medium    0.0438
+

First, use ggplot to plot a histogram of the ages of children in the study. NOTE: this is a repeated measures design, so you can’t just take a histogram of every measurement.

+
fvs %>%
+    group_by(subid, age) %>%
+    summarise()
+
## # A tibble: 119 x 2
+## # Groups:   subid [119]
+##    subid   age
+##    <dbl> <dbl>
+##  1     1 12.0 
+##  2     2  3.16
+##  3     3  9.53
+##  4     4 15.4 
+##  5     5  9.86
+##  6     6  9.40
+##  7     7 13.6 
+##  8     8 13.3 
+##  9     9 12.5 
+## 10    10 14.5 
+## # ... with 109 more rows
+
ggplot(fvs, aes(x=fvs$age)) + geom_histogram()
+
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
+

+

Second, make a scatter plot showing hand looking as a function of age and condition. Add appropriate smoothing lines. Take the time to fix the axis labels and make the plot look nice.

+
ggplot(fvs, aes(x=fvs$age, y=fvs$hand.look)) + 
+  geom_point(shape = ".") + 
+  geom_smooth()
+
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
+

+

What do you conclude from this pattern of data?

+
+

ANSWER HERE?? Around 15 months, infants start paying more attention to hands.

+
+

What statistical analyses would you perform here to quantify these differences?

+
+

ANSWER HERE. ????

+
+
+
+

Part 2: Simulation

+
library(tidyverse)
+

Let’s start by convincing ourselves that t-tests have the appropriate false positive rate. Run 10,000 t-tests with standard, normally-distributed data from a made up 30-person, single-measurement experiment (the command for sampling from a normal distribution is rnorm).

+

The goal of these t-tests are to determine, based on 30 observations, whether the underlying distribution (in this case a normal distribution with mean 0 and standard deviation 1) has a mean that is different from 0. In reality, the mean is not different from 0 (we sampled it using rnorm), but sometimes the 30 observations we get in our experiment will suggest that the mean is higher or lower. In this case, we’ll get a “significant” result and incorrectly reject the null hypothesis of mean 0.

+

What’s the proportion of “significant” results (\(p < .05\)) that you see?

+

First do this using a for loop.

+
result <- numeric(10000)
+
+for (i in 1:10000) {
+  X <- rnorm(30, 0, 1)
+  result[i] <- t.test(X)$p.value
+}
+
+sig_results <- result < 0.05
+
+sum(sig_results)/10000
+
## [1] 0.0501
+

Next, do this using the replicate function:

+
pvals <- replicate(10000, t.test(rnorm(30))$p.value)
+
+sig_pvals <- pvals < 0.05
+
+sum(sig_pvals)/10000
+
## [1] 0.0533
+

How does this compare to the intended false-positive rate of \(\alpha=0.05\)?

+
+

This answer makes sense. I’m getting a false positive rate of about 5 percent or slightly less than 5 percent. That’s what I would expect given that I am testing for signifiance of (\(p < .05\)).

+
+

Ok, that was a bit boring. Let’s try something more interesting - let’s implement a p-value sniffing simulation, in the style of Simons, Nelson, & Simonsohn (2011).

+

Consider this scenario: you have done an experiment, again with 30 participants (one observation each, just for simplicity). The question is whether the true mean is different from 0. You aren’t going to check the p-value every trial, but let’s say you run 30 - then if the p-value is within the range p < .25 and p > .05, you optionally run 30 more and add those data, then test again. But if the original p value is < .05, you call it a day, and if the original is > .25, you also stop.

+

First, write a function that implements this sampling regime.

+
double.sample <- function () {
+  
+  first_sample <- rnorm(30)
+
+    pvals_2 <- (t.test(first_sample)$p.value)
+
+  if(pvals_2<0.05) {
+      return(pvals_2)
+  } else if (pvals_2>0.25) {
+      return(pvals_2) 
+  } else {
+      second_sample <- c(first_sample, rnorm(30))
+      return(t.test(second_sample)$p.value)
+  }
+
+}
+

Now call this function 10k times and find out what happens.

+
tenk_results <- replicate(10000, double.sample())
+
+sum(tenk_results < 0.05)/10000
+
## [1] 0.0699
+

Is there an inflation of false positives? How bad is it?

+
+

Yes, there’s an inflation of false positives. I’m getting about 7 percent false positives, using the double sample technique. This is what Mike talked about in class, with p-hacking. While I may falsely believe that I am keeping p-value at less than 0.05, the double sampling technique is instead giving me slightly higher false positives.

+
+

Now modify this code so that you can investigate this “double the sample” rule in a bit more depth. In the previous question, the researcher doubles the sample only when they think they got “close” to a significant result, i.e. when their not-significant p is less than 0.25. What if the researcher was more optimistic? See what happens in these 3 other scenarios:

+
    +
  • The researcher doubles the sample whenever their pvalue is not significant, but it’s less than 0.5.
  • +
  • The researcher doubles the sample whenever their pvalue is not significant, but it’s less than 0.75.
  • +
  • The research doubles their sample whenever they get ANY pvalue that is not significant.
  • +
+

How do these choices affect the false positive rate?

+

HINT: Try to do this by making the function double.sample take the upper p value as an argument, so that you can pass this through dplyr.

+

HINT 2: You may need more samples. Find out by looking at how the results change from run to run.

+
# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.5.
+
+double.sample0.5 <- function () {
+  
+  first_sample_0.05 <- rnorm(30)
+
+    pvals_2_0.05 <- (t.test(first_sample_0.05)$p.value)
+
+    if(pvals_2_0.05<0.05) {
+      return(pvals_2_0.05)
+  } else if (pvals_2_0.05>0.5) {
+      return(pvals_2_0.05) 
+  } else {
+      second_sample_0.05 <- c(first_sample_0.05, rnorm(30))
+      return(t.test(second_sample_0.05)$p.value)
+  }
+
+}
+
+tenk_results_0.05 <- replicate(10000, double.sample0.5())
+
+sum(tenk_results_0.05 < 0.05)/10000
+
## [1] 0.0815
+
# The researcher doubles the sample whenever their pvalue is not significant, but it's less than 0.75.
+
+double.sample0.75 <- function () {
+  
+  first_sample_0.75 <- rnorm(30)
+
+    pvals_2_0.75 <- (t.test(first_sample_0.75)$p.value)
+
+  
+  if(pvals_2_0.75<0.05) {
+      return(pvals_2_0.75)
+  } else if (pvals_2_0.75>0.75) {
+      return(pvals_2_0.75) 
+  } else {
+      second_sample_0.75 <- c(first_sample_0.75, rnorm(30))
+      return(t.test(second_sample_0.75)$p.value)
+  }
+
+}
+
+tenk_results_0.75 <- replicate(10000, double.sample0.75())
+
+sum(tenk_results_0.75 < 0.05)/10000
+
## [1] 0.0728
+
# The researcher doubles the sample whenever their pvalue is not significant.
+
+double.sample_whenever <- function () {
+  
+  first_sample_whenever <- rnorm(30)
+    pvals_2_whenever <- (t.test(first_sample_whenever)$p.value)
+
+  if(pvals_2_whenever<0.05) {
+      return(pvals_2_whenever)
+  } else {
+      second_sample_whenever <- c(first_sample_whenever, rnorm(30))
+      return(t.test(second_sample_whenever)$p.value)
+  }
+
+}
+
+tenk_results_whenever <- replicate(10000, double.sample_whenever())
+
+sum(tenk_results_whenever < 0.05)/10000
+
## [1] 0.0859
+

What do you conclude on the basis of this simulation? How bad is this kind of data-dependent policy?

+
+

I see an inflation of false positives. The more liberal I am with my double-sampling policy, the higher the rate of false positives. For example, when I sampled again whenever, regardless of my p value, I got an 8 percent rate of false positives. This data-dependent double-sampling policy is leading to many more false positives, even though my putative p-value is set below 0.5

+
+
+ + + + +
+ + + + + + + + + + + + + + +