The structure of the final paper for this class has changed slightly since the last time I taught POLS 1600.
I have expanded the total number of sections and placed less emphasis on some concepts. There are things this paper does, which I no longer ask you to do (e.g. explain the difference between simulation and asymptotic approaches to statistical inference). There are things I ask you to do (e.g. write a separate theory section that motivates your expectations) which this paper was not required to do.
Still this paper remains an excellent example of how to frame your question, articulate expectations, describe your data, estimate models and interpret your results.
In presidential elections, candidates focus a lot of their time and resources on competitive “swing” states which often decide the balance of the Electoral College. In 2016, Michigan, New Hampshire, Pennsylvania, Wisconsin, Florida, and Minnesota all came down to less than a 2 percent margin between Hillary Clinton and Donald Trump. [1] Together, these states control 89 electoral votes. A minor swing in the popular vote in these highly competitive states could result in a drastically different outcome in the Electoral College.
According to Fairvote.org, 66% of eligible people voted in the 12 most competitive states in the 2012 Presidential Election but only 57% did in the nation’s 38 other states. [2] One study found that voters in competitive states were more resilient towards adverse weather conditions than voters in uncompetitive states. [3] Another study found that every 10 percent increase in the margin of victory of a Governor or Senator decreased voter turnout by 1%. [4] It appears that people will be more likely to vote as the probability of their vote being decisive increases.
This paper will look at how voter turnout changes the more competitive a voter perceives an election to be. It will determine if voter turnout in the 2016 Presidential Election was influenced by the perceived competitiveness of the election or another factor such as party identification or past voting habits. It will also look at how living in one of the six states with a less than 2% margin in 2016 influenced perceived competitiveness and voter turnout.
This paper will use data from the American National Election Study from 2016. The unit of analysis for this study is eligible voters in the United States. The ANES is conducted using a random sampling of U.S. eligible voters who completed two interviews before and after the 2016 Election. Individuals either completed the survey in-person or online. 4270 individuals completed the survey.
Min. | 1st Qu. | Median | Mean | 3rd Qu. | Max. | NA’s | |
---|---|---|---|---|---|---|---|
2016 Pres. Race Competitive | 0 | 0 | 1 | 0.725 | 1 | 1 | 27 |
Turnout in 2016 | 0 | 1 | 1 | 0.867 | 1 | 1 | 939 |
Voted in 2012 | 0 | 0 | 1 | 0.733 | 1 | 1 | 16 |
Competitve State | 0 | 0 | 0 | 0.175 | 0 | 1 | 0 |
The primary independent variable of interest in this paper is the perceived competitiveness of the 2016 Presidential Election. It is measured by a survey question that asked respondents to answer if the election would be close or one candidate would win by quite a bit. This question was asked in the first survey before the actual presidential election took place. This variable is recoded so a measure of 1 represents respondents who thought the election would be close and a measure of 0 represents respondents who thought one candidate would win by quite a bit. Respondents coded with a measure of 1 thought the election was competitive whereas respondents coded with a measure of 0 thought the election was not competitive. 3076 respondents identified the election as being competitive while 1167 respondents said it would not be competitive. According to Table 1, 72.5% of respondents said that the 2016 Presidential Election would be competitive. Figure 1 shows a map of mean perceived competitiveness in 2016 by state. The transparency is set to the number of respondents from each state. States with fewer respondents will have less precise means because there is more variation between individuals. Florida has a high number of respondents who perceived the election as competitive while California has a high number of respondents who perceived it as less competitive.
The dependent variable in this paper is whether the respondent voted in the 2016 Presidential Election. It is measured by a survey question that asked respondents to self-report if they voted for president in 2016. A measure of 0 represents respondents who reported they did not vote in 2016. A measure of 1 represents respondents who reported that they did vote in 2016. 444 respondents said they did not vote in 2016 while 2887 respondents said they did vote. According to Table 1, about 86.7% of respondents reported voting in 2016. National voter turnout in 2016 was 55% [5] so this sample over-represents voter turnout. This is likely because people say they vote even if they don’t because it is seen as a more socially acceptable response. Figure 2 shows a map of mean turnout in 2016 by state. California has a high number of respondents and a high reported turnout compared to Texas which has a high number of respondents but a lower reported turnout.
Party | Number of Respondents |
---|---|
Democrat | 1450 |
Independent | 1367 |
Other | 148 |
Republican | 1231 |
The first covariate in this paper is whether or not the respondent voted for president in 2012. This variable is measured by a survey question that asked respondents if they did or did not vote in 2012. Those who voted in 2012 are coded with a measure of 1. Those who did not vote in 2012 are coded with a measure of 0. According to Table 1, 73.27% of respondents voted in the 2012 Presidential Election. National turnout in 2012 was 58.6% [5] so this sample also over-represents voter turnout. Similar to the variable measuring turnout in 2016, this is likely because some respondents lied and said they voted because voting is seen as a more socially acceptable response.
The second covariate in this paper is party identification. This variable is measured by a survey question that asked respondents if they identify as a Democrat, Republican, Independent or other party affiliation. Respondents who answered “Other PLEASE SPECIFY” in the survey and provided a specific party are recoded to a general “Other” value. This paper treats party identification as a categorical variable with respondents falling into one of four categories instead of using an ideological spectrum. There are 1450 Democrats, 1367 Independents, 1231 Republicans, and 148 respondents with other party affiliations represented in the survey.
The third covariate in this paper is whether or not the respondent lives in a competitive state. Michigan, New Hampshire, Pennsylvania, Wisconsin, Florida, and Minnesota are identified as competitive states because they all came down to a less than 2% margin between Hillary Clinton and Donald Trump in the popular vote. This variable is based on the survey question that asked respondents to identify their state. A new variable called “comp_state” identifies whether or not an individual lives in a competitive state. A respondent is assigned a 1 if their state is one of the six “competitive” states or a 0 otherwise. 17.47% of all respondents lived in one of the six competitive states at the time of the survey.
\[ turnout \sim pres\_competitive \]
It is expected that if an individual believes the 2016 Presidential Election is competitive, they will be more likely to vote. The first relationship tested will be turnout modeled by perceived competitiveness. The coefficient of perceived competitiveness will be positive because it is expected to increase turnout. The effect will be relatively small because most respondents answered that they both thought the election was competitive and voted in 2016.
The central limitation in an observational study is that it is impossible to observe the counterfactual. If an individual thought the election was competitive and voted, it is impossible to know if they would have voted if they thought it was not competitive instead. This problem can be partially solved by adding covariates to the model to remove variation on the dependent variable attributable to other factors. Covariates help isolate the effect of the primary independent variable of interest and limit the problem of causal inference.
\[ turnout \sim pres\_competitive + prev\_voter \]
The second relationship tested will be turnout modeled by perceived competitiveness with turnout in 2012 as a covariate. Skeptics of the first model might say that a positive relationship between competitiveness and turnout is not valid because people who vote habitually are more likely to vote and to see elections as competitive. By adding turnout in 2012 as a covariate, the model can remove the variation in perceived competitiveness and turnout in 2016 that is attributable to past voting in 2012. It is expected that the coefficient for turnout in 2012 will be positive because previous voters are more likely to vote. This should decrease the coefficient for perceived competitiveness compared to the previous model because some of the model will be able to be explained by individuals who are in a habit of voting.
\[ turnout \sim pres\_competitive|party\_id \]
The third relationship tested will be turnout modeled by perceived competitiveness with party identification as an interaction covariate. Interaction variables help demonstrate how a covariate affects the relationship between the dependent variable and the primary independent variable of interest. Certain political parties might do a better job of getting their base to perceive an election as competitive. By adding political identification as an interaction, the model can calculate how political parties influence the relationship between perceived competitiveness and turnout. It is expected that the coefficient will be positive for individuals who think the election is competitive and negative for those who think it will not be competitive regardless of party. Due to the narrative in the media that Hillary Clinton was going to easily win the election over Donald Trump, Republicans are expected to view the election as more competitive than Democrats and have a higher rate of turnout.
\[ turnout \sim pres\_competitive|comp\_state \]
The fourth relationship tested will be turnout modeled by perceived competitiveness with living in a competitive state as an interaction covariate. Competitive states get more major attention from candidates and the media than non-competitive states. This could cause individuals to perceive the election as being more competitive overall, increasing their likelihood of turnout. By adding competitive states as an interaction covariate, the model can determine how living in a competitive state influences the relationship between perceived competitiveness and turnout. It is expected that the coefficient will be positive for individuals who think the election is competitive and negative for those who don’t regardless of state. Respondents who live in competitive states should view the election as more competitive than respondents in non-competitive states and have a higher rate of turnout.
Model 1 | ||
---|---|---|
(Intercept) | 0.85*** | |
(0.01) | ||
pres_competitive | 0.02 | |
(0.01) | ||
R2 | 0.00 | |
Adj. R2 | 0.00 | |
Num. obs. | 3313 | |
RMSE | 0.34 | |
p < 0.001, p < 0.01, p < 0.05 |
2.5 % | 97.5 % | |
---|---|---|
(Intercept) | 0.8304838 | 0.8751342 |
pres_competitive | -0.0068549 | 0.0453558 |
In the first model, the coefficient of perceived competitiveness is 0.02. This means that according to the model, perceiving the 2016 Presidential Election as competitive increased turnout by 2%.
A confidence interval is the range of values that are likely to include the true parameter. It is calculated through a process called bootstrapping where a sampling distribution is created by taking multiple samples with replacement from a single dataset. This means that when a value is chosen from the dataset it is returned so there is a chance it can be selected again. The coefficient produced by the model is added or subtracted from 1.96 multiplied by the standard error to calculate lower and upper bounds that contain 95% of the possible values. The standard error is the standard deviation, or variation, of the sampling distribution. If a relationship is statistically significant, the bounds of the confidence interval will be on the same side of zero. The confidence interval for the first model is between -0.01 and 0.05. Therefore, the relationship in the first model is not statistically significant.
A p-value is a probability of obtaining the observed results of a test when the null hypothesis is true. The null hypothesis is a statement that breaks the relationship between the two primary variables. In the case of the first model, the null hypothesis would be that perceived competitiveness has no effect on voter turnout. To determine the p-value, the values of the dependent variable are shuffled to break any relationship in the data. A null distribution is created by randomly sampling the shuffled data. Then, the p-value is determined by calculating the probability of a test statistic occurring in the null distribution. If a p-value is less than 0.05, the null hypothesis is rejected and the relationship is considered statistically significant. The p-value for the first model is greater than 0.05. The p-value and confidence interval both conclude that the effect of perceived competitiveness on turnout is not statistically significant.
A p-value gives a hard cutoff for determining the statistical significance of a relationship whereas a confidence interval provides a range of values that is consistent with the data. Confidence intervals might be preferred to p-values because they give a more complete picture of a relationship by providing a range of possible values that the actual result is likely to be contained within. P-values might be preferred to confidence intervals because they offer a way to quickly evaluate the statistical significance of a result.
Model 1 | Model 2 | Model 3 | Model 4 | ||
---|---|---|---|---|---|
(Intercept) | 0.85*** | 0.66*** | 0.88*** | 0.85*** | |
(0.01) | (0.02) | (0.02) | (0.01) | ||
pres_competitive | 0.02 | 0.01 | 0.00 | 0.02 | |
(0.01) | (0.01) | (0.02) | (0.01) | ||
prev_voter | 0.24*** | ||||
(0.01) | |||||
party_idIndependent | -0.12*** | ||||
(0.03) | |||||
party_idOther | -0.23*** | ||||
(0.07) | |||||
party_idRepublican | 0.04 | ||||
(0.03) | |||||
pres_competitive:party_idIndependent | 0.08* | ||||
(0.03) | |||||
pres_competitive:party_idOther | 0.13 | ||||
(0.08) | |||||
pres_competitive:party_idRepublican | -0.02 | ||||
(0.03) | |||||
comp_state | 0.05 | ||||
(0.03) | |||||
pres_competitive:comp_state | -0.04 | ||||
(0.04) | |||||
R2 | 0.00 | 0.08 | 0.02 | 0.00 | |
Adj. R2 | 0.00 | 0.08 | 0.01 | 0.00 | |
Num. obs. | 3313 | 3300 | 3279 | 3313 | |
RMSE | 0.34 | 0.32 | 0.34 | 0.34 | |
p < 0.001, p < 0.01, p < 0.05 |
2.5 % | 97.5 % | |
---|---|---|
(Intercept) | 0.6296898 | 0.6912645 |
pres_competitive | -0.0106285 | 0.0394461 |
prev_voter | 0.2165550 | 0.2724112 |
In the second model, turnout in 2012 is added as a covariate. The coefficient of perceived competitiveness is 0.01 and the coefficient of turnout in 2012 is 0.24. Based on the p-value and confidence interval, perceived competitiveness is still not statistically significant when controlling for previous voting habits. Respondents who voted in 2012 are 24% more likely to vote than those who did not. The confidence interval for this conclusion is between 21% and 27%. The p-value is less than 0.001, meaning the relationship between turnout in 2016 and turnout in 2012 is statistically significant. According to the second model, turnout in 2012 is a better predictor of turnout in 2016 than perceived competitiveness.
2.5 % | 97.5 % | |
---|---|---|
(Intercept) | 0.8509454 | 0.9148625 |
pres_competitive | -0.0381497 | 0.0416733 |
party_idIndependent | -0.1738353 | -0.0661106 |
party_idOther | -0.3624558 | -0.0956598 |
party_idRepublican | -0.0215141 | 0.0924408 |
pres_competitive:party_idIndependent | 0.0169087 | 0.1441818 |
pres_competitive:party_idOther | -0.0261667 | 0.2812159 |
pres_competitive:party_idRepublican | -0.0892191 | 0.0429307 |
In the third model, party ID is added as an interaction covariate. Independents and those with other party affiliations who perceive an election to non-competitive are 12% and 23% less likely to vote compared to Democrats. Both of these values are statistically significant with p-values less than 0.001. Independents are 8% more likely to vote if they perceive an election to be competitive compared to if they perceive it as uncompetitive. This relationship has a p-value of less than 0.05 and a confidence interval between 1.7% and 14% making it statistically significant. Perceiving an election to be competitive has no statistically significant effect on turnout from Democrats, Republicans, or those with other party affiliations.
Predicted probabilities can be calculated for the interaction models by plugging the possible combinations of the interaction covariate and the primary independent variable of interest into an equation for the line of best fit generated by the linear model. Figure 3 shows the predicted values for turnout based on the combinations of party identification and perceived competitiveness. Republicans and Democrats have about the same predicted turnout regardless of the perceived competitiveness of the election. Independents and those with other party affiliations are more likely to vote in competitive elections than non-competitive elections. However, as demonstrated by the coefficients from the model, the relationship between perceiving an election as competitive and turnout for respondents with other party affiliations is not statistically significant.
2.5 % | 97.5 % | |
---|---|---|
(Intercept) | 0.8216759 | 0.8700237 |
pres_competitive | -0.0037435 | 0.0531607 |
comp_state | -0.0157294 | 0.1102894 |
pres_competitive:comp_state | -0.1111848 | 0.0326825 |
In the fourth model, living in a competitive state is added as an interaction covariate. Those who live in a competitive state and perceive the election as non-competitive are 5% more likely to vote compared to those who live in a non-competitive state. Respondents who live in a competitive state and perceive the election to be competitive are 4% less likely to vote compared to if they perceive the election as non-competitive. Figure 4 shows the predicted values for turnout based on perceived competitiveness and living in a competitive state. Respondents who live in non-competitive states have a slightly higher predicted turnout if they perceive the election as competitive. On the other hand, respondents who live in competitive states have a slightly lower predicted turnout if they perceive the election as competitive. Neither of these relationships has a p-value of less than 0.05. Therefore, there is no statistically significant relationship between living in a competitive state and voter turnout based on the model.
The results of this study do not support the original expectations. It was hypothesized that there would be a relationship between perceived competitiveness and voter turnout in the 2016 Presidential Election. No statistically significant relationship was observed between the perceived competitiveness and voter turnout, even when turnout in 2012 and living in a competitive state were introduced as covariates.
Both Independents and individuals with other party identifications were less likely to vote when they perceived the election as non-competitive. For Independents, this relationship could be explained by a perception that their vote matters less in an election where one partisan base has a clear advantage. For members of another party, this relationship could be explained by their candidate of choice having a worse chance in an uncompetitive election. This is supported by an increased focus on third-party candidates like Gary Johnson in 2016 or Ralph Nader in 2000 compared to other, less competitive, election years.
Competitiveness only seems to have a statistically significant effect on Independents. Independents were more likely to vote when they perceived the election as competitive. This could be because Independents might feel that their vote matters more in competitive elections. In addition, when an election is close, campaigns may invest more resources in getting Independents to vote because they are seen as more “persuadable.”
This study is important because it leads to the conclusion that an individual’s decision to vote may not be influenced by their perception of the competitiveness of an election. Instead, other factors like previous voting habits, race, gender, or age may be more important in determining whether or not an individual heads to the polls. One major limitation of this study is the lack of verified turnout in the 2016 and 2012 elections. The voting-sized population is overrepresented in the data, possibly leading to skewed results. In a future analysis, a dataset with verified turnout should be used. Another limitation of this study is the binary value of competitiveness in the survey. It might yield more interesting results if an individual was asked to rate how competitive they thought the election was going to be on a continuous spectrum. Future studies could look at using available data to explain increased turnout among Independents and those with other party affiliations. While this study does not provide any conclusive evidence for a relationship between perceived competitiveness and voter turnout, it provides avenues for future research to explore.
“The 10 Closest States in the 2016 Election.” U.S. News & World Report. U.S. News & World Report. Accessed December 2, 2019. https://www.usnews.com/news/the-run-2016/articles/2016-11-14/the-10-closest-states-in-the-2016-election.
FairVote.org. “What Affects Voter Turnout Rates.” FairVote. Accessed December 2, 2019. https://www.fairvote.org/what_affects_voter_turnout_rates.
Fraga, Bernard L., and Eitan Hersh. “Voting costs and voter turnout in competitive elections.” In APSA 2010 Annual Meeting Paper. 2010.
Gilliam, Franklin D. “Influences on Voter Turnout for U. S. House Elections in Non-Presidential Years.” Legislative Studies Quarterly 10, no. 3 (1985): 339-51. www.jstor.org/stable/440035.
Wilson, Reid. “New Report Finds That Voter Turnout in 2016 Topped 2012.” TheHill, March 16, 2017. https://thehill.com/homenews/state-watch/324206-new-report-finds-that-voter-turnout-in-2016-topped-2012.
Data Citation
American National Election Studies, Stanford University, and University of Michigan. American National Election Study: 2016 Pilot Study. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2016-03-16. https://doi.org/10.3886/ICPSR36390.v1
# Set Working Directory
setwd("~/Desktop/POLS 1600/PROJECT/Draft")
# Load Libraries (normal libraries from lab + 'foreign' for loading data)
if (!require("pacman")){ install.packages("pacman") }
pacman::p_load("knitr","mosaic","foreign", "tidyverse","texreg", "haven", "DeclareDesign", "labelled", "maps", "mapproj","datasets", "usmap")
# Load the Data
df <- remove_labels(read_dta("anes_timeseries_2016_Stata12.dta"))
# Recode Age
df$age <- df$V161267
df$age[df$age < 18] <- NA
table(df$age)
table(df$V161267)
# Recode Gender
df$gender <- df$V161342
df$gender[df$gender == 1] <- "Male"
df$gender[df$gender == 2] <- "Female"
df$gender[df$gender == 3] <- "Other"
df$gender[df$gender < 1] <- NA
table(df$gender)
table(df$V161342)
# Recode Race
df$race <- df$V161310x
df$race[df$race == 1] <- "White"
df$race[df$race == 2] <- "Black"
df$race[df$race == 3] <- "Asian"
df$race[df$race == 4] <- "Native American"
df$race[df$race == 5] <- "Hispanic"
df$race[df$race == 6] <- "Other"
df$race[df$race < 1] <- NA
table(df$race)
table(df$V161310x)
# Recode Voter Reg
# 1 is registered. 0 is not registered.
df$voter_reg <- df$V161011
df$voter_reg[df$voter_reg == 1 | df$voter_reg == 2] <- 1
df$voter_reg[df$voter_reg == 3] <- 0
df$voter_reg[df$voter_reg < 0] <- NA
table(df$voter_reg)
table(df$V161011)
# Recode Party ID
df$party_id <- df$V161155
df$party_id[df$party_id == 1] <- "Democrat"
df$party_id[df$party_id == 2] <- "Republican"
df$party_id[df$party_id == 3] <- "Independent"
df$party_id[df$party_id == 5] <- "Other"
df$party_id[df$party_id < 1] <- NA
table(df$party_id)
table(df$V161155)
# Recode Previous Voter
# 1 is voted in 2012. 2 is didn't vote in 2012.
df$prev_voter <- df$V161005
df$prev_voter[df$prev_voter == 1] <- 1
df$prev_voter[df$prev_voter == 2] <- 0
df$prev_voter[df$prev_voter < 0] <- NA
table(df$prev_voter)
table(df$V161005)
# Recode Competitive
# 1 is competitive. 0 is not as competitive.
df$pres_competitive <- df$V161147
df$pres_competitive[df$pres_competitive == 1] <- 1
df$pres_competitive[df$pres_competitive == 2] <- 0
df$pres_competitive[df$pres_competitive < 0] <- NA
table(df$pres_competitive)
table(df$V161147)
# Recode Turnout
# 0 is didn't vote. 1 is voted
df$turnout <- df$V162031x
df$turnout[df$turnout == 0] <- 0
df$turnout[df$turnout == 1] <- 1
df$turnout[df$turnout < 0] <- NA
table(df$turnout)
table(df$V162031x)
# Create New Sub Dataframe with Select Variables
# (makes things cleaner/easier)
df_sub <- with(df,
data.frame("state" = V161010e,
"age" = age,
"gender" = gender,
"race" = race,
"voter_reg" = voter_reg,
"party_id" = party_id,
"prev_voter" = prev_voter,
"pres_competitive" = pres_competitive,
"turnout" = turnout))
# Competitive states
comp_states <- c("WI","MI","PA","FL", "NH", "MN")
df_sub$comp_state <- ifelse(df_sub$state %in% comp_states, 1, 0 )
table(df_sub$comp_state)
sum_array <- rbind(summary(df_sub$pres_competitive),
summary(df_sub$turnout),
summary(df_sub$prev_voter),
summary(df_sub$comp_state))
row.names(sum_array) <- c("2016 Pres. Race Competitive",
"Turnout in 2016",
"Voted in 2012",
"Competitve State")
kable(sum_array,
caption = "Table 1: Summary Statistics",
digits = 3)
# Load Data Frame of States
states_tmp <- data.frame(state = unlist(state.abb),
region = tolower(unlist(state.name)))
df_sub <- df_sub %>% left_join(states_tmp, by = "state")
# Code Mean of pres_competitve by State
df_sub_state <- df_sub %>%
group_by(region)%>%
summarise(
Respondents = n(),
Competitive = mean(pres_competitive, na.rm=T),
Turnout = mean(turnout, na.rm = T))
# Join df_sub_state to Map Data
us_states <- map_data("state")
us_states <- us_states %>% left_join(df_sub_state)
# Plot Map of pres_competitive
ggplot() +
ggtitle("Figure 1: Mean Perceived Competitiveness in 2016 by State") +
geom_polygon(data = us_states,
aes(fill = Competitive,
x = long,
y = lat,
group = group,
alpha = Respondents)) +
theme_void() +
coord_map()
# Plot Map of turnout
ggplot() +
ggtitle("Figure 2: Mean Turnout in 2016 by State") +
geom_polygon(data = us_states,
aes(fill = Turnout,
x = long,
y = lat,
group = group,
alpha = Respondents)) +
theme_void() +
coord_map()
kable(table(df_sub$party_id),
col.names = c("Party", "Number of Respondents"),
caption = "Table 2: Respondent Breakdown by Party")
# Create the Models
lm1 <- lm(turnout ~ pres_competitive, data = df_sub)
lm2 <- lm(turnout ~ pres_competitive + prev_voter, data = df_sub)
lm3 <- lm(turnout ~ pres_competitive*party_id, data = df_sub)
lm4 <- lm(turnout ~ pres_competitive*comp_state, data = df_sub)
htmlreg(lm1,
doctype = FALSE,
caption.above = TRUE,
caption = "Table 3: Model 1 Summary Statistics")
kable(confint(lm1),
caption = "Table 4: Model 1 Confidence Intervals")
htmlreg(list(lm1, lm2,lm3, lm4),
doctype = FALSE,
caption.above = TRUE,
caption = "Table 5: Model 2-4 Summary Statistics")
kable(confint(lm2),
caption = "Table 6: Model 2 Confidence Intervals")
# Generate Prediction Dataframe
pred_df <- expand.grid(pres_competitive = c(0, 1),
party_id = na.omit(unique(df_sub$party_id)))
pred_df <- cbind(pred_df,
predict(lm3, pred_df, interval = "confidence"))
# Produce Chart Using pred_df
# kable(pred_df,
# col.names = c("Competitive Election", "Party ID", "Fit", "Lower", "Upper"),
# caption = "Table 5: Model 3 Predicted Turnout Values")
# Produce Figure Using pred_df
pred_df %>%
mutate(Competitive = ifelse(pres_competitive == 1,
"Competitive",
"Not Competitive")) %>%
ggplot(aes(x = party_id,
y = fit,
col = Competitive)) +
ggtitle("Figure 3: Predicted Turnout Based on Partisanship") +
geom_point(position = position_dodge(width = .5)) +
geom_linerange(aes(ymin = lwr, ymax = upr),
position = position_dodge(width = .5))+
labs(
x = "Partisanship",
y = "Predicted Turnout")
kable(confint(lm3),
caption = "Table 7: Model 3 Confidence Intervals")
# Generate Prediction Dataframe
pred_df2 <- expand.grid(pres_competitive = c(0, 1),
comp_state = c(0, 1))
pred_df2 <- cbind(pred_df2,
predict(lm4, pred_df2, interval = "confidence"))
# Produce Chart Using pred_df2
# kable(pred_df2,
# col.names = c("Competitive Election", "Competitive State", "Fit", "Lower", "Upper"),
# caption = "Table 7: Model 4 Predicted Turnout Values")
# Produce Figure Using pred_df2
pred_df2 %>%
mutate(Competitive = ifelse(pres_competitive == 1,
"Competitive",
"Not Competitive")) %>%
ggplot(aes(x = comp_state,
y = fit,
col = Competitive)) +
ggtitle("Figure 4: Predicted Turnout Based on Competitive State") +
geom_point(position = position_dodge(width = .5)) +
geom_linerange(aes(ymin = lwr, ymax = upr),
position = position_dodge(width = .5))+
labs(
x = "Competitive State",
y = "Predicted Turnout")
kable(confint(lm4),
caption = "Table 8: Model 4 Confidence Intervals")