Prediction 8: Aggregating Across Models
October 28, 2024 – Even though Halloween is on Thursday, the scariest thing this week is the fact that we are now 8 days away from the 2024 Presidential Election!
Each week on this blog, I’ve explored different factors of elections and campaigns that can affect vote outcomes. From fundamentals like polling and the economy, to fundraising and campaign events. Now, before I release my final model to make my final predictions for the election next Monday, I want to revisit all the models I’ve made since starting this blog almost two months ago. Putting all of these different electoral factors together, what do my models say about the 2024 election?
To answer this question, I compiled my predictions from all my blog posts. These include the economic factors, polling, incumbency factors, demographics, campaign finance, and the campaign event models. I excluded the first blog post predictions, because these predictions were not made with any sort of statistical inference method, and many of my other models include lag vote share already. All the other models are the same as they appear in their blog posts, except the polling model. I replaced my original polling model with the simple regression created in last week’s ensemble model using the most recent polling averages. Out of those two models from the ensemble model, I chose the polling model that has polling average difference as the only input variable, as this model had the lowest RMSE of the two polling models. The graph below summarizes how the predictions in each model differ (or not) by state, as well as which states only include five models in their aggregated prediction calculations. As expected, the individual models are split on which candidate they predict to win the most competitive states.
After compiling the state predictions for each of these six models, I calculated the unweighted average prediction and confidence interval bounds for each state. For fairness, I did not attempt to weight the models by out-of-sample error. I took the simple mean prediction and prediction interval bounds for each state. This type of simple aggregation across models is practiced in the field, such as in Mongrain and Stegmaier (2024). Because many states do not have polling data for the 2024 election, these states do not have predictions from the polling model. Therefore, their mean prediction is calculated from their predictions from the five other models. The results of these calculations are shown below.
Averaging across this set of fundamentals and campaign-based models, my aggregated model predicts that Democrat incumbent Vice President Kamala Harris will win the 2024 Presidential Election with 292 electoral votes, compared to Republican former President Donald Trump’s 246 electoral votes. Harris wins by carrying the swing states of Wisconsin, Michigan, Pennsylvania, Nevada, and Georgia. Her victory is narrower than Biden’s victory in 2020, which can be seen in the predicted vote share of these states, in addition to Trump flipping back Arizona and retaining North Carolina.
My aggregate model’s predictions are in general agreement with popular forecasting models such as FiveThirtyEight or Silver Bulletin. Considering that my methodology differs from these other forecasts, it is notable that our models converge. My model suggests that the upcoming election is truly a close match. As illustrated below, many states could go to either party, and my model predicts extremely close vote shares for the most crucial states for both candidates.
Next week, I will present my final model predictions for the 2024 Presidential Election, using similar aggregation and ensembeling methods. Soon we will have answers on how my model performed, and if these factors I’ve used to construct models actually possess true predictive power.