How well did election models perform in the 2022 midterms?

538 Senate Forecasts over time

We’re looking at the state-level election forecasts from FiveThirtyEight, The Economist, and Decision Desk HQ (DDHQ) for the 2022 U.S. Senate races.  The data was captured at 10:30 am on the morning of Election Day.

Since ballots are still being counted, the Georgia runoff is on December 6th, and there will be inevitable litigation, we’re making assumptions about Democrats holding on to win in Georgia, Nevada, and Arizona.  If a surprise does happen somewhere, this analysis can always be revisited.

Rather than attempt to classify Evan McMullin in Utah, we’re measuring model outputs for the Republican candidate’s chances.  Alaska (Ranked-choice voting) and Louisiana (Jungle Primary) had multiple Republican candidates running in the November 8th election, so the probabilities measured for these two states are the sum of the probabilities for all Republican candidates on the respective ballot.

The three models’ state-level outputs for all 35 senate races are being measured in two ways:

  1. Brier Score.  Truth is quantified as either 1 (Republican won the election) or 0 (Democrat won the election).  For each prediction, the error is defined as the difference between the election outcome and the model’s probability of the Republican candidate(s) winning.  The simple average of all the squared errors makes up a model’s Brier score.  Brier scores range from 0 to 1, and lower is better.
  2. Predicting the senate race winner.  Using a threshold of 0.5 to define whether the model was predicting the Republican candidate to win.

Here is how the models compare with Warnock winning the Georgia run-off:

ModelBrier scoreCorrect
The Economist0.032633/35
FiveThirtyEight0.036732/35
Decision Desk HQ0.037333/35

The Brier scores for all three models were extremely close together, with the highest and lowest only 0.0047 apart.

The three models predicted the same winner (using a 0.5 threshold) in 33 races.  Of those, 32 correctly predicted the winner while all three models wrongly predicted Laxalt beating Cortez Masto in Nevada.

The two states where the models differed were Georgia and Pennsylvania:

  • In Georgia, DDHQ’s model forecasted Warnock while The Economist and FiveThirtyEight projected Walker.
  • In Pennsylvania, The Economist predicted Fetterman while DDHQ and FiveThirtyEight predicted Oz.

At the end of the day, The Economist had the lowest (again, meaning best) Brier score while The Economist and DDHQ tied for the most winners correctly predicted (33 out of 35 races).

The Economist’s Brier score benefited from the certainty of its predictions, with 13 of its state forecasts being either 0 or 1.  In contrast, DDHQ had absolute certainty in only one race (Republicans winning in North Dakota) while FiveThirtyEight had none.

Given the shock of Election Day, maybe nothing in politics should be taken for granted.  Here are how the models compare with Walker winning the Georgia run-off:

ModelBrier scoreCorrect
The Economist0.030734/35
FiveThirtyEight0.029133/35
Decision Desk HQ0.038532/35

If Walker wins, The Economist will end up having correctly predicted the most Senate races while FiveThirtyEight receives the lowest Brier score.