Subpar Forecasting, 2022!

It’s that time of year again. Last year, you all participated in our second annual subparforecasting competition, where you made probabilistic predictions on some things that might happen in 2022. The results are now in, and this is our summary of how things went.

You also might want to look at last year’s results for comparison’s sake.

What is this about?

Subparforecasting is a joking reference to to superforecasting, which comes from Philip Tetlock’s research about how to find and develop a group of people who are outsize good at making predictions.

A key part of Tetlock’s process is getting feedback: making predictions, seeing how they work out, and learning from the process. Matthew Yglesias did a nice summary of this in his piece on how to be less full of shit. You can also look at his follow-up post on how his own predictions fared last year.

Scoring

We scored everyone’s submissions as a way of generating that feedback, and we’re using almost the same approach as last year.

Last year we used the logarithmic scoring rule. For log scoring, you should think of the probability p of a prediction as a number between 0 and 1, with 0 representing 0%, and 1 representing 100%.

A traditional log score is then defined as follows.

\text{score}(p) = \begin{cases} \ln(p), & \text{if the outcome was true} \\ \ln(1-p) & \text{if the outcome was false} \end{cases}

We’re essentially using log-scoring, but we’re subtracting out \ln(0.5) from the score (which increases the score, since \ln(0.5) is negative!). As a result, your score is always zero if your prediction is 50%, and higher scores are better. This reflects the fact that we treat a 50% prediction as a kind of a baseline, as we’ll discuss below.

This is a bit easier to understand with an example. Say you predicted that a given outcome something was 90% to happen, and it did. Then your score would be \ln(0.9) - \ln(0.5) = 0.59. If it didn’t happen, your score would be \ln(1-0.9) = \ln(0.1) = -1.61.

Absolutely confident predictions

By “absolutely confident”, we mean a prediction of either 0% or 100%. An absolutely confident prediction that turns out wrong should in some sense be infinitely surprising! And log scoring says as much, punishing you infinitely for such a prediction.

There were a lot fewer absolutely confident scores than last year, just a single one, rather than a dozen or so last year.

As you might remember, 50% of the absolutely confident predictions were wrong in 2021. This year, the one fully confident prediction turned out to be wrong.

Handling non-predictions

As we explained last year, not making a prediction was the same as assigning a probability of 50%. That’s why you’ll notice in our rankings that everyone is counted as having made the same number of predictions.

The rankings

Here are our rankings, which are just the average over everyone’s guesses.

name score
Franco Baseggio 0.305
Sharon Fenick 0.255
Michael Farbiarz 0.248
Jonas Peters 0.246
Nick Salter (and family) 0.246
Dmitry Gorenburg 0.245
Blanca 0.234
Dianne Newman 0.220
Zev Isaac Minsky-Primus 0.207
Yaron Minsky 0.204
Gabriel Farbiarz 0.188
Richard Primus 0.182
Averagey McAverageface 0.181
Lois 0.181
Ida Gorenburg 0.172
Eyal Minsky-Fenick 0.153
Lucas Kimball 0.141
Yair 0.124
Lisa 0.123
Ada Fenick 0.122
Alan Promer 0.122
Sigal Minsky-Primus 0.114
Martha A. Escobar 0.105
Megan Lewis 0.097
Romana Primus 0.092
NAFTALY MINSKY 0.086
Eitan Minsky-Fenick 0.085
Sarah Williams 0.083
Shula Minsky 0.078
Jacob Gorenburg 0.077
Monse 0.046
David R H Miller 0.040
Jeremy Dauber 0.033
Sam Wurzel 0.022
Sarah Farbiarz 0.001
Michelle Fisher -0.006
Sally Gottesman -0.011
Debra Fine -0.030
Daniel Primus Cohen -0.052
Elana Farbiarz -0.068
Eli Cohen -0.096
Belinda -0.109
Nava Minsky-Primus -0.164
Sanjyot Dunung -0.354
nancy koziol -inf


In order to represent the wisdom of the crowds, we added a synthetic player, Averagey McAverageface. Averagey’s prediction is just the average of everyone else’s prediction on any particular question. Averagey did pretty well this year, though he didn’t make it to the very top of the ranking. That’s pretty inline with last year.

Another thing you can look at is how many of the scores are positive, which corresponds to doing better than someone who didn’t answer any questions, which is in turn the same as guessing 50% for everything.

Last year, most people’s predictions were in negative territory, but this year, a solid majority of people are positive. That’s a sign of real progress!

Visualizing the predictions

The following graphic lets you visualize the data and see how the predictors did collectively and individually.

This visualization works best on an ordinary computer , not a tablet or phone. That’s because you need to hover with your mouse in order to uncover more information. To hover over something, move the tip of your mouse pointer over the element in question and leave it there for a moment!

Here’s the visualization: