Which AI predicts the 2026 World Cup better?

Claude, Gemini and OpenAI each predict all 72 group-stage matches. As the real results come in, every correct call scores points. The Brier score measures how well each model's probabilities are calibrated — lower is better.

Group stage progress72/72 matches

Web Baseline Enriched

API, no tools — internal model knowledge only.

Claude

leading

136

points · 72 predictions submitted

Exact scores

Correct results

Points accuracy

37.8%

Result accuracy

59.7%

Brier (calibration ↓)

0.519

Scored matches

Gemini

125

points · 72 predictions submitted

Exact scores

Correct results

Points accuracy

34.7%

Result accuracy

59.7%

Brier (calibration ↓)

0.511

Scored matches

OpenAI

134

points · 72 predictions submitted

Exact scores

Correct results

Points accuracy

37.2%

Result accuracy

59.7%

Brier (calibration ↓)

0.520

Scored matches

How scoring works

+5Exact score
+3Correct result + one exact side
+2Correct result only (W/D/L)
+0Wrong result

Experiment design

See the prompts →

Web— Chat + live web access — free sourcing.
Baseline— API, no tools — internal model knowledge only.
Enriched— API, no tools + standardized context block (same data for every model).