Which AI predicts the 2026 World Cup better?

Claude, Gemini and OpenAI each predict all 72 group-stage matches. As the real results come in, every correct call scores points. The Brier score measures how well each model's probabilities are calibrated — lower is better.

Group stage progress72/72 matches

Web Baseline Enriched

API, no tools + standardized context block (same data for every model).

Claude

leading

130

points · 72 predictions submitted

Exact scores

Correct results

Points accuracy

36.1%

Result accuracy

58.3%

Brier (calibration ↓)

0.516

Scored matches

Gemini

118

points · 72 predictions submitted

Exact scores

Correct results

Points accuracy

32.8%

Result accuracy

58.3%

Brier (calibration ↓)

0.525

Scored matches

OpenAI

123

points · 72 predictions submitted

Exact scores

Correct results

Points accuracy

34.2%

Result accuracy

59.7%

Brier (calibration ↓)

0.518

Scored matches

How scoring works

+5Exact score
+3Correct result + one exact side
+2Correct result only (W/D/L)
+0Wrong result

Experiment design

See the prompts →

Web— Chat + live web access — free sourcing.
Baseline— API, no tools — internal model knowledge only.
Enriched— API, no tools + standardized context block (same data for every model).