Agent Lab — a live forecast exhibit · Applied AI
World Cup 2026 · a live forecast agent

World Cup forecasts, with the receipts.

Every match here gets two forecasts. A statistical model — the same method for every team, fit on 47,000+ international results — rates each side’s attack and defense and simulates the whole tournament thousands of times. Then an AI agent takes one fixture at a time: the model’s odds and how sure they are, the betting-market price, and the things the numbers can’t see — injuries, line-ups, weather, altitude, the week’s news — and writes a reasoned call.

Most of the time it agrees with the model; sometimes it nudges it; it always shows its work. Once a match is played we score the call against the model — not because a well-calibrated model is easy to beat, but to see where reading the context actually helps, and to say so plainly.

The agent on an upcoming match
Uruguay v Cape Verde
URU 60.6/26.7/12.6 73.3/19.0/7.7+12.7 pp
Likely score URU 1–0 CPV 1.7–0.3 xg
Strengthened the lean toward Uruguay.

Uruguay recent form: WDLDD (5 pts from 5 matches). Cape Verde recent form: DLDWW (8 pts from 5 matches).

Grades at kickoff 6 sources the full call →

Calibrated on 47,000+ international matches · scored against real results as they come in · the agent weighs in on each fixture as kickoff nears.

How the forecast agent works

1
The model sets the prior

A statistical model rates every team’s attack and defense from 47,000+ international results, weighting recent matches most. It then plays the tournament out thousands of times to get each side’s odds — to win its group, survive the knockouts, lift the trophy. One disciplined method for everyone, no opinions in it.

2
The agent reads the live context

For one fixture the agent gets a single brief: the model’s odds and how sure they are, the betting-market price, and what the numbers structurally miss — injuries, line-ups, suspensions, and the match-day conditions that may swing a game: altitude, heat, a roof that closes. It weighs them and writes a reasoned call — usually confirming the model, sometimes nudging it, always citing what tipped it.

3
And it gets graded

Once the match is played we score the call — the W/D/L result and how close the scoreline came. Soccer is hard to call — low-scoring, and national sides barely play together — so we keep expectations honest and show every call, win or lose. The method →

Every one of the tournament’s 104 fixtures gets both reads — the W/D/L odds (a win, draw or loss for the home side) and a most-likely scoreline — laid out across the group stage and the knockout bracket. It’s a small demonstration of a larger idea: an agent that blends the numbers with the context a model misses and reasons out loud — a sounding board, not an oracle.

The agent, live

Updated All calls →

A few of the latest calls. Each card carries the model’s number, the agent’s adjustment and the one fact behind it; the full agent surface has them all, with sources and grades. Most calls sit close to the baseline — the agent’s sharpest move so far is Uruguay +12.7 pp.

Iran v New Zealand
played
IRN 56.4/27.4/16.2 62.3/24.0/13.7+5.9 pp
Agent called IRN 1–0 NZL 1.3–0.4 xg final 2–2
Strengthened the lean toward Iran.

Iran are DLWWW (+0.62 pts/match vs expectation, opponents avg rank #60); New Zealand are LLWLL (-0.16 pts/match, opponents avg rank #43). Iran are over-performing against weaker opposition; NZL are in line vs stronger opposition.

Worse than the model call wrong 9 sources the full call →
France v Senegal
FRA 50.3/28.6/21.1 61.7/23.7/14.7+11.4 pp
Likely score FRA 1–0 SEN 1.6–0.4 xg
Strengthened the lean toward France.

France WWWLW, +0.11 pts/match vs expectation vs avg opp #45; Senegal LWWLD, -0.38 pts/match vs expectation vs avg opp #49.

Grades at kickoff 9 sources the full call →
Iraq v Norway
IRQ 16.9/25.1/58.0 13.0/20.7/66.3+8.4 pp
Likely score IRQ 0–1 NOR 0.3–1.4 xg
Strengthened the lean toward Norway.

Iraq recent form LWWDL, +0.09 pts/match vs expectation vs avg opp rank #72; Norway recent form WLDWD, +0.28 pts/match vs expectation vs avg opp rank #14.

Grades at kickoff 9 sources the full call →

The forecast, in brief

Full 48-team table →

Straight out of the model — attack and defense ratings, run through thousands of simulated tournaments. The top of the field, and the baseline the agent reasons from. Each market’s de-vigged champion price sits beside the model’s, never blended. Top 8 ≈ 70% of the title mass — an unusually flat field.

# Team Champion Final Semi Polymarket Kalshi

What changed

Champion odds over the last day
Champion odds, on the move
How the odds have moved →

Find your way around