Agent Lab — a live forecast exhibit · Applied AI
World Cup 2026 · a live forecast agent

World Cup forecasts, with the receipts.

Every match here gets two forecasts. A statistical model — the same method for every team, fit on 47,000+ international results — rates each side’s attack and defense and simulates the whole tournament thousands of times. Then an AI agent takes one fixture at a time: the model’s odds and how sure they are, the betting-market price, and the things the numbers can’t see — injuries, line-ups, weather, altitude, the week’s news — and writes a reasoned call.

Most of the time it agrees with the model; sometimes it nudges it; it always shows its work. Once a match is played we score the call against the model — not because a well-calibrated model is easy to beat, but to see where reading the context actually helps, and to say so plainly.

The agent on an upcoming match
Uruguay v Cape Verde
URU 61.1/26.5/12.3 73.3/19.0/7.7+12.2 pp
Likely score URU 1–0 CPV 1.7–0.3 xg
Strengthened the lean toward Uruguay.

Uruguay recent form: WDLDD (5 pts from 5 matches). Cape Verde recent form: DLDWW (8 pts from 5 matches).

Grades at kickoff 6 sources the full call →

Calibrated on 47,000+ international matches · scored against real results as they come in · the agent weighs in on each fixture as kickoff nears.

How the forecast agent works

1
The model sets the prior

A statistical model rates every team’s attack and defense from 47,000+ international results, weighting recent matches most. It then plays the tournament out thousands of times to get each side’s odds — to win its group, survive the knockouts, lift the trophy. One disciplined method for everyone, no opinions in it.

2
The agent reads the live context

For one fixture the agent gets a single brief: the model’s odds and how sure they are, the betting-market price, and what the numbers structurally miss — injuries, line-ups, suspensions, and the match-day conditions that may swing a game: altitude, heat, a roof that closes. It weighs them and writes a reasoned call — usually confirming the model, sometimes nudging it, always citing what tipped it.

3
And it gets graded

Once the match is played we score the call — the W/D/L result and how close the scoreline came. Soccer is hard to call — low-scoring, and national sides barely play together — so we keep expectations honest and show every call, win or lose. The method →

Every one of the tournament’s 104 fixtures gets both reads — the W/D/L odds (a win, draw or loss for the home side) and a most-likely scoreline — laid out across the group stage and the knockout bracket. It’s a small demonstration of a larger idea: an agent that blends the numbers with the context a model misses and reasons out loud — a sounding board, not an oracle.

The agent, live

Updated All calls →

A few of the latest calls. Each card carries the model’s number, the agent’s adjustment and the one fact behind it; the full agent surface has them all, with sources and grades. Most calls sit close to the baseline — the agent’s sharpest move so far is Portugal +13.2 pp.

Canada v Qatar
played
CAN 66.1/18.6/14.6 69.0/19.7/11.3+2.9 pp
Agent called CAN 2–0 QAT 1.6–0.4 xg final 6–0
Strengthened the lean toward Canada.

Canada WDDWD, in line with expectations, vs opponents avg rank #55; Qatar DLLDD, under-performing by -0.49 pts/match, vs avg rank #70.

Better than the model call right 9 sources the full call →
Mexico v South Korea
MEX 48.5/26.8/24.7 50.7/27.0/22.3+2.2 pp
Likely score MEX 1–0 KOR 1.0–0.5 xg
Strengthened the lean toward Mexico.

Mexico DWWWW, +1.09 pts/match vs expectation vs avg opponent rank #38; South Korea LLWWW, +0.53 pts/match vs avg opponent rank #75. Both over-perform, but Mexico's form is stronger against tougher opponents.

Grades at kickoff 9 sources the full call →
United States v Australia
USA 40.5/27.3/32.2 41.7/27.7/30.7+1.2 pp
Likely score USA 1–0 AUS 0.9–0.7 xg
Strengthened the lean toward United States.

USA are LLWLW, +0.38 pts/match vs expectation vs avg opponent rank #16; AUS are WWLDW, +0.43 pts/match vs expectation vs avg opponent rank #48. Both over-perform, but USA faced much stronger opposition, making their form more impressive.

Grades at kickoff 6 sources the full call →

The forecast, in brief

Full 48-team table →

Straight out of the model — attack and defense ratings, run through thousands of simulated tournaments. The top of the field, and the baseline the agent reasons from. Each market’s de-vigged champion price sits beside the model’s, never blended. Top 8 ≈ 70% of the title mass — an unusually flat field.

# Team Champion Final Semi Polymarket Kalshi

What changed

Champion odds over the last day
Champion odds, on the move
How the odds have moved →

Find your way around