An analysis of 6,314 pitcher-seasons from the past decade
High-leverage relievers sit in a valuation gap between baseball's two primary value metrics. WAR systematically undervalues elite relievers while WPA systematically overvalues them — and the truth lies somewhere in between.
The Core Tension: WAR says high-leverage relievers are barely worth noticing. WPA says they’re the most impactful pitchers in baseball. How should this be reconciled?
Measures pitcher value in terms of performance (FIP) and volume (IP). Isolates underlying skill but doesn’t fully account for leverage.
Measures actual impact on win probability, fully incorporating timing and context. But it absorbs managerial usage, sequencing luck, and situational noise the pitcher doesn't control.
Quantifies the average importance of the game situations in which a pitcher is used. Leverage is determined by game flow and managerial deployment — not by the reliever himself.
Games Started ≥ 5 and Innings Pitched ≥ 20 in a season.
Games ≥ 5 and Games Started < 3. Pitchers meeting neither definition are excluded.
Before identifying high-leverage relievers, it’s worth seeing just how differently starters and relievers are deployed. The violin plots below show the distributions of Innings Pitched and pLI across all individual pitcher-seasons — not aggregated careers, but each season as its own data point.
SPs dominate the workload — a typical SP season covers 3–4× the innings of a typical RP season, which is precisely why WAR favors them so heavily.
SPs cluster around league-average pLI (~1.0), but RPs show enormous variance. A substantial group consistently operates well above that baseline — the high-leverage arms deployed in the highest-stakes situations.
To move beyond per-season snapshots, we aggregate each pitcher’s career totals across all seasons in the dataset, then apply k-means clustering (k = 3) on career-average pLI to separate relievers into three distinct tiers. Because pLI captures how often a pitcher is deployed in high-stakes situations, this clustering effectively identifies the subset of relievers whose managers trust them with the game on the line — the closers, setup men, and firemen who anchor a bullpen. Under this framework, the cutoff between mid- and high-leverage sits at a pLI of 1.08, corresponding to the 10-year average leverage of pitchers like Giovanny Gallegos, Tyler Ferguson, and Joe Jiménez.
At their core, WAR and WPA answer different questions. WAR asks how much total value a player provides over a season, independent of context, while WPA asks who actually swings games in real time. The divergence between these lenses is nowhere more apparent than in the evaluation of elite relievers.
WAR and WPA both measure value, but they rest on different baselines, making direct scalar comparison invalid. Still, contrasting the stories they tell reveals important structural insights.
Consider the asymmetry: Zach Wheeler throws six strong innings in a comfortable lead and shifts win probability by only a few percentage points. Jhoan Duran enters in the eighth with the tying run on third, strikes out two batters, and swings win probability by 30–40% in two minutes. Both performances require real skill — but WPA rewards Duran’s two minutes far more than Wheeler’s six innings, not because Duran is better, but because the situation carried more weight.
Each dot below is a single pitcher-season. SPs spread horizontally (high WAR, moderate WPA) — their volume drives skill-based value but dilutes game-context impact. High-leverage RPs spread vertically (low WAR, wide WPA range) — their value is concentrated in high-stakes moments but inflated by sequencing and managerial deployment. The gap between the two clusters is the valuation no-man’s-land where neither metric tells the full story.
Zooming out from individual seasons to aggregate totals over the full decade sharpens the contrast. The bar chart and leaderboard below show cumulative WAR and WPA for all SPs vs. the high-leverage reliever tier identified above.
By WPA, high-leverage relievers outproduce all starting pitchers combined. By WAR, they’re worth a fraction.
A single leaderboard combining SPs and high-leverage RPs, sorted by total WPA. It’s telling that relievers appear so prominently on this list — alongside aces who threw roughly five times as many innings. This reinforces that elite high-leverage performance of individual relievers persists well beyond short-term variance.
| # | Player | Role | WPA | WAR | IP | pLI | Seasons |
|---|---|---|---|---|---|---|---|
| 1 | Max Scherzer | SP | 24.7 | 40.9 | 1492.2 | 0.96 | 10 |
| 2 | Justin Verlander | SP | 22.1 | 34.9 | 1449.4 | 0.96 | 8 |
| 3 | Jacob deGrom | SP | 21.7 | 37.5 | 1196.5 | 0.95 | 9 |
| 4 | Josh Hader | RP | 21.5 | 13.4 | 510.0 | 1.84 | 9 |
| 5 | Kenley Jansen | RP | 20.3 | 14.7 | 585.0 | 1.85 | 10 |
| 6 | Clayton Kershaw | SP | 20.0 | 31.9 | 1242.0 | 0.94 | 10 |
| 7 | Gerrit Cole | SP | 18.8 | 36.4 | 1489.5 | 0.94 | 9 |
| 8 | Chris Sale | SP | 18.3 | 35.6 | 1192.2 | 0.96 | 8 |
| 9 | Zack Wheeler | SP | 17.5 | 37.6 | 1441.6 | 0.96 | 9 |
| 10 | Raisel Iglesias | RP | 16.2 | 13.2 | 639.5 | 1.66 | 10 |
| 11 | Aroldis Chapman | RP | 15.0 | 14.2 | 500.0 | 1.80 | 10 |
| 12 | Blake Snell | SP | 14.7 | 26.4 | 1156.6 | 0.99 | 10 |
| 13 | Devin Williams | RP | 14.3 | 9.0 | 295.8 | 1.87 | 7 |
| 14 | Edwin Diaz | RP | 14.0 | 15.4 | 517.0 | 1.97 | 9 |
| 15 | Max Fried | SP | 13.3 | 23.6 | 1051.1 | 0.96 | 8 |
| 16 | Corbin Burnes | SP | 12.6 | 22.6 | 917.6 | 0.97 | 7 |
| 17 | Shane Bieber | SP | 12.4 | 21.1 | 869.7 | 0.98 | 7 |
| 18 | David Robertson | RP | 12.3 | 7.9 | 435.1 | 1.73 | 9 |
| 19 | Brandon Woodruff | SP | 11.2 | 17.4 | 700.8 | 0.95 | 7 |
| 20 | Felipe Vazquez | RP | 10.5 | 7.3 | 282.1 | 1.62 | 4 |
| 21 | Blake Treinen | RP | 10.4 | 8.5 | 455.2 | 1.58 | 9 |
| 22 | Mike Clevinger | SP | 10.3 | 14.2 | 791.8 | 0.93 | 8 |
| 23 | Aaron Nola | SP | 10.1 | 36.7 | 1635.9 | 0.95 | 10 |
| 24 | Scott Barlow | RP | 10.1 | 5.9 | 454.4 | 1.34 | 8 |
| 25 | Emmanuel Clase | RP | 10.0 | 9.9 | 357.9 | 1.80 | 6 |
| 26 | Framber Valdez | SP | 9.9 | 20.4 | 1078.8 | 0.97 | 8 |
| 27 | Ranger Suarez | SP | 9.9 | 15.5 | 741.6 | 0.96 | 6 |
| 28 | Kevin Gausman | SP | 9.7 | 33.9 | 1635.1 | 0.95 | 10 |
| 29 | Corey Kluber | SP | 9.7 | 21.9 | 967.4 | 0.92 | 7 |
| 30 | Zack Britton | RP | 9.6 | 4.3 | 242.5 | 1.67 | 6 |
| 31 | Shohei Ohtani | SP | 9.5 | 13.9 | 526.3 | 1.01 | 5 |
| 32 | Jhoan Duran | RP | 9.5 | 5.9 | 253.4 | 1.90 | 4 |
| 33 | Jordan Romano | RP | 9.4 | 3.4 | 270.7 | 1.81 | 7 |
| 34 | Zac Gallen | SP | 9.4 | 17.6 | 1007.1 | 0.98 | 7 |
| 35 | Zack Greinke | SP | 9.3 | 21.6 | 1292.8 | 0.96 | 8 |
| 36 | Felix Bautista | RP | 9.3 | 4.8 | 160.4 | 1.87 | 3 |
| 37 | Kyle Hendricks | SP | 9.3 | 22.4 | 1482.8 | 0.93 | 10 |
| 38 | Nathan Eovaldi | SP | 9.2 | 18.8 | 1085.9 | 0.95 | 9 |
| 39 | Tarik Skubal | SP | 9.1 | 19.3 | 765.5 | 0.91 | 6 |
| 40 | Kirby Yates | RP | 9.1 | 7.4 | 389.9 | 1.24 | 8 |
| 41 | Liam Hendriks | RP | 9.1 | 12.0 | 408.7 | 1.35 | 9 |
| 42 | Trevor Bauer | SP | 9.0 | 18.8 | 934.4 | 0.97 | 6 |
| 43 | Logan Webb | SP | 9.0 | 24.3 | 1060.7 | 1.01 | 7 |
| 44 | Paul Skenes | SP | 8.9 | 10.8 | 320.2 | 0.97 | 2 |
| 45 | Freddy Peralta | SP | 8.8 | 17.8 | 928.9 | 0.95 | 8 |
| 46 | Andrew Miller | RP | 8.8 | 5.5 | 273.5 | 1.41 | 6 |
| 47 | Seth Lugo | SP | 8.6 | 14.8 | 990.9 | 1.13 | 10 |
| 48 | Tony Watson | RP | 8.5 | 2.4 | 328.5 | 1.52 | 6 |
| 49 | Stephen Strasburg | SP | 8.2 | 17.9 | 682.5 | 0.96 | 5 |
| 50 | Hyun-Jin Ryu | SP | 8.2 | 12.3 | 705.5 | 0.94 | 7 |
Yet if these contextual factors were purely noise, their effects would largely cancel out over large samples. Instead, over a decade of league-wide data, high-leverage relievers consistently dominate aggregate WPA, reflecting a durable structural feature of the game: leverage is systematically concentrated into a small number of bullpen innings. This is made explicit in pLI data, where elite relievers operate at roughly double the average leverage of starters, placing them in the most consequential moments of nearly every game.