Friday, March 12

The Sheer Improbability

My last post on pitchers who won two-thirds of their starts in a season got me to thinking again about Gator's amazing record in September division title races. I noted Lefty Grove's incredible '31 season in which he won 27 of 30 starts, a feat unmatched in baseball history. I noted that Bob Welch won a higher percentage of his starts in 1990 than any pitcher since 1954, and he won only 24 games during his best 30-start stretch.

I've noted before the sheer improbability of winning 26 of any 30 starts selected on the basis of any unbiased criterion. It's highly improbable that a pitcher would win 26 of 30 weekend starts, or starts in day games at home, or starts against teams in your own division. But to get an even clearer idea of how difficult it is to win 26 of any 30 starts, even 30 starts selected by a manifestly biased criteria, consider the following.

Wednesday, March 10

Start The Game, Win The Game

Trivia question: since 1920 there have been only two pitchers to win two-thirds or more of their starts in a season more than twice (minimum 30 starts). Who are they? While you think about that I'll give you some idea of how special this achievement is.

Since 1954 a pitcher has started 30 or more games in a major league season more than 3000 times. Only 30 times has a pitcher started 30 or more games and won two-thirds of his starts. That's less than 1% of the 30-start seasons since 1954.

Since 1954 only four pitchers have accomplished this feat more than once. Can you name them? Here are some hints. Sandy Koufax never did it, narrowly missing in '66. Greg Maddux never did it, either, although he came close in '95 when he won 19 of 29 starts. Whitey Ford came close in '56 and '63, but never did it. Steve Carlton never did it, although he won 27 of his 41 starts in '72.

The following all-time greats did it once: Bob Gibson, Randy Johnson, and Tom Seaver. What year do you think Gibson did it? No, it wasn't '68, it was '70. Doc Gooden did it in his great '85 season. Denny McLain did it in '68 when he won 31 games.

Here are three of the four post-1954 pitchers who did it more than once: Roger Clemens did it twice, in '86 and '90. Juan Marichal did it twice, in '66 and '68. Pedro Martinez did it in '02 and would have done it '99, when he won 22 of his 29 starts. Let's give Pedro credit for that season, however, because he would have done it even if he hadn't won that 30th start.

The answer to our trivia question? Here's another hint first: both pitchers to have won 2/3s of their starts in a season more than twice are lefties. The answer? Lefty Grove did it four times, in '28, '30, '31 and '32. And Ron Guidry did it three times, in '78, '83 and '85. They are the only two pitchers since 1920 to have won 2/3s of their starts in a season more than twice.*

Tuesday, March 9

Leverage Adjusted ERA (Or "Not All Runs Are Equal")

It's been surprising to me, given the profusion of new pitching statistics (FIP, VORP, Component ERA), that we haven't seen an expression of ERA or ERA+ that adjusts for leverage, weighing runs allowed in high-leverage situations more and runs allowed in low-leverage situations less. The data is available in the game logs at Baseball-Reference.com, but poring through and aggregating the data would be a tedious exercise. Fangraphs.com aggregates the data on a seasonal basis in the WPA, WPA/LI and Clutch statistics, but expresses the statistics in terms of incremental games won or lost rather than adjusted ERA.

Fangraphs calculates "Clutch" by subtracting WPA/LI, which aggregates the unleveraged increase or decrease in win probabilities associated with each plate appearance against a pitcher, from WPA, which also aggregates the win probabilities but assigns a leverage factor to each event based on the game situation (score, inning, base and out situation). Generally speaking, a pitcher with a positive Clutch factor performed better in high-leverage situations relative to his overall seasonal performance, or declined in performance in low-leverage situations relative to his overall seasonal performance, or some combination of the two. A better performance in high-leverage situations means that the incremental outs the pitcher got in high-leverage situations count for more than an average out (i.e., an out obtained in a game situation with a leverage factor of 1.0). A worse performance in low-leverage situations means that the incremental runs the pitcher allowed in low-leverage situations count for less than the average run (i.e., a run scored in a game situation with a leverage factor of 1.0).

The significance of the Clutch statistic should be obvious: not all runs allowed (and runs prevented) are equal. For example, the run surrendered in the bottom of the ninth of a tie game should be counted differently than the run surrendered in the bottom of the first inning after the visiting took a six run lead in the top half of the inning. ERA and ERA+ count each run the same, notwithstanding that the two runs I used as examples are likely to have had hugely disparate impacts on the outcome of the game. The advantage of expressing the number of leverage-weighted runs allowed as a variation on ERA should also be obvious: most fans will not know whether a Clutch factor of 0.74 is merely above average, or very good, or a spectacular achievement, but fans know how to compare a 116 ERA+ to a 135 ERA+.