The Most Important Pitching Study Ever Done: An Overview
For many years, I’ve been toiling on the statistical fringes of the eternal baseball question: what is the best way to handle pitching? My aim has been to put together a baseball equivalent of a “grand unified theory” which would account for the key changes in the way pitching staffs have been deployed over the years. The time has come to let my findings see the light of day.
Thanks to Dave Smith and www.retrosheet.org, I have compiled literally millions of data points covering many aspects of pitching usage and performance from 1950-2014. I looked at every box score since 1950 for the big starting pitcher study which I’ll be presenting here in a series of articles. Here is the way I’ve mapped the articles:
1. Overview: earliest ten seasons vs. most recent ten seasons
2. The 1950s
3. The 1960s
4. The 1970s
11. Probably a separate article for each of the past five seasons.
In my scouring of box scores, I looked for a specific situation which I felt would represent the clearest changes in the varying ways in which managers have used pitchers in the past 65 seasons. I was right. It’s easy to see that back in the 1950s, the starting pitcher was expected to pitch the whole game or give out trying, while today’s managers seem to approach each game by starting at the end. For so many years, Joe Torre went to the ballpark praying that he’d get to use Mariano Rivera to wrap up a win in the ninth inning. To do that, his set-up man would have to get the job done in the eighth inning, and perhaps two or three other relievers would be called on to handle things in the seventh and eighth innings. The starting pitcher is no longer expected to give his team more than six or seven innings, the so-called “quality start” that old-timers scoff at. Exactly how did this change evolve? When did this all happen? Does it help teams win?
That was my overriding question: is this ultimately a winning strategy? We have gone from nine-man pitching staffs in the 1950s to twelve-man staffs today, thereby giving the manager much less flexibility in non-pitching substitutions. Suppose it turned out that using a dozen pitchers the way Casey Stengel and Al Lopez used to use just nine does not result in more wins? That’s what we’ll be looking at with this study.
It used to be that close to 60% of relievers entered the game when their team was losing. In recent years, that trend has reversed, with nearly 60% of relievers entering with their team in the lead. Managers are wearing down more and more arms, but is it bringing more wins? No, it isn’t. Teams have instituted pitch counts at every level of the organization, limited innings for certain pitchers, and have engaged the best trainers and medical staff to nurse their pitchers along without encountering serious arm injuries. Yet the injuries have not abated, and there is much debate about whether this is because pitchers are being pushed too hard or because their arms aren’t being developed enough. The answer to that debate might be that both cases are true: pitchers are being coddled early on and their arms are never conditioned to throw a lot of pitches or innings; they’re also being told to throw as hard as they can while they’re in the game, knowing that the manager will be getting him out of there before long. Perhaps they’re simply pressuring and straining arms that aren’t built up to withstand it.
So the old time-bomb phenomenon continues for pitchers. They seem doomed to break down at some point almost no matter how they are trained and used. Coddle them, push them, abuse them, confuse them, rely on them, or whatever you do with them—most of their arms will break down at some point. So how do you get the most out of them as long as they last? Could old-time managers actually get just as much out of a nine-man or ten-man staff as today’s managers do out of twelve or even thirteen hurlers? That was the idea behind my studies.
Here’s the situation I looked at. Your starting pitcher has given you seven innings and has a lead of three runs or less. Lest you think this is too restrictive, bear in mind that since 1950, it has occurred 38,498 times, or just under 600 times per season. That’s a good-sized sample to draw from, and it involves what apparently is the key decision for a manager. Do you leave your pitcher in? If you let him start the eighth inning, how far do you go with him? If he gets you through the eighth inning with the lead, what do you do? If you let him try for a complete game, does it work? Do you change your mind and bring in a reliever? In what situation does the reliever enter, and how does he perform?
Those are some of the questions I asked as I recorded 17 pieces of data for each occurrence. A few times each season, both starters face that situation, when the lead changes hands in the top of the eighth inning. For each season, I broke down this data separately by league (since the 1973 advent of the designated hitter, only National League managers have this key pitching decision affected by the possibility of pinch-hitting for the pitcher), and compiled 56 pieces of analytical data along with the percentage of times when each result occurred.
My wife was appalled yesterday when I did some calculations and admitted that I have gathered over 700,000 pieces of data from the study, residing in just two Excel worksheets. Some are more substantial and more important (such as the number of times the pitcher gets to start the eighth inning), while others are rare and more diverting than important (how often a starting pitching is scored on in the ninth inning of a complete-game win). But there they are, and after catching up on recent raw data, I’m ready to start analyzing it. I do not have formal training as a statistician, but I’ve always been good at playing with numbers. That’s what I’ll be doing in these articles—shedding assumptions and trying to make sense out of the numbers, the way Bill James did in the early years of his Abstract.
In this overview article, I’m going to talk about ten of the most revealing numbers. In the following table, I present those numbers as they occurred in the 1950s and in the most recent decade of play (2005-2014), to show you the sweep of changes over the past 65 years.
A closer look at these numbers will give you the broad parameters of this study. “Lead>8” is the basic situation in which the starter has a lead of three runs or less when his team takes the field in the eighth inning. It is today’s “save situation,” but in the 1950s nobody had heard of saves. The manager’s job was to win games, and most of the time the starting pitcher carried the load at least into the eighth inning. This situation occurred in 39% of games in the 1950s, or twice every five games. The frequency has dropped to nearly half that figure in the past ten years. Why does it happen so less often? Because managers have increasingly tended to take the starter out after six or even five innings when he has a small lead, putting an even greater load on the bullpen. One of my central beliefs about pitching is that a pitcher who is demonstrably doing well does not need to be replaced until evidence appears that he’s losing his stuff. Today’s manager simply does not believe that, as this table reflects.
In the 1950s, if you got through the seventh inning with a lead, you started the eighth inning a whopping 95.1% of the time, and if you finished the eighth inning in good shape, you starting the ninth inning 95.3% of the time. Such faith in the starting pitcher has plummeted steadily over the years, with the figures dropping to 47.9% and 41%, respectively. No matter how well you pitch that seventh or even eighth inning, the odds are that today’s manager is going to lift you in favor of a handful of relievers. As my research on relief pitching shows, these managers follow scripted scenarios in which Pitcher A is expected to get two outs, Pitcher B is expected to get out a key left-handed hitter, Pitcher C the same as B, Pitcher D carry the load in the eighth inning, and Pitcher E serving as the final-inning closer. Those changes have become routine regardless of how well each reliever performs. The empirical has yielded to the theoretical—but it doesn’t win more games! I have blogged about this before, for instance this one from 2011: http://charlesapril.com/2011/05/in-case-you-havent-noticed-relief.html.
Back to the table. The percentages tell you how rare the complete games has become. In the 1950s, a starter who began the eighth inning wound up pitching a complete game 69% of the time, or a little more than two-thirds. Since 2005, that has gone down to 17.5%, or a little more than one-sixth of the time. Three-fourths of the time, what would have been a complete game in which the bullpen rested rather than working, has become a parade of relievers. Nothing slows games down like pitching changes, but what results are produced?
“BS non-CG” tells you how often the bullpen blows the save when a starter is relieved at any time after the seventh inning. Note that this is the first stat I’ve showed you that has remained close. It is worse today than it was in the 1950s, but it is better than it was in the previous ten-season period (1995-2004), when the bullpens collectively blew roughly 23% of their save opportunities.
Finally, we have the most important number of all: team wins. The bottom line. Looking just at these two decades, we find a 1.6% increase in team wins. That means 1.6 more wins per hundred events. However, since teams today average fewer than ten such events per season, each team realizes an extra win every seven seasons or so. But that’s looking at just two decades. Over the course of 65 seasons, the average is right around 84-85%. Out of those 65 seasons, the Team Wins figure has landed in a four-point range (82.5-86.5%) 54 times. The worst season was 1957 at 79.4%, and the best was 1976 at 87.6%. Second-best was 2014 at 87.5%. If you take out 2014 and 1957, the margin narrows to 84.6-83.7%.
That isn’t much to show for a strategy that has run rampant and has affected everything from off-season roster moves on down the managerial ladder to the poor schmo who lifts his starter after 6 2/3 innings of four-hit work, with nobody on base, because the next hitter is just 1-for-12 lifetime against the southpaw warming up in your bullpen. Everything done by the organization to create the roster needed to implement this pervasive strategy impacts this decision, yet it doesn’t lead to more wins.
In the coming articles, we’ll be looking at these and other perspectives from my 65-season study. We’ll look at who these pitchers were–for instance, the somewhat-less-than-immortal John Tsitouris, who won 34 games in the majors. In 1963, pitching for the Reds, he faced a “save situation” at the start of the eighth inning nine times. His manager, Fred Hutchinson, let him start the eighth inning every time, and he got through that inning unscathed eight of them. In those eight games, Hutchinson sent him out there for the ninth inning seven times, and he went distance for six complete-game wins. In the seventh ninth inning, he was relieved and the bullpen blew the save. He also got two wins with saves. Tsitouris, it should be noted, was fifth on the Reds in games started that year.
I hope you’ll join me in exploring the various changes that have taken place since the 1950s in pitching usage and pitching success. We’ll break down the numbers and see who did what, how the domino effect made ripples, and why things unfolded as they did.