Baseball Digest Daily
HomeAbout UsArticlesBlogPlayer TrackerMLB StatsBig League FuturesSeamheadsHeater

No-no Fever!

by Matt Mitchell

And the only solution is to predict a no-hitter

In case you missed it, Gavin Floyd of the White Sox almost threw a no-hitter again last week. (If you need to catch up,
Diamond Rundown #6 on this very site is a good place to start). And when someone comes that close twice in the early going of a season, it’s almost natural to predict “He’s going to no-hit some team this year!”
You might have guessed that today’s discussion will be about predicting a no hitter based on the wealth of numerical data that exists, and you’d be right. Pat yourself on the back for remembering this is still the Sabermetric Soapbox without the column name above (or being smart enough to read the tag). So where should we start? Well, let’s start with (who else?) Bill James, who estimated the odds of a no-no by a pitcher in a single game using his career statistics:

[(3*IP)/((3*IP)+H)]^26

Simple probability and logic: the more frequently you get people out without allowing a hit, the better chance you have of getting 27 batters out without allowing a hit. J.C. Bradbury modifies this formula slightly to be more centered on the pitcher’s ability as well as utilizing the Poisson distribution, but essentially follows the same idea. Since you know you’re curious, this means Floyd’s odds for a no-hitter based on his career numbers as of 5/13/08 are .03%.

But that’s all I can find online. Let’s stay within Bill’s framework, but consider a few extra components:

  • Defense is typically something that, courtesy of DIPS, tends to be ignored, but it always seems to play a critical role in a no hitter. A statistic like DER can account for the percentage of batted balls that become outs. Let’s use this statistic here.
  • Since we’re accounting for defense, we’ll need to know how often the ball gets batted into play where the defense can field it. The denominator from the formula for BABIP is a good start, but we want it from the pitchers perspective, as well as it in the form of a percentage. Let’s use (BFP-BB-HBP-K-HRA)/BFP. This will be refered to as BIP%
  • Some balls leave the yard and never come back. Home runs will be accounted for with HR%, HR/BFP
  • We’ll also need to know what percentage of batters do not get a hit during their turn at bat. This is every thing else not covered in the previous 2 bullets, so 1-HR%-BIP% should work fine
  • What about walks and other ways of getting on that aren’t hits? Forget them here, as they are not hits or outs, the only factors that matter in a no hitter. However, this does ignore the issue of pitching with runners on, which tends to be a detriment to preserving a no hitter.
  • The exponential factor is the last component. Bill used 26 as an estimate, assuming that somewhere along the way a caught stealing or double play factored into the mix. What he probably wanted was to find the average number of batters that had an out-producing at bat per 27 outs. Easy to say, harder to show how to calculate. I used the Retrosheet play-by-play files since 2000, counting the number of plays with an out and dividing by the number of games times 2. This came out to 25.8, close to James’ 26. Let’s keep using 26 since its integral.

And there you have it. The full formula thus looks something like this:

NoHitOdds = [(1-HR%-BIP%)+ BIP%*DER]^26

Let’s throw the numbers into the cruncher and see what were the odds of a no hitter for pitchers with 15 or more GS in 2007. Here’s the list: 2007 No Hit Odds. I didn’t calculate this for players with < 15 starts, and thus no odds for Clay Buchholz or Floyd from last year.

Perhaps the thing that stands out is that defense is a major key in this calculation. El Duque, who tops this list, also has the highest DER. Also in the top 10 on the list, Bonser was #2 in DER, and Bergmann #3. These names jumped out to me as not being known to have great “stuff”, but above averages defenses behind them (Mets, Twins, and Nationals). Perhaps there is something to be learned about some of the more remarkable (read: more unexpected) no hitters from this nugget of insight.

DISCLAIMER: This is by no means perfected, and in reality it’s more of a retrospective tool. Your comments and criticisms are, as always, most welcome.

Thanks to the Hardball Times glossary for the formulas, and the Lahman DB for seasonal stats.

Comments (2) -> “No-no Fever!”

  1. Brian Joseph
    14 May 2008 10:55
    1

    Give me $10 to no-hit on Tim Lincecum, please.

  2. Mike Lynch
    14 May 2008 15:40
    2

    I wonder which pitcher in history had:

    1. The best odds of throwing a no-hitter?
    2. The best odds of throwing a no-hitter, but never threw one?
    3. The worst odds of those who have thrown no-hitters?

    That would be interesting to know. Great article, Matt!

Reply