{"id":33244,"date":"2020-02-09T20:10:22","date_gmt":"2020-02-10T01:10:22","guid":{"rendered":"https:\/\/seamheads.com\/blog\/?p=33244"},"modified":"2020-02-09T20:33:18","modified_gmt":"2020-02-10T01:33:18","slug":"major-league-equivalencies-for-the-negro-leagues","status":"publish","type":"post","link":"https:\/\/seamheads.com\/blog\/2020\/02\/09\/major-league-equivalencies-for-the-negro-leagues\/","title":{"rendered":"Major League Equivalencies for The Negro Leagues"},"content":{"rendered":"<p>Major League Equivalents (MLEs) are a series of calculations designed to take non-major league baseball performance and estimate what that performance\u2019s results would look like statistically in the context of the Major Leagues.    Bill James gets credit for popularizing MLEs, as he outlined his method for minor league batters in the 1985 Baseball Abstract.     James was only interested at the time in making sense of minor league statistics, but MLE\u2019s can be used to evaluate ANY baseball performance, including minor league, Japanese or other foreign league, Negro League, NCAA leagues, etc.   You can also use the basic MLE procedure to evaluate the performance of an American League player relative to the National League, or perhaps calculate what type of batting statistics Ty Cobb\u2019s 1909 performance would look like in the 2007 AL.  <\/p>\n<p>Of course, creating MLEs at all is a bit of a fool\u2019s errand.   We can\u2019t really KNOW how any player would have played placed in a different environment, especially one that is drastically different regarding its level of competition.   Some players can adapt and adjust their playing when faced with different settings, and some have difficulty.    However it can be a fun and enlightening exercise.  <\/p>\n<p>For current baseball players, MLEs can also be used to build predictive models of future play results.  This is really what James had in mind \u2013 a way to use past non-major league data to predict future major league performance of young players by converting that non-major league data into data that would approximate MLB level play, then use THAT data along with any MLB historical data on the player to give a greater sample size on which to make predictions.  Today, everyone from major league team executives to fantasy league players rely on predictions built upon some basic framework for MLEs.<\/p>\n<p>Besides executives and fantasy baseball players, MLEs can be useful for baseball historians and baseball \u2018gamers\u2019 (those who play simulation games like Diamond Mind, Out of the Park, APBA).    MLEs can help to answer questions such as:<\/p>\n<p>\uf0d8\tHow would Ted Williams, Bob Gibson, Ty Cobb, and Barry Bonds do if all placed in a league together?<br \/>\n\uf0d8\tWhat if Japanese League players had been allowed in MLB beginning in the 1960\u2019s?<br \/>\n\uf0d8\tWhat if the Major Leagues had integrated in the 1920\u2019s?<\/p>\n<p>MLEs can give us somewhat realistic \u201cWhat ifs?\u201d that can be analyzed, simulated, and just plain enjoyed.<\/p>\n<p>Creating good MLE\u2019s involve these basic steps:<\/p>\n<p>1.  Determine the relative strengths between the FROM League environment vs. the TO League Environment.   <\/p>\n<p>Ideally, you\u2019d have actual data from which to do this, such as players who move from PCL to NL within the same year.   Compare their stats between the two, adjust for quantity (player may have only 5 PAs in NL and 450 in PCL that year, while another has 400 and 70, for example), adjust for selective sampling if needed, sum up, and compare.     For Japanese Leagues, you generally only have players moving to and from MLB BETWEEN seasons, so you would want to pair one season to the following season, but since the player would be a year older the 2nd year, maybe make a slight adjustment for age to make those pairs comparable.    For Negro Leagues, you may have only limited pairs in the 1940\u2019s, or almost no pairs in the 1920\u2019s, in which case you have to make some assumptions (educated guesses) about league strengths.<\/p>\n<p>2.  Determine the differences in League Run Environments. <\/p>\n<p>This SHOULD be straightforward, but it\u2019s not.   For example, if 10 Runs per Game are scored in the PCL, and 8 Runs per Game are scored in the NL, you would think that the PCL stats for the MLE calculation would need to be decreased by 20% for batters and pitchers (lower runs allowed for pitchers).    However, ballparks on average may be a little smaller in the PCL, and perhaps if the PCL had played their season in MLB parks, they would have scored only 15% more than the NL instead of 20% more.  If that\u2019s the case, then a BATTER moving from the PCL to the NL is going to see his offensive production decline by even MORE than 20%, while a PCL pitcher would actually see his Runs Allowed IMPROVE by more than 20%!   This means you need the next step:<\/p>\n<p>3.  Determine the differences in Ballparks (and other factors) between leagues.<\/p>\n<p>As mentioned, league run environments are impacted by the parks, the balls, and the bats (like NCAA players moving from aluminum bats to wooden bats).   If a player like Tuffy Rhodes is moving from the NL to Japan, he\u2019s moving to a run scoring environment around 6% LESS than MLB so we would expect his stat line adjustment in Step #2 to be 6% worse.   However, partially due to parks and partially due to the baseball, the park run environment in Japan (pre-2012) is much more hitter friendly ON AVERAGE than parks in MLB, perhaps as much as 13% more hitter friendly.  So, not only does Tuffy get around a 10% boost in step #1 for moving to a weaker league, he gets another 7% boost (13% &#8211; 6%) from steps #2 and #3 together.<\/p>\n<p>Calculating this step is tricky, because the evidence is intertwined with the league run scoring environment.  The best estimating technique is to look at the DIFFERENCE between batters and pitchers who move between the same league environments.   For example, if the empirical evidence shows that PCL batters hit 15% worse in MLB, while PCL pitchers allow only 5% more runs moving to the MLB, that\u2019s evidence that the PCL parks are around 5% more hitter friendly on average than MLB parks. (-15%+5%)\/2.<\/p>\n<p>4.  Determine the differences in Ballparks WITHIN leagues.<\/p>\n<p>Step #3 uses the \u2018average\u2019 parks for the FROM and TO leagues, but the specific park a batter played in, and the specific park he\u2019s being calculated into, should be adjusted for if estimates are known.<\/p>\n<p>There have been several good publicly available methods already created to calculate MLE\u2019s.  Bill James of course had his formulas in the 1985 Abstract, specifically for AA and AAA players going to MLB.   James then had the \u201cWillie Davis Method\u201d in his Historical Baseball Abstract, specifically to convert any one major league batting season into a \u2018neutral\u2019 major league.   Dan Syzmborski does MLEs called ZIPS that are calculated very similarly to Bill James for batters, only he also has formulas for pitchers.<\/p>\n<p>I too have my own MLE calculations, with batting MLEs based primarily on the \u201cOdds Ratio\u201d method outlined in many blogs over the years by Tom M. Tango, author of \u201cThe Book\u201d and currently the Senior Database Architect of Stats for MLB Advanced Media.  For pitching, MLEs my calculations closely follow the method of Sean Smith, whose method was previously used by Baseball-Reference.com for their neutralized stat calculations.<\/p>\n<p>Since here at Seamheads we specialize in the rich history of Negro Leaguers, one question that often lurks in the background, and sometimes the foreground, is \u201cJust how good WERE those guys?\u201d  MLEs are the tool that, along with those important environment variables and caveats above, can help us down the road a bit to answering that question.<\/p>\n<p>Getting into the nitty-gritty details of the calculations are for a future article, but to demonstrate the power of MLEs we will take the Negro League stats for Wilber \u201cBullet\u201d Rogan, who as a two-way player will provide us with both batting and pitching stats to work with, and as a Hall of Fame performer will give us an idea of how good the top players in the Negro Leagues might have been.   We\u2019ll use my \u201cKJOK\u201d method, and see what the results look like using 2019 NL as the MLE \u201cTO\u201d season:<\/p>\n<p>Here are Rogan\u2019s raw stats from the Negro Leagues (per seamheads.com)<\/p>\n<p> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan1.png\" alt=\"\" width=\"561\" height=\"220\" class=\"aligncenter size-full wp-image-33247\" srcset=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan1.png 561w, https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan1-300x118.png 300w\" sizes=\"auto, (max-width: 561px) 100vw, 561px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan2.png\" alt=\"\" width=\"557\" height=\"175\" class=\"aligncenter size-full wp-image-33248\" srcset=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan2.png 557w, https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan2-300x94.png 300w\" sizes=\"auto, (max-width: 557px) 100vw, 557px\" \/><\/p>\n<p>Here are the translated stats using my method:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan3.png\" alt=\"\" width=\"599\" height=\"212\" class=\"aligncenter size-full wp-image-33249\" srcset=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan3.png 599w, https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan3-300x106.png 300w\" sizes=\"auto, (max-width: 599px) 100vw, 599px\" \/><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan4.png\" alt=\"\" width=\"596\" height=\"190\" class=\"aligncenter size-full wp-image-33250\" srcset=\"https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan4.png 596w, https:\/\/seamheads.com\/blog\/wp-content\/uploads\/Rogan4-300x96.png 300w\" sizes=\"auto, (max-width: 596px) 100vw, 596px\" \/><\/p>\n<p>Some general observations on the results:<\/p>\n<p>While the method does try to \u2018shape\u2019 statistics for a change in eras where the distribution of Singles, Home Runs, etc. is vastly different, like the 1920\u2019s Negro Leagues versus 2019 National League, the larger the differences in distribution, the harder it is for the model to create realistic stats in the new environment.  We are not only moving from the Negro Leagues to MLB, we are also moving from 1920s to 2019.  If we had moved into the 1920s Nation League environment, the model would do a better job.  So instead of hitting .340 in the 2019 era, maybe a better estimate would be .320 but with a few more extra base hits.  The model adds plate appearances for the difference in league game schedules between the leagues, so the higher HR numbers are a combination of much higher HR environment and a longer season schedule.   <\/p>\n<p>On the pitching side, the combination of Rogan striking out batters at a much higher rate than his contemporary Negro League pitchers, put into the high strikeout 2019 environment, results in translated strikeout totals that may be a bit too high to be realistic. <\/p>\n<p>Admittedly these are Rogan\u2019s prime, best seasons, but the translations do seem to confirm his reputation as a great two-way player.   Note again that this does not mean Rogan is PREDICTED to hit 36 Home Runs if he played in the 2019 NL instead of the 1922 NNL.  It just means that given what he did in the 1922 NNL, and making some assumptions about the quality of play and the ballparks, what he did do at the plate would be approximately EQUIVALENT to hitting 36 home runs, batting .300, etc.<\/p>\n<p>Assumptions of course can be wrong.   The point is not necessarily that the MLEs are \u2018correct\u2019, but the point is that we are now starting to have the data for players, leagues, ballparks, etc. that combined with statistical tools can be used to approximate \u201chow good these guys really were\u201d as opposed to just purely guessing based on anecdotal stories or very incomplete statistics that do not have any league scoring context to provide an analytical framework.<\/p>\n<p>In future articles we\u2019ll step back into the detailed data a bit and discuss how to analyze players when we don\u2019t have 100% complete data to work with, like missing strikeouts for batters.   How do we get around that?   Even if we can approximate batting or pitching performance, what about defense, or even baserunning?  Do we have ways to approximate those also, or are they hopelessly lost to history?   Stay tuned\u2026.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Major League Equivalents (MLEs) are a series of calculations designed to take non-major league baseball performance and estimate what that performance\u2019s results would look like statistically in the context of the Major Leagues. Bill James gets credit for popularizing MLEs, as he outlined his method for minor league batters in the 1985 Baseball Abstract. James [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4235,5],"tags":[4303],"class_list":["post-33244","post","type-post","status-publish","format-standard","hentry","category-top-stories","category-statistical-analysis","tag-negro-league-baseball"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/posts\/33244","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/comments?post=33244"}],"version-history":[{"count":2,"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/posts\/33244\/revisions"}],"predecessor-version":[{"id":33251,"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/posts\/33244\/revisions\/33251"}],"wp:attachment":[{"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/media?parent=33244"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/categories?post=33244"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/seamheads.com\/blog\/wp-json\/wp\/v2\/tags?post=33244"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}