Are Ethier and VanSlyke Actually Better?
by D.K. Robinson, 7/22/14
Don Mattingly’s decision to abruptly move Matt Kemp from CF to LF and replace him with a platoon of Andre Ethier and Scott VanSlyke may well go down in history as one of the most curious and potentially catastrophic moved ever made by a Dodgers manager. For years, many bloggers and fans of sabermetrics have yearned for the day that the managers of their teams would follow their statistical breadcrumbs and start making actual meaningful decisions based on an alphabet soup of acronyms for what are commonly referred to as “advanced metrics.” Well, be careful what you wish for, because Mr. Mattingly and others in the Dodgers organization have grossly misread the UZR-flavored tea leaves and are now selling out one of the Dodgers’ most valuable players based on an accidental misinformation campaign!
. . . All right, so that’s a bit of an overstatement. I’ll confess to being a stat geek. I love sabermetrics. In fact, I’ve been tinkering with my own metrics since I first discovered Bill James’ writings in the 80s while I was still playing dice-and-charts baseball simulation games. That being said, I’ve got some serious concerns about UZR, which I’ll get to in a minute. Now, back to the article . . .
Using UZR to Force Kemp Out of CF?
On May 23, 2014, Bill Plunkett of the OC Register wrote an excellent story with the headline “Kemp’s outfield woes becoming hard to defend.” Kemp had been benched in favor of Andre Ethier the day before and Plunkett quoted Mattingly as saying “Obviously center field has been a situation we continue to look at . . . We need to get better there. We’re looking at all options.” Plunkett wrote that “Mattingly acknowledged that defensive metrics like UZR (Ultimate Zone Rating) and D-WAR (defensive-WAR) do not reflect favorably on Kemp” and further quoted Mattingly as saying “I’m certainly not going to sit here and bash anybody or publicly go into the numbers. We know what they are, though.” Yikes! Mattingly’s been sniffing the UZR and the UZR’s been making him crazy!
So what, exactly, is this UZR? According to the popular website fangraphs.com: “UZR is an advanced defensive metric that uses play-by-play data recorded by Baseball Info Solutions (BIS) to estimate each fielder’s defensive contribution in theoretical runs above or below an average fielder at his position in that player’s league and year.” The raw UZR statistic essentially reflects the number of extra runs a fielder prevents or allows when compared to an average fielder and is cumulative in nature. So, for example, a player with an UZR of 10.0 has theoretically prevented 10 runs from scoring by his excellent defense when compared to an average player at his position while a player with an UZR of -10.0 has theoretically allowed 10 runs to score that would not have scored if an average defender was manning his position. UZR’s cousin, UZR/150 projects a player’s UZR rating over 150 defensive games so researchers can estimate what the positive or negative impact of having a player play a position over a full season.
So, what does UZR think of Matt Kemp? Not much. Matt Kemp played at least 158 games in CF during each season from 2009-2011 and won the Gold Glove in both 2009 and 2011. (Yes, I understand that the Gold Glove Awards are often not awarded to the best defensive players in the league . . . but, still, enough managers and coaches thought highly enough of Kemp’s defense to cast more votes for him than any other centerfielder twice in three years.) Kemp’s UZR ratings for those three seasons were +3.2 (2009), -25.8 (2010) and -4.8 (2011), respectively. What’s fairly mind-blowing about these UZR scores is the huge discrepancy between 2009 and 2010. If these numbers are to be believed, Kemp plummeted from being a slightly above-average defender in 2009 to being a butcher who cost his team a full run every week in 2010 and, just as abruptly, rebounded by a full “21 runs” in 2011! Kemp’s UZR/150 statistics have generally declined since 2011, dropping to -12.6 in 2012, -35.0 in 2013 and rebounding only slightly to -33.6 this season. Of course, if these numbers are to be believed, no team in its right mind would allow Kemp to play CF as his mere presence there would literally cost the Dodgers a full run every 4th or 5th game. Not an extra baserunner . . . but a full run! As I recently continued to read more and more articles citing UZR as the smoking gun that justifies Mattingly’s decision to yank Kemp out of CF, the more the entire premise of the story UZR purports to tell rang false! I felt compelled to take a fresh look at this “advanced defensive metric” and some possible alternatives.
UZR Has Flaws? – Try P150 On For Size!
Let’s start with the basics: the primary job of a centerfielder is to field the balls hit to centerfield. According to FanGraphs, in 2013, the average team had 332 balls hit to CF or about 2.05 CF plays per game. The Dodgers had the fewest number of balls actually hit to CF (267). Obviously, Kemp can’t catch calls that aren’t hit to CF! But how well did Kemp field the balls that were hit in his general direction? The short answer is: a whole lot better than the press would have you think!
Fortunately, the good folks at FanGraphs have yet another new set of defensive data that can be very helpful in answering that question in detail. It’s called “Inside Edge Fielding” and it breaks down all of the balls hit to the zone of a particular position by the observer’s sense of how difficult the ball was to catch. More specifically, it sorts batted balls into a series of categories which I’ll list here with the MLB average success rate for successfully making plays in each category for the 2012-2014 time period: Impossible (0%), Remote (9.33%), Unlikely (34.2%), Even (59.14%), Likely (84.7%) and Routine (99.37%).
Using this “Inside Edge Fielding” data, I created a spreadsheet to calculate the number of extra plays that an above-average defender would make – and that a below-average defender would fail to make – over the course of 150 full games. The results were interesting. Because Kemp has not played a full season since 2011, I used the aggregate of his 2012-2014 statistics in order to have a reasonable sample size. Based on that data, compared to an average MLB CF, Kemp fails to make 4.3 plays per 150 games. I’ve decided to call this metric the “P150” – so Kemp’s P150 rating for 2012-2014 is -4.3.
OK, so a P150 rating of -4.3 is not actually good. In fact, it ranks Kemp in the lower tier of centerfielders – but the statistic itself gives some real world context to how Kemp’s defensive liabilities actually manifest themselves on the field! There is a huge difference between failing to make a net 4.3 plays per season and being personally responsible for opening the floodgates and (allegedly) allowing 26-35 extra runs per season (as alleged by UZR). The P150 metric suggests that, on average, about once every six weeks, Kemp fails to make a play that an average CF would make. And, yes, while you may observe Kemp actually make mistakes more often than that – bear in mind that most of those mistakes are offset by Kemp’s above-average ability to make more challenging plays (e.g., Kemp converts batted balls in the “Even” category 67% of the time vs. the MLB average of 59%).
Mommy, Where Does Data Come From?
By the way, it is important to remember that all of this data is initially logged by human hands who work for Baseball Info Solutions (BIS). These “stringers” have an extremely difficult job because they must keep track of a multitude of variables on every play. Consider this excerpt from “The FanGraphs UZR Primer” (http://www.fangraphs.com/blogs/the-fangraphs-uzr-primer/):
“With UZR, if a fielder makes an out, and the UZR engine estimates that it was a difficult ball to field (and turn into an out) by an average fielder at that position, then the fielder will get more credit than if the UZR engine determined that it was an easy ball to field. Likewise, if a batted ball drops for a hit, a fielder will get more negative credit if UZR determined that it was an easy ball to field (for that fielding position) and less negative credit if it was a difficult ball to field. If a fielder makes an error, UZR automatically assumes that it was a relatively easy ball to field, since that is presumably the definition of an error in the first place, so there is no need to incorporate the speed and location of the batted ball and other factors that can influence how difficult a batted ball is to field. In other words, in UZR, errors are treated as balls that are normally fielded by that fielder and that fielder only (the one who made the error), 95% of the time, or whatever the average error rate is for that position and that type of ball.”
Wow, that’s a lot to keep track of while taking in a ball game! I have no knowledge about how the good people at BIS hire and train their stringers. I expect that they do an excellent job. That being said, I have to wonder if the “human element” may ever come into play when these stringers are logging down what they see happen on the field. For example, if Kemp’s speed allows him to get close to a ball in play that a slower CF would never have a prayer of catching up to, might that ever have an impact on the data the stringer writes down? I could easily imagine that a well hit ball to the gap that Kemp narrowly misses could be logged as a “failure” by Kemp to make a play when, if Ethier or VanSlyke were in CF, the play would just be logged as a double that one of those guys had to chase down. Consider that, according to FanGraphs’ data for VanSlyke’s first 128 innings as a CF, there have been zero plays to CF that fall within the “Remote,” “Unlikely,” or “Even” categories and only one play in the “Likely” category. Perhaps some of the stringers subconsciously realize that, when VanSlyke is playing CF, nearly every ball hit to that zone must either be “Routine” or “Impossible!”
Who Can Prove Whether UZR’s Premise Is True or False?
It’s at this juncture that I would like to issue a friendly request to the sabermetric community: Can someone steeped in UZR and its methodology please explain how, in 2013, Kemp’s defensive performance could have cost the Dodgers a full 35 runs over the course of 150 games? Seriously, where would all those runs have come from? And can someone please cite some actual examples of Kemp’s porous defense causing the opposing team to score runs at a rate anywhere near 35-runs-per-150-games? And if those UZR/150 do not accurately reflect the true impact of Kemp’s defensive lapses, can someone please clarify that (and explain it to Mr. Mattingly?). According to the “Inside Edge Fielding” data, Kemp’s rate of converting batted balls into outs was proficient enough so that, over the course of the equivalent of 150 games, the net impact of his defense would have been two failures to make a play that an average CF would have made. That doesn’t seem like enough to cause opponents to score 35 times.
By the way, it should be noted that when we separate Kemp’s recent partial-year statistics on a season-by-season basis we get some rather curious results. According to the “Inside Edge Fielding” data, Kemp was actually above-average in 2012 (a P150 of +2.5), somewhat below average in 2013 (-2.0) and horrifically below average in 2014 (-23.4!). However, I would caution the reader that these statistics can become very volatile when the sample size gets too small and Kemp has only played 326 innings in CF this season (the rough equivalent of 36 games). I would submit that, given the chance to play CF consistently, as Kemp’s mobility and confidence in his ankle improves, he might at least be able to approach his 2013 level of defense.
Do Ethier and VanSlyke Stink More or Less Than Kemp in CF?
What are the Dodgers’ alternatives? If Kemp is now a below-average CF, do Ethier or VanSlyke represent improvements? Ethier has now played over 1000 innings in CF between 2013-2014 (over 120 G) and his P150 is currently -5.3, which is substantially worse than Kemp’s -4.3 for the 2012-2014 period. Either’s P150 for 2014 is a disappointing -9.6 over 442.1 innings, but that is obviously much better than Kemp’s P150 of -23.4 accrued in 326 innings this season. VanSlyke’s sample size is hopelessly small (just 128.1 innings, the equivalent of just over 14 G) and his P150 so far is a dreadful -17.5. What about Puig? In 55.1 innings in CF (a crazy small sample size), Puig’s P150 is an awful -21.2.
So what should Mattingly do? First, he needs to free himself from the notion that Matt Kemp playing CF would cost his team anywhere near 35 runs per season. Second, Mattingly should give some careful thought as to whether or not he believes that, with consistent playing time, Matt Kemp has the potential to just get back to where he was defensively before July 22, 2013 (when he broke his ankle). Third, Mattingly should pay attention to the fact that Ethier’s and VanSlyke’s limited range in CF make them defensive liabilities with little chance of improvement.
Meanwhile, I intend to root for Matt’s return to CF. And, failing that, I’ll pray for Joc!
. . . and, just for fun, here’s a small slice of my initial P150 data (P=Plays, C=Chances):