Tuesday, November 24, 2009

The Cy Young, Sabermetrics, and Evaluating Pitchers

First, an announcement. I'm now contrbuting at Baseball Reflections. The gig is for more general baseball content, and is a weekly item. So far, there've been two articles, one on Jorge Posada, one on instant replay. Lots of good stuff over at the site. Stop by and take a look.

I keep mulling over the Cy Young results. I've mentioned before I don't have an issue with who won, but I've been wondering about the methodology used to select the winner.

My rankings for Cy Young went Lincecum, Vasquez, Carpenter, Wainwright. Keith Law came up with the same rankings, albeith with Wainwright in place of Carpenter for third, putting me in the interesting position of agreeing with Keith Law. I came to my order after looking at some of the traditional metrics (ERA, Wins, Strikeouts, etc), and some of the new statistics (FIP, WAR). I allowed the more sophisticated stats to trump the traditional ones. Fairly or not, Keith Law came under fire for his rankings, which caused me to re-examine mine.

For years, we in the sabermetric community have dissed wins as a measure of a pitcher's performance, and with good reason. The way managers use their pitching staff, especially their bullpens, has rendered the win pretty meaningless. If you've played any fantasy baseball in a league using wins as a statistical category, you've seen one of your relief pitchers get credit for a win after throwing 4 pitches, or one of your starters get a no decision after throwing 8 shutout innings because the closer came in and started throwing BP.

ERA is also out of vogue, mostly because of unearned runs being determined by the awarding of errors, an inherently subjective statistic based solely on the official scorer's determination as to whether the fielder should have made the play cleanly. We invented things like WHIP to better understand what made a pitcher successful. Then Tom Tango invented FIP, which attempted to boil down pitching evaluation to those things a pitcher controlled - allowing HR, walks and hit batsmen, and strikeouts. FIP removed the rest of the defense from the pitcher evaluation. Most people believe using FIP and stats of that nature have put pitcher evaluation on the right track.

What about the pitchers who pitch to contact, and use their defense and ballpark effectively? I think this comment, from a Cy Young post Cardinal 70 did, sums it up the sabermetric community's thoughts:

I'm sympathetic to the "Should groundball pitchers be punished for basically doing their job?" argument. However, that's an a priori argument that assumes that their approach is correct. In some way, such as in the aggregate, perhaps it is. But as far as an individual pitcher's contribution -- what he alone is able to do -- fielding-independent stats tell us more about the pitcher himself. If we are rewarding individual accomplishments, as it seems the Cy Young does, team philosophies are irrelevant. They're reflected, however, in a team's success.

The author of this comment isn't some schmoe. It's Pip from Fungoes, a man who's opinion I respect, an educated man who speaks intelligently about baseball in his blog posts. But I've come to disagree with this position. I think the SABR community is missing the forest for the trees.

The point of pitching isn't to give up no walks, no home runs, not hit anyone, and strike everybody out. The point is to get outs and keep guys off base. If you can't keep guys off base, then get outs and don't let them score. Strikeouts is only one of a variety of ways the pitcher can succeed in preventing runs.

The philosophy behind FIP is right on the money. It gives the pitcher credit for executing his pitches correctly. Most HR are allowed because a pitcher leaves the pitch in the fat part of the plate; perhaps a fastball with no movement or a breaking ball that spins but doesn't break. Walks, HBP - can't find the strike zone or can't control where the ball is going. Strikeouts: most times a K is because of a well thrown pitch in the exact location it was intended to go. No argument on the components of FIP.

However, pitchers don't pitch in a vacuum, and aren't the only guys on the field when pitching.
If Buzz Bissinger is to be believed, before each game the pitching coach, pitcher, and catcher get together to discuss how they will attack the opposing lineup. They discuss pitch location and tendencies of individual hitters, to develop a game plan for the night. It's reasonable to extend this preparation to the bench coach who positions the defense. I'm sure pitching coaches and bench coaches discuss the pitcher's approach to each hitter, so as to better position the defense. Pitchers who are able to execute their pitches and use that defensive alignment should get credit for it.

Think about it. How many times have you watched a game, and in inning after inning with guys on base the pitcher manages to get the hitter to roll the ball right to an infielder? Think that was by accident?

Evaluating pitchers should also take into the types of outs that are made. In Chris Carpenter's 7 September complete game shutout against Milwaukee, he gave up two balls to the outfield. Nine IP, 1 hit, 2 walks, 10 K's. A dominating performance. The fact that 26 of the 27 outs were recorded by an infielder puts a whole other dimension on it for me. Of the 17 hitters that did put the ball in play, 16 couldn't get it out of the infield, meaning they either were fooled, or the pitch location was so good they couldn't center the ball on their bat and drive it. Carpenter should get credit for having the ability to throw that kind of game.

When you get down to it, FIP, WAR, ERA, K, K/9, BB/K, LD% GB%, all these metrics are simply tools to develop a picture of how good the pitcher is. There's no one statistic, no magic formula, that spits out who's good and who's not, and basing a Cy Young vote on one or two of them is inherently misguided. Yes I realize I'm making fun of my vote. Choosing pitching rankings by evaluating all of the data available, tempering it with personal observation if possible, is a much better way of doing business.

Again, I don't disagree with how the Cy Young voting shook out. The top three vote getters were all deserving of the award, and the fact 10 points separated them is good evidence the voters were torn as to who was the best. Wins and ERA aren't the be-all and end-all for evaluating pitchers. But neither are FIP and WAR. And not taking the use of the defense into account when deciding which pitcher has performed the best over the course of a whole season is to not use all the data at our disposal. It does a disservice to pitchers that don't have Lincecum's stuff but are still mighty effective pitchers.

I disagree with the community. You can't properly evaluate pitching without including some statistical information on how they use their defense. This is, after all, a team game.


Cardinal70 said...

That's a lot of what I was trying to get at, but done much better!

I mean, if we are just going to use WAR or something of that nature, why bother with voting? There's got to be some other facets of this, in my mind.

Good stuff, Mike. Maybe we can revisit it during tomorrow's show.

Mike said...

Let's plan on that. Discussing the Cy Young voting alone would take most of the hour.