Introduction

We don’t really have a widely available coherent metric for pitchers which tells us how good a pitcher is, independent of his home park and the defence behind (and if anyone feels tempted to say ‘ERA’ here, read Dave Cameron’s article on pitcher evaluation first). FIP and xFIP are the most commonly used general pitching stats we have, but they suffer from the limitation of only considering strikeouts, walks, hit batsmen and home runs. We know there are other things that are mostly under a pitcher’s control, and we also know roughly how much control a pitcher asserts over said events. The hope for tRA then was to construct a metric which takes into account every action a pitcher is responsible for and turns those numbers into runs and outs based around a highly logical and transparent mathematical framework.

Theory and Method

We start with a distinct set number of endings between a batter-pitcher matchup that is resolved between the pair (e.g. not ended on a caught stealing or catcher’s interference). The batter may strike out looking, strike out swinging, walk, be walked intentionally or be hit by a pitch. The hitter may also hit a line drive, a ground ball, an outfield fly, a popup, a bunt or a home run. These possibilities can be regarded as being governed by the pitcher, provided that there is a large enough sample size.

tRA is built around knowing how many runs and outs each of these events are worth. These individual factors differ from year to year and from league to league as we account for different run-scoring environments. Using play by play data we can, in any given year, determine the average number of outs that was made on a given type of play by simply going through each play and counting the number of outs made on each type of play and dividing by the total number of plays for that type. Fairly straightforward, although a small correction factor has to be introduced to deal with outs made on the bases. An example table for 2008 is shown below:

out values

Runs are slightly more tricky. We have to introduce a run expectancy matrix in order to work out how many runs -should- score from any given game situation. (bases empty, no out, etc.). In general, they look something like this, but it’s certainly possible to build your own based again on play-by-play data and given that we deal with a variety of leagues here at StatCorner, that’s exactly what we do. After the matrix is derived, we can work out the difference in runs on any given play by looking at the following:

play_run_value = runs_scored + (run_expectancy_after – run_expectancy_before)

With a little effort/fairy dust, the yearly average value of each type of play can be determined. So far for 2008 it looks a little like this:

run values

Again with a small correction factor tossed in for baserunning effects (e.g. runner scoring on pickoff attempt error). If these are combined with the frequency with which a pitcher gives up each outcome (after making some park adjustments based on these numbers [spreadsheet]) and multiplied by total batters faced (TBF), we can determine how many runs that pitcher would have given up in a neutral park in front of an average defence. From the outs table shown earlier we can also figure out how many outs/innings he would have been expected to pitch through. tRA can then be determined as follows:

tRA = expected_runs/expected_outs*27

which gives us the expected runs a pitcher will give up per 9 innings pitched.

Another point worth considering is regression towards average. Certain pitching stats are known to fluctuate quite wildly from year to year, and in order to correct for this every outcome is regressed towards the mean based on their year-by-year correlation values and the total batters that a pitcher has faced on the season, with less regression applied the larger the sample size. The actual values to which regression is applied are as follows:

K%, BB%, HBP%, GB per ball in play%, IFF per ball in air%, LD per ball in air%, and HR per FB%

The order is extremely important, as influencing GB% will have an effect on LD% later, and so on, sometimes causing regression away from the mean in unusual situations.

2007 Player Cards Examples: Batista, Bedard, Felix, Silva, Washburn.

If there are any questions, you can send them here.

© 2009. tRA is the intellectual property of Graham MacAree.