Fumbles, Interceptions, What's the Difference?
One statistic1 that football coaches and sportscasters emphasize is turnovers. A good defense creates turnovers. A sloppy offense commits turnovers. Winning teams have a positive turnover margin (i.e. their defense forces more turnovers than their offense commits). Surely, then, a lost fumble and an interception must be equally detrimental to an offense. Right?
1.A "statistic" is a summary measure calculated from a data set. Statistics is a field concerned with extracting useful information from the data. Unfortunately, a common perception is that statistics (the profession) is nothing more than recording statistics (the summary numbers). A humorous, albeit disappointing, example of this confusion took place in an airport two years ago. While laid over, I began a conversation with a fellow traveler and subsequently divulged that I had earned a Ph.D. in statistics. His response was to ask about the total number of people afflicted with emphysema. Surely, he reasoned, studying statistics was merely an exercise in committing minutia to memory.
Interceptions (Three Different Ones)2
Before discussing the value of an interception relative to a lost fumble, we'll stipulate that not all turnovers are equally harmful. To illustrate, we'll consider three types of interceptions:
In the first case, the half would have ended anyway, so the interception is of little consequence (although the quarterback's rating suffers). In the second case, the net yardage is better than the team could have expected from a punt, so this interception helps the team that threw it3. However, in the last situation, the interception represents a double-digit swing in points (from one team inside the other's 5 to a touchdown in the other direction). So, clearly, not even all interceptions are equal.
Since specific interceptions vary in the extent to which they harm the offenses (as do specific fumbles), this discussion will focus on the average value of a turnover. The question we'll attempt to answer is whether or not a fumble is more costly than an interception (or vice versa) on average.
2.While SportsQuant is solely focused on objective sports analysis and commentary, we couldn't pass up the reference to Pink Floyd's Pigs (Three Different Ones) on their 1977 album Animals.
3. Nothing irks us quite so much as a defensive back diving to make an interception on 4th down. Granted, given most coaches' reluctance to go for it on 4th down, usually these games are practically over anyway. However, given the foregone field position, a DB's decision to lay out for an interception in this case is at best stupid and at worst selfish (although in a contract year, every INT counts).
We'll examine the relative worth of interceptions and fumbles based on play-by-play data from every game in the 2004 NFL season (over 40,000 plays, which include 527 interceptions and 358 lost fumbles).
At the research page, the idea of win probability was introduced and illustrated using the progression of the Jacksonville at Pittsburgh game played on October 16, 2005. One way to measure the impact of a turnover is how it changes the probability of a team subsequently winning. (In the Jacksonville at Pittsburgh example, Rashean Mathis's interception return for a touchdown in overtime changed Jacksonville's probability of winning drastically.) However, as this example illustrates, the timing of a turnover can have a large impact on the value of the play. (Consider how much less Mathis's interception would have helped the Jags if it occurred in the first quarter of a tie game rather than in overtime.)
To mitigate the effects of time on the value assigned to turnovers, we'll consider the impact the plays have on expected points rather than win probability. Carter and Machol4 have shown that expected points are nearly linear in field position5, so it suffices to examine the net yardage resulting from each turnover.
4. Carter, Virgil and Machol, Robert (1971) Operations Research on Football, Operations Research, 19 (2) 541-544.
5. The result is that the value (V) of having field position Y yards from your opponent's goal line is a linear function. Specifically, using Carter and Machol's data, V = 5.91 - 0.077 Y. For example the expected points scored on a drive beginning at your own 20 yard line (80 yards from the goal) is 5.91 - (0.077)(80) = -0.25 pts. Similarly, the value of possession on your opponent's 5 yard line is 5.91 - (0.077)(5) = 5.525 pts. Using this formula, the break even point is your own 23 yard line. Any deeper is an advantage to the defense; any further downfield is an advantage to the offense.
Turnovers and Field Position
Average Net Yards:
Perhaps the most common summary measure for a data set is the sample mean, which is the average of all the observations. The mean net yardage (from the perspective of the team giving up possession of the football) on fumbles and interceptions is comparable. For fumbles, the offense nets -0.7 yards. While on interceptions, the offense nets -3.2. Using Carter and Machol's approximate linear relationship, a lost fumble costs the offense 4.2 points (regardless of the original line of scrimmage), and an interception costs the offense 4.4 points.
Variability in Field Position:
Although the average cost of an interception (measured in terms of field position or expected points) is slightly larger than that of a lost fumble, there is more to the story. Anyone who has bad memories of an introductory statistics course knows that statisticians love to compute standard deviations. (The standard deviation gives a crude measure of how spread out the data are. Standard deviations are always non-negative, and larger values indicate greater variability in recorded values.) The standard deviations are 23.9 yards for lost fumbles and 27.5 for interceptions, which seem similar.
So the means are similar, and the standard deviations are comparable. That means the distributions are the same, right? Not really. The following plot gives smoothed histograms of the net yardage associated with lost fumbles (blue) and interceptions (red). It shows (emphatically) that a larger percentage of fumbles result in a net change of near zero yards, while the chance of a very large absolute change in field position is much higher for interceptions. The x-axis gives the net yards associated with a lost fumble or an interception. The y-axis gives the relative frequency of times this particular net yardage is observed. (The absolute numbers on the y-axis aren't meaningful; only their relative values are.)
Why the original line of scrimmage matters:
Net yardage depends on the original line of scrimmage. If you have the ball on your own goal line, even a turnover returned for a touchdown results in only a scant change in net yardage. What can the line of scrimmage show us about the net yardage resulting from a turnover? The scatter plot shows net yards (y-axis) resulting from lost fumbles (blue) and interceptions (red) against pre-snap yards to goal (x-axis). The dashed lines represent the minimum and maximum net yardage possible based on the original line of scrimmage. The scatter plot shows that many more interceptions (56) returned for touchdowns than fumbles returned (0). It also emphasizes the large proportion of fumbles recovered for practically zero yards regardless of the original line of scrimmage.
What to make of it:
The average cost of an interception is slightly more than a lost fumble, and the plots indicate that there is a much higher chance of something very bad happening when you throw an interception. So by any measure, interceptions are worse than fumbles. It's very tempting to come to this conclusion, but it's also wrong!
In the earlier arguments, we used summary information to determine that the average cost of fumbles and interceptions was comparable, but the potential for disaster was much larger for an interception. But the earlier analysis contained a glaring omission.
Notice in both the smoothed histogram and scatter plot that there are some fumbles which result in large net yardage for the team giving up possession of the ball. (Such plays occur after a long run or a completed pass when the ball carrier is caught from behind and fumbles.) This differs crucially from a large positive net play resulting from an interception, because the offense earned those yards before fumbling whereas the offense did not earn those yards before being intercepted. Considering net yardage in the same manner implicitly assumes that had the ball not been intercepted, the intended receiver would have certainly caught the pass -- a highly suspect assumption.
To account for this difference in earned vs. unearned yardage, an appropriate measure of net yardage for a fumble is not the change in line of scrimmage from one play to the next. Rather it's the change from the position of the fumble to the subsequent line of scrimmage (in other words, return yardage). Unfortunately, theSportsQuant database doesn't have the precise location at which fumbles occurred. As a proxy, we can replace all positive net yard plays with zero. (Although possible, it's unlikely a recovering defensive player will intentionally lose yardage after securing possession.) In this case, the average net yardage for an interception and a fumble are -3.2 (interceptions aren't adjusted so this number is the same) and -6.8, respectively. So, on average, a fumble results in more than twice as many lost yards of field position as an interception.
The difference in net yardage significant. In statistical parlance, a Wilcoxon6 rank-sum test gives a one-sided p-value of 3.7 × 10-6. In plain terms, this means that if there really is no difference between net yardage, we would expect to see a difference this substantial only 3.7 times per million experiments. Therefore, this data represents either a four in a million kind of fluke, or our working assumption (turnovers are equal) are wrong. Since a the chances of this happening entirely due to luck are miniscule, this constitutes overwhelming evidence that fumbles tend to be more costly than interceptions.
6. The Wilcoxon test is a non-parametric hypothesis test which attempts to determine if one group (in this case fumble net yards) is larger than another (interception net yards). Many people have heard of the t-test which has the same objective. However, the t-test relies on a number of parametric assumptions (such as normality of data). Since the smoothed histograms do not look bell-shaped, a non-parametric test (which makes no such assumptions) is more appropriate.
After adjusting net yardage resulting from lost fumbles, the average cost of a fumble is at least 0.3 points more than for an interception (4.65 vs. 4.35). Recall, the adjustment was to replace all positive net yard fumbles with zero. This is a best case scenario. For example, suppose a runner gained 10 yards before fumbling and the defensive player returned the fumble for 15 yards. In our data, we'd record this as a -5 net yard play, even though we should consider it as -15 yards (from the spot of the fumble). Therefore, it's very likely that the average value of a fumble is even more costly than we have concluded.
A naive analysis (which doesn't consider yardage gained before fumbling) would conclude that interceptions are slightly more costly on average and result in a higher chance of a defensive touchdowns. Upon closer inspection, though, we reach a far murkier conclusion -- namely that fumbles are more costly on average, but that, due to a higher degree of variability, the most costly turnovers are interceptions. (The least costly turnovers are also interceptions, but few coaches lose sleep worrying about benign mistakes.)
As with many complicated questions, there's no simple answer to the question "are all turnovers the same?" Those inclined toward such arguments can easily make the case that fumbles are more harmful than interceptions. However, since many of the most damaging turnovers are interceptions, one can make the case that they are more difficult to overcome. This disagreement between average behavior and extreme behavior is not uncommon. The answer to the question, in such cases, depends on what you value as important. (In decision theory terminology, the answer depends on the choice of the objective function.)
Send mail to
questions or comments about this web site.