Statistical significance is not the same as material significance, or importance.
In statistical hypothesis testing, a result has statistical significance when it is very unlikely to have occurred under the null hypothesis, or is larger than we would expect by chance. The result is statistically significant, at least in terms of the study, when the calculated p-value, p is less than the significance level, α. In other words, statistical significance reflects the low probability that an observed data has been arrived at by chance.
On the other hand, practical significance is about whether we should even care about the result, as in whether or not the effect is useful in an applied context. An effect could be statistically significant, but that does not in itself mean that it is a good idea to spend money, time, or resources into pursuing it in the real world. To put it succinctly, “A calculated difference is practically significant if the actual difference it is estimating will affect a decision to be made.” Essentially, practical significance arises out of the applicability of the result in decision making, is usually more subjective than statistical significance and almost always depends upon external factors.
Naturally then, a statistically significant result might not be practically significant. Statistical significance itself does not imply that the results have practical consequence. This could be the case If one uses a statistical test with very high power. It would be easy to conclude that a small difference from the hypothesized value is statistically significant. However, that small difference might be meaningless in their specific situation.
An example from the world of baseball statistics exemplifies this phenomenon. It has been well-documented that on-base percentage (OBP) and slugging percentage (SLG) correlate highly to wins for a team, leading to the now popular stat, on-base plus slugging (OPS). It has even been proven that both of these statistics are statistically significant in determining the amount of wins a team achieves in a season. However, another statistic actually is additionally statistically significant, earned run average (ERA), which we proved below.
Here are the results of a regression we did of 5 variables, Batting Average, OBP, SLG, ERA, and Strikeouts Per Inning, on Wins from 1995-2014. The variables with stars (*) next to them are statistically significant:
The finding that OBP and SLG are statistically significant to wins is also practically significant. Teams are going out and finding players with a high OPS and paying them a ton of money. This is why home run stars, yet with low OBP, were left in the cold last winter when it came to contract extensions. Mark Trumbo, the home run king last season, for example had to settle for a contract much lower than he expected because even with his high SLG, he only had about a league average OPS. Consequently, teams that have a roster with an above average OPS win more games.
This then brings us to ERA. Unfortunately ERA is a skewed statistic and fundamentally flawed. ERA tries to create an unbiased statistic by only incorporating earned runs, as opposed to unearned runs which are chalked up to errors made by the defence or other things outside of the pitcher’s control. If it were a perfect way to remove those runs which should not count against the pitcher, then it would have value. However, it is not perfect because it is based in large part on defining errors.
The error really only accounts for obvious mistakes, not the more nuanced aspects of poor defensive work, such as mispositioning or a slow first step. In order for a fielder to be charged with an error, he must have already done something right by being in the correct place to be able to attempt the play. A bad fielder may avoid many errors then simply by being unable to reach batted or thrown balls that a better fielder could have successfully grabbed. In this case then, it is possible that a poor fielder will have fewer errors than a better defender simply by being in the wrong place. Additionally, errors are subject to scorer bias, plus it is even difficult to discern the difference between hits and errors at times!
Because ERA is not widely used in sabermetrics communities to help rank the usefulness of pitchers, it is no surprise that baseball teams are not following suit either. Teams are not looking for pitchers with low ERA like they are for offensive players with high OPS due to the biasedness of the statistic when there are dozens of better stats to measure a pitcher’s effectiveness, for instance FIP.
Therefore, even though ERA may have statistical significance in correlating to a team’s wins in a season, it has no real practical significance, since creating a strategy to acquire pitchers with low ERA is illogical. Even though ERA will remain ingrained in baseball culture for the foreseeable future, when your team makes a trade for a pitcher with a high ERA know there could be other statistics under the surface reinforcing the maneuver.
The SaberSmart Team