Jump to content
Sign in to follow this  
  • entries
    78
  • comments
    48
  • views
    43,410

Musings Going into the 2015 Season

Duquesne Frog

376 views

I've been struggling to come up with some blog ideas to pass the time until the beathlessly-anticipated first DUSHEE ranking does come out, marking the official beginning of college football season. This morning, the Google News aggregator passed this article from the Detroit News before my eyes, the audaciously named "Top three things to know about college football analytics" by one Ed Feng, a fellow engineer.

http://www.detroitnews.com/story/sports/college/2015/09/18/feng-top-three-things-to-know-about-college-football-analytics/72385374/

In his article, he presents some interesting data and reaches a number of conclusions, some of which appear to be justified by the data and others not so much. First in his introduction he makes a point that I've made on these pages numerous times, the college football season provides one of the least rich data sets of any popular American sport:

"College football also presents challenges due to a lack of data. The 12-game regular season seems like spare change compared to the 82 games for the NBA or 162 for MLB. Moreover, football offers almost no numbers on important players such as offensive linemen."
.

The other piece of this is that in addition to the lack of games each team plays, there is a glut of teams all nominally counted in the population of teams. There are 125 teams playing in the FBS. Each only plays roughly 10% of the other teams in the league.

Now we get to the three things. I'm not sure why these are the "top" three things we should know about college football analytics, but we'll blame the headline writer for that.

Thing 1: Last Year Matters

Here Feng really makes two assertions: 1) there is a vast disparity in resources among the 125 teams in the FBS and because of that 2) past performance is a predictor of future results. To me, the obvious place to look for evidence in support of these assertions would be to look for correlation between athletic budgets and revenue, alumni bases, attendance, etc., to performance over time.

Instead he focuses on assertion #2 and attempts to correlate team performance from 2013 to 2014. We've discussed here the striking lack of consistency that a team of college football players shows from game-to-game within a season (http://www.thefroghorn.com/index.php/blog/2/entry-39-blind-squirrels-finding-acorns/). The data he shows indicates to me that, not surprisingly, the consistency doesn't improve when you add the variables of player and leadership turnover from season-to-season into the mix:

635781490283908784-EFFinal1.jpg

Feng looks at this data set and sees evidence of "strong persistence" from season-to-season in team performance. However, take away the diagonal line which biases the mind toward seeing a correlation in a blob of data points where, at best, only a very weak correlation exists, and I think you'd be hard pressed to argue that data set "strongly" shows anything. "Strong" correlation would show a concentration of data points on or near that diagonal line instead of a blob of points that is ever-so-slightly skewed in the direction of correlation. The fact that the data set IS such a random scattering tells me that the correlation of the previous season's results to the upcoming season's results is very weak and that the 70% prediction rate claimed is due to the high count of "body bag" games where a many teams on the top right portion of that chart play teams on the bottom left.

Thing 2: Predicting Turnovers

Nothing will split the nu-skool college football analytics advocate from the old-school football fan/writer than the randomness of the turnover. Both camps agree on the importance of the turnover on the outcome of the game (http://www.footballstudyhall.com/2013/8/23/4649718/college-football-turnover-margin-winning-percentage), but the two sides strongly disagree on the ability a team has to generate turnovers (and prevent their own) by either sheer will or strength of play. Analytics will tell you without equivocation that turnover margin is one of the most random metrics from season-to-season you can find (http://grantland.com/features/nfl-stats-predicting-success/, http://www.advancedfootballanalytics.com/index.php/home/research/general/79-examining-luck-in-nfl-turnovers). Teams, even those in the NFL which possess far more season-to-season roster cohesiveness than college teams, do not have the ability to maintain high turnover margins consistently.

(The one exception to this has been the recent vintage Patriots who have consistently not fumbled at a level far exceeding any other team in the league, evidence used by many that they have, at minimum, uniquely discovered some competitive advantage, and most probably that they are cheating. http://www.wsj.com/articles/patriots-always-keep-a-tight-grip-on-the-ball-1422054846)

Feng here presents one of the few instances of evidence for some modicum of predictability in turnovers that I've seen in the correlation between quarterback accuracy and interception rate:

635781490284844790-EFFinal2.jpg

Even here, the correlation isn't rock-solid, but it is weakly there, and as Feng states:

Correlation doesn't imply causation, but it's reasonable to think that less accurate quarterbacks tend to throw more interceptions. Sometimes an errant quarterback throws the ball at his receiver's ankles, other times it goes to the other team.
.

The randomness and unpredictability of turnovers is why I don't use turnover margin explicitly in DUSHEE, although it is there in its impact on point margin. There is no question that turnovers have a strong impact on point margin. And it is, in part, why you see so much variability in a team's performance from game-to-game and season-to-season.

Thing 3: Efficiency, Efficiency, Efficiency

While in most respects I fall into the nu-skool analytics camp, the focus on efficiency, or looking at statistics on a per-play basis, is not one with which I am fully on board. First and foremost, the game is not played on a per-play basis, it is played on a per-game basis. If you are a successful grind-it-out, huddle-up, milk-the-clock kind of offense and can keep the opponent's offense off the field, you will be viewed poorly on an efficiency basis. In my mind, if you put 500 yards of total offense against an opponent, not only doesn't it matter whether it took you 50 or 80 plays to do it, it might actually be preferable in some cases to do it in 80 plays (i.e., less "efficiently") because you are helping out your defense more by keeping them off the field and giving the opposing offenses fewer opportunities to score.

This was always the failing of the run-and-shoot offenses of the Jerry Glanville era. Yes they moved the ball and scored lots of points quickly (read, efficiently) but they also kept their defenses on the field for lots of plays.

This was the raison d'ĂȘtre for Gary Patterson offenses prior to the Meachum/Cumbie era and a pretty successful one ... grind it out, score points in a methodical fashion, don't put your defense in bad situations.

So, from a performance evaluation standpoint, I'm still not convinced that the per-game basis is not more indicative of a team's quality than the per-play basis.

Whither DUSHEE

So, I don't think Dr. Feng has convinced me that any of this portends any necessary changes to DUSHEE for the upcoming season. However, one significant factor that DUSHEE should account for but doesn't is the effect of home field. As it currently stands, DUSHEE does not distinguish between performance at home versus away, though pretty much any analytical approach shows a clear advantage (usually 3-5 points) to playing at home. As of this writing, I still haven't formulated how I want to change DUSHEE to account for it, but I have decided I need to come up with some basis for doing so. Stay tuned ...



1 Comment


Recommended Comments

I agree that home vs away is an important component. One of the (many) things that drove me nuts last season was the argument that Baylor's last second 3 point win at home somehow was definitive proof that they were better than TCU.

Share this comment


Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...