We built a machine model that predicted the Super Bowl would have record-low viewership. Here are three things it had to learn first.
For any ambitious data scientist, the phrase “maybe this is a wild idea” is a clue to listen closely. Problem solving and creativity are among the most important traits of excellent data scientists, so you’re always listening for crazy ideas that might just be possible with the huge leaps forward data science is taking every year.
Last year, as the world locked down to try to contain COVID-19, one of the biggest drivers of demand for many retail, food, and delivery companies became televised sports. For many Americans, gathering to watch the game became even more of a highlight of their week. My team builds models that aggregate and predict the impact of demand drivers such as conferences, concerts and severe weather, so we decided to build a model that could predict the viewership of televised sports games as well. The available viewership data was always post-game viewership, and at a designated marketing area level, rather than by county. We knew food retailers, delivery groups and groceries needed future looking, more granular data, and decided to build a model to identify it.
We knew it was going to be hard. But we also suspected that a previous series of models we had built might come in handy. We started building the model, learned a lot, had to deploy cutting edge technology and we ultimately arrived at accurate viewership predictions for televised sports. I want to share how we did this with other data scientists, but quickly before we dive into how we built the model, the major question for anyone with a prediction model is: does it work?
Two weeks out from the 2021 Super Bowl, our model delivered a surprising forecast — the lowest Super Bowl viewership in 20 years. It produced a forecast of 96 million viewers, down 6 million from 2020’s and the lowest since 1997. The post game viewership from CNBC was 96.9 million — so our new model was doing well!
Data scientists are frequently asked to explain their forecasts. Digging into the result, we were able to identify it was largely due to three factors:
- Sports fan prediction: Sports viewership has been lower than last year due to a drop in the NFL’s popularity. Like most sports, it attracted fewer viewers in 2020. Our model considers several factors including historical trends and monitoring shifts in league popularity.
- Super Bowl playing team rankings: Due to the popularity of the two teams that made it into the finals, as reflected by their regular season ranking, our model predicted slightly lower viewership than last year. This is because while the Kansas City Chiefs are the most popular team this year, they were playing the Tampa Bay Buccaneers, which is the 10th most popular team. Last year, the Chiefs were playing the San Francisco 49ers, which were the third most popular team. Popularity has a heavy influence on the number of viewers tuning in.
- Game uncertainty: This is one of the most interesting features of our model — it calculates how much the uncertainty of a team winning will impact viewership. In general, a more unevenly matched game will have fewer viewers, as everyone assumes they know the outcome already. Given the lower uncertainty in the Chiefs versus the Buccaneers this year, our model predicted a lower probability for fans tuning in and therefore viewership for this year’s game.
No model is perfect the first time and while this prediction was 99.6% accurate, there were a lot of dead-ends, late nights and endless hours of hard thinking to get there. Every machine learning model aims to provide more context and intelligence than we had before it, and to get smarter and more accurate results with more time. I want to share three of the big lessons we learned along the way so any team working on an intensely challenging new model can get to better outcomes faster.
After our first few weeks of experimentation, I was feeling pretty defeated. We designed a supervised machine learning architecture based on external historical viewership data to predict the viewership for future games. We were investigating several external sources of viewership data, which while deeply useful for some use cases, just wasn’t built for our use case (forecasting). It wasn’t working. My team and I had tried more than ten different approaches to estimate the viewership of historical games, but we were having no luck.
I had let the rest of our company’s executive team know it wasn’t working as hoped on a Friday, and headed into the weekend. While surfing, I realized that there was a way that we could arrive at the predictions without relying on external viewership data to build supervised machine learning models. We would estimate the amount of sports fans per county, and then build a probability model based on hundreds of factors per game to identify what proportion of that fan base would tune in for a game.
This was possible because we had already built an extensive entities system for PredictHQ, which tracks every sports game across the United States. To do so accurately, we also track teams, individual performers, venues and much more, with more than five years of historical data. For example, we know each venue’s exact latitude and longitude, so it was easy to identify the distance a game was playing away from its home county, which we confirmed had a material impact on viewership. We also knew which teams sold out stadiums, the impact the time of a game had on ticket sales and more.
This gave us a reliable foundation from which to build our model. It enabled deep segmentation, which is critical for accuracy for highly complex models. But we couldn’t have created our model without taking a leap, and using an emerging data science approach.
Probability models have been valuable for data scientists for a long time. And we wanted to use one, because explainability is so important for data scientists — especially when their work is informing demand forecasting decisions that can generate millions of dollars in additional revenue and savings.
But probability models can also become unwieldy quickly when you are dealing with a brand new problem and a wide range of data. We needed to create a model that could learn from the data itself, rather than our team setting the parameters and testing and checking.
Parametric probability models require strong data assumptions to get it to work. But in practice, the real data distribution is far more complex than those data assumptions therefore the parametric models are insufficient to work appropriately. Then you need to set the model free to learn from the data — purely and directly. These traditional parametric models need a strong learning curve, which in my decades of data science I am convinced can only come from advanced non-parametric elements. It can be daunting to incorporate new approaches into models, but data science is moving so fast as a profession. Feats are possible now that weren’t a few years or even months ago. Our non-parametric models, by not making assumptions about the mapping functions, capture and learn the non-linear relationships from our deep-transformed raw features.
For example, we needed to accurately model the difference in sports interest for fans by stage of season (this is just one of hundreds of specific factors). Our sports SME and research show in general, we assume that the interest will grow through the season, starting lower in pre-season and increasing through to postseason. But we are also aware the trend is non-linear, because there are some local maxima in it, such as kick-off games in the regular season, etc. We needed a non-parametric approach to pinpoint the trend and the contributing factors, such as team performance and uncertainty. Not using a non-parametric approach would hold us, the model and all of our customers back. With a parametric approach, we need to make assumptions, conduct a linear regression and assume, but we would have had to do thousands of iterations of this (years of work) to arrive at the conclusions, while with a non-parametric approach, our non-parametric model identified conclusions within its first few weeks.
Once we had a model that we were confident was functioning well and on the way to laser-like accuracy, we reached out to some of our most engaged food and delivery customers to test it alongside our team.
This was critical. As a data scientist, it is easy to focus purely on the science and the joy of experimentation and discovery. But if you can’t make your work drive value for your company and its customers, you won’t get to keep creating exciting work for long.
We used a two-track benchmark system to review our model:
- At the aggregated national level, our predicted viewership for games that have already occurred are within a close margin of historical viewership by post-broadcast estimation models. As you can see from the graph above, our predictions are similar to external post-game sources such as NBC’s post game viewership data.
- We also track if our viewership figures correlated with the expected impact on demand for our key customers.
Tracking both elements was critical to creating forecast-grade data. Accurate predictions that don’t relate to customer transactional data would indicate that we weren’t identifying relevant information for smarter demand forecasting.
These are only three lessons and we learned many more. But I wanted to share these three as soon as possible, because I know there are so many data scientists who are looking at 2021 and beyond, and facing significant challenges. Businesses need to be data-driven as we enter a long-anticipated recovery that will be highly fragmented. It’s never been more important for data scientists to step up and evangelize the true power of data science for demand planning, company efficiency and innovation.