Now that most of the NBA teams have passed the halfway point of the season, we recalculated our end-of-season win total predictions and playoff probabilities. In truth, we don’t think that the second half of the season matters that much, in terms of playoff probability.
Since the top eight teams from each conference make the playoffs, the likelihood that even the top four teams in wins per conference at mid-season would fall out of the top eight at the end of the regular season is astronomically small. However, the second half does matter in terms of the teams fighting for the last couple of spots, as well as, of course, playoff seeding.
Last fall, we generated win probabilities for all 33 games of the MLB postseason, including the 2 wild-card games, 26 divisional and championship round games, and the 5 World Series games. We think it is finally time to declare how we did.
For each game, we compared win probabilities from 6 sources, two “baselines” (50/50 odds for each game, and the home team winning each game), two for-profit industries (Vegas betting lines and FiveThirtyEight’s Elo), and our two win probability metrics, Runs Scored/Runs Allowed wp%, created in 2017, and our Bayesian SaberSmart Simulator, created in 2018.
A reader over on our Twitter recently brought to our attention a weekly MLB feature on ESPN, known as the 100s Tracker. They say that the purpose of this tracker is “projecting (the) date (the) Red Sox, (and) others will hit (the) century mark”. Naturally, anything about projecting baseball wins using "on-pace" grabs my attention, especially when it is from America's own entertainment and sports programming network.
If you have been following this blog during this summer, you should know that I have been fighting a bitter war against the blatant use of “on-pace” in sports analytics and arguing why it should instead be replaced by “expected wins”.
Wisdom of the crowd is more than the ill-fated television show of the same name. We have all heard the old adage, many minds are better than one, and this can be seen abundantly in nature. Across countless species, nature show us that social creatures, when working together as unified systems, can outperform the vast majority of individual members when solving problems and making decisions. Examples include bees, fish, and ants.
It should come as less of a surprise, then, that humans working together in tandem can efficiently converge decision problems, and even make accurate predictions. This theory of collective intelligence has been studied and analyzed for the past century to try and validate how, when, and in what circumstances it accurately and inaccurately predicts.
Have you heard of machine learning but haven’t found a way to implement any algorithms? Do you use R for all of your machine learning models and are wondering how to scale and deploy your models to production quickly and efficiently? Do you solely use R, or caret, for your machine learning models and want to diversify your skillset?
No judgement if you do, but let me introduce you to a, in my opinion, superior way to craft and deploy machine learning models using Python and scikit-learn.