Moneyball 3.0: How Visual Data Is Supercharging Sabermetrics in Sports
Before Michael Lewis released his 2003 book, Moneyball: The Art of Winning an Unfair Game, few baseball fans—and even many executives—had ever heard of sabermetrics, the term invented to describe advanced statistical analytics in the sport. More recently, the term used is “Moneyball,” and the process of analyzing all the data you can to get an edge on the competition is ubiquitous, throughout all of sports. Today, thanks to advances such as computer vision and artificial intelligence (AI), we’re about to enter a new era of Moneyball, where teams, GMs and coaches can find the next advantage using cutting-edge data capture methods and technology.
The original Moneyball: Sabermetrics goes prime time
Before the original Moneyball era started, most baseball teams relied on old-school human scouts and the usual statistics, like high batting averages, a lot of home runs, a bunch of stolen bases, flashy RBI numbers and so on to build their rosters. It was a mix of looking at traditional statistics that had been around since the late 19th-century and expert baseball opinion.
But around the turn of the millennium, faced with a low-budget payroll, Oakland A’s GM Billy Beane and his predecessor Sandy Alderson rethought that approach. Inspired by the works of Bill James, a fan turned writer who self-published an annual series of books called Baseball Abstract, which delved into the numbers in a way nobody had ever done before, the GMs looked at data that showed that players of a certain type, most often guys who had higher on-base percentages, were more valuable to teams than the market for them suggested. Other teams wanted the guys who could hit—the A’s just wanted guys who didn’t make outs. As they learned from James’ teachings, RBI (runs batted in) were the result of guys getting on base and their teammates driving them in. Stealing bases, depending on the player, were more likely to result in outs, reducing the rest of the lineup’s chances to keep an inning going. Hitting home runs was great, and would surely get the player on TV, but swinging away in every at-bat and striking out at the expense of patience and getting on-base was counter-productive.
The approach of looking at new types of statistics in new ways worked. Fast forward a few years, and after the Oakland A’s made it to the playoffs several times despite some of the lowest payrolls in baseball, other teams caught on, creating analytics departments that delved deeper into newfangled metrics such as ultimate zone rating and fielding independent pitching. Clubs such as the Tampa Bay Rays started hiring Wall Street veterans to look at the numbers—printouts of box scores and spreadsheets—determine which stats were overlooked, and use that data to their advantage when it came to acquiring players.
Moneyball 2.0: Getting deeper into the stats
Then around the mid-aughts came what could be called the Moneyball 2.0 era. Baseball lovers started digging into the stats too, comparing them to every player in history, and inventing new ways to both evaluate past performance and project how a team would do in the future based on its roster. Sites like Baseball-Reference and Fangraphs proved to be popular destinations. Journalists figured out ways to break down these statistical analyses and projects with algorithmic systems like PECOTA, and fans learned the lingo (it also didn’t hurt that the popularity of fantasy baseball made users look for any extra edge for their own imaginary teams).
The MLB caught on and started to look for new ways to capture data and gain a predictive and competitive edge. In 2006, the league’s Advanced Media arm delved into visual data capture and analysis when it introduced Pitch/FX, which uses three cameras to show the angle, speed, break, location and type of every pitch thrown in a game. Useful for television broadcasts to show if a pitch was a ball or strike, it also proved useful to teams—coaches and players had a new tool for studying film of themselves and opponents.
Moneyball 3.0: Let’s get visual
With higher and higher definition cameras coming out every year, and the realization that more and more fans were starting to understand and seek out advanced statistics, the league sought new ways to analyze the game. In 2014, MLBAM launched Statcast, which utilizes multiple cameras and Doppler radar to measure the movements of every player on the field. Now, fans can know the exit velocity of Yankees phenom Aaron Judge hitting a home run or how Clayton Kershaw’s curveball moves vertically and horizontally, with a spin rate that makes it nearly impossible to hit.
Teams mine data out of it, while fans get to marvel at just how amazing or improbable a catch, pitch or homer really was. Take, for example, Minnesota Twin Byron Buxton’s diving grab, where Statcast measured the time he’d need to catch it, how far he’d need to run, the efficiency of the route he took, the speed of his first step, his top speed during the run and much more. Measuring those facts against all the other plays that Statcast has recorded, the system determined that he had a 24% chance of making that play—it did so using proprietary algorithms that can deliver these facts almost instantly. Pitchers now regularly check Statcast between innings, and All-Stars like Kris Bryant have used it to change their approach at the plate. Between his Rookie of the Year season and 2016, the Cubs third baseman used this data to straighten out his swing and change his “launch angle,” which led to him hitting 39 home runs, winning the NL MVP award and helping Chicago win its first title in 108 years. Again, it’s just a little edge that helps.
Next-gen sabermetrics like this, which augments the analysis of existing data with previously unavailable data captured in new ways, can basically be described as Moneyball 3.0. Today’s clubs and players can get an unprecedented amount and variety of visual data on every play, every pitch and every swing, from completely new perspectives and capture methods. And it’s spreading to other sports. Former Microsoft CEO Steve Ballmer, now the owner of the Los Angeles Clippers, recently announced a collaboration with Second Spectrum, which uses multiple cameras around the arena, computer vision and AI to capture, track and analyze passes, positions and routes in real-time video of games. It then delivers insights such as what percentage any team or individual player has on scoring in a single play. Working with pro soccer teams and about three quarters of the NBA, Second Spectrum and its computer vision-powered insights arguably helped the Golden State Warriors, which has been using the tech to help coach its players since 2014, win three NBA championships.
For viewers, the Clippers collaboration adds an augmented reality (AR) layer on broadcasts that superimposes fantasy stats, animations and predictive highlights on TV, computer and mobile screens. Eventually, as Ballmer has noted, software will literally be able to provide feeds from player perspectives, which besides offering fans cool new “cam” view, can also be further analyzed by computer vision to provide untapped visual data. The idea is to show just how impressive it can be to see what’s happening on the court or the field, the seemingly preternatural awareness of these top-level athletes.
“We have already used machine learning to identify hundreds of complex basketball actions like pick-and-rolls, off-ball screens, and closeouts,” said Second Spectrum CEO Rajiv Maheswaran on Reddit. “I have a great sense of pride whenever we show our system to coaches and GMs and see their faces when they realize its full capability.”
Following the Moneyball: Where does it go from here?
In the fall of 2016, the NBA signed a six-year, $250 million deal with Second Spectrum and sports data and betting company Sportradar. The overall focus of the agreement is Second Spectrum’s visual sports data collection fusing with Sportsradar’s existing sports data, with several end goals in mind: giving teams the ability to analyze their players, giving fans the ability to learn more about a player or a single play with a second screen experience, and, eventually, aiding gamblers in making bets on games—something that NBA commissioner Adam Silver hopes is legalized someday. By the end of the 2017-2018 season, Second Spectrum will have installed its visual player-tracking cameras and system in all the NBA’s arenas, which is a tremendous boon to the evolution of computer vision-enabled data capture that is Moneyball 3.0.
Visual and other next-gen data capture and analytics are finding their way into other sports, too. In Major League Soccer, Second Spectrum’s recent deal with the San Jose Earthquakes will use the company’s visual player tracking system to not only deliver insights to coaches, but also directly to players via mobile phones. In Europe, La Liga soccer clubs are using drones and multi-camera setups to record every aspect of a game, up to speeds of 10 frames per second of every player, and computer vision and AI will analyze efficacy of players and strategies. Both baseball and cricket players are relying on HD- and infrared-camera-equipped drones and in-bat sensors that evaluate field pitch conditions and measure speeds, angles, contact rates and more on every swing, to analyze if a particular approach is working or needs to be adjusted. Hockey teams are using computer vision technology that tracks players and moves in game footage. The technology, by Sportlogiq, maps the coordinates of every player during every second and, via an algorithm, can extrapolate many advanced data points, such as player tendencies to shoot from a certain spot on the ice or how well they can handle pucks.
Even outside of the direct gameplay realm, computer vision is helping teams optimize their player performance and businesses. To implement a “prehab” strategy, the NFL is using a technology that fuses biometric information culled from sensors that players wear with gameplay video to identify potential injury areas and help improve their moves (because a winning team is a team that has all its players playing optimally). And companies such as GumGum are using computer vision and AI to track the appearance of logos throughout broadcast and social media images and videos, enabling teams and stadiums to maximize the value of their brand sponsorships.
It’s still early, so the specific effects of all this newly culled and crunched data are hard to judge. It hasn’t been around long enough so there just isn’t that much data yet, and individual teams still aren’t willing to divulge specifically how they’re using this digital information to gain an edge. But sit tight: If the disruptive, competitive influence of new statistics in baseball and other sports so far is any indication, computer vision and other next-gen player and data capture methods will become standard for sports organizations going forward.
Illustration by Neil Stevens