Data Intelligence for Sport: Put Some Smarts in Your Cheers
We have seen the scenario come true before: A cash-strapped professional baseball team rises up to win its league championship, out of left field. Drastic performance improvements of sport teams are often the results of game strategies developed by utilizing data intelligence. In baseball, “sabermetrics” – statistical analyses of players’ performances and game activities – began garnering attention in the 1980s. Inspired by how the American Major League Baseball teams quickly adapted sabermetrics as the key component of game-plan strategy, the Japan Statistical Society established the Sport Data Subcommittee in 2009 to spearhead sport applications of data science in the country. Having recently celebrated the seventh anniversary of the group’s ever-growing flagship annual event, “Sport Data Analytics Competition,” the subcommittee leaders believe data analytics are destined to serve as the fundamental tools in sport teams’ pursuit of victory. Prof. Yoshiyasu Tamura of the Institute of Statistical Mathematics, Prof. Fumitake Sakaori of Chuo University and Prof. Akinobu Takeuchi of Jissen Women’s University – three of the founding members of the subcommittee and the contest – will share their takes on the future of the application of data science in sport and what it means for the talents that will be required to help develop successful game strategies.
Ask the Experts: Yoshiyasu Tamura (The Institute of Statistical Mathematics)
Dr. Tamura serves as specially appointed professor at the Institute of Statistical Mathematics. He works to develop statistical methods and studies various applications of the methods. He is also known for his research on physical random number generator, a crucial tool for simulations conducted by the ISM. He received his Ph.D. in science from Tokyo Institute of Technology. Dr. Tamura was appointed to his current position in 2018 after as associate professor and professor at the institute since 1986 and 1997 respectively.
Ask the Experts: Fumitake Sakaori (Chuo University)
Dr. Sakaori serves as associate professor for the Department of Mathematics, Faculty of Science and Engineering, at Chuo University and as visiting associate professor at the Institute of Mathematical Statistics. He specializes in statistical modeling, computational statistics, statistical science in sports, and statistical education. Dr. Sakaori was appointed to his current position at Chuo University after working as assistant professor for College of Sociology at Rikkyo University and as adjunct professor at Chuo University’s Faculty of Science and Engineering.
Ask the Experts: Akinobu Takeuchi (Jissen Women’s University)
Dr. Takeuchi is professor for the Faculty of Humanities and Social Sciences at Jissen Women’s University and the director of the university’s Information Center. He specializes in statistical science, behavior metrics, and statistical education, including data science education. As the chairperson of the Special Committee of Statistical Education, Dr. Takeuchi has long worked to develop educational methods for fostering statistical thinking.
Prospective Data Scientist Turned Professional Baseball Player
On March 19, 2018, dozens of data scientists from across the country, both professionals and students, descended on the Research Organization of Information Systems (ROIS)’s the Institute of Statistical Mathmatics’ headquarters in the suburban Tokyo city of Tachikawa for the 7th annual Sports Data Analysis Competition award ceremony. The event highlighted the work of a grand prize, first prize and honorable mention winners in the highly competitive contest, which began in June 2017 with a call for submissions, followed by a September deadline and announcement of the awardees in January. The exponential growth of the contest’s popularity over the past years speaks for the surging interest in use of data analytics in sport, said Prof. Yoshiyasu Tamura of the Institute of Mathematical Science, a baseball enthusiast who used to play the sport for his university as a catcher.
“I love all kinds of sports, and that’s how I became interested in applying statistics to sports,” Prof. Tamura said, looking back on his college days. “When we began the competition, very few people had heard of the word, ‘data science,’ and we didn’t even have the Japan Data Scientist Society,” Prof. Tamura said. “A lot of people are sports fans like me. It makes sports an ideal subject for practicing analytics.”
The success of the Sports Data Analysis Competition has proven him right. The 2017 contest drew submissions from as many as 61 teams for three sports categories – baseball, soccer and basketball – each of which is broken down into three subcategories of “Analytics” and “Infographics” for adults and “Secondary Education” for middle and high schoolers.
Entries ran the gamut in subject matter from “Topological data analysis for the evaluation of the robustness of the defense system,” submission by a Keio University team, to “Analysis of basketball scores” entered by a Chuo University team, which won the grand prizes in the analytics and infographics categories respectively.
“When I think about this year’s entries, there was more quality research that was soccer-related than baseball-related,” Prof. Tamura said. “In baseball, you only have pitch-by-pitch data. In soccer, on the other hand, you not only have data on an entire play but also ball tracking and touch-by-touch data, allowing you to synthesize them all for analysis. The quality of soccer data is really improving,” Prof. Tamura said.
The competition organizers are trying to offer participants more sport choices to choose from, as well, which was the reason behind the addition of the rugby category for the 2016 contest.
“Statistics can be difficult to analyze, but sports have the magical power to get people excited about analyzing data,” Prof. Tamura said. “I know one university student, who participated in the competition twice, went on to become a professional baseball player. I was stoked to find out about that,” he said. “Every professional baseball team nowadays is on the hunt for data analysis talent.”
Climbing the Learning Curb with Sport Data Analysis
Contest judges first screen the submissions based on how attractive the analysis themes are.
“If there is a surprising result or highly practical use for the research, we would add points for that, too,” said Prof. Fumitake Sakaori of Chuo University, who has led the competition long with Prof. Tamura since its inception. “In other words, we look at whether the analysis is warranted.”
The second most important criteria are whether the teams used appropriate techniques for analysis as well as their ingenuity in the ways they conducted their analysis, according to Prof. Sakaori. Originality and novelty count, he said.
“Teams that show originality in data analysis and groups that try novel processing techniques to produce their own data instead of using the variables given by the competition organizers can earn extra points, as well,” Prof. Sakaori said.
“The great thing about sports is that those analyzing the data easily understand what it is that they are trying to get out of their analysis. Thus, sports data makes for excellent educational material, especially when one is familiar with whichever sports in the subject of analysis.” Prof. Sakaori said.
He explained that the Sport Data Subcommittee’s single biggest goal in organizing the competition was to encourage people to make use of data and grow their ability to think in terms of mathematical statistics.
“The competition program has been very effective in that light,” Prof. Sakaori said. “I can see contestants’ skills are improved every time they come back for another competition. From the angles they take to their analysis to their objectives, everything gets better. I suppose looking at excellent work done by other teams inspires you and helps you develop statistical thinking.”
In the end, you have to ask yourself for whose benefit you are analyzing data – whether it’s for the players, for the coaches or for the team owners – or whether you are doing it to enhance the experience of watching games on TV, Prof. Sakaori said.
“Once you answer these questions, it becomes clear how you should present your analysis,” he said.
Looking Beyond 2020: Data Analysis May Be Sports’ Future
In the Sports Data Analysis Competition, participants get to work with streamlined datasets provided to them. In real life, however, it is difficult to find well-organized quality data, and in some sports, you couldn’t find any data at all, according to Prof. Sakaori.
“That creates a challenge for people who are trying to tap into the power of data analysis to contribute to the sports industry,” Prof. Sakaori said. “Another challenge is how to produce analytical data from things like video footage of sports tournaments. You can take data from video of a judo tournament, for example, but that’s unstructured data and you wouldn’t necessarily know how to approach it. We must advance our knowledge in not only statistical science but also informatics in order to deal with these issues,” he said.
Some countries have all sorts of professional sports leagues, including basketball leagues, soccer leagues and American football leagues. In these countries, there are many science magazines focused on sports statistics, and lots of research being conducted on the topic, as well, according to Prof. Sakaori.
“In Japan, we are now seeing more sports consulting businesses being established to work with professional teams and strategize how to boost the industry for the post 2020 Olympics era,” Prof. Sakaori said. “Since the Japan Sports Analysts Association (JSAA) has also been established in 2014, a number of conferences have been organized, and they are growing larger every year, offering opportunities for sports analysts to compare notes. If this is any indication, I predict sports games 10 years from now will look much different from the ones we see today.”
Everyone Needs to Be Able to Talk Data
The 7th annual Sports Data Analysis Competition marked its fifth year since opening the contest to middle- and high-schoolers and drew as many as 65 submissions to the Secondary Education category. Prof. Akinobu Takeuchi of Jissen Women’s University, who has overseen the category since its beginning, said he has been impressed by the quality of the students’ work.
“People from around the world often look at the Japanese as being highly skilled in mathematics. But statistics reveal that many other countries, particularly China, the U.S., South Korea, New Zealand, England and Australia, provide much more advanced math education than Japan does,” Prof. Takeuchi said.
There is a movement to revamp math curriculum in Japanese secondary education with an eye toward providing all students with the necessary statistical skills to live in the data age. Prof. Takeuchi has actively been involved in that initiative in recent years.
“Until recently, middle and high schools in Japan weren’t doing enough to teach the concept of an “average” and how the “median” value and “mode” have statistical significance. It makes you wonder how this will prepare the students for their future – if they will become capable of joining their counterparts from around the world even for a casual business conversation,” Prof. Takeuchi said.
He said that the sports data used in the competition are “pretty clean and almost normally distributed.” Data that statistical scientists have to work with in real life, including social survey data, is never that clean, he said, but having neat data helps contestants explore and discover how to effectively use data to draw useful analysis from it.
“At the end of the day, you want your sports team to win. But, just because they won, it’s not necessarily clear why they won or what worked the most. We have to figure that out scientifically. And we want the contestants to learn the thought process required in statistical science,” Prof. Takeuchi said. “If I take the basketball as an example, we want them to analyze why the team has a low shooting rate, what winning teams or losing teams have in common and other factors by using various analysis techniques. It is important to be able to realize a new technique could shed light on causal relationships underlying what they’ve found.”
This year’s entry from a team from Kannonji Daiichi High School in Kagawa prefecture epitomized such learning, according to Prof. Takeuchi. The team conducted analysis to figure out ways to strengthen a local basketball team. The students created a diagram for their poster that summarized the causes and effects of game results. They conducted a principal component analysis and verified the analysis method, as well. The poster, which proposes a new strategy for the basketball team, won the grand prize in the category.
“Regardless of whether you are interested in majoring science or want to study the humanities and become a marketing professional, data science skills provide the foundation for all. This competition’s purpose is to raise the next generations who are fluent in the language of data.”
Interviewer: Rue Ikeya
Photographs: Yuji Iijima unless noted otherwise
Released on: November 12, 2018 (The Japanese version released on April 12, 2018)