Stanford researcher applies evolutionary math to March Madness

Imagining a situation where all the games in the NCAA basketball tournament have to be played sequentially in the same arena, mathematical geneticist Noah Rosenberg asks: In how many possible sequences can these games be played?

March Madness is upon us and with it, the tradition of making tournament brackets. Where most of us see a grid of future travails and triumphs that will determine the NCAA Division I champions, Stanford mathematical geneticist Noah Rosenberg sees something more: evolutionary history.

Stanford women’s basketball has earned a No. 1 seed for the third year in a row in the 2023 NCAA Tournament. The team won the tournament in 2021.

Stanford women’s basketball has earned a No. 1 seed for the third year in a row in the 2023 NCAA Tournament. The team won the tournament in 2021. (Image credit: Stanford Athletics / Bob Drebin)

In 2021, to accommodate COVID-19, the single-elimination tournaments of March Madness, usually nationwide events, were each confined to a single location.

This happened to resemble a problem that Rosenberg had recently taught in his mathematical evolutionary modeling class: What is the number of sequences in which an evolutionary process can produce a particular tree structure with a particular set of relationships among the species? Translated to March Madness, this question becomes: If all the games in a single-elimination sports tournament are played sequentially in the same arena, in how many possible sequences can the games be played?

“The question opens up a new line of thinking,” said Rosenberg, the Stanford Professor in Population Genetics and Society in the School of Humanities and Sciences. “It’s a neat setup where the math of evolutionary biology solves the sports problem, and evolutionary biology is in turn aided by the connection to sports.”

Rosenberg and Matthew King, a former undergraduate student, detailed this problem – and the unique perspective it offers – in a paper that will appear in an upcoming issue of Mathematics Magazine.

A shared perspective

Evolutionary biologists are interested in relationships among species, as well as understanding what those relationships can tell us about the evolutionary history that’s given rise to a species. Many features of these histories are possible to estimate from tree diagrams, such as how quickly the species have been diversifying. The study of evolutionary history between organisms is known as phylogenetics and so these tree diagrams are called phylogenetic trees.

It turns out that these paths of biological evolution have lookalikes in many seemingly disparate areas, such as search algorithms in computer science, branching sequences in epidemic transmission, and graph theory in math. By pursuing the similarities between the trees of his evolution research and those of March Madness, Rosenberg saw an opportunity not only to answer a fun question about sports but also to see how that answer would reflect back on evolutionary theory.

“One species can diverge into two and then into two pairs of species, sometimes in one order, and sometimes the order is reversed. But it’s rarely considered that the two pairs split exactly at the same time,” said Rosenberg. “In sports, though, one often does see games scheduled at the same time in tournaments that have multiple arenas available to them. So that makes us think about solving a more unusual evolutionary biology problem where branching events happen at the same time.”

The March Madness-verse

When Rosenberg and King, now a graduate student at Harvard University, finally ran the numbers on a March Madness scenario, the answer surprised them.

“One gets used to how sports tournaments tend to proceed in a very particular way with sequences of ‘rounds,’ but as we show in the paper, that canonical scheduling choice is chosen from a quite remarkably high number of possibilities,” said Rosenberg.

As it stands now, there are 1,905,458,855,466,636,787,971,925,146,177,334,793,473,753,765,414,856,950,607,419,556,152,726,849,614,067 (or 1.91 x 10^78) game sequences possible for one format of the 68-team tournament. If they should ever be confined to a single area, there will still be 360,410,120,625,822,474,490,741,822,944,015,962,624,736,196,480,481,624,064,000,000,000,000 (3.60 x 10^68) possibilities.

While the authors enjoyed this thought experiment, computing these numbers wasn’t really the point. Like following a phylogenetic tree, it was about the journey. A problem that started from a class turned into a sports conundrum, led to a new connection, and inspired a mathematical research paper co-authored by a faculty member and an undergraduate.

“Mathematical phylogenetics has a lot of problems that are not hard to state, but that involve serious mathematical thinking that is accessible to undergraduate research,” said Rosenberg. “If people take away anything from it, I hope they think a little about evolutionary trees during their next favorite sporting event.”

To read all stories about Stanford science, subscribe to the biweekly Stanford Science Digest.