Learning analytics at Stanford takes huge leap forward with MOOCs

Stanford's Lytics Lab studies data from massive online courses to learn more about how we learn.

L.A. Cicero Emily Schneider, Rene Kizilcec and Chris Piech

Graduate students Emily Schneider, Rene Kizilcec and Chris Piech work on a conference presentation describing their work analyzing student behavior in massive open online courses.

The hottest thing about online learning might be the opportunity it affords for learning about how people learn. It can tell us when students get fed up with lectures, how men and women react differently, the degree to which helping others can help students and how online forums can stimulate better performance.

All that and more goes inside what Stanford's Learning Analytics group calls the data cauldron. Comprising graduate students, researchers, professors and visitors from the fields of education, computer science, communication and sociology, the Lytics Lab meets weekly under the auspices of the Office of the Vice Provost for Online Learning and the  Learning Sciences and Technology Design (LSTD) program at the Graduate School of Education.

Stanford's early adoption of online learning has given the university a head start not only with the classes per se but also with the information deriving from them.

Lytics meetings used to be modest affairs. But word has gotten out, and the shortage of chairs attests to the shared assumption that online learning does not simply hold out promise to millions of potential students around the world and hundreds of Stanford students. It also might help answer a multitude of questions about how humans learn and interact.

Three members of the group presented the results of their research at the Learning Analytics and Knowledge (LAK) meeting in Leuven, Belgium, this week. Their project is one of several ongoing team projects in Lytics, which include a dashboard to help instructors monitor student engagement; a study of peer assessment based on 63,000 peer grades in a massive open online course on human-computer interaction; automated feedback for coding assignments; and predictors of student performance.

Stirring the data cauldron

Learning analytics refers to the interpretation of a wide range of data produced by and gathered on behalf of students to assess progress, predict performance and identify problems. Data are collected when students complete assignments, take exams, watch videos, participate on class forums or do peer assessments. As more data are collected, new questions can be asked, and classes can improve.

There is admittedly a great deal of hype and misunderstanding about massive open online courses, or MOOCs, which was what prompted the three doctoral students presenting in Leuven to begin their research.

The trio – René Kizilcec, in the Department of Communication; Chris Piech, in the Department of Computer Science; and Emily Schneider, in the LSTD program – were concerned about some of the criticism aimed at MOOCs and wanted good data on how to counter it.

Why do so many students start a class and then quickly drop out? Why, and when, do they bypass certain elements of online classes? And why are they taking the classes to begin with? In the researchers' opinion, the wrong questions were being asked.

In their paper, "Deconstructing Disengagements: Analyzing Learner Subpopulations in Massive Open Online Courses," Kizilcec, Piech and Schneider looked at student behavior in three MOOCs offered by Stanford faculty: Computer Science 101, a high-school-level course; Algorithms: Design and Analysis, at the undergraduate level; and the graduate-level Probabilistic Graphical Models.

They found that people take classes or stop for different reasons, and therefore referring globally to "dropouts" makes no sense in the online context. They identified four groups of participants: those who completed most assignments, those who audited, those who gradually disengaged and those who sporadically sampled. (Most students who sign up never actually show up, making their inclusion in the data problematic.) The point of all this is not simply to record who is doing what but to "provide educators, instructional designers and platform developers with insights for designing effective and potentially adaptive learning environments that best meet the needs of MOOC participants," the researchers wrote.

For example, in all three computer science courses they analyzed, they found a high correlation between "completing learners" and participation on forum pages, suggesting a positive feedback loop: The more students interacted with others on the forum page, the better they learned. This led the researchers to suggest that designers should consider building other community-oriented features, including regularly scheduled videos and discussions, to promote social behavior.

While many people take online courses for certification and skills acquisition, many more take them simply for intellectual stimulation – again making "completion" a questionable criterion of worth. In that regard, auditors should be encouraged, not reprimanded for not taking quizzes they don't need, the researchers wrote. The completion rates for the three classes, with percentages based on initial enrollment, were 27 percent for the high-school-level class, 8 percent for the undergraduate-level course and 5 percent for the graduate-level class. But 74 percent of the undergraduate students and 80 percent of the enrollees in the graduate class sampled, meaning they may well have dipped in and out according to time constraints and interest.

Finally, the researchers found substantial gender differences in the more advanced classes. Counting "active learners," defined as those who did anything at all on the website (around half the original enrollees), 64 percent of the high-school-level class were men, and the percentage rose to 88 percent men for both the undergraduate-level and graduate-level courses.

Interdisciplinary blend

Kizilcec, Piech and Schneider are enrolled in three different schools at Stanford; respectively, Humanities and Sciences, Engineering, and Education. But that only makes them work more fluidly, they say.

"We're all humanists," said Schneider, whose undergraduate degree is in English from Swarthmore, "and first and foremost we're committed to the humans who are learning through these systems. On the other side of the sea of data there are people coming to MOOCs from a vast range of backgrounds, and we want to optimize systems to best meet their needs."

Piech is the son of teachers, and he grew up in what he described as an "educational environment," first in Kenya and then in Malaysia. "I always knew I wanted to do something with education," he said. "I grew up watching people trying to do good things, and I spent lots of time thinking how I should make my mark on the world." Piech was an undergraduate teaching assistant in CS106, Programming Methodology, at Stanford, and it was that experience that cemented his wish to combine computer science with education.

Kizilcec, meanwhile, has an undergraduate degree in philosophy and economics from University College, London. "I'm really excited about this work," he said. "On the one hand, it can change millions of people's lives. And, on the other, online learning allows us to learn about learning in a totally unprecedented way."

Roy Pea, the David Jacks Professor at the Graduate School of Education, is adviser to many of the students in the Lytics Lab and a keynote speaker at the LAK meeting. The group, he says, "is an exciting innovator at the nexus of learning sciences and learning analytics, and that is my central keynote theme."

R. F. MacKay is a writer for the Office of the Vice Provost for Online Learning.