As college students click, swipe and tap through their daily lives – both in the classroom and outside of it – they’re creating a digital footprint of how they think, learn and behave that boggles the mind.

Students studying with laptops

Students generate huge amounts of data in their online lives. Stanford sociologist Mitchell Stevens wants to make sure this information is both used and protected. (Image credit: PeopleImages.com / Getty Images)

“We’re standing under a waterfall, feasting on information that’s never existed before,” said Mitchell Stevens, a sociologist and associate professor at Stanford Graduate School of Education (GSE). “All of this data has the power to redefine higher education.”

To Stevens and others, this massive data is full of promise –­­ but also peril. The researchers talk excitedly about big data helping higher education discover its Holy Grail: learning that is so deeply personalized that it both keeps struggling students from dropping out and pushes star performers to excel.

Yet, at the same time, they worry that the data will be misused, sold or stolen. Consider, for example, what might happen if data show that students who fit a certain profile struggle in a core course. Could those students be prevented from taking the class or pushed down a different path just because the data say they should?

Responsible uses

So earlier this summer, researchers at Stanford and Ithaka S+R, a nonprofit education consulting firm, brought together 70 representatives – mostly from academia, but also from government, leading nonprofits and the commercial education technology industry – to discuss some of the hot-button issues surrounding big data in higher education. The convening received financial support from several Stanford programs, including the Cyber Initiative, the Digital Civil Society Lab, the Institute for Research in the Social Sciences (IRiSS) and the McCoy Family Center for Ethics in Society, as well as the Spencer Foundation.

The ideas that came out of that meeting, and a similar one that took place two years ago, form the basis of a new Stanford-hosted website, “Responsible Use of Student Data in Higher Education.” The site launched Sept. 6.

“A university is meant to be a safe place for students to fail so they can learn, and we need to protect that,” said Timothy McKay, an astrophysicist and the faculty director of the University of Michigan’s Digital Innovation Greenhouse, which is developing cutting-edge ways of learning through technology. McKay attended the summer meeting at the Asilomar Conference Grounds in California.

Currently, formal rules governing what can and can’t be done with student data are murky.

There are laws, both state and federal, and protocols that were established long before big data and the influx of outside companies into higher education. Unsure about what to do, many colleges and universities are restricting researchers’ access to student data. At the same time, professors and students freely download apps or use online education services, often without their schools’ knowledge.

“There’s a lot of trepidation at most institutions about potential overreach and that leads to under-reach,” said Martin Kurzweil, the director of the educational transformation program at Ithaka S+R. “So a lot of players are moving in to fill those gaps and it’s not always clear how they’re using student data.”

That’s where the Stanford/Ithaka partnership and website come in.

The idea, said Stevens, who also directs the Center for Advanced Research through Online Learning, is to provide colleges and universities with the resources needed to address the unanswered questions raised by big data.

“By nailing a bunch of documents about the responsible use of student data to an electronic tree, we want to start a national conversation,” said Stevens, who has long pushed for ethical standards around educational data. He is co-founder of an interdisciplinary discussion series at Stanford GSE known as Education’s Digital Future.

Standard of care for data

Ultimately, Stevens and others would like to see higher education develop a standard of care parallel to those that govern doctor-patient relationships. As part of that goal, attendees at this summer’s conference drafted a set of voluntary ethical principles to help schools address big data’s challenges.

The guidelines center on four core ideas. The first calls on all players in higher education, including students and vendors, to recognize that data collection is a joint venture with clearly defined goals and limits. The second states that students be told how their data are collected and analyzed, and be allowed to appeal what they see as misinformation. The third emphasizes that schools have an obligation to use data-driven insights to improve their teaching. And the fourth establishes that education is about opening up opportunities for students, not closing them.

For Stevens, his “aha!” moment about the need for a big data code of ethics came soon after “MOOC mania” struck higher education in 2012.

The sudden rise of MOOCs, or massive open online courses, and the deluge of data that followed were both thrilling and unsettling. He instantly saw the potential for a new multidisciplinary science that brings together experts in computer science, engineering, economics, sociology and psychology. He also realized nobody knew the right or wrong way to handle the information crush.

He sensed, too, how urgent it is for higher education to get in front of the issues.

“Academic self-governance is an important feature of American higher education,” Stevens said. “I’d hate to wake up one morning and have these issues solely regulated by the government.”

Kent Wada of  UCLA, who attended both this summer’s conference and its predecessor in 2014, agrees. “Everything is happening really fast and that’s both great and horrifying,” he said. “It’s really critical that we look at these issues now.”