How Stanford researchers design reliable, human-focused AI systems

In brief

Building good artificial intelligence technologies requires large amounts of high-quality data and consistent thoughtfulness about how we want these tools to be used.
Stanford computer scientists Ludwig Schmidt and Diyi Yang help develop the foundations of AI tools – such as building the training data needed to create and improve a variety of algorithms to power AI systems.
They are both optimistic about AI’s future, while taking seriously their roles in ensuring these tools are carefully designed.

Artificial intelligence can seem like magic, but the products of AI don’t come out of thin air. These tools are built on algorithms created by people to process and interpret information in a predictable way. Many tools contain more than one algorithm, and if you want high-quality output from your AI, the algorithms must be structured carefully with the end users in mind and trained on large amounts of high-quality data.

Here we explore the work of Ludwig Schmidt and Diyi Yang, two assistant professors of computer science in the Stanford School of Engineering, who work on the foundational science and theory that make AI possible. With that responsibility in mind, they aim to improve the datasets AI systems are trained on, how people interact with AI systems, and the societal impacts of this technology.

Their research is providing creative solutions to questions such as: How can we make AI systems work equally well for people who speak different languages? What makes a training dataset a “good” dataset? Can AI chatbots help teach people social skills?

“We want to build AI models for humans,” said Yang, who leads the Social and Language Technologies Lab, which specializes in socially aware natural language processing, large language models, and human-AI interactions. “This means that we care not only about the technical capabilities of these systems, but also about their societal impact.”

Better data, better AI

Schmidt’s research tackles the societal impact of AI by working to improve the datasets used to train these systems in the first place.

“A lot of the training data for AI currently comes from the internet because it’s the only very large text dataset we have,” said Schmidt, who specializes in AI training datasets, language models, and multimodality (text, image, audio, and video data sources), and is also a technical staff member at Anthropic and at LAION. “Eventually, we will run out of internet data to train on, so an important research question is figuring out what we do when we run out of training data from the internet.”

Schmidt builds the techniques that curate massive training datasets containing several trillion words. These datasets teach AI systems how to perform reliably and safely when encountering new data.

“Many people are using AI in their work now because the models have begun to be useful,” Schmidt said. “At the same time, the models also still make mistakes that limit their usefulness for more complex tasks. Some of these mistakes are due to deficiencies in the training data, and my research helps address these shortcomings of AI models so they become more reliable collaborators.”

Human-centered AI

“Language was a culture shock for me when I came to the U.S. as an international student,” said Yang. “It quickly became clear that it is not just words and grammar. Language reflects both culture and the people who shape it.”

This experience motivated Yang to study how AI systems process different languages and how people of different cultures experience these systems.

AI models are disproportionately trained on English data, so AI systems often work better for people who read and speak English. Even within the English language, there are disparities in how language data are processed by AI. Yang’s research found that people living in different parts of the U.S. (and of the world) with their own varieties of English have different experiences with AI systems.

“Our lab is working to make AI models more inclusive so they can understand more languages in general, and we hope to build AI systems that understand the different regional languages so that people in, for example, New York, San Francisco, and Atlanta can experience the technology in the same way and have a positive experience with AI,” said Yang.

Stanford researchers use AI to push the boundaries of what’s possible

Scholars across campus are leveraging AI to drive remarkable advancements in fields from robotics to neuroscience to mining, while fostering a cautious approach to the application of the technology.

Yang’s interest in human-centered AI extends to building and testing tools, such as large language models that are designed to help people improve their social skills.

“Social skills are actually very hard to learn,” said Yang. “This prompted me to ask: Is there a way that we could use AI to teach social skills?

In response, Yang and her lab built an AI framework, called AI Partner and AI Mentor, which enables users to role-play conversations with an AI Partner and receive coaching from an AI Mentor. Using this framework, Yang’s lab is developing Rehearsal to help users practice conflict resolution and CARE to help counselors practice giving emotional support to patients.

“The ability to rehearse a potential conversation and get tailored feedback is important because most people don’t have the resources or the social capital to learn those soft skills,” said Yang. “We found that just 20 minutes of practice with the AI Partner and AI Mentor can help people develop social skills and become more aware of how to use empathetic language and open-ended questions.”

Moving forward with responsibility

Anticipating how AI systems will likely evolve and be used in the future is a critical part of the safety, reliability, and social impact of AI, the researchers explained.

“The last few years of AI research has focused on building chatbots such as ChatGPT and Claude,” said Schmidt. “Next, researchers will go beyond question-answering AI models to AI agents that can complete complex tasks autonomously. This will likely require new specialized training data, and I’m excited to figure out how to build such datasets.”

Yang is also optimistic about the future of AI but points out that there are risks associated with careless use of this technology. Understanding how and when AI can go wrong is vital because it helps researchers avoid pitfalls to deliberately design something good with this technology.

“AI products and AI applications can enter the real world very quickly, often without a rigorous and careful understanding of their impact or the consequences of their use,” said Yang. “This means that, not only do we have the advances in and the promise of AI, but we also face potential risks and concerns in its use. This is why we need to move forward with responsibility.”

Schmidt is also a faculty member of Stanford Data Science. Yang is also a faculty affiliate of the Institute for Human-Centered Artificial Intelligence (HAI).