“Move fast and break things” is a well-known Silicon Valley motto. Stanford’s AI researchers are choosing the opposite method: caution, scrutiny, and long-term thinking.
As researchers, the experts at Stanford – even those who are optimistic and enthusiastic about the future of AI – maintain a critical eye for detail and demanding expectations when it comes to AI applications.
“I stepped back from frontline AI development in industry to focus on a longer-term, slower research agenda because I want to understand fundamental questions better,” said Yuyan Wang, an assistant professor of marketing in the Graduate School of Business, who came to Stanford after working on AI at Uber and Google DeepMind and realizing there isn’t enough known about why these AI systems work. “I feel like academia gives me this ultimate intellectual freedom to work on longer-term research questions.”
While progress in AI in academia may be slower and more meticulous than in other places, the hope is that this approach results in rich and more robust new knowledge that informs smart, sustainable, beneficial AI.
“I feel a lot of pressure – it may be a good kind of pressure – to build rigorous evidence for the tools we’re developing, which takes time,” said Dora Demszky, an assistant professor of education data science in the Graduate School of Education. “There are industries pumping out these tools without really thinking and carefully evaluating their impact, and we need to slow down the pace.”
With representation from every school at Stanford, here are nine examples of researchers advancing our understanding of AI and creating AI tools that are designed to actually deliver on the promise of this pervasive technology:
Select from the names listed to learn more about research happening across campus.
Jef Caers leverages AI to enhance the sustainability and efficiency of the mining industry.
Aditi Sheshadri is working on better climate modeling with the help of AI.
Megan Ma and Julian Nyarko are leveraging AI to increase access to high-quality legal services.
Roxana Daneshjou focuses on AI for health care with special attention given to potential shortcomings, such as patient bias.
Dora Demszky wants to make education-related AI work well for teachers – and benefit from their expertise.
Chelsea Finn is developing multi-purpose robots that can navigate the messiness of the real world.
Laura Gwilliams combines AI and neuroscience to study the human brain’s talent for language.
Yuyan Wang is diving into the secrets of how algorithms function to create better AI systems.
Brian Trippe is using AI to advance the development of new medical therapeutics.
Revolutionizing resource discovery

In May, Professor Jef Caers led a five-day course at Copperbelt University in Zambia on data science and AI for mineral exploration. | Courtesy Jef Caers
Jef Caers, professor of Earth and planetary sciences in the Stanford Doerr School of Sustainability, founded Mineral X to focus on resilient and sustainable sourcing of essential materials and metals for the energy transition, including copper, lithium, nickel, cobalt, and rare Earth elements. “The mining industry has been thrown into this renewable energy revolution, and they don’t know what hit them,” said Caers.
In July 2024, it was major news when AI was essential to the discovery of a high-grade copper resource where mineralization was previously proven to be promising. That AI, developed in Caers’ group, showed the minimal sequence of drilling that would be required to reduce uncertainty concerning the presence of the copper resource.
The type of AI Mineral X relies on is similar to AI used for chess or self-driving cars, said Caers. For that reason, he collaborates closely with Mykel Kochenderfer, associate professor of aeronautics and astronautics, who designs safer AI systems for environments that involve a lot of uncertainty. The data involved in mining is intricate and interdisciplinary, and researchers need a lot of it in order to account for errors. “Our dataset is the whole Earth measured from zero to 10 kilometers deep, and includes remote sensing measurements of the surface, geophysical measurements of physical properties in the subsurface, and geochemical measurements,” said Caers. His group even has one dataset that includes hand-drawn Belgian maps of the Congo from 1900.
Beyond exploration of resources, Mineral X is also developing AI technologies for mining and processing of materials after mining and working on preparing a workforce that can bring mining into its next chapter. This means training that encompasses the knowledge needed for both traditional and AI-powered mining, and making sure people who live near mining locations play integral roles in the workforce.
Adding more detail to climate models

Professor Aditi Sheshadri and her lab are using AI to study the effects of gravity waves on the climate. Here, atmospheric gravity waves – normally invisible – can be seen moving through a thin group of electric blue clouds that form over the poles in summer, known as polar mesospheric clouds. | NASA/PMC Turbo/Joy Ng
Aditi Sheshadri, assistant professor of Earth system science in the Stanford Doerr School of Sustainability, and her lab are using AI to “learn” the effects of gravity waves on the climate. “Atmospheric gravity waves are ubiquitous in the Earth’s atmosphere and play a crucial role in regulating aspects of the Earth’s climate,” said Sheshadri. “Some of them are too small and fast to be explicitly resolved in climate models, and the approximations made in their representation present a large cause of uncertainty in climate projection.”
As part of this work, Sheshadri leads the Datawave project, an international consortium that is attempting to combine observations, high-resolution regional and global simulations, and AI algorithms to improve our understanding of atmospheric gravity waves and their representation in global climate models.
“I’m hopeful that AI can be meaningfully used as a tool to learn from high-resolution observations or simulations to make climate projection more accurate, with quantified uncertainties,” said Sheshadri.
Some of the challenges that Sheshadri and her colleagues are addressing include ensuring that the algorithms they use have sufficient data to train on, given the complexity of the physics involved in climate modeling. People who use these models also need to be careful about interpretation and generalization. For example, Sheshadri points out that there is a wealth of training data that can tell us about the current climate but we rely on that same data to predict the climate’s future state – which means we need to account for the uncertainty inherent in those predictions.
Rising above the hype of AI in law
Liftlab is is led by Megan Ma, executive director, and Julian Nyarko, professor of law in the Stanford Law School. Its mission: increase access to high-quality legal services in the private sector by leveraging AI and other frontier technologies. Liftlab develops, investigates, and evaluates AI-based approaches and tools that are designed to reduce the cost and increase the quality of legal services in legal education and private practice. “AI has unprecedented potential to transform legal services. Significant activity is underway across the sector, but we need a clear-eyed reality check on what truly works,” said Nyarko. “Our lab is committed to separating hype from reality, enabling the effective development of AI for legal services so that its benefits can be realized broadly.”
Examples of liftlab’s work include large-scale evaluations in partnership with law firms of immersive simulations that enable young attorneys and law students to practice negotiation and other legal skills, the development of new methodologies to effectively make use of annotations in high-stakes and high-cost domains such as law, tools to enable legal personnel to write better contracts that minimize cost, and novel approaches to identifying and mitigating racial bias.
“Legal AI should be centered on building human talent – cultivating judgment and domain literacy in attorneys – so that technology apprentices to the profession rather than the profession apprenticing to the tool,” said Ma. “Quality in legal work has never been measured by efficiency and accuracy alone. It has been defined by advocacy, transparent reasoning, client-centered judgment, and accountability.”
Caution is key for AI in clinical settings
In her lab, Roxana Daneshjou, assistant professor of biomedical data science and of dermatology in the School of Medicine, develops and tests AI tools – both image and language models – for health care. She's especially interested in the potential pitfalls of these tools, including biases toward particular groups.
“I am most excited about how these tools will improve access for patients and workflows for physicians,” said Daneshjou. “As the late Dr. Atul Butte said, AI could allow for ‘scalable privilege,’ where everyone has access to the same information and care as those who are well-versed in the health care system.”
Daneshjou’s projects include chatbots to help patients navigate their own medical records, tools that coordinate multiple streams of data (text and images) to help predict patient outcomes, and assessments of large language models (LLMs) for use in health care. This last line of research has included an 80-expert “stress test” of these health care LLMs and new research that shows that LLMs can have sycophantic behavior – as in they might tell the user what they want to hear rather than the truth.
“Though I am excited about AI in health care, there is a lot of hype and a lot of companies are pushing AI that is not ready for deployment into clinical environments,” said Daneshjou. “If AI that isn’t ready for prime time ends up causing harm, there will be a deserved backlash. We need to be thoughtful and careful because you can’t ‘move fast and break things’ when human lives are on the line.”
Giving teachers power in educational AI advances

The lab of Dora Demszky recently explored AI in math education during the Practitioner Voices Summit.
While some people are developing AI for students, the lab of Dora Demszky, assistant professor of education data science in the Graduate School of Education, focuses on how AI can support teachers. “AI is not there yet in terms of being able to support students more holistically,” said Demszky. “And so by having teachers in the loop, we can make sure that there’s oversight.”
Examples of projects in Demszky’s lab include using language models to analyze classroom discourse – such as how much the teachers talked versus the students – to help provide coaching for teachers, trying to use generative AI to help teachers adapt curriculum, creating math diagrams to improve visual learning options, and advancing more holistic and customized feedback for students. “All of these three areas came from what we heard from teachers that they really needed,” said Demszky.
The Demszky lab’s work considers that users will have a range of teaching experience, from seasoned experts to those who have minimal training. Many of the lab’s projects include the intention of helping students with different learning needs, such as being below grade level or not speaking English as their first language. The lab also recently hosted a Practitioner Voices Summit on the subject of AI in math education.
How to train your robot

Professor Chelsea Finn uses machine learning to make advancements in the field of robotics. | Stanford HAI
If you dream of household robots that can perform many different tasks, you need some way of programming all the variability that the real world throws at them. Trying to manually create something that addresses every possibility – picking up a mug with the handle in front versus on the side, or from a cupboard versus a table, at sunset or in a bright room – is basically impossible.
“Instead, one of the most successful techniques for helping robots deal with all of the variability in the world is machine learning, where instead of trying to model the world and pre-programming every scenario, you collect a lot of data and learn patterns from it,” said Chelsea Finn, assistant professor of computer science and of electrical engineering in the School of Engineering.
Finn’s lab has recently pushed the boundaries of AI robotics by relying on inexpensive hardware to produce highly trainable and relatively adaptable devices, such as Mobile ALOHA, whose impressive repertoire famously includes cooking shrimp. “The downside with that work is we can’t yet cook shrimp in any kitchen,” said Finn. “The ability to do something in any sort of scenario is called ‘generalization,’ and for that, these machine learning algorithms need larger datasets and larger models.”
In light of this need, Finn led an effort called DROID (Distributed Robot Interaction Dataset), which collected robot training data from about 15 different institutions in 50 different buildings and is fully open source. The lab is also advancing “vision, language, action” models, which combine large algorithm-based models to make it possible for robots to respond to text and/or visual cues with specific actions.
Can language models mimic the human brain?
Laura Gwilliams is an assistant professor of psychology in the School of Humanities and Sciences, a Wu Tsai Neurosciences Institute faculty scholar, and a Stanford Data Science faculty fellow. Gwilliams’ lab studies how the human brain understands and produces language. They do this by analyzing many different types of recordings of the brain at work. “We are able to see how the brain is processing and producing language from the whole brain’s perspective while also zooming all the way in, to the level of a single neuron,” said Gwilliams.
Since Gwilliams’ research focuses on language, it can leverage large language models (LLMs) to track, quantify, and analyze the data from these brain recordings. The lab is even testing how well LLMs can act as stand-ins for the human brain – similar to how lab mice are used in biological studies. One example is that a student in the lab is studying stroke-related language problems by “lesioning” different aspects of LLMs to see whether model behavior mimics aphasic speech.
Gwilliams said AI-based methods enable language researchers to test multiple hypotheses with a single experiment by asking human participants to engage with natural language, such as audiobook listening. This also feels more natural for research participants. As in other work, though, the researchers must take care to understand how the models are working and evolving – and how well they reflect the workings of the human brain. “You’re almost probing an alien system to see whether it has converged on the same types of computational solutions that the human system has converged on,” said Gwilliams.
Bringing AI out of the ‘black box’ era
As an AI researcher in industry, Yuyan Wang, an assistant professor of marketing in the Graduate School of Business, was frustrated that the algorithms she was working with – such as those that decide what goes on your YouTube homepage – are mostly black boxes. “The models have no understanding whatsoever about the why behind consumers’ choices, like ‘Why do you choose to watch this video? Why do you buy this product?’” said Wang. In part, this is because consumer choices are highly complex – to the point that the consumers themselves might have trouble explaining their preferences.
This has motivated Wang to leverage behavioral and psychological theory and economic structural insights to design AI systems that are more transparent, robust, human-centric, and optimizable for the long term. For example, Wang has continued work she began as an industry-based researcher to push consumer algorithms to structure around consumers’ predicted intents – a change that consistently gives more satisfactory recommendations and improves the consumer experience in the long term, without the need to collect any additional data about the consumers.
“Black box models are not going to get us to artificial general intelligence. If we’re interested in getting to AGI, we need to understand how these models work and why people behave in a certain way,” said Wang. “We also need to know what these intelligent models are doing and why, so we know if they are doing something wrong and how to fix it.”
Solving protein puzzles
In 2024, the Nobel Prize in Chemistry went to an AI model that predicts protein structures. “It’s really difficult to overstate how impactful this advancement has been for understanding, on an atomic level, how biological systems work,” said Brian Trippe, assistant professor of statistics in the School of Humanities and Sciences. Trippe was advised by Nobel laureate David Baker and continues his own research on using statistical machine learning to inform molecular modeling and design.
To work with proteins – such as for the purpose of targeting them for medical treatments – it’s essential to know their exact structures. Two tricky problems that are front-of-mind for Trippe: making sure that proposed treatments are a perfect fit for protein targets (to avoid side effects) and, relatedly, getting the most detailed understanding of protein structure possible. Proteins are three-dimensional and dynamic – so a machine learning model that predicts their structure would ideally also predict a range of possible conformations, while maintaining precision, said Trippe.
By integrating modern machine learning techniques with a tradition of using physics and statistics to solve for protein structures, Trippe hopes to assess thousands of potential therapeutics thousands of times faster than traditional techniques can. “Over the past decade or so, it’s become increasingly obvious that data is going to be very crucial to how we understand cells and how we design and build on them,” said Trippe.
Writer
Taylor Kubota


