People
Faculty
Stuart Russell
Professor of EECS at UC Berkeley
Director of Kavli Center for Ethics, Science, and the Public
Stuart Russell
Professor of EECS at UC Berkeley
Director of Kavli Center for Ethics, Science, and the Public
Stuart Russell received his B.A. with first-class honours in physics from Oxford University in 1982 and his Ph.D. in computer science from Stanford in 1986. He then joined the faculty of the University of California at Berkeley, where he is a Professor (and formerly Chair) of Electrical Engineering and Computer Sciences and holder of the Smith-Zadeh Chair in Engineering. He is a Fellow of the American Association for Artificial Intelligence, the Association for Computing Machinery, and the American Association for the Advancement of Science. His book Artificial Intelligence: A Modern Approach (with Peter Norvig) is the standard text in AI; it has been translated into 13 languages and is used in over 1300 universities in 118 countries. His research covers a wide range of topics in artificial intelligence including machine learning, probabilistic reasoning, knowledge representation, planning, real-time decision making, multitarget tracking, computer vision, computational physiology, and philosophical foundations. His current concerns include the threat of autonomous weapons and the long-term future of artificial intelligence and its relation to humanity. In 2022, he was selected to become the inaugural director of the newly created Kavli Center for Ethics, Science, and the Public. You can read more about Prof. Russell’s work and accomplishments at his website.Pieter Abbeel
Professor of EECS at UC Berkeley
Pieter Abbeel has been a Professor at UC Berkeley (EECS, BAIR) since 2008 and was a Research Scientist at OpenAI during 2016-2017. Pieter has developed apprenticeship learning algorithms that have enabled advanced helicopter aerobatics, including maneuvers such as tic-tocs, chaos and auto-rotation, which only exceptional human pilots can perform. His group has enabled the first end-to-end completion of reliably picking up a crumpled laundry article and folding it and has pioneered deep reinforcement learning for robotics, including learning locomotion and visuomotor skills. His current research focuses on robotics and machine learning with particular focus on deep reinforcement learning, deep imitation learning, deep unsupervised learning, meta-learning, learning-to-learn, and AI safety. You can read more about Prof. Abbeel’s work and accomplishments at his website.Anca Dragan
Assistant Professor of EECS at UC Berkeley
Founder, InterACT Lab
Anca Dragan got her B.Sc. in Computer Science from Jacobs University Bremen in Germany, after which she went to Carnegie Mellon for her Ph.D. in Robotics. She is now an Assistant Professor in the EECS Department at UC Berkeley. Her goal is to enable robots to work with, around, and in support of people. She runs the InterACT Lab, which focuses on algorithms for human-robot interaction—algorithms that move beyond the robot’s function in isolation, and generate robot behavior that also accounts for interaction and coordination with end-users. She also helped found and serves on the steering committee for the Berkeley AI Research (BAIR) Lab.
You can read more about Prof. Dragan’s work and accomplishments at her website.
Bart Selman
Professor of Computer Science and Engineering at Cornell University
Bart Selman is a Professor of Computer Science at Cornell University; previously, he worked at AT&T Bell Laboratories. His research interests include computational sustainability, efficient reasoning procedures, planning, knowledge representation, and connections between computer science and statistical physics. He has (co-)authored over 100 publications, including six which won best paper awards. He is a Fellow of the American Association for Artificial Intelligence and a Fellow of the American Association for the Advancement of Science.
You can read more about Prof. Selman’s work and accomplishments at his website.
Emma Pierson
Assistant Professor of Computer Science at UC Berkeley
Emma Pierson is an assistant professor of computer science, affiliated with the Berkeley AI Research Lab, Computational Precision Health, and the Center for Human-Compatible AI. She will be taking PhD students in Fall 2025; please apply to Berkeley in Fall 2024 and mention her name in your application if you are interested in working with her. Her work develops data science and machine learning methods to study two broad areas: inequality and healthcare. Please see the "Publications" link for representative papers. Her work has been recognized by best paper awards at KDD and AISTATS, an NSF CAREER award, a Rhodes Scholarship, Hertz Fellowship, Rising Star in EECS, MIT Technology Review 35 Innovators Under 35, Forbes 30 Under 30 in Science, AI2050 Early Career Fellowship, and Samsung AI Researcher of the Year. She writes a statistics blog, Obsession with Regression, and has also written for The New York Times, FiveThirtyEight, The Atlantic, The Washington Post, Wired, and various other publications.Joseph Halpern
Professor of Computer Science and Engineering at Cornell University
Joseph Halpern is a Professor of Computer Science at Cornell University. His research focuses on the interface between game and decision theory and computer science, on reasoning about knowledge and uncertainty, and on causality. He has also done work on and continues to think actively about security, (fault tolerant) distributed computing, and modal logic. You can read more about Prof. Halpern’s work and accomplishments at his website.Michael Wellman
Professor of Computer Science and Engineering at the University of Michigan
Michael P. Wellman is a Professor of Computer Science & Engineering at the University of Michigan. He received a Ph.D. from the Massachusetts Institute of Technology in 1988 for his work in qualitative probabilistic reasoning and decision-theoretic planning. From 1988 to 1992, Wellman conducted research in these areas at the USAF’s Wright Laboratory. For the past 25 years, his research has focused on computational market mechanisms and game-theoretic reasoning methods, with applications in electronic commerce, finance, and cyber-security. Wellman previously served as Chair of the ACM Special Interest Group on Electronic Commerce (SIGecom) and as Executive Editor of the Journal of Artificial Intelligence Research. He is a Fellow of the Association for the Advancement of Artificial Intelligence and the Association for Computing Machinery.
You can read more about Prof. Wellman’s work and accomplishments at his website.
Satinder Singh Baveja
Professor of Computer Science and Engineering at the University of Michigan
Director of the Artificial Intelligence Laboratory
Satinder Singh Baveja
Professor of Computer Science and Engineering at the University of Michigan
Director of the Artificial Intelligence Laboratory
Satinder Singh Baveja is a Professor in the Division of Computer Science and Engineering, Department of Electrical Engineering and Computer Science at the University of Michigan, Ann Arbor, as well as Director of the AI Lab there. His main research interest is in the old-fashioned goal of Artificial Intelligence (AI), that of building autonomous agents that can learn to be broadly competent in complex, dynamic, and uncertain environments. The field of reinforcement learning (RL) has focused on this goal and accordingly his deepest contributions are in RL. From time to time, he takes seriously the challenge of building agents that can interact with other agents and even humans in both artificial and natural environments. This has led to research in computational game theory / mechanism design, cognitive science, and human-computer interaction.
You can read more about Prof. Singh’s work and accomplishments at his website.
Serina Chang
Assistant Professor of EECS at UC Berkeley
Serina Chang is an Assistant Professor, with a joint appointment in EECS and Computational Precision Health. Previously, she completed her Ph.D. in Computer Science at Stanford University. She is currently doing a postdoc at Microsoft Research NYC in the Computational Social Science group. Her research falls at the intersection of AI, public health, and social science. Specifically, she develops AI and graph methods to study human behavior, improve public health, and guide data-driven policymaking. Her work is recognized by a KDD Best Paper Award, NSF Graduate Research Fellowship, Meta PhD Fellowship, EECS Rising Stars, Rising Stars in Data Science, and Cornell Future Faculty Symposium, and has been featured by over 650 news outlets, including The New York Times and The Washington Post.Tania Lombrozo
Professor of Psychology at UC Berkeley
Tania Lombrozo received her Ph.D. from Harvard University and is a Professor of Psychology at UC Berkeley. She will be joining the Department of Psychology at Princeton University in July 2018. Her research aims to address basic questions about learning, reasoning, and decision-making using the empirical tools of experimental psychology and the conceptual tools of analytic philosophy. You can read more about Prof. Lombrozo’s work here or here.
Tom Griffiths
Professor of Psychology and Cognitive Science at Princeton
Tom Griffiths is a Professor of Psychology and Cognitive Science at Princeton, and previously the Director of the Computational Cognitive Science Lab and the Institute of Cognitive and Brain Sciences at the University of California, Berkeley. He is interested in developing mathematical models of higher level cognition and understanding the formal principles that underlie our ability to solve the computational problems we face in everyday life. His current focus is on inductive problems, such as probabilistic reasoning, learning causal relationships, acquiring and using language, and inferring the structure of categories. He tries to analyze these aspects of human cognition by comparing human behavior to optimal or “rational” solutions to the underlying computational problems. For inductive problems, this usually means exploring how ideas from artificial intelligence, machine learning, and statistics (particularly Bayesian statistics) connect to human cognition. You can read more about Prof. Griffiths’ work and accomplishments at his website.Staff
Mark Nitzberg
Executive Director of CHAI
Head of Strategic Outreach for Berkeley AI Research
Dr. Mark Nitzberg is currently Executive Director of CHAI, as well as head of strategic outreach for Berkeley AI Research (BAIR). He began studying AI as a stowaway student at MIT in the early 1980’s and wrote his Ph.D. in Computer Vision and Human Perception in 1991 under David Mumford at Harvard University. Dr. Nitzberg has built companies and products in the areas of computer vision, machine learning, financial portfolio optimization, workflow efficiencies, online commerce, development aid data capture and analytics, and film and theatre. Most recently he was the Director of Computer Vision Products at A9/Amazon following their acquisition of The Blindsight Corporation, a maker of assistive technologies for low vision and active aging, where Mark was founding CEO. You can read more about Dr. Nitzberg’s work at his LinkedIn.James Paul (J.P.) Gonzales
Assistant to Stuart Russell
James Paul "J.P." Gonzales serves as the executive assistant to Professor Stuart Russell. Embedded at CHAI and employed by the Berkeley Existential Risk Initiative (BERI), J.P. provides administrative, travel/event-planning and PR assistance to Prof. Russell & CHAI, with the primary goal of supporting Prof. Russell's AI safety research and advocacy goals. He received his Bachelor's of Arts in Cognitive Science at UC Merced and Master's of Science in Experimental Psychology at Arizona State University, and currently resides in San Diego, California.Sarah Otis
Assistant Director of CHAI
Researchers
Andrew Critch
Research Scientist
Dr. Andrew Critch is currently a part-time Research Scientist at CHAI, and spends most of his time away from UC Berkeley serving as CEO of Encultured AI.Ian Baker
Research Engineer with Jonathan Stray
Ian Baker has focused on shipping production-quality code and leading high-impact software teams for more than two decades. Professionally, he cofounded a YCombinator-backed startup and led it through its acquisition, built recommender systems and ML infrastructure at Dropbox, and worked to drive editor engagement at Wikipedia. He is also an accomplished industrial artist—with his collaborators, he holds the world record for the hottest video game, and built a 2.5 mile long straightedge that you could use to see the curvature of the Earth. His present goal is to bring industry-leading software engineering practices to academic research on AI alignment.Jonathan Stray
Senior Scientist
Jonathan Stray is a Senior Scientist at the Center for Human Compatible AI at UC Berkeley, where he works on the design of recommender systems for better personalized news and information. He previously taught the dual masters degree in computer science and journalism at Columbia University, built several pieces of software for investigative journalism, worked as an editor at the Associated Press, and developed graphics algorithms for Adobe. He holds an MSc in Computer Science from the University of Toronto and an MA in Journalism from the University of Hong Kong.Julia Morris
Research Associate
Julia Morris is an STS-trained social technologist focused on projects that make AI policy more transparent and accessibleJustin Svegliato
Research Scientist
Justin is a Research Scientist in the Center for Human-Compatible AI at UC Berkeley. He received his PhD in Computer Science at UMass Amherst where he was advised by Shlomo Zilberstein. The goal of his research is to build autonomous systems that operate in the open world for long periods of time in an efficient, reliable, and ethical way. Recently, his work has focused on ensuring that autonomous systems comply with different ethical theories, such as utilitarianism, Kantianism, and prima facie duties. His work has also focused on using metareasoning to enable autonomous systems to recover from exceptions, maintain and restore safe operation, and optimize the performance of anytime planning algorithms for bounded rationality. You can learn more about Justin at his website.Livia Morris
Research Associate
Livia Morris is a designer with a background in cognitive science and science and technology studies. At CHAI she worked on projects to catalog and synthesize legislative information about AI.Research Fellows
Ben Plaut
Research Fellow
Ben is a postdoctoral research fellow at CHAI, mentored by Stuart Russell. He received his PhD from Stanford under the supervision of Ashish Goel. He does a mix of theory and empirical research, focusing on generalization: how a model handles unfamiliar inputs. Ben thinks that many safety failures can be framed as misgeneralization. Overall, he aims to design safety methods that are theoretically grounded and can potentially scale to very advanced AI systems, but which can be tested on (and are useful for) systems we have today. You can read more on Ben’s website.Brian Judge
Research Fellow
Brian Judge is an AI Policy Fellow at CHAI, where he studies the governance of artificial intelligence and its implications for global political economy. He received his PhD in Political Science from the University of California, Berkeley. His work has appeared in New Political Economy, New Media & Society, and the Cambridge Journal of Economics and his first book is forthcoming from Columbia University Press. You can read more about Brian’s work at his website.Cameron Allen
Research Fellow
Cameron Allen is a postdoc at CHAI working with Stuart Russell. Cam studies the computations that enable intelligence. His expertise is mainly in reinforcement learning, planning, observation and action abstraction, representation learning, memory, and factored world modeling. He is especially interested in exploring how abstraction can help tackle problems in value alignment, interpretability, and bounded rationality. Prior to CHAI, Cam completed his PhD at Brown University, advised by George Konidaris, where he studied structured abstractions for general-purpose decision making. You can learn more about Cam's work at his website.George Obaido
Research Fellow
George Obaido is a Postdoctoral Scholar at the University of California, San Diego, United States. He holds a PhD in Computer Science at the University of the Witwatersrand in Johannesburg, South Africa. He also obtained his Masters degree from the same university. His research interests lie in using machine learning to find solutions to many societal problems. He has collaborated actively with researchers in several other disciplines of computer science, particularly in the Data Economies and Data Markets Group of the Mechanism Design for Social Good - a multi-institutional initiative using techniques from algorithms, optimization, and mechanism design, along with insights from other disciplines, to improve access to opportunity for historically underserved and disadvantaged communities.Khanh Nguyen
Research Fellow
Khanh Nguyen is a postdoc at CHAI, being mentored by Prof. Stuart Russell. Prior to joining CHAI, he spent a year as a postdoc at Princeton NLP with Prof. Karthik Narasimhan. Khanh obtained his PhD at the University of Maryland, College Park, under the guidance of Prof. Hal Daumé III. His research aims to enhance the performance and safety of AI agents by enabling them to communicate and collaborate more effectively with humans. He has developed frameworks for learning from natural language feedback and for learning to ask natural, uncertainty-grounded questions. Previously, he worked on reinforcement learning from human ratings for text generation, which is a currently popular method for fine-tuning large language models like GPTs.Michael Cohen
Research Fellow
Michael is a postdoc at CHAI supervised by Stuart Russell. He received his PhD at Oxford University advised by Michael Osborne. His research has identified conditions under which we can expect advanced artificial long-term planning agents to escape humanity's control in order to control their inputs, and he has developed multiple idealized designs for agents which avoid those conditions. This includes an isolated environment, pessimism, and robust imitation. Recently, he has also worked on designing a regulatory framework to prevent extinction from future advanced artificial agents. You can learn more about Michael at his website.Paria Rashidinejad
Research Fellow
Paria is a postdoctoral research fellow working with Stuart Russell. Read more on her LinkedIn.Alumni
Caroline Jeanmaire
PhD Student at Oxford University, Public Policy
Director of Strategic Research and Partnerships (2019 - 2021)
Caroline Jeanmaire
PhD Student at Oxford University, Public Policy
Director of Strategic Research and Partnerships (2019 - 2021)
Caroline led CHAI’s partnership and external relations strategy, focusing on building a research community around AI safety and relationships with key stakeholders. She also researched models of international coordination to ensure the safety and reliability of AI systems. Before working at CHAI, she was an AI Policy Researcher and Project Manager at The Future Society, a think-tank incubated at Harvard’s Kennedy School of Government. She notably supported the organization of the first and second Global Governance of AI Forums at the World Government Summit in Dubai, with over 200 attendees. In the 2019 edition, she managed the Geopolitics of AI and International Panel on AI research committees. Caroline is experienced in multi-party coordination and negotiation. She was a Youth Delegate to the United Nations for two years with the French delegation. She participated in numerous climate negotiations and technical intersessions (including COP21, COP22, COP23 and COP24). Caroline has a dual master’s degree in International Relations from Peking University and Sciences Po Paris and a bachelor’s degree in political sciences from Sciences Po Paris. She also studied at Tufts University and at the Graduate Fletcher School of Law and Diplomacy. Caroline speaks English, French, Spanish and Mandarin Chinese. Caroline is currently pursuing her PhD in Public Policy at The Blavatnik School of Government at Oxford University.Karthika Mohan
Assistant Professor at Oregon State University
Postdoctoral Scholar (2017-2021)
Karthika Mohan is Assistant Professor at Oregon State University. At CHAI, she was a postdoctoral scholar mentored by Stuart Russell. Karthika received her PhD in Computer Science (Artificial Intelligence) from UCLA where she was advised by Judea Pearl. Her research is of an interdisciplinary nature and her areas of interest include Causal Inference, Graphical Models and AI Safety. She was awarded the Google Outstanding Graduate Research Award, 2017 which is a UCLA Commencement Award. Currently she serves on the editorial board of the Journal of Causal Inference. In addition she is active on program committees of leading AI/ML conferences and reviews for journals in varied disciplines such as Machine Learning, Psychology and Philosophy. For more details, please visit hereAdam Gleave
Founder at FAR AI
PhD Student in Artificial Intelligence (2017-2023)
Adam's research focuses on reward specification for reinforcement learning. Most recently, Adam developed the EPIC distance to compare reward functions. Previously, he has worked on reward modeling techniques with a number of collaborators, including inverse reinforcement learning (IRL) from vision and multi-task IRL. Adam is also interested in the robustness of reinforcement learning, and has demonstrated the existence of adversarial policies in multi-agent environments: policies that cause an opponent to fail despite behaving seemingly randomly. Prior to joining Berkeley, Adam did a M.Phil. with Christian Steinruecken and Zoubin Ghahramani in the Machine Learning Group at the University of Cambridge. You can learn more about Adam’s work at his website and by following him on Twitter.
Alex Turner
Research Scientist at Google DeepMind on Scalable Alignment Team.
Research Fellow (2022 - 2023)
Alex Turner
Research Scientist at Google DeepMind on Scalable Alignment Team.
Research Fellow (2022 - 2023)
Alex wanted to understand how to reliably form values and other cognitive structures within AIs. For example, he is interested in how different reward schedules + learning algorithms + environments (e.g. Pac-Man with PPO using score differentials as state-action-state rewards) translates into policy-level circuitry (e.g. IF near ghost, THEN move away). In this light, he wants to understand how this process occurs in human beings -- the only generally intelligent agents we have ever seen. Alex is currently a research scientist at Google DeepMind on the Scalable Alignment team.Cody Wild
Research Software Engineer at Google
Research Engineer (2018-2021)
Cody Wild is a Research Software Engineer at Google, previously Research Engineer at CHAI. She holds a B.S. in Statistical Economics from Tulane University, and a M.S. in Analytics from the University of San Francisco. In the years after finishing her Masters program, Cody worked as an industry Data Scientist, first at LendUp, and then on an applied R&D team for Sophos Labs working on developing deep learning models to detect viruses and other malicious file types. Outside of formal employment, she’s worked to deepen her knowledge in Machine Learning by reading and synthesizing technical content from the field, which can be found online in both long and short form. She is passionate about precise understanding, clear arguments, and clean code, and is glad to be helping support a research team that’s thinking critically about the ways their work and AI writ large could negatively impact the world.Dan Hendrycks
Director at the Center for AI Safety
PhD Student, Computer Science (2018-2022)
Dan Hendrycks is Director at the Center for AI Safety. His research aims to disentangle and concretize the components necessary for safe AI. This leads him to work on quantifying and improving the performance of models in unforeseen out-of-distribution scenarios. He also works on measuring a model’s alignment with human values. Dan received his BS from the University of Chicago. You can find out more about his research at his website.Daniel Filan
Research Manager at MATS
PhD Student in EECS
Daniel Filan is a PhD student of EECS at UC Berkeley, supervised by Stuart Russell. He’s interested in effective altruism and wants to ensure that future artificial intelligences who may be much more strategically intelligent than us behave in a safe way. In 2016, Daniel worked with Tom Everitt, Mayank Daswani, and Marcus Hutter, considering the problem that a sufficiently advanced AI could choose to modify its source code in order to have easily achievable goals, and such modifications may not be to humans’ liking (paper). They determined that an agent will not self-modify if and only if the value function of the agent anticipates the consequences of self-modification and uses the agent’s current utility function when evaluating the future. Daniel is currently thinking about mechanistic transparency, the problem of how to understand the workings of trained models. You can learn more about Daniel’s work at his personal website.
Dorsa Sadigh
Assistant Professor of Computer Science and Electrical Engineering at Stanford University
PhD Student, Electrical Engineering and Computer Science (2012-2017)
Dorsa Sadigh
Assistant Professor of Computer Science and Electrical Engineering at Stanford University
PhD Student, Electrical Engineering and Computer Science (2012-2017)
Dorsa Sadigh is an Assistant Professor in Computer Science and Electrical Engineering at Stanford University. Her research interests lie in the intersection of robotics, learning, and control theory. Specifically, she is interested in developing algorithms for safe and adaptive human-robot interaction. Dorsa has received her doctoral degree in Electrical Engineering and Computer Sciences (EECS) from UC Berkeley in 2017, and has received her bachelor’s degree in EECS from UC Berkeley in 2012. She is awarded the NSF CAREER award, the AFOSR Young Investigator Program Award, the IEEE TCCPS early career award, the Google Faculty Award, and the Amazon Faculty Research Award. Learn more at her website.Dylan Hadfield-Menell
Assistant Professor at MIT
PhD Student, Computer Science (2013-2020)
Dylan will be starting as an Assistant Professor at the Massachusetts Institute of Technology in July 2021. While at CHAI, Dylan was a Ph.D. student at UC Berkeley advised by Anca Dragan, Pieter Abbeel, and Stuart Russell. His research focused on algorithms that facilitate human-compatible artificial intelligence. In particular, he tried to develop frameworks that account for uncertainty about the objective being optimized. Before coming to Berkeley, Dylan did a Master’s of Engineering with Leslie Kaelbling and Tomás Lozano-Pérez at MIT. At Berkeley, Dylan’s research has taken a turn to focus more on AI safety and, thinking longer term, AI value alignment. In 2016, he and his advisors formally described a cooperative inverse reinforcement learning problem (paper). The problem serves as a tool to help researchers consider how robots could learn humans’ values via cooperative instruction. While robots could learn humans’ values by observing humans, cooperative instruction is likely to be significantly faster. In 2017, Dylan and his advisors described an “off-switch game,” a simplified problem describing scenarios in which a human would like to turn off a robot, but the robot is able to disable its off-switch (paper). They showed that a robot who is uncertain about the utility derived from various outcomes in the game is more likely to allow a human to turn it off. You can learn more about Dylan’s other professional work at his website and follow him on Twitter here.
Erdem Biyik
Assistant Professor of Computer Science at USC
Postdoctoral Researcher, EECS (2022-2023)
Erdem Biyik
Assistant Professor of Computer Science at USC
Postdoctoral Researcher, EECS (2022-2023)
Erdem is at the department of Computer Science at the University of Southern California as an Assistant Professor. He was a postdoctoral scholar at UC Berkeley mentored by Stuart Russell and Anca Dragan. He has received his B.Sc. degree from Bilkent University, Turkey, in 2017; and Ph.D. degree from Stanford University in 2022 where he was advised by Dorsa Sadigh. His research interests lie in the intersection of robotics, artificial intelligence, machine learning and game theory. He is interested in enabling robots to actively learn from various forms of human feedback and designing robot policies to improve the efficiency of multi-agent systems both in cooperative and competitive settings. He also worked at Google as a research intern in 2021 where he adapted his active robot learning algorithms to recommender systems. He will join the University of Southern California as an assistant professor in 2023.
George Matheos
PhD Student at MIT
Former Undergraduate Student
Jaime Fernandez Fisac
Assistant Professor of Electrical Engineering at Princeton University
PhD Student, Control, Intelligent Systems and Robotics (2013-2018)
Jaime Fernandez Fisac
Assistant Professor of Electrical Engineering at Princeton University
PhD Student, Control, Intelligent Systems and Robotics (2013-2018)
Jaime is currently an Assistant Professor in Electrical Engineering at Princeton University. Prior to this appointment, he worked at Waymo as a Research Scientist While Jaime was at CHAI as a PhD student, he worked on autonomous robots in both academia and industry, with a particular focus on collision avoidance and multi-agent systems. Broadly, his research focused on safely introducing robotics into society. Under the guidance of CHAI Professors Anca Dragan and Tom Griffiths, and along with fellow CHAI graduate students Vael Gates and Dylan Hadfield-Menell, Jaime worked on solving the cooperative inverse reinforcement learning (CIRL) dynamic game using well-established models of human inference, decision making, and theory of mind from the cognitive science literature. Previous solutions have relied on modelling both the human and robot as perfectly rational and able to coordinate in advance, which are nontrivial assumptions in the real world. Instead, Jaime and his colleagues’ work models the human as pedagogic (i.e., her behaviour will aim to be instructive) and the robot as pragmatic (i.e., it knows the human is not perfectly rational but is still trying to teach it). Results suggest this formulation produces robots that are more competent collaborators (paper). In the past, Jaime has also researched how we can have robots choose their course of action in a way that will be easy for a human observer to anticipate (paper) and how incorporating uncertainty into a safety framework for robotic systems that works in conjunction with their learning process can provide meaningful safety guarantees (paper). At Princeton, Jaime intends to look into how AI systems can utilize models of human cognition and behavior to ensure safe interaction with people. He believes that the safer robots will be those that engage their users and procure their cooperation, rather than try to protect against their indifference. He hopes that designing safe human-centered robotic systems in the short term will give us key insights to tackle the broader, long-term AI safety problem. Also, he has done research on having AI treat their models as fallible to protect against overconfidence. You can explore more about this topic by watching this video. You can learn more about Jaime at his website.
Lawrence Chan
Member of Technical Staff at METR
PhD Student (2023), Advised by Anca Dragan and Stuart Russell
Lawrence Chan
Member of Technical Staff at METR
PhD Student (2023), Advised by Anca Dragan and Stuart Russell
As of January 2023, Lawrence is working at METR (formerly “ARC Evals”), where he conducts evaluations of large language models. Previously, he was at Redwood Research, focusing on adversarial training and neural network interpretability.
He is also pursuing a PhD at UC Berkeley under the supervision of Anca Dragan and Stuart Russell. Before that, he earned a BAS in Computer Science and Logic and a BS in Economics from the University of Pennsylvania’s M&T Program, where he had the opportunity to work with Philip Tetlock on applying machine learning to forecasting.
Lawrence's primary research interests include mechanistic interpretability and scalable oversight. He has also explored conceptual questions related to learning human values.
Additionally, he occasionally blogs about AI alignment and other topics on LessWrong and the AI Alignment Forum.
Michael Dennis
Researcher at Google DeepMind
PhD Student, Artificial Intelligence and Computational Geometry (2016-2023)
Michael Dennis
Researcher at Google DeepMind
PhD Student, Artificial Intelligence and Computational Geometry (2016-2023)
Before joining CHAI, Michael worked in theoretical computer science. He is working to close the gap between game theoretic principles and current approaches to multi-agent learning in order to provide better assurances of both performance and stability in these systems. Currently, this takes the form of work on Unsupervised Environment Design (UED), which aims to build complex and challenging environments automatically to promote efficient learning and transfer. This framework has deep connections to decision theory, which allows us to make guarantees about how the resulting policies would perform in human-designed environments, without having ever trained on them. You can learn more about Michael's work at his website and by following him on Twitter.Olivia Watkins
Member of Technical Staff at OpenAI
PhD Student (Fall 2019 - Spring 2024)
Olivia is currently a Member of Technical Staff at OpenAI. While at CHAI, Olivia is co-advised by Pieter Abbeel and Trevor Darrell. She is excited about human-in-the-loop learning, and more generally about finding better ways to provide human supervision and priors to AI agents. More details at https://aliengirlliv.github.Rohin Shah
Research Scientist at Google DeepMind
PhD Student, Artificial Intelligence (2014-2020)
Rohin currently works as a Research Scientist on the technical AGI safety team at DeepMind. Prior to DeepMind, Rohin joined CHAI in his fourth year after he became convinced of the importance of building safe, aligned AI. He now thinks about how to provide specifications of good behavior in ways other than reward functions, especially ones that do not require much human effort. In his PhD, he worked on building AI systems that can learn to assist a human user, even if they don't initially know what the user wants. His general interests in CS are very broad, including AI, machine learning, programming languages, complexity theory, algorithms, and security, and so he started his PhD working on program synthesis. Rohin writes the Alignment Newsletter, a weekly publication with recent content relevant to AI alignment that has over 600 subscribers. You can learn more about his work at his website and by following him on Twitter.
Sam Toyer
Member of Technical Staff at OpenAI, Model Safety Team
PhD Student in Artificial Intelligence (2018-2024)
Sam Toyer
Member of Technical Staff at OpenAI, Model Safety Team
PhD Student in Artificial Intelligence (2018-2024)
Sam is currently a member of technical staff at OpenAI on the model safety team. Sam was a PhD student in EECS at UC Berkeley, advised by Stuart Russell. As an undergraduate at the Australian National University, he worked on a range of topics, including human pose estimation from video, establishing best practices for serving large-scale satellite data on the web, and accelerating classical and probabilistic planning with deep learning. Currently, he is interested in how to make intelligent agents that can infer and act in accordance with the preferences of their owners. More details about his research are available on his website.
Smitha Milli
Research Scientist at Meta FAIR (NYC) in the AI & Society group
PhD Student, Electrical Engineering and Computer Science, 2017-2022
Smitha Milli
Research Scientist at Meta FAIR (NYC) in the AI & Society group
PhD Student, Electrical Engineering and Computer Science, 2017-2022
Smitha Milli is a Research Scientist at Meta FAIR (NYC) in the AI & Society group. Their work primarily focuses on (a) rigorous evaluation of systems interacting in feedback loops with humans (e.g. recent work on measuring effects of Twitter’s ranking algorithm) and (b) designing and learning objective functions for those systems that produce more socially-beneficial outcomes. They hold a PhD in EECS from UC Berkeley and their postdoc is funded by an Open Philanthropy early career grant. You can learn more about Smitha’s research on their website and follow them on Twitter here.Steven Wang
Research Engineer (2018-2021)
Steven completed a Master's in Computer Science at ETH Zurich in 2023. At CHAI, he was an ML research engineer. He became interested in AI Safety after reading the 80000 Hours career guide. During his undergraduate studies at UC Berkeley, Steven worked with Anca Dragan and Jaime Fisac on confidence-based human predictions. As an intern at CHAI, he worked with Dylan Hadfield-Menell on extensions to Inverse Reward Design. You can check out what he’s been coding up lately on his GitHub.Thanard Kurutach
AI/ML Research Scientist at Cruise
PhD Student, Computer Science (2016-2020)
Thanard Kurutach is an AI/ML Research Scientist at Cruise. He graduated in 2021 from UC Berkeley with a PhD in AI/Robotics, advised by Prof. Pieter Abbeel and Prof. Stuart Russell. His thesis is titled Learning, Planning, and Acting with Models. Previously, he was a Math and CS double major at MIT.Read more on his LinkedIn.
Thomas Krendl Gilbert
CEO at Hortus AI
PhD Student, Machine Ethics and Epistemology (2015-2021)
Tom graduated in 2021 with his PhD in Machine Ethics and Epistemology. This fall he will join the Digital Life Initiative at Cornell Tech as a postdoc. With prior training in philosophy, sociology, and political theory, Tom researches the ethical and political predicaments that emerge when technical practitioners use machine learning to reshape organizational decision-making. His recent paper “Hard Choices in Artificial Intelligence” argues that advanced AI systems should be developed democratically so that relevant preferences and norms are modeled in a way that is simultaneously accountable, contextually appropriate, and compatible between stakeholders. This work has concrete implications for the design of AI systems that are fair for distinct subpopulations, safe when enmeshed with institutional practices, and accountable to the public interest, including medium-term applications like automated vehicles. You can learn more about his work at his website.Vael Gates
Head of Community, at FAR AI
PhD Student, Computational Cognitive Science (2016-2021)
In 2021 Vael graduated with their PhD in Neuroscience with a focus on computational cognitive science. This fall Vael will start as a postdoc at Stanford HAI / CISAC, where they intend to work on an ethnography of AI researchers. At UC Berkeley, Vael was advised by Professor Tom Griffiths (a PI at CHAI). Vael’s research is broadly aimed at developing computational models of social cognition. Vael is interested in: How people can infer beliefs and intentions in others, especially by observing others’ actions and employing recursive theory-of-mind (e.g. inverse reinforcement learning, social psychology); Group-level equilibria when agents are collaborating or competing (e.g. game theory, agent-based modeling); and Mechanism design and other ways quantitative characterizations of a phenomenon can be used to predict and shape behavior. Recently, Vael has worked on two projects related to CHAI’s mission. They have assisted with the paper on solving the cooperative inverse reinforcement learning (CIRL) dynamic game and is working with Professors Anca D. Dragan, Tom L. Griffiths, and Anant Sahai on preference aggregation across agents. More specifically, in the latter project Vael and colleagues have set up a study in which participants are presented with a problem that requires mediating between the preferences of multiple agents. Vael and colleagues take participants’ responses and attempt to explain them using a quantitative model. Their hope is to create a baseline standard of “fair” reactions to the problem—a standard to which the behavior of future AIs can be compared. Vael wants to continue to approach social cognition and social inference problems from a computational perspective, using probabilistic models and large-scale web-based crowdsourcing to investigate the computational goals and algorithms driving the social mind. By understanding the complex inferences made by human minds, they hope to contribute to the development of artificial intelligence that can collaborate and is compatible with human behavior. You can learn more at Vael’s website.Affiliates
Alison Gopnik
Professor of Psychology at UC Berkeley
Affiliate Professor of Philosophy at UC Berkeley
Alison Gopnik
Professor of Psychology at UC Berkeley
Affiliate Professor of Philosophy at UC Berkeley
Alison Gopnik is a professor of psychology and affiliate professor of philosophy at the University of California at Berkeley. She received her BA from McGill University and her PhD. from Oxford University. She is an internationally recognized leader in the study of cognitive science and of children’s learning and development and was one of the founders of the field of “theory of mind”, an originator of the “theory theory” of children’s development and more recently introduced the idea that probabilistic models and Bayesian inference could be applied to children’s learning. She has held a Center for Advanced Studies in the Behavioral Sciences Fellowship, the Moore Distinguished Scholar fellowship at the California Institute of Technology, the All Souls College Distinguished Visiting Fellowship at Oxford, and King’s College Distinguished Visiting Fellowship at Cambridge. She is an elected member of the Society of Experimental Psychologists, and the American Academy of Arts and Sciences and a fellow of the Cognitive Science Society and the American Association for the Advancement of Science. She has been continuously supported by the NSF and was PI on a 2.5 million dollar interdisciplinary collaborative grant on causal learning from the McDonnell Foundation. She is the author or coauthor of over 100 journal articles and several books including “Words, thoughts and theories” MIT Press, 1997, and the bestselling and critically acclaimed popular books “The Scientist in the Crib” William Morrow, 1999, “The Philosophical Baby; What children’s minds tell us about love, truth and the meaning of life”, and “The Gardener and the Carpenter”, Farrar, Strauss and Giroux, the latter two won the Cognitive Development Society Best Book Prize in 2009 and 2016. She has also written widely about cognitive science and psychology for The New York Times, The Atlantic, The New Yorker, Science, Scientific American, The Times Literary Supplement, The New York Review of Books, New Scientist and Slate, among others. Her TED talk on her work has been viewed more than 3 million times. And she has frequently appeared on TV and radio including “The Charlie Rose Show” and “The Colbert Report”. Since 2013 she has written the Mind and Matter column for the Wall Street Journal. She lives in Berkeley with her husband Alvy Ray Smith, and has three children and three grandchildren.
Brandie Nonnecke
Founding Director of the CITRIS Policy Lab
Associate Research Professor at the Goldman School of Public Policy
Brandie Nonnecke
Founding Director of the CITRIS Policy Lab
Associate Research Professor at the Goldman School of Public Policy
Brandie Nonnecke, PhD is Founding Director of the CITRIS Policy Lab, headquartered at UC Berkeley. She is an Associate Research Professor at the Goldman School of Public Policy (GSPP) where she directs the Tech Policy Initiative, a collaboration between CITRIS and GSPP to strengthen tech policy education, research, and impact. Brandie is the Director of Our Better Web, a program that supports empirical research, policy analysis, training, and engagement to address the sharp rise of online harms. She is a co-director of the Berkeley Center for Law and Technology at Berkeley Law where she leads the Project on Artificial Intelligence, Platforms, and Society. She also co-directs the UC Berkeley AI Policy Hub, an interdisciplinary initiative training researchers to develop effective AI governance and policy frameworks. Brandie is the host of TecHype, a groundbreaking video and audio series that debunks misunderstandings around emerging technologies and explores effective technical and policy strategies to harness emerging technologies for good. Brandie served as a Technology and Human Rights Fellow at the Carr Center for Human Rights Policy at the Harvard Kennedy School. She also completed fellowships at the Schmidt Futures International Strategy Forum, Aspen Institute’s Tech Policy Hub, and the World Economic Forum. Her research has been published in Science, Wired, Telecommunications Policy, the Journal of Information Technology and Politics, among other outlets. Her work has been cited by the FTC, NIST, the White House Office of Science and Technology Policy, as well as in the Washington Post, BBC, NPR, among other venues. Brandie was named one of the 100 Brilliant Women in AI Ethics in 2021.Brian Christian
Author of The Most Human Human, Algorithms to Live By, and The Alignment Problem
Brian Christian is the author of The Most Human Human, which was named a Wall Street Journal bestseller, a New York Times Editors’ Choice, and a New Yorker favorite book of the year. He is the author, with Tom Griffiths, of Algorithms to Live By, a #1 Audible bestseller, Amazon best science book of the year and MIT Technology Review best book of the year.
His third book, The Alignment Problem, was released in October of 2020.
Christian’s writing has been translated into nineteen languages, and has appeared in The New Yorker, The Atlantic, Wired, The Wall Street Journal, The Guardian, The Paris Review, and in scientific journals such as Cognitive Science. Christian has been featured on The Daily Show with Jon Stewart, Radiolab, and The Charlie Rose Show, and has lectured at Google, Facebook, Microsoft, the Santa Fe Institute, and the London School of Economics. His work has won several awards, including fellowships at Yaddo and the MacDowell Colony, publication in Best American Science & Nature Writing, and an award from the Academy of American Poets.
Born in Wilmington, Delaware, Christian holds degrees in philosophy, computer science, and poetry from Brown University and the University of Washington. A Visiting Scholar at the University of California, Berkeley, he lives in San Francisco.
Charis Thompson
Chancellor's Professor of Gender & Women's Studies at UC Berkeley
Charis Thompson is Chancellor’s Professor of Gender & Women’s Studies, and a former founding director of the Science, Technology, and Society Center at the University of California, Berkeley; and Professor of Sociology, London School of Economics and Political Science. Her current book in progress, Getting Ahead, focuses on minds, bodies, and emotions in an age of populism and technology elites. Getting Ahead is the third in her book series on technology and democracy. The first, Making Parents (2005), won the Rachel Carson Prize from the Society for the Social Study of Science, and examined gender, race, and agency in relation to reproductive technologies. The second book, Good Science (2013), examined the geopolitics and bioethics of pro-cures innovation economies. Thompson served on the Nuffield Council on Bioethics Working Group on Genome Editing, and currently serves on the World Economic Forum’s Global Technology Council on Technology, Values, and Policy, as well as UC Berkeley’s Stem Cell Research Oversight Committee, and the faculty advisory board of the Center for Race and Gender. She is a recipient of the Social Science Division Distinguished Teaching Award, and in 2017 received an Honorary Doctorate from the Norwegian Science and Technology University.
David Krueger
Assistant Professor at the University of Montreal
David Krueger is an Assistant Professor at the University of Montreal and a member of Mila. His research group focuses on Deep Learning, AI Alignment, AI safety, and AI Policy. He is broadly interested in work (including in areas outside of Machine Learning, e.g. AI governance) that could reduce the risk of human extinction (“x-risk”) resulting from out-of-control AI systems. Particular interests include:- Reward modeling and reward gaming
- Aligning foundation models
- Understanding learning and generalization in deep learning and foundation models, especially via “empirical theory” approaches
- Preventing the development and deployment of socially harmful AI systems
- Elaborating and evaluating speculative concerns about more advanced future AI systems
Dawn Song
Professor of EECS at UC Berkeley
Dawn Song is a Professor in the Department of Electrical Engineering and Computer Science at UC Berkeley. Her research interest lies in deep learning and security. She has studied diverse security and privacy issues in computer systems and networks, including areas ranging from software security, networking security, distributed systems security, applied cryptography, blockchain and smart contracts, to the intersection of machine learning and security. She is the recipient of various awards including the MacArthur Fellowship, the Guggenheim Fellowship, the NSF CAREER Award, the Alfred P. Sloan Research Fellowship, the MIT Technology Review TR-35 Award, the Faculty Research Award from IBM, Google and other major tech companies, and Best Paper Awards from top conferences in Computer Security and Deep Learning. She is a winner of the AMiner Most Influential Scholar Award, as the most cited scholar in Computer Security. She obtained her Ph.D. degree from UC Berkeley. Prior to joining UC Berkeley as a faculty, she was a faculty at Carnegie Mellon University from 2002 to 2007.
You can read more about Prof. Song’s work and accomplishments here.
Demian Pouzo
Associate Professor of Economics at UC Berkeley
Demian Pouzo is an associated professor in the Department of Economics at UC Berkeley. Demian joined the faculty at Berkeley in 2009 as an assistant professor after receiving his PhD in Economics from NYU. He also holds an MA and BA in Economics from Universidad Torcuato Di Tella (Argentina). Pouzo’s research interests include econometrics as well as other fields such as economic theory and macroeconomics.
Gillian Hadfield
Research Professor, Johns Hopkins Whiting School of Engineering
Gillian Hadfield holds a J.D. from Stanford Law School and a Ph.D. in economics from Stanford University. She is a leading proponent of the reform and redesign of legal systems for a rapidly changing world facing tremendous challenge from globalization and technology. Her extensive research examines how to make law more accessible, effective, and capable of fulfilling its role in balancing innovation, growth, and fairness. Prof. Hadfield is a member of the World Economic Forum’s Global Future Council on the Future of Technology, Values and Policy and co-curates the Forum’s Transformation Map for Justice and Legal Infrastructure. She was appointed in 2017 to the American Bar Association’s Commission on the Future of Legal Education, serves as Director of the USC Center for Law and Social Science and is a member of the World Justice Project’s Research Consortium. Her book, Rules for a Flat World: Why Humans Invented Law and How to Reinvent It for a Complex Global Economy, was published by Oxford University Press in November 2016. You can read more about Prof. Hadfield’s work and accomplishments at her website.Jacob Steinhardt
Assistant Professor of Statistics at UC Berkeley
Professor Steinhardt’s goal is to make the conceptual advances necessary for machine learning systems to be reliable and aligned with human values. This includes the following directions:
Robustness: How can we build models robust to distributional shift, to adversaries, to model mis-specification, and to approximations imposed by computational constraints? What is the right way to evaluate such models?
Reward specification and reward hacking: Human values are too complex to be specified by hand. How can we infer complex value functions from data? How should an agent make decisions when its value function is approximate due to noise in the data or inadequacies in the model? How can we prevent reward hacking–degenerate policies that exploit differences between the inferred and true reward?
Scalable alignment: Modern ML systems are often too large, and deployed too broadly, for any single person to reason about in detail, posing challenges to both design and monitoring. How can we design ML systems that conform to interpretable abstractions? How do we enable meaningful human oversight at training and deployment time despite the large scale? How will these large-scale systems affect societal equilibria?
These challenges require rethinking both the theoretical and empirical paradigms of ML. Theories of statistical generalization do not account for the extreme types of generalization considered above, and decision theory does not account for cases where the reward function is only approximate. Meanwhile, measuring empirical test accuracy on a fixed distribution is insufficient to analyze phenomena such as robustness to distributional shift.
For more information about Professor Steinhardt, please visit: https://www.stat.berkeley.edu/~jsteinhardt/
Jakob Foerster
Assistant Professor of Computer Science at the University of Toronto and the Vector Institute
Jakob Foerster
Assistant Professor of Computer Science at the University of Toronto and the Vector Institute
Jakob Foerster received a CIFAR AI chair in 2019 and is starting as an Assistant Professor at the University of Toronto and the Vector Institute in the academic year 20/21. During his PhD at the University of Oxford, he helped bring deep multi-agent reinforcement learning to the forefront of AI research and interned at Google Brain, OpenAI, and DeepMind. He has since been working as a research scientist at Facebook AI Research in California, where he will continue advancing the field up to his move to Toronto. He was the lead organizer of the first Emergent Communication (EmeCom) workshop at NeurIPS in 2017, which he has helped organize ever since. Learn more at his website.John Zysman
Professor Emeritus in Political Science at UC Berkeley
John Zysman is Professor Emeritus in the Department of Political Science. He received his B.A. from Harvard University and his Ph.D. from the Massachusetts Institute of Technology. He has written extensively on European and Japanese political economy, policy and corporate strategy.
His most recent articles include:
The Next Phase in the Digital Revolution: Abundant Computing, Platforms, Growth and Employment in February 2018 CACM
Intelligent Tools and Digital Platforms: Implications for Work and Employment in Intereconomics Review of European Economic Policy (Issue 6 2017)
The Rise of the Platform Economy In NAS Issues in Science and Technoogy, Spring 2016
Entrepreneurial Finance in the Era of Intelligent Tools and Digital Platforms: Implications and Consequences for Work
Recent books include:
The Third Globalization: Can Wealthy Countries Stay Rich Ed. with Dan Brezniz (Oxford University Press 2013)
Can Green Sustain Growth: From the Religion to the Reality To Sustainable Prosperity Ed. with Mark Huberty (Stanford Unversity Press. 2013)
Earlier books include:
The Highest Stakes: The Economic Foundations of the Next Security System (Oxford University Press, 1992)
Manufacturing Matters: The Myth of the Post-Industrial Economy (Basic Books, 1987)
Governments, Markets, and Growth: Finance and the Politics of Industrial Change (Cornell University Press, 1983)
Juliana Schroeder
Assistant Professor of Haas School of Business at UC Berkeley
Juliana Schroeder researches how people navigate their social worlds: first, how people form inferences about others’ mental states and mental capacities and, second, how these inferences influence their interactions.
Juliana’s research has been published in journals such as Journal of Personality and Social Psychology, Journal of Experimental Psychology, and Psychological Science. It has been featured by outlets such as the New York Times, Newsweek, NBC, and the Today Show, and has been funded by the National Science Foundation.
Juliana received her B.A. in psychology and economics from The University of Virginia. She received a M.A. in social psychology and advanced methods from the University of Chicago, and an M.B.A. from The Chicago Booth School of Business. Her Ph.D. is in Psychology and Business from the University of Chicago. Juliana is currently an Assistant Professor at the Berkeley Haas School of Business.
You can read more about Prof. Schroeder’s work and accomplishments here.
Ken Goldberg
Professor and Chair of the Industrial Engineering and Operations Research Department at UC Berkeley
Ken Goldberg
Professor and Chair of the Industrial Engineering and Operations Research Department at UC Berkeley
Ken Goldberg is an artist, inventor, and UC Berkeley Professor focusing on robotics. He was appointed the William S. Floyd Jr Distinguished Chair in Engineering and serves as Chair of the Industrial Engineering and Operations Research Department. He has secondary appointments in EECS, Art Practice, the School of Information, and Radiation Oncology at the UCSF Medical School. Ken is Director of the CITRIS “People and Robots” Initiative and the UC Berkeley AUTOLAB where he and his students pursue research in machine learning for robotics and automation in warehouses, homes, and operating rooms. Ken developed the first provably complete algorithms for part feeding and part fixturing and the first robot on the Internet. Despite agonizingly slow progress, he persists in trying to make robots less clumsy. He has over 250 peer-reviewed publications and 8 U.S. Patents. He co-founded and served as Editor-in-Chief of the IEEE Transactions on Automation Science and Engineering. Ken’s artwork has appeared in 70 exhibits including the Whitney Biennial and films he has co-written have been selected for Sundance and nominated for an Emmy Award. Ken was awarded the NSF PECASE (Presidential Faculty Fellowship) from President Bill Clinton in 1995, elected IEEE Fellow in 2005 and selected by the IEEE Robotics and Automation Society for the George Saridis Leadership Award in 2016. He lives in the Bay Area and is madly in love with his wife, filmmaker and Webby Awards founder Tiffany Shlain, and their two daughters.
You can read more about Prof. Goldberg’s work and accomplishments here.
Lara Buchak
Professor of Philosophy at Princeton University
Lara Buchak is a Professor in the Philosophy Department at Princeton University. Her research interests include decision theory, social choice theory, epistemology, ethics, and the philosophy of religion. Her book Risk and Rationality (2013) concerns how an individual ought to take risk into account when making decisions. Her research following the book has focused on applications of her view to ethics, arguing that we ought to defer to individuals’ risk-attitudes in biomedical research; that we ought to weight worse scenarios very heavily in setting climate policy; and that we ought to care a great deal about the interests of the worse-off when acting ethically. Other topics she has written on include the nature and rationality of faith; group decision-making; the relationship between assigning probability to a hypothesis and believing that hypothesis outright; and the nature of free will. Professor Buchak received her Ph.D. from Princeton in 2009. She spent 12 years in the Philosophy Department at UC Berkeley before returning to Princeton.Lijie Chen
Assistant Professor of EECS at UC Berkeley
I am an Assistant Professor at the Department of EECS at UC Berkeley. I am honored to be part of Berkeley's Theory Group. Prior to that, I was a Miller Research Fellow at UC Berkeley, hosted by Avishay Tal and Umesh V. Vazirani. I got my Ph.D. from MIT, and I was very fortunate to be advised by Ryan Williams.
Mariano Florentino Cuéllar
Chair, Board of the Directors at the Center for Advanced Study in the Behavioral Sciences
Mariano Florentino Cuéllar
Chair, Board of the Directors at the Center for Advanced Study in the Behavioral Sciences
Justice Mariano-Florentino Cuéllar was nominated by Governor Jerry Brown and began serving on the California Supreme Court in January 2015. A naturalized U.S. citizen born in Northern Mexico, Cuéllar received a B.A. from Harvard magna cum laude, a J.D. from Yale Law School, and a Ph.D. in Political Science from Stanford. Before serving on the Court, he was the Stanley Morrison Professor of Law and Professor (by courtesy) of Political Science at Stanford University. A member of the Stanford faculty since 2001, Cuéllar has written books and articles on administrative law and legislation, criminal law, international law, cyberlaw, immigration, public health law, and the history of institutions. Between 2004 and 2015, Cuéllar also held leadership positions at Stanford’s Freeman Spogli Institute for International Studies. During his tenure leading the Institute and, earlier, its Center for International Security and Cooperation, Cuéllar grew the Institute’s faculty, expanded Stanford’s role in nuclear security research, launched university-wide initiatives on global poverty and cyber security, and broadened opportunities for student and faculty research abroad. You can read more about Justice Cuéllar’s work and accomplishments here or here.Marion Fourcade
Professor of Sociology at UC Berkeley
Marion Fourcade received her PhD from Harvard University (2000) and taught at New York University and Princeton University before joining the Berkeley sociology department in 2003. A comparative sociologist by training and taste, she is interested in variations in economic and political knowledge and practice across nations. Her first book, Economists and Societies (Princeton University Press 2009), explored the distinctive character of the discipline and profession of economics in three countries. A second book, The Ordinal Society (with Kieran Healy), is under contract. This book investigates new forms of social stratification and morality in the digital economy. Other recent research focuses on the valuation of nature in comparative perspective; the moral regulation of states; the comparative study of political organization (with Evan Schofer and Brian Lande); the microsociology of courtroom exchanges (with Roi Livne); the sociology of economics, with Etienne Ollion and Yann Algan, and with Rakesh Khurana; the politics of wine classifications in France and the United States (with Rebecca Elliott and Olivier Jacquet). A final book-length project, Measure for Measure: Social Ontologies of Classification, will examine the cultural and institutional logic of what we may call “national classificatory styles” across a range of empirical domains.
Fourcade is also an Associate Fellow of the Max Planck-Sciences Po Center on Coping with Instability in Market Societies (Maxpo), and a past President of the Society for the Advancement of Socio-Economics (2016).
You can read more about Prof. Fourcade’s work and accomplishments here.
Michael Littman
Professor of Computer Science at Brown University
Michael Lederman Littman works mainly in reinforcement learning, but has done work in machine learning, game theory, computer networking, partially observable Markov decision process solving, computer solving of analogy problems and other areas. He is currently a professor of computer science at Brown University.
Before graduate school, Michael worked with Thomas Landauer at Bellcore and was granted a patent for one of the earliest systems for Cross-language information retrieval. Michael received his Ph.D. in computer science from Brown University in 1996. From 1996 to 1999, he was a professor at Duke University. During his time at Duke, he worked on an automated crossword solver PROVERB, which won an Outstanding Paper Award in 1999 from AAAI and competed in the American Crossword Puzzle Tournament. From 2000 to 2002, he worked at AT&T. From 2002 to 2012, he was a professor at Rutgers University; he chaired the department from 2009-12. In Summer 2012 he returned to Brown University as a full professor.
Moritz Hardt
Assistant Professor of EECS at UC Berkeley
Moritz Hardt is an Assistant Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. Hardt investigates algorithms and machine learning with a focus on reliability, validity, and societal impact. After obtaining a PhD in Computer Science from Princeton University, he held positions at IBM Research Almaden, Google Research and Google Brain. Hardt is a co-founder of the Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML) and a co-author of the forthcoming textbook “Fairness and Machine Learning”. He has received an NSF CAREER award, a Sloan fellowship, and best paper awards at ICML 2018 and ICLR 2017. You can read more about Prof. Hardt’s work and accomplishments here.Nika Haghtalab
Assistant Professor in the Department of Computer Science at Cornell University
Nika Haghtalab an Assistant Professor in the Department of Computer Science at Cornell University. She works broadly on the theoretical aspects of machine learning and algorithmic economics. She especially cares about developing a theory for machine learning that accounts for its interactions with people and organizations, and the wide range of social and economic limitations, aspiration, and behavior they demonstrate. Prior to Cornell, Nika was a postdoctoral researcher at Microsoft Research, New England, in 2018-2019. She received her Ph.D. from the Computer Science Department of Carnegie Mellon University, co-advised by Avrim Blum and Ariel Procaccia. Her thesis titled Foundation of Machine Learning, by the People, for the People received the CMU School of Computer Science Dissertation Award (2018) and a SIGecom Dissertation Honorable Mention Award (2019).Niko Kolodny
Professor of Philosophy at UC Berkeley
Niko Kolodny is Professor of Philosophy at UC Berkeley. He works in moral and political philosophy. He has written on the ethical significance of personal relationships and the nature of rationality, especially instrumental rationality. More recent papers focus on the value of democracy, the justification of the state, and the future of humanity. He has designed a new course for both philosophy and data science majors, “Moral Questions of Data Science,” which explores, among other things, the implications of replacing human judgment in social decision-making with statistical inference and automated algorithms.
Owain Evans
Research Lead (new AI Safety group in Berkeley)
Research Associate, Oxford University
Rediet Abebe
Assistant Professor of Computer Science at the University of California, Berkeley
Rediet Abebe is a Junior Fellow at the Harvard Society of Fellows and an incoming Assistant Professor of Computer Science at the University of California, Berkeley. Abebe holds a Ph.D. in computer science from Cornell University as well as graduate degrees from Harvard University and the University of Cambridge. Her research is in the fields of artificial intelligence and algorithms, with a focus on equity and justice concerns. She co-founded and co-organizes Mechanism Design for Social Good (MD4SG), a multi-institutional, interdisciplinary research initiative working to improve access to opportunity for historically disadvantaged communities. Abebe’s research has informed policy and practice at the National Institute of Health (NIH) and the Ethiopian Ministry of Education. Abebe has been honored in the MIT Technology Reviews’ 35 Innovators Under 35, ELLE, and the Bloomberg 50 list as a “one to watch.” She has presented her research in venues including National Academy of Sciences, the United Nations, and the Museum of Modern Art. Abebe co-founded Black in AI, a non-profit organization tackling representation and inclusion issues in AI. Her research is deeply influenced by her upbringing in her hometown of Addis Ababa, Ethiopia.Rosie Campbell
Managing Director of Eleos AI Research
Rosie Campbell currently works as a Technical Program Manager at OpenAI. Prior to working at OpenAI, she used to be the Assistant Director of CHAI. Her academic background is in Computer Science (MSc) and Physics (BSc) and includes some Philosophy and Machine Learning. She is motivated by the long-term and short-term challenges of aligning AI with human interests, but optimistic about the benefits of friendly AI. Before joining CHAI, Rosie worked as a Research Engineer at BBC R&D in Manchester, UK. Here, she worked on a variety of technical research and development projects, including a leading role on a team exploring artificially intelligent production systems. While in Manchester Rosie co-founded the BBC’s Machine Learning Special Interest Group, as well as Manchester Futurists, a thriving community group exploring the impact of emerging technology on society and the future.Scott Emmons
Research Scientist at Google DeepMind
Scott received his PhD at UC Berkeley and was advised by Stuart Russell. He is interested in both the theory and practice of AI alignment. Scott has helped characterize how RLHF can lead to deception when the AI sees more than the human, develop multimodal attacks and benchmarks for opeSiddharth Srivastava
Assistant Professor of Computer Science at Arizona State University
Siddharth Srivastava is an Assistant Professor of Computer Science at the School of Computing, Informatics, and Decision Systems Engineering at Arizona State University. He received his Ph.D. in Computer Science from the University of Massachusetts Amherst. Previously, Prof. Srivastava was a Staff Scientist at the United Technologies Research Center in Berkeley. Prior to that, he was a postdoctoral researcher in the RUGS group at the University of California Berkeley. His research objective is to develop intelligent robots and software agents that assist humans in their daily lives. Towards this objective, his research focuses on developing formal frameworks, algorithms and implementations that allow autonomous agents to reason and act efficiently under uncertainty. His dissertation work received a Best Paper award at the International Conference on Automated Planning and Scheduling (ICAPS) and an Outstanding Dissertation award from the Department of Computer Science at UMass Amherst.
You can read more about Prof. Srivastava’s work and accomplishments at his website.
Tom Lenaerts
Professor at the Université Libre de Bruxelles
Tom Lenaerts, PhD, is Professor at the Université Libre de Bruxelles (ULB) where he is co-heading the Machine Learning Group (MLG). MLG targets machine learning, AI and behavioural intelligence research focusing on time series analysis, causal and network inference, collective decision-making, social AI and behavioural analysis with applications in finance, medicine, cybersecurity and biology. He is currently the director of the Interuniversity Institute of Bioinformatics in Brussels and also holds a partial affiliation as research professor with the Artificial Intelligence Lab of the Vrije Universiteit Brussel, the Flemish counterpart of the ULB. He is vice-chair of the Benelux Association for Artificial Intelligence. He has worked in a variety of interdisciplinary domains and has co-authored many papers in AI in the different areas like evolutionary optimisation, collective intelligence, evolutionary game theory, computational biology and bioinformatics.Vincent Corruble
Associate Professor at Sorbonne Université
Vincent Corruble received degrees in Engineering (Ecole Centrale de Lille, France), a M.S. in Systems Engineering (University of Virginia, 1992), a Ph.D. in Artificial Intelligence (UPMC, Paris 6, 1996). After postdocs in South Africa, in the US, and in the UK, he joined the faculty of UPMC (now Sorbonne Université), LIP6 laboratory.
He has contributed to the field of Machine Learning and Pattern Recognition, Machine discovery and Data Science, with applications to history of medicine and medical research. In the early 2000’s, he turned his attention to intelligent agents, especially learning agents and multi-agent systems, with a special interest for multi-agent reinforcement learning in complex systems. This found significant applications in the area of Game AI and later in city modeling and urban simulation. This also led to a focus on virtual characters and agent architectures, with emphasis on affective computing to model emotions and their interplay with personality and social interactions.
Lately Vincent Corruble has developed interest in AI safety, and more specifically on how agents and humans can develop safe cooperation and trust over time. In this context he visited CHAI during the summer of 2019. He has been member of the Multi-Agent Systems group at LIP6 (Sorbonne Université, Paris) since 2007.
Wesley Holliday
Professor of Philosophy at UC Berkeley
Wesley Holliday is an Associate Professor of Philosophy and Faculty Member of the Group in Logic and the Methodology of Science at the University of California, Berkeley. He currently serves as the Chair of the Group in Logic and co-organizer of the Berkeley-Stanford Circle in Logic and Philosophy. Prof. Holliday has worked mainly in formal philosophy and logic, especially modal logic, intuitionistic logic, epistemic logic and epistemology, logic and natural language, logic and probability, and logic and social choice theory. His recent research includes voting theory and computational social choice. Read more on his website.
Zhijing Jin
Assistant Professor at the University of Toronto and Vector Institute
CIFAR AI Chair, ELLIS Advisor
Zhijing Jin
Assistant Professor at the University of Toronto and Vector Institute
CIFAR AI Chair, ELLIS Advisor
Graduate Students
Aly Lidayan
PhD Student, Fall 2020 - Present
Alyssa Li Dayan is a Computer Science PhD student at UC Berkeley. She received her BS in mathematics with computer science from Massachusetts Institute of Technology in 2018. During undergrad she did research in systems neuroscience and computational cognitive science, before spending a couple years in industry working on prediction and simulation for autonomous vehicles. You can visit their personal website here.Anand Siththaranjan
PhD Student, Fall 2021 - Present
Arnaud Fickinger
PhD Student, Fall 2019 - Present
Arnaud began his PhD at UC Berkeley in 2019 and is supervised by Stuart Russell. He is interested in multiagent systems and is currently working on adding a social choice flavor to assistance games and creating complex behaviors via multiagent interaction. Arnaud received his MS in Artificial Intelligence and Advanced Visual Computing and his BS in Applied Mathematics and Computer Science from the École Polytechnique in Palaiseau, France. He is currently a Visiting Researcher at Facebook AI and has been a Visiting Research Fellow at Harvard University as well as a Software Engineering Intern at Leia Inc. in Menlo Park.Bhaskar Mishra
PhD Student, Fall 2023 - Present
Bhaskar is a PhD student advised by Prof. Stuart Russell. He is currently interested in reinforcement learning theory with the aim of advancing the theoretical understanding of modern reinforcement learning algorithms which have exhibited strong performance experimentally. Before coming to Berkeley, Bhaskar graduated from the University of Florida with a dual degree in Mathematics and Computer Science. During undergrad, Bhaskar worked on research in empirical game-theoretic analysis and statistical learning theory while advised by Prof. Amy Greenwald from Brown University and Prof. Cyrus Cousins from UMass Amherst. Aside from academic work, Bhaskar enjoys rock climbing, playing ukulele, and reading philosophy.Cassidy Laidlaw
PhD Student, Fall 2020 - Present
Erik Jenner
PhD Student, Fall 2022 - Present
Erik is a PhD student advised by Stuart Russell. He is interested in developing techniques for aligning AI with human values that could scale to very powerful future AI systems. Most recently, he has worked on better understanding potential-based reward shaping and on distance measures between reward functions. Before joining Berkeley, Erik completed a MSc in Artificial Intelligence at the University of Amsterdam, and an undergrad in physics at the University of Heidelberg. You can learn more about his research at his website (https://ejenner.com).Hanlin Zhu
PhD Student, Fall 2021 - Present
Hanlin is a Ph.D. student in the Department of Electrical Engineering and Computer Science at UC Berkeley advised by Prof. Jiantao Jiao and Prof. Stuart Russell. Previously, he graduated from Yao Class at Tsinghua University. Hanlin's current research interests focus on machine learning theory, especially reinforcement learning theory, as well as their applications to real world problems. He is also interested in theoretical computer science and mechanism design. You can learn more about Hanlin on his website.Jakub Grudzien Kuba
PhD Student, Fall 2022 - Present
Jessy Lin
PhD Student, Fall 2020 - Present
Jessy Lin is a PhD student at Berkeley AI Research. She is interested in building machine learning systems that work well in the dynamic, interactive settings of the real world.
Jessy graduated with a double-major in computer science / electrical engineering and philosophy from MIT. Previously Jessy worked on human-inspired AI with the Computational Cognitive Science Group and on human-in-the-loop machine translation at Lilt. At MIT she organized many seasons of HackMIT with the hackathon community. Jessy also spent a summer with the Natural Language Understanding group at Google Research NY, advised by David Weiss. Learn more at her website.
Jiahai Feng
PhD Student, Fall 2023 - Present
Johannes Treutlein
PhD Student, Fall 2022 - Present
Johannes is a first year PhD student advised by Stuart Russell. He is broadly interested in empirical and theoretical research to ensure that AI systems remain safe and reliable with increasing capabilities. He is currently working on investigating learned optimization in machine learning models and on developing models whose goals generalize robustly out of distribution. Previously, Johannes studied computer science and mathematics at the University of Toronto, the University of Oxford, and the Technical University of Berlin. He is also a former CHAI intern. For more information, visit his website.Julian Yocum
PhD Student, Fall 2024 - Present
Julian is a PhD student studying AI cognition and deep learning theory through the lenses of neuroscience, physics, and mechanistic & developmental interpretability. He completed his undergraduate and master's degrees in physics and AI at MIT, where he was advised by Dylan Hadfield-Menell. While he does not believe in desks, you can catch him in the BAIR cafeteria or the cafe across the street. Things he does believe in include dancing, singing, and getting lost (regardless of competency).Mark Bedaywi
PhD Student, Fall 2024 - Present
Micah Carroll
PhD Student, Fall 2020 - Present
Micah Carroll is an Artificial Intelligence PhD student at UC Berkeley advised by Professors Anca Dragan and Stuart Russell. Originally from Italy, Micah graduated with a Bachelor’s in Statistics from Berkeley in 2019. He has worked at Microsoft Research and at the Center for Human-Compatible AI (CHAI). His research interests lie in human-AI systems: in particular measuring the effects of social media (and other recommenders) on users, and improving techniques for human-AI collaboration. You can find him on his website or on Twitter.Michelle Li
PhD Student, Fall 2024 - Present
Michelle is a PhD student interested in reinforcement learning and multi-agent learning, especially for robust human-AI collaboration and assistance. Prior to Berkeley, Michelle completed a BS in math/CS at MIT and worked for two years at Waymo Research on reward learning and simulation for autonomous driving. Michelle is also a former CHAI intern. Website: https://varmichelle.github.io/Niklas Lauffer
PhD Student, Fall 2021 - Present
Niklas is a PhD student starting in Fall 2021 advised by Stuart Russell and Sanjit Seshia. He's interested in human-AI cooperation, multiagent learning, and AI safety. In his research he uses tools from machine learning, game theory, and formal methods. Niklas received his BS in computer science and mathematics from the University of Texas at Austin in 2021, where he worked in the Autonomous Systems group under Ufuk Topcu. He also spent time at NASA Ames Research Center in the Planning and Scheduling Group. Learn more at Niklas' website.Rachel Freedman
PhD Student, Fall 2019 - Present
Rachel Freedman is a PhD student advised by Professor Stuart Russell. Her research focuses on reinforcement learning and reward modeling as paths toward value aligned AI. She’s particularly interested in integrating methodologies and expertise from computer science, statistics, cognitive science and economics to resolve the inherently interdisciplinary challenge of ensuring that AI is beneficial for humanity. In her undergraduate thesis, Rachel adapted a kidney exchange algorithm to align with societal values by isolating the underlying moral frameworks in human responses to moral dilemmas. The resulting paper was awarded “Outstanding Student Paper Honorable Mention” at AAAI 2018, and will be published in the AI Journal in 2020. Rachel’s past work on this morally vital and complex topic makes her a great fit for CHAI’s long-term goal of benefiting society. Prior to joining CHAI, Rachel studied computer science, neuroscience, and philosophy at Duke University through the Robertson Scholars Leadership Program, and at Oxford University as a Registered Visiting Student. At Oxford she cofounded and ran an interdisciplinary existential risk discussion society that met at the Future of Humanity Institute. More details about her work are available at her website .Shreyas Kapur
PhD Student, Fall 2022 - Present
Shreyas is a PhD student at Berkeley AI. He is interested in developing algorithms that can learn to construct rich, interpretable models of the world as quickly and robustly as humans do.
Shreyas got his undergraduate and Master's degrees at MIT, where he was part of the Computational Cognitive Science Group advised by Prof. Josh Tenenbaum, working on neurosymbolic techniques for building structured world models.
During his time as an undergrad, he spent summers at DeepMind, Waymo, and worked on research projects under Prof. Pattie Maes at the Fluid Interfaces Group. He also helped organize HackMIT.
Tu (Alina) Trinh
Master's Student
Tu (Alina) Trinh is a master's student advised by Stuart Russell. She is interested in human-AI collaboration, AI robustness and capability, and applications of (inverse) reinforcement learning to real-world scenarios. Previously, she has also conducted research related to topic modeling and autonomous vehicle path-planning. Tu completed her undergraduate studies at UC Berkeley, where she received degrees in computer science and business administration with a minor in data science. During her summers, she interned at Salesforce, Amazon, and PEAK6, building large-scale software services and systems.Yuxi Liu
PhD Student, Fall 2020 - Present
Yuxi Liu is a PhD student with interests in mathematics, theoretical physics, philosophy, and AI. Before coming to Berkeley, Yuxi graduated from Australian National University with degrees in mathematics and theoretical physics. Learn more at yuxiliu1995.github.io.
Affiliated Graduate Students
Jacy Reese Anthris
PhD Student, Sociology and Statistics, University of Chicago
Jacy Reese Anthis is a computational social scientist researching human-AI interaction and machine learning. His research focuses on “digital minds,” humanlike AI systems that can work side-by-side with humans and appear to have reasoning, emotion, agency, and other mental faculties. His research has been published in top academic venues, such as CHI, CSCW, and NeurIPS, and featured in global media outlets, such as Vox, Forbes, and The Guardian. Anthis has presented his work at conferences and seminars in 28 countries. He is a visiting scholar at the Institute for Human-Centered AI (HAI) at Stanford University, a co-founder of the nonprofit research organization Sentience Institute, and a PhD candidate at the University of ChicagoPulkit Verma
Postdoctoral Associate in the Interactive Robotics Group at CSAIL and AeroAstro, MIT
Pulkit is a Postdoctoral Associate in the Interactive Robotics Group at CSAIL and AeroAstro, MIT, working with Prof. Julie Shah. He completed his Ph.D. at the School of Computing and Augmented Intelligence at Arizona State University, advised by Prof. Siddharth Srivastava. During his Ph.D., he was also affiliated with UC Berkeley’s Center for Human-Compatible Artificial Intelligence and ASU’s Center for Human, Artificial Intelligence, and Robot Teaming. He earned his M.Tech. from the Department of Computer Science & Engineering at IIT Guwahati, where he was advised by Prof. Pradip K. Das.
Pulkit's primary interests lie in AI safety, AI planning, action-model learning, interpretability, and the analysis of abstractions. His research focuses on ensuring the safe and reliable behavior of AI agents. Currently, he is investigating the minimal set of requirements an AI system needs to enable users to assess and understand the limits of its safe operability.
In the past, he has worked in the areas of bio-inspired robotics, speech processing, and the Internet of Things.
You can learn more about Pulkit’s work at his website.Interns
Dillon Sandhu
Intern
Dillon Sandhu is a PhD student at Duke University and an intern with CHAI. He works on reinforcement learning and representation learning. At CHAI, he is working on benchmarking LLM Agents’ abilities to yield control to avoid failure.Ala’a Tamam
Policy Intern
Ala'a Tamam is a rising senior at Northeastern University in Boston, studying Data Science and International Affairs, with a minor in Computational Social Science. Her work sits at the intersection of technology, policy, and global justice. Ala’a has previously worked in two congressional offices, focusing on the integration of emerging technologies into governance and public service. Her research and professional interests include AI policy, international law, and human rights—especially in contexts shaped by conflict, migration, and inequality.Alexandra Souly
Intern
Alexandra finished her undergraduate studies in Mathematics at the University of Cambridge in 2020, after which she worked at Microsoft as a Software Engineer for two years. She is currently pursuing a master’s degree in Machine Learning at University College London, with her thesis focusing on multi-agent RL. At CHAI, she will be working with Sam Toyer on producing a novel benchmark for value learning.Alina Yang
Intern
Alina is a rising junior at MIT pursuing a bachelor’s degree in mathematics and computer science. This summer, she’s working with Bhaskar Mishra to develop improved algorithms for approximate inference on Bayesian networks. Previously, she worked on approximate matrix multiplication algorithms at TUM and interned as a trader at Five Rings.Andrew Garber
Intern
Andrew completed his BA in statistics from Harvard College in 2024. At CHAI, he’ll be working with Scott Emmons and Rohan Subramani to extend the theory of cooperative inverse reinforcement learning to situations involving asymmetric information—for example, when the AI observes more of the environment than the human.Austin Tripp
Intern
As of 2024, Austin is finishing his PhD at the Cambridge Machine Learning group focusing on machine learning for molecules. At CHAI he is modelling how human preferences can change over time. More information is available on his website: austintripp.caAymane El Gadarri
Intern
Aymane completed his MS and BS in Applied Mathematics and Computer Science at Ecole Polytechnique in France. As a CHAI intern, he is interested in bridging the theory and practice gap in deep reinforcement learning. Broadly, he is interested on using machine learning for better, fairer, and more robust decision-making under uncertainty. Prior to his internship, he conducted research on quantum computing for solving combinatorial optimization problems and on statistical learning for drug discovery. Aymane will start his PhD in Applied Mathematics at MIT in the fall. Visit his Linkedin profile here.Juan Liévano
Intern
Juan is a mathematician (Universidad de los Andes, Bogotá) with a master's degree in pure mathematics (IMPA, Rio de Janeiro). He has worked as an industry machine learning engineer at Quantil in Bogotá. As an intern at CHAI he will work with Dr. Benjamin Plaut on reinforcement learning algorithms designed to work in contexts where agent errors have irreversible costs.Martín Soto
Intern
Martín is a graduate in Maths and in Physics, and finishing his MSc in Mathematical Logic, where he's pursuing research in Computational Complexity with Albert Atserias. He's spent the last year exploring theoretical and conceptual research avenues in AI Safety, with collaborators from OpenAI and London's Center on Long-term Risk, among others. He's now decided to focus on multi-polar failure modes, tackling them both from theory (Cooperative AI) and governance. That's why, at CHAI, he'll be working with Niklas Lauffer on multi-agent training dynamics. In the past, he's also researched Algorithmic Decision Theory, and AI Safety cause prioritization.Mohamad H. Danesh
Intern
I am Mohamad H. Danesh, currently pursuing a PhD at McGill University in the Centre for Intelligent Machines lab. My PhD research centers on the development of safe learning agents, reflecting my commitment to advancing the field's understanding and application of ethical and risk-aware AI. Prior to embarking on my doctoral journey, I completed my MSc at Oregon State University under the supervision of Prof. Alan Fern. It was during my master's program that I developed a keen interest in RL, a field that has since become my primary focus. As an upcoming Summer 2024 CHAI intern, I am eager to collaborate with Khanh Nguyen on investigating the dynamics of control allocation among agents, particularly in the context of yielding and requesting control.Pavel Czempin
Intern
Pavel is a PhD student at USC. His research interests are robust agents in diverse task spaces and beneficial AI systems. At CHAI he is working with Rachel Freedman on learning the expertise of human annotators. Find out more about him on his website.Ram Rachum
Intern
In his previous career, Ram was a software engineer at Google, Dell and other software companies. Ram has been a contributor and activist in the open-source software community since 2009, and in 2020 he was honored as a Fellow of the Python Software Foundation. In 2022 Ram fell in love with AI Safety research and made a career change. Ram is an AI Safety researcher at the GOLD lab in Tufts University, and a research intern at CHAI. Ram believes that reproducing emergent social behavior in AI agents is the most likely path to creating an AI that has a social and moral understanding. We humans interact socially with each other our entire lives, and the mass of positive and negative interactions that we have determine our character and our ethical principles. Ram suggests that in order to create an AGI which will take the well-being of humans into account, we should break down the social skills that we humans take for granted, and design AI agents that are able to discover them and apply them to interactions with other agents and with humans. Read more on Ram's research site.Zeno Marquis
Intern
Zeno is a rising senior at Tufts studying neuroscience and computer science. He’s interested in the crossover between neuroscience and AI, including understanding the human cognitive impacts of AI systems and modeling properties of the human brain. At CHAI, he’s working under Mark Nitzberg developing an AI policy tracker.Former Interns
Alex Turner
Former Intern
Alex is a fourth-year PhD student at Oregon State University interested in making significant progress on AI alignment; in particular, better understanding and quantifying when, why, and how people would be affected by deployment of proposed AI systems. Last summer, he introduced an approach that aims implicitly to account for the difficulty of specifying a formal goal for an AI: have the agent complete its task without overly changing its ability to complete other tasks. The ability to complete one task is often tightly correlated with the ability to complete another. The key insight is that we can have the agent remain equally able to complete tasks in general, hopefully thereby remaining able to do the right thing - even without specifying what that right thing is.Alexis Wan
Former Interns
Alexis is a senior at UC Berkeley and an undergraduate intern at CHAI. She works with graduate student Alyssa Li Dayan on the problem of balancing the trade-off between diversity and accuracy for recommendation systems. At Berkeley she is majoring in Computer Science and Applied Math.Antoni Lorente Martinez
Former Intern
Toni is a first-year PhD student, advised by Dr. Mark Coté (King’s College London) and Dr. Jordi Vallverdú (Universitat Autònoma de Barcelona). His work focuses on developing an experimental philosophical approach to the mind via AI and cognitive science, in order to explore the nature and limits of philosophical inquiry. In particular, he is currently researching the relationship between empirical findings and theory development. He is also interested in AI safety and the epistemological role of fiction. Before starting his PhD, Toni graduated in Aerospace Engineering (2017), and obtained an MA in Political Philosophy at Universitat Pompeu Fabra (2019), and an MSc in Philosophy of Science at the London School of Economics (2020). You can read one of his last articles here.Beth Barnes
Former Intern
Beth is a researcher at OpenAI. Previously she worked as Research Assistant to Chief Scientist at DeepMind. Beth participated in CHAI’s first internship cohort in 2017, where she implemented and proved properties of IRL/IRD algorithms. Beth earned a degree in Computer Science at the University of Cambridge. At Cambridge she cofounded Future of Sentience (FuSe), a student society affiliated with CSER focused on improving the long-run future. FuSe created an All-Party Parliamentary Group advocating for representation of future generations in the UK Parliament. View Beth’s LinkedIn profile here.Carlo Attubato
Former Intern
Carlo Attubato is a PhD student in Computer Science at Linacre College, University of Oxford. He is supervised by Professor Georg Gottlob and works in conjunction with the RAISON DATA project on the topic of integrating automated symbolic reasoning with statistical machine learning. He receives full funding from the Royal Society. In 2020, Carlo received his Masters in Mathematics and Philosophy at Hertford College, University of Oxford. He currently serves as Senior Consultant to OSG Digital Consultancy and Contributing Editor to the Oxford Public Philosophy journal. At CHAI, Carlo works with Michael Dennis on a project combining language-acquisition, learning, and decision-making into a single end-to-end pipeline for RL agents to specify their goals and beliefs. Their goal is to create RL agents which are more interpretable, aligned, and better at generalising.Charlie Griffin
Former Intern
Charlie just graduated with a Master of Computer Science and Philosophy from the University of Oxford. At CHAI, he's working on generalising the cooperative inverse reinforecment learning framework to constraint or rule based ethical frameworks. His other research interests include the inductive bias of deep learning, active learning for aligning language models, bias in vision models and democratic paradigms for AI alignment. Back in the Uk, he started Oxford's AI safety student group and is now Head of Student Research at the AI Safety Hub.Charlotte Roman
Former Intern
Charlotte is a PhD student at the University of Warwick at the center for Mathematics for Real-world Systems. In her PhD research, she uses game theory to study how people respond to intelligent traffic systems such as route planning apps and traffic lights. She is also interested in cooperative multiagent reinforcement learning. At CHAI, Charlotte is working on finding safe beliefs that encourage cooperation in sequential social dilemmas. Her broad interests include AI safety, fairness, and bounded rationality.
Chris Cundy
Former Intern
Chris is a PhD Student at Stanford University studying Machine Learning. His research is concerned with ensuring that sophisticated AI systems will robustly and reliably carry out the tasks that we want them to. At CHAI, Chris worked with Stuart Russell and Daniel Filan. He has also worked at the Future of Humanity Institute at Oxford University, collaborating with Owain Evans on scalable human supervision of complex AI tasks. Chris studied Physics as an undergraduate at Cambridge University before switching to complete a computer science master’s degree. During his master’s, he worked with Carl E. Rasmussen, developing variational methods for Gaussian Process State-Space Models. His website is cundy.me.Cynthia Chen
Former Intern
Cynthia is a rising senior at the University of Hong Kong. She is deeply passionate about developing AI systems that are beneficial to society and align with human values. Broadly, she is interested in improving robustness and reliability in decision-making algorithms. Cynthia’s research at CHAI primarily focuses on improving imitation learning agents with representation learning methods. Prior to CHAI, Cynthia has conducted research in Automatic Machine Learning (AutoML) and had her paper published at a major top conference. She has also spent her undergrad years at Stanford University and Columbia University as a Visiting Student. Currently, Cynthia is among the Organizing Committee of SafeAI workshop at AAAI and AISafety workshop at IJCAI, where she hopes to help with growing the AI alignment research field. You can find out more about Cynthia at her website.Davis Foote
Former Intern
Davis completed his degree at UC Berkeley in 2017, after which he joined Google Brain's Medical Imaging team. In late 2019 he left Google and vowed never to touch a computer again. This proved to be intractable, and he has since pivoted to investigating computational models of human behavior, beliefs, and desires. He hopes to use these models to make "actually enrich people's lives" a more likely side-effect of abdicating humanity's decision-making power to black-box optimization systems. At CHAI, he will be working with Micah Carroll on demonstrating and characterizing manipulation by recommender systems trained via reinforcement learning to optimize long-term engagement metrics, as well as on models of preference shifts in humans.David Lindner
Former Intern
David is a masters student at ETH Zurich working on reinforcement learning. Broadly, his research is concerned with building reliable intelligent systems that interact with the world. Recently he has been interested in learning robust reward functions from human feedback and ways to reason about the uncertainty an RL agent has about its reward function. At CHAI, David is working with Rohin Shah on inferring human preferences from the current state of the world in complex environments. In the past, he has done work in condensed matter physics, network optimization, computational social science and bias in machine learning. His website is davidlindner.me.
Dmitrii Krasheninnikov
Former Intern
Dmitrii is a second-year masters student at the University of Amsterdam, where he studies artificial intelligence. At CHAI, Dmitrii investigates ways to combine misspecified and conflicting reward functions obtained via different specification processes (e.g. hand-specification and inverse reinforcement learning). This is Dmitrii’s third time working at CHAI. Previously he worked with Rohin Shah on an algorithm that can learn human preferences from the current state of the world, and studied properties of the maximum causal entropy inverse RL algorithm with Andrew Critch.
Before CHAI Dmitrii worked on stability of deep learning algorithms as an intern at a small startup in Finland, and designed efficient algorithms for processing next-generation sequencing data at the Bioinformatics Institute in Russia.
You can learn more about Dmitrii at his personal website.
Dylan Cope
Former Intern
Dylan Cope is a third year PhD student at King’s College London with the Safe and Trusted AI CDT, and a visiting doctoral student at Imperial College London. His doctoral research focuses on explainee-centric explanation in the context of multi-agent communication, with applications to the AI alignment problem. Outside of his PhD he has conducted research on robustness to distributional shift, transfer learning, and emergent communication in reinforcement learning. Previously, he worked as an applied AI data scientist and a software engineer. At CHAI, Dylan will be working with Justin Svegliato on metareasoning in reinforcement learning.Edmund Mills
Former Intern
Eric Michaud
Former Intern
Eric is a fourth-year undergraduate at UC Berkeley studying mathematics. He is broadly interested in understanding why deep learning works so well, and in investigating the general principles that give rise to intelligence in physical systems.
As a CHAI intern, Eric is working with Adam Gleave to develop interpretability techniques for reward functions. He hopes that this work will help to detect when learned reward functions fail to reflect human preferences, without having to train a policy and without knowing a ground-truth reward.
Before working at CHAI, Eric interned with the Berkeley SETI Research Center and the Lawrence Livermore National Laboratory. You can learn more about him at his personal website.
Erik Jenner
Former Intern
Erik completed his physics undergraduate degree in 2020 and is now a Master's student in artificial intelligence at the University of Amsterdam. He is passionate about Effective Altruism and wants to use his career to help ensure that AI has a positive impact. At CHAI, he will be working with Adam Gleave on improving the interpretability of reward functions. This could help evaluate learned reward functions when no ground truth is available. Before his CHAI internship, Erik did research on graph-based segmentation, causality, and on equivariant deep learning.Ethan Mendes
Former Intern
Ethan recently graduated with a computer science degree from Georgia Tech. His research interests involve understanding the capabilities and dangers of LLMs and utilizing LLMs in the creation of beneficial human-in-the-loop systems. At CHAI, Ethan is working with Sam and Micah on cross-model adversarial example transfer to measure the ethical and moral understanding of LLMs.Euan Ong
Former Intern
Euan Ong is a final-year undergraduate at the University of Cambridge, studying Computer Science. His ambition is to develop powerful, yet safe and interpretable abstract reasoners, whose internal state and behaviour remain transparent to the end user. To this end, he is particularly interested in exploring how the mathematical tool-kits we use to understand and structure programs ‒ such as formal methods, types and category theory ‒ can inspire new ways to both reverse-engineer existing neural networks, and build scalable neurosymbolic systems. You can read more about his research on his website.
Harry Giles
Former Intern
Harry is masters graduate in mathematics from Cambridge, UK. He is interested in statistics, and has experience working in finance. In particular, he is interested in reinforcement learning, and other topics of decision making under uncertainty and/or Constraints.
At CHAI, Harry has been working with Lawrence Chan on models for suboptimal RL policies, in particular Boltzmann rationality, for which he has also been working on the IRL problem. You can contact him through his website.
Henry Papadatos
Former Intern
Henry is currently finishing his master’s degree in robotics and data science at the Swiss Federal Institute of Technology Lausanne (EPFL). Where he also co-founded and organized the Lausanne AI alignment team. Being interested by the governance aspect of AI as well, he is a co-author of the «Navigating AI risks» newsletter. At CHAI, Henry will work on reinforcement learning from human preferences (RLHP). He will identify and demonstrate shortcomings in RLHP that can lead to dangerous reward learning failures, then propose solutions to make RLHP more safe and robust. His work will be supervised by Rachel Freedman.Jess Reidel
Former Intern
Jess Riedel is a postdoc at the Perimeter Institute for Theoretical Physics specializing in quantum information, decoherence, and the quantum-classical transition. In particular, he has worked on dark-matter detection with matter interferometers and on rigorous definitions for wavefunction branches in a many-body context. As a visiting scholar at CHAI, he has been pursuing interest in the physical embedding of agents and in the formal verification of self-modifying hardware. Riedel was previously a postdoc at IBM Research with Charlie Bennett. He earned his PhD in Physics from UC Santa Barbara in 2012, advised by Wojciech Zurek at Los Alamos National Lab. His website is jessriedel.com.Joar Skalse
Former Intern
Joar is a first-year PhD student at Oxford University, funded by the Future of Humanity Institute. He previously completed the BA and MCompPhil in Computer Science and Philosophy at Oxford, which he graduated from as top of the year. Joar's research is on machine learning, and how to make AI systems safe and reliable, but his broader interests include everything from cognitive science to formal epistemology. In the past he has done research in areas such as philosophical decision theory, constrained reinforcement learning, computational learning theory, inductive logic programming, and active learning. At CHAI, he is working with Adam Gleave to understand the advantages and disadvantages of different reward learning methodologies from a theoretical perspective.Joe Benton
Former Intern
Joe completed his Masters in mathematics at Trinity College, Cambridge in 2021. His primary research interests are in reinforcement learning and AI robustness. At CHAI, he will focus on encouraging robustness in RL agents by developing methods for agents to request additional information where necessary. Previously, Joe has worked as a quantitative trading intern at Jane Street, and as a research assistant in the algebraic geometry group in the University of Cambridge's maths department. After interning with CHAI, he will begin a PhD in statistical machine learning at the University of Oxford.Johannes Treutlein
Former Intern
Johannes is an incoming MSc student in computer science at the University of Toronto. Previously, he studied mathematics and computer science at the Technical University of Berlin and the University of Oxford. His research interests include multi-agent learning, game theory, bargaining and decision theory. At CHAI, he is working with Jakob Foerster and Michael Dennis on whether and how cheap talk can help with zero-shot coordination in cooperative multi-agent RL. Before that, Johannes was a research assistant at the machine learning lab at his university. His website is johannestreutlein.com.
Jonathan Colaço Carr
Former Intern
Julian Yocum
Former Intern
Julian is a Masters student in Artificial Intelligence at the Massachusetts Institute of Technology and an organizer on the MIT AI Alignment team. His research is in multi-agent contracting and negotiation with agentic LLMs (such as Auto-GPT) in open world settings like Minecraft. His work is supervised by Justin Svegliato and Dylan Hadfield-Menell.Jun Shern Chan
Former Intern
Jun works to study and reduce risks from advanced AI systems. He has spent time as a researcher at New York University working on language model safety, and at CHAI he will be working with Dan Hendrycks to investigate power-seeking behaviour in AI agents. Prior to his current research interests, Jun spent some years building algorithms for autonomous vehicle sensor calibration at Motional, and studied Electrical and Electronic Engineering at Imperial College London.Qingyuan Lu
Former Intern
Qingyuan is a recent graduate from MIT with a BS in Computation and Cognition. At CHAI, Qingyuan has worked with Justin Svegliato on ethically compliant autonomous systems and Sam Toyer, Scott Emmons, and Olivia Watkins on creating a benchmark for jailbreaking attacks against LLMs. You can find more about Qingyuan here [https://qylu4156.github.io/].Lauro Langosco di Langosco
Former Intern
Lauro graduated with a Master's in applied mathematics from ETH Zurich. He is primarily interested in the problems of AI corrigibility and alignment. Currently he is researching unusual ways in which deep learning systems can fail to generalize (for example by competently pursuing an objective different from the one they were trained on). Prior to joining CHAI, Lauro worked on topics in probabilistic deep learning and the theory of stochastic optimization. He also co-organized Effective Altruism Zurich, helping run regular events, a conference on AI governance, and a reading group on AI alignment.Leon Lang
Former Intern
Leon completed master's degrees in mathematics and artificial intelligence and currently does a PhD in machine learning and information theory. He has research experience in equivariant deep learning and multivariate information theory and recently got interested in using his career to reduce existential risk from AI. At CHAI, he will work with Erik Jenner and Scott Emmons on a project incorporating partial observability and bounded computation into cooperative inverse reinforcement learning. His website is https://langleon.github.io/ .Lev McKinney
Former Intern
Lev is an undergraduate student at the University of Toronto majoring in Computer Science focusing on Artificial Intelligence. His research interests include model-based reinforcement learning and reward learning. At CHAI, Lev worked with Adam Gleave, on improving transfer in reward learning and providing new open source baselines to accelerate research in the field. Previously, Lev has presented work at the Neurips Deep RL workshop on training dynamics models with techniques from self-supervised learning.Lukas Berglund
Former Intern
Lukas Berglund is currently finishing his bachelor in mathematics at Vanderbilt University. This summer he is working with Cassidy Laidlaw on creating a benchmark for AI–human assistance games with unknown goals. Prior to interning at CHAI, Lukas interned as a software-engineer at Google and as a Trader at Jane Street Capital.Luke Bailey
Former Intern
Luke Bailey is a rising senior at Harvard College pursuing a bachelor's degree in computer science and mathematics and a masters degree in computer science. He is currently conducting research on continual learning at the Harvard Data to Actionable Knowledge lab and is a member of the Harvard AI Safety Team. This summer he will be working with Scott Emmons on developing adversarial attacks to vision-language models.
Mason Nakamura
Former Intern
Mason recently graduated with an applied mathematics and data science degree. He is an incoming computer science PhD student at UMass Amherst. He has research experience in combinatorial geometry, graph theory, and metareasoning. At CHAI, he works under Justin Svegliato on explainability and metareasoning. Here is his website - https://www.masonnakamura.com/Matthew Farrugia-Roberts
Former Intern
Matthew is an aspiring academic researcher with an emerging research interest in the theory, capabilities, limitations, and beneficence of intelligent systems. He has completed coursework mainly in mathematics, computer science, and artificial intelligence at the University of Melbourne and ETH Zürich. Matthew is completing a minor thesis on understanding the remarkable generalization ability of deep learning systems using tools from pure mathematics and theoretical computer science. As an intern at CHAI in 2021, Matthew will work with Adam Gleave and Joar Skalse to investigate the theoretical foundations of the learning of reward functions.Matthew Rahtz
Former Intern
Matthew is a Research Engineer at DeepMind. He earned an MSc in Neural Systems and Computation at ETH Zurich and a bachelor’s in Electronic Engineering at the University of York. Matthew interned at CHAI in 2018. Read his blog at amid.fish.Max Kaufmann
Former Intern
Max graduated from his undergraduate computer science degree at the University of Cambridge, UK in 2022. At the moment, he is interested in understanding current iterations of large machine learning models, with the hopes of building useful frameworks for the study of powerful AI systems. At CHAI he will be working under Dan Hendrycks, researching how knowledge of morality can be effectively extracted from modern large language models. Before CHAI, Max worked on adversarial robustness within computer vision, both through an industry internship and academic research. He is an active member of the effective altruism community, and a commite member of the Cambridge Existential Risks Intiative.Michael Chen
Former Interns
Michael graduated from Georgia Tech in 2022 with a bachelor's degree in computer science. At CHAI, he worked with Scott Emmons to train language models to understand ethical scenarios. During his undergraduate studies, he researched the application of deep learning to the design of nanophotonic structures. He's passionate about AI alignment, effective altruism, and AI safety field-building. You can learn more about Michael through his LinkedIn or schedule a chat with him on Calendly.Michael McDonald
Former Intern
Michael completed his undergraduate studies at UC Berkeley with a double major in computer science and applied mathematics. Broadly, his interests cover value alignment in multi-step/multi-objective task scenarios and how robots can learn to efficiently aid humans in such scenarios.
As an undergraduate student, Michael researched under Professor Pieter Abbeel with Dylan Hadfield-Menell on task and motion planning algorithms - initially for a laundry-robot exhibit for the Victoria and Albert Museum in London meant to examine the feasibility and challenges of integrating robotic agents into everyday life. That work then served as foundation for designing imitation-learning algorithms for accomplishing multi-objective plans with normally intractable cost functions.
This coming autumn Michael will begin his master’s here at Berkeley in the EECS 5th year M.S. program, advised by Professor Anca Dragan.
Michał Mgeładze-Arciuch
Former Intern
I've just finished my undergraduate studies in Computer Science at the University of Cambridge and am planning on staying on for one more year to obtain my Master's degree. Before CHAI, I interned as a Software Engineer at Jane Street, where I optimised the performance of a Random Forest model and created a distributed application for operations monitoring. Beyond AI, I am interested in decentralised systems and computational biology and have taken part in a few hackathons in these areas. Apart from the tech stuff, I love playing volleyball, white water kayaking and dancing salsa.Nathan Miller
Former Intern
Nathan Miller is an undergraduate studying Computer Science at UC Berkeley. He is a researcher with Berkeley Artificial Intelligence Research and currently is a CHAI intern working with Micah Carroll. Read more about Nathan on his LinkedIn.Neel Alex
Former Intern
Neel is a former UC Berkeley Computer Science undergrad. He hopes to make AI safety a larger part of AI research as a whole. With how important AI will be and the risks it poses to potential futures mankind might have, Neel hopes to have AI researchers think about safety side-by-side with their own research. At CHAI, he currently works with Rohin Shah on a benchmarking project intended to show flaws in current approaches to reinforcement learning, and propose solutions that will bring us closer to alignable AI.Neel Nanda
Former Intern
Neel studied mathematics at the University of Cambridge, graduating in 2020. He has spent the past year working with a range of AI Alignment labs. He worked with Michael Cohen at the Future of Humanity Institute on theoretical Bayesian Reinforcement Learning, and with Jonathan Uesato at DeepMind on learning from noisy and biased labels. He hopes to better understand how neural networks work on the inside, and to use this understanding to help create networks that robustly do what we want them to. At CHAI, he works with Daniel Filan on neural network interpretability and clusterability.Nitish Dashora
Former Intern
Nitish Dashora (Berkeley EECS Undergraduate '24) was born and raised in Columbus, OH. He is extremely interested in machine learning algorithms for general intelligence, namely those concerned with robotics. He has been researching AI algorithms for high-speed, off-road self-driving at Berkeley AI Research (BAIR). He has also been working with the Redwood Center for Theoretical Neuroscience to explore semantically meaningful learning representations in computer vision. He worked at NASA Jet Propulsion Laboratory (JPL) with autonomous racing and published as a first author in the International Conference on Robotics and Automation (ICRA). He also worked at Amazon Web Services (AWS) as a Software Engineer Intern to develop his enterprise coding skills. Nitish aims to pursue a PhD in robot learning and hopes to connect neuroscience, machine learning, and decision-making through the medium of robotics to achieve true artificial general intelligence (AGI).Oliver Daniels-Koch
Former Intern
Oliver Richardson
Former Intern
Pavel Czempin
Former Intern
Pavel is a Master's student at the Technical University of Munich, focusing on Artificial Intelligence. His research interests include Deep Reinforcement Learning and making AI safe and robust. At CHAI he works with Adam Gleave to find defenses for adversarial policies. He previously worked as a student research assistant at Fraunhofer AISEC on cognitive security technologies. For more information, please visit his website.Pedro Freire
Former Intern
Pedro is a M.Sc. student in Mathematics and Computer Science at Ecole Polytechnique, France. Some of his AI interests include (i) how to make agents better at processing higher-level feedback from humans, and (ii) the development of tools for RL interpretability - in particular to better understand how we can reduce variance and increase sample-efficiency in learning, and have agents with better guarantees of generalization.Other of his interests include interactive theorem proving, software tooling, game development, virtual reality, and human-computer interaction more broadly.
Previously, he worked at Google developing tools for monitoring and performing statistical analysis on the behavior of YouTube video classifiers, and worked at Facebook on the development of Augmented Reality software. He is currently working on benchmarks specialized in evaluating properties of Inverse Reinforcement Learning algorithms.
Rafael Albert
Former Intern
Rafael completed his undergraduate education in March 2021, graduating with two bachelor's degrees in computer science and mathematics from RWTH Aachen University, a leading German institution. Prior to working on AI, he conducted research in mathematical logic that resulted in a journal paper and his university's Best Thesis Award. At CHAI, Rafael works with Michael Dennis on reinforcement learning agents that are robust to external perturbations. After his internship with CHAI, he will intern as a quantitative trader at Jane Street before commencing graduate study in machine learning in the fall of 2021. In addition to research, Rafael is passionate about Effective Altruism: He has served as Director of his local EA chapter and organized an official university course on the topic at his undergraduate institution with more than 200 participants.Sana Pandey
Intern
Sana (UC Berkeley Data Science and Cognitive Science Undergraduate ‘24) is working with Jonathan Stray on the ethical implementation and regulation of recommender systems. Her research interests are driven by the intersection of human experiences and artificial intelligence, encompassing work on neural network-driven recommendation systems at Apple, NLP and sentiment prediction with Woebot Health, and science and technology policy trend analysis with the University of Pennsylvania’s Lauder Institute. She also holds a strong interest in the international proliferation of AI, and is a graduate of the Stanford China Scholars Program along with the Department of State’s National Security Language for Youth Initiative. In her free time, she enjoys fencing, diving into science fiction novels, and attempting to learn salsa.Sandy Tanwisuth
Intern
Sandy is an incoming 2024 CHAI intern working on Multi-agent Learning and Cooperative AI. Sandy's broadly interested in Cooperative AI by addressing cooperation from an individual agent's incentives, and focusing on mechanism design for systems that incentivize multi-agent cooperation. Sandy draws research inspirations in their work from their interdisciplinary research training including but not limited to cognitive science, computational neuroscience, behavioral economics, game theory, and social sciences. Sandy was fortunate enough to be trained by many great scientists including Samuel M. McClure, John P. O'Doherty, Andrew Kayser, and Steven T. Piantadosi and the amazing people in their respective labs. While doing research in Multi-agent Learning independently, Sandy was also fortunate to collaborate with several fantastic people at Noisebridge and wonderful Multi-agent Learning researchers at UMD MARL Reading Group, The Multi-Agent Learning Seminar, and Cooperative AI Foundation.At CHAI, Sandy will be working with Niklas Lauffer on Representation Learning for Effective Coordination. Learn more on Sandy's personal website.
Apart from research, Sandy also believe in being the change they want to see in the world and doing the best that they can to uplift lives. That is why currently Sandy is doing the best they can to support works in technical AI Safety space generally.Sergei Volodin
Former Intern
Sergei is a Master student at the Swiss Federal Institute of Technology in Lausanne (EPFL) studying Computer Science and Neuroscience. Sergei did an internship at Google which resulted in a paper on causal models in reinforcement learning. Sergei co-founded the local AI safety group in Lausanne which does research and discussions. At CHAI, Sergei is working on adversarial policies with Adam Gleave.
Shivam Singhal
Former Intern
Shivam is an undergraduate student at UC Berkeley, majoring in EECS. He aspires to better understand the interactions between humans and AI, and he is particularly interested in the application of data-driven technology to socially impactful causes. At CHAI, he works with Smitha Milli on research related to disentangling value from temptation in recommender systems. Apart from his academic ventures, he is co-president of Tech+Social Impact, a club at Berkeley that leads important conversations about ethics in the tech space.Shlomi Hod
Former Intern
Shlomi is a data scientist and educator. He works on developing knowledge, practices, and tools to audit and mitigate the biases of real-world AI systems. He is the developer of Responsibly an open-source toolkit that brings techniques from AI bias and fairness research to practitioners. In CHAI, Shlomi works on developing a new method for increasing the interpretability of neural networks under the supervision of Daniel Filan. Besides that, Shlomi teaches various computing classes on Python programming and machine learning. His focus is on teaching a methodical and thoughtful process of programming problem-solving. He is one of the co-founders of the Israeli Cyber Education Center, where he led the design of national computing programs for kids and teenagers. For example, he initiated and managed the development of the cybersecurity Bagrut (the Israeli matriculation exam, the equivalent of A-level), and co-authored a Computer Network textbook in tutorial approach in Hebrew Before that, Shlomi worked as an algorithmic researcher and a research team leader in cybersecurity. He holds a B.Sc.. in pure Mathematics from Bar-Ilan University. Currently, he studies for an M.Sc. in Cognitive Systems at the University of Potsdam in Germany. In January 2020, Shlomi will start a Ph.D. in Computer Science in Boston Univesity.
Sören Mindermann
Former Intern
Sören is a DPhil student in CS in the Applied and Theoretical Machine Learning Group at Oxford University (OATML). His interests in machine learning include its economic and political properties, how it scales, and safety. Before joining OATML, Sören worked on reward inference and machine learning for game theory with David Duvenaud and Roger Grosse at Toronto’s Vector Institute, at the Future of Humanity Institute at Oxford, and with CHAI. He has degrees in machine learning (UCL), maths (Amsterdam) and Future Planet Studies (Amsterdam). His website is oatml.cs.ox.ac.uk/members/soren_mindermann/.Stephen Casper
Former Intern
Stephen Casper is a rising senior at Harvard College pursuing a degree in statistics and a minor in mathematics. In addition to interning at CHAI under Daniel Filan, he is affiliated with the MIT/Harvard Center for Brains, Minds, and Machines and the Harvard Kreiman Lab. Broadly, his research interests align with the goals of making AI systems safe and aligned. More specifically, areas that he is interested in include understanding compressible features and generalization in deep networks, robust reinforcement learning, and decision theory.
Stephen’s work with CHAI is focusing on mechanistic transparency via modular interpretations of neural networks. However, he is also thinking about adversaries in contexts of deep reinforcement learning and decision theory.
In addition to being fascinated by questions in machine intelligence, Stephen’s primary motivations for working on AI alignment involve using it as a strategy for shaping the future impactfully for the better. He is excited about the Effective Altruism community and the paradigm of maximizing the good that one can do in the world.
For more information and contact info, please see his website.
Thomas Woodside
Former Intern
Thomas is a rising junior undergraduate at Yale University, studying computer science. Previously, Thomas has completed internships in machine learning research and engineering at Kiva, NASA, and a security startup. At CHAI, he will be working with Scott Emmons, researching safe exploration and natural language constraints for reinforcement learning. Thomas is interested in the question of how to make AI more beneficial for humanity, and hopes that his internship will help him develop skills and knowledge to help answer that question. At Yale, Thomas has been involved with running AI Safety reading groups to raise awareness of the issue, and also helps lead Yale Effective Altruism.Yawen Duan
Former Intern
Yawen Duan received a B.S. in Decision Analytics at the University of Hong Kong in 2020. His research interests include out-of-distribution robustness and generalizability for decision-making algorithms. At CHAI, Yawen works with Adam Gleave on research on the benefits of reward function transfer: to what extent do reward learning methods actually transfer, and how can we exploit transfer to improve the reliability of RL systems? Prior to CHAI, Yawen spent a year as a full-time research intern at Huawei Noah’s Ark Lab, where he worked on Neural Architecture Search (NAS) and published two academic papers at the Conference on Computer Vision and Pattern Recognition (CVPR) and European Conference on Computer Vision (ECCV). In addition to research, he is deeply passionate about doing the most good. He was a Tianxia Fellow and the director of Effective Altruism Hong Kong.Yijin Hua
Former Intern
Yijin completed her undergrad degree in computer science at UC Berkeley in 2018, after which she worked for three years as a software engineer on search ranking systems. Now, she is back in school and pursing a masters degree at UCLA. She is interested in combinatorics, causal inference, and AI for science. At CHAI, she is working on developing ways to measure the confidence of LLMs and mitigate cases of confident hallucinations in LLMs, mentored by Justin Svegliato and Sam Toyer.Yulong Lin
Former Intern
Yulong studies Computer Science at the University of Cambridge, where he has obtained a starred first. For his dissertation, he worked with Professor Michael Bronstein and Professor Pietro Liò to benchmark dynamic graph neural networks and develop new models. Previously, he has interned at Amazon Web Services (AWS) and the National University of Singapore (NUS). He is interested in ensuring that AI is developed safely. At CHAI, he works with Scott Emmons to develop robust models in response to unrestricted adversarial examples. Read more about Yulong on his LinkedIn.