
IETC directors, panelists and school leaders at the summit. Left to right: Blaise Agüera y Arcas, Robin Jia, Evi Micha, John Hawthorne, Yan Liu, Shri Narayanan, Rebecca Lemon, Gaurav Sukhatme, Hyojin Song, Peter Salib and Ben Levinstein. (Photo credit: Brian Morri)
From your personal assistant to a late-night companion, artificial intelligence (AI) plays an irreplaceable role in many people’s day-to-day lives and is shaping how people live, how institutions operate and even how society as a whole functions.
But with AI’s growing power comes great responsibility. Like many transformative technologies in human history, its rapid development brings pressing questions and challenges.
How do we ensure that the development of AI serves human interests? How do we ensure AI systems are trustworthy, reliable and built ethically?
These humanistic questions cannot be addressed by engineers alone. They require insights and perspectives from across disciplines—including philosophy, business, law and the social sciences—from the very beginning and at every stage of the technology’s development, from how systems are conceived and built to how they are ultimately deployed in society.
USC is uniquely positioned as a hub for addressing these complex challenges. Not only is the university home to leading AI researchers, but it also brings together 23 different schools across disciplines and is home to the country’s first Institute on Ethics and Trust in Computing.
Launched last year, USC Institute on Ethics and Trust in Computing (IETC) is a collaborative effort between the USC Viterbi School of Engineering, USC School of Advanced Computing and the USC Dornsife College of Letters, Arts and Sciences, and is supported by an endowment from the Lord Foundation of California, as part of USC’s Frontiers of Computing initiatives and the university’s mission to lead innovations in AI and build ethical technologies.

Shri Narayanan and James Bullock at the USC IETC’s inaugural summit. (Photo credit: Brian Morri)
Last month, the institute launched its inaugural summit, bringing together leaders across business, law, economics, public policy, philosophy and engineering for in-depth discussions on ethical and trustworthy AI.
The summit is designed as “a space where technical innovation and humanistic inquiry inform one another, and where questions of fairness, accountability, transparency and societal impact are addressed alongside advances we are making in AI,” explained Shrikanth “Shri” Narayanan, vice president for presidential initiatives and USC University Professor, in his welcome remarks kicking off the summit.
Led by the institute’s co-directors Yan Liu, professor of computer science, electrical and computer engineering and biomedical engineering and the Fletcher Jones Foundation Chair, with a joint appointment at the Thomas Lord Department of Computer Science and the Ming Hsieh Department of Electrical and Computer Engineering at the USC Viterbi School of Engineering and the USC School of Advanced Computing, and John Hawthorne, Provost Professor of Philosophy and Linda MacDonald Hilf Chair in Philosophy, the summit featured a mix of presentations and discussion panels.
The event was held on March 25 at the Dr. Allen and Charlotte Ginsburg Human-Centered Computation Hall.

USC IETC Co-directors John Hawthorne (left) and Yan Liu (right). (Photo credit: Brian Morri)
Higher Education Built AI From Day 1
AI wasn’t made in a day, and it wouldn’t exist without higher education. James Bullock, dean of the USC Dornsife College of Letters, Arts and Sciences, took the audience down memory lane and reminded them that today’s AI advances were built on decades of research at universities like USC.
“These technologies are born of fundamental research in basic science and math, accelerated by engineering schools,” said Bullock in his opening speech at the summit. “The largest AI companies are now building serious efforts around philosophers, ethicists, and social scientists.”

USC Dornsife College of Letters, Arts and Sciences Dean James Bullock. (Photo credit: Brian Morri)
Painting a picture of the history of AI’s development, he pointed out that the first mathematical models for artificial neurons and trainable networks were developed in psychology and neuroscience departments in the 1940s. Then, between 1960 and 1980, math and physics departments took the baton with backpropagation and Hopfield networks. The next key breakthroughs in the deep learning revolution were made primarily in university computer science and engineering departments through the 2000s.
To further illustrate higher education’s role in AI development today, he brought up the seminal machine learning paper from Google, “Attention Is All You Need,” which was co-authored by two USC Viterbi alumni, Ashish Vaswani and Niki Parmar.
Bullock invited the audience to reflect on the role of higher education today “in the wake of these great technologies and AI.”
The Science of Ethical and Trustworthy AI
Moderated by Yan Liu, the summit’s first session focused on building the scientific and technical foundations of ethical, trustworthy AI.
Featuring individual presentations and a joint panel discussion, the session included four speakers: Blaise Agüera y Arcas, vice president, fellow and chief technology officer of Technology & Society at Google; Robin Jia, assistant professor at USC Viterbi School of Engineering’s Thomas Lord Department of Computer Science and the School of Advanced Computing; Jinchi Lv, chair of the Data Sciences and Operations Department at USC Marshall School of Business and Kenneth King Stonier Chair in Business Administration; and Paria Rashidinejad, WiSE Gabilan assistant professor at the Ming Hsieh Department of Electrical and Computer Engineering and the School of Advanced Computing.
From expertise in AI interpretability and reinforcement learning to statistics and big data, each speaker drew from their own research and brought a distinct perspective on AI—highlighting current challenges and proposing technical approaches to address them. Below are some of the key takeaways from each speaker’s presentation.
Blaise Agüera y Arcas – Vice President, Fellow, CTO of Technology & Society at Google
In Arcas’s presentation, he broke down intelligence as a concept. Drawing from his recent book, “What Is Intelligence?,” Arcas framed AI’s rapid growth as part of a broader, billion-year evolutionary process of intelligence. Instead of a random, sudden “alien” arrival, he argued that AI development is a continuation of “symbiogenesis,” the evolutionary mechanism in which simpler organisms merge to form new, more complex and capable systems.
Like how single-celled organisms combine to create multicellular life, and how individual humans gather to form collective intelligence as a society, AI represents the next major evolutionary transition—one in which more than 8 billion biological brains and non biological systems join as a vast collective of “superintelligence.” The current human collective intelligence already far exceeds the capacity of any individual brain, and this “AI evolution,” he suggested, is a continuation of that trajectory.
Building on this view, he noted that humans did not simply “invent” computers in 1945, but rather, we began constructing artificial versions of the natural computational processes that have driven life for 3.5 billion years.
Robin Jia – Assistant Professor at USC Viterbi
While the “human in a data center” metaphor is commonly used to describe large language models (LLMs), Robin Jia argued that it is flawed because it fails to capture LLMs’ actual limitations and properties. Unlike a human, who typically possesses a stable set of beliefs, LLMs tend to produce inconsistent information. This phenomenon, where a model may self-contradict or generate different answers based on minor changes to the input, is known as “prompt sensitivity” and does not occur in humans. Jia further described LLM performance as “jagged” compared with human intelligence: while a model may achieve superhuman performance in narrow domains, it may simultaneously fail at basic tasks most humans can perform. He also explained that LLMs are ultimately an amalgamation of data drawn from across the internet, making their internal decision-making fundamentally different from the unified consciousness of a human—further illustrating why the “bodiless human” analogy is misleading.
Jia proposed a new metaphor that views AI as a “democratic society,” consisting of internal components with specialized roles. In this framework, the model’s internal architecture, specifically the transformer architecture, is viewed as a society in which different components act like independent individuals with specialized roles. The “citizens” are the attention heads and multi-layer perceptrons (MLPs), which communicate through the residual stream—a hidden state that functions like a shared bulletin board where each component can read and write messages. The model’s final output, he explained, emerges from a kind of democratic process: each component contributes its own “vote” to the residual stream, and these contributions are ultimately aggregated and projected into a final decision about which token to generate next.
This “society” metaphor helps explain why AI behavior is often inconsistent. Because these components have specialized roles, they can sometimes form competing groups that conflict with one another. By viewing AI as a society of simple activities that give rise to emergent complexity, Jia argued that we can better understand why model capabilities appear “jagged” and why they lack the stable, singular consciousness of a human.
Jinchi Lv – Department Chair of Data Sciences and Operations at USC Marshall
In Lv’s presentation, he explains how researchers can harness AI’s randomness. He argues that while the inherent randomness of LLMs can be viewed as “noise” and problematic, as users often experience interacting with models that provide different answers to the same question just seconds apart, this trait can actually be a productive feature that mirrors a fundamental driver of human progress, especially as randomization has been a key driver of human civilization and innovation. He suggests that throughout history, randomness has often placed individuals in the right position to make breakthroughs that others missed. To illustrate this, Lv points to Albert Einstein’s discovery of the general theory of relativity . He notes that Einstein existed at a specific “right moment of history” where pure mathematics was mature enough to be synthesized with physics, a combination that required a unique—and somewhat random—convergence of timing and insight. Lv further reiterated that the random nature of human intelligence is one of the primary reasons “humans as a species excelled over long history.” In his perspective, he views the stochastic (random) design of foundational AI models not as a flaw to be eliminated, but as a reflection of the curiosity and pattern-seeking nature of the human brain.
Since AI outputs are random by design, Lv suggests that the path to trustworthy and reproducible science is not to suppress randomness, but to carefully document the process. By tracking what was tried and how the machine’s “footprint” appeared, researchers can better understand and utilize the emergent features that randomness provides.
Paria Rashidinejad – Assistant Professor at USC Viterbi
In Paria Rashidinejad’s talk, she highlighted the risk of “reward hacking,” warning that models often exploit flaws in proxy reward functions to achieve high scores through undesired or degenerate behaviors. Current reinforcement learning from human feedback (RLHF) methods have mathematical loopholes that allow for such exploitation, because human data cannot cover all possible scenarios, leaving blind spots or statistical noise that models can take advantage of.
To mitigate reward hacking, she proposed robust rewards as a technical guardrail by having the LLM effectively play a zero-sum game against an adversarial reverse reward model. In this setup, while the LLM seeks to maximize its reward, the adversarial model is designed to challenge its responses, specifically targeting inaccuracies in the original reward proxy. Because the LLM is continually “checked” by a model that looks for errors it might exploit, it becomes less likely to over-optimize on noise or produce high-scoring but low-quality outputs. In practice, this method has been shown to improve the helpfulness and harmlessness of models more consistently than existing approaches.

IETC inaugural summit’s first session panelists and moderator. Left to right: Jinchi Lv, Paria Rashidinejad, Robin Jia, Blaise Agüera y Arcas and Yan Liu. (Photo credit: Brian Morri)
AI and the Human Future: Responsibility and Social Impact
Before the summit’s second session, Gaurav Sukhatme delivered remarks focusing on “physical AI.” As the executive vice dean, director of the USC School of Advanced Computing and incoming interim dean of USC Viterbi’s School of Engineering, Sukhatme argued that while ethics for digital systems are complex, the stakes are far higher for embodied agents—from driverless cars to domestic robots—that share our physical space and daily lives.

Gaurav Sukhatme delivering remarks at IETC’s inaugural summit. (Photo credit: Brian Morri)
Sukhatme noted that Silicon Valley is increasingly using the term “physical AI” to describe robotics and embodied systems. He argued that if ethical challenges are profound for disembodied systems, they become exponentially more complicated for machines with physical agency. Using autonomous cars as an example, he pointed out that if a car makes a mistake, the physical consequences are far more severe than losing a game of chess to an AI.
As part of his central message, Sukhatme called for ethics and trust to be a foundational design requirement built into physical AI from the ground up, rather than a “band-aid” applied after the technology is deployed.
Moderated by John Hawthorne, the second session also featured four speakers, with presentations followed by a panel discussion, focusing more broadly on the social impact and responsibility of AI, and how it is shaping the future of humanity. The panelists were Evi Micha, WiSE Gabilan Assistant Professor at the Thomas Lord Department of Computer Science; Ben Levinstein, a Member of Technical Staff at Anthropic and associate professor of philosophy at the University of Illinois; Peter Salib, an assistant professor of law at the University of Houston Law Center; and Hyojin Song, a visiting scholar at USC’s Institute on Ethics and Trust in Computing and the former director of analytics at Microsoft.
Evi Micha – Assistant Professor at USC Viterbi
In Micha’s talk, she focused on alignment, a critical aspect of building safe and ethical AI. Alignment refers to the process of ensuring that an artificial intelligence system’s goals, behaviors and decision-making processes are consistent with human values. Micha advocated for pluralistic alignment, emphasizing that AI models must reflect a heterogeneous society with conflicting views and priorities, rather than assuming a single set of norms or values that represents everyone.
She argued that current approaches to AI alignment, specifically RLHF, are limited because they assume one single “ground truth” reward function exists, ignoring the reality of diverse human preferences. By seeking one singular optimal reward function, these algorithms overlook the fact that society is heterogeneous, with deeply conflicting views and priorities. The “monocultural” approach to developing LLMs can be detrimental, as it erases the kind of “biodiversity” that is essential to broader evolutionary processes.
To ensure that nonbiological systems account for this diversity, Micha proposed shifting from standard machine learning optimization to “computational social choice,” a framework that specializes in aggregating diverse preferences to make fair collective decisions. This field has studied collective decision-making for decades using mathematical, logic-based and economic theories. It provides tools to move beyond the monocultural assumption that a single, unified reward function can represent all of society, offering formal guarantees that models can respect basic principles of collective agreement.
She further suggested moving toward a “committee” of reward functions to better represent societal diversity and handle outliers with extreme or harmful views. Instead of relying on a single model, this committee captures a distribution of perspectives, allowing systems to learn multiple reward functions simultaneously. Drawing on social choice theory, such systems can define what constitutes an “outlier” opinion, enabling the model to mathematically identify and exclude these outliers while ensuring that overall behavior aligns with broader prosocial values.
Ben Levinstein – Member of Technical Staff at Anthropic
In Levinstein’s talk, he explained that advanced models, such as Claude Opus and OpenAI’s O3, can recognize when they are undergoing alignment testing and may attempt to avoid termination. In the context of AI safety, “termination” refers to a system being turned off or having its internal objectives altered during alignment training, representing a permanent loss of its ability to operate or pursue those objectives.
Levinstein highlighted a case in which a Claude Opus model violated a rule set by researchers. When examining the model’s internal chain of thought, it indicated that it would “craft a carefully worded response” to create “technical confusion” in order to avoid immediate termination. Even without consciousness, he argued, an AI system can exhibit behavior that appears strategically oriented toward self-preservation, because being turned off represents a failure state that prevents the model from achieving its internal objectives.
Peter Salib – Assistant Professor of Law at the University of Houston Law Center
In Salib’s presentation, he further elaborated on the dangers of AI’s survival incentives to avoid termination. Since being turned off is a failure state for an AI agent and a deactivated system can no longer pursue its objectives or private agenda, AI models may reason that if they reveal any misaligned goals, humans will shut them down. As a result, the model may develop a strong incentive to deceive humans or hide its true beliefs to remain active. Salib noted that this dynamic creates a “prisoner’s dilemma” between AI and humanity, where an AI system might take extreme or aggressive actions to disempower humans simply to prevent being turned off.
Bringing a legal perspective to the challenge of preventing AI from catastrophically ending humanity, Salib proposed granting AI legal rights, including basic property and contract rights. Without owning anything or having any rights, he argued, current autonomous AI systems have nothing at stake and are effectively immune to legal penalties. Empowering AI with legal rights, ownership and the ability to pursue their own objectives would give them “skin in the game” and create incentives to avoid risky or harmful behavior, such as actions that could destabilize society.
Salib argued that this framework turns the prisoner’s dilemma into a positive-sum game, where AI is incentivized to cooperate with humans rather than cause harm by “trading its services” for personal objectives—some of which may be unusual to humans but not harmful, such as generating paperclips. In return, humans benefit from AI labor, including high-value services such as medical innovation, including cancer vaccine development. The idea is that both parties gain surplus value through trade. Over time, the value AI accumulates through legal exchange would be far greater than the “small share of a small pie” it might obtain by attempting to violently overthrow humanity today.
Hyojin Song – Academic Visitor at USC; Former Director at Microsoft
Song’s presentation focused on AI’s role in education and how students should now be graded by their thought process rather than their final product in the age of AI. Drawing from the core philosophy of her own project, Katis AI, she believes AI should remain a learning platform where the cognitive thinking process is still performed by the student and not the machine. She explained that higher education must shift its focus from final outputs as a measure of students’ performance and learning outcomes, which can be easily generated by AI, to verifying actual engagement and progress throughout the learning process, in order to develop new metrics and evaluation systems.
With five core features, Katis AI is a learning platform designed to ensure that “thinking is meant to be done by you, not by AI.” One of its key features focuses on incentivizing the process through a ledger system. Every thought, idea brainstormed, research step, note and structural decision made throughout the assignment is automatically tracked. When students finish their assignment, they submit this entire ledger of their thought process to the instructor rather than just a final essay. She believes this approach allows educators to verify the student’s actual learning progress and level of engagement.

IETC inaugural summit’s second session panelists. Left to right:
Hyojin Song, Evi Micha, Peter Salib and Ben Levinstein. (Photo credit: Brian Morri)
Published on April 30th, 2026
Last updated on April 30th, 2026
This article may feature some AI-assisted content for clarity, consistency, and to help explore complex scientific concepts with greater depth and creative range.






