Pluralistic Alignment
@ NeurIPS 2024 Workshop
December 14, 2024.
Vancouver Convention Center (West Meeting Room 116, 117).
Exploring Pluralistic Perspectives in AI
Welcome to the Pluralistic Alignment Workshop! Aligning AI with human preferences and values is increasingly important. Yet, today’s AI alignment methods have been shown to be insufficient for capturing the vast space of complex – and often conflicting – real-world values. Our workshop will discuss how to integrate diverse perspectives, values, and expertise into pluralistic AI alignment. We aim to explore new methods for multi-objective alignment by drawing inspiration from governance and consensus-building practices to address conflicting values in pluralistic AI alignment. Discussion will include technical approaches for dataset collection, algorithms development, and the design of human-AI interaction workflows that reflect pluralistic values among diverse populations. By gathering experts from various fields, this workshop seeks to foster interdisciplinary collaboration and push the boundaries of the understanding, development and practice of pluralistic AI alignment.
Stay tuned by following us on Twitter @pluralistic_ai.
Time | Program |
---|---|
9:00-9:10 | Opening remarks |
9:10-9:55 | Keynote Talk: Monojit Choudhury, LLMs for a Multi-cultural World: A case for On-demand Value Alignment |
9:55-10:40 | Keynote Talk: Hannah Rose Kirk, Interpersonal and Intrapersonal Dilemmas in Achieving Pluralistic Alignment |
10:40-11:40 | Poster Session & Coffee Break |
11:40-12:25 | Keynote Talk: Yejin Choi, TBD |
12:25-13:30 | Lunch Break |
13:30-14:15 | Keynote Talk: Seth Lazar, Philosophical Foundations for Pluralistic Alignment |
14:15-15:00 | Keynote Talk: Melanie Mitchell, The Role of Metacognition in Wise and Aligned Machine Intelligence |
15:00-15:30 | Coffee Break |
15:30-16:45 | 5 Contributed Talks (10 mins talk + 5 mins Q&A) |
16:45-17:30 | Keynote Talk: Michael Bernstein, Interactive Simulacra of Human Attitudes and Behavior |
17:30-17:40 | Closing Remarks |
Aligning AI models to human values is of utmost importance, but to which values and of which humans? In our multicultural world there are no universal set or hierarchy of values. Therefore, if foundation models are aligned to some particular values, we run the risk of excluding users, usage contexts and applications that require alignment to conflicting values. I will discuss a set of experiments with moral dilemmas with various LLMs and languages that shows that while moral reasoning capability of LLMs grow with model size, this ability is greatly compensated when the model is strongly aligned to certain set of values. This seriously limits the usability of the model in diverse applications and regions that prefer conflicting value hierarchies. I will use this as a case to argue against generic value alignment for foundation model; instead, foundation models should possess the ability to reason with any arbitrary value system specified in their prompt or through knowledge injection.
Early work in AI alignment relied on restrictive assumptions about human behaviour to make progress even in simple 1:1 settings with a single operator. This talk addresses two key challenges in developing more pluralistic and realistic models of human preferences for alignment today. In Part I, we challenge the assumption that values and preferences are universal or acontextual through examining interpersonal dilemmas - what happens when we disagree with one another? I'll introduce the PRISM Alignment Dataset as a key new resource that contextualizes preference ratings across diverse human groups with detailed sociodemographic data. In Part II, we challenge the assumption that values and preferences are stable or exogenous by exploring intrapersonal dilemmas - what happens when we disagree with ourselves? I'll introduce ongoing research on anthropomorphism in human-AI interaction, examining how revealed preferences often conflict with stated preferences, especially regarding AI systems' social capabilities and in longitudinal interactions.
Details to be updated soon.
Why does pluralism matter, and how do different arguments for pluralism condition the methods by which we should realise it? This talk considers different possible justifications for pluralistic AI, and argues that, as long as we’re not using AI systems to exercise significant degrees of power, the best way to achieve pluralism is through ensuring a vibrant ecosystem of competing, varied, and (at least in some cases) open models.
I will argue that AI alignment, especially in pluralistic contexts, will require machines to be able to understand and reason about concepts appropriately in diverse situations, and to be able to explain these reasoning processes. In humans, such capacities are enabled by metacognitive abilities: being sensitive to context, grasping others' perspectives, and recognizing the limits of one's own capabilities. I will discuss possible approaches toward AI metacognition and why such abilities may be paramount in developing machines with the “wisdom" needed for pluralistic alignment.
Effective models of human attitudes and behavior can empower applications ranging from immersive environments to social policy simulation. However, traditional simulations have struggled to capture the complexity and contingency of human behavior. I argue that modern artificial intelligence models allow us to re-examine this limitation. I make my case through computational software agents that simulate human attitudes and behavior. I discuss how we used this approach, which we call generative agents, to model a representative sample of 1,000 Americans and replicate their attitudes and behavior 85% as well as they replicate themselves two weeks later. Extending my line of argument, I explore how modeling human behavior and attitudes can help us design more effective online social spaces, understand the societal disagreement underlying modern AI models, and better embed societal values into our algorithms.
Accepted papers are available on OpenReview.
Oral presentation
Posters
Our workshop aims to bring together researchers with diverse scientific backgrounds, including (but not limited to) machine learning, human-computer interaction, philosophy, and policy studies. More broadly, our workshop lies at the intersection of computer and social sciences. We welcome all interested researchers to discuss the aspects of pluralistic AI, from its definition to the technical pipeline to broad deployment and social acceptance.
We invite submissions that discuss the technical, philosophical, and societal aspects of pluralistic AI. We provide a non-exhaustive list of topics we hope to cover below. We also broadly welcome any submissions which are broadly relevant to pluralistic alignment.
Submission Instructions
We invite authors to submit anonymized papers up to 4 pages, excluding references and appendices. All submissions should be in PDF format and made through OpenReview submission portal. Submissions must follow the NeurIPS 2024 template. Checklists are not required for submissions. Reviews will be double-blind, with at least three reviewers assigned to each paper to ensure a thorough evaluation process.
We welcome various types of papers including works in progress, position papers, policy papers, academic papers. All accepted papers will be available on the workshop website, but are to be considered non-archival.
Travel Support
There will be a limited amount of travel grants to cover expenses. Financial support will be made available for lodging, and registration, subject to our available funding. Travel expenses are handled via reimbursement. We extend our thanks to OpenAI for their generous sponsorship of our workshop.
Please fill out the travel support application form by Oct 18, 2025 AOE.
All deadlines are 11:59 pm UTC-12h (“Anywhere on Earth”).
July 15, 2024 | Call for Workshop Papers |
Paper Submission Deadline | |
Oct 9, 2024 | Notification of Acceptance |
Oct 18, 2024 | Travel Support Application Deadline |
Nov 04, 2024 | Notification of Travel Support Decisions |
November 14, 2024 | Camera-Ready Version Due |
Dec 14, 2024 | Workshop Date |
---|
Please email pluralistic-alignment-neurips2024@googlegroups.com if you have any questions.