Pluralistic Alignment

@ NeurIPS 2024 Workshop

December 14, 2024.
Vancouver Convention Center (West Meeting Room 116, 117).

Exploring Pluralistic Perspectives in AI


Call for Papers Schedule

Welcome to the Pluralistic Alignment Workshop! Aligning AI with human preferences and values is increasingly important. Yet, today’s AI alignment methods have been shown to be insufficient for capturing the vast space of complex – and often conflicting – real-world values. Our workshop will discuss how to integrate diverse perspectives, values, and expertise into pluralistic AI alignment. We aim to explore new methods for multi-objective alignment by drawing inspiration from governance and consensus-building practices to address conflicting values in pluralistic AI alignment. Discussion will include technical approaches for dataset collection, algorithms development, and the design of human-AI interaction workflows that reflect pluralistic values among diverse populations. By gathering experts from various fields, this workshop seeks to foster interdisciplinary collaboration and push the boundaries of the understanding, development and practice of pluralistic AI alignment.

Stay tuned by following us on Twitter @pluralistic_ai.

Speakers

Avatar

Yejin Choi

University of Washington

Avatar

Melanie Mitchell

Santa Fe Institute

Avatar

Seth Lazar

Australian National University

Avatar

Michael Bernstein

Stanford University

Avatar

Hannah Rose Kirk

University of Oxford

Schedule

Time Program
9:00-9:10 Opening remarks
9:10-9:55 Keynote Talk: Monojit Choudhury, LLMs for a Multi-cultural World: A case for On-demand Value Alignment
9:55-10:40 Keynote Talk: Hannah Rose Kirk, Interpersonal and Intrapersonal Dilemmas in Achieving Pluralistic Alignment
10:40-11:40 Poster Session & Coffee Break
11:40-12:25 Keynote Talk: Yejin Choi, TBD
12:25-13:30 Lunch Break
13:30-14:15 Keynote Talk: Seth Lazar, Philosophical Foundations for Pluralistic Alignment
14:15-15:00 Keynote Talk: Melanie Mitchell, The Role of Metacognition in Wise and Aligned Machine Intelligence
15:00-15:30 Coffee Break
15:30-16:45 5 Contributed Talks (10 mins talk + 5 mins Q&A)
16:45-17:30 Keynote Talk: Michael Bernstein, Interactive Simulacra of Human Attitudes and Behavior
17:30-17:40 Closing Remarks

Keynote Talks

LLMs for a Multi-cultural World: A Case for On-demand Value Alignment

Speaker: Monojit Choudhury

Aligning AI models to human values is of utmost importance, but to which values and of which humans? In our multicultural world there are no universal set or hierarchy of values. Therefore, if foundation models are aligned to some particular values, we run the risk of excluding users, usage contexts and applications that require alignment to conflicting values. I will discuss a set of experiments with moral dilemmas with various LLMs and languages that shows that while moral reasoning capability of LLMs grow with model size, this ability is greatly compensated when the model is strongly aligned to certain set of values. This seriously limits the usability of the model in diverse applications and regions that prefer conflicting value hierarchies. I will use this as a case to argue against generic value alignment for foundation model; instead, foundation models should possess the ability to reason with any arbitrary value system specified in their prompt or through knowledge injection.

Interpersonal and Intrapersonal Dilemmas in Achieving Pluralistic Alignment

Speaker: Hannah Rose Kirk

Early work in AI alignment relied on restrictive assumptions about human behaviour to make progress even in simple 1:1 settings with a single operator. This talk addresses two key challenges in developing more pluralistic and realistic models of human preferences for alignment today. In Part I, we challenge the assumption that values and preferences are universal or acontextual through examining interpersonal dilemmas - what happens when we disagree with one another? I'll introduce the PRISM Alignment Dataset as a key new resource that contextualizes preference ratings across diverse human groups with detailed sociodemographic data. In Part II, we challenge the assumption that values and preferences are stable or exogenous by exploring intrapersonal dilemmas - what happens when we disagree with ourselves? I'll introduce ongoing research on anthropomorphism in human-AI interaction, examining how revealed preferences often conflict with stated preferences, especially regarding AI systems' social capabilities and in longitudinal interactions.

Title: TBD

Speaker: Yejin Choi

Details to be updated soon.

Philosophical Foundations for Pluralistic Alignment

Speaker: Seth Lazar

Why does pluralism matter, and how do different arguments for pluralism condition the methods by which we should realise it? This talk considers different possible justifications for pluralistic AI, and argues that, as long as we’re not using AI systems to exercise significant degrees of power, the best way to achieve pluralism is through ensuring a vibrant ecosystem of competing, varied, and (at least in some cases) open models.

The Role of Metacognition in Wise and Aligned Machine Intelligence

Speaker: Melanie Mitchell

I will argue that AI alignment, especially in pluralistic contexts, will require machines to be able to understand and reason about concepts appropriately in diverse situations, and to be able to explain these reasoning processes. In humans, such capacities are enabled by metacognitive abilities: being sensitive to context, grasping others' perspectives, and recognizing the limits of one's own capabilities. I will discuss possible approaches toward AI metacognition and why such abilities may be paramount in developing machines with the “wisdom" needed for pluralistic alignment.

Interactive Simulacra of Human Attitudes and Behavior

Speaker: Michael Bernstein

Effective models of human attitudes and behavior can empower applications ranging from immersive environments to social policy simulation. However, traditional simulations have struggled to capture the complexity and contingency of human behavior. I argue that modern artificial intelligence models allow us to re-examine this limitation. I make my case through computational software agents that simulate human attitudes and behavior. I discuss how we used this approach, which we call generative agents, to model a representative sample of 1,000 Americans and replicate their attitudes and behavior 85% as well as they replicate themselves two weeks later. Extending my line of argument, I explore how modeling human behavior and attitudes can help us design more effective online social spaces, understand the societal disagreement underlying modern AI models, and better embed societal values into our algorithms.

Accepted Papers

Accepted papers are available on OpenReview.

Oral presentation

Posters

Call for Papers

Our workshop aims to bring together researchers with diverse scientific backgrounds, including (but not limited to) machine learning, human-computer interaction, philosophy, and policy studies. More broadly, our workshop lies at the intersection of computer and social sciences. We welcome all interested researchers to discuss the aspects of pluralistic AI, from its definition to the technical pipeline to broad deployment and social acceptance.

We invite submissions that discuss the technical, philosophical, and societal aspects of pluralistic AI. We provide a non-exhaustive list of topics we hope to cover below. We also broadly welcome any submissions which are broadly relevant to pluralistic alignment.

  • Philosophy:
    • Definitions and frameworks for Pluralistic Alignment
    • Ethical considerations in aligning AI with diverse human values
  • Machine learning:
    • Methods for pluralistic ML training and learning algorithms
    • Methods for handling annotation disagreements
    • Evaluation metrics and datasets suitable for pluralistic AI
  • Human-computer interaction:
    • Designing human-AI interaction that reflects diverse user experiences and values
    • Integrating existing surveys on human values into AI design
    • Navigating privacy challenges in pluralistic AI systems
  • Social sciences:
    • Methods for achieving consensus and different forms of aggregation
    • Assessment and measurement of the social impact of pluralistic AI
    • Dealing with pluralistic AI representing values that are offensive to some cultural groups
  • Policy studies:
    • Policy and laws for the deployment of pluralistic AI
    • Democratic processes for incorporating diverse values into AI systems on a broad scale
  • Applications:
    • Case studies in areas such as hate speech mitigation and public health

Submission Instructions

We invite authors to submit anonymized papers up to 4 pages, excluding references and appendices. All submissions should be in PDF format and made through OpenReview submission portal. Submissions must follow the NeurIPS 2024 template. Checklists are not required for submissions. Reviews will be double-blind, with at least three reviewers assigned to each paper to ensure a thorough evaluation process.

We welcome various types of papers including works in progress, position papers, policy papers, academic papers. All accepted papers will be available on the workshop website, but are to be considered non-archival.

Travel Support

There will be a limited amount of travel grants to cover expenses. Financial support will be made available for lodging, and registration, subject to our available funding. Travel expenses are handled via reimbursement. We extend our thanks to OpenAI for their generous sponsorship of our workshop.

Please fill out the travel support application form by Oct 18, 2025 AOE.

Important Dates

All deadlines are 11:59 pm UTC-12h (“Anywhere on Earth”).

July 15, 2024 Call for Workshop Papers
Sep 07 Sep 10, 2024 Paper Submission Deadline
Oct 9, 2024 Notification of Acceptance
Oct 18, 2024 Travel Support Application Deadline
Nov 04, 2024 Notification of Travel Support Decisions
November 14, 2024 Camera-Ready Version Due
Dec 14, 2024 Workshop Date

Organization

Organizing Committee

Avatar

Moksh Jain

Mila & Université de Montréal

Avatar

Ruyuan Wan

Pennsylvania State University

Avatar

Mitchell L. Gordon

OpenAI and MIT CSAIL

Avatar

Dongyeop Kang

University of Minnesota

Avatar

Maarten Sap

CMU LTI

Avatar

Amy Zhang

University of Washington

Avatar

He He

New York University

Scientific Advisory Board

Avatar

Yoshua Bengio

Mila & Université de Montréal

Avatar

Jeffrey P. Bigham

Carnegie Mellon University

Contact us

Please email pluralistic-alignment-neurips2024@googlegroups.com if you have any questions.

Sponsor