pith. sign in

arxiv: 2305.09620 · v4 · pith:NQEXAKUNnew · submitted 2023-05-16 · 💻 cs.CL · cs.AI· cs.LG

AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction

classification 💻 cs.CL cs.AIcs.LG
keywords surveysopinionsllmsmissingopinionmodelspredictingprediction
0
0 comments X
read the original abstract

Nationally representative surveys track public opinion, yet they ask only a limited set of questions each year, limiting its potential to capture historical changes. To fill this gap, we develop a large language model (LLM)-based framework for predicting missing responses in repeated cross-sectional surveys by incorporating embeddings for questions, respondents, and survey periods. We introduce two new applications of LLMs to survey research: retrodiction (predicting year-level missing opinions) and unasked opinion prediction (predicting entirely missing opinions). Using data from the 1972-2021 General Social Surveys, our LLM-based models perform strongly in retrodicting masked GSS opinions through cross-validation and public opinions measured by other organizations in years when the GSS did not ask them. These capabilities enable us to recover missing trends and pinpoint when public attitudes changed, such as the rising support for same-sex marriage. However, performance remains modest for unasked opinion prediction. We show when our models outperform established benchmarks, examine which opinions and and respondents are more predictable, and evaluate whether our approach reduces LLMs' tendency to homogenize predicted responses. Our study demonstrates that LLMs and surveys can mutually enhance each other: LLMs broaden survey potential, while surveys calibrate LLMs for simulating human opinions.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Roadmap to Pluralistic Alignment

    cs.AI 2024-02 unverdicted novelty 6.0

    The paper formalizes three types of pluralistic AI models and three benchmark classes, arguing that current alignment techniques may reduce rather than increase distributional pluralism.

  2. From Demographics to Survey Anchors: Evaluating LLM Agents for Modeling Retirement Attitudes

    cs.CY 2026-04 conditional novelty 5.0

    Demographic-only LLM agents for retirement survey prediction exhibit central tendency bias, fail to reproduce incorrect or 'don't know' answers, and miss factor interactions in regressions, unlike survey-anchored agents.

  3. Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation

    cs.AI 2025-09 conditional novelty 5.0

    Introduces PAS and FAS task abstractions plus the LLM-S^3 benchmark to evaluate LLMs on generating sociodemographic survey responses across 11 real datasets and multiple models.