AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction

Byungkyu Lee; Junsol Kim

arxiv: 2305.09620 · v4 · pith:NQEXAKUNnew · submitted 2023-05-16 · 💻 cs.CL · cs.AI· cs.LG

AI-Augmented Surveys: Leveraging Large Language Models and Surveys for Opinion Prediction

Junsol Kim , Byungkyu Lee This is my paper

Pith reviewed 2026-05-24 08:45 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.LG

keywords large language modelssurvey researchpublic opinionopinion predictionGeneral Social Surveyretrodictionmissing data

0 comments

The pith

Large language models retrodict missing survey opinions by embedding questions, respondents, and years.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an LLM framework that predicts omitted responses in repeated cross-sectional surveys by representing each question, each respondent, and each survey wave as embeddings. These models recover masked answers from the 1972-2021 General Social Surveys in cross-validation and match independent measurements from other organizations in years when the GSS skipped items. The filled series locate the timing of attitude changes, including the rise in support for same-sex marriage. Accuracy falls when the task shifts to opinions that no survey in the data ever measured. The work treats surveys and LLMs as mutually corrective: surveys anchor the models while the models expand the reach of survey records.

Core claim

By incorporating embeddings for questions, respondents, and survey periods into large language models, the framework predicts masked GSS opinions accurately in cross-validation and matches external public opinion data from years the GSS did not field certain items. These predictions recover complete time trends and locate inflection points in attitude change.

What carries the argument

Embeddings for questions, respondents, and survey periods that allow the model to generate individualized response predictions.

Load-bearing premise

That the learned embeddings capture enough structure to predict opinions across time and questions without substantial bias or homogenization of responses.

What would settle it

If the model's predictions for opinions measured by both the GSS and another organization in the same year diverge substantially from the external measurements, the retrodiction claim would be falsified.

Figures

Figures reproduced from arXiv: 2305.09620 by Byungkyu Lee, Junsol Kim.

**Figure 2.** Figure 2: An overview of our methodological framework. [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Model performance for predicting three types of missing responses at individual [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗

**Figure 4.** Figure 4: Illustration of the potential application of our models and matrix factorization [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Coefficient plots from OLS regression models predicting individual-level AUC [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗

**Figure 6.** Figure 6: Coefficient plots from OLS regression models predicting opinion-level AUC [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗

read the original abstract

Nationally representative surveys track public opinion, yet they ask only a limited set of questions each year, limiting its potential to capture historical changes. To fill this gap, we develop a large language model (LLM)-based framework for predicting missing responses in repeated cross-sectional surveys by incorporating embeddings for questions, respondents, and survey periods. We introduce two new applications of LLMs to survey research: retrodiction (predicting year-level missing opinions) and unasked opinion prediction (predicting entirely missing opinions). Using data from the 1972-2021 General Social Surveys, our LLM-based models perform strongly in retrodicting masked GSS opinions through cross-validation and public opinions measured by other organizations in years when the GSS did not ask them. These capabilities enable us to recover missing trends and pinpoint when public attitudes changed, such as the rising support for same-sex marriage. However, performance remains modest for unasked opinion prediction. We show when our models outperform established benchmarks, examine which opinions and and respondents are more predictable, and evaluate whether our approach reduces LLMs' tendency to homogenize predicted responses. Our study demonstrates that LLMs and surveys can mutually enhance each other: LLMs broaden survey potential, while surveys calibrate LLMs for simulating human opinions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The embedding framework retrodicts masked GSS opinions reasonably well via CV and external checks, but unasked-opinion results stay modest and the temporal extrapolation claim needs tighter tests.

read the letter

The main takeaway is that question-respondent-period embeddings let the model recover masked GSS items at decent cross-validation levels and line up with other polls in non-GSS years. That part looks usable for filling gaps in repeated surveys. The unasked-opinion task, by contrast, shows only modest performance, which the abstract itself flags. This split suggests the method is stronger at interpolation within observed distributions than at true out-of-distribution prediction. The stress-test worry about period embeddings absorbing average yearly levels rather than isolating attitude shifts is worth checking in the full methods; if the embeddings are fit jointly on all data, the retrodiction numbers could overstate how well the model would recover trends from truly unseen years. Data dependence on GSS is another soft spot, though the external validation against other organizations gives partial relief. The paper does not appear to ship code or formal proofs, so reproducibility will rest on the usual survey-data details. Overall the work is aimed at social scientists who want to extend limited repeated cross-sections without new fieldwork. It is coherent on its own terms and shows honest engagement with the limits of the unasked task, so it clears the bar for peer review even if the central extrapolation claim needs more scrutiny in revision.

Referee Report

3 major / 2 minor

Summary. The paper claims that an LLM-based framework incorporating embeddings for questions, respondents, and survey periods can predict missing responses in repeated cross-sectional surveys such as the GSS (1972-2021). It reports strong performance in retrodicting masked GSS opinions via cross-validation and in aligning with external organizations' measurements in non-GSS years, enabling recovery of historical trends (e.g., rising support for same-sex marriage), while performance is modest for predicting entirely unasked opinions. The work also examines predictability by opinion/respondent type and whether the approach reduces LLM homogenization.

Significance. If the central retrodiction and external-validation results hold after addressing methodological details, the approach would meaningfully extend the temporal coverage of survey data and provide a calibrated method for using LLMs to simulate opinions, with direct utility for trend recovery in public-opinion research. The external validation against independent polls is a positive feature that partially mitigates data-source dependence.

major comments (3)

[Abstract and §3] Abstract and §3 (Methods): the cross-validation procedure for retrodiction is described only at a high level; it is unclear whether period embeddings are learned jointly over all years (including those supplying supervision) or held out in a manner that tests extrapolation rather than interpolation of within-year correlations. If the former, the reported strong retrodiction performance does not yet demonstrate recovery of attitude shifts in truly unseen periods, which is load-bearing for the central claim.
[Abstract and Results] Abstract and Results section: quantitative claims of strong performance and outperformance of benchmarks are presented without reported error bars, full hyperparameter/model specifications, or explicit data-exclusion criteria. This prevents assessment of whether the modest unasked-opinion results and the stronger retrodiction results are statistically distinguishable from the benchmarks.
[§4] §4 (External validation): while alignment with other organizations' polls in non-GSS years is cited as supporting evidence, the paper does not report the exact overlap in question wording, sampling frames, or adjustment for mode effects; without these, the degree of independent grounding remains difficult to evaluate.

minor comments (2)

[Abstract] Abstract contains a repeated word: 'which opinions and and respondents'.
[§3] Notation for the three embedding types (question, respondent, period) should be introduced once with consistent symbols and reused throughout to improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas where additional methodological detail and transparency will strengthen the manuscript. We address each major comment below and will revise the paper to improve clarity and rigor.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (Methods): the cross-validation procedure for retrodiction is described only at a high level; it is unclear whether period embeddings are learned jointly over all years (including those supplying supervision) or held out in a manner that tests extrapolation rather than interpolation of within-year correlations. If the former, the reported strong retrodiction performance does not yet demonstrate recovery of attitude shifts in truly unseen periods, which is load-bearing for the central claim.

Authors: We will revise §3 to provide a detailed, explicit description of the cross-validation procedure, including the training of question, respondent, and period embeddings. In the current implementation, period embeddings are learned jointly over all years in the GSS data because retrodiction focuses on imputing masked individual responses by leveraging the full observed structure and correlations within the dataset. We will add text distinguishing this from the external validation task, which evaluates predictions against independent polls in periods absent from the GSS for those questions. This clarification will specify that retrodiction demonstrates recovery of within-survey patterns while external validation addresses performance on unseen periods. revision: yes
Referee: [Abstract and Results] Abstract and Results section: quantitative claims of strong performance and outperformance of benchmarks are presented without reported error bars, full hyperparameter/model specifications, or explicit data-exclusion criteria. This prevents assessment of whether the modest unasked-opinion results and the stronger retrodiction results are statistically distinguishable from the benchmarks.

Authors: We agree that the absence of these details limits evaluation. In the revised manuscript we will report error bars (standard errors or bootstrap intervals) for all performance metrics in the Results section and Abstract. We will add an appendix containing full hyperparameter values, model specifications, training details, and explicit data inclusion/exclusion criteria. These changes will allow direct assessment of whether differences from benchmarks are statistically meaningful. revision: yes
Referee: [§4] §4 (External validation): while alignment with other organizations' polls in non-GSS years is cited as supporting evidence, the paper does not report the exact overlap in question wording, sampling frames, or adjustment for mode effects; without these, the degree of independent grounding remains difficult to evaluate.

Authors: We will expand §4 with a table or subsection detailing the specific external poll questions, the degree of wording overlap with corresponding GSS items, sampling frame differences, and any adjustments applied for survey mode or other methodological factors. Where precise information is unavailable from the source documentation we will explicitly note the limitation and its implications for interpreting the validation results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation uses independent validation

full rationale

The paper trains embeddings and models on GSS responses then evaluates retrodiction via cross-validation on masked GSS items plus direct comparison to independent polls from other organizations in non-GSS years. This constitutes standard held-out evaluation against external benchmarks rather than any reduction of a claimed prediction to a fitted input or self-citation by construction. No equations, self-definitional steps, or load-bearing self-citations are present in the manuscript description that would force the central results to equal their inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the framework implicitly assumes standard LLM embedding capabilities can capture opinion dynamics.

pith-pipeline@v0.9.0 · 5756 in / 1139 out tokens · 25907 ms · 2026-05-24T08:45:06.614032+00:00 · methodology

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Will Scaling Improve Social Simulation with LLMs?
cs.CL 2026-07 conditional novelty 7.0

Scaling improves LLM social simulation fidelity in most opinion and behavior tasks but not for human cognitive bias calibration or low-resource domains.
In-Context Learning for the Imputation of Public Opinion Data with Large Language Models
cs.CL 2026-06 unverdicted novelty 7.0

ICL with LLMs reduces absolute imputation error for survey data versus MICE PMM across MCAR/MAR/MNAR mechanisms and yields narrower intervals with near-nominal coverage.
A Roadmap to Pluralistic Alignment
cs.AI 2024-02 unverdicted novelty 6.0

The paper formalizes three types of pluralistic AI models and three benchmark classes, arguing that current alignment techniques may reduce rather than increase distributional pluralism.
A Single Rewrite Suffices: Empirical Lessons from Production Skill Description Optimization
cs.CL 2026-06 unverdicted novelty 5.0

A single LLM rewrite of skill descriptions using false positive and negative cases matches manual optimization performance in production, with most other pipeline components adding little value.
From Demographics to Survey Anchors: Evaluating LLM Agents for Modeling Retirement Attitudes
cs.CY 2026-04 conditional novelty 5.0

Demographic-only LLM agents for retirement survey prediction exhibit central tendency bias, fail to reproduce incorrect or 'don't know' answers, and miss factor interactions in regressions, unlike survey-anchored agents.
Large Language Models as Virtual Survey Respondents: Evaluating Sociodemographic Response Generation
cs.AI 2025-09 conditional novelty 5.0

Introduces PAS and FAS task abstractions plus the LLM-S^3 benchmark to evaluate LLMs on generating sociodemographic survey responses across 11 real datasets and multiple models.

Reference graph

Works this paper leans on

103 extracted references · 103 canonical work pages · cited by 6 Pith papers · 3 internal anchors

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "In " FUNCTION format.date ye...

work page
[2]

Abid, Abubakar, Maheen Farooqi, and James Zou. 2021. Persistent Anti-Muslim Bias in Large Language Models . In Proceedings of the 2021 AAAI / ACM Conference on AI , Ethics , and Society \/ , pp. 298--306. Association for Computing Machinery

work page 2021
[3]

Preprint, arXiv:2208.10264

Aher, Gati, Rosa I. Arriaga, and Adam Tauman Kalai. 2023. Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies . arXiv:2208.10264 https://arxiv.org/abs/2208.10264 [cs.CL]

work page arXiv 2023
[4]

Aiyappa, Rachith, Jisun An, Haewoon Kwak, and Yong-Yeol Ahn. 2023. Can We Trust the Evaluation on ChatGPT ? arXiv:2303.12767 https://arxiv.org/abs/2303.12767 [cs.CL]

work page arXiv 2023
[5]

Ansolabehere, Stephen, Jonathan Rodden, and James M. Snyder. 2008. The Strength of Issues : Using Multiple Measures to Gauge Preference Stability , Ideological Constraint , and Issue Voting . The American Political Science Review\/ 102:215--232

work page 2008
[6]

Prabin Bhandari, Antonios Anastasopoulos, and Dieter Pfoser

Argyle, Lisa P., Ethan C. Busby, Nancy Fulda, Joshua R. Gubler, Christopher Rytting, and David Wingate. 2023. Out of One , Many : Using Language Models to Simulate Human Samples . Political Analysis\/ https://doi.org/10.1017/pan.2023.2 https://doi.org/10.1017/pan.2023.2

work page doi:10.1017/pan.2023.2 2023
[7]

Xue, Peter S

Atari, Mohammad, Mona J. Xue, Peter S. Park, Damián Blasi, and Joseph Henrich. 2023. Which Humans ?

work page 2023
[8]

Bail, Christopher A. 2023. Can Generative AI Improve Social Science ? SocArXiv . https://doi.org/10.31235/osf.io/rwtzs https://doi.org/10.31235/osf.io/rwtzs

work page doi:10.31235/osf.io/rwtzs 2023
[9]

Baldassarri, Delia and Andrew Gelman. 2008. Partisans without Constraint : Political Polarization and Trends in American Public Opinion . American Journal of Sociology\/ 114:408--446

work page 2008
[10]

Baldassarri, Delia and Amir Goldberg. 2014. Neither Ideologues nor Agnostics : Alternative Voters ' Belief System in an Age of Partisan Politics . American Journal of Sociology\/ 120:45--95

work page 2014
[11]

Baldassarri, Delia and Barum Park. 2020. Was There a Culture War ? Partisan Polarization and Secular Trends in US Public Opinion . The Journal of Politics\/ 82:809--827

work page 2020
[12]

Baunach, Dawn Michelle. 2012. Changing Same-Sex Marriage Attitudes in America from 1988 Through 2010. Public Opinion Quarterly\/ 76:364--378

work page 2012
[13]

Beauchamp, Nicholas. 2017. Predicting and Interpolating State-Level Polls Using Twitter Textual Data . American Journal of Political Science\/ 61:490--503

work page 2017
[14]

and Shanto Iyengar

Behr, Roy L. and Shanto Iyengar. 1985. Television News , Real-World Cues , and Changes in the Public Agenda . Public Opinion Quarterly\/ 49:38

work page 1985
[15]

Berinsky, Adam J. 2017. Measuring Public Opinion with Surveys . Annual Review of Political Science\/ 20:309--329

work page 2017
[16]

Blumenstock, Joshua, Gabriel Cadamuro, and Robert On. 2015. Predicting Poverty and Wealth from Mobile Phone Metadata. Science\/ 350:1073--1076

work page 2015
[17]

Boutyline, Andrei and Laura K Soter. 2021. Cultural schemas: What they are, how to find them, and what to do once you’ve caught one. American Sociological Review\/ 86:728--758

work page 2021
[18]

Boutyline, Andrei and Stephen Vaisey. 2017. Belief Network Analysis: A Relational Approach to Understanding the Structure of Attitudes. American journal of sociology\/ 122:1371--1447

work page 2017
[19]

Brand, James, Ayelet Israeli, and Donald Ngwe. 2023. Using gpt for market research. Available at SSRN 4395751\/

work page 2023
[20]

Brayne, Sarah. 2020. Predict and Surveil : Data , Discretion , and the Future of Policing \/ . Oxford University Press

work page 2020
[21]

Brooks, Clem and Jeff Manza. 2006. Social Policy Responsiveness in Developed Democracies . American Sociological Review\/ 71:474--494

work page 2006
[22]

Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, and Amanda Askell. 2020. Language Models Are Few-Shot Learners. Advances in neural information processing systems\/ 33:1877--1901

work page 2020
[23]

Burstein, Paul. 2003. The Impact of Public Opinion on Public Policy : A Review and an Agenda . Political Research Quarterly\/ 56:29--40

work page 2003
[24]

Cesare, Nina, Hedwig Lee, Tyler McCormick, Emma Spiro, and Emilio Zagheni. 2018. Promises and Pitfalls of Using Digital Traces for Demographic Research . Demography\/ 55:1979--1999

work page 2018
[25]

Chu, Eric, Jacob Andreas, Stephen Ansolabehere, and Deb Roy. 2023. Language Models Trained on Media Diets Can Predict Public Opinion. arXiv:2303.16779 https://arxiv.org/abs/2303.16779 [cs.CL]

work page arXiv 2023
[26]

Converse, P. 1964. The Nature of Belief Systems in Mass Publics . In Ideology and Discontent \/ , edited by Apter, D. E. , pp. 206--261. The Free Press

work page 1964
[27]

Couper, Mick P. 2017. New Developments in Survey Data Collection . Annual Review of Sociology\/ 43:121--145

work page 2017
[28]

Morgan, and Tom W

Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. General Social Surveys, 1972-2021 Cross-section [machine-readable data file, 68,846 cases]. Principal Investigator, Michael Davern; Co-Principal Investigators, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith; Sponsored by National Science Foundation...

work page 2021
[29]

Oil Spill

DellaPosta, Daniel. 2020. Pluralistic Collapse : The “ Oil Spill ” Model of Mass Opinion Polarization . American Sociological Review\/ 85:507--536. Publisher: SAGE Publications Inc

work page 2020
[30]

DellaPosta, Daniel, Yongren Shi, and Michael Macy. 2015. Why Do Liberals Drink Lattes? American Journal of Sociology\/ 120:1473--1511

work page 2015
[31]

Dillion, Danica, Niket Tandon, Yuling Gu, and Kurt Gray. 2023. Can AI Language Models Replace Human Participants? Trends in Cognitive Sciences\/ https://doi.org/10.1016/j.tics.2023.04.008 https://doi.org/10.1016/j.tics.2023.04.008

work page doi:10.1016/j.tics.2023.04.008 2023
[32]

DiMaggio, Paul, Eszter Hargittai, Coral Celeste, and Steven Shafer. 2004. Digital Inequality : From Unequal Access to Differentiated Use . In Social Inequality \/ , edited by Kathryn M. Neckerman, pp. 355--400. Russell Sage Foundation

work page 2004
[33]

DiMaggio, Paul, Ramina Sotoudeh, Amir Goldberg, and Hana Shepherd. 2018. Culture out of attitudes: Relationality, population heterogeneity and attitudes toward science and religion in the US. Poetics\/ 68:31--51

work page 2018
[34]

Dominguez-Olmedo, Ricardo, Moritz Hardt, and Celestine Mendler-D \"u nner. 2023. Questioning the survey responses of large language models. arXiv preprint arXiv:2306.07951\/

work page arXiv 2023
[35]

Downs, Anthony. 1972. Up and down with Ecology: The Issue-Attention Cycle. The public\/ 28:38--50

work page 1972
[36]

Eloundou, Tyna, Sam Manning, Pamela Mishkin, and Daniel Rock. 2023. GPTs Are GPTs : An Early Look at the Labor Market Impact Potential of Large Language Models . arXiv:2303.10130 https://arxiv.org/abs/2303.10130 [econ.GN]

work page arXiv 2023
[37]

and Melissa M

Ferraro, Kenneth F. and Melissa M. Farmer. 1999. Utility of Health Data from Social Surveys : Is There a Gold Standard for Measuring Morbidity ? American Sociological Review\/ 64:303--315

work page 1999
[38]

Floridi, Luciano, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice Chazerand, Virginia Dignum, Christoph Luetge, Robert Madelin, Ugo Pagallo, Francesca Rossi, Burkhard Schafer, Peggy Valcke, and Effy Vayena. 2018. AI4People An Ethical Framework for a Good AI Society : Opportunities , Risks , Principles , and Recommendations . Minds and Machines\/ 28:689--707

work page 2018
[39]

Friedkin, Noah E and Eugene C Johnsen. 2011. Social influence network theory: A sociological examination of small group dynamics\/ , volume 33. Cambridge University Press

work page 2011
[40]

Garip, Filiz. 2020. What failure to predict life outcomes can teach us. Proceedings of the National Academy of Sciences\/ 117:8234--8235. Publisher: Proceedings of the National Academy of Sciences

work page 2020
[41]

Goldberg, Amir. 2011. Mapping Shared Understandings Using Relational Class Analysis : The Case of the Cultural Omnivore Reexamined . American Journal of Sociology\/ 116:1397--1436

work page 2011
[42]

Moreno, and Antoine Doucet

Gonz \'a lez-Gallardo , Carlos-Emiliano, Emanuela Boros, Nancy Girdhar, Ahmed Hamdi, Jose G. Moreno, and Antoine Doucet. 2023. Yes but.. Can ChatGPT Identify Entities in Historical Documents ? arXiv:2303.17322 https://arxiv.org/abs/2303.17322 [cs.DL]

work page arXiv 2023
[43]

Lam, Joon Sung Park, Kayur Patel, Jeff Hancock, Tatsunori Hashimoto, and Michael S

Gordon, Mitchell L., Michelle S. Lam, Joon Sung Park, Kayur Patel, Jeff Hancock, Tatsunori Hashimoto, and Michael S. Bernstein. 2022. Jury Learning : Integrating Dissenting Voices into Machine Learning Models . In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems \/ , pp. 1--19. Association for Computing Machinery

work page 2022
[44]

Roberts, and Brandon M

Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. 2022. Text as Data : A New Framework for Machine Learning and the Social Sciences \/ . Princeton University Press

work page 2022
[45]

Grossmann, Igor, Matthew Feinberg, Dawn C Parker, Nicholas A Christakis, Philip E Tetlock, and William A Cunningham. 2023. AI and the transformation of social science research. Science\/ 380:1108--1109

work page 2023
[46]

a m \"a l \

H \"a m \"a l \"a inen, Perttu, Mikke Tavast, and Anton Kunnari. 2023. Evaluating Large Language Models in Generating Synthetic HCI Research Data : A Case Study . In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems \/ , pp. 1--19. Association for Computing Machinery

work page 2023
[47]

Hastie, Trevor J. 1992. Generalized Additive Models . In Statistical Models in S \/ . Routledge

work page 1992
[48]

Hilgartner, Stephen and Charles L Bosk. 1988. The Rise and Fall of Social Problems: A Public Arenas Model. American journal of Sociology\/ 94:53--78

work page 1988
[49]

Holm, Elizabeth A. 2019. In Defense of the Black Box. Science\/ 364:26--27

work page 2019
[50]

Honaker, James, Gary King, and Matthew Blackwell. 2011. Amelia II : A Program for Missing Data . Journal of Statistical Software\/ 45:1--47

work page 2011
[51]

Horton, John J. 2023. Large Language Models as Simulated Economic Agents : What Can We Learn from Homo Silicus ? arXiv:2301.07543 https://arxiv.org/abs/2301.07543 [econ.GN]

work page arXiv 2023
[52]

Igo, Sarah E. 2008. The Averaged American : Surveys , Citizens , and the Making of a Mass Public \/ . Harvard University Press

work page 2008
[53]

Jansen, Bernard J., Soon-gyo Jung, and Joni Salminen. 2023. Employing large language models in survey research. Natural Language Processing Journal\/ 4:100020

work page 2023
[54]

Jefferson, Hakeem. 2020. The Curious Case of Black Conservatives : Construct Validity and the 7-Point Liberal-Conservative Scale . Available at SSRN: https://ssrn.com/abstract=3602209 or http://dx.doi.org/10.2139/ssrn.3602209 https://ssrn.com/abstract=3602209 or http://dx.doi.org/10.2139/ssrn.3602209

work page doi:10.2139/ssrn.3602209 2020
[55]

Jiang, Hang, Doug Beeferman, Brandon Roy, and Deb Roy. 2022. CommunityLM : Probing Partisan Worldviews from Language Models . In Proceedings of the 29th International Conference on Computational Linguistics \/ , pp. 6818--6826. International Committee on Computational Linguistics

work page 2022
[56]

Joo, Won-Tak and Jason Fletcher. 2020. Out of Sync, out of Society: Political Beliefs and Social Networks. Network Science\/ 8:445--468

work page 2020
[57]

Jumper, John, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Z \'i dek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes , Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David ...

work page 2021
[58]

Jurafsky, Daniel and James Martin. 2023. Speech and Language Processing , 3rd Edition Draft \/

work page 2023
[59]

Kiley, Kevin and Stephen Vaisey. 2020. Measuring Stability and Change in Personal Culture Using Panel Data . American Sociological Review\/ 85:477--506

work page 2020
[60]

Kirk, Hannah Rose, Bertie Vidgen, Paul R \"o ttger, and Scott A. Hale. 2023. Personalisation within Bounds: A Risk Taxonomy and Policy Framework for the Alignment of Large Language Models with Personalised Feedback. arXiv:2303.05453 https://arxiv.org/abs/2303.05453 [cs.CL]

work page arXiv 2023
[61]

Koren, Yehuda, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems . Computer\/ 42:30--37. Conference Name: Computer

work page 2009
[62]

Kozlowski, Austin C and James P Murphy. 2021. Issue alignment and partisanship in the American public: Revisiting the ‘partisans without constraint’thesis. Social Science Research\/ 94:102498

work page 2021
[63]

Kozlowski, Austin C., Matt Taddy, and James A. Evans. 2019. The Geometry of Culture : Analyzing the Meanings of Class through Word Embeddings . American Sociological Review\/ 84:905--949

work page 2019
[64]

Lall, Ranjit and Thomas Robinson. 2022. The MIDAS touch: accurate and scalable missing-data imputation with deep learning. Political Analysis\/ 30:179--196

work page 2022
[65]

Latour, Bruno. 2007. Reassembling the Social : An Introduction to Actor-Network-Theory \/ . OUP Oxford

work page 2007
[66]

Berelson, and Hazel Gaudet

Lazarsfeld, Paul F., Bernard R. Berelson, and Hazel Gaudet. 1948. The People 's Choice \/ . Columbia University Press, 3rd edition

work page 1948
[67]

Lersch, Philipp M. 2023. Change in Personal Culture over the Life Course . American Sociological Review\/ 88:220–251

work page 2023
[68]

Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A Robustly Optimized Bert Pretraining Approach. arXiv:1907.11692 https://arxiv.org/abs/1907.11692 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[69]

Longpre, Shayne, Gregory Yauney, Emily Reif, Katherine Lee, Adam Roberts, Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David Mimno, et al. 2023. A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity. arXiv:2305.13169 https://arxiv.org/abs/2305.13169 [ cs.CL ]

work page arXiv 2023
[70]

Smith, and Michael Hout

Marsden, Peter V., Tom W. Smith, and Michael Hout. 2020. Tracking US Social Change Over a Half-Century : The General Social Survey at Fifty . Annual Review of Sociology\/ 46:109--134

work page 2020
[71]

Martin, John Levi. 2010. Life's a Beach but You're an Ant, and Other Unwelcome News for the Sociology of Culture. Poetics\/ 38:229--244

work page 2010
[72]

Mei, Qiaozhu, Yutong Xie, Walter Yuan, and Matthew O Jackson. 2024. A Turing test of whether AI chatbots are behaviorally similar to humans. Proceedings of the National Academy of Sciences\/ 121:e2313925121

work page 2024
[73]

Miller, Sasha Mitts, Adithya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, and Markus Zijlstra

Meta Fundamental AI Research Diplomacy Team (FAIR) , Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyuan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sasha Mitts, Adithya Renduchintala, Stephen Roller, Dirk Rowe, Weiya...

work page 2022
[74]

Milbauer, Jeremiah, Adarsh Mathew, and James Evans. 2021. Aligning Multidimensional Worldviews and Discovering Ideological Differences . Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing\/ pp. 4832--4845

work page 2021
[75]

Molina, Mario and Filiz Garip. 2019. Machine learning for sociology. Annual Review of Sociology\/ 45:27--45

work page 2019
[76]

Moore, Frances C., Nick Obradovich, Flavio Lehner, and Patrick Baylis. 2019. Rapidly Declining Remarkability of Temperature Anomalies May Obscure Public Perception of Climate Change. Proceedings of the National Academy of Sciences\/ 116:4905--4910

work page 2019
[77]

Nadeem, Moin, Anna Bethke, and Siva Reddy. 2020. StereoSet : Measuring Stereotypical Bias in Pretrained Language Models. arXiv:2004.09456 https://arxiv.org/abs/2004.09456 [cs.CL]

work page arXiv 2020
[78]

O'Connor, Brendan, Ramnath Balasubramanyan, Bryan Routledge, and Noah Smith. 2010. From Tweets to Polls : Linking Text Sentiment to Public Opinion Time Series . In Proceedings of the Fourth International Conference on Weblogs and Social Media\/ , pp. 122--129. AAAI Press

work page 2010
[79]

Park, Barum. 2018. How Are We Apart ? Continuity and Change in the Structure of Ideological Disagreement in the American Public , 1980–2012. Social Forces\/ 96:1757--1784. 00000

work page 2018
[80]

Park, Chan Young, Julia Mendelsohn, Karthik Radhakrishnan, Kinjal Jain, Tushar Kanakagiri, David Jurgens, and Yulia Tsvetkov. 2021. Detecting Community Sensitive Norm Violations in Online Conversations . Findings of the Association for Computational Linguistics : EMNLP 2021\/ pp. 3386--3397

work page 2021

Showing first 80 references.

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "In " FUNCTION format.date ye...

work page

[2] [2]

Abid, Abubakar, Maheen Farooqi, and James Zou. 2021. Persistent Anti-Muslim Bias in Large Language Models . In Proceedings of the 2021 AAAI / ACM Conference on AI , Ethics , and Society \/ , pp. 298--306. Association for Computing Machinery

work page 2021

[3] [3]

Preprint, arXiv:2208.10264

Aher, Gati, Rosa I. Arriaga, and Adam Tauman Kalai. 2023. Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies . arXiv:2208.10264 https://arxiv.org/abs/2208.10264 [cs.CL]

work page arXiv 2023

[4] [4]

Aiyappa, Rachith, Jisun An, Haewoon Kwak, and Yong-Yeol Ahn. 2023. Can We Trust the Evaluation on ChatGPT ? arXiv:2303.12767 https://arxiv.org/abs/2303.12767 [cs.CL]

work page arXiv 2023

[5] [5]

Ansolabehere, Stephen, Jonathan Rodden, and James M. Snyder. 2008. The Strength of Issues : Using Multiple Measures to Gauge Preference Stability , Ideological Constraint , and Issue Voting . The American Political Science Review\/ 102:215--232

work page 2008

[6] [6]

Prabin Bhandari, Antonios Anastasopoulos, and Dieter Pfoser

Argyle, Lisa P., Ethan C. Busby, Nancy Fulda, Joshua R. Gubler, Christopher Rytting, and David Wingate. 2023. Out of One , Many : Using Language Models to Simulate Human Samples . Political Analysis\/ https://doi.org/10.1017/pan.2023.2 https://doi.org/10.1017/pan.2023.2

work page doi:10.1017/pan.2023.2 2023

[7] [7]

Xue, Peter S

Atari, Mohammad, Mona J. Xue, Peter S. Park, Damián Blasi, and Joseph Henrich. 2023. Which Humans ?

work page 2023

[8] [8]

Bail, Christopher A. 2023. Can Generative AI Improve Social Science ? SocArXiv . https://doi.org/10.31235/osf.io/rwtzs https://doi.org/10.31235/osf.io/rwtzs

work page doi:10.31235/osf.io/rwtzs 2023

[9] [9]

Baldassarri, Delia and Andrew Gelman. 2008. Partisans without Constraint : Political Polarization and Trends in American Public Opinion . American Journal of Sociology\/ 114:408--446

work page 2008

[10] [10]

Baldassarri, Delia and Amir Goldberg. 2014. Neither Ideologues nor Agnostics : Alternative Voters ' Belief System in an Age of Partisan Politics . American Journal of Sociology\/ 120:45--95

work page 2014

[11] [11]

Baldassarri, Delia and Barum Park. 2020. Was There a Culture War ? Partisan Polarization and Secular Trends in US Public Opinion . The Journal of Politics\/ 82:809--827

work page 2020

[12] [12]

Baunach, Dawn Michelle. 2012. Changing Same-Sex Marriage Attitudes in America from 1988 Through 2010. Public Opinion Quarterly\/ 76:364--378

work page 2012

[13] [13]

Beauchamp, Nicholas. 2017. Predicting and Interpolating State-Level Polls Using Twitter Textual Data . American Journal of Political Science\/ 61:490--503

work page 2017

[14] [14]

and Shanto Iyengar

Behr, Roy L. and Shanto Iyengar. 1985. Television News , Real-World Cues , and Changes in the Public Agenda . Public Opinion Quarterly\/ 49:38

work page 1985

[15] [15]

Berinsky, Adam J. 2017. Measuring Public Opinion with Surveys . Annual Review of Political Science\/ 20:309--329

work page 2017

[16] [16]

Blumenstock, Joshua, Gabriel Cadamuro, and Robert On. 2015. Predicting Poverty and Wealth from Mobile Phone Metadata. Science\/ 350:1073--1076

work page 2015

[17] [17]

Boutyline, Andrei and Laura K Soter. 2021. Cultural schemas: What they are, how to find them, and what to do once you’ve caught one. American Sociological Review\/ 86:728--758

work page 2021

[18] [18]

Boutyline, Andrei and Stephen Vaisey. 2017. Belief Network Analysis: A Relational Approach to Understanding the Structure of Attitudes. American journal of sociology\/ 122:1371--1447

work page 2017

[19] [19]

Brand, James, Ayelet Israeli, and Donald Ngwe. 2023. Using gpt for market research. Available at SSRN 4395751\/

work page 2023

[20] [20]

Brayne, Sarah. 2020. Predict and Surveil : Data , Discretion , and the Future of Policing \/ . Oxford University Press

work page 2020

[21] [21]

Brooks, Clem and Jeff Manza. 2006. Social Policy Responsiveness in Developed Democracies . American Sociological Review\/ 71:474--494

work page 2006

[22] [22]

Brown, Tom, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, and Amanda Askell. 2020. Language Models Are Few-Shot Learners. Advances in neural information processing systems\/ 33:1877--1901

work page 2020

[23] [23]

Burstein, Paul. 2003. The Impact of Public Opinion on Public Policy : A Review and an Agenda . Political Research Quarterly\/ 56:29--40

work page 2003

[24] [24]

Cesare, Nina, Hedwig Lee, Tyler McCormick, Emma Spiro, and Emilio Zagheni. 2018. Promises and Pitfalls of Using Digital Traces for Demographic Research . Demography\/ 55:1979--1999

work page 2018

[25] [25]

Chu, Eric, Jacob Andreas, Stephen Ansolabehere, and Deb Roy. 2023. Language Models Trained on Media Diets Can Predict Public Opinion. arXiv:2303.16779 https://arxiv.org/abs/2303.16779 [cs.CL]

work page arXiv 2023

[26] [26]

Converse, P. 1964. The Nature of Belief Systems in Mass Publics . In Ideology and Discontent \/ , edited by Apter, D. E. , pp. 206--261. The Free Press

work page 1964

[27] [27]

Couper, Mick P. 2017. New Developments in Survey Data Collection . Annual Review of Sociology\/ 43:121--145

work page 2017

[28] [28]

Morgan, and Tom W

Davern, Michael, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith. 2021. General Social Surveys, 1972-2021 Cross-section [machine-readable data file, 68,846 cases]. Principal Investigator, Michael Davern; Co-Principal Investigators, Rene Bautista, Jeremy Freese, Stephen L. Morgan, and Tom W. Smith; Sponsored by National Science Foundation...

work page 2021

[29] [29]

Oil Spill

DellaPosta, Daniel. 2020. Pluralistic Collapse : The “ Oil Spill ” Model of Mass Opinion Polarization . American Sociological Review\/ 85:507--536. Publisher: SAGE Publications Inc

work page 2020

[30] [30]

DellaPosta, Daniel, Yongren Shi, and Michael Macy. 2015. Why Do Liberals Drink Lattes? American Journal of Sociology\/ 120:1473--1511

work page 2015

[31] [31]

Dillion, Danica, Niket Tandon, Yuling Gu, and Kurt Gray. 2023. Can AI Language Models Replace Human Participants? Trends in Cognitive Sciences\/ https://doi.org/10.1016/j.tics.2023.04.008 https://doi.org/10.1016/j.tics.2023.04.008

work page doi:10.1016/j.tics.2023.04.008 2023

[32] [32]

DiMaggio, Paul, Eszter Hargittai, Coral Celeste, and Steven Shafer. 2004. Digital Inequality : From Unequal Access to Differentiated Use . In Social Inequality \/ , edited by Kathryn M. Neckerman, pp. 355--400. Russell Sage Foundation

work page 2004

[33] [33]

DiMaggio, Paul, Ramina Sotoudeh, Amir Goldberg, and Hana Shepherd. 2018. Culture out of attitudes: Relationality, population heterogeneity and attitudes toward science and religion in the US. Poetics\/ 68:31--51

work page 2018

[34] [34]

Dominguez-Olmedo, Ricardo, Moritz Hardt, and Celestine Mendler-D \"u nner. 2023. Questioning the survey responses of large language models. arXiv preprint arXiv:2306.07951\/

work page arXiv 2023

[35] [35]

Downs, Anthony. 1972. Up and down with Ecology: The Issue-Attention Cycle. The public\/ 28:38--50

work page 1972

[36] [36]

Eloundou, Tyna, Sam Manning, Pamela Mishkin, and Daniel Rock. 2023. GPTs Are GPTs : An Early Look at the Labor Market Impact Potential of Large Language Models . arXiv:2303.10130 https://arxiv.org/abs/2303.10130 [econ.GN]

work page arXiv 2023

[37] [37]

and Melissa M

Ferraro, Kenneth F. and Melissa M. Farmer. 1999. Utility of Health Data from Social Surveys : Is There a Gold Standard for Measuring Morbidity ? American Sociological Review\/ 64:303--315

work page 1999

[38] [38]

Floridi, Luciano, Josh Cowls, Monica Beltrametti, Raja Chatila, Patrice Chazerand, Virginia Dignum, Christoph Luetge, Robert Madelin, Ugo Pagallo, Francesca Rossi, Burkhard Schafer, Peggy Valcke, and Effy Vayena. 2018. AI4People An Ethical Framework for a Good AI Society : Opportunities , Risks , Principles , and Recommendations . Minds and Machines\/ 28:689--707

work page 2018

[39] [39]

Friedkin, Noah E and Eugene C Johnsen. 2011. Social influence network theory: A sociological examination of small group dynamics\/ , volume 33. Cambridge University Press

work page 2011

[40] [40]

Garip, Filiz. 2020. What failure to predict life outcomes can teach us. Proceedings of the National Academy of Sciences\/ 117:8234--8235. Publisher: Proceedings of the National Academy of Sciences

work page 2020

[41] [41]

Goldberg, Amir. 2011. Mapping Shared Understandings Using Relational Class Analysis : The Case of the Cultural Omnivore Reexamined . American Journal of Sociology\/ 116:1397--1436

work page 2011

[42] [42]

Moreno, and Antoine Doucet

Gonz \'a lez-Gallardo , Carlos-Emiliano, Emanuela Boros, Nancy Girdhar, Ahmed Hamdi, Jose G. Moreno, and Antoine Doucet. 2023. Yes but.. Can ChatGPT Identify Entities in Historical Documents ? arXiv:2303.17322 https://arxiv.org/abs/2303.17322 [cs.DL]

work page arXiv 2023

[43] [43]

Lam, Joon Sung Park, Kayur Patel, Jeff Hancock, Tatsunori Hashimoto, and Michael S

Gordon, Mitchell L., Michelle S. Lam, Joon Sung Park, Kayur Patel, Jeff Hancock, Tatsunori Hashimoto, and Michael S. Bernstein. 2022. Jury Learning : Integrating Dissenting Voices into Machine Learning Models . In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems \/ , pp. 1--19. Association for Computing Machinery

work page 2022

[44] [44]

Roberts, and Brandon M

Grimmer, Justin, Margaret E. Roberts, and Brandon M. Stewart. 2022. Text as Data : A New Framework for Machine Learning and the Social Sciences \/ . Princeton University Press

work page 2022

[45] [45]

Grossmann, Igor, Matthew Feinberg, Dawn C Parker, Nicholas A Christakis, Philip E Tetlock, and William A Cunningham. 2023. AI and the transformation of social science research. Science\/ 380:1108--1109

work page 2023

[46] [46]

a m \"a l \

H \"a m \"a l \"a inen, Perttu, Mikke Tavast, and Anton Kunnari. 2023. Evaluating Large Language Models in Generating Synthetic HCI Research Data : A Case Study . In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems \/ , pp. 1--19. Association for Computing Machinery

work page 2023

[47] [47]

Hastie, Trevor J. 1992. Generalized Additive Models . In Statistical Models in S \/ . Routledge

work page 1992

[48] [48]

Hilgartner, Stephen and Charles L Bosk. 1988. The Rise and Fall of Social Problems: A Public Arenas Model. American journal of Sociology\/ 94:53--78

work page 1988

[49] [49]

Holm, Elizabeth A. 2019. In Defense of the Black Box. Science\/ 364:26--27

work page 2019

[50] [50]

Honaker, James, Gary King, and Matthew Blackwell. 2011. Amelia II : A Program for Missing Data . Journal of Statistical Software\/ 45:1--47

work page 2011

[51] [51]

Horton, John J. 2023. Large Language Models as Simulated Economic Agents : What Can We Learn from Homo Silicus ? arXiv:2301.07543 https://arxiv.org/abs/2301.07543 [econ.GN]

work page arXiv 2023

[52] [52]

Igo, Sarah E. 2008. The Averaged American : Surveys , Citizens , and the Making of a Mass Public \/ . Harvard University Press

work page 2008

[53] [53]

Jansen, Bernard J., Soon-gyo Jung, and Joni Salminen. 2023. Employing large language models in survey research. Natural Language Processing Journal\/ 4:100020

work page 2023

[54] [54]

Jefferson, Hakeem. 2020. The Curious Case of Black Conservatives : Construct Validity and the 7-Point Liberal-Conservative Scale . Available at SSRN: https://ssrn.com/abstract=3602209 or http://dx.doi.org/10.2139/ssrn.3602209 https://ssrn.com/abstract=3602209 or http://dx.doi.org/10.2139/ssrn.3602209

work page doi:10.2139/ssrn.3602209 2020

[55] [55]

Jiang, Hang, Doug Beeferman, Brandon Roy, and Deb Roy. 2022. CommunityLM : Probing Partisan Worldviews from Language Models . In Proceedings of the 29th International Conference on Computational Linguistics \/ , pp. 6818--6826. International Committee on Computational Linguistics

work page 2022

[56] [56]

Joo, Won-Tak and Jason Fletcher. 2020. Out of Sync, out of Society: Political Beliefs and Social Networks. Network Science\/ 8:445--468

work page 2020

[57] [57]

Jumper, John, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Z \'i dek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes , Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David ...

work page 2021

[58] [58]

Jurafsky, Daniel and James Martin. 2023. Speech and Language Processing , 3rd Edition Draft \/

work page 2023

[59] [59]

Kiley, Kevin and Stephen Vaisey. 2020. Measuring Stability and Change in Personal Culture Using Panel Data . American Sociological Review\/ 85:477--506

work page 2020

[60] [60]

Kirk, Hannah Rose, Bertie Vidgen, Paul R \"o ttger, and Scott A. Hale. 2023. Personalisation within Bounds: A Risk Taxonomy and Policy Framework for the Alignment of Large Language Models with Personalised Feedback. arXiv:2303.05453 https://arxiv.org/abs/2303.05453 [cs.CL]

work page arXiv 2023

[61] [61]

Koren, Yehuda, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems . Computer\/ 42:30--37. Conference Name: Computer

work page 2009

[62] [62]

Kozlowski, Austin C and James P Murphy. 2021. Issue alignment and partisanship in the American public: Revisiting the ‘partisans without constraint’thesis. Social Science Research\/ 94:102498

work page 2021

[63] [63]

Kozlowski, Austin C., Matt Taddy, and James A. Evans. 2019. The Geometry of Culture : Analyzing the Meanings of Class through Word Embeddings . American Sociological Review\/ 84:905--949

work page 2019

[64] [64]

Lall, Ranjit and Thomas Robinson. 2022. The MIDAS touch: accurate and scalable missing-data imputation with deep learning. Political Analysis\/ 30:179--196

work page 2022

[65] [65]

Latour, Bruno. 2007. Reassembling the Social : An Introduction to Actor-Network-Theory \/ . OUP Oxford

work page 2007

[66] [66]

Berelson, and Hazel Gaudet

Lazarsfeld, Paul F., Bernard R. Berelson, and Hazel Gaudet. 1948. The People 's Choice \/ . Columbia University Press, 3rd edition

work page 1948

[67] [67]

Lersch, Philipp M. 2023. Change in Personal Culture over the Life Course . American Sociological Review\/ 88:220–251

work page 2023

[68] [68]

Liu, Yinhan, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A Robustly Optimized Bert Pretraining Approach. arXiv:1907.11692 https://arxiv.org/abs/1907.11692 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[69] [69]

Longpre, Shayne, Gregory Yauney, Emily Reif, Katherine Lee, Adam Roberts, Barret Zoph, Denny Zhou, Jason Wei, Kevin Robinson, David Mimno, et al. 2023. A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity. arXiv:2305.13169 https://arxiv.org/abs/2305.13169 [ cs.CL ]

work page arXiv 2023

[70] [70]

Smith, and Michael Hout

Marsden, Peter V., Tom W. Smith, and Michael Hout. 2020. Tracking US Social Change Over a Half-Century : The General Social Survey at Fifty . Annual Review of Sociology\/ 46:109--134

work page 2020

[71] [71]

Martin, John Levi. 2010. Life's a Beach but You're an Ant, and Other Unwelcome News for the Sociology of Culture. Poetics\/ 38:229--244

work page 2010

[72] [72]

Mei, Qiaozhu, Yutong Xie, Walter Yuan, and Matthew O Jackson. 2024. A Turing test of whether AI chatbots are behaviorally similar to humans. Proceedings of the National Academy of Sciences\/ 121:e2313925121

work page 2024

[73] [73]

Miller, Sasha Mitts, Adithya Renduchintala, Stephen Roller, Dirk Rowe, Weiyan Shi, Joe Spisak, Alexander Wei, David Wu, Hugh Zhang, and Markus Zijlstra

Meta Fundamental AI Research Diplomacy Team (FAIR) , Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyuan Hu, Athul Paul Jacob, Mojtaba Komeili, Karthik Konath, Minae Kwon, Adam Lerer, Mike Lewis, Alexander H. Miller, Sasha Mitts, Adithya Renduchintala, Stephen Roller, Dirk Rowe, Weiya...

work page 2022

[74] [74]

Milbauer, Jeremiah, Adarsh Mathew, and James Evans. 2021. Aligning Multidimensional Worldviews and Discovering Ideological Differences . Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing\/ pp. 4832--4845

work page 2021

[75] [75]

Molina, Mario and Filiz Garip. 2019. Machine learning for sociology. Annual Review of Sociology\/ 45:27--45

work page 2019

[76] [76]

Moore, Frances C., Nick Obradovich, Flavio Lehner, and Patrick Baylis. 2019. Rapidly Declining Remarkability of Temperature Anomalies May Obscure Public Perception of Climate Change. Proceedings of the National Academy of Sciences\/ 116:4905--4910

work page 2019

[77] [77]

Nadeem, Moin, Anna Bethke, and Siva Reddy. 2020. StereoSet : Measuring Stereotypical Bias in Pretrained Language Models. arXiv:2004.09456 https://arxiv.org/abs/2004.09456 [cs.CL]

work page arXiv 2020

[78] [78]

O'Connor, Brendan, Ramnath Balasubramanyan, Bryan Routledge, and Noah Smith. 2010. From Tweets to Polls : Linking Text Sentiment to Public Opinion Time Series . In Proceedings of the Fourth International Conference on Weblogs and Social Media\/ , pp. 122--129. AAAI Press

work page 2010

[79] [79]

Park, Barum. 2018. How Are We Apart ? Continuity and Change in the Structure of Ideological Disagreement in the American Public , 1980–2012. Social Forces\/ 96:1757--1784. 00000

work page 2018

[80] [80]

Park, Chan Young, Julia Mendelsohn, Karthik Radhakrishnan, Kinjal Jain, Tushar Kanakagiri, David Jurgens, and Yulia Tsvetkov. 2021. Detecting Community Sensitive Norm Violations in Online Conversations . Findings of the Association for Computational Linguistics : EMNLP 2021\/ pp. 3386--3397

work page 2021