arxiv: 2605.04029 · v1 · submitted 2026-05-05 · 💻 cs.HC

Recognition: unknown

Stayin' Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data

Simret Araya Gebreegziabher , Allison E Sproul , Yinuo Yang , Chaoran Chen , Diego G\'omez-Zar\'a , Toby Jia-Jun Li

Authors on Pith no claims yet

Pith reviewed 2026-05-07 03:20 UTC · model grok-4.3

classification 💻 cs.HC

keywords longitudinal human-LLM alignmentpreference elicitationcontextual reflectionprivacy-preserving behavioral dataLLM evaluationtemporal preference changehuman-AI alignmentBITE system

0 comments

The pith

User preferences for LLM outputs shift between immediate feedback and later reflection after real-world consequences, showing single-moment data is incomplete.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Current methods collect LLM preference signals right after an interaction and treat them as fixed. Many LLM-mediated decisions unfold over days or weeks, however, allowing users to re-evaluate outputs once they see actual outcomes. The paper therefore proposes a longitudinal framework that combines in-situ preference capture, context-triggered follow-up reflection, and privacy-preserving behavioral traces. It implements the framework in BITE, a browser extension that detects consequential interactions and prompts delayed feedback while letting users control data sharing. A two-week deployment with eight participants found that ratings of accuracy, relevance, and other qualities differed between the initial response and the later reflection.

Core claim

Through the BITE system, immediate user preferences for LLM outputs were collected at the moment of interaction and then compared with preferences elicited later at contextually relevant decision points. The study showed measurable shifts in how participants assessed dimensions such as accuracy and relevance once real-world consequences had occurred, demonstrating that static, single-moment preference datasets miss these temporal dynamics in everyday LLM use.

What carries the argument

BITE, a browser-based system that detects consequential LLM interactions, issues context-triggered follow-up reflection prompts at later decision points, and gathers user-controlled privacy-preserving behavioral traces to interpret preference changes.

If this is right

Single-moment preference datasets may misrepresent how users ultimately value LLM outputs once real-world consequences are observed.
Alignment evaluation requires temporally distributed signals that incorporate evolving judgments over time.
Context-triggered reflection combined with behavioral traces supplies richer data for assessing alignment in everyday settings.
Progressive, user-controlled consent mechanisms can support ongoing data collection without constant monitoring.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If temporal shifts prove consistent, alignment training pipelines could incorporate delayed feedback loops to better match long-term user satisfaction.
The same combination of immediate capture and later reflection could be adapted to evaluate other AI systems where outcomes unfold gradually, such as planning assistants or recommendation engines.
Future work could test whether the magnitude of preference change varies by task type or by how long after the original interaction the reflection occurs.

Load-bearing premise

The preference differences observed across two weeks with only eight participants reflect genuine temporal shifts in alignment rather than prompting artifacts introduced by the BITE system or limitations of the small sample.

What would settle it

A larger study spanning more participants and longer periods that finds no systematic differences between immediate and delayed preferences, or finds differences that exactly match the timing and wording of BITE reflection prompts, would undermine the claim that single-moment data is generally insufficient.

Figures

Figures reproduced from arXiv: 2605.04029 by Allison E Sproul, Chaoran Chen, Diego G\'omez-Zar\'a, Simret Araya Gebreegziabher, Toby Jia-Jun Li, Yinuo Yang.

**Figure 1.** Figure 1: Workflow of the proposed system. Following an LLM conversation, users provide an initial rating. The system classifies view at source ↗

**Figure 2.** Figure 2: In-situ rating interface for immediate feedback. view at source ↗

**Figure 3.** Figure 3: Follow-up rating interface triggered by a real-world event. When a relevant outcome is detected (e.g., an email related view at source ↗

**Figure 4.** Figure 4: Distribution of rating differences (Δ = 𝑖𝑛𝑖𝑡𝑖𝑎𝑙𝑅𝑎𝑡𝑖𝑛𝑔 − 𝑓 𝑜𝑙𝑙𝑜𝑤𝑈 𝑝𝑅𝑎𝑡𝑖𝑛𝑔) across evaluation dimensions. Positive values indicate higher initial ratings than follow-up, while negative values indicate higher ratings over time. with the LLM. These triggers were often incidental, where participants revisited earlier advice after related conversations, Slack messages, or external reading prompted reconsiderati… view at source ↗

**Figure 5.** Figure 5: Average daily check-in scores across observed check view at source ↗

**Figure 6.** Figure 6: Screenshots of the Chrome extension used for longitudinal alignment data collection. (A) Onboarding interface where view at source ↗

read the original abstract

Current human-AI alignment and evaluation methods for large language models (LLMs) often rely on preference signals collected immediately after an interaction. This practice implicitly treats preference as static, even though many LLM-mediated decisions unfold over time and may be re-evaluated differently after real-world consequences and observed outcomes. Therefore, we argue for a methodological shift from single-moment preference elicitation to longitudinal, context-situated alignment measurement. We present a methodological framework for collecting temporally grounded alignment signals by combining (1) in-situ preference capture, (2) context-triggered follow-up preference reflection, and (3) privacy-preserving behavioral traces that help interpret preference change. As an instantiation of this methodology, we introduce BITE, a browser-based system that detects consequential LLM interactions, prompts reflection across later decision points, and supports progressive, user-controlled consent for sharing behavioral data. Through a two week longitudinal deployment study with 8 participants, our approach surfaced differences between immediate and later user preferences in accuracy, relevance and other dimensions of the LLM output. Our findings highlight the limitations of single-moment preference datasets and underscore the importance of longitudinal methods for alignment evaluation in everyday use.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper builds a browser tool for capturing LLM preferences both immediately and after later reflection, but the n=8 study does not yet show that single-moment datasets are broadly limited because there is no control for the prompts themselves.

read the letter

The paper introduces BITE, a browser extension that logs consequential LLM interactions, captures immediate preferences, and then triggers follow-up reflection prompts at later decision points while keeping behavioral traces private and user-controlled. The central claim is that this longitudinal approach reveals shifts in user judgments on accuracy, relevance, and similar dimensions that static single-moment collections miss. That combination of in-situ logging plus context-triggered re-evaluation is the concrete new piece; most prior preference work stops at the first interaction. The system design itself is practical, with progressive consent and privacy features that make a real deployment feasible rather than just a lab prototype. The two-week study with eight users did surface some preference changes, which serves as a workable demonstration that collecting richer temporal data is possible in everyday use. The soft spots sit in the study design and evidence strength. Eight participants over two weeks without a no-prompt control arm leaves open whether the observed shifts come from real-world consequences or simply from the act of being asked to reflect again. The abstract reports differences but gives no detail on statistical tests, exclusion rules, or how qualitative coding was validated, so the support for declaring single-moment methods limited stays thin. Generalization from a small self-selected group is also limited. This work is for HCI researchers who run user studies on LLM alignment or evaluation and want practical methods for time-extended data. A reader looking for deployable tools and a framework for longitudinal signals will get usable ideas from the BITE implementation. It deserves peer review because the methodological proposal and the working system are clear enough to be worth referee time, even though the current evidence base needs expansion on controls and scale before the stronger claims land. I would send it out with the expectation that the authors add a control condition and more participants in revision.

Referee Report

2 major / 1 minor

Summary. The paper argues that current human-LLM alignment methods rely on immediate post-interaction preferences, which treat preferences as static despite real-world re-evaluation over time. It proposes a methodological framework for longitudinal alignment via (1) in-situ preference capture, (2) context-triggered follow-up reflection, and (3) privacy-preserving behavioral traces. The BITE browser-based system is presented as an implementation that detects consequential interactions, prompts later reflection, and manages consent. A two-week deployment study with 8 participants is reported to have surfaced differences between immediate and later preferences on dimensions including accuracy and relevance of LLM outputs, highlighting limitations of single-moment datasets.

Significance. If the empirical claims can be substantiated with stronger controls and larger samples, the work would usefully draw attention to temporal dynamics in user preferences for LLM evaluation, a topic of growing relevance in HCI and AI alignment. The combination of reflective prompts with behavioral logging offers a concrete direction for context-situated measurement. The study provides only preliminary evidence, however, so the significance remains prospective rather than demonstrated.

major comments (2)

[Abstract / Study Description] Abstract / Study Description: The central claim that the two-week study 'surfaced differences' and thereby 'highlight[s] the limitations of single-moment preference datasets' rests on an N=8 deployment without a control arm (participants logging interactions but receiving no reflection prompts). This design cannot separate genuine temporal preference evolution from effects induced by the BITE system's own context-triggered prompts, directly weakening the methodological argument.
[Abstract] Abstract: No details are supplied on statistical tests, exclusion criteria, pre-registered analysis plan, effect sizes, or inter-rater reliability for any qualitative coding of preference dimensions. Without these, the reported differences cannot be evaluated for reliability or generalizability, which is load-bearing for the claim that single-moment methods are broadly insufficient.

minor comments (1)

[Abstract] Abstract: The phrase 'other dimensions of the LLM output' is underspecified; listing the additional dimensions examined and the measurement approach (scales, themes, etc.) would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important considerations for strengthening the presentation of our exploratory deployment study. We address each major comment below and have revised the manuscript to improve clarity on study limitations, analysis procedures, and the scope of our claims.

read point-by-point responses

Referee: [Abstract / Study Description] Abstract / Study Description: The central claim that the two-week study 'surfaced differences' and thereby 'highlight[s] the limitations of single-moment preference datasets' rests on an N=8 deployment without a control arm (participants logging interactions but receiving no reflection prompts). This design cannot separate genuine temporal preference evolution from effects induced by the BITE system's own context-triggered prompts, directly weakening the methodological argument.

Authors: We agree that the absence of a control arm (in which interactions are logged without reflection prompts) prevents full isolation of natural temporal preference shifts from any re-evaluation induced by the prompts. The two-week deployment was conceived as an initial, naturalistic illustration of the proposed methodological framework and BITE system rather than a controlled experiment establishing causality. The observed differences between immediate and later-elicited preferences nonetheless demonstrate that single-moment captures can miss dimensions that become salient after reflection and real-world use. In the revised manuscript we have expanded the limitations subsection to explicitly discuss the lack of a control condition, clarified that the primary contribution lies in the framework and system design, and adjusted language in the abstract and discussion to characterize the findings as preliminary and illustrative. revision: partial
Referee: [Abstract] Abstract: No details are supplied on statistical tests, exclusion criteria, pre-registered analysis plan, effect sizes, or inter-rater reliability for any qualitative coding of preference dimensions. Without these, the reported differences cannot be evaluated for reliability or generalizability, which is load-bearing for the claim that single-moment methods are broadly insufficient.

Authors: The deployment study employed qualitative thematic analysis of the preference reflections and behavioral traces rather than quantitative hypothesis testing; therefore no statistical tests or effect sizes were performed. No pre-registered analysis plan was used, consistent with the exploratory character of the work. All eight participants completed the full two-week period, so no exclusion criteria were applied. Qualitative coding of preference dimensions (accuracy, relevance, etc.) was conducted by a single researcher with iterative refinement against the raw reflections; formal inter-rater reliability metrics were not computed. The revised manuscript now includes a dedicated analysis-methods subsection describing the coding process and adds an expanded limitations paragraph addressing generalizability and the preliminary nature of the evidence. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical user-study proposal with no derivations or self-referential modeling

full rationale

The paper is a methodological proposal instantiated via a two-week deployment study with 8 participants. It contains no equations, fitted parameters, predictive models, or derivation chains. Claims about differences in immediate vs. later preferences are presented as direct observations from the BITE system deployment rather than outputs derived from prior self-citations or ansatzes. No load-bearing uniqueness theorems, self-definitional constructs, or renaming of known results appear. The work is self-contained as an empirical contribution; any concerns about sample size or control conditions are validity issues, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on domain assumptions about preference dynamics and system feasibility rather than free parameters or new entities with independent evidence.

axioms (2)

domain assumption User preferences for LLM outputs can meaningfully change over time after observing real-world consequences.
Invoked to justify moving beyond immediate elicitation.
domain assumption Privacy-preserving behavioral traces can be collected and interpreted without introducing new privacy risks or biasing reflections.
Assumed in the design of the BITE system and data collection.

invented entities (1)

BITE browser-based system no independent evidence
purpose: Detects consequential LLM interactions, triggers delayed reflections, and manages consent for behavioral data.
New tool introduced to instantiate the longitudinal framework.

pith-pipeline@v0.9.0 · 5536 in / 1312 out tokens · 59990 ms · 2026-05-07T03:20:18.870162+00:00 · methodology

Review history (3 revisions) →

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 17 canonical work pages · 3 internal anchors

[1]

2012.Fluent Thinking: Why We Like What We Like and How We Think

Adam L Alter. 2012.Fluent Thinking: Why We Like What We Like and How We Think. Farrar, Straus and Giroux

2012
[2]

Dario Amodei, Chris Olah, Jacob Steinhardt, Paul Christiano, John Schulman, and Dan Mané. 2016. Concrete problems in AI safety.arXiv preprint arXiv:1606.06565 (2016)

work page internal anchor Pith review arXiv 2016
[3]

Michaela Benk and Tim Miller. 2026. Same Performance, Hidden Bias: Evaluating Hypothesis-and Recommendation-Driven AI.arXiv preprint arXiv:2603.15824 (2026)

work page arXiv 2026
[4]

Barry Brown, Stuart Reeves, and Scott Sherwood. 2011. Into the wild: challenges and opportunities for field trial methods. InProceedings of the SIGCHI conference on human factors in computing systems. 1657–1666

2011
[5]

Joseph Chee Chang, Aniket Kittur, Nathan Hahn, and Brad A. Myers. 2021. When the Tab Comes Due: Challenges in the Cost Structure of Browser Tab Management. InProceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13. doi:10.1145/3411764.3445585

work page doi:10.1145/3411764.3445585 2021
[6]

Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. 2017. Deep reinforcement learning from human preferences.Advances in neural information processing systems30 (2017)

2017
[7]

Mihaly Csikszentmihalyi and Reed Larson. 1987. Validity and reliability of the experience-sampling method.The Journal of nervous and mental disease175, 9 (1987), 526–536

1987
[8]

Thomas Erickson and Wendy A. Kellogg. 2000. Social Translucence: An Ap- proach to Designing Systems that Support Social Processes.ACM Transactions on Computer-Human Interaction7, 1 (2000), 59–83. doi:10.1145/344949.345004

work page doi:10.1145/344949.345004 2000
[9]

Smith, Wendy A

Thomas Erickson, David N. Smith, Wendy A. Kellogg, Mark Laff, John T. Richards, and Erin Bradner. 1999. Socially Translucent Systems: Social Proxies, Persistent Conversation, and the Design of Babble. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems. 72–79. doi:10.1145/302979.303017

work page doi:10.1145/302979.303017 1999
[10]

Hugging Face. 2024. Everyday Conversations for LLMs. https://huggingface.co/ datasets/HuggingFaceTB/everyday-conversations-llama3.1-2k

2024
[11]

Cathy Mengying Fang, Sheer Karny, Chayapatr Archiwaranguprok, Yasith Sama- radivakara, Pat Pataranutaporn, and Pattie Maes. 2026. AI-Wrapped: Participatory, Privacy-Preserving Measurement of Longitudinal LLM Use In-the-Wild.arXiv preprint arXiv:2602.18415(2026)

work page arXiv 2026
[12]

Chongming Gao, Shiqi Wang, Shijun Li, Jiawei Chen, Xiangnan He, Wenqiang Lei, Biao Li, Yuan Zhang, and Peng Jiang. 2023. CIRS: Bursting filter bubbles by coun- terfactual interactive recommender system.ACM Transactions on Information Systems42, 1 (2023), 1–27

2023
[13]

Daniel T Gilbert, Elizabeth C Pinel, Timothy D Wilson, Stephen J Blumberg, and Thalia Wheatley. 2002. The Trouble with Vronsky: Impact Bias in the Forecasting of Future Affective States.Journal of Personality and Social Psychology82, 3 (2002), 353–366

2002
[14]

Pamela Grimm. 2010. Social desirability bias.Wiley international encyclopedia of marketing(2010)

2010
[15]

Will Hill, Jim Hollan, Dave Wroblewski, and Tim McCandless. 1992. Edit Wear and Read Wear. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems. 3–9. doi:10.1145/142750.142751

work page doi:10.1145/142750.142751 1992
[16]

Geoffrey Irving, Paul Christiano, and Dario Amodei. 2018. AI safety via debate. arXiv preprint arXiv:1805.00899(2018)

work page internal anchor Pith review arXiv 2018
[17]

David Laibson. 1997. Golden Eggs and Hyperbolic Discounting.The Quarterly Journal of Economics112, 2 (1997), 443–477

1997
[18]

Gilly Leshed, Eben Haber, Tara Matthews, and Tessa Lau. 2008. CoScripter: Automating & Sharing How-To Knowledge in the Enterprise. InProceedings of the SIGCHI Conference on Human Factors in Computing Systems. 1719–1728. doi:10.1145/1357054.1357323

work page doi:10.1145/1357054.1357323 2008
[19]

George Loewenstein. 1996. Out of Control: Visceral Influences on Behavior. Organizational Behavior and Human Decision Processes65, 3 (1996), 272–292

1996
[20]

George Loewenstein and Drazen Prelec. 1992. Anomalies in intertemporal choice: Evidence and an interpretation.The quarterly journal of economics107, 2 (1992), 573–597

1992
[21]

Hugh Munby. 1989. Reflection-in-action and reflection-on-action.Current issues in education9, 1 (1989), 31–42

1989
[22]

Helen Nissenbaum. 2004. Privacy as contextual integrity.Wash. L. Rev.79 (2004), 119

2004
[23]

Shuo Niu, Tianyi Li, and Mohan Chi. 2026. A Literature Review of Ethical Considerations in Recommender Systems for User-Generated Content in Human- Computer Interaction.ACM Transactions on Recommender Systems4, 2 (2026). doi:10.1145/3770747

work page doi:10.1145/3770747 2026
[24]

Ted O’Donoghue and Matthew Rabin. 1999. Doing It Now or Later.American Economic Review89, 1 (1999), 103–124

1999
[25]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback.Advances in neural information processing systems35 (2022), 27730–27744

2022
[26]

Weirui Peng, Yinuo Yang, Zheng Zhang, and Toby Jia-Jun Li. 2025. Glitter: An AI-Assisted Platform for Material-Grounded Asynchronous Discussion in Flipped Learning. InProceedings of the 38th Annual ACM Symposium on User Interface Software and Technology. 1–22

2025
[27]

Norbert Schwarz and Fritz Strack. 1999. Reports of Subjective Well-Being: Judg- mental Processes and Their Methodological Implications. 61–84 pages

1999
[28]

Phoebe Sengers, Kirsten Boehner, Shay David, and Joseph’Jofish’ Kaye. 2005. Re- flective design. InProceedings of the 4th decennial conference on Critical computing: between sense and sensibility. 49–58

2005
[29]

Hua Shen, Tiffany Knearem, Reshmi Ghosh, Michael Xieyang Liu, Andrés Monroy-Hernández, Tongshuang Wu, Diyi Yang, Yun Huang, Tanushree Mitra, Yang Li, et al. 2025. Bidirectional Human-AI Alignment: Emerging Challenges and Opportunities. InProceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems. 1–6

2025
[30]

Daniel D Slate, Chaoran Chen, Yaxing Yao, and Toby Jia-Jun Li. 2025. Iterative Contextual Consent: AI-enabled Data Privacy Contracts. InProceedings of the 2025 Workshop on Human-Centered AI Privacy and Security. 84–91

2025
[31]

Paul Slovic. 1995. The Construction of Preference.American Psychologist50, 5 (1995), 364–371

1995
[32]

Xinge Tao, Shuya Zhou, Kai Ding, Sairan Li, Yanzeng Li, Boyou Wu, Qirui Huang, Wangyue Chen, Muzi Shen, En Meng, et al. 2026. An LLM Chatbot to Facilitate Primary-to-Specialist Care Transitions: A Randomized Controlled Trial.Nature Medicine(2026). doi:10.1038/s41591-025-04176-7

work page doi:10.1038/s41591-025-04176-7 2026
[33]

1974.Judgment under Uncertainty: Heuris- tics and Biases

Amos Tversky and Daniel Kahneman. 1974.Judgment under Uncertainty: Heuris- tics and Biases. Science

1974
[34]

Yash Vekaria, Aurelio Loris Canino, Jonathan Levitsky, Alex Ciechonski, Patricia Callejo, Anna Maria Mandalari, and Zubair Shafiq. 2025. Big Help or Big Brother? Auditing Tracking, Profiling, and Personalization in Generative {AI} Assistants. In34th USENIX Security Symposium (USENIX Security 25). 8115–8134

2025
[35]

Jialin Wei, Badrish Chandramouli, Lakshminarayanan Subramanian, Denny Wu, Tiancheng Li, Xinyu Qian, Metin Sezgin, Yong Ji, Jianfeng Gao, and Alessandro Acquisti. 2016. GroupLink: Group Event Recommendations Using Personal Digital Traces. InProceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing Companion. 149–152. ...

work page doi:10.1145/2818052 2016
[36]

Yinuo Yang and Steve Oney. 2024. Vizcode: A practical real-time tool for in-class computer programming tutoring. InProceedings of the Eleventh ACM Conference on Learning@ Scale. 544–546

2024
[37]

Yinuo Yang, Ashley Ge Zhang, Steve Oney, and April Yi Wang. 2025. SPARK: Real-Time Monitoring of Multi-Faceted Programming Exercises. In2025 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, 81–92

2025
[38]

Yinuo Yang, Zheng Zhang, Ningzhi Tang, Xu Wang, Alex Ambrose, Nathaniel Myers, Patrick Clauss, and Toby Jia-Jun Li. 2026. Lessons from Real-World Deployment of a Cognition-Preserving Writing Tool: Students Actively Engage with Critical Thinking and Planning Affordances. arXiv:2603.15777 [cs.HC] https://arxiv.org/abs/2603.15777

work page arXiv 2026
[39]

Ashley Ge Zhang, Yan-Ru Jhou, Yinuo Yang, Shamita Rao, Maryam Arab, Yan Chen, and Steve Oney. 2026. Editrail: Understanding AI Usage by Visualizing Student-AI Interaction in Code.arXiv preprint arXiv:2601.20085(2026)

work page arXiv 2026
[40]

Guanhua Zhang, Zhiming Hu, and Andreas Bulling. 2024. DisMouse: Disentan- gling Information from Mouse Movement Data. InProceedings of the 37th Annual ACM Symposium on User Interface Software and Technology. doi:10.1145/3654777. 3676411

work page doi:10.1145/3654777 2024
[41]

It’s a Fair Game

Zhiping Zhang, Michelle Jia, Hao-Ping Lee, Bingsheng Yao, Sauvik Das, Ada Lerner, Dakuo Wang, and Tianshi Li. 2024. “It’s a Fair Game”, or Is It? Examining How Users Navigate Disclosure Risks and Benefits When Using LLM-Based Conversational Agents. InProceedings of the 2024 CHI Conference on Human Factors in Computing Systems. 1–26

2024
[42]

Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, and Yuntian Deng. 2024. WildChat: 1M ChatGPT Interaction Logs in the Wild.arXiv preprint arXiv:2405.01470(2024)

work page arXiv 2024
[43]

Daniel M Ziegler, Nisan Stiennon, Jeffrey Wu, Tom B Brown, Alec Radford, Dario Amodei, Paul Christiano, and Geoffrey Irving. 2019. Fine-tuning language models from human preferences.arXiv preprint arXiv:1909.08593(2019). Preprint, 2026, Simret Araya Gebreegziabher, Allison E Sproul, Yinuo Yang, Chaoran Chen, Diego Gómez-Zará, and Toby Jia-Jun Li A Additio...

work page internal anchor Pith review arXiv 2019