Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL) , pages=

TruthfulQA: Measuring How Models Mimic Human Falsehoods , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Team-Based Self-Play With Dual Adaptive Weighting for Fine-Tuning LLMs

cs.CL · 2026-05-11 · unverdicted · novelty 6.0

TPAW uses teams of current and historical model checkpoints that collaborate and compete, plus adaptive weightings for responses and players, to improve self-supervised LLM alignment and outperform baselines.

Data-dependent Exploration for Online Reinforcement Learning from Human Feedback

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

DEPO uses historical data to build a data-dependent uncertainty bonus for exploration in online RLHF, yielding an adaptive regret bound and stronger empirical performance than baselines.

citing papers explorer

Showing 2 of 2 citing papers.

Team-Based Self-Play With Dual Adaptive Weighting for Fine-Tuning LLMs cs.CL · 2026-05-11 · unverdicted · none · ref 51
TPAW uses teams of current and historical model checkpoints that collaborate and compete, plus adaptive weightings for responses and players, to improve self-supervised LLM alignment and outperform baselines.
Data-dependent Exploration for Online Reinforcement Learning from Human Feedback cs.LG · 2026-05-06 · unverdicted · none · ref 21
DEPO uses historical data to build a data-dependent uncertainty bonus for exploration in online RLHF, yielding an adaptive regret bound and stronger empirical performance than baselines.

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL) , pages=

fields

years

verdicts

representative citing papers

citing papers explorer