A Theory of Time-Sensitive Language Generation: Sparse Hallucination Beats Mode Collapse

Atul Ganju; Shaddin Dughmi; Shang-Hua Teng; Travis McVoy

arxiv: 2605.11302 · v2 · pith:RC5VFPV3new · submitted 2026-05-11 · 💻 cs.LG · cs.AI· cs.CL

A Theory of Time-Sensitive Language Generation: Sparse Hallucination Beats Mode Collapse

Atul Ganju , Travis McVoy , Shaddin Dughmi , Shang-Hua Teng This is my paper

Pith reviewed 2026-05-21 07:49 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CL

keywords language generationhallucinationtimelinessconsistencydeadline functionsdensity

0 comments

The pith

Generators with a hallucination rate that vanishes over time can meet any superlinear deadline while covering the language at optimal density.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies language generation where strings must appear before deadlines set by their rank in a global preference ordering. Eventually consistent generators cannot produce outputs in time under this model. Relaxing consistency so that the rate of incorrect outputs drops toward zero as time passes allows the generator to achieve optimal density for every superlinear deadline function. The bound is tight: the same relaxation still rules out timely generation when deadlines are linear.

Core claim

Timely generation is impossible for eventually consistent generators. Under the mildest natural relaxation of consistency—a hallucination rate that vanishes over time—we can achieve optimal density with respect to any superlinear deadline function. This is tight by ruling out timely generation with linear deadlines and vanishing hallucination rate.

What carries the argument

Vanishing hallucination rate, the condition that the fraction of incorrect strings produced tends to zero as time increases, which relaxes eventual consistency enough to permit optimal timely coverage for superlinear deadlines.

If this is right

Eventually consistent generators cannot satisfy any nontrivial timeliness requirement under rank-based deadlines.
A vanishing hallucination rate suffices for optimal density against every superlinear deadline function.
Linear deadlines remain impossible to meet even when the hallucination rate vanishes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result indicates that sparse, diminishing errors can substitute for strict consistency in settings where both coverage and speed matter.
The same relaxation may apply to other sequential generation problems that balance quality ordering against production timing.

Load-bearing premise

A global preference ordering on strings exists and each string's deadline is set by a function of its rank in that ordering.

What would settle it

An explicit construction of a generator whose hallucination rate goes to zero yet fails to hit optimal density on some superlinear deadline function, or succeeds on a linear deadline function.

Figures

Figures reproduced from arXiv: 2605.11302 by Atul Ganju, Shaddin Dughmi, Shang-Hua Teng, Travis McVoy.

**Figure 1.** Figure 1: Visual for behavior of SBG algorithm: every tick mark represents a generated element and larger tick marks represent deadlines for corresponding prefixes of the target language. Since C(m) → ∞, the true target K = L i ⋆ is eventually included among these candidates. From that point onward, a 1/C(m) fraction of speculative steps in epoch m target K, so each epoch yields a constant expected number of fresh e… view at source ↗

read the original abstract

We study language generation in the limit under a global preference ordering on strings, as introduced by Kleinberg and Wei. As is done in previous work, we aim for breadth, but impose an additional requirement of timeliness: higher-ranked strings should be generated earlier. A string is then only credited if it is generated before a deadline, where its deadline is defined by a function that maps a string's rank in the target language to the time by which it must be produced. This is in keeping with a central consideration in machine learning, where inductive bias favors ``simpler'' or ``more plausible'' outputs, all else being equal. We show that timely generation is impossible in a strong sense for eventually consistent generators -- the protagonists of most prior related work. Under what is perhaps the mildest natural relaxation of consistency, a hallucination rate that vanishes over time, we show that we can circumvent our impossibility result. In particular, we can achieve optimal density with respect to any superlinear deadline function. We also show this is tight by ruling out timely generation with linear deadlines and vanishing hallucination rate.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper studies language generation in the limit under a global preference ordering on strings. It imposes timeliness via rank-based deadline functions and proves that eventually consistent generators cannot achieve timely generation. Under the relaxation to a vanishing hallucination rate, the authors establish that optimal density is achievable for any superlinear deadline function and prove tightness by showing impossibility for linear deadlines.

Significance. If the central results hold, the work supplies a clean theoretical separation between strong consistency and a mild relaxation that permits sparse hallucinations, while delivering tight characterizations with respect to deadline growth rates. This directly addresses a core tension in modern language modeling between breadth and the inductive bias toward simpler outputs, and the tightness result is a notable strength.

minor comments (3)

[§3] §3, Definition 2.1: the formal definition of the vanishing hallucination rate is introduced without an immediate comparison to the eventual-consistency condition used in the impossibility result; adding a short remark on the quantitative difference would improve readability.
[Theorem 4.3] Theorem 4.3: the statement that density is optimal for superlinear deadlines would be easier to parse if the precise notion of 'optimal density' (e.g., the asymptotic ratio) were restated in the theorem statement rather than only in the surrounding prose.
[Figure 1] Figure 1: the caption does not indicate whether the plotted curves are theoretical bounds or empirical simulations; a single clarifying sentence would prevent misinterpretation.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the recommendation for minor revision. The provided summary correctly captures the core contributions: the impossibility of timely generation for eventually consistent generators, the achievability of optimal density under vanishing hallucination rates for superlinear deadlines, and the matching impossibility result for linear deadlines.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper explicitly defines its model via a global preference ordering on strings and a deadline function mapping rank to production time, then proves an impossibility result for eventually consistent generators under timeliness. It separately establishes a positive result achieving optimal density for any superlinear deadline under the vanishing hallucination rate relaxation, plus a matching impossibility for linear deadlines. These steps are presented as independent theorems with explicit tightness arguments. The reference to Kleinberg and Wei supplies only the base setup for breadth; the timeliness constraint, hallucination relaxation, and density results are derived from the stated assumptions without reducing to fitted parameters, self-definitions, or load-bearing self-citations. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Review is abstract-only so ledger is necessarily incomplete; the central claims rest on the existence of a global preference ordering and the technical definition of vanishing hallucination rate.

axioms (1)

domain assumption Existence of a global preference ordering on strings
Introduced by Kleinberg and Wei and used to define ranks and deadlines.

invented entities (1)

Vanishing hallucination rate no independent evidence
purpose: Mildest natural relaxation of eventual consistency to enable timely generation
Defined as the rate of incorrect outputs going to zero over time.

pith-pipeline@v0.9.0 · 5735 in / 1281 out tokens · 45114 ms · 2026-05-21T07:49:48.383252+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Constants/AlphaDerivationExplicit.lean phi_golden_ratio, phi_fixed_point echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Conjecture 1 (The Golden-Ratio Barrier for Timely Language Generation). ... Balancing these two exponents leads to the relation α² + α = 1, whose positive solution is α = (√5−1)/2. Equivalently, the corresponding delay exponent is the golden ratio φ = (1 + √5)/2.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel, J_uniquely_calibrated_via_higher_derivative echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

We show that timely generation is impossible ... for eventually consistent generators. Under ... vanishing hallucination rate ... optimal density with respect to any superlinear deadline function ... tight by ruling out ... linear deadlines

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

D. Angluin. Inductive inference of formal languages from positive data.Information and Control, 45(2):117–135, 1980. 10

work page 1980
[2]

Arjovsky and L

M. Arjovsky and L. Bottou. Towards principled methods for training generative adversarial networks. InInternational Conference on Learning Representations, 2017

work page 2017
[3]

Arjovsky, S

M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein generative adversarial networks. In D. Precup and Y. W. Teh, editors,Proceedings of the 34th International Conference on Machine Learning, volume 70 ofProceedings of Machine Learning Research, pages 214–223. PMLR, 06–11 Aug 2017

work page 2017
[4]

Y. Bai, D. Panigrahi, and I. Zhang. Language generation in the limit: Noise, loss, and feedback. In Proceedings of the 2026 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 794–

work page 2026
[5]

Charikar and C

M. Charikar and C. Pabbaraju. Exploring facets of language generation in the limit. In N. Haghtalab and A. Moitra, editors,Proceedings of Thirty Eighth Conference on Learning Theory, volume 291 of Proceedings of Machine Learning Research, pages 854–887. PMLR, 30 Jun–04 Jul 2025

work page 2025
[6]

Charikar and C

M. Charikar and C. Pabbaraju. Pareto-optimal non-uniform language generation. In37th International Conference on Algorithmic Learning Theory, 2026

work page 2026
[7]

T. Feng, T. H. Trinh, G. Bingham, D. Hwang, Y. Chervonyi, J. Jung, J. Lee, C. Pagano, S. hyun Kim, F. Pasqualotto, S. Gukov, J. N. Lee, J. Kim, K. Hou, G. Ghiasi, Y. Tay, Y. Li, C. Kuang, Y. Liu, H. Lin, E. Z. Liu, N. Nayakanti, X. Yang, H.-T. Cheng, D. Hassabis, K. Kavukcuoglu, Q. V. Le, and T. Luong. Towards autonomous mathematics research, 2026

work page 2026
[8]

D. A. Freedman. On tail probabilities for martingales.The Annals of Probability, 3(1):100–118, 1975

work page 1975
[9]

L. Gao, A. Madaan, S. Zhou, U. Alon, P. Liu, Y. Yang, J. Callan, and G. Neubig. Pal: program-aided language models. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023

work page 2023
[10]

E. M. Gold. Language identification in the limit.Information and Control, 10(5):447–474, 1967

work page 1967
[11]

Hanneke, A

S. Hanneke, A. Karbasi, A. Mehrotra, and G. Velegkas. On union-closedness of language generation. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026

work page 2026
[12]

A. T. Kalai and S. S. Vempala. Calibrated language models must hallucinate. InProceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, page 160–171, New York, NY, USA, 2024. Association for Computing Machinery

work page 2024
[13]

Kalavasis, A

A. Kalavasis, A. Mehrotra, and G. Velegkas. On the limits of language generation: Trade-offs between hallucination and mode-collapse. InProceedings of the 57th Annual ACM Symposium on Theory of Computing, STOC ’25, page 1732–1743, New York, NY, USA, 2025. Association for Computing Ma- chinery

work page 2025
[14]

Kalavasis, A

A. Kalavasis, A. Mehrotra, and G. Velegkas. On characterizations for language generation: Interplay of hallucinations, breadth, and stability. In37th International Conference on Algorithmic Learning Theory, 2026

work page 2026
[15]

Karbasi, O

A. Karbasi, O. Montasser, J. Sous, and G. Velegkas. (im)possibility of automated hallucination detection in large language models. InICML 2025 Workshop on Reliable and Responsible Foundation Models, 2025

work page 2025
[16]

Kleinberg and S

J. Kleinberg and S. Mullainathan. Language generation in the limit. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 66058–66079. Curran Associates, Inc., 2024

work page 2024
[17]

Kleinberg and F

J. Kleinberg and F. Wei. Density measures for language generation. In2025 IEEE 66th Annual Symposium on Foundations of Computer Science (FOCS), pages 620–658, 2025

work page 2025
[18]

Kleinberg and F

J. Kleinberg and F. Wei. Language generation and identification from partial enumeration: Tight density bounds and topological characterizations, 2025. arxiv preprint; accepted to STOC 2026. 11

work page 2025
[19]

J. Li, Y. Xia, R. Yan, H. Sun, D. Zhao, and T.-Y. Liu. Stylized dialogue generation with multi-pass dual learning. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 28470–28481. Curran Associates, Inc., 2021

work page 2021
[20]

Mehrotra, G

A. Mehrotra, G. Velegkas, X. Yu, and F. Zhou. Language generation with infinite contamination, 2025. arxiv preprint; to appear in COLT 2026

work page 2025
[21]

Peale, V

C. Peale, V. Raman, and O. Reingold. Representative language generation. In A. Singh, M. Fazel, D. Hsu, S. Lacoste-Julien, F. Berkenkamp, T. Maharaj, K. Wagstaff, and J. Zhu, editors,Proceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 48518–48541. PMLR, 13–19 Jul 2025

work page 2025
[22]

Pillutla, S

K. Pillutla, S. Swayamdipta, R. Zellers, J. Thickstun, S. Welleck, Y. Choi, and Z. Harchaoui. Mauve: Measuring the gap between neural text and human text using divergence frontiers. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 4816–4828. Curran Associate...

work page 2021
[23]

C. Qian, P. Han, Q. Luo, B. He, X. Chen, Y. Zhang, H. Du, J. Yao, X. Yang, D. Zhang, Y. Li, and H. Ji. EscapeBench: Towards advancing creative intelligence of language model agents. In W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, editors,Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),...

work page 2025
[24]

Raman and V

A. Raman and V. Raman. Generation from noisy examples. In A. Singh, M. Fazel, D. Hsu, S. Lacoste- Julien, F. Berkenkamp, T. Maharaj, K. Wagstaff, and J. Zhu, editors,Proceedings of the 42nd Inter- national Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 51079–51093. PMLR, 13–19 Jul 2025

work page 2025
[25]

Raman, J

V. Raman, J. Li, and A. Tewari. Generation through the lens of learning theory. In N. Haghtalab and A. Moitra, editors,Proceedings of Thirty Eighth Conference on Learning Theory, volume 291 of Proceedings of Machine Learning Research, pages 4740–4776. PMLR, 30 Jun–04 Jul 2025

work page 2025
[26]

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. V. Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc., 2022

work page 2022
[27]

preference order

J. Zhang, S. Yu, D. Chong, A. Sicilia, M. R. Tomz, C. D. Manning, and W. Shi. Verbalized sampling: How to mitigate mode collapse and unlock llm diversity, 2025. A Additional Discussion of Related Work The literature on language generation is vast and multifaceted. We therefore cannot do all of it justice, and emphasize only the most relevant papers to our...

work page 2025
[29]

That is, the algorithm eventually only produces guesses that are consistent with the target language, and moreover, it guesses the target language correctly infinitely often

there exists a finite timeTsuch that for allt≥T, ˆLt ⊆K. That is, the algorithm eventually only produces guesses that are consistent with the target language, and moreover, it guesses the target language correctly infinitely often. Using this procedure, Kleinberg and Wei [17] construct a generation algorithmAsatisfyingµ up(A,∞) = 1/2, which can easily be ...

work page
[30]

ˆLt =Kfor infinitely manyt, and

work page
[31]

The algorithmGCGmaintains two objects: a listG= (g 1,

there exists a finite timeTsuch that for allt≥T, ˆLt ⊆K. The algorithmGCGmaintains two objects: a listG= (g 1, . . . , g|G|) of candidate languages which is initialized to be empty and a stage indexm∈Nwhich is initialized to be 0. The listGis constructed from the guesses output byAccurateas follows: whenever at some roundtwe have ˆLt ⊋ ˆLt−1, the guess ˆL...

work page
[32]

Gis nonempty and stops growing after some finite time:SupposeGis nonempty and there exists some final stagem ⋆ such thatg m⋆ is the last element ever appended toG

Therefore, lim sup i→∞ µi(GCG(E), K, i)≥ 1 2. Gis nonempty and stops growing after some finite time:SupposeGis nonempty and there exists some final stagem ⋆ such thatg m⋆ is the last element ever appended toG. LetT ⋆ denote the time at which gm⋆ is appended toG. Since there are infinitely many roundstfor which ˆLt =K, and sinceT ⋆ is finite, there must al...

work page

[1] [1]

D. Angluin. Inductive inference of formal languages from positive data.Information and Control, 45(2):117–135, 1980. 10

work page 1980

[2] [2]

Arjovsky and L

M. Arjovsky and L. Bottou. Towards principled methods for training generative adversarial networks. InInternational Conference on Learning Representations, 2017

work page 2017

[3] [3]

Arjovsky, S

M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein generative adversarial networks. In D. Precup and Y. W. Teh, editors,Proceedings of the 34th International Conference on Machine Learning, volume 70 ofProceedings of Machine Learning Research, pages 214–223. PMLR, 06–11 Aug 2017

work page 2017

[4] [4]

Y. Bai, D. Panigrahi, and I. Zhang. Language generation in the limit: Noise, loss, and feedback. In Proceedings of the 2026 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 794–

work page 2026

[5] [5]

Charikar and C

M. Charikar and C. Pabbaraju. Exploring facets of language generation in the limit. In N. Haghtalab and A. Moitra, editors,Proceedings of Thirty Eighth Conference on Learning Theory, volume 291 of Proceedings of Machine Learning Research, pages 854–887. PMLR, 30 Jun–04 Jul 2025

work page 2025

[6] [6]

Charikar and C

M. Charikar and C. Pabbaraju. Pareto-optimal non-uniform language generation. In37th International Conference on Algorithmic Learning Theory, 2026

work page 2026

[7] [7]

T. Feng, T. H. Trinh, G. Bingham, D. Hwang, Y. Chervonyi, J. Jung, J. Lee, C. Pagano, S. hyun Kim, F. Pasqualotto, S. Gukov, J. N. Lee, J. Kim, K. Hou, G. Ghiasi, Y. Tay, Y. Li, C. Kuang, Y. Liu, H. Lin, E. Z. Liu, N. Nayakanti, X. Yang, H.-T. Cheng, D. Hassabis, K. Kavukcuoglu, Q. V. Le, and T. Luong. Towards autonomous mathematics research, 2026

work page 2026

[8] [8]

D. A. Freedman. On tail probabilities for martingales.The Annals of Probability, 3(1):100–118, 1975

work page 1975

[9] [9]

L. Gao, A. Madaan, S. Zhou, U. Alon, P. Liu, Y. Yang, J. Callan, and G. Neubig. Pal: program-aided language models. InProceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org, 2023

work page 2023

[10] [10]

E. M. Gold. Language identification in the limit.Information and Control, 10(5):447–474, 1967

work page 1967

[11] [11]

Hanneke, A

S. Hanneke, A. Karbasi, A. Mehrotra, and G. Velegkas. On union-closedness of language generation. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026

work page 2026

[12] [12]

A. T. Kalai and S. S. Vempala. Calibrated language models must hallucinate. InProceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, page 160–171, New York, NY, USA, 2024. Association for Computing Machinery

work page 2024

[13] [13]

Kalavasis, A

A. Kalavasis, A. Mehrotra, and G. Velegkas. On the limits of language generation: Trade-offs between hallucination and mode-collapse. InProceedings of the 57th Annual ACM Symposium on Theory of Computing, STOC ’25, page 1732–1743, New York, NY, USA, 2025. Association for Computing Ma- chinery

work page 2025

[14] [14]

Kalavasis, A

A. Kalavasis, A. Mehrotra, and G. Velegkas. On characterizations for language generation: Interplay of hallucinations, breadth, and stability. In37th International Conference on Algorithmic Learning Theory, 2026

work page 2026

[15] [15]

Karbasi, O

A. Karbasi, O. Montasser, J. Sous, and G. Velegkas. (im)possibility of automated hallucination detection in large language models. InICML 2025 Workshop on Reliable and Responsible Foundation Models, 2025

work page 2025

[16] [16]

Kleinberg and S

J. Kleinberg and S. Mullainathan. Language generation in the limit. In A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neural Information Processing Systems, volume 37, pages 66058–66079. Curran Associates, Inc., 2024

work page 2024

[17] [17]

Kleinberg and F

J. Kleinberg and F. Wei. Density measures for language generation. In2025 IEEE 66th Annual Symposium on Foundations of Computer Science (FOCS), pages 620–658, 2025

work page 2025

[18] [18]

Kleinberg and F

J. Kleinberg and F. Wei. Language generation and identification from partial enumeration: Tight density bounds and topological characterizations, 2025. arxiv preprint; accepted to STOC 2026. 11

work page 2025

[19] [19]

J. Li, Y. Xia, R. Yan, H. Sun, D. Zhao, and T.-Y. Liu. Stylized dialogue generation with multi-pass dual learning. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 28470–28481. Curran Associates, Inc., 2021

work page 2021

[20] [20]

Mehrotra, G

A. Mehrotra, G. Velegkas, X. Yu, and F. Zhou. Language generation with infinite contamination, 2025. arxiv preprint; to appear in COLT 2026

work page 2025

[21] [21]

Peale, V

C. Peale, V. Raman, and O. Reingold. Representative language generation. In A. Singh, M. Fazel, D. Hsu, S. Lacoste-Julien, F. Berkenkamp, T. Maharaj, K. Wagstaff, and J. Zhu, editors,Proceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 48518–48541. PMLR, 13–19 Jul 2025

work page 2025

[22] [22]

Pillutla, S

K. Pillutla, S. Swayamdipta, R. Zellers, J. Thickstun, S. Welleck, Y. Choi, and Z. Harchaoui. Mauve: Measuring the gap between neural text and human text using divergence frontiers. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems, volume 34, pages 4816–4828. Curran Associate...

work page 2021

[23] [23]

C. Qian, P. Han, Q. Luo, B. He, X. Chen, Y. Zhang, H. Du, J. Yao, X. Yang, D. Zhang, Y. Li, and H. Ji. EscapeBench: Towards advancing creative intelligence of language model agents. In W. Che, J. Nabende, E. Shutova, and M. T. Pilehvar, editors,Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),...

work page 2025

[24] [24]

Raman and V

A. Raman and V. Raman. Generation from noisy examples. In A. Singh, M. Fazel, D. Hsu, S. Lacoste- Julien, F. Berkenkamp, T. Maharaj, K. Wagstaff, and J. Zhu, editors,Proceedings of the 42nd Inter- national Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 51079–51093. PMLR, 13–19 Jul 2025

work page 2025

[25] [25]

Raman, J

V. Raman, J. Li, and A. Tewari. Generation through the lens of learning theory. In N. Haghtalab and A. Moitra, editors,Proceedings of Thirty Eighth Conference on Learning Theory, volume 291 of Proceedings of Machine Learning Research, pages 4740–4776. PMLR, 30 Jun–04 Jul 2025

work page 2025

[26] [26]

J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. V. Le, and D. Zhou. Chain-of-thought prompting elicits reasoning in large language models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Systems, volume 35, pages 24824–24837. Curran Associates, Inc., 2022

work page 2022

[27] [27]

preference order

J. Zhang, S. Yu, D. Chong, A. Sicilia, M. R. Tomz, C. D. Manning, and W. Shi. Verbalized sampling: How to mitigate mode collapse and unlock llm diversity, 2025. A Additional Discussion of Related Work The literature on language generation is vast and multifaceted. We therefore cannot do all of it justice, and emphasize only the most relevant papers to our...

work page 2025

[28] [29]

That is, the algorithm eventually only produces guesses that are consistent with the target language, and moreover, it guesses the target language correctly infinitely often

there exists a finite timeTsuch that for allt≥T, ˆLt ⊆K. That is, the algorithm eventually only produces guesses that are consistent with the target language, and moreover, it guesses the target language correctly infinitely often. Using this procedure, Kleinberg and Wei [17] construct a generation algorithmAsatisfyingµ up(A,∞) = 1/2, which can easily be ...

work page

[29] [30]

ˆLt =Kfor infinitely manyt, and

work page

[30] [31]

The algorithmGCGmaintains two objects: a listG= (g 1,

there exists a finite timeTsuch that for allt≥T, ˆLt ⊆K. The algorithmGCGmaintains two objects: a listG= (g 1, . . . , g|G|) of candidate languages which is initialized to be empty and a stage indexm∈Nwhich is initialized to be 0. The listGis constructed from the guesses output byAccurateas follows: whenever at some roundtwe have ˆLt ⊋ ˆLt−1, the guess ˆL...

work page

[31] [32]

Gis nonempty and stops growing after some finite time:SupposeGis nonempty and there exists some final stagem ⋆ such thatg m⋆ is the last element ever appended toG

Therefore, lim sup i→∞ µi(GCG(E), K, i)≥ 1 2. Gis nonempty and stops growing after some finite time:SupposeGis nonempty and there exists some final stagem ⋆ such thatg m⋆ is the last element ever appended toG. LetT ⋆ denote the time at which gm⋆ is appended toG. Since there are infinitely many roundstfor which ˆLt =K, and sinceT ⋆ is finite, there must al...

work page