arXiv preprint arXiv:2310.17022 , year=

Controlled decoding from language models , author= · arXiv 2310.17022

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching

cs.CL · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

TBPO posits a token-level Bradley-Terry model and derives a Bregman-divergence density-ratio matching loss that generalizes DPO while preserving token-level optimality.

Generalization in LLM Problem Solving: The Case of the Shortest Path

cs.AI · 2026-04-16 · unverdicted · novelty 6.0

LLMs show strong spatial generalization to unseen maps in shortest-path tasks but fail length scaling due to recursive instability, with data coverage setting hard limits.

citing papers explorer

Showing 2 of 2 citing papers.

TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching cs.CL · 2026-05-12 · unverdicted · none · ref 152 · 2 links
TBPO posits a token-level Bradley-Terry model and derives a Bregman-divergence density-ratio matching loss that generalizes DPO while preserving token-level optimality.
Generalization in LLM Problem Solving: The Case of the Shortest Path cs.AI · 2026-04-16 · unverdicted · none · ref 34
LLMs show strong spatial generalization to unseen maps in shortest-path tasks but fail length scaling due to recursive instability, with data coverage setting hard limits.

arXiv preprint arXiv:2310.17022 , year=

fields

years

verdicts

representative citing papers

citing papers explorer