A twisted sequential Monte Carlo decoder with surface automata improves exact constraint satisfaction in structured text generation over standard decoding baselines across three benchmarks.
Automata-based constraints for language model decoding.arXiv preprint arXiv:2407.08103,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2representative citing papers
A 355M-parameter byte-level LM on 80B multilingual tokens exhibits UTF-8 validity converging after 4.2B tokens versus 2.1B for perplexity, with higher validity on rare characters than common ones.
citing papers explorer
-
LatticeBridge: Rare-Event Sequential Inference for Faithful Structured Sequence Synthesis
A twisted sequential Monte Carlo decoder with surface automata improves exact constraint satisfaction in structured text generation over standard decoding baselines across three benchmarks.
-
Beyond Perplexity: UTF-8 Validity in Byte-aware Language Models
A 355M-parameter byte-level LM on 80B multilingual tokens exhibits UTF-8 validity converging after 4.2B tokens versus 2.1B for perplexity, with higher validity on rare characters than common ones.