DEMASK adds a lightweight pairwise-dependency predictor to dLLMs and uses greedy selection to enable parallel unmasking whose total-variation error is provably bounded under sub-additivity.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2verdicts
UNVERDICTED 2representative citing papers
Language models encode modal categories via linear difference vectors in their activations that predict fine-grained human plausibility judgments better than prior reports suggested.
citing papers explorer
-
Dependency-Guided Parallel Decoding in Discrete Diffusion Language Models
DEMASK adds a lightweight pairwise-dependency predictor to dLLMs and uses greedy selection to enable parallel unmasking whose total-variation error is provably bounded under sub-additivity.
-
Is This Just Fantasy? Language Model Representations Reflect Human Judgments of Event Plausibility
Language models encode modal categories via linear difference vectors in their activations that predict fine-grained human plausibility judgments better than prior reports suggested.