Diffusion language models develop early-layer collapse around an indispensable super-outlier due to overtraining, resulting in higher compressibility and reversed optimal sparsity patterns versus autoregressive models.
ISBN 979-8-4007-0103-0
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 3years
2026 3verdicts
UNVERDICTED 3roles
method 1polarities
use method 1representative citing papers
ARCH is a hierarchical flow-based generative model that enables tractable conditional intensity computation and arbitrary conditioning for spatiotemporal event distributions.
SNMPP builds a product-form neural influence kernel from a signed interaction network over event classes and a delay-aware monotonic temporal network to enable explicit discovery of inter-event relationships alongside strong prediction.
citing papers explorer
-
Layer Collapse in Diffusion Language Models
Diffusion language models develop early-layer collapse around an indispensable super-outlier due to overtraining, resulting in higher compressibility and reversed optimal sparsity patterns versus autoregressive models.
-
Arbitrarily Conditioned Hierarchical Flows for Spatiotemporal Events
ARCH is a hierarchical flow-based generative model that enables tractable conditional intensity computation and arbitrary conditioning for spatiotemporal event distributions.
-
Structured Neural Marked Point Processes for Interpretable Event Interaction Modeling
SNMPP builds a product-form neural influence kernel from a signed interaction network over event classes and a delay-aware monotonic temporal network to enable explicit discovery of inter-event relationships alongside strong prediction.