QLAM extends state-space models with quantum superposition in the hidden state for linear-time long-sequence modeling and reports consistent gains over RNN and transformer baselines on sequential image tasks.
Transformers are rnns: Fast autoregressive transformers with linear attention,
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
ECO uses supervised warm-up plus iterative batched DPO on a Mamba backbone to reach top neural performance on TSP and CVRP while lowering memory growth and raising throughput.
LetsTalk combines a multimodal diffusion transformer, noise-regularized memory bank, deep compression autoencoder, and symbiotic/direct fusion schemes to achieve state-of-the-art quality and efficiency in long-duration talking video generation.
citing papers explorer
-
QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling
QLAM extends state-space models with quantum superposition in the hidden state for linear-time long-sequence modeling and reports consistent gains over RNN and transformer baselines on sequential image tasks.
-
Rethinking Efficiency in Neural Combinatorial Optimization: Batched Preference Optimization with Mamba
ECO uses supervised warm-up plus iterative batched DPO on a Mamba backbone to reach top neural performance on TSP and CVRP while lowering memory growth and raising throughput.
-
Multimodal Diffusion Transformer with Memory Bank for Scalable Long-Duration Talking Video Generation
LetsTalk combines a multimodal diffusion transformer, noise-regularized memory bank, deep compression autoencoder, and symbiotic/direct fusion schemes to achieve state-of-the-art quality and efficiency in long-duration talking video generation.