TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.
hub
Training data-efficient image transformers & distillation through attention
10 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
A hierarchical spiking transformer using Q-K attention achieves 85.65% top-1 accuracy on ImageNet-1K, the first direct-trained SNN to exceed 85%.
Register tokens enhance pixel-space DiT training and output quality via cleaner high-noise feature maps, and a dual-stream design adds further gains with little overhead.
Nonlinear Bipolar Compensation with Bipolar Logarithmic Transformation reduces outlier effects in post-training quantization by performing compensation in a compressed transformed space.
Outlier tokens in DiTs are addressed with Dual-Stage Registers, which reduce artifacts and improve image generation on ImageNet and text-to-image tasks.
Dynamic parameterization of standard layers can replace explicit attention for linear-time global visual modeling.
TRUST is a test-time adaptation method for SSM vision models that uses uncertainty-guided traversal permutations to refine Mamba parameters via pseudo-labels and weight averaging, improving robustness on distribution shifts.
ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.
MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.
Colinearity-Decay regularizer trains ViTs that maintain or improve full-precision accuracy while delivering higher accuracy after low-bit quantization on ImageNet and COCO tasks.
citing papers explorer
-
TCP-SSM: Efficient Vision State Space Models with Token-Conditioned Poles
TCP-SSM conditions stable poles on visual tokens to explicitly control memory decay and oscillation in SSMs, cutting computation up to 44% while matching or exceeding accuracy on classification, segmentation, and detection.
-
QKFormer: Hierarchical Spiking Transformer using Q-K Attention
A hierarchical spiking transformer using Q-K attention achieves 85.65% top-1 accuracy on ImageNet-1K, the first direct-trained SNN to exceed 85%.
-
Registers Matter for Pixel-Space Diffusion Transformers
Register tokens enhance pixel-space DiT training and output quality via cleaner high-noise feature maps, and a dual-stream design adds further gains with little overhead.
-
Nonlinear Bipolar Compensation: Handling Outliers in Post-Training Quantization
Nonlinear Bipolar Compensation with Bipolar Logarithmic Transformation reduces outlier effects in post-training quantization by performing compensation in a compressed transformed space.
-
Taming Outlier Tokens in Diffusion Transformers
Outlier tokens in DiTs are addressed with Dual-Stage Registers, which reduce artifacts and improve image generation on ImageNet and text-to-image tasks.
-
Linear-Time Global Visual Modeling without Explicit Attention
Dynamic parameterization of standard layers can replace explicit attention for linear-time global visual modeling.
-
TRUST: Test-Time Refinement using Uncertainty-Guided SSM Traverses
TRUST is a test-time adaptation method for SSM vision models that uses uncertainty-guided traversal permutations to refine Mamba parameters via pseudo-labels and weight averaging, improving robustness on distribution shifts.
-
ASAP: Attention Sink Anchored Pruning
ASAP prunes tokens in ViTs by anchoring on attention sinks modeled as lazy random walks, using cumulative transition matrices and radial diffusion clustering to compress redundancy while preserving accuracy.
-
Micro-Defects Expose Macro-Fakes: Detecting AI-Generated Images via Local Distributional Shifts
MDMF detects AI-generated images by learning patch-level forensic signatures and quantifying their distributional discrepancies with MMD, yielding larger separation than global methods when micro-defects are present.
-
Colinearity Decay: Training Quantization-Friendly ViTs with Outlier Decay
Colinearity-Decay regularizer trains ViTs that maintain or improve full-precision accuracy while delivering higher accuracy after low-bit quantization on ImageNet and COCO tasks.