S2-CoT coordinates a Structural Fidelity Adapter in the encoder-decoder with a Semantic Context Adapter in the entropy model to convert potential performance loss into state-of-the-art gains across base codecs while using only a small fraction of parameters.
Vision transformer adapter for dense predictions
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 3representative citing papers
DeCo decouples high- and low-frequency generation in pixel diffusion via a DiT plus lightweight decoder and a frequency-aware flow-matching loss, reaching FID 1.62 at 256x256 and 2.22 at 512x512 on ImageNet while closing the gap to latent diffusion methods.
Introduces progressive task-specific multi-task adaptation for vision transformers, sharing adapters early and specializing later with gradient-based task allocation, outperforming prior methods on PASCAL and NYUD-v2 with fewer trainable parameters.
citing papers explorer
-
What and Where to Adapt: Structure-Semantics Co-Tuning for Machine Vision Compression via Synergistic Adapters
S2-CoT coordinates a Structural Fidelity Adapter in the encoder-decoder with a Semantic Context Adapter in the entropy model to convert potential performance loss into state-of-the-art gains across base codecs while using only a small fraction of parameters.
-
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
DeCo decouples high- and low-frequency generation in pixel diffusion via a DiT plus lightweight decoder and a frequency-aware flow-matching loss, reaching FID 1.62 at 256x256 and 2.22 at 512x512 on ImageNet while closing the gap to latent diffusion methods.
-
Parameter-Efficient Multi-Task Learning via Progressive Task-Specific Adaptation
Introduces progressive task-specific multi-task adaptation for vision transformers, sharing adapters early and specializing later with gradient-based task allocation, outperforming prior methods on PASCAL and NYUD-v2 with fewer trainable parameters.