A training-free adaptive subspace projection method mitigates semantic collapsing in generative personalization by isolating and adjusting drift in a low-dimensional subspace using the stable pre-trained embedding as anchor.
arXiv preprint arXiv:2312.04461 (2024) 2, 3
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 6roles
other 1polarities
unclear 1representative citing papers
MIBE introduces a multi-subject interaction benchmark (MIB) with silver and gold sets and a dual-head evaluator (MIE) trained on VLM labels that outperforms baselines in matching human judgments.
PortraitGen integrates real-image exemplars into GRPO sampling and applies dual rewards (OmniReward and AI-Portrait) to improve photorealism, claiming better results than baselines on a new PortraitBench.
IdGlow is a progressive two-stage diffusion framework that uses task-adaptive timestep scheduling, temporal gating, VLM prompt synthesis, and group-level DPO to balance identity preservation and scene coherence in multi-subject image generation.
A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.