StochasT uses stochastic clustering of language tasks into varying turn depths for the same image to improve LVLMs on both single-turn and multi-turn scenarios without discarding data.
Personalized visual instruction tuning
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Introduces Personal VCL formalization and benchmark revealing LMM context gaps, plus an Agentic Context Bank baseline that boosts personalized visual reasoning.
PersonaVLM adds memory extraction, multi-turn retrieval-based reasoning, and personality inference to multimodal LLMs, yielding 22.4% gains on a new long-term personalization benchmark and outperforming GPT-4o.
ICPT adds an adaptive-length projection module and two geometric regularizations to enable efficient, high-accuracy personalization of LVLMs across complex multi-concept tasks.
citing papers explorer
-
StochasT: Learning with Stochastic Turn Depth for Visual Instruction Tuning
StochasT uses stochastic clustering of language tasks into varying turn depths for the same image to improve LVLMs on both single-turn and multi-turn scenarios without discarding data.
-
Personal Visual Context Learning in Large Multimodal Models
Introduces Personal VCL formalization and benchmark revealing LMM context gaps, plus an Agentic Context Bank baseline that boosts personalized visual reasoning.
-
PersonaVLM: Long-Term Personalized Multimodal LLMs
PersonaVLM adds memory extraction, multi-turn retrieval-based reasoning, and personality inference to multimodal LLMs, yielding 22.4% gains on a new long-term personalization benchmark and outperforming GPT-4o.
-
Personalize Your Large Vision-language Models With In-context Prompt Tuning
ICPT adds an adaptive-length projection module and two geometric regularizations to enable efficient, high-accuracy personalization of LVLMs across complex multi-concept tasks.