UniEmo unifies emotional understanding and generation by extracting multi-scale features via learnable expert queries, guiding diffusion-based image generation, and using dual feedback to improve both tasks.
Detecting and grounding multi-modal media manipulation and beyond
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
ForgeryGPT integrates a forgery localization expert and mask encoder into an LLM for pixel-level forgery detection, localization, and explainable output via three-stage training on custom mask-text and instruction datasets.
citing papers explorer
-
UniEmo: Unifying Emotional Understanding and Generation with Learnable Expert Queries
UniEmo unifies emotional understanding and generation by extracting multi-scale features via learnable expert queries, guiding diffusion-based image generation, and using dual feedback to improve both tasks.