UniEmo unifies emotional understanding and generation by extracting multi-scale features via learnable expert queries, guiding diffusion-based image generation, and using dual feedback to improve both tasks.
Photorealistic text-to-image diffusion models with deep language understanding,
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
ForgeryGPT integrates a forgery localization expert and mask encoder into an LLM for pixel-level forgery detection, localization, and explainable output via three-stage training on custom mask-text and instruction datasets.
citing papers explorer
-
UniEmo: Unifying Emotional Understanding and Generation with Learnable Expert Queries
UniEmo unifies emotional understanding and generation by extracting multi-scale features via learnable expert queries, guiding diffusion-based image generation, and using dual feedback to improve both tasks.
-
ForgeryGPT: A Multimodal LLM for Interpretable Image Forgery Detection and Localization
ForgeryGPT integrates a forgery localization expert and mask encoder into an LLM for pixel-level forgery detection, localization, and explainable output via three-stage training on custom mask-text and instruction datasets.