Visual instruction tuning,

· 2024

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Toward Generalizable Forgery Detection and Reasoning

cs.CV · 2025-03-27 · unverdicted · novelty 7.0

FakeReasoning is an MLLM-based framework for unified forgery detection and reasoning on AI-generated images, supported by the new MMFR-Dataset of 120K images and 378K annotations across 10 generators.

UniEmo: Unifying Emotional Understanding and Generation with Learnable Expert Queries

cs.CV · 2025-07-31 · unverdicted · novelty 6.0

UniEmo unifies emotional understanding and generation by extracting multi-scale features via learnable expert queries, guiding diffusion-based image generation, and using dual feedback to improve both tasks.

AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning

cs.RO · 2025-03-10 · unverdicted · novelty 5.0

AutoSpatial improves VLM spatial reasoning for social navigation by combining minimal manual supervision with auto-labeled VQA pairs and hierarchical training, showing gains up to 20.5% in action prediction over baselines.

Recent Advances in Multimodal Affective Computing: An NLP Perspective

cs.CL · 2024-09-11 · unverdicted · novelty 3.0

Survey organizing multimodal affective computing research around four NLP tasks, method paradigms, datasets, evaluation protocols, and future directions while releasing a resource repository.

citing papers explorer

Showing 4 of 4 citing papers.

Toward Generalizable Forgery Detection and Reasoning cs.CV · 2025-03-27 · unverdicted · none · ref 29
FakeReasoning is an MLLM-based framework for unified forgery detection and reasoning on AI-generated images, supported by the new MMFR-Dataset of 120K images and 378K annotations across 10 generators.
UniEmo: Unifying Emotional Understanding and Generation with Learnable Expert Queries cs.CV · 2025-07-31 · unverdicted · none · ref 76
UniEmo unifies emotional understanding and generation by extracting multi-scale features via learnable expert queries, guiding diffusion-based image generation, and using dual feedback to improve both tasks.
AutoSpatial: Visual-Language Reasoning for Social Robot Navigation through Efficient Spatial Reasoning Learning cs.RO · 2025-03-10 · unverdicted · none · ref 4
AutoSpatial improves VLM spatial reasoning for social navigation by combining minimal manual supervision with auto-labeled VQA pairs and hierarchical training, showing gains up to 20.5% in action prediction over baselines.
Recent Advances in Multimodal Affective Computing: An NLP Perspective cs.CL · 2024-09-11 · unverdicted · none · ref 75
Survey organizing multimodal affective computing research around four NLP tasks, method paradigms, datasets, evaluation protocols, and future directions while releasing a resource repository.

Visual instruction tuning,

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer