Emo-llama: Enhancing facial emotion understanding with instruction tuning

Bohao Xing, Zitong Yu, Xin Liu, Kaishen Yuan, Qilang Ye, Weicheng Xie, Huanjing Yue, Jingyu Yang, Heikki K¨alvi¨ainen · 2024 · arXiv 2408.11424

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

ActFER: Agentic Facial Expression Recognition via Active Tool-Augmented Visual Reasoning

cs.CV · 2026-04-10 · unverdicted · novelty 7.0

ActFER reformulates facial expression recognition as active tool-augmented visual reasoning with a custom reinforcement learning algorithm UC-GRPO that outperforms passive MLLM baselines on AU prediction.

FPBench: A Comprehensive Benchmark of Multimodal Large Language Models for Fingerprint Analysis

cs.CV · 2025-12-19 · conditional · novelty 6.0

FPBench evaluates 20 MLLMs across 8 fingerprint tasks on 7 datasets and shows fine-tuning vision and language encoders improves performance by 7-39%.

Insights from Visual Cognition: Understanding Human Action Dynamics with Overall Glance and Refined Gaze Transformer

cs.CV · 2026-04-08 · unverdicted · novelty 5.0

The OG-ReG Transformer achieves state-of-the-art results on Kinetics-400, Something-Something v2, and Diving-48 by combining global glance and local gaze processing paths.

citing papers explorer

Showing 1 of 1 citing paper after filters.

FPBench: A Comprehensive Benchmark of Multimodal Large Language Models for Fingerprint Analysis cs.CV · 2025-12-19 · conditional · none · ref 58
FPBench evaluates 20 MLLMs across 8 fingerprint tasks on 7 datasets and shows fine-tuning vision and language encoders improves performance by 7-39%.

Emo-llama: Enhancing facial emotion understanding with instruction tuning

fields

years

verdicts

representative citing papers

citing papers explorer