pith. sign in

Chaoyou Fu

Identifiers

  • name variant Chaoyou Fu 0.60 · backfill

Papers (13)

  1. SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation cs.CL · 2026 · author #9
  2. Tango: Taming Visual Signals for Efficient Video Large Language Models cs.CV · 2026 · author #6
  3. ActFER: Agentic Facial Expression Recognition via Active Tool-Augmented Visual Reasoning cs.CV · 2026 · author #8
  4. Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding cs.CV · 2026 · author #1
  5. Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence? cs.AI · 2026 · author #13
  6. VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding cs.CV · 2026 · author #5
  7. PersonaVLM: Long-Term Personalized Multimodal LLMs cs.CL · 2026 · author #2
  8. Thyme: Think Beyond Images cs.CV · 2025 · author #4
  9. VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction cs.CV · 2025 · author #1
  10. MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? cs.CV · 2024 · author #4
  11. Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis cs.CV · 2024 · author #1
  12. A Survey on Multimodal Large Language Models cs.CV · 2023 · author #2
  13. MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models cs.CV · 2023 · author #1

Mentions

  • 2501.01957 #1 · arxiv_oai · confidence 0.70 Chaoyou Fu
  • 2408.13257 #4 · arxiv_oai · confidence 0.70 Chaoyou Fu
  • 2306.13549 #2 · arxiv_oai · confidence 0.70 Chaoyou Fu

Frequent Coauthors