Llm-based nlg evaluation: Current status and challenges,

Mingqi Gao, Xinyu Hu, Jie Ruan, Xiao Pu, Xiaojun Wan · 2024 · arXiv 2402.01383

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?

cs.CV · 2025-03-29 · unverdicted · novelty 6.0

Presents YesBut (V2) benchmark and shows state-of-the-art VLMs significantly underperform humans on tasks requiring comparative reasoning for contradictory humor in comics.

Sensorimotor Self-Recognition in Multimodal Large Language Model-Driven Robots

cs.AI · 2025-05-25 · unverdicted · novelty 4.0

Multimodal LLMs in robots develop self-identification and predictive awareness through sensorimotor loops, with structural equation modeling linking sensory integration to dimensions of the minimal self.

From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap

cs.SE · 2024-10-28 · unverdicted · novelty 4.0

A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.

citing papers explorer

Showing 3 of 3 citing papers.

When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning? cs.CV · 2025-03-29 · unverdicted · none · ref 77
Presents YesBut (V2) benchmark and shows state-of-the-art VLMs significantly underperform humans on tasks requiring comparative reasoning for contradictory humor in comics.
Sensorimotor Self-Recognition in Multimodal Large Language Model-Driven Robots cs.AI · 2025-05-25 · unverdicted · none · ref 42
Multimodal LLMs in robots develop self-identification and predictive awareness through sensorimotor loops, with structural equation modeling linking sensory integration to dimensions of the minimal self.
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap cs.SE · 2024-10-28 · unverdicted · none · ref 46
A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.

Llm-based nlg evaluation: Current status and challenges,

fields

years

verdicts

representative citing papers

citing papers explorer