Two cartoon pears holding hands

Keep the same action/context structure Constraints: - The random entity must be a concrete, visualizable noun - Must be completely unrelated to original pun - Do NOT reuse common examples (vary your selection) Example: Original Visual: “T · 2026

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

"I See What You Did There": Can Large Vision-Language Models Understand Multimodal Puns?

cs.CL · 2026-04-07 · unverdicted · novelty 6.0

Vision-language models largely fail to distinguish multimodal puns from adversarial non-puns but gain an average 16.5% F1 improvement from prompt-level and model-level interventions.

citing papers explorer

Showing 1 of 1 citing paper.

"I See What You Did There": Can Large Vision-Language Models Understand Multimodal Puns? cs.CL · 2026-04-07 · unverdicted · none · ref 7
Vision-language models largely fail to distinguish multimodal puns from adversarial non-puns but gain an average 16.5% F1 improvement from prompt-level and model-level interventions.

Two cartoon pears holding hands

fields

years

verdicts

representative citing papers

citing papers explorer