GPT-4V processes interleaved image-text inputs generically and supports visual referring prompting for new human-AI interaction.
Large language models are zero-shot reasoners
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.
The paper provides the first comprehensive survey of multimodal chain-of-thought reasoning, including foundational concepts, a taxonomy of methodologies, application analyses, challenges, and future directions.
citing papers explorer
-
The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)
GPT-4V processes interleaved image-text inputs generically and supports visual referring prompting for new human-AI interaction.
-
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
The paper surveys reinforced reasoning techniques for LLMs, covering automated data construction, learning-to-reason methods, and test-time scaling as steps toward Large Reasoning Models.
-
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
The paper provides the first comprehensive survey of multimodal chain-of-thought reasoning, including foundational concepts, a taxonomy of methodologies, application analyses, challenges, and future directions.