Minigpt-4: Enhancing vision-language understanding with advanced large language models

Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny · 2023

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Focusing Where Vision Matters: Selective Training for Large Vision Language Models via Visual Information Gain

cs.CV · 2026-02-19 · unverdicted · novelty 7.0

Introduces VIG metric to measure visual contribution via perplexity reduction and applies it for selective training of LVLMs on high-VIG samples and tokens to improve grounding with reduced supervision.

LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts

cs.AI · 2024-07-06 · conditional · novelty 6.0

LogicVista is a new benchmark dataset with 448 visual logic questions that evaluates multimodal LLMs on five reasoning tasks covering nine capabilities.

Orca: Progressive Learning from Complex Explanation Traces of GPT-4

cs.CL · 2023-06-05 · conditional · novelty 6.0

A 13B model called Orca learns detailed reasoning from GPT-4 explanation traces and reaches parity with ChatGPT on Big-Bench Hard while outperforming other 13B models.

citing papers explorer

Showing 3 of 3 citing papers.

Focusing Where Vision Matters: Selective Training for Large Vision Language Models via Visual Information Gain cs.CV · 2026-02-19 · unverdicted · none · ref 8
Introduces VIG metric to measure visual contribution via perplexity reduction and applies it for selective training of LVLMs on high-VIG samples and tokens to improve grounding with reduced supervision.
LogicVista: Multimodal LLM Logical Reasoning Benchmark in Visual Contexts cs.AI · 2024-07-06 · conditional · none · ref 4
LogicVista is a new benchmark dataset with 448 visual logic questions that evaluates multimodal LLMs on five reasoning tasks covering nine capabilities.
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 cs.CL · 2023-06-05 · conditional · none · ref 23
A 13B model called Orca learns detailed reasoning from GPT-4 explanation traces and reaches parity with ChatGPT on Big-Bench Hard while outperforming other 13B models.

Minigpt-4: Enhancing vision-language understanding with advanced large language models

fields

years

verdicts

representative citing papers

citing papers explorer