BLIP-2: Bootstrapping language-image pre- training with frozen image encoders and large language models

Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

citation-role summary

background 1

background 1

cs.CV · 2024-08-03 · conditional · novelty 5.0

MiniCPM-Llama3-V 2.5 delivers GPT-4V-level multimodal performance on phones through architecture, pretraining, and alignment optimizations.

Showing 1 of 1 citing paper.

MiniCPM-V: A GPT-4V Level MLLM on Your Phone cs.CV · 2024-08-03 · conditional · none · ref 54
MiniCPM-Llama3-V 2.5 delivers GPT-4V-level multimodal performance on phones through architecture, pretraining, and alignment optimizations.