Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models

Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi · 2023

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

StableSketcher: Enhancing Diffusion Model for Pixel-based Sketch Generation via Visual Question Answering Feedback

cs.CV · 2025-10-23 · conditional · novelty 6.0

StableSketcher improves text-to-sketch generation by fine-tuning a diffusion VAE and adding a VQA-based RL reward, while releasing the SketchDUO dataset of sketches with captions and QA pairs.

Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering

cs.CV · 2025-08-20 · unverdicted · novelty 6.0

Ouroboros uses two single-step diffusion models with cycle consistency for forward and inverse rendering, extending intrinsic decomposition to indoor/outdoor scenes with faster inference than multi-step methods.

citing papers explorer

Showing 2 of 2 citing papers.

StableSketcher: Enhancing Diffusion Model for Pixel-based Sketch Generation via Visual Question Answering Feedback cs.CV · 2025-10-23 · conditional · none · ref 18
StableSketcher improves text-to-sketch generation by fine-tuning a diffusion VAE and adding a VQA-based RL reward, while releasing the SketchDUO dataset of sketches with captions and QA pairs.
Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering cs.CV · 2025-08-20 · unverdicted · none · ref 36
Ouroboros uses two single-step diffusion models with cycle consistency for forward and inverse rendering, extending intrinsic decomposition to indoor/outdoor scenes with faster inference than multi-step methods.

Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models

fields

years

verdicts

representative citing papers

citing papers explorer