pith. sign in

Multi-grained vi- sion language pre-training: Aligning texts with visual con- cepts

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

fields

cs.CV 5 cs.LG 1

representative citing papers

Stitch-a-Demo: Video Demonstrations from Multistep Descriptions

cs.CV · 2025-03-18 · unverdicted · novelty 7.0

Stitch-a-Demo is a retrieval-based method that assembles visually coherent video demonstrations from multistep textual descriptions by training on weakly supervised procedural data with hard negatives.

ViperGPT: Visual Inference via Python Execution for Reasoning

cs.CV · 2023-03-14 · unverdicted · novelty 7.0

ViperGPT generates executable Python code to compose pre-trained vision-and-language modules into programs that answer visual queries, reaching state-of-the-art results with no additional training.

citing papers explorer

Showing 6 of 6 citing papers.