pith. sign in

2.5 years in class: A multimodal textbook for vision-language pretraining

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.AI 2 cs.CV 1

years

2026 2 2025 1

verdicts

UNVERDICTED 3

roles

background 1

polarities

background 1

clear filters

representative citing papers

MMSearch-R1: Incentivizing LMMs to Search

cs.CV · 2025-06-25 · unverdicted · novelty 7.0

MMSearch-R1 uses reinforcement learning to train multimodal models for on-demand multi-turn internet search with image and text tools, outperforming same-size RAG baselines and matching larger ones while cutting search calls by over 30%.

Logics-Parsing-Omni Technical Report

cs.AI · 2026-03-10 · unverdicted · novelty 6.0

Omni Parsing framework converts complex multimodal signals into locatable, enumerable, and traceable structured knowledge via hierarchical detection, recognition, and interpreting with strict evidence alignment.

citing papers explorer

Showing 2 of 2 citing papers after filters.