Videogpt+: Integrating image and video encoders for enhanced video understanding

Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding

cs.CV · 2025-01-09 · unverdicted · novelty 5.0

LLaVA-Octopus introduces instruction-driven adaptive fusion of multiple visual projectors in a multimodal LLM to improve video understanding performance.

citing papers explorer

Showing 1 of 1 citing paper.

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding cs.CV · 2025-01-09 · unverdicted · none · ref 50
LLaVA-Octopus introduces instruction-driven adaptive fusion of multiple visual projectors in a multimodal LLM to improve video understanding performance.

Videogpt+: Integrating image and video encoders for enhanced video understanding

fields

years

verdicts

representative citing papers

citing papers explorer