In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Xiang, J · 2025

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Brain3D: EEG-to-3D Decoding of Visual Representations via Multimodal Reasoning

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

A multimodal pipeline decodes EEG into 3D meshes via EEG-to-image, MLLM reasoning, diffusion, and single-image-to-3D conversion, reporting 85.4% 10-way accuracy and 0.648 CLIPScore.

StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generation

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

StreamGVE enables high-quality training-free video editing by converting the task to noise-to-data streaming generation with dual-branch fast sampling, self-attention bridges, cross-attention grounding, source-oriented guidance, and visual prompting.

Multimodal LLMs under Pairwise Modalities

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

A two-stage framework enables multimodal LLMs to learn shared latent representations from pairwise modality data and achieve cross-modal generation when incorporating new modalities.

LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows

cs.CV · 2026-04-06 · conditional · novelty 6.0

LSRM scales transformer context windows with native sparse attention and geometric routing to deliver high-fidelity feed-forward 3D reconstruction and inverse rendering that approaches dense optimization quality.

citing papers explorer

Showing 4 of 4 citing papers.

Brain3D: EEG-to-3D Decoding of Visual Representations via Multimodal Reasoning cs.CV · 2026-04-09 · unverdicted · none · ref 34
A multimodal pipeline decodes EEG into 3D meshes via EEG-to-image, MLLM reasoning, diffusion, and single-image-to-3D conversion, reporting 85.4% 10-way accuracy and 0.648 CLIPScore.
StreamGVE: Training-Free Video Editing via Few-Step Streaming Video Generation cs.CV · 2026-05-20 · unverdicted · none · ref 82
StreamGVE enables high-quality training-free video editing by converting the task to noise-to-data streaming generation with dual-branch fast sampling, self-attention bridges, cross-attention grounding, source-oriented guidance, and visual prompting.
Multimodal LLMs under Pairwise Modalities cs.CV · 2026-05-20 · unverdicted · none · ref 61
A two-stage framework enables multimodal LLMs to learn shared latent representations from pairwise modality data and achieve cross-modal generation when incorporating new modalities.
LSRM: High-Fidelity Object-Centric Reconstruction via Scaled Context Windows cs.CV · 2026-04-06 · conditional · none · ref 67
LSRM scales transformer context windows with native sparse attention and geometric routing to deliver high-fidelity feed-forward 3D reconstruction and inverse rendering that approaches dense optimization quality.

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer