10•Jimin Tang et al

Kling-Avatar: Grounding SIGGRAPH Conference Papers ’26, July 19–23 · 2026 · arXiv 2509.09595

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1 baseline 1 method 1

citation-polarity summary

background 1 baseline 1 use method 1

representative citing papers

VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

VidSplat iteratively synthesizes novel views with geometry-guided video diffusion to enable robust Gaussian splatting reconstruction from sparse or single-image inputs.

OmniHuman: A Large-scale Dataset and Benchmark for Human-Centric Video Generation

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

OmniHuman is a new large-scale multi-scene dataset with video-, frame-, and individual-level annotations for human-centric video generation, accompanied by the OHBench benchmark that adds metrics aligned with human perception.

Empowering Video Translation using Multimodal Large Language Models

cs.CV · 2026-04-13 · unverdicted · novelty 4.0

The paper offers the first focused review of MLLM-based video translation organized by a three-role taxonomy of Semantic Reasoner, Expressive Performer, and Visual Synthesizer, plus open challenges.

citing papers explorer

Showing 3 of 3 citing papers.

VidSplat: Gaussian Splatting Reconstruction with Geometry-Guided Video Diffusion Priors cs.CV · 2026-05-12 · unverdicted · none · ref 6
VidSplat iteratively synthesizes novel views with geometry-guided video diffusion to enable robust Gaussian splatting reconstruction from sparse or single-image inputs.
OmniHuman: A Large-scale Dataset and Benchmark for Human-Centric Video Generation cs.CV · 2026-04-20 · unverdicted · none · ref 12
OmniHuman is a new large-scale multi-scene dataset with video-, frame-, and individual-level annotations for human-centric video generation, accompanied by the OHBench benchmark that adds metrics aligned with human perception.
Empowering Video Translation using Multimodal Large Language Models cs.CV · 2026-04-13 · unverdicted · none · ref 131
The paper offers the first focused review of MLLM-based video translation organized by a three-role taxonomy of Semantic Reasoner, Expressive Performer, and Visual Synthesizer, plus open challenges.

10•Jimin Tang et al

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer