A relightable Gaussian Splatting method for virtual production decomposes scenes into fixed appearance and variable lighting by parameterizing primitives to directly sample high-resolution background textures, enabling controllable relighting without physically-based rendering or far-field maps.
Llava-interactive: An all-in-one demo for image chat, segmentation, generation and editing
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
LLM Orchestration integrates modality experts via an LLM controller, cross-modal memory, and interaction layer to enable multimodal input-output without gradient-based training.
Sa2VA unifies SAM-2 segmentation with MLLM reasoning into a single model for referring segmentation and conversation on images and videos, supported by a new 72k-expression Ref-SAV dataset.
Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.
citing papers explorer
-
Relightable Gaussian Splatting for Virtual Production Using Image-Based Illumination
A relightable Gaussian Splatting method for virtual production decomposes scenes into fixed appearance and variable lighting by parameterizing primitives to directly sample high-resolution background textures, enabling controllable relighting without physically-based rendering or far-field maps.
-
Training-Free Multimodal Large Language Model Orchestration
LLM Orchestration integrates modality experts via an LLM controller, cross-modal memory, and interaction layer to enable multimodal input-output without gradient-based training.
-
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Sa2VA unifies SAM-2 segmentation with MLLM reasoning into a single model for referring segmentation and conversation on images and videos, supported by a new 72k-expression Ref-SAV dataset.
-
Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction
Survey proposing a taxonomy for document parsing into pipeline-based systems and VLM-driven unified models, reviewing components, metrics, benchmarks, and challenges.
- Don't Guess, Just Ask: Resolving Ambiguity in Referring Segmentation via Multi-turn Clarification