arXiv preprint arXiv:2402.17766 (2024)

Qi, Z · 2024 · arXiv 2402.17766

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1 dataset 1

citation-polarity summary

background 2

representative citing papers

Abstract 3D Perception for Spatial Intelligence in Vision-Language Models

cs.CV · 2025-11-14 · unverdicted · novelty 7.0

SandboxVLM enhances VLMs' spatial intelligence by encoding 3D geometry with abstract bounding boxes in a four-stage zero-shot pipeline, yielding an 8.3% improvement on SAT Real benchmark.

Multimodal LLMs under Pairwise Modalities

cs.CV · 2026-05-20 · unverdicted · novelty 6.0

A two-stage framework enables multimodal LLMs to learn shared latent representations from pairwise modality data and achieve cross-modal generation when incorporating new modalities.

Affordance Agent Harness: Verification-Gated Skill Orchestration

cs.RO · 2026-05-01 · unverdicted · novelty 6.0 · 2 refs

Affordance Agent Harness is a verification-gated orchestration system that unifies skills via an evidence store, episodic memory priors, an adaptive router, and a self-consistency verifier to improve accuracy-cost tradeoffs in open-world affordance grounding.

A Systematic Survey on Deep Learning Architectures for Point Cloud Classification and Segmentation

cs.CV · 2026-05-16 · unverdicted · novelty 4.0

A systematic literature survey that categorizes deep learning architectures for point cloud classification, part segmentation, and semantic segmentation, evaluates them on benchmarks, and discusses innovations, limitations, and future directions.

citing papers explorer

Showing 4 of 4 citing papers.

Abstract 3D Perception for Spatial Intelligence in Vision-Language Models cs.CV · 2025-11-14 · unverdicted · none · ref 40
SandboxVLM enhances VLMs' spatial intelligence by encoding 3D geometry with abstract bounding boxes in a four-stage zero-shot pipeline, yielding an 8.3% improvement on SAT Real benchmark.
Multimodal LLMs under Pairwise Modalities cs.CV · 2026-05-20 · unverdicted · none · ref 42
A two-stage framework enables multimodal LLMs to learn shared latent representations from pairwise modality data and achieve cross-modal generation when incorporating new modalities.
Affordance Agent Harness: Verification-Gated Skill Orchestration cs.RO · 2026-05-01 · unverdicted · none · ref 50 · 2 links
Affordance Agent Harness is a verification-gated orchestration system that unifies skills via an evidence store, episodic memory priors, an adaptive router, and a self-consistency verifier to improve accuracy-cost tradeoffs in open-world affordance grounding.
A Systematic Survey on Deep Learning Architectures for Point Cloud Classification and Segmentation cs.CV · 2026-05-16 · unverdicted · none · ref 71
A systematic literature survey that categorizes deep learning architectures for point cloud classification, part segmentation, and semantic segmentation, evaluates them on benchmarks, and discusses innovations, limitations, and future directions.

arXiv preprint arXiv:2402.17766 (2024)

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer