pith. sign in

Explain before you answer: A survey on compositional visual reasoning

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 7 2025 2

roles

background 2

polarities

background 2

clear filters

representative citing papers

Dynamic Execution Commitment of Vision-Language-Action Models

cs.CV · 2026-05-12 · unverdicted · novelty 7.0 · 3 refs

A3 reframes dynamic action chunk commitment in VLA models as self-speculative prefix verification, accepting the longest continuous sequence of actions that satisfies consensus-ordered conditional invariance and prefix-closed sequential consistency.

Mull-Tokens: Modality-Agnostic Latent Thinking

cs.CV · 2025-12-11 · unverdicted · novelty 6.0

Mull-Tokens are modality-agnostic latent tokens that enable free-form multimodal thinking and deliver up to 16% gains on spatial reasoning benchmarks.

Visual Compositional Tuning

cs.CV · 2025-04-30 · unverdicted · novelty 6.0

COMPACT synthesizes compositional visual instruction data to reduce VIT training data by 90% while achieving 100.2% of full performance across eight multimodal benchmarks.

ARIS: Agentic and Relationship Intelligence System for Social Robots

cs.RO · 2026-05-01 · unverdicted · novelty 4.0

ARIS integrates a graph-based Social World Model, RAG, and agentic architecture for social robots and reports higher user ratings for intelligence, animacy, anthropomorphism, and likeability than an LLM baseline in a 23-person study with the Pepper robot.

citing papers explorer

Showing 7 of 7 citing papers after filters.