A pair-centric set-prediction model unifies present HOI detection and multi-horizon anticipation in video by modeling future interactions as residual transitions from current pair states, backed by a temporally corrected benchmark.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 4years
2026 4roles
method 1polarities
use method 1representative citing papers
DEX is a modular network using dynamically activated experts and a group-EMA director to learn emergent modular representations for multi-modality medical vision foundation models, evaluated on a new 4M-image benchmark across 10 modalities and 26 downstream tasks.
A unified cost-aware formulation couples fine-grained high-resolution sampling decisions with cross-patch representation prediction to achieve superior performance-cost trade-offs on remote sensing recognition and retrieval tasks using a new 10M-image benchmark.
citing papers explorer
-
Rethinking Video Human-Object Interaction: Set Prediction over Time for Unified Detection and Anticipation
A pair-centric set-prediction model unifies present HOI detection and multi-horizon anticipation in video by modeling future interactions as residual transitions from current pair states, backed by a temporally corrected benchmark.
-
Learning Emergent Modular Representations in Multi-modality Medical Vision Foundation Models
DEX is a modular network using dynamically activated experts and a group-EMA director to learn emergent modular representations for multi-modality medical vision foundation models, evaluated on a new 4M-image benchmark across 10 modalities and 26 downstream tasks.
-
Observe Less, Understand More: Cost-aware Cross-scale Observation for Remote Sensing Understanding
A unified cost-aware formulation couples fine-grained high-resolution sampling decisions with cross-patch representation prediction to achieve superior performance-cost trade-offs on remote sensing recognition and retrieval tasks using a new 10M-image benchmark.
- Case-Aware Medical Image Classification with Multimodal Knowledge Graphs and Reliability-Guided Refinement