Composed image retrieval is reframed as calibrated intent resolution under uncertainty via conformal prediction sets and expected-information-gain clarification, with new AmbiCIR benchmark showing matched single-turn SOTA and faster multi-turn resolution with valid coverage.
Title resolution pending
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 6years
2026 6roles
background 2polarities
background 2representative citing papers
RelWitness introduces relation witnesses as observable visual-geometric cues to classify unannotated relations and enable positive-unlabeled learning for open-vocabulary 3D scene graph generation.
C²R framework for robust dataset distillation prioritizes small-margin adversaries via a derived perturbation score and widens class boundaries with contrastive loss, yielding 2.8% average robust accuracy gains on CIFAR and ImageNet benchmarks.
ScriptHOI decomposes HOI phrases into state slots, uses slot-wise script coverage and conflict matching, and applies interval partial-label learning to improve rare and unseen interaction detection.
citing papers explorer
-
Resolving Ambiguity in Composed Image Retrieval via Calibrated Interaction
Composed image retrieval is reframed as calibrated intent resolution under uncertainty via conformal prediction sets and expected-information-gain clarification, with new AmbiCIR benchmark showing matched single-turn SOTA and faster multi-turn resolution with valid coverage.
-
RelWitness: Open-Vocabulary 3D Scene Graph Generation with Visual-Geometric Relation Witnesses
RelWitness introduces relation witnesses as observable visual-geometric cues to classify unannotated relations and enable positive-unlabeled learning for open-vocabulary 3D scene graph generation.
-
Mind Your Margin and Boundary: Are Your Distilled Datasets Truly Robust?
C²R framework for robust dataset distillation prioritizes small-margin adversaries via a derived perturbation score and widens class boundaries with contrastive loss, yielding 2.8% average robust accuracy gains on CIFAR and ImageNet benchmarks.
-
ScriptHOI: Learning Scripted State Transitions for Open-Vocabulary Human-Object Interaction Detection
ScriptHOI decomposes HOI phrases into state slots, uses slot-wise script coverage and conflict matching, and applies interval partial-label learning to improve rare and unseen interaction detection.
- ReLIC-SGG: Relation Lattice Completion for Open-Vocabulary Scene Graph Generation
- CAGE-SGG: Counterfactual Active Graph Evidence for Open-Vocabulary Scene Graph Generation