hub Canonical reference

arXiv preprint arXiv:2604.01617 (2026)

Qianyun Yang, Zhiwei Chen, Yupeng Hu, Zixu Li, Zhiheng Fu, Liqiang Nie · 2026 · arXiv 2604.01617

Canonical reference. 78% of citing Pith papers cite this work as background.

12 Pith papers citing it

Background 78% of classified citations

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 9

citation-polarity summary

background 7 unclear 2

representative citing papers

ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval

cs.CV · 2026-04-22 · unverdicted · novelty 7.0

ConeSep tackles noisy triplet correspondences in composed image retrieval by introducing geometric fidelity quantization to locate noise, negative boundary learning for semantic opposites, and targeted unlearning via optimal transport, outperforming prior methods on FashionIQ and CIRR.

GateMOT: Q-Gated Attention for Dense Object Tracking

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

GateMOT proposes Q-Gated Attention to enable linear-complexity, spatially aware attention for state-of-the-art dense object tracking on benchmarks like BEE24.

OmniTrend: Content-Context Modeling for Scalable Social Popularity Prediction

cs.CV · 2026-04-29 · unverdicted · novelty 6.0

OmniTrend predicts popularity by combining separate content attractiveness and contextual exposure predictors using cross-modal and exogenous signals.

HotComment: A Benchmark for Evaluating Popularity of Online Comments

cs.AI · 2026-04-28 · unverdicted · novelty 6.0

HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylistic model.

Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval

cs.CV · 2026-04-21 · unverdicted · novelty 6.0

Air-Know decouples MLLM-based external arbitration from proxy learning via knowledge internalization and dual-stream training to overcome noisy triplet correspondence in composed image retrieval.

INTENT: Invariance and Discrimination-aware Noise Mitigation for Robust Composed Image Retrieval

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

INTENT mitigates cross-modal correspondence noise and modality-inherent noise in composed image retrieval via FFT-based visual invariant composition and bi-objective discriminative learning.

HABIT: Chrono-Synergia Robust Progressive Learning Framework for Composed Image Retrieval

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

HABIT improves robustness in composed image retrieval under noisy triplets by quantifying sample cleanliness via mutual information transition rates and applying dual-consistency progressive learning to retain good patterns and correct bad ones.

ReTrack: Evidence-Driven Dual-Stream Directional Anchor Calibration Network for Composed Video Retrieval

cs.CV · 2026-04-20 · unverdicted · novelty 6.0

ReTrack calibrates directional bias in composed video features using semantic disentanglement and bidirectional evidence alignment to improve retrieval performance on CVR and CIR tasks.

Think in Latent Thoughts: A New Paradigm for Gloss-Free Sign Language Translation

cs.CV · 2026-04-16 · unverdicted · novelty 6.0

A new SLT framework uses latent thoughts as a middle reasoning layer and plan-then-ground decoding to improve coherence and faithfulness in gloss-free sign language translation.

Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization

cs.AI · 2026-05-02 · unverdicted · novelty 5.0

SCM-GRPO grounds multi-hop fact verification in structural causal models and applies GRPO reinforcement learning to optimize reasoning chain length, outperforming baselines on HoVer and EX-FEVER.

Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction

cs.MM · 2026-04-22 · unverdicted · novelty 5.0

A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.

CurEvo: Curriculum-Guided Self-Evolution for Video Understanding

cs.CV · 2026-04-29 · unverdicted · novelty 4.0

CurEvo integrates curriculum guidance into self-evolution to structure autonomous improvement of video understanding models, yielding gains on VideoQA benchmarks.

citing papers explorer

Showing 12 of 12 citing papers.

ConeSep: Cone-based Robust Noise-Unlearning Compositional Network for Composed Image Retrieval cs.CV · 2026-04-22 · unverdicted · none · ref 32
ConeSep tackles noisy triplet correspondences in composed image retrieval by introducing geometric fidelity quantization to locate noise, negative boundary learning for semantic opposites, and targeted unlearning via optimal transport, outperforming prior methods on FashionIQ and CIRR.
GateMOT: Q-Gated Attention for Dense Object Tracking cs.CV · 2026-04-29 · unverdicted · none · ref 77
GateMOT proposes Q-Gated Attention to enable linear-complexity, spatially aware attention for state-of-the-art dense object tracking on benchmarks like BEE24.
OmniTrend: Content-Context Modeling for Scalable Social Popularity Prediction cs.CV · 2026-04-29 · unverdicted · none · ref 71
OmniTrend predicts popularity by combining separate content attractiveness and contextual exposure predictors using cross-modal and exogenous signals.
HotComment: A Benchmark for Evaluating Popularity of Online Comments cs.AI · 2026-04-28 · unverdicted · none · ref 87
HotComment is a new multimodal benchmark that quantifies online comment popularity via content quality assessment, interaction-based prediction, and agent-simulated user engagement, accompanied by the StyleCmt stylistic model.
Air-Know: Arbiter-Calibrated Knowledge-Internalizing Robust Network for Composed Image Retrieval cs.CV · 2026-04-21 · unverdicted · none · ref 7
Air-Know decouples MLLM-based external arbitration from proxy learning via knowledge internalization and dual-stream training to overcome noisy triplet correspondence in composed image retrieval.
INTENT: Invariance and Discrimination-aware Noise Mitigation for Robust Composed Image Retrieval cs.CV · 2026-04-20 · unverdicted · none · ref 18
INTENT mitigates cross-modal correspondence noise and modality-inherent noise in composed image retrieval via FFT-based visual invariant composition and bi-objective discriminative learning.
HABIT: Chrono-Synergia Robust Progressive Learning Framework for Composed Image Retrieval cs.CV · 2026-04-20 · unverdicted · none · ref 52
HABIT improves robustness in composed image retrieval under noisy triplets by quantifying sample cleanliness via mutual information transition rates and applying dual-consistency progressive learning to retain good patterns and correct bad ones.
ReTrack: Evidence-Driven Dual-Stream Directional Anchor Calibration Network for Composed Video Retrieval cs.CV · 2026-04-20 · unverdicted · none · ref 38
ReTrack calibrates directional bias in composed video features using semantic disentanglement and bidirectional evidence alignment to improve retrieval performance on CVR and CIR tasks.
Think in Latent Thoughts: A New Paradigm for Gloss-Free Sign Language Translation cs.CV · 2026-04-16 · unverdicted · none · ref 57
A new SLT framework uses latent thoughts as a middle reasoning layer and plan-then-ground decoding to improve coherence and faithfulness in gloss-free sign language translation.
Grounding Multi-Hop Reasoning in Structural Causal Models via Group Relative Policy Optimization cs.AI · 2026-05-02 · unverdicted · none · ref 69
SCM-GRPO grounds multi-hop fact verification in structural causal models and applies GRPO reinforcement learning to optimize reasoning chain length, outperforming baselines on HoVer and EX-FEVER.
Seeing Further and Wider: Joint Spatio-Temporal Enlargement for Micro-Video Popularity Prediction cs.MM · 2026-04-22 · unverdicted · none · ref 70
A new joint spatio-temporal enlargement model for micro-video popularity prediction using frame scoring for long sequences and a topology-aware memory bank for unbounded historical associations.
CurEvo: Curriculum-Guided Self-Evolution for Video Understanding cs.CV · 2026-04-29 · unverdicted · none · ref 89
CurEvo integrates curriculum guidance into self-evolution to structure autonomous improvement of video understanding models, yielding gains on VideoQA benchmarks.

arXiv preprint arXiv:2604.01617 (2026)

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer