hub

arXiv preprint arXiv:2506.17733 (2025)

Mengqi Lei, Siqi Li, Yihong Wu, Han Hu, You Zhou, Xinhu Zheng, Guiguang Ding, Shaoyi Du, Zongze Wu, Yue Gao · 2025 · arXiv 2506.17733

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

read on arXiv browse 14 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

Towards Self-Explainable Document Visual Question Answering with Chain-of-Explanation Predictions

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

CoExVQA uses a chain-of-explanation to ground DocVQA answers in localized document regions, achieving state-of-the-art explainable performance with a 12% ANLS gain on PFL-DocVQA over prior baselines.

WUTDet: A 100K-Scale Ship Detection Dataset and Benchmarks with Dense Small Objects

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

WUTDet is a 100K-image ship detection dataset with benchmarks indicating Transformer models outperform CNN and Mamba architectures in accuracy and small-object detection for complex maritime environments.

DLEBench: Evaluating Small-scale Object Editing Ability for Instruction-based Image Editing Model

cs.CV · 2026-02-27 · unverdicted · novelty 7.0

DLEBench is the first benchmark for small-scale object editing in instruction-based image editing models, using 1889 samples, seven instruction types, and a dual-mode evaluation protocol to reveal performance gaps in 10 tested models.

Hypergraph-Enhanced Training-Free and Language-Free Few-Shot Anomaly Detection

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

HyperFSAD uses sparse hypergraph matching on DINOv3 features plus dual-branch scoring to deliver training-free and language-free few-shot anomaly detection that reaches state-of-the-art on six industrial and medical datasets.

Sparse Hypergraph-Enhanced Frame-Event Object Detection with Fine-Grained MoE

cs.CV · 2026-04-13 · unverdicted · novelty 6.0

Hyper-FEOD fuses RGB and event data via sparse hypergraph cross-modal fusion and region-specialized MoE experts to improve accuracy-efficiency in object detection.

{\Psi}-Map: Panoptic Surface Integrated Mapping Enables Real2Sim Transfer

cs.RO · 2026-04-13 · unverdicted · novelty 6.0

Ψ-Map combines plane-constrained Gaussian surfels from LiDAR with end-to-end panoptic lifting to deliver high-precision geometric and semantic reconstruction in large-scale environments at real-time speeds.

RACANet: Reliability-Aware Crowd Anchor Network for RGB-T Crowd Counting

cs.CV · 2026-04-27 · unverdicted · novelty 5.0

RACANet proposes a reliability-aware two-stage fusion network with cross-modal pretraining and local anchor modules that outperforms prior RGB-T crowd counting methods on standard benchmarks.

Fast-SegSim: Real-Time Open-Vocabulary Segmentation for Robotics in Simulation

cs.RO · 2026-04-13 · unverdicted · novelty 5.0

Fast-SegSim achieves real-time 3D-consistent open-vocabulary segmentation by optimizing feature accumulation in 2D Gaussian Splatting with Precise Tile Intersection and Top-K Hard Selection.

FMC-DETR: Frequency-Decoupled Multi-Domain Coordination for Aerial-View Object Detection

cs.CV · 2025-09-27 · unverdicted · novelty 5.0

FMC-DETR proposes a frequency-decoupled fusion framework with WeKat backbone, MDFC coordination, and CPF fusion modules that claims state-of-the-art results on remote sensing object detection benchmarks.

UAVDB: Point-Guided Masks for UAV Detection and Segmentation

cs.CV · 2024-09-09 · unverdicted · novelty 5.0

Introduces UAVDB dataset for UAV detection/segmentation via PIC point-to-box conversion and SAM2 masks, with YOLO baselines showing PIC+SAM2 outperforms prior annotation methods on IoU.

A Marine Debris Detection Framework for Ocean Robots via Self-Attention Enhancement and Feature Interaction Optimization

cs.CV · 2026-05-08 · unverdicted · novelty 4.0

YOLO-MD improves underwater marine debris detection by adding a Dual-Branch Convolutional Enhanced Self-Attention module, a lightweight shift operation, and SFG-Loss for class imbalance, achieving 0.875 precision and 0.849 mAP50 on the UODM dataset.

Resource-Constrained UAV-Based Weed Detection for Site-Specific Management on Edge Devices

cs.CV · 2026-04-25 · unverdicted · novelty 4.0

YOLOv11s and RT-DETRv2-R50-M provide the best accuracy-speed trade-off for real-time weed detection on edge UAV systems, with mAP50 up to 79% and low latency.

Beyond Mamba: Enhancing State-space Models with Deformable Dilated Convolutions for Multi-scale Traffic Object Detection

cs.CV · 2026-04-09 · unverdicted · novelty 4.0

MDDCNet combines Mamba blocks with deformable dilated convolutions, enhanced feed-forward networks, and an attention-aggregating feature pyramid to achieve better multi-scale traffic object detection than prior detectors.

NTIRE 2026 Rip Current Detection and Segmentation (RipDetSeg) Challenge Report

cs.CV · 2026-04-18 · unverdicted · novelty 1.0

The NTIRE 2026 RipDetSeg Challenge evaluated AI methods for rip current detection and segmentation, finding that pretrained general-purpose models with augmentation and post-processing performed well on a diverse multi-country dataset.

citing papers explorer

Showing 14 of 14 citing papers.

Towards Self-Explainable Document Visual Question Answering with Chain-of-Explanation Predictions cs.LG · 2026-05-07 · unverdicted · none · ref 28
CoExVQA uses a chain-of-explanation to ground DocVQA answers in localized document regions, achieving state-of-the-art explainable performance with a 12% ANLS gain on PFL-DocVQA over prior baselines.
WUTDet: A 100K-Scale Ship Detection Dataset and Benchmarks with Dense Small Objects cs.CV · 2026-04-09 · unverdicted · none · ref 35
WUTDet is a 100K-image ship detection dataset with benchmarks indicating Transformer models outperform CNN and Mamba architectures in accuracy and small-object detection for complex maritime environments.
DLEBench: Evaluating Small-scale Object Editing Ability for Instruction-based Image Editing Model cs.CV · 2026-02-27 · unverdicted · none · ref 12
DLEBench is the first benchmark for small-scale object editing in instruction-based image editing models, using 1889 samples, seven instruction types, and a dual-mode evaluation protocol to reveal performance gaps in 10 tested models.
Hypergraph-Enhanced Training-Free and Language-Free Few-Shot Anomaly Detection cs.CV · 2026-05-11 · unverdicted · none · ref 18
HyperFSAD uses sparse hypergraph matching on DINOv3 features plus dual-branch scoring to deliver training-free and language-free few-shot anomaly detection that reaches state-of-the-art on six industrial and medical datasets.
Sparse Hypergraph-Enhanced Frame-Event Object Detection with Fine-Grained MoE cs.CV · 2026-04-13 · unverdicted · none · ref 9
Hyper-FEOD fuses RGB and event data via sparse hypergraph cross-modal fusion and region-specialized MoE experts to improve accuracy-efficiency in object detection.
{\Psi}-Map: Panoptic Surface Integrated Mapping Enables Real2Sim Transfer cs.RO · 2026-04-13 · unverdicted · none · ref 44
Ψ-Map combines plane-constrained Gaussian surfels from LiDAR with end-to-end panoptic lifting to deliver high-precision geometric and semantic reconstruction in large-scale environments at real-time speeds.
RACANet: Reliability-Aware Crowd Anchor Network for RGB-T Crowd Counting cs.CV · 2026-04-27 · unverdicted · none · ref 11
RACANet proposes a reliability-aware two-stage fusion network with cross-modal pretraining and local anchor modules that outperforms prior RGB-T crowd counting methods on standard benchmarks.
Fast-SegSim: Real-Time Open-Vocabulary Segmentation for Robotics in Simulation cs.RO · 2026-04-13 · unverdicted · none · ref 34
Fast-SegSim achieves real-time 3D-consistent open-vocabulary segmentation by optimizing feature accumulation in 2D Gaussian Splatting with Precise Tile Intersection and Top-K Hard Selection.
FMC-DETR: Frequency-Decoupled Multi-Domain Coordination for Aerial-View Object Detection cs.CV · 2025-09-27 · unverdicted · none · ref 21
FMC-DETR proposes a frequency-decoupled fusion framework with WeKat backbone, MDFC coordination, and CPF fusion modules that claims state-of-the-art results on remote sensing object detection benchmarks.
UAVDB: Point-Guided Masks for UAV Detection and Segmentation cs.CV · 2024-09-09 · unverdicted · none · ref 37
Introduces UAVDB dataset for UAV detection/segmentation via PIC point-to-box conversion and SAM2 masks, with YOLO baselines showing PIC+SAM2 outperforms prior annotation methods on IoU.
A Marine Debris Detection Framework for Ocean Robots via Self-Attention Enhancement and Feature Interaction Optimization cs.CV · 2026-05-08 · unverdicted · none · ref 26
YOLO-MD improves underwater marine debris detection by adding a Dual-Branch Convolutional Enhanced Self-Attention module, a lightweight shift operation, and SFG-Loss for class imbalance, achieving 0.875 precision and 0.849 mAP50 on the UODM dataset.
Resource-Constrained UAV-Based Weed Detection for Site-Specific Management on Edge Devices cs.CV · 2026-04-25 · unverdicted · none · ref 21
YOLOv11s and RT-DETRv2-R50-M provide the best accuracy-speed trade-off for real-time weed detection on edge UAV systems, with mAP50 up to 79% and low latency.
Beyond Mamba: Enhancing State-space Models with Deformable Dilated Convolutions for Multi-scale Traffic Object Detection cs.CV · 2026-04-09 · unverdicted · none · ref 5
MDDCNet combines Mamba blocks with deformable dilated convolutions, enhanced feed-forward networks, and an attention-aggregating feature pyramid to achieve better multi-scale traffic object detection than prior detectors.
NTIRE 2026 Rip Current Detection and Segmentation (RipDetSeg) Challenge Report cs.CV · 2026-04-18 · unverdicted · none · ref 38
The NTIRE 2026 RipDetSeg Challenge evaluated AI methods for rip current detection and segmentation, finding that pretrained general-purpose models with augmentation and post-processing performed well on a diverse multi-country dataset.

arXiv preprint arXiv:2506.17733 (2025)

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer