hub

Yolov10: Real-time end-to-end object detection

Yolov10: Real-time end-to-end object detection , author= · 2024 · arXiv 2405.14458

26 Pith papers cite this work. Polarity classification is still indexing.

26 Pith papers citing it

read on arXiv browse 26 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2 baseline 1

citation-polarity summary

background 2 baseline 1

representative citing papers

Unlocking the Visual Record of Materials Science: A Large-Scale Multimodal Dataset from Scientific Literature

cs.CV · 2026-06-29 · accept · novelty 8.0

MatMMExtract pipeline creates MatSciFig dataset of 391k annotated materials science figure panels and MaterialScope detection dataset with high accuracy.

M$^2$E-UAV: A Benchmark and Analysis for Onboard Motion-on-Motion Event-Based Tiny UAV Detection

cs.CV · 2026-05-11 · conditional · novelty 8.0 · 2 refs

M²E-UAV is the first benchmark dataset and evaluation protocol for tiny UAV detection from a moving event camera in motion-on-motion conditions.

Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline

cs.AI · 2026-06-06 · unverdicted · novelty 6.0

Presents MMIO benchmark and RTVP method achieving state-of-the-art 42.2% AP in zero-shot industrial defect detection.

RefDiffNet: Learning to Expose Subtle PCB Defects Before Detection

cs.CV · 2026-05-30 · unverdicted · novelty 6.0

RefDiffNet is a lightweight input enhancement block that uses reference image comparison to expose PCB defects, delivering up to 18% relative mAP50:95 gains across YOLO, RT-DETR, and Faster R-CNN detectors with 0.004-0.005M extra parameters.

Small Object Detection in Industrial Recycling: A New Dataset and YOLO Performance Evaluation

cs.CV · 2026-05-26 · unverdicted · novelty 6.0

Releases a recycling-specific dataset of >10k images and evaluates YOLO variants on small dense overlapping objects with augmentation and anomaly detection.

BabelDOC: Better Layout-Preserving PDF Translation via Intermediate Representation

cs.CV · 2026-05-11 · unverdicted · novelty 6.0

BabelDOC uses an intermediate representation to decouple layout from content for improved layout-preserving PDF translation.

DM$^3$-Nav: Decentralized Multi-Agent Multimodal Multi-Object Semantic Navigation

cs.MA · 2026-04-23 · unverdicted · novelty 6.0

DM³-Nav delivers decentralized multi-agent semantic navigation for multimodal open-vocabulary multi-object tasks that matches centralized baselines in simulation and succeeds in real-world robot deployments.

SoftHGNN: Soft Hypergraph Neural Networks for General Visual Recognition

cs.CV · 2025-05-21 · unverdicted · novelty 6.0

SoftHGNN introduces differentiable soft hyperedges via learnable prototypes and top-k sparse selection to model high-order visual interactions and improve recognition accuracy.

YOLOv12: Attention-Centric Real-Time Object Detectors

cs.CV · 2025-02-18 · unverdicted · novelty 6.0

YOLOv12 is a new attention-based real-time object detector that reports higher accuracy than YOLOv10, YOLOv11, and RT-DETR variants at comparable or better speed and efficiency.

TinyFormer: Preserving Tiny Objects in YOLO-DETR Hybrid Real-time Detectors

cs.CV · 2026-05-24 · unverdicted · novelty 5.0

TinyFormer adds Parallel Bi-fusion Module and Spatial Semantic Adapter to a YOLO-DETR hybrid, raising small-object AP by 1.6 points to 58.5% on MS COCO while keeping real-time speed.

STAR-IOD: Scale-decoupled Topology Alignment with Pseudo-label Refinement for Remote Sensing Incremental Object Detection

cs.CV · 2026-05-20 · unverdicted · novelty 5.0

STAR-IOD applies scale-decoupled topology alignment and K-Means-based pseudo-label refinement to reduce catastrophic forgetting in remote sensing incremental object detection, reporting 1.7% and 2.1% mAP gains on new DIOR-IOD and DOTA-IOD datasets.

Deep Learning-Based Computer Vision for Beam Selection and Proactive Blockage Prediction

eess.SP · 2026-05-06 · unverdicted · novelty 5.0

Vision-aided deep learning delivers 98.96% beam prediction accuracy and over 98% proactive blockage prediction for mm-wave links, including the first treatment of simultaneous non-uniform mobility.

A Real-time Scale-robust Network for Glottis Segmentation in Nasal Transnasal Intubation

eess.IV · 2026-04-30 · unverdicted · novelty 5.0

A scale-robust lightweight CNN for glottis segmentation achieves 92.9% mDice at over 170 FPS with a 19 MB model size on three datasets.

DocRevive: A Unified Pipeline for Document Text Restoration

cs.CV · 2026-04-11 · unverdicted · novelty 5.0 · 2 refs

A unified pipeline using OCR, inpainting, and diffusion models restores text in degraded documents on a new synthetic benchmark dataset, evaluated with the proposed UCSM metric.

UAVDB: Point-Guided Masks for UAV Detection and Segmentation

cs.CV · 2024-09-09 · unverdicted · novelty 5.0

Introduces UAVDB dataset for UAV detection/segmentation via PIC point-to-box conversion and SAM2 masks, with YOLO baselines showing PIC+SAM2 outperforms prior annotation methods on IoU.

Structure-Guided Mixed Masked Pretraining and Spatial Continuity Regularization for Printed Circuit Board Defect Detection

cs.CV · 2026-06-02 · unverdicted · novelty 4.0

A new PCB defect detection method using structure-guided masked pretraining and spatial continuity regularization achieves 85.5% mAP0.5 on the DsPCBSD+ dataset.

Hierarchically Decoupled Mixture-of-Experts for Robust Traffic Sign Recognition in Complex Driving Scenarios

cs.CV · 2026-06-01 · unverdicted · novelty 4.0

A hierarchically decoupled heterogeneous MoE framework with YOLO experts and lightweight gating network reports 76.8% mAP50-95 on a composite traffic sign dataset, a 2.3% gain over baseline with 39.4% lower compute.

Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents

cs.AI · 2026-05-15 · unverdicted · novelty 4.0

Proposes a knowledge-adaptive edge expert agent architecture for sustainable biodiversity monitoring that separates visual perception from reasoning with an explicit knowledge base.

DFIR-DETR: Frequency-Domain Iterative Refinement and Dynamic Feature Aggregation for Small Object Detection

cs.CV · 2025-12-08 · unverdicted · novelty 4.0

DFIR-DETR augments RT-DETR with frequency-domain iterative refinement and dynamic feature aggregation, reporting 92.9% mAP50 on NEU-DET and 51.6% on VisDrone at 11.7M parameters and 47.2 GFLOPs.

MinerU: An Open-Source Solution for Precise Document Content Extraction

cs.CV · 2024-09-27 · conditional · novelty 4.0

MinerU delivers an open-source pipeline for high-precision document content extraction by integrating specialized models with tuned preprocessing and postprocessing rules.

A Goal-Oriented Networking Approach for Intelligent IoT Service Deployment

cs.NI · 2026-05-27 · unverdicted · novelty 3.0

A multi-objective optimization framework is proposed to assess KPIs in goal-oriented IoT service deployment, with simulation results indicating network efficiency benefits.

A Comparative Study of Modern Object Detectors for Robust Apple Detection in Orchard Imagery

cs.CV · 2026-04-11 · unverdicted · novelty 3.0

YOLO11n achieves the highest mAP@0.5:0.95 of 0.6065 for apple localization, with other detectors showing trade-offs in recall and precision at low confidence thresholds.

Underwater Waste Detection Using Deep Learning A Performance Comparison of YOLOv7 to 10 and Faster RCNN

cs.CV · 2025-07-25 · unverdicted · novelty 3.0

YOLOv8 achieves the highest mAP of 80.9% for detecting 15 classes of underwater waste among the tested models.

YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review

cs.CV · 2025-01-23 · unverdicted · novelty 2.0

Comparative review of YOLOv8 to YOLO11 architectures based on papers, docs, and code inspection, noting incremental improvements and some unchanged blocks.

citing papers explorer

Showing 26 of 26 citing papers.

Unlocking the Visual Record of Materials Science: A Large-Scale Multimodal Dataset from Scientific Literature cs.CV · 2026-06-29 · accept · none · ref 30
MatMMExtract pipeline creates MatSciFig dataset of 391k annotated materials science figure panels and MaterialScope detection dataset with high accuracy.
M$^2$E-UAV: A Benchmark and Analysis for Onboard Motion-on-Motion Event-Based Tiny UAV Detection cs.CV · 2026-05-11 · conditional · none · ref 13 · 2 links
M²E-UAV is the first benchmark dataset and evaluation protocol for tiny UAV detection from a moving event camera in motion-on-motion conditions.
Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline cs.AI · 2026-06-06 · unverdicted · none · ref 103
Presents MMIO benchmark and RTVP method achieving state-of-the-art 42.2% AP in zero-shot industrial defect detection.
RefDiffNet: Learning to Expose Subtle PCB Defects Before Detection cs.CV · 2026-05-30 · unverdicted · none · ref 36
RefDiffNet is a lightweight input enhancement block that uses reference image comparison to expose PCB defects, delivering up to 18% relative mAP50:95 gains across YOLO, RT-DETR, and Faster R-CNN detectors with 0.004-0.005M extra parameters.
Small Object Detection in Industrial Recycling: A New Dataset and YOLO Performance Evaluation cs.CV · 2026-05-26 · unverdicted · none · ref 27
Releases a recycling-specific dataset of >10k images and evaluates YOLO variants on small dense overlapping objects with augmentation and anomaly detection.
BabelDOC: Better Layout-Preserving PDF Translation via Intermediate Representation cs.CV · 2026-05-11 · unverdicted · none · ref 18
BabelDOC uses an intermediate representation to decouple layout from content for improved layout-preserving PDF translation.
DM$^3$-Nav: Decentralized Multi-Agent Multimodal Multi-Object Semantic Navigation cs.MA · 2026-04-23 · unverdicted · none · ref 44
DM³-Nav delivers decentralized multi-agent semantic navigation for multimodal open-vocabulary multi-object tasks that matches centralized baselines in simulation and succeeds in real-world robot deployments.
SoftHGNN: Soft Hypergraph Neural Networks for General Visual Recognition cs.CV · 2025-05-21 · unverdicted · none · ref 74
SoftHGNN introduces differentiable soft hyperedges via learnable prototypes and top-k sparse selection to model high-order visual interactions and improve recognition accuracy.
YOLOv12: Attention-Centric Real-Time Object Detectors cs.CV · 2025-02-18 · unverdicted · none · ref 53
YOLOv12 is a new attention-based real-time object detector that reports higher accuracy than YOLOv10, YOLOv11, and RT-DETR variants at comparable or better speed and efficiency.
TinyFormer: Preserving Tiny Objects in YOLO-DETR Hybrid Real-time Detectors cs.CV · 2026-05-24 · unverdicted · none · ref 24
TinyFormer adds Parallel Bi-fusion Module and Spatial Semantic Adapter to a YOLO-DETR hybrid, raising small-object AP by 1.6 points to 58.5% on MS COCO while keeping real-time speed.
STAR-IOD: Scale-decoupled Topology Alignment with Pseudo-label Refinement for Remote Sensing Incremental Object Detection cs.CV · 2026-05-20 · unverdicted · none · ref 131
STAR-IOD applies scale-decoupled topology alignment and K-Means-based pseudo-label refinement to reduce catastrophic forgetting in remote sensing incremental object detection, reporting 1.7% and 2.1% mAP gains on new DIOR-IOD and DOTA-IOD datasets.
Deep Learning-Based Computer Vision for Beam Selection and Proactive Blockage Prediction eess.SP · 2026-05-06 · unverdicted · none · ref 35
Vision-aided deep learning delivers 98.96% beam prediction accuracy and over 98% proactive blockage prediction for mm-wave links, including the first treatment of simultaneous non-uniform mobility.
A Real-time Scale-robust Network for Glottis Segmentation in Nasal Transnasal Intubation eess.IV · 2026-04-30 · unverdicted · none · ref 43
A scale-robust lightweight CNN for glottis segmentation achieves 92.9% mDice at over 170 FPS with a 19 MB model size on three datasets.
DocRevive: A Unified Pipeline for Document Text Restoration cs.CV · 2026-04-11 · unverdicted · none · ref 40 · 2 links
A unified pipeline using OCR, inpainting, and diffusion models restores text in degraded documents on a new synthetic benchmark dataset, evaluated with the proposed UCSM metric.
UAVDB: Point-Guided Masks for UAV Detection and Segmentation cs.CV · 2024-09-09 · unverdicted · none · ref 67
Introduces UAVDB dataset for UAV detection/segmentation via PIC point-to-box conversion and SAM2 masks, with YOLO baselines showing PIC+SAM2 outperforms prior annotation methods on IoU.
Structure-Guided Mixed Masked Pretraining and Spatial Continuity Regularization for Printed Circuit Board Defect Detection cs.CV · 2026-06-02 · unverdicted · none · ref 64
A new PCB defect detection method using structure-guided masked pretraining and spatial continuity regularization achieves 85.5% mAP0.5 on the DsPCBSD+ dataset.
Hierarchically Decoupled Mixture-of-Experts for Robust Traffic Sign Recognition in Complex Driving Scenarios cs.CV · 2026-06-01 · unverdicted · none · ref 15
A hierarchically decoupled heterogeneous MoE framework with YOLO experts and lightweight gating network reports 76.8% mAP50-95 on a composite traffic sign dataset, a 2.3% gain over baseline with 39.4% lower compute.
Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents cs.AI · 2026-05-15 · unverdicted · none · ref 31
Proposes a knowledge-adaptive edge expert agent architecture for sustainable biodiversity monitoring that separates visual perception from reasoning with an explicit knowledge base.
DFIR-DETR: Frequency-Domain Iterative Refinement and Dynamic Feature Aggregation for Small Object Detection cs.CV · 2025-12-08 · unverdicted · none · ref 70
DFIR-DETR augments RT-DETR with frequency-domain iterative refinement and dynamic feature aggregation, reporting 92.9% mAP50 on NEU-DET and 51.6% on VisDrone at 11.7M parameters and 47.2 GFLOPs.
MinerU: An Open-Source Solution for Precise Document Content Extraction cs.CV · 2024-09-27 · conditional · none · ref 31
MinerU delivers an open-source pipeline for high-precision document content extraction by integrating specialized models with tuned preprocessing and postprocessing rules.
A Goal-Oriented Networking Approach for Intelligent IoT Service Deployment cs.NI · 2026-05-27 · unverdicted · none · ref 31
A multi-objective optimization framework is proposed to assess KPIs in goal-oriented IoT service deployment, with simulation results indicating network efficiency benefits.
A Comparative Study of Modern Object Detectors for Robust Apple Detection in Orchard Imagery cs.CV · 2026-04-11 · unverdicted · none · ref 1
YOLO11n achieves the highest mAP@0.5:0.95 of 0.6065 for apple localization, with other detectors showing trade-offs in recall and precision at low confidence thresholds.
Underwater Waste Detection Using Deep Learning A Performance Comparison of YOLOv7 to 10 and Faster RCNN cs.CV · 2025-07-25 · unverdicted · none · ref 20
YOLOv8 achieves the highest mAP of 80.9% for detecting 15 classes of underwater waste among the tested models.
YOLOv8 to YOLO11: A Comprehensive Architecture In-depth Comparative Review cs.CV · 2025-01-23 · unverdicted · none · ref 4
Comparative review of YOLOv8 to YOLO11 architectures based on papers, docs, and code inspection, noting incremental improvements and some unchanged blocks.
YOLOv11: An Overview of the Key Architectural Enhancements cs.CV · 2024-10-23 · unverdicted · none · ref 15
YOLOv11 adds blocks such as C3k2, SPPF, and C2PSA to improve feature extraction, mAP, and efficiency while supporting detection, segmentation, pose, and oriented detection across model sizes.
Global Offshore Wind Infrastructure: Deployment and Operational Dynamics from Dense Sentinel-1 Time Series cs.CV · 2026-04-22 · unreviewed · ref 41

Yolov10: Real-time end-to-end object detection

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer