hub

Rt-detrv2: Improved base- line with bag-of-freebies for real-time detection transformer

RT-DETRv2: Improved Baseline with Bag-of-Freebies for Real-Time Detection Transformer · 2024 · arXiv 2407.17140

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

read on arXiv browse 13 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

AdvScene: Rethinking Adversarial Patch Evaluation Through Scene Robustness

cs.CR · 2026-05-28 · unverdicted · novelty 7.0

AdvScene is a scene-grounded evaluation method using Adversarial Patch-to-Scene Embedding (APSE) to map the operational envelope of physical adversarial patches in reconstructed real environments.

ReLeaf: Benchmarking Leaf Segmentation across Domains and Species

cs.CV · 2026-05-05 · unverdicted · novelty 7.0

A YOLO26 model trained on four leaf segmentation datasets reaches 83.9% mean mAP50-95 on their test sets but only 40.2% on a new 23-species benchmark, revealing substantial cross-domain generalization gaps.

Segmenting, Fast and Slow: Real-Time Open-Vocabulary Video Instance Segmentation with Dual-Path Processing

cs.CV · 2026-06-30 · unverdicted · novelty 6.0

SegFS is a dual-path architecture that uses sparse keyframe open-vocabulary predictions to condition a fast feature-space network for efficient temporal instance segmentation in videos.

Architect-Ant: Editable Automatic Furnishing of Architectural Floor Plans

cs.AI · 2026-06-09 · unverdicted · novelty 6.0

Architect-Ant fine-tunes a vision-language model on the new AntPlan-270 dataset using procedural reasoning traces and preference optimization to output editable DSL furniture layouts that can be rendered into images.

Improving Layout Representation Learning Across Inconsistently Annotated Datasets via Agentic Harmonization

cs.CV · 2026-04-13 · unverdicted · novelty 6.0

VLM-based harmonization of inconsistent annotations across two document layout corpora raises detection F-score from 0.860 to 0.883 and table TEDS from 0.750 to 0.814 while tightening embedding clusters.

YOLOv12: Attention-Centric Real-Time Object Detectors

cs.CV · 2025-02-18 · unverdicted · novelty 6.0

YOLOv12 is a new attention-based real-time object detector that reports higher accuracy than YOLOv10, YOLOv11, and RT-DETR variants at comparable or better speed and efficiency.

DeWorldSG: Depth-Aware 3D Semantic Scene Graph Generation via World-Model Priors

cs.CV · 2026-07-01 · unverdicted · novelty 5.0

DeWorldSG improves 3D scene graph generation from RGB-D sequences by using depth-guided 3D Gaussian object nodes and V-JEPA 2 world-model priors for spatiotemporal relation refinement, reporting large recall gains on 3DSSG and ReplicaSSG.

TinyFormer: Preserving Tiny Objects in YOLO-DETR Hybrid Real-time Detectors

cs.CV · 2026-05-24 · unverdicted · novelty 5.0

TinyFormer adds Parallel Bi-fusion Module and Spatial Semantic Adapter to a YOLO-DETR hybrid, raising small-object AP by 1.6 points to 58.5% on MS COCO while keeping real-time speed.

ConRTF: Edge-Constrained Boundary Distribution Refinement for Realtime TransFormer Table Structure Recognition

cs.CV · 2026-07-01 · unverdicted · novelty 4.0

ConRTF adds an edge-constrained fine-grained localization loss to a distribution-based real-time detector to improve boundary accuracy in table structure recognition, claiming up to +1.6 GriTS gains on PubTables-1M while remaining data-efficient.

RT-SDGOD: Real-Time Single-Domain Generalized Object Detection

cs.CV · 2026-06-08 · unverdicted · novelty 4.0

RT-SDGDet applies one-to-many supervision, Discriminative Evidence Diversity Learning, and Dual-view Evidence Consistency Learning during training to reduce missed detections in real-time object detectors under unseen domain shifts.

Ultralytics YOLO26: Unified Real-Time End-to-End Vision Models

cs.CV · 2026-06-02 · unverdicted · novelty 4.0

YOLO26 presents a unified real-time vision model family with dual-head end-to-end design, new training components, and task-specific heads that reports improved mAP-latency tradeoffs on COCO and LVIS benchmarks across detection, segmentation, pose, and oriented detection.

Resource-Constrained UAV-Based Weed Detection for Site-Specific Management on Edge Devices

cs.CV · 2026-04-25 · unverdicted · novelty 4.0

YOLOv11s and RT-DETRv2-R50-M provide the best accuracy-speed trade-off for real-time weed detection on edge UAV systems, with mAP50 up to 79% and low latency.

YOLO26 vs. YOLOv8: A Comprehensive Architectural Benchmark of Next-Generation Real-Time Object Detection Models

cs.CV · 2026-05-24 · unverdicted · novelty 2.0

Empirical benchmark finds YOLO26 superior on Pascal VOC accuracy and efficiency but YOLOv8 faster on GPU, with both models struggling similarly on VisDrone small-object detection.

citing papers explorer

Showing 1 of 1 citing paper after filters.

YOLOv12: Attention-Centric Real-Time Object Detectors cs.CV · 2025-02-18 · unverdicted · none · ref 41
YOLOv12 is a new attention-based real-time object detector that reports higher accuracy than YOLOv10, YOLOv11, and RT-DETR variants at comparable or better speed and efficiency.

Rt-detrv2: Improved base- line with bag-of-freebies for real-time detection transformer

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer