M²C-EvDet: Multi-Domain Multi-Order Cross-Modal Knowledge Distillation for Event-based Object Detection
Pith reviewed 2026-06-26 00:30 UTC · model grok-4.3
The pith
M^2C-EvDet improves event-based object detection by distilling frequency features and multi-order relations from frame-based models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
M^2C-EvDet is a Multi-domain and Multi-order Cross-modal knowledge distillation framework for Event-based object Detection. Built upon frequency learning and hypergraph computation, M^2C-EvDet integrates two specialized modules: Adaptive Frequency-Decoupled Feature Distillation (AF^2D^2) and Multi-Order Relational Distillation (MORD) to address limitations of prior distillation methods that only focus on spatial semantics or pair-wise relations.
What carries the argument
The M^2C-EvDet framework, with AF^2D^2 for adaptive frequency-decoupled feature distillation and MORD for multi-order relational distillation via hypergraph computation.
If this is right
- Event-based detectors achieve higher accuracy in complex scenarios than methods limited to spatial or pairwise distillation.
- Visual semantics missing from sparse event data are transferred more completely across modalities.
- The performance disparity between EvDet and frame-based detection is reduced through combined frequency and relational knowledge transfer.
- Distillation now operates effectively on both frequency domains and higher-order scene relations.
Where Pith is reading between the lines
- The same frequency-plus-hypergraph approach could be tested on other event-based tasks such as tracking or segmentation.
- Combining the framework with existing large frame-based pretrained models might lower the data requirements for training event detectors.
- Hypergraph modeling of multi-order relations may prove useful in other sparse sensing domains beyond vision.
Load-bearing premise
That the frequency-decoupled and multi-order relational modules will successfully capture and transfer the missing visual semantics from frame-based models to event data in complex scenarios beyond what prior spatial or pairwise methods achieved.
What would settle it
Experiments on standard event detection benchmarks showing no meaningful accuracy gain over prior distillation baselines specifically in complex or high-dynamic-range sequences.
Figures
read the original abstract
Event-based object Detection (EvDet), as a biologically inspired visual perception paradigm, demonstrates superior performance in scenarios demanding high temporal resolution and a wide dynamic range. Nevertheless, the inherent sparse representations and inadequate visual semantics of event data result in a considerable performance disparity between EvDet and frame-based object detection. Previous works attempt to alleviate this cross-modal discrepancy through knowledge distillation, yet they only focus on spatial visual semantics or pair-wise relational information, thus limiting performance in more complex scenarios. To address this challenge, this paper proposes M^2C-EvDet, a Multi-domain and Multi-order Cross-modal knowledge distillation framework for EvDet. Built upon frequency learning and hypergraph computation, M^2C-EvDet integrates two specialized modules: Adaptive Frequency-Decoupled Feature Distillation (AF^2D^2) and Multi-Order Relational Distillation (MORD).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes M^2C-EvDet, a Multi-domain Multi-Order Cross-Modal knowledge distillation framework for Event-based Object Detection (EvDet). It introduces two modules—Adaptive Frequency-Decoupled Feature Distillation (AF^2D^2) and Multi-Order Relational Distillation (MORD)—built on frequency learning and hypergraph computation to transfer richer visual semantics from frame-based models to sparse event data, addressing limitations of prior distillation approaches that focus only on spatial semantics or pairwise relations.
Significance. If the modules prove effective at capturing and transferring multi-domain and multi-order information, the work could meaningfully narrow the performance gap between EvDet and frame-based detection in complex, high-dynamic-range scenarios, contributing a new direction for cross-modal distillation in event-based vision.
major comments (2)
- [Abstract] Abstract: The central claim of improved performance in complex scenarios rests on the success of AF^2D^2 and MORD, yet the manuscript provides no equations, algorithmic details, or pseudocode for how frequency decoupling is performed adaptively or how hypergraph-based multi-order relations are computed and distilled.
- [Abstract] Abstract: No experimental results, ablation studies, or quantitative comparisons are presented, so the assertion that the proposed modules outperform prior spatial or pairwise distillation methods cannot be assessed.
Simulated Author's Rebuttal
We thank the referee for the detailed comments on the abstract. The full manuscript contains the technical details and experimental results referenced in the body; we address each point below and indicate where revisions to the abstract may be appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of improved performance in complex scenarios rests on the success of AF^2D^2 and MORD, yet the manuscript provides no equations, algorithmic details, or pseudocode for how frequency decoupling is performed adaptively or how hypergraph-based multi-order relations are computed and distilled.
Authors: The abstract is a high-level summary. The full manuscript provides the requested equations for adaptive frequency decoupling in AF^2D^2 (Section 3.2, including the frequency-domain formulation and adaptive weighting) and the hypergraph construction plus multi-order distillation loss for MORD (Section 3.3, with explicit hyperedge definitions and message-passing steps). Algorithmic details and a pseudocode outline appear in the supplementary material. We can add one sentence to the abstract that points to these sections if the editor prefers. revision: partial
-
Referee: [Abstract] Abstract: No experimental results, ablation studies, or quantitative comparisons are presented, so the assertion that the proposed modules outperform prior spatial or pairwise distillation methods cannot be assessed.
Authors: The full manuscript reports comprehensive experiments in Section 4, including quantitative mAP comparisons on multiple event-based datasets, ablation studies isolating AF^2D^2 and MORD, and direct comparisons against prior spatial and pairwise distillation baselines. The abstract summarizes the outcome of these experiments without specific numbers due to length limits. No change to the abstract is required on this point. revision: no
Circularity Check
No significant circularity detected
full rationale
The paper proposes a new framework M^2C-EvDet integrating AF^2D^2 and MORD modules built on frequency learning and hypergraph computation for cross-modal distillation. No equations, derivations, or fitted parameters are described in the provided text that reduce by construction to the inputs. The central claim is an architectural proposal whose validity rests on empirical performance rather than any self-referential definition or self-citation chain. No load-bearing steps match the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Event-Based Vision: A Survey,
G. Guillermo, D. Tobi, M. O. Garrick, B. Chiara, T. Brian, C. Andrea, L. Stefan, D. Andrew, C. Jorg, D. Kostas, and S. Davide, “Event-Based Vision: A Survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020
2020
-
[2]
EVCSLR: Event-guided Continuous Sign Language Recognition and Benchmark,
Y . Jiang, Y . Wang, S. Li, Y . Zhang, Q. Guo, Q. Chu, and Y . Gao, “EVCSLR: Event-guided Continuous Sign Language Recognition and Benchmark,”IEEE Transactions on Multimedia, 2024
2024
-
[3]
RGB-D Visual Per- ception for Occluded Scenes via Event Camera,
S. Li, Z. Wu, Y . Li, Z. Xue, Y .-S. Liu, and Y . Gao, “RGB-D Visual Per- ception for Occluded Scenes via Event Camera,”International Journal of Computer Vision, pp. 1–22, 2025
2025
-
[4]
Event- based Low-illumination Image Enhancement,
Y . Jiang, Y . Wang, S. Li, Y . Zhang, M. Zhao, and Y . Gao, “Event- based Low-illumination Image Enhancement,”IEEE Transactions on Multimedia, vol. 26, pp. 1920–1931, 2023
1920
-
[5]
3D Feature Tracking via Event Camera,
S. Li, Z. Zhou, Z. Xue, Y . Li, S. Du, and Y . Gao, “3D Feature Tracking via Event Camera,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 18 974–18 983
2024
-
[6]
AEGNN: Asynchronous Event-based Graph Neural Networks,
S. Schaefer, D. Gehrig, and D. Scaramuzza, “AEGNN: Asynchronous Event-based Graph Neural Networks,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 12 371–12 381
2022
-
[7]
Spiking Transformers for Event-based Single Object Tracking,
J. Zhang, B. Dong, H. Zhang, J. Ding, F. Heide, B. Yin, and X. Yang, “Spiking Transformers for Event-based Single Object Tracking,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 8801–8810
2022
-
[8]
Recurrent Vision Transformers for Object Detection with Event Cameras,
M. Gehrig and D. Scaramuzza, “Recurrent Vision Transformers for Object Detection with Event Cameras,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 13 884–13 893
2023
-
[9]
EvRT-DETR: Latent Space Adaptation of Image Detectors for Event-based Vision,
D. Torbunov, Y . Ren, A. Ghose, O. Dim, and Y . Cui, “EvRT-DETR: Latent Space Adaptation of Image Detectors for Event-based Vision,” in Proceedings of the IEEE International Conference on Computer Vision, 2025, pp. 9812–9821
2025
-
[10]
Low-Latency Automotive Vision with Event Cameras,
D. Gehrig and D. Scaramuzza, “Low-Latency Automotive Vision with Event Cameras,”Nature, vol. 629, no. 8014, pp. 1034–1040, 2024
2024
-
[11]
FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequen- cies,
D. Lu, L. Kong, G. Lee, C. Chane, and W. Ooi, “FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequen- cies,”arXiv preprint arXiv:2412.06708, 2025
-
[12]
Frequency-Adaptive Low-Latency Object Detection Using Events and Frames,
H. Zhang, X. Wang, C. Xu, X. Wang, F. Xu, H. Yu, L. Yu, and W. Yang, “Frequency-Adaptive Low-Latency Object Detection Using Events and Frames,”arXiv preprint arXiv:2412.04149, 2024
-
[13]
From Global to Local: Multi-Scale Out-of-Distribution Detection,
J. Zhang, L. Gao, B. Hao, H. Huang, J. Song, and H. Shen, “From Global to Local: Multi-Scale Out-of-Distribution Detection,”IEEE Transactions on Image Processing, vol. 32, pp. 6115–6128, 2023
2023
-
[14]
From Channel Bias to Feature Redundancy: Uncovering the
J. Zhang, X. Luo, L. Gao, D. Zou, H. Shen, and J. Song, “From Channel Bias to Feature Redundancy: Uncovering the” Less is More” Principle in Few-Shot Learning,”arXiv e-prints, pp. arXiv–2310, 2023
2023
-
[15]
A Closer Look at Conditional Prompt Tuning for Vision-Language Models,
J. Zhang, S. Wu, L. Gao, J. Song, N. Sebe, and H. T. Shen, “A Closer Look at Conditional Prompt Tuning for Vision-Language Models,”arXiv preprint arXiv:2506.23856, 2025
-
[16]
Reliable Few-shot Learning under Dual Noises,
J. Zhang, J. Song, L. Gao, N. Sebe, and H. T. Shen, “Reliable Few-shot Learning under Dual Noises,” 2025
2025
-
[17]
Object-Centric Cross-Modal Feature Distillation for Event-based Ob- ject Detection,
L. Li, A. Linger, M. Millhaeusler, V . Tsiminaki, Y . Li, and D. Dai, “Object-Centric Cross-Modal Feature Distillation for Event-based Ob- ject Detection,” inProceedings of the IEEE International Conference on Robotics and Automation. IEEE, 2024, pp. 15 440–15 447
2024
-
[18]
Event-Aware Distilled DETR for Object Detection in an Automotive Context,
D. Rossi, P. Vasseur, F. Morbidi, C. Demonceaux, and F. Rameau, “Event-Aware Distilled DETR for Object Detection in an Automotive Context,” inIEEE Intelligent V ehicles Symposium, 2025
2025
-
[19]
Wavelet Integrated CNNs for Noise- robust Image Classification,
Q. Li, L. Shen, S. Guo, and Z. Lai, “Wavelet Integrated CNNs for Noise- robust Image Classification,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2020, pp. 7245–7254
2020
-
[20]
HGNN +: General Hypergraph Neural Networks,
Y . Gao, Y . Feng, S. Ji, and R. Ji, “HGNN +: General Hypergraph Neural Networks,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3181–3199, 2023
2023
-
[21]
SuperEvent: Cross- Modal Learning of Event-based Keypoint Detection,
Y . Burkhardt, S. Schaefer, and S. Leutenegger, “SuperEvent: Cross- Modal Learning of Event-based Keypoint Detection,”arXiv preprint arXiv:2504.00139, 2025
-
[22]
Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event- Based Monocular Depth Estimation,
L. Bartolomei, E. Mannocci, F. Tosi, M. Poggi, and S. Mattoccia, “Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event- Based Monocular Depth Estimation,” inProceedings of the IEEE International Conference on Computer Vision, 2025, pp. 19 669–19 678
2025
-
[23]
I2EKD: Efficient and Versatile Image-to-Event Knowledge Distillation,
H. Liu, G. Yu, H. Cao, S. Qu, F. Lu, Y . Zhong, Z. Lu, L. Leng, and G. Chen, “I2EKD: Efficient and Versatile Image-to-Event Knowledge Distillation,”IEEE Transactions on Circuits and Systems for Video Technology, 2025. 12
2025
-
[24]
When Object Detection Meets Knowledge Distillation: A Survey,
Z. Li, P. Xu, X. Chang, L. Yang, Y . Zhang, L. Yao, and X. Chen, “When Object Detection Meets Knowledge Distillation: A Survey,”IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 45, no. 8, pp. 10 555–10 579, 2023
2023
-
[25]
Context Matters: Distilling Knowledge Graph for Enhanced Object Detection,
A. Yang, S. Lin, C.-H. Yeh, M. Shu, Y . Yang, and X. Chang, “Context Matters: Distilling Knowledge Graph for Enhanced Object Detection,” IEEE Transactions on Multimedia, vol. 26, pp. 487–500, 2023
2023
-
[26]
A Hier- archical Semantic Distillation Framework for Open-V ocabulary Object Detection,
S. Fu, J. Yan, Q. Yang, X. Wei, X. Xie, and W.-S. Zheng, “A Hier- archical Semantic Distillation Framework for Open-V ocabulary Object Detection,”IEEE Transactions on Multimedia, 2025
2025
-
[27]
Improve Object Detection with Feature-Based Knowledge Distillation: Towards Accurate and Efficient Detectors,
L. Zhang and K. Ma, “Improve Object Detection with Feature-Based Knowledge Distillation: Towards Accurate and Efficient Detectors,” in Proceedings of the International Conference on Learning Representa- tions, 2020
2020
-
[28]
Focal and Global Knowledge Distillation for Detectors,
Z. Yang, Z. Li, X. Jiang, Y . Gong, Z. Yuan, D. Zhao, and C. Yuan, “Focal and Global Knowledge Distillation for Detectors,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 4643–4652
2022
-
[29]
Masked Generative Distillation,
Z. Yang, Z. Li, M. Shao, D. Shi, Z. Yuan, and C. Yuan, “Masked Generative Distillation,” inProceedings of the European Conference on Computer Vision. Springer, 2022, pp. 53–69
2022
-
[30]
D 3ETR: Decoder Distillation for Detection Transformer,
X. Chen, J. Chen, Y . Liu, and G. Zeng, “D 3ETR: Decoder Distillation for Detection Transformer,”arXiv preprint arXiv:2211.09768, 2022
-
[31]
DE- TRDistill: A Universal Knowledge Distillation Framework for DETR- families,
J. Chang, S. Wang, H.-M. Xu, Z. Chen, C. Yang, and F. Zhao, “DE- TRDistill: A Universal Knowledge Distillation Framework for DETR- families,” inProceedings of the IEEE International Conference on Computer Vision, 2023, pp. 6898–6908
2023
-
[32]
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling,
Y . Wang, X. Li, S. Weng, G. Zhang, H. Yue, H. Feng, J. Han, and E. Ding, “KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 16 016–16 025
2024
-
[33]
CLoCKDistill: Consistent Location-and- Context-aware Knowledge Distillation for DETRs,
Q. Lan and Q. Tian, “CLoCKDistill: Consistent Location-and- Context-aware Knowledge Distillation for DETRs,”arXiv preprint arXiv:2502.10683, 2025
-
[34]
Frequency- aware Feature Fusion for Dense Image Prediction,
L. Chen, Y . Fu, L. Gu, C. Yan, T. Harada, and G. Huang, “Frequency- aware Feature Fusion for Dense Image Prediction,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
2024
-
[35]
Y . Cui, B. Gao, Y . Zhang, X. Dong, J. Xiang, D. Li, and Z. Tu, “WD-DETR: Wavelet Denoising-Enhanced Real-Time Object Detection Transformer for Robot Perception with Event Cameras,”arXiv preprint arXiv:2506.09098, 2025
-
[36]
Frequency Attention for Knowledge Distillation,
C. Pham, V .-A. Nguyen, T. Le, D. Phung, G. Carneiro, and T.-T. Do, “Frequency Attention for Knowledge Distillation,” inProceedings of the IEEE Winter Conference on Applications of Computer Vision, 2024, pp. 2277–2286
2024
-
[37]
FreeKD: Knowledge Distillation via Semantic Frequency Prompt,
Y . Zhang, T. Huang, J. Liu, T. Jiang, K. Cheng, and S. Zhang, “FreeKD: Knowledge Distillation via Semantic Frequency Prompt,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 15 931–15 940
2024
-
[38]
Learning Representation on Opti- mized High-order Manifold for Visual classification,
X. Ma, W. Liu, Q. Tian, and Y . Gao, “Learning Representation on Opti- mized High-order Manifold for Visual classification,”IEEE Transactions on Multimedia, vol. 24, pp. 3989–4001, 2021
2021
-
[39]
CrossHypergraph: Consis- tent High-order Semantic Network for Few-shot Image Classification,
Y . Zhang, H. Wang, S. Zhang, and B. Leng, “CrossHypergraph: Consis- tent High-order Semantic Network for Few-shot Image Classification,” IEEE Transactions on Multimedia, pp. 1–12, 2025
2025
-
[40]
Hypergraph-Based Remaining Prototype Alignment for Open-Set Cross-Domain Image Retrieval,
Y . Xu, Y . Feng, X. Zhong, Y . Gao, and Z. Wu, “Hypergraph-Based Remaining Prototype Alignment for Open-Set Cross-Domain Image Retrieval,”IEEE Transactions on Multimedia, 2025
2025
-
[41]
Residual Fuzzy Alignment on Hypergraph for Open-Set 3D Cross-Modal Retrieval,
Y . Xu, Y . Feng, X. Zhuang, J. Wang, Z. Wu, and Y . Gao, “Residual Fuzzy Alignment on Hypergraph for Open-Set 3D Cross-Modal Retrieval,” IEEE Transactions on Multimedia, 2025
2025
-
[42]
Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation,
Y . Feng, J. Huang, S. Du, S. Ying, J.-H. Yong, Y . Li, G. Ding, R. Ji, and Y . Gao, “Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
2024
-
[43]
Yolov13: Real-time object detection with hypergraph- enhanced adaptive visual perception,
M. Lei, S. Li, Y . Wu, H. Hu, Y . Zhou, X. Zheng, G. Ding, S. Du, Z. Wu, and Y . Gao, “YOLOv13: Real-Time Object Detection with Hypergraph-Enhanced Adaptive Visual Perception,”arXiv preprint arXiv:2506.17733, 2025
-
[44]
Feature Pyramid Networks for Object Detection,
T.-Y . Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2117–2125
2017
-
[45]
Path Aggregation Network for Instance Segmentation,
S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path Aggregation Network for Instance Segmentation,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8759–8768
2018
-
[46]
Pyramid Vision Transformer: A versatile Backbone for Dense Prediction without Convolutions,
W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao, “Pyramid Vision Transformer: A versatile Backbone for Dense Prediction without Convolutions,” inProceedings of the IEEE International Conference on Computer Vision, 2021, pp. 568–578
2021
-
[47]
Attention is All You Need,
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is All You Need,” inProceedings of the Advances in Neural Information Processing Systems, vol. 30, 2017
2017
-
[48]
Vision HGNN: An Image is More than a Graph of Nodes,
Y . Han, P. Wang, S. Kundu, Y . Ding, and Z. Wang, “Vision HGNN: An Image is More than a Graph of Nodes,” inProceedings of the IEEE International Conference on Computer Vision, 2023, pp. 19 878–19 888
2023
-
[49]
Motion Robust High- Speed Light-Weighted Object Detection with Event Camera,
B. Liu, C. Xu, W. Yang, H. Yu, and L. Yu, “Motion Robust High- Speed Light-Weighted Object Detection with Event Camera,”IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1–13, 2023
2023
-
[50]
State Space Models for Event Cameras,
N. Zubic, M. Gehrig, and D. Scaramuzza, “State Space Models for Event Cameras,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 5819–5828
2024
-
[51]
D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement,
Y . Peng, H. Li, P. Wu, Y . Zhang, X. Sun, and F. Wu, “D-FINE: Redefine Regression Task in DETRs as Fine-grained Distribution Refinement,” arXiv preprint arXiv:2410.13842, 2024
-
[52]
Fusing event-based and rgb camera for robust object detection in adverse conditions,
A. Tomy, A. Paigwar, K. S. Mann, A. Renzaglia, and C. Laugier, “Fusing event-based and rgb camera for robust object detection in adverse conditions,” inIEEE International Conference on Robotics and Automation. IEEE, 2022, pp. 933–939
2022
-
[53]
SODFormer: Streaming Object Detection with Transformer using Events and Frames,
D. Li, Y . Tian, and J. Li, “SODFormer: Streaming Object Detection with Transformer using Events and Frames,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 11, pp. 14 020–14 037, 2023
2023
-
[54]
Microsoft COCO: Common Objects in Context,
T.-Y . Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll ´ar, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” inProceedings of the European Conference on Computer Vision. Springer, 2014, pp. 740–755
2014
-
[55]
Ultralytics YOLO,
G. Jocher, A. Chaurasia, and J. Qiu, “Ultralytics YOLO,” Jan. 2023. [Online]. Available: https://github.com/ultralytics/ultralytics
2023
-
[56]
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection,
X. Li, W. Wang, L. Wu, S. Chen, X. Hu, J. Li, J. Tang, and J. Yang, “Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection,”Proceedings of the Advances in Neural Information Processing Systems, vol. 33, pp. 21 002–21 012, 2020
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.